Announcement

Collapse
No announcement yet.

Unity Is Growing Their LLVM Compiler Team As They Try To Make C# Faster Than C++

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by atomsymbol View Post

    I originally (yesterday) wanted to post a longer response. That didn't work out. The short version is as follows:

    I am hesitant to use std::weak_ptr because composition of std::weak_ptr is impossible. If the programmer puts a graph of std::shared_ptr behind std::weak_ptr, and the graph entrance is reachable only via weak pointers, then all the std::shared_ptr in the graph effectively become std::weak_ptr. The contradiction is:
    • The programmer cannot use std::weak_ptr in the graph because constructing/initializing the graph requires std::shared_ptr
    • The programmer cannot use std::shared_ptr in the graph because that would lead to a memory leak

      Solution with current state of C++ STL: The programmer has to manually write code computing the liveliness of the mentioned graph
    This parent child relationship with reference counting is fairly well explored; the main solutions are:

    1) In a hierarchical structure you would use std::shared_ptr downwards and std::weak_ptr upwards. That way you can have the graph "hanging off" the shared pointer root and individual branches.

    2) a single flat list of shared_ptrs *and* then a graph structure, all of weak pointers.

    Or am I missing what you mean?

    Some might say that stating this "ownership" of data is manual memory management. I personally think that shared / weak pointers in the header just seem like useful annotation. One that I would even do in Lua scripts.

    That said, the hybrid approach of reference counting *and* limited garbage collection would be the ideal case, then you could just use something like shared_ptr.
    There are already some "stubs" in the standard to help facilitate this garbage collection (https://en.cppreference.com/w/cpp/me...lare_reachable) but no formal garbage collector yet.
    The closest to this hybrid approach is probably UE4; although they err on the side of GC a little too much IMO.

    Slightly off topic but I would also note that locking a weak pointer to get a shared pointer to go up the hierarchy is often cited to be "slow". However I find this quite amusing because in my tests, it is quite a bit faster to trivially increment a reference count and create a stack object than it is to go through all that machinery in C# to even call *any* managed function by the VM; and yet Java and .NET developers are always trying to suggest that non-native / jit languages can be on par with native C++ :/
    Last edited by kpedersen; 04-08-2019, 11:05 AM.

    Comment


    • #32
      Originally posted by kravemir View Post

      Uhm.. No,.. The problem with Android apps is that they are bloated,.. if Google had picked c++ as the primary language for Android alps, then it would have proved how c++ is unstable, buggy, and memory eater (due to memory leaks). I personally saw android developers who don't care to write sane code even in Java.. I wouldn't like to see them writing c++ apps..
      I think your premise is wrong, but I can't argue with your points, you're right.

      Comment


      • #33
        Originally posted by kravemir View Post

        Uhm.. No,.. The problem with Android apps is that they are bloated,.. if Google had picked c++ as the primary language for Android alps, then it would have proved how c++ is unstable, buggy, and memory eater (due to memory leaks). I personally saw android developers who don't care to write sane code even in Java.. I wouldn't like to see them writing c++ apps..
        I wouldn't like to see C++ to be the main language for Android development, either. But it's really hard to accidentally program a memory leak in C++14, unless you use archaic memory management stuff. Crashes due to poor thread management is what I'd expect more.

        Comment


        • #34
          Originally posted by oleid View Post
          I wouldn't like to see C++ to be the main language for Android development, either.
          I would but not on merit of the language. But instead entirely on the homogeneous nature of all code bases. We could basically stick to one language for *everything* which in the "games industry" happens to be C++. As it stands now:

          1) Win32 - C++
          2) iOS - Objective-C++
          3) Android - NDK / NativeActivity (C++)
          4) Web - Emscripten (C++)

          Like this, we can share so much code between projects. Imagine using the "premier" languages as the following:

          1) Win32 - C#
          2) iOS - Swift
          3) Android - Java
          4) Web - Javascript

          We couldn't share sod all. What a bloody waste of time! No porting opportunity. Why do people do this?

          I don't particularly like Java but if every single project on earth would switch to it as a language, I would be very happy with that.

          It is rarely about our "enjoyment" when it comes to programming, it is all about not having to rewrite shite pointlessly XD.

          Same with spoken language to be fair. I am a native English speaker so obviously prefer that language over others but if the whole world switched to i.e Danish, I would be more than happy to learn it. Holidays around the world would be so much better!

          Bringing this topic back to Unity slightly, I highly recommend using Microsoft's C++/clr compiler rather than C# for the .NET bytecode. Getting locked to C# is really crappy for portability.
          Last edited by kpedersen; 04-09-2019, 05:33 AM.

          Comment


          • #35
            Originally posted by kravemir View Post

            Uhm.. No,.. The problem with Android apps is that they are bloated,.. if Google had picked c++ as the primary language for Android alps, then it would have proved how c++ is unstable, buggy, and memory eater (due to memory leaks). I personally saw android developers who don't care to write sane code even in Java.. I wouldn't like to see them writing c++ apps..
            Speaking as someone who's been forced to code in Ada for the past 8 years: Going back to C/C++ is hard. Yes, it's portable, but it's really overdue that the language gets retired for everything except maybe low-level interfaces. It is not fun to use, and the shorthand syntax that pretty much *every* C developer uses make figuring out what programs are trying to do much more difficult then necessary. (Sorry, I'm used to ImAVeryLongAndDescriptiveVariableOrProcedureName now.)

            Comment


            • #36
              Originally posted by kpedersen View Post
              This parent child relationship with reference counting is fairly well explored; the main solutions are:

              1) In a hierarchical structure you would use std::shared_ptr downwards and std::weak_ptr upwards. That way you can have the graph "hanging off" the shared pointer root and individual branches.
              First, I like your answer because it made me think. However, the issue is that there do exist cases in which it is impossible to know at compile-time whether a pointer is a forward/downward pointer or a backward/upward pointer.

              Consider the following code found in a compiler implementation:
              Code:
              struct Jump final : Instruction {
                  // Does ptr==shared_ptr OR ptr==weak_ptr ???
                  ptr<Instruction> target;
              };
              It is impossible to decide up-front whether target is `shared_ptr` or `weak_ptr` because in some control-flow graphs the `target` is a forward/downward pointer while in other control-flow graphs the `target` is a forward/downward pointer.

              Originally posted by kpedersen View Post
              2) a single flat list of shared_ptrs *and* then a graph structure, all of weak pointers.
              This means that if the `Instruction` graph is changed (e.g: the compiler optimizes the code), then a manual cycle-aware garbage collection of the graph needs to be performed in order to update the list of shared_ptrs.

              Note: Imagine that `Instruction` has 100 subclasses representing various machine opcodes.

              Comment


              • #37
                Originally posted by atomsymbol View Post

                Consider the following code found in a compiler implementation:
                Code:
                struct Jump final : Instruction {
                // Does ptr==shared_ptr OR ptr==weak_ptr ???
                ptr<Instruction> target;
                };

                Note: Imagine that `Instruction` has 100 subclasses representing various machine opcodes.
                I agree in general but in this case, (am I understanding this correctly?) the answer here strongly suggests a weak_ptr because:

                1) target (almost by definition of its name) suggests that this Jump class does not own this data. It is referring to existing data to point to.
                2) In contextual terms, the Jump instruction navigates to the target instruction. This must exist and "be owned" elsewhere, and thus does not need to be held here as a shared_ptr
                3) This is basically a good example of aggregation. The target instruction exists regardless of if this Jmp instruction points to it. Composition here makes no sense because they are independent instructions.

                In the case of instructions, I think they should never hold onto each others data (as shared_ptrs) because even though they may point at one another, they are still all held (in effectively a flat list) with the concept of an assembly object.

                I tend to use this as a quick reference guide. Player can target other players but cannot possibly own them (Aggregation). A player has weapons glued to his hands which he cannot possibly drop. So he "owns" their data because they are part of him (Composition).

                Code:
                struct Player
                {
                  weak_ptr<Player> target;
                  shared_ptr<Weapon> weapon;
                  Weapon offhand;
                };
                But yes, you are absolutely right, sometimes you cannot easily make this choice. But I especially notice it when wrapping C libraries.

                Code:
                struct Texture
                {
                  shared_ptr<OpenGLContext> context; // Because if OpenGL deletes its game over for the underlying data (GLuint) regardless of reference count
                  GLuint id;
                };
                
                struct OpenGLContext
                {
                  std::vector<shared_ptr<Texture> > textures; // Because if any of these delete, bad news because they might be currently bound in OpenGL's stupid state system
                };
                Here you get a cyclic reference. In this case, I would use an intermediate class to hold them both (as shared_ptrs) and then let that go out of scope at the end of the program. Still a bit naff though. This solution is actually most similar to the single flat list (#2) in the first example solutions.

                That said... a garbage collector in this case would be even worse because it cannot handle native C memory, especially stored on a GPU any better. No current way to scan a GPUs stack memory for references. The Java jogamp / gluegen bindings are... complex to get round this and end up just holding on to as much as possible to prevent it from being GC'ed.

                So as usual it leads me to state that after all these years, C++ is still pretty crappy. However it is still the best we have :/
                The later standards are not making it any better either, most of the features are gimmicks like auto, lambdas, modules and other javascript-like "quick buck" concepts.
                And don't get me started on safety; I am writing a "safe" debug STL just to tell me when the language has managed to kick me in the teeth again. Just to prevent even the most basic errors (like this: https://github.com/osen/sr1/blob/mas...gling_this.cpp) when dealing with massive codebases.
                Last edited by kpedersen; 04-09-2019, 12:10 PM.

                Comment


                • #38
                  Holy shit so many terms for pointers which are just... pointers. Just fucking use raw pointers, christ. This crap is so much over-engineering for dummies.

                  Comment


                  • #39
                    Originally posted by kpedersen View Post
                    Just to prevent even the most basic errors (like this: https://github.com/osen/sr1/blob/mas...gling_this.cpp) when dealing with massive codebases.
                    I am waiting for a C++ compiler that will print a warning in such cases.

                    ----

                    Unknown:

                    I don't know how to add a single github file at https://scan.coverity.com

                    Insufficient:
                    Code:
                    # https://clang-analyzer.llvm.org/
                    $ scan-build -v g++ -Wall -Werror -g dangling_this.cpp
                    scan-build: Using '/usr/lib64/llvm/8/bin/clang-8' for static analysis
                    scan-build: Emitting reports for this run to '/tmp/scan-build-2019-04-10-140713-26651-1'.
                    scan-build: Removing directory '/tmp/scan-build-2019-04-10-140713-26651-1' because it contains no reports.
                    scan-build: No bugs found.
                    Code:
                    # http://cppcheck.sourceforge.net/
                    $ cppcheck dangling_this.cpp -I /usr/lib/gcc/x86_64-pc-linux-gnu/7.4.0/include/g++-v7/ -I /usr/lib/gcc/x86_64-pc-linux-gnu/7.4.0/include/g++-v7/x86_64-pc-linux-gnu/ -I /usr/lib/gcc/x86_64-pc-linux-gnu/7.4.0/include/ -I /usr/include
                    Checking dangling_this.cpp ...
                    Checking dangling_this.cpp: F_LOCK...
                    Checking dangling_this.cpp: HAVE_SCHED_H;_LIBOBJC...
                    Checking dangling_this.cpp: L_SET...
                    Checking dangling_this.cpp: MB_LEN_MAX...
                    Checking dangling_this.cpp: PTHREAD_RECURSIVE_MUTEX_INITIALIZER...
                    Checking dangling_this.cpp: WINNT;__BSD_NET2__;__FreeBSD__;____386BSD____;__bsdi__;__sequent__...
                    Checking dangling_this.cpp: _ANSI_H_;_I386_ANSI_H_;_MACHINE_ANSI_H_;_X86_64_ANSI_H_;_ANSI_STDDEF_H;_STDDEF_H;_STDDEF_H_;__STDDEF_H__;__need_ptrdiff_t;__need_wint_t...
                    Checking dangling_this.cpp: _ANSI_H_;_I386_ANSI_H_;_MACHINE_ANSI_H_;_X86_64_ANSI_H_;_ANSI_STDDEF_H;_STDDEF_H;_STDDEF_H_;__STDDEF_H__;__need_ptrdiff_t;__need_wint_t;_BSD_PTRDIFF_T_;_PTRDIFF_T_...
                    Checking dangling_this.cpp: _ANSI_H_;_I386_ANSI_H_;_MACHINE_ANSI_H_;_X86_64_ANSI_H_;_ANSI_STDDEF_H;_STDDEF_H;_STDDEF_H_;__STDDEF_H__;__need_ptrdiff_t;__need_wint_t;_BSD_SIZE_T_;_SIZE_T_...
                    Checking dangling_this.cpp: _ANSI_H_;_I386_ANSI_H_;_MACHINE_ANSI_H_;_X86_64_ANSI_H_;_ANSI_STDDEF_H;_STDDEF_H;_STDDEF_H_;__STDDEF_H__;__need_ptrdiff_t;__need_wint_t;_BSD_WCHAR_T_;_WCHAR_T_...
                    Checking dangling_this.cpp: _ANSI_H_;_I386_ANSI_H_;_MACHINE_ANSI_H_;_X86_64_ANSI_H_;_ANSI_STDDEF_H;_STDDEF_H;_STDDEF_H_;__STDDEF_H__;__need_ptrdiff_t;__need_wint_t;_GCC_PTRDIFF_T_...
                    (information) Too many #ifdef configurations - cppcheck only checks 12 configurations. Use --force to check all configurations. For more details, use --enable=information.
                    Successful detection by valgrind (slow for large code bases and it isn't a compile-time analysis):
                    Code:
                    $ valgrind ./dangling_this
                    ==27349== Memcheck, a memory error detector
                    ==27349== 
                    ==27349== Invalid write of size 4
                    ==27349==    at 0x109463: Dummy::broken() (dangling_this.cpp:15)
                    ==27349==    by 0x1092B6: main (dangling_this.cpp:23)

                    Comment


                    • #40
                      Originally posted by atomsymbol View Post

                      I am waiting for a C++ compiler that will print a warning in such cases.
                      I think PVS-Studio (https://www.viva64.com/en/pvs-studio) is the closest to a compile time check. Never used it personally.

                      Valgrind is good but I often get tired of porting my software to Linux (from either BSD or Windows) just to use it haha. I have too many false negatives when I use it on FreeBSD. Or too many false positives when using NVIDIA drivers. Intel Parallel studio / VTune misses most errors IMO. My favorite; Rational Purify is hard to get hold of these days.

                      What is quite interesting is that Valgrind misses around 10% of the memory breakages in my "hall of shame" test suite.
                      The more mechanical nature of my std::sr1 library means it runs faster (even when active during debug) and it is guaranteed to catch most of these issues. Valgrinding large games and real-time engines is... painful

                      I put it past the C++ ISO standards guys but they did not see much value in a "debug version of the stl" and also didn't seem *that* interested in safety compared to speed and features which was a little disappointing.

                      Originally posted by Weasel View Post
                      Holy shit so many terms for pointers which are just... pointers. Just fucking use raw pointers, christ. This crap is so much over-engineering for dummies.
                      Raw pointers are no good for modern C++. Consider this:

                      Code:
                      Player *p = new Player();
                      std::string bob = "My name";
                      delete p; // May never reach here
                      An std::bad_alloc exception could be thrown on the bob line meaning that as the exception propagates and stack unwinds, you never get to call delete on p.
                      You could litter the code with try / catch but since C++ has no "finally" keyword, you will also probably be duplicating a lot of code.

                      Plus there aren't many pointers... Just shared_ptr. The others are only there for its short comings / failures . If shared_ptr was perfect, there would only be one pointer type and I am sure C-style pointers would be officially deprecated.

                      For me, safety is much more important than speed. Besides, in my experience, I could write the slowest C++ code possible and still come out faster than the most optimized Java. Jeez, only recently Java stopped having to use the heap for everything haha. The algorithm came with Java 8.. Escape Analysis.

                      When you see things like glm no longer zeroing out basic structures like vec4 just for "ultimate speed" and artificially disabling compiling on compilers as recent as Visual C++ 2012 (not even the standard but the brand, wtf :/), you can really see that C++ is starting to be ruined by people. And yet it is *still* the best choice for most problem domains haha

                      Don't get me wrong, raw pointers are fine, if binding C code or out right writing in C. Otherwise you can take a step back, look at a massive codebase... and know somewhere... someone... has fscked up with a raw pointer XD.
                      Last edited by kpedersen; 04-10-2019, 06:13 PM.

                      Comment

                      Working...
                      X