Announcement

Collapse
No announcement yet.

AMD Rolls Out The Threadripper 1900X: 16 Thread, 4.0GHz Boost

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by pal666 View Post
    note that this "compiler" can't produce executable. because it is just compiler's frontend. and you've been told this on multiple occasions, and yet you still are hallucinating
    The compiler does it's job and produces a standardized IR format. There is no C++ involved.

    you are moronic enough to give me link to c++ sources and claiming it is rust
    https://github.com/rust-lang/llvm/tr...604b4e548a84d3
    Why did you link the source code to LLVM? That isn't the Rust compiler.

    that is not thread pool. you run arbitrary code on thread pool and if one task is waiting for other it will not deadlock only when other will have chance to run. which it won't have if your pool has one thread or in general if size of pool is smaller than length of dependency chain
    Sorry, but you're incredibly wrong. That is a thread pool. It's not a single thread, but a pool of threads, and these pool of threads can execute events as they are received over time. No need to spawn threads every time a parallel calculation is needed, and no need to spawn as many threads as you have work units. No deadlocking of any form will happen.

    most established c++ library is boost. good luck
    Rust has no need for Boost. The things that Boost provides, Rust already has superior alternatives within it's standard library, the core language, and crates within the ecosystem. Sucks for C++ though that it's not Rust. C++ is now a dying breed that won't be around for the long term.

    Comment


    • #72
      Originally posted by pal666 View Post
      nobody needs simple c compiler. in real optimizing compiler parsing time is negligible
      Depends on the compiler and language. I know some more recent compilers do around 10-20k lines per second. Parsing the Linux kernel with such speed takes quite long. Not all parsers (or generators) are that fast.

      Comment


      • #73
        Originally posted by jrch2k8 View Post

        Again you are pointing 1 set of processes of a bunch(you are not wrong until then) but a lot happens between reading the file(or TU if you prefer ) and the creation of that .o(or obj) file and a lot more happens long before the linker kicks in.
        Indeed I simplified this quite a bit, but gcc is quite special compiler with lots of legacy code / design in it. There are tons of new compilers which actually have everything packed inside a single executable and process. They might even do linking in the same process.

        The TU is the smallest atomic piece of code but not always can be parallelized nor always is unique nor always is serialized, it depends heavily on what all the processes in between need to do and with certain operations it may need deeper analysis and may even require to parse more deeply a huge amount of files.
        I described some ways to parallelize and improve the speed at the level of TUs, without going into finer grained details. The main reason it works it we can start with full load at the beginning without slowly discovering new tasks as we compile. The autohell+gcc+make combo is suboptimal in so many ways. We don't even need to go into details about parallelization, we can just look at the problem and estimate how much faster will it get with a better design. Take a look at that zapcc. It can give a rough idea what's wrong. All the autohell caches, compiler caches etc. just memoize previous compiler/configuration runs. We could do a lot better by integrating those in the compiler. There would be a new problem, running out of memory, but it's not really a problem, just a matter of managing a cache which is a well known problem. Besides, e.g. C/C++ compilers started in environments that couldn't do full program compilation without 2+ passes due to memory constraints. Now my machine can hold 1000 compiler runs concurrently in memory. That's a 5 fold increase in memory size wrt computational resources. And this is just an ordinary desktop PC. Compilation takes place in cloud nowadays. The memory capacity won't become a problem since we can LRU evict data, we have swap files, and finally the amount of code isn't that large anyways. I can compile the whole Linux base system (few hundred megabytes) in RAM without discarding any intermediate data.

        The problem is the TU is a very theoretical unit but in reality is not a good place to parallelize because modern compilers need several copies not only of each TU but the actual parsed files and depending the optimization levels even several IR interpretation of each TU and dependency tree on RAM and some passes can even generate more copies with mixed permutations to the discard and create another tree.
        This poses some problems, but the critical optimizations that diversify the object code won't take place in the first phases (depending on the language). We could have a full IR representation before we start dropping subtrees and transforming the AST. Caching this helps already quite a bit.

        My recommendation is try to track in code a real compiler from the time it opens an actual file and the moment the .o is created with several different flags and it will be more clear why is so complex and not as easy to purely parallelize as virtual theoretical wikipedia compilers concepts
        The problem here is, you assume that there needs to be a certain link between on-disk formats and processing. I'm claiming that you need to rethink the whole design to speed up the whole task. One could really argue that compilation IS NOT a hard problem in need for parallelization even now. Take mainline Linux and gcc, make defconfig, it will take 20-30 seconds with workstation processors. Do some development, recompile, takes two seconds. We already have better implementations (in Java land the IDEs have incremental compilers), Zapcc and so on. We also have better parsing tech. A super optimized parser parses 1-10M lines per second, an ANTLR generated toy parser does 10k LOC per second. It would be nice to make it a lot faster, but is it really one of the really critical problems that need urgent fixing?

        Comment


        • #74
          Originally posted by caligula View Post
          Indeed I simplified this quite a bit, but gcc is quite special compiler with lots of legacy code / design in it. There are tons of new compilers which actually have everything packed inside a single executable and process. They might even do linking in the same process.


          I described some ways to parallelize and improve the speed at the level of TUs, without going into finer grained details. The main reason it works it we can start with full load at the beginning without slowly discovering new tasks as we compile. The autohell+gcc+make combo is suboptimal in so many ways. We don't even need to go into details about parallelization, we can just look at the problem and estimate how much faster will it get with a better design. Take a look at that zapcc. It can give a rough idea what's wrong. All the autohell caches, compiler caches etc. just memoize previous compiler/configuration runs. We could do a lot better by integrating those in the compiler. There would be a new problem, running out of memory, but it's not really a problem, just a matter of managing a cache which is a well known problem. Besides, e.g. C/C++ compilers started in environments that couldn't do full program compilation without 2+ passes due to memory constraints. Now my machine can hold 1000 compiler runs concurrently in memory. That's a 5 fold increase in memory size wrt computational resources. And this is just an ordinary desktop PC. Compilation takes place in cloud nowadays. The memory capacity won't become a problem since we can LRU evict data, we have swap files, and finally the amount of code isn't that large anyways. I can compile the whole Linux base system (few hundred megabytes) in RAM without discarding any intermediate data.


          This poses some problems, but the critical optimizations that diversify the object code won't take place in the first phases (depending on the language). We could have a full IR representation before we start dropping subtrees and transforming the AST. Caching this helps already quite a bit.



          The problem here is, you assume that there needs to be a certain link between on-disk formats and processing. I'm claiming that you need to rethink the whole design to speed up the whole task. One could really argue that compilation IS NOT a hard problem in need for parallelization even now. Take mainline Linux and gcc, make defconfig, it will take 20-30 seconds with workstation processors. Do some development, recompile, takes two seconds. We already have better implementations (in Java land the IDEs have incremental compilers), Zapcc and so on. We also have better parsing tech. A super optimized parser parses 1-10M lines per second, an ANTLR generated toy parser does 10k LOC per second. It would be nice to make it a lot faster, but is it really one of the really critical problems that need urgent fixing?
          Ok, now that we have a better understanding of each pov I can agree with:

          1.) yea GCC have a lot of legacy code but the behavior I talked about also happens on Clang and ICC but I agree with you in the sense this is with C/C++ and probably lot of legacy code make its way into those as well

          2.) I agree certain steps(depending on the language) can certainly be made to improve further compiling performance and prolly some new techniques including those you mentioned could be used to even reduce the footprint and synchronization problems between passes maybe even make those atomic.

          3.) pretty much on the same page

          4.) I agree compilation is not really a problem right now and dare to assert is fast enough until bigger issues are improved that justifies invest time into a compiler speed, so yeah same page.

          The only caveat I have is I don't think C/C++ specific compilation can be improved much further because some optimizations are truly hellish to implement and as far as i know no one have found alternative ways to implement them even on theory and those are a big part of the reason no other language can come close to C/C++ performance in many scenarios but in other languages certainly you can get fast enough while keeping the compiler sane and fast like rust does with LLVM.

          Disclaimer: I know rust can be faster at runtime than readable partially optimized C++ but is no match to properly optimized tho almost unreadable C++ in many HPC cases and I'm not saying rust will not eventually get there since LLVM still miss optimizations that could help rust in the future as well tho I'm sure compilation speed will start suffer as it does in C++, so It will depend if the rust guys find that trade off acceptable

          Comment


          • #75
            Originally posted by jrch2k8 View Post
            Disclaimer: I know rust can be faster at runtime than readable partially optimized C++ but is no match to properly optimized tho almost unreadable C++ in many HPC cases and I'm not saying rust will not eventually get there since LLVM still miss optimizations that could help rust in the future as well tho I'm sure compilation speed will start suffer as it does in C++, so It will depend if the rust guys find that trade off acceptable
            I could say the same about less-readable, non-idiomatic and unsafe Rust, too. However, I think the key here is that software written in Rust consistently outperforms C++ software across the board. Hence, Rust software is regularly being released that defeats existing C / C++ tools that have had decades of a head start. Compiler enforces software to be written in a more efficient manner due to compiler rules (borrow checker), and this also opens the door to encouraging more brute-force optimizations in more cases (due to borrow checker, safety guarantees, and Cargo).

            But any issues with performance are entirely related to LLVM in Rust's case. There are plans to produce an IR for GCC, in addition to LLVM, at some point, which would lead to a more fair comparison between Rust and C / C++ software (mainly because most benchmarks compare GCC-compiled C / C++ to LLVM-compiled Rust).

            Comment


            • #76
              Originally posted by mmstick View Post
              The compiler does it's job and produces a standardized IR format.
              but you can't run ir format
              Originally posted by mmstick View Post
              Why did you link the source code to LLVM?
              because your link contains it








              rust/src/


              Latest commit 744dd6c 4 hours ago bors Auto merge of #44066 - cuviperowerpc64-extern-abi, r=alexcrichton


              ..
              bootstrap Auto merge of #43886 - oli-obk:clippy, r=nrc 6 hours ago
              build_helper rustbuild: Replace create_dir_racy with create_dir_all 26 days ago
              ci fix option for RUST_CONFIGURE_ARGS to be rust.ignore-git=false 4 days ago
              doc Rollup merge of #44172 - matticoatch-2, r=frewsxcv 3 days ago
              etc Allow htmldocck to run using Python 3. 8 days ago
              grammar changed upper bound digit in octal rule to 7 3 months ago
              jemalloc @ 1f5a287 Update jemalloc to 4.5.0 17 days ago
              liballoc Rollup merge of #44160 - AndyGauge:api-docs-macros, r=steveklabnik 3 days ago
              liballoc_jemalloc Auto merge of #43648 - RalfJung:jemalloc-debug, r=alexcrichton 4 days ago
              liballoc_system *: remove crate_{name,type} attributes 8 days ago
              libarena *: remove crate_{name,type} attributes 8 days ago
              libbacktrace Update libbacktrace config.sub from http://git.savannah.gnu.org/cgit/… a month ago
              libcollections *: remove crate_{name,type} attributes 8 days ago
              libcompiler_builtins @ 6b9281d Update the compiler_builtins submodule 16 days ago
              libcore Update bootstrap compiler 2 days ago
              libfmt_macros *: remove crate_{name,type} attributes 8 days ago
              libgetopts *: remove crate_{name,type} attributes 8 days ago
              libgraphviz *: remove crate_{name,type} attributes 8 days ago
              liblibc @ 04a5e75 Update libc to fix sparc compiles 3 days ago
              libpanic_abort *: remove crate_{name,type} attributes 8 days ago
              libpanic_unwind Add missing dependency for Windows 7 days ago
              libproc_macro Rollup merge of #44125 - SergioBenitez:master, r=nrc 3 days ago
              libprofiler_builtins Remove some false positive issues 4 days ago
              librand *: remove crate_{name,type} attributes 8 days ago
              librustc Auto merge of #44104 - llogiq:lowercase-lints, r=nikomatsakis 15 hours ago
              librustc_allocator Make fields of `Span` private 4 days ago
              librustc_apfloat *: remove crate_{name,type} attributes 8 days ago
              librustc_asan Bump master to 1.21.0 a month ago
              librustc_back Rollup merge of #44091 - kallisti5:haiku-fix, r=eddyb 7 days ago
              librustc_bitflags *: remove crate_{name,type} attributes 8 days ago
              librustc_borrowck rustc: use hir::ItemLocalId instead of ast::NodeId in CodeExtent. 2 days ago
              librustc_const_eval *: remove crate_{name,type} attributes 8 days ago
              librustc_const_math *: remove crate_{name,type} attributes 8 days ago
              librustc_data_structures Auto merge of #44059 - oli-obkk_suggestion, r=nikomatsakis 5 days ago
              librustc_driver rustc: use hir::ItemLocalId instead of ast::NodeId in CodeExtent. 2 days ago
              librustc_errors Rollup merge of #44125 - SergioBenitez:master, r=nrc 3 days ago
              librustc_incremental *: remove crate_{name,type} attributes 8 days ago
              librustc_lint rustc: use hir::ItemLocalId instead of ast::NodeId in CodeExtent. 2 days ago
              librustc_llvm *: remove crate_{name,type} attributes 8 days ago
              librustc_lsan Bump master to 1.21.0 a month ago
              librustc_metadata Merge branch 'hide-trait-map' into rollup 3 days ago
              librustc_mir rustc: use hir::ItemLocalId instead of ast::NodeId in CodeExtent. 2 days ago
              librustc_msan Bump master to 1.21.0 a month ago
              librustc_passes Auto merge of #43932 - eddyb:const-scoping, r=nikomatsakis 3 days ago
              librustc_platform_intrinsics *: remove crate_{name,type} attributes 8 days ago
              librustc_plugin *: remove crate_{name,type} attributes 8 days ago
              librustc_privacy Rollup merge of #44202 - alexcrichton:xcrate-generators, r=arielb1 2 days ago
              librustc_resolve Rollup merge of #44089 - alexcrichton:trait-proc-macro, r=nrc 3 days ago
              librustc_save_analysis Make fields of `Span` private 4 days ago
              librustc_trans x86: return single-float aggregates in a float register 22 hours ago
              librustc_trans_utils *: remove crate_{name,type} attributes 8 days ago
              librustc_tsan Move unused-extern-crate to late pass 7 days ago
              librustc_typeck rustc: use hir::ItemLocalId instead of ast::NodeId in CodeExtent. 2 days ago
              librustdoc Update html-diff-rs version a day ago
              libserialize *: remove crate_{name,type} attributes 8 days ago
              libstd Auto merge of #44154 - alexcrichton:bump-bootstrap, r=Mark-Simulacrum a day ago
              libstd_unicode *: remove crate_{name,type} attributes 8 days ago
              libsyntax Auto merge of #43425 - matklad:lambda-restrictions, r=eddyb 2 days ago
              libsyntax_ext Make fields of `Span` private 4 days ago
              libsyntax_pos Implement From<&str> for Symbol. 2 days ago
              libterm *: remove crate_{name,type} attributes 8 days ago
              libtest Platform gate libc in libtest 7 days ago
              libunwind *: remove crate_{name,type} attributes 8 days ago
              llvm @ d9e7d26 Fix LLVM assertion when a weak symbol is defined in global_asm. 2 months ago



              Originally posted by mmstick View Post
              That isn't the Rust compiler.
              so it was included in rust by mistake, imbecile?
              Originally posted by mmstick View Post
              Sorry, but you're incredibly wrong. That is a thread pool. It's not a single thread, but a pool of threads, and these pool of threads can execute events as they are received over time. No need to spawn threads every time a parallel calculation is needed, and no need to spawn as many threads as you have work units. No deadlocking of any form will happen.
              imbecile, not every bunch of threads is thread pool. thread pools execute tasks, not events
              Originally posted by mmstick View Post
              Rust has no need for Boost. The things that Boost provides, Rust already has superior alternatives within it's standard library, the core language, and crates within the ecosystem.
              imbecile, this contradicts your claim "In addition, you don't have to rewrite a C or C++ library in Rust in order to use that library in Rust. A number of crates in the ecosystem are actually wrappers to established C / C++ libraries". in reality you have to rewrite all valuable c++ libraries with "superior" alternatives.
              Originally posted by mmstick View Post
              Sucks for C++ though that it's not Rust. C++ is now a dying breed that won't be around for the long term.
              lol, so you will have no compiler?

              Comment


              • #77
                Originally posted by caligula View Post
                Depends on the compiler and language. I know some more recent compilers do around 10-20k lines per second. Parsing the Linux kernel with such speed takes quite long. Not all parsers (or generators) are that fast.
                but all optimizers are slow. and in c++ even without optimization slow part is not turning text to ast, but metaprogramming

                Comment


                • #78
                  Originally posted by pal666 View Post
                  but all optimizers are slow. and in c++ even without optimization slow part is not turning text to ast, but metaprogramming
                  That's true. Optimizers have become horribly slow and complex these days. Metaprogramming doesn't need to be that slow though. World class C++ coders and fathers of template metaprogramming (such as Andrei Alexandrescu) have switched to D to pursue to goals of greater goods with template heavy code. Before C++ didn't even exist yet, Lisp/Scheme communities had quite clever ideas how to do fast metaprogramming. Metaprogramming with C++ was found by accident and it really shows. You don't need to do that many levels of recursion with few templates to fill the RAM with template instantiations. It's just silly.

                  Comment


                  • #79
                    Originally posted by caligula View Post
                    World class C++ coders and fathers of template metaprogramming (such as Andrei Alexandrescu) have switched to D to pursue to goals of greater goods with template heavy code.
                    both alexandrescu and walter bright(author of d) are busy working on c++ standards committee. d is a testbed for new c++ features
                    Originally posted by caligula View Post
                    Before C++ didn't even exist yet, Lisp/Scheme communities had quite clever ideas how to do fast metaprogramming.
                    but non-compiletime, i guess
                    Originally posted by caligula View Post
                    Metaprogramming with C++ was found by accident and it really shows. You don't need to do that many levels of recursion with few templates to fill the RAM with template instantiations. It's just silly.
                    turing-completeness of templates was discovered after the fact, but templates were designed for metaprogramming since beginning. and now c++ has constexpr metaprogramming. and parameter packs to avoid recursion in templates

                    Comment

                    Working...
                    X