Announcement

Collapse
No announcement yet.

PyPy 2.6 Released, ~7x Faster Than CPython

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PyPy 2.6 Released, ~7x Faster Than CPython

    Phoronix: PyPy 2.6 Released, ~7x Faster Than CPython

    Version 2.6 of the PyPy JIT-compiler-based interpreter for Python has been released. With PyPy 2.6 there's some Python compatibility improvements along with Numpy improvements and preliminary support for a new lightweight stats profiler...

    http://www.phoronix.com/scan.php?pag...y-2.6-Released

  • #2
    I have no problem believing that they are 7 times faster than CPython, but I don't believe that that's a cold-start benchmark...
    Generally the way JIT works is that it gets re-compiled and optimized in the background the longer you are running the script, right? So running a Python program for the first time, surely CPython would be faster until you've been using the program for almost an hour, then PyPy could start pulling ahead...

    Comment


    • #3
      As someone who likes to write in python, I really like the direction pypy is heading in, but it's much better practice to develop in cpython first, and attempt to optimize for that. In other words, as a developer, don't depend on pypy for good performance.

      Comment


      • #4
        Here are my thoughts after using Pypy extensively for the last 6 months as a grad student working on a PhD in wireless communications signal processing. Previously I had a job for 5 years doing a lot of Matlab, Octave, C, CUDA, and Verilog work, so I have a pretty good breadth of experience to compare against.

        My code is generally a lot of tight nested for loops using mathy things to solve nasty signal processing problems. ... Basically I use a lot of Numpy.

        1) Matlab is slow unless you can vectorize. My application can't be vectorized so Matlab is generally 10-100x slower than C code. CPython is 2-10x slower than Matlab (depends a lot on what modules you're using). Matlab costs several thousand dollars. I've 100% converted to Python so that I can do work independently without license hassle. I believe that the only reason anyone should use Matlab anymore is if they have to collaborate with a community that is fully invested in Matlab. I've never found the extra Matlab libraries very useful though. Generally the algorithms the provide are very common and easy to make on your own.

        2) Putting Matlab in an embedded system for a quick prototype is doable but weird and heavy. Doing it with python is amazingly easy and allows you to do many scripting tasks that get the whole system working much quicker/easier.

        3) In non-vectorized situations, Pypy is 2-4x faster than Matlab.

        4) Pypy is often 13x faster than CPython. Only times that isn't true in my application is if I'm using a Numpy function which is 'heavy' with lots of safety features, conversions, and checks. For example, using the dot product function will severally slow down your code if being used only a few elements at a time. After a while you get a sense of which functions these are and just hand code them or create a library of replacements. This is constantly getting better though. The custom Numpy branch used by Pypy is relatively new.

        5) Pypy is actually slower on code that isn't repeated (maybe 2x, but that generally consumes <1% of my running time so it's irrelevant). Yes, there is a penalty when the JIT decides to compile a time consuming function (? 100ms). But Pypy is awesome at recognizing these hotspots quickly. I'm seeing speedups basically immediately, in the first 1 second or less. Usually full speedup on second iteration.

        6) I've had some contests in the lab and I've gotten Pypy within just 2x slower than a coworkers C# implementation. More importantly I had the code done WAY faster because of the nicety's of a scripting language to develop in.

        7) WARNING: Numpy inside of Pypy is still buggy, but they're making steady progress.

        8) WARNING: plotting doesn't work in Pypy. So do development in Python/iPython. Then save results to a pickle file when doing long simulations. Open pickle file in Python/iPython to plot.

        I'm even doing FPGA and silicon design work using MyHDL now.

        Hope I've convinced a few of you to try it out :-)
        Last edited by jchedstrom; 01 June 2015, 04:12 PM.

        Comment


        • #5
          I?d like to try if PyPy makes emerge?s dependency calculations faster, once I understand how to setup python correctly on Gentoo that is. Could take months.

          Comment


          • #6
            Originally posted by jchedstrom View Post
            Here are my thoughts after using Pypy extensively for the last 6 months as a grad student working on a PhD in wireless communications signal processing. Previously I had a job for 5 years doing a lot of Matlab, Octave, C, CUDA, and Verilog work, so I have a pretty good breadth of experience to compare against.

            My code is generally a lot of tight nested for loops using mathy things to solve nasty signal processing problems. ... Basically I use a lot of Numpy.

            1) Matlab is slow unless you can vectorize. My application can't be vectorized so Matlab is generally 10-100x slower than C code. CPython is 2-10x slower than Matlab (depends a lot on what modules you're using). Matlab costs several thousand dollars. I've 100% converted to Python so that I can do work independently without license hassle. I believe that the only reason anyone should use Matlab anymore is if they have to collaborate with a community that is fully invested in Matlab. I've never found the extra Matlab libraries very useful though. Generally the algorithms the provide are very common and easy to make on your own.

            2) Putting Matlab in an embedded system for a quick prototype is doable but weird and heavy. Doing it with python is amazingly easy and allows you to do many scripting tasks that get the whole system working much quicker/easier.

            3) In non-vectorized situations, Pypy is 2-4x faster than Matlab.

            4) Pypy is often 13x faster than CPython. Only times that isn't true in my application is if I'm using a Numpy function which is 'heavy' with lots of safety features, conversions, and checks. For example, using the dot product function will severally slow down your code if being used only a few elements at a time. After a while you get a sense of which functions these are and just hand code them or create a library of replacements. This is constantly getting better though. The custom Numpy branch used by Pypy is relatively new.

            5) Pypy is actually slower on code that isn't repeated (maybe 2x, but that generally consumes <1% of my running time so it's irrelevant). Yes, there is a penalty when the JIT decides to compile a time consuming function (? 100ms). But Pypy is awesome at recognizing these hotspots quickly. I'm seeing speedups basically immediately, in the first 1 second or less. Usually full speedup on second iteration.

            6) I've had some contests in the lab and I've gotten Pypy within just 2x slower than a coworkers C# implementation. More importantly I had the code done WAY faster because of the nicety's of a scripting language to develop in.

            7) WARNING: Numpy inside of Pypy is still buggy, but they're making steady progress.

            8) WARNING: plotting doesn't work in Pypy. So do development in Python/iPython. Then save results to a pickle file when doing long simulations. Open pickle file in Python/iPython to plot.

            I'm even doing FPGA and silicon design work using MyHDL now.

            Hope I've convinced a few of you to try it out :-)
            I'm still wondering why you'd want to use Python. JavaScript is the language of the future and for numerics there's Fortran, Julia, heck even C++. Python is a sucky choice. If you know how to program, the compiled alternatives are just ok. And if you need a dynamic language, Julia is the solution. It's much faster than toy Python.

            Some benchmarks here:
            http://julialang.org/
            Last edited by caligula; 01 June 2015, 09:43 PM.

            Comment


            • #7
              Originally posted by caligula View Post
              I'm still wondering why you'd want to use Python. JavaScript is the language of the future and for numerics there's Fortran, Julia, heck even C++. Python is a sucky choice. If you know how to program, the compiled alternatives are just ok. And if you need a dynamic language, Julia is the solution. It's much faster than toy Python.
              I have already explained to you many times that speed is not the only concern when choosing a language for numerics, in fact it probably isn't even in the top 10 concerns. The fact that people are still using MATLAB is proof of that. And so far predictions about "the language of the future" have a very bad track record. But maybe your crystal ball is better than everyone elses'.

              I notice you don't post these trolls comments on the Octave announcements. Why are you so hung up on Python? Is it because Python is still so much more popular than your language of choice, Julia?

              Comment


              • #8
                Originally posted by TheBlackCat View Post
                I have already explained to you many times that speed is not the only concern when choosing a language for numerics, in fact it probably isn't even in the top 10 concerns. The fact that people are still using MATLAB is proof of that. And so far predictions about "the language of the future" have a very bad track record. But maybe your crystal ball is better than everyone elses'.

                I notice you don't post these trolls comments on the Octave announcements. Why are you so hung up on Python? Is it because Python is still so much more popular than your language of choice, Julia?
                Octave, R, Matlab, Mathematica and others have a long history. There are good legacy reasons to use them even if the library design or language runtime isn't any good. Python is the cool new kid on the block. It doesn't have any track record in this domain, it's a toy scripting language and now also has an identity crisis (2.6 vs 3.x). Since a lot is still missing from Python, it would be crucial to act now and switch to better languages. Python is probably the worst of the choices we have these days for this job. Perhaps only PHP is even worse.

                Comment


                • #9
                  Originally posted by caligula View Post
                  Octave, R, Matlab, Mathematica and others have a long history. There are good legacy reasons to use them even if the library design or language runtime isn't any good. Python is the cool new kid on the block. It doesn't have any track record in this domain,
                  Python has been around for more than 25 years, older than R and only few years younger than octave. Its numerical computing capabilities go back 20 years, only 2 years younger than R. It is hardly "the cool new kid on the block".

                  Originally posted by caligula View Post
                  it's a toy scripting language
                  By definition, if it was a toy language it wouldn't be so much more popular than Julia.

                  Originally posted by caligula View Post
                  Since a lot is still missing from Python, it would be crucial to act now and switch to better languages.
                  And that is the real issue here. The open-source numerical computing community has aligned itself behind Python, not your language of choice, Julia. You don't like that, so you spread FUD about Python every chance you get. Whether your hatred of Python is petty jealousy, or a concern that the community focus on Python is limiting the developers to Julia, I don't know. The end result is the same: this pointless trolling on a forum almost nobody involved in the field actually reads.

                  Comment


                  • #10
                    Originally posted by TheBlackCat View Post
                    Python has been around for more than 25 years, older than R and only few years younger than octave. Its numerical computing capabilities go back 20 years, only 2 years younger than R. It is hardly "the cool new kid on the block".


                    By definition, if it was a toy language it wouldn't be so much more popular than Julia.


                    And that is the real issue here. The open-source numerical computing community has aligned itself behind Python, not your language of choice, Julia. You don't like that, so you spread FUD about Python every chance you get. Whether your hatred of Python is petty jealousy, or a concern that the community focus on Python is limiting the developers to Julia, I don't know. The end result is the same: this pointless trolling on a forum almost nobody involved in the field actually reads.
                    I understand you're terribly worried about the quality of discussion in these forums. However I don't understand why you're so much against alternative open source technologies. I think Python already has so much momentum behind it that a single person can't stop it. Maybe some people are happy with it. However it's far from an optimal platform for numeric programming at the moment. Julia is just one example of the faster than Python languages and it already has libraries built for numerics. If we give it more time, it will grow and become more ubiquitous. I guess it's hard to convince you that execution speed is also a valuable feature. It feels funny to even argue the point in a thread that's all about optimizing Python. Why is it ok for you to spend resources optimizing Python, but when people discuss other languages, it becomes trolling. Julia is backed by solid research and academic wisdom. It's a toy language only if you consider language popularity. This should be obvious considering how recent language it is. It's already faster than PyPy and obviously the direction PyPy has taken is to copy the ideas already presented there in that other language.

                    Comment

                    Working...
                    X