Announcement

Collapse
No announcement yet.

A Nicely-Built 40-Core Raspberry Pi Cluster

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A Nicely-Built 40-Core Raspberry Pi Cluster

    Phoronix: A Nicely-Built 40-Core Raspberry Pi Cluster

    Raspberry Pi super-computing clusters have been attempted before, but usually they don't turn out as nice as this new one that's comprised of 40 Raspberry Pi boards inside of an acrylic chassis...

    http://www.phoronix.com/vr.php?view=MTYwNTI

  • #2
    The guy obviously did it for fun... 'cause you can buy a 486 for $20 that can outperform 40 rpi's.

    Comment


    • #3
      Well, sure, but a 486 won't be able to do a whole lot of concurrent operations. That's like comparing a CPU to a GPU--they're different beasts, with different purposes.

      That said, he was doing it sort-of for fun, but it started out as a thesis project. According to his blog post, he's going to use it to test distributed software.

      Comment


      • #4
        I appreciate you featuring my project here!

        Regarding other comments, the processor on the Pi is approximately 20 to 40 times the speed of a 486DX2-50, depending on your metric. The cluster I built should be should be similar in performance to the Origin2000 systems from the late 1990s, but with slower interconnects. Applications that don't rely on having such fast interconnects should work just fine.

        I wanted to point out that, with different cards, my case design could be used with other similar boards, such as the Beaglebone Black. It might be slightly harder to set it up with Gigabit Ethernet switches, due to most 24-port Gigabit switches being too wide), but I think that should become practical in the future. This would alleviate the bottlenecking issue mentioned in the article.

        And yes, part of the purpose of this project has been the fun side of it. I also also think it will be a lot more interesting to run distributed code on a hardware platform, where I can see lights blinking and such as it works. But I grew up on movies that had supercomputers with huge panels of blinking lights, so I suppose that sentiment may not be universal.

        Comment


        • #5
          Originally posted by droidhacker View Post
          The guy obviously did it for fun... 'cause you can buy a 486 for $20 that can outperform 40 rpi's.
          It must be fun when you can spend 3000$ for fun.

          Comment


          • #6
            Originally posted by DaveG View Post
            I appreciate you featuring my project here!

            Regarding other comments, the processor on the Pi is approximately 20 to 40 times the speed of a 486DX2-50, depending on your metric. The cluster I built should be should be similar in performance to the Origin2000 systems from the late 1990s, but with slower interconnects. Applications that don't rely on having such fast interconnects should work just fine.

            I wanted to point out that, with different cards, my case design could be used with other similar boards, such as the Beaglebone Black. It might be slightly harder to set it up with Gigabit Ethernet switches, due to most 24-port Gigabit switches being too wide), but I think that should become practical in the future. This would alleviate the bottlenecking issue mentioned in the article.

            And yes, part of the purpose of this project has been the fun side of it. I also also think it will be a lot more interesting to run distributed code on a hardware platform, where I can see lights blinking and such as it works. But I grew up on movies that had supercomputers with huge panels of blinking lights, so I suppose that sentiment may not be universal.
            I love your project. Read the pdf about raspberry pi beowulf cluster last year. Really cool stuff. Are you familiar with other devices such as the Parallella? http://www.adapteva.com/parallella-board/

            Comment


            • #7
              I'd have to say, this is very well done. I just think it's a real shame you chose RPi. Spend roughly $20 more per system and you could've gone for the MK808 with a USB to Ethernet adapter. That's just about the cheapest dual core system you can get. It's too bad it only has wifi support instead of ethernet, because otherwise it'd be even cheaper for you. The major downside would be that it doesn't have any blinky LEDs on it.

              Comment


              • #8
                Originally posted by schmidtbag View Post
                I'd have to say, this is very well done. I just think it's a real shame you chose RPi. Spend roughly $20 more per system and you could've gone for the MK808 with a USB to Ethernet adapter. That's just about the cheapest dual core system you can get. It's too bad it only has wifi support instead of ethernet, because otherwise it'd be even cheaper for you. The major downside would be that it doesn't have any blinky LEDs on it.
                wifi only would make it pretty useless for what he is doing. Not sure about the usb ethernet adapter, that might not be all that great either.

                Comment


                • #9
                  Originally posted by philip550c View Post
                  I love your project. Read the pdf about raspberry pi beowulf cluster last year. Really cool stuff. Are you familiar with other devices such as the Parallella? http://www.adapteva.com/parallella-board/
                  Thanks.

                  I'm familiar with Parallella, but I don't own one. I'll probably buy one in the next year or so. However, I need to spend time with my other mini PCs before I buy more of them.

                  Originally posted by schmidtbag View Post
                  I'd have to say, this is very well done. I just think it's a real shame you chose RPi. Spend roughly $20 more per system and you could've gone for the MK808 with a USB to Ethernet adapter. That's just about the cheapest dual core system you can get. It's too bad it only has wifi support instead of ethernet, because otherwise it'd be even cheaper for you. The major downside would be that it doesn't have any blinky LEDs on it.
                  Thanks.

                  Regarding the MK808: For this build, I didn't want to use wireless or separate USB Ethernet adapters. But I own a GK802 and I'd like to build some kind of stick PC cluster eventually. If/when I do, I'll use the on-board wireless and limit it to more like 8 to 16 nodes. I probably won't make such an elaborate case for it though.

                  Comment


                  • #10
                    Originally posted by DaveG View Post
                    Thanks.

                    I'm familiar with Parallella, but I don't own one. I'll probably buy one in the next year or so. However, I need to spend time with my other mini PCs before I buy more of them.



                    Thanks.

                    Regarding the MK808: For this build, I didn't want to use wireless or separate USB Ethernet adapters. But I own a GK802 and I'd like to build some kind of stick PC cluster eventually. If/when I do, I'll use the on-board wireless and limit it to more like 8 to 16 nodes. I probably won't make such an elaborate case for it though.
                    Man, WTF. Odroid U3 community edition is $59 and has quad 1.7 GHz Exynos cores and 2 GB of RAM ==> $14.75 per core
                    Raspberry => $35 per core

                    And you need 75% fewer SD cards, cables, switch ports.

                    Guess which one is faster?

                    Comment


                    • #11
                      Originally posted by caligula View Post
                      Man, WTF. Odroid U3 community edition is $59 and has quad 1.7 GHz Exynos cores and 2 GB of RAM ==> $14.75 per core
                      Raspberry => $35 per core

                      And you need 75% fewer SD cards, cables, switch ports.

                      Guess which one is faster?
                      So, there are a few things about that:
                      • If performance/cost had been my basis for comparison, I would have put together an i7-based system with one or more fast video cards. It would have cost less and performed a lot better. A couple people would have congratulated me on building a nice gaming rig. You never would have heard about it.
                      • Even though the work on this build didn't get serious until late last year, I was buying my first RPis in June 2012. It would not have been possible to choose the Odroid U3 for anything at that time. (As far as I know, it was only just released about a month ago.)
                      • I wanted 32+ nodes in this cluster.
                      • "And the maximum order quantity is limited to one unit for one person." - I could have gotten around that, but it would have been a pain and there probably would have been a lot of shipping to pay out.

                      But if you want to build a cluster of Odroid U3s, I encourage you to do so. It sounds like it would be pretty sweet.

                      Comment


                      • #12
                        Originally posted by DaveG View Post
                        So, there are a few things about that:
                        • If performance/cost had been my basis for comparison, I would have put together an i7-based system with one or more fast video cards. It would have cost less and performed a lot better. A couple people would have congratulated me on building a nice gaming rig. You never would have heard about it.
                        • Even though the work on this build didn't get serious until late last year, I was buying my first RPis in June 2012. It would not have been possible to choose the Odroid U3 for anything at that time. (As far as I know, it was only just released about a month ago.)
                        • I wanted 32+ nodes in this cluster.
                        • "And the maximum order quantity is limited to one unit for one person." - I could have gotten around that, but it would have been a pain and there probably would have been a lot of shipping to pay out.

                        But if you want to build a cluster of Odroid U3s, I encourage you to do so. It sounds like it would be pretty sweet.
                        Whatever. If you don't count cores as nodes. If you use MPI, it also works with SMP systems as you might know. You could also have ordered two Parallela boards. The 16 core versions were already shipped to early backers. RPi is such a poor choice not because of slow CPU but because everything in RPi is a bottleneck. The network interface is slow, SD card reader is slow and buggy, USB power issues. On top of that the CPU is a major pain in the neck if you need to compile your software and you often do. The hardware is so slow you can't even use your cluster to provide any speedup when dist-compiling your software. If you had bought a tiny bit faster ARM boards, you could speed up with distcc.

                        Comment


                        • #13
                          Originally posted by caligula View Post
                          Man, WTF. Odroid U3 community edition is $59 and has quad 1.7 GHz Exynos cores and 2 GB of RAM ==> $14.75 per core
                          Raspberry => $35 per core

                          And you need 75% fewer SD cards, cables, switch ports.

                          Guess which one is faster?
                          Last time I checked, you're limited to 1 odroid U3 per person. That would be kind of hard to do a cluster that way. Otherwise, I'd agree - a single U3 is probably as good as at least 8 RPis and only costs twice as much. I believe there's a way to control the "heartbeat" LED on it too.

                          As for my suggestion about the MK808 and using USB ethernet, keep in mind that 99% of all ARM platforms with built-in ethernet have a USB based ethernet. Unless you don't intend to use the RPi's serial ports, the ethernet jack on it I believe is on the same hub as the other USB ports, which could potentially slow it down further. Also, I wasn't suggesting to use the built-in wifi, because that is a terrible idea to use.

                          Comment


                          • #14
                            Originally posted by schmidtbag View Post
                            Last time I checked, you're limited to 1 odroid U3 per person. That would be kind of hard to do a cluster that way. Otherwise, I'd agree - a single U3 is probably as good as at least 8 RPis and only costs twice as much. I believe there's a way to control the "heartbeat" LED on it too.

                            As for my suggestion about the MK808 and using USB ethernet, keep in mind that 99% of all ARM platforms with built-in ethernet have a USB based ethernet. Unless you don't intend to use the RPi's serial ports, the ethernet jack on it I believe is on the same hub as the other USB ports, which could potentially slow it down further. Also, I wasn't suggesting to use the built-in wifi, because that is a terrible idea to use.
                            You're overly generous towards RPi. U3 has next gen CPU architecture compared to RPi. This already makes it twice as fast in some benchmarks. The clock rate is over twice as big. Then, 4 times as many cores. Guess how much faster interconnect is between CPU cores than USB powered ethernet? If the interconnect was 10 Gbps for RPi, U3 would be about as fast as 16 RPis. Since it's not and there's congestion and overhead from MPI or other distributed protocols, I'd say U3 is equivalent to 20-24 RPis.

                            Comment


                            • #15
                              For the price of a 40 core PI cluster you could easyly buy 2x Intel Xeon E5 with 10 cores, which means 20 cores in one box with over 2 ghz each. Maybe for massive parallel apps a Xeon Phi with 61 cores active would be interesting as well. Using clusters of extra slow ARM v6 system leads to nothing, it is just crazy spent money. I have nothing against ARM in general and a PI can be used as low cost (VDR) server or XBMC client or for lots of other fun projects but it was never designed for HPC usage.

                              Comment

                              Working...
                              X