Announcement

Collapse
No announcement yet.

NVIDIA Announces "RAPIDS" Open-Source Data Analytics / Machine Learning Platform

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Exactly. CUDA offers no advantages except for "we are used to it" and "general purpose AI libraries that we use only support CUDA". Both are indications of sick lock-in. It's time for developers to move on indeed and use open standards.

    Comment


    • #12
      Dropping this here :

      https://github.com/ROCm-Developer-Tools/HIP

      It is not very popular yet, but if AMD does come with interesting offerings for AI / ML / DL their ROCm platform could get a serious bump.
      Right now, it has many "requirements" that only work on certain cpu/chipset/gpu combo but HIP is another story.

      HIP is basically CUDA on steroids, mimics its API and is cross gpu. They even went the extra mile and provided you a converter.

      I only have a very small projects (lab sized experiments) that do no contain 100k+ lines, but I would very curious to see if someone with more polished products could try it and give some feedback on all pros and cons obviously.

      Comment


      • #13
        Yet more Python. There is sooo much python around on machine learning and data science. Tensorflow, pytorch, chainer, scikit, Dask, Ray, etc etc etc. I actually think this thing, Rapids, is going to find itself being an also-ran in a python-overload sort of world where all the python people have already invested learning time somewhere else.

        AMD's ROCm should consider moving on from Python. Julia? Maybe Swift? Even Haskell, Ocaml or Scala. You need a REPL..... otherwise Rust, Go etc would be candidates. But the point is that coming in from left field and dumping Python, which has outgrown its usefulness in this 8-32 core world, would be a big USP.
        Last edited by vegabook; 10-10-2018, 04:53 PM.

        Comment


        • #14
          Originally posted by vegabook View Post
          Yet more Python. There is sooo much python around on machine learning and data science. Tensorflow, pytorch, chainer, scikit, Dask, Ray, etc etc etc. I actually think this thing, Rapids, is going to find itself being an also-ran in a python-overload sort of world where all the python people have already invested learning time somewhere else.

          AMD's ROCm should consider moving on from Python. Julia? Maybe Swift? Even Haskell, Ocaml or Scala. You need a REPL..... otherwise Rust, Go etc would be candidates. But the point is that coming in from left field and dumping Python, which has outgrown its usefulness in this 8-32 core world, would be a big USP.
          Python isn't intrinsically a single core language. It's CPython that has the GIL limiting it's threading abilities. I think IronPython and Jython don't have a GIL. PyPy also has a stackless threads feature to allow for lightweight (in terms of memory) threading.

          Comment


          • #15
            So does AMD's HIP work on this? Or does AMD have plans for providing something similar? Or is this really just like TensorFlow in which case I don't really understand the point or speed claims? I know AMD has added support to TensorFlow for their GPUs that use ROCm.

            Is RAPIDS meant to be better in some way compared to the already well established TensorFlow with it's community providing plenty of resources like articles and github projects?

            There's also ArrayFire which provides an API that compiles optimized JIT kernels for OpenCL, CUDA and something else I think, kinda neat.

            Originally posted by vegabook View Post
            otherwise Rust, Go etc would be candidates.
            Rust would be nice, and iirc I think I have seen someone provide a REPL for rust on r/rust.

            Comment


            • #16
              Originally posted by anarki2 View Post

              That's the dumbest thing I've read today so far (which is quite an accomplishment because I've already read several dozen articles' comments today). As if the CUDA runtime being open or not had anything to do with anything.

              Developers and researchers use CUDA because it's reliable and performant, with a huge ecosystem built around it. Something AMD can't say about OpenCL, and about their GPUs in general.

              Yeah, please do insert 9 million "works4me" comments here, the industry couldn't care less about your one man garage projects.
              It has everything to do with everything. Take a look at Justin Lebar: CUDA is a low-level language talk at CppCon 2016. This is what it looks like when a technology is closed. Google engineers programming against a black box and measuring size of an undocumented IR as the only "sensible" feedback. This what you get for investing into closed tech and depending on it. You burn dollars by the 100k to reverse engineer something that should be documented.

              You can give me all the BS you want about crying over OpenCL not holding up to its promise, but it's opinion like yours that defeat OpenCL. One man army projects like Sony Vegas or Adobe Photoshop... Geez, what do you guys smoke? And by the way, this has nothing to do with AMD. If I want to run my stuff on Intel IGP or ARM IGPs or multi-core CPUs of any flavor, CUDA can't help you. It's not AMDs job to create the ecosystem. They contributed quite a lot, starting with the clMath libraries (clFFT, clBLAS, clRNG, clSPARSE), open-sourcing CodeXL, BOLT, just to name a few prominent contributions.

              And yes, it's a shame how OpenCL doesn't live up to expectations, especially how SYCL is readily available as an alternative to CUDA with a much better interface that only requires an OpenCL runtime when executed. I've said it before and I'll say it again: if AMD had invested in upstream support for SYCL in Clang instead of pursuing HCC et al. the situation would be a lot better. The single-source nature of SYCL resembles CUDA enough that projects like GROMACS could port from CUDA to SYCL in no time and the lack of clFFT, etc. could be filled with the OpenCL interop capability of SYCL; you can create API objects from one another and consume naked OpenCL from SYCL and vice versa.

              If you're such a big fan of closed technology, I take it you only buy G-Sync enabled TVs, which you hook up via Thunderbolt while streaming to an Intel WiDi enabled projector? No, you use HDMI, USB and Miracast, because those are f**king STANDARDS backed by alliances and consortiums.
               

              Comment


              • #17
                Originally posted by Meteorhead View Post
                (...) clFFT, clBLAS, clRNG, clSPARSE (...)
                It is bit telling about quality of ecosystem: you are not aware that those projects are dead and superseded by ROCm libraries.


                Comment


                • #18

                  I am aware of the fact that the clMath libraries are abandoned. As I said, AMD placed their priorities elsewhere and they are ROCm full throttle. Only problem is, doing so they abandoned all their users who pushed and promoted OpenCL. RocFFT et al are built atop their HIP runtime which is not as portable as OpenCL. I cannot teach it to students at university, because most people only have Intel CPU+IGP in their systems, and my colleagues could not run any code I write for the same reason, other than inside their cluster. HIP is **NOT** a replacement for OpenCL, it wants to leverage existing CUDA codebases.

                  When SYCL was announced and Codeplay showed off their first beta implementation (SYCLONE, IMHO much better name than ComputeCpp), our group convinced our boss it was worthwhile to become beta testers. We provided ample feedback, and still do for ComputeCpp. I even started a work project SYCL-PRNG to grow the ecosystem. (A set of STL and SYCL compatible PRNGs. I take pull requests. ) SYCL-FFT is on our roadmap, but it's up for grabs. If every 20th research group at every 3rd university/institute would do the same, SYCL would be in glorius shape. The fact AMD is pushing ROCm in Tensorflow as well instead of the already half-baked SYCL implementation pretty much says everything about the openness of GPUOpen. It very much seems to me that AMD is tearing down it's 'good guy' image rapidly. (The fact that cl_khr_spir extension and the CPU runtime was silently dropped from their OpenCL implementation is very hard to forgive. There is virtually no feedback on their plans of ever restoring support for either of them. bridgman might be able to shed some light on the matter.)

                  So yes, the ecosystem is not nearly as good as CUDAs, but that's all due to people being happy in the vendor lock-in. People are happy no having choices. When Maxwell came which crippled the double-precision support compared to Fermi solely for the sake of the consumer market and graphics efficiency, you couldn't switch in an AMD card. When Nvidia decided in their EULA that you cannot install their driver in datacenters to drive GTX cards, people decided not to care and violate the EULA, cause they didn't have the option to switch to another vendors consumer grade HW. Now that everybody is going to pay for ray-tracing and tensor HW regardless of you ever using it... I think you get my point. (And yes, I could insert 9 million success stories here, but nobody seems to care.)

                  CUDA and ROCm all have their place under the sun, but standards should not be sabotaged; they exist for a reason.

                  Comment


                  • #19
                    Originally posted by Meteorhead View Post
                    If every 20th research group at every 3rd university/institute would do the same, SYCL would be in glorius shape.
                    Problem here that with current setup how science is financed it is hard to have long running infrastructure projects. Been there done that

                    Meteorhead I think performance would always trump portability. OpenCL is portable is it is not performance portable. Properly addressing performance portability problem adds extra complexity to already complex problems. You are *always* vendor locked-in to pretty big degree if you want performance, if you do not then why bother?

                    Example of proper approach, I highly recommend author and this library: https://github.com/CNugteren/CLBlast

                    As for CUDA vs OpenCL. Properly written C++11 cuda doesn't require much porting to OpenCL, code should already be parametrized so it can be fine tuned to different CUDA GPUs. One can use better tooling (profilers, debuggers, memcheckers) in CUDA ecosystem and then cheaply port to OpenCL.

                    Comment

                    Working...
                    X