Announcement

Collapse
No announcement yet.

Nouveau Persevered In 2017 For Open-Source NVIDIA But 2018 Could Be Much Better

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    hiring nouveau devs is certainly better than helping nvidia blob (which was recent nvidia-related activity of rh)

    Comment


    • #32
      The amount of misinformation in this thread is pretty staggering. Rather than try to address individual wrong comments specifically, here's how things work.

      NVIDIA produces a chip, and a reference board, but usually doesn't sell it (I think they have, on occasion, but not in general). OEM's buy the chips and get access to the reference designs which they are free to ignore. OEMs produce the boards that consumers buy.

      Now, there's a certain amount of configurability in the NVIDIA-made chip to handle variations in the way add-on components are connected. Like different RAM chips, or the fan being hooked up via GPIO or via PWM or anything else. There are thousands and thousands of possible variations. Basically every board on the market is slightly different, and sometimes even 2 copies of the same board produced some time apart are different. Only the OEM knows all the specific details, not NVIDIA (at least not in a software sense - I wouldn't be surprised if designs had to be registered with them somehow).

      This board-specific knowledge is encoded in the VBIOS. The VBIOS has 2 totally separate purposes -- one is to bring up the attached screens during bootup, so that you can see stuff during early boot (BIOS/EFI, grub, etc). This is done by software retrieved via the PCI option ROM, executing in x86 16-bit real mode (since that's how the machine comes up in a BIOS boot... might go to 32-bit directly for EFI boots, not sure), and providing services accessible via software interrupt (int 10h). This is a standardized API for setting (VGA) modes, and various other video things. Fancier software (e.g. post-1996 or so) includes VBE things too which let you access higher resolutions than VGA allows for. The EFI situation is slightly different but logically equivalent.

      Starting with Kepler or so, the VBIOS startup scripts initialize the GPU into its lowest performance mode -- both memory speed and core clocks. This is perfectly fine for operating the BIOS / boot software, but leaves something to be desired for e.g. gaming.

      The second purpose of the VBIOS is to provide the operating system driver with the knowledge required to operate the board properly. Some of this knowledge is provided in data tables (e.g. "fan is connected to GPIO 7"), and some of it is provided in high-level scripts which are written in a turing-complete language. These scripts are generally a long sequences of "write value X to register Y", but can also have logic which does various things depending on values on registers or data contents of the vbios. The scripts are executed during resume-from-ram (since the normal VBIOS doesn't run then), as well as at various times during regular use (like when a new display is plugged in).

      It doesn't make sense to talk about an open-source VBIOS since it's really just the board-specific info. One could write an open-source option ROM plugin which parses the tables and provides the int10h/vbe services, but that's about it -- the core data driving it would still have to come from the manufacturer of the board.

      Now ... firmware. Firmware can mean a lot of things, but in this case it's "software that runs on internal CPUs embedded in the NVIDIA chip". Note that I said "CPUs", not "CPU". There are a bunch of them. Each one is internally connected to various things, and as a result there are a number of different pieces of firmware, one for each chip, which does totally different things. There's no board-specific knowledge in the firmware, so it's generic to all boards with a particular chip (often chip generation). On a modern GPU, these are:

      - Various video decoding processors (3 on kepler and earlier, 1 on maxwell+) - these control the fixed function logic that does the actual data decoding
      - Video encoding processor - this controls the fixed function logic used to perform h264 encoding
      - Graphics engine context switches - this is inside the graphics unit and is able to read out all of the graphics context and save it off to memory, and take another graphics context and load it from memory. This is basically required for "graphics acceleration" to work.
      - "Power Management Unit" - a hodge podge of various services, can range from thermal control to providing the primitives for reclocking memory. It also has access to certain information from the card regarding utilization. This unit is required for fan control, changing performance levels, etc.

      Depending on the specific generation, there can be other details, but this is the general gist of it. The nouveau team has developed full implementations of the context switch logic up to (and including) GM10x (1st gen Maxwell). The nouveau team also has implemented a PMU which is sufficient to allow the kernel driver to perform memory reclocks on Kepler/Maxwell, as well as all the fan/therm management stuff that's needed. This wasn't done by *stealing* anything, it was done entirely by the nouveau team and is included as data in the kernel, not linux-firmware.

      For the video decoding logic, we did resort to extracting the firmware from the drivers that NVIDIA ships, as the pain of implementing something like this from scratch would be extreme (REing engines that are only accessible from such CPUs is an extra level harder), while the benefit minimal. I wouldn't call this stealing as we just supply a script which reads some bits from the software supplied by NVIDIA, and we don't distribute the results of that script.

      So now for the "interesting" part. NVIDIA made it impossible (starting with GM200) for unsigned firmware uploaded to these CPUs to access (some of) the critical functions required for them to do their jobs. But they don't distribute the firmware, and they have made it a lot harder to extract from the driver that they ship (probably not on purpose, but that's the end result). Furthermore the loading sequence to get the proper "secure" mode is extremely convoluted and requires "secure" firmware for a number of different engines that all load one another. After a while they did put out both context switch firmware and sufficient surrounding firmware/logic to get it loaded up so that it works, but nothing else. Which lets nouveau kinda limp along.

      Furthermore, reclocking scripts to change memory speed and so on are highly dependent on specific board information (different RAM chips have slighly different properties, and probably other parameters influence it). This information is contained in the VBIOS, and is then composed into reclocking scripts by the driver based on the current GPU state and the desired target state. A developer can easily trace all the transitions for any particular board, but they will be different from board to board (or sometimes even the same board but at a different temperature). In order to get wider coverage, one can just fuzz the data in the VBIOS and see how each bit in the VBIOS affects the reclocking scripts (yes, very tedious work). From that, code can be produced that handles a wide range of boards, but even that's unlikely to capture everything. One has to repeat this with additional boards that don't work, from which one eventually gets a pretty comprehensive solution.

      As it happens, GM20x VBIOS's store the reclocking info in the same table formats as earlier GPUs, and the memory controllers are identical, so we can reuse the exact same logic as was developed for Kepler. And amusingly enough, the hardware access required to perform reclocking is accessible from the PMU in unsecure mode. However Pascal has an entirely different memory controller, so a new reclocking script generator would need to be developed. Except the NVIDIA drivers now ensure that the VBIOS hasn't been tampered with (starting with GM20x), which means that script development is limited to the specific GPU used for the RE. (And it's unclear whether the reclocking access would be possible on Pascal without "secure" mode access.)

      Hopefully this covers the points of confusion people have about various stuff. As for why they're doing this ... who knows. It's officially done in the name of security, I believe, but that's probably just a smoke screen. Probably sick of fake boards being released. The end result is all the same though.
      Last edited by imirkin; 01-01-2018, 02:46 PM. Reason: Clarify that vbios verification started with GM20x, not GP10x.

      Comment


      • #33
        Nice writeup... deserves to be stickied. I hadn't realized you guys had written your own PMU code for Kepler/Maxwell.

        Comment


        • #34
          Originally posted by eydee View Post
          The biggest issue with Nouveau is how distro maintainers are ignoring it. Even a latest live image of cutting-edge rolling release distros won't boot on Pascal because of Nouveau: unsupported chipset, and it's not Nouveau's fault, but of those who include outdated shit in distributions.
          I don't understand your point. This isn't specific to nouveau. If you use very recent hardware with older kernel/components, that's what happens.
          Are you saying that it's worse for nouveau than other drivers?

          Comment


          • #35
            Originally posted by PluMGMK View Post

            As far as I can tell, with the talented people working on the driver it could be done quite quickly if it were possible to get the thermal system to co-operate. But you need signed firmware to actually adjust the fan speed, so reclocking essentially can't even be attempted since it'll just fry the chip. So it comes down to NVIDIA refusing to release the firmware, as usual.
            Would reverse engineer their firmwares be an impossible task? And rewrite them in FOSS form, hacking any protections they had put on hardware.

            Maybe that would be quite difficult and put Nvidia bosses very upset, but it would be an amazing spectacle to see. Nvidia deserves nothing, they act like deserves everything and contribute back practically nothing.

            EDIT: I forgot to read the comment from the Nouveau developer. It seems massively reverse engineering their full firmwares of all their boards and variations can be an impossible task for now. Maybe it would require so big exponential development and research resources that it would be easier to develop a new equivalent GPU from scratch and the drivers for it?
            Last edited by timofonic; 01-02-2018, 08:48 PM.

            Comment


            • #36
              It seems like there are 2 different questions here. The first being "Why is NVIDIA implementing firmware signing and VBIOS checks on newer hardware?" and the likely answer to that is that its for security and to prevent fake boards (someone taking a cheap board and modifying the VBIOS and firmware and things to make it appear to be a more expensive board) and also for DRM (make it harder to hack or modify things to defeat it or to get around "protected media path" on Windows or whatever).

              The second question and the one we dont have an answer to is why NVIDIA are so reluctant to publish the signed blobs in a form the noveau guys can use (and why they are making it harder to pull those blobs from the binary drivers for loading onto the card)

              Comment


              • #37
                Originally posted by imirkin View Post
                As for why they're doing this ... who knows.
                Now that is easy to answer: GPU manufacturers have consistently cheated like crazy over the years with regards to drivers being fudged to deliver higher frame rates in gaming tests, where even small reductions in visual quality can give you that 10% extra frame rate boost to shine in the benchmark results.
                If nVidia opened up their existing software, people would start to see all those "per-popular-game-or-benchmark" cheats.

                Comment


                • #38
                  Originally posted by imirkin View Post
                  The amount of misinformation in this thread is pretty staggering. Rather than try to address individual wrong comments specifically, here's how things work.

                  NVIDIA produces a chip, and a reference board, but usually doesn't sell it (I think they have, on occasion, but not in general). OEM's buy the chips and get access to the reference designs which they are free to ignore. OEMs produce the boards that consumers buy.

                  Now, there's a certain amount of configurability in the NVIDIA-made chip to handle variations in the way add-on components are connected. Like different RAM chips, or the fan being hooked up via GPIO or via PWM or anything else. There are thousands and thousands of possible variations. Basically every board on the market is slightly different, and sometimes even 2 copies of the same board produced some time apart are different. Only the OEM knows all the specific details, not NVIDIA (at least not in a software sense - I wouldn't be surprised if designs had to be registered with them somehow).

                  This board-specific knowledge is encoded in the VBIOS. The VBIOS has 2 totally separate purposes -- one is to bring up the attached screens during bootup, so that you can see stuff during early boot (BIOS/EFI, grub, etc). This is done by software retrieved via the PCI option ROM, executing in x86 16-bit real mode (since that's how the machine comes up in a BIOS boot... might go to 32-bit directly for EFI boots, not sure), and providing services accessible via software interrupt (int 10h). This is a standardized API for setting (VGA) modes, and various other video things. Fancier software (e.g. post-1996 or so) includes VBE things too which let you access higher resolutions than VGA allows for. The EFI situation is slightly different but logically equivalent.

                  Starting with Kepler or so, the VBIOS startup scripts initialize the GPU into its lowest performance mode -- both memory speed and core clocks. This is perfectly fine for operating the BIOS / boot software, but leaves something to be desired for e.g. gaming.

                  The second purpose of the VBIOS is to provide the operating system driver with the knowledge required to operate the board properly. Some of this knowledge is provided in data tables (e.g. "fan is connected to GPIO 7"), and some of it is provided in high-level scripts which are written in a turing-complete language. These scripts are generally a long sequences of "write value X to register Y", but can also have logic which does various things depending on values on registers or data contents of the vbios. The scripts are executed during resume-from-ram (since the normal VBIOS doesn't run then), as well as at various times during regular use (like when a new display is plugged in).

                  It doesn't make sense to talk about an open-source VBIOS since it's really just the board-specific info. One could write an open-source option ROM plugin which parses the tables and provides the int10h/vbe services, but that's about it -- the core data driving it would still have to come from the manufacturer of the board.

                  Now ... firmware. Firmware can mean a lot of things, but in this case it's "software that runs on internal CPUs embedded in the NVIDIA chip". Note that I said "CPUs", not "CPU". There are a bunch of them. Each one is internally connected to various things, and as a result there are a number of different pieces of firmware, one for each chip, which does totally different things. There's no board-specific knowledge in the firmware, so it's generic to all boards with a particular chip (often chip generation). On a modern GPU, these are:

                  - Various video decoding processors (3 on kepler and earlier, 1 on maxwell+) - these control the fixed function logic that does the actual data decoding
                  - Video encoding processor - this controls the fixed function logic used to perform h264 encoding
                  - Graphics engine context switches - this is inside the graphics unit and is able to read out all of the graphics context and save it off to memory, and take another graphics context and load it from memory. This is basically required for "graphics acceleration" to work.
                  - "Power Management Unit" - a hodge podge of various services, can range from thermal control to providing the primitives for reclocking memory. It also has access to certain information from the card regarding utilization. This unit is required for fan control, changing performance levels, etc.

                  Depending on the specific generation, there can be other details, but this is the general gist of it. The nouveau team has developed full implementations of the context switch logic up to (and including) GM10x (1st gen Maxwell). The nouveau team also has implemented a PMU which is sufficient to allow the kernel driver to perform memory reclocks on Kepler/Maxwell, as well as all the fan/therm management stuff that's needed. This wasn't done by *stealing* anything, it was done entirely by the nouveau team and is included as data in the kernel, not linux-firmware.

                  For the video decoding logic, we did resort to extracting the firmware from the drivers that NVIDIA ships, as the pain of implementing something like this from scratch would be extreme (REing engines that are only accessible from such CPUs is an extra level harder), while the benefit minimal. I wouldn't call this stealing as we just supply a script which reads some bits from the software supplied by NVIDIA, and we don't distribute the results of that script.

                  So now for the "interesting" part. NVIDIA made it impossible (starting with GM200) for unsigned firmware uploaded to these CPUs to access (some of) the critical functions required for them to do their jobs. But they don't distribute the firmware, and they have made it a lot harder to extract from the driver that they ship (probably not on purpose, but that's the end result). Furthermore the loading sequence to get the proper "secure" mode is extremely convoluted and requires "secure" firmware for a number of different engines that all load one another. After a while they did put out both context switch firmware and sufficient surrounding firmware/logic to get it loaded up so that it works, but nothing else. Which lets nouveau kinda limp along.

                  Furthermore, reclocking scripts to change memory speed and so on are highly dependent on specific board information (different RAM chips have slighly different properties, and probably other parameters influence it). This information is contained in the VBIOS, and is then composed into reclocking scripts by the driver based on the current GPU state and the desired target state. A developer can easily trace all the transitions for any particular board, but they will be different from board to board (or sometimes even the same board but at a different temperature). In order to get wider coverage, one can just fuzz the data in the VBIOS and see how each bit in the VBIOS affects the reclocking scripts (yes, very tedious work). From that, code can be produced that handles a wide range of boards, but even that's unlikely to capture everything. One has to repeat this with additional boards that don't work, from which one eventually gets a pretty comprehensive solution.

                  As it happens, GM20x VBIOS's store the reclocking info in the same table formats as earlier GPUs, and the memory controllers are identical, so we can reuse the exact same logic as was developed for Kepler. And amusingly enough, the hardware access required to perform reclocking is accessible from the PMU in unsecure mode. However Pascal has an entirely different memory controller, so a new reclocking script generator would need to be developed. Except the NVIDIA drivers now ensure that the VBIOS hasn't been tampered with (starting with GM20x), which means that script development is limited to the specific GPU used for the RE. (And it's unclear whether the reclocking access would be possible on Pascal without "secure" mode access.)

                  Hopefully this covers the points of confusion people have about various stuff. As for why they're doing this ... who knows. It's officially done in the name of security, I believe, but that's probably just a smoke screen. Probably sick of fake boards being released. The end result is all the same though.
                  You are missing something very important here. Bios is locked to prevent excessive overclock without warranty void. Every Nvidia OEM needs to submit the project back to nvidia for approval since maxwell, and they cannot ship without signed firmware or approved clock and hardware. NVDIA is artificially limiting the overclocking potential of the new cards because given unlimited power draw, they could be overclocked quite substantially and that would hurt the obsolescence of the products. They learned their lesson around the 680/780ti era.

                  Comment

                  Working...
                  X