Announcement

Collapse
No announcement yet.

Patches Proceed For Disabling Radeon AGP GART, Deprecating TTM AGP

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by kylew77 View Post
    NetBSD supports Wine like FreeBSD and unlike OpenBSD as well as many open source games.
    I have my doubts that gaming on NetBSD is viable, besides those using proprietary NVIDIA drivers.

    Also unless there are additional NetBSD-only opensource games, most is basic stuff and the only well-known ones are like Wesnoth and 0AD.

    The only perk I can see of running a modern OS on a likely 32 bit system with AGP graphics is for newer emulators for console games or for the various open source games and having the latest versions of them.
    not a great idea. Depending on what console you are emulating, you are either wasting space and power (i.e. a raspberry Pi can do the same), or for newer consoles your processor sucks balls and can't run the emulator properly (that are usually CPU-intensive).
    For example, Dolphin emulator needs 64bit system with OpenGL 3.0, and that's a tall order for an AGP card.
    Last edited by starshipeleven; 17 May 2020, 04:05 PM.

    Comment


    • #22
      Originally posted by agd5f View Post
      I think, largely come from the fact that PCI GART (at least on the the oldest radeons), only had like one TLB entry so there were a lot more page table walks.
      That is one of the differences with the powerpc build of the PCI GART driver vs the x86 one. The powerpc one used more than 1 TLB entry.

      Originally posted by agd5f View Post
      Both AGP GART and PCI GART are handled by architecture independent drivers.
      This is not true. AGP GART driver is not architecture independent in the Linux kernel its only built on x86 platforms for over 2 decades now. PCI GART over 95% platform generic but there are differences like in allowed TLB count that for code simplicity should go away because there is no major technical reason now why x86 locked it self to 1 TLB and other things like it. Migrating to powerpc allowances in the PCI GART code would be a good thing.

      You can think of the single TLB like the big kernel lock of old. In fact they are both kind of related.

      Originally posted by agd5f View Post
      The AGP bridge drivers for AGP GART and the GPU drivers for PCI(e) GART.
      This patch is ripping code out of the graphics driver. AGP bridge drivers don't have to have AGP GART support like the powerpc ones don't.

      Yes the newer AGP cards based on pcie chipset with a AGP to pcie bridge chip also will use out the box more than 1 TLB entry on x86 because they are using the pcie gart code not the pci gart code.

      If they fix up the TLB entry problem in the x86 build PCI GART to match what the powerpc build PCI GART does there really should not be any major performance drop. Of course this does not have to happen for the PCIe based AGP cards that have bridge chips and this is why some AGP cards will have a performance hit and others will be what the difference.

      Originally posted by agd5f View Post
      In fact, you can (and probably should) use both at the same time in order to support both cached and uncached system memory.
      This is where AGP GART goes horrible wrong. PCI GART is safe with cached and uncached system memory. AGP GART turns out not to be safe with CPU L3 cached memory write outs happening to the same area of memory its interfacing with. There are reasons why AGP GART code had to be CPU and chipset particular instead of pure platform neutral generic code.

      PS one of the reasons why powerpc PCI GART could use more than 1 TLB entry is no AGP GART so able to use areas in the GPU mmu that would have been used if AGP GART was enabled for generic PCI stuff. So you really cannot have PCI GART and AGP GART perform at absolute max at the same time. AGP GART to perform at max equals cripple(lock to 1TLB) /disable PCI GART. Reverse is true as well for PCI GART to be able to perform at max you need to disable AGP GART so it can have more than 1 TLB entry to play with.
      Last edited by oiaohm; 17 May 2020, 06:47 PM.

      Comment


      • #23
        Originally posted by eydee View Post

        Do you actually use it for gaming?
        No not really, as that old machine I use mainly as a netbook.

        Comment


        • #24
          Originally posted by DeepDayze View Post

          No not really, as that old machine I use mainly as a netbook.
          I don't think you'll notice anything then. Even in a game, that GPU being able to saturate an AGP bus is very unlikely. On a desktop you'd be probably fine with any type of connection.

          Comment


          • #25
            Originally posted by starshipeleven View Post
            I have my doubts that gaming on NetBSD is viable, besides those using proprietary NVIDIA drivers.

            Also unless there are additional NetBSD-only opensource games, most is basic stuff and the only well-known ones are like Wesnoth and 0AD.

            not a great idea. Depending on what console you are emulating, you are either wasting space and power (i.e. a raspberry Pi can do the same), or for newer consoles your processor sucks balls and can't run the emulator properly (that are usually CPU-intensive).
            For example, Dolphin emulator needs 64bit system with OpenGL 3.0, and that's a tall order for an AGP card.
            I was thinking more people who do like NES and SNES emulation and the computers from the 1980s like the Amiga and emulation like that. 21st century consoles aren't going to be emulated well on hardware from around 2000. NetBSD doesn't even have a binary NVIDIA driver only FreeBSD does and they cut old cruft just like Linux does. There is a cult group that does gaming on OpenBSD and even has a a subreddit on Reddit devoted to OpenBSD gaming but again that project cuts out old cruft too so I could see them adopting this change.

            I also completely agree that a Raspberry PI 4 is going to be more powerful than a Pentium 4 while using less power, but that gets into CapEX vs OpEX costs. Some people, heck many people I know of, would rather spend more money on power monthly that move to a new system and invest in the new hardware. My mother ran a P4 system until the wheels feel off and I moved her to a Phenom II system with way more power.

            Comment


            • #26
              Originally posted by kylew77 View Post
              I was thinking more people who do like NES and SNES emulation and the computers from the 1980s like the Amiga and emulation like that. 21st century consoles aren't going to be emulated well on hardware from around 2000.
              https://github.com/MiSTer-devel/Main_MiSTer/wiki
              Really when you are talking about running NES/SNES and other systems from 1980s as close to perfect today without the legacy hardware you are not normally talking PC. Instead like this MiSTer that the main board is 130USD and with all the add on boards it under 500 USD. Why is this so good is it a FPGA chip that can really reproduce very close approximation of the circuits those old systems had so zero performance jitter.

              x86 cpus due to SMM and other things have a lot of jitter in performance. Even the Raspberry PI4 with cooling has quite a high jitter value but way lower than a x86 processor that would have a AGP slot. The last intel x86 cpu without major jitter issues is the 486dx4 100.

              Those old 1980 consoles are fixed clock zero jitter so next to impossible to-do properly on a x86 PC of any time-frame. Raspbery PI 4 not ideal but gets closer.

              Basically just because something is old does not mean it easy to emulate.

              Originally posted by kylew77 View Post
              I also completely agree that a Raspberry PI 4 is going to be more powerful than a Pentium 4 while using less power
              This is one of these horrible ones. The Raspberry PI 4 vs brand new x86 computer competing in how well it can reproduce the emulated behaviour of a old 1980s console the Raspberry PI 4 will be closer to correct. Old weaker X86 processors with AGP will be no better but worse again.

              Playing old PC games that expect jitter the old PC systems do well. The idea that you can recycle a old x86 PC into a decent functional old console emulator is an idea people have but it really does not work as well as they expect. Lot of cases they think X rom they have got is broken somehow because X game from the console is unplayable and the issue really be the jitter problems. Yes it does show up at times that colours and other things you should see due to the colour being made by frame switching you don't see correctly any more and other game play effecting things.

              I have had the stupid where a person had a rom they though was broken because game play logic appeared to screw up random-ally on there x86 system same emulator built for arm on a Raspberry PI3.

              The jitter in performance difference caused by dynamic clocks and SMM and other things on x86 is more problem than many expect when you get into really old console emulation.

              The zero performance jitter of those old 1980s consoles is really simple to forgot is a feature their game roms were designed to expect and its not exactly that simple to reproduce. Yes our modern x86 and arm cpu do speculative executions and other things and yes this has came at the price of predictable execution time frames so introducing a jitter factor that those old console just don't have and the programs for them are not built for this modern behavour.

              Comment


              • #27
                Originally posted by oiaohm View Post
                That is one of the differences with the powerpc build of the PCI GART driver vs the x86 one. The powerpc one used more than 1 TLB entry.

                This is not true. AGP GART driver is not architecture independent in the Linux kernel its only built on x86 platforms for over 2 decades now. PCI GART over 95% platform generic but there are differences like in allowed TLB count that for code simplicity should go away because there is no major technical reason now why x86 locked it self to 1 TLB and other things like it. Migrating to powerpc allowances in the PCI GART code would be a good thing.

                You can think of the single TLB like the big kernel lock of old. In fact they are both kind of related.

                This patch is ripping code out of the graphics driver. AGP bridge drivers don't have to have AGP GART support like the powerpc ones don't.

                Yes the newer AGP cards based on pcie chipset with a AGP to pcie bridge chip also will use out the box more than 1 TLB entry on x86 because they are using the pcie gart code not the pci gart code.

                If they fix up the TLB entry problem in the x86 build PCI GART to match what the powerpc build PCI GART does there really should not be any major performance drop. Of course this does not have to happen for the PCIe based AGP cards that have bridge chips and this is why some AGP cards will have a performance hit and others will be what the difference.

                This is where AGP GART goes horrible wrong. PCI GART is safe with cached and uncached system memory. AGP GART turns out not to be safe with CPU L3 cached memory write outs happening to the same area of memory its interfacing with. There are reasons why AGP GART code had to be CPU and chipset particular instead of pure platform neutral generic code.

                PS one of the reasons why powerpc PCI GART could use more than 1 TLB entry is no AGP GART so able to use areas in the GPU mmu that would have been used if AGP GART was enabled for generic PCI stuff. So you really cannot have PCI GART and AGP GART perform at absolute max at the same time. AGP GART to perform at max equals cripple(lock to 1TLB) /disable PCI GART. Reverse is true as well for PCI GART to be able to perform at max you need to disable AGP GART so it can have more than 1 TLB entry to play with.
                I'm not sure where to begin with this. I think there is a bit of a misunderstanding here about how this works. The number of TLB entries is a aspect of the relevant hardware. E.g., the AGP chipset on the motherboard or the GPU itself. The hardware is what it is. If you plug a GPU into an x86 box or a PowerPC box, it has the same number of TLB entries. Also PCI(e) requires coherency with system memory, so by default all devices need to support CPU cache snooping in whatever mechanism they provide for DMA. The devices can also support non-coherent operations, but support for that is platform dependent. x86 generally supports it, other architectures do not necessarily. The earliest PCI gart on older radeons only supported coherent operations (i.e., cache snooped). It was expected that for unsnooped operations, you'd use AGP. The extra overhead for the cache snooping may have also have an impact performance in some cases relative to non-snooped. It could also be that internally the GPU has less bandwidth via the cache coherent interfaces to memory. The newer versions of the MMU on the GPU (added to support PCIe) added support for both snooped and unsnooped transactions. As for AGP, the way it works is that the AGP chipset provides an aperture in the CPU's address space of a specific size (usually specified in the sbios). This is basically a scatter/gather aperture. Devices (any device on the system) can access this aperture as if it were a linear range of addresses and the page tables that control that aperture translate the accesses to the actual physical pages that back it up. The Linux kernel used the AGP aperture as a limited IOMMU on some platforms because that is basically what it is. From the GPU's perspective, you just point the GPU to this aperture and it assumes it's a linear aperture of memory. On radeons, AGP is just an aperture in it's internal address space which points to a physical bus address. The driver points this aperture at the AGP aperture in the CPU's address space, but it doesn't have to. You can point it at any contiguous region of uncached system memory.

                Comment


                • #28
                  Originally posted by jabl View Post
                  How do modern (PCIe) GPUs work? Do they all use a local IOMMU?
                  On radeons at least, there is a device specific MMU in the GPU that works independent of what the underlying bus is (PCI, AGP, PCIe).
                  Last edited by agd5f; 18 May 2020, 12:41 PM. Reason: clarify

                  Comment


                  • #29
                    Dell Inspiron 8100 with Intel 82815 PCI/AGP chipset.

                    Linux Kernel 5.4.38

                    PCI GART, WITHOUT AGP
                    CPU Usage: 14-17%
                    FPS: ~59
                    NOTE: AFTER ~5 seconds, a hard freeze/lock occurs, sometimes escapable by kernel sys umount/reboot key, most other times hard a hard freeze.


                    WITH AGP
                    CPU Usage: 8.5-12.5%
                    FPS: ~59

                    If these patches fly through without any testing, I can guarentee all Intel chipsets will likely freeze using the PCI GART, including the wonderfully stable and reknown 440BX chipset motherboards.

                    If one reads the kernel help, only AGP video cards having excessive memory are affected by memory related bugs.

                    Some of you complaining and trying to make people replace their stable well cared for hardware, should be rest assured, all of your Government tax dollars are likely well cared for too ... being tossed out the window upon million dollar toilet seats. I don't know why there's a recent rash of people coding within the past ~5-10 years, whom have no respect for others property, or computers. If it's not broken, probably best to leave it alone if you're not going to fix the further incurred bugs. Shrugs, at this rate, I may consider buying a Mac laptop instead, in the future. At least I'll know I'll have a snapshot of an O/S working 10-20 years down the road.

                    Comment


                    • #30
                      It's really a non-issue. Intel and AMD basically dropped support for AGP on their platforms after 2005, and the last AGP video cards that were even remotely performant came out in 2008! We have people here complaining that they won't be able to run the latest kernel on their 20 year old Pentium 3 laptops! Those systems are so old that they would absolutely struggle with any kind of modern distribution. Even browsing the web these days isn't pleasant on a mid 2000s system, let alone one with hardware from 1999/2000.

                      If you have old decrepit hardware, just use older software and operating systems. It's really that simple. You didn't see people in 2000 whining that Debian 2.2 didn't fully support their IBM PC 5150 from '81 - that's the same equivalent time-span here.

                      Comment

                      Working...
                      X