Announcement

Collapse
No announcement yet.

How to tell if a driver is gallium or just mesa? (Slow renderng with radeon)

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    It is faster than firefox by a measurable margin from my experience and I am not using its start page ever. In any ways as it is closed when testing 3D in this case it has no effect on the performance.

    In case of extreme tux racer I see 90-100% CPU but only towards that executable. In case of glxgears I see stable 25% towads glxgears. In Both cases there is a minor 0.3% or usually less towards dwm and sometimes towards Xorg.

    I feel the AGP acceleration is just missing and I have no idea why. The code path in the driver and other informations presented above now surely indicate this. :-(

    PS.: When I say "it is faster than firexfox by a measurable margin from my experience" I am talking about only my machine and maybe single core machines. It seems mainstream browsers all move towards the multi-core performance focus, but this one performs well here still and that I why I am using it.
    Last edited by prenex; 05-21-2019, 10:10 AM.

    Comment


    • #22
      Actually this is how top + glxgears look if palemoon is still open:

      Code:
      top - 16:12:20 up  2:21,  5 users,  load average: 1,21, 1,07, 1,02
      Tasks:  83 total,   1 running,  79 sleeping,   1 stopped,   2 zombie
      %Cpu(s): 69,9 us, 15,2 sy,  0,0 ni, 14,9 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
      MiB Mem :   1380,8 total,    167,0 free,    407,9 used,    805,9 buff/cache
      MiB Swap:    988,3 total,    985,6 free,      2,8 used.    780,9 avail Mem
      
        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                  
        729 prenex     7   0  149476  51896  44604 S  29,7   3,7   0:07.36 glxgears                                
        627 prenex     1   0  930236 316900  92848 S  20,5  22,4  20:24.36 palemoon                                
        349 prenex     1   0  161272  40084  22872 S   7,6   2,8   3:13.92 Xorg                                    
        361 prenex     1   0   22796  11928   8256 S   0,3   0,8   0:01.76 xterm                                    
        372 prenex     1   0    9096   2732   2400 S   0,3   0,2   0:00.99 dwm                                      
          1 root       1   0   16952   4032   3364 S   0,0   0,3   0:03.01 systemd
      I still have the 10 tabs and if I move to an other virtual screen and run two terminal and glxgears + top there this is what I get and I the frame rate is exactly the same as when I am running it without the browser open. Extreme Tux Racer is barely slower than how it looks when there is no browser open - only measurably slower if I force the llvmpipe software renderer. In the latter case it seem to count if I have the browser open or not, but in the radeon driver it seems there is no big impact.

      I am getting more and more sure that the AGP acceleration is missing for some reason and that is the key. Just didn't found out why... I should have suspected it when I wrote "16x" acceleration value in xorg.conf and the driver didn't became unstable as earlier 8x was the biggest value it was working stable on. I remeber when I set 8x things became faster too on my earlier system - so I am sure it was working once. Now it seems only the other settings are taken care of from my xorg conf despite I have ati-agp and agp kernel modules built. This is weird for me.

      Surely dmesg show the first line of the agp support starting up, but no more information and there should be either errors or info logs there... this bugs me...
      Last edited by prenex; 05-21-2019, 10:20 AM.

      Comment


      • #23
        Look what others get from dmesg:
        Code:
        ...
        Sep 19 11:29:54 Debian-G5 kernel: pmac_zilog: 0.6 (Benjamin Herrenschmidt <[email protected]>)
        Sep 19 11:29:54 Debian-G5 kernel: Linux agpgart interface v0.103
        Sep 19 11:29:54 Debian-G5 kernel: agpgart-uninorth 0000:f0:0b.0: Apple U3 chipset
        Sep 19 11:29:54 Debian-G5 kernel: agpgart-uninorth 0000:f0:0b.0: configuring for size idx: 64
        Sep 19 11:29:54 Debian-G5 kernel: agpgart-uninorth 0000:f0:0b.0: AGP aperture is 256M @ 0x0
        ...
        I do not get anything related to agp, but only the first line about the agpgart interface version (it is the same for me)!

        Btw this is the code that should have been running - found it in the kernel source code...
        (I marked relevant part with LOOK AT HERE and a lot of stars *****):

        Code:
        // drivers/char/agp/ati-agp.c
        static int agp_ati_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
        {
            struct agp_device_ids *devs = ati_agp_device_ids;
            struct agp_bridge_data *bridge;
            u8 cap_ptr;
            int j;
        
            cap_ptr = pci_find_capability(pdev, PCI_CAP_ID_AGP);
            if (!cap_ptr)
                return -ENODEV;
        
            /* probe for known chipsets */
            for (j = 0; devs[j].chipset_name; j++) {
                if (pdev->device == devs[j].device_id)
                    goto found;
            }
        
            dev_err(&pdev->dev, "unsupported Ati chipset [%04x/%04x])\n",
                pdev->vendor, pdev->device);
            return -ENODEV;
        
        found:
            bridge = agp_alloc_bridge();
            if (!bridge)
                return -ENOMEM;
        
            bridge->dev = pdev;
            bridge->capndx = cap_ptr;
        
            bridge->driver = &ati_generic_bridge;
        
            /* LOOK AT HERE *************************************/
            dev_info(&pdev->dev, "Ati %s chipset\n", devs[j].chipset_name);
        
            /* Fill in the mode register */
            pci_read_config_dword(pdev,
                    bridge->capndx+PCI_AGP_STATUS,
                    &bridge->mode);
        
            pci_set_drvdata(pdev, bridge);
            return agp_add_bridge(bridge);
        }
        Also the later second log line about the AGP aperture size should be coming from "vim drivers/char/agp/backend.c", from the agp_add_bridge(...) call. That is not printed neither. This means that the code is not even touching these areas or the log level is smaller than info which I doubt. I would be happy to know how can I see if these should be printed according to the current log level or not, but I highly suspect that the card is not under AGP acceleration in the driver now either because of some configuration or because some code changes and before surely it was accelerated with AGP.

        The list of hardcoded supported devices is this in the latest kernel driver source:

        Code:
        static struct agp_device_ids ati_agp_device_ids[] =
        {
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS100,
                .chipset_name    = "IGP320/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS200,
                .chipset_name    = "IGP330/340/345/350/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS200_B,
                .chipset_name    = "IGP345M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS250,
                .chipset_name    = "IGP7000/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS300_100,
                .chipset_name    = "IGP9100/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS300_133,
                .chipset_name    = "IGP9100/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS300_166,
                .chipset_name    = "IGP9100/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS300_200,
                .chipset_name    = "IGP9100/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS350_133,
                .chipset_name    = "IGP9000/M",
            },
            {
                .device_id    = PCI_DEVICE_ID_ATI_RS350_200,
                .chipset_name    = "IGP9100/M",
            },
            { }, /* dummy final entry, always present */
        };
        I did not git out the whole kernel source tree to see if cards got removed from this list or not as I only have a tar.gz version to save space on the hdd. From the code it seems if my device is not in the list then I have no acceleration whatsoever, but I have no idea why there is no error message or anything neither have an idea what is my device_id and chipset name here the table is compared against mine.

        Comment


        • #24
          Also weird is that this is not even printed:

          Code:
               dev_err(&pdev->dev, "unsupported Ati chipset [%04x/%04x])\n",         pdev->vendor, pdev->device);     return -ENODEV;
          It is like as if the agp driver would stall at some point or exit much earlier than these points - or of course it can be still that the log level just do not shows these messages. Maybe I should see what the first message's log level is. I will grep onto that just to know if it is maybe the same or higher and in that case that indicates a problem surely.

          Comment


          • #25
            What I see being printed is INFO level too so both should show up and only one shows for me:

            Code:
            [[email protected] zen-kernel-5.0.17-lqx1]$ grep -R "Linux agpgart interface" *
            drivers/char/agp/backend.c:             printk(KERN_INFO "Linux agpgart interface v%d.%d\n"
            Maybe this return no AGP capability for some reason:

            Code:
            cap_ptr = pci_find_capability(pdev, PCI_CAP_ID_AGP);
            if (!cap_ptr)
               return -ENODEV;
            In the above function this is the only place where there is an error, but there is no error log and maybe neither is one later...
            I haven't changed anything in my BIOS so I have no idea why AGP might have stopped working. It can be that there were kernel changes that make some AGP cards not recognizable as AGP anymore? :-(

            Comment


            • #26
              New information: I have tried with the linux-lts 4.19.38-1.0 kernel and its modules just to see it has the very same problems for me.

              I think I will add some more logging to the agp and ati-agp kernel modules to see where exactly things are going wrong. After that I will also try embedding these modules into the kernel itself so that modules do not need to be loaded alongside the kernel. I am not really sure what was the last kernel I tried on my ubuntu 16.04 because it was not completely up to date before I updated it to 18.04 so I have no idea. Surely it was a later than 3.x kernel and maybe a very early 4.x with corresponding modules. Also I don't know if they serve it that way that agp is embedded or if it is a module, but maybe if I embed that it will work automagically...

              Comment


              • #27
                Maybe a dumb question, but if you go back to Ubuntu 16.04 does the speed come back ?

                My first thought would be to grab a log dump from running on 16.04 so you have a basis for comparison.

                Comment


                • #28
                  Hmmm... it is good to know it is PCIe, but maybe pcie is hsndled in agp ways still. I only suspect from here, but you made me unsure:

                  Code:
                  [[email protected] ~]$ lspci -knn
                  00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD/ATI] RC410 Host Bridge [1002:5a31] (rev 01)
                          Subsystem: ASUSTeK Computer Inc. RC410 Host Bridge [1043:13d7]
                          Kernel modules: ati_agp
                  Btw it is not a dumb idea at all to compare logs, settings and perf output against the old system. I do not have it however... I had no hdd space so when the 18.04 update failed i just choose to swap the distto instead of fixing it because they stop 32 bit support anyways. Now i have added some logs in the agp code and will try it...

                  Comment


                  • #29
                    I have let my machine compile my modified logging kernel through the night... very interesting results I am getting that strengthens my suspicions. Also I have configured the kernel to remove a lot of unnecessary drivers (only those I can surely tell they are unnecessary) and the compilation still took measurably long. I hope if I am not reconfiguring, just changing some lines, the compilation does not start over from the beginning, but now it started...

                    Also I changed in the config that the ati-agp and the agp modules are now built into the kernel and not externally loaded anymore. I had a minor hope that this will help, but performance is the same as before...

                    Full dmesg output is here:

                    http://ballmerpeak.web.elte.hu/agp_dmesg_hackerman.txt

                    The relevant log parts are these:

                    Code:
                    [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-prenex-hackerman root=UUID=8e....0f6 rw radeon.agp_mode=8 agp=try_unsupported
                    [ 0.000000] Trying unsupported AGP cards too......
                    
                    ...
                    
                    [    0.932692] Linux agpgart interface v0.103
                    [    0.932694] agpgart: Registering agp-ati driver.
                    [    0.932708] agpgart-ati 0000:00:00.0: probing Ati chipset [1002/5a31]) for agp capabilities
                    
                    ...
                    
                     [   15.399353] radeon: unknown parameter 'agp_mode' ignored
                    Some of these are immediately understandable (it seems the newer radeon driver ignores agp_mode - maybe not when in xorg.conf) and some of them are better understood when I present my small changeset to the kernel source tree.

                    Relevant changes in the kernel sources follow from this point.

                    Code:
                    #ifndef MODULE
                    static __init int agp_setup(char *s)
                    {
                        if (!strcmp(s,"off"))
                            agp_off = 1;
                        if (!strcmp(s,"try_unsupported")) {
                    +       printk(KERN_INFO "Trying unsupported AGP cards too...");
                            agp_try_unsupported_boot = 1;
                        }
                        return 1;
                    }
                    __setup("agp=", agp_setup);
                    #endif
                    I have added this to understand if the "agp=try_unsupported" really works or not.

                    You can see from the log that I have sucessfully enabled it - whatever it means it surely do not help in my case.

                    Code:
                    static int __init agp_ati_init(void)
                    {
                        if (agp_off)
                            return -EINVAL;
                    +   printk(KERN_INFO PFX "Registering agp-ati driver.\n");
                        return pci_register_driver(&agp_ati_pci_driver);
                    }
                    At this place the agp-ati driver initiates itself as a pci driver. I wanted to log here, because this way I can be sure that the registration really happens. I just wanted to rule out that further logs maybe not printed because this did not happen...
                    Technically the best would have been to save the return value of the pci_register_driver and log only afterwards and return the saved value, but this was faster to write in...

                    You can see from the log that the agp-ati driver is indeed registered among the pci drivers of the kernel. The agp_ati_pci_driver is a struct with function pointers. The most meaningful is its function for "probing the device".

                    Code:
                    static int agp_ati_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
                    {
                        struct agp_device_ids *devs = ati_agp_device_ids;
                        struct agp_bridge_data *bridge;
                        u8 cap_ptr;
                        int j;
                    
                    +   dev_info(&pdev->dev, "probing Ati chipset [%04x/%04x]) for agp capabilities\n",
                    +       pdev->vendor, pdev->device);
                    
                        cap_ptr = pci_find_capability(pdev, PCI_CAP_ID_AGP);
                        if (!cap_ptr)
                            return -ENODEV;
                    
                    +   dev_info(&pdev->dev, "found agp capability on Ati chipset [%04x/%04x])\n",
                    +       pdev->vendor, pdev->device);
                    
                        /* probe for known chipsets */
                        for (j = 0; devs[j].chipset_name; j++) {
                            if (pdev->device == devs[j].device_id)
                                goto found;
                        }
                    
                        dev_err(&pdev->dev, "unsupported Ati chipset [%04x/%04x])\n",
                            pdev->vendor, pdev->device);
                        return -ENODEV;
                    
                    found:
                        bridge = agp_alloc_bridge();
                        if (!bridge)
                            return -ENOMEM;
                    
                        bridge->dev = pdev;
                        bridge->capndx = cap_ptr;
                    
                        bridge->driver = &ati_generic_bridge;
                    
                        dev_info(&pdev->dev, "Ati %s chipset\n", devs[j].chipset_name);
                    
                        /* Fill in the mode register */
                        pci_read_config_dword(pdev,
                                bridge->capndx+PCI_AGP_STATUS,
                                &bridge->mode);
                    
                        pci_set_drvdata(pdev, bridge);
                        return agp_add_bridge(bridge);
                    }
                    This is the key point now. From the log one can see now that the driver is "probing Ati chipset [1002/5a31]", but not see the logs happening after the "pci_find_capability" AGP-capability testing code. That function returns as if there is no AGP capability.

                    My bad I have not traced that function back and did not filled that with log messages so I have no idea why it says so :-(.

                    If I can recompile the kernel really fast by just uncommenting the test, it would have worth to see how the machine operates when I don't care what this function returns and just start to use AGP anyways. It might be anything from complete success to and utter failure and maybe even worse (I only hope it will not damage anything...)

                    Comment


                    • #30
                      Btw I really thank debianxfce and everyone for all their suggestions. If I can tell this is surely a bug I should contact the person who is doing the agpgart drivers in the kernel tree I guess. Still it is best to look around a bit more first to at least have a grasp about what is going on or what have been changed that impacts me here.

                      Comment

                      Working...
                      X