Announcement

Collapse
No announcement yet.

How to tell if a driver is gallium or just mesa? (Slow renderng with radeon)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Btw these files only saw really small changes after 2016 in which year 16.04 was coming out:

    Linux kernel source tree. Contribute to torvalds/linux development by creating an account on GitHub.

    (click "history")

    But the pci.c (in which the capability query function is) has been changed a whole lot of times:

    Linux kernel source tree. Contribute to torvalds/linux development by creating an account on GitHub.

    (click "history")

    Also there is a separate PCI express and AGP capability, but that does not rule out that one is the subset of the other or things like that, just I pay attention to it.

    Code:
    /**
     * pci_find_capability - query for devices' capabilities
     * @dev: PCI device to query
     * @cap: capability code
     *
     * Tell if a device supports a given PCI capability.
     * Returns the address of the requested capability structure within the
     * device's PCI configuration space or 0 in case the device does not
     * support it.  Possible values for @cap:
     *
     *  %PCI_CAP_ID_PM           Power Management
     *  %PCI_CAP_ID_AGP          Accelerated Graphics Port
     *  %PCI_CAP_ID_VPD          Vital Product Data
     *  %PCI_CAP_ID_SLOTID       Slot Identification
     *  %PCI_CAP_ID_MSI          Message Signalled Interrupts
     *  %PCI_CAP_ID_CHSWP        CompactPCI HotSwap
     *  %PCI_CAP_ID_PCIX         PCI-X
     *  %PCI_CAP_ID_EXP          PCI Express
     */
    int pci_find_capability(struct pci_dev *dev, int cap)
    {
        int pos;
    
        pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type);
        if (pos)
            pos = __pci_find_next_cap(dev->bus, dev->devfn, pos, cap);
    
        return pos;
    }
    EXPORT_SYMBOL(pci_find_capability);
    Also now I see that I cannot ignore this function, because it returns some pointer that the code needs afterwards so ignoring it will just likely lead to a segmentation fault or bigger errors and nothing else. It is not just a check. At this point I am a bit hopeless now, because I have no idea about the code here (or anywhere in the kernel nevertheless).

    Maybe a good idea for me is to look for a very old, 2016-ish version of this file from the git history just to compare it with the one I am having and see if any changes make sense for my case....

    Comment


    • #32


      ^^The function seem to be very much the same when I look into a 2016.03.15. version (likely should be the same in 16.04 too). Still a lot of things can happen here... retries in older code maybe removed, code changed that changes stuff this function operates in etc. I can look around in the surroundings for stuff that visibly changed, but will end up at a point where side-effects count and I cannot follow :-(

      Maybe those who think that I should install an older linux with a kernel to test what the old one is doing might be a good direction soon too, but I am open to any ideas still and maybe I miss some configuration still (config affecting me can be even in the pci subsystem too it seems). Like retry numbers for probing the cards or who knows what. Maybe some old cards worked only for the second-third try etc...

      Comment


      • #33
        Even this TTL number is the same(pci.h):

        Code:
        #define PCI_FIND_CAP_TTL 48
        This I have found to be a retry number for some underlying operation. Of course this is not a retry number for "probing", but the retry number directly trying something. Maybe if other retries were removed or shortened at other places I can elevate this to a higher number to counteract that and still find the capabilities...

        This is where it affects the code:

        Code:
        static int __pci_find_next_cap_ttl(struct pci_bus *bus, unsigned int devfn,
                           u8 pos, int cap, int *ttl)
        {
            u8 id;
            u16 ent;
        
            pci_bus_read_config_byte(bus, devfn, pos, &pos);
        
            while ((*ttl)--) {
                if (pos < 0x40)
                    break;
                pos &= ~3;
                pci_bus_read_config_word(bus, devfn, pos, &ent);
        
                id = ent & 0xff;
                if (id == 0xff)
                    break;
                if (id == cap)
                    return pos;
                pos = (ent >> 8);
            }
            return 0;
        }
        
        static int __pci_find_next_cap(struct pci_bus *bus, unsigned int devfn,
                           u8 pos, int cap)
        {
            int ttl = PCI_FIND_CAP_TTL;
        
            return __pci_find_next_cap_ttl(bus, devfn, pos, cap, &ttl);
        }
        The other function (the *start* one) only seem to initalize the process and this one seems to do the real search. Also I feel this is the one that has a higher chance of failing - but I will likely add logging to be sure. It will create a lot of log messages because it will affect all pci devices... or oh not! I can put the logging into a branch so that it only logs when someone asks for AGP capabilities :-)

        I might try to build a kernel without changing configuration and only making minor changes and see if it is a fast process or not. There must be some fast way to make one-liner changes to single files and try them out - is there anything I can do for that? A good makefile usually handles that, but I understand that the kernel source tree is a really complex and big system so it might be hard to achieve that the build process only builds what have changed...

        Changing the TLL might or might not help if there is some retry somewhere removed. If the card only needs some time to answer it might help if I put there a bigger value - but if a higher-level retry is removed somewhere either because of configuration or code changes it might not help when the retry only works because the kernel does something between the first and second time that enables the second time to finish more properly...

        Comment


        • #34
          Are these all easily installable using apt or something on a debian system? It might really help cornering the problem if it is.

          I have quite limited disk space right now (basically 30 Gb but many of it is used). Some of the partitions I have filled completely with saves from the ealier system. There are some saves on this root partion already too (like saved data from pendrives I was using for installation of arch, saved stuff from an sd card that I am in the meantime preparing for a really old 2.6 kernel system and fglrx just for kexec fun etc. So not only compilation, but installation will take a measurable time for me - mostly because I will need to move the save data to random other places from this machine to reclaim some space. Later I have planned to go through them and junk unnecessary stuff because now I still have the complete home directory of the old system for example.

          Also of course it would be good to know how to compile very minor changes in a kernel without fully recompiling the whole. I will at least the the latter once as now I was not touching make menuconfig and made really minor changes. Maybe jut issuing make now will recompile only the necessary things and will be faster...

          Comment


          • #35
            Also: is there any way to know if there is any live images that already come with a radeon driver without installing the system? That might come really handy too now. I am sure it is not a best practice to include such special drivers in a live image, but maybe some distro images do so :-)

            Comment


            • #36
              I am compiling a kernel with these changes now:

              Code:
              /* drivers/pci/pci.c */
              ....
              static int __pci_find_next_cap_ttl(struct pci_bus *bus, unsigned int devfn,
                                 u8 pos, int cap, int *ttl)
              {
                  u8 id;
                  u16 ent;
              
                  pci_bus_read_config_byte(bus, devfn, pos, &pos);
              
                  while ((*ttl)--) {
              -       if (pos < 0x40)
              +       if (pos < 0x40) {
              +           if(cap == PCI_CAP_ID_AGP)
              +               printk(KERN_ERR "PCI: __pci_find_next_cap_ttl pos < 0x40 at TTL value still at: %d\n", *ttl);
                          break;
              +       }
                      pos &= ~3;
                      pci_bus_read_config_word(bus, devfn, pos, &ent);
              
                      id = ent & 0xff;
              -       if (id == 0xff)
              +       if (id == 0xff) {
              +           if(cap == PCI_CAP_ID_AGP)
              +               printk(KERN_ERR "PCI: __pci_find_next_cap_ttl  id == 0xff at TTL value still at: %d\n", *ttl);
                          break;
              +       }
                      if (id == cap)
                          return pos;
                      pos = (ent >> 8);
                  }
              +   if(cap == PCI_CAP_ID_AGP)
              +       printk(KERN_ERR "PCI: __pci_find_next_cap_ttl error at TTL value still at: %d\n", *ttl);
                  return 0;
              }
              
              ...
              
              int pci_find_capability(struct pci_dev *dev, int cap)
              {
                  int pos;
              
              +   if(cap == PCI_CAP_ID_AGP)
              +       printk(KERN_INFO "PCI: Searching for agp capability for device: %x\n", dev);
                  pos = __pci_bus_find_cap_start(dev->bus, dev->devfn, dev->hdr_type);
              -   if (pos)
              +   if (pos) {
              +        if(cap == PCI_CAP_ID_AGP)
              +            printk(KERN_INFO "PCI: __pci_bus_find_cap_start success (agp capability for device: %x)\n", dev);
                      pos = __pci_find_next_cap(dev->bus, dev->devfn, pos, cap);
              +    }
              
                  return pos;
              }
              EXPORT_SYMBOL(pci_find_capability);
              
              ...
              and

              Code:
              /* drivers/pci/pci.h */
              ...
              -#define PCI_FIND_CAP_TTL    48
              +#define PCI_FIND_CAP_TTL    4800
              ...
              A bit more information for me and a possible dirty hack. I can just hope that the build will take less than an hour at most now that I am not touching the config file haha :-)

              Comment


              • #37
                Now it took much faster, maybe 5-15 minutes for compilation but module install still takes some amount :-)

                It seems I need to recompile the whole - or most of the kernel - only if I make relevant config changes, but one-liners work well thank God (and all the kernel team).

                Of course this is still slower than your Ryzen haha. At least it is usable now for debugging.

                Comment


                • #38
                  Originally posted by prenex View Post
                  Hmmm... it is good to know it is PCIe, but maybe pcie is hsndled in agp ways still. I only suspect from here, but you made me unsure:

                  Code:
                  [prenex@prenex-laptop ~]$ lspci -knn
                  00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD/ATI] RC410 Host Bridge [1002:5a31] (rev 01)
                  Subsystem: ASUSTeK Computer Inc. RC410 Host Bridge [1043:13d7]
                  Kernel modules: ati_agp
                  Btw it is not a dumb idea at all to compare logs, settings and perf output against the old system. I do not have it however... I had no hdd space so when the 18.04 update failed i just choose to swap the distto instead of fixing it because they stop 32 bit support anyways. Now i have added some logs in the agp code and will try it...
                  Just too many posts to read them all, but just some background info....

                  The last few generations of AGP cards were actually PCIe and they used a PCIe > AGP bridge chipset. Those cards do still need the amd agpgart for that bridge chipset. Another thing that's important to note is that AGP in that card supports X8 speed, check your BIOS to make sure it's not accidentally set to x1 or something, just make sure it is still set to X8 speed.

                  Comment


                  • #39
                    Originally posted by duby229 View Post

                    Just too many posts to read them all, but just some background info....

                    The last few generations of AGP cards were actually PCIe and they used a PCIe > AGP bridge chipset. Those cards do still need the amd agpgart for that bridge chipset. Another thing that's important to note is that AGP in that card supports X8 speed, check your BIOS to make sure it's not accidentally set to x1 or something, just make sure it is still set to X8 speed.
                    Thank you Duby!

                    I think this information clears up a lot for me. Actually before reading the kernel modules code I was not even paying attention that AGP is handled/queried like a PCI capability extension. Once I did asm/low level programming as a hobby but never on those levels where I have ever faced PCI or AGP and how they relate - while with PCIe I was not even remotely knowing how it that works...

                    Btw the TTL trick did not help me, but the added printk messages made me find a relevant kernel configuration that might help. I will gather all the details and post it here soon, then I will try if that works. At least now I have hope for a moment :-)

                    Comment


                    • #40
                      So after my changes this is the dmesg log I am getting:



                      Usually these kind of things happen when someone asks if the card has AGP acceleration:

                      Code:
                      [    0.249427] PCI: Searching for agp capability for device: f540b800
                      [    0.249429] PCI: __pci_bus_find_cap_start success (agp capability for device: f540b800)
                      [    0.249432] PCI: __pci_find_next_cap_ttl pos < 0x40 at TTL value still at: 4798
                      [    0.249442] PCI: __pci_find_next_cap_ttl error at TTL value still at: 4798

                      Which basically means that the "__pci_bus_find_cap_start(..)" function used to succeed, but the other is not. The start is just a preparation for reading from the capabilities. Then we went to the "next" function to really read the capabilities and custom my high TTL value does not count much because one can see from the logs that we get a "< 0x40" answer in fewer than 3-4 tries!

                      I mean.. the TTL starts at 4800 and I see I get a break condition at 4798 or 4797 usually. That only means that

                      Code:
                           pci_bus_read_config_byte(bus, devfn, pos, &pos);     /* THIS WORKS */
                           while ((*ttl)--) {
                      -       if (pos < 0x40)
                      +       if (pos < 0x40) {
                      +           if(cap == PCI_CAP_ID_AGP)
                      +               printk(KERN_ERR "PCI: __pci_find_next_cap_ttl pos < 0x40 at TTL value still at: %d\n", *ttl);                    break;
                      +       }         pos &= ~3;
                              pci_bus_read_config_word(bus, devfn, pos, &ent); /* THIS FAILS AFTER SOME LOOPS */
                      Also I am not sure if this is not just a linear search instead of a retry loop, because there is a "pos" value which looks like an index and and &ent and it looks like a downwards counting look on some register array so me setting the TTL might have been a really-really dumb idea as there will be never that many registers...

                      Anways I went on and looked how this "pci_bus_read_config_word" is implemented - NOW:

                      Code:
                      /* drivers/pci/access.c */
                      
                      #define PCI_byte_BAD 0
                      #define PCI_word_BAD (pos & 1)
                      #define PCI_dword_BAD (pos & 3)
                      
                      #ifdef CONFIG_PCI_LOCKLESS_CONFIG
                      # define pci_lock_config(f)    do { (void)(f); } while (0)
                      # define pci_unlock_config(f)    do { (void)(f); } while (0)
                      #else
                      # define pci_lock_config(f)    raw_spin_lock_irqsave(&pci_lock, f)
                      # define pci_unlock_config(f)    raw_spin_unlock_irqrestore(&pci_lock, f)
                      #endif
                      
                      #define PCI_OP_READ(size, type, len) \
                      int noinline pci_bus_read_config_##size \
                          (struct pci_bus *bus, unsigned int devfn, int pos, type *value)    \
                      {                                    \
                          int res;                            \
                          unsigned long flags;                        \
                          u32 data = 0;                            \
                          if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;    \
                          pci_lock_config(flags);                        \
                          res = bus->ops->read(bus, devfn, pos, len, &data);        \
                          *value = (type)data;                        \
                          pci_unlock_config(flags);                    \
                          return res;                            \
                      }
                      
                      #define PCI_OP_WRITE(size, type, len) \
                      int noinline pci_bus_write_config_##size \
                          (struct pci_bus *bus, unsigned int devfn, int pos, type value)    \
                      {                                    \
                          int res;                            \
                          unsigned long flags;                        \
                          if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;    \
                          pci_lock_config(flags);                        \
                          res = bus->ops->write(bus, devfn, pos, len, value);        \
                          pci_unlock_config(flags);                    \
                          return res;                            \
                      }
                      
                      PCI_OP_READ(byte, u8, 1)
                      PCI_OP_READ(word, u16, 2)
                      PCI_OP_READ(dword, u32, 4)
                      PCI_OP_WRITE(byte, u8, 1)
                      PCI_OP_WRITE(word, u16, 2)
                      PCI_OP_WRITE(dword, u32, 4)
                      
                      EXPORT_SYMBOL(pci_bus_read_config_byte);
                      EXPORT_SYMBOL(pci_bus_read_config_word);
                      EXPORT_SYMBOL(pci_bus_read_config_dword);
                      EXPORT_SYMBOL(pci_bus_write_config_byte);
                      EXPORT_SYMBOL(pci_bus_write_config_word);
                      EXPORT_SYMBOL(pci_bus_write_config_dword);
                      The relevant part is this:

                      Code:
                      #ifdef CONFIG_PCI_LOCKLESS_CONFIG .... #endif
                      It seems there is a configuration to change between a lockless pci configuration reading and a locking one!

                      This seems to be quite a new thing and I am not seeing it in the same file from 2016.03.15 (Marc. 15):

                      https://github.com/torvalds/linux/bl...s/pci/access.c

                      Also I see what was the default and hardcoded thing back then is what now seem to happen if CONFIG_PCI_LOCKLESS_CONFIG is not set!

                      Then I go and read /proc/config.gz and what I see enabled for me?

                      Code:
                      ...
                      CONFIG_PCI_LOCKLESS_CONFIG=y
                      ...
                      That might affect me for some weird and unknown reason - I think now! :-)

                      Changing the kernel config value would lead to a long compilation so to test this theory that this is maybe the source of all devil here, I think I will just temporarily change the #ifdef into an #ifndef so that completely the opposite will happen than the configuration. Then build the kernel and hope and pray and things like that :-)
                      Last edited by prenex; 22 May 2019, 10:55 AM.

                      Comment

                      Working...
                      X