Announcement

Collapse
No announcement yet.

Problem with Kubuntu 23.04 and Nvidia 535

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Kubuntu 23.04 and Nvidia 535

    The same thing happens on two completely different systems:

    Desktop - Ryzen 9 5950x - RTX 3090 connected to a screen using HDMI.
    Laptop - Ryzen 7 5800H - RTX 3070 Mobile (or whatever the mobile parts were called for 3000) just using the built in screen.

    Clean install of Kubuntu 23.04.
    Switch to command line before first time entering. (Doesn't matter if I have or not, same results...)
    Code:
    sudo apt update
    <optionally>
    Code:
    sudo add-apt-repository ppa:graphics-drivers
    (same result with regular).
    If needed (One install it went with 535, other with 525):
    Code:
    sudo ubuntu-drivers install nvidia:535
    .
    During installation, at some point screen goes off and keyboard doesn't work. Caps lock, num lock don't light up. Machine still responds to ping.

    In the case where 535 was already installed:
    Code:
    sudo apt dist-upgrade
    Same result as above. It updates from 535.86 to 535.104.

    And I mean during the installation, not after a reboot. If I don't update, it's stable.

    I was running 22.04 before with 525 and 530, without these issues.

    Ideas?

    What I haven't tried:

    * Just use 525 or 530.
    * Ubuntu instead of Kubuntu.

    It's not a new issue, either. I installed 23.04 for the first time a few months ago (May?), and since it gave me grief I just went back to 22.04. Thought I'd try again now.

    Update 1:

    Tried Ubuntu 23.04 instead. It installed driver 535.86. On first startup I switched to command line and added the ppa and then did
    Code:
    sudo apt update && sudo apt dist-upgrade
    and it did the same as the other.

    Then I reinstalled Kubuntu and on first boot I enabled the ssh server and ssh'd in from my laptop.
    Over SSH I enabled the PPA and did
    Code:
    sudo ubuntu-drivers install nvidia:535
    and the screen disappeared like before, however the installation just continued until it was done.

    The moment the screen disappeared was near or at the following lines:

    Code:
    nvidia-persistanced.service is a disabled or a static unit, not starting it. (twice)
    Setting up cpp-12 (12.3.0-1ubuntu1~23.04) ...
    Setting up nvidia-kernel-common-535 (535.104.05-0ubuntu0.23.04.1) ...
    Installing new version of config file /etc/modprobe.d/nvidia-graphics-drivers-kms.conf ...
    update-initramfs: deferring update (trigger activated) <== This is where I believe it blanked out.
    Could not execute systemctl: at /usr/bin/deb-systemd-invoke line 145.
    Setting up libnvidia-decode-535:amd64 (535.104.05-0ubuntu0.23.04.1) ... (and same for i386).
    // Stefan
    Last edited by stesmi; 25 August 2023, 09:03 PM.

  • #2
    Update 2:

    Tried 530. It installs 535, so not really a real test...
    Tried 525. It defaults to not using dkms, which 535 does. Enabled dkms and removed the prebuilt module and it still just works.

    Tried Kubuntu 22.04. Same results as 23.04.
    Tried Ubuntu 22.04. Same results as Kubuntu and 23.04.

    While having 525 installed, rebooted and blacklisted the nvidia driver:
    Code:
    module_blacklist=nvidia
    Obviously the nvidia driver wasn't loaded.

    Tried upgrading from 525 to 535 (which didn't work when 525 was enabled). This worked.

    So workarounds for now are :

    Run 525.

    When there is a dkms and/or kernel and/or nvidia update and/or some other driver update, apply blacklist, boot, do the update, reboot with normal mode.

    If anyone has any other ideas, feel free to point me in the right direction. I haven't tried the workarounds on my laptop yet, I've been focusing mostly on my desktop.

    I'll actually try Ubuntu 23.04 now with my blacklist trick, since it always defaults to 535.86, as opposed to Kubuntu which mostly (but not always?!) defaults to 525. That way I'll know if that's a valid workaround for when a new driver is released.

    Update 3:

    Tried Ubuntu 23.04, which defaulted to 535.86. It doesn't use dkms apparently. Made sure it works, rebooted, blacklisting the nvidia driver, added the PPA, updated, and it just worked. Now running 23.04 with 535.104. So my conclusion, and take this with a pinch of salt is that there's something going on with dkms and the 535.104 driver.

    I guess there's one more thing I can try... Install Ubuntu 23.04 again, removing the prebuilt kernel module, enabling dkms, and seeing if it then messes up or not. Here we go... Conclusion will be if there's a problem with 535 and dkms or if the problem is with the one from the PPA.

    Update 4:

    So, I installed Ubuntu 23.04, which defaults to the non-PPA 535.86 driver, which by default doesn't use dkms.

    I booted it normally (not disabling the nvidia driver).
    Switched it to using dkms instead of not:

    Code:
    sudo apt install nvidia-dkms-535 # This installs dkms as well. If you read carefully, it does NOT actually USE the modules it builds, as they are already installed from a package.
    sudo apt purge linux-module-nvidia-...version linux-objects-nvidia-...version linux-signatures-nvidia-...version # This removes the prebuilt stuff
    sudo apt reinstall nvidia-dkms-535 # This makes it rebuild the modules, and since there are none, it happily installs the ones it builds.
    Result: No crash.

    I then took it one step further, since I now have a working setup with 535.86, and added the PPA and switched over to the 535.104 drivers from it and ... same lockup as before. Screen off, keyboard and mouse non-functional. Can ssh in (if I have the ssh server enabled, which I mostly forget...)
    Trying to reboot or poweroff
    Code:
    sudo systemctl poweroff
    or
    sudo systemctl reboot
    makes it lockup somewhere. No idea where as I'm booted from the SSH and obviously don't see anything on the screen.

    One more step to try:

    Kubuntu 23.04 (since it gives me 525 most of the time) and just update to 535.86 and then same but with dkms enabled.

    Update 5:

    On Kubuntu 23.04, with 525 installed, going straight to 535.86 (non-PPA), makes it actually install dkms and nvidia-dkms-535, and it locks up.
    I tried the same, except this time I manually changed the 525 over from using the prebuilt modules to dkms and THEN upgraded to 535.86 and ... still locks up. So PPA or no PPA, same problem exists.

    Only solution I see is : Use the trick of blacklisting the module when something will update and do it through that.

    I'm totally open to suggestions or a pointer where to turn for help. I have a feeling the nvidia forums isn't the right place, but...?

    As long as the pointers aren't of the type "Don't use Nvidia", "Just use AMD as it Just Works(TM)", or similar "solutions". While technically possible in my desktop, my laptop has what my laptop has.

    Oh, one thing I've changed as well : I Now have a DisplayPort-connected screen as well to the desktop, in portrait mode. Gotta love when you disable drivers as that's the primary display according to Nvidia, so I have to look sideways...

    // Stefan
    Last edited by stesmi; 29 August 2023, 01:57 PM.

    Comment


    • #3
      So I found what's going on!

      I added debug info just before that line I quoted above, in deb-systemd-invoke, and I saw that it was invoking nvidia-suspend.service with "start" as an argument, which felt weird.

      I then looked at where anything pertaining that service in the source packages (deb source packages), and found:

      525:
      Code:
      dh_systemd_enable --name=nvidia-suspend
      535:
      Code:
      dh_installsystemd --name=nvidia-suspend
      Now I'm guessing, since I haven't dug into this bit:

      dh_systemd_enable - I'm assuming it enables a service, i.e. next boot it will autostart.
      dh_installsystemd - I'm assuming it enables it ... and starts it?

      I compared the 525 and 535 versions of the service files and they are identical, same hash.
      They call /usr/bin/nvidia-sleep.sh with the argument "suspend".
      /usr/bin/nvidia-sleep.sh is also identical on both, so it has to be the calling.

      Well...

      Then I tried the following, on a kubuntu (to get 525 driver):

      Code:
      sudo add-apt-repository -y ppa:graphics.drivers && sudo apt -y update && sudo apt -y dist-upgrade && sudo apt -y install nvidia-driver-535 && sudo /usr/bin/nvidia-sleep.sh resume
      And lo and behold! Screen blanks and after some 45 seconds it comes back. I had to Alt-F1 to get back to graphical mode, but everything was alive (apart from KDE complaining that it lost graphics).

      Then I tried an Ubuntu (535.86 by default), using almost the same line (without the install nvidia-driver-535), and again it worked. I did it twice actually, the first time it came back to GUI mode and second I had to Alt-F1 to get back there, but hey.

      So if you're using command line to update your system, this is what you need to do. I'm going to contact the maintainer and ask him if I'm correct about the change the rules file or not.

      Hope this helps anyone.

      // Stefan

      Comment


      • #4
        One recommendation is to use a slightly modified line:
        Code:
        sudo sh -c 'add-apt-repository -y ppa:graphics.drivers && apt -y update && apt -y dist-upgrade && apt -y install nvidia-driver-535 && /usr/bin/nvidia-sleep.sh resume
        Does the same, except if the first steps take a while (which they can depending on your computer), then the "sudo" might time out on you and you'd have to reenter the password, which you don't know that you need to do since... the screen's blanked out. This way the sudo is done once for all the commands.

        // Stefan

        Comment

        Working...
        X