Announcement

Collapse
No announcement yet.

debugging a hard crash and reboot during "benchmark ffmpeg"

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • debugging a hard crash and reboot during "benchmark ffmpeg"

    Hello

    I can reliably crash my linux mint box by running the following

    Code:
    $ ./phoronix-test-suite batch-benchmark ffmpeg
    
    Fmpeg 6.0:
        pts/ffmpeg-6.0.0 [Encoder: libx265 - Scenario: Upload]
        Test 4 of 8
        Estimated Trial Run Count:    3                      
        Estimated Test Run-Time:      5 Minutes              
        Estimated Time To Completion: 24 Minutes [11:33 CDT]
            Started Run 1 @ 11:10:03​
    At this point my log file (i dumped batch output to a file) ends, and the system reboots.

    I have checked syslog, journalctl, dmesg, and none of them seem to have a specific error message, there is no Kernel Panic, it's as though the machine just blinks out of existence then comes back up to boot.

    What are the next steps for debugging?

    My next hypothesis that I would like to test is to see if when the Phoronix Test Suite builds the program, it is building it using some kind of option to the compiler that generates an invalid instruction. For that I would like to get the exact command line to run the program, so i tried dumping 'ps' continuously during the test and found the following,

    Code:
    $ /home/don/.phoronix-test-suite/installed-tests/pts/ffmpeg-6.0.0//ffmpeg_/bin/ffprobe
    -select_streams v:0 -count_frames -show_entries stream=nb_read_frames
     /home/don/.phoronix-test-suite/installed-tests/pts/ffmpeg-6.0.0//vbench/videos/crf0/chicken_3840x2160_30.mkv
    However that fails with this error:

    Code:
    /home/don/.phoronix-test-suite/installed-tests/pts/ffmpeg-6.0.0//ffmpeg_/bin/ffprobe:
    error while loading shared libraries: libx265.so.206: cannot open shared object file: No such file or directory
    I'm hoping someone might be able to direct me to a better usage of the test tools so that i could isolate the exact command environment, and path setup, and library path setup, to reproduce this crash instantly without having to restart the entire test suite and wait 10+ minutes for it to get to the stage that breaks it ,and then from there to further isolate the code through some kind of instrumentation.

    I do not think it is the following
    • temperature spike - i have dumped 'sensors' into a log file and there were all under 40 degree Celsius at crashtime
    • out of RAM - i have dumped 'free' into a log file and swap was not even being touched during crashtim
    Thanks!



  • #2
    thoughts? comments?

    Comment

    Working...
    X