Announcement

Collapse
No announcement yet.

The dangers of Linux kernel development

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    What (not to) do when you got raw hardware access

    So, once you've got direct hardware access, kernel mode or otherwise, you can try the following:
    Caution: doing these things can be extremely harmful and/or destructive. Think twice it its what you want.

    1) Got SATA drive? And even learned how to send all these cool interesting commands into SATA interface? Sounds like a plan! Now go send "firmware update" command and try to upload swap file to see if it improves HDD speed. Should it work, please do not ask how to undo that. I do not have even slightest idea. Replace HDD and press any key. And of course you shouldn't create backups! That's cheating for sure.
    1.1) If it does not works, it can be intersting idea to send some vendor-specific commands in random order, fuzzing parameters and so on. Basically, redirect /dev/urandom into sata and take a look. Eventually you'll get interesting results for sure. I can't tell what effects you'll face. But some of them should be really funny. Unfortunately I do not know how to undo them if you dislike the results.
    1.2) Same idea could work for SATA SSDs and USB flashes. Sending some garbage into SD/MMC cards could be wise idea as well. All this crap haves ton of interesting commands. Some are one way tickets, some are highly destructive, some are undocumented vendor stuff to do hell know what.
    1.3) Sending funny commands could also work for CD/DVD/BD drives as well. They have ton of interesteing stuff to try as well!

    2) Oh, are there any I2C and SPI buses in your system? Scan 'em all. As hard as you can. If there is EEPROM or flash IC - you're lucky! Now erase them and take a look if you won jackpot.
    2.1) If you're especially lucky, best thing you can do is to erase BIOS (or boot loader) which often starts from some SPI flash IC. Needless to say, if erase succeeds, you're doomed to have a lot of fun on next reboot. And you'll learn heck a lot about advanced recovery techniques if you'll want to undo this.
    2.2) Regulators. Ahh, what a nice invention. Your system needs more power, isn't it? So once you managed to find regulator IC on some bus - go reprogram it! Increase all voltages you can afford to their maximum values. Absolute Maximum Ratings in datasheets are for nuts! Feel free to exceed them. Since many regulators are generic ICs which do not have even slightest idea about absolute maximum ratings for your system... well, it's not going to be safe journey in hardware land. Feeding regulator IC with /dev/urandom output can be not so bad option to play Russian Roulette with your system, I guess.
    2.3) Charger ICs. Another great inventon. Of course most of these are generic as well and so they do not know anything about your particular system until system software programs values adequately. If you've got some (usually embedded or mobile) system with LiIon charger IC, its really cool idea to "improve" operational limits to something more funny than all these boring correct/safe values. Exceeding charging current would make your battery to charge faster! And exceeding charge voltage would make battery to store even more power. The "only" issue is that battery can go boom. But absolute maximum ratings are meant for cowards, aren't they?!
    2.4) Smart batteries. These are often found in laptops in I2C bus. Try to hack into smart battery and adjust operational limits just like described above, same idea applies. Though their devs are often badass and trying to protect agains such scenarios, demanding some silly auth keys and somesuch. Fortunately, many smart battery IC devs have really good sense of humor, so if you'll bother smart battery controller for long enough, feeding it with bad management keys, controller could lose its patience and ... permanently disable battery, in way which isn't easy to fix, often by triggeting powerful FET to cause artificial "short circuit" which in turn permanently destroys fuse, making battery useless. Needless to say you have to be rather competent in electronics to find that fuse, replace it and reset controller's state. Or just replace your expensive laptop battery and try again. But of course, jackpot only awarded for exceeding absolute maximum ratings. Just killing battery is boring.

    3) FAN controllers/PWMs. Another nice invention. If you can figure out how to control that damn blower in particular system, you can make it really quiet by setting fan speed to 0. Do it for all fans you can afford. Let it shut up! And if your system TDP is high enough, you're in good position to win jackpot. Nvidia once released buggy driver which has proven its a really promising option to get rid of your old GPU. Needless to say, everyone with direct hardware access can try this trick as well, after figuring out some things. Interestingly, I once observed some firmware bug in hardware fan controller from Cooler Master where its firmware has failed to start. Then, no coolers were spinning at all. So it looks like if at least 2 vendors already got this sweet idea about making systems really silent via stopping all FANs. Who cares about all these silly TDPs?!

    Comment


    • #22
      Great! Lots of thank You. Will have to try some of those.
      Will see if anything will catch fire.

      Actually modern CPUs will survive removing the fan while they are running. GPUs may be more sensitive, though. Actually my fan does not start once in a blue moon (some ACPI issue?) and the box is still working.

      On the other hand, it should not be possible to Flash / EEPROM in modern boxes with Secure Boot, the Secure Boot specification says that the hardware should prevent it.

      Comment


      • #23
        Originally posted by Mat2 View Post
        ... GPUs may be more sensitive, though.
        My exprience is the opposite. The max temperatures for GPU's are usually higher than CPU. IIRC, for CPU's, the upper limit is around 70 deg Celsius. While GPU's are around 90 degrees. The AMD Radeon cards are usually not "sensitive". I cant say much about NVIDIA.

        My personal experience with an AMD GPU (Radeon 6950, overclocking due to Bitcoin mining) is that the absolute max rating is even at 105 degrees. My card once briefly went to 102 degrees (always Celsius) After that "event" I ran the card at 85 - 95 degrees for extended periods of time (months) It is now still in use and working absolutely fine.

        Comment


        • #24
          so there's that

          If you boot Memtest on a Mac, the fan will not turn on, no matter how hot the system gets...
          Memtest86+ is an advanced, free, open-source, stand-alone memory tester for 32- and 64-bit computers (UEFI & BIOS supported)


          Just recently I tried Lubuntu on an old laptop with an analog display, and when it resumed from suspend, the refresh rate was out of tolerance and it left some lines permanently burned into the screen (though some of them faded away after couple of days.)

          Comment


          • #25
            Originally posted by Mat2 View Post
            Great! Lots of thank You. Will have to try some of those.
            Will see if anything will catch fire.
            Good luck with it .

            Actually modern CPUs will survive removing the fan while they are running. GPUs may be more sensitive, though.
            Interestingly I recently did such test with AMD GPUs of 57x0 series. And these proven to have at least some overheat protection. Thermal grease got dry and stopped doing its job, making heatsink far less efficient than it supposed to be (should be comparable to fan stop I guess). Interestingly, cards haven't died. Instead, ACPI profile activated when GPU themperature exceeded 70C and then GPU downclocked to absolute minimum it can do. Fan runs at full speed, making horrible noise but its worth of nothing, since grease gone dry and heat just not transferred to heatsink. So even horrible airflow makes little difference. So, it haven't died. But even displaying desktop tends to lag all the time since GPU seems to be throttling all the time. After scratching head why that system lags so much without obvious reasons and makes such a horrible fan noise, I noticed GPU exceeds 70C even on light loads and had to replace grease (TBH I haven't even thought video cards may need something like this). So at least default card attitude is somewhat failsafe. But again, voltages and frequencies are matter of BIOS tables. And AMD's binary driver even gives means to overclock the card ("ADL"). Not sure you can completely fry it in efficient and quick ways but at least you can surely go to point where it would lose stability. IIRC, overclock is not covered by warranty, so... .

            Actually my fan does not start once in a blue moon (some ACPI issue?) and the box is still working.
            Hard to tell why it happens, can be both software and hardware issue.

            On the other hand, it should not be possible to Flash / EEPROM in modern boxes with Secure Boot, the Secure Boot specification says that the hardware should prevent it.
            I wouldn't rely on this too much.
            1) In ARM/MIPS systems there is often no such restrictions at all.
            2) There could be "miscelannious" ICs with some "secondary" crap. Still can be rather unpleasant experience if they became empty, depending on reason why some IC hangs on particular bus and what it contains.
            3) It can turn out that hardware is imperfect in this regard, allowing to override it one way or another or just fooling secure boot crap.
            4) Implementation of secure boot in proprietary firmwares tends to be low quality and from what I can observe, it mostly focused on locking everyone but MS out of their PC systems rather than actually making things anyhow "secure". In fact UEFI firmwares were so bugged that mere OS reinstall can completely ruin some notebooks if OS does something with UEFI variables. While there is nothing wrong with touching variables, some UEFI firmwares proven to be bugged to degree they can't handle edge cases or rare operations (like garbage collection when flash IC area getting full, etc). I can remember you can win jackpot on some Samsung notebooks by just reinstalling OS, be it Windows or Linux. At some point firmware would ruin own variables area and ... refuse to boot at all, making device fairly useless.

            And I can tell for sure I've *seen* some hardware systems where generic regulator IC can output far more voltage than system CPU and other ICs absolute maximum ratings permits. Needless to say such system could actually die after sending right commands. Most ICs are unable to survive cases where absolute maximum voltages are exceeded by 300+%.

            Then I can admit some systems have inherently software-killable design. Imagine ESC (electronic speed controller, BLDC controller, etc). Usually microcontroller would just "directly" toggle powerful FETs to switch motor windings under software (firmware) control in correct sequence. This allows to adjust motor speed on request and keep commutation of windings according to actual rotation speed, etc while keeping circuitry simple and compact (i.e. circuit mostly consists of uC and FETs). But there is one catch. If firmware would do something wrong, its completely possible to get powerful FET open forever. Since FET and winding resistance is negligible, its switching what keeps system safe: winding inductance would limit current as long as current is pulsed. If FET left open forever, current would increase to degree only limited by low resistance of winding and FET. This mode is close to short circuit. If no countermeasurements were taken, motor and/or FET would be fried or power supply would face overload. In best case it can be caught by fuse/current limiting circuitry. In worst case system could be actually fried. Needless to say it makes debugging such firmwares tricky . But its not worst example of software failure. There are far worse scenarios all around. Microprocessors and microcontrollers proven to be universal way to solve various engineering problems in efficient and simple ways. But then it turned out there're many new problems arised. You see, if train interlocking system computes allowed speeds in software, it do not even have to be kernel mode to cause impressive amount of "hardware" damage, should something go wrong...

            Comment


            • #26
              Originally posted by 0xBADCODE View Post
              4) Implementation of secure boot in proprietary firmwares tends to be low quality and from what I can observe, it mostly focused on locking everyone but MS out of their PC systems rather than actually making things anyhow "secure". In fact UEFI firmwares were so bugged that mere OS reinstall can completely ruin some notebooks if OS does something with UEFI variables. While there is nothing wrong with touching variables, some UEFI firmwares proven to be bugged to degree they can't handle edge cases or rare operations (like garbage collection when flash IC area getting full, etc). I can remember you can win jackpot on some Samsung notebooks by just reinstalling OS, be it Windows or Linux. At some point firmware would ruin own variables area and ... refuse to boot at all, making device fairly useless.
              The UEFI required some free variable space to boot to store some temp data.

              Then I can admit some systems have inherently software-killable design. Imagine ESC (electronic speed controller, BLDC controller, etc). Usually microcontroller would just "directly" toggle powerful FETs to switch motor windings under software (firmware) control in correct sequence. This allows to adjust motor speed on request and keep commutation of windings according to actual rotation speed, etc while keeping circuitry simple and compact (i.e. circuit mostly consists of uC and FETs). But there is one catch. If firmware would do something wrong, its completely possible to get powerful FET open forever. Since FET and winding resistance is negligible, its switching what keeps system safe: winding inductance would limit current as long as current is pulsed. If FET left open forever, current would increase to degree only limited by low resistance of winding and FET. This mode is close to short circuit. If no countermeasurements were taken, motor and/or FET would be fried or power supply would face overload. In best case it can be caught by fuse/current limiting circuitry. In worst case system could be actually fried. Needless to say it makes debugging such firmwares tricky . But its not worst example of software failure. There are far worse scenarios all around. Microprocessors and microcontrollers proven to be universal way to solve various engineering problems in efficient and simple ways. But then it turned out there're many new problems arised. You see, if train interlocking system computes allowed speeds in software, it do not even have to be kernel mode to cause impressive amount of "hardware" damage, should something go wrong...
              Like the centrifuges in Iran.
              I would argue that it is not comparable to a short circuit. In a correct design, the inductance should limit the current to an allowable value.

              That's why such systems are usually designed as formally proven.
              BTW, I am still wondering when someone will do a formal proof of LibreSSL.

              EDIT: With hardware systems there were also things that could go wrong.
              I have heard of an instance when a faulty pipe in a high voltage switch in an electric power plant made the generator fly out of its hall.
              (and everything turned out to be done correctly and according to the regulations).
              Last edited by Mat2; 01 September 2014, 04:24 PM.

              Comment

              Working...
              X