jade_opw: Smiling into the camera on a Monday (Default)
2014-07-19 06:01 pm

Caveman Reboot

WARNING: Do your research and make sure you know what you're doing. The commands found in this article could seriously affect your machine and I print them here with no warranty. I am not responsible for your bricked hardware. The information contained here is for entertainment and historical purposes only. You have been warned.

I, like many other people, enjoy changing the scenery from time to time and working on a laptop. Now, since I work on kernel drivers, I face some unique issues sometimes when working remotely that you just wouldn't think about when doing application development or web development. One such issue, and the subject of this article is the problem of hanging on reboot.

tl;dr - If you're unsure whether your machine is going to reboot properly while you're working remotely, echo as root the characters r, s, u, b separated by a few seconds each to /proc/sysrq-trigger to safely sync your disks, unmount and reboot by any means necessary. If you were at the keyboard physically, this is your normal magic sysrq sequence as in the mnemonic "(r)aising (e)lephants (i)s (s)o (u)tterly (b)oring" with the "i" and "s" taken out so as to not kill your ssh session while you're entering commands which turns the mnemonic into "(r)emote (s)sh (u)nquestionably (b)oot." There is another method detailed below that amounts to the same thing with less platform portability but nonetheless interesting history.

What if I try to reboot while the hardware is in an undefined state because my driver oops'd, bug'd or hit a general protection fault? I'm not sure if my box is going to hang while waiting on some kernel subsystem to dealloc or drop a reference which has been scribbled on while my driver was in its death throes. I'm sitting in a cafe or on a friend's couch right now and I'd really like it if I didn't have to walk 45 minutes back to my apartment to hit a physical button.

There should be a better way to deal with an unknown kernel state than turning it on-and-off-again manually! This is 2014! Well, look no further than 1984 and the brand new IBM System Unit 5170 also known as the IBM Personal Computer AT.

PC history time! The IBM Personal Computer AT was the third major model in the Personal Computer line. It introduced new features and new hardware, crucially the Intel 80286 processor as its CPU. The architecture of this model was so popular that it inspired many clones and created the lineage of x86 systems that we still use today, largely feature compatible down to the keyboard controller - the 8042.

The 8042 keyboard controller: a small programmable microcontroller whose main use was the new bi-directional keyboard protocol used with the 84 key keyboard (which also introduced the sysrq key coincidentally) allowing the system to do some basic hardware tests and pass status information to the keyboard. This "AT" protocol went on to become the "PS/2" protocol which in turn, through the magic of backwards compatibility, is still the base command set we use to communicate with USB keyboards and turn on status indicators like your keyboard's CapsLock LED.

The history of the 8042 would be extremely boring if not for its secondary usage in the original IBM 5170's architecture. The 8042 is a fully fledged 8-bit microcontroller with plenty of I/O pins (electrical on/off switches) which were used to electrically interface with the keyboard. The keyboard interface didn't use up all the pins available on the 8042 and seeing as it was a perfectly good chip going onto the motherboard anyways, the designers of the 5170 decided to reuse the chip for two additional functions as a cost saving measure. First was controlling the "A20 gate" backwards compatibility function, a feature with much history which has been written about at length elsewhere. Second was the "system reset" line.

As the 8042 was on the system bus of the IBM 5170 its features have been carried forward on all compatible x86 systems. Today you can find it in I/O port memory through /dev/port and the same memory addresses found in technical manuals from 1984 can be used to pulse the system reset line today. Simply

echo -en "\xfe" | dd of=/dev/port seek=100 bs=1 count=1

and your machine will quickly reboot without doing any checks which might hang. Further reading in the sysrq section of the kernel source reveals that this method is actually still used, albeit through a different interface, as the second option, along with a few others, for emergency reboot if the more modern ACPI interface fails or is unavailable for some reason.

This is a stupid way to reboot your system yourself, always use /proc/sysrq-trigger instead and allow the kernel to at least sync and unmount the disks (details in the tl;dr section above) but if you're not working on a mission critical machine and you want to do something a little foolhardy to cheer yourself up after a long, frustrating day/week of hunting bugs, think of simpler times and the 8042 keyboard controller while you reboot your system the caveman way.

Original Source for this Method: How to reboot a Linux server stuck into the Big Kernel Lock
Further Reading: A short history of the PS/2 controller
IBM 51xx - Manuals specifically: IBM 5170 - Technical Reference - 1502243 - MAR84