The phrase “extraordinary claims require extraordinary evidence” is most often attributed to Carl Sagan, specifically from his television series Cosmos. Sagan was probably not the first person to put forward such a hypothesis, and the show certainly didn’t claim he was. But that’s the power of TV for you; the term has since come to be known as the “Sagan Standard” and is a handy aphorism that nicely encapsulates the importance of skepticism and critical thinking when dealing with unproven theories.
It also happens to be the first phrase that came to mind when we heard about Obfuscation Revealed: Leveraging Electromagnetic Signals for Obfuscated Malware Classification, a paper presented during the 2021 Annual Computer Security Applications Conference (ACSAC). As described in the mainstream press, the paper detailed a method by which researchers were able to detect viruses and malware running on an Internet of Things (IoT) device simply by listening to the electromagnetic waves being emanated from it. One needed only to pass a probe over a troubled gadget, and the technique could identify what ailed it with near 100% accuracy.
Those certainly sound like extraordinary claims to us. But what about the evidence? Well, it turns out that digging a bit deeper into the story uncovered plenty of it. Not only has the paper been made available for free thanks to the sponsors of the ACSAC, but the team behind it has released all of code and documentation necessary to recreate their findings on GitHub.
Unfortunately we seem to have temporarily misplaced the $10,000 1 GHz Picoscope 6407 USB oscilloscope that their software is written to support, so we’re unable to recreate the experiment in full. If you happen to come across it, please drop us a line. But in the meantime we can still walk through the process and try to separate fact from fiction in classic Sagan style.
Baking a Malware Pi
The best way of understanding what this technique is capable of, and further what it’s not capable of, is to examine the team’s test rig. In addition to the aforementioned Picoscope 6407, the hardware configuration includes a Langer PA-303 amplifier and a Langer RF-R H-Field probe that’s been brought to rest on the BCM2837 processor of a Raspberry Pi 2B. The probe and amplifier were connected to the first channel of the oscilloscope as you might expect, but interestingly, the second channel was connected to GPIO 17 on the Pi to serve as the trigger signal.
As explained in the project’s Wiki, the next step was to intentionally install various rootkits, malware, and viruses onto the Raspberry Pi. A wrapper program was then used that would first trigger the Picoscope over the GPIO pin, and then run the specific piece of software under examination for a given duration. This process was repeated until the team had amassed tens of thousands of captures for various pieces of malware including bashlite
, mirai
, gonnacry
, keysniffer
, and maK_it
. This gave them data on what the electromagnetic (EM) output of the Pi’s SoC looked like when its Linux operating system had become infected.
But critically, they also performed the same data acquisition on what they called a “benign” dataset. These captures were made while the Raspberry Pi was operating normally and running tools that would be common for IoT applications. EM signatures were collected for well known programs and commands such as mpg123
, wget
, tar
, more
, grep
, and dmesg
. This data established a baseline for normal operations, and gave the team a control to compare against.
Crunching the Numbers
As explained in section 5.3 of the paper, Data Analysis and Preprocessing, the raw EM captures need to be cleaned up before any useful data can be extracted. As you can imagine, the probe picks up a cacophony of electronic noise at such close proximity. The goal of the preprocessing stage is to filter out as much of the background noise as possible, and identify the telltale frequency fluctuations and peaks that correspond to individual programs running on the processor.
The resulting cleaned up spectrograms were then put through a neural network designed to classify the EM signatures. In much the way a computer vision system is able to classify objects in an image based on its training set, the team’s software demonstrated an uncanny ability to pick out what type of software was running on the Pi when presented with a captured EM signature.
When asked to classify a signature as ransomware, rootkit, DDoS, or benign, the neural network had an accuracy of better than 98%. Similar accuracy was achieved when the system was tasked with drilling down and determining the specific type of malware that was running. This meant the system was not only capable of detecting if the Pi was compromised, but could even tell the difference between a gonnacry
or bashlite
infection.
Accuracy took a considerable hit when attempting to identify the specific binary being executed, but the system still manged a respectable 82.28%. Perhaps most impressively, the team claims an accuracy of 82.70% when attempting to identify between various types of malware even when attempts were made to actively obfuscate their execution, such as running them in a virtualized environment.
Realistic Expectations
While the results of the experiment are certainly compelling, it’s important to stress that this all took place under controlled and ideal conditions. At no point in the paper is it claimed that this technique, at least in its current form, could actually be used in the wild to determine if a computer or IoT device has been infected with malware.
At the absolute minimum, data would need to be collected on a much wider array of computing devices before you could even say if this idea has any practical application outside of the lab. For their part, the authors say they chose the Pi 2B as a sort of “boilerplate” device; believing it’s 32-bit ARM processor and vanilla Linux operating system provided a reasonable stand-in for a generic IoT gadget. That’s a logical enough assumption, but there’s still far too many variables at play to say that any of the EM signatures collected on the Pi test rig would be applicable to a random wireless router pulled off the shelf.
Still, it’s hard not to come away impressed. While the researchers might not have created the IT equivalent of the Star Trek medical tricorder, a device that you can simply wave over the patient to instantly see what malady of the week they’ve been struck by, it certainly seems like they’re tantalizingly close.
0 Commentaires