RF Fingerprinting with a SDR
The increase in Radio Frequency (RF) devices driven by the Internet of Things (IoT) has led to a need to efficiently and reliably authenticate these identifiers, especially since traditional authentication methods are easily compromised through identity spoofing. These challenges have motivated research into the use of RF-fingerprinting as a potential solution. RF-fingerprinting relies on detecting minute imperfections of individual transmitters and uses these imperfections to create a unique identifier for each transmitter. Like many classification problems, fingerprinting relies on hand-crafted features, which is expensive and unreliable. Recently, researchers have shifted their focus toward artificial intelligence and machine learning (AI/ML) techniques, which could lead to more efficient and more accurate identification without costly design price tags. Researchers here at GA-CCRi were curious if RF-fingerprinting would be possible with simple equipment.
A recent post by Nihal Pasham laid out an quick-and-dirty way to begin developing RF-fingerprinting with modern machine learning. In his post, Nihal used a Software Defined Radio (SDR) to record signals from cheap devices, such as a car key fob, and tried to identify each individual transmitter. While his code can be found in the link above, he did not post any results. We really liked the idea of using an SDR as a test-bed for ML-based RF-fingerprinting algorithms. Specifically, we wanted to determine if identical car key fobs could be identified with an SDR-frontend, some signal processing, and a convolution neural net. Below is a diagram of a potential ML-based RF-fingerprinting system. In this post we focus on the signal processing and ML models.
Key fobs are Radio Frequency Identification transmitters which are used to lock and unlock your car. When a command button is pressed, a modulated radio wave at around 400MHz is broadcast into the world. If you are close enough to your car, this signal is received, de-modulated and the desired action is performed. These transmitters are good candidates for RF-fingerprinting since they are ubiquitous, and most people have a pair of identical transmitters. To begin to identify a key fob, we first need a way to record the signal and get it onto a computer. For this we use a software defined radio (SDR).
Signal Exploration with a SDR
SDRs are great for quick experiments because they are cheap and flexible in their application. The most commonly used SDR is RTL-SDR, because they are inexpensive and have good software support. The RTL-SDR provides a general purpose receiver which can be used to record up to about 1.7GHz with a bandwidth of 2.4MHz.
Most key fobs in the U.S. transmit at approximately 315 MHz (there are some exceptions, which transmit on 433.92 MHz). For our first tests, we used a key fob from one of our engineers. Using pyrtlsdr, we recorded a one second time interval and captured the key fob unlock signal, which is about 0.2 seconds long. RF data is multidimensional in nature, so we need to use multiple views of the data to get a more complete picture. Potentially the most common way you see RF data visualized is to plot the amplitude of the signal over time, but RF also contains phase information (often represented as a complex number and visualized by plotting the real and imaginary components), as well as power and frequency information. Usually power and frequency information are plotted against each other, allowing you to see the distribution of power into frequency components that make up the signal (called a Power Spectral Density (PSD) plot).
Zooming in on the time-domain signal as shown below, you can can see its binary amplitude modulation (on-off keying). To demodulate this, we would take the amplitude (green), threshold it, sample it, and get a binary signal out to unlock the car. But to demodulate is not our goal. Instead, the message should be completely removed since it is considered noise from the perspective of RF-fingerprinting in order to differentiate devices . For the purpose of RF-fingerprinting, the tiny artifacts imprinted on this transmission by the radio are the signal of interest. We started feature engineering on those artifacts.
Through our research, we discovered a technique for RF-fingerprinting used by cell phone towers. The idea was to record the rise times at the beginning of each received transmission, and use this rise-time signature to identify each handset. In the binary AM case shown above, the rise-time of the carrier amplitude is prominent. By using an edge-detector to align each pulse, we characterize the rise-time amplitudes.
Shown below is a plot of several pulse leading edges aligned to each other. Let's marinate on this nice RC time constant; this feature has nothing to do with the message, and everything to do with the transmitter, so it is a great candidate for a RF-fingerprint feature. Furthermore it's real, about 32-64 bits long, and there are about a 1000 of them per signal transmission. These are all great characteristics for feature vectors when used in conjunction with a neural net.
Most newer cars don't use digital AM for their key fob modulations. We also got our hands on a key fob for a newer vehicle. Below, we see that the pulse has a short gap in it, and that the PSD tells us that its made up of two frequencies. To get a better look we inspect the time-series, shown below to determine that it is binary Frequency shift keying (FSK) .
In order to use something similar to the rise-time concept, we hone in on the discontinuity about the bit-flip. In reference to the figure below, Plot 1 shows the real part of the complex time signal. Plot 2 shows the real vs imaginary components. Taking the phase of this signal and unwrapping it leads to Plot 3. This clearly shows the two-frequencies as two different (positive and negative) phase slopes. Since the signal has been mixed down to a frequency between the two carriers, each frequency has an equal and opposite phase slope. The change in phase about the bit shift is the part of interest, since it reveals an unwanted transient. Next, we took the first finite difference of the unwrapped phase, as shown in Plot 4, which gets close to the binary signal that represents the message. To focus on the phase discontinuity about the bit-flip, we applied a median-filter-based edge detector to highlight the zero-crossings as shown in Plot 5. This finally allows us to align the phase transients about the bit-flips, shown in Plot 6. Like before with the digital AM signal from the E46 fob, this feature is a real signal, about 64bits long, and there are thousands of them per signal transmission, making them ideal candidates for a RF-fingerprint feature.
The real test of RF-fingerprinting is to be able to distinguish between two identical transmitters. Here, we leverage a pair of key fobs for the same vehicle. The same series of plots leading the feature engineering on the signal from the earlier fobs is shown below. What can be seen is that the signal is much noisier and transient is much sharper, thus harder to characterize. The other fob for the same car has a signal which is visually identical to the human eye. Can machine learning distinguish them? We feed these features into a neural net to see if it can decipher the difference between two identical fobs.
Neural Net and Results
We took three recordings of each fob at different locations with respect to our SDR, to generate different received power levels, and signal-to-noise ratios. Each pulse was then turned into a set of 32-bit long features through the series of pre-processing steps similar to those depicted above. Both feature sets were trimmed to the same length for fob 1 and fob 2, and split in half into training/test datasets. The data sets were then fed into a shallow convolutional neural network (CNN) using the tflearn library with GPU support. The net was run for 1000 epochs on a Dell Inspiron laptop with a NVIDIA GTX 1650, which only takes about 20s (it takes much longer without GPU support). By providing CNN both training and testing data, the accuracy of the model can be tracked as the neural net evolves. Plots of the CNN's loss and accuracy vs. step are shown below, which indicate an accuracy of >95% in detection between the two fobs.
Below is a graph illustrating the neural net, courtesy of TensorBoard, which is integrated into TFLearn .
We demonstrated a proof of concept AI-based RF-fingerprinting algorithm which was shown to distinguish between two identical key fobs operating at 315MHz at 95% accuracy. The system consisted of a RTL-SDR for the RF front-end and open-source Python-based libraries for the feature engineering and machine learning components. The algorithm produces features which are well suited for AI-based applications. While the algorithm is specific to PSK modulation, the concept could be adopted to other modulation schemes.