GA-CCRi Analytical Development Services

RF Fingerprinting with a SDR

The increase in Radio Frequency (RF) devices driven by the Internet of Things (IoT) has led to a need to efficiently and reliably authenticate these identifiers, especially since traditional authentication methods are easily compromised through identity spoofing. These challenges have motivated research into the use of RF-fingerprinting as a potential solution. RF-fingerprinting relies on detecting minute imperfections of individual transmitters and uses these imperfections to create a unique identifier for each transmitter. Like many classification problems, fingerprinting relies on hand-crafted features, which is expensive and unreliable. Recently, researchers have shifted their focus toward artificial intelligence and machine learning (AI/ML) techniques, which could lead to more efficient and more accurate identification without costly design price tags. Researchers here at GA-CCRi were curious if RF-fingerprinting would be possible with simple equipment.

Overview

A recent post by Nihal Pasham laid out an quick-and-dirty way to begin developing RF-fingerprinting with modern machine learning. In his post, Nihal used a Software Defined Radio (SDR) to record signals from cheap devices, such as a car key fob, and tried to identify each individual transmitter. While his code can be found in the link above, he did not post any results. We really liked the idea of using an SDR as a test-bed for ML-based RF-fingerprinting algorithms. Specifically, we wanted to determine if identical car key fobs could be identified with an SDR-frontend, some signal processing, and a convolution neural net. Below is a diagram of a potential ML-based RF-fingerprinting system. In this post we focus on the signal processing and ML models.

The RF Identification Pipeline
The RF Identification Pipeline

Key Fobs 

Key fobs are Radio Frequency Identification transmitters which are used to lock and unlock your car. When a command button is pressed, a modulated radio wave at around 400MHz is broadcast into the world. If you are close enough to your car, this signal is received, de-modulated and the desired action is performed. These transmitters are good candidates for RF-fingerprinting since they are ubiquitous, and most people have a pair of identical transmitters.  To begin to identify a key fob, we first need a way to record the signal and get it onto a computer.  For this we use a software defined radio (SDR).

A key fob in action.
A key fob in action.

Signal Exploration with a SDR

SDRs are great for quick experiments because they are cheap and flexible in their application. The most commonly used SDR is RTL-SDR, because they are inexpensive and have good software support. The RTL-SDR provides a general purpose receiver which can be used to record up to about 1.7GHz with a bandwidth of 2.4MHz.

The RTL-SDR
The RTL-SDR

Most key fobs in the U.S. transmit at approximately 315 MHz (there are some exceptions, which transmit on 433.92 MHz). For our first tests, we used a key fob from one of our engineers. Using pyrtlsdr, we recorded a one second time interval and captured the key fob unlock signal, which is about 0.2 seconds long. RF data is multidimensional in nature, so we need to use multiple views of the data to get a more complete picture. Potentially the most common way you see RF data visualized is to plot the amplitude of the signal over time, but RF also contains phase information (often represented as a complex number and visualized by plotting the real and imaginary components), as well as power and frequency information. Usually power and frequency information are plotted against each other, allowing you to see the distribution of power into frequency components that make up the signal (called a Power Spectral Density (PSD) plot). 

Measurement  from a keyfob, in time domain, IQ-plane, and PSD.
Measurement from a keyfob, in time domain, IQ-plane, and PSD.

Zooming in on the time-domain signal as shown below, you can can see its binary amplitude modulation (on-off keying). To demodulate this, we would take the amplitude (green), threshold it, sample it, and get a binary signal out to unlock the car. But to demodulate is not our goal. Instead, the message should be completely removed since it is considered noise from the perspective of RF-fingerprinting in order to differentiate devices . For the purpose of RF-fingerprinting, the tiny artifacts imprinted on this transmission by the radio are the signal of interest.  We started feature engineering on those artifacts.

Time domain of a short segment of  key fob signal illustrating the binary-AM modulation. Real part in blue,  magnitude shown in green. 
Time domain of a short segment of key fob signal illustrating the binary-AM modulation. Real part in blue, magnitude shown in green. 

Feature Engineering

E46

Through our research, we discovered a technique for RF-fingerprinting used by cell phone towers.  The idea was to record the rise times at the beginning of each received transmission, and use this rise-time signature to identify each handset. In the binary AM case shown above, the rise-time of the carrier amplitude is prominent. By using an edge-detector to align each pulse, we characterize the rise-time amplitudes.
Shown below is a plot of several pulse leading edges aligned to each other. Let's marinate on this nice RC time constant; this feature has nothing to do with the message, and everything to do with the transmitter, so it is a great candidate for a RF-fingerprint feature. Furthermore it's real, about 32-64 bits long, and there are about a 1000 of them per signal transmission.  These are all great characteristics for feature vectors when used in conjunction with a neural net.

>Differential phase of leading  edges  aligned in time.
Differential phase of leading edges  aligned in time. 

E90

Most newer cars don't use digital AM for their key fob modulations. We also got our hands on a key fob for a newer vehicle.  Below, we see that the pulse has a short gap in it, and that the PSD tells us that its made up of two frequencies. To get a better look we inspect the time-series, shown below to determine that it is binary Frequency shift keying (FSK) .

Close up of time-domain signal  illustrating binary FSK.
Close up of time-domain signal illustrating binary FSK.

Close up of time-domain signal  illustrating binary FSK.
Close up of time-domain signal illustrating binary FSK.

In order to use something similar to the rise-time concept, we hone in on the discontinuity about the bit-flip. In reference to the figure below, Plot 1 shows the real part of the complex time signal. Plot 2 shows the real vs imaginary components. Taking the phase of this signal and unwrapping it leads to Plot 3. This clearly shows the two-frequencies as two different (positive and negative) phase slopes.   Since the signal has been mixed down to a frequency between the two carriers, each frequency has an equal and opposite phase slope. The change in phase about the bit shift is the part of interest, since it reveals an unwanted transient. Next, we took the first finite difference of the unwrapped phase, as shown in Plot 4, which gets close to the binary signal that represents the message. To focus on the phase discontinuity about the bit-flip, we applied a median-filter-based edge detector to highlight the zero-crossings as shown in Plot 5. This finally allows us to align the phase transients about the bit-flips, shown in Plot 6. Like before with the digital AM signal from the E46 fob, this feature is a real signal, about 64bits long, and there are thousands of them per signal transmission, making them ideal candidates for a RF-fingerprint feature.

Illustration of processing steps for  feature extraction from the E90  key fob signal.
Illustration of processing steps for feature extraction from the E90  key fob signal.

The real test of RF-fingerprinting is to be able to distinguish between two identical transmitters. Here, we leverage a pair of key fobs for the same vehicle. The same series of plots leading the feature engineering on the signal from the earlier fobs is shown below. What can be seen is that the signal is much noisier and transient is much sharper, thus harder to characterize. The other fob for the same car has a signal which is visually identical to the human eye.  Can machine learning distinguish them? We feed these features into a neural net to see if it can decipher the difference between two identical fobs. 

Illustration of processing steps for  feature extraction from the pair of key fob signals.
Illustration of processing steps for feature extraction from the pair of key fob signals.

Neural Net and Results

We took three recordings of each fob at different locations with respect to our SDR, to generate different received power levels, and signal-to-noise ratios. Each pulse was then turned into a set of 32-bit long features through the series of pre-processing steps similar to those depicted above. Both feature sets were trimmed to the same length for fob 1 and fob 2, and split in half into training/test datasets. The data sets were then fed into a shallow convolutional neural network (CNN) using the tflearn library with GPU support. The net was run for 1000 epochs on a Dell Inspiron laptop with a NVIDIA GTX 1650, which only takes about 20s (it takes much longer without GPU support). By providing CNN both training and testing data, the accuracy of the model can be tracked as the neural net evolves. Plots of the CNN's loss and accuracy vs. step are shown below, which indicate an accuracy of >95% in detection between the two fobs. 

CNN loss as a function of    iteration.
CNN loss as a function of iteration.
CNN accuracy as a function    of iteration.
CNN accuracy as a function of iteration.

Below is a graph illustrating the neural net, courtesy of TensorBoard, which is integrated into TFLearn . 

Flow graph for CNN. (produced by  tensorboard)
Flow graph for CNN. (produced by tensorboard)

Conclusion

We demonstrated a proof of concept AI-based RF-fingerprinting algorithm which was shown to distinguish between two identical key fobs operating at 315MHz at 95% accuracy. The system consisted of a RTL-SDR for the RF front-end and open-source Python-based libraries for the feature engineering and machine learning components. The algorithm produces features which are well suited for AI-based applications. While the algorithm is specific to PSK modulation, the concept could be adopted to other modulation schemes.

Go Back