Article

Cutting through the Emissions: Feature Selection from Electromagnetic Side-Channel Data for Activity Detection

Asanka Sayakkara; Luis Miralles; Nhien-An Le-Khac; Mark Scanlon

April 2020 Forensic Science International: Digital Investigation

Contribution Summary

This paper addresses the challenge of processing high-dimensional electromagnetic (EM) data in real-time for digital forensics and cybersecurity applications. The authors propose a systematic methodology to identify information-leaking frequency channels from EM data using multiple filtering techniques and machine learning. The approach is evaluated on a dataset of EM signals from an IoT device, demonstrating its effectiveness in reducing the number of channels from 20,000 to less than 100. This significant reduction in dimensionality enables real-time analysis and improves the efficiency of EM side-channel analysis (EM-SCA). The methodology is based on a Random Forest classifier and is shown to achieve high accuracy in identifying information-leaking channels. The results have implications for the application of EM-SCA in digital forensics and cybersecurity, particularly in scenarios where real-time analysis is critical, such as in the investigation of IoT devices.

Keywords: Digital forensics; Electromagnetic side-channels; Feature selection; Internet-of-things (IoT); Machine learning; EM-SCA; Real-time analysis; High-dimensional data

Abstract

Electromagnetic side-channel analysis (EM-SCA) has been used as a window to eavesdrop on computing devices for information security purposes. It has recently been proposed to use as a digital evidence acquisition method in forensic investigation scenarios as well. The massive amount of data produced by EM signal acquisition devices makes it difficult to process them in real-time making on-sight EM-SCA nearly impossible. The uncertainty of exact information leaking frequency channel demands the investigator to acquire signals over a wide bandwidth. As a consequence, the investigators are left with a large number of potential frequency channels in EM data to be inspected, among them , many may not contain any information leakages at all. Under these circumstances, the identification of a small subset of frequency channels that leak sufficient amount of information can significantly boost the performance of real-time analysis of EM side-channel data. This work, presents a systematic methodology to identify information leaking frequency channels from high dimensional EM data with the help of multiple filtering techniques and machine learning. The evaluations show that it is possible to narrow down the number of frequency channels from over 20,000 to less than few hundreds that makes real-time EM data processing highly efficient.

BibTeX

@article{sayakkara2020EMFeatureSelection,
	author={Sayakkara, Asanka and Miralles, Luis and Le-Khac, Nhien-An and Scanlon, Mark},
	title="{Cutting through the Emissions: Feature Selection from Electromagnetic Side-Channel Data for Activity Detection}",
	journal="{Forensic Science International: Digital Investigation}",
	year="2020",
	month="04",
	publisher={Elsevier},
	volume = "32",
	pages = "300927",
	issn = "2666-2817",
	doi = "https://doi.org/10.1016/j.fsidi.2020.300927",
	url = "http://www.sciencedirect.com/science/article/pii/S2666281720300226",
	keywords = "Digital forensics, Electromagnetic side-channels, Feature selection, Internet-of-things (IoT), Machine learning",
	abstract={Electromagnetic side-channel analysis (EM-SCA) has been used as a window to eavesdrop on computing devices for information security purposes. It has recently been proposed to use as a digital evidence acquisition method in forensic investigation scenarios as well. The massive amount of data produced by EM signal acquisition devices makes it difficult to process them in real-time making on-sight EM-SCA nearly impossible. The uncertainty of exact information leaking frequency channel demands the investigator to acquire signals over a wide bandwidth. As a consequence, the investigators are left with a large number of potential frequency channels in EM data to be inspected, among them , many may not contain any information leakages at all. Under these circumstances, the identification of a small subset of frequency channels that leak sufficient amount of information can significantly boost the performance of real-time analysis of EM side-channel data. This work, presents a systematic methodology to identify information leaking frequency channels from high dimensional EM data with the help of multiple filtering techniques and machine learning. The evaluations show that it is possible to narrow down the number of frequency channels from over 20,000 to less than few hundreds that makes real-time EM data processing highly efficient. }
}