Inproceedings

A Framework for Integrated Digital Forensic Investigation Employing AutoGen AI Agents

Akila Wickramasekara; Mark Scanlon

April 2024 Proceedings of the 12th International Symposium on Digital Forensics and Security

Contribution Summary

This paper presents a novel architecture for digital forensic investigations that integrates AutoGen AI agents and Large Language Models (LLMs) to optimize the investigative workflow. The proposed framework utilizes AI agents and LLMs to perform tasks articulated in natural language by a human agent, alleviating the investigative workload and shortening the learning curve for investigators. The framework's design considers the challenges of evolving requirements, information accuracy, and legal barriers. The authors introduce the concept of prompt engineering in the context of digital forensics, aiming to generate subtasks from intricate and sequential tasks. The framework's integrity is intricately tied to the precision and clarity achieved through prompt engineering, establishing a pivotal dependency. The proposed framework is built upon the AutoGen framework, integrating LLaMA and StarCoder LLMs alongside four AI agents. The framework's ability to process and understand input from natural language, distinguish between specific tasks and irrelevant information, and recognize various language patterns and technical DF terminology improves the model's accuracy in interpreting user commands. The framework's validation mechanism leverages a baseline data set for effective implementation, and the Language Feedback Benchmark (LLF-Bench) is utilized to assess the LLM's performance in sub-task decomposition within the DF domain.

Keywords: Digital Forensics; Large Language Models; AI Agents; AutoGen; Prompt Engineering; Digital Forensic Investigation; Artificial Intelligence; Natural Language Processing

Abstract

The increasing frequency and rapidity of criminal activities require faster digital forensic (DF) investigations. Currently, most DF phases involve manual procedures, requiring significant human effort and time, often facing evolving requirements. This paper proposes an integrated framework employing AutoGen Artificial Intelligence (AI) agents and Large Language Models (LLMs) such as LLAMA, and StarCoder. The suggested framework utilizes AI agents and LLMs to perform tasks articulated in natural language by a human agent. The proposed architecture presents a significant advantage by alleviating the investigative workload and shortening the learning curve for investigators. However, it is still combined with risks such as information accuracy, hallucination impact, and legal barriers. Although, this research contributes to the ongoing discourse on optimizing DF processes in response to the evolving landscape of criminal activities and the corresponding demands placed on investigative resources.

BibTeX

@inproceedings{wickramasekara2024DFAutoGenAI,
	author={Wickramasekara, Akila and Scanlon, Mark},
	title="{A Framework for Integrated Digital Forensic Investigation Employing AutoGen AI Agents}",
	booktitle="{Proceedings of the 12th International Symposium on Digital Forensics and Security}",
	year=2024,
	pages = {},
	month=04,
	publisher={IEEE},
	abstract={The increasing frequency and rapidity of criminal activities require faster digital forensic (DF) investigations. Currently, most DF phases involve manual procedures, requiring significant human effort and time, often facing evolving requirements. This paper proposes an integrated framework employing AutoGen Artificial Intelligence (AI) agents and Large Language Models (LLMs) such as LLAMA, and StarCoder. The suggested framework utilizes AI agents and LLMs to perform tasks articulated in natural language by a human agent.  The proposed architecture presents a significant advantage by alleviating the investigative workload and shortening the learning curve for investigators. However, it is still combined with risks such as information accuracy, hallucination impact, and legal barriers. Although, this research contributes to the ongoing discourse on optimizing DF processes in response to the evolving landscape of criminal activities and the corresponding demands placed on investigative resources.},
  doi={10.1109/ISDFS60797.2024.10527235},
}