Article in Press
This article is currently in the Just Accepted phase. The final published version may have formatting changes or additional corrections.
Abstract
In today's world of information overload, researchers face increased difficulty in identifying reliable, verified, and up-to-date knowledge from vast online data sources. While Large Language Models (LLMs) and traditional Retrieval-Augmented Generation (RAG) systems support factual retrieval, they remain limited by single-agent reasoning, lack of integrated validation, and dependence on static or outdated knowledge bases. This paper proposes DeepRAG, a conceptual, implementation-ready multi-agent Retrieval-Augmented Generation framework designed to support autonomous and verifiable research synthesis. DeepRAG introduces a coordinated set of specialised agents for planning, semantic retrieval, real-time web data acquisition, citation validation, knowledge synthesis, and structured report generation. The framework leverages agentic orchestration mechanisms (e.g., Autogen), vector-based retrieval using ChromaDB, and real-time web crawling via Crawl4AI to address core limitations of existing RAG pipelines. Rather than presenting empirical benchmarks, this work focuses on architectural design, agent-level responsibilities, and workflow specification, demonstrating how DeepRAG is intended to generate citation-backed research reports in PDF format using the FPDF library. Design-level reasoning indicates that the proposed framework addresses challenges related to factual grounding, citation consistency, and transparency in AI-assisted research. DeepRAG thus provides a robust foundation for future prototype development, experimental evaluation, and deployment of autonomous research assistants.