AI-powered search for scientific publications
Navigates the users through the complex taxonomy of scientific publications, and helps them find the right articles and papers.
Chat AgentScientific papersLLMKnowledge GraphGenerative AI

What motivated the project?
Navigating the vast and complex taxonomy of scientific fields to find relevant research papers can be daunting and time-consuming. Traditional scholarly search systems like Google Scholar often require users to sift through numerous irrelevant results, making the search for specific scientific papers inefficient and overwhelming. As the volume of scientific literature continues to grow, there is a pressing need for a more intelligent and efficient way to search for and identify relevant research.
What was the objective of the projet?
In the era of advanced artificial intelligence, we embarked on a mission to revolutionize the way researchers search for scientific papers. Our goal was to develop an AI chat agent that helps users navigate the taxonomy of scientific fields and find papers that align with their research goals. This innovative solution leverages hierarchical clustering, knowledge graphs, and large language models (LLMs) to provide a superior search experience compared to traditional scholarly search systems.
How It Works
Our AI chat agent simplifies the process of finding relevant scientific papers by guiding users through a series of questions about their research goals. Here is how it works:
- Initial Inquiry: The AI chat agent begins by asking the user about their specific research goals, including the topic of interest, desired outcomes, and any particular subfields they are focusing on.
- Hierarchical Narrowing: Using hierarchical clustering techniques, the agent evaluates the user's responses to progressively narrow down the list of potential papers. This method ensures that the search is focused and relevant, filtering out unnecessary results.
- Paper Matching: The agent searches a vast database of scientific literature to identify papers that match the refined criteria, ensuring that the most pertinent studies are highlighted.
- Suggestions and Selection: The user is presented with a list of 10 relevant papers. This curated list allows for a streamlined and efficient selection process.
- Paper Comparison: If the user wishes to compare two papers from the list, the AI chat agent utilizes large language models to generate detailed comparison summaries. These summaries highlight key differences and similarities, aiding the user in making an informed decision about which paper best fits their research needs.
What are the benefits of using this agent?
By integrating an intelligent and user-friendly interface, our AI chat agent transforms the search experience for researchers. This automated system ensures that users can quickly and efficiently find relevant scientific papers, saving valuable time and effort. The hierarchical clustering and LLM-powered comparison features provide a comprehensive and insightful search process, setting a new standard for academic research in the digital age. The expertly designed virtual assistant not only enhances the efficiency of academic searches but also empowers researchers to focus on their core work—advancing knowledge and innovation—without getting bogged down by the complexities of traditional search methods.
What technologies are used to build the agent?
The AI chat agent is built using a robust and efficient architecture comprising multiple components:
- Knowledge Base:
- The knowledge base is a Neo4J database that contains publication entities and their relationships to authors, other publications, conferences, and various topics and subtopics within the scientific field.
- For topic recognition, a vector database running on a Weaviate DB Instance is used. This allows for searches through publication embeddings that are 70 times faster compared to Neo4J. However, since Weaviate is not suitable for storing entities, both databases are used in tandem.
- AI Framework:
- The agent is built using the RASA Framework, which trains transformer models to understand user intents and trigger specific actions. For example, if a user indicates a desire to compare papers, the agent will call the comparison service, which utilizes the Mistral-based Zephyr-7b-beta LLM running on a GPU server.
- Client-Side Application:
- The user interface is developed using the Streamlit Framework, facilitating easy prototyping and deployment of chat-based applications.