Build an AI-Powered Question-Answering Application
You’ve got a ton of data — structured, unstructured, you name it — and you want to put it to use in an application. Whether you’re looking to gain insights or find answers, you need a solution that delivers results quickly and accurately. With the right tools, this might be easier than you think.
The old standby of keyword searching has its limits, especially when unstructured data enters the mix. At that point, the chances of getting quick, relevant results start to fade. But there’s a way forward. By combining Milvus, an open source vector database, with Haystack 2.0, an open source framework from Deepset for building end-to-end large language model (LLM) applications and retrieval augmented generation (RAG) pipelines, you can build the kind of advanced applications that users and developers crave.
In this article, I’ll explain how to harness the power of Milvus and Haystack 2.0 to create an AI-powered question-answering application using retrieval-augmented generation (RAG). Let’s dive in!
Data Storage With Milvus
To use data, you need to store it somewhere first. Maintained by Zilliz developers, Milvus is an open source vector database designed to handle high-dimensional vectors efficiently. Vectors are numerical representations of unstructured data (text, photos, audio files, etc.) in a high-dimensional space.
Converting data to a vector embedding preserves the semantic meaning and relationships between data points. The closer these embeddings are located in this space, the more semantically relevant they are. When you store these vectors in a vector database, you quickly amass related embeddings, making searches for contextually relevant data more efficient.
Factors to Consider When Selecting a Vector Database
It’s important to consider critical functionality when selecting a vector database for your RAG pipeline.
- High-dimensional vector indexing: Is the vector database optimized for indexing and searching high-dimensional vectors? It should be able to utilize advanced indexing techniques, such as hierarchical navigable small world graphs (HNSW) or inverted file systems (IVF), to create efficient index structures. These index structures enable fast similarity searches even in large-scale data sets with millions or billions of vectors.
- Scalability and performance: Even though your prototype might use a small data set, it’s important to plan for production scale. Can your vector database handle massive amounts of vector data without sacrificing performance? It should scale horizontally across multiple nodes or machines, allowing distributed storage and parallel processing. This scalability ensures that it can accommodate growing data sets and high query throughput without compromising the speed and accuracy of your application.
- Hybrid search: Your vector database should support hybrid searches, enabling you to retrieve vectors across different modalities or generated by various embedding models. Additionally, it should allow you to combine hybrid vector search with keyword matching for more complex queries. This capability simplifies finding the most relevant items in a vector database, whether based on metadata or vector embeddings representing images, audio, text and more.
- Integration with popular machine learning models: Make sure your vector database integrates seamlessly with popular machine learning models, such as OpenAI text embedding models, Cohere multilingual models and Voyage AI code embedding models, to streamline the conversion of unstructured data into vector embeddings for efficient similarity retrieval.
- Support for multiple indexes and distance metrics: Different indexing algorithms and distance metrics cater to different use cases and data characteristics. Depending on your specific requirements, you should be able to choose between indexes such as HNSW, IVF, ANNOY or FAISS. Additionally, you may want to choose distance metrics such as Euclidean distance, cosine similarity and inner product, allowing you to select the most appropriate metric for your application.
Building Pipelines With Haystack 2.0
Once your data is safely stored, you need to build pipelines to use it effectively. This is where Haystack 2.0 comes into play. Haystack is an open source Python framework for building production-ready LLM applications, RAG pipelines and modern search systems that work intelligently with large document collections.
How do you build Haystack pipelines for an LLM application? Haystack gives you components you can link to build custom data pipelines. The components can help you perform tasks like document retrieval, text generation or summarization. You can also build your own components or use one of the example pipelines as a starting point.
Haystack 2.0 also integrates seamlessly with Milvus. This means developers can quickly connect their data pipelines with Milvus data storage and retrieval capabilities, accelerating the development of RAG pipelines and LLM applications.
Build an AI-Powered Application
In the following sections, I’ll show you how to build an AI-powered question-answering recipe application using the popular RAG technique with Haystack 2.0 and the Milvus vector database.
Setup and Installation
To get started building with Haystack and Milvus, the following instructions will walk you through building a sample RAG-based recipe application that lets you ask questions, request recipes and create meal plans from a set of popular vegan recipes. You can also use your own recipes to make it even more custom.
First and most importantly, make sure you have Python installed on your local machine. You’ll need version 3.6 or higher. If you need an update, grab one at the Python page.
To install the necessary packages:
1 |
pip install --upgrade pymilvus milvus-haystack markdown-it-py mdit_plain pypdf sentence-transformers |
Build an Indexing Pipeline
Next, you need to build your indexing pipeline. Since this tutorial uses Milvus as the data store, your indexing pipeline, which processes and stores documents, will use MilvusDocumentStore.
Make sure you have some sample files to use and test with. Put your files in a folder called recipe_files. You can use the sample files used in this example, or you can use your own recipes. If you change the folder name or file path, update the example to reflect your choices. You may also want to implement error checking, handling and logging to the sample code to catch any issues that may arise along the way.
1. Initialize the MilvusDocumentStore
2. Configure the Necessary Components for Document Processing
3. Index the Documents
This step adds the Haystack components to your pipeline. Make sure you configure inputs and outputs for each component and put them in the order you want them to run.
4. Run the Pipeline
Indexing processes the documents, converts them to text, splits them into chunks and embeds those chunks into high-dimensional vectors.
Integrate the RAG Pipeline
The RAG pipeline combines document retrieval with answer generation using an LLM. For this example, you need an OpenAI key. Set it as an environment variable, OPENAI_API_KEY
. See Haystack’s documentation for the full list of models it supports.
The RAG pipeline retrieves relevant documents based on the query, generates a prompt for the OpenAIGenerator using the retrieved documents and generates an answer using the LLM. It then returns the generated answer as the final output.
Conclusion
This was a basic example illustrating how to integrate Milvus with Haystack 2.0. Combining Milvus’ vector storage, indexing and retrieval capability with Haystack’s RAG pipelines lets you build systems that effectively process and evaluate documents to deliver relevant answers. Hopefully, this example will help jump-start your efforts to build advanced LLM applications.