In this episode of the Data Engineering Podcast, Tobias Macey interviews Kacper Lukawski, a Senior Developer Advocate at Qdrant, about using MCP servers with vector databases to streamline unstructured data processing. They discuss the challenges teams face in building pipelines for unstructured data, the applications of LLMs in transforming this data, and the design considerations for storing vector embeddings. Kacper distinguishes vector databases from search engines, highlighting Qdrant's role as a search engine and the importance of keeping original data close to vectors. They explore retrieval methods, the efficiency of vector databases compared to traditional search engines, and the broader applications of vector engines beyond RAG. The conversation also covers the role of MCP servers, best practices for structuring data, the need for experimentation in data teams, and strategies for managing the lifecycle of embeddings. Kacper shares insights on grounded vibe coding, the cost of running vector search, and the importance of choosing the right embedding model, as well as Qdrant's future plans, including code generation-specific MCP servers.
Sign in to continue reading, translating and more.
Continue