Effective document search relies on representing your data in a way that captures meaning. Common approaches include keyword search, vector similarity search, and semantic graph traversal.

With Morphik, you can ingest text, images, and other modalities. Use the retrieve_docs function for a simple vector similarity search or query to combine retrieval with language model generation:

from morphik import Morphik

db = Morphik()

# Retrieve top matching documents
docs = db.retrieve_docs(query="latest sales figures", k=3)

# Or generate an answer from the documents
answer = db.query("summarize the trends", k=3)
print(answer.text)
  • Q: What is the difference between keyword and vector search?
    A: Keyword search matches exact terms via an inverted index, while vector search compares dense embeddings to capture semantic similarity even when different words are used.

  • Q: How can I limit search to a specific document category?
    A: Pass a filters dictionary when calling retrieve_docs or query, e.g. filters={"category": "finance"}, to restrict results to documents with matching metadata.

  • Q: When should I use query instead of retrieve_docs?
    A: Use query when you need the language model to read the retrieved docs and generate a synthesized answer; use retrieve_docs when you only need the raw documents.