How do I perform search over documents?
Techniques for searching document collections efficiently
Effective document search relies on representing your data in a way that captures meaning. Common approaches include keyword search, vector similarity search, and semantic graph traversal.
With Morphik, you can ingest text, images, and other modalities. Use the retrieve_docs
function for a simple vector similarity search or query
to combine retrieval with language model generation:
Related questions
-
Q: What is the difference between keyword and vector search?
A: Keyword search matches exact terms via an inverted index, while vector search compares dense embeddings to capture semantic similarity even when different words are used. -
Q: How can I limit search to a specific document category?
A: Pass afilters
dictionary when callingretrieve_docs
orquery
, e.g.filters={"category": "finance"}
, to restrict results to documents with matching metadata. -
Q: When should I use
query
instead ofretrieve_docs
?
A: Usequery
when you need the language model to read the retrieved docs and generate a synthesized answer; useretrieve_docs
when you only need the raw documents.