Skip to main content
def retrieve_docs(
    query: str,
    filters: Optional[Dict[str, Any]] = None,
    k: int = 4,
    min_score: float = 0.0,
    use_colpali: bool = True,
    use_reranking: Optional[bool] = None,
    folder_name: Optional[Union[str, List[str]]] = None,
    folder_depth: Optional[int] = None,
) -> List[DocumentResult]

Parameters

  • query (str): Search query text
  • filters (Dict[str, Any], optional): Optional metadata filters
  • k (int, optional): Number of results. Defaults to 4.
  • min_score (float, optional): Minimum similarity threshold. Defaults to 0.0.
  • use_colpali (bool, optional): Whether to use ColPali-style embedding model to retrieve the documents (only works for documents ingested with use_colpali=True). Defaults to True.
  • use_reranking (bool, optional): Override workspace reranking configuration for this request.
  • folder_name (str | List[str], optional): Optional folder scope. Accepts canonical paths (e.g., /projects/alpha/specs) or a list of paths/names.
  • folder_depth (int, optional): Folder scope depth. None/0 = exact match, -1 = include all descendants, n > 0 = include descendants up to n levels deep.

Metadata Filters

Filters share a common JSON DSL. Review the Metadata Filtering guide for supported operators and typed comparisons. Example:
filters = {
    "$and": [
        {"department": {"$eq": "research"}},
        {"priority": {"$gte": 40}},
        {"start_date": {"$lte": "2024-06-01"}}
    ]
}

docs = db.retrieve_docs("budget summary", filters=filters, k=5)

Returns

  • List[DocumentResult]: List of document results

Examples

from morphik import Morphik

db = Morphik()

docs = db.retrieve_docs(
    "machine learning",
    k=5,
    min_score=0.5
)

nested_docs = db.retrieve_docs(
    "design notes",
    folder_name="/projects/alpha",
    folder_depth=-1,
)

for doc in docs:
    print(f"Score: {doc.score}")
    print(f"Document ID: {doc.document_id}")
    print(f"Metadata: {doc.metadata}")
    print(f"Content: {doc.content}")
    print("---")

DocumentResult Properties

The DocumentResult objects returned by this method have the following properties:
  • score (float): Relevance score
  • document_id (str): Document ID
  • metadata (Dict[str, Any]): Document metadata
  • content (DocumentContent): Document content or URL