batch_get_chunks

Retrieve specific chunks by their document ID and chunk number in a single batch operation.

def batch_get_chunks(sources: List[Union[ChunkSource, Dict[str, Any]]]) -> List[FinalChunkResult]

Parameters

  • sources (List[Union[ChunkSource, Dict[str, Any]]]): List of ChunkSource objects or dictionaries with document_id and chunk_number

Returns

  • List[FinalChunkResult]: List of chunk results

Example

from databridge.sync import DataBridge
from databridge.models import ChunkSource

db = DataBridge()

# Using dictionaries
sources = [
    {"document_id": "doc_123", "chunk_number": 0},
    {"document_id": "doc_456", "chunk_number": 2}
]

# Or using ChunkSource objects
sources = [
    ChunkSource(document_id="doc_123", chunk_number=0),
    ChunkSource(document_id="doc_456", chunk_number=2)
]

chunks = db.batch_get_chunks(sources)
for chunk in chunks:
    print(f"Chunk from {chunk.document_id}, number {chunk.chunk_number}: {chunk.content[:50]}...")

FinalChunkResult Properties

Each FinalChunkResult object in the returned list has the following properties:

  • content (str | PILImage): Chunk content (text or image)
  • score (float): Relevance score
  • document_id (str): Parent document ID
  • chunk_number (int): Chunk sequence number
  • metadata (Dict[str, Any]): Document metadata
  • content_type (str): Content type
  • filename (Optional[str]): Original filename
  • download_url (Optional[str]): URL to download full document