Skip to content

chunklet.base_chunker

Base Chunker Abstract Class

Defines the interface for chunkers.

Classes:

BaseChunker

BaseChunker(verbose: bool = False)

Bases: ABC

Abstract base class for chunkers.

Defines the standard interface for chunking content into units.

Methods:

Source code in src/chunklet/base_chunker.py
def __init__(self, verbose: bool = False):
    self.verbose = verbose

chunk_file abstractmethod

chunk_file(*args, **kwargs) -> list[Box]

Read and chunk a file.

Returns:

  • list[Box]

    list[Box]: List of chunks with content and metadata.

Source code in src/chunklet/base_chunker.py
@abstractmethod
def chunk_file(self, *args, **kwargs) -> list[Box]:
    """
    Read and chunk a file.

    Returns:
        list[Box]: List of chunks with content and metadata.
    """
    pass

chunk_files abstractmethod

chunk_files(*args, **kwargs) -> Generator[Box, None, None]

Process multiple files.

Yields:

  • Box ( Box ) –

    Box object, representing a chunk with its content and metadata.

Source code in src/chunklet/base_chunker.py
@abstractmethod
def chunk_files(self, *args, **kwargs) -> Generator[Box, None, None]:
    """
    Process multiple files.

    Yields:
        Box: `Box` object, representing a chunk with its content and metadata.
    """
    pass

chunk_text abstractmethod

chunk_text(*args, **kwargs) -> list[Box]

Extract chunks from text.

Returns:

  • list[Box]

    list[Box]: List of chunks with content and metadata.

Source code in src/chunklet/base_chunker.py
@abstractmethod
def chunk_text(self, *args, **kwargs) -> list[Box]:
    """
    Extract chunks from text.

    Returns:
        list[Box]: List of chunks with content and metadata.
    """
    pass

chunk_texts abstractmethod

chunk_texts(*args, **kwargs) -> list[list[Box]]

Process multiple texts.

Returns:

  • list[list[Box]]

    list[list[Box]]: List of chunks for each input text.

Source code in src/chunklet/base_chunker.py
@abstractmethod
def chunk_texts(self, *args, **kwargs) -> list[list[Box]]:
    """
    Process multiple texts.

    Returns:
        list[list[Box]]: List of chunks for each input text.
    """
    pass