chunklet.common.token_utils
Functions:
-
count_tokens–Count tokens in a string using a provided token counting function.
count_tokens
cached
Count tokens in a string using a provided token counting function.
Wraps the token counting function with error handling. Ensures the returned value is numeric and converts it to an integer.
Parameters:
-
(textstr) –Text to count tokens in.
-
(token_counterCallable[[str], int]) –Function that returns the number of tokens.
Returns:
-
int(int) –Number of tokens.
Raises:
-
CallbackError–If the token counter fails or returns an invalid type.
Examples:
>>> def simple_word_counter(text: str) -> int:
... return len(text.split())
>>> text = "This is a sample sentence."
>>> count_tokens(text, simple_word_counter)
5
>>> def char_counter(text: str) -> int:
... return len(text)
>>> count_tokens("hello", char_counter)
5
>>> # Example with a failing token counter
>>> def failing_counter(text: str) -> int:
... raise ValueError("Something went wrong!")
>>> try:
... count_tokens("test", failing_counter)
... except CallbackError as e:
... print(e)
Token counter failed while processing text starting with: 'test...'.
💡 Hint: Please ensure the token counter function handles all edge cases and returns an integer.
Details: Something went wrong!