Text Chunk Visualizer
Interactive chunking in action - upload, process, and explore!
Quick Install
Text Chunk Visualizer: Your Window into the Chunking Abyss
This installs the web interface dependencies (FastAPI + Uvicorn) for interactive chunk visualization! 🌐
Ever wondered what your text or code looks like after being chopped up by a chunking algorithm? The Text Chunk Visualizer demystifies text segmentation with a clean web interface - WYSIWYG (what you see is what you get).
No more guessing games - see your chunking results in real-time!
So How Do I Get This Running?
First, make sure you have the visualization dependencies:
Here's the basic code to get it running:
- Host: The IP address where the server will listen. Use
"127.0.0.1"for localhost or"0.0.0.0"to allow access from other devices on your network. - Port: The port number for the web server. The visualizer will be accessible at
http://host:port.
Click to show output
Starting Chunklet Visualizer...
URL: http://127.0.0.1:8000
Press Ctrl+C to stop the server
Opened in default browser
= = = = = = = = = = = = = = = = = = = = = = = = = =
TEXT CHUNK VISUALIZER
= = = = = = = = = = = = = = = = = = = = = = = = = =
URL: http://127.0.0.1:8000
INFO: Started server process [30999]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:45482 - "GET / HTTP/1.1" 200 OK
INFO: 127.0.0.1:45490 - "GET /static/js/app.js HTTP/1.1" 200 OK
INFO: 127.0.0.1:45482 - "GET /static/css/style.css HTTP/1.1" 304 Not Modified
INFO: 127.0.0.1:45490 - "GET /api/token_counter_status HTTP/1.1" 200 OK
Run this and you'll see the server start up with the URL where your visualizer is ready!
(But honestly, the CLI chunklet visualize command is way easier for most use cases!)
Prefer command line?
For quick access without writing code, check out the CLI visualize command.
What's the Web Interface Like?
Open your browser to the URL shown in the terminal output. You'll find a clean interface designed for quick chunking experiments.
How do I upload files?
Simple: drag and drop your text files (.txt, .md, .py, etc.) onto the upload area, or click "Browse Files" to select them manually. The visualizer accepts any text-based file.
What's the difference between Document and Code mode?
Choose your chunking strategy after upload: - Document Mode: For general text, articles, and documents - focuses on sentences and sections - Code Mode: For source code - understands functions, classes, and code structure
Each mode has its own parameter controls because text and code need different chunking approaches.
How Do I Process My Content?
Select your mode and parameters, then click "Process Document" or "Process Code". The visualizer applies your settings and shows exactly how your content gets chunked.
What About the Interactive Features?
The interface gives you great visibility:
- Click to Highlight: Click text to see which chunk(s) contain it
- Double-Click for Details: Get full metadata popups with span, chunk number, and source info
- Overlap Toggle: Use "Reveal Overlaps" to see where chunks share content
Can I Export My Results?
Absolutely! Click "Download Chunks" to get a JSON file with all chunks, their content, and complete metadata - perfect for further processing or analysis.
The exported JSON follows this structure:
{
"chunks": [
{
"content": "The actual text content of this chunk...",
"metadata": {
"source": "path/to/source/file.txt",
"chunk_num": 1,
"span": [0, 150],
// ... other metadata fields
}
},
// ... more chunks
],
"stats": {
"chunk_count": 3,
"overlap_count": 2,
"text_length": 696,
"mode": "document",
"generated": "2025-12-18T15:16:11.379Z"
}
}
Quick Tips for Better Results
- Start with small files to get familiar with the interface
- Experiment with different parameter combinations to see their effects
- Use the metadata views to understand chunk boundaries
- The visualizer is perfect for comparing chunking strategies side-by-side
Go experiment! The visualizer makes it easy to see exactly what your settings produce, so you can fine-tune for optimal chunking.
Headless/REST API Usage
The Visualizer isn't just a web interface - it also provides a complete REST API for headless chunking operations. This means you can use Chunklet's interactive features programmatically without the web UI!
Available Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check endpoint |
GET |
/api/token_counter_status |
Check if token counter is configured |
POST |
/api/chunk |
Upload and chunk a file |
Chunking Files Programmatically
Option 1: CLI Headless Server (Recommended)
The easiest way to start a headless visualizer server is with the CLI:
CLI Headless Mode
See the Scenario 3: Headless Mode in the CLI documentation for more details on headless CLI usage with custom tokenizers.
Option 2: Python Server Script
For more programmatic control, create a custom server script (server.py):
Run the server:
Using the REST API Client
Use this Python client to chunk files programmatically:
Response Format
The /api/chunk endpoint returns:
{
"text": "Original file content...",
"chunks": [
{
"content": "Chunk text content...",
"metadata": {
"source": "filename.txt",
"chunk_num": 1,
"span": [0, 150],
// ... additional metadata
}
}
],
"stats": {
"text_length": 696,
"chunk_count": 3,
"mode": "document"
}
}
Perfect for Integration
Use the REST API to integrate Chunklet's visualizer capabilities into your own applications, automation scripts, or testing pipelines!
API Reference
For complete technical details on the Visualizer class, check out the API documentation.