🧩 LLM Tokenizer with Pricing Calculator in Zig

This project is a Byte Pair Encoding (BPE) tokenizer written in Zig that tokenizes text and calculates the cost across various LLM providers.

It reads input from src/prompt.txt, performs BPE tokenization, and displays a comprehensive pricing table for popular language models.

✨ Features

Pure Zig 0.15 implementation (no dependencies outside std)
BPE Tokenization:
- Iteratively finds and merges most frequent adjacent byte pairs
- Stops when no pair occurs more than once
- ANSI-colored output for token visualization
LLM Pricing Calculator:
- Calculates prompt costs
- Displays cost per prompt and price per million tokens
Reads input from src/prompt.txt file

Create a file src/prompt.txt with your text:

This sentence must be tokenized.

Run:

zig build run

Output:

To add new LLM models, edit the models array in src/main.zig:

const models = [_]Model{
    .{ .name = "Your Model Name", .price_per_million = 0.50 },
    // ... other models
};

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src		src
.gitignore		.gitignore
README.md		README.md
build.zig		build.zig
build.zig.zon		build.zig.zon