prune your logs
Context pruning middleware — collapse noise, rank relevance, pack to your token budget before the model ever sees it.
Thousands of repetitive log lines become a focused context stream — critical errors preserved, duplicates collapsed.
Roka is a pruning middleware: ingest logs, code, or prose, detect structure, score relevance to your query, and deliver exactly what fits the budget.
Fingerprints repetitive lines (UUIDs, IPs, timestamps → placeholders) and collapses patterns above a threshold into summary chunks with provenance.
Panics, exceptions, OOM, auth failures, and stack traces are never collapsed — always packed first regardless of score.
Fast lexical scoring with BM25Okapi, then optional sentence-transformer re-ranking so chunks match your query intent.
Greedy packing with exact tiktoken counts (cl100k_base). Source diversity cap prevents one file from hogging the budget.
Auto-detects logs, code, or prose. Code can be minified (docstrings/comments stripped) and chunked by function/class scope.
Pipe stdin, point at a file, or POST to /api/prune. Same pipeline everywhere — stdout is always the pruned context.
Paste a log dump, set your query and budget, and run the live pipeline.
Drop Roka into your agent pipeline, CI debug step, or observability stack. One POST with your raw context and query — get back pruned text plus compression stats.
curl -X POST http://localhost:8000/api/prune \
-H "Content-Type: application/json" \
-d '{
"text": "<raw logs or code>",
"query": "why did auth fail",
"token_budget": 8000,
"use_semantic": true,
"use_minification": false
}'
Python 3.10+. Install dependencies and run the API server or use the CLI directly.
pip install fastapi uvicorn rank-bm25 tiktoken \
sentence-transformers typer rich
python project/main.py
# → http://127.0.0.1:8000
cat logs.txt | python project/prune.py \
--query "deploy failure" --budget 8000 --stats