1 comments

  • raaihank 4 hours ago
    Costbase is an LLM cost optimization proxy. We just shipped TOON (Token-Oriented Object Notation) compression.

    TOON is an open format (not ours): https://github.com/toon-format/toon

    It converts JSON like this:

        {"id": "cust_001", "name": "Acme", "mrr": 15000}
    
    Into: id: cust_001 name: Acme mrr: 15000

    We integrated it into our gateway to automatically compress JSON in tool results, user messages, and tool call arguments before they hit the LLM.

    Benchmarks on real payloads:

    - CRM query (10 records): 48% tokens saved - E-commerce orders (4 orders): 34% saved - API metrics (8 endpoints): 43% saved

    Sub-100μs latency overhead. LLMs parse it correctly in our testing (GPT-4o, Claude, etc).

    Not a silver bullet — works best on arrays of objects with uniform schemas. Deeply nested or irregular JSON sees less benefit.

    Curious what strategies others use for token compression. We considered CSV for tabular data but it doesn't handle nested structures.

    https://www.costbase.ai