February 27, 2025

Improved Streaming Support and Async Stream Parser

We’ve made significant improvements to our streaming functionality with two key updates:

Stream Fixes

We’ve resolved several issues with stream handling across different LLM providers, ensuring more reliable and consistent streaming experiences. These fixes address edge cases and improve compatibility with various streaming implementations, including:

Better handling of stream interruptions and reconnections
Improved error handling for streaming responses
Enhanced compatibility with different LLM provider streaming formats
Fixed timing calculations for streamed responses

New Streaming Methods

The HeliconeManualLogger class now includes enhanced methods for working with streams:

logStream: Logs a streaming operation with full control over stream handling
logSingleStream: Simplified method for logging a single ReadableStream
logSingleRequest: Logs a single request with a response body

Example Usage with Together AI

import Together from "together-ai";
import { HeliconeManualLogger } from "@helicone/helpers";

// Initialize with properties
const helicone = new HeliconeManualLogger({
  apiKey: process.env.HELICONE_API_KEY!,
  loggingEndpoint: "https://api.worker.helicone.ai/oai/v1/log",
  headers: {
    "Helicone-Property-Environment": "production",
  },
});

export async function POST(request: Request) {
  const { question } = await request.json();

  const body = {
    model: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    messages: [{ role: "user", content: question }],
    stream: true,
  } as Together.Chat.CompletionCreateParamsStreaming & { stream: true };

  const response = await together.chat.completions.create(body);
  const [stream1, stream2] = response.tee();
  helicone.logStream(
    body,
    async (resultRecorder) => {
      resultRecorder.attachStream(stream2.toReadableStream());
    },
    {
      "Helicone-User-Id": "123",
    }
  );

  return new Response(stream1.toReadableStream());
}

These improvements make working with streaming LLMs more reliable and efficient, especially for applications that require real-time responses.

February 26, 2025

Complete Logging Control for Async Integration

We’ve enhanced our Python Asynchronous Logging integration with new methods to completely disable all logging to Helicone. This feature allows you to temporarily or permanently stop sending any data to our backend while maintaining your integration setup.

New Methods

disable_logging(): Completely disables all logging by shutting down the Traceloop SDK
enable_logging(): Re-enables logging if it was previously disabled

This is different from the existing disable_content_tracing() which only omits request and response content but still sends other metrics. The new functionality gives you complete control over your data privacy and is only available when using Helicone’s async integration mode.

# Example usage
from helicone_async import HeliconeAsyncLogger
from openai import OpenAI

logger = HeliconeAsyncLogger(api_key=HELICONE_API_KEY)
logger.init()

# Completely disable all logging
logger.disable_logging()

# Your OpenAI calls here - no data sent to Helicone

# Later, re-enable logging if needed
logger.enable_logging()

February 25, 2025

Enhanced Cost Sorting and Data Organization

Improved sorting capabilities across the platform, particularly for cost request pages. This update (#3326) makes it easier to organize and analyze your data with more intuitive sorting controls.

February 24, 2025

New Claude 3.7 Sonnet Support: Full Cost Tracking and Integration

Immediate Support for Claude 3.7 Sonnet

We’re excited to announce full support for Anthropic’s latest Claude 3.7 Sonnet model (claude-3-7-sonnet-20250219), including comprehensive cost tracking and monitoring capabilities.

What’s New in Claude 3.7?

Claude 3.7 Sonnet represents Anthropic’s latest advancement in language AI, introducing groundbreaking hybrid reasoning capabilities. As announced by Anthropic, this model offers:

Core Capabilities

First hybrid reasoning model combining quick responses and extended thinking modes
Up to 128K token output limit for handling extensive conversations
Same competitive pricing: $3 per million input tokens and$ 15 per million output tokens (including thinking tokens)

Enhanced Performance

State-of-the-art results on SWE-bench Verified and TAU-bench
Best-in-class coding and software development capabilities
Advanced reasoning and analytical processing
Improved instruction-following and task completion reliability

Real-world Applications

The model has demonstrated exceptional performance across:

Complex codebase analysis and management
Advanced tool usage and automation
Full-stack development and web applications
Production-ready code generation with reduced errors
Large-scale refactoring and debugging

Cost Tracking and Integration

Our platform provides comprehensive monitoring with:

Usage Analytics

Full cost tracking for both standard and extended thinking modes
Precise token counting for inputs and outputs
Real-time usage monitoring and analytics
Updated pricing calculator with latest model rates

Integration Features

Seamless integration with existing Anthropic workflows
Support for both API and direct integration paths
Comprehensive request-level analytics
Advanced monitoring dashboards

How to Use

To start using Claude 3.7 Sonnet with Helicone:

Update your model parameter to claude-3-7-sonnet-20250219
Refer to our Anthropic Integration Guide for implementation details
Monitor costs and usage through your Helicone dashboard

Supported Features

Real-time cost and usage tracking
Extended thinking mode support and monitoring
Comprehensive token usage analytics
Full integration with existing workflows
Advanced request-level insights

Learn More About Claude 3.7 Sonnet

February 14, 2025

Introducing Auto-Improve for Prompts

We’re excited to launch Auto-Improve, an intelligent prompt optimization tool that helps you write more effective LLM prompts. While traditional prompt engineering requires extensive trial and error, Auto-Improve analyzes your prompts and suggests improvements instantly.

How it Works

Click the Auto-Improve button in the Helicone Prompt Editor
Our AI analyzes each sentence of your prompt to understand:

The semantic interpretation
Your instructional intent
Potential areas for enhancement

Get a new suggested optimized version of your prompt

Key Benefits

Semantic Analysis: Goes beyond simple text improvements by understanding the purpose behind each instruction
Maintains Intent: Preserves your original goals while enhancing how they’re communicated
Time Saving: Skip hours of prompt iteration and testing
Learning Tool: Understand what makes an effective prompt by comparing your original with the improved version

Auto-Improve is now available in beta for all Helicone users. Try it today with one of your prompts!

February 14, 2025

Anthropic Prompt Caching Support

We now support Anthropic’s Prompt Caching feature! Monitor and analyze cached prompt tokens and costs alongside regular requests through Helicone’s observability tools. Enable prompt caching through your existing Anthropic integration with Helicone - no additional configuration needed.

Learn more about integrating with Anthropic in our Anthropic integration guide.
Learn more about Anthropic Prompt Caching in Prompt caching with Claude.

January 31, 2025

Perplexity AI + Helicone

We’re excited to announce our integration with Perplexity AI, bringing powerful observability tools to your Perplexity model implementations! Get started with just two simple steps:

Generate a write-only API key in your Helicone account.
Update your Perplexity AI base URL to:
```
https://perplexity.helicone.ai
```

That’s all it takes! Now you can monitor, analyze, and optimize your Perplexity AI models with Helicone’s comprehensive insights.

For more details, check out our Perplexity AI integration guide.

January 30, 2025

Nebius AI + Helicone

We’re excited to announce our integration with Nebius AI, bringing powerful observability tools to your Nebius model implementations! Get started with just two simple steps:

Generate a write-only API key in your Helicone account.
Update your Nebius AI base URL to:
```
https://nebius.helicone.ai
```

That’s all it takes! Now you can monitor, analyze, and optimize your Nebius AI models with Helicone’s comprehensive insights.

For more details, check out our Nebius AI integration guide.

January 24, 2025

Helicone Prompt Editor: Your Ultimate AI Prompt Workspace

Elevate your prompt development with our new best-in-the-world editor, designed for peak efficiency and creativity. Now Live for all Prompts & Experiments users!

Smart Editing Capabilities

Advanced Text Editors
- Intelligent auto-complete
- Magic Toolbar with powerful shortcuts
  - ⌘ E: Quickly add variables
  - ⌘ J: Insert XML delimiters
  - ⌘ K: Perform instant edits
Workflow Optimization
- Side-by-side prompt-response view
- Instant Save & Run for real-time feedback

Comprehensive Control & Flexibility

Version Management
- Fast version switching
- One-click version promotion
- Simple deployment via prompt ID
Data & Model Support
- Import variables from live production data
- Quickly link to an experiment
- Message pairs and prefill support
- Full compatibility with leading AI models:
  - Anthropic
  - OpenAI
  - Google
  - Meta
  - DeepSeek
  - And more!

January 16, 2025

DeepSeek AI + Helicone

We’re excited to announce our integration with DeepSeek AI, bringing powerful observability tools to your DeepSeek model implementations! Get started with just two simple steps:

Generate a write-only API key in your Helicone account.
Update your DeepSeek AI base URL to:
```
https://deepseek.helicone.ai
```

That’s all it takes! Now you can monitor, analyze, and optimize your DeepSeek AI models with Helicone’s comprehensive insights.

For more details, check out our DeepSeek AI integration guide.

December 19, 2024

User Histograms: Analyze LLM Usage Patterns

We’re excited to introduce User Histograms, a powerful new visualization tool that helps you understand user behavior patterns across your LLM applications.

Key Features

Distribution Visualization: See how your users are distributed across different metrics like token usage, costs, and request volumes
Percentile Analysis: Quickly identify power users and understand usage patterns at different percentiles
Interactive Filtering: Filter and segment your user data to focus on specific time periods or user groups

Use Cases

Usage Pattern Analysis
- Identify usage clusters and understand how different user segments interact with your LLM applications
- Spot outliers and investigate unusual usage patterns
Cost Optimization
- Understand cost distribution across your user base
- Make informed decisions about pricing tiers and usage limits
Capacity Planning
- Analyze token usage patterns to better predict and plan for scaling
- Understand peak usage patterns across your user base

To access User Histograms, navigate to the Users tab in your Helicone dashboard and click on the Histograms view.

December 10, 2024

🎉 Experiments is here!

We are thrilled to announce that Experiments is out of beta.

Experiments is designed to help you tune your LLM prompt, test it on production data, and verify your iterations with quantifiable data.

Main use cases

1. Continuous Improvement

Analyze production edge cases to refine your application’s performance.

2. Pre-deployment Testing

Benchmark new releases rigorously before rolling out to production environments.

3. Structured Testing

Implement LLM-as-a-judge or custom evaluation metrics, then compare prompt variations side-by-side with quick, actionable feedback loop.

4. Prompt Optimization

Determine the best prompt for production by running evaluators to prevent performance regressions.

For detailed documentation, refer to our updated docs.

December 6, 2024

Support for AWS Bedrock Models

We’re excited to announce support for tracking AWS Bedrock models requests through Helicone

How to track your requests?

To track your Bedrock requests through Helicone, you can set the Bedrock client’s endpoint to use the Helicone Proxy.

endpoint="https://bedrock.helicone.ai/v1/<region>"

For detailed API documentation, please refer to our updated docs.

November 12, 2024

Cerebras: New Model Provider Integration

Cerebras Integration

We’re excited to announce the addition of Cerebras as a new model provider on our platform. This integration expands our suite of available AI models and provides more options for our users.

Getting Started

To start using Cerebras models, create an account on Cerebras and then create a new API key. Once you have your API key, you can add it to your Helicone configuration as a base_url.

base_url="https://cerebras.helicone.ai/v1"

For detailed API documentation, please refer to our updated docs.

October 24, 2024

Webhooks: Real-Time Integration and Automation

We are excited to announce the addition of webhooks to our platform, enhancing real-time integration and automation capabilities. With this update, you can:

Set up and monitor webhooks for real-time data processing.
Seamlessly integrate webhook routes with your applications.
Test webhooks locally using tools like ngrok.
Randomly sample webhook events and filter them by specific properties to tailor your data processing needs.
Utilize webhooks for evaluations and score tracking, enabling more precise and automated performance assessments.

For detailed instructions, please refer to our Webhooks Setup Guide.

October 23, 2024

Prompt UI Refresh

We’ve refreshed the Prompts interface to align with our new UI style — now simpler, more productive, and consistent throughout. Key improvements include:

Full-width interface
Ability to view production inputs in a prompt template
Easily rollback to a previous prompt
Ability to edit prompts and push to production from the UI

Check it out in the Prompts tab in Helicone!

October 23, 2024

New Claude 3.5 Sonnet (claude-3-5-sonnet-20241022-v2): Full Cost Support and Tracking

We’re excited to announce immediate support for Anthropic’s latest Claude 3.5 Sonnet model (claude-3-5-sonnet-20241022-v2), released in October 2024.

What’s New in This Version?

Improved performance across various tasks
Enhanced capabilities in analysis, coding, and creative writing
Better contextual understanding and response relevance

Performance Tracking and Cost Management

Our platform now offers:

Comprehensive performance tracking for claude-3-5-sonnet-20241022-v2
Updated pricing calculator for accurate cost estimation

How to Use

Refer to our Anthropic Integration Guide for details on how to use the new model with Helicone

Learn More About Claude 3.5 Sonnet

October 4, 2024

🎉 Prompt Experiments V2 Launch! 🎉

Discover Helicone’s experiments, a new spreadsheet-like interface designed for efficient LLM prompt experimentation. Easily manage multiple prompt variations, run flexible experiments, and gain data-driven insights to optimize your AI prompts.

Get early access now 👉 helicone.ai/experiments

October 3, 2024

Redesigned Requests Page for Enhanced LLM Observability

We’re excited to announce a major redesign of our Requests page, enhancing the user experience and efficiency for AI LLM observability.

Key Improvements

Streamlined Navigation: Quick toggle between requests without closing the drawer, allowing for faster review and comparison.
Compact Information Display: More data visible at a glance with a sleeker, more compact row design.
Reduced Visual Clutter: A cleaner interface that focuses on essential information.
Enhanced Time Selector: Improved configuration options and quick select features for more precise data filtering.
Unobstructed Page Navigation: Chat widget no longer blocks page navigation, ensuring a smoother user experience.

Benefits for LLM Developers and Data Scientists

Efficient Prompt Analysis: Easily view and compare prompts across multiple requests.
Improved Performance Monitoring: Quickly identify trends and anomalies in your LLM applications.
Streamlined Workflow: Navigate through large volumes of request data with ease.

This redesign reflects our commitment to providing the best tools for AI LLM observability. We’ve focused on enhancing the core features that matter most to our users, making it easier than ever to gain insights from your LLM application data.

We encourage you to explore the new Requests page and experience the improvements firsthand. Your feedback is valuable as we continue to refine and enhance Helicone’s observability platform.

October 2, 2024

Introducing new NPM packages for Helicone

We are thrilled to announce the addition of two essential npm packages: @helicone/async and @helicone/helpers. Additionally, we are also deprecating the @helicone/helicone package.

Why These Changes?

Optimized Package Size: The previous @helicone/helicone was wrapped around OpenAI, resulting in a bulky package size.
Enhanced Function Utilization: Many functions within the old package were unused and outdated. The new approach ensures that only necessary functions are included and are up to date.

Detailed Changes

Deprecated @helicone/helicone:
- This package is officially deprecated and will no longer receive updates.
- Existing functions within this package will continue to operate as expected to ensure a smooth transition.
Added @helicone/async:
- HeliconeAsyncLogger Class: Previously part of @helicone/helicone, this class is now housed within @helicone/async. It retains all existing functionalities, offering robust asynchronous logging capabilities.
Added @helicone/helpers:
- HeliconeManualLogger Class: Moved from @helicone/helicone to @helicone/helpers, this class now adopts a more functional approach. Visit the docs to learn more.
- Enhanced Features:
  - Vector Database Support: Added request logging support for vector databases.
  - External Tools: You can also log your external tool calls.

September 23, 2024

Summary Reports

Get weekly summary reports of your LLM usage

We’ve launched a new feature that keeps you updated on your LLM usage with detailed weekly reports delivered directly to your inbox every Monday at 10 AM UTC. These reports provide a comprehensive overview of key metrics, including total usage, cost analysis, number of requests, error rate, active users, threats, number of sessions, and average session costs.

With these automated reports, you can easily monitor your AI performance, optimize your usage, and make data-driven decisions for your projects. Ensure you’re staying on top of your LLM utilization and maximizing the value of your resources.

Ready to get started? Configure your weekly summaries now.

September 16, 2024

O1 Models: Support Added with Token and Cost Tracking

Immediate Support for OpenAI’s o1 Models

We’re excited to announce support for OpenAI’s new o1 models, along with comprehensive tracking of token counts and spending.

What Are o1 Models?

OpenAI’s o1 models represent a significant advancement in language AI. They use reinforcement learning to perform complex reasoning tasks, generating an internal chain of thought before producing a final response. This leads to enhanced performance and new capabilities for your applications.

Accurate Cost Tracking

Our platform now fully supports cost tracking for o1 model usage. Due to the unique way these models process information, it’s important to provide token counts for both input and output to ensure accurate cost calculations.

How to Ensure Accurate Tracking

Using Integrations: If you’re using integrations like Langchain, LlamaIndex, or LiteLLM, token usage is automatically tracked.
Streaming Usage: For accurate cost calculation while streaming, refer to our guide on Correct Cost Calculation While Streaming.

Learn More About o1 Models

September 12, 2024

Datasets

Streamline your AI data organization and analysis with Helicone’s new Datasets feature. Designed for LLM developers and data scientists, this tool simplifies data handling for improved AI model performance.

Key Features of Helicone Datasets:

Dataset Creation: Quickly set up and organize your AI training data within the requests page.
Export: Easily export your data as JSONL for training or finetuning.
Edit: Edit your dataset and save it as a new version.

Benefits for AI Development:

To begin using the Datasets feature:

Navigate to the Requests page in your Helicone dashboard.
Enter select mode by clicking the select icon in the top right corner.
Select the data points you want to include in your dataset.
Click on “Create Dataset” and give it a name.
Access your datasets from the new Datasets tab to export or edit as needed.

September 11, 2024

Collapsible Sidebar

Enhance your workflow with our new collapsible sidebar feature. Users can now easily toggle the sidebar visibility, maximizing screen real estate and improving focus. This update offers:

One-click sidebar collapse/expand
Increased workspace flexibility
Improved screen space utilization
Seamless transition between full and minimized views

Optimize your productivity by customizing your interface on demand. Experience a cleaner, more adaptable workspace with our latest sidebar enhancement.

September 10, 2024

Slack Alerts

Real-Time Alerts Now Available in Slack for Faster Issue Resolution

Stay on top of critical issues with Helicone’s latest update: Slack Integration for Alerts. In addition to email notifications, you can now receive real-time alerts directly in your Slack workspace for faster action when something goes wrong.

To get started, visit the Alerts page to create or edit an alert. Enhance your team’s productivity by responding to key notifications without delay.

August 29, 2024

#1 Product of the Day on Product Hunt

Helicone Reaches #1 on Product Hunt!

This achievement reflects our team’s hard work and the incredible support from our community. We’re thrilled about the boost in visibility for our platform!

Highlights:

#1 on Product Hunt’s daily leaderboard
Positive feedback from the open-source community
Surge in new user sign-ups and engagement

Product Hunt Results

A huge thank you to everyone who upvoted, commented, and shared Helicone. Your support motivates us to keep improving!

For more on our Product Hunt journey, check out our blog posts:

Links:

Product Hunt: Helicone on Product Hunt

August 25, 2024

Docker images on Docker Hub

Docker images now available on Docker Hub We’ve started publishing Docker images on Docker Hub.

This update simplifies Helicone deployment on platforms that don’t natively support the Google Container Registry. For detailed instructions, please refer to our updated self-hosting guide.

Links:

Docker Hub: helicone

August 12, 2024

New hpstatic Function for Static Prompts in LLM Applications

We’ve added a new hpstatic function to our Helicone Prompt Formatter (HPF) package. This function allows users to create static prompts that don’t change between requests, which is particularly useful for system prompts or other constant text. The hpstatic function wraps the text in <helicone-prompt-static> tags, indicating to Helicone that this part of the prompt should not be treated as variable input.

Here’s a quick example of how to use hpstatic:

import { hpf, hpstatic } from "@helicone/prompts";

const systemPrompt = hpstatic`You are a helpful assistant.`;
const userPrompt = hpf`Write a story about ${{ character }}`;

const chatCompletion = await openai.chat.completions.create(
  {
    messages: [
      { role: "system", content: systemPrompt },
      { role: "user", content: userPrompt },
    ],
    model: "gpt-3.5-turbo",
  },
  {
    headers: {
      "Helicone-Prompt-Id": "prompt_story",
    },
  }
);

This new feature enhances our prompt management capabilities, allowing for more flexible and efficient prompt structuring in your applications.

Start Using Static Prompts 🚀

August 9, 2024

Ragas Integration for RAG System Evaluation

We’re excited to announce our integration with Ragas, an open-source framework for evaluating Retrieval-Augmented Generation (RAG) systems. This integration allows you to:

Monitor and analyze the performance of your RAG pipelines
Gain insights into RAG effectiveness using metrics like faithfulness, answer relevancy, and context precision
Easily identify areas for improvement in your RAG systems

Check out this quick video overview of the Ragas integration:

To get started with the Ragas integration, visit our documentation for step-by-step instructions and code examples.

August 6, 2024

Optimistic Updates & Asynchronous Loading in Requests Page

We’ve improved data loading in the Requests page of the Helicone platform. By fetching metadata and request bodies separately and loading data asynchronously we’ve reduced the time it takes to render large tables by almost 6x, improving speed and UX.

July 26, 2024

New Assistants UI Playground

We’re thrilled to announce a major update to our Assistants UI Playground! Head to the Playground and click the “Try New Playground” button to explore the latest improvements:

Streamed responses for real-time interaction
Enhanced tool rendering for better visualization
Improved reliability for a smoother experience

Coming soon:

Expanded model support
Advanced prompt management
Integrated Markdown editor

Try out the new Playground today and elevate your LLM testing experience!

July 24, 2024

Fireworks AI + Helicone

We’re excited to announce our integration with Fireworks AI, the high-performance LLM platform! Enhance your AI applications with Helicone’s powerful observability tools in just two easy steps:

Generate a write-only API key in your Helicone account.
Update your Fireworks AI base URL to:
```
https://fireworks.helicone.ai
```

That’s all it takes! Now you can monitor, analyze, and optimize your Fireworks AI models with Helicone’s comprehensive insights.

For more details, check out our Fireworks AI integration guide.

July 23, 2024

Dify + Helicone

We’re thrilled to announce our integration with Dify, the open-source LLM app development platform! Now you can easily add Helicone’s powerful observability features to your Dify projects in just two simple steps:

Generate a write-only API key in your Helicone account.
Set your API base URL in Dify to:
```
https://oai.helicone.ai/<API_KEY>
```

That’s it! Enjoy comprehensive logs and insights for your Dify LLM applications.

Check out our integration guide for more details.

July 22, 2024

Prompts package

We’re excited to announce the release of our new @helicone/prompts package! This lightweight library simplifies prompt formatting for Large Language Models, offering features like:

Automated versioning with change detection
Support for chat-like prompt templates
Efficient variable handling and extraction

Check it out on GitHub and enhance your LLM workflow today!