Changelog

June 2, 2025

May Product Update

Performance Tests

We ran performance tests for our staging environment with a battery of diverse ingestion scenarios. With a single node deployment, we saw 250k QPS ingestion without any effect on read performance, graphs continue to load and show data with sub-15s data currency and no sluggishness visible on the UI or query performance. We believe we can double this with the same setup. And since our platform can be scaled horizontally, we are confident that a multinode cluster can safely serve up to 1M ingestions per second.

Onboarding Improvements

While most of these changes below are operational, the backing tech was interesting to build. We are seeing better engagement with our potential customers with -

AWS Monitoring Coverage and Improvements

Our focus on getting potential customers on AWS to get high value from Scout continues forward. Along with custom CloudWatch based implementation, we onboarded a prospect customer through the ADOT route, which we believe is the best way to instrument AWS components today.

Postgres Monitoring Improvements

Postgres is one of the best and most popular DBs around and we want to double down on advanced observability for postgres clusters. In May, we started on this with building the ability to ingest over 200 important time-series from the engine, already available to customers today. We will continue to work on this in June with meaningful visualisations.

Operations Improvements

We’re building a control plane for customers to manage self-onboarding, starting with adding managing users and access control. The beta version is available for testing. Most of our configs will soon be available here. So while gitops continues to be our preferred and advised approach to observability configuration, we see the need for a friendly UX.

Anomaly Detection and Knowledge Graph

We continued to run iterations on our ML models for anomaly detection and summary algorithms for building the knowledge graph. As we ingest more production data, our models are improving and continue to bolster our original thesis - we will soon be able to build agents that help our customers reduce downtimes, drastically!


May 1, 2025

Accelerated Playground access

We want to ensure our customers have quick access to a playground to evaluate Scout. Scout now has -

Operations Improvements

Our customers consistently highlight how manual dashboard and alert setup increases operational complexity for them. We've addressed this head-on - GitOps for Scout configuration - all dashboards, alerts and notification policies can now be managed through streamlined GitOps workflows Scout integrates with all leading incident management tools, and now sends emails on alerts as well

AWS Monitoring

We've engineered custom workflows to capture and process metrics from AWS Fargate, VPC, and other components that traditionally resist direct OpenTelemetry metric publishing, expanding visibility across your entire cloud infrastructure.

Dashboard delivery

Dashboards are integral to observability. We have built a library of functional dashboards for a variety of components, from database specific or platform specific to dashboards that can be used to monitor services and entire prod environments. As we keep adding new dashboards as we encounter new systems or enhance existing ones, our dashboards are now delivered instantaneously to our customers, requiring zero effort from their end to apply updates.

Help & Guides

We have a docs site now! We have added guides to set up opentelemetry internally and Scout exporter. This will continue to be a work in progress, as our implementations grow.

ScoutMaster

Eat your own dogfood! Who watches the watchmen? ScoutMaster. We now have a central Scout that observes all Scout instances in a region, and helps us respond to situations faster. The integrations are automated, so as soon as we onboard customers on our playground or production on Scout, ScoutMaster starts watching over.

Anomaly Detection (WIP)

As we continue to build Scout and Monk as tools that can help customers reduce downtime drastically, we spent significant time and focus on anomaly detection, a fundamental building block for autonomous downtime prevention. A beta version of this will be available soon for Scout customers who can set alerts when anomalies compared to set patterns are detected.

Knowledge Graph (WIP)

Another important part of autonomous downtime prevention is building a knowledge graph of components and infrastructure for a Customer. A beta version of this will be available soon for our customers so they can visualise relationships between services, between services and components and services and infrastructure. We believe that this graph will help us automate root cause detection and help our customers minimize MTTR.


Mar 18, 2025 - Data Tiering

We're excited to announce enhancements to our data storage capabilities, designed to significantly reduce your storage costs while maintaining complete data access and integrity. These improvements allow you to retain your valuable metrics data for extended periods without the traditional cost barriers.

High-Performance Compression

Implementation of LZ4 Compression Algorithm

S3-Backed Tiered Storage

Intelligent Multi-Tier Data Management

Customer Benefits

Availability

These features are now available to all customers today.