Skip to main content

Alternative to Elasticsearch

Elasticsearch and Apache Doris are both popular in observability, cybersecurity, and real-time analytics. However, Elasticsearch can be costly in terms of storage and write resources. Apache Doris reduces these costs through efficient storage and high compression, and offers comprehensive analytical capabilities, such as JOIN and superior query performance.

tencent-music

“By replacing Elasticsearch with the Doris Commercial Distributed Version supported by VeloDB, GuanceDB showcases a big stride in improving data processing speed and reducing costs.”

Highlight:

  • 70% Cost Reduction
  • 2-3x Faster full-text search performance
  • Variant Data type is flexible to handle semi-structured data in log tracing
tencent-music

“Previously, we used multiple components for complex security analysis... Adopting Doris as a unified solution has significantly improved data writes, query performance and storage efficiency.”

Highlight:

  • 4x Faster write speeds
  • 3x Better query performance
  • 50% Storage space savings
tencent-music

“Compared to the original OLAP database, query performance has improved 5-10 times, concurrency has doubled, and analysis time has dropped from 10 minutes to under 1 minute for 90% of cases, all while using just one-third of the original resources.”

Highlight:

  • 2x Increasing report analysis concurrency
  • 65% Storage space reduction
  • Simplified query with standard SQL

Apache Doris vs. ClickHouse

Apache DorisElasticsearch
Open Source License
  • Licensed under Apache License 2.0
  • Stable License since governed by the Apache Software Foundation
  • License changed from Apache License 2.0 to Elastic License, then to AGPL License
  • Changing license since governed by Elastic NV
Architecture

    Higher flexibility and elasticity:

  • Strict workload isolation by workload group, powered by Linux CGroups, ideal for multi-tenancy
  • Compute-Storage decoupled and coupled modes

    Traditional deployment with limited elasticity:

  • Soft Workload Isolation by Thread Group
  • Does not support decoupling compute and storage
Real-Time Data Writes
  • High throughput: Indexing only on one replica
  • Pull-based ingestion via Kafka CDC, easier and simpler
  • Support Logstash and Beats output plugin
  • Low throughput: Indexing for multiple data replicas
  • Requires additional tools like Logstash and Beats for pull-based ingestion, less convenient
Real-Time Data Storage
  • Low storage consumption with compression rates up to 1:5 - 1:10
  • Unique model supports both write and read optimization (MoW & MoR), retaining 90% of write speed when data is duplicated by key
  • Aggregation model supports strong consistency, allows aggregated data updates, and coexists with original data
  • Flexible Schema Change to meet dynamic business needs
  • High storage consumption with a compression ratio of 1:1.5
  • Unique model only supports write optimization, with write performance loss up to 3 times
  • The aggregation model does not allow aggregated data to be updated and does not coexist with the original data
  • Limited support for Schema Change
Real-Time Data Queries
  • Lightning-Fast in various query workloads
  • Supports multi-table JOINs and optimization for complex analysis
  • Easy to use with standard SQL
  • Open MySQL ecosystem
  • Good at point queries, but not suited for data analysis
  • No support for multi-table JOINs or complex analysis
  • Difficult for users due to custom DSL
  • Proprietary Elasticsearch ecosystem

Performance Comparison

Observability & Cyber Security

The HTTP Logs benchmark is an official Elasticsearch performance test designed for log storage and analysis. It uses a real-world HTTP log dataset to evaluate indexing performance, storage efficiency, and query performance.

This benchmark comprises 11 queries commonly used in log analysis scenarios, including keyword search, time range queries, aggregations, and sorting. As a result, it is highly suitable for assessing performance in observability and network security analysis contexts.

ClickBench  Benchmark

Real-Time Analytics

ClickBench is a benchmarking tool to evaluate the performance of analytical databases. It focuses on testing the performance of large, flat tables rather than complex multi-table joins. It uses real-world data from a major web analytics platform, covering typical scenarios such as clickstream analysis and structured logs.

The benchmark consists of a set of queries that test aggregation operations and single-table performance, without involving complex joins. This makes it especially useful for evaluating databases optimized for real-time analytics and large-scale data processing.

Note: These test results are archived benchmarks captured in December 2024. Current real-time comparisons are maintained at ClickBench.

ClickBench  Benchmark

More Migration Stories