Introduction
SelectDB Cloud is a new generation of multi-cloud native real-time data warehouse based on Apache Doris, focusing on meeting the real-time analysis needs of enterprise-level big data, and providing customers with extremely cost-effective, easy-to-use data analysis services.
SelectDB Cloud is publicly available to customers. If customers want to deploy their data warehouse to AWS (Amazon Web Services), please visit and log in to SelectDB Cloud International.
Key Features
- Extreme Performance: In terms of storage, SelectDB Cloud adopts efficient columnar storage and data indexing; in terms of computing, SelectDB Cloud relies on the MPP distributed computing architecture and the vectorized execution engine optimized for X64 and ARM64; SelectDB Cloud is at the global leading level in the ClickBench public performance evaluation.
- Cost-Effective: SelectDB Cloud adopts a cloud-native architecture that separates storage and computing, and is designed and developed based on cloud infrastructure. In terms of storage, shared object storage achieves extremely low cost; in terms of computing, SelectDB Cloud supports on-demand scaling and start-stop to maximize resource utilization.
- Easy-to-Use: One-click deployment, out-of-the-box; supports MySQL-compatible network connection protocols; provides integrated connectors with Kafka/Flink/Spark/DBT; has a powerful and easy-to-use visual operation and maintenance management console and data development tools.
- Single-Unified: On a single product, multiple analytical workloads can be run. Supports real-time/interactive/batch computing types, structured/semi-structured data types, and federated analysis of external data lakes (such as Hive, Iceberg, Hudi, etc.) and databases (such as MySQL, Elasticsearch, etc.).
- Open Source: Based on the open source Apache Doris research and development, SelectDB Cloud continue to contribute innovations to the open source community. SelectDB Cloud is fully compatible with the Doris syntax protocol, and can freely migrate data with Doris. Continue to be compatible and mutually certified with domestic and foreign ecological products and tools. Open cooperation with cloud providers at home and abroad, the product runs on multiple clouds, providing a consistent user experience.
- Safe and Stable: In terms of data security, SelectDB Cloud provides complete authority control, data encryption, backup and recovery mechanisms; in terms of operation and maintenance management, SelectDB Cloud provides comprehensive observability metrics collection and visual management of data warehouse services; in terms of technical support, SelectDB Cloud has a complete ticketing management system and remote assistance platform, providing multiple levels of expert support services.
Key Concepts
- Organization: An organization represents an enterprise or a relatively independent group, and users can use the service as an organization after registering with SelectDB Cloud. Organizations are billing and settlement objects in SelectDB Cloud, and billing, resources, and data between different organizations are isolated from each other.
- Warehouse: A warehouse is a logical concept that includes computing and storage resources. Each organization can create multiple warehouses to meet the data analysis needs of different businesses, such as orders, advertising, logistics and other businesses. Similarly, resources and data between different warehouses are also isolated from each other, which can be used to meet the security requirements within the organization.
- Cluster: A cluster is a computing resource in the warehouse, including one or more computing nodes, which can be elastically scaled. A warehouse can contain multiple clusters, which share the underlying data. Different clusters can meet different workloads, such as statistical reports, interactive analysis, etc., and the workloads between multiple clusters do not interfere with each other.
- Storage: Use a mature and stable object storage system to store the full amount of data, and support multi-computing cluster shared storage, which brings extremely low storage cost, high data reliability and almost unlimited storage capacity to the data warehouse, and greatly simplifies the implementation complexity of the upper computing cluster.
Product Architecture
- Cloud Service Layer: The cloud service layer is a collection of supporting services provided by SelectDB Cloud, including: authentication, access control, cloud infrastructure management, metadata management, query parsing and optimization, etc., expressed in the form of a "warehouse". Warehouses are isolated from each other.
- Computing Cluster Layer: The computing layer is decoupled from the storage layer, supporting flexible elastic scaling and smooth upgrades. The computing layer consists of several computing clusters. Multiple computing clusters share storage, and workloads are isolated between multiple clusters. Each cluster contains one or more computing nodes. Computing nodes use high-speed hard disks to build hot data caches (Cache), and avoid unnecessary cold data reading through leading query optimizers and rich indexing technologies, which significantly optimizes the problem of high response delay of object storage, providing customers with the ultimate data analysis performance.
- Shared Storage Layer: The bottom layer of SelectDB Cloud uses cheap, highly available, and nearly infinitely scalable object storage as the shared storage layer, and is based on object storage for deep optimization design, which can help customers reduce the cost of data analysis by multiples, and easily support PB-level data analysis needs. The unified standard and maturity of object storage in different cloud environments also strengthens the consistent use experience of SelectDB Cloud in multiple clouds.
Application Scenario
- High Concurrent Real-time Reporting and Analysis: Use SelectDB Cloud to process online high-concurrency reports to obtain real-time, fast, stable, and highly available services. It supports real-time data writing, sub-second query response, and high-concurrency point queries to meet the high-availability deployment requirements of clusters.
- User Portrait and Behavior Analysis: Based on SelectDB Cloud, build user CDP (Customer Data management Platform) data warehouse platform layering, support millisecond-level column addition and dynamic tables to flexibly respond to business changes, support rich behavior analysis functions to simplify development and improve efficiency, and support high-level orthogonal bitmaps to achieve second-level circle people in portrait scenes.
- Log Storage and Analysis: Integrating the SelectDB Cloud data warehouse into the logging system to realize real-time log query, low-cost storage, and efficient processing, reduce the overall cost of the enterprise log system, and improve the performance and reliability of the log system.
- Lake Warehouse Integration and Federated Analysis: Unified integration of data lakes, databases, and data warehouses into a single platform, relying on the data federation query acceleration capability of SelectDB Cloud, provides high-performance business intelligence reports, Adhoc analysis, and incremental ETL/ELT data processing services.
Relationship to Apache Doris
Flywheels Data Technology International HK Limited ("FLYWHEEL", "SelectDB") is the commercialization company of Apache Doris. FLYWHEEL was founded in January 2022 by the founding team of Apache Doris and the founding team of Baidu Smart Cloud. FLYWHEEL is an important driving force of Apache Doris. It has 5 PMC members and 16 Committers, and has led the release of a series of core versions of Apache Doris. FLYWHEEL vigorously promotes the open source Apache Doris, the technology benefits open source users and developers, and launches commercial products based on Apache Doris, the business empowers commercial customers, and the two-wheel drive achieves healthy growth of open source and business.
SelectDB Cloud is a new generation of multi-cloud native real-time data warehouse built by FLYWHEEL based on Apache Doris. Compared with Apache Doris, SelectDB Cloud has the following main differences:
- The core version is more mature and stable, with more enterprise-level features and cloud-native features.
- Provides a built-in visualized operation and maintenance management console and data development tools, without the need for users to install and deploy, out-of-the-box, minimalist operation and maintenance and management.