Gazette
Gazette is a platform that integrates SQL, batch, and millisecond-latency streaming processing, using a core abstraction called 'journal' for real-time and cloud storage data streams.
Company Overview
Gazette specializes in enabling platforms that seamlessly integrate SQL, batch, and millisecond-latency streaming processing paradigms. Its core technology revolves around the concept of 'journals'—streaming append logs represented through regular files in BLOB stores like S3. These journals can be accessed as real-time streams or as static files in cloud storage, tailoring to diverse data processing needs.
Stream Processing and Journal Abstraction
Gazette's core abstraction, the 'journal', allows for versatile data handling methods. Journals in Gazette can be consumed in real-time as data streams or accessed as structured files stored in cloud systems like Amazon S3. This duality ensures that users can choose the most efficient method according to their requirements. Convenience is further enhanced as Gazette supports querying these journals via SQL tools like Snowflake, BigQuery, or Presto.
Broker and Consumer Framework
Gazette offers a broker service that includes features like millisecond-latency, serializable publish/subscribe, and delegated storage via S3. It provides a robust consumers framework for building streaming applications in Go, supporting stateful streaming with embedded stores such as RocksDB and SQLite. The broker service can stage recent writes to local SSDs, optimizing both performance and cost efficiency.
Multi-Cloud and Scalability
Gazette supports multi-cloud, worldwide deployments, enabling scalable and resilient data processing. Its architecture ensures durability with journal replicas spanning multiple availability zones. Gazette clusters can scale to millions of streamed records per second and recover from faults in seconds without requiring data migration, making it ideal for global operations.
Tools and Management
Gazette provides various tools to simplify the management of its services. The command-line tool, gazctl, is designed for easy management of brokers and consumer applications. Gazette utilizes familiar Kubernetes primitives for creating and configuring journals, facilitating seamless integration into existing workflows. This array of tools and support ensures efficient operations and straightforward user management.