Skip to content

Conversation

@platinumhamburg
Copy link
Contributor

Purpose

Linked issue: close #2254

Brief change log

Tests

API and Format

Documentation

@platinumhamburg platinumhamburg changed the title basic aggregate merge engine support Implement basic Aggregate Merge Engine Dec 25, 2025
@platinumhamburg platinumhamburg force-pushed the base-agg branch 21 times, most recently from 907559e to e7d0af7 Compare December 26, 2025 06:45
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the Aggregation Merge Engine for Fluss, which allows field-level aggregation of rows with the same primary key. The implementation includes 12 aggregate functions (sum, product, max, min, last_value, last_value_ignore_nulls, first_value, first_value_ignore_nulls, listagg, string_agg, bool_and, bool_or) with comprehensive schema evolution support.

Key changes:

  • Core aggregation engine with field-level aggregators and caching
  • Schema API enhancements to support aggregation function configuration
  • Comprehensive documentation with examples for all supported functions
  • Utility classes for string concatenation and value comparison

Reviewed changes

Copilot reviewed 56 out of 56 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
website/docs/table-design/table-types/pk-table/merge-engines/aggregation.md Comprehensive documentation for the Aggregation Merge Engine with examples
fluss-server/src/main/java/org/apache/fluss/server/kv/rowmerger/AggregateRowMerger.java Main row merger implementation with partial update support
fluss-server/src/main/java/org/apache/fluss/server/kv/rowmerger/aggregate/*.java Aggregation context, caching, and field processing logic
fluss-server/src/main/java/org/apache/fluss/server/kv/rowmerger/aggregate/functions/*.java Field aggregator implementations for all functions
fluss-server/src/main/java/org/apache/fluss/server/kv/rowmerger/aggregate/factory/*.java Factory classes for creating aggregators via SPI
fluss-common/src/main/java/org/apache/fluss/metadata/AggFunction.java Enum defining all supported aggregation functions
fluss-common/src/main/java/org/apache/fluss/metadata/Schema.java Schema enhancements to support aggregation functions
fluss-common/src/main/java/org/apache/fluss/utils/*.java Utility classes for string operations and comparisons
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java New configuration options for aggregation
Test files Comprehensive test coverage for all components

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* <p>Efficiently encodes schema ID and target column indices for cache lookup. Uses array
* content-based equality and hashCode for correct cache behavior.
*/
private static class CacheKey {
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class CacheKey overrides hashCode but not equals.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[server] Implement basic Aggregate Merge Engine

2 participants