Skip to content

Investigate Regression with Merges #297

@stanbrub

Description

@stanbrub

In a previous re-scale attempt, it was discovered that using merge on some operations to simulate large row counts appears to cause periodic large regressions. For example, doing an update().sum_by() without merges at high scale produces relatively low variability between test runs. But doing the same operation with merge copies at the same scale (# or rows) produces consistent performance most of the time, but occasional large drops in rate for the test. (See chart below.)

  • Difference in rates between merged and not-merged is expected, while large differences in variability is not
  • Investigate the drops with perfmon tables and java flight recorder
  • Understand differences in GC characteristics between normal and regression runs for the same bits and test
  • Look at possible divergences in compilation paths

Chart for "Update-Sum- 2 Calcs Using 2 Cols":

  • Running 100 iterations on my laptop (110G heap w/ 24 CPU threads)
  • Each benchmark runs 640M rows
  • Each series name says how many merges it took to get 640M rows
  • Even 1 merge (320M x 2) causes large periodic regression
  • Merge does introduce an extra layer, so the slower times are expected, but the wild periodic swings don't make sense to me
    update-sum-2calcs-2-cols_merges_laptop-100

Default Deephaven Properties

ENTRYPOINT ["java", "-server", "-XX:+UseG1GC", "-XX:MaxGCPauseMillis=100", "-XX:+UseStringDeduplication", "-XX:InitialRAMPercentage=25.0", "-XX:MinRAMPercentage=70.0", "-XX:MaxRAMPercentage=80.0", "--add-opens", "java.base/java.nio=ALL-UNNAMED", "-XshowSettings:vm", "-cp", "/app/resources:/app/classes:/app/libs/*", "io.deephaven.grpc_api.runner.Main"]

Use the perfmon tables

from deephaven import perfmon as perf

qpl = perf.query_performance_log()
qpl_tree = perf.query_performance_tree_table()
qopl= perf.query_operation_performance_log()
qopl_tree = perf.query_operation_performance_tree_table()

Could introduce a named nugget that would create a QOPL line for all instrumented operations to roll up into.

io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder#getInstance
io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder#getNugget(java.lang.String)
(Nuggets are closeable)

Tips:

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions