Schema ORDER BY Rules
Introduction
The ORDER BY clause in ClickHouse schemas determines how data is physically sorted on disk. Understanding how ORDER BY is constructed from meta and derived schemas is crucial for optimal performance.
Common Header Fields
Common header fields, specifically timestamp_load, are always first in the ORDER BY clause. This is a fixed rule that cannot be overridden by meta or derived schemas.
# Common Headers (always first)
column,type,index_type,index_order,os_order,comment
timestamp_load,timestamp,,1,1,load timestampSchema Precedence Rules
Common Headers Priority: timestamp_load is always first in ORDER BY
Derived Schema Precedence: index_order values in derived schema override meta schema
Invalid Values: Invalid index_order values (null, non-numeric) are ignored, falling back to meta schema
Missing Values: Columns with missing index_order are excluded from ORDER BY
Duplicate Values: Duplicate index_order values are allowed and maintain their relative order
Examples
Basic Override Example
Invalid Index Order Example
Missing Index Order Example
Duplicate Index Order Example
Best Practices
Common Headers: Remember that timestamp_load will always be first
Explicit Ordering: Always provide explicit index_order values for columns that need to be in ORDER BY
Derived Schema: Use derived schema to override ordering when needed for specific use cases
Valid Values: Ensure index_order values are valid numbers to avoid fallback behavior
Documentation: Document the intended order in column comments for clarity
Last updated