XDE Build System Release Notes
XDE Release Notes
v0.3.10 (December 23, 2024)
Schema Data Type Mismatch:
Issue: Mismatch detected during schema updates between
Nullable(Bool)
(ClickHouse) andNullable(Boolean)
(schema output) due to differing Boolean representations.Fix: Updated schema comparison logic to treat
Nullable(Bool)
as equivalent toNullable(Boolean)
.
v0.3.9 (December 5, 2024)
Overview
hunts
: Improved Whitespace Handling Enhanced the customer filters handling in hunts.py to better manage whitespace variations. The system now elegantly handles both spaced and non-spaced versions of filter variables (e.g., both "{customer_filters}" and "{ customer_filters }"), ensuring consistent query generation regardless of template formatting.schema_executor
: Fixed Directory Structure Issue Resolved a problem where thecreate_roles.sql
file was being searched in an outdatedorg_id
directory structure. The system now correctly adheres to the updated directory structure.schema_builder
: Role Grants Update Thecreate_roles.sql
file has been updated to include the missingINSERT
grant for thedetections_role
on thelogs_alerts
table, ensuring proper population of alerts into the table.
v0.3.8 (November 19, 2024)
Overview
Fixed skip index logic into the schema-ch
class, addressing issues with columns names having "." DOT in the name.
v0.3.7 (November 19, 2024)
Overview
Reintroduced skip index logic into the schema-ch
class, addressing issues caused by major schema refactoring.
Key Changes
Index Type applycation for clickhouse schemas
enabling derived schemas index_type override the meta schema index_typ
schema-ch
Updates*
v0.3.6 (November 18, 2024)
Overview
Reintroduced skip index logic into the schema-ch
class, addressing issues caused by major schema refactoring.
Key Changes
schema-ch
UpdatesRestored skip index logic in the
schema-ch
class to handle skipped indexes effectively.
Unit Tests
Enhanced unit tests to verify the creation and application of skip indexes.
GitLab CI Pipeline
Temporarily disabled documentation generation logic to focus on building the release and package.
Common Schema V001_001_005
Added a skip index for the
timestamp
column to improve query performance.
v0.3.5 (November 13, 2024)
Overview: This release improves asynchronous task management, error handling, and default settings for CLI commands, with a focus on CronJob
and HuntScheduler
functionalities. Checkpoint structure has also been updated for better JSON serialization.
Key Changes:
Asynchronous Task Management Fixes:
Addressed unawaited coroutine warnings by ensuring tasks are properly awaited.
Updated task scheduling to prevent blocking and ensure proper handling of async functions.
Checkpoint Serialization:
Removed non-serializable
xdr_logger
fromsuccessful_checkpoints
to enhance JSON compatibility.
Packaged pipeline_template.yaml:
Bundled the pipeline_template.yaml file for access via the package's resource files, making it available in the package distribution.
v0.3.4 (November 15, 2024)
Changes:
Pipeline Fixes: Fixed failing GitLab pipeline uploading packages.
Ingestion Pipeline: Fixed input paramter from s3 to S3 for uploading templates to S3 bucket.
v0.3.3 (November 5, 2024)
Changes:
OpenSearch Flag: Fixed
opensearch_flag
option to control the creation of OpenSearch templates. If set toFalse
, templates will not be created.Schema Application Logic: Merged
apply-schema
andupdate-schema
commands into a singleapply-schema
command. Schema updates will only run if--schema_update_flag
is set toYES
.Bug Fix - ClickHouse Schema: Resolved an issue in ClickHouse schema creation for the WinlogBeat dataset where
event_hash
appeared twice due to overlap between the common and core schemas.CLI Command - Generate Docs: Added a new python script to generate documents automatically. Script is kept at src/scripts/document_generator.py.
Dependency Upgrade: Upgraded
poetry
for enhanced dependency management and compatibility.Test Updates: Updated all tests to align with the new schema application logic and
opensearch_flag
changes.Upgrade Common Schema: Upgrade XDR OpenSearch framework to v3.2.6 including dev-test deployment and support for non XDR sources. Upgraded the common schema to version common/v001.001.005.
sigma rules: Built a Python Script that will walk the sigma open source git master branch and build them in post_build_artefacts/sigma_rules folder.
Deployment Steps
Prepare and Test Pipeline Updates:
Replace
update-schema
withapply-schema
and use--schema_update_flag=YES
in all relevant pipelines.Validate these changes in integration and ghostburner environments to ensure they trigger schema updates correctly.
Verify OpenSearch Flag Functionality:
Confirm that setting
opensearch_flag=False
prevents the creation of OpenSearch templates as expected.
Run Schema and Compatibility Tests:
Confirm that schema duplication issues (e.g.,
event_hash
) are resolved.Run all tests to verify alignment with the updated schema logic and OpenSearch settings.
Upgrade Common Schema:
Update the
xdr_package.yaml
configuration files to use the new common schema versioncommon/v001.001.005
.Deploy the upgraded OpenSearch framework (v3.2.6) to dev-test for initial validation.
Documentation Generator and Fetching Sigma Rules:
Run the
document_generator.py
script to automatically generate CLI documentation.Build Sigma rules from the Sigma open-source repository, placing them in the
post_build_artefacts/sigma_rules
folder for reference.
Dependency and Environment Preparation:
Upgrade
poetry
to the latest version for improved dependency management.
v0.3.2 (October 31, 2024)
Changes
Version Bump: This release includes a version bump to v0.3.02.
v0.3.1 (October 31, 2024)
Changes:
Build Schemas: Fixed logic for handling nullable data types. Previously, columns were printed in the DDL without the
Nullable
attribute.Apply Schemas: Removed the logic that updates data types during schema execution.
Update Schemas: Moved the logic for detecting data type changes to
update-schemas
. This will no longer apply the changes automatically but will detect any type discrepancies, raise an error, and continue execution.Merge main to this branch to capture all recent changes to schema and ingestion pipeline.
v0.3.0 (October 30, 2024)
Changes:
🚀 Beats Optimisations:
Enhancement for
Flexible Meta Schema and Pipeline Config
external:Exposing all of the meta schemas, filebeat derived schemas, ingestion pipeline mappings and configuration.
Ability for customers add there own meta schemas.
Fix for
FutureWarning
on DataFrame Concatenation:Resolved a future deprecation warning regarding the behavior of DataFrame concatenation with empty or all-NA columns. Adjustments have been made to ensure the correct dtype handling without dropping columns unnecessarily.
Enhanced Schema Validation for ClickHouse:
Added validation to remove nullable columns from
PRIMARY KEY
andORDER BY
statements, with a warning log message. When a nullable ClickHouse data type is detected, it is excluded from the final schema, and a warning is issued.Enhanced and added new CLI Commands.
Bug Fix for
update-schemas
ClassResolved an issue relating to how the system.query_log is queried after an insertion is completed
Enhance Ingestion Pipelines: Refactor ingestion pipelines CLI commands to dump the files to local or s3 with versions.
Deployment Steps:
Update all xdr_package.yaml global settings:
Update the xdr_package.yaml from xdr_schema_template to meta_schema_paths
Update the xdr_package.yaml from xdr_schema_template_version to meta_schema_version
Update the xdr_package.yaml from sub_schema_file_path to derived_schema_paths
Update the xdr_package.yaml to reference the ingestion pipeline transforms mappings and templates to below format
Update the xdr_package.yaml schemas to use the terms meta and derived schemas.
global_settings: ingestion_template_s3_bucket: development-config-bucket-afterburner schema_common_version: v001.001.004 schema_output_path: ../.xdr_schema_output/ ingestion_pipeline_output_path: ../.xdr_ingestion_pipelines_output upload_ingestion_template_output_path: ../.xdr_ingestion_template_output schema_versioning_output_path: .migrations/xdr_schema_versioning default_profile: ghostburner target_path: ~/.xdr/xdr_targets.yaml ingestion_pipeline_transform_mappings_paths: standard: - ../post_build_artefacts/ingestion_pipeline_enrichment_standard - ../post_build_artefacts/ingestion_pipeline_enrichment_standard_custom geoip: - ../post_build_artefacts/ingestion_pipeline_enrichment_geoip ingestion_pipeline_templates_paths: vector: - ../post_build_artefacts/ingestion_pipeline_templates meta_schema_paths: ../post_build_artefacts/xdr_meta_schemas_package derived_schema_paths: ../_sample_xdr_config/sample_schemas/openmss_schemas/stable_schemas schemas: logs_beats_filebeat_activemq: name: logs_beats_filebeat_activemq meta_schema: logs_beats_filebeat.csv meta_schema_version: v001.000.003 derived_schema_file_path: logs_beats_filebeat/logs_beats_filebeat_activemq.csv derived_schema_ttl: 90
All meta schemas, pipeline config files and templates are delivered as external artefacts:
Download the artefacts from HyperSec gitlab account and reference them in a single directory with the right sub directory structures.
Update all xdr_package.yaml files to reference .csv as there core templates:
Update the xdr_package.yaml files from logs_beats_filebeat.json to logs_beats_filebeat.csv
logs_beats_filebeat: name: logs_beats_filebeat meta_schema: logs_beats_filebeat.csv # Change here, swapped from logs_beats_filebeat.json meta_schema_version: v001.000.003 derived_schema_ttl: 90
Download Artefacts
Here are the downloadable links for the package assets:
v0.2.59 (October 17, 2024)
Changes:
🚀 Build Schemas:
Fix for
FutureWarning
on DataFrame Concatenation:Resolved a future deprecation warning regarding the behavior of DataFrame concatenation with empty or all-NA columns. Adjustments have been made to ensure the correct dtype handling without dropping columns unnecessarily.
Enhanced Schema Validation for ClickHouse:
Added validation to remove nullable columns from
PRIMARY KEY
andORDER BY
statements, with a warning log message. When a nullable ClickHouse data type is detected, it is excluded from the final schema, and a warning is issued.Enhanced and added new CLI Commands.
Bug Fix for
update-schemas
ClassResolved an issue relating to how the system.query_log is queried after an insertion is completed
v0.2.58 (October 16, 2024)
Changes:
🚀 Build Schemas:
Bug Fix in
build-schemas
andupdate-schemas
Class:Resolved an issue where the lack of a new line in the schema build was causing an upstream issue with extraction in the schema update.
v0.2.57 (October 16, 2024)
Changes:
🚀 Build Schemas:
Bug Fix in
build-schemas
Class:Resolved an issue where the index_type was specified in the schema heading but no values were assigned to it.
v0.2.56 (October 15, 2024)
Changes:
🚀 Fixed Failing Schema Unit tests:
Resolved the failing Schema unit tests.
v0.2.55 (October 14, 2024)
Changes:
🚀 Enhancements to Common Schema and CLI Commands:
Fixed Issues in CLI Commands:
Resolved a bug in the
download-core-schemas
command.Addressed a bug in the
build-schema
command to prevent failures when a schema column is defined multiple times.Fixed the issues in xdrcli package due to Pyparsing package bug.
Schema Upgrade:
Upgraded the common schema version to v001.001.004 across all components.
🚀 New CLI Commands Introduced:
download-vector-templates
: Added a new CLI command to download vector and enrichment templates.
v0.2.55 (14/10/2024)
Changes:
🚀 Update Schemas:
Bug Fix in
common-header hc skip index applied
Class:Removing high cardinality skip index from the main timestamp field in the common header. Will review in later version to add timestamp friendly skip index like minmax.
v0.2.54 (12/10/2024)
Changes:
🚀 Update Schemas:
Bug Fix in
update-schemas
Class:Resolved an issue with the correct reference to the replacement and base tables, which were incorrectly referenced before in insert_records_into_new_table method. The process is now functioning as intended.
Added a new flag drop_replacement_table_flag in update-schemas CLI.
Updated documentation.
v0.2.53 (30/09/2024)
Changes:
🚀 New Features:
Helm Charts for Ingestion Pipelines:
Introduced working Helm charts that describe the ingestion pipelines, allowing for simplified deployment and management of various components in Kubernetes.
Stringing Together Vector Components:
Developed a mechanism to string together different vector components, enabling users to create more complex and efficient data ingestion workflows.
Variable Abstraction:
Abstracted variables across the pipeline configurations and charts, enabling the use of easier variable names.
v0.2.52 (20/09/2024)
Changes:
🚀 Vectors:**:
Sync Vector Templates from XDRConfig to xdr-data-engine.
Ran package_vector to sync the vector templates to src/resources.
🛠 Enhancements:
Unit Test Fixes:
Resolved issues in unit tests to ensure more reliable testing of the ingestion pipeline components.
These fixes improve test accuracy and help in catching errors earlier in the development cycle.
Ingestion Pipeline Uploads:
Fixed issues related to ingestion pipeline uploads to use xdr_package config, ensuring smoother and error-free uploads of pipeline configurations.
v0.51 (09/18/2024)
Changes:
API Development:
Introduced a new FastAPI application to manage schema and hunt-related operations. This includes a unified API interface for schema creation and hunt execution, previously handled via CLI commands.
New Features:
Schema Management using FastAPI:
Added all schema CLI commands into the FastAPI application with dedicated endpoints.
Hunt Framework:
Integrated hunt management into the FastAPI application with dedicated endpoints for scheduling and executing hunts.
UI Support:
Built an API class to support a user interface, enabling interaction with schema and hunt operations through RESTful endpoints.
Prometheus Metrics and Logging:
Exposed metrics and logs for Prometheus, allowing for better monitoring and performance tracking of the FastAPI applications.
Health Checks:
Implemented health checks to ensure the applications can access essential resources like ClickHouse and required configuration files.
Containerization:
Containerized both the schema management and hunt management FastAPI applications, facilitating local testing and deployment.
gitbook/RELEASE_NOTES.md
v0.2.50 (18/09/2024)
Changes:
🚀 New Features:
Added Key Specifications to
build-schemas
,apply-schemas
,plan-schemas
, andupdate-schemas
Commands:Users can now specify a particular organization and schema for updating. This provides more granular control over schema updates, allowing for selective operations instead of updating all customers and schemas by default.
If no specific organization or schema is provided, the commands will process updates for all customers and schemas.
New Command-Line Arguments:
--schema_filter_list
: Accepts a string of comma-separated schema names to selectively target specific schemas for updating.--derived_schema_filter_list
: Similar toschema_filter_list
, this allows specifying sub-schemas for more focused operations.--schema_filter_wildchar
: Introduces wildcard matching to schemas, enabling flexible lookup based on schema patterns.--derived_schema_filter_wildchar
: Likeschema_filter_wildchar
, this option applies to sub-schemas, allowing users to apply wildcards to sub-schema names for matching multiple entities.--verbose
: Enables verbose logging, allowing users to see detailed debug information during command execution.
🛠 Enhancements:
Improved Error Management:
Enhanced mechanisms for managing and reporting errors, ensuring that failures are properly flagged. This addresses an issue where certain failures were being overlooked and tests appeared to pass even when they should have failed.
Consistent management of files to exclude from
update-schemas
.
v0.2.49 (09/09/2024)
Changes:
CLI Commands: Changed the names of CLI commands from
build-schemas
toplan-schemas
andupdate-schemas
for improved clarity.New Features: - Incorporated
--all
and--only-beats
flags inbuild-schemas
command: ---all
: Processes all schemas without filtering. ---only-beats
: Scans and processes only beats schemas.Enhancements: - Updated
plan-schemas
to handle logging more effectively when databases or schemas are missing. - Added new unit tests and updated existing ones to cover the latest functionality and improvements. - Incorporated multi-threading usingThreadPoolExecutor
fromconcurrent.futures
inbuild
andapply
schemas to improve performance and scalability.
v0.2.48 (09/09/2024)
Enhancements:
Sigma Rules Processor:
Sigma-to-Schema Mapping: Integrated Sigma rules with schema mappings in the
xdr_package.yaml
configuration, enabling more granular control and flexible schema adaptation across organizations.Advanced Sigma Refactoring: Comprehensive refactoring of the Sigma Backend and Converter to improve modularity, maintainability, and performance.
Extended Unit Test Coverage: Enhanced unit test suite, covering edge cases and complex processing scenarios, ensuring the integrity of Sigma rule ingestion and execution.
Logging & Observability:
XDRLog Integration: Implemented structured logging via
XDRLog
across all Sigma components, providing improved traceability and insight into rule processing, error handling, and performance metrics.
Additional Improvements:
Code Quality & Reliability:
Robust Error Handling: Strengthened error handling mechanisms, reducing processing downtime and making troubleshooting easier through detailed logging.
Performance Optimization: Fine-tuned performance optimizations in the Sigma rule conversion pipeline to improve processing speed and resource utilization.
v0.2.47 (08/09/2024)
Changes:
GITLAB pipline: Gitlab pipeline container registry version fix. The current pipeline takes poetry version from the feature branch , we want to pull latest version from main branch for coker image pushing to container registry .
v0.2.46 (08/09/2024)
Changes:
GITLAB pipline: Gitlab pipeline release version fix. The current pipeline takes poetry version from the feature branch , we want to pull latest version from main branch.
v0.2.45 (07/09/2024)
Changes:
GITLAB pipline: Gitlab pipeline package distribution fix.
v0.2.44 (31/08/2024)
Changes:
CLI Commands:
plan-schemas
: Introduced a new command to generate a detailed plan for schema changes before applying them. This allows users to review and verify schema modifications in advance.
update-schemas
: Added a command to apply all schema updates across the ClickHouse environment, streamlining the process of implementing changes based on the schema changes.
v0.2.43 (25/08/2024)
Changes:
Vector Templates: Refreshed Vector Templates from XDR config to xdr-data-engine in src/resources/vector_templates/** using package_vector.py python job.
origin/main
v0.2.42 (14/08/2024)
Changes:
schema builder: Updated a logic in schema builder to create all schemas for all customers ie removed the logic of customers subscribing to schemas.
v0.2.41 (14/08/2024)
Changes:
cronjob: Added a logic in hunt config to handle the orchestration of multiiple CRON's for multiple Customers. Updated unit tests to support this change.
v0.2.37 (13/08/2024)
Changes:
modify-orderby-table and modify-index-table: Added 2 new CLI commands : modify-orderby-table is to add or remove the columns from order by and primary statements in clickhouse table. modify-index-table is to add or remove the indexes from the clickhouse table.
v0.2.36 (11/08/2024)
Changes:
Schema Upgrade-Skip Indexes: Updated logic for making primary key columns same as order by columns.
Common Header v1.1.3: Removed indexing from event_hash column
v0.2.35 (10/08/2024)
Changes:
Schema Upgrade-Skip Indexes: A new index_type was added to the schema definition language that will allow users to define skip indexs.
Common Header v1.1.3: A series of new fields added to the common header.
v0.2.34 (04/08/2024)
v0.2.33 (04/08/2024)
v0.2.32 (04/08/2024)
Changes:
Hunt Checkpointing: Upgraded the HuntCheckpointManager class to create unique xdr_audit databases and detection_checkpoint tables for each checkpoint test run.
Hunt Checkpointing: Upgraded the HuntCheckpointManager class to write the previous_successful_checkpoint to make it easy to validate checkpoint windows
Cron job: Simplified and reduced the function scoping for executing cron jobs on threads and within asyncio loops. Fixed hunt threading to ensure it displays correctly for each hunt per customer. -diagrams Added Hunt Framework Workflow diagrams, including both drawio and PNG files.
Deployment Steps:
Database Schema Update:
Execute the following SQL command to add the new column to the
detection_checkpoint
table:ALTER TABLE xdr_audit.detection_checkpoint ADD COLUMN previous_successful_checkpoint DateTime CODEC(DoubleDelta, LZ4);
v0.2.31 (02/08/2024)
Changes:
gitlab-ci: Decouple the CI and CD. The unit tests and stress tests will always run but the package, docker image build and tags/release will only run on manual execution of the pipeline on main branch.
v0.2.31 (2/11/2024)
Changes:
gitlab-ci: Decouple the CI and CD. The unit tests and stress tests will always run but the package, docker image build and tags/release will only run on manual execution of the pipeline on main branch.
v0.2.27 (31/07/2024)
Changes:
Hunts: Added hunt framework test to validate that the hunt framework is spinning up the right number of threads and effectively isolating the hunts to run in parallel.
Example Output:
- 2024-07-17 01:07:46,691 INFO | Summary: - 2024-07-17 01:07:46,692 INFO | Total threads used: 8 - 2024-07-17 01:07:46,692 INFO | Thread 0: Execution Time: 74.97776647203136 seconds - 2024-07-17 01:07:46,693 INFO | Thread 1: Execution Time: 74.7663262829883 seconds - 2024-07-17 01:07:46,693 INFO | Thread 2: Execution Time: 74.69299309497 seconds - 2024-07-17 01:07:46,694 INFO | Thread 3: Execution Time: 74.62824630603427 seconds - 2024-07-17 01:07:46,695 INFO | Thread 4: Execution Time: 74.55346441699658 seconds - 2024-07-17 01:07:46,696 INFO | Thread 5: Execution Time: 74.42174733499996 seconds - 2024-07-17 01:07:46,696 INFO | Thread 6: Execution Time: 74.31992934900336 seconds - 2024-07-17 01:07:46,697 INFO | Thread 7: Execution Time: 74.24981495796237 seconds> - 2024-07-17 01:08:57,786 INFO | Benchmark analysis completed. - 2024-07-17 01:08:57,788 INFO | Total threads used: 8 - 2024-07-17 01:08:57,789 INFO | Total execution time: 565.3543259570142 seconds - 2024-07-17 01:08:57,790 INFO | Average execution time per thread: 70.66929074462678 seconds
Detection Checkpoint: Added a new column
query_checkpoint_time
to thedetection_checkpoint
table.
Deployment Steps:
Database Schema Update:
Execute the following SQL command to add the new column to the
detection_checkpoint
table:ALTER TABLE xdr_audit.detection_checkpoint ADD COLUMN query_checkpoint_time DateTime CODEC(DoubleDelta, LZ4);
Initial Data Scan:
The first hunt run ( ie no existing checkpoint ) will scan all existing data and populate the new
query_checkpoint_time
column. This process may take time depending on the volume of data.
Watermark Configuration:
To optimize the checkpointing process and avoid a full data scan every time, we have configured a watermark in the
query_checkpoint_time
column.
Testing:
Ensured that new checkpointing functionality is correctly implemented and that the watermark setting is functioning as intended.
Monitor:
After deployment, monitor system performance and logs to ensure that the update does not negatively impact operations.
v0.2.22 (16/07/2024)
[hunts]
Handled merge conflicts from v0.2.18
Pushed common Header
Fixed gitlab pipeline to run build , package and tags only when merged to main. But still run tests on Merge Request.
v0.2.21 (14/07/2024)Framework Enhancements:
Enhanced [vector-ingestion-pipelines] by integrating new templates and abstracting additional variables into the xdr_package.yaml configuration file.
v0.2.20 (14/07/2024)
[schema-executor] Regression tested Schema DataType Changes
v0.2.18.3 (14/07/2024)
[cron_job] Passed the scheduled start time of the cron task from JobScheduler to Hunts Checkpoint.
v0.2.18.2 (14/07/2024)
[stress-testing] Expanded Stress tests to run on 20 databases integration_test1-integration_test20.
v0.2.18.1 (14/07/2024)
[cronjob] Upgraded the multi threading for the hunts down from hunt to the individual customers on the hunt.
[hunts] We also passed in the schedule start of that cron task and use that at the checkpoint date time.
[tests] updated unit and stress tests.
v0.2.15 (10/07/2024)Framework Enhancements:
[schema-builder] Organisation List moved to the xdr_package_build.yaml.
[schema-builder] Added Timestamp_receiver and timestamp_fianlise to the common headers. Upgraded to v001.001.001 to build the two fields.
[hunt-runner] Checkpointing bug - not pulling
[schema-builder] Organisation List moved to the xdr_package_build.yaml.
Bug Fixes:
[hunt-framework] Fixing bug that uses same checkpoint for a across all customers, added customer name to the checkpointing solution.
Deployment Steps:
[schema-builder] Migrate your list of organisations into the xdr-package.yaml and specific the schemas that you want to build.
organisations: - org_id: detectionlab cluster_name: '' schemas: - logs_alerts - nx_log_windows_sub_ee - logs_nxlog_windows - logs_syslog - org_id: org321 cluster_name: '' schemas: - logs_alerts - logs_syslog Ensure that the org_id, and schemas fields are updated to reflect your organization’s specific requirements. Cluster can be ignored for SaaS ClickHouse Deployments.
Last updated