Package Configuration
The xdr_package.yaml
file is the central configuration file for the XDR Data Engine. It defines the system's behavior, data processing pipelines, and schema configurations.
Global Settings
global_settings:
ingestion_template_s3_bucket: "development-config-bucket-afterburner"
schema_common_version: "v001.001.005"
schema_output_path: "../.xdr_schema_output/"
ingestion_pipeline_output_path: "../.xdr_ingestion_pipelines_output"
upload_ingestion_template_output_path: "../.xdr_ingestion_template_output"
default_profile: "ghostburner"
target_path: "~/.xdr/xdr_targets.yaml"
Organizations Configuration
Define organizations and their cluster settings:
organisations:
- org_id: "org123"
cluster_name: ""
- org_id: "org456"
cluster_name: ""
Schema Configuration
Build Settings
build_schemas:
no_cluster_declarations_needed: true
use_replicated_merge_tree: false
apply_schemas:
do_add_roles: false
do_add_columns: true
Schema Definitions
schemas:
logs_alerts:
name: "logs_alerts"
meta_schema: "logs_alerts.csv"
meta_schema_version: "v001.000.000"
derived_schema_file_path: "logs_alerts/logs_alerts_sub.csv"
additional_fields_config: "logs_alerts/logs_alerts_add.csv"
Ingestion Pipeline Configuration
Pipeline Globals
ingestion_pipeline_globals:
ingestion_data_dir: "/vector-data-dir"
ingestion_mtls_path: "/etc/vector_tls"
kafka_mtls_path: "/etc/vector_mtls"
Pipeline Definition
Each pipeline can have multiple stages:
ingestion_pipelines:
- name: "logs_alerts"
stages:
- name: "finalise"
description: "Finalization stage"
config:
kafka_source_topic: "logs_alerts"
kafka_sink_topic: "logs_alerts_load"
# ... additional configuration
Vector Templates
Define vector transformation templates:
vector_templates:
core:
- name: "000-source-file.yml"
version: "v001.000.000"
- name: "001-source-kafka-aws-saas.yml"
version: "v001.000.000"
Best Practices
Version Management
Use semantic versioning for schemas and templates
Document version changes
Maintain backward compatibility
Pipeline Configuration
Group related transformations into stages
Use descriptive names for pipelines and stages
Document pipeline dependencies
Schema Management
Keep schema definitions organized
Use consistent naming conventions
Document schema relationships
Resource Management
Configure appropriate resource limits
Monitor pipeline performance
Adjust configurations based on usage patterns
Common Tasks
Adding a New Schema
Define schema in the
schemas
sectionSpecify meta schema and version
Add any derived schema configurations
Configure additional fields if needed
Creating a New Pipeline
Add pipeline definition under
ingestion_pipelines
Configure required stages
Define transformation steps
Set appropriate resource limits
Updating Vector Templates
Add new template under
vector_templates
Specify version number
Reference in pipeline configurations
Test transformations before deployment
Last updated