XDR Data Engine Overview

What is XDR Data Engine?

XDR Data Engine is a powerful toolset designed to streamline the management and packaging of security data processing components. It provides a comprehensive solution for handling beats schemas, vector templates, and XDR configurations in an automated and efficient manner.

Prerequisites and Dependencies

Required Access

  • GitLab account with access to HyperSec repositories

  • GitLab access token with registry read permissions

  • Access to the following package repositories:

    • Python Wheel Package

    • Derive Schemas Package

    • Meta Schemas Package

    • Ingestion Pipelines Package

System Requirements

  • Python 3.11 or higher

  • Docker and Docker Compose

  • Git

  • curl, wget, unzip

  • 4GB RAM minimum (8GB recommended)

  • 20GB disk space minimum

Core Components

1. Schema Management

  • Build and manage ClickHouse schemas for security data

  • Version control and track schema changes

  • Automated schema updates and migrations

  • Performance optimization through intelligent indexing

2. Data Pipeline Management

  • Configure and manage Vector-based data ingestion pipelines

  • Template-based configuration for consistent data processing

  • Support for multiple data sources and formats

  • Efficient data routing and transformation

3. Hunt Framework

  • Execute and manage threat hunting operations

  • Parallel processing capabilities

  • Real-time hunt status monitoring

  • Flexible hunt configuration options

4. CLI Tools

  • Comprehensive command-line interface for all operations

  • Automated workflow support

  • Configuration management

  • Health monitoring and diagnostics

Package Structure

The XDR Data Engine consists of several key packages:

  1. Core Python Package

    • Main CLI tool and core functionality

    • Available as a wheel package from GitLab registry

  2. Schema Packages

    • Derive Schemas: Base schema definitions

    • Meta Schemas: Schema metadata and relationships

    • Used for data structure management

  3. Ingestion Pipeline Package

    • Vector templates and configurations

    • Data transformation rules

    • Pipeline definitions

XDE Overview

The system components includes:

  1. Schema Builder

    • Creates and maintains data schemas

    • Manages schema versions

    • Handles schema migrations

  2. Vector Pipeline Manager

    • Manages data ingestion

    • Handles data routing

    • Processes transformations

  3. Hunt Framework

    • Executes hunting operations

    • Manages hunt scheduling

    • Processes results

  4. API Servers

    • Schema management API

    • Hunt operation API

    • Configuration API

Configuration Hierarchy

Settings are applied in the following order:

  1. Environment Variables

  2. CLI Parameters

  3. xdr_package.yaml settings

  4. xdr_targets.yaml settings

Best Practices

1. Installation and Setup

  • Use the provided setup script

  • Keep packages updated

  • Follow version control best practices

  • Document custom configurations

2. Schema Management

  • Version all schemas

  • Test changes in development

  • Monitor performance impacts

  • Keep schema documentation updated

3. Pipeline Configuration

  • Use templates consistently

  • Validate changes in test environment

  • Monitor pipeline performance

  • Document transformations

4. Hunt Operations

  • Set appropriate timeouts

  • Configure thread limits

  • Monitor execution status

  • Document hunt rules

Getting Started

For detailed setup instructions, see the Quick Start Guide.

Additional resources:

Support and Resources

Last updated