# XDR Data Engine Overview

## What is XDR Data Engine?

XDR Data Engine is a powerful toolset designed to streamline the management and packaging of security data processing components. It provides a comprehensive solution for handling beats schemas, vector templates, and XDR configurations in an automated and efficient manner.

## Prerequisites and Dependencies

### Required Access

* GitLab account with access to HyperSec repositories
* GitLab access token with registry read permissions
* Access to the following package repositories:
  * Python Wheel Package
  * Derive Schemas Package
  * Meta Schemas Package
  * Ingestion Pipelines Package

### System Requirements

* Python 3.11 or higher
* Docker and Docker Compose
* Git
* curl, wget, unzip
* 4GB RAM minimum (8GB recommended)
* 20GB disk space minimum

## Core Components

### 1. Schema Management

* Build and manage ClickHouse schemas for security data
* Version control and track schema changes
* Automated schema updates and migrations
* Performance optimization through intelligent indexing

### 2. Data Pipeline Management

* Configure and manage Vector-based data ingestion pipelines
* Template-based configuration for consistent data processing
* Support for multiple data sources and formats
* Efficient data routing and transformation

### 3. Hunt Framework

* Execute and manage threat hunting operations
* Parallel processing capabilities
* Real-time hunt status monitoring
* Flexible hunt configuration options

### 4. CLI Tools

* Comprehensive command-line interface for all operations
* Automated workflow support
* Configuration management
* Health monitoring and diagnostics

## Package Structure

The XDR Data Engine consists of several key packages:

1. **Core Python Package**
   * Main CLI tool and core functionality
   * Available as a wheel package from GitLab registry
2. **Schema Packages**
   * Derive Schemas: Base schema definitions
   * Meta Schemas: Schema metadata and relationships
   * Used for data structure management
3. **Ingestion Pipeline Package**
   * Vector templates and configurations
   * Data transformation rules
   * Pipeline definitions

## XDE Overview

The system components includes:

1. **Schema Builder**
   * Creates and maintains data schemas
   * Manages schema versions
   * Handles schema migrations
2. **Vector Pipeline Manager**
   * Manages data ingestion
   * Handles data routing
   * Processes transformations
3. **Hunt Framework**
   * Executes hunting operations
   * Manages hunt scheduling
   * Processes results
4. **API Servers**
   * Schema management API
   * Hunt operation API
   * Configuration API

## Configuration Hierarchy

Settings are applied in the following order:

1. Environment Variables
2. CLI Parameters
3. xdr\_package.yaml settings
4. xdr\_targets.yaml settings

## Best Practices

### 1. Installation and Setup

* Use the provided setup script
* Keep packages updated
* Follow version control best practices
* Document custom configurations

### 2. Schema Management

* Version all schemas
* Test changes in development
* Monitor performance impacts
* Keep schema documentation updated

### 3. Pipeline Configuration

* Use templates consistently
* Validate changes in test environment
* Monitor pipeline performance
* Document transformations

### 4. Hunt Operations

* Set appropriate timeouts
* Configure thread limits
* Monitor execution status
* Document hunt rules

## Getting Started

For detailed setup instructions, see the [Quick Start Guide](https://xdr-data-engine.docs.hypersec.io/quickstart).

Additional resources:

* [Environment Configuration](https://xdr-data-engine.docs.hypersec.io/getting-started/environment_config)
* [Package Configuration](https://xdr-data-engine.docs.hypersec.io/broken-reference)
* [Schema Management](https://xdr-data-engine.docs.hypersec.io/broken-reference)
* [Hunt Framework](https://xdr-data-engine.docs.hypersec.io/hunt-framework/hunt_framework)

## Support and Resources

* [Release Notes](https://xdr-data-engine.docs.hypersec.io/broken-reference)
* [Version Management](https://xdr-data-engine.docs.hypersec.io/release-information/versioning)
* [Deployment Guide](https://xdr-data-engine.docs.hypersec.io/deployment/deployments)
* GitLab Repository: <https://gitlab.com/hypersec-repo/hyperstack/xdr-data-engine>
