# XDR Data Engine Overview

## What is XDR Data Engine?

XDR Data Engine is a powerful toolset designed to streamline the management and packaging of security data processing components. It provides a comprehensive solution for handling beats schemas, vector templates, and XDR configurations in an automated and efficient manner.

## Prerequisites and Dependencies

### Required Access

* GitLab account with access to HyperSec repositories
* GitLab access token with registry read permissions
* Access to the following package repositories:
  * Python Wheel Package
  * Derive Schemas Package
  * Meta Schemas Package
  * Ingestion Pipelines Package

### System Requirements

* Python 3.11 or higher
* Docker and Docker Compose
* Git
* curl, wget, unzip
* 4GB RAM minimum (8GB recommended)
* 20GB disk space minimum

## Core Components

### 1. Schema Management

* Build and manage ClickHouse schemas for security data
* Version control and track schema changes
* Automated schema updates and migrations
* Performance optimization through intelligent indexing

### 2. Data Pipeline Management

* Configure and manage Vector-based data ingestion pipelines
* Template-based configuration for consistent data processing
* Support for multiple data sources and formats
* Efficient data routing and transformation

### 3. Hunt Framework

* Execute and manage threat hunting operations
* Parallel processing capabilities
* Real-time hunt status monitoring
* Flexible hunt configuration options

### 4. CLI Tools

* Comprehensive command-line interface for all operations
* Automated workflow support
* Configuration management
* Health monitoring and diagnostics

## Package Structure

The XDR Data Engine consists of several key packages:

1. **Core Python Package**
   * Main CLI tool and core functionality
   * Available as a wheel package from GitLab registry
2. **Schema Packages**
   * Derive Schemas: Base schema definitions
   * Meta Schemas: Schema metadata and relationships
   * Used for data structure management
3. **Ingestion Pipeline Package**
   * Vector templates and configurations
   * Data transformation rules
   * Pipeline definitions

## XDE Overview

The system components includes:

1. **Schema Builder**
   * Creates and maintains data schemas
   * Manages schema versions
   * Handles schema migrations
2. **Vector Pipeline Manager**
   * Manages data ingestion
   * Handles data routing
   * Processes transformations
3. **Hunt Framework**
   * Executes hunting operations
   * Manages hunt scheduling
   * Processes results
4. **API Servers**
   * Schema management API
   * Hunt operation API
   * Configuration API

## Configuration Hierarchy

Settings are applied in the following order:

1. Environment Variables
2. CLI Parameters
3. xdr\_package.yaml settings
4. xdr\_targets.yaml settings

## Best Practices

### 1. Installation and Setup

* Use the provided setup script
* Keep packages updated
* Follow version control best practices
* Document custom configurations

### 2. Schema Management

* Version all schemas
* Test changes in development
* Monitor performance impacts
* Keep schema documentation updated

### 3. Pipeline Configuration

* Use templates consistently
* Validate changes in test environment
* Monitor pipeline performance
* Document transformations

### 4. Hunt Operations

* Set appropriate timeouts
* Configure thread limits
* Monitor execution status
* Document hunt rules

## Getting Started

For detailed setup instructions, see the [Quick Start Guide](https://xdr-data-engine.docs.hypersec.io/quickstart).

Additional resources:

* [Environment Configuration](https://xdr-data-engine.docs.hypersec.io/getting-started/environment_config)
* [Package Configuration](https://xdr-data-engine.docs.hypersec.io/broken-reference)
* [Schema Management](https://xdr-data-engine.docs.hypersec.io/broken-reference)
* [Hunt Framework](https://xdr-data-engine.docs.hypersec.io/hunt-framework/hunt_framework)

## Support and Resources

* [Release Notes](https://xdr-data-engine.docs.hypersec.io/broken-reference)
* [Version Management](https://xdr-data-engine.docs.hypersec.io/release-information/versioning)
* [Deployment Guide](https://xdr-data-engine.docs.hypersec.io/deployment/deployments)
* GitLab Repository: <https://gitlab.com/hypersec-repo/hyperstack/xdr-data-engine>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://xdr-data-engine.docs.hypersec.io/xdr_overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
