Why Is Wazuh Using High CPU? Troubleshooting Guide

Wazuh is a distributed security monitoring system built on a modular architecture composed of agents, manager, indexer, and dashboard components.

At a high level:

Agents collect telemetry from endpoints (logs, FIM events, system activity).
Wazuh Manager processes incoming events, applies decoders, evaluates rules, and triggers alerts.
Indexer (OpenSearch/Elasticsearch-based) stores and indexes security events for search and correlation.
Dashboard provides visualization and investigation capabilities.

This architecture is powerful, but inherently resource-intensive.

In SIEM/XDR systems like Wazuh, CPU spikes are expected under certain workloads, especially during:

High log ingestion bursts
Rule evaluation surges
Indexing backpressure in OpenSearch/Elasticsearch

According to NIST’s continuous monitoring guidance, security telemetry pipelines must be treated as “high-throughput analytical systems” where compute demand fluctuates significantly under real-time detection workloads.

Symptoms of High CPU Usage

When CPU saturation occurs in a Wazuh environment, common symptoms include:

Delayed or missing alerts in the dashboard
Increased event processing latency on the manager
Slow query performance in the dashboard
Agent communication lag or queue buildup
System-wide performance degradation on the Wazuh node(s)

In production deployments, these symptoms often indicate bottlenecks in either rule processing (manager layer) or indexing throughput (storage layer).

For a complete guide, see The Complete Wazuh Performance Optimization Guide.

How to Identify High CPU Usage in Wazuh

Before fixing CPU issues, you need to pinpoint which component is responsible.

Using `top`, `htop`, and System Monitoring Tools

Start with standard Linux observability tools:

top / htop → Identify processes consuming CPU in real time
pidstat → Break down CPU usage per process thread
vmstat → Detect CPU run queue pressure
iostat → Check if CPU spikes correlate with disk I/O saturation

Focus on:

wazuh-analysisd
wazuh-remoted
wazuh-db
filebeat (if used in your pipeline)
OpenSearch/Elasticsearch JVM processes

Checking Wazuh Manager Processes

On the manager node, the most CPU-heavy components are typically:

analysisd → rule evaluation engine (most common culprit)
logcollector → log ingestion and normalization
wazuh-db → state tracking and integrity data handling

A sustained high CPU on analysisd usually indicates:

Too many active rules
High event throughput
Inefficient decoding patterns

Indexer and OpenSearch CPU Impact

If CPU spikes originate from the indexer layer:

OpenSearch/Elasticsearch JVM heap pressure may be high
Garbage collection cycles increase CPU consumption
Shard rebalancing or indexing bursts may overload CPU

Elastic’s performance documentation notes that indexing throughput is tightly coupled with heap sizing and shard strategy, and misconfiguration can lead to CPU saturation during ingestion spikes.

Correlating Spikes with Log Ingestion Rates

To confirm root cause:

Compare CPU spikes with log ingestion rate (EPS: events per second)
Check if spikes align with agent deployment changes or traffic surges
Review queue metrics in Wazuh manager logs

A sudden increase in EPS without filtering is one of the most common triggers of CPU saturation.

Most Common Causes of High CPU Usage

Excessive Log Volume

High log volume is the primary driver of CPU overload in Wazuh environments.

Typical causes:

No log filtering at the agent level
High-frequency system logs (auditd, syslog, application debug logs)
Misconfigured syslog ingestion pipelines flooding the manager

When every event is forwarded without filtering, the manager is forced to:

Decode each event
Apply rule matching
Evaluate correlation logic

This leads to exponential CPU growth under load.

Rule Overload and Inefficient Decoders

Wazuh performance heavily depends on rule and decoder efficiency.

Common issues:

Too many active rules (especially unused or redundant rules)
Complex regex patterns in decoders
Duplicate rule evaluation across multiple rule groups

Each event may trigger dozens or even hundreds of rule evaluations, significantly increasing CPU usage in analysisd.

Reference for deeper tuning:

Wazuh Manager Bottlenecks

The manager layer is often the first point of failure under load.

Key bottleneck patterns:

analysisd saturation → CPU maxed due to rule evaluation backlog
Queue backlog issues → events waiting in processing queues
Thread contention → multiple workers competing for CPU cycles

When queues fill up, latency increases and CPU remains pinned at high utilization as the system tries to catch up.

Indexer / OpenSearch Pressure

The indexing layer can silently become the CPU bottleneck.

Typical issues include:

Heavy indexing load from high event ingestion rates
Poor shard configuration (too many or too large shards)
Insufficient JVM heap allocation causing frequent garbage collection cycles

OpenSearch documentation highlights that improper shard sizing can significantly degrade indexing throughput and increase CPU consumption due to coordination overhead.

Agent Misconfiguration

Poor endpoint configuration can push unnecessary load upstream.

Common misconfigurations:

Over-reporting agents sending verbose logs
File Integrity Monitoring (FIM) scanning too frequently
Audit and rootcheck modules enabled with overly aggressive policies

This leads to:

High event volume at the source
Amplified processing load on the manager
Increased indexing pressure downstream

Related references:

Step-by-Step Troubleshooting Guide

This section focuses on isolating whether CPU pressure originates from the Wazuh manager, indexer, or ingestion pipeline, and then progressively reducing load in a controlled manner.

Check System Resource Usage

Start by establishing a baseline of system utilization.

Identify top CPU-consuming processes

Run standard Linux profiling tools:

top / htop → quick real-time view of CPU-heavy processes
pidstat -u 1 → per-process CPU usage over time
ps -eo pid,ppid,cmd,%cpu,%mem --sort=-%cpu | head → snapshot of top consumers

Focus on:

wazuh-analysisd
wazuh-remoted
wazuh-db
OpenSearch / Elasticsearch Java process
Filebeat (if used in ingestion pipeline)

Validate whether issue is manager vs indexer

A key distinction:

Manager CPU spike
- High analysisd or logcollector
- High rule evaluation latency
- Increased queue size in /var/ossec/queue/
Indexer CPU spike
- High JVM CPU usage
- Frequent garbage collection cycles
- Slow indexing or shard reallocation

This separation is critical because tuning strategies differ significantly between layers.

Reference:

How to Build a Wazuh Indexer Cluster

Analyze Wazuh Logs

Wazuh logs provide direct visibility into bottlenecks and queue saturation.

Key log files

/var/ossec/logs/ossec.log → core manager activity and errors
/var/ossec/logs/alerts/alerts.json → generated alerts and rule activity

What to look for

Repeated warnings like:
- “Queue is full”
- “Too many events received”
- “Analysisd high load”
Dropped event messages
Frequent decoder failures
Sudden spikes in alert generation frequency

A pattern of repeated queue warnings is a strong indicator that CPU is being saturated due to ingestion or rule processing overload.

Reduce Log Ingestion Load

One of the most effective ways to immediately reduce CPU pressure is to lower event volume before it reaches the manager.

Filter noisy logs at agent level

Exclude verbose system logs (debug-level application logs)
Limit high-frequency event sources (e.g., auditd, syslog spam)
Apply ignore rules in agent configuration

Disable unnecessary modules

Disable modules that are not required in your environment:

Rootcheck (if not actively used)
Unused FIM directories
Excess cloud integrations or collectors

Reducing upstream noise directly reduces analysisd CPU load.

Optimize Rules and Decoders

Rule and decoder efficiency has a direct impact on CPU usage in the manager layer.

Disable unused rules

Audit active rule sets
Remove rules that are not relevant to your environment
Disable entire rule groups if not needed

Merge duplicate rules

Consolidate overlapping detection logic
Avoid multiple rules triggering on the same event pattern
Reduce redundant regex evaluation

Use rule frequency tuning

Apply frequency and timeframe options to limit repetitive triggering
Prevent high-volume alerts from repeatedly firing on identical conditions

This reduces both CPU usage and alert noise.

Reference:

How to Create Custom Detection Rules in Wazuh (With Examples)

Tune Indexer Settings

If the bottleneck is in OpenSearch/Elasticsearch, indexing configuration must be optimized.

Adjust shard size

Avoid excessive small shards (high overhead)
Avoid oversized shards (slow queries and merges)
Aim for balanced shard distribution per node

Increase heap memory (if needed)

Ensure JVM heap is appropriately sized (commonly 50% of system RAM up to safe limits)
Monitor garbage collection frequency, frequent GC = CPU waste

Reduce indexing refresh rate

Increase refresh_interval to reduce indexing overhead
Batch indexing where possible to reduce CPU spikes

Performance Optimization Best Practices

Once immediate CPU issues are stabilized, long-term optimizations help prevent recurrence.

Enable log throttling

Limit repetitive event ingestion
Prevent burst traffic from overwhelming analysis pipeline

Use centralized filtering strategies

Filter logs at ingestion layer rather than manager
Standardize syslog filtering across all agents
Apply consistent log severity thresholds

Optimize agent configurations

Reduce unnecessary FIM monitoring paths
Disable unused integrations per endpoint type
Tune log collection frequency per environment role (server vs workstation)

Scale Wazuh horizontally (multi-node setup)

Split roles across multiple nodes:
- Manager nodes
- Indexer nodes
- Dashboard nodes
Distribute ingestion load to avoid single-node CPU saturation

This is especially important in environments exceeding high EPS (events per second).

Regular performance audits

Monitor CPU trends over time
Review rule efficiency quarterly
Analyze ingestion growth patterns
Benchmark system under peak load conditions

Reference:

Advanced Debugging Techniques

For persistent or complex CPU issues, deeper system-level diagnostics are required.

Enable debug logging in Wazuh manager

Increase log verbosity in ossec.conf
Helps identify rule processing delays and queue bottlenecks
Useful for pinpointing inefficient decoders or rules

Use performance profiling tools

pidstat → CPU usage per thread over time
perf top → kernel-level function call hotspots
strace → system call tracing for bottleneck detection

These tools help determine whether CPU usage is driven by:

User-space rule evaluation
Kernel I/O waits
Indexing or disk bottlenecks

Monitor queue metrics in real time

Key areas:

Event queue depth (queue/fts/, queue/rids/)
Agent buffer backlog
Analysisd processing lag

A continuously growing queue is a direct indicator that processing capacity is below ingestion rate.

Trace rule execution timing

Identify slow rules using debug logs
Detect regex-heavy rules causing CPU spikes
Reorder or disable inefficient rules based on execution cost

This level of tracing is often necessary in large-scale deployments where rule complexity becomes the primary performance limiter.

When to Scale Your Wazuh Deployment

Scaling becomes necessary when optimization alone can no longer stabilize CPU usage or ingestion throughput.

At this point, the issue is no longer configuration efficiency, it is architectural capacity.

CPU consistently above threshold (>80–90%)

Sustained high CPU utilization on the manager or indexer nodes indicates that the system is operating at or beyond its designed processing capacity.

Key signals:

analysisd or OpenSearch processes consistently pegged near max CPU
No improvement after rule tuning or log filtering
Increased event processing latency even under normal load

At this stage, additional tuning yields diminishing returns.

High event ingestion rates

A rapid increase in EPS (events per second) is one of the strongest indicators that scaling is required.

Common triggers:

New logging sources (cloud integrations, Kubernetes clusters)
Increased audit verbosity across endpoints
Security incidents generating burst telemetry

When ingestion grows faster than processing capacity, CPU saturation becomes unavoidable without horizontal scaling.

Reference:

Growing number of endpoints

As agent count increases:

Rule evaluation workload scales linearly (or worse, depending on rule complexity)
Log aggregation pressure increases on the manager
Queue depth grows under peak traffic

Large environments require:

Multi-manager deployments
Load-balanced agent distribution
Dedicated indexer clusters

Reference:

Indexer unable to keep up with ingestion

When the indexer becomes the bottleneck:

Indexing latency increases
CPU usage remains high even during idle periods
Shard reallocation or GC cycles dominate processing time

This typically indicates the need for:

Additional indexer nodes
Better shard distribution
Increased hardware resources per node

Reference:

How to Build a Wazuh Indexer Cluster

Frequently Asked Questions (FAQ)

Question: Why is Wazuh using so much CPU?

High CPU usage in Wazuh is typically caused by excessive log ingestion, inefficient rule evaluation, or indexer bottlenecks.

The most common root cause is unfiltered high-volume telemetry overwhelming the analysisd process.

Question: Which Wazuh process consumes the most CPU?

In most deployments:

Manager layer: wazuh-analysisd is the primary CPU consumer
Indexer layer: OpenSearch/Elasticsearch JVM process dominates CPU usage

The exact bottleneck depends on whether the system is rule-bound or indexing-bound.

Question: Can reducing rules improve performance?

Yes. Reducing active rules directly lowers CPU consumption in analysisd because fewer evaluations are performed per event.

Best practices:

Disable unused rulesets
Remove redundant detection logic
Avoid overly complex regex patterns

Reference:

Question: Does increasing memory reduce CPU usage?

Not directly.

Increasing memory may:

Reduce garbage collection pressure on the indexer
Improve caching efficiency

However, CPU usage is primarily driven by:

Rule evaluation complexity
Event volume
Indexing workload

So memory tuning helps indirectly, not as a primary fix.

Question: How do I monitor Wazuh performance effectively?

Effective monitoring requires visibility across all layers:

System tools: top, htop, pidstat
Wazuh logs: /var/ossec/logs/ossec.log
Indexer metrics: JVM heap, GC activity, shard health
Queue monitoring: event backlog and processing delays

A strong approach is correlating:

CPU spikes
EPS (event ingestion rate)
Queue depth
Alert latency

Reference:

Wazuh Dashboard Not Loading? Complete Troubleshooting Guide

Conclusion

High CPU usage in Wazuh is rarely caused by a single factor.

It is usually the result of compounding pressure across ingestion, rule evaluation, and indexing layers.

Recap main causes of high CPU usage

The most common contributors include:

Excessive log volume without filtering
Inefficient or overloaded rule sets
Manager-side bottlenecks in analysisd
Indexer pressure from shard or heap misconfiguration
Misconfigured or overly verbose agents

Importance of tuning and monitoring

Sustainable Wazuh performance depends on continuous tuning:

Reducing noise at the source (agents)
Optimizing detection logic (rules/decoders)
Ensuring indexing efficiency (OpenSearch tuning)
Monitoring system health proactively rather than reactively

Without ongoing observability, CPU issues tend to reappear as environments scale.

Recommendation: proactive optimization over reactive troubleshooting

Instead of waiting for CPU spikes to impact alerting or system stability, organizations should:

Establish baseline performance metrics
Continuously audit rule and log efficiency
Scale architecture before saturation occurs

Internal reference cluster for ongoing optimization:

Wazuh vs Splunk
Wazuh vs Graylog
Wazuh vs OSSIM

A properly tuned Wazuh deployment is not just about preventing CPU spikes, it is about maintaining predictable detection performance under evolving security workloads.

Why Is Wazuh Using High CPU? Troubleshooting Guide

Symptoms of High CPU Usage

How to Identify High CPU Usage in Wazuh

Using top, htop, and System Monitoring Tools

Checking Wazuh Manager Processes

Indexer and OpenSearch CPU Impact

Correlating Spikes with Log Ingestion Rates

Most Common Causes of High CPU Usage

Excessive Log Volume

Rule Overload and Inefficient Decoders

Wazuh Manager Bottlenecks

Indexer / OpenSearch Pressure

Agent Misconfiguration

Step-by-Step Troubleshooting Guide

Check System Resource Usage

Identify top CPU-consuming processes

Validate whether issue is manager vs indexer

Analyze Wazuh Logs

Key log files

What to look for

Reduce Log Ingestion Load

Filter noisy logs at agent level

Disable unnecessary modules

Optimize Rules and Decoders

Disable unused rules

Merge duplicate rules

Use rule frequency tuning

Tune Indexer Settings

Adjust shard size

Increase heap memory (if needed)

Reduce indexing refresh rate

Performance Optimization Best Practices

Enable log throttling

Use centralized filtering strategies

Optimize agent configurations

Scale Wazuh horizontally (multi-node setup)

Regular performance audits

Advanced Debugging Techniques

Enable debug logging in Wazuh manager

Use performance profiling tools

Monitor queue metrics in real time

Trace rule execution timing

When to Scale Your Wazuh Deployment

CPU consistently above threshold (>80–90%)

High event ingestion rates

Growing number of endpoints

Indexer unable to keep up with ingestion

Frequently Asked Questions (FAQ)

Question: Why is Wazuh using so much CPU?

Question: Which Wazuh process consumes the most CPU?

Question: Can reducing rules improve performance?

Question: Does increasing memory reduce CPU usage?

Question: How do I monitor Wazuh performance effectively?

Conclusion

Recap main causes of high CPU usage

Importance of tuning and monitoring

Recommendation: proactive optimization over reactive troubleshooting

Be First to Comment

Leave a Reply Cancel reply

Using `top`, `htop`, and System Monitoring Tools