Fix Wazuh Logcollector Dropped Messages

Wazuh is designed to process large volumes of security events from endpoints, servers, applications, and network devices.

However, when log volume exceeds the processing capacity of the Wazuh pipeline, administrators may encounter warnings indicating that messages are being dropped by Logcollector.

These warnings should never be ignored because they often signal that critical security events are being lost before they can be analyzed or correlated.

What the “Logcollector Dropped Messages” Warning Means in Wazuh

The “Logcollector dropped messages” warning typically appears when the Wazuh Logcollector component cannot forward incoming log events fast enough to the internal processing queues.

Once the queue reaches its maximum capacity, newly arriving events are discarded to prevent the agent or manager from becoming unresponsive.

Common warning messages include:

wazuh-logcollector: WARNING: Messages were dropped due to full event queue
wazuh-logcollector: WARNING: Logcollector queue is full
wazuh-logcollector: WARNING: Discarding events due to queue saturation

In simple terms, Wazuh is receiving more logs than it can process at that moment. As a result, some events never reach the analysis engine and are permanently lost.

Why Dropped Logs Are a Serious Security and Compliance Concern

Dropped messages create blind spots in your security monitoring program.

If important events are discarded before analysis, Wazuh cannot generate alerts, trigger active responses, or store the data for future investigations.

Potential consequences include:

  • Missed indicators of compromise (IOCs)
  • Undetected brute-force attacks
  • Incomplete audit trails
  • Delayed incident response
  • Compliance violations due to missing log records

According to guidance from the National Institute of Standards and Technology (NIST), complete log collection and retention are critical components of effective security monitoring and incident response.

Missing security logs can significantly reduce an organization’s ability to detect and investigate threats.

Security monitoring experts at SANS Institute similarly emphasize that log visibility gaps often become a major obstacle during forensic investigations because analysts cannot reconstruct events that were never collected.

Common Symptoms Administrators Observe

When Logcollector begins dropping events, administrators often notice other performance-related issues throughout the Wazuh environment.

Common symptoms include:

  • Missing alerts that previously triggered correctly
  • Delays in event processing
  • Gaps in dashboard visualizations
  • Unexpected reductions in event counts
  • Increased CPU or memory utilization
  • Queue-related warnings in ossec.log
  • Agent communication bottlenecks

In heavily loaded environments, dropped messages may occur alongside resource issues covered in our guide on Why Is Wazuh Using High CPU? Troubleshooting Guide.

Similarly, environments experiencing OpenSearch resource bottlenecks may benefit from How to Tune OpenSearch Heap Size to Stop Wazuh High Memory Crashes.

What Readers Will Learn in This Guide

In this guide, you’ll learn:

  • How Wazuh Logcollector works internally
  • Why dropped message warnings occur
  • Where to locate Logcollector-related errors
  • How to identify queue saturation issues
  • How to troubleshoot performance bottlenecks
  • Configuration changes that prevent message loss
  • Best practices for handling high-volume log sources
  • Long-term strategies to improve Wazuh scalability

By the end of this article, you’ll have a structured process for diagnosing and fixing Logcollector dropped message warnings while ensuring critical security events are reliably collected and analyzed.


Understanding Wazuh Logcollector

 

What Is Wazuh Logcollector?

Logcollector is one of the core Wazuh agent components responsible for gathering log data from monitored systems.

It continuously reads configured log sources and forwards events into the Wazuh processing pipeline for analysis.

The component supports numerous log sources, including:

  • Linux system logs
  • Windows Event Logs
  • Application logs
  • Audit logs
  • Web server logs
  • Security software logs
  • Custom text-based log files

Without Logcollector, Wazuh would have no mechanism for ingesting endpoint log data.

Purpose of the Logcollector Component

The primary responsibility of Logcollector is to act as the data ingestion layer between operating system logs and the Wazuh detection engine.

Its duties include:

  • Reading configured log files
  • Monitoring log changes in real time
  • Parsing incoming entries
  • Buffering events
  • Forwarding events to internal processing queues

Logcollector is designed for efficiency, but every environment has practical limits.

Excessive event rates can overwhelm the queueing system and lead to dropped messages.

How Logcollector Gathers Logs from Monitored Endpoints

Logcollector continuously watches configured log sources defined in the Wazuh configuration.

Examples include:

<localfile>
  <location>/var/log/auth.log</location>
  <log_format>syslog</log_format>
</localfile>

On Windows systems, Logcollector can subscribe directly to Windows Event Channels, allowing near real-time monitoring of security, application, and system events.

For organizations monitoring Windows infrastructure, see our How to Monitor Windows Event Logs Using Wazuh guide.

Relationship Between Logcollector, analysisd, and the Wazuh Manager

Several Wazuh components work together to process events:

  1. Logcollector
    • Reads logs from the operating system.
  2. Agent Communication Layer
    • Sends collected events to the manager.
  3. analysisd
    • Decodes logs and evaluates detection rules.
  4. Indexer
    • Stores processed alerts and events.
  5. Dashboard
    • Displays results to analysts.

If any component downstream becomes overloaded, queues can fill and eventually cause Logcollector to drop messages.

How Log Processing Works in Wazuh

 

Log Collection

The process begins when Logcollector reads data from configured log sources.

Examples include:

  • /var/log/auth.log
  • /var/log/syslog
  • Windows Security Logs
  • Apache access logs
  • Firewall logs

For web server monitoring, see How to Monitor Apache Logs with Wazuh.

Event Buffering and Queues

Collected events are placed into internal queues before processing.

These queues help absorb temporary spikes in log volume and prevent downstream components from becoming overwhelmed.

However, queues have finite capacity. When event production exceeds consumption rates for extended periods, saturation occurs.

Event Analysis and Rule Matching

Once events reach the manager, the analysisd process:

  • Decodes log entries
  • Applies rules
  • Performs correlation
  • Identifies suspicious activity

This stage is computationally intensive and often becomes a bottleneck in large deployments.

Alert Generation and Indexing

Events matching detection rules generate alerts.

These alerts are then:

  • Indexed into OpenSearch
  • Stored for historical analysis
  • Displayed in the dashboard
  • Used for reporting and investigations

If indexing performance degrades, queue backlogs may propagate upstream toward Logcollector.

What Happens When Messages Are Dropped?

Queue Saturation

Queue saturation occurs when events arrive faster than they can be processed.

Once the queue reaches its maximum capacity:

  • New events cannot be accepted
  • Logcollector starts discarding messages
  • Warning messages appear in logs

This behavior prevents complete service failure but results in data loss.

Lost Security Events

The most immediate consequence is the permanent loss of security events.

Examples include:

  • Failed login attempts
  • Malware detections
  • Privilege escalation activity
  • File modifications
  • Network intrusion alerts

If these events never enter the analysis pipeline, Wazuh cannot detect them.

Delayed Detection

Even before events are dropped, overloaded queues often introduce processing delays.

Security teams may notice:

  • Alerts arriving minutes late
  • Dashboards showing stale data
  • Correlation rules triggering after the fact

Delayed detection increases attacker dwell time and reduces response effectiveness.

Impact on Investigations and Auditing

From a forensic perspective, missing logs are particularly problematic.

Investigators rely on complete timelines to answer questions such as:

  • When did the attack begin?
  • Which systems were affected?
  • What actions did the attacker perform?

Dropped events create gaps that may make definitive conclusions impossible.

For organizations subject to compliance frameworks such as PCI DSS, HIPAA, or ISO 27001, missing audit records can also introduce regulatory concerns.

This is why many security architects treat dropped-message warnings as high-priority operational issues rather than harmless performance notifications.


Common Logcollector Dropped Message Errors

Typical Warning Messages

The exact wording varies by Wazuh version, but administrators commonly encounter warnings similar to the following:

wazuh-logcollector: WARNING: Messages were dropped due to full event queue

This warning indicates that incoming events exceeded the queue’s processing capacity.

wazuh-logcollector: WARNING: Logcollector queue is full

This message confirms that the queue has reached its maximum size and cannot accept additional events.

wazuh-logcollector: WARNING: Discarding events due to queue saturation

This warning indicates that new log events are being intentionally dropped to protect overall system stability.

In some environments, these warnings may appear together with other performance-related messages involving analysisd, agent communication, or indexing services.

Where to Find These Errors

Identifying the source and frequency of dropped-message warnings is the first step toward remediation.

Linux Agents

On Linux systems, Logcollector warnings are typically written to:

/var/ossec/logs/ossec.log

Useful commands:

grep -i "dropped" /var/ossec/logs/ossec.log
grep -i "queue" /var/ossec/logs/ossec.log

You can also monitor the log in real time:

tail -f /var/ossec/logs/ossec.log

Windows Agents

On Windows agents, Logcollector warnings are generally stored in:

C:\Program Files (x86)\ossec-agent\ossec.log

Administrators can review the file directly or search using PowerShell:

Select-String -Path "C:\Program Files (x86)\ossec-agent\ossec.log" -Pattern "queue"

If the issue occurs on Windows systems, you may also want to review Why the Wazuh Windows Agent Service Starts Then Stops (And How to Fix It) for related agent-side troubleshooting techniques.

Wazuh Manager

The manager often provides additional context regarding downstream bottlenecks.

Common locations include:

/var/ossec/logs/ossec.log

Look for:

  • Queue overflow warnings
  • analysisd performance issues
  • Communication delays
  • Indexing bottlenecks
  • Resource exhaustion messages

These messages frequently reveal whether the root cause exists on the agent or manager side.

Manager-Side Logs

Useful manager diagnostics include:

grep -i "analysisd" /var/ossec/logs/ossec.log
grep -i "queue" /var/ossec/logs/ossec.log
grep -i "overflow" /var/ossec/logs/ossec.log

Reviewing these entries helps determine whether queue saturation originates from:

  • Excessive event volume
  • Slow rule processing
  • Hardware resource limitations
  • OpenSearch indexing delays

Monitoring Dashboard Alerts

The Wazuh Dashboard may reveal symptoms before administrators notice dropped-message warnings in log files.

Watch for:

  • Unexpected decreases in event counts
  • Missing alerts
  • Delayed dashboards
  • Data ingestion gaps
  • Indexing backlogs

If dashboard visibility is affected, the following guides may also be useful:

Detecting these warning signs early can prevent minor queue pressure from escalating into significant event loss.


Root Causes of Logcollector Dropped Messages

Understanding the root cause is critical before attempting any remediation.

While the warning itself indicates queue saturation, the underlying reason can vary significantly between environments.

Excessive Log Volume

The most common cause of dropped messages is simply generating more logs than the Wazuh agent can process.

When event rates exceed processing capacity for an extended period, internal buffers begin filling until the queue reaches its maximum limit.

High-Frequency Applications

Certain applications generate thousands of events per second under normal operation.

Examples include:

  • Web servers handling large traffic volumes
  • Reverse proxies
  • Database servers
  • SIEM forwarding systems
  • Containerized workloads

A busy Apache or Nginx server can generate enormous amounts of access log data, especially during peak traffic periods.

Debug Logging Enabled

  • Debug logging is frequently enabled during troubleshooting and then forgotten.
  • Debug logs typically generate significantly more events than normal operational logs.

Common examples include:

  • Application debugging
  • Database query logging
  • Verbose authentication logging
  • Network packet debugging

These logs often provide little long-term security value while consuming substantial processing resources.

Burst Traffic Scenarios

Even well-sized environments can experience temporary logging spikes.

Examples include:

  • Brute-force attacks
  • Vulnerability scans
  • Malware outbreaks
  • System outages
  • Large application deployments

During these bursts, log production may temporarily exceed processing capacity, causing queues to fill rapidly.

Logcollector Queue Overflow

Logcollector relies on internal queues to buffer incoming events before forwarding them through the processing pipeline.

Internal Event Queue Limitations

Every queue has a finite size.

Queues are designed to absorb short-term traffic spikes, not sustain unlimited event ingestion.

Once the queue reaches capacity:

  • New events cannot be accepted
  • Warning messages appear
  • Log entries are discarded

This protective behavior prevents service crashes but introduces data loss.

Event Production Exceeding Processing Capacity

Dropped messages occur when event generation consistently exceeds event consumption.

In practical terms:

Events Generated > Events Processed

The larger this imbalance becomes, the faster queues fill.

Even modest overloads can eventually result in dropped events if sustained long enough.

Wazuh Agent Resource Constraints

Resource shortages on monitored endpoints frequently contribute to dropped-message warnings.

High CPU Utilization

When CPU utilization remains elevated, Logcollector receives less processing time from the operating system.

Common causes include:

  • Antivirus scans
  • Database workloads
  • Backup operations
  • Resource-intensive applications
  • Competing monitoring agents

As CPU contention increases, queue backlogs often follow.

Memory Pressure

Insufficient available memory can affect queue management and event processing.

Symptoms may include:

  • Increased swapping
  • Slower event handling
  • Delayed forwarding
  • Overall agent performance degradation

Disk Bottlenecks

Logcollector constantly reads log files from disk.

Slow storage can significantly reduce ingestion performance.

Particularly problematic scenarios include:

  • Traditional HDD storage
  • Shared storage environments
  • Heavily fragmented file systems
  • Overloaded virtual machines

Slow Wazuh Manager Processing

The issue isn’t always on the agent side.

In many environments, agents are functioning correctly while the manager struggles to keep up.

Overloaded Manager

Large deployments may process millions of events daily.

When manager resources become exhausted, downstream queues begin backing up.

Common indicators include:

  • Increased CPU utilization
  • High memory consumption
  • Delayed alert generation
  • Communication bottlenecks

Rule Processing Delays

Complex detection logic can increase processing times.

Examples include:

  • Heavy regex usage
  • Large custom rule sets
  • Extensive rule chaining
  • Complex correlation rules

Organizations with extensive custom detection logic should periodically review How to Create Custom Detection Rules in Wazuh (With Examples) to ensure rules remain efficient.

Analysis Bottlenecks

The analysisd process performs event decoding and rule evaluation.

If analysis throughput falls behind ingestion rates:

  • Internal queues grow
  • Processing delays increase
  • Event drops become more likely

According to the official Wazuh architecture documentation, event analysis is one of the most resource-intensive stages of the processing pipeline.

Excessive File Monitoring

Monitoring too many files can overwhelm Logcollector.

Monitoring Large Directories

Some administrators configure monitoring on directories containing:

  • Thousands of log files
  • Application-generated archives
  • Rotated logs
  • Temporary files

This dramatically increases processing overhead.

High-Churn Log Files

Certain files change extremely frequently.

Examples include:

  • Access logs
  • Security audit logs
  • Container logs
  • API gateway logs

These sources may generate hundreds or thousands of events every second.

Duplicate Log Collection Sources

Duplicate monitoring is more common than many administrators realize.

Examples include:

  • Monitoring the same file twice
  • Overlapping directory definitions
  • Multiple agents collecting identical logs
  • Syslog forwarding plus local collection

Duplicate collection effectively doubles event volume without providing additional visibility.

Improper Agent Configuration

Poor configuration choices often create unnecessary workload.

Inefficient localfile Definitions

Broad log collection rules may gather far more data than intended.

For example:

<localfile>
  <location>/var/log/*</location>
  <log_format>syslog</log_format>
</localfile>

Configurations like this can ingest numerous low-value logs.

Redundant Monitoring Rules

Administrators sometimes leave old configurations in place after migrations or upgrades.

This can result in:

  • Duplicate event collection
  • Increased resource consumption
  • Unnecessary queue pressure

Excessive Monitored Paths

Every monitored path consumes resources.

The more files Logcollector must track, the greater the processing overhead.

Following the principle of collecting only security-relevant logs significantly improves performance and scalability.


How to Diagnose Dropped Messages

Effective troubleshooting begins with identifying where the bottleneck exists.

The goal is to determine whether the problem originates from excessive event generation, insufficient agent resources, or downstream processing limitations.

Step 1: Check the Wazuh Logs

The first step is confirming that dropped-message warnings are occurring and determining their frequency.

Linux

Search the Wazuh logs for dropped-event warnings:

grep -i "dropped" /var/ossec/logs/ossec.log

Additional useful searches include:

grep -i "queue" /var/ossec/logs/ossec.log
grep -i "overflow" /var/ossec/logs/ossec.log

Review timestamps carefully to identify recurring patterns.

Windows

On Windows agents, use PowerShell:

Select-String -Path ossec.log -Pattern "dropped"

You can also search for queue-related warnings:

Select-String -Path ossec.log -Pattern "queue"

Document:

  • Warning frequency
  • Time of occurrence
  • Affected systems
  • Associated performance events

Step 2: Measure Log Volume

Before making configuration changes, determine how much data the system is attempting to process.

Identify Noisy Log Sources

Look for sources generating unusually high event counts.

Common offenders include:

  • Web server access logs
  • Firewall logs
  • Audit logs
  • Authentication logs
  • Container logs

Review file growth rates and event generation patterns.

Estimate Events Per Second

A useful metric is Events Per Second (EPS).

For example:

600,000 events per hour ÷ 3,600 seconds
≈ 167 EPS

Higher EPS environments require more careful tuning.

Determine Peak Logging Periods

Many systems exhibit predictable traffic spikes.

Examples include:

  • Business hours
  • Scheduled scans
  • Backup windows
  • Software deployments
  • Security incidents

If dropped messages align with these periods, temporary overload is likely the root cause.

Step 3: Monitor Agent Resource Usage

Agent-side resource constraints are a common contributor to queue saturation.

Linux

Monitor CPU and memory usage:

top
htop
vmstat 5

Look for:

  • High CPU utilization
  • Memory exhaustion
  • Excessive swapping
  • I/O wait conditions

Windows

Review resource consumption using:

  • Task Manager
  • Resource Monitor
  • Performance Monitor

Pay particular attention to:

  • Processor utilization
  • Available memory
  • Disk activity
  • Wazuh agent processes

Sustained resource pressure often correlates directly with dropped-event warnings.

Step 4: Review Manager Health

Even if agents appear healthy, the manager may be struggling to process incoming events.

CPU Utilization

Check whether manager CPU usage remains consistently elevated.

Linux example:

top

Pay close attention to:

  • analysisd
  • wazuh-db
  • Indexer-related processes

Memory Usage

Evaluate:

  • Available RAM
  • Swap utilization
  • OpenSearch heap consumption

High memory pressure may indicate a broader scaling issue.

For memory-related environments, see How to Tune OpenSearch Heap Size to Stop Wazuh High Memory Crashes.

Event Throughput

Compare:

Events Received
vs
Events Processed

If processing consistently lags behind ingestion, queue growth is expected.

Indexing Performance

Slow indexing can create downstream bottlenecks that propagate backward through the pipeline.

Review:

  • Indexing latency
  • OpenSearch health
  • Cluster resource utilization

The official OpenSearch performance guidance highlights indexing throughput as a critical factor in log analytics platforms.

Step 5: Identify Queue Saturation

The final step is confirming whether queue saturation is the direct cause of event loss.

Queue-Related Warnings

Look for messages such as:

queue is full
messages dropped
discarding events
event queue overflow

These warnings provide strong evidence of queue exhaustion.

Backlog Indicators

Signs of growing backlogs include:

  • Increasing event latency
  • Delayed alerts
  • Growing queue warnings
  • Reduced dashboard responsiveness

Processing Delays

Compare event timestamps with alert timestamps.

If alerts consistently arrive minutes after the original event occurred, processing bottlenecks are likely present.

At this point, you should have enough information to determine whether the issue stems from:

  • Excessive log generation
  • Agent resource limitations
  • Manager performance problems
  • Queue configuration constraints
  • Indexing bottlenecks

The next step is reducing unnecessary event volume before pursuing more advanced tuning options.


Fix 1: Reduce Excessive Log Volume

In many environments, reducing unnecessary log volume is the fastest and safest solution for Logcollector dropped-message warnings.

Rather than increasing hardware resources immediately, start by eliminating logs that provide little security value.

This approach often resolves queue saturation while improving overall Wazuh performance.

Disable Unnecessary Debug Logging

Debug logging is one of the most common sources of excessive event generation.

Application Debug Logs

Many applications support verbose debug modes intended only for troubleshooting.

Examples include:

  • Web applications
  • Databases
  • Authentication services
  • Reverse proxies
  • Security tools

After troubleshooting is complete, debug logging should typically be disabled.

Temporary Troubleshooting Logs

Administrators frequently enable:

  • Audit tracing
  • Verbose service logging
  • Detailed transaction logging

These settings often remain enabled long after the original issue has been resolved.

Regular audits of logging configurations can eliminate significant amounts of unnecessary traffic.

Exclude Low-Value Logs

Not every log source provides meaningful security insights.

Removing low-value data reduces processing overhead and improves signal-to-noise ratio.

Temporary Files

Avoid monitoring:

  • Temporary directories
  • Cache locations
  • Installation logs
  • Application scratch files

These sources rarely contribute to threat detection.

Repetitive Informational Events

Examples include:

  • Successful health checks
  • Routine status messages
  • Periodic heartbeat logs
  • Service startup confirmations

Large volumes of repetitive informational data can overwhelm processing pipelines.

Non-Security Logs

Focus collection efforts on logs relevant to:

  • Authentication
  • Authorization
  • Privilege escalation
  • Malware detection
  • Network activity
  • Configuration changes

Security-focused collection strategies improve both performance and detection quality.

Filter Events at the Source

Whenever possible, reduce event volume before Logcollector processes the data.

Filtering at the source is generally more efficient than collecting everything and filtering later.

Examples of Targeted Log Collection

Instead of monitoring an entire log directory:

<localfile>
  <location>/var/log/auth.log</location>
  <log_format>syslog</log_format>
</localfile>

Use narrowly scoped definitions focused on security-relevant sources.

Avoid broad configurations such as:

<localfile>
  <location>/var/log/*</location>
  <log_format>syslog</log_format>
</localfile>

Targeted monitoring reduces:

  • Event volume
  • Queue pressure
  • Resource consumption
  • Processing latency

Organizations collecting firewall logs should also review How to Collect Firewall Logs in Wazuh to ensure only valuable security events are being ingested.

Benefits of Reducing Noise

Reducing unnecessary log volume provides benefits beyond simply eliminating dropped-message warnings.

Advantages include:

  • Lower CPU utilization
  • Reduced memory consumption
  • Faster rule processing
  • Improved dashboard responsiveness
  • Better alert quality
  • Lower storage requirements
  • More efficient investigations

Many Wazuh administrators discover that 20–40% of collected events provide little operational value and can be safely removed from the pipeline.

Security monitoring experts at the Center for Internet Security (CIS) consistently recommend prioritizing high-value security telemetry over collecting every available log source.

Before increasing queue sizes or adding infrastructure, reducing unnecessary log volume should almost always be the first optimization step.


Fix 2: Increase Logcollector Queue Capacity

If excessive log volume cannot be reduced further, the next step is increasing the amount of buffering available to Logcollector.

Larger queues allow Wazuh to absorb temporary traffic spikes without immediately dropping events.

It’s important to understand, however, that increasing queue capacity treats the symptom rather than the root cause.

If event generation consistently exceeds processing capacity, larger queues will eventually fill as well.

Understanding Event Queues

Logcollector uses internal event queues to temporarily store events before forwarding them through the Wazuh processing pipeline.

These queues serve several purposes:

  • Buffer bursts of incoming logs
  • Smooth out temporary processing delays
  • Prevent immediate event loss
  • Improve pipeline stability

When queues become full, Logcollector begins discarding new events and generates the dropped-message warnings discussed earlier.

Queue Sizing Fundamentals

Queue sizing should be based on expected event rates and peak workload conditions.

For example:

Average Event Rate: 200 EPS
Peak Event Rate: 1,000 EPS
Peak Duration: 5 minutes

A queue that can absorb several minutes of peak traffic provides much greater resilience than one sized only for average conditions.

However, increasing queue sizes also increases:

  • Memory consumption
  • Event latency during overload conditions
  • Recovery times after spikes

Finding the right balance is critical.

Throughput Considerations

Before increasing queue capacity, determine whether the bottleneck is:

  • Event generation
  • Agent processing
  • Network communication
  • Manager analysis
  • OpenSearch indexing

If the downstream systems remain overloaded, larger queues simply delay eventual event loss.

For this reason, queue tuning should be combined with the performance optimizations discussed later in this guide.

Modifying Queue Settings

Depending on your Wazuh version and deployment architecture, queue-related parameters can be adjusted to provide additional buffering capacity.

Always back up configuration files before making changes.

Relevant Configuration Options

Common areas to review include:

  • Agent internal queue settings
  • Manager queue settings
  • Analysis queues
  • Remoted queues
  • OpenSearch ingestion capacity

Refer to the official Wazuh documentation for version-specific queue parameters.

Example Configuration Snippets

A queue-related configuration may resemble:

<client_buffer>
  <disabled>no</disabled>
  <queue_size>5000</queue_size>
  <events_per_second>500</events_per_second>
</client_buffer>

The exact options available depend on the Wazuh release you’re running.

When modifying queue sizes:

  1. Increase gradually.
  2. Test under realistic load.
  3. Monitor memory usage.
  4. Verify dropped-message warnings disappear.

Avoid making large increases without validating system capacity.

Restarting the Agent

After applying configuration changes, restart the agent so the new settings take effect.

Linux

systemctl restart wazuh-agent

Verify the service starts successfully:

systemctl status wazuh-agent

If the service fails after configuration changes, consult our How to Fix ossec.conf Syntax Errors in Wazuh Agents guide.

Windows

Restart the Wazuh service using PowerShell:

Restart-Service Wazuh

You can confirm the service status using:

Get-Service Wazuh

After the restart, monitor ossec.log for at least several hours to ensure dropped-message warnings no longer appear.

Increasing queue capacity often resolves temporary burst-related issues, but environments experiencing sustained overload typically require additional performance tuning.


Fix 3: Optimize Wazuh Agent Performance

When Logcollector is unable to keep up with incoming events, the underlying issue is often insufficient agent performance.

Improving the agent’s ability to collect and forward logs can dramatically reduce dropped messages.

Allocate More System Resources

Resource shortages frequently contribute to queue saturation and delayed event processing.

CPU

Log collection, parsing, and forwarding all consume CPU resources.

If the monitored endpoint consistently operates above 80–90% CPU utilization, Wazuh may struggle to process events efficiently.

Consider:

  • Adding additional CPU cores
  • Migrating resource-intensive workloads
  • Reducing competing background processes
  • Scheduling heavy jobs outside peak periods

High CPU conditions may also contribute to issues discussed in Why Is Wazuh Using High CPU? Troubleshooting Guide.

RAM

Insufficient memory can create processing delays and excessive swapping.

Monitor:

  • Available memory
  • Swap usage
  • Page faults
  • Wazuh process memory consumption

Adding RAM often improves queue handling during periods of elevated log volume.

Disk I/O

Logcollector continuously reads files from storage.

Slow storage devices can become a significant bottleneck.

Common indicators include:

  • High disk wait times
  • Elevated I/O latency
  • Slow file access

SSD storage typically performs substantially better than traditional hard drives for log-intensive workloads.

Reduce Concurrent Monitoring Tasks

Many administrators enable multiple Wazuh features simultaneously without considering their cumulative resource requirements.

Reducing concurrent workloads can free resources for Logcollector.

Log Collection

Review monitored log sources and eliminate unnecessary collection points.

Focus on:

  • Security logs
  • Authentication logs
  • Critical application logs

Avoid collecting excessive operational noise.

File Integrity Monitoring

Real-time File Integrity Monitoring (FIM) can generate significant system activity.

Review your configuration to ensure only critical directories are monitored.

For optimization recommendations, see:

How to Configure File Integrity Monitoring (FIM) in Wazuh

Vulnerability Scans

Vulnerability detection is resource-intensive, particularly on busy systems.

If scans coincide with periods of heavy logging, consider:

  • Scheduling scans during off-hours
  • Reducing scan frequency
  • Staggering scans across systems

This prevents resource contention between scanning and log collection.

Update the Agent

Older Wazuh versions may contain performance limitations, bugs, or inefficiencies that contribute to dropped events.

Keeping agents updated is one of the simplest ways to improve stability.

Benefits of Newer Versions

Recent releases often include:

  • Performance optimizations
  • Improved queue management
  • Better memory utilization
  • Enhanced scalability
  • Bug fixes

These improvements can significantly reduce dropped-message occurrences in high-volume environments.

How Updates Improve Performance

New versions frequently introduce:

  • More efficient event handling
  • Improved thread management
  • Reduced resource consumption
  • Enhanced communication reliability

Before upgrading, review release notes and compatibility requirements.

To safely perform upgrades, follow:

How to Upgrade a Wazuh Agent

The official Wazuh release documentation regularly highlights performance-related enhancements and bug fixes that can directly impact log collection reliability.

In many cases, a simple agent upgrade combined with modest resource improvements eliminates dropped-message warnings without requiring major architectural changes.


Fix 4: Optimize the Wazuh Manager

Many dropped-message incidents originate on the manager side rather than the agent itself.

If the manager cannot process incoming events quickly enough, queues begin backing up throughout the pipeline.

Optimizing manager performance is therefore a critical step when troubleshooting persistent Logcollector warnings.

Check Analysis Throughput

Begin by determining whether the manager can process events as quickly as they arrive.

Event Processing Rates

Compare:

Events Received Per Second
vs
Events Processed Per Second

If incoming volume consistently exceeds processing throughput, backlogs will eventually develop.

Monitor trends during:

  • Normal operations
  • Peak traffic periods
  • Security incidents
  • Scheduled scans

Queue Backlogs

Review manager logs for evidence of:

  • Queue growth
  • Processing delays
  • Event accumulation
  • Resource exhaustion

Common indicators include:

queue full
event backlog
analysis delay
discarded events

Persistent queue growth almost always indicates insufficient processing capacity somewhere in the pipeline.

Scale Manager Resources

Resource limitations on the manager are among the most common causes of sustained queue saturation.

CPU Recommendations

The analysisd component is CPU-intensive because it must:

  • Decode logs
  • Evaluate rules
  • Perform correlation
  • Generate alerts

For larger environments:

  • Allocate additional CPU cores
  • Use modern processors
  • Monitor CPU utilization continuously

High event-volume deployments often require substantially more processing resources than small proof-of-concept environments.

Memory Sizing

Memory affects:

  • Event buffering
  • Rule processing
  • OpenSearch operations
  • Cache efficiency

Monitor:

  • Available RAM
  • Swap usage
  • JVM heap consumption
  • Memory pressure indicators

Memory shortages can create cascading bottlenecks throughout the Wazuh stack.

Tune OpenSearch Performance

Even when the manager performs well, slow indexing can cause backpressure throughout the environment.

Heap Configuration

Proper heap sizing is essential for stable indexing performance.

Key recommendations include:

  • Set Xms and Xmx equally
  • Avoid oversized heaps
  • Monitor garbage collection activity
  • Track heap utilization trends

For a complete tuning walkthrough, see:

How to Tune OpenSearch Heap Size to Stop Wazuh High Memory Crashes

Indexing Optimization

Review:

  • Shard counts
  • Replica settings
  • Index lifecycle policies
  • Query workloads

Poor indexing configurations can significantly reduce throughput.

Storage Considerations

OpenSearch performance is heavily influenced by storage speed.

Best practices include:

  • Use SSD or NVMe storage
  • Minimize storage latency
  • Monitor disk queue lengths
  • Avoid heavily oversubscribed storage systems

According to the OpenSearch project, storage performance directly impacts indexing throughput and cluster responsiveness.

Consider Distributed Deployments

At some scale, infrastructure upgrades alone become insufficient.

Distributed architectures provide a more sustainable solution.

Multi-Node Architectures

Separating components across multiple systems can dramatically improve scalability.

Examples include:

  • Dedicated managers
  • Dedicated indexers
  • Dedicated dashboard nodes

This reduces resource contention and increases overall throughput.

Clustered Environments

For larger deployments, clustered architectures offer:

  • Improved fault tolerance
  • Better scalability
  • Higher event-processing capacity
  • Reduced risk of bottlenecks

Organizations managing large event volumes should evaluate clustered deployments as a long-term strategy.

For implementation guidance, see:

 How to Build a Wazuh Indexer Cluster

The official Wazuh architecture guidance recommends distributed deployments for environments processing substantial event volumes or requiring high availability.

If dropped-message warnings persist after optimizing agents and managers, the next step is usually redesigning log collection strategies to further reduce ingestion pressure and improve scalability.


Fix 5: Review and Optimize Localfile Configurations

One of the most overlooked causes of Logcollector dropped-message warnings is inefficient log collection configuration.

Over time, many Wazuh deployments accumulate redundant, outdated, or overly broad localfile definitions that dramatically increase event volume.

Optimizing these configurations often provides immediate improvements without requiring additional hardware.

Audit Existing Log Sources

Start by reviewing every configured log source on affected agents.

The objective is to identify:

  • Duplicate monitoring entries
  • Unused log sources
  • Overly broad collection rules
  • Legacy configurations left behind after upgrades

Many environments discover that a significant percentage of monitored logs provide little security value.

Remove Duplicates

Duplicate log collection can double event volume without providing any additional visibility.

Examples include:

<localfile>
  <location>/var/log/auth.log</location>
  <log_format>syslog</log_format>
</localfile>

and later:

<localfile>
  <location>/var/log/*.log</location>
  <log_format>syslog</log_format>
</localfile>

In this scenario, auth.log may be collected twice.

Common sources of duplication include:

  • Multiple localfile definitions
  • Overlapping directory patterns
  • Configuration migrations
  • Copy-and-paste configuration errors

Remove Unused Entries

Many environments continue monitoring logs for applications that no longer exist.

Examples include:

  • Retired applications
  • Legacy databases
  • Decommissioned services
  • Test environments

Removing unused entries reduces:

  • CPU utilization
  • Queue pressure
  • Memory consumption
  • Storage requirements

Monitor Only Required Files

Security monitoring works best when focused on high-value telemetry.

Prioritize:

  • Authentication logs
  • Security logs
  • Audit logs
  • Firewall logs
  • Critical application logs

Avoid collecting logs solely because they are available.

Examples of Efficient Monitoring Configurations

A focused configuration might look like:

<localfile>
  <location>/var/log/auth.log</location>
  <log_format>syslog</log_format>
</localfile>

<localfile>
  <location>/var/log/secure</location>
  <log_format>syslog</log_format>
</localfile>

This is typically more efficient than broad monitoring such as:

<localfile>
  <location>/var/log/*</location>
  <log_format>syslog</log_format>
</localfile>

Targeted collection reduces noise while preserving security visibility.

Avoid Monitoring High-Churn Directories

Some directories generate enormous numbers of file changes and log entries.

Monitoring these locations indiscriminately can overwhelm Logcollector.

Common Problematic Locations

Examples include:

/tmp
/var/tmp
/var/cache
/var/lib/docker
/var/log/containers

On Windows systems:

C:\Windows\Temp
C:\Temp
C:\ProgramData\Temp

These locations often contain:

  • Temporary files
  • Rotating logs
  • Container artifacts
  • Application cache data

Monitoring them usually produces more noise than actionable intelligence.

Alternative Approaches

Instead of monitoring entire directories:

  • Monitor specific files
  • Filter unwanted events
  • Collect summarized logs
  • Forward only security-relevant data

The principle is simple:

Collect less data,
but collect higher-quality data.

This approach improves performance while enhancing the signal-to-noise ratio for security analysts.

The official Wazuh documentation recommends carefully defining monitored sources rather than relying on broad collection patterns.


Fix 6: Configure Rate Limiting and Event Filtering

If certain systems legitimately generate high event volumes, filtering and rate limiting can prevent queue saturation without sacrificing meaningful security visibility.

The objective is to eliminate repetitive, low-value events before they consume processing resources.

Use Ignore and Restriction Options

Wazuh provides several mechanisms that help reduce event volume.

These controls can be used to:

  • Ignore repetitive messages
  • Exclude known benign activity
  • Restrict unnecessary log collection
  • Reduce processing overhead

Filtering should always be performed as close to the source as possible.

Examples of Filtering Noisy Events

Examples of commonly filtered events include:

  • Health check requests
  • Application heartbeat messages
  • Successful routine authentications
  • Repetitive status notifications
  • Debug-level messages

For example, if an application generates thousands of identical informational messages every hour, filtering them can dramatically reduce event volume.

Always verify that filtered events do not contain information required for security investigations or compliance reporting.

Reduce Duplicate Event Generation

Many environments unknowingly generate duplicate events through multiple collection mechanisms.

Application-Level Filtering

Whenever possible, reduce noise at the application itself.

Examples include:

  • Lowering log verbosity
  • Disabling debug mode
  • Limiting audit output
  • Restricting unnecessary event categories

Application-level filtering is typically the most efficient solution because unwanted events are never generated.

Wazuh-Level Filtering

When application changes are not possible, Wazuh can filter events after collection.

Examples include:

  • Decoder-based filtering
  • Rule exclusions
  • Event suppression
  • Log source restrictions

Filtering at the Wazuh layer reduces downstream processing requirements while preserving visibility into critical activity.

Create Custom Rules

Custom rules can be used to suppress repetitive low-value events that do not require analyst attention.

Suppressing Repetitive Low-Value Events

For example:

<rule id="100500" level="0">
  <if_sid>5501</if_sid>
  <match>Routine informational message</match>
  <description>Ignore repetitive informational event</description>
</rule>

Level 0 rules can prevent unnecessary alert generation while still allowing event collection if needed.

Carefully test all custom rules before production deployment.

For detailed guidance, see:

 How to Create Custom Detection Rules in Wazuh (With Examples)

You should also validate rule behavior using:

How to Test Wazuh Rules

The ultimate goal is to ensure that analysts spend time investigating meaningful security events rather than repetitive operational noise.

According to guidance from the Center for Internet Security (CIS), effective security monitoring depends not only on collecting logs but also on reducing alert fatigue through intelligent filtering and prioritization.


Verifying the Fix

After implementing one or more of the remediation steps discussed in this guide, it is important to verify that the issue has truly been resolved.

The absence of warnings alone is not enough.

You should also confirm that events are flowing correctly through the entire Wazuh pipeline.

Confirm Dropped Message Warnings Have Stopped

The first validation step is reviewing the Wazuh logs.

Log Verification Methods

Linux:

grep -i "dropped" /var/ossec/logs/ossec.log

Windows:

Select-String -Path ossec.log -Pattern "dropped"

You should no longer see recurring warnings such as:

Messages were dropped due to full event queue

Continue monitoring logs during normal operations and peak workload periods.

Monitor Event Throughput

Verify that event processing remains stable after remediation.

Monitor:

  • Events per second
  • Alert generation rates
  • Queue growth
  • Indexing throughput

Expected Behavior After Remediation

A healthy environment should demonstrate:

  • Stable event processing
  • No queue overflow warnings
  • Consistent alert generation
  • Minimal event latency
  • Normal dashboard responsiveness

Short-lived queue increases during traffic spikes are generally acceptable provided they quickly return to normal levels.

Validate Event Delivery

The most important test is confirming that events successfully travel through the entire Wazuh pipeline.

Generate Test Events

Examples include:

Linux:

logger "Wazuh dropped message test"

Windows PowerShell:

Write-EventLog -LogName Application -Source WazuhTest -EntryType Information -EventId 1000 -Message "Wazuh test event"

You can also generate security-related test activity if appropriate for your environment.

Confirm Alerts Reach the Dashboard

After generating test events:

  1. Verify the agent collected the event.
  2. Confirm the manager received it.
  3. Check that rules processed it.
  4. Verify visibility in the dashboard.

If dashboard visibility issues persist, review:

Wazuh Dashboard Not Loading? Complete Troubleshooting Guide

and

Troubleshooting “No Matching Indices Found” Error in Wazuh Dashboard

Track System Performance

Even after fixing dropped-message warnings, ongoing monitoring is essential.

CPU

Monitor:

  • Agent CPU usage
  • Manager CPU usage
  • Analysisd utilization

Look for sustained high utilization that could indicate future bottlenecks.

Memory

Track:

  • Available RAM
  • Swap utilization
  • OpenSearch heap usage
  • Wazuh process memory consumption

Memory trends often reveal scaling issues before dropped messages return.

Queue Utilization

Queue utilization is one of the best early-warning indicators.

Monitor for:

  • Queue growth trends
  • Backlog accumulation
  • Processing latency
  • Event throughput changes

If queue utilization begins increasing steadily over time, additional tuning may be required before event loss occurs.

By validating log delivery, monitoring throughput, and tracking system health, you can confirm that the dropped-message issue has been fully resolved and that your Wazuh deployment is once again collecting and processing security events reliably.


Best Practices to Prevent Future Dropped Messages

Preventing Logcollector dropped messages is fundamentally about maintaining balance between log ingestion rates and system processing capacity.

Once the immediate issue is resolved, long-term stability depends on continuous tuning and operational discipline.

Monitor Queue Health Regularly

Queue health is one of the earliest indicators of emerging performance issues.

Key metrics to monitor include:

  • Queue utilization percentage
  • Event backlog growth
  • Processing latency
  • Dropped message counters
  • EPS (events per second)

Set up alerts when queue utilization exceeds safe thresholds so that you can respond before event loss occurs.

In mature environments, queue monitoring should be treated as a core observability signal alongside CPU, memory, and disk I/O.

Avoid Collecting Unnecessary Logs

Over-collection is one of the most common long-term causes of instability.

To reduce unnecessary load:

  • Remove low-value log sources
  • Disable debug logging in production
  • Avoid collecting temporary or cache files
  • Focus on security-relevant telemetry only

A lean logging strategy not only reduces dropped messages but also improves detection quality by reducing noise.

Review Logging Policies Periodically

Logging requirements change over time as systems evolve.

Establish a recurring review cycle to:

  • Validate all localfile configurations
  • Identify redundant sources
  • Remove deprecated applications
  • Reassess security relevance of each log source

Organizations often accumulate “logging debt” where outdated configurations silently degrade performance.

Keep Agents Updated

Wazuh agent updates frequently include:

  • Performance improvements
  • Memory optimization enhancements
  • Bug fixes affecting log processing
  • Improved queue handling mechanisms

Running outdated agents can introduce inefficiencies that increase the likelihood of dropped messages under load.

Align agent upgrades with your standard maintenance cycle, and verify compatibility with your manager and indexer versions.

Scale Infrastructure Before Bottlenecks Occur

Reactive scaling often leads to instability. Instead, monitor trends and scale proactively.

Consider scaling when you observe:

  • Sustained CPU utilization above safe thresholds
  • Consistently rising queue usage
  • Increasing event latency
  • Regular peak-time saturation

Scaling options include:

  • Adding CPU resources
  • Increasing memory allocation
  • Deploying additional manager nodes
  • Expanding OpenSearch clusters

For large environments, distributed architectures are often more effective than vertical scaling alone.

Establish Event Volume Baselines

Without baseline metrics, it is impossible to detect abnormal behavior early.

Track:

  • Average EPS per host
  • Peak EPS during business hours
  • Night/weekend traffic patterns
  • Seasonal or scheduled spikes

Once baselines are established, deviations become immediately visible and actionable.

This enables early intervention before queue saturation occurs.


Frequently Asked Questions (FAQ)

Question: What causes Wazuh Logcollector dropped messages?

Dropped messages typically occur when the Logcollector event queue becomes full.

This is usually caused by excessive log volume, insufficient system resources, slow manager processing, or inefficient configuration.

Question: Are dropped messages permanently lost?

Yes. Once events are dropped at the Logcollector level, they are not forwarded to the manager or stored in the indexer. This is why prevention is critical.

Question: How can I determine which logs are being dropped?

Wazuh does not recover dropped events, but you can identify patterns by:

  • Reviewing ossec.log for queue warnings
  • Correlating timestamps with missing alerts
  • Analyzing high-volume log sources
  • Comparing expected vs observed event counts

Question: Does increasing queue size always solve the problem?

No. Increasing queue size only delays the issue. If event ingestion consistently exceeds processing capacity, the queue will eventually fill again. Root cause remediation is required.

Question: Can a slow Wazuh manager cause dropped messages?

Yes. If the manager cannot process events quickly enough (due to CPU, memory, rule complexity, or indexing delays), backpressure propagates upstream and results in Logcollector queue saturation.

Question: How many logs per second can Wazuh process?

There is no fixed limit. Throughput depends on:

  • Hardware resources
  • Rule complexity
  • Log format
  • Indexing performance
  • Deployment architecture

Well-tuned systems can handle thousands of EPS, while poorly configured systems may struggle at much lower rates.

Question: Should I disable noisy logs or increase resources?

In most cases, you should first reduce noisy logs. Increasing resources is appropriate when:

  • Logs are already optimized
  • Traffic is legitimately high-value
  • System is correctly configured but under-provisioned

A hybrid approach is often best.

Question: How can I monitor queue utilization in Wazuh?

Queue utilization can be monitored through:

  • Wazuh logs (ossec.log)
  • Manager performance metrics
  • OpenSearch ingestion statistics
  • Custom monitoring dashboards

You should also track indirect indicators such as event latency and alert delays.


Conclusion

Dropped messages in Wazuh Logcollector are a direct indicator that your event ingestion pipeline is under stress.

While the symptom appears at the agent level, the root cause can exist anywhere in the system, from excessive log generation to manager or indexer bottlenecks.

Recap of the Most Common Causes of Dropped Messages

The primary contributors include:

  • Excessive log volume from high-churn sources
  • Logcollector queue saturation
  • Resource constraints on agents
  • Slow or overloaded Wazuh manager
  • Inefficient or redundant log configurations
  • OpenSearch indexing bottlenecks

Summary of the Troubleshooting Process

A structured remediation approach includes:

  1. Identifying dropped-message warnings in logs
  2. Measuring log volume and EPS
  3. Checking agent and manager resource usage
  4. Reducing unnecessary log sources
  5. Optimizing queue capacity (carefully)
  6. Improving system performance at both agent and manager levels
  7. Validating that event flow is restored

Importance of Balancing Log Collection Volume with Processing Capacity

Wazuh performance depends on maintaining equilibrium between:

  • Event generation rate
  • Processing throughput
  • Storage and indexing capacity

Over-collecting logs is just as harmful as under-provisioning infrastructure. The most stable deployments are those that prioritize high-signal security data over raw volume.

Recommendations for Long-Term Wazuh Performance Monitoring and Tuning

To maintain long-term stability:

  • Continuously monitor queue health and EPS
  • Regularly audit log sources and configurations
  • Scale infrastructure proactively based on trends
  • Keep agents and managers updated
  • Optimize OpenSearch performance regularly
  • Revisit logging policies as systems evolve

A well-tuned Wazuh deployment should operate with minimal queue pressure, consistent throughput, and no persistent dropped-message warnings, even under peak load conditions.

Be First to Comment

    Leave a Reply

    Your email address will not be published. Required fields are marked *