A Wazuh Manager core dump is one of the clearest indicators that something has gone seriously wrong inside the Wazuh server.
When a critical Wazuh process crashes unexpectedly, the operating system may generate a core dump file containing a snapshot of the process’s memory, execution state, loaded libraries, and stack traces at the exact moment of failure.
While many administrators focus on restoring service availability after a crash, the core dump itself often contains the information needed to identify and permanently resolve the underlying problem.
Core dumps should never be treated as isolated incidents. In most environments, they are symptoms of deeper issues such as software bugs, memory corruption, resource exhaustion, incompatible integrations, corrupted databases, configuration errors, or operating system limitations.
Ignoring repeated core dumps can lead to recurring outages, degraded security visibility, and unreliable event processing.
During an outage, important security events may be delayed, dropped, or never processed at all, potentially allowing malicious activity to go unnoticed.
In this guide, you’ll learn how Wazuh Manager core dumps occur, how to identify the affected components, how to analyze crash data, and how to systematically troubleshoot the most common root causes. You’ll also learn preventive measures that reduce the likelihood of future crashes and improve overall Wazuh stability.
Understanding Wazuh Manager Core Dumps
What Is a Core Dump?
A core dump is a file generated by the Linux kernel when a running process terminates unexpectedly due to a fatal error such as a segmentation fault, illegal instruction, memory access violation, or abort signal.
Think of a core dump as a forensic snapshot of a crashed process.
It captures the internal state of the application at the exact moment the failure occurred, allowing developers and system administrators to reconstruct what happened.
Linux systems generate core dumps when core dump generation is enabled through system limits and kernel settings.
Depending on the distribution and configuration, these files may be stored directly on disk or managed by systemd-coredump.
Core dump files typically contain:
- Process memory contents
- Stack traces
- Register values
- Loaded shared libraries
- Thread information
- Execution state
- Signal information that triggered the crash
This information is extremely valuable during troubleshooting because it allows engineers to determine the exact code path that caused the failure.
According to the Linux kernel documentation, core dumps are specifically designed to assist post-mortem debugging by preserving process state after abnormal termination.
For Wazuh deployments experiencing repeated crashes, core dump analysis often reveals the root cause far faster than reviewing log files alone.
How Wazuh Manager Generates Core Dumps
The Wazuh Manager consists of multiple services and internal modules working together to collect, analyze, store, and correlate security events.
Under normal conditions, these processes shut down gracefully when administrators stop the service. During a graceful shutdown, processes release resources, close connections, and exit cleanly without generating a core dump.
Core dumps occur when a process crashes unexpectedly before normal cleanup procedures can execute.
Several Wazuh components are commonly involved in crash scenarios:
analysisd
The analysis engine responsible for decoding events, matching rules, generating alerts, and processing incoming security data.
remoted
Handles communication between Wazuh agents and the manager, including event reception and agent connectivity.
logcollector
Responsible for collecting and forwarding local logs when installed on manager systems.
modulesd
Runs various Wazuh modules and integrations, including vulnerability detection and external data sources.
authd
Handles agent registration and authentication processes.
Related reading: INTERNAL LINK: /fix-authd-registration-failures-wazuh-agent-password-mismatched-guide/
wazuh-db
Manages internal database operations and agent-related data storage.
Related reading: Fixing wazuh-db Worker Thread Crashes
If any of these components encounter fatal software defects, corrupted data structures, memory allocation failures, or resource exhaustion conditions, Linux may generate a core dump before the process terminates.
Why Core Dumps Should Never Be Ignored
Many administrators make the mistake of restarting Wazuh services after a crash without investigating the cause.
While this may temporarily restore functionality, the underlying issue often remains unresolved.
Core dumps frequently indicate hidden stability problems such as:
- Memory leaks
- Software defects
- Corrupted index data
- Resource exhaustion
- Integration failures
- Operating system limitations
- Incompatible package versions
When manager processes crash, event processing may be interrupted for seconds, minutes, or even hours depending on how quickly the issue is detected.
These interruptions can lead to:
- Lost log events
- Delayed alerts
- Missed detections
- Incomplete incident timelines
- Agent communication failures
Security researchers at the SANS Institute have repeatedly emphasized that monitoring gaps and logging interruptions significantly reduce an organization’s ability to detect and investigate attacks.
Even if a crashed process automatically restarts, the resulting monitoring blind spot may leave critical security events unrecorded.
For this reason, every Wazuh Manager core dump should be treated as a high-priority operational incident requiring investigation and root cause analysis.
Common Symptoms of Wazuh Manager Core Dumps
Unexpected Service Restarts
One of the most common indicators of manager core dumps is frequent service restarts.
Administrators may notice that the Wazuh Manager service repeatedly transitions between running and failed states.
On systems managed by systemd, automatic restart policies often hide the original crash by immediately launching a new process instance.
Common indicators include:
- Frequent service restarts in logs
- Unexpected manager downtime
- Intermittent dashboard functionality
- Repeated crash-recovery cycles
You can often identify these patterns using:
systemctl status wazuh-manager
journalctl -u wazuh-manager
In severe cases, the service may enter a restart loop where crashes occur immediately after startup.
Missing or Delayed Security Alerts
Another major symptom is a sudden interruption in alert generation.
When critical components such as analysisd crash, incoming events may stop being analyzed even if agents continue sending logs.
Administrators may observe:
- Missing alerts
- Delayed detections
- Reduced alert volume
- Incomplete rule matches
- Gaps in security monitoring
This behavior is often mistaken for rule configuration problems when the real issue is a crashed processing component.
If you’re troubleshooting missing detections, you may also find these guides useful:
Agent Connection Problems
Manager crashes frequently affect agent communication.
Depending on which component fails, agents may:
- Disconnect unexpectedly
- Fail health checks
- Miss heartbeat acknowledgments
- Experience registration failures
- Stop transmitting events
Common symptoms include:
- Agents showing disconnected status
- Increased reconnect attempts
- Communication timeout errors
- Delayed event delivery
For environments already experiencing connectivity issues, review:
- Wazuh Agent Not Connecting to Manager? 12 Proven Fixes
- Resolving Duplicate Name or IP Errors in Wazuh Agent Registration
High Resource Utilization Before Crashes
Many Wazuh Manager crashes are preceded by abnormal resource consumption.
Before the core dump occurs, administrators may observe:
Memory Spikes
Rapid increases in RAM consumption can indicate memory leaks, oversized event queues, or database issues.
CPU Saturation
Excessive event processing workloads can push manager processes to 100% CPU utilization for extended periods.
Related reading: Why Is Wazuh Using High CPU? Troubleshooting Guide
File Descriptor Exhaustion
Large environments handling thousands of agent connections may exhaust available file descriptors, causing process instability.
The Open Source Security Foundation (OpenSSF) recommends continuous monitoring of resource consumption as part of production security platform reliability practices.
Tracking resource trends often helps administrators identify the root cause before the next crash occurs.
Core Dump Files Appearing on Disk
The most direct symptom is the appearance of core dump files themselves.
Depending on the Linux distribution and core dump configuration, files may appear in locations such as:
/var/lib/systemd/coredump/
/var/crash/
/tmp/
Common file naming patterns include:
core
core.12345
core.wazuh-analysisd.12345
Many administrators first discover the problem when:
- Disk usage unexpectedly increases
- Backup jobs begin processing large dump files
- System monitoring tools report new files
- Crash analysis directories fill with data
The presence of one or more core dump files should immediately trigger an investigation, especially if multiple dumps are generated over a short period of time.
Repeated dumps almost always indicate a persistent stability problem rather than an isolated incident.
Common Causes of Wazuh Manager Core Dumps
Understanding why a Wazuh Manager process crashed is the most important part of troubleshooting.
While the core dump itself provides evidence of the failure, identifying the underlying root cause prevents future outages and improves platform stability.
Below are the most common reasons Wazuh Manager components generate core dumps.
Memory Exhaustion and Out-of-Memory Conditions
Memory-related failures are among the leading causes of Wazuh Manager crashes.
As event volume grows, Wazuh must allocate memory for:
- Event processing
- Rule evaluation
- Agent communication
- Internal queues
- Database operations
- Vulnerability detection tasks
If memory consumption exceeds available resources, processes can become unstable and eventually crash.
Memory Leaks
A memory leak occurs when an application allocates memory but fails to release it after use.
Over time, leaked memory accumulates until the process exhausts available RAM or virtual memory resources.
Common indicators include:
- Gradually increasing memory usage
- Slower performance over time
- Crashes after running for several days or weeks
- Frequent OOM (Out of Memory) events
Excessive Event Volumes
Large environments can overwhelm manager processes with sudden bursts of events.
Examples include:
- Malware outbreaks
- Log forwarding loops
- Firewall floods
- Authentication storms
- Network scanning activity
When event rates exceed processing capacity, resource consumption can spike dramatically.
Large Queues
If event ingestion exceeds processing speed, internal queues may grow uncontrollably.
Large queues consume memory and increase pressure on components such as analysisd and remoted.
Related reading:
Resource Starvation
Memory shortages often affect more than one subsystem.
When critical resources become scarce, Wazuh processes may fail to allocate memory required for normal operations, resulting in crashes and core dumps.
Corrupted Log Data
Wazuh processes millions of log events from different sources and formats.
Not all logs are well-formed.
Unexpected input can occasionally trigger parser failures or expose software defects.
Malformed Log Entries
Corrupted or improperly formatted logs may contain:
- Missing fields
- Broken JSON structures
- Invalid delimiters
- Unexpected character sequences
Although Wazuh is designed to handle malformed data gracefully, some edge cases can trigger crashes in affected versions.
Unexpected Encoding Formats
Encoding mismatches can create parsing problems.
Examples include:
- UTF-8 versus UTF-16 mismatches
- Invalid Unicode characters
- Binary data embedded in text logs
These issues may cause decoders or integrations to behave unpredictably.
Oversized Events
Extremely large events can consume excessive memory during processing.
Examples include:
- Multi-megabyte JSON payloads
- Large audit logs
- Oversized application logs
- Corrupted log files
Oversized events have historically been responsible for processing failures across many SIEM and log management platforms.
Software Bugs and Version Defects
Not every crash is caused by environmental issues.
Sometimes the root cause is a software defect within Wazuh itself.
Known Bugs in Specific Releases
Every major software platform occasionally ships with defects that are later corrected through updates.
Before spending hours troubleshooting, check whether the crash matches a known issue documented in:
Wazuh release notes frequently contain bug fixes related to stability, memory management, and crash prevention.
Module-Specific Crashes
Certain modules may crash independently while the rest of the platform continues functioning.
Examples include:
- Vulnerability detection modules
- Cloud integrations
- External connectors
- Agent enrollment services
Module-specific failures often appear repeatedly under identical workloads.
Edge-Case Processing Failures
Many software defects only occur under unusual circumstances.
Examples include:
- Extremely large event payloads
- Rare decoder combinations
- Unexpected agent behavior
- Concurrent processing conditions
These failures can be difficult to reproduce without analyzing the core dump.
Third-Party Integration Issues
Many Wazuh deployments include custom integrations and external automation.
While powerful, these integrations can also introduce instability.
External Scripts
Custom scripts may:
- Consume excessive resources
- Return malformed data
- Create deadlocks
- Trigger unexpected module behavior
Poorly written integrations are a common source of intermittent crashes.
Custom Integrations
Organizations frequently connect Wazuh to:
- Ticketing platforms
- Threat intelligence feeds
- SIEM solutions
- Automation tools
Problems within these integrations can indirectly affect manager stability.
Related reading:
- How to Integrate Wazuh with VirusTotal for Threat Intelligence
- How to Integrate Wazuh with Suricata for Better Threat Detection
API-Related Crashes
API integrations occasionally trigger failures when:
- Responses are malformed
- Timeouts occur unexpectedly
- Authentication failures are not handled correctly
- Returned data exceeds expected limits
Reviewing integration logs often helps identify these issues.
Database and Internal Communication Problems
Several Wazuh components depend on internal database communication.
Failures in this area can cause instability throughout the platform.
wazuh-db Failures
The wazuh-db service handles important internal operations involving agent and configuration data.
When wazuh-db becomes unstable, dependent components may also fail.
Related reading:
Fixing wazuh-db Worker Thread Crashes
Database Corruption
Corrupted databases can lead to:
- Failed queries
- Invalid responses
- Unexpected process termination
- Repeated crash cycles
Corruption may result from:
- Improper shutdowns
- Disk failures
- File system corruption
- Incomplete upgrades
IPC Communication Issues
Wazuh components communicate internally through Inter-Process Communication (IPC) mechanisms.
If IPC channels become corrupted or unavailable, processes may:
- Hang indefinitely
- Receive invalid responses
- Terminate unexpectedly
These failures often appear in manager logs shortly before a crash.
Rule and Decoder Configuration Errors
Custom configurations can introduce instability when not properly tested.
Invalid Custom Rules
Incorrect rule syntax may cause parsing failures during startup or event processing.
Always validate custom rules before deploying them to production.
Related reading:
Recursive Logic Problems
Poorly designed rule chains can create excessive processing overhead.
Examples include:
- Circular rule references
- Excessive inheritance chains
- Deep dependency relationships
These conditions can dramatically increase CPU and memory usage.
Decoder Parsing Issues
Custom decoders sometimes fail when encountering unexpected data.
Common problems include:
- Incorrect regex patterns
- Missing fields
- Invalid assumptions about log structure
Decoder-related crashes often appear only when specific event types are processed.
Disk and File System Problems
Storage problems can affect every Wazuh component.
Full Disks
A full disk can prevent:
- Log writing
- Queue storage
- Database updates
- Temporary file creation
When critical write operations fail, processes may terminate unexpectedly.
Corrupted Storage
File system corruption can damage:
- Databases
- Index files
- Queue files
- Configuration files
Corruption often causes recurring crashes that persist across service restarts.
File Permission Issues
Incorrect ownership or permissions may prevent Wazuh components from accessing required files.
Symptoms include:
- Startup failures
- Unexpected process exits
- Incomplete initialization
- Core dumps during file operations
Operating System and Dependency Issues
The underlying operating system can also contribute to crashes.
Unsupported Libraries
Library mismatches may occur after:
- Partial upgrades
- Repository changes
- Manual package installations
An incompatible shared library can cause immediate application crashes.
Broken Package Dependencies
Missing dependencies may prevent modules from functioning correctly.
Administrators should verify package integrity whenever crashes occur after upgrades.
Kernel-Related Compatibility Problems
Certain kernel versions occasionally expose compatibility issues with user-space applications.
According to guidance from the Linux Foundation, maintaining supported kernel and dependency combinations is an important best practice for production system stability.
When crashes begin shortly after operating system updates, dependency compatibility should be investigated immediately.
Step 1: Confirm That a Core Dump Occurred
Before investigating root causes, verify that a core dump was actually generated.
This helps distinguish true process crashes from configuration problems, graceful shutdowns, or service restarts.
Check Wazuh Service Status
Start by examining the current service state.
Using systemctl
Run:
systemctl status wazuh-manager
Look for:
- Failed status indicators
- Signal termination messages
- Segmentation fault errors
- Restart loop behavior
Recent crashes often appear directly within the status output.
Reviewing Recent Crashes
Systemd typically records termination events in the journal.
Use:
journalctl -u wazuh-manager -n 200
Look for messages containing:
Segmentation fault
Aborted
Core dumped
Killed
Signal 11
Signal 6
These entries frequently indicate that a core dump was generated.
Identify Terminated Processes
If the manager service restarted automatically, determine which component actually failed.
Examples include:
- wazuh-analysisd
- wazuh-remoted
- wazuh-db
- wazuh-modulesd
- wazuh-authd
Knowing the affected process significantly narrows the investigation scope.
Check Kernel Messages
The Linux kernel often records crash details.
Use:
dmesg -T | grep -i "segfault"
or:
journalctl -k
Kernel logs frequently reveal:
- Faulting addresses
- Signal numbers
- Memory violations
- Crashed binaries
Review Service Logs
Examine Wazuh logs immediately before the crash.
Useful files include:
/var/ossec/logs/ossec.log
Look for:
- Error messages
- Module failures
- Database issues
- Queue overflows
- Resource warnings
Many root causes become visible shortly before process termination.
Verify Core Dump Generation
After confirming a crash occurred, determine whether Linux created a dump file.
Using coredumpctl
On systemd-based systems:
coredumpctl list
Example output:
TIME PID UID GID SIG
Mon 2026-01-10 10:15:12 UTC 1234 0 0 11
This confirms a process generated a core dump.
You can inspect details using:
coredumpctl info
Using Core Dump Files Directly
On systems that store traditional core files:
find / -name "core*" 2>/dev/null
If files are present, a crash likely occurred and further analysis can begin.
Step 2: Locate Wazuh Core Dump Files
Once you’ve confirmed a crash occurred, locate the dump file associated with the failed process.
Common Core Dump Locations
The storage location depends on Linux distribution and core dump configuration.
systemd-coredump Storage
Modern distributions commonly use systemd-coredump.
Typical location:
/var/lib/systemd/coredump/
List available dumps:
coredumpctl list
Extract a dump if necessary:
coredumpctl dump <PID>
Traditional Core File Locations
Older systems often write core files directly to disk.
Common locations include:
/var/crash/
/tmp/
or the working directory of the crashed process.
Search for them using:
find / -type f -name "core*" 2>/dev/null
Custom Core Pattern Locations
Linux allows administrators to customize dump storage through:
cat /proc/sys/kernel/core_pattern
Example:
/var/core/core.%e.%p
This setting determines where newly generated dumps are written.
Determine Which Process Crashed
Finding a core file is only the first step.
Next, identify which Wazuh component generated it.
Mapping Dump Files to Wazuh Components
Many dump files include process information within the filename.
Examples:
core.wazuh-analysisd.12345
core.wazuh-remoted.9876
core.wazuh-db.5555
This immediately identifies the affected service.
If filenames are not descriptive, use:
coredumpctl info
or:
file core.*
to obtain executable information.
Identifying Affected Services
The most commonly crashed Wazuh processes include:
| Process | Primary Function |
|---|---|
| analysisd | Event analysis and rule matching |
| remoted | Agent communications |
| wazuh-db | Internal database operations |
| modulesd | Module execution |
| authd | Agent enrollment and authentication |
Identifying the crashed process early dramatically reduces troubleshooting time because it allows you to focus on the subsystem most likely responsible for the failure.
Step 3: Examine Wazuh Logs Before the Crash
Once you’ve identified the crashed process and located the core dump, the next step is reviewing Wazuh logs generated immediately before the failure.
In many cases, the logs reveal the root cause without requiring deep core dump analysis.
Fatal errors, resource exhaustion warnings, database communication failures, and malformed event processing issues often appear minutes or even seconds before a process terminates.
Review Manager Logs
Wazuh components generate extensive operational logs that provide valuable context surrounding a crash.
Rather than focusing only on the exact crash timestamp, examine activity occurring several minutes beforehand.
Many failures are preceded by warning messages that progressively worsen until the process becomes unstable.
Important Log Locations
The primary log file for troubleshooting manager crashes is:
/var/ossec/logs/ossec.log
Search recent activity using:
tail -500 /var/ossec/logs/ossec.log
Or filter by component:
grep analysisd /var/ossec/logs/ossec.log
grep remoted /var/ossec/logs/ossec.log
grep wazuh-db /var/ossec/logs/ossec.log
For system-level events, also review:
journalctl -u wazuh-manager
and
journalctl -xe
These logs frequently contain information unavailable within ossec.log itself.
Identifying Fatal Errors
Start by searching for obvious failure indicators.
Examples include:
ERROR
CRITICAL
FATAL
Aborted
Segmentation fault
Out of memory
Connection refused
Database error
Useful commands:
grep -Ei "fatal|critical|error|abort" /var/ossec/logs/ossec.log
Look especially for messages occurring immediately before the service restart or crash timestamp.
Many Wazuh crashes leave a clear trail of warnings before termination.
Look for Warning Messages Leading Up to the Crash
Warnings often provide the earliest indication of instability.
Administrators frequently overlook these messages because the service may continue functioning temporarily before finally crashing.
Queue Warnings
Queue-related warnings indicate that incoming events are arriving faster than they can be processed.
Examples include:
Queue is full
Event queue saturated
Messages dropped
Large queue backlogs can contribute to memory pressure and eventual process failures.
Related reading:
Fix Wazuh Logcollector Dropped Messages
Memory Allocation Errors
Memory-related warnings should always be treated seriously.
Examples:
Cannot allocate memory
Out of memory
Allocation failed
These messages often appear before segmentation faults and process crashes.
Related reading:
- How to Tune OpenSearch Heap Size to Stop Wazuh High Memory Crashes
- Why Is Wazuh Using High CPU? Troubleshooting Guide
Decoder Failures
Malformed or unexpected log formats can trigger decoder problems.
Examples:
Decoder error
Regex compilation failed
Invalid log format
Repeated decoder failures may indicate corrupted log sources or configuration issues.
Related reading:
How to Create Custom Detection Rules in Wazuh (With Examples)
Database Communication Errors
Database instability frequently affects multiple Wazuh components.
Watch for messages such as:
Database connection failed
Unable to communicate with wazuh-db
IPC timeout
Database unavailable
These warnings often precede crashes involving modulesd, analysisd, or wazuh-db itself.
Related reading:
Fixing wazuh-db Worker Thread Crashes
Correlate Crash Timing
Finding errors is important, but understanding their relationship to the crash is even more valuable.
Building a Timeline
Create a timeline of events leading to the crash.
Document:
| Time | Event |
|---|---|
| 10:01 | Queue warnings begin |
| 10:05 | Memory usage spikes |
| 10:08 | Database timeout errors appear |
| 10:10 | Process crashes |
| 10:10 | Core dump generated |
This approach often reveals patterns that individual log entries cannot.
Matching Logs to Crash Events
Compare timestamps from:
- ossec.log
- journalctl
- dmesg
- coredumpctl
- monitoring systems
Your goal is to identify what changed immediately before the failure occurred.
Experienced incident responders frequently emphasize timeline reconstruction as one of the most effective methods for identifying root causes because it helps distinguish symptoms from the actual triggering event.
Step 4: Analyze the Core Dump
Once you’ve collected relevant logs, it’s time to examine the core dump itself.
Core dump analysis can reveal exactly where the process failed and which function triggered the crash.
Even if you’re not a software developer, basic analysis often provides enough information to identify known bugs, resource issues, or module-specific failures.
Install Required Debugging Tools
Several tools are required before you can inspect a dump file.
GDB
The GNU Debugger (GDB) is the most common utility used for Linux crash analysis.
Install it on Debian-based systems:
sudo apt install gdb
On RHEL-based systems:
sudo yum install gdb
GDB allows you to inspect:
- Stack traces
- Thread information
- Register values
- Memory state
- Crashed functions
The official GNU debugger documentation provides detailed guidance on post-mortem debugging techniques.
Debug Symbol Packages
Without debugging symbols, stack traces may contain limited information.
Install:
- Wazuh debug packages (if available)
- Operating system debug symbols
- Library debug packages
Symbols allow GDB to display function names and source code references rather than memory addresses.
Open the Core Dump in GDB
After installing the necessary tools, load the dump.
Loading the Dump
Example:
gdb /var/ossec/bin/wazuh-analysisd core.12345
Or using systemd:
coredumpctl gdb <PID>
GDB will load the process state captured at the time of the crash.
Generating a Backtrace
The first command most engineers run is:
bt
or:
thread apply all bt
This generates a backtrace showing the sequence of function calls that led to the crash.
Example:
#0 process_event()
#1 decode_log()
#2 rule_matching()
#3 main()
The backtrace is often the single most valuable artifact produced during troubleshooting.
Understanding Backtrace Output
A backtrace may look intimidating at first, but several patterns are easy to recognize.
Function Call Stacks
The stack trace shows which functions were executing when the process failed.
Repeated function names may indicate:
- Infinite recursion
- Looping logic
- Stack exhaustion
These patterns frequently point directly to software defects.
Segmentation Faults
A segmentation fault (SIGSEGV) occurs when a process attempts to access memory it does not own.
Example:
Program terminated with signal SIGSEGV
Segmentation fault
This is one of the most common causes of Wazuh core dumps.
Abort Signals
Abort signals typically appear as:
Program terminated with signal SIGABRT
These crashes often occur when internal safety checks detect invalid program states.
Memory Access Violations
Memory corruption indicators may include:
- Invalid pointers
- Null pointer dereferences
- Buffer overflows
- Corrupted heap structures
When these patterns appear, a software defect is often involved.
According to guidance from the GNU Project and major Linux distribution maintainers, stack traces and signal information are usually the most important artifacts for diagnosing application crashes.
Collect Information for Vendor Support
If the root cause is not immediately obvious, gather information before opening a support case or GitHub issue.
Backtrace Output
Save:
bt
thread apply all bt
outputs to a text file.
These are typically the first artifacts requested by support engineers.
System Details
Collect:
uname -a
cat /etc/os-release
This helps identify operating system compatibility issues.
Version Information
Document:
/var/ossec/bin/wazuh-control info
or:
rpm -qa | grep wazuh
or:
dpkg -l | grep wazuh
Include:
- Wazuh version
- Operating system version
- Kernel version
- Installed integrations
- Deployment architecture
Providing complete diagnostic information significantly accelerates vendor troubleshooting.
Step 5: Verify System Resource Health
A surprisingly large percentage of Wazuh crashes are caused by resource exhaustion rather than software defects.
Before assuming a bug exists, verify that the underlying system has sufficient resources to support the workload.
Check Available Memory
Memory shortages are among the most common causes of instability.
Physical RAM
Review memory utilization:
free -h
Look for:
- Very low available memory
- Consistently high utilization
- Frequent memory pressure events
Memory consumption approaching system limits should be investigated immediately.
Swap Usage
Check swap activity:
swapon --show
and
free -h
Heavy swap usage often indicates insufficient physical memory.
Systems relying extensively on swap frequently experience:
- Increased latency
- Slower event processing
- Process instability
- Unexpected crashes
Monitor CPU Utilization
CPU saturation can create cascading failures throughout the manager.
Sustained High CPU Usage
Monitor system load:
top
or:
htop
Look for:
- CPU usage consistently above 80–90%
- Load averages exceeding CPU core counts
- Analysisd consuming excessive resources
Process-Level Analysis
Identify which processes are consuming resources:
ps aux --sort=-%cpu | head
and:
ps aux --sort=-%mem | head
This helps determine whether the crash is linked to a specific component.
Verify Disk Capacity
Storage exhaustion can destabilize Wazuh and its supporting services.
Filesystem Usage
Check available space:
df -h
Pay special attention to:
- /
- /var
- /var/ossec
- OpenSearch data volumes
Full filesystems commonly trigger service failures.
Related reading:
How to Fix a Yellow Cluster Status in Wazuh Indexer
Inode Availability
A filesystem can run out of inodes even when free space remains.
Check inode consumption:
df -i
Low inode availability may prevent new files from being created.
Inspect File Descriptor Limits
Wazuh managers handling thousands of agents may encounter file descriptor limitations.
Current Limits
View current limits:
ulimit -n
Review system-wide settings:
cat /proc/sys/fs/file-max
Low limits can cause:
- Connection failures
- Queue problems
- Service instability
- Unexpected process exits
Increasing Limits When Necessary
If limits are too restrictive, adjust:
/etc/security/limits.conf
Example:
wazuh soft nofile 65535
wazuh hard nofile 65535
After increasing limits, restart affected services and continue monitoring.
Resource validation is a critical troubleshooting step because many crashes that initially appear to be software bugs ultimately turn out to be memory shortages, CPU saturation, disk exhaustion, or operating system limitations.
Step 6: Validate Wazuh Configuration
Configuration problems are a common source of Wazuh Manager instability, especially in environments with extensive customization.
Custom rules, decoders, integrations, and manual configuration changes can introduce unexpected behavior that eventually leads to process crashes.
If core dumps began appearing after a configuration change, validation should be one of your highest-priority troubleshooting steps.
Check Manager Configuration Syntax
Before investigating more complex causes, verify that the manager configuration is syntactically correct.
Even small formatting mistakes can create instability or prevent components from operating properly.
Validate ossec.conf
The primary Wazuh Manager configuration file is:
/var/ossec/etc/ossec.conf
Inspect the file for:
- Missing XML tags
- Invalid nesting
- Duplicate configuration blocks
- Typographical errors
- Unsupported options
Wazuh logs often reveal configuration-related errors during startup.
Review:
cat /var/ossec/logs/ossec.log
immediately after restarting the service.
Related reading:
How to Fix ossec.conf Syntax Errors in Wazuh Agents
Identify Recent Changes
One of the fastest ways to locate a root cause is determining what changed before the crashes began.
Ask questions such as:
- Were new rules recently added?
- Was Wazuh upgraded?
- Was an integration deployed?
- Were decoder changes introduced?
- Were manager settings modified?
Many incidents can be traced directly to a recent configuration change.
If version control is available, compare current and previous configurations.
Review Custom Rules
Custom detection rules are powerful but can introduce processing problems when improperly designed.
Detect Faulty Rule Logic
Review recently added rules for:
- Invalid syntax
- Unsupported fields
- Excessive inheritance
- Circular dependencies
- Inefficient matching logic
Examples of problematic patterns include:
<if_sid>100001</if_sid>
referencing rules that do not exist or recursive rule chains that repeatedly trigger each other.
These issues can dramatically increase processing overhead and occasionally expose edge-case software defects.
Test Rule Changes Safely
Never deploy major rule changes directly to production without validation.
Use:
/var/ossec/bin/wazuh-logtest
to verify behavior before rollout.
This tool allows administrators to:
- Test rule matching
- Validate syntax
- Verify decoder interactions
- Identify unexpected behavior
Review Custom Decoders
Custom decoders are another frequent source of instability.
Decoder errors may not appear immediately and can remain hidden until a specific log format is processed.
Decoder Validation
Inspect custom decoders for:
- Invalid regular expressions
- Incorrect field mappings
- Missing parent decoders
- Unsupported XML elements
Validate decoder behavior using representative log samples before deployment.
Common Decoder Mistakes
The most common issues include:
- Overly complex regex patterns
- Greedy matching expressions
- Invalid capture groups
- Decoder inheritance errors
- Assumptions about log structure
For example, a decoder may work perfectly with expected logs but fail when encountering malformed or unexpected input.
These edge cases can trigger excessive resource consumption or process instability under heavy workloads.
Step 7: Investigate Database and Module Failures
Many Wazuh Manager crashes originate from internal modules rather than the manager framework itself.
Database communication problems, module failures, and subsystem-specific defects can all produce core dumps.
The goal of this step is identifying whether a particular component is consistently involved in the crash.
Check wazuh-db Health
The wazuh-db service is one of the most important components within the Wazuh architecture.
Many manager functions rely on it for configuration management, agent information, and internal data operations.
Database Errors
Review logs for messages such as:
Database error
Database unavailable
Query failed
Database timeout
Search logs using:
grep -i database /var/ossec/logs/ossec.log
Repeated database errors often indicate corruption, communication failures, or resource exhaustion.
Communication Failures
Wazuh components communicate extensively with wazuh-db.
Common warning messages include:
Unable to communicate with wazuh-db
IPC timeout
Socket communication error
Connection lost
When communication breaks down, dependent processes may become unstable and eventually crash.
Related reading:
Fixing wazuh-db Worker Thread Crashes
Review Wazuh Modules
Several Wazuh modules perform specialized functions and may crash independently under certain conditions.
Examine logs for module-specific errors.
Vulnerability Detection
The vulnerability detection module processes package inventory information and vulnerability feeds.
Potential issues include:
- Corrupted vulnerability databases
- Feed synchronization failures
- Excessive resource consumption
- Version compatibility problems
Related reading:
Wazuh Vulnerability Detection Not Working? Here’s How to Fix It
Syscollector
Syscollector gathers inventory information from monitored endpoints.
Problems may occur when:
- Agents send unexpected inventory data
- Large environments generate excessive inventory updates
- Resource limits are reached
Review Syscollector-related log entries surrounding the crash.
FIM
File Integrity Monitoring (FIM) can generate significant processing workloads.
Potential crash contributors include:
- Monitoring extremely large directories
- Excessive file changes
- Aggressive scan schedules
- Resource exhaustion
Related reading:
- How to Configure File Integrity Monitoring (FIM) in Wazuh
- How to Stop Wazuh File Integrity Monitoring From Eating Your CPU
SCA
Security Configuration Assessment (SCA) scans can place additional load on manager resources.
Review:
- Scan frequency
- Policy complexity
- Concurrent scan activity
Large-scale SCA deployments occasionally expose scalability issues.
Identify Module-Specific Crashes
The objective is determining whether crashes consistently occur within the same subsystem.
Look for patterns such as:
- Every crash involving modulesd
- Every crash occurring during vulnerability scans
- Every crash occurring after FIM activity
- Every crash occurring during agent enrollment
Consistent patterns usually indicate a module-specific problem.
Isolating Problematic Components
If evidence points toward a specific module, isolate it for testing.
For example:
- Disable the suspected module.
- Restart Wazuh.
- Monitor system stability.
- Compare behavior before and after the change.
This controlled approach often confirms the root cause quickly.
Temporary Module Disablement for Testing
Disabling a module temporarily can help determine whether it is responsible for the crashes.
Examples include:
- Vulnerability Detection
- SCA
- Syscollector
- Third-party integrations
Do not leave critical security features disabled permanently, but temporary testing can provide valuable diagnostic information.
Document every change so that configurations can be restored after troubleshooting.
Step 8: Determine Whether the Crash Is a Known Bug
Not every core dump is caused by local configuration problems or infrastructure issues.
Sometimes the crash is the result of a documented software defect that has already been identified and fixed by the Wazuh development team.
Before investing excessive time in deep debugging, verify whether the issue is already known.
Verify Installed Wazuh Version
Begin by identifying the exact version running in your environment.
Examples:
rpm -qa | grep wazuh
or:
dpkg -l | grep wazuh
Document:
- Manager version
- Agent versions
- Dashboard version
- Indexer version
Version mismatches can sometimes contribute to instability.
Review Release Notes
Wazuh release notes frequently contain bug fixes addressing:
- Memory leaks
- Segmentation faults
- Database crashes
- Module instability
- Integration failures
Pay particular attention to fixes involving the component identified in your backtrace.
For example, if analysisd generated the core dump, search release notes for analysisd-related fixes.
Search Known Issues
The next step is reviewing publicly reported bugs.
Search using:
- Error messages
- Backtrace functions
- Signal names
- Module names
- Version numbers
You may discover that other administrators have already encountered the same issue.
Compare Crash Signatures
Core dump analysis becomes especially valuable when comparing crash signatures against known defects.
Matching Stack Traces
If your backtrace contains functions such as:
process_event()
decode_event()
db_query()
search those function names together with your Wazuh version.
Matching stack traces are often strong evidence that you’re encountering an existing bug.
Many software vendors use crash signatures as the primary method for categorizing and resolving defects.
Existing Bug Reports
Review issue reports for:
- Similar stack traces
- Similar workloads
- Similar deployment architectures
- Matching error messages
Pay attention to comments from maintainers because they often contain workarounds or temporary mitigations.
Fixed Versions
If a bug has already been fixed, upgrading may be the fastest resolution.
Before upgrading:
- Verify the bug matches your symptoms.
- Review release notes carefully.
- Confirm upgrade compatibility.
- Test in a non-production environment whenever possible.
Related reading:
Many organizations spend days troubleshooting issues that have already been resolved in newer releases.
Checking known bugs early in the investigation process can save significant time and effort while reducing future stability risks.
Step 9: Apply Corrective Actions
After identifying the likely root cause of the crash, the next step is implementing corrective actions that permanently eliminate the issue.
Avoid the temptation to simply restart the manager and move on.
A successful troubleshooting effort should not only restore service but also reduce the likelihood of future crashes.
Upgrade to a Stable Release
If your investigation points to a known software defect, upgrading to a newer stable release is often the most effective solution.
Many Wazuh Manager core dumps are eventually traced back to bugs that have already been fixed by the development team.
Before upgrading:
- Review release notes
- Verify compatibility requirements
- Back up critical configurations
- Test upgrades in a staging environment
- Validate custom rules and integrations
Pay particular attention to fixes involving:
- analysisd crashes
- memory leaks
- database communication failures
- module instability
- agent communication issues
Fix Resource Bottlenecks
Resource exhaustion is one of the most common causes of manager instability.
If memory, CPU, disk, or file descriptor limitations contributed to the crash, address them before returning the system to production.
Common corrective actions include:
- Increasing available RAM
- Expanding swap space
- Adding CPU resources
- Increasing file descriptor limits
- Expanding storage capacity
- Reducing event ingestion rates
Organizations that proactively address infrastructure bottlenecks often eliminate recurring crash cycles without making any application-level changes.
Correct Configuration Errors
Configuration issues should be corrected immediately once identified.
Examples include:
- Invalid XML syntax
- Incorrect module settings
- Broken integrations
- Unsupported configuration options
- Misconfigured cluster settings
After applying corrections:
- Validate the configuration.
- Restart affected services.
- Review startup logs.
- Monitor for recurring errors.
Repair Corrupted Files
Corrupted files frequently contribute to unexpected process failures.
Files that may require repair or replacement include:
- Internal databases
- Queue files
- Configuration files
- Index data
- Integration artifacts
Potential indicators of corruption include:
- Unexpected parsing failures
- Repeated database errors
- Invalid file format messages
- Consistent crashes during startup
When corruption is suspected, restore affected files from a known-good backup whenever possible.
Remove Faulty Customizations
Customizations often introduce instability, especially after upgrades.
Examples include:
- Custom scripts
- Third-party integrations
- Custom decoders
- Custom rules
- Modified startup procedures
Temporarily remove nonessential customizations and observe system behavior.
If crashes stop occurring, reintroduce customizations individually until the problematic component is identified.
Tune Event Processing Workloads
Large environments frequently overwhelm Wazuh through sheer event volume.
Potential tuning strategies include:
- Filtering unnecessary logs
- Reducing noisy event sources
- Optimizing custom rules
- Limiting excessive FIM activity
- Adjusting scan schedules
- Increasing processing capacity
Related reading:
- How to Stop Wazuh File Integrity Monitoring From Eating Your CPU
- Why Is Wazuh Using High CPU? Troubleshooting Guide
- Fix Wazuh Logcollector Dropped Messages
The goal is ensuring that incoming workloads remain within the capacity of the manager infrastructure.
Preventing Future Wazuh Manager Core Dumps
While troubleshooting is important, prevention is even more valuable.
Organizations that implement proactive monitoring and maintenance practices experience significantly fewer stability incidents than those operating reactively.
The following best practices can dramatically reduce the likelihood of future core dumps.
Keep Wazuh Updated
Running outdated software increases exposure to:
- Known bugs
- Memory leaks
- Stability defects
- Security vulnerabilities
- Compatibility issues
Establish a process for:
- Reviewing release notes
- Evaluating new versions
- Testing upgrades
- Deploying approved updates
According to guidance from the Wazuh project, staying current with supported releases is one of the most effective ways to maintain platform stability.
Monitor Resource Consumption Proactively
Resource-related crashes rarely occur without warning.
Monitor key metrics such as:
- Memory utilization
- CPU usage
- Queue depth
- Disk capacity
- File descriptor usage
- Process restart frequency
Alerting on abnormal trends allows administrators to intervene before instability develops.
Validate Configuration Changes Before Deployment
Every configuration change carries risk.
Before deploying modifications:
- Review syntax carefully
- Validate XML structures
- Test integrations
- Verify dependencies
- Document changes
A formal change validation process can eliminate many avoidable outages.
Test Custom Rules and Decoders in Staging
Custom content should never be deployed directly to production without testing.
A staging environment allows administrators to verify:
- Rule behavior
- Decoder accuracy
- Performance impact
- Compatibility with existing configurations
Many production incidents originate from untested customizations rather than defects in Wazuh itself.
Implement Log and Performance Monitoring
Effective monitoring provides early warning signs before crashes occur.
Track:
- Service restarts
- Error messages
- Queue growth
- Database communication failures
- Memory allocation warnings
- Agent connectivity issues
Monitoring platforms should generate alerts whenever abnormal behavior is detected.
As noted by observability experts at the OpenTelemetry project, early detection of abnormal system behavior is critical for maintaining application reliability.
Establish Routine Health Checks
Periodic health reviews help identify hidden issues before they become critical.
A routine health check may include:
- Reviewing logs
- Verifying module status
- Checking disk utilization
- Examining memory trends
- Confirming agent connectivity
- Reviewing cluster health
Organizations that conduct regular health assessments often discover developing problems long before they cause outages.
Maintain Sufficient System Capacity
As deployments grow, infrastructure requirements increase.
Many Wazuh environments remain stable for months before suddenly experiencing crashes due to capacity constraints.
Review capacity regularly and plan for:
- Additional agents
- Higher event volumes
- New integrations
- Increased retention periods
- Expanded security monitoring requirements
Maintaining adequate headroom helps prevent resource-related failures and improves overall reliability.
When to Escalate to Wazuh Support
Some crashes cannot be fully diagnosed internally.
If the root cause remains unclear after completing the troubleshooting process, escalation may be necessary.
Providing complete diagnostic information significantly improves the chances of a fast resolution.
Information to Collect Before Opening a Case
Support engineers can only work with the information provided.
Gather as much evidence as possible before opening a ticket or submitting a bug report.
Core Dump Files
Collect:
- Original core dump files
- coredumpctl output
- Crash timestamps
- Associated process names
These files often contain the most valuable diagnostic data.
Backtraces
Generate and save:
bt
thread apply all bt
outputs from GDB.
Backtraces are frequently the first artifact requested by developers.
Wazuh Logs
Include:
/var/ossec/logs/ossec.log
particularly entries immediately preceding the crash.
Capture:
- Error messages
- Warning messages
- Service restart events
- Database communication failures
Version Information
Document:
- Wazuh Manager version
- Wazuh Agent versions
- Dashboard version
- Indexer version
- Operating system version
- Kernel version
Version details often help identify known bugs quickly.
System Specifications
Provide:
- CPU count
- Available RAM
- Storage configuration
- Number of agents
- Daily event volume
- Cluster architecture
Environmental information helps support engineers reproduce conditions associated with the crash.
Creating a Useful Support Request
A well-prepared support request can reduce troubleshooting time from days to hours.
Diagnostic Information Checklist
Before submitting a case, ensure you have collected:
- Core dump files
- Stack traces
- Wazuh logs
- System logs
- Version information
- Resource utilization data
- Configuration changes made before the crash
- Relevant screenshots or error messages
The more evidence provided, the faster engineers can isolate the root cause.
Reproduction Details
One of the most valuable pieces of information is whether the crash can be reproduced consistently.
Document:
- What happened before the crash
- Which component failed
- How frequently it occurs
- Whether specific logs trigger the failure
- Whether certain integrations are involved
- Whether the crash appeared after an upgrade or configuration change
Providing clear reproduction steps dramatically increases the likelihood that developers can identify and fix the underlying problem.
By combining detailed diagnostics, core dump analysis, resource validation, configuration reviews, and proactive monitoring, most Wazuh Manager core dumps can be resolved systematically.
The key is treating every core dump as an opportunity to identify and eliminate the underlying cause rather than simply restoring service and waiting for the next crash.
Frequently Asked Questions (FAQ)
Question: What causes Wazuh Manager core dumps?
Wazuh Manager core dumps can be triggered by a wide range of issues, including:
- Memory exhaustion
- Memory leaks
- Software defects
- Corrupted log data
- Database communication failures
- Faulty custom rules or decoders
- Third-party integration problems
- Disk and filesystem issues
- Operating system dependency conflicts
The core dump itself is not the root cause. It is evidence that a process terminated unexpectedly.
Identifying the underlying trigger requires reviewing logs, analyzing the dump, and examining system health.
Question: Where are Wazuh core dump files stored?
The location depends on your Linux distribution and core dump configuration.
Common locations include:
/var/lib/systemd/coredump/
/var/crash/
/tmp/
Some systems use custom storage paths defined by:
cat /proc/sys/kernel/core_pattern
If you’re unsure where dumps are being stored, use:
find / -name "core*" 2>/dev/null
or:
coredumpctl list
to locate them.
Question: How do I know which Wazuh process crashed?
Several methods can help identify the affected process.
Start by reviewing:
coredumpctl info
You can also inspect:
- systemd logs
- kernel logs
- ossec.log
- core dump filenames
Common Wazuh processes that generate core dumps include:
- wazuh-analysisd
- wazuh-remoted
- wazuh-db
- wazuh-modulesd
- wazuh-authd
Identifying the crashed process is one of the most important steps because it narrows the investigation to a specific subsystem.
Question: Can a core dump cause data loss?
A core dump itself does not cause data loss.
However, the crash that generated the dump can interrupt:
- Event processing
- Alert generation
- Agent communication
- Database operations
- Log collection
Depending on the duration of the outage, some security events may be delayed, missed, or lost entirely.
This is why recurring crashes should be treated as high-priority operational incidents.
Question: How do I analyze a Wazuh core dump using GDB?
Install GDB and load the dump file:
gdb /var/ossec/bin/wazuh-analysisd core.12345
For systemd-managed dumps:
coredumpctl gdb <PID>
After loading the dump, generate a stack trace using:
bt
or:
thread apply all bt
The resulting backtrace shows the function calls that occurred before the crash and is often the most valuable artifact during troubleshooting.
Question: Are core dumps always caused by software bugs?
No.
While software defects can certainly generate core dumps, many crashes are caused by environmental problems such as:
- Insufficient memory
- High CPU utilization
- Full disks
- Corrupted databases
- Invalid configurations
- Third-party integrations
- Dependency conflicts
In production environments, resource-related issues are often just as common as software bugs.
Question: Should I delete core dump files after analysis?
Yes, in most cases.
Core dump files can consume significant disk space, especially when large processes crash.
However, do not delete them until:
- Analysis has been completed.
- Backtraces have been collected.
- Required support artifacts have been archived.
- Any support cases have been opened.
Once the necessary information has been preserved, old dumps can usually be removed safely.
Question: Can insufficient memory trigger Wazuh Manager crashes?
Absolutely.
Memory shortages are one of the most common causes of Wazuh instability.
When available memory becomes limited, Wazuh components may experience:
- Allocation failures
- Queue growth
- Performance degradation
- Process termination
- Core dump generation
Administrators should regularly monitor:
free -h
and overall memory consumption trends to identify problems before they cause outages.
Question: How can I prevent recurring core dumps?
The most effective prevention strategies include:
- Keeping Wazuh updated
- Monitoring resource utilization
- Testing configuration changes before deployment
- Validating custom rules and decoders
- Reviewing logs regularly
- Performing routine health checks
- Maintaining adequate system capacity
Proactive maintenance is significantly more effective than reacting to crashes after they occur.
Question: When should I contact Wazuh support?
Consider contacting support or opening a bug report when:
- The root cause remains unclear after troubleshooting
- Crashes continue after corrective actions
- The backtrace points to a possible software defect
- Multiple manager components are crashing
- Core dumps appear immediately after upgrades
- You suspect a previously unknown bug
Before escalating, gather:
- Core dump files
- GDB backtraces
- Wazuh logs
- System logs
- Version information
- System specifications
Providing complete diagnostic information dramatically improves the chances of a quick resolution.
Conclusion
Wazuh Manager core dumps are among the most serious indicators of instability within a Wazuh deployment.
While it may be tempting to simply restart the affected service and move on, doing so often leaves the underlying problem unresolved and increases the likelihood of future outages.
A systematic troubleshooting approach is far more effective.
The workflow outlined in this guide begins by confirming that a core dump actually occurred, locating the associated dump files, reviewing logs leading up to the crash, analyzing the dump with GDB, validating system resources, checking configurations, investigating database and module failures, and determining whether the issue matches a known software defect.
Throughout the investigation, the primary objective should be identifying the root cause rather than treating the symptoms.
Whether the issue stems from memory exhaustion, malformed log data, corrupted databases, faulty integrations, configuration errors, or a software bug, understanding why the process crashed is the key to preventing it from happening again.
Long-term stability depends on strong operational practices, including:
- Keeping Wazuh updated with supported releases
- Monitoring memory, CPU, disk, and queue utilization
- Testing custom rules and decoders before deployment
- Validating configuration changes carefully
- Performing routine health checks
- Maintaining sufficient infrastructure capacity as deployments grow
By combining proactive monitoring, disciplined change management, and thorough root-cause analysis, administrators can significantly reduce the frequency of Wazuh Manager crashes and maintain a more reliable, resilient, and effective security monitoring platform.

Be First to Comment