Cluster Statistics Collection Guide
This guide provides an overview of the statistics collection system within the Cluster API platform. It explains how statistics are collected, stored, and accessed, as well as how to interpret the data for monitoring your infrastructure.
The Cluster API platform includes a comprehensive statistics collection system that monitors various aspects of your virtual infrastructure, including:
- CPU utilization
- Memory usage
- Storage usage and IOPS
- Network bandwidth
- Instance-specific metrics
- System-wide resource utilization
These statistics are collected in real-time and stored using RRD (Round-Robin Database) files, which provide efficient storage and retrieval of time-series data.
Dashboard Statistics
The dashboard provides a high-level overview of your system's resource utilization:
Resource Usage Overview
The dashboard displays the following key metrics:
- Core Usage: Percentage of CPU cores in use across all nodes
- Memory Usage: Percentage of total memory in use across all nodes
- Storage Usage: Percentage of total storage space in use across all datastores
These metrics help you quickly assess the overall health and capacity of your infrastructure.
Real-time Monitoring
The dashboard also provides real-time monitoring of:
- Total CPU Usage: Average CPU utilization across all physical CPUs
- Total Storage IOPS: Combined read and write operations per second across all storage devices
- Total Bandwidth Usage: Network traffic in and out of the system
These metrics are updated continuously through WebSocket connections, providing immediate visibility into system performance.
Instance Statistics
For each virtual machine instance, the system collects detailed statistics:
CPU Statistics
- Per-vCPU utilization percentage
- Total CPU time consumed by the instance
Disk Statistics
For each virtual disk attached to an instance:
- Read operations per second (IOPS)
- Write operations per second (IOPS)
- Read bandwidth (bytes/sec)
- Write bandwidth (bytes/sec)
Network Statistics
For each network interface:
- Incoming traffic (bytes/sec)
- Outgoing traffic (bytes/sec)
- Packet counts
Memory Statistics
- Memory utilization by the instance
Accessing Statistics
REST API
Statistics can be accessed through the REST API:
- /api/stat_metrics: Get all statistics metrics
- /api/stat_metrics/last10min: Get statistics from the last 10 minutes
- /api/dashboard/usage: Get overall system resource usage
WebSocket Channels
Real-time statistics are available through WebSocket channels:
- onTotalCpuUsage: Real-time CPU usage across the system
- onTotalStorageIops: Real-time storage IOPS across all disks
- onTotalBandwidthUsage: Real-time network bandwidth usage
Statistics Storage
Statistics are stored in several ways:
1. RRD Files: Time-series data is stored in RRD files for efficient storage and retrieval
2. Database Records: Aggregated statistics are stored in the database for historical analysis
3. Archive Tables: Historical statistics are archived for long-term storage and analysis
Understanding Statistics Data
CPU Utilization
CPU utilization is measured as a percentage of available CPU time. For virtual CPUs (vCPUs), this represents the percentage of the allocated vCPU that is being used. For physical CPUs, this represents the percentage of the physical CPU that is being used.
Storage IOPS
Storage IOPS (Input/Output Operations Per Second) measures the number of read and write operations performed on a storage device per second. Higher IOPS generally indicates higher storage activity.
Network Bandwidth
Network bandwidth is measured in bytes per second and represents the amount of data being transferred over the network. This is split into incoming traffic (data received) and outgoing traffic (data sent).
Advanced Usage
Time Range Analysis
You can analyze statistics for specific time ranges:
GET /api/stat_metrics?start=2023-01-01T00:00:00Z&end=2023-01-02T00:00:00Z
Instance-Specific Analysis
For detailed analysis of a specific instance:
GET /api/instances/{instance_id}/stats