Skip to content

DB: vm_stats table missing indices #8775

@kohrar

Description

@kohrar
ISSUE TYPE
  • Improvement Request
COMPONENT NAME
DB
CLOUDSTACK VERSION
4.18.1
CONFIGURATION

Standard install

OS / ENVIRONMENT

Rocky Linux 8

SUMMARY

The vm_stats table only has 1 index on id:

mysql> SHOW INDEX FROM vm_stats;
+----------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table    | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+----------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| vm_stats |          0 | PRIMARY         |            1 | id          | A         |     1197045 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+----------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
1 rows in set (0.00 sec)

However, on my CloudStack instance, I see that MySQL is quite busy with these SQL queries (when listing VMs) which requires sorting on the timestamp and filtering on vm_id. Both of these columns are not indexed.

SELECT vm_stats.id, vm_stats.vm_id, vm_stats.mgmt_server_id, vm_stats.timestamp, vm_stats.vm_stats_data FROM vm_stats WHERE vm_stats.vm_id = 1273  ORDER BY vm_stats.timestamp DESC 

Our system keeps VM stats for 1 week with a polling interval of 60 seconds. We have about 1.2 million records. A query such as the one above takes about 1 second to complete and is suboptimal due to it needing a full table scan.

Beacuse of the types of queries being executed on this table, I propose adding 2 additional indices to the vm_stats table:

  1. timestamp
  2. vm_id

I added the following indices manually to test if the performance was better, and it was (from 1+ second to under 0.1s in my setup). The database system load was also dramatically reduced.

create index vmid_index on vm_stats (vm_id);
create index timestamp_index on vm_stats (timestamp);
STEPS TO REPRODUCE
  1. Have stats enabled and have vm_stats with lots of data.
  2. Run list virtualmachines listall=true in cmk or via API
  3. Look at the processlist on mysql.
EXPECTED RESULTS

Have a reasonable MySQL load when running VM list commands

ACTUAL RESULTS

High MySQL load and slow queries when listing VM commands.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions