Skip to content

Conversation

@vishesh92
Copy link
Member

@vishesh92 vishesh92 commented Mar 4, 2024

Description

Fixes #8775
This PR adds index on vm_id column of vm_stats table.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@vishesh92 vishesh92 force-pushed the vmstats-add-indexes branch from 5ab6621 to 5faf462 Compare March 4, 2024 11:10
@apache apache deleted a comment from blueorangutan Mar 4, 2024
@vishesh92
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 12.50000% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 30.89%. Comparing base (b82ea3d) to head (a71b7e0).
Report is 16 commits behind head on 4.19.

Files Patch % Lines
...ava/com/cloud/upgrade/dao/Upgrade41900to41910.java 12.50% 14 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.19    #8737      +/-   ##
============================================
- Coverage     30.91%   30.89%   -0.03%     
+ Complexity    34244    34242       -2     
============================================
  Files          5354     5355       +1     
  Lines        376071   376643     +572     
  Branches      54693    54807     +114     
============================================
+ Hits         116255   116346      +91     
- Misses       244520   244987     +467     
- Partials      15296    15310      +14     
Flag Coverage Δ
simulator-marvin-tests 24.71% <12.50%> (-0.02%) ⬇️
uitests 4.39% <ø> (+<0.01%) ⬆️
unit-tests 16.57% <12.50%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sureshanaparti sureshanaparti added this to the 4.19.1.0 milestone Mar 4, 2024
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 8846

@vishesh92
Copy link
Member Author

@blueorangutan test

@blueorangutan
Copy link

@vishesh92 a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@JoaoJandre
Copy link
Contributor

Hey, @vishesh92

Adding the vm_stats -> timestamp index to the table might not produce the effect that you want. When deleting very specific rows, adding an index to the table will generally make the deletion much faster. However, when a table has indexes, those must also be updated on deletes (and inserts too); therefore, if you are deleting large parts of a table, such as when deleting all the entries older then x time, your delete query will probably be slower.

I've actually ran into a situation where a user was retaining the last 30 days of VM statistics, but decided that they wanted to start retaining only 7 days. When changing the retain time from 30 days to 7 days, ACS had to delete 23 days worth of data. Because of the number of VMs and MGMT servers on his enviroment, they had about 1 billion entries to delete. ACS had trouble to perform this operation as the delete was being done in a single query that always timed out; thus the table started to grow endlessly. I've tried to add this exact index you are proposing; but, when I tested deleting 10 million rows of vm_stats, the index actually made the delete take double the amount of time.

I've found another solution to the vm_stats delete problem. You can check it out on #8740

@blueorangutan
Copy link

[SF] Trillian test result (tid-9382)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 47963 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8737-t9382-kvm-centos7.zip
Smoke tests completed. 129 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@vishesh92
Copy link
Member Author

Hey, @vishesh92

Adding the vm_stats -> timestamp index to the table might not produce the effect that you want. When deleting very specific rows, adding an index to the table will generally make the deletion much faster. However, when a table has indexes, those must also be updated on deletes (and inserts too); therefore, if you are deleting large parts of a table, such as when deleting all the entries older then x time, your delete query will probably be slower.

I've actually ran into a situation where a user was retaining the last 30 days of VM statistics, but decided that they wanted to start retaining only 7 days. When changing the retain time from 30 days to 7 days, ACS had to delete 23 days worth of data. Because of the number of VMs and MGMT servers on his enviroment, they had about 1 billion entries to delete. ACS had trouble to perform this operation as the delete was being done in a single query that always timed out; thus the table started to grow endlessly. I've tried to add this exact index you are proposing; but, when I tested deleting 10 million rows of vm_stats, the index actually made the delete take double the amount of time.

I've found another solution to the vm_stats delete problem. You can check it out on #8740

@JoaoJandre I understand what you mean. But for day to day operations, the index on timestamp will make the operation much faster. The scenario you mentioned will only occur when the operator changes the stats retention interval which IMO doesn't happen too frequently. And with the changes you are suggesting in #8740, the query won't timeout since we are deleting only a small part of the table.

@JoaoJandre
Copy link
Contributor

Hey, @vishesh92
Adding the vm_stats -> timestamp index to the table might not produce the effect that you want. When deleting very specific rows, adding an index to the table will generally make the deletion much faster. However, when a table has indexes, those must also be updated on deletes (and inserts too); therefore, if you are deleting large parts of a table, such as when deleting all the entries older then x time, your delete query will probably be slower.
I've actually ran into a situation where a user was retaining the last 30 days of VM statistics, but decided that they wanted to start retaining only 7 days. When changing the retain time from 30 days to 7 days, ACS had to delete 23 days worth of data. Because of the number of VMs and MGMT servers on his enviroment, they had about 1 billion entries to delete. ACS had trouble to perform this operation as the delete was being done in a single query that always timed out; thus the table started to grow endlessly. I've tried to add this exact index you are proposing; but, when I tested deleting 10 million rows of vm_stats, the index actually made the delete take double the amount of time.
I've found another solution to the vm_stats delete problem. You can check it out on #8740

@JoaoJandre I understand what you mean. But for day to day operations, the index on timestamp will make the operation much faster. The scenario you mentioned will only occur when the operator changes the stats retention interval which IMO doesn't happen too frequently. And with the changes you are suggesting in #8740, the query won't timeout since we are deleting only a small part of the table.

@vishesh92 could you share a description of the tests that you've done and their results?

@vishesh92
Copy link
Member Author

@vishesh92 could you share a description of the tests that you've done and their results?

Sure. I created a table with 661683 entries for timestamp ranging from 1970-11-29 till 2024-03-07 00:00:00.

EXPLAIN ANALYZE SELECT * FROM vm_stats WHERE timestamp < '2000-01-01' ;

Query plan with index

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                        |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| -> Filter: (vm_stats.`timestamp` < TIMESTAMP'2000-01-01 00:00:00')  (cost=66558 rows=330170) (actual time=0.0104..47.4 rows=282075 loops=1)                                    |
|     -> Covering index range scan on vm_stats using temp_idx over (timestamp < '2000-01-01 00:00:00')  (cost=66558 rows=330170) (actual time=0.00948..35.4 rows=282075 loops=1) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Query plan without index

+---------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                     |
|---------------------------------------------------------------------------------------------------------------------------------------------|
| -> Filter: (vm_stats.`timestamp` < TIMESTAMP'2000-01-01 00:00:00')  (cost=66500 rows=220091) (actual time=0.0198..87.5 rows=282075 loops=1) |
|     -> Table scan on vm_stats  (cost=66500 rows=660340) (actual time=0.0186..64 rows=661683 loops=1)                                        |
+---------------------------------------------------------------------------------------------------------------------------------------------+

As you can see the index reduced the query time by ~40%.

@JoaoJandre
Copy link
Contributor

@vishesh92 could you share a description of the tests that you've done and their results?

Sure. I created a table with 661683 entries for timestamp ranging from 1970-11-29 till 2024-03-07 00:00:00.

EXPLAIN ANALYZE SELECT * FROM vm_stats WHERE timestamp < '2000-01-01' ;

Query plan with index

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                        |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| -> Filter: (vm_stats.`timestamp` < TIMESTAMP'2000-01-01 00:00:00')  (cost=66558 rows=330170) (actual time=0.0104..47.4 rows=282075 loops=1)                                    |
|     -> Covering index range scan on vm_stats using temp_idx over (timestamp < '2000-01-01 00:00:00')  (cost=66558 rows=330170) (actual time=0.00948..35.4 rows=282075 loops=1) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Query plan without index

+---------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                     |
|---------------------------------------------------------------------------------------------------------------------------------------------|
| -> Filter: (vm_stats.`timestamp` < TIMESTAMP'2000-01-01 00:00:00')  (cost=66500 rows=220091) (actual time=0.0198..87.5 rows=282075 loops=1) |
|     -> Table scan on vm_stats  (cost=66500 rows=660340) (actual time=0.0186..64 rows=661683 loops=1)                                        |
+---------------------------------------------------------------------------------------------------------------------------------------------+

As you can see the index reduced the query time by ~40%.

@vishesh92, from the description of your PR:

This PR adds the upgrade path for 4.19 to 4.19.1 and creates two missing indexes on vm_stats table
...
2. vm_stats -> timestamp - To speed up clean up of old vm_stats.

You're proposing the addition of an index on the timestamp column to speed up the deletion of old records on vm_stats. Why would you test a SELECT and not a DELETE? Could you share the tests results using a DELETE query?

@vishesh92 vishesh92 force-pushed the vmstats-add-indexes branch from 5faf462 to ead5be1 Compare March 6, 2024 17:02
@vishesh92 vishesh92 force-pushed the vmstats-add-indexes branch from ead5be1 to 74eaea3 Compare March 6, 2024 17:08
@vishesh92
Copy link
Member Author

@vishesh92, from the description of your PR:

This PR adds the upgrade path for 4.19 to 4.19.1 and creates two missing indexes on vm_stats table
...
2. vm_stats -> timestamp - To speed up clean up of old vm_stats.

You're proposing the addition of an index on the timestamp column to speed up the deletion of old records on vm_stats. Why would you test a SELECT and not a DELETE? Could you share the tests results using a DELETE query?

@JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute.

@JoaoJandre
Copy link
Contributor

@JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute.

@vishesh92 you are proposing to add an index to improve DELETE operations; however, you are presenting results for SELECT operations. Indeed, indexes can improve SELECT operations, which only list the data. However, the DELETE operation works different: it removes data, recalculates statistics, reorganizes indexes and more. Please, present some tests and results for DELETE operations; otherwise, it is not feasible to sustain the reason for adding those indexes.

@vishesh92
Copy link
Member Author

@JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute.

@vishesh92 you are proposing to add an index to improve DELETE operations; however, you are presenting results for SELECT operations. Indeed, indexes can improve SELECT operations, which only list the data. However, the DELETE operation works different: it removes data, recalculates statistics, reorganizes indexes and more. Please, present some tests and results for DELETE operations; otherwise, it is not feasible to sustain the reason for adding those indexes.

@JoaoJandre Here are the results for a delete operation where the filter doesn't matches any rows.

MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.001s
MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.001s

query plan

+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
|------------------------------------------------------------------------------------------------|
| {                                                                                              |
|   "query_block": {                                                                             |
|     "select_id": 1,                                                                            |
|     "table": {                                                                                 |
|       "delete": true,                                                                          |
|       "table_name": "vm_stats",                                                                |
|       "access_type": "range",                                                                  |
|       "possible_keys": [                                                                       |
|         "temp_idx"                                                                             |
|       ],                                                                                       |
|       "key": "temp_idx",                                                                       |
|       "used_key_parts": [                                                                      |
|         "timestamp"                                                                            |
|       ],                                                                                       |
|       "key_length": "5",                                                                       |
|       "ref": [                                                                                 |
|         "const"                                                                                |
|       ],                                                                                       |
|       "rows_examined_per_scan": 1,                                                             |
|       "filtered": "100.00",                                                                    |
|       "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" |
|     }                                                                                          |
|   }                                                                                            |
| }                                                                                              |
+------------------------------------------------------------------------------------------------+

After dropping the index

MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.364s
MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.362s

query plan

+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
|------------------------------------------------------------------------------------------------|
| {                                                                                              |
|   "query_block": {                                                                             |
|     "select_id": 1,                                                                            |
|     "table": {                                                                                 |
|       "delete": true,                                                                          |
|       "table_name": "vm_stats",                                                                |
|       "access_type": "ALL",                                                                    |
|       "rows_examined_per_scan": 660340,                                                        |
|       "filtered": "100.00",                                                                    |
|       "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" |
|     }                                                                                          |
|   }                                                                                            |
| }                                                                                              |
+------------------------------------------------------------------------------------------------+

As you can see, without the index it takes around 0.36 seconds because it has to go through each row in the database.
It takes 0.001 seconds with the index because it doesn't need to go through all the rows in the table.

@sureshanaparti
Copy link
Contributor

@JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute.

@vishesh92 you are proposing to add an index to improve DELETE operations; however, you are presenting results for SELECT operations. Indeed, indexes can improve SELECT operations, which only list the data. However, the DELETE operation works different: it removes data, recalculates statistics, reorganizes indexes and more. Please, present some tests and results for DELETE operations; otherwise, it is not feasible to sustain the reason for adding those indexes.

@JoaoJandre Here are the results for a delete operation where the filter doesn't matches any rows.

MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.001s
MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.001s

query plan

+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
|------------------------------------------------------------------------------------------------|
| {                                                                                              |
|   "query_block": {                                                                             |
|     "select_id": 1,                                                                            |
|     "table": {                                                                                 |
|       "delete": true,                                                                          |
|       "table_name": "vm_stats",                                                                |
|       "access_type": "range",                                                                  |
|       "possible_keys": [                                                                       |
|         "temp_idx"                                                                             |
|       ],                                                                                       |
|       "key": "temp_idx",                                                                       |
|       "used_key_parts": [                                                                      |
|         "timestamp"                                                                            |
|       ],                                                                                       |
|       "key_length": "5",                                                                       |
|       "ref": [                                                                                 |
|         "const"                                                                                |
|       ],                                                                                       |
|       "rows_examined_per_scan": 1,                                                             |
|       "filtered": "100.00",                                                                    |
|       "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" |
|     }                                                                                          |
|   }                                                                                            |
| }                                                                                              |
+------------------------------------------------------------------------------------------------+

After dropping the index

MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.364s
MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.362s

query plan

+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
|------------------------------------------------------------------------------------------------|
| {                                                                                              |
|   "query_block": {                                                                             |
|     "select_id": 1,                                                                            |
|     "table": {                                                                                 |
|       "delete": true,                                                                          |
|       "table_name": "vm_stats",                                                                |
|       "access_type": "ALL",                                                                    |
|       "rows_examined_per_scan": 660340,                                                        |
|       "filtered": "100.00",                                                                    |
|       "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" |
|     }                                                                                          |
|   }                                                                                            |
| }                                                                                              |
+------------------------------------------------------------------------------------------------+

As you can see, without the index it takes around 0.36 seconds because it has to go through each row in the database. It takes 0.001 seconds with the index because it doesn't need to go through all the rows in the table.

@vishesh92 vm_stats table data, timestamp column cardinality in both cases are the same? please check/confirm results with the filter matching rows, and same delete count.

@vishesh92
Copy link
Member Author

@vishesh92 vm_stats table data, timestamp column cardinality in both cases are the same? please check/confirm results with the filter matching rows, and same delete count.
Here are the stats.

MySQL root@(none):test> DELETE FROM vm_stats_without_index WHERE timestamp < '1972-05-01';  -- without index
Query OK, 9213 rows affected
Time: 0.414s
MySQL root@(none):test> DELETE FROM vm_stats_with_index WHERE timestamp < '1972-05-01';   -- with index
Query OK, 9213 rows affected
Time: 0.100s
MySQL root@(none):test> DELETE FROM vm_stats_without_index WHERE timestamp < '1980-01-01';
You're about to run a destructive command.
Do you want to proceed? (y/n): y
Your call!
Query OK, 84840 rows affected
Time: 0.438s
MySQL root@(none):test> DELETE FROM vm_stats_with_index WHERE timestamp < '1980-01-01';
You're about to run a destructive command.
Do you want to proceed? (y/n): y
Your call!
Query OK, 84840 rows affected
Time: 0.375s

In case of a delete operation with some filter, before deleting the records database needs to fetch the records from the database. If the index is not present, this fetching of records from the database becomes slow and leads to scanning of all rows in the table resulting in high I/O on disk.

@sureshanaparti
Copy link
Contributor

In case of a delete operation with some filter, before deleting the records database needs to fetch the records from the database. If the index is not present, this fetching of records from the database becomes slow and leads to scanning of all rows in the table resulting in high I/O on disk.

thanks for sharing the results @vishesh92 , seems there is some improvement with the index.

@JoaoJandre
Copy link
Contributor

@JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute.

@vishesh92 you are proposing to add an index to improve DELETE operations; however, you are presenting results for SELECT operations. Indeed, indexes can improve SELECT operations, which only list the data. However, the DELETE operation works different: it removes data, recalculates statistics, reorganizes indexes and more. Please, present some tests and results for DELETE operations; otherwise, it is not feasible to sustain the reason for adding those indexes.

@JoaoJandre Here are the results for a delete operation where the filter doesn't matches any rows.

MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.001s
MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.001s

query plan

+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
|------------------------------------------------------------------------------------------------|
| {                                                                                              |
|   "query_block": {                                                                             |
|     "select_id": 1,                                                                            |
|     "table": {                                                                                 |
|       "delete": true,                                                                          |
|       "table_name": "vm_stats",                                                                |
|       "access_type": "range",                                                                  |
|       "possible_keys": [                                                                       |
|         "temp_idx"                                                                             |
|       ],                                                                                       |
|       "key": "temp_idx",                                                                       |
|       "used_key_parts": [                                                                      |
|         "timestamp"                                                                            |
|       ],                                                                                       |
|       "key_length": "5",                                                                       |
|       "ref": [                                                                                 |
|         "const"                                                                                |
|       ],                                                                                       |
|       "rows_examined_per_scan": 1,                                                             |
|       "filtered": "100.00",                                                                    |
|       "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" |
|     }                                                                                          |
|   }                                                                                            |
| }                                                                                              |
+------------------------------------------------------------------------------------------------+

After dropping the index

MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.364s
MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01';
Query OK, 0 rows affected
Time: 0.362s

query plan

+------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                        |
|------------------------------------------------------------------------------------------------|
| {                                                                                              |
|   "query_block": {                                                                             |
|     "select_id": 1,                                                                            |
|     "table": {                                                                                 |
|       "delete": true,                                                                          |
|       "table_name": "vm_stats",                                                                |
|       "access_type": "ALL",                                                                    |
|       "rows_examined_per_scan": 660340,                                                        |
|       "filtered": "100.00",                                                                    |
|       "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" |
|     }                                                                                          |
|   }                                                                                            |
| }                                                                                              |
+------------------------------------------------------------------------------------------------+

As you can see, without the index it takes around 0.36 seconds because it has to go through each row in the database. It takes 0.001 seconds with the index because it doesn't need to go through all the rows in the table.

@vishesh92, I'm sorry, but I don't understand the point of a DELETE test where no rows are deleted.

I've decided to test/compare the performance of deletion on the vm_stats table using a more realistic scenario. Consider the following:
We have 1000 VMs, 2 MGMT servers each collecting stats every 30 seconds, and we have a retention time of 7 days, this will give us a table size of around 40 million rows. Thus, I've created a mock vm_stats table with that many rows, using random dates between 2024-03-01 00:00:00 and 2024-03-08 00:00:00 and made the following tests (I've regenerated the table when needed):

Without adding the proposed indexes, I deleted rows that had a timestamp lower then 2024-03-01 00:01:00 :

MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-01 00:01:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39379728 | 40198741.00 |   100.00 |       0.01 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (31.688 sec)

After, I added the indexes and deleted all the rows with timestamp lower then 2024-03-01 00:02:00 (the ones before 00:01:00 were deleted already by the earlier test):

MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-01 00:02:00';
+------+-------------+----------+-------+---------------+--------+---------+------+------+---------+----------+------------+-------------+
| id   | select_type | table    | type  | possible_keys | key    | key_len | ref  | rows | r_rows  | filtered | r_filtered | Extra       |
+------+-------------+----------+-------+---------------+--------+---------+------+------+---------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | range | index1        | index1 | 5       | NULL | 4148 | 4148.00 |   100.00 |     100.00 | Using where |
+------+-------------+----------+-------+---------------+--------+---------+------+------+---------+----------+------------+-------------+
1 row in set (0.819 sec)

Great! the indexes made it much faster! roughly 30 times faster! Let's test deleting a whole day's worth of rows then:

Without indexes:

MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40198741.00 |   100.00 |      14.29 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (2 min 57.537 sec)

With indexes:

MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | index1        | NULL | NULL    | NULL | 39371512 | 40190525.00 |   100.00 |      14.27 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (45 min 50.800 sec)

The performance was about 15 times worst this time with indexes. This happens because of what I've explained in this comment: #8737 (comment). When deleting a small amount of the data (0.01% of it), we can find the specific rows we need to delete way faster using the indexes and all the overhead of updating them is worth it. However, when deleting a bigger amount of data (14% of the table) we can see the impact of the indexes' overhead, they make the deletion much slower.

The above tests only consider the proposed PR, but I've also tested the performance of the indexes alongside what's proposed on #8740. I've tested making a query to delete a whole day's worth of rows with a limit of 100000 :

Without indexes:

MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00' limit 100000;
+------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows    | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 697806.00 |   100.00 |      14.33 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+
1 row in set (2.075 sec)

With indexes:

MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00' limit 100000;
+------+-------------+----------+-------+---------------+--------+---------+------+----------+-----------+----------+------------+-------------+
| id   | select_type | table    | type  | possible_keys | key    | key_len | ref  | rows     | r_rows    | filtered | r_filtered | Extra       |
+------+-------------+----------+-------+---------------+--------+---------+------+----------+-----------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | range | index1        | index1 | 5       | NULL | 12540998 | 100000.00 |   100.00 |     100.00 | Using where |
+------+-------------+----------+-------+---------------+--------+---------+------+----------+-----------+----------+------------+-------------+
1 row in set (21.133 sec)

Deleting without indexes was 10 times faster then with indexes.

Looking at the tests that I've done, I can agree with you that, for the most common case (deleting 1 minute worth of data), the index addition is worth it. However, if at any time the user decides that they want to retain less stats, the query to delete the excess stats might take much longer, and eventually timeout, leading to a snowball where ACS is unable to clean the vm_stats table and it keeps growing, making the problem worse.

Another point is that, while the feature on #8740 is optional, and by default turned off, the indexes proposed here are not optional (at least not without manual DB intervention). Taking this into consideration, I really think that adding the timestamp index is not the best approach, as it might lead to the problems that I've described both here and on #8740.

@kohrar
Copy link
Contributor

kohrar commented Mar 12, 2024

Thanks for the detailed breakdown on your deletion performance @JoaoJandre. Since you have so many rows in the vm_stats table, what's the performance like when you run a list VM command (eg. list virtualmachines listall=true)? Without an index on timestamp, are you noticing any significant slowdowns with these commands? I only noticed this after users complained that listing VMs when creating a port forwarding was taking >30 seconds to load. Our instance did have vm.stats.increment.metrics enabled which we've subsequently disabled to help speed up these types of API calls.

However, if at any time the user decides that they want to retain less stats, the query to delete the excess stats might take much longer, and eventually timeout, leading to a snowball where ACS is unable to clean the vm_stats table and it keeps growing, making the problem worse.

Where is the timeout happening and could the timeout be increased? A slow deletion happening in the background isn't as impactful as a slow or unresponsive user interface, so I'd still rather have the index in place.

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - the change is largely from a large prod. env where table scans/queries were observed like:

SELECT vm_stats.id, vm_stats.vm_id, vm_stats.mgmt_server_id, vm_stats.timestamp, vm_stats.vm_stats_data FROM vm_stats WHERE vm_stats.vm_id =

It seemed that vm_stats table needed an index or two. I'm not a MySQL expert but the change in prod environment took the query from 15s to < 1s.

@GutoVeronezi
Copy link
Contributor

@JoaoJandre, could you make some tests listing and removing data with the proposed index?

@rohityadavcloud
Copy link
Member

fwiw @borisstoyanov is helping to test this.

@blueorangutan
Copy link

[SF] Trillian test result (tid-9478)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 49903 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8737-t9478-kvm-centos7.zip
Smoke tests completed. 129 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@borisstoyanov
Copy link
Contributor

@blueorangutan test matrix

@blueorangutan
Copy link

@borisstoyanov a [SL] Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@JoaoJandre
Copy link
Contributor

@JoaoJandre, could you make some tests listing and removing data with the proposed index?

@GutoVeronezi, @vishesh92, @rohityadavcloud, @sureshanaparti, @mlsorensen, @kohrar I've made some tests listing and deleting data from the vm_stats table using the proposed index, I've also taken the liberty to reproduce the tests using a index on (vm_id) only. I've used the same assumptions from the tests that I've done here: #8737 (comment).

First, I tested two listings: Select all the data on a VM; select the data on a VM between two specific dates.

Without any indexes:

MariaDB [teste]> analyze select * from vm_stats where vm_id = 368;
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40198741.00 |   100.00 |       0.10 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (24.905 sec)
MariaDB [teste]> analyze select * from vm_stats where vm_id = 368 AND (timestamp between '2024-03-02' AND '2024-03-04');
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40198741.00 |   100.00 |       0.03 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (24.528 sec)

Using the (vm_id,timestamp) index:

MariaDB [teste]> analyze select * from vm_stats where vm_id = 368;
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------+
| id   | select_type | table    | type | possible_keys | key    | key_len | ref   | rows  | r_rows   | filtered | r_filtered | Extra |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------+
|    1 | SIMPLE      | vm_stats | ref  | index2        | index2 | 8       | const | 87578 | 40866.00 |   100.00 |     100.00 |       |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------+
1 row in set (3.687 sec)

MariaDB [teste]> analyze select * from vm_stats where vm_id = 368 AND (timestamp between '2024-03-02' AND '2024-03-04');
+------+-------------+----------+-------+---------------+--------+---------+------+-------+----------+----------+------------+-----------------------+
| id   | select_type | table    | type  | possible_keys | key    | key_len | ref  | rows  | r_rows   | filtered | r_filtered | Extra                 |
+------+-------------+----------+-------+---------------+--------+---------+------+-------+----------+----------+------------+-----------------------+
|    1 | SIMPLE      | vm_stats | range | index2        | index2 | 13      | NULL | 23986 | 11581.00 |     0.06 |     100.00 | Using index condition |
+------+-------------+----------+-------+---------------+--------+---------+------+-------+----------+----------+------------+-----------------------+
1 row in set (1.025 sec)

Using the (vm_id) index:

MariaDB [teste]> analyze select * from vm_stats where vm_id = 368;
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------+
| id   | select_type | table    | type | possible_keys | key    | key_len | ref   | rows  | r_rows   | filtered | r_filtered | Extra |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------+
|    1 | SIMPLE      | vm_stats | ref  | index1        | index1 | 8       | const | 87528 | 40866.00 |   100.00 |     100.00 |       |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------+
1 row in set (3.381 sec)
MariaDB [teste]> analyze select * from vm_stats where vm_id = 368 AND (timestamp between '2024-03-02' AND '2024-03-04');
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key    | key_len | ref   | rows  | r_rows   | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ref  | index1        | index1 | 8       | const | 87528 | 40866.00 |   100.00 |      28.34 | Using where |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------------+
1 row in set (3.365 sec)

Again, listing with indexes is much faster, we can see that the select with indexes are an order of magnitude faster then the select without an index. Furthermore, the (vm_id,timestamp) index is slighty faster than the (vm_id) index on the select with a condition on timestamp.

Then, I did three delete tests: deleting one minute of data, deleting a whole day of data, and deleting a whole day of data with limit 100000.

Without any indexes:

MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-01 00:03:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40190453.00 |   100.00 |       0.01 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (30.399 sec)
MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-02 00:00:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39281076 | 40186342.00 |   100.00 |      14.26 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (2 min 50.000 sec)
MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-02 00:00:00' limit 100000;
+------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows    | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 697806.00 |   100.00 |      14.33 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+
1 row in set (2.023 sec)

Using the (vm_id,timestamp) index:

MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-01 00:01:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40198741.00 |   100.00 |       0.01 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (30.866 sec)
MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-02 00:00:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40198741.00 |   100.00 |      14.29 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (22 min 6.165 sec)
MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-02 00:00:00' limit 100000;
+------+-------------+----------+------+---------------+------+---------+------+----------+------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows     | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39185187 | 1900366.00 |   100.00 |       5.26 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+------------+----------+------------+-------------+
1 row in set (14.068 sec)

Using the (vm_id) index:

MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-01 00:02:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40194609.00 |   100.00 |       0.01 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (31.899 sec)
MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-02 00:00:00';
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows      | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40198741.00 |   100.00 |      14.29 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+
1 row in set (7 min 28.040 sec)
MariaDB [teste]> ANALYZE delete from vm_stats where `timestamp` <= '2024-03-02 00:00:00' limit 100000;
+------+-------------+----------+------+---------------+------+---------+------+----------+------------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows     | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+------+---------+------+----------+------------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39185187 | 1299394.00 |   100.00 |       7.70 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+----------+------------+----------+------------+-------------+
1 row in set (5.265 sec)

From the delete tests, we can once again see that, when deleting small amounts of data, the index overhead is unnoticeable. However, when deleting larger parts of the table, the indexes make a large difference, with the deletion using (vm_id,timestamp) being an order of magnitude slower the deletion without any indexes. However, the deletion with the (vm_id) index is in the same order of magnitude as the no index deletion, only being about two times slower.

Looking at the test results, we can make the following conclusions:

  1. Selects without any indexes take too long (who would've thought?);
  2. While the (vm_id,timestamp) index makes the select faster when filtering for both vm_id and timestamp, the deletion is much slower;
  3. The (vm_id) index lowers the select time considerably, being on the same order of magnitude of the (vm_id,timestamp) speed up, while being only a bit slower when selecting using timestamps. Moreover, the deletion is slower than the no index deletion; however, it's still on the same order of magnitude of the no index deletion.

Therefore, based on these results, I believe that we could change the proposed (vm_id,timestamp) index to a (vm_id) index, as the (vm_id) offers a good speed boost for listing, and minimal slowdown when deleting, especially when used with the feature proposed in #8740.

}

private void addIndexes(Connection conn) {
DbUpgradeUtils.addIndexIfNeeded(conn, "vm_stats", "vm_id", "timestamp");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DbUpgradeUtils.addIndexIfNeeded(conn, "vm_stats", "vm_id", "timestamp");
DbUpgradeUtils.addIndexIfNeeded(conn, "vm_stats", "vm_id");

@vishesh92
Copy link
Member Author

@JoaoJandre Can you share query times for this query?

SELECT vm_stats.id, vm_stats.vm_id, vm_stats.mgmt_server_id, vm_stats.timestamp, vm_stats.vm_stats_data FROM vm_stats WHERE vm_stats.vm_id = 368  ORDER BY vm_stats.timestamp DESC;

@blueorangutan
Copy link

[SF] Trillian test result (tid-9496)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 48441 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8737-t9496-xenserver-71.zip
Smoke tests completed. 128 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_cancel_host_maintenace_with_no_migration_jobs Error 1040.48 test_host_maintenance.py

@blueorangutan
Copy link

[SF] Trillian test result (tid-9497)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 48937 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8737-t9497-vmware-67u3.zip
Smoke tests completed. 129 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link

[SF] Trillian test result (tid-9498)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 49936 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8737-t9498-kvm-centos7.zip
Smoke tests completed. 123 look OK, 6 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_unsecure_vm_migration Error 224.53 test_vm_life_cycle.py
test_03_secured_to_nonsecured_vm_migration Error 273.24 test_vm_life_cycle.py
test_01_create_vm_snapshots Failure 810.66 test_vm_snapshots.py
test_02_revert_vm_snapshots Failure 810.79 test_vm_snapshots.py
test_03_delete_vm_snapshots Failure 0.02 test_vm_snapshots.py
test_04_deploy_vnf_appliance Error 102.44 test_vnf_templates.py
test_04_deploy_vnf_appliance Error 102.44 test_vnf_templates.py
test_05_delete_vnf_template Error 1.07 test_vnf_templates.py
ContextSuite context=TestVnfTemplates>:teardown Error 1.15 test_vnf_templates.py
test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL Error 372.69 test_vpc_redundant.py
test_02_redundant_VPC_default_routes Error 362.60 test_vpc_redundant.py
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Error 240.13 test_vpc_redundant.py
test_04_rvpc_network_garbage_collector_nics Error 186.84 test_vpc_redundant.py
test_05_rvpc_multi_tiers Error 277.02 test_vpc_redundant.py
test_05_rvpc_multi_tiers Error 277.04 test_vpc_redundant.py
test_01_VPC_nics_after_destroy Error 185.27 test_vpc_router_nics.py
test_02_VPC_default_routes Error 193.49 test_vpc_router_nics.py
test_01_redundant_vpc_site2site_vpn Failure 423.11 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Failure 245.56 test_vpc_vpn.py
test_01_vpc_site2site_vpn Failure 271.98 test_vpc_vpn.py

@JoaoJandre
Copy link
Contributor

Sure @vishesh92 , here are the results:

No index:

MariaDB [teste]> ANALYZE SELECT vm_stats.id, vm_stats.vm_id, vm_stats.mgmt_server_id, vm_stats.timestamp, vm_stats.vm_stats_data FROM vm_stats WHERE vm_stats.vm_id = 368  ORDER BY vm_stats.timestamp DESC;
+------+-------------+----------+------+---------------+------+---------+------+----------+----------+----------+------------+-----------------------------+
| id   | select_type | table    | type | possible_keys | key  | key_len | ref  | rows     | r_rows   | filtered | r_filtered | Extra                       |
+------+-------------+----------+------+---------------+------+---------+------+----------+----------+----------+------------+-----------------------------+
|    1 | SIMPLE      | vm_stats | ALL  | NULL          | NULL | NULL    | NULL | 39285187 | 40866.00 |   100.00 |     100.00 | Using where; Using filesort |
+------+-------------+----------+------+---------------+------+---------+------+----------+----------+----------+------------+-----------------------------+
1 row in set (27.276 sec)

Index on (vm_id, timestamp):

MariaDB [teste]> ANALYZE SELECT vm_stats.id, vm_stats.vm_id, vm_stats.mgmt_server_id, vm_stats.timestamp, vm_stats.vm_stats_data FROM vm_stats WHERE vm_stats.vm_id = 368  ORDER BY vm_stats.timestamp DESC;
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------------+
| id   | select_type | table    | type | possible_keys | key    | key_len | ref   | rows  | r_rows   | filtered | r_filtered | Extra       |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------------+
|    1 | SIMPLE      | vm_stats | ref  | index2        | index2 | 8       | const | 87578 | 40866.00 |   100.00 |     100.00 | Using where |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-------------+
1 row in set (4.544 sec)

Index on (vm_id):

MariaDB [teste]> ANALYZE SELECT vm_stats.id, vm_stats.vm_id, vm_stats.mgmt_server_id, vm_stats.timestamp, vm_stats.vm_stats_data FROM vm_stats WHERE vm_stats.vm_id = 368  ORDER BY vm_stats.timestamp DESC;
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-----------------------------+
| id   | select_type | table    | type | possible_keys | key    | key_len | ref   | rows  | r_rows   | filtered | r_filtered | Extra                       |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-----------------------------+
|    1 | SIMPLE      | vm_stats | ref  | index2        | index2 | 8       | const | 87528 | 40866.00 |   100.00 |     100.00 | Using where; Using filesort |
+------+-------------+----------+------+---------------+--------+---------+-------+-------+----------+----------+------------+-----------------------------+
1 row in set (8.123 sec)

The results/conclusions are the same: No index is very slow; the (vm_id, timestamp) index is the fastest for listing when considering the timestamp, and the (vm_id) index is much faster than without index, but a bit slower than the (vm_id, timestamp) index.

One point to consider is that this query is returning one whole week's worth of data for a VM and ordering it. That amounts to 40866 rows in this example (that have to be selected from 40 million rows in the table); It's expected that it should take some time.

@vishesh92
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 8969

@rohityadavcloud
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@rohityadavcloud a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-9524)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 43389 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8737-t9524-kvm-centos7.zip
Smoke tests completed. 128 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_trigger_shutdown Failure 336.56 test_safe_shutdown.py

Copy link
Contributor

@JoaoJandre JoaoJandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only one question

Copy link
Contributor

@borisstoyanov borisstoyanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rohityadavcloud
Copy link
Member

Merging based on reviews, smoketests and Bobby's manual testing.

@rohityadavcloud rohityadavcloud merged commit 24d5c45 into apache:4.19 Mar 21, 2024
@rohityadavcloud rohityadavcloud deleted the vmstats-add-indexes branch March 21, 2024 08:35
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Mar 25, 2024
* Add indexes for vm_stats

* Remove index on timestamp

* Chnage index from vm_id,timestamp to vm_id
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

9 participants