Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres high CPU usage #68

Open
Phu2 opened this issue Dec 4, 2024 · 8 comments
Open

Postgres high CPU usage #68

Phu2 opened this issue Dec 4, 2024 · 8 comments

Comments

@Phu2
Copy link
Contributor

Phu2 commented Dec 4, 2024

From monit alert:

	Date:        Wed, 04 Dec 2024 12:27:58
...
	Description: cpu user usage of 97.7% matches resource limit [cpu user usage > 80.0%]

postgres processes on production server, sorted by CPU usage:

sudo ps --sort=-pcpu auxf | grep postgres
...
100070   1848558  0.0  0.1 191460 10232 ?        Ss   04:57   0:00  \_ postgres
100070   1902894 36.3  1.3 240872 105844 ?       Rs   12:23  19:29      \_ postgres: parallel worker for PID 2429   
100070   1903199 36.2  1.2 240876 102528 ?       Rs   12:25  18:38      \_ postgres: parallel worker for PID 2443   
100070   1902937 36.0  1.4 244784 114172 ?       Rs   12:23  19:14      \_ postgres: strapi strapi 192.168.192.3(56116) SELECT
100070   1902939 36.0  1.2 244744 104956 ?       Rs   12:23  19:12      \_ postgres: strapi strapi 192.168.192.3(33680) SELECT
100070   1902959 36.0  1.2 240868 99116 ?        Rs   12:23  19:11      \_ postgres: parallel worker for PID 2445   
100070   1903070 36.0  1.3 236568 110524 ?       Rs   12:24  18:56      \_ postgres: strapi strapi 192.168.192.3(34080) SELECT
100070   1903094 36.0  1.3 236600 113232 ?       Rs   12:24  18:49      \_ postgres: strapi strapi 192.168.192.3(43746) SELECT
100070   1902960 35.8  1.0 240648 81928 ?        Rs   12:23  19:04      \_ postgres: strapi strapi 192.168.192.3(33694) SELECT
100070   1902570 34.4  1.9 240812 161028 ?       Rs   12:20  19:27      \_ postgres: strapi strapi 192.168.192.3(56312) SELECT
100070   1902576 34.4  1.7 244672 140004 ?       Rs   12:20  19:29      \_ postgres: strapi strapi 192.168.192.3(56374) SELECT
100070   1902913 12.8  1.1 232680 94660 ?        Ss   12:23   6:52      \_ postgres: parallel worker for PID 2423   
100070   1902985  0.4  0.9 232672 74568 ?        Ss   12:23   0:13      \_ postgres: parallel worker for PID 2448   
100070   1848690  0.0  0.1 191516 14848 ?        Ss   04:57   0:00      \_ postgres: checkpointer   
100070   1848691  0.0  0.0 191472  7172 ?        Ss   04:57   0:00      \_ postgres: background writer   
100070   1848692  0.0  0.0 191460  4464 ?        Ss   04:57   0:00      \_ postgres: walwriter   
100070   1848693  0.0  0.0 191980  2768 ?        Ss   04:57   0:00      \_ postgres: autovacuum launcher   
100070   1848694  0.0  0.0  45964  1232 ?        Ss   04:57   0:00      \_ postgres: stats collector   
100070   1848695  0.0  0.0 191888  1432 ?        Ss   04:57   0:00      \_ postgres: logical replication launcher   

Docker Stats:

sudo docker stats
CONTAINER ID   NAME                                           CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
b78d7ff9c594   strapi-rpb                                     0.02%     126MiB / 7.763GiB     1.58%     72.3MB / 96.9MB   114MB / 13.9MB    22
7e280fa48953   strapi-rpbDB                                   422.24%   1.011GiB / 7.763GiB   13.03%    28.3MB / 56.2MB   17GB / 49.3GB     19
...

See also statistics in our internal vcenter.

@Phu2
Copy link
Contributor Author

Phu2 commented Dec 4, 2024

It seems not to be traffic related. The number of requests:

sudo grep "04/Dec/2024:12" /var/log/apache2/access_log_rpb-cms | cut -d[ -f2 | cut -d] -f1 | awk -F: '{print $2":"$3}' | sort -nk1 -nk2 | uniq -c
     17 12:00
     45 12:01
     49 12:02
     13 12:03
     20 12:04
     10 12:06
     37 12:07
      3 12:08
      8 12:09
      1 12:10
      6 12:12
      1 12:13
      6 12:14
     22 12:16
      3 12:18
     32 12:19
     10 12:20
     12 12:22
     16 12:23
     23 12:24
      7 12:25
     17 12:26
     18 12:27
     26 12:28
     45 12:29
      9 12:30
     10 12:31
      5 12:32
      4 12:33
     28 12:34
     10 12:35
      1 12:36
      5 12:37
     12 12:39
      1 12:41
      4 12:44
      5 12:45
      8 12:46
      8 12:47
     12 12:48
      5 12:49
      6 12:50
     21 12:51
     15 12:52
      9 12:53
     13 12:54

@Phu2
Copy link
Contributor Author

Phu2 commented Dec 4, 2024

Looks like complex queries, eg:

[xxx.xxx.xxx.xxx] [HOST] - [04/Dec/2024:12:25:11 +0100] "GET /api/rpb-authorities?populate=*&pagination[limit]=10&filters[$or][0][$and][0][$or][0][preferredName][$containsi]=%09Sankt&filters[$or][0][$and][1][$or][0][preferredName][$containsi]=Peter&filters[$or][0][$and][2][$or][0][preferredName][$containsi]=%2F&filters[$or][0][$and][3][$or][0][preferredName][$containsi]=Horbach%2C&filters[$or][0][$and][4][$or][0][preferredName][$containsi]=Landkreis&filters[$or][0][$and][5][$or][0][preferredName][$containsi]=Pirmasens&filters[$or][0][$and][6][$or][0][preferredName][$containsi]=&filters[$or][1][$and][0][$or][1][variantName][value][$containsi]=%09Sankt&filters[$or][1][$and][1][$or][1][variantName][value][$containsi]=Peter&filters[$or][1][$and][2][$or][1][variantName][value][$containsi]=%2F&filters[$or][1][$and][3][$or][1][variantName][value][$containsi]=Horbach%2C&filters[$or][1][$and][4][$or][1][variantName][value][$containsi]=Landkreis&filters[$or][1][$and][5][$or][1][variantName][value][$containsi]=Pirmasens&filters[$or][1][$and][6][$or][1][variantName][value][$containsi]=&filters[$or][2][$or][0][rpbId][$endsWithi]=%09Sankt%20Peter%20%2F%20Horbach%2C%20Landkreis%20Pirmasens%20 HTTP/1.1" 502 341 "https://rpb-cms.lobid.org/admin/content-manager/collection-types/api::article.article/4327344" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"

@Phu2
Copy link
Contributor Author

Phu2 commented Dec 4, 2024

We should either asks IT for more CPU or add some limits on the postgres container, see main...postgres-cpu-mem-limit or both.
Also investigate whether query or database optimizations are feasible.
@fsteeg What do you think?

@fsteeg
Copy link
Member

fsteeg commented Dec 5, 2024

Query or database optimizations would probably require going very much under the hood of Strapi, since we don't work with the DB directly, but use the Strapi REST API. I think a good first measure would be setting limits on the Postgres container. If users complain about performance, we can still look into increasing CPUs.

@Phu2
Copy link
Contributor Author

Phu2 commented Dec 5, 2024

If users complain about performance

During the high usage of CPUs (around 11:30 until 14:15 as depicted in vcenter) i didn't noticed any performance issues in the system as a logged in user. But i only viewed some records, no edits or other actions.

@Phu2
Copy link
Contributor Author

Phu2 commented Dec 5, 2024

@fsteeg
Copy link
Member

fsteeg commented Dec 5, 2024

But i only viewed some records, no edits or other actions.

I think the main use case these queries are produced for are lookups. That's why'd I'd assume caching won't help much here, but a good thing to keep in mind in case we have performance issues in the future.

@fsteeg fsteeg closed this as completed in 2561ccd Dec 5, 2024
@Phu2
Copy link
Contributor Author

Phu2 commented Jan 9, 2025

CPU usage at nearly 100% again from 08.01.2025, 15:59:34 until 09.01.2025, 04:50:47.

@Phu2 Phu2 reopened this Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants