55

I'm using an open source (RHEL 6.2) based machine running SIEM software. When I run the top command, I see postgres and postmaster both with 96% CPU usage. Is there a way to pin-point or see what causing these service to stack up?

Craig Ringer
  • 57,821
  • 6
  • 162
  • 193
asadz
  • 655
  • 1
  • 5
  • 6

6 Answers6

50

You can match a specific Postgres backend ID to a system process ID using the pg_stat_activity system table.

SELECT pid, datname, usename, query FROM pg_stat_activity; can be a good starting point.
Once you know what queries are running you can investigate further (EXPLAIN/EXPLAIN ANALYZE; check locks, etc.)

voretaq7
  • 1,607
  • 17
  • 17
32

I was having the same issue. The postgresql is setup on AWS RDS and it was having 100% cpu utilisation even after increasing the instance. I debugged with the method shown here and one of the method worked for me.

I checked for the query running for the longest time and came to know that certain queries was stuck and was running since more than 3-4 hours. To check since how much time the query is running, run the following command:

SELECT max(now() - xact_start) FROM pg_stat_activity
                               WHERE state IN ('idle in transaction', 'active');

If this is more than an hour, than this is the issue. Kill the long running connection and limit the max age of the connection from application side.

Ajeet Khan
  • 469
  • 1
  • 5
  • 5
12

If this is really the postmaster using all that CPU, then you likely have lock contention issues, probably due to very high max_connections. Consider lowering max_connections and using a connection pooler if this is the case.

Otherwise: Details, please. Full output of top -b -n 1 for a start.

Craig Ringer
  • 57,821
  • 6
  • 162
  • 193
11

You can use pg_stat_statements,
     https://www.postgresql.org/docs/current/pgstatstatements.html.

First, pg_stat_activity (from the other answers) is good advice — and I think sometimes it's not enough? It shows the current activity only, but what if there are many many quick queries, which don't last long enough for them to look interesting in pg_stat_activity? However when there're many many of them, they might still cause high CPU usage?

Then you can use pg_stat_statements:

pg_stat_statements records queries that are run against your database, strips out a number of variables from them, and then saves data about the query, such as how long it took, as well as what happened to underlying reads/writes.

(From this blog post: The most useful Postgres extension: pg_stat_statements )

Here I use it to find out why my own CPU usage is unexpectedly high:

\x

select (total_exec_time / 1000 / 3600) as total_hours, (total_exec_time / 1000) as total_seconds, (total_exec_time / calls) as avg_millis, calls num_calls, query from pg_stat_statements order by 1 desc limit 10;

-[ RECORD 1 ]-+------------------------- total_hours | 0.128210291016666 total_seconds | 461.557047659998 avg_millis | 9.06506889111474 num_calls | 50916 query | select p.site_id, p.page_id, p.version current_version, h.page_version cached_version from ... -[ RECORD 2 ]-+------------------------ total_hours | 0.0586912504452778 total_seconds | 211.288501603 avg_millis | 4.14966517279101 num_calls | 50917 query | select p.site_id, p.page_id ... from ... : :

Did you notice something, above? — There top queries are quick: 0.009 and 0.004 seconds. But they get called many many times.

(You could also try total_plan_time + total_exec_time instead of just total_exec_time.)

KajMagnus
  • 1,249
  • 2
  • 14
  • 21
5

I would like to recommend the pg_activity utility to see which SQL queries use CPU\RAM. Very useful, like top.

mustaccio
  • 28,207
  • 24
  • 60
  • 76
Vsevolod Gromov
  • 253
  • 1
  • 3
  • 7
1

I've had a similar in Postgres 9.1 and I've solved it by increasing shared_buffers and effective_cache_size parameters.

The official docs says:

shared_buffers (integer)

Sets the amount of memory the database server uses for shared memory buffers. The default is typically 32 megabytes (32MB), but might be less if your kernel settings will not support it (as determined during initdb). This setting must be at least 128 kilobytes. (Non-default values of BLCKSZ change the minimum.) However, settings significantly higher than the minimum are usually needed for good performance. This parameter can only be set at server start.

If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system. There are some workloads where even large settings for shared_buffers are effective, but because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount. Larger settings for shared_buffers usually require a corresponding increase in checkpoint_segments, in order to spread out the process of writing large quantities of new or changed data over a longer period of time.

On systems with less than 1GB of RAM, a smaller percentage of RAM is appropriate, so as to leave adequate space for the operating system. Also, on Windows, large values for shared_buffers aren't as effective. You may find better results keeping the setting relatively low and using the operating system cache more instead. The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB.

Increasing this parameter might cause PostgreSQL to request more System V shared memory than your operating system's default configuration allows. See Section 17.4.1 for information on how to adjust those parameters, if necessary.

In a system with 64GB of RAM and totally dedicated to Postgres only, I've set:

ALTER SYSTEM SET shared_buffers = '8GB';
ALTER SYSTEM SET effective_cache_size = '32GB';

But depending of you Postgres' version and your Linux distribution, it will not be possible to initialize your database with those values in each parameter, due to Linux kernel.shmmax limitations.

You can change it by running:

sysctl -w kernel.shmmax=14294967296 

where 14294967296 is the size (in bytes) you'd limit.

SHMMAX is a kernel parameter used to define the maximum size of a single shared memory segment a Linux process can allocate.

Christian
  • 345
  • 1
  • 4
  • 12