2

I have a single Postgres server and a set of PgBouncers configured to help me with connection pooling.

Recently, I discovered that the database connections are getting accumulated in "idle" state over time.

=> SELECT COUNT(*), state FROM pg_stat_activity GROUP BY state;

count | state -------+-------- 9 | 1 | active 11 | idle (3 rows)

I executed the following query on the admin console of each PgBouncer to check whether I have any server connection from PgBouncers. I got the same result as below for every PgBouncer.

pgbouncer=# SHOW SERVERS;

type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | wait | wait_us | close_needed | ptr | link | remote_pid | tls ------+------+----------+-------+------+------+------------+------------+--------------+--------------+------+---------+--------------+-----+------+------------+----- (0 rows)

Here are the details of the connections which are in "idle" state.

=> SELECT * FROM pg_stat_activity WHERE state='idle';

-[ RECORD 1 ]----+--------------------------------------------------------------------- datid | 16498 datname | users pid | 337260 usesysid | 14419 usename | novadeadmin application_name | psql client_addr | x.x.x.x client_hostname | client_port | 38850 backend_start | 2021-11-20 14:26:53.648459+00 xact_start | query_start | 2021-11-20 14:27:12.585782+00 state_change | 2021-11-20 14:27:12.585782+00 wait_event_type | Client wait_event | ClientRead state | idle backend_xid | backend_xmin | query | SELECT name FROM workspaces WHERE id='wnujb0lh37tsou44k7b1o'; backend_type | client backend

-[ RECORD 2 ]----+--------------------------------------------------------------------- datid | 376313 datname | wnujb0lh37tsou44k7b1o pid | 363348 usesysid | 14419 usename | novadeadmin application_name | client_addr | x.x.x.x client_hostname | client_port | 12992 backend_start | 2021-11-19 19:24:03.696311+00 xact_start | query_start | 2021-11-19 19:26:41.16366+00 state_change | 2021-11-19 19:26:41.16366+00 wait_event_type | Client wait_event | ClientRead state | idle backend_xid | backend_xmin | query | SELECT "currentVersion"."updatedAt" AS "currentVersion_updatedAt", "currentVersion"."deletedAt" AS "currentVersion_deletedAt", "currentVersion"."updatedBy" AS "currentVersion_updatedBy", "currentVersion"."id" AS "currentVersion_id", "currentVersion"."fileName" AS "currentVersion_fileName", "currentVersion"."drawingID" AS "currentVersion_drawingID", "currentVersion"."versionNumber" AS "currentVersion_versionNumber", "currentVersion"."issuedAt" AS "currentVersion_issuedAt", "currentVersion"."createdBy" AS "currentVersion_createdBy", "currentVersion"."createdAt" AS "currentVersion_createdAt", "currentVersion"."mime" AS "currentVersion_mime", "currentVersion"."size" AS "currentVersion_size" FROM "drawingVersions" "currentVersion" INNER JOIN "drawings" "Drawing" ON "Drawing"."currentVersionID" = "currentVersion"."id" AND "Drawing"."deletedAt" IS NULL WHERE ( "Drawing"."id" IN ($1) ) AND ( "currentVersion"."deletedAt" IS NULL ) backend_type | client backend

--More--

  1. The query within the first record was executed by myself yesterday (2021-11-20) through a psql client (not through a PgBouncer). I'm not entirely sure whether I forgot to close the session in psql. So this might be my mistake.

  2. The query values from the other records are all the same. They were all executed from my backend application (using NodeJS TypeORM) through PgBouncers. But as I showed earlier, there are no active connections to Postgres server from PgBouncers.

The reason for having the same client_addr in all the records is because the psql client I used and the PgBouncers are in the same Kubernetes cluster.

Could you help me understand how 1. and 2. are happening? I would also like a mechanism to stop having long-lived "idle" connections.

Thanks in advance !


Update:

I see the following netstat result in PgBouncer host (y.y.y.y is the IP of my Postgres server):

/ $ netstat -tp | grep "y.y.y.y"
tcp        0      0 api-7659758786-2tpfv:57250 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40576 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40570 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40580 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40572 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40566 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40574 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:37988 y.y.y.y:postgresql  ESTABLISHED 1/pgbouncer
tcp        0      0 api-7659758786-2tpfv:40568 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:57252 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:40578 y.y.y.y:postgresql  ESTABLISHED -
tcp        0      0 api-7659758786-2tpfv:46606 y.y.y.y:postgresql  ESTABLISHED 92/psql
tcp        0      0 api-7659758786-2tpfv:51548 y.y.y.y:postgresql  ESTABLISHED 1/pgbouncer

This is the result from PgBouncer admin console for open server connections:

pgbouncer=# SHOW SERVERS;
 type |                  user                  |  database  | state |    addr | port |  local_addr  | local_port |      connect_time       |      request_time       | wait | wait_us | close_needed |      ptr       | link | remote_pid |                             tls
------+----------------------------------------+------------+-------+---------+------+--------------+------------+-------------------------+-------------------------+------+---------+--------------+----------------+------+------------+--------------------------------------------------------------
 S    | novadeadmin@psql-frc-stage-novade-lite | novadelite | idle  | y.y.y.y | 5432 | 10.244.4.196 |      51548 | 2021-11-22 05:01:42 UTC | 2021-11-22 05:14:34 UTC |    0 |       0 |            0 | 0x562d3d78aba0 |      |     900512 | TLSv1.2/ECDHE-RSA-AES256-GCM-SHA384/ECDH=secp384r1/OCSP=good
 S    | novadeadmin@psql-frc-stage-novade-lite | users      | used  | y.y.y.y | 5432 | 10.244.4.196 |      37988 | 2021-11-22 04:38:56 UTC | 2021-11-22 05:09:35 UTC |    0 |       0 |            0 | 0x562d3d78a740 |      |     879848 | TLSv1.2/ECDHE-RSA-AES256-GCM-SHA384/ECDH=secp384r1/OCSP=good
(2 rows)
Eranga Heshan
  • 133
  • 2
  • 9

1 Answers1

1

I feel like PgBouncer for some reason closed the connections with the server (gracefully or ungracefully). But Postgres server did not close the connection. Probably it doesn't know that there is no client.

With the connections being in a wait_event of 'ClientRead', they would notice immediately if the client closed the other end of the connection, and would respond by closing its own end and shutting down that backend process. So whatever is going on, it isn't that.

You said you used lsof and netstat to find connections and couldn't. Could you describe specifically what you looked for and saw? Did you just look on the pgbouncer host, or did you look on the database server host as well?

jjanes
  • 42,332
  • 3
  • 44
  • 54