V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
Koril

请教一个关于 PostgreSQL 连接被 Server 关闭的问题

  •  
  •   Koril · Mar 11, 2025 · 3060 views
    This topic created in 412 days ago, the information mentioned may be changed or developed.

    背景

    我在自己的阿里云服务器上( 2C2G ,3M ,Debian 12 )装了一个 PostgreSQL ( 15 ),安装后,仅仅做了以下配置改动:

    pg_hba.conf:

    添加:host all all 0.0.0.0/0 md5

    postgresql.conf

    开放端口:listen_addresses = '*'

    给 postgres 设置了密码,sudo -i -u postgres -> psql -> \password


    异常

    无论是 Navicat 还是 Python 的 Psycopg2 在超过一定时间(大概 3-5 分钟),就会连接失效了。

    Navicat 报错如下:

    Server closed the connection unexpectedly
    This probably means the server terminated abnormally before or while processing the request.
    

    Psycopg2 报错如下:

    psycopg2.OperationalError: server closed the connection unexpectedly
    	This probably means the server terminated abnormally
    	before or while processing the request.
    server closed the connection unexpectedly
    	This probably means the server terminated abnormally
    	before or while processing the request. 
    

    已知

    服务器没有太大的负载,cpu ,内存,磁盘,网络,占用率都很低,除了 Prometheus/Grafana/nginx 之外,这个机器上只有刚刚安装的 PostgreSQL 。

    重新获取连接,能够正常访问,但是我使用的 Psycopg2 的 ThreadedConnectionPool 连接池,连接池没法保持连接么?代码如下:

    
    def get_pg_pool(pg_config):
        pool = psycopg2.pool.ThreadedConnectionPool(
            minconn=1,
            maxconn=200,
            host=pg_config['PG_HOST'],
            port=pg_config['PG_PORT'],
            dbname=pg_config['PG_DB'],
            user=pg_config['PG_USER'],
            password=pg_config['PG_PASSWORD'],
            connect_timeout=5,
        )
        return pool
        
    @flask_app.route('/task/log', methods=['GET'])
        def task_log():
            task_log_list = []
            # 在进程启动时,给 flask_app 初始化了一个 pg_pool 对象
            conn = flask_app.pg_pool.getconn()
            try:
                with conn.cursor() as cur:
                    cur.execute('SELECT * FROM t_log')
                    task_log_list = cur.fetchall()
            except psycopg2.Error as e:
                return JsonResult.failed('获取日志列表失败')
            finally:
                flask_app.pg_pool.putconn(conn)
    
            return JsonResult.successful(task_log_list)
    
    

    问题

    这个问题是和 Linux 服务器配置有关呢?还是跟 PostgreSQL 配置有关?请问如何排查和解决呢?

    3 replies    2025-03-11 12:47:12 +08:00
    zbinlin
        1
    zbinlin  
       Mar 11, 2025
    你看下 pg 的日志里有什么报错信息
    Koril
        2
    Koril  
    OP
       Mar 11, 2025
    @zbinlin /var/log/postgresql/postgresql-15-main.log 的日志里显示:

    2025-03-11 11:47:12.722 CST [1370665] postgres@badminton LOG: could not receive data from client: Connection timed out
    2025-03-11 11:47:12.722 CST [1370663] postgres@badminton LOG: could not receive data from client: Connection timed out
    2025-03-11 11:49:03.314 CST [1370701] postgres@badminton LOG: could not receive data from client: Connection timed out
    Nt6Z1g
        3
    Nt6Z1g  
       Mar 11, 2025
    印象中阿里云服务器修改了 OS 的 tcp_keepalive 参数, pg 默认用系统的参数会导致这个问题. 你手动设置 pg 的 tcp_keepalive 参数试试. 比如

    ```
    select name,setting from pg_settings where name like 'tcp_keepalives%';
    ALTER SYSTEM set tcp_keepalives_idle = 600;
    ALTER SYSTEM set tcp_keepalives_interval = 30;
    ALTER SYSTEM set tcp_keepalives_count = 10;
    SELECT pg_reload_conf();
    ```
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   1305 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 27ms · UTC 23:41 · PVG 07:41 · LAX 16:41 · JFK 19:41
    ♥ Do have faith in what you're doing.