PostgreSQL 10.0 preview 多核并行增强 - 控制集群并行度

2 minute read

背景

PostgreSQL 9.6引入多核并行,一条SQL可以使用多个CPU核,提升SQL性能。

但是多核并行一定不要滥用,因为CPU资源有限,如果单个QUERY把CPU都用光了,其他QUERY就会因为缺乏CPU资源造成性能抖动。

9.6刚出来的时候,可以控制单个gather的并行度,比如全表扫描,扫描节点算一个gather,一个gather下面会fork 一些worker process执行并行任务。

9.6通过max_worker_processes参数控制整个集群的并行度,同时运行的QUERY,同时启动的worker process总数不能超过max_worker_processes设置的值。

但是系统中还有其他功能还需要fork worker process,比如pg_base_basckup,比如standby,都会用到worker process。

那么多核计算很可能会影响这些应用。

10.0 新增了一个控制多核并行的参数max_parallel_workers,用于控制整个集群允许开启的用于多核计算的worker process.

这样就可以控制用于并行的workers不会占满所有的max_worker_processes。

设置max_parallel_workers后,可以大幅提升开启并行计算后的数据库OLTP稳定性。

建议的配置

比如64核的机器

max_parallel_workers_per_gather(8) < max_parallel_workers(48) < max_worker_processes(64)

相关参数

1. max_worker_processes (integer)

设置集群允许fork的最大worker process数目

Sets the maximum number of background processes that the system can support. This parameter can only be set at server start. The default is 8.    
    
When running a standby server, you must set this parameter to the same or higher value than on the master server. Otherwise, queries will not be allowed in the standby server.    
    
When changing this value, consider also adjusting max_parallel_workers and max_parallel_workers_per_gather.    

2. max_parallel_workers_per_gather (integer)

设置QUERY中单个gather node允许开启的worker process数目

Sets the maximum number of workers that can be started by a single Gather node.     
    
Parallel workers are taken from the pool of processes established by max_worker_processes, limited by max_parallel_workers.     
    
Note that the requested number of workers may not actually be available at runtime.     
    
If this occurs, the plan will run with fewer workers than expected, which may be inefficient.     
    
The default value is 2. Setting this value to 0 disables parallel query execution.    
    
Note that parallel queries may consume very substantially more resources than non-parallel queries,     
    
because each worker process is a completely separate process which has roughly the same impact on the system as an additional user session.     
    
This should be taken into account when choosing a value for this setting, as well as when configuring other settings that control resource utilization,     
    
such as work_mem. Resource limits such as work_mem are applied individually to each worker,     
    
which means the total utilization may be much higher across all processes than it would normally be for any single process.     
    
For example, a parallel query using 4 workers may use up to 5 times as much CPU time, memory, I/O bandwidth, and so forth as a query which uses no workers at all.    
    
For more information on parallel query, see Chapter 15, Parallel Query.    

3. max_parallel_workers (integer)

设置整个数据库集群,允许同时开启的用于多核计算的worker process数目

Sets the maximum number of workers that the system can support for parallel queries. The default value is 8.     
    
When increasing or decreasing this value, consider also adjusting max_parallel_workers_per_gather.     
    
Also, note that a setting for this value which is higher than max_worker_processes will have no effect, since parallel workers are taken from the pool of worker processes established by that setting.    

除此之外,PostgreSQL的并行度是如何计算,如何控制并行度的?请参考

《PostgreSQL 9.6 并行计算 优化器算法浅析》

后续优化

1. 按时间段配置并行度,比如0点到8点,并行度开到最大。平时降一半。

这种配置可以用于OLAP+OLTP的混合场景,例如晚上用于统计,白天用于OLTP,白天即使有多核并行的QUERY,由于现在了集群级别的并行度,所以对OLTP也不会有影响。

2. 如果你要设置单个会话的最大并行度,可以设置会话级别的max_parallel_workers参数,如果你要设置单个QUERY的最大并行度,则设置max_parallel_workers或者max_parallel_workers_per_gather即可

参考

https://www.postgresql.org/docs/devel/static/runtime-config-resource.html

《PostgreSQL 9.6 并行计算 优化器算法浅析》

Flag Counter

digoal’s 大量PostgreSQL文章入口