PostgreSQL 11 preview - 并行计算 增强 汇总

1 minute read

背景

PostgreSQL 11 并行计算能力的增强。

E.1.3.1.2. Parallel Queries

  • Allow btree indexes to be built in parallel (Peter Geoghegan, Rushabh Lathia, Heikki Linnakangas)

    支持并行排序,支持并行创建索引(并行写索引文件)。

    《PostgreSQL 11 preview - 并行排序、并行索引 (性能线性暴增) 单实例100亿TOP-K仅40秒》

  • Allow hash joins to be performed in parallel using a shared hash table (Thomas Munro)

    HASH JOIN支持共享哈希表了。原来是每个parallel worker进程一份哈希表副本。

  • Allow UNION to run each SELECT in parallel if the individual SELECTs cannot be parallelized (Amit Khandekar, Robert Haas, Amul Sul)

    当各个UNION ALL内的子句无法支持并行时,PostgreSQL 11会选择union的各个子句并行。

    https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0927d2f46ddd4cf7d6bf2cc84b3be923e0aedc52

    query1 union query2 union query3;  
      
    当query1,query2,query3 这些QUERY本身无法并行执行时。  
      
    PostgreSQL 11, 选择让 query1,query2,query3 同时执行。  
      
    老版本, 无法并行。  
    

    《PostgreSQL 11 preview - Parallel Append(包括 union all\分区查询) (多表并行计算) sharding架构并行计算核心功能之一》

  • Allow partition scans to more efficiently use parallel workers (Amit Khandekar, Robert Haas, Amul Sul)

    同上,支持paralle append扫描多个子分区。

  • Allow LIMIT to be passed to parallel workers (Robert Haas, Tom Lane)

    This allows workers to reduce returned results and use targeted index scans.

    允许LIMIT子句下层到各个paralle worker进程。加速带LIMIT的并行查询。

  • Allow single-evaluation queries, e.g. WHERE clause aggregate queries, and functions in the target list to be parallelized (Amit Kapila, Robert Haas)

    允许”单次评估的QUERY”并行执行,例如”where子句中的聚合子句”,”select目标中的函数”。

  • Add server option parallel_leader_participation to control if the leader executes subplans (Thomas Munro)

    The default is enabled, meaning the leader will execute subplans.

    Allows the leader process to execute the query plan under Gather and Gather Merge nodes instead of waiting for worker processes. The default is on. Setting this value to off reduces the likelihood that workers will become blocked because the leader is not reading tuples fast enough, but requires the leader process to wait for worker processes to start up before the first tuples can be produced. The degree to which the leader can help or hinder performance depends on the plan type, number of workers and query duration.

    允许parallel leader 进程在gather或gather merge节点主动接收worker进程产生的数据,而不是等待。

  • Allow parallelization of commands CREATE TABLE .. AS, SELECT INTO, and CREATE MATERIALIZED VIEW (Haribabu Kommi)

    允许CREATE TABLE .. AS, SELECT INTO, and CREATE MATERIALIZED VIEW这几类SQL并行执行。

  • Improve performance of sequential scans with many parallel workers (David Rowley)

    并行全表扫描性能增强。

  • Add reporting of parallel worker sort activity to EXPLAIN (Robert Haas, Tom Lane)

    explain增加输出详情,包括parallel worker节点排序的统计信息。

Flag Counter

digoal’s 大量PostgreSQL文章入口