PostgreSQL sharding : citus 系列2 - TPC-H
背景
紧接着上一篇文档,本文测试citus的tpc-h能力(包括兼容性).
《PostgreSQL sharding : citus 系列1 - 多机部署(含OLTP(TPC-B)测试)》
《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》
https://github.com/digoal/gp_tpch
实际测试过程中,发现CITUS对TPC-H的SQL支持并不完整。
citus tpc-h 测试
1、下载gp_tpch包
git clone https://github.com/digoal/gp_tpch
2、生成200G测试数据
cd gp_tpch
ln -s `pwd` /tmp/dss-data
./dbgen -s 200
3、将数据转换为PG识别的格式
for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; echo $i; done;
4、生成测试SQL
SF=200
mkdir dss/queries
for q in `seq 1 22`
do
DSS_QUERY=dss/templates ./qgen -s $SF $q > dss/queries/$q.sql
sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql
done
5、修改citus 的几个参数,确保在跑两类QUERY的时候不报错(末尾会提到报错原因)。
alter role postgres set citus.enable_repartition_joins =on;
alter role postgres set citus.max_intermediate_result_size =-1;
6、测试TPC-H(与coordinator同一台主机上测试)
./tpch.sh ./results 127.0.0.1 port tpch-db tpch-user password citus
tpc-h 性能
有些SQL不支持,显示0.
2018-08-29 19:24:30 [1535541870] : query 1 finished OK (12 seconds)
2018-08-29 19:24:30 [1535541870] : query 2 finished OK (0 seconds)
2018-08-29 19:25:48 [1535541948] : query 3 finished OK (77 seconds)
2018-08-29 19:25:50 [1535541950] : query 4 finished OK (2 seconds)
2018-08-29 19:29:45 [1535542185] : query 5 finished OK (234 seconds)
2018-08-29 19:29:47 [1535542187] : query 6 finished OK (1 seconds)
2018-08-29 19:37:33 [1535542653] : query 7 finished OK (465 seconds)
2018-08-29 19:44:30 [1535543070] : query 8 finished OK (415 seconds)
2018-08-29 19:58:29 [1535543909] : query 9 finished OK (837 seconds)
2018-08-29 20:00:26 [1535544026] : query 10 finished OK (116 seconds)
2018-08-29 20:00:26 [1535544026] : query 11 finished OK (0 seconds)
2018-08-29 20:00:32 [1535544032] : query 12 finished OK (6 seconds)
2018-08-29 20:00:33 [1535544033] : query 13 finished OK (0 seconds)
2018-08-29 20:01:40 [1535544100] : query 14 finished OK (67 seconds)
2018-08-29 20:05:33 [1535544333] : query 15 finished OK (232 seconds)
2018-08-29 20:05:34 [1535544334] : query 16 finished OK (0 seconds)
2018-08-29 20:05:34 [1535544334] : query 17 finished OK (0 seconds)
2018-08-29 20:05:34 [1535544334] : query 18 finished OK (0 seconds)
2018-08-29 20:06:51 [1535544411] : query 19 finished OK (76 seconds)
2018-08-29 20:06:51 [1535544411] : query 20 finished OK (0 seconds)
2018-08-29 20:06:52 [1535544412] : query 21 finished OK (0 seconds)
2018-08-29 20:06:52 [1535544412] : query 22 finished OK (0 seconds)
2018-08-29 20:06:52 [1535544412] : finished TPC-H benchmark
citus tpc-h SQL文件讲解
1、分片字段
select create_distributed_table('part','p_partkey');
select create_distributed_table('region','r_regionkey');
select create_distributed_table('nation','n_nationkey');
select create_distributed_table('supplier','s_suppkey');
select create_distributed_table('customer','c_custkey');
select create_distributed_table('partsupp','ps_suppkey');
select create_distributed_table('orders','o_orderkey');
select create_distributed_table('lineitem','l_orderkey');
2、colocate(默认情况下是同一个分组,不需要colocate)
SELECT mark_tables_colocated('part', ARRAY['region', 'nation', 'supplier', 'customer', 'partsupp', 'orders', 'lineitem']);
默认分组,在创建分片表时colocate默认为default,
只要表的shard数量、分片类型(append, or hash)、colocate组 都一致,则他们就是colocate的。
create_distributed_table
参数:colocate_with: (Optional) include current table in the co-location group of another table.
By default tables are co-located when they are distributed by columns of the same type,
have the same shard count, and have the same replication factor.
Possible values for colocate_with are default, none to start a new co-location group,
or the name of another table to co-locate with that table. (See Co-Locating Tables.)
3、索引
ALTER TABLE PART ADD constraint pk1 PRIMARY KEY (P_PARTKEY);
ALTER TABLE SUPPLIER ADD constraint pk2 PRIMARY KEY (S_SUPPKEY);
ALTER TABLE PARTSUPP ADD constraint pk3 PRIMARY KEY (PS_PARTKEY, PS_SUPPKEY);
ALTER TABLE CUSTOMER ADD constraint pk4 PRIMARY KEY (C_CUSTKEY);
ALTER TABLE ORDERS ADD constraint pk5 PRIMARY KEY (O_ORDERKEY);
ALTER TABLE LINEITEM ADD constraint pk6 PRIMARY KEY (L_ORDERKEY, L_LINENUMBER);
ALTER TABLE NATION ADD constraint pk7 PRIMARY KEY (N_NATIONKEY);
ALTER TABLE REGION ADD constraint pk8 PRIMARY KEY (R_REGIONKEY);
CREATE INDEX idx_nation_regionkey ON public.nation USING btree (n_regionkey);
CREATE INDEX idx6 ON public.nation USING btree (n_nationkey, n_regionkey);
CREATE INDEX idx5 ON public.region USING btree (r_name, r_regionkey);
CREATE INDEX IDX_CUSTOMER_NATIONKEY ON CUSTOMER (C_NATIONKEY);
CREATE INDEX IDX_LINEITEM_PART_SUPP ON LINEITEM (L_PARTKEY,L_SUPPKEY);
CREATE INDEX idx_lineitem_shipdate ON public.lineitem USING btree (l_shipdate, l_discount, l_quantity);
CREATE INDEX idx_lineitem__2 ON public.lineitem USING btree (l_partkey);
CREATE INDEX idx_lineitem__3 ON public.lineitem USING btree (l_suppkey);
CREATE INDEX idx_lineitem__11 ON public.lineitem USING btree (l_shipdate);
CREATE INDEX idx_lineitem_orderkey ON public.lineitem USING btree (l_orderkey);
CREATE INDEX idx1 ON public.lineitem USING btree (l_orderkey) WHERE (l_commitdate < l_receiptdate);
CREATE INDEX idx_orders__6 ON public.orders USING btree (o_orderpriority);
CREATE INDEX idx_orders_orderdate ON public.orders USING btree (o_orderdate);
CREATE INDEX idx_orders_custkey ON public.orders USING btree (o_custkey)
CREATE INDEX idx_part__5 ON public.part USING btree (p_type);
CREATE INDEX idx_part__6 ON public.part USING btree (p_size);
CREATE INDEX idx_part_1 ON public.part USING btree (p_container, p_brand);
CREATE INDEX idx_supplier_nation_key ON public.supplier USING btree (s_nationkey);
CREATE INDEX idx4 ON public.supplier USING btree (s_suppkey, s_nationkey);
CREATE INDEX idx ON public.partsupp USING btree (ps_partkey, ps_suppkey, ps_supplycost);
CREATE INDEX idx_partsupp_partkey ON public.partsupp USING btree (ps_partkey);
CREATE INDEX idx_partsupp_suppkey ON public.partsupp USING btree (ps_suppkey);
CREATE INDEX idx_partsupp__4 ON public.partsupp USING btree (ps_supplycost);
citus测试TPC-H遇到的兼容性报错与解决方法
错误1
ERROR: could not run distributed query with subquery outside the FROM and WHERE clauses
HINT: Consider using an equality filter on the distributed table's partition column.
报错SQL
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'SAUDI ARABIA'
group by
ps_partkey having
sum(ps_supplycost * ps_availqty) > (
select
sum(ps_supplycost * ps_availqty) * 0.0000005000
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'SAUDI ARABIA'
)
order by
value desc
LIMIT 1;
改成
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'SAUDI ARABIA'
group by
ps_partkey having
sum(ps_supplycost * ps_availqty) >
( sum(ps_supplycost * ps_availqty) filter
(where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'SAUDI ARABIA'
)
) * 0.0000005000
order by
value desc
LIMIT 1;
错误2
ERROR: the query contains a join that requires repartitioning
HINT: Set citus.enable_repartition_joins to on to enable repartitioning
报错SQL
select
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'AUTOMOBILE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < date '1995-03-08'
and l_shipdate > date '1995-03-08'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate
LIMIT 10;
有些SQL需要repartition join,需要设置如下参数
alter role postgres set citus.enable_repartition_joins =on;
错误3
ERROR: the intermediate result size exceeds citus.max_intermediate_result_size (currently 1048576 kB)
DETAIL: Citus restricts the size of intermediate results of complex subqueries and CTEs to avoid accidentally pulling large result sets into once place.
HINT: To run the current query, set citus.max_intermediate_result_size to a higher value or -1 to disable.
有些SQL的中间结果比较大,可以设置为无限制,重新测试
alter role postgres set citus.max_intermediate_result_size =-1;
错误4
ERROR: 0A000: complex joins are only supported when all distributed tables are joined on their distribution columns with equal operator
复杂JOIN,仅支持分布键作为JOIN的字段,并且只能使用等值JOIN。
postgres=# explain select
sum(l_extendedprice) / 7.0 as avg_yearly
from
lineitem join
part on (p_partkey = l_partkey and p_brand = 'Brand#13' and p_container = 'SM PKG') join
(SELECT l_partkey AS agg_partkey, 0.2 * avg(l_quantity) AS avg_quantity FROM lineitem GROUP BY l_partkey) part_agg
on (agg_partkey = l_partkey) limit 10;
ERROR: 0A000: complex joins are only supported when all distributed tables are joined on their distribution columns with equal operator
LOCATION: DeferErrorIfUnsupportedSubqueryPushdown, query_pushdown_planning.c:469
对应的代码如下
https://github.com/citusdata/citus/blob/d63cbf382288358533b71d907f6a3a7fc8a1df5f/src/backend/distributed/planner/query_pushdown_planning.c
else if (!RestrictionEquivalenceForPartitionKeys(plannerRestrictionContext))
{
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"complex joins are only supported when all distributed tables are "
"joined on their distribution columns with equal operator",
NULL, NULL);
}
错误5
ERROR: failed to execute task 18
报错SQL
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'SAUDI ARABIA'
group by
ps_partkey having
sum(ps_supplycost * ps_availqty) > ( sum(ps_supplycost * ps_availqty) filter (where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'SAUDI ARABIA') ) * 0.0000005000
order by
value desc
LIMIT 1;
错误6
ERROR: 0A000: cannot run outer join query if join is not on the partition column
DETAIL: Outer joins requiring repartitioning are not supported.
LOCATION: FixedJoinOrderList, multi_join_order.c:189
报错SQL
select
c_count,
count(*) as custdist
from
(
select
c_custkey,
count(o_orderkey)
from
customer left outer join orders on
c_custkey = o_custkey
and o_comment not like '%unusual%requests%'
group by
c_custkey
) as c_orders (c_custkey, c_count)
group by
c_count
order by
custdist desc,
c_count desc
LIMIT 1;
小结
citus的复杂SQL的语法支持还不是非常完备,如果要实现比较复杂的运算,建议Writing in SQL, thinking in MapReduce
https://www.citusdata.com/blog/2018/08/17/breaking-down-citus-real-time-executor/
这样的话,基本上你可以避免掉复杂的SQL,使用分解动作来实现。