PostgreSQL 10.0 preview 功能增强 - libpq支持多主机连接(failover,LB)让数据库HA和应用配合更紧密
背景
数据库一主多备,这个词在互联网应该不陌生。但是主备切换和应用程序如何配合才能天衣无缝呢?你可能会有这样的疑问。
1. 什么类型的QUERY发给主库,什么类型的QUERY发给备库?
2. 主库和备库发生了角色切换之后,客户端连接如何配合?
业界有一些做法可以回答这两个问题。
1. 通常使用集群软件,使用VIP来表示主备库的角色,不管怎么切换,VIP1永远都是主库,VIP2永远都是备库。
对于客户端来说,通过VIP来判断是主库还是备库,程序自己控制哪些SQL发给主库,哪些SQL发给备库。
一个典型的例子
https://github.com/digoal/PostgreSQL_HA_with_primary_standby_2vip
2. 使用VIP是让数据库和应用程序形成默契的一种方法,还有没有更好的方法呢?比如数据库驱动层能不能配合角色的判断?
Oracle做得不错,Oracle的客户端,可以配合RAC,自动识别failover, switchover。
PostgreSQL jdbc驱动,也实现了类似的功能,支持failover, loadbalance。
https://jdbc.postgresql.org/documentation/94/connect.html
PostgreSQL jdbc相关参数如下
1.
targetServerType = {any, master, slave, preferSlave}
Allows opening connections to only servers with required state, the allowed values are any, master, slave and preferSlave.
The master/slave distinction is currently done by observing if the server allows writes.
The value preferSlave tries to connect to slaves if any are available, otherwise allows falls back to connecting also to master.
any表示连接到任意节点
preferSlave表示优先连接到slave节点
master或slave表示连接到master或slave节点。
2.
hostRecheckSeconds = int
Controls how long in seconds the knowledge about a host state is cached in JVM wide global cache.
The default value is 10 seconds.
重新检测节点状态的时间间隔
3.
loadBalanceHosts = boolean
In default mode (disabled) hosts are connected in the given order.
If enabled hosts are chosen randomly from the set of suitable candidates.
是否随机选择节点,负载均衡
4.
Connection Fail-over
To support simple connection fail-over it is possible to define multiple endpoints (host and port pairs) in the connection url separated by commas.
The driver will try to once connect to each of them in order until the connection succeeds.
If none succeed, a normal connection exception is thrown.
当配置了多个目标节点时,JDBC会按顺序连接目标,直到连接成功为止。
jdbc连接串语法
The syntax for the connection url is:
jdbc:postgresql://host1:port1,host2:port2/database
The simple connection fail-over is useful when running against a high availability postgres installation that has identical data on each node.
For example streaming replication postgres or postgres-xc cluster.
5.
典型的例子,应用程序可以配置两个连接池,一个配置为master,写请求发往这个连接池。另一个配置为slave优先,同时开启负载均衡参数,读请求发往这个连接池。
For example an application can create two connection pools.
One data source is for writes, another for reads.
The write pool limits connections only to master node:
jdbc:postgresql://node1,node2,node3/accounting?targetServerType=master .
And read pool balances connections between slaves nodes, but allows connections also to master if no slaves are available:
jdbc:postgresql://node1,node2,node3/accounting?targetServerType=preferSlave&loadBalanceHosts=true
PostgreSQL 10.0 libpq增加多个连接的功能
PostgreSQL 10.0 libpq层,也增加了多连接功能,设计时引入了target_session_attrs参数,可以设置为read-write或者any。不同的target_session_attrs配置,对应不同的节点检测机制。
target_session_attrs=read-write,使用show transaction_read_only检测节点,返回on,表示这是只读节点,off表示这是可读写节点。(standby返回on, 同时通过default_transaction_read_only可以让master也返回on)。
target_session_attrs=any,表示不检测。
两个patch的commit信息如下。
libpq: Allow connection strings and URIs to specify multiple hosts.
author Robert Haas <rhaas@postgresql.org>
Thu, 3 Nov 2016 21:25:20 +0800 (09:25 -0400)
committer Robert Haas <rhaas@postgresql.org>
Thu, 3 Nov 2016 21:25:20 +0800 (09:25 -0400)
commit 274bb2b3857cc987cfa21d14775cae9b0dababa5
tree 488b5fd46e2cb4acdab7fb2dd30c4e4d1d4bb7d1 tree | snapshot
parent 770671062f130a830aa89100c9aa2d26f8d4bf32 commit | diff
libpq: Allow connection strings and URIs to specify multiple hosts.
It's also possible to specify a separate port for each host.
Previously, we'd loop over every address returned by looking up the
host name; now, we'll try every address for every host name.
Patch by me. Victor Wagner wrote an earlier patch for this feature,
which I read, but I didn't use any of his code. Review by Mithun Cy.
libpq: Add target_session_attrs parameter.
author Robert Haas <rhaas@postgresql.org>
Wed, 30 Nov 2016 01:18:31 +0800 (12:18 -0500)
committer Robert Haas <rhaas@postgresql.org>
Wed, 30 Nov 2016 01:18:31 +0800 (12:18 -0500)
Commit 274bb2b3857cc987cfa21d14775cae9b0dababa5 made it possible to
specify multiple IPs in a connection string, but that's not good
enough for the case where you have a read-write master and a bunch of
read-only standbys and want to connect to whichever server is the
master at the current time. This commit allows that, by making it
possible to specify target_session_attrs=read-write as a connection
parameter.
There was extensive discussion of the best name for the connection
parameter and its values as well as the best way to distinguish master
and standbys. For now, adopt the same solution as JDBC: if the user
wants a read-write connection, issue 'show transaction_read_only' and
rejection the connection if the result is 'on'. In the future, we
could add additional values of this new target_session_attrs parameter
that issue different queries; or we might have some way of
distinguishing the server type without resorting to an SQL query; but
right now, we have this, and that's (hopefully) a good start.
Victor Wagner and Mithun Cy. Design review by Álvaro Herrera, Catalin
Iacob, Takayuki Tsunakawa, and Craig Ringer; code review by me. I
changed Mithun's patch to skip all remaining IPs for a host if we
reject a connection based on this new parameter, rewrote the
documentation, and did some other cosmetic cleanup.
Discussion: http://postgr.es/m/CAD__OuhqPRGpcsfwPHz_PDqAGkoqS1UvnUnOnAB-LBWBW=wu4A@mail.gmail.com
libpq用法介绍
URI格式
postgresql://[user[:password]@][netloc][:port][,...][/dbname][?param1=value1&...]
postgresql://
postgresql://localhost
postgresql://localhost:5433
postgresql://localhost/mydb
postgresql://user@localhost
postgresql://user:secret@localhost
postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp
postgresql://host1:123,host2:456/somedb?target_session_attrs=any&application_name=myapp
配置多个目标节点,host:port使用逗号隔开即可。
host
Comma-separated list of host names.
If a host name begins with a slash, it specifies Unix-domain communication rather than TCP/IP communication;
the value is the name of the directory in which the socket file is stored.
If multiple host names are specified, each will be tried in turn in the order given.
The default behavior when host is not specified is to connect to a Unix-domain socket in /tmp (or whatever socket directory was specified when PostgreSQL was built).
On machines without Unix-domain sockets, the default is to connect to localhost.
port
Port number to connect to at the server host, or socket file name extension for Unix-domain connections.
If the host parameter included multiple, comma-separated hosts, this parameter may specify a list of ports of equal length,
or it may specify a single port number to be used for all hosts.
target_session_attrs
If this parameter is set to read-write, only a connection in which read-write transactions are accepted by default is considered acceptable.
The query show transaction_read_only will be sent upon any successful connection;
if it returns on, the connection will be closed.
If multiple hosts were specified in the connection string, any remaining servers will be tried just as if the connection attempt had failed.
The default value of this parameter, any, regards all connections as acceptable.
使用libpq,你同样可以实现与jdbc一样的效果(负载均衡,自动找到master)。
HA只需要负责切换角色,不需要再负责切换IP地址了。可以更省心一些。
这个patch的讨论,详见邮件组,本文末尾URL。
PostgreSQL社区的作风非常严谨,一个patch可能在邮件组中讨论几个月甚至几年,根据大家的意见反复的修正,patch合并到master已经非常成熟,所以PostgreSQL的稳定性也是远近闻名的。
参考
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=274bb2b3857cc987cfa21d14775cae9b0dababa5
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=721f7bd3cbccaf8c07cad2707826b83f84694832
https://www.postgresql.org/docs/devel/static/libpq-connect.html#libpq-connstring
https://jdbc.postgresql.org/documentation/94/connect.html