PostgreSQL 和 Greenplum pgcrypto 加解密bytea处理差异(convert, convert_from)
背景
pgcrypto是一个加解密插件,可以对数据进行加解密,支持多种加密算法。
https://www.postgresql.org/docs/devel/static/pgcrypto.html
encrypt(data bytea, key bytea, type text) returns bytea
decrypt(data bytea, key bytea, type text) returns bytea
encrypt_iv(data bytea, key bytea, iv bytea, type text) returns bytea
decrypt_iv(data bytea, key bytea, iv bytea, type text) returns bytea
在Greenplum和PostgreSQL中使用略有差异。
PostgreSQL 加解密
加密举例
postgres=# select encrypt('abcde'::bytea, 'hello_key'::bytea, 'aes');
encrypt
------------------------------------
\x670d356c4df5a5b6b6f37e0a0e5a8e93
(1 row)
解密举例
postgres=# select decrypt('\x670d356c4df5a5b6b6f37e0a0e5a8e93', 'hello_key'::bytea, 'aes');
decrypt
--------------
\x6162636465
(1 row)
postgres=# select convert_from(decrypt('\x670d356c4df5a5b6b6f37e0a0e5a8e93', 'hello_key'::bytea, 'aes'), 'sql_ascii');
convert_from
--------------
abcde
(1 row)
PostgreSQL用到了convert_from来转换bytea和text。
Function | Return Type | Description | Example | Result |
---|---|---|---|---|
convert(string bytea, src_encoding name, dest_encoding name) | bytea | Convert string to dest_encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. Conversions can be defined by CREATE CONVERSION. Also there are some predefined conversions. See Table 9.10 for available conversions. | convert(‘text_in_utf8’, ‘UTF8’, ‘LATIN1’) | text_in_utf8 represented in Latin-1 encoding (ISO 8859-1) |
convert_from(string bytea, src_encoding name) | text | Convert string to the database encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. | convert_from(‘text_in_utf8’, ‘UTF8’) | text_in_utf8 represented in the current database encoding |
convert_to(string text, dest_encoding name) | bytea | Convert string to dest_encoding. | convert_to(‘some text’, ‘UTF8’) | some text represented in the UTF8 encoding |
Greenplum 加解密
Greenplum 早期的版本基于PostgreSQL 8.2改进而来,还没有convert_from函数。
加密举例
postgres=# select encrypt('abcde'::bytea, 'hello_key'::bytea, 'aes');
encrypt
------------------------------------------------
g\0155lM\365\245\266\266\363~\012\016Z\216\223
(1 row)
解密举例, HEX格式需要去除\x
。
postgres=# select decrypt(decode('g\0155lM\365\245\266\266\363~\012\016Z\216\223', 'escape'), 'hello_key'::bytea, 'aes');
decrypt
---------
abcde
(1 row)
postgres=# select decrypt(decode('670d356c4df5a5b6b6f37e0a0e5a8e93', 'hex'), 'hello_key'::bytea, 'aes');
decrypt
---------
abcde
(1 row)
postgres=# select decrypt(decode(ltrim('\x670d356c4df5a5b6b6f37e0a0e5a8e93','\x'), 'hex'), 'hello_key'::bytea, 'aes');
decrypt
---------
abcde
(1 row)
greenplum用到了decode来转换bytea和text
Function | Return Type | Description | Example | Result |
---|---|---|---|---|
decode(string text, format text) | bytea | Decode binary data from textual representation in string. Options for format are same as in encode. | decode(‘MTIzAAE=’, ‘base64’) | \x3132330001 |
encode(data bytea, format text) | text | Encode binary data into a textual representation. Supported formats are: base64, hex, escape. escape converts zero bytes and high-bit-set bytes to octal sequences (\nnn) and doubles backslashes. | encode(E’123\000\001’, ‘base64’) | MTIzAAE= |
对于decrypt转换后的bytea得到的中文字符,需要增加greenplum的有convert 函数支持,否则只能显示为bytea格式。后面有方法:
-- PostgreSQL 8.3+
-- PostgreSQL 8.3+可以设置BYTEA类型的输出格式,HEX或ESCAPE格式。
postgres=# set bytea_output ='hex';
SET
postgres=# select '你'::bytea;
bytea
----------
\xe4bda0
(1 row)
postgres=# set bytea_output ='escape';
SET
postgres=# select '你'::bytea;
bytea
--------------
\344\275\240
(1 row)
postgres=# select convert_from(decrypt(ENCRYPT('你'::bytea, 'xm'::bytea,'aes'::text), 'xm'::bytea,'aes'::text), 'utf8');
convert_from
--------------
你
(1 row)
-- Greenplum 早期版本, 没有convert_from函数,多字节字符经过decrypt后,输出BYTEA。
postgres=# select decrypt(ENCRYPT('你'::bytea, 'xm'::bytea,'aes'::text), 'xm'::bytea,'aes'::text);
decrypt
--------------
\344\275\240
(1 row)
encode无法显示中文。
postgres=# select encode(decrypt(ENCRYPT('你'::bytea, 'xm'::bytea,'aes'::text), 'xm'::bytea,'aes'::text), 'escape');
encode
--------------
\344\275\240
(1 row)
Greenplum 4.x版本基于PostgreSQL 8.2,增加convert(bytea,name,name)函数的方法:
create or replace function pg_catalog.convert (bytea, name, name) returns text as $$
declare
res text;
begin
execute $_$select convert(E'$_$||encode($1, 'escape')||$_$','$_$||$2||$_$','$_$||$3||$_$')$_$ into res;
return res;
end;
$$ language plpgsql strict immutable;
目前greenplum 4.x可以通过escape的文本得到中文:
postgres=# select convert(E'\344\275\240', 'UTF8', 'UTF8');
convert
---------
哈哈
(1 row)
在添加了convert(bytea,name,name)后,可以直接从bytea得到中文
postgres=# select convert(decrypt(ENCRYPT('你'::bytea, 'xm'::bytea,'aes'::text), 'xm'::bytea,'aes'::text), 'UTF8', 'UTF8');
convert
---------
你
(1 row)
postgres=# select convert(bytea '\344\275\240', 'UTF8', 'UTF8');
convert
---------
你
(1 row)
postgres=# select convert(bytea('\344\275\240'), 'UTF8', 'UTF8');
convert
---------
你
(1 row)
参考
《PostgreSQL 按拼音排序 - convert to GBK/EUC_CN coding》
《PostgreSQL Oracle 兼容性之 - 字符编码转换 CONVERT》
https://www.postgresql.org/docs/devel/static/pgcrypto.html
《固若金汤 - PostgreSQL pgcrypto加密插件》