HOW PostgreSQL deal NULL-Values follow Three ternary valued logic RULE
背景
在逻辑运算中有三种状态表示,真、假、不知道。
数据库的NULL表示没有值, 空的意思(在逻辑中属于 不知道)。
在三价逻辑运算中, 数据库的NULL相当于UNKNOWN的意思.
三价逻辑运算请参考 :
http://en.wikipedia.org/wiki/Three-valued_logic
来看看三价逻辑运算的真值表 :
Kleene logic
Below is a set of truth tables showing the logic operations for Kleene’s logic.
A AND B | True | Unknown | False |
---|---|---|---|
True | True | Unknown | False |
Unknown | Unknown | Unknown | False |
False | False | False | False |
A OR B | True | Unknown | False |
---|---|---|---|
True | True | True | True |
Unknown | True | Unknown | Unknown |
False | True | Unknown | False |
A | NOT A |
---|---|
True | False |
Unknown | Unknown |
False | True |
Lukasiewicz logic真值表略…
简单的解释一下, 在这里Unknown可能是true也可能是false。
因此 :
对于Unknown and true 可能是true and true也可能是false and true. 那么结果应该是true或者false. 最终还是不确定. 所以还是Unknown.
对于NOT Unknown. 也一样, 可能是NOT true也可能是NOT false, 结果有可能是true或者false, 最终还是不确定, 所以还是Unknown.
对于Unknown or true, 不管Unknown是true还是false, 结果都是true.
对于Unknown and false, 不管Unknown是true还是false, 结果都是false.
对于Unknown and Unknown, 可能是true and true或者true and false或者false and false最终结果不确定, 所以还是Unknown.
对于Unknown or Unknown, 可能是true or true或者true or false或者false or false最终结果不确定, 所以还是Unknown.
在PostgreSQL数据库中是如何处理的呢?
正文
验证以上真值表(仅验证含null的部分) :
ocz@db-172-16-3-150-> psql digoal
psql (9.2.1)
Type "help" for help.
digoal=# \pset null UnKnown
Null display is "UnKnown".
digoal=# select null and true;
UnKnown
digoal=# select null and null;
UnKnown
digoal=# select null and false;
f
digoal=# select null or true;
t
digoal=# select null or null;
UnKnown
digoal=# select null or false;
UnKnown
digoal=# select not null;
UnKnown
接下来测试where条件中的null.
digoal=# select 1 where null;
?column?
----------
(0 rows)
digoal=# select 1 where true;
?column?
----------
1
(1 row)
digoal=# select 1 where false;
?column?
----------
(0 rows)
从测试结果可以看出WHERE子句中的 null和false一样, 被排除掉了.
接下来测试数据库基本的比较操作中用到null的地方的运算结果.
Operator | Description |
---|---|
< | less than |
> | greater than |
<= | less than or equal to |
>= | greater than or equal to |
= | equal |
<> or != | not equal |
以上操作符只要带有null的比较返回值都是null。
例如 :
digoal=# select 1 < null;
?column?
----------
UnKnown
(1 row)
digoal=# select true < null;
?column?
----------
UnKnown
(1 row)
表示不知道null和null是否相等, 所以输出还是null.
digoal=# select null=null;
?column?
----------
UnKnown
(1 row)
表示不知道null和null是否不相等, 所以输出还是null.
digoal=# select null <> null;
?column?
----------
UnKnown
(1 row)
between and 相当于>= and <=, 例如 :
digoal=# select 1 between null and null;
?column?
----------
UnKnown
(1 row)
digoal=# select 1 between null and 2;
?column?
----------
UnKnown
(1 row)
digoal=# select null between null and 2;
?column?
----------
UnKnown
(1 row)
digoal=# select null between 1 and 2;
?column?
----------
UnKnown
(1 row)
digoal=# select null between 1 and null;
?column?
----------
UnKnown
(1 row)
not between and 相当于 < or >, 例如 :
digoal=# select null not between 1 and null;
?column?
----------
UnKnown
(1 row)
2. case, 注意CASE中如果使用NULL, 是使用的=操作符. 所以null分支用于不会执行. 如下 :
digoal=# select case 1 when null then 'is null' else 'is not null' end;
case
-------------
is not null
(1 row)
digoal=# select case null when null then 'is null' else 'is not null' end;
case
-------------
is not null
(1 row)
其他表达式中的null :
表示不知道null+1等于多少, 结果输出还是null.
postgres=# select null+1;
?column?
----------
UnKnown
(1 row)
digoal=# select null||'abc';
?column?
----------
UnKnown
(1 row)
digoal=# select 'abc'||null;
?column?
----------
UnKnown
(1 row)
注意
1. SQL中的特例, 某些使用场景中null和null被认为是相同的, 这打破了三价逻辑的规则. 如下.
postgres=# select null union select null;
?column?
----------
UnKnown
(1 row)
postgres=# select null intersect select null;
?column?
----------
UnKnown
(1 row)
digoal=# select null except select null;
?column?
----------
(0 rows)
2. 聚合函数, 除了count(*) 其他聚合函数都不处理null值. 例如 :
Table
i | j |
---|---|
150 | 150 |
200 | 200 |
250 | 250 |
NULL | 0 |
Here AVG(i) is 200 (the average of 150, 200, and 250), while AVG(j) is 150 (the average of 150, 200, 250, and 0).
A well-known side effect of this is that in SQL AVG(z) is not equivalent with SUM(z)/COUNT(*).
聚合函数注意, 如果传入了distinct的话, 就要看函数的strict标记 :
* Aggregate functions that are called with DISTINCT are now passed
NULL values if the aggregate transition function is not marked as
STRICT (Andrew Gierth)
For example, agg(DISTINCT x) might pass a NULL x value to agg().
This is more consistent with the behavior in non-DISTINCT cases.
3. 在SQL92标准扩展文件F571定义了6个操作符, 仅返回true或false, 不返回unknown如下 :
p | true | false | unknown |
---|---|---|---|
p IS TRUE | true | false | false |
p IS NOT TRUE | false | true | true |
p IS FALSE | false | true | false |
p IS NOT FALSE | true | false | true |
p IS UNKNOWN | false | false | true |
p IS NOT UNKNOWN | true | true | false |
digoal=# select null is true;
?column?
----------
f
(1 row)
digoal=# select null is not true;
?column?
----------
t
(1 row)
digoal=# select null is not false;
?column?
----------
t
(1 row)
digoal=# select null is false;
?column?
----------
f
(1 row)
digoal=# select null is null;
?column?
----------
t
(1 row)
digoal=# select null is not null;
?column?
----------
f
(1 row)
digoal=# select null is not unknown;
?column?
----------
f
(1 row)
digoal=# select null is unknown;
?column?
----------
t
(1 row)
4. 除此之外, PostgreSQL 中还包含两个逻辑操作符.
IS DISTINCT FROM 和 IS NOT DISTINCT FROM, 所有操作都返回true或者false, 不会返回null :
Ordinary comparison operators yield null (signifying "unknown"), not true or false, when either input is null.
For example, 7 = NULL yields null, as does 7 <> NULL. When this behavior is not suitable, use the IS [ NOT ] DISTINCT FROM constructs:
expression IS DISTINCT FROM expression expression IS NOT DISTINCT FROM expression For non-null inputs, IS DISTINCT FROM is the same as the <> operator.
However, if both inputs are null it returns false, and if only one input is null it returns true.
Similarly, IS NOT DISTINCT FROM is identical to = for non-null inputs, but it returns true when both inputs are null, and false when only one input is null.
Thus, these constructs effectively act as though null were a normal data value, rather than "unknown".
digoal=# select null is distinct from null;
?column?
----------
f
(1 row)
digoal=# select null is distinct from 1;
?column?
----------
t
(1 row)
digoal=# select null is not distinct from 1;
?column?
----------
f
(1 row)
digoal=# select null is not distinct from null;
?column?
----------
t
(1 row)
5. PostgreSQL transform_null_equals参数打开, 将’表达式=null’或’null=表达式’转换成’表达式 is null’. 如下 :
digoal=# set transform_null_equals=on;
SET
digoal=# select 1=null;
?column?
----------
f
(1 row)
digoal=# select null=null;
?column?
----------
t
(1 row)
digoal=# select null=1;
?column?
----------
f
(1 row)
注意这个参数不影响case表达式的判断, 和修改前结果一致 :
digoal=# select case null when null then 'is null' else 'is not null' end;
case
-------------
is not null
(1 row)
6. greatest和least不处理null值.
digoal=# select greatest(null,null);
greatest
----------
UnKnown
(1 row)
digoal=# select least(null,null);
least
---------
UnKnown
(1 row)
digoal=# select least(null,null,1,2,3);
least
-------
1
(1 row)
digoal=# select greatest(null,null,1,2,3);
greatest
----------
3
(1 row)
7. 索引中的null值, PostgreSQL 8.3以前(不含8.3)的版本的BTREE索引不支持IS NULL的查询.
参见 HISTORY :
* Allow col IS NULL to use an index (Teodor)
参考
1. http://en.wikipedia.org/wiki/Null_(SQL)
2. http://en.wikipedia.org/wiki/Three-valued_logic
3. http://www.databasedesign-resource.com/null-values-in-a-database.html
4. http://en.wikipedia.org/wiki/Propositional_logic
5. http://en.wikipedia.org/wiki/%C5%81ukasiewicz_logic
6. http://en.wikipedia.org/wiki/Stephen_Cole_Kleene
7. http://link.springer.com/chapter/10.1007%2F3-540-36596-6_7
8. http://www.postgresql.org/docs/9.2/static/functions-comparison.html