Greenplum Oracle 兼容性之 - LOG ERRORS INTO

5 minute read

背景

Oracle支持DML的log errors,是一个很赞的功能。

https://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_errlog.htm#CEGEJAAJ

https://oracle-base.com/articles/10g/dml-error-logging-10gr2

支持insert,update,delete,merge的错误日志记录,可以跳过错误的行

INSERT INTO dest  
SELECT *  
FROM   source  
LOG ERRORS INTO err$_dest ('INSERT') REJECT LIMIT UNLIMITED;  
  
99998 rows created.  
  
SQL>  
COLUMN ora_err_mesg$ FORMAT A70  
SELECT ora_err_number$, ora_err_mesg$  
FROM   err$_dest  
WHERE  ora_err_tag$ = 'INSERT';  
  
ORA_ERR_NUMBER$ ORA_ERR_MESG$  
--------------- ---------------------------------------------------------  
           1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")  
           1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")  
  
2 rows selected.  
  
SQL>  
UPDATE dest  
SET    code = DECODE(id, 9, NULL, 10, NULL, code)  
WHERE  id BETWEEN 1 AND 10  
LOG ERRORS INTO err$_dest ('UPDATE') REJECT LIMIT UNLIMITED;  
  
8 rows updated.  
  
SQL>  
COLUMN ora_err_mesg$ FORMAT A70  
SELECT ora_err_number$, ora_err_mesg$  
FROM   err$_dest  
WHERE  ora_err_tag$ = 'UPDATE';  
  
ORA_ERR_NUMBER$ ORA_ERR_MESG$  
--------------- ---------------------------------------------------------  
           1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")  
           1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")  
  
2 rows selected.  
  
SQL>  
MERGE INTO dest a  
    USING source b  
    ON (a.id = b.id)  
  WHEN MATCHED THEN  
    UPDATE SET a.code        = b.code,  
               a.description = b.description  
  WHEN NOT MATCHED THEN  
    INSERT (id, code, description)  
    VALUES (b.id, b.code, b.description)  
  LOG ERRORS INTO err$_dest ('MERGE') REJECT LIMIT UNLIMITED;  
  
99998 rows merged.  
  
SQL>  
COLUMN ora_err_mesg$ FORMAT A70  
SELECT ora_err_number$, ora_err_mesg$  
FROM   err$_dest  
WHERE  ora_err_tag$ = 'MERGE';  
  
ORA_ERR_NUMBER$ ORA_ERR_MESG$  
--------------- ---------------------------------------------------------  
           1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")  
           1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")  
  
2 rows selected.  
  
SQL>  
DELETE FROM dest  
LOG ERRORS INTO err$_dest ('DELETE') REJECT LIMIT UNLIMITED;  
  
99996 rows deleted.  
  
SQL>  
COLUMN ora_err_mesg$ FORMAT A69  
SELECT ora_err_number$, ora_err_mesg$  
FROM   err$_dest  
WHERE  ora_err_tag$ = 'DELETE';  
  
ORA_ERR_NUMBER$ ORA_ERR_MESG$  
--------------- ---------------------------------------------------------------------  
           2292 ORA-02292: integrity constraint (TEST.DEST_CHILD_DEST_FK) violated -  
                child record found  
  
           2292 ORA-02292: integrity constraint (TEST.DEST_CHILD_DEST_FK) violated -  
                child record found  
  
  
2 rows selected.  
  
SQL>  

Greenplum copy兼容log errors

Greenplum可以通过COPY支持log errors。暂时未支持insert, merge, update, delete的error log.

COPY table [(column [, ...])] FROM {'file' | STDIN}  
  [ [WITH]  
    [OIDS]  
    [HEADER]  
    [DELIMITER [ AS ] 'delimiter']  
    [NULL [ AS ] 'null string']  
    [ESCAPE [ AS ] 'escape' | 'OFF']  
    [NEWLINE [ AS ] 'LF' | 'CR' | 'CRLF']  
    [CSV [QUOTE [ AS ] 'quote']  
    [FORCE NOT NULL column [, ...]]  
    [FILL MISSING FIELDS]  
    [[LOG ERRORS [INTO error_table] [KEEP]  
    SEGMENT REJECT LIMIT count [ROWS | PERCENT] ]  
  
  
COPY {table [(column [, ...])] | (query)} TO {'file' | STDOUT}  
  [ [WITH]  
    [OIDS]  
    [HEADER]  
    [DELIMITER [ AS ] 'delimiter']  
    [NULL [ AS ] 'null string']  
    [ESCAPE [ AS ] 'escape' | 'OFF']  
    [CSV [QUOTE [ AS ] 'quote']  
    [FORCE QUOTE column [, ...]] ]  
    [IGNORE EXTERNAL PARTITIONS ]  
LOG ERRORS [INTO error_table] [KEEP]  
This is an optional clause that can precede a SEGMENT REJECT LIMIT clause to log  
information about rows with formatting errors. The INTO error_table clause specifies an  
error table where rows with formatting errors will be logged when running in single row error  
isolation mode.  
If the INTO error_table clause is not specified, the error log information is stored internally  
(not in an error table). Error log information that is stored internally is accessed with the  
Greenplum Database built-in SQL function gp_read_error_log().  
If the error_table specified already exists, it is used. If it does not exist, it is created. If  
error_table exists and does not have a random distribution (the DISTRIBUTED RANDOMLY  
clause was not specified when creating the table), an error is returned.  
If the command generates the error table and no errors are produced, the default is to drop  
the error table after the operation completes unless KEEP is specified. If the table is created  
and the error limit is exceeded, the entire transaction is rolled back and no error data is  
saved. If you want the error table to persist in this case, create the error table prior to running  
the COPY.  
See Notes for information about the error log information and built-in functions for viewing  
and managing error log information.  
Note: The optional INTO error_table clause is deprecated and will not be  
supported in a future release. Only internal error logs will be supported.  
When you specify LOG ERRORS INTO error_table, Greenplum Database creates the table error_table  
that contains errors that occur while reading the external table. The table is defined as follows:  
CREATE TABLE error_table_name ( cmdtime timestamptz, relname text,  
filename text, linenum int, bytenum int, errmsg text,  
rawdata text, rawbytes bytea ) DISTRIBUTED RANDOMLY;  
You can view the information in the table with SQL commands.  
For error log data that is stored internally when the INTO error_table is not specified:  
- Use the built-in SQL function gp_read_error_log('table_name'). It requires SELECT privilege on  
table_name. This example displays the error log information for data loaded into table ext_expenses  
with a COPY command:  
SELECT * from gp_read_error_log('ext_expenses');  
The error log contains the same columns as the error table.  
The function returns FALSE if table_name does not exist.  
- If error log data exists for the specified table, the new error log data is appended to existing error log  
data. The error log information is not replicated to mirror segments.  
- Use the built-in SQL function gp_truncate_error_log('table_name') to delete the error log data  
for table_name. It requires the table owner privilege This example deletes the error log information  
captured when moving data into the table ext_expenses:  
SELECT gp_truncate_error_log('ext_expenses');  
The function returns FALSE if table_name does not exist.  
Specify the * wildcard character to delete error log information for existing tables in the current  
database. Specify the string *.* to delete all database error log information, including error log  
information that was not deleted due to previous database issues. If * is specified, database owner  
privilege is required. If *.* is specified, operating system super-user privilege is required.  
When a Greenplum Database user who is not a superuser runs a COPY command, the command can be  
controlled by a resource queue. The resource queue must be configured with the ACTIVE_STATEMENTS  
parameter that specifies a maximum limit on the number of queries that can be executed by roles assigned  
to that queue. Greenplum Database does not apply a cost value or memory value to a COPY command,  
resource queues with only cost or memory limits do not affect the running of COPY commands.  
A non-superuser can runs can run these types of COPY commands:  
- COPY FROM command where the source is stdin  
- COPY TO command where the destination is stdout  
For information about resource queues, see "Workload Management with Resource Queues" in the  
Greenplum Database Administrator Guide.  

参考

https://greenplum.org/docs/570/ref_guide/sql_commands/COPY.html

《PostgreSQL 11 preview - MERGE 语法支持与CTE内支持,兼容SQL:2016 , 兼容 Oracle》

Flag Counter

digoal’s 大量PostgreSQL文章入口