分类目录归档:PL/SQL

12c返回指定行数

如果对返回结果进行行数的过滤时,通常都是通过order by + rownum <的方式,对于返回中间行数的情况就需要通过嵌套分页的写法,而在其他类型的数据库比如Mysql中可以比较方便的直接通过limit的方式

SELECT *
FROM t1
ORDER BY 1 LIMIT 0,
                 10;

row limiting概念

从12c开始,oracle对这方面的功能大大的增强,引入了Row_limiting_clause语句格式,允许查询sql能限制返回的行数以及返回结果集开始的行位置。许多查询sql需要限制返回的行数和修改返回行数的起始位置,比如在返回结果集中只需要过滤top-N的数据,这个时候就只需要FETCH FIRSTOFFSET选项就能简单完成。

Row_limiting_clause语句通过指定起始偏移量,返回行数n或者返回记录数的百分比,将这些选项都接在常规sql的order by语句后面。

它主要通过以下选项来实现:

  • OFFSET
    用来指定限制返回行数开始前需要跳过的行数的整数值,如果offset为负数或未指定,则默认为0并且限制返回行数时从第一行开始。如果值为NULL或者大于等于返回行的总数,那么最终返回0行,等于是进行了截断。

  • FETCH
    指定返回多少行或者多少百分比的记录,如果不指定,则默认返回所有行,第一行从offset+1开始

  • FIRST | NEXT
    这些关键字可以互换使用,根据场景来判断

  • Rowcount | percent PERCENT
    使用Rowcount来指定返回的行数,负值或者NULL则被当做0。如果指定的值大于最大返回行数,则只返回可返回的最大行数。percent PERCENT类似,表示返回的百分比。
    如果这两个值都没指定,则默认返回一行。

  • ONLY | WITH TIES
    ONLY只返回指定行数的精确值,而WITH TIES则在除了指定行数以外,如果还有与最后一行数据同样的order by值的时候会将其他满足条件的记录都返回,如果使用WITH TIES则必须配合order by语句,否则额外的行都不会返回。

ROW limiting语句的限制:

  1. 不能用于for update语句
  2. select后面不能包含序列的CURRVAL或NEXTVAL
  3. 物化视图如果包含row limiting,则无法刷新

示例

创建一张测试表,插入部分重复数据

DROP TABLE t1 purge;

CREATE TABLE t1 (id number);

INSERT INTO t1
SELECT LEVEL
FROM dual CONNECT BY LEVEL <=5;

INSERT INTO t1
SELECT *
FROM t1;

 COMMIT;

现在t1表则包含10条数据,其中一半是重复的

xb@PDB12C> select * from t1 order by 1; 

        ID
----------
         1
         1
         2
         2
         3
         3
         4
         4
         5
         5

按id排序只取前5行数据

SELECT *
FROM t1
ORDER BY 1 FETCH FIRST 5 ROWS ONLY;

        ID
----------
         1
         1
         2
         2
         3

因为ID=3的记录有两条,通过with ties将与第5行id值一样的记录也展示出来

SELECT *
FROM t1
ORDER BY 1 FETCH FIRST 5 ROWS WITH ties;

        ID
----------
         1
         1
         2
         2
         3
         3

用percent取id前30%的数据

SELECT *
FROM t1
ORDER BY 1 FETCH FIRST 30 percent ROWS ONLY;

        ID
----------
         1
         1
         2

percent配合with ties

SELECT *
FROM t1
ORDER BY 1 FETCH FIRST 30 percent ROWS WITH ties;

        ID
----------
         1
         1
         2
         2

以前的版本如果要取中间的数据只能用嵌套分页的办法,比如取根据ID排序第3到5行的数据

# before 12c
SELECT id
FROM
  (SELECT rownum rn,
          id
   FROM
     (SELECT id
      FROM t1
      ORDER BY 1)
   WHERE rownum <=5)
WHERE rn >=3;

# after 12c
SELECT id
FROM t1
ORDER BY 1
OFFSET 2 ROWS FETCH NEXT 3 ROWS ONLY;

不根据id排序,而是根据rownum取5-7行数据

SELECT rownum,
       id
FROM t1
OFFSET 4 ROWS FETCH NEXT 3 ROWS ONLY;

    ROWNUM         ID
---------- ----------
         5          5
         6          1
         7          2

效率问题

之前做分页的时候都会碰到id between 5 and 7的写法,我们知道这是很低效的,所有都是通过rownum的嵌套方式来书写,那新的写法执行效率又如何?
对此做一个10053 trace,看下执行计划是如何

oradebug setmypid
oradebug event 10053 trace name context forever,level 12;

SELECT id
FROM xb.t1
ORDER BY 1
OFFSET 2 ROWS FETCH NEXT 3 ROWS ONLY;

oradebug event 10053 trace name context off;

sys@ORA12C> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/ora12c/ora12c/trace/ora12c_ora_112949.trc

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "from$_subquery$_002"."ID" "ID"
FROM
  (SELECT "T1"."ID" "ID",
          "T1"."ID" "rowlimit_$_0",
          ROW_NUMBER() OVER (
                             ORDER BY "T1"."ID") "rowlimit_$$_rownumber"
   FROM "XB"."T1" "T1"
   WHERE 2<CASE WHEN (2>=0) THEN 2 ELSE 0 END +3) "from$_subquery$_002"
WHERE "from$_subquery$_002"."rowlimit_$$_rownumber"<=CASE
                                                         WHEN (2>=0) THEN 2
                                                         ELSE 0
                                                     END +3
  AND "from$_subquery$_002"."rowlimit_$$_rownumber">2
ORDER BY "from$_subquery$_002"."rowlimit_$_0"

----- Plan Table -----

============
Plan Table
============
--------------------------------------------+-----------------------------------+
| Id  | Operation                 | Name    | Rows  | Bytes | Cost  | Time      |
--------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT          |         |       |       |     4 |           |
| 1   |  VIEW                     |         |    10 |   390 |     4 |  00:00:01 |
| 2   |   WINDOW SORT PUSHED RANK |         |    10 |   130 |     4 |  00:00:01 |
| 3   |    TABLE ACCESS FULL      | T1      |    10 |   130 |     3 |  00:00:01 |
--------------------------------------------+-----------------------------------+
Query Block Name / Object Alias(identified by operation id):
------------------------------------------------------------
 1 - SEL$1                / from$_subquery$_002@SEL$2
 2 - SEL$1
 3 - SEL$1                / T1@SEL$1
------------------------------------------------------------
Predicate Information:
----------------------

*** 2019-08-21T13:50:43.663264+08:00 (PDB12C(3))
1 - filter(("from$_subquery$_002"."rowlimit_$$_rownumber"<=CASE  WHEN (2>=0) THEN 2 ELSE 0 END +3 AND "from$_subquery$_002"."rowlimit_$$_rownumber">2))
2 - filter(ROW_NUMBER() OVER ( ORDER BY "ID")<=CASE  WHEN (2>=0) THEN 2 ELSE 0 END +3)

可以看到实际上是对sql做了转换,先通过分析函数将伪列的值查出,然后判断OFFSET值是否符合要求,如果大于0则为2,然后返回后面3行。

通过这个大概能猜到如果是with ties的写法,那么可能就是把ROW_NUMBER分析函数换成了RANK。

通过DBMS_REDEFINITION更新大表

更新大表的时候通常伴随着大量的redo和undo产生,消耗了不少的资源。同时如果这个字段较长的话,也会产生许多行迁移,因为肯定会有某些块无法容纳下更新后的行的大小。

通常考虑通过DDL的方式来替代对这些大表的更新DML操作,第一时间想到的可能就是CTAS,最后rename回去的方式。但是这会带来一些其他的问题,比如表的其他属性并不会复制到新表当中,其他的约束和索引等都需要重建,工作比较繁琐也容易漏掉,而且这种变更表操作是需要停机时间的。而通过在线重定义的方式则可以免去这些烦恼,程序会自动帮我们将这些依赖关系同步过去。通常我们使用在线重定义都是作为改变表的存储属性用途,但是别忘了,在线重定义同样支持字段匹配的功能,只要我们的字段匹配是确定的,那我们可以获得更多有价值的内容,包括更新操作。

首先建两张测试表,T1和T2,T1做为被更新的表,T2作为T1更新字段所需要关联的表

drop table t1 purge;
create table t1 as
select object_id ,object_name 
from dba_objects 
where object_id is not null;

drop table t2 purge;
create table t2 as
select object_id ,object_name ,object_id||object_name cname
from dba_objects 
where object_id is not null;

alter table t1 
   add constraint pk_t1
   primary key (object_id);

alter table t2 
   add constraint pk_t2
   primary key (object_id);

这里给T1加一个字段cname,然后通过与T2表的object_id字段关联,将T2表的cname更新到T1表cname字段中

alter table t1 add cname varchar2(200);

xb@PDB12C> select * from t1 where rownum<10;

 OBJECT_ID OBJECT_NAME                         CNAME
---------- ----------------------------------- --------------
         9 I_FILE#_BLOCK#
        38 I_OBJ3
        45 I_TS1
        51 I_CON1
        19 IND$
        31 CDEF$
         6 C_TS#
        58 I_CCOL2
        24 I_PROXY_DATA$

xb@PDB12C> select * from t2 where rownum<10;

 OBJECT_ID OBJECT_NAME          CNAME
---------- -------------------- ---------------------------
         9 I_FILE#_BLOCK#       9I_FILE#_BLOCK#
        38 I_OBJ3               38I_OBJ3
        45 I_TS1                45I_TS1
        51 I_CON1               51I_CON1
        19 IND$                 19IND$
        31 CDEF$                31CDEF$
         6 C_TS#                6C_TS#
        58 I_CCOL2              58I_CCOL2
        24 I_PROXY_DATA$        24I_PROXY_DATA$

DBMS_REDEFINITION中无法使用子查询或者表关联,但是可以用确定性的表达式,在这个例子中可以使用函数的方式

create or replace
function get_cname(v_id int )  return varchar2 deterministic is
  f varchar2(200);
begin
  select  cname
  into f
  from t2
  where object_id = v_id;
  return f;
end;
/

建一张临时表用作重定义的过程中转换的数据

drop table t_tmp purge;
create table t_tmp (
  object_id number,
  object_name varchar2(200),
  cname varchar2(200));

那现在可以使用DBMS_REDEFINITION包了,这里需要注意的重点是COL_MAPPING参数,这里将配置之前建的get_cname函数来获取要更新T1表的数据

declare
    l_colmap varchar(512);
  begin
    l_colmap :=
          'OBJECT_ID
          ,OBJECT_NAME
          ,get_cname(OBJECT_ID) cname';

    dbms_redefinition.start_redef_table
    (  uname           => user,
       orig_table      => 'T1',
       int_table       => 'T_TMP',
       col_mapping   => l_colmap );
 end;
 /

在这里不需要考虑其他依赖对象,只需要COPY_TABLE_DEPENDENTS就可以解决这些问题

var nerrors number
begin
  dbms_redefinition.copy_table_dependents
    ( user, 'T1', 'T_TMP',
      copy_indexes => dbms_redefinition.cons_orig_params,
      num_errors => :nerrors );
end;
/

xb@PDB12C> print nerrors

   NERRORS
----------
         0

-- 完毕
begin
  dbms_redefinition.finish_redef_table
    ( user, 'T1', 'T_TMP' );
end;
/

最后检查T1表的数据

xb@PDB12C> select * from t1 where rownum<10;

 OBJECT_ID OBJECT_NAME          CNAME
---------- -------------------- ----------------------
         9 I_FILE#_BLOCK#       9I_FILE#_BLOCK#
        38 I_OBJ3               38I_OBJ3
        45 I_TS1                45I_TS1
        51 I_CON1               51I_CON1
        19 IND$                 19IND$
        31 CDEF$                31CDEF$
         6 C_TS#                6C_TS#
        58 I_CCOL2              58I_CCOL2
        24 I_PROXY_DATA$        24I_PROXY_DATA$

在线重定义操作避免了业务的中断,不需要考虑依赖对象的问题,特别是对大表的操作起到很好的效果。

通过DBMS_PARALLEL_EXECUTE更新大表

DBMS_PARALLEL_EXECUTE包可以将一个任务拆分成很多小块,然后通过并发的方式共同完成任务,从而提高效率。

当你在执行超大表的更新操作时,这个技术都能起到作用:

  • 它在同一时间内只会锁住一小部分的数据集,而不是锁住全表
  • 在整个任务完成之前如果碰见部分更新失败,则其他已完成的任务不受影响
  • 减少了回滚段的消耗
  • 提高了效率

注意DBMS_PARALLEL_EXECUTE是通过job的方式来执行,所以参数job_queue_processe必须要大于0,否则任务则会一直处于”PL/SQL lock timer”等待

DBMS_PARALLEL_EXECUTE提供了多种切片方式

  1. CREATE_CHUNKS_BY_NUMBER_COL,将表通过指定的字段进行切片,这个字段必须是数字型,找出这个字段的最大值和最小值,然后根据分块值平均将其分成多个部分。
  2. CREATE_CHUNKS_BY_ROWID,通过rowid来切分表,所以被切分的表必须是物理存在的表,需要含有物理rowid地址,所以不支持索引组织表之类。
  3. CREATE_CHUNKS_BY_SQL,通过用户指定的SELECT语句来进行切分。

使用DBMS_PARALLEL_EXECUTE前提条件

  • 为了并发执行多个块任务,必须含有CREATE JOB权限
  • 需要有执行DBMS_SQL包的权限,因为CHUNK_BY_SQL, RUN_TASK, 和RESUME_TASK子系统都需要查询,都是通过执行DBMS_SQL得到

下面建了一个测试表,插入了一些数据

DROP TABLE TEST_TAB;
CREATE TABLE test_tab
(
  REC_ID      NUMBER,
  DESCRIPTION VARCHAR2(50),
  NUM_COL     NUMBER,
  CONSTRAINT  TEST_TAB_PK PRIMARY KEY (REC_ID)
);

INSERT /*+ APPEND */ INTO
TEST_TAB
SELECT LEVEL,
'Description for ' || LEVEL,
CASE
  WHEN MOD(level, 5) = 0 THEN 10
  WHEN MOD(level, 3) = 0 THEN 20
  ELSE 30
END
FROM DUAL
CONNECT BY LEVEL <= 500000;
COMMIT;

SELECT NUM_COL, COUNT(*) FROM TEST_TAB
GROUP BY NUM_COL ORDER BY NUM_COL;

   NUM_COL   COUNT(*)
---------- ----------
        10     100000
        20     133333
        30     266667

创建任务

BEGIN
  DBMS_PARALLEL_EXECUTE.create_task (task_name => 'test_task');
END;
/

建好的任务相关信息可以通过视图查看

COLUMN task_name FORMAT A10
SELECT task_name,
       status
FROM   user_parallel_execute_tasks;

TASK_NAME  STATUS
---------- -------------------
test_task  CREATED

做个切片示例,按rowid

BEGIN
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID
  (
    TASK_NAME   => 'test_task',
    TABLE_OWNER => 'XB',
    TABLE_NAME  => 'TEST_TAB',
    BY_ROW      => TRUE,
    CHUNK_SIZE  => 2500
  );
END;
/

重新查看任务状态

COLUMN task_name FORMAT A10
SELECT task_name,
       status
FROM   user_parallel_execute_tasks;

TASK_NAME  STATUS
---------- -------------------
test_task  CHUNKED

查看每个单独切片的信息

SELECT chunk_id, status, start_rowid, end_rowid
FROM   user_parallel_execute_chunks
WHERE  task_name = 'test_task'
ORDER BY chunk_id;

 CHUNK_ID STATUS               START_ROWID        END_ROWID
---------- -------------------- ------------------ ------------------
       257 UNASSIGNED           AAASyXAAMAAAACAAAA AAASyXAAMAAAAC5H//
       258 UNASSIGNED           AAASyXAAMAAAAC6AAA AAASyXAAMAAAADrH//
       259 UNASSIGNED           AAASyXAAMAAAADsAAA AAASyXAAMAAAAGVH//
       260 UNASSIGNED           AAASyXAAMAAAAGWAAA AAASyXAAMAAAAHHH//
       261 UNASSIGNED           AAASyXAAMAAAAHIAAA AAASyXAAMAAAAH5H//
       262 UNASSIGNED           AAASyXAAMAAAAH6AAA AAASyXAAMAAAAIrH//
       263 UNASSIGNED           AAASyXAAMAAAAIsAAA AAASyXAAMAAAAJdH//
       264 UNASSIGNED           AAASyXAAMAAAAJeAAA AAASyXAAMAAAAKPH//
       265 UNASSIGNED           AAASyXAAMAAAAKQAAA AAASyXAAMAAAALBH//
       266 UNASSIGNED           AAASyXAAMAAAALCAAA AAASyXAAMAAAALzH//
       267 UNASSIGNED           AAASyXAAMAAAAL0AAA AAASyXAAMAAAAMlH//
       268 UNASSIGNED           AAASyXAAMAAAAMmAAA AAASyXAAMAAAANXH//
       269 UNASSIGNED           AAASyXAAMAAAANYAAA AAASyXAAMAAAAOJH//
       270 UNASSIGNED           AAASyXAAMAAAAOKAAA AAASyXAAMAAAAO7H//
       271 UNASSIGNED           AAASyXAAMAAAAO8AAA AAASyXAAMAAAAPtH//
       272 UNASSIGNED           AAASyXAAMAAAAPuAAA AAASyXAAMAAAAQfH//
       273 UNASSIGNED           AAASyXAAMAAAAQgAAA AAASyXAAMAAAARRH//
       274 UNASSIGNED           AAASyXAAMAAAARSAAA AAASyXAAMAAAASDH//
       275 UNASSIGNED           AAASyXAAMAAAASEAAA AAASyXAAMAAAAS1H//
       276 UNASSIGNED           AAASyXAAMAAAAS2AAA AAASyXAAMAAAATnH//
       277 UNASSIGNED           AAASyXAAMAAAAToAAA AAASyXAAMAAAAUZH//
       278 UNASSIGNED           AAASyXAAMAAAAUaAAA AAASyXAAMAAAAVLH//
       279 UNASSIGNED           AAASyXAAMAAAAVMAAA AAASyXAAMAAAAV9H//
       280 UNASSIGNED           AAASyXAAMAAAAV+AAA AAASyXAAMAAAAWvH//
       281 UNASSIGNED           AAASyXAAMAAAAWwAAA AAASyXAAMAAAAXhH//
       282 UNASSIGNED           AAASyXAAMAAAAXiAAA AAASyXAAMAAAAYTH//
       283 UNASSIGNED           AAASyXAAMAAAAYUAAA AAASyXAAMAAAAZFH//
       284 UNASSIGNED           AAASyXAAMAAAAZGAAA AAASyXAAMAAAAZ3H//
       285 UNASSIGNED           AAASyXAAMAAAAZ4AAA AAASyXAAMAAAAapH//
       286 UNASSIGNED           AAASyXAAMAAAAaqAAA AAASyXAAMAAAAbbH//
       287 UNASSIGNED           AAASyXAAMAAAAbcAAA AAASyXAAMAAAAcNH//
       288 UNASSIGNED           AAASyXAAMAAAAcOAAA AAASyXAAMAAAAc/H//
       289 UNASSIGNED           AAASyXAAMAAAAdAAAA AAASyXAAMAAAAdxH//
       290 UNASSIGNED           AAASyXAAMAAAAdyAAA AAASyXAAMAAAAejH//
       291 UNASSIGNED           AAASyXAAMAAAAekAAA AAASyXAAMAAAAfVH//
       292 UNASSIGNED           AAASyXAAMAAAAfWAAA AAASyXAAMAAAAgHH//
       293 UNASSIGNED           AAASyXAAMAAAAgIAAA AAASyXAAMAAAAg5H//
       294 UNASSIGNED           AAASyXAAMAAAAg6AAA AAASyXAAMAAAAhrH//
       295 UNASSIGNED           AAASyXAAMAAAAhsAAA AAASyXAAMAAAAidH//
       296 UNASSIGNED           AAASyXAAMAAAAieAAA AAASyXAAMAAAAjPH//
       297 UNASSIGNED           AAASyXAAMAAAAjQAAA AAASyXAAMAAAAkBH//
       298 UNASSIGNED           AAASyXAAMAAAAkCAAA AAASyXAAMAAAAkzH//
       299 UNASSIGNED           AAASyXAAMAAAAk0AAA AAASyXAAMAAAAllH//
       300 UNASSIGNED           AAASyXAAMAAAAlmAAA AAASyXAAMAAAAmXH//
       301 UNASSIGNED           AAASyXAAMAAAAmYAAA AAASyXAAMAAAAnJH//
       302 UNASSIGNED           AAASyXAAMAAAAnKAAA AAASyXAAMAAAAn7H//
       303 UNASSIGNED           AAASyXAAMAAAAn8AAA AAASyXAAMAAAAotH//
       304 UNASSIGNED           AAASyXAAMAAAAouAAA AAASyXAAMAAAApfH//
       305 UNASSIGNED           AAASyXAAMAAAApgAAA AAASyXAAMAAAAqRH//
       306 UNASSIGNED           AAASyXAAMAAAAqSAAA AAASyXAAMAAAArDH//
       307 UNASSIGNED           AAASyXAAMAAAArEAAA AAASyXAAMAAAAr1H//
       308 UNASSIGNED           AAASyXAAMAAAAr2AAA AAASyXAAMAAAAr/H//

52 rows selected.

完整语句

通过CREATE_CHUNKS_BY_ROWID切片

SET SERVEROUTPUT ON
BEGIN
  DBMS_PARALLEL_EXECUTE.DROP_TASK ('test_task');
EXCEPTION WHEN OTHERS THEN
NULL;
END;
/

DECLARE
  l_task     VARCHAR2(30) := 'test_task';
  l_sql_stmt VARCHAR2(32767);
  l_try      NUMBER;
  l_status   NUMBER;
BEGIN
  -- Create the TASK
  DBMS_PARALLEL_EXECUTE.CREATE_TASK (task_name => l_task);

  -- Chunk the table by the ROWID
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID
  (
    TASK_NAME   => l_task,
    TABLE_OWNER => 'XB', /* Replace the TABLE_OWNER as appropriately */
    TABLE_NAME  => 'TEST_TAB',
    BY_ROW      => TRUE,
    CHUNK_SIZE  => 2500
  );

  -- DML to be execute in parallel
  l_sql_stmt := 'UPDATE test_tab t SET t.num_col = t.num_col + 10 WHERE rowid BETWEEN :start_id AND :end_id';

  -- Run the task
  DBMS_PARALLEL_EXECUTE.RUN_TASK
  (
    TASK_NAME      => l_task,
    SQL_STMT       => l_sql_stmt,
    LANGUAGE_FLAG  => DBMS_SQL.NATIVE,
    PARALLEL_LEVEL => 10
  );

  -- If there is error, RESUME it for at most 2 times.
  l_try := 0;
  l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);

  WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
  LOOP
    l_try := l_try + 1;
    DBMS_PARALLEL_EXECUTE.RESUME_TASK(l_task);
    l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);
  END LOOP;

  -- Done with processing; drop the task
  DBMS_PARALLEL_EXECUTE.DROP_TASK(l_task);
EXCEPTION WHEN OTHERS THEN
  DBMS_OUTPUT.PUT_LINE('Error in the code :' || SQLERRM);
END;
/ 

通过CREATE_CHUNKS_BY_NUMBER_COL切片

SET SERVEROUTPUT ON

DECLARE
  l_task     VARCHAR2(30) := 'test_task';
  l_sql_stmt VARCHAR2(32767);
  l_try      NUMBER;
  l_status   NUMBER;
BEGIN
  -- Create the TASK
  DBMS_PARALLEL_EXECUTE.CREATE_TASK (task_name => l_task);

  -- Chunk the table by the ID
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_NUMBER_COL
  (
    TASK_NAME    => l_task,
    TABLE_OWNER  => 'XB', /* Replace the TABLE_OWNER value as appropriately */
    TABLE_NAME   => 'TEST_TAB',
    TABLE_COLUMN => 'REC_ID',
    CHUNK_SIZE   => 2500
  );

  -- Procedure to be execute in parallel
  l_sql_stmt := 'BEGIN PROCESS_UPDATE(:start_id, :end_id); END;';

  DBMS_PARALLEL_EXECUTE.RUN_TASK
  (
    TASK_NAME      => l_task,
    SQL_STMT       => l_sql_stmt,
    LANGUAGE_FLAG  => DBMS_SQL.NATIVE,
    PARALLEL_LEVEL => 10
  );

  -- If there is error, RESUME it for at most 2 times.
  l_try := 0;
  l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);

  WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
  LOOP
    l_try := l_try + 1;
    DBMS_PARALLEL_EXECUTE.RESUME_TASK(l_task);
    l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);
  END LOOP;

  -- Done with processing; drop the task
  DBMS_PARALLEL_EXECUTE.DROP_TASK(l_task);
EXCEPTION WHEN OTHERS THEN
  DBMS_OUTPUT.PUT_LINE('Error in the code :' || SQLERRM);
END;
/ 

通过CREATE_CHUNKS_BY_SQL切片

SET SERVEROUTPUT ON
DECLARE
  l_chunk_sql VARCHAR2(1000);
  l_sql_stmt  VARCHAR2(1000);
  l_try       NUMBER;
  l_status    NUMBER;
BEGIN
  -- Create the TASK
  DBMS_PARALLEL_EXECUTE.CREATE_TASK ('test_task');
  -- Chunk the table by NUM_COL
  l_chunk_sql := 'SELECT DISTINCT NUM_COL, NUM_COL FROM TEST_TAB';
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL
  (
    TASK_NAME => 'test_task',
    SQL_STMT  => l_chunk_sql,
    BY_ROWID  => false
  );

  -- Execute the DML in parallel
  -- the WHERE clause contain a condition on num_col, which is the chunk
  -- column. In this case, grouping rows is by num_col.
  l_sql_stmt := 'UPDATE test_tab t SET t.num_col = t.num_col + 10 WHERE num_col BETWEEN :start_id AND :end_id';
  DBMS_PARALLEL_EXECUTE.RUN_TASK
  (
    TASK_NAME      => 'test_task',
    SQL_STMT       => l_sql_stmt,
    LANGUAGE_FLAG  => DBMS_SQL.NATIVE,
    PARALLEL_LEVEL => 10
  );

  -- If there is error, RESUME it for at most 2 times.
  L_try := 0;
  L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('test_task');

  WHILE(l_try < 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED)
  LOOP
    L_try := l_try + 1;
    DBMS_PARALLEL_EXECUTE.RESUME_TASK('test_task');
    L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('test_task');
  END LOOP;

  -- Done with processing; drop the task
  DBMS_PARALLEL_EXECUTE.DROP_TASK('test_task');
EXCEPTION WHEN OTHERS THEN
  DBMS_OUTPUT.PUT_LINE('Error in the code :' || SQLERRM);
END;
/

dba权限报ORA-00942: table or view does not exist

新建函数的时候碰到个小问题

一个dba角色的用户单独查询几个视图时正常,但是放进函数里面就报错ORA-00942

SYS@ora12c> grant dba to xb;

Grant succeeded.

SYS@ora12c> conn xb/xb
Connected.

CREATE OR REPLACE FUNCTION func_test RETURN NUMBER as
cnt number;
BEGIN
  SELECT count(1)
   into cnt
  FROM   v$mystat;
  RETURN cnt;
END func_test;
  9  /

Warning: Function created with compilation errors.

XB@ora12c> show errors
Errors for FUNCTION FUNC_TEST:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/3      PL/SQL: SQL Statement ignored
6/10     PL/SQL: ORA-00942: table or view does not exist

嗯? 单独执行没问题

XB@ora12c> SELECT count(1)   FROM   v$mystat;

  COUNT(1)
----------
      1804

这是由于当权限赋予给某个角色以后,然后将这个角色赋予其他用户,一般情况下用户是能享有角色所有的权限的。然而当这个用户写的PL/SQL当中需要角色中的某些权限时,这些权限就变成不可用了,只能通过单独给这个用户重新赋权

SYS@ora12c> grant select on v_$mystat to xb;

Grant succeeded.

CREATE OR REPLACE FUNCTION func_test RETURN NUMBER as
cnt number;
BEGIN
  SELECT count(1)
   into cnt
  FROM   v$mystat;
  RETURN cnt;
  8  END func_test;
  9  /

Function created.

XB@ora12c> select func_test from dual;

 FUNC_TEST
----------
      1804

记录一下。

Pipelined Table Functions

TABLE函数

table function是一个用户自定义的PL/SQL函数,它能返回一个结果集(关联数组、嵌套表),你能像查询普通表一样查询它,通过SELECT语句里的TABLE表达式

SELECT * FROM TABLE(table_function_name(parameter_list))

table函数能将行的集合作为入参,也就是说它的入参可以是嵌套表、数组或者游标等,因此table函数tf1的结果集可以作为tf2的入参,同理函数tf2的结果集也可以作为tf3的入参,以此类推。
由于传统的table function必须要填充完所有的数据以后才能返回结果,而结果又保存在内存当中,所以因为这个原因导致table function并不适合数据量较大的操作。

先构建一个table function例子

DROP TYPE t_tf_type;
DROP TYPE t_tf_obj;

CREATE TYPE t_tf_obj AS OBJECT (
  id           NUMBER,
  name  VARCHAR2(50)
);
/

CREATE TYPE t_tf_type IS TABLE OF t_tf_obj;
/

CREATE OR REPLACE FUNCTION get_tab (cnt IN NUMBER) RETURN t_tf_type AS
  l_tab  t_tf_type := t_tf_type();
BEGIN
  FOR i IN 1 .. cnt LOOP
    l_tab.extend;
    l_tab(l_tab.last) := t_tf_obj(i, 'Name is ' || i);
  END LOOP;

  RETURN l_tab;
END;
/

XB@ora12c> SELECT * FROM   TABLE(get_tab(10));

        ID NAME
---------- -------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5
         6 Name is 6
         7 Name is 7
         8 Name is 8
         9 Name is 9
        10 Name is 10

10 rows selected.

如果版本是12.2及以上,可以连关键字TABLE都省掉

XB@ora12c> SELECT * FROM   get_tab(10);

        ID NAME
---------- --------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5
         6 Name is 6
         7 Name is 7
         8 Name is 8
         9 Name is 9
        10 Name is 10

10 rows selected.

Pipelined Table Functions

为了提高table function的性能,你可以

  • 通过PARALLEL_ENABLE选项开启并行
  • 通过oracle streams将表函数的结果集直接传给下一个进程
  • 通过PIPELINED选项管道化函数结果

这篇重点说一下最后一个。
pipelined table function当获取到行时会立刻返回给调用它的进程,然后继续下一行数据的获取。因为不需要等到全部的数据集合获取完毕,所以能很大程度上的提高响应时间。同时也能节省内存,因为不需要缓存整个结果集对象。

pipelined table function包含PIPELINED子句,并调用PIPE ROW在函数的结果生成时就立刻返回,并不需要等数据构建完。

CREATE OR REPLACE FUNCTION get_tab_ptf (cnt IN NUMBER) RETURN t_tf_type PIPELINED AS
BEGIN
  FOR i IN 1 .. cnt LOOP
    PIPE ROW(t_tf_obj(i, 'Name is ' || i));   
  END LOOP;

  RETURN;
END;
/

XB@ora12c> SELECT * FROM   get_tab_ptf(10);

        ID NAME
---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5
         6 Name is 6
         7 Name is 7
         8 Name is 8
         9 Name is 9
        10 Name is 10

10 rows selected.

当你在操作大数据量时,这个性能上的优势就会非常明显了。

NO_DATA_NEEDED异常

你必须理解NO_DATA_NEEDED异常的几种场景

  • 包含有PIPE ROW语句的块中必须存在OTHERS异常处理
  • 你输入的PIPE ROW代码伴随清理的过程
    一般情况下,清理过程主要是用来释放那些代码不需要的资源。

如果pipelined table function函数返回的结果集超过了语句所需要的,pipelined table function函数的调用则会停止但是调用的语句却还在执行,如果这时有NO_DATA_NEEDED的异常处理过程,则能释放出不再需要的资源。

比如下面这个没有异常处理的例子

CREATE OR REPLACE FUNCTION get_tab_ptf (cnt IN NUMBER) RETURN t_tf_type PIPELINED AS
BEGIN
  FOR i IN 1 .. cnt LOOP
    PIPE ROW(t_tf_obj(i, 'Name is ' || i));   
  END LOOP;
  RETURN;
END;
/


Function created.

XB@ora12c> XB@ora12c> SELECT * FROM   get_tab_ptf(10) where rownum<=5;

        ID NAME
---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5

看结果显示语句并没有报错提示,而且只返回了5行结果,如果加上异常处理的判断,是能获取到异常情况的

set serveroutput on

CREATE OR REPLACE FUNCTION get_tab_ptf (cnt IN NUMBER) RETURN t_tf_type PIPELINED AS
BEGIN
  FOR i IN 1 .. cnt LOOP
    PIPE ROW(t_tf_obj(i, 'Name is ' || i));   
  END LOOP;
  RETURN;
  EXCEPTION
  WHEN OTHERS THEN
    DBMS_OUTPUT.put_line('EXCEPTION');
    RAISE;
END;
/

XB@ora12c> SELECT * FROM   get_tab_ptf(10) where rownum<=5;

        ID NAME
---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5

EXCEPTION

可以看到最后有EXCEPTION的输出,说明异常是真实存在的,只不过并没有显示的提醒。

指定获取NO_DATA_NEEDED的异常

CREATE OR REPLACE FUNCTION get_tab_ptf (cnt IN NUMBER) RETURN t_tf_type PIPELINED AS
BEGIN
  FOR i IN 1 .. cnt LOOP
    PIPE ROW(t_tf_obj(i, 'Name is ' || i));   
  END LOOP;
  RETURN;
  EXCEPTION
  WHEN NO_DATA_NEEDED THEN
    DBMS_OUTPUT.put_line('NO_DATA_NEEDED');
    RAISE;
  WHEN OTHERS THEN
    DBMS_OUTPUT.put_line('EXCEPTION');
    RAISE;
END;
/


Function created.

XB@ora12c> XB@ora12c> SELECT * FROM   get_tab_ptf(10) where rownum<=5;

        ID NAME
---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5

NO_DATA_NEEDED

可以看到最后的结果表示异常确实是NO_DATA_NEEDED

内存使用情况比较

主要是要对比一下普通的表函数和pipelined表函数在内存的使用情况有什么区别,二者都是将数据缓存到pga内存里,所以直接比较session pga memory值的大小

因为需要在匿名块里计算,所以新建一个简单的函数

CREATE OR REPLACE FUNCTION get_stat RETURN NUMBER AS
  l_sum  NUMBER;
BEGIN
  SELECT a.value
  INTO   l_sum
  FROM   v$mystat a,
         v$statname b
  WHERE  a.statistic# = b.statistic#
  AND    b.name = 'session pga memory';
  RETURN l_sum;
END get_stat;
/

Function created.

首先看普通表函数

set serveroutput on;

declare
v_begin number;
v_end number;
BEGIN
  v_begin:=sys.get_stat;
  FOR cur IN (SELECT *
                  FROM   TABLE(get_tab(10000)))
  LOOP
    NULL;
 END LOOP;
 v_end:=sys.get_stat;
  DBMS_OUTPUT.put_line('Begin memory is: '||v_begin);
  DBMS_OUTPUT.put_line('End memory is: '||v_end);
  DBMS_OUTPUT.put_line('Total memory used: '||(v_end-v_begin));
END;
 16  /
Begin memory is: 4753336
End memory is: 9013176
Total memory used: 4259840

接下来换成pipelined table function

set serveroutput on;
declare
v_begin number;
v_end number;
BEGIN
  v_begin:=sys.get_stat;
  FOR cur IN (SELECT *
                  FROM   TABLE(get_tab_ptf(10000)))
  LOOP
    NULL;
 END LOOP;
 v_end:=sys.get_stat;
  DBMS_OUTPUT.put_line('Begin memory is: '||v_begin);
  DBMS_OUTPUT.put_line('End memory is: '||v_end);
  DBMS_OUTPUT.put_line('Total memory used: '||(v_end-v_begin));
END;
 16  /
Begin memory is: 4032440
End memory is: 4032440
Total memory used: 0

结果令人震惊,居然是0。重新试了多次结果依然都是0

这是由于pipelined table function不需要将全部结果集缓存到内存当中,所以数据在内存当中停留的时间非常短

Cardinality

如果你优化过table function的sql的话 你会发现oracle估算其基数时总是不准的。主要取决于db block size,如果是默认的8k,那么估算的结果总是8196

XB@ora12c> 
SELECT *
  2  FROM   TABLE(get_tab_ptf(10));

        ID NAME
---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5
         6 Name is 6
         7 Name is 7
         8 Name is 8
         9 Name is 9
        10 Name is 10

10 rows selected.

XB@ora12c> @x

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2gpbj20w5kmbn, child number 0
-------------------------------------
SELECT * FROM   TABLE(get_tab_ptf(10))

Plan hash value: 822655197

---------------------------------------------------------------------------------------
| Id  | Operation                         | Name        | E-Rows |E-Bytes| Cost (%CPU)|
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |             |        |       |    29 (100)|
|   1 |  COLLECTION ITERATOR PICKLER FETCH| GET_TAB_PTF |   8168 | 16336 |    29   (0)|    <<<<====估算8168行
---------------------------------------------------------------------------------------

通常针对这种问题有几种处理办法

  • cardinality hint告诉优化器我会返回多少行
  • dynamica_sampling hint动态采样
  • Cardinality Feedback 就是sql实际执行完毕后会进行基数的修正,在之后的执行时会采用正确的基数。但是这个反馈值是存储在shared pool里,就面临到被age out的因素

动态采样是个不错的办法

XB@ora12c> alter system flush shared_pool;

System altered.

XB@ora12c> select /*+ DYNAMIC_SAMPLING(2)*/ * FROM   TABLE(get_tab_ptf(10));

        ID NAME
---------- ------------------------------------------------------------------------------------------------------------------------------------------------------
         1 Name is 1
         2 Name is 2
         3 Name is 3
         4 Name is 4
         5 Name is 5
         6 Name is 6
         7 Name is 7
         8 Name is 8
         9 Name is 9
        10 Name is 10

10 rows selected.

XB@ora12c> @x
Display execution plan for last statement for this session from library cache...

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  76sppwrcf832x, child number 0
-------------------------------------
select /*+ DYNAMIC_SAMPLING(2)*/ * FROM   TABLE(get_tab_ptf(10))

Plan hash value: 822655197

---------------------------------------------------------------------------------------
| Id  | Operation                         | Name        | E-Rows |E-Bytes| Cost (%CPU)|
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |             |        |       |    11 (100)|
|   1 |  COLLECTION ITERATOR PICKLER FETCH| GET_TAB_PTF |     10 |    20 |    11   (0)|          <<<<====估算的值就要准确多了
---------------------------------------------------------------------------------------

   - dynamic statistics used: dynamic sampling (level=2)


15 rows selected.