Tuesday, June 8, 2010

Improving LOAD performance

Example of a LOAD instruction (small table with 219146 rows) and related costs:

LOAD FROM table.ixf of ixf INSERT INTO table;

The utility is beginning the "LOAD" phase at time 11:28:37.789940
The utility has finished the "LOAD" phase at time 11:29:09.774459
Total: about 32 sec.

LOAD FROM table.ixf of ixf SAVECOUNT 200000 INSERT INTO table DATA BUFFER 5000
SORT BUFFER 768 CPU_PARALLELISM 4;

The utility is beginning the "LOAD" phase at time 11:50:08.747113
The utility has finished the "LOAD" phase at time 11:50:29.627021
Total: about 21 sec.

Now we study the parameters that have influenced results:

CPU_PARALLELISM
Use this parameter to exploit intra-partition parallelism (if this is part of your machine’s capability), and significantly improve load performance. The parameter specifies the number of processes or threads used by the load utility to parse, convert, and format data records. The maximum number allowed is 30. If there is insufficient memory to support the specified value, the utility adjusts the value. If this parameter is not specified, the load utility selects a default value that is based on the number of CPUs on the system.
Record order in the source data is preserved regardless of the value of this parameter.
If tables include either LOB or LONG VARCHAR data, CPU_PARALLELISM is set to one. Parallelism is not supported in this case.
Although use of this parameter is not restricted to symmetric multiprocessor (SMP) hardware, you may not obtain any discernible performance benefit from using it in non-SMP environments.

DATA BUFFER
The DATA BUFFER parameter specifies the total amount of memory allocated to the load utility as a buffer. It is recommended that this buffer be several extents in size. An extent is the unit of movement for data within DB2, and the extent size can be one or more 4KB pages. The DATA BUFFER parameter is useful when working with large objects (LOBs); it reduces I/O waiting time. The data buffer is allocated from the utility heap. Depending on the amount of storage available on your system, you should consider allocating more memory for use by the DB2(R) utilities. The database configuration parameter util_heap_sz can be modified accordingly. The default value for the Utility Heap Size configuration parameter is 5 000 4KB pages. Because load is only one of several utilities that use memory from the utility heap, it is recommended that no more than fifty percent of the pages defined by this parameter be available for the load utility, and that the utility heap be defined large enough.

SORT BUFFER
This option specifies a value that overrides the sortheap database configuration parameter during a load operation. It is relevant only when loading tables with indexes and only when the INDEXING MODE parameter is not specified as DEFERRED. The value that is specified cannot exceed the value of sortheap. This parameter is useful for throttling the sort memory that is used when loading tables with many indexes without changing the value of sortheap, which would also affect general query processing.

No comments:

Post a Comment