TPUMP in Teradata


TPUMP is a flexible multi-session load utility.  The only characteristic of TPUMP that limits its use is its transaction oriented nature.  If the situation allows execution of Fastload or Multiload, these utilities will always outperform TPUMP in a large batch operation when measuring Teradata resource consumption and response time.  Consideration of TPUMP in a load scenario should occur before that of Fastload or Multiload.  Basically TPUMP has none of the disadvantages of Multiload (table locking, load slot usage, single amp operations), Fastload (empty table only, load slot usage), or BTEQ (no checkpointing or restartability). 

Best practices for the development of BTEQ Applications:

  • Logtable and error tables are job specific.  
  • The BTEQ clean up step drops the logtable and prepares the target table.
  • The file layout section should completely define the input file.  .FILLER should be replaced with .FIELD and any null if logic should be completed in the DML section of the TPUMP. 
  • If the script is doing complex updates, ROBUST ON is in the "BEGIN LOAD" statement.  
  • If the script is doing a high volume (due to high pack and/or sessions) of simple inserts and the checkpoint is not 1, ROBUST ON is in the "BEGIN LOAD" statement.   
  • The PACK factor is lower than the maximum (32K block). You will receive a warning if it is too high.
  • SERIALIZE is on for UPSERT scripts.
  • If maintaining a table without a unique primary index (a NUPI), you probably need to use the SERIALIZE ON feature to prevent blocking.
  • When using SERIALIZE ON, be sure to type ‘KEY’ next to each of the primary index fields in the INFILE_LAYOUT section.
  • All dates are formatted.
  • All 6 digit dates in non characters fields are converted to char(06) so sliding date rule can be applied properly.  
  • The # of sessions are appropriate.


No Primary Index Tables


The syntax for the CREATE TABLE statement has been changed to permit user data tables to be created without a primary index. Such tables are referred to as NoPI (No Primary Index) tables.

This feature provides a performance advantage when using FastLoad or TPump Array INSERT to load data into staging tables. Because NoPI staging tables have no row-ordering constraints, the system can always append rows to the end of a NoPI table. Rows in a NoPI table can also be stored on any AMP, which is advantageous for TPump Array INSERT operations because many rows can then be packed into a single AMP step, thus dramatically reducing the performance burden on both CPU and I/O. After a NoPI staging table has been populated, the table can be processed further using SQL DML statements such as DELETE, INSERT, and SELECT.

While using a NoPI table, you can:
• Manipulate rows directly using most SQL DML statements or you can move its rows into a primary-indexed target table using INSERT… SELECT, MERGE, or UPDATE…FROM SQLs.
• Create unique secondary indexes to avoid full-table scans during row access. For example, while single-AMP retrieval of NoPI rows by means of their primary index is not possible, you can work around this through the appropriate assignment of unique secondary indexes (USIs) to NoPI tables, and by careful construction of request conditions to specify those USIs to access individual NoPI table rows.
• Create nonunique secondary indexes (NUSIs) to facilitate set processing retrieval of rows from NoPI tables.
• Avoid full-table scans when deleting a set of rows from a NoPI table by assigning secondary indexes and specifying them in your request conditions.
Benefits.
• Enhanced performance for FastLoad bulk data loads into staging tables.
• Enhanced performance for TPump Array INSERT minibatch loads into staging tables.

Points to note:
• The absence of a primary index or secondary index in NoPI tables means that all row access is done using full-table scans.
• The drawback to using secondary indexes for NoPI tables is that while they can enhance query processing significantly, they can also reduce load performance.
• You cannot modify a NoPI table using SQL UPDATE or UPSERT.
• This feature introduces a new DBS Control flag, PrimaryIndexDefault, that determines the behavior of a CREATE TABLE statement that does not explicitly specify any of the following:
• PRIMARY INDEX clause
• NO PRIMARY INDEX clause
• PRIMARY KEY or UNIQUE constraints