Data Distribution in Teradata


Here we can see how Teradata performs Data distribution.


PEs are assigned either to LAN connections or to channel connections (e.g., IBM Mainframe). Data will be always stored by the AMPs in the form of 8-bit ASCII. If the input is in any form like EDCDIC, the PE converts it to ASCII before any hashing data and data distribution takes place.

A USER may have a HOST or COLLATION = MULTINATIONAL, EBCIDC, ASCII. If  COLLATION = EBCDIC or the HOST is an EBCDIC host , then the AMPs convert the data from ASCII to EBCDIC before doing any sorts or comparisions. MULTINATIONAL collation allows sites to create their own collation file. Otherwise all sorts and comparisons use the ASCII collating sequence.

Teradata do not have concept of pre allocated table space. The rows of all tables are distributed randomly across all AMPs after hashing and then randomly within the space available on the selected AMP.Data distribution is  directly dependens on the hash value of the primary index.

AMP means Access Module Processor
PE means Parsing Engine.


No comments:

Post a Comment