Teradata SQL Reference: June 2012

List of Interview Questions On Collect Statistics

1.Suppose if collect stats is created on table , we can delete the data of the table by using delete statement those collect stats also be deleted or not?

Ans: No collect stats will not be deleted on table. We need to drop it manually by using DROP COLLECT STATISTICS statement.

2.Where these Collect statistics are actually stored?

Ans: Collect Statistics are stored in DBC views like IndexStats,ColumnStats and MultiColumnStats.

3.What is the purpose of Collect Statistics Command?

Ans: The purpose is to gather and store demographic data for indices or join index or one or more columns of a table. If you want optimizer to use high confidence, You should create COLLECT STATISTICS on columns or indices

Advantages of Partition Primary Index

· Range queries eliminates the use of Full Table Scan.
· PPI is an alternate to secondary index, it provides an excellent solution compared to secondary indexes
. PPI can make modeling and querying easier.
· We can't use fastload and multiload with all secondary indexes, But we an use Fastload and Multiload work with PPI tables.

The POSITION Function

The POSITION function Returns the actual position of the character which occurs first. POSITION function is ANSI standard.

Teradata has an equivalent function called INDEX.

Both the POSITION and INDEX functions returns position of character's first occurrence in a string.

Examples for the POSITION function

SELECT POSITION( 'u' IN 'formula'); Displays Result as '5'

SELECT POSITION( 'fo' IN 'formula'); Displays Result as '1'

SELECT POSITION( 'e' IN 'formula'); Displays Result as '0'

Examples for the INDEX function.

SELECT INDEX('formula', 'u'); Displays Result as '5'

SELECT INDEX('formula', 'mu'); Displays Result as '4'

SELECT INDEX('formula', 'e'); Displays Result as '0'

MERGE INTO Statement

The MERGE INTO statement 'merges' a source row into a target table based on whether any rows in the target table match a specified condition with the source row.

MERGE INTO replicates the functionality of the Upsert capability provided by the UPDATE with INSERT option. This option updates a row if it exists and inserts the row if it does not exists.

Example 1:

Add a new row for Department 700 using the MERGE INTO form of Upsert.

MERGE INTO department

USING VALUES (700,'Shipping',800000.00)

AS Dept (deptnum, dept_name, budgamt)

ON Dept.deptnum = dept_num

WHEN MATCHED THEN UPDATE

SET budget_amount = Dept.budgamt

WHEN NOT MATCHED THEN INSERT

VALUES (Dept.deptnum, Dept.dept_name,Dept.budgamt,NULL);

SELECT dept_num AS Dept

, department_name (CHAR(10))AS Dept_Name

, budget_amount

, manager_employee_number

FROM department

WHERE dept_num = 700;

Dept Dept_Name budget_amount manager_employee_number

------ ---------- ------------------ -----------------------

700 Shipping 800000.00 ?

Things to notice about this example:

• This example does an insert of a 'Shipping' department row.

• The USING clause defines the source row to be merged.

• The AS clause defines the name of the temporary source table and columns which contain the source row. This is also referred to as the 'correlation table'.

• The ON clause followed by condition returns the matching rows in both the source and target tables.

• It must use the primary index of the target table.

• If the Primary Index is non-unique, additional qualifying conditions are required to reduce the test to a single target row.

Example 2:

Update the budget for Department 700 to be $9,900,000 using the MERGE INTO form of Upsert.

MERGE INTO department

USING VALUES (700,'Shipping',9900000.00) AS Dept (deptnum, dept_name, budgamt)

ON Dept.deptnum = dept_num

WHEN MATCHED THEN UPDATE

SET budget_amount = Dept.budgamt

WHEN NOT MATCHED THEN INSERT

VALUES (Dept.deptnum, Dept.dept_name,Dept.budgamt,NULL);

SELECT dept_num AS Dept

, department_name (CHAR(10))AS Dept_Name

, budget_amount

, manager_employee_number

FROM department

WHERE dept_num = 700;

Dept Dept_Name budget_amount manager_employee_number

------ ------------ ------------------ ---------------------------------

700 Shipping 9900000.00 ?

Things to notice about this example:

• This example does an update of the previously inserted 'Shipping' department row.

Hashing Algorithm

When the primary index value of a row is input to the hashing algorithm, then the output is called the “row hash”. Row hash is the logical storage address of the row, and identifies the amp of the row. Also, the “table id” plus the row hash identifies the cylinder and data block, and is used for row distribution, placement and retrieval of the row. Based on the row hash uniqueness, data distribution happens.

The “table id” is a sequential number assigned whenever a table is created. This number changes whenever a table is re-created.

“Hash code redistribution” is used in join operation. This is used when the foreign key (join column) of a table (i.e. table A) is joined to a primary index of another table (i.e. table B). For each table A row, the row hash of the foreign key is calculated. Then, the table A row is sent to the amp dictated by the row hash, which is the same amp that contains table B’s row for that row hash.

“Join column hash code sequence” is the result of a sorting. The row hash of the foreign key (join column) of a table (i.e. table A) is sorted into this sequence. These are matched in sequence to the other table (i.e. table B) on the same amp.