SQL and Indexes :
Reducing Table Update Time :
1) Primary indexes: Use primary indexes for joins whenever
possible, and specify in the where clause all the columns for the primary
indexes.
2) Secondary indexes (10% rule rumor): The optimizer does not actually use a
10% rule to determine if a secondary index will be used. But, this is a good
estimation: If less than 10% of a table will be accessed if the secondary index
is used, then assume the sql will use the secondary index. Otherwise, the sql
execution will do a full table scan.
The optimizer actually uses a “least cost” method: The optimizer determines if the cost of using a secondary index is cheaper than the cost of doing a full table scan. The cost involves the cpu usage, and diskio counts.
The optimizer actually uses a “least cost” method: The optimizer determines if the cost of using a secondary index is cheaper than the cost of doing a full table scan. The cost involves the cpu usage, and diskio counts.
3) Constants: Use constants to specify index column
contents whenever possible, instead of specifying the constant once, and
joining the tables. This may provide a small savings on performance.
4) Mathematical operations: Mathematical operations are faster than
string operations (i.e. concatenation), if both can achieve the same result.
5) Variable length columns: The use of variable length columns
should be minimized, and should be by exception. Fixed length columns should
always be used to define tables.
6) Union: The “union” command can be used to
break up a large sql process or statement into several smaller sql processes or
statements, which would run in parallel. But these could then cause spoolspace
limit problems. “Union all” executes the sql’s single threaded.
7) Where in/where not in (subquery): The sql “where in” is more efficient
than the sql “where not in”. It is more efficient to specify constants in
these, but if a subquery is specified, then the subquery has a direct impact on
the sql time.
If there is a sql time problem with the subquery, then the sql subquery could be separated from the original query. This would require 2 sql statements, and an intermediate table. The 2 sql statements would be: 1) New sql statement, which does the previous subquery function, and inserts into the temporary table, and 2) Modified original sql statement, which doesn’t have the subquery, and reads the temporary table.
If there is a sql time problem with the subquery, then the sql subquery could be separated from the original query. This would require 2 sql statements, and an intermediate table. The 2 sql statements would be: 1) New sql statement, which does the previous subquery function, and inserts into the temporary table, and 2) Modified original sql statement, which doesn’t have the subquery, and reads the temporary table.
8) Strategic Semicolon: At the end of every sql statement,
there is a semicolon. In some cases, the strategic placement of this semicolon
can improve the sql time of a group of sql statements. But this will not
improve an individual sql statement’s time. These are a couple cases: 1) The
group’s sql time could be improved if a group of sql statements share the same
tables (or spool files), 2) The group’s sql time could be improved if several
sql statements use the same unix input file.
Reducing Large SQL’s :
The following methods can be used to scope down the size of
sql’s.
1) Table denormalization: Duplicating data in another table.
This provides faster access to the duplicated data, but requires more update
time.
2) Table summarization: The data from one/many table(s) is
summarized into commonly used summary tables. This provides faster access to
the summarized data, but requires more update time.
3) SQL union: The DBC/SQL Union can be used to break up a
large SQL process or statement into several smaller SQL processes or
statements, which would run in parallel.
4) Unix split: A large input unix files could be split into
several smaller unix files, which could then be input in series, or in
parallel, to create smaller SQL processing steps.
5) Unix concatenation: A large query could be broken up into
smaller independent queries, whose output is written to several smaller unix
files. Then these smaller files are unix concatenated together to provide a
single unix file.
6) Trigger tables: A group of tables, each contains a subset
of the keys of the index of an original table. the tables could be created
based on some value in the index of the original table. This provides an
ability to break up a large SQL statement into multiple smaller SQL statements,
but creating the trigger tables requires more update time.
7) Sorts (order by): Although sorts take time, these are
always done at the end of the query, and the sort time is directly dependent on
the size of the solution. Unnecessary sorts could be eliminated.
8) Export/Load: Table data could be exported (Bteq,
Fastexport) to a unix file, and updated, and then reloaded into the table
(Bteq, fastload, Multiload).
9) C PROGRAM/UNIX SCRIPTS: Some data manipulation is very
difficult and time consuming in sql. These could be replaced with c
programs/unix scripts. See the “C/Embedded sql” tip.
1) Table update time can be improved by dropping the table’s
indexes first, and then doing the updates. After the completion of the updates,
then rebuild the indexes, and recollect the table’s statistics on the indexes.
The best improvement is obtained when the volume of table updates is large in
relation to the size of the table. If more then 5% of a large table is changed.
2) Try to avoid
dropping a table, instead, delete the table. Table related statements (i.e.
create table, drop table) are single threaded thru a system permissions table
and become a bottleneck. They can also cause deadlocks on the dictionary
tables. Also, any user permissions specific to the table are dropped when the
table is dropped, and these permissions must be recreated.
Excellent information on Teradata SQL in a crisp. Many thanks for posting ...
ReplyDeleteThanks
ReplyDeleteThank you..Very useful info
ReplyDeleteGood tips in a best summarized way. Excellent.
ReplyDeleteGood Asim Good....
DeleteGood Summarization !!
ReplyDeleteExcellent !
ReplyDeleteThanks , very useful!
ReplyDelete