Data modeling is a method used to define and analyze data requirements needed to support the business processes of an organization. The data requirements are recorded as a conceptual data model with associated data definitions. Actual implementation of the conceptual model is called a logical data model. To implement one conceptual data model may require multiple logical data models. Data modeling defines the relationships between data elements and structures. Data modeling is also a technique for defining business requirements for a database. It is sometimes called database modeling because a data model is eventually implemented in a database.
Conceptual schema
Conceptual schema
Logical schema
Physical schema
Modeling methodologies:
Bottom-up models are often the result of a reengineering effort. They usually start with existing data structures forms, fields on application screens, or reports. These models are usually physical, application-specific, and incomplete from an enterprise perspective.
Top-down logical data models are created in an abstract way by getting information from people who know the subject area. A system may not implement all the entities in a logical model, but the model serves as a reference point or template.
Top-down logical data models are created in an abstract way by getting information from people who know the subject area. A system may not implement all the entities in a logical model, but the model serves as a reference point or template.
Evaluation in organization use of data warehouses:
Off line Operational Database
Data warehouses in this initial stage are developed by simply copying the data off an operational system to another server where the processing load of reporting against the copied data does not impact the operational system's performance.
Off line Data Warehouse
Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data is stored in a data structure designed to facilitate reporting.
Real Time Data Warehouse
Data warehouses at this stage are updated every time an operational system performs a transaction (e.g. an order or a delivery or a booking.)
Integrated Data Warehouse
Data warehouses at this stage are updated every time an operational system performs a transaction and then generate transactions that are passed back into the operational systems.
Dimension: A dimension is a data element that categorizes each item in a data set into non-overlapping regions.
The primary function of dimensions is threefold: to provide filtering, grouping and labeling. For example, in a data warehouse where each person is categorized as having a gender of male, female or unknown, a user of the data warehouse would then be able to filter or categorize each presentation or report by either filtering based on the gender dimension or displaying results broken out by the gender.
Types of dimension:
Confirmed Dimension: Dimension connecting to multiple facts is called confirmed dimension. This dimension does not change with time. Some examples are time dimension, customer dimension and product dimension
Junk Dimension: Consolidated dimension from several (two or more) smaller dimension is called junk dimension.
Degenerated Dimension: Fact containing attribute from dimension is called degenerated dimension. Fact generally contains measure, but since one or more attribute of (to be) dimension is present inside this fact, it is treated as dimension.
Role playing dimension: Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
Junk Dimension: Consolidated dimension from several (two or more) smaller dimension is called junk dimension.
Degenerated Dimension: Fact containing attribute from dimension is called degenerated dimension. Fact generally contains measure, but since one or more attribute of (to be) dimension is present inside this fact, it is treated as dimension.
Role playing dimension: Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
Codd’s Rules for normalization:
1NF: Eliminate Repeating Groups - Make a separate table for each set of related attributes, and give each table a primary key.
2NF: Eliminate Redundant Data - If an attribute depends on only part of a multi-valued key, remove it to a separate table. To avoid update and delete anomalies!
3NF: Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.
No comments:
Post a Comment