Unit 5
Unit 5
Relational Database Design and Normalization
1. Features of Good Relational Database Designs
A well-designed relational database ensures:
- Data Integrity: Strongly enforced rules like foreign key constraints maintain the consistency of related data across tables.
- Optimized Queries: Carefully structured tables and indexed columns improve query performance, enabling faster data retrieval.
- Data Scalability: A good database design allows seamless integration of new features or tables without disrupting existing ones.
- Logical Organization: The use of clear table relationships (e.g., one-to-one, one-to-many) ensures easy interpretation and maintenance.
2. Decomposing Relational Schemas Using Functional Dependencies
What Are Functional Dependencies?
- A functional dependency (FD) exists when one attribute (or a group of attributes) uniquely determines another attribute.
- Example: In a Student table,
StudentID -> StudentName
implies eachStudentID
corresponds to exactly oneStudentName
.
- Example: In a Student table,
Why Decompose Using FDs?
- Large tables may contain redundant data, leading to anomalies. Decomposing tables using functional dependencies ensures data consistency and minimizes redundancy.
Practical Steps:
- Identify functional dependencies in the given table.
- Split the table into smaller tables based on dependencies while:
- Preserving relationships between attributes.
- Ensuring that the decomposition is lossless (no data is lost).
Example:
For a table with {OrderID, ProductID, ProductName, CustomerID, CustomerName}
:
- If
CustomerID -> CustomerName
andProductID -> ProductName
, you can decompose it into two tables:{CustomerID, CustomerName, OrderID}
{OrderID, ProductID, ProductName}
3. Implementing Normal Forms for Database Normalization
First Normal Form (1NF):
- Ensure all columns contain atomic values.
- Eliminate repeating groups or arrays.
- Example:
- Non-atomic table: | StudentID | Subjects | |———–|——————-| | 1 | Math, Science |
- 1NF-compliant table: | StudentID | Subject | |———–|———-| | 1 | Math | | 1 | Science |
Second Normal Form (2NF):
- Eliminate partial dependencies.
- A column should only depend on the entire primary key, not part of it.
- Example: If
OrderID + ProductID
is a composite key, thenCustomerName
depends only onOrderID
, which violates 2NF.
Third Normal Form (3NF):
- Eliminate transitive dependencies.
- A non-primary attribute should not depend on another non-primary attribute.
- Example: If
StudentID -> DepartmentID -> DepartmentName
, removeDepartmentName
to create a separate table{DepartmentID, DepartmentName}
.
Boyce-Codd Normal Form (BCNF):
- A stricter version of 3NF; ensures every functional dependency has a superkey as its determinant.
4. Functional Dependency Theory
Key Concepts:
- Closure of Attributes: Determines all attributes that can be functionally derived from a set of attributes.
- Example: If
A -> B
andB -> C
, thenA -> C
is derived.
- Example: If
- Candidate Keys: Minimal set of attributes that uniquely identify a record.
- Example:
{OrderID, ProductID}
in a composite key.
- Example:
- Canonical Cover: Minimal set of functional dependencies that retains the original FD set’s properties.
Importance:
This theory provides the foundation for designing and validating relational schemas, ensuring all dependencies are represented efficiently.
5. Algorithms for Decomposition Using Functional Dependencies
Lossless Decomposition Algorithm:
- Identify all candidate keys and functional dependencies.
- Divide the attributes into smaller tables ensuring:
- The original table can be reconstructed (lossless join).
- Functional dependencies are preserved (dependency preservation).
Dependency Preservation:
- Verify that each functional dependency can still be enforced in the decomposed schema.
Example:
For a table {A, B, C, D}
with A -> B
, B -> C
:
- Step 1: Identify functional dependencies.
- Step 2: Decompose into
{A, B}
,{B, C}
, and{A, D}
. - Step 3: Verify the decomposition preserves dependencies and supports lossless joins.
Advanced Database Design Concepts
1. Decomposing Schemas Using Multivalued Dependencies
What Are Multivalued Dependencies?
- A multivalued dependency (MVD) exists when one attribute determines a set of values for another, independently of other attributes in the schema.
- Example: In a
Movie
table with{Title, Actor, Genre}
, eachTitle
can be associated with multipleActors
and multipleGenres
.
- Example: In a
Why Decompose Using MVDs?
- Multivalued dependencies can lead to redundancy and anomalies in data, similar to functional dependencies.
- Decomposition resolves these issues while maintaining the original data relationships.
Practical Steps:
- Identify the multivalued dependencies in the schema.
- Decompose the table into smaller tables to eliminate redundancy.
- Ensure the decomposition is lossless and preserves dependencies.
Example:
For a table {StudentID, Course, Hobby}
:
- If
StudentID ->> Course
andStudentID ->> Hobby
(MVDs), decompose into:{StudentID, Course}
{StudentID, Hobby}
2. Understanding Additional Normal Forms
Fourth Normal Form (4NF):
- A table is in 4NF if it is in BCNF and has no non-trivial multivalued dependencies.
- Eliminates redundancy caused by MVDs.
- Example:
- A table
{StudentID, Course, Hobby}
withStudentID ->> Course
andStudentID ->> Hobby
should be split as shown above.
- A table
Fifth Normal Form (5NF) or Project-Join Normal Form (PJNF):
- A table is in 5NF if it is in 4NF and cannot be further decomposed into smaller tables without losing information.
- Addresses redundancy caused by join dependencies.
Domain-Key Normal Form (DKNF):
- Ensures all constraints are represented by domain constraints and key constraints, minimizing anomalies.
3. Applying Concepts of Atomic Domains and 1NF
Atomic Domains:
- Domains where each value is indivisible or atomic.
- Example:
- Non-atomic domain:
{Name: "John, Jane"}
- Atomic domain:
{Name: "John"}, {Name: "Jane"}
- Non-atomic domain:
First Normal Form (1NF):
- A table satisfies 1NF if:
- All attributes contain atomic values.
- There are no repeating groups or arrays.
- Example:
- Non-atomic table: | StudentID | Subjects | |———–|——————-| | 1 | Math, Science |
- 1NF-compliant table: | StudentID | Subject | |———–|———-| | 1 | Math | | 1 | Science |
4. Evaluating Systematic Design Processes
Steps in a Systematic Design Process:
- Requirement Analysis:
- Gather and analyze user requirements to identify entities, attributes, and relationships.
- Conceptual Design:
- Use Entity-Relationship (ER) diagrams to represent data and relationships visually.
- Logical Design:
- Convert the ER model into a relational schema.
- Normalization:
- Apply normalization rules to remove redundancy and anomalies.
- Physical Design:
- Define storage structures, indexing, and query optimization.
Benefits:
- Ensures a logical and efficient database structure.
- Reduces the risk of anomalies and redundancy.
- Enhances scalability and performance.
5. Modeling Temporal Data in Database Designs
Challenges of Temporal Data:
- Representing data that changes over time.
- Maintaining historical records and future projections.
Temporal Modeling Techniques:
- Valid-Time and Transaction-Time:
- Use two timestamps:
- Valid-Time: When the data is true in the real world.
- Transaction-Time: When the data is stored in the database.
- Use two timestamps:
- Temporal Tables:
- Add additional columns for timestamps to track changes over time.
- Example: | EmployeeID | Salary | StartDate | EndDate | |————|——–|————|————| | 1 | 50000 | 2023-01-01 | 2024-01-01 |
- Bitemporal Data:
- Combines valid-time and transaction-time for comprehensive tracking.
Temporal Querying:
- Specialized SQL clauses (e.g.,
PERIOD
,AS OF
) enable querying historical and future data.
Conclusion and Key Takeaways
Conclusion
Relational database design and advanced normalization techniques are fundamental for creating efficient, scalable, and reliable databases. Mastering these concepts ensures data consistency, reduces redundancy, and enhances query performance.
Key Takeaways
Lesson 1: Relational Database Design and Normalization
- Good database designs prioritize data integrity, scalability, and logical organization.
- Decomposition using functional dependencies eliminates redundancy and anomalies.
- Normal forms, from 1NF to BCNF, systematically refine tables for better structure.
- Functional dependency theory underpins effective schema design and validation.
- Algorithms for decomposition ensure lossless joins and dependency preservation.
Lesson 2: Advanced Database Concepts
- Decomposing schemas using multivalued dependencies further reduces redundancy in complex scenarios.
- Additional normal forms (4NF, 5NF, and DKNF) address anomalies beyond traditional functional dependencies.
- Atomic domains and 1NF ensure tables contain indivisible values for clean data representation.
- Systematic design processes provide a structured approach to building robust databases.
- Modeling temporal data introduces methods to handle evolving and historical information effectively.
This post is licensed under CC BY 4.0 by the author.