Why one grouping of attributes into a relation schema may be better than another
two levels we can discuss the goodness of relation schemas.
- logical(conceptual) level - how users interpret the relation schemas and the meaning of their attributes.
- implementation (physical storage) level - how the tuples in a base relation are stored and updated.
- Top-down design approach and as such is more appropriate when performing design of database by analysis and decomposition of sets of attributes
Relational database design ultimately produces a set of relations. The implicit goals of the design activity are information preservation and minimum redundancy.
- Minimizing redundancy = minimize redundant storage of same information + reduce the need for multiple updates to maintain consistency across multiple copies of the same information.
- functional dependency: a formal constraint among attributes that is the main tool for formally measuring the appropriateness of attribute groupings into relation schemas.
normalization 与 functional dependency 的关系:
the process of normalization using functional dependencies.
1NF: 列的原子性, 即列不能够再分成其他几列 联系电话包括: 家庭电话 和 公司电话, 需要拆分为两列在表里进行储存
2NF: 完全依赖主键 OrderID 和 ProductID 也就是说, 2NF 在 1NF 的基础之上, 消除了非主属性对于码的部分函数依赖.
3NF: 任何非主属性不依赖于其它非主属性. 第三范式 (3NF) 是第二范式 (2NF) 的一个子集, 即满足第三范式 (3NF) 必须满足第二范式(2NF).
Update Anomaly(更新异常): 修改一个列会 影响到其它的列 主要包括 insertion anomalies, deletion anomalies, and modification anomalies
EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) Update Anomaly: change the name of project number P1 from "Billing" to "Customer-Accounting" may cause this update to be made for all 100 employees working on project P1
Insert Anomaly(插入异常): Can not insert a project unless an employee is assigned to it. 混在一起, 必须要处理不同的信息
Delete Anomaly(删除异常): When a project is deleted, it will result in deleting all the employees who work on the project 会有多个行存一条 Project 信息
改变 Entity 的状态: 不应该受到影响
- Modification Anomaly(修改异常):
- Guidelines:
非异常的情况: Design a schema that does not suffer from the insertion, deletion and update anomalies
- Relations should be designed such that their tuples will have as few NULL values as possible
- The relations should be designed to satisfy the losslee join condition:
Candidate Key: 表中的一个属性或属性组, 若除 K 之外的所有属性都完全依赖于 K
如何判断一个 Relation 是否符合 2NF?
数据表中所有的 candidate key
根据 candidate key, 找出主属性
找到所有的非主属性
查看是否存在非主属性对码的部分函数依赖
完全函数依赖: 在一张表中, 如果有 X -> Y, 那么对于 X 的任何一个真子集, X' -> Y 都不成立, 则 Y 对于 X 完全函数依赖. 也就是说 X 当中少了任何一个 attribute 都不能够唯一确定 Y
部分函数依赖: 是完全函数依赖的取反, X 当中的部分的属性就可以确定 Y
传递函数依赖: Z 函数依赖于 Y, 且 Y 函数依赖于 X, 那么我们就称 Z 传递函数依赖于 X
NULL 值用在什么地方?
- not applicable or invalid
- Attribute value unknown
- Value known to exist, but unavailable
- Closure of Attributes
- Given a relation, FD, a set of attributes A, find all B such that A -> B
来源: http://www.bubuko.com/infodetail-2869334.html