Database System Concepts and Architecture
Data Abstraction --> !!!View!!!
A major purpose of a DBS: providing abstract view of data
Data abstraction at several levels
DBS: retrieve data efficiently [new view means new index -> quick search]
Need for efficiency has led designers to use complex data structures to represent data in DB.
View can simplify user's interaction with the system.
Physical (internal) level:
The lowest level of abstraction describes complex low-level data structures in detail (HOW the data are stored).
Covers the physical implementation of the database to achieve optimal runtime performance and storage space utilization; the data structures and file organizations used to store data on storage devices;
Interfaces with the OS access methods to place the data on the storage devices, build the indexes, retrieve the data, and so on.
Logic (conceptual) level:
The next-higher level of abstraction describes WHAT data are stored in the DB, and WHAT relationships exist among the data.
Describes the entire DB in terms of a small number of relatively simple structures.
The community view of the database, seen by the database application.
Represent: all entities, their attributes and their relationships, the constraints on the data, semantic information about the data, security and integrity information.
View (external) Level:
Describe various parts of the entire DB. Simplify the interaction with the system. DBMS may provide many views for the same DB.
Describe the part of the DB that is relevant to each user, includes only those entities, attributes and relationships in the real world that the user is interested in, others will not be unaware by the user.
Instances and Schemas of DB
Instances of DBs: 【实例】【特定时刻的数据统称】【动态】【快照】
The collection of data stored in the DB at a particular moment is called an instance of the DB.
Schemas of DBs: 【数据库应该是什么样的】【静态】
The overall design / description of the DB is called the DB schema.
External schema: (view level, corresponding to different views)
Describe different views of the DB.
The Logic (conceptual) schema: (logic level) 【提供对整个概念的抽象，不必考虑具体实现】
Describes all the entities, attributes, and relationships together with integrity constraints.
Most important! Programmers construct applications based on this schema.
Logical data independence:
The immunity of the external-schemas to changes in the logical schema. (addition or removal of new entities, attributes, relationships not require the rewriting of application. )
The physical (internal) schema: (physical level)
Complete describe the definition of the stored records, the method of representation, the data fields, and the indexes and storage structures used.
Hidden beneath the logical schema, can usually be change easily without affecting application programs.
Physical data independence: If programs do not depend on the physical schema, thus no need for rewritten if physical schema changed. 【物理数据的独立性】
The ANSI-SPARC Three-level Architecture
- Objective of this architecture
- Separate each user's view of the DB from the way the DB is physically represented.
- Reason why Separate
- Different user should be able to access the same data, but have a different customized view of the data.
- Each user should be able to change the way he or she views the data. and this change should not affect other users.
- User should not have to deal directly with physical database storage details. (indexing, hashing)
- The DBA should be able to change the database storage structures or conceptual structure without affecting the user's views.
- The internal structure of the database should be unaffected be changes to the physical aspects of storage. (迁移硬盘)
Mapping between Schemas of the DBs.
The two stage mapping between schemas:
- Logical-physical mapping: enable DBMS to find the actual record or combination of records in physical schema, together with any constraints to be enforced on the operations for that logical record.
- External-conceptual mapping: enable the DBMS to map names in the user's view to the relevant part of the logical schema.
Data model: Underlying the structure of a DB
A collection of conceptual tools for describing data, data relationship, data semantics, and consistency constraints.
Provides a way to describe the design of the database at the physical, logical, and view level.
Components of data model:
- Structure of the data
- Operation on the data
- Constraint on the data
Relational Model: uses a set of tables (relations) to represent both data and the relationships among those data.
Record-based model: The DB is structured in fixed-format records of several types. (Relational Model is an example of a record-based model)
Entity-Relationship Model: uses a collection of basic objects, called entities, and relationships among these objects.
Objected-oriented data model:
can be seen as extending the ER model with notions of encapsulation, methods(functions), and objects identity.
combines features of the object-oriented data model and relational data model.
Semi-structured data model:
permits the specification of data where individual data items of the same type may have different sets of attributes.
Network data model and hierarchical data model: [preceded the relational data model]
were tied closely to the underlying implementation, and complicated the task of modeling data.
DDL: Data-definition Language => specify the DB schema
Specify a DB schema by a set of definitions and additional properties of the data.
Data storage and definition language:
specify the storage structure and access methods used by the DB system by a set of statements, which define the implementation details of the DB schemas, which are usually hidden from the user.
- Consistency constraints. 一致性约束
- Domain constraints. 域约束 【变量声明、列定义】
- Referential integrity 参照完整性约束 【外键】
- Assertions 断言 【数据库需要某时刻满足某一条件。实例约束。】
- Authorization (read, insert, update, delete) 授权 【对于不同用户在数据库中不同数据值上允许不同的访问类型】
DML: Data-manipulation Language => express DB queries and updates
- Retrieval of data stored in the DB
- Insertion of new data into the DB
- Deletion of data from the DB
- Modification of data stored in the DB
- Procedural DMLs: requires a user to specify WHAT data are needed and HOW to get the data. 过程化DML 要求用户指定需要什么数据,以及如何获得数据库。
- Declarative DMLs: requires a user to specify WHAT data are needed WITHOUT SPECIFYING HOW to get those data. 声明式DML 只要求用户指定需要什么数据,而不指明如何获得这些数据。
Database design: mainly involves the design of the DB schemas
- Initial phase: Characterize fully the data needs of the prospective DB users.
- Conceptual-design phase: Choose a data model, and by applying it into a conceptual schema of the DB
- Logical-design phase: Map the high-level conceptual schema onto the implementation data model.
- Physical-design phase: Specify the physical features if the DB
Functional Components of a Database System
Authorization and integrity manager: tests the constraints, check the authorization.
Transaction manager: ensure consistent.
a collection of operations that performs a single logical function in a DB application.
- Ensure atomicity and durability (recovery manager)
- Ensure interactions amount concurrent transaction (concurrency-control manager)
- Consistency: 数据库事务不能破坏关系数据的完整性以及业务逻辑上的一致性。
- Isolation 事务不会查看到中间状态的数据
- Durability 持久性：只要事务成功结束，它对数据库所做的更新就必须永久保存下来
File manager: allocation of space, data structure
- Data files: DB itself
- Data dictionary: mete-data about the structure of the DB
- Indices: fast access to data item
Buffer manager: fetching data into main memory, deciding cache.
Query processor: Help to simplify and facilitate access to data.
- DDL interpreter
- DML complier: perform query optimization
- Query evaluation engine: execute low-level instructions generated by the DML complier
Query processor = Query complier + Execution engine
Transaction processor = Logging and recovery manager + Concurrency-control manager (scheduler)
Functions / Services of any Full-scale DBMS
- Data storage, retrieval and update
- User-accessible catalog
- Transaction support
- Concurrency control services
- Recovery services
- Authorization services
- Support for data communication
- Integrity services
- Services to promote data independence
- Utility services
Database Application Architetures
- The two-tier architecture: Client + Server
- The three-tier architecture:
- Client: User interface
- Server (With DBMS) : Business Logic + Data processing Logic
- DBMS: Data Validation + database access
Database Users and User Interface
- Naive users (interact with invoking one of the application programs)
- Application programmers (write application programs)
- Sophisticated users (interact without writing program)
- Specialized users (write specialized DB app which not fit into traditional data-processing framework)
person with central control
Storage structure and access-method definition
Schema and physical-organization modification
Granting of authorization for data access
- Periodically backing up the database
- Ensuring that enough free disk space is available for normal operations, updating disk space as required
- Monitoring jobs running on the database and ensuring the performancec