Chapter 7 – Accessing Organizational Data
- structured data - data files; dabase records; database tables
- unstructured data - document / text files
- Managing Organizational Data
- Problems with Separate Files and File Ownership
- Separated and Isolated Data
- Data Duplication - data integrity
- Application Program Dependency on Data (e.g. Year 2000 crisis)
- Incompatible Files
- The Difficulty of Representing Data in Users' Perspective (not natural)
- The Database Solution
Definition of a Database
bits -> bytes(characters) -> fields -> records -> files -> database
A database is a self-describing collection of integrated records
- A Database is Self-Describing - description is in a data dictionary
- A Database is a Collection of Integerated Records - uses indexes
- A Database is a Model of a Model (model of the users' model = views)
- Integrated Data - database
- Reduced Data Duplication
- Program/Data Independence
- Easier Representation of the Users' Perspective
- requires a database administrator
- database management system (DBMS) - software that manages & creates the databases
- data manipuation language (DML) - uses to program the DBMS
- database query language - used to formulate queries and reports
- data definition language (DDL) - used to define the structure and schema of the database
- Enterprise Databases - requires a centralized enterprise-wide database
- Client/Server Databases - allocates data to a database server and assigns local responsibility for its maintenance and integrity [database appears local to the user]
- Data Warehousing - database created specifically for retrieval by knowledge workers
- Review of Data Concepts
- Entities, Attributes, and Relationships
- entity (entity class) - type or category of objects about which data is collected
- entity instance - individual instance within the category
- attribute - characteristic of interest about an entity (field)
- identifier (key) - attribute that distinguishes one entity instance from another
- relationship - correspondence or association between entities
- degree of a relationship is determined by the largest # of instances on each end of the relationship
- Relating Entity Terminology to Files, Records, and Data Items
- Data Names - ex. CUSTOMERID
- Data Types - integer, real, string, date [each data item has a domain]
- Data Structures
- E-R Diagram
- Data Structure Diagram
- Semantic Object Diagram
- Algebraic Data Description
- Data Order (collating sequences: ASCII, EBCDIC)
- Relational Databases Maintained by the Organization
need to develop tables for querying and reports
- Identifying the Tables of Interest - select tables; include
- names and types of attributes (include meanings)
- identify which attributes are keys and foreign keys
- determine the domains of each attribute
- describe constraints on the data
- Formatting the Output or Report
- data for use in spreadsheet analysis
- data for use as part of a document
- report output
- Determining Selection and Ordering Criteria
- Forming and Executing a Database Query (SQL - Structured Query Language vs. QBE - Query By Example)
ex. SELECT CUSTOMER-NAME, ORDER-NO, AMOUNT
FROM ORDERS
ORDER BY AMOUNT DESCENDING
WHERE AMOUNT > 1000 AND MONTH = CURRENT-MONTH
- Validating Query Results
- estimate the number of records to be retrieved prior to the query operation
- inspect or print a sample from each of the tables being used in order to review your understanding of the tables and their values
- make rough estimates for any calculations such as totals that will result from your queries
- compare your results with previous queries or with estimates by others in your workgroup
- do a complex query by starting with the simplest part and adding query elements one by one
- sort results so as to highlight possible errors or omissions
- perform a second query using independently formulated conditions to verify that the WHERE clause and other parts of the expression have been processed as expected
- consider a complementary selection (or at least a count)
- Online Analytical Processing (OLAP) - multidimensional analysis combining analytic and retrieval software
- Constraints on Accessing Organizational Databases (who has priviliges; views into the database; snapshots of data)
- Knowledge Work Productivity in Accessing Organizational Data
- Efficient Use of Data Access Software - should include a GUI, easy to learn, and help facilities
- Obtaining Data for Direct Use - should have a GUI for designing queries and reports
- Obtaining Data for Spreadsheet Analysis
- Obtaining Data for Word Processing Documents (RFT)