CIS 9616: Data Management
[Spring 2008]
Prerequisites,
Text,
Description,
Grading,
Exams,
Homeworks,
Final Project,
Outline.
Additional information about this course may be found on the Web at
http://knight.cis.temple.edu/~yates/cis9616/.
Lecture Time: Mondays: 4:40pm to 7:10pm in Tuttleman 403B
Instructor : Alexander Yates
- Office : Wachman Hall, Room 303A
- E-Mail :
- Contact Hours:
Mondays from 2:00pm to 4:00pm,
or by appointment, or drop by and see if I'm in.
Miscellaneous:
- Our first class is on Monday, January 28th and our last class is on Monday, May 5.
- Each student should have a general Temple email address, usually of the form firstName.LastName@temple.edu
- Important student information is accessible from http://owlnet.temple.edu/
- The last day to drop from the course (and get tuition refund) is Monday, February 4, 2008.
The last day to withdraw from the course (no refund) is Monday,
March 31, 2008.
Students who have previously withdrawn from this course, or who
have already withdrawn from 5 courses since September 2003 may not
withdraw.
- Any student who has a need for accomodation based on the impact of a
disability should contact me privately to discuss the specific situation
as soon as possible. Students with documented disabilities should contact
Disability Resources and Services at
215-204-1280 in 100 Ritter Hall to coordinate reasonable accomodations.
- Freedom to teach and freedom to learn are inseparable facets
of academic freedom. The University has adopted a policy on
Student and Faculty Academic Rights and Responsibilities
(Policy # 03.70.02) which can be accessed through the
following link:
http://policies.temple.edu/getdoc.asp?policy_no=03.70.02
- Students should be familiar with the University statement on academic
honesty found at the following link
http://www.temple.edu/bulletin/Responsibilities_rights/responsibilities/responsibilities.shtm
- The grievance procedures are available online
PREREQUISITES
TEXT
-
Database System Concepts, the 5th Edition, by Abraham Silberschatz, Henry F. Korth, and S. Sudarshan. The ISBN is 0-07-295886-3, and it's published by McGraw Hill.
DESCRIPTION
This course covers fundamental and advanced topics in Database Management Systems, including:
- Database System Architecture: ANSI/SPARC architecture; data abstraction; external, conceptual, and internal schemata; data independence; data definition and data manipulation languages.
- Data Models: Entity-relationship and relational data models; data structures, integrity constraints, and operations for each data model.
- Relational Query Languages: SQL, algebra, calculus.
- Theory of Database Design: Functional dependencies, normal forms, dependency preservation, information loss.
- Query Optimization: Equivalence of expressions, algebraic manipulation, optimization of selections and joins.
- Storage Strategies: Indices, B-trees, hashing.
- Transaction Processing: Recovery and concurrency control.
- Other advanced topics, possibly including: Object-oriented and object-relational Data Model. Parallel and Distributed Databases. Multimedia databases and queries by content. Data mining, data warehousing, mobile databases, Web databases.
GRADING
- Weekly Quizzes: 20%
- Homeworks: 20%
- Midterm: 30% (On Monday, February 25, 2008)
- Final Project: 30%
EXAMS AND QUIZZES
All exams and quizzes are closed book. Their content is cumulative, i.e. they address
the material from the entire semester up to the day of the exam. If a student misses
the midterm for an emergency [as agreed with instructor], there will be no
makeup exam: the homeworks, quizzes, and final project will become
proportionally more important. If a student misses the midterm without previous
agreement and without definitive proof as to the medical or legal reasons,
he or she will get a zero for that exam. Quizzes that are
missed will not be made up.
//The final exam is mandatory on the scheduled day.
HOMEWORKS
Homeworks will be a mixture of programming problems and essay or short answer questions.
Each assignment must be sent by e-mail to the instructor. The homeworks will be graded, commented upon, and returned by the instructor usually before the next homework is due.
Lateness Policy: Computer programs can be fickle and finicky. Tracking down bugs may take an unexpectedly long time.
Students will have three (3) late days total (not three late days per assignment) to use on homeworks during the course of the semester, in case tricky bugs crop up. If a student does not hand an
assignment in by the due date and time, the instructor will still accept the assignment during the next 24 hours, if the student has any late days left. Each late day
cannot be subdivided; if an assignment is late, it takes a whole late day to allow it to be accepted. Once all three late days have
been used, late assignments will not be accepted. Any exceptions must be approved by the instructor.
You are expected to work and complete all the homeworks on your own, except as otherwise noted.
Plagiarism will be severely punished.
See the
University Policy on Plagiarism and Academic Cheating.
FINAL PROJECT
During the second half of the semester, students will have few homework assignments (except for reading), and instead will focus on the course project. Several project ideas will be suggested during the course of the semester, but students are free to suggest their own, especially if they relate to their current research. Students will be expected to come up with innovative, novel solutions to problems with modern databases.
Course projects will be undertaken by teams of 2 or 3 students. Each student on a team will receive the same grade for the project; it is up to the team members to divide the work fairly.
COURSE OUTLINE
Introduction to Relational Databases
- History of database systems. Why databases? Differing views of databases: user, administrator, application. The Relational Model: history, concept, justifications, and comparison with other models. Schemas, relations, and relational queries. The Relational Algebra: selection, join, and other operators. Database modification. Transactions. Complexity of database query languages.
Modern Structured Query Language (SQL)
- Database creation and data definition. Data types. Data manipulation: basic SQL queries; SQL joins; SQL updates, inserts, and deletes. Tuple variables, nulls, String operations, aggregate functions, and other syntactic sugar. Nested queries. SQL transactions. Views. Integrity Constraints. Administration. Functions and Procedures. Recursive queries.
Database Application Design and Development
- ODBC and JDBC. Development tools and user interfaces. Web applications. JSP. Triggers. Authorization and security. XML.
Relational Database Design
- The Entity-Relationship Model. ER Diagrams. Relationship with relational schemas. The Unified Modeling Language. First Normal Form, and other normal forms. Functional-dependency theory. Decomposition. Temporal Data, geospatial data, and other tricky data types.
Transaction Management
- ACID properties. Concurrency control mechanisms. Serializability. Recoverability. Deadlock and deadlock handling. Weaker levels of consistency. Transaction failures. Recovery mechanisms. Storage structures for recovery. Log-based recovery. Remote backup.
Data Storage and Querying
- Disk drives and file systems. RAID. Organization of records in files. Index structure options. Static and dynamic hashing. B-trees and R-trees. SQL indices. Efficient query processing. Approximate queries. Query optimization.
Data Mining and Information Retrieval
- Data warehousing. Standard data mining techniques. Mining huge data sets. Statistical relational learning. Probabilistic databases. Natural language databases. Relevance ranking. Ontologies. Web search engines.
Database System Architecture
- Centralized and Client-Server Architectures. Parallel and Distributed Architectures. Scaling to really huge datasets.