CIS 9616: Data Management [Spring 2008]

Prerequisites, Text, Description, Grading, Exams, Homeworks, Final Project, Outline.

Additional information about this course may be found on the Web at
http://knight.cis.temple.edu/~yates/cis9616/.

Lecture Time: Mondays: 4:40pm to 7:10pm in Tuttleman 403B

Instructor : Alexander Yates Miscellaneous:

PREREQUISITES

TEXT

DESCRIPTION

This course covers fundamental and advanced topics in Database Management Systems, including:

GRADING

EXAMS AND QUIZZES

All exams and quizzes are closed book. Their content is cumulative, i.e. they address the material from the entire semester up to the day of the exam. If a student misses the midterm for an emergency [as agreed with instructor], there will be no makeup exam: the homeworks, quizzes, and final project will become proportionally more important. If a student misses the midterm without previous agreement and without definitive proof as to the medical or legal reasons, he or she will get a zero for that exam. Quizzes that are missed will not be made up. //The final exam is mandatory on the scheduled day.

HOMEWORKS

Homeworks will be a mixture of programming problems and essay or short answer questions. Each assignment must be sent by e-mail to the instructor. The homeworks will be graded, commented upon, and returned by the instructor usually before the next homework is due.

Lateness Policy: Computer programs can be fickle and finicky. Tracking down bugs may take an unexpectedly long time. Students will have three (3) late days total (not three late days per assignment) to use on homeworks during the course of the semester, in case tricky bugs crop up. If a student does not hand an assignment in by the due date and time, the instructor will still accept the assignment during the next 24 hours, if the student has any late days left. Each late day cannot be subdivided; if an assignment is late, it takes a whole late day to allow it to be accepted. Once all three late days have been used, late assignments will not be accepted. Any exceptions must be approved by the instructor.

You are expected to work and complete all the homeworks on your own, except as otherwise noted. Plagiarism will be severely punished. See the University Policy on Plagiarism and Academic Cheating.

FINAL PROJECT

During the second half of the semester, students will have few homework assignments (except for reading), and instead will focus on the course project. Several project ideas will be suggested during the course of the semester, but students are free to suggest their own, especially if they relate to their current research. Students will be expected to come up with innovative, novel solutions to problems with modern databases.

Course projects will be undertaken by teams of 2 or 3 students. Each student on a team will receive the same grade for the project; it is up to the team members to divide the work fairly.

COURSE OUTLINE

Introduction to Relational Databases

History of database systems. Why databases? Differing views of databases: user, administrator, application. The Relational Model: history, concept, justifications, and comparison with other models. Schemas, relations, and relational queries. The Relational Algebra: selection, join, and other operators. Database modification. Transactions. Complexity of database query languages.

Modern Structured Query Language (SQL)

Database creation and data definition. Data types. Data manipulation: basic SQL queries; SQL joins; SQL updates, inserts, and deletes. Tuple variables, nulls, String operations, aggregate functions, and other syntactic sugar. Nested queries. SQL transactions. Views. Integrity Constraints. Administration. Functions and Procedures. Recursive queries.

Database Application Design and Development

ODBC and JDBC. Development tools and user interfaces. Web applications. JSP. Triggers. Authorization and security. XML.

Relational Database Design

The Entity-Relationship Model. ER Diagrams. Relationship with relational schemas. The Unified Modeling Language. First Normal Form, and other normal forms. Functional-dependency theory. Decomposition. Temporal Data, geospatial data, and other tricky data types.

Transaction Management

ACID properties. Concurrency control mechanisms. Serializability. Recoverability. Deadlock and deadlock handling. Weaker levels of consistency. Transaction failures. Recovery mechanisms. Storage structures for recovery. Log-based recovery. Remote backup.

Data Storage and Querying

Disk drives and file systems. RAID. Organization of records in files. Index structure options. Static and dynamic hashing. B-trees and R-trees. SQL indices. Efficient query processing. Approximate queries. Query optimization.

Data Mining and Information Retrieval

Data warehousing. Standard data mining techniques. Mining huge data sets. Statistical relational learning. Probabilistic databases. Natural language databases. Relevance ranking. Ontologies. Web search engines.

Database System Architecture

Centralized and Client-Server Architectures. Parallel and Distributed Architectures. Scaling to really huge datasets.