Captain's Log: Database Management Systems, Spring 2008

1: January 18 (Friday)
Introduction and motivation. Go through syllabus.
Cover selected slides from Chapter 1 in textbook.
January 21 (Monday)
University Holiday (Martin Luther King)
2: January 23 (Wednesday)
Embark on Chapter 2, covering most of Sections 2.1 and 2.2 on relational algebra with fundamental operations.
We use many of the first 25 slides from Chapter 2.
3: January 25 (Friday)
Homework 1 out.
Cover (most of) Section 2.3, introducing natural join.
Do Exercise 2.1(b).
Look at slides 25--39,44.
4: January 28 (Monday)
Introduce the division operation (Sect. 2.3.3), generalized projection (Sect. 2.4.1), aggregate functions (Sect. 2.4.2), and modification of the database (Sect. 2.6).
Do Exercises 2.5(b,c,d,e).
Look at slides 40--43, 46--47, 48--53, 60--66.
5: January 30 (Wednesday)
Cover outer joins (Sect. 2.4.3) and null values (Sect. 2.5), thus finishing Chapter 2.
Do Exercises 2.11(a,b), 2.9(b), and 2.1(c) (with and without explicit max operator).
Look at Chap 2 slides, 54--59.
6: February 1 (Friday)
Homework 1 in.
Cover Sections 3.1-3.2, and Sections 3.3.1-3.3.5, introducing the basics of SQL.
Look at Chap 3 Slides, 1--17.
Do Exercise 2.8 (with and without aggregates), Exercise 2.11 (d), and Exercise 2.3 (b).
7: February 4 (Monday)
Homework 2 out.
Cover Sections 3.3.6-3.3.8, and Sections 3.4-3.6, introducing several useful SQL constructs.
Do Exercises 3.9(a-d), and Exercise 3.1(a).
Look at Chap 3 Slides, 18--31.
8: February 6 (Wednesday)
Cover Sections 3.7.1-3.7.3, and Section 3.8.
Do Exercise 3.2(g) (using "having"), Exercise 3.2(c) (using "not in"), Exercise 3.9(e) (using "≤= all", and using "with"), Exercise 3.2 (e) (encoding division).
Look at Chap 3 Slides, 32--39, 44.
Show how to set up and run MySQL, running simple queries, like Exercise 3.9(e).
Last day for 100% refund for a course with 78 or more calendar days (10 or more weeks in length).
9: February 8 (Friday)
Finish Chapter 3, covering Section 3.7.4, and Sections 3.9-3.11.
Look at Chap 3 Slides, pp. 40-62.
Do Exercises 3.7, 3.14, 3.21(b).
Play around with MySQL.
10: February 11 (Monday)
Homework 2 in.
Go through model solutions for Homework 1.
Embark on Chapter 4 on advanced features of SQL, covering Sections 4.1 & 4.2. Do Exercise 4.4.
Look at Chap 4 Slides, pp. 1-19.
11: February 13 (Wednesday)
Go through model solutions for Homework 2.
Finish Section 4, covering Sections 4.3, 4.4 & 4.6. (We skip Sections 4.5 & 4.8, and postpone Section 4.7.)
Do Exercise 4.10.
Look at Chap 4 Slides, pp. 20-27, 43-45, 50-52.
Last day for 50% refund for a course with 78 or more calendar days (10 or more weeks in length).
12: February 15 (Friday)
Homework 3 out. Graded Homework 1 back.
Embark on Chapter 5, covering the tuple relational calculus (Sect. 5.1).
Do Exercise 5.1(a,b,c), Exercise 5.11, Exercise 5.6(g,h).
Look at Chap 5 Slides, pp. 3-11.
13: February 18 (Monday)
Briefly cover the domain relational calculus (Sect. 5.2), the theoretical foundation for QBE (Query-by-Example), doing Exercise 5.6(g&h), and Exercise 5.2(a&b).
Cover the main aspects of QBE (Sect 5.3), doing Exercise 5.12(b&d&f).
Look at Chap 5 Slides, pp. 12-27.
14: February 20 (Wednesday)
Finish QBE (Section 5.3), doing Exercises 5.12(g) (with and without aggregates), 5.12(h), and 5.13.
Embark on Datalog (Sect 5.4).
Look at Chap 5 Slides, pp. 28-39, 51-56.
Thursday (February 21) is last day to drop a course without a W being recorded for a course with 98 calendar days (14 or more weeks in length).
15: February 22 (Friday)
Continue Datalog (Sect 5.4), with focus on negation.
Do Exercises 5.12(bcdefg).
Look at Chap 5 Slides, pp. 53,56-63.
16: February 25 (Monday)
Homework 3 in. Graded Homework 2 back.
Finish Datalog, with focus on recursion which adds a lot of power; for recursion in SQL, see Section 4.7. Also briefly discuss how recursion and negation interacts in Datalog.
Do Exercises 5.12(h) and 5.4(d).
Look at Chap 5 Slides, pp. 64-73.
17: February 27 (Wednesday)
Introduction to relational database design, covering Section 7.1 and Section 7.3.1.
Look at relation schema r(S,A,D) where S -> A and A -> D: decomposing into (S,A) and (A,D) is lossless and dependency-preserving; decomposing into (S,A) and (S,D) is lossless but not dependency-preserving; decomposing into (S,D) and (A,D) is not lossless.
Do Exercises 7.2 & 7.9.
Prove a sufficent condition for a decomposition to be lossless.
Look at Chap 7 Slides, pp. 1--7, 11--16.
18: February 29 (Friday)
Homework 4 out.
Finish Sections 7.2 & 7.3, motivating and introducing various normal forms; 3NF is motivated by considering a relation with the dependencies C -> P and PH -> C.
We can always decompose into BNCF, and always decompose dependency preserving into 3NF; but sometime we have to choose between (i) get BNCF but not preserve dependencies, (ii) preserve dependencies but "only" get 3NF.
Cover Section 7.4.1 on the closure of functional dependencies.
Look at Chap 7 Slides, pp. 8-10, 17-27.
19: March 3 (Monday)
Cover Section 7.4 on functional dependencies, including the computation of canonical cover, and checking for dependency preservation.
Do Exercises 7.5 & 7.22.
Illustrate the computation of canonical cover by the example on top of p.285.
Illustrate the algorithm on p.287 (lower part), checking for dependency preservation, on two ways of decomposing (S,A,D) with S -> A and A -> D.
Look at Chap 7 Slides, pp. 25--44.
Amtoft has no office hours Tuesday.
20: March 5 (Wednesday)
Graded Homework 3 back; go through model solutions.
Cover Section 7.5.1, decomposing into BCNF.
Do Exercises 7.25 and 7.27, decomposing the schema from Exercise 7.1 into BCNF while realizing it is already in 3NF.
Look at Chap 7 Slides, pp. 17,20,43--47, 55(where the schema is already in 3NF!)
21: March 7 (Friday)
Homework 4 in, go through model solutions.
Decomposition into 3NF (Sections 7.5.2 & 7.5.3), illustrated on various examples: decomposing ABCD wrt. dependencies A -> C and B -> D shows why one schema must contain a candidate key; decomposing ABCD wrt. dependencies AB -> CD and A -> C shows why one must first take the canonical cover.
Look at Chap 7 Slides, pp. 48--58.
22: March 10 (Monday)
Optional review session, going through some previous midterm exam questions.
23: March 12 (Wednesday)
1:30-3:20pm: Midterm exam.
24: March 14 (Friday)
Introduce multi-valued dependencies (Sect. 7.6.1); we prove that alpha -> beta implies alpha ->-> beta, and show (Exercise 7.29) that ->-> has some surprising properties.
Introduce "The Chase".
Look at Chap 7 Slides, pp. 23, 59-64.
March 17-21
Spring (Easter) break.
25: March 24 (Monday)
Graded midterm exam back; go through model solutions.
Homework 5 out.
Last day to drop a course with 98 calendar days (14 or more weeks in length) with a "W".
26: March 26 (Wednesday)
One more example of the chase (Question 2 in this set with these solutions).
Introduce 4NF (Section 7.6.2), and prove that 4NF implies BCNF.
Decomposition into 4NF (Section 7.6.3).
Prove that decomposing (A,B,C) into (A,B) and (B,C) is lossless if and only if there is a multi-valued dependency B ->-> A (equivalent to B ->-> C).
Look at Chap 7 Slides, pp. 65--68, 73.
27: March 28 (Friday)
Other normal forms (Section 7.7), including general join dependencies.
Take a step back, looking at the big design picture (Section 7.8).
Look at Chap 7 Slides, pp. 69--72.
Embark on the E-R model (Chapter 6), covering Sections 6.1-6.4.
Look at Chap 6 Slides, pp. 1-27.
28: March 31 (Monday)
Homework 5 in. Homework 6 out.
Try out E-R design, doing Exercises 6.1 & 6.4 & 6.17.
29: April 2 (Wednesday)
Continue the E-R model, covering Sections 6.5 (design issues), 6.6 (weak entity sets), 6.7.1-6.7.4 (generalization/specialization), and 6.9.1-6.9.3 (translation into schemas).
Do Exercise 6.2, and translate the resulting E-R diagram into tables, cf. Exercise 6.16.
Look at Chap 6 Slides, pp. 28-43.
30: April 4 (Friday)
All have now Oracle accounts.
Graded Homework 5 back, go through model solutions.
Finish the E-R model, covering the rest of Chapter 6 (except Section 6.11).
Look at Chap 6 Slides, pp. 44-61.
31: April 7 (Monday)
Homework 6 in. Project out.
Embark on Data Analysis and Mining (Chapter 18), covering Data Analysis and OLAP (Section 18.2).
Do Exercises 18.1 & 18.3. mention Exercises 18.4 & 18.8. Express "ranking" (even the "dense" kind), using the "count" aggregate.
Look at Chap 18 Slides, pp. 1--23.
32: April 9 (Wednesday)
Go through model solutions for (the very open-ended) Homework 6.
Embark on Data Mining (Section 18.4), covering Association Rules (Sect. 18.4.3), with key notions of support and confidence.
Look at Chap 18 Slides, pp. 30, 43--46.
33: April 11 (Friday)
Wrap up Association rules, doing Exercise 18.13 and mentioning Exercise 18.15.
Cover Data Warehousing (Section 18.3), doing exercise 18.7 and mentioning Exercise 18.5.
Run a simple "rollup" query in Oracle.
Continue Data Mining, covering decision tree classifiers (Sect. 18.4.2.1), motivated by an example of constructing a decision tree (taken from Maria Zamfir Bleyberg). We looked at an example on how to use the notion of information-gain when computing the best split.
Look at Chap 18 Slides, pp. 24--29, 32--36, 38--39, 47.
34: April 14 (Monday)
An example on how to use the notion of information-gain ratio when computing the best split.
Cover Bayesian classifiers (Sect. 18.4.2.2), where
                 P(WZ|X)P(X)
 P(X|WZ) =  ------------------------
           P(WZ|X)P(X) + P(WZ|~X)P(~X)

and where in the "naive" case we can assume that P(WZ|X) = P(W|X)P(Z|X).
Look at Chap 18 Slides, pp. 37, 40--41.
Embark on Chapter 15 on Transactions, introducing the ACID properties.
Look at Chap 15 Slides, pp. 1--11.
35: April 16 (Wednesday)
Part I of Project in. Homework 7 out (later today).
Finish Chapter 15 on transactions, introducing the key concepts of (conflict) serializability and (cascadeless) recoverability.
Look at Chap 15 Slides, pp. 12--34.
Embark on concurrency control, motivating lock-based protocols (Section 16.1) while pointing out potential pitfalls.
Look at Chap 16 Slides, pp. 1--7.
April 18 (Friday)
Class canceled (All-University Open House)
36: April 21 (Monday)
Graded Homework 6 back.
Graded Phase I of project back.
Continue Chapter 16 on concurrency control, covering lock-based protocols (Section 16.1); this includes those based on graph ordering (Section 16.1.5) and on multiple granularity (Section 16.4).
Sketch a proof that the two-phase locking protocol yields only serializable schedules (cf. Exercise 16.1).
Do Exercise 16.5; mention Exercises 16.12&15.
Look at Chap 16 Slides, pp. 8--11, 16--23.
37: April 23 (Wednesday)
Continue concurrency control, covering Do Exercises 16.16&22&25.
Encourage looking at the suggested solution to Exercise 16.14, comparing the various protocols.
Look at Chap 16 Slides, pp. 31--47.
Thursday April 24, Amtoft has office hours from 3pm to 4pm.
38: April 25 (Friday)
Homework 7 in.
We now turn our attention to back-end issues, and cover Sections 12.1-12.3, with focus on B+ trees (deciding to ignore the gritty details of insertion and deletion algorithms).
Look at Chap 12 Slides, pp. 1--32.
39: April 28 (Monday)
Finish Chapter 12: wrap up B+ trees; then focus on Hashing.
Do Exercise 12.7.
Look at Chap 12 Slides, pp. 33--72.
April 30 (Wednesday)
Part II of Project in. (Prepare to give demos very soon.)
Class canceled (Landon Lecture by CIA director Hayden).
40: May 2 (Friday)
Homework 8 out. Graded Homework 7 back.
Introduction to Query Processing (Sections 13.1-13.2), doing Exercise 13.1.
Cover Section 13.3, discussing a variety of ways to implement selection.
Cover Section 13.4, discussing sorting.
Start on Section 13.5, discussing a variety of ways to implement join: nested-loop (13.5.1), block nested-loop (13.5.2), indexed nested-loop (13.5.3).
Look at Chap 13 Slides, pp. 1--28.
41: May 5 (Monday)
Finish Section 13.5, introducing merge join (13.5.4) and hash join (13.5.5).
Look at Chap 13 Slides, pp. 29--41.
Introduction to Query Optimization (Section 14.1), covering Size Estimation for selection and join (Sections 14.3.1-3).
Do Exercises 14.4 & 14.5.
Look at Chap 14 Slides, pp. 1--5, 33--41.
42: May 7 (Wednesday)
Homework 8 in. Go through model solutions for Homework 7.
Finish Estimating Statistics (Sections 14.3.4-5).
Cover Section 14.2 on Equivalent Expressions.
Do Exercise 14.2(abc), and some of Exercise 14.3(b) (for part a, see pp 86--89 in textbook).
Cover Section 14.4 (except Section 14.4.4) on choice of evaluation plan.
Give an example (p.589) where "perform selection early" is a bad idea.
Look at Chap 14 Slides, pp. 6--30, 42--46.
43: May 9 (Friday)
Review session.
Spent most of the time on going through model solutions for Homework 8.
Also very briefly mention Sections 13.6-7 (thus wrapping up Query Processing), and Section 14.5 (Materialized Views).
May 12 (Monday)
Final exam, 4:10-6:00pm


Torben Amtoft