The multiple-granularity protocol rules specify that a transaction Ti can lock a node Q in S or IS mode only if Ti currently has the parent of Q locked in either IX or IS mode. Given that SIX and S locks are stronger than IX or IS locks, why does the protocol not allow locking a node in S or IS mode if the parent is locked in either SIX or S mode?
> Discuss the advantages and disadvantages of the two methods that we presented in Section 23.3.4 for generating globally unique timestamps.
> If we apply a distributed version of the multiple-granularity protocol of Chapter 18 to a distributed database, the site responsible for the root of the DAG may become a bottleneck. Suppose we modify that protocol as follows: • Only intention-mode locks
> In the majority protocol, what should the reader do if it finds different values from different copies, to (a) decide what is the correct value, and (b) to bring the copies back to consistency? If the reader does not bother to bring the copies back to consi
> Give an example where the read one, write all available approach leads to an erroneous state.
> What characteristics of an application make it easy to scale the application by using a key-value store, and what characteristics rule out deployment on key-value stores?
> Consider system that is processing a stream of tuples for a relation r with attributes (A, B, C, timestamp) Suppose the goal of a parallel stream processing system is to compute the number of tuples for each A value in each 5 minute window (based on the
> Suppose you wish to perform keyword querying on a set of tuples in a database, where each tuple has only a few attributes, each containing only a few words. Does the concept of term frequency make sense in this context? And that of inverse document frequ
> The attribute on which a relation is partitioned can have a significant impact on the cost of a query. a. Given a workload of SQL queries on a single relation, what attributes would be candidates for partitioning? b. How would you choose between the alter
> Using the university schema, write an SQL query to find section(s) with max- imam enrollment. The result columns should appear in the order “coursed, secede, year, semester, numb”. (It may be convenient to use the with construct.)
> What is the motivation for work-stealing with virtual nodes in a shared-memory setting? Why might work-stealing not be as efficient in a shared-nothing set- ting?
> Suppose you wish to handle a workload consisting of a large number of small transactions by using shared-nothing parallelism. a. Is intra query parallelism required in such a situation? If not, why, and what form of parallelism is appropriate? b. What fo
> Describe a good way to parallelize each of the following: a. The difference operation b. Aggregation by the count operation c. Aggregation by the count distinct operation d. Aggregation by the age operation e. Left outer join, if the join condition involv
> Can partitioned join be used for r ⋈r? A
> Joins can be expensive in a key-value store, and difficult to express if the system does not support SQL or a similar declarative query language. What can an application developer do to efficiently get results of join or aggregate queries in such a setting?
> Why is it easier for a distributed file system such as GFS or HDFS to support replication than it is for a key-value store?
> What is the motivation for storing related records together in a key-value store? Explain the idea using the notion of an entity group.
> What factors could result in skew when a relation is partitioned on one of its attributes by: a. Hash partitioning? b. Range partitioning? In each case, what can be done to reduce the skew?
> Consider the E-R diagram in Figure 8.9, which contains specializations, using subtypes and sub tables. a. Give an SQL schema definition of the E-R diagram. b. Give an SQL query to find the names of all people who are not secretaries. c. Give an SQL query t
> For each of the three partitioning techniques, namely, round-robin, hash partitioning, and range partitioning, give an example of a query for which that partitioning technique would provide the fastest response.
> Suppose that a major database vendor offers its database system (e.g., Oracle, SQL Server DB2) as a cloud service. Where would this fit among the cloud- service models? Why?
> Using the university schema, write an SQL query to find the number of students in each section. The result columns should appear in the order “coursed, secede, year, semester, numb”. You do not need to output sections with 0 students.
> In a shared-nothing system data access from a remote node can be done by remote procedure calls, or by sending messages. But remote direct memory access (RDMA) provides a much faster mechanism for such data access. Ex- plain why.
> Assume we have data items d1, d2, d n with each di protected by a lock stored in memory location Mi. a. Describe the implementation of lock-X (di) and unlock (di) via the use of the test-and-set instruction. b. Describe the implementation of lock-X (di)
> Memory systems today are divided into multiple modules, each of which can be serving a separate request at a given time, in contrast to earlier architectures where there was a single interface to memory. What impact has such a memory architecture have on
> What are the factors that can work against linear scale up in a transaction processing system? Which of the factors are likely to be the most important in each of the following architectures: shared-memory, shared disk, and shared nothing?
> Is it wise to allow a user process to access the shared-memory area of a database system? Explain your answer.
> Database systems are typically implemented as a set of processes (or threads) accessing shared memory. a. How is access to the shared-memory area controlled? b. Is two-phase locking appropriate for serializing access to the data structures in shared memo
> Assume that a growing enterprise has outgrown its current computer system and is purchasing a new parallel computer. If the growth has resulted in many more transactions per unit time, but the length of individual transactions has not changed, what measu
> Consider the schemas for the table people, and the table’s students and teachers, which were created under people, in Section 8.2.1.3. Give a relational schema in third normal form that represents the same information. Recall the constraints on sub table
> If an enterprise uses its own ERP application on a cloud service under the platform-as-a-service model, what restrictions would there be on when that enterprise may upgrade the ERP system to a new version?
> Consider a bank that has a collection of sites, each running a database system. Suppose the only way the databases interact is by electronic transfer of money between themselves, using persistent messaging. Would such a system qualify as a distributed da
> Suppose there is a transaction that has been running for a very long time but has performed very few updates. a. What effect would the transaction have on recovery time with the recovery algorithm of Section 19.4, and with the ARIES recovery algorithm? b.
> Using the university schema, write an SQL query to find the ID and title of each course in Comp. Sci. that has had at least one section with afternoon hours (i.e., ends at or after 12:00). (You should eliminate duplicates if any.)
> Consider the log in Figure 19.5. Suppose there is a crash just before the log
> Explain why logical undo logging is used widely, whereas logical redo logging (other than physiological redo logging) is rarely used.
> Physiological redo logging can reduce logging overheads significantly, especially with a slotted page record organization. Explain why.
> Suppose two-phase locking is used, but exclusive locks are released early, that is, locking is not done in a strict two-phase manner. Give an example to show why transaction rollback can result in a wrong final state, when using the log- based recovery al
> Outline the drawbacks of the no-steal and force buffer management policies.
> Explain how the database may become inconsistent if some log records pertaining to a block are not output to stable storage before the block is output to disk.
> Redesign the database of Exercise 8.4 into first normal form and fourth normal form. List any functional or multivalued dependencies that you assume. Also list all referential-integrity constraints that should be present in the first and fourth normal form
> Stable storage cannot be implemented. a. Explain why it cannot be. b. Explain how database systems deal with this problem
> For each of the following requirements, identify the best choice of degree of durability in a remote backup system: a. Data loss must be avoided, but some loss of availability may be tolerated. b. Transaction commit must be accomplished quickly, even at
> Explain the difference between a system crash and a “disaster.”
> In the ARIES recovery algorithm: a. If at the beginning of the analysis pass, a page is not in the checkpoint dirty page table, will we need to apply any redo records to it? Why? b. What is Rec LSN, and how is it used to minimize unnecessary redoes?
> Rewrite the preceding query, but also ensure that you include only instructors who have given at least one other non-null grade in some course.
> Compare log-based recovery with the shadow-copy scheme in terms of their overheads for the case when data are being added to newly allocated disk pages (in other words, there is no old value to be restored in case the transaction aborts).
> Consider the log in Figure 19.7. Suppose there is a crash during recovery, just before the operation abort log record is written for operation O1. Explain what will happen when the system recovers again.
> Explain the difference between the three storage types — volatile, nonvolatile, and stable— in terms of I/O cost.
> Suppose the lock hierarchy for a database consists of database, relations, and tuples. a. If a transaction needs to read a lot of tuples from a relation r, what locks should it acquire? b. Now suppose the transaction wants to update a few of the tuples i
> Describe the differences in meaning between the terms relation and relation schema.
> Although SIX mode is useful in multiple-granularity locking, an exclusive and intention-shared (XIS) mode is of no use. Why is it useless?
> In multiple-granularity locking, what is the difference between implicit and explicit locking?
> If deadlock is avoided by deadlock-avoidance schemes, is starvation still possible? Explain your answer.
> Under what conditions is it less expensive to avoid deadlock than to allow deadlocks to occur and then to detect them?
> Consider a variant of the tree protocol called the forest protocol. The database is organized as a forest of rooted trees. Each transaction Ti must follow the following rules: • The first lock in each tree may be on any data item. • The second, and all su
> Using the university schema, write an SQL query to find the ID and name of each instructor who has never given an A grade in any course she or he has taught. (Instructors who have never taught a course trivially satisfy this condition.)
> Consider the following locking protocol: All items are numbered, and once an item is unlocked, only higher-numbered items may be locked. Locks may be released at any time. Only X-locks are used. Show by an example that this protocol does not guarantee se
> Most implementations of database systems use strict two-phase locking. Suggest three reasons for the popularity of this protocol.
> Many transactions update a common item (e.g., the cash balance at a branch) and private items (e.g., individual account balances). Explain how you can in- crease concurrency (and throughput) by ordering the operations of the trans- action.
> Give example schedules to show that with key-value locking, if lookup, insert, or delete does not lock the next-key value, the phantom phenomenon could go undetected.
> Show that the following decomposition of the schema R of Exercise 7.1 is not a lossless decomposition:
> Explain the reason for the use of degree-two consistency. What disadvantages does this approach have?
> Explain the phantom phenomenon. Why may this phenomenon lead to an incorrect concurrent execution despite the use of the two-phase locking proto- col?
> Consider a relation r (A, B, C) and a transaction T that does the following: find maximum A value. Assume that an index is used to find the maximum a value. a. Suppose that the transaction locks each tuple it reads in S mode, and the tuple it creates in X
> Outline the key similarities and differences between the timestamp-based implementation of the first-committer-wins version of snapshot isolation, de- scribed in Exercise 18.15, and the optimistic-concurrency control-without-read- validation scheme, descri
> As discussed in Exercise 18.15, snapshot isolation can be implemented using a form of timestamp validation. However, unlike the multisession timestamp- ordering scheme, which guarantees serialize ability, snapshot isolation does not guarantee serialize a
> Under a modified version of the timestamp protocol, we require that a commit bit be tested to see whether a read request must wait. Explain how the commit bit can prevent cascading abort. Why is this test not necessary for write requests?
> Consider the following SQL query on the university schema: Select avg (salary) - (sum(salary) / count(*)) From instructor We might expect that the result of this query is zero since the average of a set of numbers is defined to be the sum of the numbers d
> List four significant differences between a file-processing system and a DBMS.
> Show that there are schedules that are possible under the two-phase locking protocol but not possible under the timestamp protocol, and vice versa.
> When a transaction is rolled-back under timestamp ordering, it is assigned a new timestamp. Why can it not simply keep its old timestamp?
> Using the functional dependencies of Exercise 7.6, compute B+.
> What benefit does strict two-phase locking provide? What disadvantages result?
> For each of the following isolation levels, give an example of a schedule that respects the specified level of isolation but is not serialize able: a. Read uncommitted b. Read committed c. Repeatable read
> Explain why the read-committed isolation level ensures that schedules are cascade-free.
> Why do database systems support concurrent execution of transactions, de- spite the extra effort needed to ensure that concurrent execution does not cause any problems?
> What is a recoverable schedule? Why is recoverability of schedules desirable? Are there any circumstances under which it would be desirable to allow non- recoverable schedules? Explain your answer.
> Give an example of a serialize able schedule with two transactions such that the order in which the transactions commit is different from the serialization order.
> Consider the following two transactions: Let the consistency requirement be A = 0 ∨ B = 0, with A = B = 0 as the initial values. a. Show that every serial execution involving these two transactions pre- serves the consistency of t
> Explain the distinction between the terms serial schedule and serialize able schedule.
> Write the following queries in relational algebra, using the university schema. a. Find the ID and name of each instructor in the Physics department. b. Find the ID and name of each instructor in a department located in the building “Watson”. c. Find the
> During its execution, a transaction passes through several states, until it finally commits or aborts. List all possible sequences of states through which a trans- action may pass. Explain why each state transition may occur.
> Use Armstrong’s axioms to prove the soundness of the decomposition rule.
> List four applications you have used that most likely employed a database system to store persistent data.
> What is international entrepreneurship? Why is it important?
> What motives might encourage managers to over diversify their firm?
> What incentives and resources encourage diversification?
> This Mini-Case includes descriptions of recent Am Ex innovations. Do you anticipate that most of these innovations resulted from autonomous strategic behavior or from induced strategic behavior? Why?
> What actions do you believe Amax should take to establish an entrepreneurial mind-set among employees throughout the company?
> Use material from Chapter 4 to identify the business-level strategy Am Ex uses. What dimensions do you believe Am Ex should emphasize to use the strategy you identified successfully across time?
> This Mini-Case suggests that a lack of continuous innovation contributed to American Express’s (Am Ex) poor performance in 2014. Assuming this is true, what factors might prevent a firm the size and scope of Am Ex from being able to innovate continuously
> Using information in this Mini-Case as well as additional materials available to you via searches, how do you evaluate Tim Cook as a CEO? Is he an effective strategic leader or not? Use examples from the chapter’s discussion of “Key Strategic Leadership
> Given their different leadership styles, describe the differences you see in Apple’s culture under Tim Cook’s leadership compared to the culture in Apple when Steve Jobs was CEO.
> Tim Cook came from Apple’s internal managerial labor market to succeed Steve Jobs. In your view, was using the internal managerial labor market the best approach to follow when replacing Jobs? Use materials in the chapter regarding the internal and exter
> What makes a CEO’s job so complex? Use the challenge Tim Cook faces as Steve Jobs’ successor to provide examples that support your answer.
> What additional organizational structure and/or process adjustments will Sony need to make to realize its revised strategic objectives?
> What are the two ways to obtain financial economies when using an unrelated diversification strategy?
> Do you think that Sony has the right organization structure to foster the necessary integration among its electronic and entertainment content businesses that its revamped strategy seems to entail?
> To implement a corporate strategy, a firm needs to have a strong set of capabilities to “parent” the set of business units that the firm has established or acquired. Given Sony’s history and organization structure, what would you argue are Sony’s stronge
> What would you recommend to improve the governance systems in Japan, Germany, and China, respectively, given the governance devices described in Chapter 10?