2.99 See Answer

Question: Explain the concept of physical data independence


Explain the concept of physical data independence and its importance in database systems.


> Explain the difference between two-tier and three-tier application architectures. Which is better suited for web applications? Why?

> Suppose there are two relations r and s, such that the foreign key B of r references the primary key A of s. Describe how the trigger mechanism can be used to implement the on delete cascade option when a tuple is deleted from s.

> Hackers may be able to fool you into believing that their web site is actually a web site (such as a bank or credit card web site) that you trust. This may be done by misleading email, or even by breaking into the network infrastructure and rerouting net

> Redo Exercise 5.12 using the language of your database system for coding stored procedures and functions. Note that you are likely to have to consult the online Documentation for your system as a reference, since most systems use syntax diï¬&#12

> Consider the relational schema from Exercise 5.16. Write a JDBC function using non recursive SQL to find the total cost of part “P-100”, including the costs of all its subparts. Be sure to take into account the fact that a part may have multiple occurrenc

> Consider the relational schema Where the primary-key attributes are underlined. A tuple (p1, p2, 3) in the subpart relation denotes that the part with part id p2 is a direct subpart of the part with part id p1, and p1 has 3 copies of p2. Note that p2 may

> Consider an employee database with two relations Where the primary keys are underlined. Write a function avg salary that takes a company name as an argument and finds the average salary of employees at that company. Then, write an SQL stat

> Repeat Exercise 5.13 using ODBC, defining void print Table (char *r) as a function instead of a method.

> Suppose you were asked to define a class Meta Display in Java, containing a method static void print Table(String r); the method takes a relation name r as input, executes the query “select * from r”, and prints the result out in tabular format, with the

> Consider the relation, r, shown in Figure 5.22. Give the result of the following query:

> Consider the nice relation of Exercise 5.9. For each month of each year, show the total monthly dollar volume and the average monthly dollar volume for that month and the two prior months. (Hint: First write a query to find the total dollar volume for eac

> Given relation s (a, b, c), write an SQL statement to generate a histogram show- ing the sum of c values versus a, dividing a tin to 20 equal-sized partitions (i.e., where each partition contains 5 percent of the tuples in s, sorted by a).

> Assume that two students are trying to register for a course in which there is only one open seat. What component of a database system prevents both students from being given that last seat?

> Consider the advisor relation shown in the schema diagram in Figure 2.9, with s id as the primary key of advisor. Suppose a student can have more than one advisor. Then, would s I d still be a primary key of the advisor relation? If not, what should the

> Modify the recursive query in Figure 5.16 to define a relation Prefer depth (course id, prefer id, depth) Where the attribute depth indicates how many levels of intermediate prerequisites there are between the course and the prerequisite. Direct prerequis

> Write a Java program that allows university administrators to print the teaching record of an instructor. a. Start by having the user input the login ID and password; then open the proper connection. b. The user is asked next for a search substring and t

> Show how to express the coalesce function using the case construct.

> For the view of Exercise 4.18, explain why the database system would not allow a tuple to be inserted into the database through this view.

> Show how to define a view to credits (year, numb credits), giving the total number of credits taken in each year.

> Under what circumstances would the query Include tuples with null values for the title attribute?

> For the database of Figure 4.12, write a query to find the ID of each employee with no manager. Note that an employee may simply have no manager listed or may have a null manager. Write your query using an outer join and then write it again using no outer

> Express the following query in SQL using no sub queries and no set operations.

> Write an SQL query using the university schema to find the ID of each student who has never taken a course at the university. Do this using no sub queries and no set operations (use an outer join).

> Rewrite the query Select * From section natural join classroom Without using a natural join but instead using an inner join with a using condition.

> Suppose you wish to create an audit trail of changes to the takes relation. a. Define triggers to create an audit trail, logging the information into a relation called, for example, takes trail. The logged information should include the user-id (assume a

> List at least two reasons why database systems support data manipulation using a declarative query language such as SQL, instead of just providing a library of C or C++ functions to carry out data manipulation.

> Explain the difference between integrity constraints and authorization con- strains.

> Suppose a user creates a new relation r1 with a foreign key referencing another relation r2. What authorization privilege does the user need on r2? Why should this not simply be allowed without any such authorization?

> Suppose user A, who has all authorization privileges on a relation r, grants select on relation r to public with grant option. Suppose user B then grants select on r to A. Does this cause a cycle in the authorization graph? Explain why.

> Explain why, when a manager, say Satoshi, grants an authorization, the grant should be done by the manager role, rather than by the user Satoshi.

> Consider the query Explain why appending natural join section in the from clause would not change the result.

> List two reasons why null values might be introduced into the database.

> Give an SQL schema definition for the employee database of Figure 3.19. Choose an appropriate domain for each attribute and an appropriate primary key for each relation schema. Include any foreign-key constraints that might be appropriate.

> Consider the employee database of Figure 3.19. Give an expression in SQL for each of the following queries. a. Give all employees of “First Bank Corporation” a 10 percent raise. b. Give all managers of “First Bank Corporation” a 10 percent raise. c. Dele

> Consider the employee database of Figure 3.19, where the primary keys are underlined. Give an expression in SQL for each of the following queries. a. Find ID and name of each employee who lives in the same city as the location of the company for which th

> What are two advantages of encrypting data stored in the database?

> Consider the bank database of Figure 3.18, where the primary keys are under- lined. Construct the following SQL queries for this relational database. a. Find each customer who has an account at every branch located in “Brook- Lyn”. b. Find the total sum

> List five responsibilities of a database-management system. For each response ability, explain the problems that would arise if the responsibility were not dis- charged.

> Consider the insurance database of Figure 3.17, where the primary keys are underlined. Construct the following SQL queries for this relational database. a. Find the number of accidents involving a car belonging to a person named “John Smith”. b. Update t

> Write SQL DDL corresponding to the schema in Figure 3.17. Make any reason- able assumptions about data types, and be sure to declare primary and foreign keys.

> Write the SQL statements using the university schema to perform the following operations: a. Create a new course “CS-001”, titled “Weekly Seminar”, with 0 credits. b. Create a section of this course in fall 2017, with sec id of 1, and with the location o

> Using the university schema, write an SQL query to find the name and ID of each History student whose name begins with the letter ‘D’ and who has not taken at least five Music courses.

> Using the university schema, write an SQL query to find the names and IDs of those instructors who teach every course taught in his or her department (i.e., every course that appears in the course relation with the instructor’s department name). Order res

> Using the university schema, write an SQL query to find the IDs of those students who have retaken at least three distinct courses at least once (i.e., the student has taken the course at least two times).

> Using the university schema, use SQL to do the following: For each student who has retaken a course at least twice (i.e., the student has taken the course at least three times), show the course ID and the student’s ID. Please display your results in orde

> Using the university schema, write an SQL query to find the names of those departments whose budget is higher than that of Philosophy. List them in al- phonetic order.

> Consider the Oracle Virtual Private Database (VPD) feature described in Sec- ton 9.8.5 and an application based on our university schema. a. What predicate (using a sub query) should be generated to allow each faculty member to see only takes tuples corr

> Using the university schema, write an SQL query to find the name and ID of those Accounting students advised by an instructor in the Physics department.

> With dept. total (dept. name, value) as (select dept. name, sum (salary) from instructor Group by dept. name), dept. total avgas (value) as (Select avgas (value) from dept. total) Select dept. name From dept. total, dept. total avgas Where dept. total. V

> Rewrite the where clause Where unique (select title from course) Without using the unique construct.

> Choose an enterprise of personal interest to you and explain how block chain technology could be employed usefully in that business.

> Explain how off-chain transaction processing can enhance throughput. What are the trade-offs for this benefit?

> Why is Byzantine consensus a poor consensus mechanism in a public block chain?

> How is the difficulty of proof-of-work mining adjusted as more nodes join the network, thus increasing the total computational power of the network? De- scribe the process in detail.

> Consider the library database of Figure 3.20. Write the following queries in SQL. a. Find the member number and name of each member who has borrowed at least one book published by “McGraw-Hill”. b. Find the member number and name of each member who has b

> Suppose a user forgets or loses her or his private key? How is the user affected?

> Write a servlet and associated HTML code for the following very simple application: A user is allowed to submit a form containing a value, say n, and should get a response containing n “*” symbols.

> Describe at least three tables that might be used to store information in a social- networking system such as Facebook.

> Since pointers in a block chain include a cryptographic hash of the previous block, why is there the additional need for replication of the block chain to ensure immutability?

> Since block chains are immutable, how is a transaction abort implemented so as not to violate immutability?

> In what order are block chain transactions serialized?

> Given that the LDAP functionality can be implemented on top of a database system, what is the need for the LDAP standard?

> Explain what application characteristics would help you decide which of TPC- C, TPC-H, or TPC-R best models the application.

> Why was the TPC-D benchmark replaced by the TPC-H and TPC-R bench- marks?

> List at least four features of the TPC benchmarks that help make them realistic and dependable measures.

> Suppose the price of memory falls by half, and the speed of disk access (number of accesses per second) doubles, while all other factors remain the same. What would be the effect of this change on the 5-minute and 1-minute rule?

> What is the motivation for splitting a long transaction into a series of small ones? What problems could arise as a result, and how can these problems be averted?

> Show that, in SQL, all is identical to not in.

> The Google search engine provides a feature whereby web sites can display advertisements supplied by Google. The advertisements supplied are based on the contents of the page. Suggest how Google might choose which advertisements to supply for a page, giv

> Suppose that your application has transactions that each access and update some that all internal nodes of the B+-tree are in memory, but only a very small fraction of the leaf pages can fit in memory. Explain how to calculate the minimum number of disks

> When carrying out performance tuning, should you try to tune your hardware (by adding disks or memory) first, or should you try to tune your transactions (by adding indices or materialized views) first. Explain your answer.

> Database tuning: a. What are the three broad levels at which a database system can be tuned to improve performance? b. Give two examples of how tuning can be done for each of the levels.

> Our description of static hashing assumes that a large contiguous stretch of disk blocks can be allocated to a static hash table. Suppose you can allocate only C contiguous blocks. Suggest how to implement the hash table, if it can be much larger than C

> Why is a hash structure not the best choice for a search key on which range queries are likely?

> What are the causes of bucket overflow in a hash file organization? What can be done to reduce the occurrence of bucket overflows?

> Explain the distinction between closed and open hashing. Discuss the relative merits of each technique in database applications.

> Suppose you want to use the idea of a quad tree for data in three dimensions. How would the resultant data structure (called an cotter) divide up space?

> The stepped merge variant of the LSM tree allows multiple trees per level. What are the tradeoffs in having more trees per level?

> For correct execution of a replicated state machine, the actions must be deterministic. What could happen if an action is non-deterministic?

> Web sites that want to get some publicity can join a web ring, where they create links to other sites in the ring in exchange for other sites in the ring creating links to their site. What is the effect of such rings on popularity ranking techniques such

> Write the following queries in SQL, using the university schema. A. Find the ID and name of each student who has taken at least one Comp. Sci. course; make sure there are no duplicate names in the result. b. Find the ID and name of each student who has n

> Why is the notion of term important when an election is used to choose a coordinator? What are the analogies between elections with terms and elections used in a democracy?

> Markel trees can be made short and fat (like B+-trees) or thin and tall (like binary search trees). Which option would be better if you are comparing data across two sites that are geographically separated, and why?

> Spanner provides read-only transactions a snapshot view of data, using multi- version two-phase locking. a. In the centralized multi-version 2PL scheme, read-only transactions never wait. But in Spanner, reads may have to wait. Explain why. b. Using an o

> Discuss the advantages and disadvantages of the two methods that we presented in Section 23.3.4 for generating globally unique timestamps.

> If we apply a distributed version of the multiple-granularity protocol of Chapter 18 to a distributed database, the site responsible for the root of the DAG may become a bottleneck. Suppose we modify that protocol as follows: • Only intention-mode locks

> In the majority protocol, what should the reader do if it finds different values from different copies, to (a) decide what is the correct value, and (b) to bring the copies back to consistency? If the reader does not bother to bring the copies back to consi

> Give an example where the read one, write all available approach leads to an erroneous state.

> What characteristics of an application make it easy to scale the application by using a key-value store, and what characteristics rule out deployment on key-value stores?

> Consider system that is processing a stream of tuples for a relation r with attributes (A, B, C, timestamp) Suppose the goal of a parallel stream processing system is to compute the number of tuples for each A value in each 5 minute window (based on the

> Suppose you wish to perform keyword querying on a set of tuples in a database, where each tuple has only a few attributes, each containing only a few words. Does the concept of term frequency make sense in this context? And that of inverse document frequ

> The attribute on which a relation is partitioned can have a significant impact on the cost of a query. a. Given a workload of SQL queries on a single relation, what attributes would be candidates for partitioning? b. How would you choose between the alter

> Using the university schema, write an SQL query to find section(s) with max- imam enrollment. The result columns should appear in the order “coursed, secede, year, semester, numb”. (It may be convenient to use the with construct.)

> What is the motivation for work-stealing with virtual nodes in a shared-memory setting? Why might work-stealing not be as efficient in a shared-nothing set- ting?

> Suppose you wish to handle a workload consisting of a large number of small transactions by using shared-nothing parallelism. a. Is intra query parallelism required in such a situation? If not, why, and what form of parallelism is appropriate? b. What fo

> Describe a good way to parallelize each of the following: a. The difference operation b. Aggregation by the count operation c. Aggregation by the count distinct operation d. Aggregation by the age operation e. Left outer join, if the join condition involv

> Can partitioned join be used for r ⋈r? A

> Joins can be expensive in a key-value store, and difficult to express if the system does not support SQL or a similar declarative query language. What can an application developer do to efficiently get results of join or aggregate queries in such a setting?

> Why is it easier for a distributed file system such as GFS or HDFS to support replication than it is for a key-value store?

> What is the motivation for storing related records together in a key-value store? Explain the idea using the notion of an entity group.

2.99

See Answer