Consider the EER diagram for Pine Valley Furniture shown in Figure 3-12. Figure 8-15 looks at a portion of that EER diagram.
Letâs make a few assumptions about the average usage of the system:
⢠There are 60,000 customers, and of these, 85 percent represent regular accounts and 15 percent national accounts.
⢠Currently, the system stores 2,500,000 orders, although this number is constantly changing.
⢠Each order has an average of 30 products.
⢠There are 5,000 products.
⢠Approximately 1,500 orders are placed per hour.
a. Based on these assumptions, draw a usage map for this portion of the EER diagram.
b. Management would like employees only to use this database. Do you see any opportunities for denormalization?
Data from Figure 8-15:
Data from Figure 3-12:
> Write two HIVE queries, the first to create a PRODUCT table with fields ProdID, Name, Seller, Price; the second to load data into the table from file ProductInfo.csv. Make all necessary assumptions.
> For each situation presented below, illustrate a document as depicted in Figures 10-4 and 10-5 and specify whether it contains an array, an embedded subdocument, relationships, or collections. Use hypothetical data and make necessary assumptions. a. A do
> Review Figure 10-15 and answer the following questions based on it. a. What has happened between Input and Input’? b. Assume that the values associated with each of the keys (k1, k2, and so forth) are counts. What is the purpose of the
> Figure 10-14 describes a simple Hadoop architecture. If a real-world system is implemented using this approach, it will suffer from a specific weakness. Identify what this weakness is and find out what the latest versions of Hadoop have done to address i
> Assume that the following data regarding Students need to be stored—Name: First Name and Last Name, Roll Number, and Mobile Number. Illustrate with figures how it would be stored in different NoSQL database models.
> Review Figure 10-5 (a). Write a MongoDB query to display all products with review ratings greater than 3 stars and suppress the fields “height” and “width” in the output using the su
> Review Figure 10-3. For each of the formats, identify the elements that are data values and those that are labels describing the data. Data from Figure 10-3:
> Compare the JSON and XML representations of a record in Figure 10-1. What is the primary difference between these? Can you identify any advantages of one compared to the other? Data from Figure 10-1:
> GROUP BY by itself creates subtotals by category, and the ROLLUP extension to GROUP BY creates even more categories for subtotals. Using all the orders, do a rollup to get total order amounts by product, sales region, and month and all combinations, incl
> Because data warehouses and even data marts can become very large, it may be sufficient to work with a subset of data for some analyses. Create a sample of orders from 2004 using the SAMPLE SQL command (which is standard SQL); put a randomized allocation
> Consider the data needs of a small accounting department at a tax services firm. What would some of the data entities be in this setting? List and explain their relevance. Develop a project data model for this firm applying the database design concepts y
> Using the MDIFF “ordered analytical function” in Teradata SQL (see the Functions and Operators manual), show the differences (label the difference CHANGE) in TOTAL from quarter to quarter. Hint: You will likely create a derived table based on your query
> Take the query you scrapped from Problem and Exercise 9-58 and modify it to show only the U.S. region grouped by each quarter, not just for 2005 but for all years available, in order by quarter. Label the total orders by quarter with the heading TOTAL an
> The database you are using was developed by MicroStrategy, a leading business intelligence software vendor. The MicroStrategy software is also available on TUN. Most business intelligence tools generate SQL to retrieve the data they need to produce the r
> Review the metadata file for the db_samwh database and the definitions of the database tables. (You can use SHOW TABLE commands to display the DDL for tables.) Are dimension tables conformed in this data mart? Explain.
> Review the metadata file for the db_samwh database and the definitions of the database tables. (You can use SHOW TABLE commands to display the DDL for tables.) Explain what dimension data, if any, are maintained to support slowly changing dimensions. If
> Review the metadata file for the db_samwh database and the definitions of the database tables. (You can use SHOW TABLE commands to display the DDL for tables.) Explain the methods used in this database for modeling hierarchies. Are hierarchies modeled as
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> In the section “Disadvantages of File Processing Systems,” the statement is made that the disadvantages of file processing systems can also be limitations of databases, depending on how an organization manages its databases. First, why do organizations c
> Contrast the following terms: a. data dependence; data independence b. structured data; unstructured data c. metadata; data d. repository; database e. entity; enterprise data model f. data warehouse; data lake g. personal databases; multi-tiered database
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Fitchwood Insurance Company, which is involved primarily in the sale of annuity products, would like to design a data mart for its sales and marketing organization. Presently, the OLTP system is a legacy system residing on a shared network drive consisti
> Pine Valley Furniture wants you to help design a data mart for analysis of sales. The subjects of the data mart are as follows: Salesperson: Attributes: SalespersonID, Years with PVFC, SalespersonName, and SupervisorRating. Product: Attributes: ProductID
> A firm wants to reduce fluid drilling costs substantially by increasing drilling fluid efficiency. Research finds that both fluid drilling speed and cost are significantly influenced by Time, Geography, Drilling fluid type, Formation, and Well type. Geog
> A pharmaceutical retail store manages its current sales, procurement and materials availability at the store through Excel sheets. Owing to the increase in the number of branches in the city, the store manager is now finding this process of data maintena
> A university gathers student admission data from three different sources: through forms filled manually at university desks, by registering at the university Web site, or by registering on the department’s Web site. All the three sources have disparate f
> Employees working in IT organizations are assigned different projects for a specific duration, such as a few months or years. The duration is specified by the project start date and end date in the database. The project location is different for each pro
> Table 1-1 shows example metadata for a set of data items. Identify three other columns for these data (i.e., three other metadata characteristics for the listed attributes) and complete the entries of the table in Table 1-1 for these three additional col
> Simplified Automobile Insurance Company would like to add a Claims dimension to its star schema. Attributes of Claim are ClaimID, ClaimDescription, and ClaimType. Attributes of the fact table are now PolicyPremium, Deductible, and MonthlyClaimTotal. a. E
> You are to construct a star schema for Simplified Automobile Insurance Company (for a more realistic example, Kimball, 1996b). The relevant dimensions, dimension attributes, and dimension sizes are as follows: InsuredParty: Attributes: InsuredPartyID and
> A table Student stores the values StudentID, name, date of result, and total marks obtained. A student’s information is: StudentID: S876, Name: Sabcd, Date of result: 22/12/14, and Total marks obtained: 650. An update transaction has changed the date and
> Drilling often accounts for one-third to two-thirds of the total cost in the search for fluid. Advances in drilling technology can reduce these costs substantially. The key point is redesigning the scheme of drilling fluid. A research study identifies th
> The following table shows some simple album and price data as of the date 07/18/2015: The following transactions occur on 07/19/2015: • Album K3 price discounted to $7. • Album K5 is deleted from the file. â€
> Examine the three tables with student data shown in Figure 9-1. Design a single-table format that will hold all of the data (non-redundantly) that are contained in these three tables. Choose column names that you believe are most appropriate for these da
> Based on the table above as well as additional research, write a memo in support of or against the following statement: “Cloud databases will increasingly eliminate the need for data/database administrators in corporations.”
> Assume that a bank operates multinational and has millions of financial records of customers in its database. The bank also offers e-banking services to its clients. Based on what you have learned from the book, suggest how they can take regular backups
> Revisit the six issues identified in Problem and Exercise 8-72. What risk, if any, do each of them pose to the firm? Data from Problem and Exercise 8-72: During the Sarbanes-Oxley audit of a financial services company, you note the following issues. Cat
> During the Sarbanes-Oxley audit of a financial services company, you note the following issues. Categorize each of them into the area to which they belong: IT change management, logical access to data, and IT operations. a. Five DBAs have access to the S
> You are the manager of a department in a small logistics company. The current database system being used is hierarchical, and you have been tasked to formulate a team that can create a plan to develop a more efficient database system that is consistent w
> A number of situations have been listed below. For each one, identify the need, if any, to create an index. Justify your answer. If there is indeed a need, suggest a way for the index to be created. a. Banking applications that involve frequent retrieval
> For each of the situations described, decide which technique for data field design listed below would be most appropriate and how it could be applied. • Code lookup table • Default value • Range control • Referential integrity • Handling missing data a
> Fill in the two authorization tables for Pine Valley Furniture Company below based on the following assumptions (enter Y for yes or N for no): • Salespersons, managers, and carpenters may read inventory records but may not perform any o
> Refer to Figure 4-5. For each of the following reports (with sample data), indicate any indexes that you feel would help the report run faster as well as the type of index: a. State, by products (user-specified period) State, by Products Report, January
> Consider the composite usage map in Figure 8-1. After a period of time, the assumptions for this usage map have changed, as follows: • There is an average of 60 supplies (rather than 40) for each supplier. • Manufactur
> Create an index on the CustomerID column of the Customer_T and Order_T table in Figure 4-4. Data from Figure 4-4:
> Consider the following assumptions: • A music company offers three types of music genres: Jazz, Hip-hop, and Metal (subtypes of the Genre supertype). An “Artist” instances “Records” of these Genres. • There are total of 8,000 songs in company’s database,
> Parallel query processing, as described in this chapter, means that the same query is run on multiple processors and that each processor accesses in parallel a different subset of the database. Another form of parallel query processing, not discussed in
> Suggest an application for each type of file organization. Explain your answer.
> Visit the PHP website (php.net) and investigate how a failure to sanitize database inputs can leave a database exposed to online attack.
> Assume that the most important reports that the organization needs are as follows: • A list of the current developer’s project assignments. • A list of the total costs for all projects. • For each team, a list of its membership history. • For each countr
> Consider Figure 4-35 and your answer to Problem and Exercise 4-44 in Chapter 4. Assume that it is essential that customers who had rented from Vacation Property Rentals earlier can be identified quickly based on their last name–first na
> Specify the format for the Oracle date data type. How does it account for change in century? What is the purpose of ‘TIMESTAMP WITH LOCAL TIMEZONE’? Suppose the system time zone in database in City A = –9:00 and City B = –4:00. A client in City B inserts
> Consider the relations specified in Problem and Exercise 8-53. Assume that the database has been implemented without denormalization. Further assume that the database is global in scope and covers thousands of leagues, tens of thousands of teams, and hun
> Assume that the table BOOKS in a database with the primary key on BookID has more than 25,000 records. A query is frequently executed in which the Publisher of the book appears in the WHERE clause. The Publisher field has more than 100 different values,
> A company offering music services provides a search feature to its users and allows them to mix music (a key feature for disc jockeys), which is supported through parallel processing. All music information is stored in a database management system. a. Wh
> Search the Internet for at least three examples where parallel processing is applied. How was the underlying database prepared for this? What were the advantages of this implementation?
> Use the Internet to search for examples of each type of horizontal partitioning provided by Oracle. Explain your answer.
> Consider the following normalized relations for a sports league: TEAM(TeamID, TeamName, TeamLocation, TeamManager) PLAYER(PlayerID, PlayerFirstName, PlayerLastName, PlayerDateOfBirth, PlayerSpecialtyCode) SPECIALTY(SpecialtyCode, SpecialtyDescription, Sa
> Consider the relations in Problem and Exercise 8-51. Identify possible opportunities for denormalizing these relations as part of the physical design of the database. Which ones would you be most likely to implement? Data from Problem and Exercise 8-51:
> On a smaller scale than in Field Exercise 7-25, investigate the computing architecture of a department within your university. Try to find out how well the current system is meeting the department’s information-processing needs. Data from Exercise 7-25:
> Consider the following set of normalized relations from a database used by a mobile service provide to keep track of its users and advertiser customers. USER(UserID, UserLName, UserFName, UserEmail, UserYearOfBirth, UserCategoryID, UserZip) ADVERTISERCLI
> Consider the following normalized relations from a database in a large retail chain: STORE (StoreID, Region, ManagerID, SquareFeet) EMPLOYEE (EmployeeID, WhereWork, EmployeeName, EmployeeAddress) DEPARTMENT (DepartmentID, ManagerID, SalesGoal) SCHEDULE (
> When students fill out forms for admission to various courses or to write their exams, they leave many missing values. This may lead to issues while compiling data. Can this be handled at the data capture stage? What are the alternate approaches to handl
> In a normalized database, all customer information is stored in a Customer table, invoices are stored in an Invoice table, and related account manager information in an Employee table. Suppose a customer changes their address and then demands old invoice
> Say that you are interested in storing the numeric value 3,456,349.2334. What will be stored with each of the following Oracle data types? a. NUMBER(11) b. NUMBER(11,1) c. NUMBER(11,-2) d. NUMBER(11,6) e. NUMBER(6) f. NUMBER
> Explain in your own words what the precision (p) and scale (s) parameters for the Oracle data type NUMBER mean.
> Choose Oracle data types for the attributes in the normalized relations that you created in Problem and Exercise 4-47 in Chapter 4. Data from Problem and Exercise 4-47: Figure includes an EER diagram describing a publisher specializing in large edited w
> Choose Oracle data types for the attributes in the normalized relations in the middle panel of Figure 8-4.
> Consider the following two relations for a firm: EMPLOYEE(EmployeeID, EmployeeName, Contact, Email) PERFORMANCE(EmployeeID, DepartmentID, Rank) The following is a typical query against these relations SELECT Employee_T.EmployeeID, EmployeeName, Departmen
> Examine the two applications shown in Figures 7-5a and 7-5b. Identify the various security considerations that are relevant to each environment. Data from Figure 7-5:
> Investigate the computing architecture of your university. Trace the history of computing at your university and determine what path the university followed to get to its present configurations. Some universities started early with mainframe environments
> List and discuss five areas where threats to data security may occur.
> Conduct some research to find out how a Java-based application can be connected to a database. Provide some brief code snippets and annotate the code.
> How does versioning work in a current database environment? What advantages does versioning offer?
> What is the difference between shared locks and exclusive locks?
> What is the advantage of optimistic concurrency control compared with pessimistic concurrency control?
> Visit an online retailer such as Amazon or eBay and explain the system’s design using the MVC paradigm.
> For each product, display in ascending order, by product ID, the product ID and description, along with the customer ID and name for the customer who has bought the most of that product; also show the total quantity ordered by that customer (who has boug
> The head of marketing is interested in some opportunities for cross-selling of products. She thinks that the way to identify cross-selling opportunities is to know for each product how many other products are sold to the same customer on the same order (
> Display employee information for all the employees in each state who were hired before the most recently hired person in that state.
> Display in product ID order the product ID and total amount ordered of that product by the customer who has bought the most of that product; use a derived table in a FROM clause to answer this query.
> For each of the descriptions below, perform the following tasks: i. Identify the degree and cardinalities of each relationship. ii. Express the relationships in each description graphically with an E-R diagram. a. A book is identified by its ISBN number,
> Write an SQL query to list the salesperson who has sold the most computer desks.
> Write an SQL query to list the order number, product ID, and ordered quantity for all ordered products for which the ordered quantity is greater than the average ordered quantity for that product.
> List the IDs and names of those sales territories that have at least 50 percent more customers as the average number of customers per territory.
> List the IDs and names of all products that cost less than the average product price in their product line.
> Review the first query in the “Correlated Subqueries” section. Can you identify a special set of standard prices for which this query will not yield the desired result? How might you rewrite the query to handle this situation?
> Write an SQL query that lists the vendor ID, vendor name, material ID, material name, and supply unit prices for all those materials that are provided by more than one vendor.
> Display the customer names of all customers who have ordered (on the same or different orders) both products with IDs 3 and 4.
> Show the customer ID and name for all the customers who have ordered both products with IDs 3 and 4 on the same order.
> Display the customer ID, name, and order ID for all customer orders. For those customers who do not have any orders, include them in the display once by showing order ID 0.
> Rewrite your answer to Problem and Exercise 6-71 for each customer, not just customer 16. Data from Problem and Exercise 6-71: Display the name of customer 16 and the names of all the customers that are in the same zip code as customer 16.
> A cellular operator needs a database to keep track of its customers, their subscription plans, and the handsets (mobile phones) that they are using. The E-R diagram in Figure 2-24 illustrates the key entities of interest to the operator and the relations
> Display the name of customer 16 and the names of all the customers that are in the same zip code as customer 16.