Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (administrative assistant to Pat Smith, an artist manager). Pat is on vacation, and Martin has promised that Pat’s perspective will be provided at a later date. The other two are from Dale Dylan, an artist who Pat manages, and Sandy Wallis, an event organizer. The text of these e-mails is provided below. E-mail from Alex Martin, Administrative Assistant My name is Alex Martin, and I am the administrative assistant to Pat Smith. While Pat’s role is to create and maintain relationships with our clients and the event organizers, I am responsible for running the show at the operational level. I take care of Pat’s phone calls while Pat is on the road, respond to inquiries and relay the urgent ones to Pat, write letters to organizers and artists, collect information on prospective artists, send bills to the event organizers and make sure that they pay their bills, take care of the artist accounts, and arrange Pat’s travel (and keep track of travel costs). Most of my work I manage with Word and simple Excel spreadsheets, but it would be very useful to be able to have a system that would help me to keep track of the event fees that have been agreed upon, the events that have been successfully completed, cancellations (in the current system, I sometimes don’t get information about a cancellation and I end up sending an invoice for a cancelled concert—pretty embarrassing), payments that need to be made to the artists, etc. Pat and other managers seem to think that it would be a good idea if they could better track their travel costs and the impact these costs have on their income. We don’t have a very good system for managing our artist accounts because we have separate spreadsheets for keeping track of a particular artist’s fees earned and the expenses incurred, and then at the end of each month we manually create a simple statement for each of the artists. This is a lot of work, and it would make much more sense to have a computer system that would allow us to be able to keep the books constantly up to date. A big thing for me is to keep track of the artists whom Pat manages. We need to keep in our databases plenty of information on them—their name, gender, address (including country, as they live all over the world), phone number(s), instrument(s), e-mail, etc. We also try to keep track of how they are doing in terms of the reviews they get, and thus we are subscribing to a clipping service that provides us articles on the artists whom we manage. For some of the artists, the amount of material we get is huge, and we would like to reduce it somehow. At any rate, we would at least like to be able to have a better idea of what we have in our archives on a particular artist, and thus we should probably start to maintain some kind of a list of the news items we have for a particular artist. I don’t know if this is worth it but it would be very useful if we could get it done. Scheduling is, of course, a major headache for me. Although Pat and the artists negotiate the final schedules, I do, in practice, at this point maintain a big schedule book for each artist whom we manage. You know, somebody has to have the central copy. This means that Pat, the artists, and the event organizers are calling me all the time to verify the current situation and make changes to the schedule. Sometimes things get mixed up and we don’t get the latest changes to the central calendar (for example, an artist schedules a vacation and forgets to tell us—as you can understand, this can lead to a pretty difficult situation). It would be so wonderful to get a centralized calendar which both Pat and the artists could access; it is probably, however, better if Pat (and the other managers for the other artists, of course) was the only person in addition to me who had the right to change the calendar. Hmmm . . . I guess it would be good if the artists could block time out if they decide that they need if for personal purposes (they are not, however, allowed to book any performances without discussing it first with us). One more thing: I would need to have something that would remind me of the upcoming changes in artist contracts. Every artist’s contract has to be renewed annually, and sometimes I forget to remind Pat to do this with the artist. Normally this is not a big deal, but occasionally we have had a situation where the lack of a valid contract led to unfortunate and unnecessary problems. It seems that we would need to maintain some type of list of the contracts with their start dates, end dates, royalty percentages, and simple notes related to each of the contracts. This is a pretty hectic job, and I have not had time to get as good computer training as I would have wanted. I think I am still doing pretty well. It is very important that whatever you develop for us, it has to be easy to use because we are in such a hurry all the time and we cannot spend much time learning complex commands. E-mail from Dale Dylan, Established Artist Hi! I am Dale Dylan, a pianist from Austin, TX. I have achieved reasonable success during my career and I am very thankful that I have been able to work with Pat Smith and Mr. Forondo during the past five years. They have been very good at finding suitable performance opportunities for me, particularly after I won an international piano competition in Amsterdam a few years ago. Compared to some other people with whom I have worked, Pat is very conscientious and works hard for me. During the recent months, FAME and its managers’ client base has grown quite a lot, and unfortunately I have seen this in the service they have been able to provide to me. I know that Pat and Alex don’t mean any harm but it seems that they simply have too much to do, particularly in scheduling and getting my fees to me. Sometimes things seem to get lost pretty easily these days, and occasionally I have been waiting for my money for 2–3 months. This was never the case earlier but it has been pretty typical during the last year or so. Please don’t say anything to Pat or Alex about this; I don’t want to hurt their feelings, but it just simply seems that they have too much to do. Do you think your new system could help them? What I would like to see in a new system—if you will develop one for them—are just simple facilities that would help them do even better what they have always done pretty well (except very recently): collecting money from the concert organizers and getting it to me fast (they are, after all, taking 20 percent of my money—at least they should get the rest of it to me quickly) and maintaining my schedule. I have either a laptop or at least my smartphone/iPad with me all the time while I am on the road, thus I certainly should be able to check my schedule on the Web. Now I always need to call Alex to get any lastminute changes. It seems pretty silly that Pat has to be in touch with Alex before any changes can be made to the calendar; I feel that I should be allowed to make my own changes. Naturally, I would always notify Pat about anything that changes (or maybe the system could do that for me). The calendar system should be able to give me at least a simple list of the coming events in the chronological order for any time period I want. Furthermore, I would like to be able to search for events using specific criteria (location, type, etc.). In addition, we do, of course, get annual summaries from FAME regarding the fees we have earned, but it would be nice to have this information a bit more often. I don’t need it on paper but if I could access that information on the Web, it would be very, very good. It seems to me that Alex is doing a lot of work with these reports by hand; if you could help her with any of the routine work she is doing, I am sure she would be quite happy. Maybe then she and Pat would have more time for getting everything done as they always did earlier. E-mail from Sandy Wallis, Event Organizer I am Sandy Wallis, the executive director of the Greater Tri-State Area Concert Halls, and it has been a pleasure to have a good working relationship with Pat Smith at FAME for many years. Pat has provided me and my annual concert series several excellent artists per year, and I believe that our cooperation has a potential to continue into the foreseeable future. This does, however, require that Pat is able to continue to give me the best service in the industry during the years to come. Our business is largely based on personal trust, and the most important aspect of our cooperation is that I can know that I can rely on the artists managed by Pat. I am not interested in the technology Pat is using, but it is important for us that practical matters such as billing and scheduling work smoothly and that technology does not prevent us from making decisions fast, if necessary. We don’t want to be billed for events that were cancelled and never rescheduled, and we are quite unhappy if we need to spend our time on these types of technicalities. At times, we need a replacement artist to substitute for a musician who becomes ill or cancels for some other reason, and the faster we can get information about the availability of world-class performers in these situations, the better it is for us. Yes, we work in these situations directly with Pat, but we have seen that occasionally all the information required for fast decision making is not readily available, and this is something that is difficult for us to understand. We would like to be able to assume that Pat’s able assistant Alex should be able to give us information regarding the availability of a certain artist on a certain date on the phone without any problems. Couldn’t this information be available on the Web, too? Of course, we don’t want anybody to know in advance whom we have booked before we announce our annual program; therefore, security is very important for us. I hope you understand that we run multiple venues but we definitely still want to be treated as one customer. With some agencies we have seen silly problems that have forced them to send us invoices with several different names and customer numbers, which does not make any sense from our perspective and causes practical problems with our systems. Project Questions: Use the narratives in Chapter 1 and above to identify the typical outputs (reports and displays) the various stakeholders might want to retrieve from your database. Now, revisit the E-R diagram you created in 2-60 to ensure that your model has captured the information necessary to generate the outputs desired. Update your E-R diagram as necessary.
> Identify six broad categories of implications of big data analytics and decision making.
> How is data quality and management vital in realizing the full potential of big data and analytics?
> Describe the core idea underlying in-database analytics.
> Describe the core idea underlying in-memory DBMSs.
> Describe the mechanism through which prescriptive analytics is dependent on descriptive and predictive analytics.
> Having reviewed your conceptual models (from Chapters 2 and 3) with the appropriate stakeholders and gaining their approval, you are now ready to move to the next phase of the project, logical design. Your next deliverable is the creation of a relational
> How is KNIME used as a predictive analytics tool?
> Discuss why data mining applications are growing rapidly in business.
> Illustrate the goals of data mining and how they answer fundamental business questions.
> Discuss the different types of dashboards and their role in business performance management.
> How does Apache Spark differ from Hadoop?
> Define each of the following terms: a. data mining b. online analytical processing c. business intelligence d. predictive analytics e. Apache Spark
> What is the difference between a wide-column store and a graph-oriented database?
> What is the trade-off one needs to consider while using a NoSQL database management system?
> What is the difference between the explanatory and exploratory goals of data mining?
> Identify the differences between Hadoop and NoSQL technologies.
> Having reviewed your conceptual models (from Chapters 2 and 3) with the appropriate stakeholders and gaining their approval, you are now ready to move to the next phase of the project, logical design. Your next deliverable is the creation of a relational
> What are the two challenges faced in visualizing big data?
> Identify and briefly describe the five Vs that are often used to define big data.
> Contrast the following terms: a. data lake; data warehouse b. Pig; Hive c. volume; velocity d. NoSQL; SQL
> Match the following terms to the appropriate definitions: - Hive - Big data - Data lake - Pig - Analytics a. data exist in large volumes and variety and need to processed at a very high speed b. a language that is used to extract, load and transform data
> HDase and Cassandra share a common purpose. What is it? What is their relationship to HDFS and Google BigTable?
> Explain the implementation of MapReduce on HDFS clusters.
> How does HDFS aid in coping with hardware failure?
> Describe and explain the two main components of MapReduce
> What is the role of YARN in the management of highly distributed systems?
> List the purposes Hadoop is used for.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Discuss the features of NoSQL DBMS that ensure high availability but do not guarantee consistency.
> What is the format that can be used to describe database schema besides JSON?
> Define each of the following terms: a. Hadoop b. MapReduce c. HDFS d. NoSQL e. Pig
> Why is it important to consolidate a Web-based customer interaction in a data warehouse?
> List five claimed limitations of independent data marts.
> Explain the need to separate operational and information systems.
> List the issues that one encounters while achieving a single corporate view of data in a firm.
> Briefly describe the factors that have led to the evolution of the data warehouse.
> Why does an information gap still exist despite the surge in data in most firms?
> List the functions performed by a Data Warehouse Administrator and explain how they differ from the typical data administrator and database administrator.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Explain the reasons why Data Warehousing 2.0 is necessary.
> Explain how the phrase “extract–transform–load” relates to the data reconciliation process.
> List five errors and inconsistencies that are commonly found in operational data.
> List and briefly describe five steps in the data reconciliation process.
> Contrast the following terms: a. transient data; periodic data b. data scrubbing; data transformation c. data warehouse; data mart; operational data store d. reconciled data; derived data e. static extract; incremental extract f. fact table; dimension ta
> List six typical characteristics of reconciled data.
> Explain why it is essential to scrub data before transformation and how they blend together.
> Which three techniques form the building blocks of any data integration approach?
> Describe the current key trends in data warehousing.
> Explain how data integration is not the only data consolidation technique across an enterprise.
> Briefly explain how the dimensions and facts required for a data mart are driven by the context for decision making.
> Why should changes be made to the data warehouse design? What are the changes that need to be accommodated?
> What is the meaning of the phrase “slowly changing dimension”?
> What are the two situations in which factless fact tables may apply?
> Explain through common examples why determining grain is critical.
> Match the following terms and definitions: - periodic data - data mart - star schema - data scrubbing - data transformation - grain - reconciled data - dependent data mart - real-time data warehouse - selection - transient data - snowflake schema a. lost
> List and describe the various situations in which it becomes essential to further normalize dimension tables.
> Explain the components of a star schema with figures and suitable examples.
> Describe the characteristics of a surrogate key as used in a data warehouse or data mart.
> Discuss the role of an enterprise data model and metadata in the architecture of a data warehouse.
> FAME (Forondo Artist Management Excellence) Inc. is an artist management company that represents classical music artists (only soloists) both nationally and internationally. FAME has more than 500 artists under its management and wants to replace its spr
> What are the key differences between data warehousing and big data approaches to analytical data management?
> What type of an impact has the significant decrease in the cost of storage space had on data warehouse and data mart design?
> Why is real-time data warehousing called active data warehousing?
> Explain how the characteristics of data for data warehousing is different from the characteristics of data for operational databases.
> List the different roles played by data marts and data warehouses in a data warehouse environment.
> What is meant by a corporate information factory?
> List the 10 essential rules for dimensional modeling.
> Define each of the following terms: a. data warehouse b. data mart c. reconciled data d. derived data e. enterprise data warehouse f. real-time data warehouse g. star schema h. snowflake schema i. grain j. conformed dimension k. static extract l. increme
> What is the role of a DBA? List various regulations and standards for physical database design and their functions.
> Identify some limitations of normalized data as outlined in the text.
> What is a translation or code table? When it should be implemented, and what are its advantages?
> FAME (Forondo Artist Management Excellence) Inc. is an artist management company that represents classical music artists (only soloists) both nationally and internationally. FAME has more than 500 artists under its management and wants to replace its spr
> What decisions have to be made to develop a field specification?
> What are the key decisions in physical database design?
> Discuss the potential advantages, technical challenges, and disadvantages of using cloud-based database provisioning.
> Describe the differences between the IaaS, PaaS, and SaaS models of cloud-based database management solutions.
> How can views be used as part of data security? What are the limitations of views for data security?
> What are the major inputs into physical database design?
> Briefly describe four components of a disaster recovery plan.
> Explain the role of encryption in data security.
> List and describe four common types of database failure.
> Briefly describe four DBMS facilities that are required for database backup and recovery.
> Research various graphics and drawing packages, such as Microsoft Office and SmartDraw, and compare the E-R diagramming capabilities of each. Is each package capable of using the notation found in this text? Is it possible to draw a ternary or higher-ord
> What are the two key types of security policies and procedures that must be established to aid in Sarbanes-Oxley compliance?
> What are the key areas of IT that are examined during a Sarbanes-Oxley audit?
> What is the difference between an authentication scheme and an authorization scheme?
> List and briefly explain how integrity controls can be used for database security.
> Explain the components of a repository system architecture. List and explain the functions supported by a repository engine.
> List and discuss five areas where threats to data security may occur.
> Contrast the following terms: a. horizontal partitioning; vertical partitioning b. repository; data dictionary c. physical file; tablespace d. before image; after image e. normalization; denormalization f. range control; null control g. transaction log;
> Contrast the uses of a data dictionary and a repository in data and database management.
> What are the different elements of a query that can be processed in parallel?
> Explain how query writers can improve query processing performance through knowledge of query optimizers.
> Interview one person from a key business function, such as finance, human resources, or marketing. Concentrate your questions on the following items: How does he or she retrieve data needed to make business decisions? From what kind of system (personal d
> What role can a query optimizer play in the selection of an optimal set of indexes?
> Database servers frequently use one of the many parallel processing architectures. Discuss which elements of a query can be processed in parallel.
> Explain why an index is useful only if there is sufficient variety in the values of an attribute.
> Discuss the trade-off between improved performance for retrieval through use of indexes and degraded performance due to updates of indexed records in a file.
> State 10 rules of thumb for choosing indexes.
> How is storage space in a database divided logically by the DBMS? What is the role of a DBA in managing it?