FAME (Forondo Artist Management Excellence) Inc. is an artist management company that represents classical music artists (only soloists) both nationally and internationally. FAME has more than 500 artists under its management and wants to replace its spreadsheet-based system with a new state-of-the-art computerized information system.
Their core business idea is simple: FAME finds paid performance opportunities for the artists whom it represents and receives a 10 to 30 percent royalty for all the fees the artists earn (the royalties vary by artist and are based on a contract between FAME and each artist). To accomplish this objective, FAME needs technology support for several tasks. For example, it needs to keep track of prospective artists. FAME receives information regarding possible new artists both from promising young artists themselves and as recommendations from current artists and a network of music critics. FAME employees collect information regarding promising prospects and maintain that information in the system. When FAME management decides to propose a contract to a prospect, it first sends the artist a tentative contract, and if the response is positive, a final contract is mailed to the prospect. New contracts are issued annually to all artists.
FAME markets its artists to opera houses and concert halls (customers); in this process, a customer normally requests a specific artist for a specific date. FAME maintains the artistsâ calendars and responds back based on the requested artistâs availability. After the performance, FAME sends an invoice to the customer, who sends a payment to FAME (note that FAME requires a security deposit, but you do not need to capture that aspect in your system). Finally, FAME pays the artist after deducting its own fee.
Currently, FAME has no IT staff. Its technology infrastructure consists of a variety of desktops, printers, laptops, tablets, and smartphones all connected with a simple wired and wireless network. A local company manages this infrastructure and provides the required support. Martin Forondo, the owner of FAME, has commissioned your team to design and develop a database application. In his e-mail soliciting your help, he provides the following information:
E-mail from Martin Forondo, Owner
My name is Martin Forondo, and I am the owner and founder of FAME. I have built this business over the past 30 years together with my wonderful staff and I am very proud of my company. We are in the business of creating bridges between the finest classical musicians and the best concert venues and opera houses of the world and finding the best possible opportunities for the musicians we represent. It is very important for us to provide the best possible service to the artists we represent. It used to be possible to run our business without any technology, particularly when the number of the artists we represented was much smaller than it currently is. The situation is, however, changing, and we seem to have a need to get some technical help for us. At this moment we have about 500 different artists and every one of them is very special for us. We have about 20 artist managers who are responsible for different numbers of artists; some of them have only 10, but some manage as many as 30 artists.
The artist managers really keep this business going, and each of them has the ultimate responsibility for the artists for whom they work. Every manager has an administrative assistant to help him or her with daily routine work the managers are focusing on relationship building and finding new talent for our company.
The managersâ report to me but they are very independent in their work, and I am very pleased that I only very seldom have to deal with operational issues related to the managersâ work. By the way, I also have my own artists (only a few but, of course, the very best within the company, if I may say so). As I said, we find performance opportunities for the artists and, in practice, we organize their entire professional livesâof course, in agreement with them. Our main source of revenue consists of the royalties we get when we are successful in finding a performance opportunity for an artist:
We get up to 30 percent of the fee paid to an artist (this is agreed separately with every artist and is a central part of our contract with the artist). Of course, we get the money only after the artist has successfully completed the performance; thus, if an artist has to cancel the performance, for example, because of illness, we will not get anything. Within the company the policy is very clear: A manager gets 50 percent of the royalties we earn based on the work of the artists he or she manages, and the remaining 50 percent will be used to cover administrative costs (including the administrative assistantsâ salaries), rent, electricity, computer systems, accounting services, and, of course, my modest profits. Each manager pays their own travel expenses from their 50 percent. Keeping track of the revenues by manager and by artist is one of the most important issues in running this business.
Right now, we take care of it manually, which occasionally leads to unfortunate mistakes and a lot of extra work trying to figure out what the problem is. It is amazing how difficult simple things can sometimes become. When thinking about the relationship between us and an artist whom we represent, it is important to remember that the artists are ultimately responsible for a lot of the direct expenses we pay when working for them, such as flyers, photos, prints of photos, advertisements, and publicity mailings. We donât, however, charge for phone calls made on behalf of a certain artist, but rather this is part of the general overhead. We would like to settle the accounts with each of the artists once per month so that either we pay them what we owe after our expenses are deducted from their portion of the fee or they pay us, if the expenses are higher than a particular monthâs fees.
The artists take care of their own travel expenses, meals, etc. From my perspective, the most important benefit of a new system would be an improved ability to know real-time how my managers are serving their artists. Are they finding opportunities for them and how good are the opportunities, what are the fees that their artists have earned and what are they projected to be, etc. Furthermore, the better the system could predict the future revenues of the company, the better for me. Whatever we could do with the system to better cultivate new relationships between promising young artists, it would be great. I am not very computer savvy; thus, it is essential that the system will be easy to use.
Project Questions:
Create an enterprise data model that captures the data needs of FAME. Use a notation similar to the one shown in Figure 1-4.
Data from Figure 1-4:
> Define each of the following terms: a. data mining b. online analytical processing c. business intelligence d. predictive analytics e. Apache Spark
> What is the difference between a wide-column store and a graph-oriented database?
> What is the trade-off one needs to consider while using a NoSQL database management system?
> What is the difference between the explanatory and exploratory goals of data mining?
> Identify the differences between Hadoop and NoSQL technologies.
> Having reviewed your conceptual models (from Chapters 2 and 3) with the appropriate stakeholders and gaining their approval, you are now ready to move to the next phase of the project, logical design. Your next deliverable is the creation of a relational
> What are the two challenges faced in visualizing big data?
> Identify and briefly describe the five Vs that are often used to define big data.
> Contrast the following terms: a. data lake; data warehouse b. Pig; Hive c. volume; velocity d. NoSQL; SQL
> Match the following terms to the appropriate definitions: - Hive - Big data - Data lake - Pig - Analytics a. data exist in large volumes and variety and need to processed at a very high speed b. a language that is used to extract, load and transform data
> HDase and Cassandra share a common purpose. What is it? What is their relationship to HDFS and Google BigTable?
> Explain the implementation of MapReduce on HDFS clusters.
> How does HDFS aid in coping with hardware failure?
> Describe and explain the two main components of MapReduce
> What is the role of YARN in the management of highly distributed systems?
> List the purposes Hadoop is used for.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Discuss the features of NoSQL DBMS that ensure high availability but do not guarantee consistency.
> What is the format that can be used to describe database schema besides JSON?
> Define each of the following terms: a. Hadoop b. MapReduce c. HDFS d. NoSQL e. Pig
> Why is it important to consolidate a Web-based customer interaction in a data warehouse?
> List five claimed limitations of independent data marts.
> Explain the need to separate operational and information systems.
> List the issues that one encounters while achieving a single corporate view of data in a firm.
> Briefly describe the factors that have led to the evolution of the data warehouse.
> Why does an information gap still exist despite the surge in data in most firms?
> List the functions performed by a Data Warehouse Administrator and explain how they differ from the typical data administrator and database administrator.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Explain the reasons why Data Warehousing 2.0 is necessary.
> Explain how the phrase “extract–transform–load” relates to the data reconciliation process.
> List five errors and inconsistencies that are commonly found in operational data.
> List and briefly describe five steps in the data reconciliation process.
> Contrast the following terms: a. transient data; periodic data b. data scrubbing; data transformation c. data warehouse; data mart; operational data store d. reconciled data; derived data e. static extract; incremental extract f. fact table; dimension ta
> List six typical characteristics of reconciled data.
> Explain why it is essential to scrub data before transformation and how they blend together.
> Which three techniques form the building blocks of any data integration approach?
> Describe the current key trends in data warehousing.
> Explain how data integration is not the only data consolidation technique across an enterprise.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Briefly explain how the dimensions and facts required for a data mart are driven by the context for decision making.
> Why should changes be made to the data warehouse design? What are the changes that need to be accommodated?
> What is the meaning of the phrase “slowly changing dimension”?
> What are the two situations in which factless fact tables may apply?
> Explain through common examples why determining grain is critical.
> Match the following terms and definitions: - periodic data - data mart - star schema - data scrubbing - data transformation - grain - reconciled data - dependent data mart - real-time data warehouse - selection - transient data - snowflake schema a. lost
> List and describe the various situations in which it becomes essential to further normalize dimension tables.
> Explain the components of a star schema with figures and suitable examples.
> Describe the characteristics of a surrogate key as used in a data warehouse or data mart.
> Discuss the role of an enterprise data model and metadata in the architecture of a data warehouse.
> What are the key differences between data warehousing and big data approaches to analytical data management?
> What type of an impact has the significant decrease in the cost of storage space had on data warehouse and data mart design?
> Why is real-time data warehousing called active data warehousing?
> Explain how the characteristics of data for data warehousing is different from the characteristics of data for operational databases.
> List the different roles played by data marts and data warehouses in a data warehouse environment.
> What is meant by a corporate information factory?
> List the 10 essential rules for dimensional modeling.
> Define each of the following terms: a. data warehouse b. data mart c. reconciled data d. derived data e. enterprise data warehouse f. real-time data warehouse g. star schema h. snowflake schema i. grain j. conformed dimension k. static extract l. increme
> What is the role of a DBA? List various regulations and standards for physical database design and their functions.
> Identify some limitations of normalized data as outlined in the text.
> What is a translation or code table? When it should be implemented, and what are its advantages?
> FAME (Forondo Artist Management Excellence) Inc. is an artist management company that represents classical music artists (only soloists) both nationally and internationally. FAME has more than 500 artists under its management and wants to replace its spr
> What decisions have to be made to develop a field specification?
> What are the key decisions in physical database design?
> Discuss the potential advantages, technical challenges, and disadvantages of using cloud-based database provisioning.
> Describe the differences between the IaaS, PaaS, and SaaS models of cloud-based database management solutions.
> How can views be used as part of data security? What are the limitations of views for data security?
> What are the major inputs into physical database design?
> Briefly describe four components of a disaster recovery plan.
> Explain the role of encryption in data security.
> List and describe four common types of database failure.
> Briefly describe four DBMS facilities that are required for database backup and recovery.
> Research various graphics and drawing packages, such as Microsoft Office and SmartDraw, and compare the E-R diagramming capabilities of each. Is each package capable of using the notation found in this text? Is it possible to draw a ternary or higher-ord
> What are the two key types of security policies and procedures that must be established to aid in Sarbanes-Oxley compliance?
> What are the key areas of IT that are examined during a Sarbanes-Oxley audit?
> What is the difference between an authentication scheme and an authorization scheme?
> List and briefly explain how integrity controls can be used for database security.
> Explain the components of a repository system architecture. List and explain the functions supported by a repository engine.
> List and discuss five areas where threats to data security may occur.
> Contrast the following terms: a. horizontal partitioning; vertical partitioning b. repository; data dictionary c. physical file; tablespace d. before image; after image e. normalization; denormalization f. range control; null control g. transaction log;
> Contrast the uses of a data dictionary and a repository in data and database management.
> What are the different elements of a query that can be processed in parallel?
> Explain how query writers can improve query processing performance through knowledge of query optimizers.
> Interview one person from a key business function, such as finance, human resources, or marketing. Concentrate your questions on the following items: How does he or she retrieve data needed to make business decisions? From what kind of system (personal d
> What role can a query optimizer play in the selection of an optimal set of indexes?
> Database servers frequently use one of the many parallel processing architectures. Discuss which elements of a query can be processed in parallel.
> Explain why an index is useful only if there is sufficient variety in the values of an attribute.
> Discuss the trade-off between improved performance for retrieval through use of indexes and degraded performance due to updates of indexed records in a file.
> State 10 rules of thumb for choosing indexes.
> How is storage space in a database divided logically by the DBMS? What is the role of a DBA in managing it?
> What is the purpose of the EXPLAIN or EXPLAIN PLAN command?
> Match the following terms to the appropriate definitions: - extent hashing algorithm - rollback - index - checkpoint facility - physical record - pointer - data type - physical file - database recovery a. a detailed coding scheme for representing organi
> Compare the features of the four families of file organization.
> Which index is most suitable for decision support and transaction processing applications that involve online querying? Explain your answer.
> Interview a systems analyst or database analyst and ask questions about how that organization uses data modeling and design tools in the systems development process. Concentrate your questions on how data modeling and design tools are used to support dat
> Explain data replication, forms of partitioning, and their areas of application.
> Explain the reasons why some experts are against the practice of denormalization.
> Why would a database administrator create multiple tablespaces? What is its architecture?
> Explain why it makes sense to first go through the normalization process and then denormalize.
> Explain why normalized relations may not comprise an efficient physical implementation structure.
> Describe three ways to handle missing field values.