Discuss the role of OLAP in the context of descriptive analytics.
> In Chapter 4, you created the relational schema for the FAME system, and in Chapter 5 you implemented that schema with a relational database management system without giving full consideration to the details of physical database design. In Chapters 5 and
> Compare two versions of SQL to which you have access, such as Microsoft Access and Oracle. Identify at least five similarities and three dissimilarities in the SQL code from these two SQL systems. Do the dissimilarities cause results to differ?
> Develop an EER model for the following situation using the traditional EER notation, the Visio notation, or the subtypes inside supertypes notation, as specified by your instructor: An international school of technology has hired you to create a database
> Contact the DBA of an organization you are familiar with. Interview them to understand how they collect, manage, and utilize metadata. Do they store it using a data repository? If they do use a data repository, do they allow users to create their own pas
> Draw an EER diagram for the following problem: MoneyBags Bank customers can access over 10,000 ATMs worldwide. These are a combination of the Bank’s own machines and those provided through the Visa network. Customers using the Bank’s own machines can che
> Draw an EER diagram for the following problem: A company posts job openings online. Job seekers must apply for these jobs online as well. The company will process the applications and call eligible job seekers for an interview. Upon successful completion
> Visit an organization that has implemented a database approach. Evaluate each of the following: a. The organizational placement of data administration, database administration, and data warehouse administration. b. The assignment of responsibilities for
> Visit the database designer or administrator of an organization that has heavy transaction oriented applications and requires frequent updates to records in the database as well. Examples include any organization dealing with customers, point-of-sale sys
> Denormalization can be a controversial topic among database designers. Some believe that any database should be fully normalized (even using all the normal forms discussed in Appendix B, available on the book’s Web site). Others look for ways to denormal
> Using a table, differentiate between how data is represented in a file processing environment and how it is represented in a relational database.
> Using the Web and Internet resources, search for the application areas of parallel processing. Which DBMS in the market supports this feature?
> Visit an organization that has implemented a database management system and interview concerned individuals about the standards and regulations used by the organization for financial reporting and ensuring the security of the IT Infrastructure. How does
> Look for a receipt from a supermarket or other retail store you have purchased from. Based on the receipt, draw an EER diagram of the data in this form or report. Transform the diagram into a set of 3NF relations.
> Using the online Appendix B, available on the book’s Web site, as a resource, interview a database analyst/designer to determine whether he or she normalizes relations to higher than 3NF. Why or why not does he or she use normal forms beyond 3NF?
> Obtain an EER diagram from a database administrator or system designer. Based on what you have learned in this book, convert this into a relational schema in 3NF. Now interview the administrator on how they convert the diagram into relations. How do they
> Interview system designers and database designers at several organizations. Ask them to describe the process they use for logical design. How do they transform their conceptual data models (e.g., E-R diagrams) to relational schema? What is the role of CA
> For the same E-R diagram used in Field Exercise 2-56 or for a different database in the same or a different organization, identify any uses of time stamping or other means to model time dependent data. Why are time-dependent data necessary for those who
> Ask a database or systems analyst in a local company to show you an E-R diagram for one of the organization’s primary databases. Ask questions to be sure you understand what each entity, attribute, and relationship means. Does this organization use the s
> Ask a database or systems analyst to give you examples of unary, binary, and ternary relationships that the analyst has dealt with personally at his or her company. Ask which is most common and why. Ask them if they ever model weak or dependent entities
> Interview a database analyst or a systems analyst. How do they extract business rules for ER modeling? Ask for specific sources. Are they all listed in the text? Did they purchase an ER model and customize it or design it on their own? How did they decid
> List the nine major components in a database system environment.
> Interview a database analyst and ask how they go about identifying business rules in the data modeling process. How do they decide to document what business rules they will gather, utilize, manage, and consider while developing an E-R model? How do they
> What changes can be made in data administration at each stage of the traditional database development life cycle to deliver high-quality, robust systems more quickly?
> Briefly describe four database administration trends that are emerging today.
> What factors must be considered when deciding on an open-source DBMS?
> What functions require the input and involvement of both the data administrator and the database administrator?
> Why are data administrators required to maintain an information repository?
> Indicate whether data administration or database administration is typically responsible for each of the following functions: a. Managing the data repository b. Installing and upgrading the DBMS c. Conceptual data modeling d. Managing data security and p
> Many organizations are now offering cloud-based data warehousing services such as IBM’s dashDB, Amazon’s Redshift, and Microsoft Azure. Pick any three such firms and, using the Internet, compare them based on the factors listed. Prepare a report based on
> Contrast the following terms: a. chief data officer; DBA b. data administration; database administration c. open source DBMS; commercial DBMS d. ETL; MDM
> What distinguishes MDM from other forms of data integration?
> Describe the three major approaches to MDM.
> State any four data availability problems and how they can potentially be addressed.
> Explain how TQM techniques can help in improving data quality.
> Match the following terms and definitions: - data administration database - master data management - data steward - open source DBMS a. oversees data quality for a particular data subject b. a database management system available for free (typically incl
> What are some of the advanced techniques that can be applied by a software solution for data quality improvement?
> Explain how an organization’s business rules can be checked as part of a data audit.
> Describe the key steps to improve data quality in an organization.
> Explain four reasons why the quality of data is poor in many organizations.
> Visit an organization that has implemented information systems on a data warehouse, and interview managers to discuss following issues: a. Does increased data collection lead to any information gaps for managers? b. Do they receive information from diver
> Define the eight characteristics of quality data.
> What are the four basic facilities for the backup and recovery of a database?
> What are four reasons why data quality is important to an organization?
> How can fuzzy logic, pattern matching, and expert systems be used to improve data quality?
> How can the data capture process be improved?
> Briefly describe four threats to high data availability and at least one measure that can be taken to counter each of these threats.
> Define each of the following terms: a. database administration b. data administration c. chief data officer d. master data management e. open source DBMS
> Compare and contrast R and Python as computational environments for analytics.
> Briefly describe three types of operations that can easily be performed with OLAP tools.
> Having reviewed your conceptual models (from Chapters 2 and 3) with the appropriate stakeholders and gaining their approval, you are now ready to move to the next phase of the project, logical design. Your next deliverable is the creation of a relational
> Explain the different tools for querying and analyzing data in traditional data warehouses and marts that enable various forms of descriptive analytics.
> Explain the three different generations of business intelligence and analytics.
> Explain the progression from DSS to analytics through business intelligence.
> Contrast the following terms: a. Data mining; text mining b. ROLAP; MOLAP c. R; Python
> Match the following terms to the appropriate definitions: - text mining - data mining - descriptive analytics - analytics - predictive analytics - prescriptive analytics a. knowledge discovery using a variety of statistical and computational techniques b
> Identify six broad categories of implications of big data analytics and decision making.
> How is data quality and management vital in realizing the full potential of big data and analytics?
> Describe the core idea underlying in-database analytics.
> Describe the core idea underlying in-memory DBMSs.
> Describe the mechanism through which prescriptive analytics is dependent on descriptive and predictive analytics.
> Having reviewed your conceptual models (from Chapters 2 and 3) with the appropriate stakeholders and gaining their approval, you are now ready to move to the next phase of the project, logical design. Your next deliverable is the creation of a relational
> How is KNIME used as a predictive analytics tool?
> Discuss why data mining applications are growing rapidly in business.
> Illustrate the goals of data mining and how they answer fundamental business questions.
> Discuss the different types of dashboards and their role in business performance management.
> How does Apache Spark differ from Hadoop?
> Define each of the following terms: a. data mining b. online analytical processing c. business intelligence d. predictive analytics e. Apache Spark
> What is the difference between a wide-column store and a graph-oriented database?
> What is the trade-off one needs to consider while using a NoSQL database management system?
> What is the difference between the explanatory and exploratory goals of data mining?
> Identify the differences between Hadoop and NoSQL technologies.
> Having reviewed your conceptual models (from Chapters 2 and 3) with the appropriate stakeholders and gaining their approval, you are now ready to move to the next phase of the project, logical design. Your next deliverable is the creation of a relational
> What are the two challenges faced in visualizing big data?
> Identify and briefly describe the five Vs that are often used to define big data.
> Contrast the following terms: a. data lake; data warehouse b. Pig; Hive c. volume; velocity d. NoSQL; SQL
> Match the following terms to the appropriate definitions: - Hive - Big data - Data lake - Pig - Analytics a. data exist in large volumes and variety and need to processed at a very high speed b. a language that is used to extract, load and transform data
> HDase and Cassandra share a common purpose. What is it? What is their relationship to HDFS and Google BigTable?
> Explain the implementation of MapReduce on HDFS clusters.
> How does HDFS aid in coping with hardware failure?
> Describe and explain the two main components of MapReduce
> What is the role of YARN in the management of highly distributed systems?
> List the purposes Hadoop is used for.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Discuss the features of NoSQL DBMS that ensure high availability but do not guarantee consistency.
> What is the format that can be used to describe database schema besides JSON?
> Define each of the following terms: a. Hadoop b. MapReduce c. HDFS d. NoSQL e. Pig
> Why is it important to consolidate a Web-based customer interaction in a data warehouse?
> List five claimed limitations of independent data marts.
> Explain the need to separate operational and information systems.
> List the issues that one encounters while achieving a single corporate view of data in a firm.
> Briefly describe the factors that have led to the evolution of the data warehouse.
> Why does an information gap still exist despite the surge in data in most firms?
> List the functions performed by a Data Warehouse Administrator and explain how they differ from the typical data administrator and database administrator.
> Martin was very impressed with your project plan and has given you the go-ahead for the project. He also indicates to you that he has e-mails from several key staff members that should help with the design of the system. The first is from Alex Martin (ad
> Explain the reasons why Data Warehousing 2.0 is necessary.
> Explain how the phrase “extract–transform–load” relates to the data reconciliation process.
> List five errors and inconsistencies that are commonly found in operational data.
> List and briefly describe five steps in the data reconciliation process.
> Contrast the following terms: a. transient data; periodic data b. data scrubbing; data transformation c. data warehouse; data mart; operational data store d. reconciled data; derived data e. static extract; incremental extract f. fact table; dimension ta
> List six typical characteristics of reconciled data.