Data Warehousing Interview Questions

Data Warehousing Interview Questions

When it comes to career people see very few options in the ground level but when it comes to the cyber world, you can go beyond the expectation. Data warehouse or the DW serves as one of the initial checkpoints for many important business data which are of high demand. Data warehouse interview questions are one such area that has a great career opportunity. Here are a few questions that will help you find your dream job within Data warehouse field.

Read on the important Data warehouse interview questions and their answers.

Download Data Warehousing Interview Questions PDF

Below are the list of Best Data Warehousing Interview Questions and Answers

There are three major types of data warehouses as-

  • Enterprise data warehouse
  • Operational data store
  • Data mart

A data warehouse consists of data which is obtained from data sources or in other words, external sources. The aim is to make the data available, searchable and valuable for business users. There are three fundamental elements are: -

  • Various data sources like ERP, Excel, financial applications or CRM.
  • A place where data is refined, sorted and put in order.
  • A warehoused space where data is presented

Data analytics or simply DA is the science used for examining raw data with the purpose of concluding that information. This is mostly built to enable the Data Analytics

This method was adopted back in the 1980s by IBM researchers Paul Murphy and Barry Devlin. They happen to put together business data warehouse in a 1988 paper, written by the duo.

William H. Inmon improved it further as data warehouse development, by the introduction of his book Building the Data Warehouse in 1992.

The data warehousing institute was founded in 1995 and the technology started growing. In 2002, Inmon introduced a new concept – data warehousing 2.0.

  • Top-down approach

    According to Inmon’s methods, the data warehouse has to be built first. The data derived from the third party’s external system is verified and finally combined into a normalized data model. The data stored in data warehouse leads to further creation of data marts.

  • Bottom-up method

    According to Kimball’s method, one should create the dimensional data marts first. Data obtained from the systems is passed to the staging area and then shaped into a star schema design. This data is at the end, processed and stored with the data marts and each of the marts focus on individual business process..

  • Hybrid method

    This is an approach obtained from the combination of both the top-down and bottom-up method. It means the pace of the bottom-up method is combined with the integration from top-down design.

  • People who rely on a mass amount of data to make decisions.
  • Users who wish to obtain information from multiple data sources using a customized, complex process.
  • People who wish to access the data using simple technology can also, use it.
  • People who wish to make decisions, based on a systematic approach.
  • Users who want fast results out of a huge amount of data to be used in reports and grids or charts.
  • Data warehousing is the first step toward discovering hidden patterns of data flows and groupings.

Here is a list of the most common sectors which use data warehousing.

  • Airlines

    Used for the purpose such as crew assignment, analysis of profitability, frequent flyers program promotion etc.

  • Banking

    Helps in management of the available resources. Along with the management of market research, performance analysis of product and operations.

  • Healthcare

    This sector uses a data warehouse to strategize and predict outcomes. It also helps in generating patient’s treatment reports, medical aid services, share data with tie-in insurance companies, etc.

  • Public sector

    In this sector, data warehousing is used to gather intelligence. Government agencies use it to maintain and analyze tax records, health policy records, for every individual.

  • Investment and insurance sector

    In this sector, the warehouses come handy in analyzing various data patterns, customer trends and tacking market movements.

  • Retail chain

    As far as the retail chain is considered, data warehousing helps in distribution and marketing. It also helps to track items, customer buying pattern, promotions and also used for determining pricing policy.

  • Telecommunication

    Telecommunication uses data warehousing for product promotions, sales decision along with distribution decisions.

  • Hospitality industry

    This sector uses the warehouse to design and estimate its advertising and promotional campaigns. The main targets are the clients, based on their feedbacks along with travel patterns.

Transaction system that collects the business data is called as OLTP.OLAP tends to report and analyze the system on that data.
OLTP systems are usually optimized for INSERT and UPDATE operations, hence are highly normalized in general.When it comes to OLAP, systems are made denormalized for faster data retrieval through the operation of SELECT.

Data marts are usually designed for just one unique subject area. The organization may have data pertaining to various departments such as Finance, Marketing, HR etc. Hence the data warehouse stores of each department need to be separate, which is solved by data marts. These can also be built on top of a data warehouse if needed.

The hierarchical clustering of the algorithm that overcomes all limitations of the base models and methods that are present in the data warehousing in combination is called the Chameleon. This method operates as a sparse graph that has nodes, that is a Chameleon can represent data items and edges representing the need of the data items.

Chameleon representation is the one in the data warehouse that allows a large dataset to create and operate successfully. The method finds the clusters that can be used in the dataset using the two-phase algorithm.

There are three steps, which would help address the business risk associated.

  • Enterprise strategy

    Technical requirement is identified here, including the current architecture and tools. Facts, dimensions and attributes are also identified. It also includes data mapping and transformation.

  • Phased delivery

    Data warehouse implementation demands to be phased as per the subject areas. Any kind of related business entities such as booking and billing needs to be implemented prior to integration with each other.

  • Iterative prototyping

    The data warehouse needs to developed and tested iteratively and, does need a big approach for the implementation.

Data warehouse system can also be referred to as:

  • Decision Support System (DSS)
  • Business Intelligence Solution
  • Management Information System
  • Analytic Application
  • Executive Information System

Cluster analysis is mostly used to define an object without a class label. It helps in analyzing all the data that is present in the data warehouse. It can compare the cluster with another already running cluster. It also performs assigning tasks to set some of the objects into the groups.

Cluster analysis includes all the information and knowledge around other fields like the machine learning, image analysis, pattern recognition, and bio-informatics and helps in performing the iterative process of knowledge discovery that is used with pre-processing and other parameters

The prominent tools of warehousing are given as below: -

  • MarkLogic
  • Oracle
  • Amazon RedShift

The size of the databases continues to grow, which might actually be a problem in the future. The present data warehousing system would not be able to support such a huge data in the future.

Regulatory constraints are changing too, which might lead to loss of ability in combining a source of data. This can lead to unstructured data which is quite difficult to store.