OLAP

OLAP stands for Online Analytical Processing.

It allows for interactive analysis of data, allowing data to be summarized and viewed in different ways in an online fashion (with negligible delay).

Data that can be modeled as dimension attributes and measure attributes are called multidimensional data.

  • Measure attributes

    • measure some value

    • can be aggregated upon

    • Ex. the attribute quantity of the sales relation

  • Dimension attributes

    • define the dimensions on which measure attributes (or aggregates thereof) are viewed

    • Ex. the attributes item_name, color, and clothes_size of the sales relation

Example (sales relation):

Cross Tabulation / Pivot

The above shows the cross-tabulation table (or cross-tab or pivot table) of the sales table by item_name and color.

  • Values for one of the dimension attributes form the row headers

  • Values for another dimension attribute form the column headers

  • Other dimension attributes are listed on top

  • Values in individual cells are (aggregates of) the values of the dimension attributes that specify the cell

Data Cube

A data cube is a multidimensional generalization of a cross-tab.

Cross-tabs can be used as views on a data cube.

Cross-Tabulation with Hierarchy

Extended Aggregation

The cube operation computes a union of group by’s on every subset of the specified attributes.

select item_name, color, size, sum(quantity)
from sales
group by cube(item_name, color, size)

This computes the union of eight different groupings of the sales relation:

{ (item_name, color, size),

(item_name, color),

(item_name, size),

(color, size),

(item_name),

(color),

(size),

( ) }

where ( ) denotes an empty group by list.

For each grouping, the result contains null for attributes not present in the grouping.

OLAP Operations

  • Pivoting: changing the dimensions used in a cross-tab

  • Slicing: creating a cross-tab for fixed values only

    • Sometimes called dicing, particularly when values for multiple dimensions are fixed

  • Rollup: moving from finer-granularity data to a coarser granularity

  • Drill down: The opposite operation - that of moving from coarser granularity data to finer-granularity data

OLAP Implementation

  • The earliest OLAP systems used multidimensional arrays in memory to store data cubes, and are referred to as multidimensional OLAP (MOLAP) systems

  • OLAP implementations using only relational database features are called relational OLAP (ROLAP) systems

  • Hybrid systems, which store some summaries in memory and store the base data and other summaries in a relational database, are called hybrid OLAP (HOLAP) systems

Last updated