OpenCube Query Language

 

The OQL (OpenCube Query Language) is divided in 3 types of commands:

  • Cube Management
  • Data Management
  • Reports

 

Language Conventions

The following conventions are used in this manual:

 

  • All commands must end in a semicolon ( ; )
  • All syntax elements enclosed in single quotes ( ‘ ‘ ) means that they are used literally and without the quotes
  • Keywords are case insensitive
  • STRING is a double quoted string, if it needs to contain a double quote inside, then this must be signaled by using 2 double quotes (i.e. “The lazy dog”, “The “”lazy”” dog”)
  • NUMBER is a series of digits optionally followed by a dot and another series of digits. The optional part is the decimal part (i.e. 123, 123.25)
  • ID is an identifier, it beings with a letter or an underscore, and it is followed by 0 or more letters, digits or underscores. IDs are case sensitive. An ID might optionally be preceded by another ID and a dot, this is used when identifying elements that are part of another ID (i.e. price, brand.coke )
  • A syntax element followed by a question mark ( ? ) means that the element is optional
  • A syntax element followed by an asterisk ( * ) means that the element is optional, or can be repeated multiple times.
  • A syntax element followed by a plus sign ( + ) means that the element can be repeated multiple times, but it has to appear at least once.
  • Syntax elements will be grouped using curly braces ( { } )
  • A group of syntax elements separated by a vertical bar ( | ) means that 1 and only one of the options is used
  • Use a single apostrophe in the script to create a line comment ( ‘ )

 

Cube Management

 

The cube management commands allow creating new cube files, as well as load and saving them.

 

Load

 

The load command is used to load data into the workspace

 

Syntax

 

CmdLoad -> ‘load’ STRING LoadUsing? LoadFields?

 

 

 

Where:

 

The string is the location of the data, typically a file name and path.

 

LoadUsing -> ‘using’ STRING STRING

 

 

 

This optional part allows using external DLLs as data providers for the load.

 

The first string is the file name to the external DLL that contains a class that implements OuterCube.Importers.IImport

 

The second string is the name of the class inside the DLL that implements the interface. The string must also contain the name space to the class

 

LoadFields -> LoadField { ‘,’ LoadField }*

 

 

 

This is a comma separated list of field definitions where

 

 

 

LoadField -> ID { ‘context’ | ‘dimension’ | ‘measure’ } ‘notmapped’? STRING? LoadFieldParam? LoadValues?

 

 

 

ID is the name of the field

 

The field must be defined to what type it belongs, either a context, and dimension, or a measure

 

 

 

If the data for the field contains a string description instead of the index into the value, then the field is not value mapped and it should be marked with the keyword notmapped

 

 

 

The optional string is a description to be loaded for this field

 

The LoadFieldParam is defined as follows:

 

LoadFieldParam -> ‘(‘ { STRING | NUMBER }+  ‘)’

 

 

 

This section is used for extra data that the import dll might need for the field. For example, the fix width text importer uses 2 parameters from this section, the first one is the starting position for the field, and the second is the length of the field.

 

 

 

The syntax is a series of string or number values enclosed by parenthesis. There is no need to separate the individual values by commas.

 

 

 

LoadValues -> ‘[‘ LoadValue { ‘,’ LoadValue }* ‘]’

 

The load values section allows to preload values names, descriptions and indices for a specific field

 

 

 

It is a comma separated list of load values enclosed by square brakets, and is defined as follows:

 

 

 

LoadValue -> ID NUMBER STRING?

 

 

 

The ID is the name of the value.

 

 

 

The number is the index of the value.

 

 

 

The optional string is a description of the value

 

 

 

Examples

 

 

 

load "cola.dat"

 

    hhid  measure   ( 1 8 ),

 

    date  dimension ( 9 2 ),

 

    cat   dimension ( 11 2 ) [ cola 1, uncola 2, water 3 ],

 

    brand dimension ( 13 2 ) [ coke 1, pepsi 2, rc 3, sprite 4, 

 

                                   fanta 5, mist 6, desani 7, aqua 8, 

 

                                   perrier 9 ],

 

    type  dimension ( 15 2 ) [ reg 1, diet 2 ],

 

    pkg   dimension ( 17 2 ) [ can 1, bot 2 ]

 

    ;

 

    

 

This example loads the file cola.dat using the internal importer (there is no ‘using’ clause). Since the file is not a ‘.csv’ then is assumed to be fixed length. The file contains 6 fields:

 

  • The field hhid is numeric, begins at position 1 and has a length of 8, and consider is a measure
  • The field date is a dimension the begins at position 9 and has a length of 2, and it does not know the possible values so it does not define them and names will be assigned during the load itself (Value_1, Value_2, etc)
  • The field cat is also a dimension that starts at position 11 and has a length of 2, the values and not their labels are stored in the data file hence there is no notmapped flag, and the value labels are defined as the name cola for the value 1, uncola for the value 2, and water for the value 3.
  • The rest of the fields are easily understood

 

 

 

load "Rx_GI_2.sav" 

 

using "OuterCube.Importers.spss.dll" "OuterCube.Importers.spss.ImportSPSS";

 

 

 

This example loads an SPSS file using an external DLL and class. In this case there is no need to define the fields (LoadFields part) because the DLL will read the data field configuration from the SPSS file.

 

 

 

Open

 

This command can open a previously saved OuterCube file (.oc)

 

Syntax

 

CmdOpen -> ‘open’ STRING

 

Where:

 

The string is the file name (and path is necessary) to the OpenCube file

 

Example

 

open "soda.oc";

 

 

 

This opens the file ‘soda.oc’.

 

Save

 

The save command allows to save the current workspace.

 

Syntax

 

CmdSave -> ‘save’ STRING

 

 

 

Where:

 

The string is the name of the file to be saved into. The file will be overwritten if it already exists

 

Example

 

save "soda.oc";

 

 

 

This line saves the current data to the ‘soda.oc’ file

 

 

 

 


 

 

Data Management

 

The data management commands allow modifying or creating new data based on the data already loaded.

 

Using

 

Using allows setting a global filter that the system will use for all report operations

 

Syntax

 

CmdUsing -> ‘using’ Context

 

Where:

 

Context is as detailed in the basic concepts section.

 

To eliminate the global filter influence set it to sys.All, which is the value when the workspace is just opened or loaded

 

Example

 

using pkg.can;

 

using sys.All;

 

Measure

 

The measure command creates new measures by combining existing measures and constant number values. The records affected can be limited using a context

 

Syntax

 

CmdMeasure -> ‘measure’ ID ‘=’ Measure { ‘using’ Context }?

 

Where:

 

ID is the name of the new measure

 

Measure is the combination of 1 or more measures or numeric constants using the basic math operators ( +, -, *, / ). If the operation for a particular row results in a division by 0, then the result of that operation will be replaced with a 0.

 

If optionally the keyword ‘using’ is included, then the context will limit the rows affected, and it will make 0 all the other rows

 

Example

 

measure client_price = price * 1.25;

 

This example creates a new measure called ‘client_price’ where every row will have the multiplication the value of the same row for the measure ‘price’ times 1.25

 

 

 

Dimension

 

The dimension commands allows to create new dimensions based on existing contexts, or by binning a measure, or by crossing 2 dimensions

 

Syntax

 

CmdDimension  -> ‘dimension’ ID ‘=’ { IdentifierList

 

                                     | { Measure ‘is’ RangeList }

 

                                     | { ID ‘cross’ ID }

 

                                   }

 

 

 

Where:

 

ID is the name if the new dimension

 

If an identifier list is used it has this syntax:

 

IdentifierList -> ‘[‘ Context { ‘,’ Context }* ‘]’

 

 

 

It is a comma delimited list of contexts. The contexts are as detailed in the basic concepts section. The list is enclosed in square brackets

 

Since this can be unnamed formulas, the name of the contexts of the new dimension will be automatically generated, you can sue the ‘rename’ command to change the names

 

If a measure (or a measure calculation) is specified, then the dimension will be the binning of that measure, and the syntax is as follows:

 

Measure ‘is’ RangeList

 

Measure is as detailed in the basic concepts section.

 

The range list has this syntax:

 

RangeList -> Range { ‘,’ Range }*;

 

 

 

It is a comma delimited list of ranges with this syntax:

 

Range -> ID { ‘under’ NUMBER | ‘between’ NUMBER ‘and’ NUMBER | ‘above’ NUMBER };

 

 

 

ID is the name that this context will have in the dimension

 

The numbers are the limits for that range.

 

If a cross dimension is used then the IDs are the names of the dimensions to be crossed

 

 

 

 

 

Example

 

dimension dimColas = [ brand.coke, brand.pepsi, brand.rc ];

 

dimension priceLevel = price is low under 10,

 

                                medium between 10 and 20,

 

                                high above 20;

 

 

 

dimension priceColas = dimColas cross pricelevel;

 

Context

 

The context command allows to create new contexts. These new contexts will be placed in the ‘global’ dimension

 

Syntax

 

CmdContext -> ‘context’ ID ‘=’ Context

 

Where:

 

ID is the name of the new context, and Context is the combination of other contexts as detailed in the basic concepts section

 

Example

 

context bottledCoke = pkg.bot and brand.coke;

 

context sodas = sys.All minus cat.water;

 

Rename

 

The rename commands change the name of a measure, dimension or context

 

Syntax

 

CmdRename -> ‘rename’ { ID ‘.’ }? ID ID

 

Where:

 

The first identifier might be the name of a measure or dimension, or using the dot notation, the name of a context. The second identifier will be the new name.

 

The dimensions ‘global’ and ‘sys’ cannot be renamed.

 

Example

 

rename priceColas.high_coke expensiveCoke;

 

Delete

 

The delete commands removes a measure, dimension or context from the workspace

 

Syntax

 

CmdDelete -> ‘delete’ { ID ‘.’ }?ID

 

Where:

 

The identifier might be the name of a measure or dimension, or using the dot notation, the name of a context, that will be deleted

 

The dimensions ‘global’ and ‘sys’ cannot be deleted

 

Example

 

Delete priceColas.low_coke

 

Print

 

The print ( ? ) command is invoked simple by the use of a question mark, and it can list details of the elements that are part of the workspace

 

Syntax

 

CmdPrint -> ‘?’ { ‘context’ | ‘dimension’ | ‘measure’ }? STRING?

 

Where:

 

Including one of the optional keywords ‘context’,  ‘dimension’, or ‘measure’ will limit the result to include only that kind of element.

 

The optional string is a regular expression that can be used to filter the results; the filter will be applied to the name and to the description of the element. The filter is not case sensitive.

 

Example

 

‘ List everything

 

?;

 

‘ List all dimensions

 

? dimension;

 

‘ List anything that contains ‘co’

 

? “co”;

 

‘ List measures that begin with ‘p’

 

? measure ”^p”;


 

 

Reports

 

The report commands extract data from the workspace

 

Select

 

The select retrieves the information as rows of data without any aggregation. . This command gets impacted by the global filter, on top of the context.

 

Syntax

 

CmdSelect -> ‘select’ IdList ‘by’ Context

 

Where

 

IdList is the list of columns that need to be retrieved and it is a comma delimited list of names of measures, dimensions, or contexts.

 

For measures, the returning column would be type of double

 

For dimensions, the returning column would be type of string and will contain the name of the contexts that has that row

 

For contexts, the returning column would be type of Boolean, and it will contain true or false if that rows is in that context

 

Context is as defined in the basic concepts section. Use ‘sys.All’ if you want to retrieve all the data.

 

Example

 

select cat, pkg.can by brand.rc;

 

Total

 

The ‘total’ command calculates aggregations. This command gets impacted by the global filter

 

Syntax

 

CmdTotal -> ‘total’ AggList { ‘[‘ IdList ‘]’ }? { ‘[‘ IdList ‘]’ }?

 

Where:

 

The AggList is the list of aggregations that will be done in each cell, and it has this syntax:

 

AggList -> AggExp { ‘,’ AggExp }*;

 

 

 

It is a comma delimited list of Aggregation expressions with this syntax:

 

AggExp -> AggFunc Measure?

 

 

 

The aggregation function is one of these keywords: ‘count’, ‘sum’, ‘avg’, ‘min’, ‘max’, ‘stddev’, ‘var’

 

 

 

The measure is as detailed in the basic concepts section, and it is optional only if the aggregation function is ‘count’

 

The two optional set of id lists helps detail the result. The first set details in the rows direction, and the second set details in the columns direction.

 

The list of id is a comma delimited list of dimensions or contexts. Measures are not valid here.

 

Example

 

' total count

 

total count;

 

 

 

' counts by brand

 

total count [ brand ];

 

 

 

' counts by brand and package

 

total count [ brand, pkg ];

 

 

 

' counts by brand crossed by pkg

 

total count [ brand ] [ pkg ];

 

 

 

‘ counts by brand crossed by package including average price

 

Total count, avg price [ brand ]  [ pkg ];