The following options enables mainly data processing and customization of content to be extracted. While describing these options, we will see how they can enhance :
performance enhancement;
advanced reporting;
flexible data manipulation.
Let's consider the following CSV sample:
The CSV delimiter is the character that delimits two fields.
By default, the csvDelimiter is ";" (with is also the delimiter of our example). It can be modified if the CSV data source is separated by a different delimiter.
Record delimiter (Kap Lab component property: recordDelimiter)
The record Delimiter indicates the delimiter of a field (if it exists). It is generally used when data field content contains CSVDelimiter string as in such situations CSV parsing produces wrong results. By default, the recordDelimiter property is "" and can be modified when needed.
With Headers (Kap Lab component property: withHeaders)
withHeaders property indicates if the first row should be considered as column headers. If set to false, the reference to each column of the parsed csv input is the string representation of its index. This reference is also used as properties for component unit data element (nodes for the Visualizer, TreeMapNode for TreeMap...). In the case where set to true (default case), column headers are the properties of the constructed elements.
The analysisPath is an array containing a logical linking path between columns (for CSV). It defines how columns content can be connected. This array contains column references logically placed to illustrate a linking logic. In our example ["Entreprise","Departement","Name"] can used as an analysis path to show an hierarchy describing Employees classification by department.
The analysisPath is required to generate data Graphs, extract hierarchies from a CSV data or to generate item lists. CSV analysis, cannot be done if analysisPath property is not set or having no column reference inside.
the analysisPath is specialy used on the Visualizer
The following example shows a CSV template that can be extracted as a data graph by simply defining the array ["source","target"] as analysisPath of a Visualizer instance.
Analysis Path Source Sample
source;target
A;B
A;C
B;C
C;D
The analysisPath setting enables building reduced tables of the CSV input according to some advanced properties that will be discussed later in this section (reporting functions, merge descriptor...).
Here an example of how analysisPath can be used to extract hierarchies from our CSV sample:
Unlike Visualizer, the filter path of all other components can have one column reference as they display entities and they don't care about hierarchies.
This class defines merging tasks that must be applied on the CSV source. In fact, a MergeDescriptor is a set of MergeEntity instances, each one defines columns to be merged in one column according to a merging function. A MergeEntity instance is created by defining :
Array containing the name of columns to be merged;
Name of the resulting column (merging column)
merging functions to be applied on data fields content of the merged columns.
A MergeEntity is a unit merging task that must be added to the MergeDescriptor to be taken into account using the addMergeDescription(mergeEntity) method. When merging, you can choose between leaving the content of merged columns to be taken into account when performing data processing and adding the merging column or replacing them by the resulting column. This can be accomplished by setting to true or false the leaveMergeColumns of the MergeDescriptor instance.
The following example shows how a MergeDescriptor can be applied to our CSV sample.
Merge Descriptor Integration Sample
var mergeDescriptor:MergeDescriptor=new MergeDescriptor();
var mergeEntity:MergeEntity=new MergeEntity(["work_days","total_days"],"holidays",mergeFunction);
mergeDescriptor.addMergeDescription(mergeEntity);
mergeDescriptor.leaveMergeColumns=false;
labComponent.mergeDescriptor=mergeDescriptor;
labComponent.build();
private function mergeFunction(arr:Array):Number
{
if(arr.length!=2)
return 0;
returnNumber(arr[1])-Number(arr[0]);
}
When manipulating CSV files, some columns are often ignored or non-useful for the current data analysis. Thus, the content of these columns is a junk content and can be source of data overloading especially when manipulating large CSV files. The AttributesDescriptor is a solution for that problem. In fact, for columns defined in the analysis or filter path ( which are parsed as data items), the AttributesDescriptor assigns some column headers to a buildPath column as attributes. Thus, the generated data sets have the selected column headers as properties.
An AttributesDescriptor instance is a set of AttributesEntity instances defining the attributes of the data sets extracted from the buildPath column reference. An AttributesEntity is initialized by defining the column Name (must be a buildPath element) and its attributes array that contains selected column names. In analogy with the MergeDescriptor class, an AttributesEntity must be registered in the AttributesDescriptor using the function addAttributesEntity(attributesEntity).
The following example shows how a AttributesDescriptor can be used (defined after merge).
Attributes Descriptor Integration Sample
visualizer.buildPath=["Entreprise", "Name", "Departement"];
var attributesDescriptor:AttributesDescriptor=new AttributesDescriptor();
var attributesEntity1:AttributesEntity=new AttributesEntity("Entreprise",["Technology","Age"]);
var attributesEntity2:AttributesEntity=new AttributesEntity("Name",["Holidays"]);
var attributesEntity3:AttributesEntity=new AttributesEntity("Departement",["Age"]);
attributesDescriptor.AddAttributesDescription(attributesEntity1);
attributesDescriptor.AddAttributesDescription(attributesEntity2);
attributesDescriptor.AddAttributesDescription(attributesEntity3);
labComponent.attributesDescriptor=attributesDescriptor;
The component can recognize by his own the type of columns content if they are standard types. But, in some cases, numbers can refer to IDs and thus cannot be considered as Numbers but as Strings. The Types Descripor is a Dictionary that allows to assign a Type Class to a given column data and thus giving the developer the possibility to avoid the modification of his data source while feeding the typesDescriptor property by the dictionary that describes well the content of his data source columns.
Types Descriptor Integration Sample
var typesDescriptor:Dictionary= new Dictionary();
typesDescriptor["Age"]=String;
labComponent.typesDescriptor=typesDescriptor;
The reporting functions are an important part of the CSV data analysis. In fact, when building data Items and their corresponding data from a given column (based on a ReducedTable analysis), several rows are grouped into one row and each data field content (attribute column) is grouped in an array. Transforming the data fields arrays into a real data by applying standard or custom functions, can be very important to have correct and coherent data for data items generated from the CSV analysis.
The reportingFunctions property is a Dictionary that assigns to a given column key (column name) its reporting function. This function should accept as parameters an array and a type class (uint for example).
Actually, there are two standard functions that can be accessed statically from the ReportingUtils class:
Sum: Sums the content of a data field array (column content). When applying this function, String typed columns are pushed in an Array and Number/uint.. typed
columns are summed.
Mean : calculates the mean value of a data field array (column content) (Can be only applied on Number/uint.. Typed columns.
The following example shows how reportingFunctions property can be used for our CSV sample.
Reporting functions Integration Sample
var reportingFunctions:Dictionary=new Dictionary();
reportingFunctions["Age"]=ReportingUtils.mean;
reportingFunctions["Holidays"]=ReportingUtils.mean;
labComponent.reportingFunctions=reportingFunctions;
The ICollectionView handling in Kap Lab components integrates handling XML, XMLList, XMLListCollection, ICollectionView instances, Array, Object with children property or with custom tree data descriptor...
Navigation through these entities is simple using the DefaultTreeDescriptor or a custom TreeDataDescriptor and links are generated, on the fly, according to hierarchy of navigation.
In addition to this simple data parsing, Lab Components provides the ability to analyse deeply some ICollectionView instances by inspecting linkage properties between properties (for XML and other objects). This can be accomplished using the filter path ( or analysis path for Visualizer) and reporting functions. This feature have enabled Lab Components to analyse any data from different angles and with different interpretations while using the same data and modifying only on the filter path (or analysis path for Visualizer). The TreeMap component demo uses this feature to change data hierarchy whithout changing the data input.
Custom Tree Data Descriptor (Kap Lab component property treeDataDesciptor)
In some cases, objects or ICollectionView instances cannot be browsed by the DefaultTreeDescriptor, thus providing a custom Tree data descriptor implementing the ITreeDataDescriptor will be useful to analyze the content of that object and to build the corresponding Kap Lab component internal data structure.
XML files and some objects are generally encapsulated in a dummy root that is not needed for analysis and that prevent Lab components from showing disconnected entities that user want to display. The new version of Lab components, let developers choose to ignore or maintain the top root of a data input by setting the ignoreRoot to true or false.
The following example shows an XML template that can be extracted as a disconnected data graph (4 disconnected data graphs) by simply setting the ignore property of the lab component to true.
The analysisPath is an array containing a logical linking path between data properties. It defines how XML entity attributes or object properties content can be connected. This array contains column references logically placed to illustrate a linking logic. The analysisPath is not mandatory for ICollectionView data sources, but they can be used in order to analyze the data from a different point of view.
the analysisPath is specialy used on the Visualizer
The following example shows an XML template that can be extracted as a data graph by simply defining the array ["source","target"] as an analysisPath of a Visualizer instance.