Setup your machine to process your data.
Data coming from the datasource is piped through various processing nodes until it reaches the data store as final destination.
In the previous section, we learned how to set settings for datasources to connect and pull data from your database or another sources. The data will be then streaming through many processing nodes. Each processing node does a specific task upon the data stream. Together they will help to transform original data into meaningful information/data to be consumed in the report view.
<?php
use \koolreport\processes\Filter;
use \koolreport\processes\Group;
class MyReport extends \koolreport\KoolReport
{
...
public function setup()
{
$this->src('sales')
->query("SELECT customerName, productName,region, amount from tblPurchases")
->pipe(new Filter(array(
array("region","=","asia")
)))
->pipe(new Group(array(
"by"=>"customerName",
"sum"=>"amount"
)))
->pipe($this->dataStore('sales'));
}
}
In above example, we want to summarize total purchased amount per customer from Asia region. The data stream from SQL Query will filtered by region then grouped by customerName. In the end, the result of processing will be store in dataStore named "sales".
There are FOUR (4) major groups of processing nodes. They are Row, Column, Transformation and Analysis.
Table process nodes contain process that apply changes to the whole data set.
Transpose
process help to transpose table, changing row to column and column to row.
The new columns will be name as "c0"
, "c1"
, "c2"
to "cn"
.
The "c0"
is a specical column which contains the name
of column before being transposed.
<?php
use \koolreport\processes\Transpose;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Transpose())
...
}
}
Row process nodes contain processes make changes to data rows. For example, you need to filter row by condition, limit row result and so on.
Filter
process helps to filter your data based on condition.
It is like the WHERE
statement in SQL Query.
If your data is from database, we encourage you to use the filtering feature
of database. However, if your data coming from other sources such as CSV or Microsoft
Excel then you need Filter
.
<?php
use \koolreport\processes\Filter;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Filter(array(
array("region","=","asia"),
array("customerAge",">",50)
)))
...
}
}
Below are the list of supported operators in Filter
Name | description | example |
---|---|---|
= | Equal to | array("age","=",32) |
!= | Not equal to | array("age","!=",32) |
> | Greater than | array("age",">",32) |
< | Less than | array("age","<",32) |
>= | Greater than or equal to | array("age",">=",32) |
<= | Less than or equal to | array("age","<=",32) |
contain | Contain a string | array("name","contain","John") |
notContain | Not contain a string | array("name","notContain","John") |
startWith | Start with a string | array("name","startWith","John") |
notStartWith | Not start with a string | array("name","notStartWith","John") |
endWith | End with a string | array("name","endWith","John") |
notEndWith | Not end with a string | array("name","notEndWith","John") |
between | Between two given values | array("name","between",24,32) |
notBetween | Not between two given values | array("name","notBetween",24,32) |
By default the condition are joined with and
operator, however
we can change to or
like below example:
...
->pipe(new Filter(array(
array("region","=","asia"),
'or',
array("region","=","america")
)))
...
Limit
process will limit the rows returned.
It works the same as LIMIT statement in SQL Query.
<?php
use \koolreport\processes\Limit;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Limit(array(20,10));//Limit 20 rows starting from offset 10
...
}
}
Example: ->pipe(new Limit(array(10))
will return first 10 rows.
Sort
process helps you to sort data based on columns.
It works like the SORT BY in SQL Statement.
The input array contains key=>value
where key
will be column name
and the value
is the direction of sort.
<?php
use \koolreport\processes\Sort;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Sort(array(
"age"=>"asc",
"name"=>"desc"
));
...
}
}
If you need sort differently rather than simple desc
or asc
, you can set your own comparison function
<?php
use \koolreport\processes\Sort;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Sort(array(
"age"=>function($a,$b){
return $a<$b;
},
));
...
}
}
Join
process help to join two data sources based on the matching of key columns.
<?php
use \koolreport\processes\Join;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
$user_source = $this->src('user_source')->query("select id,name from users");
$purchase_source = $this->src('purchase_source')->query("select user_id,item,amount from purchases");
//Note that: user_source and purchase_source can be from different database
$join = new Join($user_source,$purchase_source,array("id"=>"user_id"));
$join->pipe($this->dataStore('together'));
}
}
In above example, we join data from different data sources. The output data will
contains data from both sources that is match "user_id"
to "id"
.
Columns processes are those causes changes to data column such as create new columns, remove columns or alternate column meta data.
CalculatedColumn
helps you to create new column from existed ones.
For example, you have price and quantity column, you want to create
new column named amount which is the price x quantity.
CalculateColumn will take expression to create new column.
<?php
use \koolreport\processes\CalculatedColumn;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new CalculateColumn(array(
"amount"=>"{price}*{quantity}",
"recievedAmount"=>"{amount}-{fee}",
"power"=>"pow({number_column},10)"
)))
...
}
}
You may define the new column by function:
...
->pipe(new CalculateColumn(array(
"amount"=>function($data){
return $data["price"]*$data["quantity"];
},
)))
...
Create new column store the row number:
...
->pipe(new CalculateColumn(array(
"rowNum"=>"{#}"
)))
...
You can set custom meta data for the new column
...
->pipe(new CalculateColumn(array(
"amount"=>array(
"exp"=>"{price}*{quantity}",
"type"=>"number",
),
"name"=>array(
"exp"=>function($data){
return $data["first_name"]." ".$data["last_name"];
},
"type"=>"string",
)
)))
...
AggregatedColumn
are specical process help you to calculate the aggregated result of an column and create a new column to store result.
For example, you have a sale_amount
column, the AggregatedColumn
can help you to calculate the total sale_amount and put result
in column name total_sale_amount
...
->pipe(new AggregatedColumn(array(
"total_sale_amount"=>array("sum","sale_amount")
)))
...
The great thing is now you are capable of calculating percentage of each sale_amount
over the total_sale_amount
with CalculatedColumn
process.
...
->pipe(new AggregatedColumn(array(
"total_sale_amount"=>array("sum","sale_amount")
)))
->pipe(new CalculatedColumn(array(
"percentage"=>"{sale_amount}*100/{total_sale_amount}"
)))
...
Beside "sum"
, AggregatedColumn supports "count"
, "avg"
, "min"
, "max"
operation.
ColumnMeta
does nothing to data except that it allows you to set
meta data for columns. Meta data are extra description of columns. Meta data
define the column type and extra settings depended on column type.
Name | type | default | description |
---|---|---|---|
name | string | Set new name for column | |
type | string | unknown | Type of column data which can be "string" , "number" or "datetime" |
hidden | bool | false | Whether the column is hidden |
label | string | The column name | The display name of column |
"type"=>"number"
)Name | type | default | description |
---|---|---|---|
decimals | number | 0 | The number of zeros after decimal point |
decimalPoint | string | "." | The decimal point |
thousandSeparator | string | "," | The separator character between each thousand |
prefix | string | The character in front of number. If the number represent dollar currency, your prefix could be "$" | |
suffix | string | The character appeared at the end of number. |
"type"=>"datetime"
)Name | type | default | description |
---|---|---|---|
format | string | "Y-m-d H:i:s" | The format string of datetime column |
<?php
use \koolreport\processes\ColumnMeta;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new ColumnMeta(array(
"customerName"=>array(
"type"=>"string",
"label"=>"Customer Name"
),
"amount"=>array(
"type"=>"number",
"label"=>"Amount in USD"
"decimals"=>2,
"prefix"=>"$ "
),
"orderDate"=>array(
"type"=>"datetime",
"label"=>"Order Date",
"format"=>"Y-m-d H:i:s",
)
)))
...
}
}
ColumnsSort
process helps you to sort the column list by its name or its labels.
For example, you have data {name:"Peter",age:35}
, the ColumnsSort
will turn those data to {age:35,name:"Peter"}
if you choose to sort by name of columns.
<?php
use \koolreport\processes\ColumnsSort;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
//Sort by the name of columns
->pipe(new ColumnsSort(array(
"{name}"=>"asc",
)))
...
//Sort by the label of columns
->pipe(new ColumnsSort(array(
"{label}"=>"asc",
)))
...
//Sort by name of columns using custom function
->pipe(new ColumnsSort(array(
"{name}"=>function($a,$b){
return $a<$b;
},
)))
...
//Sort by label of columns using custom function
->pipe(new ColumnsSort(array(
"{label}"=>function($a,$b){
return $a<$b;
},
)))
...
}
}
CopyColumn
will make a copy of an existed column
<?php
use \koolreport\processes\CopyColumn;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new CopyColumn(array(
"name"=>"copy_of_name",
"amount"=>"copy_of_amount"
)))
...
}
}
Sometime, you need to remove columns after usage to keep thing neat.
RemoveColumn
process help you to remove those unwanted columns.
<?php
use \koolreport\processes\CopyColumn;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new RemoveColumn(array(
"extra_column",
"unimportant_column"
)))
...
}
}
The "extra_column"
or "unimportant_task"
will be removed from datastream
If you want to rename a column, you may use ColumnRename
process.
<?php
use \koolreport\processes\ColumnRename;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new ColumnRename(array(
"old_name"=>"new_name",
"amount"=>"sale_amount",
)))
...
}
}
The "old_name"
and "amount"
column will be renamed to "new_name"
and "sale_amount"
OnlyColumn
process helps you to keep those columns data you need.
<?php
use \koolreport\processes\OnlyColumn;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new OnlyColumn(array(
"important_column",
"another_important_one"
)))
...
}
}
In above example, ONLY "important_column"
column and "another_important_one"
column will be kept.
The Transformation processes will make change to the data cell, transform original value to another.
DateTimeFormat
process will transform to datetime of a column to another format.
To transform it requires the original format of datetime and the format you want to convert to.
<?php
use \koolreport\processes\DateTimeFormat;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new DateTimeFormat(array(
"last_login_time"=>"F j, Y",
"created_time"=>array(
"from"=>"Y-m-d H:i:s",
"to"=>"F j, Y"
)
)))
...
}
}
There are two ways to write settings for DateTimeFormat
"created_time"=>array("from"=>"Y-m-d","to"=>"F j, Y")
.
This is very clear that this created_time column will be converted
from 2016-1-12 to January 12, 2016
"last_login_time"=>"F j, Y"
.
This is used when the date format of the column (in this example is "last_login_time") has been
defined with ColumnMeta. If not then the default "from"
format is
"Y-m-d H:i:s"
.
When you have datetime column and you want to group those date into week of year, month, quarter or year,
TimeBucket
process will come into play.
This process will collect the datetime and put them into bucket of your choice whether year or month or other.
The datetime data after categorized by TimeBucket can be aggregated by Group or Pivot process.
<?php
use \koolreport\processes\TimeBucket;
use \koolreport\processes\Group;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new TimeBucket(array(
"created_time"=>"quarter"
)))
->pipe(new Group(array(
"by"=>"created_time",
"sum"=>"amount"
))
...
}
}
Bucket | description |
---|---|
date | Categorize datetime into date |
month | Categorize datetime into month |
quarter | Categorize dateime into quarter |
week | Categorize datetime into week number |
year | Categorize datetime into year |
hourofday | Categorize time into hour of day, outout value's range is [0-23] |
dayofweek |
Categorize datetime into day of week, output value's range is [0-6] [Monday - Sunday]
|
dayofmonth | Categorize datetime into day of month, output value's range is [1-31] |
monthofyear | Categorize datetime into month of year, output value's range is [1-12] [January - December] |
NumberBucket
helps to categorize number into group of predefined range for example 0-10
or 10-20
.
The number data after categorized by NumberBucket can be aggregated by Group or Pivot process.
<?php
use \koolreport\processes\NumberBucket;
use \koolreport\processes\Group;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new NumberBucket(array(
"age"=>array("step"=>5)
)))
->pipe(new Group(array(
"by"=>"age",
"avg"=>"income"
))
...
}
}
For each column need bucket, we can have follow advanced settings:
Name | type | default | description |
---|---|---|---|
step | number | *required This is range of bucket you want to set |
|
formatString | string | "{from} - {to}" | The format string of output. With default settings, output will be 0-10 for example. |
decimals | number | 0 | The number of decimals for {from} and {to} |
thousandSeparator | string | "," | Thousand separator format for number |
decimalPoint | string | "." | Decimal character separating number and it's decimal |
prefix | string | The string in front of number | |
suffix | string | The string goes after number |
This below example will count the number of people by their income ranges.
<?php
use \koolreport\processes\NumberBucket;
use \koolreport\processes\Group;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new NumberBucket(array(
"income"=>array(
"step"=>1000,
"formatString"=>"From {from} to {to}",
"decimals"=>0,
"prefix"=>"$",
)
)))
->pipe(new Group(array(
"by"=>"income",
"count"=>"user_id"
))
...
}
}
NumberRange
helps to set custom name for defined range.
For example, you can defined income from $0 to $1000 as low
,
$1000 to $6000 is medium
and above $6000 is high
.
<?php
use \koolreport\processes\NumberRange;
use \koolreport\processes\Group;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new NumberRange(array(
"income"=>array(
"low"=>array(null,999.99)
"medium"=>array(1000,5999.99),
"high"=>array(6000,null)
)
)))
->pipe(new Group(array(
"by"=>"income",
"count"=>"user_id"
))
...
}
}
StringCase
process helps you to format string
<?php
use \koolreport\processes\StringCase;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new StringCase(array(
"upper"=>"country,region",
"lower"=>"address1,address2",
"first-cap"=>"description,note",
"all-cap"=>"name"
)))
...
}
}
Name | description | example |
---|---|---|
lower | Lowercase all data of a column | "joHn DoE"=>"john doe" |
upper | Uppercase data of a column | "joHn DoE"=>"JOHN DOE" |
first-cap | Capitalize the first character | "john doe"=>"John doe" |
all-cap | Capitalize all words in string | "john doe"=>"John Doe" |
StringTrim
helps to trim off the space or some of special character at beginning and at the end of string data
<?php
use \koolreport\processes\StringTrim;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new StringTrim(array(
"name",
"address",
"description"
)))
...
}
}
Set character_mask
to determine what character to trim off.
For example:
->pipe(new StringTrim(array(
"name",
"address",
"description",
"character_mask"=>"\t\n\r\0\x0B"
)))
ValueMap
process helps to map from one value to other.
<?php
use \koolreport\processes\ValueMap;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new ValueMap(array(
"level"=>array(
"0"=>"easy",
"1"=>"medium",
"2"=>"hard"
)
)))
...
}
}
Problem: It happens that the converted value may be in different type compared to original value As the above example, we map from number to string.
Solution: You can add the {meta}
to set new meta data for columns
<?php
use \koolreport\processes\ValueMap;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new ValueMap(array(
"level"=>array(
0=>"Monday",
1=>"Tuesday",
3=>"Wednesday",
4=>"Thursday",
5=>"Friday",
6=>"Saturday",
7=>"Sunday",
"{meta}"=>array(
"type"=>"string",
)
)
)))
...
}
}
Sometime you need function to map value to another. Here is how to do:
<?php
use \koolreport\processes\ValueMap;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new ValueMap(array(
"level"=>array(
"{func}"=>function($value){
return "This is $value";
},
"{meta}"=>array(
"type"=>"string",
)
)
)))
...
}
}
Custom
is a special process which you set your own function to perform custom change to data row.
Your function will receive data row as parameters. After you are done with processing, return the row data for the next process.
<?php
use \koolreport\processes\Custom;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Custom(function($row){
$row["name"] = strtolower($row["name"]);
return $row;
}))
...
}
}
Analyis processes are those summarize whole data and output meaningful information or data
Group
process acts like the GROUP BY
statement in SQL Query.
<?php
use \koolreport\processes\Group;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Group(array(
"by"=>"country,state",
"sum"=>"polulation"
)))
...
}
}
Name | type | default | description | example |
---|---|---|---|---|
by | string | List of columns to be group | "by"=>"country,state" |
|
sum | string | List of columns to be sum | "sum"=>"population" |
|
avg | string | List of columns to be averaged | "avg"=>"income" |
|
min | string | List of columns to get min value | "min"=>"income" |
|
max | string | List of columns to get max value | "avg"=>"max" |
|
sort | bool | true | Whether to sort data in grouped columns | "sort"=>false |
The `Pivot` process has been move to the Pivot package. Pivot package contains both Pivot process to analyze data as well as Pivot widget to let user navigate data summarization.
Pivot
process helps to summarize data in multi dimensions.
If you have familiar with Excel Pivot table then Pivot
produce results in the same way. However, Pivot
process
is more advanced in the way that it does not stop at two dimensions
rather it can summarize data in three or more dimensions.
<?php
use use \koolreport\pivot\processes\Pivot;
class MyReport extends \koolreport\KoolReport
{
public function setup()
{
...
->pipe(new Pivot(array(
"dimensions"=>array(
"row"=>"customerName",
"column"=>"productName",
),
"aggregates"=>array(
"sum"=>"dollar_sales"
)
)))
...
}
}
Creating new data process is very simple. You follow this below template:
<?php
class MyProcess extends \koolreport\core\Process
{
public function onInit()
{
// Get called when the process is initiated.
// You may get the params setting from $this->params
}
public function onStartInput()
{
// Get called when the previous process prepared to send your process data.
// You may know which process send this start input through $this->streamingSource
}
public function onInput($data)
{
// $data is the associate array in format array("customerName"=>"John Doe","amount"=>500)
// Get the streaming source through $this->streamingSource
}
public function onEndInput()
{
// Get called when previous process ended input
// You may know which process end by $this->streamingSource
}
}