Reporting: Query Aggregates

Document created by RSA Information Design and Development on Sep 14, 2017Last modified by RSA Information Design and Development on Oct 15, 2017
Version 9Show Document
  • View in full screen mode

This section explains the supported aggregate functions.

Supported Aggregate Functions

The following table lists the supported Aggregate Functions.

                                                         
Aggregate FunctionDescriptionInput data typesOutput data types
countReturns the count of meta values, which includes duplicate values as well.NumericNumeric
countdistinctReturns the total number of distinct or unique values.NumericNumeric
distinctReturns all the unique values.AnyAny
firstReturns the first occurrence of the meta value.AnySame as input
lastReturns the last occurrence of the meta value.AnySame as input
sumReturns a sum of all non-NULL values of metaKey in a group.NumericNumeric
avg (Average)Returns the average value of all non-NULL values of the metaKey within a group.NumericNumeric
min (Minimum)Returns the minimum for all values of metaKey in each group. This value is based on order by field.AnyAny
max (Maximum)Returns the maximum for all values of metaKey in each group. The maximum value is the value that is returned by order by field.AnyAny
lengthReturns the length of the values of metakey. This is called a "scalar function" in SQL.AnyNumeric

Examples of Queries and Results per Function

Count

This function returns the number of values for a specified meta key, that exclude null values but include duplicate ones. .

Example

The following figure shows a sample query for count function used for the destination IP and the respective source IP.

Sample query for count function used for the destination IP and the respective source IP

The following figure shows the result for the above query.

Result of the count function used for the destination IP and the respective source IP

Here, for each unique ip.src (source IP), the page returns the total number or count of ip.dst (destination IP) values, which include the duplicate values as well.

Note: If your RSA NetWitness Suite is currently on 10.5 or newer version and any of the NetWitness Suite Core devices are on 10.3 or 10.4 versions, then some of the aggregate functions may display unexpected errors. However, aggregate functions such as sum() and count() are supported in 10.4 version.

Countdistinct

The countdistinct function returns the count of unique or distinct values for the metakey. In other words, countdistinct function can be used to retrieve a number of distinct values for the specified metakey.

The following figure shows a sample query where the countdistinct function is used along with IP source (ip.src) and data size(size).

Example

Sample query where the countdistinct function is used along with IP source (ip.src) and data size(size)

The following figure shows the result for the above query.
Result of  query where the countdistinct function is used along with IP source (ip.src) and data size(size)

Here, the page displays the data size along with the total number or count of distinct filenames from the respective IP source. Unlike the count function, the countdistinct excludes the duplicate values from the result.

Distinct

This function returns all the unique or distinct values of the metakey.

Example

The following figure shows a sample query for distinct function used to retrieve e-mails, between various source and destination IP (ip.dst).

Sample query for distinct function used to retrieve e-mails, between various source and destination IP (ip.dst)

The following figure shows the result for the above query.

Here, the page displays the list of unique e-mails that were exchanged between the respective IP source and destination.

First

This function is used to retrieve the first value from an ordered sequence of values for a specified metakey.

Example

The following figure shows a sample query for first function used to retrieve the first destination city name.

Sample query for first function used to retrieve the first destination city name

The following figure shows the result for the above query.

Result of query for first function used to retrieve the first destination city name

Here, the page displays the the first destination city for the corresponding source and destination IP. You can use the first function to isolate a particular value from a search result.

Last

This function is used to retrieve the last value from an ordered sequence of values for a specified metakey.

Example

The following figure shows a sample query for last function used to retrieve the most recent user name.

Sample query for last function used to retrieve the most recent user name

The following figure shows the result for the above query.

Result of query for last function used to retrieve the most recent user name

Here, the page displays the list of most recent or last usernames in full, that were exchanged between the source and destination IP.

Sum

This function returns the total of the non-NULL values of the metaKey within a group.

Example

The following figure shows the query for the Sum function used for packets.

Query for the Sum function used for packets

The following figure shows the result of the above query.

Result of query for the Sum function used for packets

Here the page displays the total or sum of the packets along with the size of the data for the respective destination country.

Avg

The average function returns the average of non-NULL values of the meta within a group.

Example

The following figure shows a sample query for average data size transmitted between a source and destination IP.

Sample query for average data size transmitted between a source and destination IP

The following figure shows the result for the above query.

Result of query for average data size transmitted between a source and destination IP

Here, the page displays the average size of data exchanged between source and destination IP:

Max and Min

Max and Min functions provide the maximum and minimum for given values of a meta respectively.

The following figure shows a sample query for max and min functions for various data sizes, for source IP and destination country.

Example

Sample query for max and min functions for various data sizes, for source IP and destination country

The following figure shows the result for the above query.

Result of query for max and min functions for various data sizes, for source IP and destination country

Here, the page displays the max(size) and min(size) columns, along with the list of source IP and destination country. The max(size) column lists the maximum data sizes exchanged while the min(size) column lists the minimum data sizes that were exchanged.

Filter aggregate meta results with Max_threshold

You can further filter the results of any function by using the threshold rule action.

Example
Following is a sample query for max_threshold used along with the Max function in the Then field is:
max_threshold(5000,max(size))

The following figure shows the Build Rule screen for the above query.

Sample query for max_threshold used along with the Max function in the Then field is: max_threshold(5000,max(size))

Here the max_threshold is applied for data size with an upper limit of 5000. The following figure shows the result.

Result of query for max_threshold used along with the Max function in the Then field is: max_threshold(5000,max(size))

Here, the result page displays the max(size) column, that lists the data sizes lesser than 5000 as this is the maximum threshold in the query, along with the corresponding IP source and the respective directory.

Filter aggregate meta results withMin_threshold

Similarly, min_threshold is used to filter the results for any function. A similar scenario as max_threshold is considered to explain this.

Example
Query for min_threshold used along with the Max function in the Then field is:
min_threshold(5000,max(size))

The following figure shows the Build Rule screen for the above query.

 Query for min_threshold used along with the Max function in the Then field is: min_threshold(5000,max(size))

Here the min_threshold is applied for data size with a lower limit of 5000. The following figure shows the result.

Result of  query for min_threshold used along with the Max function in the Then field is: min_threshold(5000,max(size))

Here, the result page displays the max(size) column, that lists the data sizes greater than 5000 as this is the minimum threshold in the query, along with the corresponding IP source and the respective directory.

Note: Max_threshold and Min_threshold rule actions are common across all the functions, and can be used along with the other queries in the Then field to retrieve the respective output.

Length

This function returns the length of a meta value. In other words, Length function returns the number of bytes used to store the actual value.
For instance, for the value "Analytics" it returns the length as 9. Similarly, for an IPv4 ip.src, it returns 4 (representing 4 bytes).

Example

The following figure shows a sample query for the length function used for usernames.

Sample query for the length function used for usernames

The following figure shows the result for the above query.

Result of query for the length function used for usernames

Here, the page displays the length of the usernames associated with the user account and their respective source IP.

Additional Information

When you query for aggregates (E.g. sum(size)) with Group By on a meta which has multiple values in a session, then the session with multiple values is accounted for aggregate calculation for each value of that meta.

Example

When you query for the Count aggregate function with Group By on Alias.host and if the column has multiple values in a session, then the session is counted for each occurrence, including the duplicate values.

Consider the following table.

                           
SessionIDAlias.hostIp.srcSize
1host-a, host-b, host-aa10
2host-b, host-c, host-a, host-cc20
3host-b, host-c, host-db30
4host-c, host-aa40

In the above table, alias.host for host-a and host-c has duplicate values listed for a single session. Let us consider the following query:

Select : alias.host, count(ip.src), sum(size)
Group By : alias.host

Here, host-a and host-c are present in 3 sessions and they are duplicated for two different sessions. However, the output is as shown below.

                      
Alias.hostcount(Ip.src)Sum (size)
host-a480
host-b360
host-c4110
host-d130

Output table shows that the count for host-a and host-c is 4. This is because for each alias.host value, the entire session is considered. Similarly to calculate sum (size), the same sessions are considered for each alias.host value.

In the report output if the number of rows has reached NWDB maximum aggregate rows defined in RE configuration, then a message Max Aggregate Row Limit Reached is displayed to indicate that there is more information to be displayed. The default limit is 1000, and you can change this value as per your requirement, in the Reporting Engine Configuration page .

Max Aggregate Row Limit Reached message

You are here
Table of Contents > Appendix > Query Aggregates

Attachments

    Outcomes