Davide Veneziano

Sample Contents: detect unused accounts with the SAW

Discussion created by Davide Veneziano Employee on Dec 16, 2014

This articles follows up another example posted here a few days ago (Sample Contents: detect uncommon user-agent with the SAW) and will be focused as well on how to use the Security Analytics Warehouse to generate advanced reports.


We are going to leverage once again the expert-mode and Hive as a query language to accomplish our objective that is detecting user accounts which have not been used for the last 7 days.


Also here the first step is to create an external table which is pointing to the location in our SAW which contains our data set, specified by the LOCATION directive containing the avro files (everything from November 2014).

We also need to define the schema, listing all the meta keys need for our query:

SET hive.mapred.supports.subdirectories=true;

SET mapred.input.dir.recursive=true;


CREATE EXTERNAL TABLE AllSessions ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/rsasoc/v1/sessions/data/2014/11' TBLPROPERTIES ('avro.schema.literal'=' { "type":"record", "name":"nextgen", "fields": [




{"name":"user_dst","type":["string","null"],"default":"null"} ] }');

We now want to create a temporary table storing for each username, the last time has been seen in our dataset.


CREATE TABLE Users (username string,lastSeen bigint);


Time now to populate the termporary table. To do so, we group our databa set by user_src and add for each username the greatest "time" meta (which is the time of the last event since it is a unix timestamp):

INSERT OVERWRITE TABLE Users SELECT user_src,MAX(time) from AllSessions GROUP BY user_src;


Then, if we want to extract all the accounts not used for the last 7 days, we just need to select from the temporary table the rows whose timestamp is less than the difference between now (UNIX_TIMESTAMP() returns the current timestamp in seconds) and the number of seconds elapsed in 7 days (7*24*3600):

SELECT username,FROM_UNIXTIME(lastSeen) FROM Users WHERE lastSeen < UNIX_TIMESTAMP()-7*24*3600;


Also this full sample has been attached below.


Disclaimer: please DO NOT consider what is described and attached to this post as RSA official content. As any other unofficial material, it has to be tested in a controlled environment first and impacts have to be evaluated carefullybefore being promoted in production. Also note the content I publish is usually intended to prove a concept rather than being fully working in any environments. As such, handle it with care.