000035852 - How to Restore Missing Hourly Data in RSA Web Threat Detection

Document created by RSA Customer Support Employee on Jul 12, 2018
Version 1Show Document
  • View in full screen mode

Article Content

Article Number000035852
Applies ToRSA Product Set: Web Threat Detection
RSA Product/Service Type: Forensics
RSA Version/Condition: 5.1 - 6.2
IssueA known or unknown issue has caused the blue bars in the Forensics User Interface (FUI) to disappear, how can I determine what is the best way to resolve this. 

Example Customer issue:  When our WTD  server that hosts r2b2 hung and was restarted several hours later, all mitigator and all other services were working as expected. We continued to receive mitigator alerts/output during that time, so we are sure that the application was processing data, however, because the PRC server was hung, it doesn't look like those hourly files were processed, so I can't access the data in the UI. Can Customer Support help me to determine a way to rerun those hours, or confirm if the data is simply lost.

Additional comments: 
I had notes in the past instructing us to move files from /var/opt/silvertail/data/tasks/r2b2/failed to /var/opt/silvertail/data/tasks/organizer, however, I don't see the files in question in the ./failed directory. We couldn't find those files in the ./completed directory, either. Looking for next steps in trying to restore the data to the UI, if possible.
TasksPlease read the information in the Notes section for the background on how the directories and dataflow occurs within the /var/opt/silvertail/data directory.
ResolutionIf something happened with the system data flow from Organizer -> Indexer -> Report builder  (which create logs, indices and reports) then we do the following:
  1.  First, do some checks --  Make sure the disk space on the system is adequate with df -h  and go to Varz to make sure you follow the data flow to create the shards by the Organizer
  2. The next step is to identify if the shards in /var/opt/silvertail/data/logs are present for the time of the missing data, if not, then there is nothing more that can be done.

(establish a time window of missing data. in UT)

A.  Organizer creates the log files at the top of the hour and completes at the top of the hour, thereby committing the file.  If that gets interrupted, one could find incomplete or nearly empty files, then these would not be able to be recovered either.  In some situations wherein the reports were not processed (not showing up in FUI), but the logs directory are seen log files and it has 256 shards. If they are a good size, approximately 200Mb each, but no task files are seen, then it is possible to manually create task files.  I.e., If shards in var/opt/silvertail/data/logs are present then the data is present in the system but is just not displaying in the Forensics UI.   
  • In this case, one needs to find the task files for the equivalent hour. (task files are named with the time/date/hour(UTC)) depending on where they are, there is the possibility of manual moving them into the completed folder of the process that failed. 
  • Tasks are going to either in be located in completed or failed. Do a find command on .task to see where they are left behind. Only a handful should exist in any folder as they are only created every hour. Here is a way we figured out that works well try various folders after tasks (ie /organizer/failed, /organizer/completed), variations on this look in man pages --  

find /var/opt/silvertail/data/tasks/indexer/failed -name *.task | sort -n     (look in each of  indexer, r2b2)


Or do this


find /var/opt/silvertail/data/tasks/*/failed -name *.task | sort -n


We were also looking with this command better suited to searching in the logs files


du -h /var/opt/silvertail/data/logs/2017 --max-depth=1  

B. Also, the messages file should be reviewed for the cause of the issue for these components. Syslog is the guide to what was failing and which directory one should move it. 
  1.   Since there is a lot more going on with report builder, there is more of a chance that it can fail and therefore one would see report builder failed.  Here you can go into the reports and see if there is partial information or none.    
  2. Once it has been determined what the window of missing data and within that window has been inventoried and listed --  recovery can start 

  • Shards are present or not > > missing shards means the data is not recoverable (can only work with shards that exist)
  •  Reports are present or absent >> Reports that are incomplete and tasks are found  then add the tasks to indexer/complete directory
  • tasks are present or absent   >>  tasks are identified by the particular hour that matches a shard  (name convention YYYY-MM-DD.HH.task eg.  2017-10-26.03.task)
  • Process task files or create a missing task file (when the shard is present). so the next steps are:
    • Place tasks that failed at the indexer  to the directory where the indexer looks at the organizer completed folder to pick up new tasks (so place tasks in the organizer completed directory to start processing)
      • System logs show -- indexer[120633]: [info] global Waiting for task file in directory /var/opt/silvertail/data/tasks/global/organizer/completed/

  • Process task files or create a missing task(when shard is present) where the r2b2, looks at indexer completed folder for new tasks

    • System logs show -- r2b2[82828]: [info] tenant Waiting for task file in directory /var/opt/silvertail/data/tasks/tenant/indexer/completed/
In /data there are the following directories, highlighted are the directories used to store data:. 
[root@wtd data]# ll
drwxr-xr-x 2 rsawtd rsawtd   4096 Oct 20 14:00 alerts
drwxr-xr-x 2 rsawtd rsawtd   4096 Sep 21 20:26 audit
-rw-r--r-- 1 rsawtd rsawtd 603983 May 26 15:54 autotune.conf
drwxr-xr-x 5 rsawtd rsawtd   4096 Jan  8  2016 cassandra
drwxr-xr-x 2 rsawtd rsawtd   4096 Jan  8  2016 edsserver
drwxr-xr-x 2 rsawtd rsawtd   4096 Oct 20 14:19 guiduser
drwxr-xr-x 4 rsawtd rsawtd   4096 Jan  1  2017 logs    >>> In logs are all the shard files
drwxr-xr-x 2 rsawtd rsawtd   4096 Sep 20 22:10 mitregisters
drwxr-xr-x 4 rsawtd rsawtd   4096 Jan  1  2017 reports >> the end of the process- multiple files for FUI display of data
drwxr-xr-x 2 rsawtd rsawtd   4096 Jun 25  2016 snapshot
drwxr-xr-x 6 rsawtd rsawtd   4096 Jan  8  2016 tasks  >>task files control the  hourly processing of data
Organizer also goes to Back Plex starts log files at the start of the hour and commits at the top of the hour. and creates  logs that are the shards sorted by IP  and also tasks (which are just empty files that are placeholders for data processing steps)
The order of processing the data as represented by the task files 
ORGANIZER  >> INDEXER >> R2B2(Report Builder)
tasks are seen in these directories
drwxr-xr-x 2 rsawtd rsawtd 380928 Oct 20 14:00 completed  >> completed tasks... 
drwxr-xr-x 5 rsawtd rsawtd   4096 Jan  8  2016 indexer/completed  >>  r2b2 polls for any files entering here and processes them to create reports that show up in the FUI 
drwxr-xr-x 3 rsawtd rsawtd   4096 Jan  8  2016 organizer/completed   >> Indexer polls for any files entering here and processes them to present to R2B2 (Report builder)
drwxr-xr-x 4 rsawtd rsawtd   4096 Jan  8  2016 r2b2  /failed                 and in_process
 indexer puts them it's own completed folder, from where r2b2 picks it up and generates the reports/blue bars
an example - indexer looks at the organizer completed folder to pick up new tasks
  • indexer[120633]: [info] global Waiting for task file in directory /var/opt/silvertail/data/tasks/global/organizer/completed/
r2b2 does the same, looks at indexer complete folder for new tasks
  • r2b2[82828]: [info] tenant Waiting for task file in directory /var/opt/silvertail/data/tasks/tenant/indexer/completed/
In Reports directory are the processed data used for populating the FUI  more than ten types of data are seen as .json and .gz files for rules, scores, json files to populate the FUI.  
If there is an issue that prevented FUI from displaying 'blue bars' for a period of time, it is because there was something that prevented the Reports files from being created.