we've been prepping to upgrade from 10.6.2 to 10.6.4 and reviewing the backup scripts and docs provided by RSA.
and also had an opportunity to test them out on migrating a decoder from SD to HDD [we used v2 despite the lack of official certification for that. Core ins't very complicated]
I'd like to share our experience to raise awareness of the improvements and fixes in the newer version of the scripts to put pressure on RSA product team to certify the v2 script for 10.6.2.
first off - the scripts are quite nice:
- you can centrally get an appliance list from head unit and propagate SSH keys then remotely back up your appliances to your head unit. [generally streaming via Tar over ssh]
- It also excludes the common locations with 'long backup times' [malware repo for files, mnesia db stats bits on log collector, run reports on RE] . That you can re-include with optional args
- In v2 you can also move the backups off to an external mountpoint automagically and some other things .
I suppose it looks like RSA is prepping for the v11 upgrade - hence the -U switch [we'd have to buildstick everything for 11 I hear]
- v2 provides better instructions on migrating custom users/groups ...etc
- v2 has disk space check options.
1) Some of the things we really did not enjoy with the scripts:
- the script looks to have been substantially rewriten but there is no change log or known issues list for v1 vs v2.
Support seem to be blindly recommending v1 without actually testing it or having experience runnining it:
i) v1 script writes esa backup to a small parition using mongodump - potentially non restorable esa - fixed in v2maybe we'd pick up mongodump failing and the parition running over via health policy... if it works.
ii) v1 script - does not backup postgres DB on malware server (that doesn't sound restorable
iii) no guidance on validating output - although there is a log and v2 also does checksums
iv) both scipts don't /etc/netwitness/ng/Geo* feed dat files. [domain/org/country/etc meta won’t get tagged – bad] - no mention in pdfs . A useful reference is this 000035021 - How-to Update the geoIP Databases on RSA NetWitness decoders [well you can get it out RPMs, but it's curious to see it's not a feed. Feed Redist negotiation with Maxmind failed i suppose? ]
2)build stick doc quality was suspect .
- Some KBs we were referred to were very vague. The most useful one was 000029977 - Instructions for build sticking an RSA Security Analytics appliance using the "SA 10.4.0.2B" image .
- HW compatibility issues were not well documented [plus the variety of r610-20-30 hardware is not well documented in terms of bios options either. Especially r610 - e.g. hiding PERC8xx PCI vs unplugging DAC].
- Cold run testing was very unclear . [test your build stick...before wiping raid config . we've had an issue where the boot menu comes up ok, but then it can't find the KS scripts and sees not usb so changing SDA/SDB/SDC doesn't work - different brand worked ok ] - we have provided some suggestions on improving documentation and removing some older less helpful docs
- Support teams in some regions have never touch hw appliances and RSA keep trying to push back to professional services for backup restore rebuild [mmm come on...
- realistically, I take it - the v1 script has been well tested for core appliances and less can go wrong with none-core but I personally would be a lot less comfortable running v1 on real servers vs v2.
So yes, so if you're on <= 10.6.2 - stand up for your rights - put some pressure on your account manager so they have a chat to the product team. oh and test your buildsticks:
- get RSA to back certify v2 scripts . Don't accept extra operational risk from RSA and run v1. backup and restore should be easy.
- get RSA them to publish a diffs list for scripts. (and fix list)
- push RSA to document and fix build stick processes and improve documentation - this shouldn't be a painful difficult process .
- get some test servers to play with. Virtual?
thanks for listening.