Parsing content that contains a lot of taboo characters

Discussion created by RSA Admin Employee on Aug 29, 2008

Occasionally, we run across flat files that are essentially logs in an XML or HTML-like format.  These will contain a LOT of those nasty < and > characters that we know we can't code up literally.  Here's a snippet of such a line I was given recently:



The goal was to parse this such that the string "TX" is read into a variable named "state", and the string "Plano" is read into a second variable called "city".  Everything else is to be treated as fixed text.  So, how would you code that up in the content parameter?


Here's the solution:



The parts highlighted in blue are the fixed text, and the parts highlighted in green are the actual variables. 


The key is to "escape" the left bracket (< ) characters by doubling up the ones you want to treat as fixed text.  Refer to your UDS class slides for the details on escaping the < and > brackets.

