- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
Parsing content that contains a lot of taboo characters
Occasionally, we run across flat files that are essentially logs in an XML or HTML-like format. These will contain a LOT of those nasty < and > characters that we know we can't code up literally. Here's a snippet of such a line I was given recently:
<state>TX</state><city>Plano</city>
The goal was to parse this such that the string "TX" is read into a variable named "state", and the string "Plano" is read into a second variable called "city". Everything else is to be treated as fixed text. So, how would you code that up in the content parameter?
Here's the solution:
<<state><state><</state><<city><city><</city>
The parts highlighted in blue are the fixed text, and the parts highlighted in green are the actual variables.
The key is to "escape" the left bracket (< ) characters by doubling up the ones you want to treat as fixed text. Refer to your UDS class slides for the details on escaping the < and > brackets.
