|Applies To||RSA Product Set: RSA Identity Governance & Lifecycle|
RSA Version/Condition: All
|Issue||Many customers provide comma separated value (CSV) files for collection of their identity, account, entitlement, and role data. RSA Identity Governance & Lifecycle leverages a type 4 JDBC driver called HXTT Text (CSV) Pure Java JDBC Driver to be able to collect data from CSV files as if the data in the files were in database tables. This has proven to be a very powerful tool for manipulating data coming from a CSV file or even joining data across multiple CSV files. However, there have been issues using this driver to collect data due to some of the SQL issues associated with using a SQL language to access data in a text file vs. a true relational database. Some issues are due to the fact that a text file does not have metadata around the columns of data within the file so declared datatypes for the data are needed. There are also issues around performance; particularly when joining across files or doing complex SQL due to the fact that a CSV file is not a relational database and is not going to perform as one.|
It is highly recommended to test queries in collectors against the data source before running collectors. When the data source is a relational database such as Oracle or Microsoft SQL Server, the collector SQL queries can be testing using a tool such as SQL*Plus or SQL Developer. When the collector is an LDAP collector, an LDAP browser can be used to test the LDAP filters in the collector definitions. When using a text-based CSV database driver, one way to test the queries in the collector definition is to use a tool that supports using any JDBC type 4 driver to query the data source (CSV file.) Another option, described in this RSA Knowledge Base Article, is to use the HXTT Text (CSV) Pure Java JDBC Driver test tool built into the JDBC driver.
In this RSA Knowledge Base Article, we are referring to a trial version of the HXTT Text (CSV) Pure Java JDBC Driver. This version has limited functionality but is useful for testing SQL queries from a desktop computer. For supported SQL syntax documentation, license restrictions, or information on fully functional licensed versions, please refer to the official HXTT site.
Requirements to run the HXTT testing tool
The testing tool is simply a Java program with a user interface for typing in SQL queries and viewing the data results of that query obtained using the HXTT driver. To run the the tool you will need:
Setting up and using the HXTT testing tool
NOTE: This example is on Windows.
means that the driver will look in the /home/oracle/database/SampleData/Demo directory for any CSV files.
On Windows, set the URL based on a local directory where the CSV files live. For example, if using C:\temp as in this example, the JDBC URL would look like:
In either case, be sure to set the _CSV_Header param to true or false depending on whether the CSV files have a header row in them or not.
|Notes||If you are getting a parse error similar to the error below and believe you are using valid SQL syntax according to the HXTT documentation, it is suggested to look at the SQL statement in a HEX editor to look for invalid characters.|
There are additional URL parameters available to specify different CSV file properties such as a different column separator other than a comma (_CSV_Separator). Refer to the Advanced Programming documentation from HXTT.