EQuIS Data Processor
Copyright © 2020 • Modified: 18 Jun 2020
Introduction to the EQuIS Data Processor
Once data is collected and analyzed, the EQuIS Data Processor (EDP) is the EQuIS workflow component used to check and process the data. EDP sets the standard for simplicity in data quality management while supporting a host of features and capabilities that allow users the utmost in data checking flexibility. Data is checked and submitted into the EQuIS database via EDP using an Electronic Data Deliverable (EDD)—an electronic tabular format for sharing, manipulating and using data.
EDP can be used as either a desktop or web user interface. There are two primary desktop modes for using EDP:
•EDP Standalone – Data providers check EDDs prior to submitting those EDDs to their client.
•Professional EDP – EQuIS power users check EDDs for data quality and then import the data into the EQuIS database.
Both EDP Standalone and Professional EDP enable the user to check EDDs for formatting, valid values and logic before being accepted into EQuIS. Problematic EDDs are rejected and flagged for specific errors. EDD tables with errors are denoted in red, and individual fields within the EDD tables that have errors are highlighted with different colors, which signify the error type and facilitates correcting errors.
There is a separate web mode for EDP:
•Enterprise EDP – Performs all the same checks as other EDP modes, but the entire process may be automated. The purpose of Enterprise EDP is for the automated workflow to receive (via ftp, email, or web) and process EDDs. Acceptance and/or rejection notifications are automatically sent.
Electronic Data Deliverable (EDD) – EDDs are electronic tabular formats for sharing, manipulating and using data. Microsoft Excel spreadsheets and tab-delimited text files are examples of EDDs.
EQuIS Format Files – The EQuIS format file is the essence of data checking in EDP and EQuIS. Structured in XML, the EQuIS format file set contains the definitions and restrictions for each individual field in available data tables. The format files control data checks, such as range checking, reference values, formatting and enumerations. Data providers may submit data to EQuIS using one of the standard format files (e.g., EFWEDD, EQEDD, EZEDD, etc.) or a custom format file.
Package – The package is the EDD data that have been converted into the EQuIS data structure (tables and fields) before they are committed to EQuIS.
Create – The process in which data in the EDD is mapped from the structure of the format to the structure of the EQuIS database and the package is created. The data are compared to the rules of the EQuIS database and each record in the newly created package is assigned an EBatch number.
EBatch – The EDD Batch, or EBatch, number is how EQuIS tracks data when it is uploaded via EDP. As each EDD is loaded into the EQuIS database, a unique EBatch number is assigned based on file name, username, date and time, machine ID, etc. Every record created by the EDD contains that EBatch number and can be traced back to the EDD from which it was loaded.
Commit – The process in which created data is compared to data already in EQuIS via the primary keys. Depending on which commit type is selected, the commit process will determine how the data are imported into EQuIS.
Rollback – The process in which data may be removed from the EQuIS database, based on the EBatch of the data. Rollback will not permit parent-child relationships to be broken. Rollback will not return a record to a previous state—the entire record will be removed from the EQuIS database.
EQuIS EDD Conventions
The column headers in EQuIS EDD tables contain the data field names from the associated EQuIS database Schema. The first row defines the data in each column, and the second row indicates the data type and length. The second row also contains a tool tip with more detail on the column content. Both of these rows start with the "#" sign, which indicates that these are comment rows and do not contain data. These comment header rows will not be uploaded into the EQuIS database and are only in the file to provide context to the user/reader. Thus, some EDDs are created without any header rows.
To help understand data connections and assist with data population, criteria for certain fields are denoted by the following conventions:
•Red and Underlined text indicates a Primary Key field.
•Red text indicates a required field.
•Blue text indicates a field linked to a Reference table (e.g., look-up or valid value).
Training Scenario Overview
We will be working with data collected for the Gold King Mine facility. Located approximately 10 miles north of Silverton, CO, the Gold King Mine Site was claimed in 1887 and was last active in the 1920s. Mining activities produced acid mine drainage that required monitoring to avoid contamination of the nearby Animas River.
In August 2015, approximately three million gallons of mine waste were accidentally released into Cement Creek, a tributary of the Animas River. The spill contained known contaminants of arsenic, cadmium, copper, lead, and aluminum. The spill changed the color of the river to orange. The post-spill monitoring includes surface water, soil and sediment to measure impact of the spill on the greater San Juan River.
One of our laboratories has completed their analysis of some samples from the Gold King Mine facility and has produced an EDD of the results from their Laboratory Information Management System (LIMS). We will assist the laboratory to successfully submit the data into the EQuIS database via EDP.