Data Mining with Forms
QEST Platform 5.7 Documentation
Applies to QESTField Forms
This article describes the structure of forms data as it is saved in the QEST Platform database with the intent that it act as a starting point for extracting this data for reporting purposes.
Contents
Overview
Forms will typically contain two types of data:
- mapped data, which is tightly bound to the work order and report hierarchy in the QESTLab database
- unmapped data, which includes all of the entry fields used by the field technician when filling out the bulk of the report, and which usually does not correspond to existing QESTLab tests
When a form is uploaded, both types of data are recorded in the database in the DocumentExternal.XmlData
field.
Field Types
Fields Defined in Adobe Acrobat
Consider a mapped form that contains a mix of data: some mapped from the QEST Platform database and some entered by the user in the field. In Adobe Acrobat, the field names are defined as shown below. Note some of the field names (in black) in the header:
- Contract No
- Date
- TIP Number Inspector
Fields Mapped in QEST Form Mapper
In the QEST Form Mapper, mappings are created for the header data, known to be available in the QEST Platform database, but the fields at the bottom are left un-mapped, as they are not known to QEST Platform and will be completed by the field technician. Note there are mappings to the same fields used before:
- Contract No - ID20002/ProjectCode
- Date - ID101/WorkDate
- TIP Number Inspector - - ID101/PersonName
Fields Completed by the Field Operator
Once an instance of the form is created, some of the header information is populated from the QESTLab database, and the field technician enters data for the remaining parts. Both the names of the form fields and the mappings on those fields are not seen by the user.
The field technician completes the rest of the fields, including:
- TIP Number
- High Temp
- Low Temp
- AM Conditions
- PM Conditions
Xml data
Once the field technician uploads the form, the form is analysed and the data is extracted. This is divided into:
- mapped data, in the
Data/QestData
node - unmapped data, in the
Data/Raw
node
Note that for each form there is only a single block of XML in the database, but the two parts will be considered separately in the sections below.
Mapped data
The data in the QestData
node is hierarchical and corresponds to the structure of QESTLab documents. In this sense it most closely reflects the field mappings seen in the QEST Form Mapper.
<Data> <QestData> <ID20002> <ProjectCode>DSI</ProjectCode> <QestUUID>f775c498-cb94-4aa3-8954-a6ff012ce0fc</QestUUID> </ID20002> <ID101> <PersonName>Field Tech 1</PersonName> <WorkDate>11-May-18</WorkDate> <QestUUID>926d68bd-608c-415c-bf87-a8dd000acbec</QestUUID> <ID190001> <SignatureImage/> <QestUUID>19067c18-56c8-46de-bfa8-a8dd005b4308</QestUUID> </ID190001> </ID101> </QestData> <Raw> <!-- Redacted --> </Raw> </Data>
In the above example the top node (ID101) is the work order, and the child node (ID190001) is the form itself.
Property names
The properties on each document will usually (but not always) correspond to a database field of the same name. For example, the ReportNo
in the example above corresponds to DocumentExternal.ReportNo
in the database.
Date and numeric formats
At this time, WYSIWIG (what you see is what you get): the XML will reflect what the user has entered regardless of whether the corresponding database field is a strongly-typed date etc or not. For example, suppose that a PDF form has two date fields:
- one has been configured to display as dd-mmm-yyyy, and a field technician sets a value of 12-May-2018
- one has been configured to display as mm/dd/yy, and a field technician sets a value of 08/30/18
In the resultant XML, these will not be converted back to a standardized date format, but rather will directly contain the text "12-May-2018" and "08/30/18". This imposes some limits on data mining, as in this scenario it's not possible to compare such dates without first using a specific format conversion.
Identifiers
Every document will have a QestUUID, which uniquely identifies the document in the database, and can be used in conjunction with the qestReverseLookup
table to traverse document relationships.
Unmapped data
The data in the Raw
node is flat and corresponds to the names of the form fields. In this sense it most closely reflects the field mappings seen in the fields defined in Adobe Acrobat. Note that the mapped fields will be included in this block as well.
<Data> <QestData> <!-- redacted --> </QestData> <Raw> <ContractNo>DSI</ContractNo> <undefined>227261</undefined> <TIPNumberInspector>Field Tech 1</TIPNumberInspector> <Day/> <Date>11-May-18</Date> <HighTemp>70</HighTemp> <LowTemp>55</LowTemp> <AMConditions>Good</AMConditions> <PMConditions>Bad</PMConditions> <ItemsofWorkRow1/> <!--snip--> </Raw> </Data>
Property names
The properties correspond to the names of the fields that the administrator configured in e.g. Adobe Acrobat, minus the spaces.
- by coincidence only, properties such as
ReportNo
andClientName
correspond to the same field names in theQestData
node (atData/QestData/ID101/ID190010)
- other properties such as
ClientStreet1
are named differently to the corresponding fieldStreet
in theQestData
node (atData/QestData/ID20001
)
Date and numeric formats
The same limitations apply as with mapped data.
Fields with invalid characters
There are reasonably strict naming rules for XML which, if broken, will render the XML invalid and make it impossible to parse. The naming conventions of the PDF form fields are generally less strict, so an administrator may make form fields which need to be sanitized in the raw XML. For example
- the field
90 Degrees
would appear in the XML as<_x0039_0_x0020_Degrees>
If it is necessary to ensure that the original field names are preserved, make certain that the field names entered in Adobe Acrobat conform to the same rules as for XML naming. Namely:
- Element names must start with a letter or underscore
- Element names cannot start with the letters xml (or XML, or Xml, etc)
- Element names can contain letters, digits, hyphens, underscores, and periods
- Element names cannot contain spaces
Versioning of XML data
Multiple versions of the XmlData are not currently retained. The data in the table will always reflect that of the most recently uploaded version of the form.
Products described on these pages, including but not limited to QESTLab®, QESTNet, QESTField, QEST Web App, Construction Hive, and associated products are Trademarks (™) of Spectra QEST Australia Pty Ltd and/or related companies.
The content of this page is confidential. Do not share, duplicate or distribute without permission.
© 2024 Spectra QEST® Australia Pty Ltd and/or related companies. Terms of Use and Privacy Statement
Integrity | Curiosity | Empathy | Unity
The content of this page is confidential and for internal Spectra QEST use only. Do not share, duplicate or distribute without permission.