Text analytics

Currently Text analytics for IUCLID is shared with users for testing purposes. It can be installed with IUCLID 6 (preferably IUCLID 6.2) and only supports Oracle-based installation. Please note that the standard IUCLID support does not apply to this tool but you can send your questions and feedback to the IUCLID team.

Text analytics is a web application built on top of IUCLID that allows routinely analysing the information contained in IUCLID dossiers. The analysis involves both structured (e.g. picklists, numbers, dates) and unstructured information (i.e. free text fields and attachments).

Text analytics extracts all information from IUCLID dossiers and indexes it using elastic search. It allows rapid searches in an index. Text analytics carries out an optical recognition of scanned attachments and also a language recognition.

Typical hardware requirements are presented below, assuming that the Wildfly and Elastic Search Servers are installed at the same host (the IUCLID server and database could also be hosted on the same server). The hardware requirements will vary according to the amount of data managed by IUCLID.

  • CPU: 6
  • RAM: 16GB (4GB Wildfly Server, 4GB Elastic Search Server,  8GB OS and Tesseract)
  • HDD: 50GB

Text analytics has been tested with JDK version 8.

Text analytics Test version (18th July 2018, version 2.2.0)

The installation package contains all the needed components of the Text Analytics software. It can be deployed together with an Oracle-based IUCLID 6 installation (preferably IUCLID 6.2). Deployment instructions are available on this page, above.

