DOT Document Server

v5.75.0

Overview

DOT Document Server is a RESTful microservice for document image normalizing and recognising document visual zones, mainly text fields.

API Reference

The DOT Document Server API reference is published here

Distribution package contents

You can find the distribution package in our CRM portal. It contains these files:

Your sales representative will provide you the credentials for the CRM login.
  • config – The configuration folder

    • application.yml – The application configuration file, see Externalized configuration

    • logback-spring.xml – The logging configuration file

  • doc – The documentation folder

    • Innovatrics_DOT_Document_Server_5.75.0_Technical_Documentation.html – Technical documentation

    • Innovatrics_DOT_Document_Server_5.75.0_Technical_Documentation.pdf – Technical documentation

    • swagger.json – Swagger API file

    • EULA.txt - The license agreement

  • docker – The Docker folder

    • Dockerfile – The text document that contains all the commands to assemble a Docker image, see Docker

    • entrypoint.sh – The entry point script

  • libs – The libraries folder

    • libsam.so – The Innovatrics OCR library

    • libiface.so – The Innovatrics IFace library

    • solvers – The Innovatrics IFace library solvers

  • dot-document-server.jar – The executable JAR file, see How to run

  • Innovatrics_DOT_Document_Server_5.75.0_postman_collection.json – Postman collection

Installation

System requirements

  • Ubuntu 18.04 (64-bit)

Steps

  1. Install the following packages:

    • OpenJDK Runtime Environment (JRE) (openjdk-17-jre-headless)

    • userspace USB programming library (libusb-0.1)

    • GCC OpenMP (GOMP) support library (libgomp1)

    • Locales

    apt-get update
    apt-get install -y openjdk-17-jre-headless libusb-0.1 libgomp1 locales
  2. Set the locale

    sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
    export LANG=en_US.UTF-8; export LANGUAGE=en_US:en; export LC_ALL=en_US.UTF-8
  3. Extract the DOT Document Server distribution package to any folder.

  4. Link the application libraries:

    ldconfig /local/path/to/current/dir/libs
    Replace the path /local/path/to/current/dir in the command with your current path. Keep /libs as a suffix in the path.

Activate the DOT license

The activation of the DOT license depends on the type of your deployment.

If you perform serverless or Docker deployments, please contact your sales representative or sales@innovatrics.com to receive a license. Once you receive the license, please deploy it as described in step 5 below.

If you perform a bare metal installation, or use a fixed VM or AWS instance, perform the following steps:

  1. Run DOT Document Server to generate the Hardware ID necessary for the license.

    java -Dspring.config.additional-location=file:config/application.yml -Dlogging.config=file:config/logback-spring.xml -DLOGS_DIR=logs -Djna.library.path=libs/ -jar dot-document-server.jar

    Copy the Hardware ID, which you can find in the output. See the example below:

    Unable to init IFace. Hardware ID: xxxxxxxxxxxx
  2. Visit our CRM portal and go to Products > Digital Onboarding Toolkit > Licenses.

  3. Then, select Generate License and paste the Hardware ID.

    Generate license
  4. Confirm again with Generate License and download the license.

  5. Copy your license file iengine.lic for Innovatrics IFace SDK {iface-version} into {DOT_DOCUMENT_SERVER_DIR}/license/

How to run

As DOT Document Server is a stand-alone Spring Boot application with an embedded servlet container, there is no need for deployment on a pre-installed web server. Instead just run in the application folder:

java -Dspring.config.additional-location=file:config/application.yml -Dlogging.config=file:config/logback-spring.xml -DLOGS_DIR=logs -Djna.library.path=libs/ -jar dot-document-server.jar

Embedded Tomcat web server will be started and the application will be listening on the port 8080 (or another configured port).

Docker

For building a Docker image, you can use the Dockerfile and the entrypoint.sh script. A Dockerfile example and Entrypoint.sh script example can be also found in the Appendix.

Build the Docker image as follows:

cd docker
cp ../dot-document-server.jar .
cp ../libs/libsam.so.* .
cp ../libs/libiface.so.* .
cp -r ../libs/solvers/ ./solvers
docker build --build-arg JAR_FILE=dot-document-server.jar --build-arg SAM_OCR_LIB=libsam.so.* --build-arg IFACE_LIB=libiface.so.* -t dot-document-server .

Run the container according to the instructions below:

docker run -v /local/path/to/license/dir/:/srv/dot-document-server/license -v /local/path/to/config/dir/:/srv/dot-document-server/config -v /local/path/to/logs/dir/:/srv/dot-document-server/logs -p 8080:8080 dot-document-server
Replace the path /local/path/to/license/dir/ in the command with your local path to the license directory.
Replace the path /local/path/to/config/dir/ in the command with your local path to the config directory (from the distribution package).
Important Replace the path /local/path/to/logs/dir/ in the command with your local path to the logs directory (you need to create the directory mounted to a persistent drive). The volume mount into the docker is mandatory, otherwise application does not start successfully.

Externalized configuration

YAML configuration file is located under the config folder:

config/application.yml

There are two groups of properties:

  • Spring Boot properties

  • DOT Document Server specific properties

Spring Boot properties

You can find the specification at Common Application properties.

For example, if you would like to specify a different server port, you just add the following property:

server:
    port: 9080

To fully understand how the externalized configuration works in Spring Boot, see Spring Boot documentation, chapter Externalized Configuration.

DOT Document Server specific properties

These properties are tied with DOT Document Server specific behavior. The following property collection is provided as a guideline and shows the default values and their meaning.

innovatrics:
  dot:
    iface:
      license:
        filepath: license/iengine.lic (1)
      solvers:
        filepath: libs/solvers (2)
    document:
      jpg-compression-quality: 0.9 (3)
      thresholds: (4)
        "[COLOR_SIMILARITY]":
          - checkpoint: 0.0
            level: very_low
          - checkpoint: 0.02
            level: low
          - checkpoint: 0.2
            level: medium
          - checkpoint: 0.4
            level: high
        AUTHENTICITY:
          - checkpoint: 0.0
            level: very_low
          - checkpoint: 0.2
            level: low
          - checkpoint: 0.5
            level: medium
          - checkpoint: 0.65
            level: high
        DISPLAY_ATTACKS:
          - checkpoint: 0.0
            level: low
          - checkpoint: 0.46
            level: medium
          - checkpoint: 0.54
            level: high
    data-downloader: (5)
      connection-timeout: 2000 (6)
      read-timeout: 30000 (7)
1Innovatrics IFace license file path.
2Innovatrics IFace solvers file path.
3Jpg compression quality of the document image as percentage
4The thresholds configurations. Each threshold configuration have to contain the zero checkpoint. The checkpoints divide 0 - 100 score range. Each checkpoint indicated what level should be on the result when value is equal or above the checkpoint value.
5The configuration of the downloader of data from provided URL
6The data downloader connection timeout in milliseconds
7The data downloader read timeout in milliseconds

Externalized configuration via command line arguments

You can specify any Spring Boot property via command line arguments, as you can see below:

java -jar dot-document-server.jar --server.port=9080

Logging

DOT Document Server logs to the console and writes the log file (dot-document-server.log) as well. The log file is located at a directory defined by the LOGS_DIR system property. Log files rotate when they reach 5 MB size and the maximum history is 5 files by default.

API Transaction Counter Log

The separate log files following filename pattern dot-document-transaction-counter.log.%d{yyyy-MM-dd}.%i.gz are located at a directory defined by the LOGS_DIR system property. The %d{yyyy-MM-dd} template represents the date and the %i represents the index of log window within the day, starting at 0. These log files contain information about counts of API calls (transactions). The same rolling policy is applied as for the application log, except the maximum history of these log files is 455 files.

For proper transactions billing, please be sure to send all transactions logs every time.

Docker: Persisting log files in local filesystem

When you run DOT Document Server as a Docker container, you may have access to log files even after the container doesn’t exist anymore. This can be achieved by using Docker volumes. To find out how to run a container, see Docker.

Monitoring

Information as build or license info can be accessed on /api/v5/actuator/info. Information about available endpoints can be viewed under /swagger-ui.html.

The health endpoint accessible under /api/v5/health provides information about the health of the application. This feature can be used by an external tool such as Spring Boot Admin, etc.

Application also supports exposing metrics in standardised prometheus format. These are accessible under /api/v5/prometheus. You can expose this endpoint in your configuration:

management:
  endpoints:
    web:
      exposure:
        include: health, info, prometheus

For more information, see Spring Boot documentation, sections Endpoints and Metrics. Spring Boot Actuator Documentation also provides info about other monitoring endpoints that can be enabled.

Tracing

OpenTracing API with Jaeger implementation is used for tracing purposes. The DOT Document Server tracing implementation supports SpanContext extraction from HTTP request using HTTP Headers format. For more information, see OpenTracing Specification. Tracing is disabled by default. To enable Jaeger tracing:

Set these application properties:

opentracing:
  jaeger:
    enabled: true
    udp-sender:
      host: jaegerhost
      port: portNumber

For more information about Jaeger configuration, see Jaeger Client Lib Docs.

Features

Quality check

The document quality check is provided on demand. It checks if the document has right brightness, sharpness, it doesn’t contain hotspots, it has enough background borders around the document in the image and if it isn’t too small.

The document quality check result contains details with detection confidence and the coordinates of the document in the image.

Table 1. Quality check examples
Quality check resultInput image

OK

Ok

WARNING: DOCUMENT_CLOSE_TO_IMAGE_BORDER

The distance of at least one detected corner point from the nearest image border is less than 2% of the image width or height.

Document close to image borders

FAILED: BRIGHTNESS_LOW

The brightness score is below 0.25

Low brightness

FAILED: BRIGHTNESS_HIGH

The brightness score is over 0.9

High brightness

FAILED: SHARPNESS_LOW

The sharpness score is bellow 0.85

Low sharpness

FAILED: HOTSPOTS_SCORE_HIGH

The hotsposts score is over 0.008

Hotspots

FAILED: DOCUMENT_OUT_OF_IMAGE

At least one corner point of the document was detected outside the image area.

Document out of image

FAILED: DOCUMENT_SMALL

The width of the detected document should be over 450px and the height over (450px / aspect ratio). If any detected edge of the document does not meet these requirements, the document is considered to be too small.

Small document

Display attack detection

The display attack detection is provided on demand.

Table 2. Display attack detection examples
Genuity confidence levelInput image

LOW

Low genuity means a display attack was detected.