DOT Document Server
v5.82.0
Overview
DOT Document Server is a RESTful microservice for document image normalizing and recognising document visual zones, mainly text fields.
API Reference
The DOT Document Server API reference is published here
Distribution package contents
You can find the distribution package in our CRM portal. It contains these files:
Your sales representative will provide you the credentials for the CRM login. |
config
– The configuration folderapplication.yml
– The application configuration file, see Externalized configurationlogback-spring.xml
– The logging configuration file
doc
– The documentation folderInnovatrics_DOT_Document_Server_5.82.0_Technical_Documentation.html
– Technical documentationInnovatrics_DOT_Document_Server_5.82.0_Technical_Documentation.pdf
– Technical documentationswagger.json
– Swagger API fileEULA.txt
- The license agreement
docker
– The Docker folderDockerfile
– The text document that contains all the commands to assemble a Docker image, see Dockerentrypoint.sh
– The entry point script
libs
– The libraries folderlibdot-sam.so
– The Innovatrics OCR libraryNOTE: before version DOT Document Server 5.76.0, the library was named
libdot-sam.so
libiface.so
– The Innovatrics IFace librarysolvers
– The Innovatrics IFace library solvers
dot-document-server.jar
– The executable JAR file, see How to runInnovatrics_DOT_Document_Server_5.82.0_postman_collection.json
– Postman collection
Installation
System requirements
Ubuntu 18.04 (64-bit)
Steps
Install the following packages:
OpenJDK Runtime Environment (JRE) (
openjdk-17-jre-headless
)userspace USB programming library (
libusb-0.1
)GCC OpenMP (GOMP) support library (
libgomp1
)Locales
apt-get update apt-get install -y openjdk-17-jre-headless libusb-0.1 libgomp1 locales
Set the locale
sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen export LANG=en_US.UTF-8; export LANGUAGE=en_US:en; export LC_ALL=en_US.UTF-8
Extract the DOT Document Server distribution package to any folder.
Link the application libraries:
ldconfig /local/path/to/current/dir/libs
Replace the path /local/path/to/current/dir
in the command with your current path. Keep/libs
as a suffix in the path.
Activate the DOT license
The activation of the DOT license depends on the type of your deployment.
If you perform serverless or Docker deployments, please contact your sales representative or sales@innovatrics.com to receive a license. Once you receive the license, please deploy it as described in step 5 below.
If you perform a bare metal installation, or use a fixed VM or AWS instance, perform the following steps:
Run DOT Document Server to generate the Hardware ID necessary for the license.
java -Dspring.config.additional-location=file:config/application.yml -Dlogging.config=file:config/logback-spring.xml -DLOGS_DIR=logs -Djna.library.path=libs/ -jar dot-document-server.jar
Copy the Hardware ID, which you can find in the output. See the example below:
Unable to init IFace. Hardware ID: xxxxxxxxxxxx
Visit our CRM portal and go to Products > Digital Onboarding Toolkit > Licenses.
Then, select Generate License and paste the Hardware ID.
Confirm again with Generate License and download the license.
Copy your license file
iengine.lic
for Innovatrics IFace SDK {iface-version} into{DOT_DOCUMENT_SERVER_DIR}/license/
How to run
As DOT Document Server is a stand-alone Spring Boot application with an embedded servlet container, there is no need for deployment on a pre-installed web server. Instead just run in the application folder:
java -Dspring.config.additional-location=file:config/application.yml -Dlogging.config=file:config/logback-spring.xml -DLOGS_DIR=logs -Djna.library.path=libs/ -jar dot-document-server.jar
Embedded Tomcat web server will be started and the application will be listening on the port 8080 (or another configured port).
Docker
For building a Docker image, you can use the Dockerfile
and the entrypoint.sh
script. A Dockerfile example and Entrypoint.sh script example can be also found in the Appendix.
Build the Docker image as follows:
cd docker cp ../dot-document-server.jar . cp ../libs/libdot-sam.so.* . cp ../libs/libiface.so.* . cp -r ../libs/solvers/ ./solvers docker build --build-arg JAR_FILE=dot-document-server.jar --build-arg SAM_OCR_LIB=libdot-sam.so.* --build-arg IFACE_LIB=libiface.so.* -t dot-document-server .
Run the container according to the instructions below:
docker run -v /local/path/to/license/dir/:/srv/dot-document-server/license -v /local/path/to/config/dir/:/srv/dot-document-server/config -v /local/path/to/logs/dir/:/srv/dot-document-server/logs -p 8080:8080 dot-document-server
Replace the path /local/path/to/license/dir/ in the command with your local path to the license directory. |
Replace the path /local/path/to/config/dir/ in the command with your local path to the config directory (from the distribution package). |
Important Replace the path /local/path/to/logs/dir/ in the command with your local path to the logs directory (you need to create the directory mounted to a persistent drive). The volume mount into the docker is mandatory, otherwise application does not start successfully. |
Externalized configuration
YAML configuration file is located under the config
folder:
config/application.yml
There are two groups of properties:
Spring Boot properties
DOT Document Server specific properties
Spring Boot properties
You can find the specification at Common Application properties.
For example, if you would like to specify a different server port, you just add the following property:
server: port: 9080
To fully understand how the externalized configuration works in Spring Boot, see Spring Boot documentation, chapter Externalized Configuration.
DOT Document Server specific properties
These properties are tied with DOT Document Server specific behavior. The following property collection is provided as a guideline and shows the default values and their meaning.
innovatrics:
dot:
iface:
license:
filepath: license/iengine.lic (1)
solvers:
filepath: libs/solvers (2)
document:
jpg-compression-quality: 0.9 (3)
thresholds: (4)
"[COLOR_SIMILARITY]":
- checkpoint: 0.0
level: very_low
- checkpoint: 0.02
level: low
- checkpoint: 0.2
level: medium
- checkpoint: 0.4
level: high
AUTHENTICITY:
- checkpoint: 0.0
level: very_low
- checkpoint: 0.2
level: low
- checkpoint: 0.5
level: medium
- checkpoint: 0.65
level: high
DISPLAY_ATTACKS:
- checkpoint: 0.0
level: low
- checkpoint: 0.46
level: medium
- checkpoint: 0.54
level: high
data-downloader: (5)
connection-timeout: 2000 (6)
read-timeout: 30000 (7)
1 | Innovatrics IFace license file path. |
2 | Innovatrics IFace solvers file path. |
3 | Jpg compression quality of the document image as percentage |
4 | The thresholds configurations. Each threshold configuration have to contain the zero checkpoint. The checkpoints divide 0 - 100 score range. Each checkpoint indicated what level should be on the result when value is equal or above the checkpoint value. |
5 | The configuration of the downloader of data from provided URL |
6 | The data downloader connection timeout in milliseconds |
7 | The data downloader read timeout in milliseconds |
Externalized configuration via command line arguments
You can specify any Spring Boot property via command line arguments, as you can see below:
java -jar dot-document-server.jar --server.port=9080
Logging
DOT Document Server logs to the console and writes the log file (dot-document-server.log
) as well. The log file is located at a directory defined by the LOGS_DIR
system property. Log files rotate when they reach 5 MB size and the maximum history is 5 files by default.
API Transaction Counter Log
The separate log files following filename pattern dot-document-transaction-counter.log.%d{yyyy-MM-dd}.%i.gz
are located at a directory defined by the LOGS_DIR
system property. The %d{yyyy-MM-dd} template represents the date and the %i represents the index of log window within the day, starting at 0. These log files contain information about counts of API calls (transactions). The same rolling policy is applied as for the application log, except the maximum history of these log files is 455 files.
For proper transactions billing, please be sure to send all transactions logs every time.
Docker: Persisting log files in local filesystem
When you run DOT Document Server as a Docker container, you may have access to log files even after the container doesn’t exist anymore. This can be achieved by using Docker volumes. To find out how to run a container, see Docker.
Monitoring
Information as build or license info can be accessed on /api/v5/actuator/info
. Information about available endpoints can be viewed under /swagger-ui.html
.
The health endpoint accessible under /api/v5/health
provides information about the health of the application.
This feature can be used by an external tool such as Spring Boot Admin, etc.
Application also supports exposing metrics in standardised prometheus format. These are accessible under /api/v5/prometheus
. You can expose this endpoint in your configuration:
management: endpoints: web: exposure: include: health, info, prometheus
For more information, see Spring Boot documentation, sections Endpoints and Metrics. Spring Boot Actuator Documentation also provides info about other monitoring endpoints that can be enabled.
Tracing
OpenTracing API with Jaeger implementation is used for tracing purposes. The DOT Document Server tracing implementation supports SpanContext
extraction from HTTP request using HTTP Headers
format. For more information, see OpenTracing Specification. Tracing is disabled by default. To enable Jaeger tracing:
Set these application properties:
opentracing: jaeger: enabled: true udp-sender: host: jaegerhost port: portNumber
For more information about Jaeger configuration, see Jaeger Client Lib Docs.
Features
Quality check
The document quality check is provided on demand. It checks if the document has right brightness, sharpness, it doesn’t contain hotspots, it has enough background borders around the document in the image and if it isn’t too small.
The document quality check result contains details with detection confidence and the coordinates of the document in the image.
Quality check result | Input image |
---|---|
OK | |
WARNING: The distance of at least one detected corner point from the nearest image border is less than 2% of the image width or height. | |
FAILED: The brightness score is below 0.25 | |
FAILED: The brightness score is over 0.9 | |
FAILED: The sharpness score is bellow 0.85 | |
FAILED: The hotsposts score is over 0.008 | |
FAILED: At least one corner point of the document was detected outside the image area. | |
FAILED: The width of the detected document should be over 450px and the height over (450px / aspect ratio). If any detected edge of the document does not meet these requirements, the document is considered to be too small. |
Display attack detection
The display attack detection is provided on demand.
Genuity confidence level | Input image |
---|---|
Low genuity means a display attack was detected. | |
High genuity means no display attack was detected. |
You can override the default configuration and configure your own thresholds for display attack genuity confidence levels. The table bellow shows the false accept ratio and the false reject ratio for various thresholds we measured on our test dataset.
Threshold | FAR (%) | FRR (%) |
---|---|---|
0.807 | 35.41 | 0.00 |
0.843 | 11.93 | 1.04 |
0.85 | 9.08 | 2.07 |
0.861 | 5.25 | 2.98 |
0.869 | 2.98 | 3.89 |
0.876 | 1.82 | 5.05 |
0.88 | 1.17 | 6.09 |
0.886 | 0.91 | 7.12 |
0.907 | 0.00 | 12.56 |
Document authenticity detection
The detection of areas in document overlaid by paper or sticker was deprecated in Document Server version 5.26.0 and it is not working as intended in versions 5.44.0 and later. The result of the confidence score in the authenticity object should not be evaluated, but will be retained in API for compatibility reasons.
Passport reading
An image of any passport that fulfills the ICAO Document 9303 specification can be read via further classification. The machine-readable zone is recognized and parsed. The validity of MRZ chek digits is provided too.
Image requirements
The supported image formats are JPEG and PNG
The document image must be large enough — when the document card is normalized, the text height must be at least 32 px (document card height is approximately 1000 px)
The document card edges must be clearly visible and be placed at least 10 px inside the image area
The image must be sharp enough for the human eye to recognize the text
Image should not contain objects or background with visible edges. (example below) This can confuse process of detecting card on image
Appendix
Changelog
5.82.0 - 2025-03-24
Internal improvements
5.81.0 - 2025-02-28
Internal improvements
5.80.0 - 2025-02-05
Internal improvements
5.79.1 - 2025-01-28
Internal improvements
5.79.0 - 2025-01-17
Internal improvements
5.78.0 - 2024-12-20
Internal improvements
5.77.0 - 2024-11-28
Internal improvements
5.76.0 - 2024-10-25
Internal improvements
5.75.0 - 2024-08-28
Internal improvements
5.74.0 - 2024-08-08
Internal improvements
5.73.0 - 2024-07-19
Internal improvements
5.72.0 - 2024-06-28
Internal improvements
5.71.0 - 2024-06-05
Internal improvements
5.70.0 - 2024-05-15
Internal improvements
5.69.0 - 2024-05-03
Internal improvements
5.68.0 - 2024-04-04
Internal improvements
5.67.0 - 2024-03-14
Internal improvements
5.66.0 - 2024-02-15
Increased minimum required Java version to 17
5.65.0 - 2024-02-01
Internal improvements
5.64.0 - 2024-01-11
Internal improvements
5.63.0 - 2023-12-21
Internal improvements
5.62.0 - 2023-12-07
Internal improvements
5.61.0 - 2023-11-16
Internal improvements
5.60.0 - 2023-10-27
Internal improvements
5.59.1 - 2023-10-19
Internal improvements
5.59.0 - 2023-10-03
Internal improvements
5.58.0 - 2023-09-14
Internal improvements
5.57.0 - 2023-08-17
Internal improvements
5.56.1 - 2023-07-31
Internal improvements
5.56.0 - 2023-07-27
Internal improvements
5.55.0 - 2023-06-09
Internal improvements
5.54.1 - 2023-06-06
Internal improvements
5.54.0 - 2023-05-19
Internal improvements
5.53.0 - 2023-04-27
Internal improvements
5.52.0 - 2023-03-24
Internal improvements
5.51.0 - 2023-03-03
Internal improvements
5.50.0 - 2023-02-10
Internal improvements
5.49.0 - 2023-01-25
Internal improvements
5.48.0 - 2022-12-16
Internal improvements
5.47.1 - 2022-11-29
Internal improvements
5.47.0 - 2022-11-25
AVX instruction set support added to reduce latency and increase throughput (a CPU supporting AVX instruction set required)
Internal improvements
5.46.0 - 2022-10-27
Internal improvements
5.45.0 - 2022-10-06
Internal improvements
5.44.0 - 2022-09-08
Internal improvements
5.43.0 - 2022-08-17
Increased transaction log retention to 455 days
Internal improvements
5.42.1 - 2022-08-11
Internal improvements
5.42.0 - 2022-08-03
Internal improvements
5.41.0 - 2022-07-14
Internal improvements
5.40.0 - 2022-07-07
Internal improvements.
5.39.0 - 2022-06-15
Internal improvements.
5.38.0 - 2022-05-18
Internal improvements.
5.37.1 - 2022-05-10
Internal improvements.
5.37.0 - 2022-05-04
Internal improvements.
5.36.0 - 2022-04-21
Internal improvements.
5.35.0 - 2022-03-31
Internal improvements.
5.34.0 - 2022-03-24
Fixed
Fixed classification problem when the text field filter is used.
5.33.0 - 2022-03-09
Changed
Internal improvements.
5.32.0 - 2022-02-14
Internal improvements.
5.31.0 - 2022-01-27
Changed
Internal improvements.
5.30.0 - 2022-01-13
Fixed
Fixed the position of ROI for text fields.
5.29.0 - 2021-12-16
Internal improvements.
5.28.0 - 2021-12-02
Internal improvements.
5.27.0 - 2021-12-02
Internal improvements.
5.26.0 - 2021-11-11
Changed
Document Authenticity functionality has been marked as deprecated and removed from documentation.
5.25.0 - 2021-10-28
Parameter
innovatrics.dot.document.normalization.jpg-compression-quality
changed toinnovatrics.dot.document.jpg-compression-quality
5.24.0 - 2021-10-13
Changed
Update IFace to 4.13.0
Update thresholds and documentation for improved Display Attack Detection.
5.23.0 - 2021-09-29
Added
Check if logs directory is properly mounted to host machine when application is running in the docker image
5.22.0 - 2021-09-16
Changed
Internal improvements.
5.21.0 - 2021-09-14
Added
API:
PassportResponse.Document.croppedImage
: Cropped passport.
5.20.0 - 2021-09-10
Changed
An active DOT license is required to run the server.
Update IFace to 4.11.0
Deleted
Deleted config property:
innovatrics.dot.iface.enabled
.
5.19.0 - 2021-08-27
Changed
Internal improvements.
5.18.0 - 2021-08-23
Added
API:
/api/v5/passports/
: Added the endpoint to process passports (ICAO Document 9303)
Changed
API: All decimal numbers will be returned rounded to 7 decimal places.
5.17.0 - 2021-08-05
Internal improvements.
5.16.1 - 2021-07-22
Fixed
API
DocumentOcrResponse.textLabels
: Removed undocumented field from the response
5.16.0 - 2021-07-15
Internal improvements.
5.15.0 - 2021-06-18
Added
Added Display Attack Detection. Enabled and configured IFace is required for this functionality.
Added config property:
innovatrics.dot.iface.enabled
: Enable IFace. When set to true, an active IFace license is needed.API:
DocumentOcrRequest.documentProperties.displayAttackDetection.enabled
: Request the display attack detection.API:
DocumentOcrResponse.documentProperties.displayAttackDetection
: Display attack detection result.
5.14.0 - 2021-05-14
Internal improvements.
5.13.0 - 2021-04-27
Changed
Internal improvements.
5.12.1 - 2021-04-09
Changed
API:
DocumentOcrResponse.textFields
: The textFields field will always be returned when the response is 200.API:
DocumentOcrResponse.imageFields
: The imageFields field will always be returned when the response is 200.
5.12.0 - 2021-04-07
Added
API:
DocumentOcrRequest.documentProperties.quality.enabled
: Enable the quality check.API:
DocumentOcrResponse.documentProperties.quality
: The quality check result. # ChangedAPI:
DocumentOcrResponse.documentProperties
: The document properties are returned now when the authenticity check or quality check are enabled in request. The document properties can contain the document color similarity and a result of the authenticity check and/or result of the quality check. Therefore, the authenticity check result and the color similarity are optional in the document properties from now on.
5.11.0 - 2021-03-18
Changed
API:
DocumentOcrResponse.documentProperties.colorProfile.similarityScoreLevel
: AddedVERY_LOW
valueAPI:
DocumentOcrResponse.documentProperties.authenticity.confidenceLevel
: AddedVERY_LOW
valueAPI:
DocumentOcrResponse.documentProperties.authenticity.suspiciousFields.textFields.confidenceLevel
: AddedVERY_LOW
valueAPI:
DocumentOcrResponse.documentProperties.authenticity.suspiciousFields.imageFields.confidenceLevel
: AddedVERY_LOW
value
5.10.1 - 2021-03-08
Changed
Internal improvements.
5.10.0 - 2021-03-05
Changed
Internal improvements.
5.9.0 - 2021-02-26
Changed
Internal improvements.
5.8.0 - 2021-02-18
Changed
Internal improvements.
5.7.0 - 2021-01-15
Changed
Internal improvements.
5.6.0 - 2020-12-16
Changed
Internal improvements.
5.5.0 - 2020-12-14
Changed
Internal improvements.
5.4.0 - 2020-12-08
Changed
Internal improvements.
5.3.0 - 2020-12-04
Added
API:
DocumentMetadataResponse.documentTypes.pages.textFields.valueNormalized
: Flag to inform if the value in this field is being returned normalized.
5.2.0 - 2020-11-12
Changed
Internal improvements.
5.1.1 - 2020-10-30
Fixed
API:
DocumentOcrResponse.documentProperties.authenticity.details
: Removed from the response.
5.1.0 - 2020-10-30
Added
API:
DocumentOcrResponse.documentProperties.authenticity.suspiciousFields
: Text fields and image fields, which are suspicious by their authenticity.
Removed
API:
DocumentOcrResponse.documentProperties.authenticity.details
: The authenticity details were removed from the response.
5.0.0 - 2020-10-23
Added
Added the authenticity check feature.
API:
DocumentOcrRequest.documentProperties.authenticity.enabled
: Enable authenticity check.API:
DocumentMetadataResponse.documentTypes.pages.authenticity
: The authenticity metadata.API:
DocumentOcrResponse.documentProperties
: The document properties, which are returned only when the authenticity check is enabled in request. The document properties contain a document color similarity and a result of the authenticity check.
Changed
Rename the application to Document Server.
New API version 5.
API:
DocumentOcrResponse.textFields.confidence
: Change the data type fromInt
toDouble
and the value interval from[0,1000]
to[0,1]
.API:
DocumentOcrResponse.textFields.lines.confidence
: Change the data type fromInt
toDouble
and the value interval from[0,1000]
to[0,1]
.
4.26.0 - 2020-10-19
Changed
Internal improvements.
4.25.0 - 2020-10-12
Changed
API:
DocumentOcrRequest.documentTypeAdvice.country
: Case insensitive.API:
DocumentOcrRequest.documentTypeAdvice.type
: Case insensitive.API:
DocumentOcrRequest.documentTypeAdvice.edition
: Case insensitive.API:
DocumentOcrRequest.documentTypeAdvice.machineReadableTravelDocument
: Case insensitive.API:
DocumentOcrRequest.documentTypeAdvice.pageTypes
: Case insensitive.
4.24.0 - 2020-10-09
Changed
Internal improvements.
4.23.0 - 2020-10-01
Changed
Internal improvements.
4.22.0 - 2020-09-30
Changed
Internal improvements.
4.21.0 - 2020-09-10
Changed
Internal improvements.
4.20.0 - 2020-09-04
Changed
Internal improvements.
4.19.0 - 2020-08-25
Changed
Internal improvements.
4.18.0 - 2020-08-22
Changed
Internal improvements.
4.17.0 - 2020-08-13
Changed
Internal improvements.
4.16.0 - 2020-08-13
Changed
Internal improvements.
4.15.1 - 2020-08-10
Changed
Internal improvements.
4.14.0 - 2020-08-07
Changed
Internal improvements.
4.13.0 - 2020-07-30
Changed
Internal improvements.
4.12.0 - 2020-07-15
Changed
Internal improvements.
4.11.0 - 2020-07-02
Changed
Internal improvements.
4.10.0 - 2020-06-24
Changed
Internal improvements.
4.9.0 - 2020-06-22
Changed
Internal improvements.
4.8.0 - 2020-06-12
Changed
Internal improvements.
4.7.0 - 2020-06-11
Changed
Internal improvements.
4.6.0 - 2020-06-10
Changed
Internal improvements.
4.5.0 - 2020-06-08
Changed
Internal improvements.
4.4.0 - 2020-05-07
Fixed
Normalized image aspect ratio.
4.3.0 - 2020-04-27
Added
API: New attribute
DocumentOcrResponse.normalizedDocumentImage
: If an image of a normalized document should be present in a response.
Changed
API:
DocumentOcrResponse.normalizedImage
: Change to JPG image format.API:
DocumentOcrResponse.normalizedImage
: Not present in response by default. (SeeDocumentOcrResponse.normalizedDocumentImage
).OCR performance optimization.
4.2.0 - 2020-04-17
Changed
Improve document image normalization.
4.1.0 - 2020-04-03
Added
New major release.
DOT Document Server Dockerfile example
FROM ubuntu:22.04 # Install dependencies RUN set -ex && \ apt-get update && apt-get install -y --no-install-recommends \ openjdk-17-jre-headless \ locales \ libusb-0.1 \ libgomp1 \ && rm -rf /var/lib/apt/lists/* # Set the locale ENV LANG en_US.UTF-8 ENV LANGUAGE en_US:en ENV LC_ALL en_US.UTF-8 RUN set -eux && \ sed -i "/${LANG}/s/^# //g" /etc/locale.gen && \ locale-gen # Add a user to run the application ARG UID=1000 ARG GID=1000 RUN set -eux && \ addgroup --gid=${GID} dot-ds && \ adduser --gid=${GID} --uid=${UID} --disabled-login --disabled-password dot-ds # Add entrypoint script COPY entrypoint.sh /usr/local/bin/ WORKDIR /srv/dot-document-server # Add libs ARG IFACE_LIB COPY ${IFACE_LIB} ./libs/ COPY solvers/* ./libs/solvers/ ARG SAM_OCR_LIB COPY ${SAM_OCR_LIB} ./libs/ # Config libs RUN ldconfig "$(realpath libs)" # Add application ARG JAR_FILE COPY ${JAR_FILE} ./app.jar # Set timestamps and run user permissions RUN touch ./*.jar ENV CONFIG_DIR=/srv/dot-document-server/config ENV LOGS_DIR=/srv/dot-document-server/logs ARG JAVA_OPTS ENV JAVA_OPTS="" EXPOSE 8080 CMD ["entrypoint.sh"]
Entrypoint.sh script example
#!/bin/sh set -eux java $JAVA_OPTS \ -Dspring.config.additional-location=file:$CONFIG_DIR/application.yml \ -Dlogging.config=file:$CONFIG_DIR/logback-spring.xml \ -jar app.jar