Brussels connection

the best address for international procurement

Patent Office needs to convert text and images to coded data



Provision of Optical Character Recognition processing and conversion of scanned images and PDF documents to coded data and data extraction

European Patent Organisation, Rijswijk NL

Contract for 36 months

Description: The European Patent Office receives daily a large amount of confidential new patent applications, forms, free text letters as well as patent and non-patent literature containing plain text, tables, chemical and mathematical formulae as well as drawings relating to the patent grant process in various formats. These documents need to be OCR-converted into XML-structured data where formulae, drawings and many tables can be captured as images. The requested Services cover XML-conversion, extraction of citations and XML mark-up in line with the EPO’s capturing guidelines. On average the Services relate to 2 million pages per month spread over 3 Lots.

All data shall be kept and services shall be performed within the territory of one of the EPC Member States. The Services shall be performed at Contractor’s premises.

Lot 1 – Early OCR – Optical Character Recognition concerning patent documents requiring digitisation of text

Lot 2 – Citation Extraction (covering digitised patent documents requiring citation extraction)

Lot 3 – Forms OCR – Optical Character Recognition concerning images and/or PDFs of forms related to the patent grant procedure requiring the localisation, extraction and mark-up of information in such forms

Submission deadline: Jan 9, 2025

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Get updates

From art exploration to the latest archeological findings, all here in our weekly newsletter.
