Brussels connection

[the one that wins]

the best address for international procurement

Patent Office needs to convert text and images to coded data

·

,

Provision of Optical Character Recognition processing and conversion of scanned images and PDF documents to coded data and data extraction

European Patent Organisation, Rijswijk NL

Contract for 36 months

Description: The European Patent Office receives daily a large amount of confidential new patent applications, forms, free text letters as well as patent and non-patent literature containing plain text, tables, chemical and mathematical formulae as well as drawings relating to the patent grant process in various formats. These documents need to be OCR-converted into XML-structured data where formulae, drawings and many tables can be captured as images. The requested Services cover XML-conversion, extraction of citations and XML mark-up in line with the EPO’s capturing guidelines. On average the Services relate to 2 million pages per month spread over 3 Lots.

All data shall be kept and services shall be performed within the territory of one of the EPC Member States. The Services shall be performed at Contractor’s premises.

Lot 1 – Early OCR – Optical Character Recognition concerning patent documents requiring digitisation of text

Lot 2 – Citation Extraction (covering digitised patent documents requiring citation extraction)

Lot 3 – Forms OCR – Optical Character Recognition concerning images and/or PDFs of forms related to the patent grant procedure requiring the localisation, extraction and mark-up of information in such forms

Submission deadline: Jan 9, 2025

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.