Brussels connection

the best address for international procurement

Confidential patent applications must be OCR-converted into XML-structured data. Ready?

·

,

Call: Provision of Optical Character Recognition processing and conversion of scanned images and PDF documents to coded data and data extraction

Buyer: European Patent Office, Munich Germany

Deadline Jan 1, 2025

Duration in months: 36

Description: EPO receives daily a large amount of confidential new patent applications, forms, free text letters as well as patent and non-patent literature containing plain text, tables, chemical and mathematical formulae as well as drawings relating to the patent grant process in various formats. These documents need to be OCR-converted into XML-structured data where formulae, drawings and many tables can be captured as images. The requested Services cover XML-conversion, extraction of citations and XML mark-up in line with the EPO’s capturing guidelines. On average the Services relate to 2 million pages per month spread over 3 Lots. Service descriptions and volumes are subject to change.

All data shall be kept and services shall be performed within the territory of one of the EPC Member States. The Services shall be performed at Contractor’s premises.

Lot 1 – Early OCR – Optical Character Recognition concerning patent documents requiring digitisation of text

Lot 2 – Citation extraction covering digitised patent documents requiring citation extraction

Lot 3 – Optical Character Recognition concerning images and/or PDFs of forms related to the patent grant procedure requiring the localisation, extraction and mark-up of information in such forms

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Get updates

From art exploration to the latest archeological findings, all here in our weekly newsletter.

Subscribe