Provision of Optical Character Recognition processing and conversion of scanned images and PDF documents to coded data and data extraction
European Patent Organisation, Rijswijk NL
Contract for 36 months
Description: The European Patent Office receives daily a large amount of confidential new patent applications, forms, free text letters as well as patent and non-patent literature containing plain text, tables, chemical and mathematical formulae as well as drawings relating to the patent grant process in various formats. These documents need to be OCR-converted into XML-structured data where formulae, drawings and many tables can be captured as images. The requested Services cover XML-conversion, extraction of citations and XML mark-up in line with the EPO’s capturing guidelines. On average the Services relate to 2 million pages per month spread over 3 Lots.
All data shall be kept and services shall be performed within the territory of one of the EPC Member States. The Services shall be performed at Contractor’s premises.
Lot 1 – Early OCR – Optical Character Recognition concerning patent documents requiring digitisation of text
Lot 2 – Citation Extraction (covering digitised patent documents requiring citation extraction)
Lot 3 – Forms OCR – Optical Character Recognition concerning images and/or PDFs of forms related to the patent grant procedure requiring the localisation, extraction and mark-up of information in such forms
Submission deadline: Jan 9, 2025
Leave a Reply