Vincent CHRISTLEIN
(Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany)

Keywords: Writer recognition, historical documents, writer retrieval

Abstract:
Handwritten documents were a fundamental part of communication until the end of the twentieth century. Thus, there is a large historical stock of handwritings, which is gradually digitized, and the subject of research in many respects. In this presentation, the focus lies on automatic writer identification and retrieval. While a paleograph or historian can typically differentiate well different scribes, the sheer mass of available documents make an exhaustive search infeasible. In this presentation, we first give an overview of the development of writer recognition. We show that in clean benchmark data, only a small portion of text is necessary to obtain very high recognition results. However, historical data is much more challenging due to different preservation conditions generating artifacts in the digitizations, such as rips and holes. In the remainder of the presentation, we focus on methods dealing with historical data. We suggest a method to learn robust local features by means of deep neural networks in an unsupervised fashion. These local descriptors are then encoded to form a global representation. Eventually, these representations are classified with exemplar classifiers. We evaluate our method in a large historical dataset, consisting of more than 700 writers, where each writer contributed five samples. Our proposed method obtains a recognition accuracy of nearly 90%.

Relevance for the conference: Obtaining the correct writer in large historical datasets can be very time consuming, therefore assisting methods can be of great help to speed-up the process.
Relevance for the session: Writer recognition methods trained on clean benchmark data typically do not work well for historical data, such as archeological material.
Innovation: We compare the work in common writer recognition benchmarks with realistic scenarios using historical data.
References:

  • V. Christlein, M. Gropp, S. Fiel, and A. Maier. “Unsupervised Feature Learning for Writer Identification and Writer Retrieval”. In: 2017 14th International Conference on Document Analysis and Recognition. Kyoto, Nov. 2017. doi: 10.1109/ICDAR.2017.165
  • V. Christlein and A. Maier, “Encoding CNN Activations for Writer Recognition,” 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), Vienna, Austria, 2018, pp. 169-174. doi: 10.1109/DAS.2018.9