Articles

Text detection and reading

Build a Handwritten Text Recognition System using TensorFlow

This article explains how to build a modern (handwritten) text recognition system. Besides a theoretical introduction, the code of the SimpleHTR repository is discussed.

FAQ: Build a Handwritten Text Recognition System using TensorFlow

There were some questions regarding the last article which are discussed here: how to use custom datasets, how to detect text on line-level and how to compute a confidence score for the recognized text.

An Intuitive Explanation of Connectionist Temporal Classification

CTC is an essential part of modern text and speech recognition systems. This article explains how CTC works without hiding the clever ideas it is based on behind complicated formulas.

Beam Search Decoding in CTC-trained Neural Networks

To get the recognized text from a CTC-trained neural network, its output must be decoded. Beam search decoding is a fast and well-performing algorithm which optionally can integrate a character-level language model.

Word Beam Search: A CTC Decoding Algorithm

Word beam search is an extension to the vanilla beam search algorithm. It avoids spelling mistakes of words, allows arbitrary numbers and punctuation marks between words and optionally makes use of a word-level language model. This article gives a high-level overview of how the algorithm works.

Handwritten Word Detector

Documentation of the handwritten word detector implementation.

Deep learning deep dives

Padding in deep learning models

The article presents two experiments that show the influence of padding in deep learning models.

Pooling in deep learning models

Even though pooling layers are omnipresent in convolutional neural networks, they can cause problems due to not being shift invariant.

Quasi Cauchy Optimizer

Some notes on the implementation of the quasi cauchy optimizer. It uses a diagonal approximation of the Hessian and therefore is very memory-efficient.