-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run with font class priors #7
Comments
This option can be easily added for the font classifier which could improve performance and will ensure these class are never in the predictions, so i will do that. |
Yes, I guess that would require changing the network of COCR, with an input-as-output scheme (i.e. representing the font as additional output dimension). |
Its also linked to the training process of the model, at the beginning we have dedicated modules specialized on specific fonts but then the whole network is fine tuned at once. The result is kind of an interlinked structure where these modules are not specialized on specific fonts anymore, but most probably mix of fonts etc. |
Oh, interesting. I do think this would still be compatible with an input-as-output extension. The network would simply (be forced to) learn to factor this in at every phase (perhaps with some custom regularizer). Or you just add it as another (uninitialized) layer during the finetuning phase. |
For now, we have tried once to modify the COCR architecture to also output font groups at character level, which partially (but not fully) enforced the different components to be specialized for different font groups. It had unfortunately a negative impact on the CER. Investigating this further is in our todo list, however it will require time, as training combined OCR models isn't that fast. |
It would be really nice if it was possible to constrain the font predictions to classes known in advance. This could be implemented in the OCR-D wrapper by suppressing certain results from the prediction, but ideally its passed to the neural network decoder so all the probability mass gets reassigned.
For example, if I know the document only contains Fraktur and Antiqua, or Hebrew and Greek, or Antiqua and Italic and Manuscript, or Gotico-Antiqua and Schwabacher, then I don't want to risk "surprise" outliers (or systematic misclassification as in the Greek-Italic example).
The text was updated successfully, but these errors were encountered: