-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add 1st draft line GT/training specs #105
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend the introduction of a third GT subset devel
.
In addition, some minor comments.
@wrznr Can you elaborate? |
Most training procedures allow for the application of three different sets of GT: |
@wrznr So far I mainly applied k-fold_cross-validation, would you still see added benefits over this by partitioning into three sets? |
- groundTruthBag | ||
- model | ||
properties: | ||
engineName: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete .Derived from model
- kraken | ||
- tesseract | ||
- calamari | ||
engineVersion: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete .Derived from model
required: false | ||
default: 'image/png' | ||
values: | ||
- 'image/png' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a differentiation between Tiff compressed or JPEG2000 make more sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean additionally allow image/jp2
? Do engines allow JPEG2000 input for training?
BagIt-Profile-Info: | ||
BagIt-Profile-Identifier: https://ocr-d.github.io/gt-profile.json | ||
BagIt-Profile-Version: '1.2.0' | ||
Source-Organization: OCR-D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about information about the origin of the digitized lines?
- minimal bibliographic record based on DC?
- and artificially generated lines (+ degeneration)
- what about the degeneration algorithm?
I think that comment may be in the wrong place here. It should probably be placed in this place ## Line metadata##.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/OCR-D/spec/pull/105/files/6827085d051e945062203b82ef921e54025cfbda#diff-ee256e83a17cfe309565c88ab376091a That is the definition of what's currently supposed to be in there. Bibliographic metadata would be in the METS referred to by metsUrl
. How to encode provenance on a line-level I am not sure though. @VolkerHartmann?
@wrznr Do your remaining |
@Doreenruirui's work on okralact has diverged significantly from these specs. It makes little sense to publish these specs with the only implementation implementing it differently. @Doreenruirui can you compare your schemas and documentation with this so we can integrate that part of okralact into the specs? |
@kba I am sorry that I am not very familiar with github. Can you point me to the document I should compare with my schemas? |
@Doreenruirui We're discussing these changes/new files: https://github.com/OCR-D/spec/pull/105/files. In particular I would like to harmonize the proposal here (https://github.com/OCR-D/spec/pull/105/files?file-filters%5B%5D=.md#diff-2ae93b1f468c44b9f7e195133a0fb539) of using BagIt for the line GT with your approach in okralact wrt to input format. Also interesting would be to compare okralact's engine schemas with the schema proposed here https://github.com/OCR-D/spec/pull/105/files?file-filters%5B%5D=.md&file-filters%5B%5D=.yml#diff-690d5874f98dfbd6737bc0168b6084d8 and https://github.com/OCR-D/spec/pull/105/files?file-filters%5B%5D=.md&file-filters%5B%5D=.yml#diff-a1f62fd4dd219fc5c5d5f0ccb419c88b |
My original review does not relate to the current state very much.
No description provided.