-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the Categories configurable #59
Comments
yep, that'll work! |
I have two ideas, I would like to ask you your opinion. Case 1: the categories on the field level is arbitrary, so you can add anything you want, there is no check at all. Case 2: you should either add a "categories" list on schema level as well, which contains all possible values. It behaves as a controlled vocabulary, and the field level category MUST BE in this list. Here is an example for Case 2: format: json
fields:
- name: edm:ProvidedCHO/@about
path: $.['providedCHOs'][0]['about']
categories:
- MANDATORY
- name: Proxy/dc:title
path: $.['proxies'][?(@['europeanaProxy'] == false)]['dcTitle']
categories:
- DESCRIPTIVENESS
- SEARCHABILITY
- IDENTIFICATION
- MULTILINGUALITY
- CUSTOM
- name: Proxy/dcterms:alternative
path: $.['proxies'][?(@['europeanaProxy'] == false)]['dctermsAlternative']
categories:
- DESCRIPTIVENESS
- SEARCHABILITY
- IDENTIFICATION
- MULTILINGUALITY
groups:
- fields:
- Proxy/dc:title
- Proxy/dc:description
categories:
- MANDATORY
categories:
- MANDATORY
- DESCRIPTIVENESS
- SEARCHABILITY
- IDENTIFICATION
- MULTILINGUALITY
- CUSTOM Note the categories in the last section. It give the schema create a bit more work, but keep the consistency of the categories. If it is missing the default list which the tool will compare the field categories against will be the current enumeration. Would you vote for case 1 or case 2? |
I would say 1 because 2 won't add much in practice except for redundancy and less transparency. You can still implement the current situation if the categories on the field level are arbitrary. In fact, the schema doesn't even have to change. Agreed, there is no check, but I think it's up to the writer of the schema to do it properly :) The worst that can happen is that the results are wrongly classified. Repeating the list on the schema level won't entirely avoid this mistake from happening. But I don't have the full picture of course, is there's something I'm missing about the current list of categories? |
Thanks! There is only one more thing I forget to mention. The final output (the order of columns in the output CSV or in the Java collection) could be sorted against this canonical list. Otherwise the order will be set on first come first served basis. |
I implemented it, but it requires some changes in the API. It is not anymore possible to add categories into JsonPath constructor, one has to use Here is an example. Old style new JsonBranch("Proxy/dc:title", "$.['dcTitle']",
Category.DESCRIPTIVENESS,
Category.SEARCHABILITY,
Category.IDENTIFICATION,
Category.MULTILINGUALITY); new style: new JsonBranch("Proxy/dc:title", "$.['dcTitle']")
.setCategories(
Category.DESCRIPTIVENESS,
Category.SEARCHABILITY,
Category.IDENTIFICATION,
Category.MULTILINGUALITY
); One more thing: if the schema configuration has the |
Sounds good to me! |
@pkiraly I think you can close this one |
@mielvds suggested the following: Would it be an idea to enable custon extentions of this list with arbitrary groups?
Right now category is an enumeration, and can not be configured.
The text was updated successfully, but these errors were encountered: