Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: footnotes validation #702

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

Rossi-Luciano
Copy link
Collaborator

O que esse PR faz?

Este PR adiciona uma série de validações para notas de rodapé (<fn>) e elementos de correspondência (<corresp>) em artigos XML, garantindo conformidade com versões específicas do DTD e padrões de marcação. As principais funcionalidades incluídas são:

  1. Validação de tipo de nota de rodapé (coi_statement_vs_conflict_by_dtd_validation): Verifica se o atributo fn-type está correto conforme a versão DTD ("coi-statement" para DTD >= 1.3 e "conflict" para versões menores).
  2. Validação de rótulos ausentes: Checa a presença de <label> em <corresp> e <fn>, gerando alertas se estiverem ausentes.
  3. Validação da presença de <title> e <bold>: Gera erros fatais quando <title> ou <bold> são encontrados dentro de <corresp> ou <fn>, orientando para sua remoção.

Onde a revisão poderia começar?

Por commit.

Como este poderia ser testado manualmente?

Para testar manualmente as funcionalidades:

  1. Carregue um documento XML que inclua notas de rodapé (<fn>) e elementos de correspondência (<corresp>).
  2. Execute as validações implementadas, verificando se os erros e avisos são gerados corretamente para os cenários:
  • <fn> com fn-type incorreto em relação à versão DTD.
  • Falta de <label> dentro de <corresp> ou <fn>.
  • Presença indevida de <title> e <bold> dentro de <corresp> ou <fn>.
  1. Confirme que as mensagens de validação são claras e indicam corretamente as ações recomendadas para corrigir os erros.

Algum cenário de contexto que queira dar?

NA

Screenshots

NA

Quais são tickets relevantes?

NA

Referências

NA

@@ -22,13 +22,14 @@ def data(self):

class BaseNoteGroups:
def __init__(self, article_or_sub_article_node, fn_parent_tag_name, NoteGroupClass):
self.article_or_sub_article_node = article_or_sub_article_node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano estava correto antes... A instanciação deveria ocorrer antes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub_item="title",
validation_type="exist",
is_valid=False,
expected="<title> not in <corresp>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano Trocar por expected="<label> in <corresp>"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_valid=False,
expected="<title> not in <corresp>",
obtained=f'<title>{self.fns_dict.get("corresp_title")}</title>',
advice="Remove <title> from <corresp>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano replace <title> by <label>

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub_item="title",
validation_type="exist",
is_valid=False,
expected="<title> not in <fn>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano o mesmo que mencionado em corresp

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub_item="bold",
validation_type="exist",
is_valid=False,
expected="<bold> not in <corresp>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano recomende usar label no lugar de bold

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -80,67 +81,171 @@ def fn_validation(self):
if not dtd:
return

fns = ArticleFootnotes(self.xmltree)
for fn in fns.article_footnotes:
for fn in self.fns_dict.get("fns"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano uso excessivo de for, sendo que há a classe FootnoteValidation e FootnotesValidation.

Na FootnoteValidation (singular), crie todas as validações SEM for, é para ser aplicado em um item do self.fns_dict.get("fns")

Na FootnotesValidation (plural), faça 1 for

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.fn_parent_tag_name = fn_parent_tag_name
self.parent = article_or_sub_article_node.tag
self.parent_id = article_or_sub_article_node.get("id")
self.parent_lang = article_or_sub_article_node.get("{http://www.w3.org/XML/1998/namespace}lang")
self.parent_article_type = article_or_sub_article_node.get("article-type")
self.NoteGroupClass = NoteGroupClass
self.article_or_sub_article_node = article_or_sub_article_node \
if self.parent == "sub-article" else article_or_sub_article_node.find("./")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano só vai pegar o front

In [27]: from lxml import etree

In [28]: root = etree.fromstring(xml)

In [29]: x = root.find('.').find("./")

In [30]: x
Out[30]: <Element front at 0x1125c6340>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano No método items, faça:

if self.article_or_sub_article_node == "article":
    xpath = f".//front//{self.fn_parent_tag_name} | .//body//{self.fn_parent_tag_name} | .//back//{self.fn_parent_tag_name}"
else:
    xpath = f".//{self.fn_parent_tag_name}"

for fn_parent_node in self.article_or_sub_article_node.xpath(xpath):
    data = self.NoteGroupClass(fn_parent_node).data
    yield put_parent_context(data, self.parent_lang, self.parent_article_type, self.parent, self.parent_id)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


parent : str
The parent element.
def missing_fn_label_validation(self, error_level="WARNING"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano métodos em comum de Author e foot podem ter uma classe Base

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class ArticleNotesValidation:
def __init__(self, xml_tree):
self.xml_tree = xml_tree
self.fns_dict = ArticleNotes(self.xml_tree).all_notes()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano melhor ter um método para recuperar separadamente authornotes e footnotes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


expected : str
The expected value.
def article_notes_validation(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano Tenha um método só para validar footnotes e authornotes separadamente

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 259 to 263
def author_validation(self, fn_dict):
author_validation = AuthorNoteValidation(fn_dict)
yield author_validation.missing_corresp_label_validation()
yield author_validation.title_presence_in_corresp_validation()
yield author_validation.bold_presence_in_corresp_validation()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano mover isso para um método validate de AuthorNoteValidation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertatakenaka, podemos discutir isso?

advice=f"replace {obtained} with {expected}",
data=fn
)
def footnote_validation(self, fn):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano mover isso para um método validate de FootnoteValidation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertatakenaka, podemos discutir isso?

def __init__(self, note_dict, element_tag):
self.note_dict = note_dict
self.element_tag = element_tag
self.check_element_existence = bool(self.note_dict.get("corresp")) if self.element_tag == "corresp" else True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano se é classe Base, não deve ter conteúdo referente a AuthorNotes

class FootnoteValidation:
class AuthorNoteValidation(BaseNoteValidation):
def __init__(self, fn_dict):
super().__init__(note_dict=fn_dict, element_tag='corresp')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertatakenaka
Copy link
Member

@Rossi-Luciano foram feitas mudanças desnecessárias e as solicitadas não foram feitas. Vamos fazer juntos. Sugiro retornar ao que era antes

@@ -1,19 +1,84 @@
from packtools.sps.models.footnotes import ArticleFootnotes
from packtools.sps.models.article_and_subarticles import ArticleAndSubArticles
from packtools.sps.models.v2.notes import ArticleNotes
from packtools.sps.validation.utils import format_response
from packtools.sps.validation.exceptions import ValidationFootnotes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usar nome mais adequado para a exceção ValidationFootnotes. Este nome deve conter no final Error ou Exception. E ter no nome o tipo de erro, por exemplo: FileNotFoundError está explícito que é um erro de arquivo não encontrado.

)


class ArticleNotesValidation:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano acho que não tem sentido ter uma classe para validar os dois tipos de notas.
Faz mais sentido ter a validação de author notes na classe de ContribGroup, por exemplo.
Se tiver dúvida disso, fazemos juntos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants