You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The breaking changes of lxml version 5.2 (seen in #2351) pointed me to the fact that we are currently using two tools (lxml and bs4) for the same purpose (extracting plain text from html content). We should conglomerate these and not use two tools without a good reason (which I don't see in this case). Since bs4 seems to be the more versatile, forgiving and easier-to-use parser, I would opt to remove lxml.
The text was updated successfully, but these errors were encountered:
The breaking changes of
lxml
version 5.2 (seen in #2351) pointed me to the fact that we are currently using two tools (lxml
andbs4
) for the same purpose (extracting plain text from html content). We should conglomerate these and not use two tools without a good reason (which I don't see in this case). Sincebs4
seems to be the more versatile, forgiving and easier-to-use parser, I would opt to removelxml
.The text was updated successfully, but these errors were encountered: