-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XML schema conforms to v1.1 of the schema spec but not to v1.0 #113
Comments
For example, when trying to validate Security/SafeRedirectStandard.xml with Xerces2-j (in schema 1.0 mode), the following error message is output:
(even when there is no element in the XML instance that could potentially match the wildcard) |
Hi @pbiron! Thanks for the comments, I wasn't aware we violated anything (I never got any notice about the schema version when working on it from my IDE). I guess I should have made better research, as I'm not really that knowledgeable about XML schemas 😬 The original discussion was started in this issue. The original PR was also made in the WPCS repo. What I used as a basis of the XSD file is the default phpcs.xsd schema. Is there a tool we can use to verify the validity of the schema against a certain XML version spec? I think @jrfnl won't be against a PR that will make the schema even better 🙂 |
@pbiron Thanks for opening this issue to discuss your concerns. The use of wildcards in select places is intentional as we didn't want to make the XSD more restrictive than is strictly necessary.
This XSD is not intended for processing via PHP. It's intended use is with XMLlint. See the usage instructions in the README. And we do actually validate the XSD against the W3C specs during CI and the XSD validates against those:
The target public for this XSD file is very limited: it is a tool which is only interesting for maintainer of sniff libraries. Suggestions to improve the XSD are definitely welcome, but please don't take offense if any are rejected as, as I state above, the goal is for it to be a helpful tool for sniff developers and not to restrict them more than necessary. |
As a side-note: the 1.1 version of the specs, which we used during our research when creating the XSD, was published in 2012, so any tooling not compatible with version 1.1 has had 10 years to update already. |
Could: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> ...be changed to: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema/v1.1" elementFormDefault="qualified"> ...as per https://www.w3.org/TR/xmlschema11-1/#langids to explictely indicate that it's written with the XSD 1.1 spec? |
@GaryJones Sounds like a good improvement to make. |
Thanx for all the replies everyone. My only goal in opening this issue is to make sure that the XML Schema is usable in the widest array of XML schema processors...just as WP tries to be available on the widest array of PHP/webserver environments. Since the XML instances are very simple, I believe there is no need to write a schema that doesn't conform to v1.0 of the schema spec. As one of the editors of v1.0 of the XML Schema spec (https://www.w3.org/TR/xmlschema-2/) and a member of the XML Schema Working Group that produced v1.1 of the schema spec, XML Schema is something that I am well versed in and am just trying to contribute that knowledge to this project. Julliette, I agree that using wildcards is a good thing...and I'm not suggesting removing the wildcard. Generally, when allowing "extension" elements in XML instances governed by an XML schema, it's a good idea to require those extension elements to be in an XML namespace different from the
to
Not only would this change make it perfectly clear which elements in the XML instances are part of the "standard" and which are "extension" elements, it also means that the element wildcard no longer violates v1.0 UPA...and hence, an XML Schema v1.0 processor is able to validate the instances. Similarly, I would suggest that the The next change I would make to that element wildcard is to make be "lax" as the attribute wildcards already are, as in:
What does "lax" mean? It means that if the XML schema processor can find a schema for those extension elements, then it will try to validate them against the extension schema, if not then it will silently ignore them (essentially, treat them "as if" they were valid). As is, the element wildcard is For example, if someone were to add a
even a v1.1 schema processor (such as xmllint), would produce a validation error such as:
With the I'll address several of the other comments in separate comments of my own. |
Xerces2-j is the most widely used XML Schema processor out there. In fact, the vast majority of XML tools that are written in Java (which is most of them) use Xerces2-j "under the hood". For instance, PhpStorm does (see https://www.jetbrains.com/help/phpstorm/working-with-xml.html), altho I suspect that that page is out-of-date, since it mentions Xerces 2.11 which did not include support for XML Schema v1.1. Xerces didn't support schema 1.1 until version 2.12.0 (release in late 2018). Many codebases that use Xerces "under the hood" still don't support schema v1.1 (such as Eclipse, the IDE I use...in some senses, PhpStorm is basically a fork of Eclipse). Why? Because the schema v1.1 support in Xerces 2.12 is only available through it's JAXP interface and it's not always easy for tools to switch from using Xerces' DOM or SAX interfaces. |
Unfortunately, the
Essentially, the use of the term As the v1.1 schema spec says in The Schema Namespace
If you were to change the namespace URI in the schema to
|
@pbiron Sounds like you have some useful improvements in mind. I look forward to a PR. When working on this, please be aware that the most important aspect of this XSD is that PHP_CodeSniffer should be able to interpret and display the information correctly - this XSD is specifically for the XML files which PHPCS displays when using the The current XSD is safeguarded by tests to ensure this and those tests would still need to pass for the PR (and if any additions are made, new tests may need to be added).
Considering that this toolset is only interesting for a very small group of people, that is not really a big concern for this project. |
I'll have to go through the existing tests to see if any of them will need to be changed. I suspect that at least 2 new tests will need to add to check that any elements/attributes that match the wildcards are explicitly in an XML namespace (one for extension elements, another for extension attributes). Since I've never used I just did an informal test of adding some extension elements/attributes to a few of the |
👍🏻
The code for the generators can be found here: https://github.com/squizlabs/PHP_CodeSniffer/tree/master/src/Generators To see it in action, try the below commands on the command line: # For sniff documentation on the command line:
> phpcs --standard=PSR12 --generator=Text
# To generate the same documentation as an HTML page to add to a website:
> phpcs --standard=PSR12 --generator=HTML > psr12.html
# Similar, to generate the same documentation to be published as markdown (like in a GH wiki or a GHPages website):
> phpcs --standard=PSR12 --generator=Markdown > psr12.md
For WPCS, there is currently no intention to add extra elements/attributes. For the PHPCompatibility standard, there is, though IIRC, the limited set of currently available documentation doesn't do so yet. A typical attribute PHPCompatibility may start using would be Does that help ?
I suppose we could set something up in the test suite to verify that the docs are generated by PHPCS as expected. This package contains one sniff with (very minimal) documentation which could possibly be used for that. Then again, it would probably be better if PHPCS itself would test the generators they provide (currently not the case, well, only done manually). |
Thanx. Looking at that code, I don't foresee any problems...since it is looking for specific named elements/attributes via And I know you previously said that doing schema validation in PHP was a non-goal, but it might be helpful for
will result in a PHP fatal error in any of
And DOMDocument::schemaValidate() only supports XML Schema 1.0, so one more reason it will be good if the XML Schema conforms to v1.0 :-) Of course, if the CI is done correctly, then there should be no invalid XML instances...but still, always a good idea to either code defensively OR validate before using the XML. |
And that is as it should be. In contrast to WP, in most PHP projects, it is not customary to hide PHP errors. |
@pbiron Sorry, I realize you are using WP as a frame of reference, but WP is, in all honesty, one of the worst pieces of code out there and really not a good example of good coding practices, nor of a good coding culture. |
We'll have to discuss this a little more. Because if I understand what you're saying, I would NOT consider the By extension elements/attributes, I think of elements/attributes added to the XML instances that are intended to be processed by a tool other than And it is those "company-specific" elements/attributes that I'm suggesting should be in a company-specific namespace. |
I'm not suggesting hiding PHP errors!! Trust me, I won't be offended if you don't take any of my advice, but I would much rather the
or, at least:
That is, process what you can and tell the user what you can't process...graceful degradation. |
PHPCS will not add support for additional attributes/elements to the doc generators. This has been discussed in that repo before and Greg is leaning more towards changing the whole docs setup completely, but it is completely unknown to "what" it will be changed. If and when that happens, I intend to add a tool to convert existing XML based PHPCS docs to whatever the new format is to this tooling repo. As for external tooling using the docs files... there is a WIP/Proof of concept repo here: https://github.com/PHPCSStandards/phpcs-docs On the one hand, this PoC may be used to start auto-generating a docs website for PHPCompatibility, combining information in the sniff docblocks with the information/code samples in the XML docs. For now, all of that is still up in the air as there's only so many hours in a day, For reference, here are some links to previous discussions about the docs in the PHPCS repo:
|
And, no, I'm not necessarily using WP as a frame of reference. While it's true that for the last 12+ years, 90% of what is do is WP-related, I've been a software engineer for ~30 years. And in fact, long before I'd ever heard of WP, I spent 10+ years writing XML related standards (not only through the W3C, but also for Health Level Seven, an ANSI-accredited SDO that defines clinical messaging standards used throughout the world) and code to process XML for the healthcare industry (including teaching other developers how to write XML processing code). |
Ah, I think we were talking sideways. Yes, I agree your suggestion would be a nice improvement for PHPCS. however there are two caveats:
In case you are not aware: the proposal to add an XSD was first proposed to be added to PHPCS itself, but that was rejected (twice), which is why we decided to add it here as we still felt it would be useful to have the XSD available. |
Apologies, I just got that impression based on some of the references and suggestions you made. No offense intended Paul! |
👋 Hiya, just checking in whether this is still being worked on ? |
Some useful improvement suggestions were made in this ticket, but I've yet to see a PR. As it's been over a year, I'm tempted to close this ticket. |
thanx for the reminder Juliette...I'd forgotten that I opened this. I think I might have some time in the next week to put together a preliminary PR. Tomorrow I'll refresh my memory about everything that's been said on this issue and that'll give me a better idea of when I'm likely to have a PR. |
The phpcsdoc.xsd schema does not conform to v1.0 of the XML schema spec because of the wildcard in the
rulegroup
Model Group.Since the schema has no
targetNamespace
, the element wildcard in that model group (<xs:any minOccurs="0"/>
), as written, violates the Unique Particle Attribution constraint (since the wildcard could potentially match any unqualified element, including those specifically named in the model group).XML Schema 1.1 loosened UPA so that wildcard is OK as specified provided that a v1.1 schema processor is used.
Since there are not that many v1.1 schema processors out there (especially in PHP, and most IDEs), it would probably be a good idea to modify the schema to conform to v1.0 of the schema spec.
I have some other suggestions for improvements to the schema, but it would be a good idea to have some general discussions around the "goals" of the schema (and validation in general) before jumping into opening a PR.
Note: I came across the schema when I saw WordPress/WordPress-Coding-Standards#2084.
The text was updated successfully, but these errors were encountered: