Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML null namespaces need to be preserved #47

Open
bedney opened this issue Mar 11, 2019 · 13 comments
Open

XML null namespaces need to be preserved #47

bedney opened this issue Mar 11, 2019 · 13 comments

Comments

@bedney
Copy link

bedney commented Mar 11, 2019

When serializing a node that should not belong in any (including the default) namespace, it is important to preserve that fact. XMLSerializer/DOMParser objects have long supported such serialization via the xmlns="" mechanism, which instructs the parser to not place the element in the default namespace, but that the element should have no namespace.

This behavior is detailed in the XML namespaces specification, Namespaces in XML 1.0 (Third Edition): "The attribute value in a default namespace declaration MAY be empty. This has the same effect, within the scope of the declaration, of there being no default namespace."

As a quick example, consider this:

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <people xmlns="">
      <person>
        <lastname>Smith</lastname>
        <firstname>Joe</firstname>
      </person>
      <person>
        <lastname>Jones</lastname>
        <firstname>John</firstname>
      </person>
    </people>
  </head>
  <body>
  </body>
</html>

The people element and all of its descendants should be considered to be in no namespace whatsoever (i.e. they do not inherit the XHTML namespace).

This is a critical distinction that has impact in many areas, but especially things like XPath querying.

@bedney bedney changed the title XML null namespaces are not preserved XML null namespaces need to be preserved Mar 11, 2019
@tkent-google
Copy link

With the current specification, XMLSerializer emits xmlns="" if you do serializeToString() for <html> or <head>, and no xmlns="" if you specify <people>. Because <people> and <people xmlns=""> have equivalent meanings, you would have troubles only if you specify <people> and wrap the serialized result with another serialized element in another namespace.

Anyway, I understand web authors want to ask XMLSerializer to keep the original serialized representation or DOM representation as much as possible. We might want to change the meaning of ignore namespace definition attribute flag so that XMLSerializer ignores default namespace declarations only if their values are inconsistent with element's namespace.

Chrome<=73, Firefox, and Safari keep such default namespace declarations, Edge and IE don't.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 12, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behaviro to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
@bedney
Copy link
Author

bedney commented Mar 12, 2019

Hmmm... I would state that the purpose of specifying xmlns="" allows me as a web author to have an 'escape' mechanism to prevent that element (and any descendants that don't provide their own namespace) from being in the default namespace.

These two are quite different:

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <people xmlns="">
    </people>
  </head>
  <body>
  </body>
</html>

and

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <people>
    </people>
  </head>
  <body>
  </body>
</html>

In the first case, <people xmlns=""> and its descendants have no namespace, whereas in the second case <people> and its descendants are in the XHTML namespace.

This is the only authoring mechanism available to me as a web author to specify that a chunk of markup should have no namespace (i.e. be escaped from "inheriting" it's parent namespace).

Therefore, I'm concerned with your use of the term "redundant". The xmlns="" is not a redundant construct here.

@tkent-google
Copy link

tkent-google commented Mar 12, 2019

I think an intention of the current specification is that the result of serializeToString() can be parsed as an XML document. <person /> and <person xmlns=""/> are equivalent because an XML parser sets namespaces of both of persons to null if they are parsed as XML documents.

@bedney
Copy link
Author

bedney commented Mar 12, 2019

Ah, I see.

Thanks for the clarification.

@annevk
Copy link
Member

annevk commented Mar 12, 2019

Which node are you serializing @bedney in your example? The root? I'd agree that dropping xmlns="" in that case would be a bug.

(I also looked at the wpt changes and found an issue there: web-platform-tests/wpt#15779 (review).)

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 12, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behaviro to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 12, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 12, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/heads/master@{#639873}
aarongable pushed a commit to chromium/chromium that referenced this issue Mar 12, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/heads/master@{#639873}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 12, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/heads/master@{#639873}
@bedney
Copy link
Author

bedney commented Mar 12, 2019

Yes, I am serializing the root.

@annevk
Copy link
Member

annevk commented Mar 12, 2019

I should have asked more questions, sorry. I agree with OP. What I'm wondering about is:

  1. Is this a bug in the specification text? If so, where?
  2. Is this a bug in implementations? If so, are there implementation bugs?
  3. Is this tested by wpt? XMLSerializer: Keep redundant but harmless default namespace declarations web-platform-tests/wpt#15779 suggests yes, but maybe there's a nuanced scenario that's not.

@bedney
Copy link
Author

bedney commented Mar 12, 2019

Anne -

---- "Is this a bug in the specification text? If so, where?"

It looks like this might be covered in section 3.2.1.1.1 Recording the namespace, step 4.

@tkent-google Can you confirm? I'm not a specification person, sorry ;-). You know what I'm after here. Thank you!

---- "Is this a bug in implementations? If so, are there implementation bugs?"

I would note that, while currently Edge is out of compliance (until they switch to Chromium, obviously), all other currently-supported (not old IE) browsers support this construct. On IE, the old ActiveX DOM Parser did support this construct.

In fact, 14 years ago, I made this report about this very issue to the Mozilla team :-).

https://bugzilla.mozilla.org/show_bug.cgi?id=301260

---- "Is this tested by wpt?"

Those tests might cover the use case I'm talking about here. Obviously, due to the fact that this feature really has to do with 'escaping' elements from the surrounding default namespace context, having tests exercising that would be ideal. I think that would require adding tests to the DOMParser side of the house to make sure that the value of the .namespaceURI DOM property on those elements would, in fact, be null and not the 'inherited' default namespace.

@tkent-google
Copy link

Yes, I am serializing the root.

In crbug.com/940204, you complained about serialization starting with <people>, not the root.
if you call serializeToString() with an ancestor of <people>, it has xmlns="" with all major browsers. The specification has no problem in this case.

Your request is that <people> should not drop xmlns="" even if we call serializeToString(people), right?

@bedney
Copy link
Author

bedney commented Mar 14, 2019

@tkent-google -

Yes, you are correct. In looking back at my test case, I see I was serializing starting with <people>.

Therefore, the spec seems fine here.

And, yes, that is my request. Therefore, your fix in crbug.com/940204 is correct.

Thank you!

@tkent-google
Copy link

  1. Is this a bug in the specification text? If so, where?

IMO it's hard to say this is a specification bug. The current specification is reasonable, and Edge follows it. However, now three major browsers except for Edge are against the specification.

If we change the specification so that it matches to non-Edge, we should update 3.5.2.1.
the second bullet in https://w3c.github.io/DOM-Parsing/#serializing-an-element-s-attributes.

"the attr's prefix is null and the ignore namespace definition attribute flag is true"

to something like "the attr's prefix is null, the ignore namespace definition attribute flag is true, and the attr's value does not match to the element's namespaceURI"

  1. Is this tested by wpt? XMLSerializer: Keep redundant but harmless default namespace declarations web-platform-tests/wpt#15779 suggests yes, but maybe there's a nuanced scenario that's not.

domparsing/XMLSerializer-serializeToString.html contains some test cases related to this issue. They follow the current specification.
Now "Check if start tag serialization drops element prefix if the namespace is same as inherited default namespace." fails on Firefox and Safari. I reverted the Chrome's behavior in this week. As you know web-platform-tests/wpt#15779 moved the test cases.

@annevk
Copy link
Member

annevk commented Mar 15, 2019

Given that Edge is going away, it seems like an update is warranted, especially as doing what Edge does goes against expectations.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 1, 2019
…ss default namespace declarations, a=testonly

Automatic update from web-platform-tests
XMLSerializer: Keep redundant but harmless default namespace declarations

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/heads/master@{#639873}

--

wpt-commits: 4dc79803ad127fb635a03ec8b248c8671b4c3b67
wpt-pr: 15779
mykmelez pushed a commit to mykmelez/gecko that referenced this issue Apr 2, 2019
…ss default namespace declarations, a=testonly

Automatic update from web-platform-tests
XMLSerializer: Keep redundant but harmless default namespace declarations

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/heads/master@{#639873}

--

wpt-commits: 4dc79803ad127fb635a03ec8b248c8671b4c3b67
wpt-pr: 15779
imran1008 pushed a commit to bloomberg/chromium.bb that referenced this issue May 21, 2019
…eclarations" to M74 branch

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Original-Commit-Position: refs/heads/master@{#639873}(cherry picked from commit 31f0236 [formerly 6208aa43211c09758870c44f7dd7c737e2cc3110])
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1524508
Reviewed-by: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/branch-heads/3729@{#147}
Cr-Branched-From: 1926073 [formerly d4a8972e30b604f090aeda5dfff68386ae656267]-refs/heads/master@{#638880}
Former-commit-id: f75d3ff1174b12dce871d1f942ed8798296971c9
marcoscaceres pushed a commit to web-platform-tests/wpt that referenced this issue Jul 23, 2019
…ions

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <[email protected]>
Commit-Queue: Kent Tamura <[email protected]>
Auto-Submit: Kent Tamura <[email protected]>
Cr-Commit-Position: refs/heads/master@{#639873}
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 4, 2019
…ss default namespace declarations, a=testonly

Automatic update from web-platform-tests
XMLSerializer: Keep redundant but harmless default namespace declarations

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <yosinchromium.org>
Commit-Queue: Kent Tamura <tkentchromium.org>
Auto-Submit: Kent Tamura <tkentchromium.org>
Cr-Commit-Position: refs/heads/master{#639873}

--

wpt-commits: 4dc79803ad127fb635a03ec8b248c8671b4c3b67
wpt-pr: 15779

UltraBlame original commit: 41a76d86404be10336a61e3a8fbd86ac0aca467f
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 4, 2019
…ss default namespace declarations, a=testonly

Automatic update from web-platform-tests
XMLSerializer: Keep redundant but harmless default namespace declarations

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <yosinchromium.org>
Commit-Queue: Kent Tamura <tkentchromium.org>
Auto-Submit: Kent Tamura <tkentchromium.org>
Cr-Commit-Position: refs/heads/master{#639873}

--

wpt-commits: 4dc79803ad127fb635a03ec8b248c8671b4c3b67
wpt-pr: 15779

UltraBlame original commit: 41a76d86404be10336a61e3a8fbd86ac0aca467f
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 4, 2019
…ss default namespace declarations, a=testonly

Automatic update from web-platform-tests
XMLSerializer: Keep redundant but harmless default namespace declarations

Since crrev.com/632142, XMLSerializer has dropped redundant xmlns="..."
declarations. It matches to IE, Edge, and DOM P&S standard.  This CL
reverts the behavior to unbreak existing applications.  The restored
behavior matches to Firefox and Safari.

* MarkupAccumulator::AppendElement():
  Even if ignore_namespace_definition_attribute_ is set, we drop a
  xmlns="..." only if its value is inconsistent with element's
  namespace.

* MarkupAccumulator::AppendStartTagOpen():
  if local_default_namespace is "" and ns is null, do not emit xmlns="".
  This avoids to serialize xmlns="" twice.

Bug: w3c/DOM-Parsing#47
Bug: 940204
Change-Id: I2978ddc9a3f9511d227a9a1b902f1811ac1c3c07
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1516124
Reviewed-by: Yoshifumi Inoue <yosinchromium.org>
Commit-Queue: Kent Tamura <tkentchromium.org>
Auto-Submit: Kent Tamura <tkentchromium.org>
Cr-Commit-Position: refs/heads/master{#639873}

--

wpt-commits: 4dc79803ad127fb635a03ec8b248c8671b4c3b67
wpt-pr: 15779

UltraBlame original commit: 41a76d86404be10336a61e3a8fbd86ac0aca467f
@cscott
Copy link

cscott commented Jul 2, 2021

I found https://chromium-review.googlesource.com/c/chromium/src/+/1516124/5/third_party/blink/renderer/core/editing/serializers/markup_accumulator.cc helpful when trying to determine what spec changes were made to address this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants