Skip to content

Commit

Permalink
Merge remote-tracking branch 'ariadne/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
poef committed Feb 23, 2020
2 parents d322bb2 + fa156f6 commit 95c8d49
Showing 1 changed file with 153 additions and 34 deletions.
187 changes: 153 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ Application Framework and Content Management System
arc/xml
=======

This component provides a unified xml parser and writer. The writer allows for readable and correct xml in code, not using
templates. The parser is a wrapper around both DOMDocument and SimpleXML.
This component provides a unified xml parser and writer. The writer allows for readable and always correct xml in code, without using templates. The parser is a wrapper around both DOMDocument and SimpleXML.

The parser and writer also work on fragments of XML. The parser also makes sure that the output is identical to the input.
When converting a node to a string, \arc\xml will return the full xml string, including tags. If you don't want that, you
can always access the 'nodeValue' property to get the original SimpleXMLElement.

Finally the parser also adds the ability to use basic CSS selectors to find elements in the XML.

Example code:
```php
use \arc\xml as x;
$xmlString =
Expand All @@ -41,21 +41,154 @@ Finally the parser also adds the ability to use basic CSS selectors to find elem
);
```

And parsing it:

```php
$xml = \arc\xml::parse($xmlString);
$title = $xml->channel->title->nodeValue; // SimpleXMLElement 'Wikipedia'
$titleTag = $xml->channel->title; // <title>Wikipedia</title>
echo $title;
```

CSS selectors
-------------
Installation
------------

This library requires PHP 7.1 or higher. It is installable and autoloadable via Composer as arc/xml.

```
composer require arc/xml
```

Parsing XML
===========

Examples
--------

For these examples we'll use the following XML

```xml
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel rdf:about="http://slashdot.org/">
<title>Slashdot</title>
<link>http://slashdot.org/</link>
<description>News for nerds, stuff that matters</description>
<dc:language>en-us</dc:language>
<dc:date>2016-01-30T20:38:08+00:00</dc:date>
</channel>
<item rdf:about="http://hardware.slashdot.org/story/1757209/">
<title>Drone Races To Be Broadcast To VR Headsets</title>
<link>http://hardware.slashdot.org/story/1757209/</link>
</item>
<item rdf:about="http://it.slashdot.org/story/1720259/">
<title>FTDI Driver Breaks Hardware Again</title>
<link>http://it.slashdot.org/story/1720259/</link>
</item>
</rdf:RDF>
```

### Getting the title
```php
$title = current($xml->find('title'));
$xml = \arc\xml::parse( $xmlString );
$title = $xml->channel->title;
echo $title;
```

result:
```
<title>Slashdot</title>
```

The parser returns the full XML element by default. If you just want the contents, you must be explicit:

```php
$title = $xml->channel->title->nodeValue;
echo $title;
```

result:
```
Slashdot
```

Instead of the default in SimpleXML, arc\xml must be explicitly told to get the value of the node using the `nodeValue` property.

### Setting the title
```php
$xml->channel->title = 'Update title';
```

As you can see, there is no need to mention the nodeValue here, the name 'title' is enough to select the correct element. It would not make sense to turn the title into another tag entirely here. You can still use the `nodeValue` if you prefer though.

### Getting attributes
```php
$about = $xml->channel['rdf:about'];
```

result
```
http://slashdot.org/
```

Just what you would expect, even though there is a namespace in there. When you use a namespace that the parser hasn't been told about before, it will simply look it up in the document and re-use it.

Since attributes aren't XML nodes, there is no nodeValue. Attributes are always returned as just a string.

### Setting attributes
```php
$xml->channel['title-attribute'] = 'This is a title attribute';
```

This adds the `title-attribute` if it wasn't there before, or updates it if it was.

### Removing attributes
```php
unset($xml->channel['title-attribute']);
```

### Searching the document
```php
$items = $xml->find('item');
echo implode($items);
```

result:
```xml
<item rdf:about="http://hardware.slashdot.org/story/1757209/">
<title>Drone Races To Be Broadcast To VR Headsets</title>
<link>http://hardware.slashdot.org/story/1757209/</link>
</item>
<item rdf:about="http://it.slashdot.org/story/1720259/">
<title>FTDI Driver Breaks Hardware Again</title>
<link>http://it.slashdot.org/story/1720259/</link>
</item>
```

Again, you get the full XML of the result and it is just an array. (Its been joined here using `implode` for clarity).

The `find()` method accepts most CSS2.0 selectors. For now you can't enter more than one selector, so you can't select 'item, channel' for instance.
Either use the SimpleXML `xpath()` method or run multiple queries.

### Searching using namespaces
```php
$xml->registerNamespace('dublincore','http://purl.org/dc/elements/1.1/');
$date = current($xml->find('dublincore|date));
echo $date;
```

result:
```xml
<dc:date>2016-01-30T20:38:08+00:00</dc:date>
```

Again, you get the full XML by default. But in addition, though you've used a namespace alias not known in the document ( `dublincore` ), `find()` returns the `<dc:date>` element for you. The alias is different, but the namespace is the same and that is what matters.

The find() method always returns an array, which may be empty. By using current() you get the first element found, or null if nothing was found.

### Supported CSS Selectors

The following CSS selectors are supported:

- `tag1 tag2`<br>
Expand All @@ -76,6 +209,8 @@ The following CSS selectors are supported:
This matches any `tag` with id `id`.
- `#id`<br>
This matches any element with id `id`.
- `ns|tag`<br>
This matches `ns:tag` or more generally `tag` in the namespace indicated by the alias `ns`

SimpleXML
---------
Expand All @@ -95,8 +230,7 @@ You can walk through the node tree:

Any method or property available in SimpleXMLElement is included in \arc\xml parsed data.

DOMElement
----------
### DOMElement methods

In addition to SimpleXMLElement methods, you can also call any method that is available in DOMElement.

Expand All @@ -106,12 +240,9 @@ In addition to SimpleXMLElement methods, you can also call any method that is av
->getElementsByTagName('title')[0];
```

But right now you cannot access the properties of DOMElement. This will be fixed soonish.

Parsing fragments
-----------------
### Parsing fragments

The arc\xml parser also accepts partial XML content. It doesn't require a single root element.
The arc\xml parser accepts partial XML content. It doesn't require a single root element.

```php
$xmlString = <<< EOF
Expand All @@ -123,27 +254,15 @@ The arc\xml parser also accepts partial XML content. It doesn't require a single
</item>
EOF;
$xml = \arc\xml::parse($xmlString);
$titles = $xml->find('title');
echo $xml;
```

And when you convert the xml back to a string, it will still be a partial XML fragment.


Why use this instead of DOMDocument or SimpleXML?
-------------------------------------------------

arc\xml::parse has the following differences:

- When converted to string, it returns the original XML, without additions you didn't make.
- You can use it with partial XML fragments.
- No need to remember calling importNode() before appendChild() or insertBefore()
- No need to switch between SimpleXML and DOMDocument, because you need that one method only available in the other API.
- When returning a list of elements, you always get a standard array, not a magic NodeList.

In addition the arc\xml writer is a simple way to generate valid and indented XML, with readable and self-validating code.

- Filters all illegal characters from your XML
- Automatically translate &, < and > characters to &amp;, &lt; and &gt;
- Automatically translate " to &quot; in attributes

You can include raw unfiltered XML with the \arc\xml::raw() method
result:
```xml
<item>
<title>An item</title>
</item>
<item>
<title>Another item</title>
</item>
```

0 comments on commit 95c8d49

Please sign in to comment.