rexml

Simple XML parsing with a regular expression.

  • rexml

Downloads in past

Stats

StarsIssuesVersionUpdatedCreatedSize
rexml
2.2.24 years ago5 years agoMinified + gzip package size for rexml in KB

Readme

rexml
npm version
rexml is a Node.JS package for simple XML parsing with a regular expression. It's been tested to work for simple use cases (does work on nested tags).
yarn add rexml

Table Of Contents

extractTags(tag, string): Return
* [`Return`](#type-return)
* [Extracting Multiple Tags](#extracting-multiple-tags)
extractProps(string: string, parseValue?: boolean): Object<string,(boolean|string|number)> extractTagsSpec(tag: string, string: string): {content, props}[]

API

The package is available by importing its default and named functions:
import rexml, { extractProps, extractTagsSpec, extractPropSpec } from 'rexml'

extractTags(
  tag: string|!Array<string>,
  string: string,
): Return

Extract member elements from an XML string. Numbers and booleans will be parsed into their JS types.
- tag (string \| !Array<string>): Which tag to extract, e.g., div. Can also pass an array of tags, in which case the name of the tag will also be returned. - string string: The XML string.
The tags are returned as an array with objects containing content and props properties. The content is the inner content of the tag, and props is the attributes specified inside the tag.
SourceOutput

import extractTags from 'rexml'

const xml = `
<html>
  <div id="d1"
    class="example"
    contenteditable />
  <div id="d2" class="example">Hello World</div>
</html>
`

const res = extractTags('div', xml)

[ { content: '',
    props: 
     { id: 'd1',
       class: 'example',
       contenteditable: true },
    tag: 'div' },
  { content: 'Hello World',
    props: { id: 'd2', class: 'example' },
    tag: 'div' } ]

Return: The return type.
| Name | Type | Description | | ------------ | --------------------------- | ------------------------------------------------------ | | content | string | The content of the tag, including possible whitespace. | | props | !Object | The properties of the element. | | tag | string | The name of the extracted element. |

Extracting Multiple Tags

It's possible to give an array of tags which should be extracted from the XML string.
SourceOutput

import extractTags from 'rexml'

const xml = `<html>
  <div id="d1"/>
  <div id="d2" class="example">Hello World</div>
  <footer>Art Deco, 2019</footer>
</html>
`

const res = extractTags(['div', 'footer'], xml)

[ { content: '',
    props: { id: 'd1' },
    tag: 'div' },
  { content: 'Hello World',
    props: { id: 'd2', class: 'example' },
    tag: 'div' },
  { content: 'Art Deco, 2019',
    props: {},
    tag: 'footer' } ]

extractProps(
  string: string,
  parseValue?: boolean,
): Object

Extracts the properties from the attributes part of the tag and returns them as an object. It will parse values if not specified otherwise.
SourceOutput

import { extractProps, extractPropsSpec } from 'rexml'

const s = `id="d2"
class="example"
value="123"
parsable="true"
ignore="false"
2-non-spec
required`

const res = extractProps(s)
console.log(JSON.stringify(res, null, 2))

// don't parse booleans and integers
const res2 = extractProps(s, false)
console.log(JSON.stringify(res2, null, 2))

// conform to the spec
const res3 = extractPropsSpec(s)
console.log(JSON.stringify(res3, null, 2))

{
  "id": "d2",
  "class": "example",
  "value": 123,
  "parsable": true,
  "ignore": false,
  "2-non-spec": true,
  "required": true
}
{
  "id": "d2",
  "class": "example",
  "value": "123",
  "parsable": "true",
  "ignore": "false",
  "2-non-spec": true,
  "required": true
}
{
  "id": "d2",
  "class": "example",
  "value": 123,
  "parsable": true,
  "ignore": false,
  "required": true
}

extractTagsSpec(
  tag: string,
  string: string,
): {content, props}

Same as the default method, but confirms to the XML specification in defining attributes.
import { extractTagsSpec } from 'rexml'

const xml = `
<html>
  <div id="d1" class="example" contenteditable />
  <div 5-non-spec>Attributes cannot start with a number.</div>
</html>`

const res = extractTagsSpec('div', xml)

console.log(JSON.stringify(res, null, 2))
[
  {
    "props": {
      "id": "d1",
      "class": "example",
      "contenteditable": true
    },
    "content": ""
  }
]

Copyright

<th>
  <a href="https://artd.eco">
    <img width="100" src="https://raw.githubusercontent.com/wrote/wrote/master/images/artdeco.png"
      alt="Art Deco">
  </a>
</th>
<th>© <a href="https://artd.eco">Art Deco</a>   2019</th>
<th>
  <a href="https://www.technation.sucks" title="Tech Nation Visa">
    <img width="100" src="https://raw.githubusercontent.com/idiocc/cookies/master/wiki/arch4.jpg"
      alt="Tech Nation Visa">
  </a>
</th>
<th><a href="https://www.technation.sucks">Tech Nation Visa Sucks</a></th>