mdast-util-to-nlcst

mdast utility to transform to nlcst

Downloads in past

Stats

StarsIssuesVersionUpdatedCreatedSize
mdast-util-to-nlcst
806.0.05 days ago8 years agoMinified + gzip package size for mdast-util-to-nlcst in KB

Readme

mdast-util-to-nlcst
!Buildbuild-badgebuild !Coveragecoverage-badgecoverage !Downloadsdownloads-badgedownloads !Sizesize-badgesize !Sponsorssponsors-badgecollective !Backersbackers-badgecollective !Chatchat-badgechat
mdast utility to transform to nlcst.

Contents

*   [`toNlcst(tree, file, Parser[, options])`](#tonlcsttree-file-parser-options)
*   [`Options`](#options)
*   [`ParserConstructor`](#parserconstructor)
*   [`ParserInstance`](#parserinstance)

What is this?

This package is a utility that takes an mdast (markdown) syntax tree as input and turns it into nlcst (natural language).

When should I use this?

This project is useful when you want to deal with ASTs and inspect the natural language inside markdown. Unfortunately, there is no way yet to apply changes to the nlcst back into mdast.
The hast utility hast-util-to-nlcsthast-util-to-nlcst does the same but uses an HTML tree as input.
The remark plugin remark-retextremark-retext wraps this utility to do the same at a higher-level (easier) abstraction.

Install

This package is ESM onlyesm. In Node.js (version 14.14+ and 16.0+), install with npm:
npm install mdast-util-to-nlcst

In Deno with esm.shesmsh:
import {toNlcst} from 'https://esm.sh/mdast-util-to-nlcst@6'

In browsers with esm.shesmsh:
<script type="module">
  import {toNlcst} from 'https://esm.sh/mdast-util-to-nlcst@6?bundle'
</script>

Use

Say we have the following example.md:
Some *foo*sball.

…and next to it a module example.js:
import {read} from 'to-vfile'
import {ParseEnglish} from 'parse-english'
import {inspect} from 'unist-util-inspect'
import {fromMarkdown} from 'mdast-util-from-markdown'
import {toNlcst} from 'mdast-util-to-nlcst'

const file = await read('example.md')
const mdast = fromMarkdown(file)
const nlcst = toNlcst(mdast, file, ParseEnglish)

console.log(inspect(nlcst))

Yields:
RootNode[1] (1:1-1:17, 0-16)
└─0 ParagraphNode[1] (1:1-1:17, 0-16)
    └─0 SentenceNode[4] (1:1-1:17, 0-16)
        ├─0 WordNode[1] (1:1-1:5, 0-4)
        │   └─0 TextNode "Some" (1:1-1:5, 0-4)
        ├─1 WhiteSpaceNode " " (1:5-1:6, 4-5)
        ├─2 WordNode[2] (1:7-1:16, 6-15)
        │   ├─0 TextNode "foo" (1:7-1:10, 6-9)
        │   └─1 TextNode "sball" (1:11-1:16, 10-15)
        └─3 PunctuationNode "." (1:16-1:17, 15-16)

API

This package exports the identifier toNlcstapi-tonlcst. There is no default export.

toNlcst(tree, file, Parser[, options])

Turn an mdast tree into an nlcst tree.
👉 Note: tree must have positional info and file must be a VFile corresponding to tree.
Parameters
— mdast tree to transform
— virtual file
[`ParserInstance`][api-parserinstance])
— parser to use
— configuration
Returns
nlcst tree (NlcstNodenlcst-node).

Options

Configuration (TypeScript type).
Fields
ignore
List of mdast node types to ignore (Array<string>, optional).
The types 'table', 'tableRow', and 'tableCell' are always ignored.
Show example
Say we have the following file example.md:
A paragraph.

> A paragraph in a block quote.

…and if we now transform with ignore: ['blockquote'], we get:
RootNode[2] (1:1-3:1, 0-14)
├─0 ParagraphNode[1] (1:1-1:13, 0-12)
│   └─0 SentenceNode[4] (1:1-1:13, 0-12)
│       ├─0 WordNode[1] (1:1-1:2, 0-1)
│       │   └─0 TextNode "A" (1:1-1:2, 0-1)
│       ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
│       ├─2 WordNode[1] (1:3-1:12, 2-11)
│       │   └─0 TextNode "paragraph" (1:3-1:12, 2-11)
│       └─3 PunctuationNode "." (1:12-1:13, 11-12)
└─1 WhiteSpaceNode "\n\n" (1:13-3:1, 12-14)

source
List of mdast node types to mark as nlcst source nodes (Array<string>, optional).
The type 'inlineCode' is always marked as source.
Show example
Say we have the following file example.md:
A paragraph.

> A paragraph in a block quote.

…and if we now transform with source: ['blockquote'], we get:
RootNode[3] (1:1-3:32, 0-45)
├─0 ParagraphNode[1] (1:1-1:13, 0-12)
│   └─0 SentenceNode[4] (1:1-1:13, 0-12)
│       ├─0 WordNode[1] (1:1-1:2, 0-1)
│       │   └─0 TextNode "A" (1:1-1:2, 0-1)
│       ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
│       ├─2 WordNode[1] (1:3-1:12, 2-11)
│       │   └─0 TextNode "paragraph" (1:3-1:12, 2-11)
│       └─3 PunctuationNode "." (1:12-1:13, 11-12)
├─1 WhiteSpaceNode "\n\n" (1:13-3:1, 12-14)
└─2 ParagraphNode[1] (3:1-3:32, 14-45)
    └─0 SentenceNode[1] (3:1-3:32, 14-45)
        └─0 SourceNode "> A paragraph in a block quote." (3:1-3:32, 14-45)

ParserConstructor

Create a new parser (TypeScript type).
Type
type ParserConstructor = new () => ParserInstance

ParserInstance

nlcst parser (TypeScript type).
For example, parse-dutchparse-dutch, parse-englishparse-english, or parse-latinparse-latin.
Type
type ParserInstance = {
  tokenizeSentencePlugins: ((node: NlcstSentence) => void)[]
  tokenizeParagraphPlugins: ((node: NlcstParagraph) => void)[]
  tokenizeRootPlugins: ((node: NlcstRoot) => void)[]
  parse(value: string | null | undefined): NlcstRoot
  tokenize(value: string | null | undefined): Array<NlcstSentenceContent>
}

Types

This package is fully typed with TypeScript. It exports the types Optionsapi-options, ParserConstructorapi-parserconstructor, and ParserInstanceapi-parserinstance.

Compatibility

Projects maintained by the unified collective are compatible with all maintained versions of Node.js. As of now, that is Node.js 12.20+, 14.14+, and 16.0+. Our projects sometimes work with older versions, but this is not guaranteed.

Security

Use of mdast-util-to-nlcst does not involve hasthast so there are no openings for cross-site scripting (XSS)xss attacks.

Related

— transform mdast to hast
— transform hast to nlcst
— transform hast to mdast
— transform hast to xast
— sanitize hast nodes

Contribute

See contributing.mdcontributing in syntax-tree/.githubhealth for ways to get started. See support.mdsupport for ways to get help.
This project has a code of conductcoc. By interacting with this repository, organization, or community you agree to abide by its terms.

License

MITlicense © Titus Wormerauthor