mdast-util-to-nlcst
!Buildbuild-badgebuild
!Coveragecoverage-badgecoverage
!Downloadsdownloads-badgedownloads
!Sizesize-badgesize
!Sponsorssponsors-badgecollective
!Backersbackers-badgecollective
!Chatchat-badgechatmdast utility to transform to nlcst.
Contents
* [`toNlcst(tree, file, Parser[, options])`](#tonlcsttree-file-parser-options)
* [`Options`](#options)
* [`ParserConstructor`](#parserconstructor)
* [`ParserInstance`](#parserinstance)
What is this?
This package is a utility that takes an mdast (markdown) syntax tree as input and turns it into nlcst (natural language).When should I use this?
This project is useful when you want to deal with ASTs and inspect the natural language inside markdown. Unfortunately, there is no way yet to apply changes to the nlcst back into mdast.The hast utility
hast-util-to-nlcst
hast-util-to-nlcst does the same but
uses an HTML tree as input.The remark plugin
remark-retext
remark-retext wraps this utility to do the
same at a higher-level (easier) abstraction.Install
This package is ESM onlyesm. In Node.js (version 14.14+ and 16.0+), install with npm:npm install mdast-util-to-nlcst
In Deno with
esm.sh
esmsh:import {toNlcst} from 'https://esm.sh/mdast-util-to-nlcst@6'
In browsers with
esm.sh
esmsh:<script type="module">
import {toNlcst} from 'https://esm.sh/mdast-util-to-nlcst@6?bundle'
</script>
Use
Say we have the followingexample.md
:Some *foo*sball.
…and next to it a module
example.js
:import {read} from 'to-vfile'
import {ParseEnglish} from 'parse-english'
import {inspect} from 'unist-util-inspect'
import {fromMarkdown} from 'mdast-util-from-markdown'
import {toNlcst} from 'mdast-util-to-nlcst'
const file = await read('example.md')
const mdast = fromMarkdown(file)
const nlcst = toNlcst(mdast, file, ParseEnglish)
console.log(inspect(nlcst))
Yields:
RootNode[1] (1:1-1:17, 0-16)
└─0 ParagraphNode[1] (1:1-1:17, 0-16)
└─0 SentenceNode[4] (1:1-1:17, 0-16)
├─0 WordNode[1] (1:1-1:5, 0-4)
│ └─0 TextNode "Some" (1:1-1:5, 0-4)
├─1 WhiteSpaceNode " " (1:5-1:6, 4-5)
├─2 WordNode[2] (1:7-1:16, 6-15)
│ ├─0 TextNode "foo" (1:7-1:10, 6-9)
│ └─1 TextNode "sball" (1:11-1:16, 10-15)
└─3 PunctuationNode "." (1:16-1:17, 15-16)
API
This package exports the identifiertoNlcst
api-tonlcst.
There is no default export.toNlcst(tree, file, Parser[, options])
Turn an mdast tree into an nlcst tree.👉 Note:tree
must have positional info andfile
must be aVFile
corresponding totree
.
Parameters
tree
(MdastNode
mdast-node)
— mdast tree to transform
— virtual file
Parser
(ParserConstructor
api-parserconstructor or
[`ParserInstance`][api-parserinstance])
— parser to use
options
(Options
api-options, optional)
— configuration
Returns
nlcst tree (NlcstNode
nlcst-node).Options
Configuration (TypeScript type).Fields
ignore
List of mdast node types to ignore (Array<string>
, optional).The types
'table'
, 'tableRow'
, and 'tableCell'
are always ignored.Show example
Say we have the following file
example.md
:A paragraph.
> A paragraph in a block quote.
…and if we now transform with
ignore: ['blockquote']
, we get:RootNode[2] (1:1-3:1, 0-14)
├─0 ParagraphNode[1] (1:1-1:13, 0-12)
│ └─0 SentenceNode[4] (1:1-1:13, 0-12)
│ ├─0 WordNode[1] (1:1-1:2, 0-1)
│ │ └─0 TextNode "A" (1:1-1:2, 0-1)
│ ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
│ ├─2 WordNode[1] (1:3-1:12, 2-11)
│ │ └─0 TextNode "paragraph" (1:3-1:12, 2-11)
│ └─3 PunctuationNode "." (1:12-1:13, 11-12)
└─1 WhiteSpaceNode "\n\n" (1:13-3:1, 12-14)
source
List of mdast node types to mark as nlcst source nodes
(Array<string>
, optional).The type
'inlineCode'
is always marked as source.Show example
Say we have the following file
example.md
:A paragraph.
> A paragraph in a block quote.
…and if we now transform with
source: ['blockquote']
, we get:RootNode[3] (1:1-3:32, 0-45)
├─0 ParagraphNode[1] (1:1-1:13, 0-12)
│ └─0 SentenceNode[4] (1:1-1:13, 0-12)
│ ├─0 WordNode[1] (1:1-1:2, 0-1)
│ │ └─0 TextNode "A" (1:1-1:2, 0-1)
│ ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
│ ├─2 WordNode[1] (1:3-1:12, 2-11)
│ │ └─0 TextNode "paragraph" (1:3-1:12, 2-11)
│ └─3 PunctuationNode "." (1:12-1:13, 11-12)
├─1 WhiteSpaceNode "\n\n" (1:13-3:1, 12-14)
└─2 ParagraphNode[1] (3:1-3:32, 14-45)
└─0 SentenceNode[1] (3:1-3:32, 14-45)
└─0 SourceNode "> A paragraph in a block quote." (3:1-3:32, 14-45)
ParserConstructor
Create a new parser (TypeScript type).Type
type ParserConstructor = new () => ParserInstance
ParserInstance
nlcst parser (TypeScript type).For example,
parse-dutch
parse-dutch, parse-english
parse-english, or
parse-latin
parse-latin.Type
type ParserInstance = {
tokenizeSentencePlugins: ((node: NlcstSentence) => void)[]
tokenizeParagraphPlugins: ((node: NlcstParagraph) => void)[]
tokenizeRootPlugins: ((node: NlcstRoot) => void)[]
parse(value: string | null | undefined): NlcstRoot
tokenize(value: string | null | undefined): Array<NlcstSentenceContent>
}
Types
This package is fully typed with TypeScript. It exports the typesOptions
api-options,
ParserConstructor
api-parserconstructor, and
ParserInstance
api-parserinstance.Compatibility
Projects maintained by the unified collective are compatible with all maintained versions of Node.js. As of now, that is Node.js 12.20+, 14.14+, and 16.0+. Our projects sometimes work with older versions, but this is not guaranteed.Security
Use ofmdast-util-to-nlcst
does not involve hasthast so there are no
openings for cross-site scripting (XSS)xss attacks.Related
— transform mdast to hast
— transform hast to nlcst
— transform hast to mdast
— transform hast to xast
— sanitize hast nodes
Contribute
Seecontributing.md
contributing in syntax-tree/.github
health for
ways to get started.
See support.md
support for ways to get help.This project has a code of conductcoc. By interacting with this repository, organization, or community you agree to abide by its terms.