degausser

Transforms HTML to plain text by eliminating tags from a document.

Downloads in past

Stats

StarsIssuesVersionUpdatedCreatedSize
degausser
0572.4.4a year ago4 years agoMinified + gzip package size for degausser in KB

Readme

degausser
JS
HTML to plain text conversion.
For when you want to eliminate HTML tags from a document and leave reasonably rendered text behind.
The target algorithm is similar to the HTMLElement.innerText property of the HTML5 DOM. With the limitation of not taking into account layout or styling.

Usage

Example:
import { degausser } from 'degausser'

const template = document.createElement('template')

template.innerHTML = `
<h3>For example:</h3>
<p id="source">
  <style>#source { color: red; }</style>
  Take a look at
  <br>
  <strong>how</strong>
  <em>this</em>
  text<br>is
  <mark>inter</mark>preted
  below.
  <span style="display:none">HIDDEN TEXT</span>
</p>
`

const documentFragment = template.content

const plain = degausser(documentFragment)

console.log(plain)

Output:
For example:

Take a look at
how this text
is interpreted below. HIDDEN TEXT

Using with Node.js

It's recommended to use jsdom for a DOM implementation.