Package Exports
- detect-is-it-html-or-xhtml
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (detect-is-it-html-or-xhtml) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
detect-is-it-html-or-xhtml
Answers, is the string input string more an HTML or XHTML (or neither)
Purpose
As you know, XHTML is slightly different from HTML: HTML (4 and 5) does not close the <img>
and other single tags, while XHTML does. There are more to that, but that's the major thing from developer's perspective.
When I was working on the email-remove-unused-css, I was parsing the HTML and rendering it back. Upon this rendering-back stage, I had to identify, is the source code of the HTML type, or XHTML, because I could instruct renderer the renderer to close all the single tags (or not). I couldn't find any library that analyses the code, is it HTML or XHTML. That's how detect-is-it-html-or-xhtml
was born.
Feed the string into this library. If it's more of an HTML, it will output a string "html"
. If it's more of an XHTML, it will output a string xhtml
. If it doesn't contain any tags, or it does, but there is no doctype
, and it's impossible to distinguish between the two, it will output null
.
Install
$ npm install --save detect-is-it-html-or-xhtml
Use
var detect = require('detect-is-it-html-or-xhtml')
console.log(detect('<img src="some.jpg" width="zzz" height="zzz" border="0" style="display:block;" alt="zzz"/>'))
// => 'xhtml'
API
detect(
htmlAsString // Some code in string format. Or some other string.
)
// => 'html'|'xhtml'|null
Under the hood
The algorithm is the following:
- Look for
doctype
. If recognised, Bob's your uncle. - IF there's no
doctype
or it's messed up beyond recognition, DO scan all singleton tags (<img>
,<br>
and<hr>
) and see which type the majority is (closed or not closed). - In a rare case when there is an equal amount of both closed and unclosed tags, lean for
html
. - If (there are no tags in the input) OR (there are no doctype tags and no singleton tags), return
null
.
Contributing & testing
All contributions are welcome. This library uses Standard JavaScript notation. See test.js
. It's very minimalistic testing setup using AVA.
npm test
If you see anything incorrect whatsoever, raise an issue. PR's are welcome — fork, hack and PR.
Licence
MIT © Roy Reveltas