Package Exports
- detect-is-it-html-or-xhtml
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (detect-is-it-html-or-xhtml) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
detect-is-it-html-or-xhtml
Answers, is the string input string more an HTML or XHTML (or neither)
Table of Contents
Install
npm i detect-is-it-html-or-xhtml
// consume using a CommonJS require:
const detect = require("detect-is-it-html-or-xhtml");
// or as a native ES Module:
import detect from "detect-is-it-html-or-xhtml";
// then, pass it a string containing HTML:
console.log(
detect(
'<img src="some.jpg" width="zzz" height="zzz" border="0" style="display:block;" alt="zzz"/>'
)
);
// => 'xhtml'
Here's what you'll get:
Type | Key in package.json |
Path | Size |
---|---|---|---|
Main export - CommonJS version, transpiled to ES5, contains require and module.exports |
main |
dist/detect-is-it-html-or-xhtml.cjs.js |
2 KB |
ES module build that Webpack/Rollup understands. Untranspiled ES6 code with import /export . |
module |
dist/detect-is-it-html-or-xhtml.esm.js |
2 KB |
UMD build for browsers, transpiled, minified, containing iife 's and has all dependencies baked-in |
browser |
dist/detect-is-it-html-or-xhtml.umd.js |
795 B |
Purpose
As you know, XHTML is slightly different from HTML: HTML (4 and 5) does not close the <img>
and other single tags, while XHTML does. There are more to that, but that's the major thing from developer's perspective.
When I was working on the email-remove-unused-css, I was parsing the HTML and rendering it back. Upon this rendering-back stage, I had to identify, is the source code of the HTML-type, or XHTML, because I had to instruct the renderer to close all the single tags (or not close them). Ignoring this setting would have nasty consequences because, roughly, in only half of the cases my library would produce the correct code.
I couldn't find any library that analyses the code, telling is it HTML or XHTML. That's how detect-is-it-html-or-xhtml
was born.
Feed the string into this library. If it's more of an HTML, it will output a string "html"
. If it's more of an XHTML, it will output a string xhtml
. If your code doesn't contain any tags, or it does, but there is no doctype
, and it's impossible to distinguish between the two, it will output null
.
API
detect(
htmlAsString // Some code in string format. Or some other string.
);
// => 'html'|'xhtml'|null
API - Input
Input argument | Type | Obligatory? | Description |
---|---|---|---|
htmlAsString |
String | yes | String, hopefully containing some HTML code |
If the input is not String type, this package will throw an error. If the input is missing completely, it will return null
.
API - Output
Type | Value | Description |
---|---|---|
String or null | 'html', 'xhtml' or null | Identified type of your input |
Under the hood
The algorithm is the following:
- Look for
doctype
. If recognised, Bob's your uncle, here's your answer. - IF there's no
doctype
or it's messed up beyond recognition, DO scan all singleton tags (<img>
,<br>
and<hr>
) and see which type the majority is (closed or not closed). - In a rare case when there is an equal amount of both closed and unclosed tags, lean for
html
. - If (there are no tags in the input) OR (there are no doctype tags and no singleton tags), return
null
.
Contributing
If you want a new feature in this package or you would like us to change some of its functionality, raise an issue on this repo.
If you tried to use this library but it misbehaves, or you need advice setting it up, and its readme doesn't make sense, just document it and raise an issue on this repo.
If you would like to add or change some features, just fork it, hack away, and file a pull request. We'll do our best to merge it quickly. Prettier is enabled, so you don't need to worry about the code style.
Licence
MIT License (MIT)
Copyright © 2018 Codsen Ltd, Roy Revelt