JSPM

  • Created
  • Published
  • Downloads 74
  • Score
    100M100P100Q110453F
  • License MIT

Answers, is the string input string more an HTML or XHTML (or neither)

Package Exports

  • detect-is-it-html-or-xhtml

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (detect-is-it-html-or-xhtml) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

detect-is-it-html-or-xhtml

Standard JavaScript

Answers, is the string input string more an HTML or XHTML (or neither)

Build Status bitHound Overall Score bitHound Dependencies bitHound Dev Dependencies Downloads/Month

Purpose

As you know, XHTML is slightly different from HTML: HTML (4 and 5) does not close the <img> and other single tags, while XHTML does. There are more to that, but that's the major thing from developer's perspective.

When I was working on the email-remove-unused-css, I was parsing the HTML and rendering it back. Upon this rendering-back stage, I had to identify, is the source code of the HTML-type, or XHTML, because I had to instruct the renderer to close all the single tags (or not close them). Ignoring this setting would have nasty consequences because, roughly, in only half of the cases my library would produce the correct code.

I couldn't find any library that analyses the code, telling is it HTML or XHTML. That's how detect-is-it-html-or-xhtml was born.

Feed the string into this library. If it's more of an HTML, it will output a string "html". If it's more of an XHTML, it will output a string xhtml. If your code doesn't contain any tags, or it does, but there is no doctype, and it's impossible to distinguish between the two, it will output null.

Install

$ npm install --save detect-is-it-html-or-xhtml

Use

var detect = require('detect-is-it-html-or-xhtml')
console.log(detect('<img src="some.jpg" width="zzz" height="zzz" border="0" style="display:block;" alt="zzz"/>'))
// => 'xhtml'

API

detect(
  htmlAsString   // Some code in string format. Or some other string.
)
// => 'html'|'xhtml'|null

API - Input

Input argument Type Obligatory? Description
htmlAsString String yes String, hopefully containing some HTML code

API - Output

Type Value Description
String or null 'html', 'xhtml' or null Identified type of your input

Under the hood

The algorithm is the following:

  1. Look for doctype. If recognised, Bob's your uncle, here's your answer.
  2. IF there's no doctype or it's messed up beyond recognition, DO scan all singleton tags (<img>, <br> and <hr>) and see which type the majority is (closed or not closed).
  3. In a rare case when there is an equal amount of both closed and unclosed tags, lean for html.
  4. If (there are no tags in the input) OR (there are no doctype tags and no singleton tags), return null.

Testing

$ npm test

Unit tests use AVA and JS Standard notation.

Contributing

All contributions are welcome. Please stick to Standard JavaScript notation and supplement the test.js with new unit tests covering your feature(s).

If you see anything incorrect whatsoever, do raise an issue. If you file a pull request, I'll do my best to help you to get it merged in a timely manner. If you have any comments on the code, including ideas how to improve things, don't hesitate to contact me by email.

Licence

MIT License (MIT)

Copyright (c) 2016 Code and Send Ltd, Roy Reveltas

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.