Package Exports
- @allystudio/url-utils
Readme
@allystudio/url-utils
Comprehensive URL normalization, validation, and manipulation utilities for web applications. This package consolidates all URL handling patterns used across the AllyStudio ecosystem and provides a robust, well-tested foundation for URL operations.
Features
- π§ URL Normalization - Consistent URL cleaning across browsers
- β Validation - Comprehensive URL validation with detailed error messages
- π― Extraction - Extract domains, hostnames, paths, and components
- π Comparison - Compare URLs with flexible options
- π International Support - Handles international domains and punycode
- π Performance - Optimized for high-throughput applications
- π¦ Zero Dependencies - Only depends on
tldts
for domain parsing - π§ͺ Well Tested - Comprehensive test suite with 100+ test cases
Installation
npm install @allystudio/url-utils
Quick Start
import { normalizeUrl, extractDomain, compareUrls } from '@allystudio/url-utils'
// Normalize URLs for consistent storage
const normalized = normalizeUrl('https://www.Google.com/Path/?utm_source=test')
console.log(normalized.full) // "google.com/path"
// Extract domain from any URL
const domain = extractDomain('https://sub.example.com/path')
console.log(domain) // "example.com"
// Compare URLs intelligently
const isSame = compareUrls(
'https://example.com/path?param=1',
'https://www.example.com/path'
) // true (ignores query params and www by default)
API Reference
Normalization
normalizeUrl(url, options?)
Normalizes a URL with consistent rules across browsers.
interface NormalizationOptions {
keepQueryParams?: boolean // Keep query parameters (default: false)
keepFragment?: boolean // Keep hash fragments (default: false)
removeWww?: boolean // Remove www prefix (default: true)
removeTrailingSlash?: boolean // Remove trailing slashes (default: true)
sortQueryParams?: boolean // Sort query parameters (default: true)
}
const result = normalizeUrl('https://www.example.com/path/?b=2&a=1', {
keepQueryParams: true,
sortQueryParams: true
})
// Returns: { hostname: "example.com", domain: "example.com", path: "/path", full: "example.com/path?a=1&b=2", raw: "..." }
normalizeUrlString(url, keepQueryParams?)
Simple string-based normalization (compatible with legacy code).
const normalized = normalizeUrlString('https://www.example.com/path', false)
// Returns: "example.com/path"
Validation
isValidPageUrl(url)
Quick validation for web page URLs.
isValidPageUrl('https://example.com') // true
isValidPageUrl('chrome://settings') // false
isValidPageUrl('localhost') // false
validateUrl(url)
Detailed validation with error messages.
const result = validateUrl('invalid-url')
// Returns: { isValid: false, error: "Invalid domain" }
Extraction
extractDomain(url)
/ extractHostname(url)
/ extractPath(url)
Extract specific components from URLs.
extractDomain('https://sub.example.com/path') // "example.com"
extractHostname('https://sub.example.com/path') // "sub.example.com"
extractPath('https://example.com/path?param=1') // "/path"
Comparison
compareUrls(url1, url2, options?)
Compare URLs with flexible options.
interface ComparisonOptions {
ignoreQueryParams?: boolean // Ignore query parameters (default: true)
ignoreFragment?: boolean // Ignore hash fragments (default: true)
ignoreCase?: boolean // Ignore case differences (default: true)
}
compareUrls(
'https://Example.com/path?param=1',
'https://www.example.com/path#section',
{ ignoreQueryParams: true, ignoreFragment: true }
) // true
compareHostnames(url1, url2)
Compare just the domains of two URLs.
compareHostnames('https://sub1.example.com', 'https://sub2.example.com') // true
Utility Functions
isHomepage(url)
Check if a URL points to a homepage.
isHomepage('https://example.com') // true
isHomepage('https://example.com/') // true
isHomepage('https://example.com/about') // false
isUrlUnderDomain(childUrl, parentDomain)
Check if a URL belongs to a specific domain.
isUrlUnderDomain('https://blog.example.com/post', 'example.com') // true
getDisplayUrl(url)
Create a display-friendly version of a URL.
getDisplayUrl('https://example.com/path?param=value')
// Returns: "example.com/path?param=value"
Use Cases
Website/Page Management
import { normalizeUrl, extractDomain, extractPath } from '@allystudio/url-utils'
// Normalize website URLs (remove query params)
const websiteUrl = normalizeUrl(userInput).full
// Extract domain and path for database storage
const domain = extractDomain(pageUrl)
const path = extractPath(pageUrl)
Web Crawling
import { normalizeUrlForCrawling, shouldSkipUrl, isUrlUnderDomain } from '@allystudio/url-utils'
// Normalize URLs for crawling (removes query params and fragments)
const normalized = normalizeUrlForCrawling(foundUrl, baseUrl)
// Skip non-HTML resources
if (shouldSkipUrl(url)) {
continue
}
// Stay within domain
if (!isUrlUnderDomain(foundUrl, targetDomain)) {
continue
}
Analytics & SEO
import { normalizeUrl, compareUrls, extractDomain } from '@allystudio/url-utils'
// Deduplicate URLs for analytics
const canonical = normalizeUrl(pageUrl, { keepQueryParams: false }).full
// Group pages by domain
const domain = extractDomain(pageUrl)
// Compare URLs ignoring tracking parameters
const isSamePage = compareUrls(url1, url2, { ignoreQueryParams: true })
Browser Compatibility
- Chrome/Edge: Full support
- Firefox: Full support (IDNs shown in native script)
- Safari: Full support (may handle some Unicode differently)
- Mobile: Full support (respects length limitations)
International Domain Support
The package fully supports international domain names (IDNs):
// Handles international domains
normalizeUrl('https://γ°γΌγ°γ«.jp/ζ€η΄’')
// Returns normalized punycode version
// Supports complex TLD structures
extractDomain('https://service.gov.uk') // "gov.uk"
extractDomain('https://university.edu.au') // "edu.au"
Error Handling
All functions include robust error handling:
try {
const result = normalizeUrl(userInput)
// Use result.full for normalized URL
} catch (error) {
// Handle invalid URL
console.error('Invalid URL:', error.message)
}
// Or use validation first
const validation = validateUrl(userInput)
if (!validation.isValid) {
console.error('Invalid URL:', validation.error)
} else {
const result = normalizeUrl(userInput)
}
Performance
The package is optimized for high-throughput applications:
- Efficient domain parsing with
tldts
- Minimal string operations
- Graceful fallbacks for invalid input
- No unnecessary object creation
Migration Guide
From AllyStudio URL Utils
// Old
import { normalizeUrl, extractDomain } from '@/utils/url'
// New
import { normalizeUrl, extractDomain } from '@allystudio/url-utils'
// API is identical, but returns structured objects
From Allyship URL Utils
// Old
import { normalizeUrl } from '@/utils/url'
const normalized = normalizeUrl(url, keepQueryParams)
// New
import { normalizeUrlString } from '@allystudio/url-utils'
const normalized = normalizeUrlString(url, keepQueryParams)
From Custom Implementations
// Replace custom normalization
function customNormalize(url) {
return url.replace(/^https?:\/\//, '').replace(/^www\./, '')
}
// With robust normalization
import { normalizeUrlString } from '@allystudio/url-utils'
const normalized = normalizeUrlString(url)
Contributing
This package is part of the AllyStudio ecosystem. See the main repository for contribution guidelines.
License
MIT License - see LICENSE file for details.