Package Exports
- mirror-web-cli
- mirror-web-cli/src/core/mirror-cloner.js
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (mirror-web-cli) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
๐ช Mirror Web CLI v1.1.3
Professional Website Mirroring with Intelligent Framework Preservation & Enhanced Asset Processing
A powerful, universal website mirroring tool that intelligently detects and preserves framework structures while creating offline-ready websites. Works seamlessly with React, Next.js, Vue, Angular, Svelte, WordPress, and static sites.
โจ Key Features
๐ง Intelligent Framework Detection
- Automatically detects 14+ frameworks (React, Vue, Angular, Next.js, Nuxt, Gatsby, Svelte, etc.)
- Comprehensive pattern matching with confidence scoring
- Framework-specific optimization strategies
๐จ Beautiful Terminal Experience
- Modern UI with gradient effects and smooth animations
- Professional progress tracking with step-by-step indicators
- Color-coded status messages and comprehensive feedback
โก Advanced Asset Processing
- Complete asset extraction and optimization (images, CSS, JS, fonts, icons, videos)
- Smart URL rewriting for offline functionality
- Framework-preserving structure generation
- Comprehensive video support with 14+ video formats (.mp4, .webm, .ogg, etc.)
๐งน Clean Code Generation
- Optional tracking script removal (analytics, GTM, Facebook Pixel)
- Professional project structure ready for development
- Offline-ready websites with localized resources
- Next.js/React error handling for graceful offline operation
๐ Auto-Differentiated Output Directories
- Standard mirroring: Creates
./domain-standard/
directories - AI-enhanced mirroring: Creates
./domain-ai-enhanced/
directories - Easy comparison: Side-by-side analysis of different approaches
- Organized workflow: Never overwrite previous results
๐ ๏ธ Recent Improvements (v1.1.3)
โ Enhanced Environment Variable System
- Priority-based .env loading with shell environment preservation
- Improved OpenAI API key handling with multiple configuration sources
- Better development workflow with .env.local support
โ Next.js Image Optimizer Support
- Robust handling of
/_next/image
endpoints with HTTP 402 avoidance - Original image extraction from optimizer URLs
- Runtime asset rewriting with DOM mutation observer
- Enhanced offline compatibility for Next.js applications
โ Advanced Asset Processing
- Microlink integration for screenshot services
- Comprehensive hover/popover content capture
- Responsive image support with
srcset
rewriting - Enhanced video and audio processing with extended timeouts
โ Smart Output Organization
- Auto-differentiated directories prevent accidental overwrites
- Easy comparison between standard and AI-enhanced results
- Professional project organization
๐ Quick Start
Installation
# Global installation (recommended)
npm install -g mirror-web-cli
# Or run directly with npx (no installation required)
npx mirror-web-cli https://example.com
OpenAI API Setup (Optional)
For AI-powered website analysis, you'll need an OpenAI API key:
Option 1: Environment Variable (Recommended)
Windows PowerShell:
$env:OPENAI_API_KEY="sk-proj-your-openai-key-here"
Windows Command Prompt:
set OPENAI_API_KEY=sk-proj-your-openai-key-here
macOS/Linux (Bash/Zsh):
export OPENAI_API_KEY="sk-proj-your-openai-key-here"
Permanent Setup (recommended for regular use):
Windows (PowerShell as Administrator):
[System.Environment]::SetEnvironmentVariable('OPENAI_API_KEY', 'sk-proj-your-openai-key-here', 'User')
macOS/Linux (add to ~/.bashrc or ~/.zshrc):
echo 'export OPENAI_API_KEY="sk-proj-your-openai-key-here"' >> ~/.bashrc
source ~/.bashrc
Option 2: Command Line Parameter
mirror-web-cli https://example.com --ai --openai-key "sk-proj-your-key-here"
Requirements:
- Only OpenAI API keys are supported (must start with
sk-
) - Uses OpenAI GPT-4o model for intelligent analysis
- Get your API key: OpenAI Platform
Basic Usage
# Standard mirroring (outputs to example.com-standard)
mirror-web-cli https://example.com
# AI-enhanced mirroring (outputs to example.com-ai-enhanced)
mirror-web-cli https://example.com --ai
# Clean mirror without tracking scripts
mirror-web-cli https://react-site.com --clean
# Custom output directory (overrides automatic naming)
mirror-web-cli https://vue-app.com -o ./my-project
# Debug mode with detailed logging
mirror-web-cli https://complex-site.com --debug
๐ Auto-Differentiated Output Directories
Mirror Web CLI automatically creates different output directories based on the analysis method:
- Standard:
./domain-standard
(e.g.,./example.com-standard
) - AI-Enhanced:
./domain-ai-enhanced
(e.g.,./example.com-ai-enhanced
) - Custom: Uses your specified path with
-o
flag
This allows easy comparison between different analysis approaches and organized project management.
Serving the Output
# The tool generates a complete project structure
cd ./example.com-standard # or ./example.com-ai-enhanced
# Use any static server to serve the mirrored site
python -m http.server 8000
# Open http://localhost:8000
# Or use Node.js static server
npx serve .
๐ฏ How It Works
1. Intelligent Page Loading
- Launches headless browser with optimized settings
- Waits for framework-specific elements (#__next, #root, #app)
- Performs scroll-to-bottom for lazy-loaded content
- Waits for images and network idle state
2. Framework Analysis Engine
๐ Detection Methods:
โโโ Script Source Analysis โ Framework bundles & runtime files
โโโ DOM Element Inspection โ Framework-specific containers
โโโ Meta Tag Analysis โ Generator tags & signatures
โโโ Content Pattern Matching โ Component structures
โโโ CSS Class Analysis โ Framework styling patterns
โโโ JSON Data Detection โ State management structures
โโโ Link Href Analysis โ Framework asset paths
3. Comprehensive Asset Extraction
๐ฏ Asset Categories:
โโโ ๐ผ๏ธ Images โ src, srcset, lazy attributes, backgrounds
โโโ ๐จ Stylesheets โ External CSS + inline styles with url() rewriting
โโโ โ๏ธ Scripts โ External JS + inline scripts (with optional cleaning)
โโโ ๐ Fonts โ Web fonts and icon fonts
โโโ ๐ญ Icons โ Favicons and app icons
โโโ ๐ฅ Media โ Videos (.mp4, .webm, .ogg, .avi, .mov, etc.), audio files
4. Smart URL Rewriting
- Converts all absolute URLs to relative paths
- Creates organized asset directory structure
- Generates short, stable, hashed filenames
- Maintains proper file extensions and MIME types
5. Framework-Preserving Output
๐ Output Structure:
website.com/
โโโ index.html # Main page with framework intact
โโโ package.json # Project metadata & serve scripts
โโโ README.md # Usage instructions
โโโ server.js # Optional Node.js static server
โโโ assets/
โโโ images/ # All images with optimized names
โโโ css/ # Stylesheets with localized assets
โโโ js/ # JavaScript files (cleaned if --clean)
โโโ fonts/ # Web fonts and typography
โโโ icons/ # Favicons and app icons
โโโ media/ # Videos (.mp4, .webm, .ogg), audio files, and other media
Next.js + Microlink offline support (v1.0.2)
Modern sites often use:
- Next.js Image Optimizer:
/_next/image?url=<original>&w=<size>&q=<quality>
- Microlink-based previews:
https://api.microlink.io/?url=...
returning either JSON or direct images
This tool:
- Skips downloading
/_next/image
directly (avoids 402s) - Extracts the original image URL from the
url=
param and downloads that - Aliases
/_next/image?...
to the same local file as the original - Injects a runtime MutationObserver rewriter that:
- Rewrites
src
,href
,poster
, inlinestyle
background-image - Rewrites
srcset
andimagesrcset
(browsers prefer srcset over src) - Handles dynamically added DOM (hover cards, popovers, etc.)
- Rewrites
- Captures Microlink responses; if JSON, follows to the actual screenshot URL and downloads bytes
Verification
Run with
--debug
and open DevTools ConsoleInteract with the page (e.g., hover โPreviewโ links)
Look for lines like:
[MW rewrite] imagesrcset: /_next/image?url=... -> ./assets/images/asset_dc814d3448.png 1x, ...
Open the local asset path (e.g., http://localhost:8000/assets/images/asset_dc814d3448.png)
Troubleshooting (quick)
Blank hover/popover preview
- Serve over HTTP (not file://)
- Ensure
srcset
/imagesrcset
are being rewritten (use--debug
) - Open the local asset URL from logs; if 404, rebuild the mirror
HTTP 402 from Next.js
/_next/image
- Expected; the tool avoids these endpoints and downloads the original target from
url=
- Expected; the tool avoids these endpoints and downloads the original target from
Helpful snippet to locate candidates:
document.querySelectorAll('img, [style]').forEach(n => { const src = n.currentSrc || n.getAttribute('src') || ''; const styleAttr = n.getAttribute('style') || ''; const bg = getComputedStyle(n).backgroundImage || ''; const hay = [src, styleAttr, bg].join(' '); if (/(microlink|_next\/image|og|twitter|card)/i.test(hay)) { console.log('el:', n, { src, styleAttr, bg }); } });
๐ง CLI Reference
Usage: mirror-web-cli <url> [options]
Arguments:
url Target website URL to mirror
Options:
-o, --output <dir> Custom output directory (default: domain name)
--clean Remove tracking scripts and analytics
--ai Enable AI-powered analysis (requires OpenAI API key)
--openai-key <key> OpenAI API key for AI features (or set OPENAI_API_KEY env var)
--debug Enable detailed debug logging
--timeout <ms> Page load timeout in milliseconds (default: 120000)
--headless <bool> Run browser in headless mode (default: true)
-h, --help Show help information
-V, --version Show version number
OpenAI API Key Priority
The tool checks for OpenAI API keys in this order:
--openai-key
command line parameterOPENAI_API_KEY
environment variable- If neither is found, AI features are disabled with a helpful message
- Keys must start with
sk-
(validated automatically)
๐๏ธ Framework Support
Framework | Detection | Preservation | Output Quality |
---|---|---|---|
React | โ High confidence | โ Component structure | โญโญโญโญโญ |
Next.js | โ Advanced patterns | โ SSR/SSG structure | โญโญโญโญโญ |
Vue.js | โ Reactive patterns | โ Template structure | โญโญโญโญโญ |
Nuxt | โ SSR detection | โ Module organization | โญโญโญโญโญ |
Angular | โ Component analysis | โ Module structure | โญโญโญโญโญ |
Svelte | โ Store patterns | โ Component logic | โญโญโญโญโญ |
Gatsby | โ GraphQL detection | โ Static generation | โญโญโญโญโญ |
WordPress | โ Theme detection | โ Content structure | โญโญโญโญ |
Static Sites | โ Always works | โ Clean HTML/CSS/JS | โญโญโญโญโญ |
๐งช Usage Examples
Basic Website Mirroring
# Simple static site
mirror-web-cli https://example.com
# โ Creates: ./example.com-standard/ with complete offline functionality
React Application
# React SPA with complex routing
mirror-web-cli https://react-app.com --clean
# โ Creates: ./react-app.com-standard/ preserves React structure, removes tracking, offline-ready
Next.js Website
# Next.js with image optimization and error handling
mirror-web-cli https://nextjs-site.com --clean
# โ Creates: ./nextjs-site.com-standard/ with enhanced Next.js compatibility
# โ Handles /_next/image URLs, fixes hydration issues, preserves SSR structure
E-commerce Site
# Complex site with lots of assets
mirror-web-cli https://shop.example.com --debug --clean
# โ Creates: ./shop.example.com-standard/ with detailed logging, removes analytics
AI-Powered Analysis (OpenAI)
Windows PowerShell:
# Set environment variable first
$env:OPENAI_API_KEY="sk-proj-your-openai-key-here"
mirror-web-cli https://complex-app.com --ai --clean
# โ Creates: ./complex-app.com-ai-enhanced/ with OpenAI GPT-4o framework analysis
macOS/Linux:
# Set environment variable first
export OPENAI_API_KEY="sk-proj-your-openai-key-here"
mirror-web-cli https://complex-app.com --ai --clean
# โ Creates: ./complex-app.com-ai-enhanced/ with OpenAI GPT-4o framework analysis
Cross-platform (using CLI parameter):
# Compare standard vs AI-enhanced outputs
mirror-web-cli https://react-app.com --clean # โ ./react-app.com-standard/
mirror-web-cli https://react-app.com --ai --clean # โ ./react-app.com-ai-enhanced/
Development Workflow
# Mirror for development reference
mirror-web-cli https://design-system.com -o ./reference
cd ./reference
npm start # Built-in development server
Video-Rich Websites
# Websites with hero videos (like VS Code, Apple, etc.)
mirror-web-cli https://code.visualstudio.com --clean
# โ Downloads all video formats (.mp4, .webm), preserves video posters
# โ Handles responsive video sources with media queries
# โ Supports autoplay, muted, and poster attributes
# Complex video embedding
mirror-web-cli https://video-heavy-site.com --timeout 180000
# โ Extended timeout for large video downloads
# โ Maintains video element structure and JavaScript controls
๐จ Terminal UI Showcase
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ช Mirror Web CLI v1.1.3
Professional Website Mirroring
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โจ Features:
โข Intelligent framework detection (React, Vue, Angular, Next.js, etc.)
โข Framework-preserving output with professional structure
โข Comprehensive asset extraction and optimization
โข Clean code generation with tracking script removal
๐ Quick Start:
mirror-web-cli https://example.com
mirror-web-cli https://react-app.com --clean -o ./my-project
Progress Tracking
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Step 3/7 โข Framework Analysis
Detecting technology stack and framework patterns...
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
๐ฆ Framework Analysis
Framework: Next.js
Confidence: 95% โโโโโโโโโโโโโโโโโโโโโ
Complexity: HIGH
Strategy: Preserve DOM; localize assets for exact Next.js look
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
๐ก๏ธ Privacy & Security
Tracking Removal (--clean flag)
- Google Analytics (gtag, ga, analytics.js)
- Google Tag Manager (gtm, dataLayer)
- Facebook Pixel (fbevents, facebook.com/tr)
- Service Workers (registration scripts)
- Third-party trackers (extensive database)
Safety Considerations
- Always respect robots.txt and terms of service
- Ensure you have permission to mirror content
- Use responsibly and ethically
- Consider rate limiting for large sites
๐๏ธ Architecture Overview
src/
โโโ cli.js # Command-line interface & argument parsing
โโโ core/ # Core functionality modules
โ โโโ mirror-cloner.js # Main orchestrator class
โ โโโ browser-engine.js # Puppeteer browser management
โ โโโ framework-analyzer.js # Intelligent framework detection
โ โโโ asset-manager.js # Comprehensive asset extraction
โ โโโ framework-writer.js # Output generation & structure
โ โโโ display.js # Beautiful terminal UI system
โ โโโ logger.js # Logging & warning management
โ โโโ file-writer.js # File system operations
โ โโโ filename-utils.js # Smart filename generation
โ โโโ server.js # Optional static server
โโโ ai/ # AI-powered analysis (optional)
โโโ ai-analyzer.js # OpenAI integration for analysis
๐งฉ Extending the Tool
Adding New Framework Detection
// In src/core/framework-analyzer.js
this.frameworks.myframework = {
name: 'My Framework',
patterns: [
{ type: 'script', pattern: /myframework\.js/ },
{ type: 'element', selector: '#my-app' },
{ type: 'meta', name: 'generator', pattern: /myframework/i }
]
};
Custom Asset Processing
// In src/core/asset-manager.js
async extractCustomAssets() {
// Add your custom asset extraction logic
}
๐ค Contributing
We welcome contributions! Here's how to get started:
# Development setup
git clone https://github.com/SanjeevSaniel/mirror-web-cli.git
cd mirror-web-cli
npm install
# Run tests
npm test
# Development with debugging
npm run dev -- https://example.com --debug
Key Areas for Contribution
- Framework Detection: Add support for new frameworks
- Asset Processing: Improve extraction algorithms
- Output Optimization: Enhance generated code quality
- Terminal UI: Improve user experience
- Documentation: Help others understand the tool
๐ Troubleshooting
Common Issues
"Cannot read properties of undefined" Error
- Fixed in v1.0 - update to latest version
- Use
--debug
flag for detailed error information
Incomplete Asset Loading
- Increase timeout:
--timeout 180000
(3 minutes) - Check network connectivity
- Some dynamic content may require JavaScript enabled
Framework Not Detected
- Use
--debug
to see detection process - Framework patterns may need updating for newer versions
- Manual inspection may be needed for custom frameworks
Environment Variable Issues
Windows PowerShell "export command not found":
# โ Wrong (Bash syntax)
export OPENAI_API_KEY="sk-..."
# โ
Correct (PowerShell syntax)
$env:OPENAI_API_KEY="sk-..."
Windows Command Prompt:
# โ
Correct (CMD syntax)
set OPENAI_API_KEY=sk-your-key-here
Verify environment variable is set:
# PowerShell
echo $env:OPENAI_API_KEY
# Command Prompt
echo %OPENAI_API_KEY%
# Bash/Zsh
echo $OPENAI_API_KEY
AI Features Not Working
- Verify OpenAI API key is set correctly (see above)
- Check API key format: Must start with
sk-
- Ensure sufficient OpenAI credits/quota
- Use
--debug
to see AI analysis process
Blank Screen or Empty Content
Iframe-based sites (like hitesh.ai):
Some sites are just iframe wrappers pointing to external URLs
Example:
hitesh.ai
loadshiteshchoudhary.com
in an iframeSolution: Mirror the actual content site directly:
# Instead of the wrapper mirror-web-cli https://hitesh.ai # Mirror the actual content mirror-web-cli https://hiteshchoudhary.com --clean
Sites with heavy JavaScript dependencies:
Some React/Next.js sites may need additional processing
Try AI-enhanced mode for better framework handling:
mirror-web-cli https://your-site.com --ai --clean
Getting Help
- Check the GitHub Issues
- Use
--debug
flag for detailed logging - Include error output when reporting bugs
๐ Performance Stats
- Average Processing Time: 15-45 seconds per site
- Asset Extraction Rate: 95%+ success rate
- Framework Detection Accuracy: 90%+ for supported frameworks
- Memory Usage: Optimized for large sites (>1000 assets)
๐ Acknowledgments
Special thanks to the amazing open-source community:
- Puppeteer - Headless browser automation
- Cheerio - Server-side HTML parsing
- Chalk - Terminal styling
- Commander - CLI framework
- Sharp - Image processing
๐ License
MIT License - see LICENSE file for details.
Made with โค๏ธ by Sanjeev Saniel Kujur
Convert any website to universal HTML/CSS/JS with intelligent framework preservation!