Package Exports

document-processing-cleaner

Readme

📄 react-document-cleaner A React hook-based utility for processing document images in the browser using DeepLab (TensorFlow.js) and OpenCV.js. This tool segments paper-like regions, applies masks, deskews, and extracts the clean document for use in OCR, analysis, or archival.

✨ Features

✅ DeepLab segmentation using TensorFlow.js

✅ In-browser OpenCV.js post-processing

✅ Auto masking and document region extraction

✅ Deskewing and perspective transformation

✅ Debug image inspection support

Example Usage

Check out the live demo to see how it works.

📦 Installation

install with npm

npm install react-document-cleaner

or with yarn

yarn add react-document-cleaner

🚀 Quick Start

Load the DeepLab model

import { loadDeepLabModel } from 'react-document-cleaner';

const model = await loadDeepLabModel();

Use OpenCV loader hook

import { useOpenCVReady } from 'react-document-cleaner';

const isOpenCVReady = useOpenCVReady();

Use the document processor hook

import { useDocumentProcess } from 'react-document-cleaner';

const {
  originalImage,
  processedImage,
  debugImages,
  isProcessing,
  fileInputRef,
  handleImageUpload,
  processImage,
  setOriginalImage
} = useDocumentProcess(model, isOpenCVReady);

Build your UI

<input type="file" ref={fileInputRef} onChange={handleImageUpload} />
<button onClick={processImage} disabled={!model || !isOpenCVReady || isProcessing}>
  Process Document
</button>

{originalImage && <img src={originalImage} alt="Original" />}
{processedImage && <img src={processedImage} alt="Cleaned" />}

{debugImages.map((entry, i) => {
  const [label, url] = entry.split('|');
  return (
    <div key={i}>
      <strong>{label}</strong>
      <img src={url} alt={label} />
    </div>
  );
})}

🔍 Under the Hood loadDeepLabModel: Loads a pretrained DeepLab model.

useOpenCVReady: Loads OpenCV.js and tracks when it's ready.

useDocumentProcess: Handles uploading, segmenting, masking, deskewing, and returning cleaned images.

📁 Output originalImage: Base64 of the uploaded image

processedImage: Cleaned, deskewed base64 image

debugImages: Array of labeled debug image base64 strings

⚠️ Requirements Runs only in the browser (uses window,

Needs OpenCV.js loaded dynamically, handled by useOpenCVReady

Requires TensorFlow.js-compatible environment

🧪 Model Classes By default, the DeepLab model looks for document-like classes with IDs:

const paperClassIds = [15, 14, 72]; // typically person, book, paper-like regions

document-processing-cleaner

Package Exports

Readme

Example Usage

install with npm

or with yarn