JSPM

  • ESM via JSPM
  • ES Module Entrypoint
  • Export Map
  • Keywords
  • License
  • Repository URL
  • TypeScript Types
  • README
  • Created
  • Published
  • Downloads 7
  • Score
    100M100P100Q38075F
  • License MIT

Package Exports

    This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (text-hoarder) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

    Readme

    Text Hoarder CLI Docs

    Text Hoarder browser extension comes with an optional command line companion that provides the following powerful features:

    Getting started

    As a pre-requisite, you should have Node.js installed

    # Replace YOUR_USERNAME with your GitHub username.
    # Replace YOUR_TEXT_HOARDER_REPOSITORY with the name of the repository you
    # created to store Text Hoarder's saved articles
    git clone https://github.com/YOUR_USERNAME/YOUR_TEXT_HOARDER_REPOSITORY
    cd YOUR_TEXT_HOARDER_REPOSITORY
    # Installs Text Hoarder CLI companion
    npm install
    # Shows documentation for Text Hoarder CLI companion
    npx text-hoarder --help

    If you need help cloning the repository from the command line, see documentation from GitHub

    If you are a Windows user, consider running this command in your terminal to allow Git to handle files with long file names.

    git config --global core.longpaths true

    Without this, "git clone" may fail if your text hoarder repository has saved articles with very long URLs

    Generating Stats

    You can create a webpage with comprehensive statistics about the saved articles using the npx text-hoarder stats command.

    Example Usage

    # Open the repository you created to store Text Hoarder's saved articles
    cd YOUR_TEXT_HOARDER_REPOSITORY
    # Generate stats based on all saved articles and open results in your browser.
    # To see all options, run "npx text-hoarder stats --help"
    npx text-hoarder stats

    Example output:

    Computing statistics...
    1%
    ... trimmed ...
    99%
    100%
    Finalizing output...

    Once complete, stats.html will open in your browser:

    Stats page displays a chart of saved articles per time period and a button to download stats as JSON

    There are metrics for total number of articles, paragraphs, sentences, unique words, words and characters

    There are tables for most commonly saved websites and most common words

    Processing Text

    npx text-hoarder process command optimizes saved articles for text-to-speech software (removes likely spam and advertisement lines, removes characters that are not friendly with text-to-speech software, and etc).

    This command also converts markdown files to plaintext and splits large articles into smaller files to work around the max length limit in some text-to-speech tools.

    By default, it processes all new articles saved since the last time this command was run.

    Example Usage

    # Open the repository you created to store Text Hoarder's saved articles
    cd YOUR_TEXT_HOARDER_REPOSITORY
    # Process all articles saved since the last time this command was run.
    # To see all options, run "npx text-hoarder process --help"
    npx text-hoarder process

    By default, process automatically removes duplicated lines between saved articles. Why this is useful:

    • If you accidentally saved the same article twice, this step will remove the duplicate
    • It will automatically remove all the commonly repeated lines like Advertisement, or footers from websites (i.e, wired.com has a lot of lines like More Great WIRED Stories at the end of each article)
    • Some websites are not fully accessibility-complaint, leading to tools like Text Hoarder extracting some line two times in a row. This step will remove the duplicates.

    If you wish to disable this, pass the --no-exclude-duplicated-lines option when running the command.

    Converting Processed Text to Audio

    The output of the npx text-hoarder process command can be used with various text-to-speech software. This is a great way of consuming the saved articles while doing other tasks, like walking or doing house chores.

    Here is a small example script for converting the processed text files to audio using macOS's "say" utility:

    # Find the directory where process outputted the files
    cd processed/ && ls
    # Open the directory where the processed text files are located
    cd 2024-02-18
    # Convert each text file that hasn't yet been converted
    for f in *.txt; do
      echo "Generating $f.flac"
      # -r controls speaking rate. Run "man say" to see all options
      say -r 100 -o "$f.flac" --progress "$(cat $f)"
      # NOTE: this deletes the processed text file after it's converted to audio
      rm "$f"
    done

    NOTE: the above script removes the processed text file after converting it to audio. This allows to mark current progress and makes restarting the command easy if it freezes. If you do not wish this, remove the rm "$f" line.

    If you are not on macOS, see some of the options for other operating systems

    For best results, you should download high-quality Siri's voices. See the following section for more information.

    (macOS) Get High-Quality Text-To-Speech Voices

    On macOS, high-quality Siri's voices are available for text-to-speech using the say CLI command, as well as using the "Spoken Content" accessibility feature.

    To download these, follow Apple's tutorial on adding a new voice. In the list of voices, search for a section titled "English (US) - Siri" (or other language, as long as the name ends with "Siri") - these are the highest quality voices available.

    After downloading, make sure to select it as the default voice.

    Now, when you use the say CLI command, the high-quality voice will be used.

    Finding spam lines

    npx text-hoarder find-spam finds commonly repeated lines in your saved articles, which are possible spam/advertisement lines that should be excluded (for example, lines like Advertisement, RECOMMENDED VIDEOS FOR YOU, etc.).

    Example usage

    You can run the find-spam command, then check if it reported any common undesirable lines of text, and add those to the exclude-list.txt file in the repository Text Hoarder saves articles too.

    # Open the repository you created to store Text Hoarder's saved articles
    cd YOUR_TEXT_HOARDER_REPOSITORY
    # Report possible unwanted lines
    # To see all options, run "npx text-hoarder find-spam --help"
    npx text-hoarder find-spam
    # Add detected spam lines to the exclude-list.txt file

    Next time you run npx text-hoarder process or npx text-hoarder find-spam, the unwanted lines would be excluded automatically.

    By default, text-hoarder's CLI comes with a list of common spam lines built in. See the full list. If you do not wish to use this list, pass the --no-default-exclude option when running the commands.