JSPM

  • Created
  • Published
  • Downloads 2828
  • Score
    100M100P100Q117576F
  • License MIT

Flexible CLI regex replace in files.

Package Exports

  • rexreplace

This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (rexreplace) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.

Readme

Build Status npm version OPEN open source software bitHound Overall Score NPM downloads

RexReplace

RexReplace is a versatile tool for doing search-and-replaces in files from the command line. Its inspired by how developers often need to do quick fixes or one-liners for build scripts.

Key features:

  • Replacement can be dynamically generated by javascript code.
  • Multiple files can be given as glob notation (so docs/*.md represents each markdown file in your docs/ dir).
  • No more brute forcing the right combination of find, cat, sed, tr, and awk to replace a text pattern in a bunch of files.

Install

To use RexReplace from your command line

> npm install -g rexreplace

To use RexReplace from a npm build script

> npm install rexreplace --save-dev

Examples

Let 'foobar' in myfile.md become 'xxxbar'

> rexreplace 'Foo' 'xxx' myfile.md

Hard for your fingers to write on your keyboard? We got you covered with the rr alias for rexreplace:

> rr Foo xxx myfile.md

Let all markdown files in the docs/ dir get headlines moved one level deeper

> rexreplace '^#' '##' docs/*.md            

Let the version number from package.json get into your distribution js files (use the string VERSION_NUMBER in your source files).

> rexreplace 'VERSION_NUMBER' 'require("package.json").version' -j dist/*.js 

Let 'foobar' in myfile.md become 'barfoo' (backreferences to a matching group)

> rexreplace '(foo)(.*)' '$2$1' myfile.md

RexReplace normally treats as an alias for $ so the following will do the same as the previus example

> rexreplace '(foo)(.*)' '€2€1' myfile.md  

Usage

> rexreplace pattern replacement [fileGlob|option]+

Please update to version 3 as several flags have been altered (for the better) since version 2. Most noticeable: the -J flag has merged into the -j flag and the function of -O must now be obtained via -m.

Flag Effect
-v --version Print rexreplace version (can be given as only argument) [boolean]
-V --verbose More chatty output [boolean]
-I --void-ignore-case Void case insensitive search pattern. [boolean]
-G --void-global Void global search (work only with first the match). [boolean]
-M --void-multiline Void multiline search pattern. Makes ^ and $ match start/end of whole content rather than each line. [boolean]
-u --unicode Treat pattern as a sequence of unicode code points. [boolean]
-e --encoding Encoding of files/piped data. [default: "utf8"]
-q --quiet Only display errors (no other info) [boolean]
-Q --quiet-total Never display errors or info [boolean]
-H --halt Halt on first error [boolean] [default: false]
-d --debug Print debug info [boolean]
-€ --void-euro Void having as alias for $ in pattern and replacement parameters [boolean]
-o --output Output the final result instead of saving to file. Will also output content even if no replacement has taken place. [boolean]
-A --void-async Handle files in a synchronous flow. Good to limit memory usage when handling large files. [boolean]
-B --void-backup Avoid temporary backing up file. Works async (independent of -A flag) and will speed up things but at one point data lives only in memory and you will lose the content if the process is abrupted. [boolean]
-b --keep-backup Keep a backup file of the original content. [boolean]
-m --output-match Output each match on a new line. Will not replace any content but you still need to provide a dummy value (like _) as replacement parameter. If search pattern does not contain matching groups the full match will be outputted. If search pattern does contain matching groups only matching groups will be outputted (same line with no delimiter). [boolean]
-T --trim-pipe Trim piped data before processing. If piped data only consists of chars that can be trimmed (new line, space, tabs...) it will be become an empty string. [boolean]
-R --replacement-pipe Replacement will be piped in. You still need to provide a dummy value (like _) as replacement parameter. [boolean]
-j --replacement-js Treat replacement as javascript source code. The statement from the last expression will become the replacement string. Purposefully implemented the most insecure way possible to remove any incentive to consider running code from an untrusted person - that be anyone that is not yourself. The full match will be available as a javascript variable named $0 while each captured group will be avaiable as $1, $2, $3, ... and so on. At some point the $ char will give you a headache when used from the command line, so use €0, €1, €2 €3 ... instead. If the javascript source code references to the full match or a captured group the code will run once per match. Otherwise it will run once per file. The code has access to the following variables: _fs from node, _globs from npm, _pipe is the piped data into the command (null if no piped data), _find is the final pattern searched for. _text is the full text being searched (= file contents or piped data). The following values are also available if working on a file (if data is being piped they are all set to an empty string): _file is the full path of the active file being searched (including full filename), _path is the full path without filename of the active file being searched, _filename is the full filename of the active file being searched, _name is the filename of the active file being searched with no extension, _ext is the extension of the filename including leading dot. [boolean]
-h --help Display help. [boolean]

Good to know

Features

  • Patterns are described as javascript notation regex
  • Pattern defaults to global multiline case-insensitive search
  • Supports regex lookaheads in pattern
  • Supports backreference to matching groups in the replacement
  • Data to be treated can be piped in
  • See the release note for a log of changes. Descriptions are given in latest patch version.

Limitations

  • RexReplace reads each file fully into memory, so working on your 4Gb log files will probably not be ideal.
  • For versions of Node prior to 6, please use version 2.2.x. For versions of Node prior to 0.12, please use the legacy version of RexReplace called rreplace

Quirks

  • Per default is treated as an alias for $ in the CLI input. The main reason is for you not to worry about how command line tools often have a special relationship with the $ char. Your can escape your way out of this old love story, but it often pops up in unexpected ways. Use the -€ flag if you need to search or replace the actual euro char.

  • Options can only be set after the replacement parameter. "But I like to put my options as the first thing, so I know what I am doing" I agree, but we must sometimes sacrifice habits for consistency.

Priorities

  • Flexibility regarding text pattern matching
  • Easy to filter what files to be treated
  • Helpful interface
  • Tests (if you know how to do a test cover report on javascript code ran via the command line, please let me know)

Not a priority

  • Speed. Obviously, speed is important, but to what extent does a 21-millisecond command really satisfy the user compared to a 294-millisecond command? See test->speed for more info.
> time cat README.md | sed 's/x/y/g'  > /dev/null
cat myfile  0,00s user 0,00s system 45% cpu 0,011 total
sed 's/x/y/g' > /dev/null  0,00s user 0,00s system 43% cpu 0,010 total
> time rr x y README.md -o > /dev/null 
rr x y myfile -o > /dev/null  0,21s user 0,04s system 86% cpu 0,294 total

Test

Regression

All CLI end to end tests are defined in test/cli/run.sh and all unit test are described in test/*.js. After git clone'ing the repo and npm install'ing you can invoke them with:

> npm test

Speed

tl;dr: Files over 5 Mb are faster with rr than with sed - but - it does not matter as any file under 25 Mb has less than 0.7 seconds in difference.

The speed test is initiated by npm run test-speed. The test takes files in different sizes and compares the processing time for RexReplace (rr) and the Unix tool sed. The test uses the sources of a website displaying the book 1984 by George Orwell. The task for the tests is to remove all HTML tags by search-and-replace so only the final text is left. The source is 888Kb, so all files up to 500Kb are generated directly from the source, while larger files are created by combining the first 500Kb several times. Each test runs 10 times to even out any temporary workload fluctuations. Results from latest test run can always be seen in the speed test log.

The graph visualises speed as relative to fastest overall run (sed on a 1kb file). This chart also has an interactive version in log scale, so the details in the low end can be studied better. Interestingly files of 1Kb, 5Kb takes longer for rr than 10Kb files.

Now, what is relevant to notice is how sed only takes 3.3 seconds longer for the 100Mb file - even if the difference looks drastic on the graph.

Speed relative to fastest tool for each file size
---------------------------------------------------
Bytes    sed    rr    Time it took longer (seconds)
1          1    60    0,3    <= sed is 60x faster  
5          1    44    0,3
10         1    35    0,2
100        1    24    0,2
500        1     8    0,2
1000       1     5    0,2
5000       1     1    0,0    <= same speed for 5Mb file
10000      1     1    0,2
25000      2     1    0,7
50000      2     1    1,7
100000     3     1    3,3    <= rr is 3x faster

So even though the speed evolves very differently, there is only little practical use of the focus on speed for most use cases. Replacing in 10000 small files? Use RexReplace and go get a cup of coffee - or spend half an hour getting sed to work as you want it to and enjoy the thrilling few seconds it takes to do its magic.

Please note that speeds might look very different when files get as large as the memory available.

Rumours

Inspiration

.oO(What should "sed" have looked like by now?)

Future ideas

  • Test-run with info outputted about what will happen (sets -t and does not change anything)
  • Let search and replace be withing the names of the files (ask for overwriting. -Y = no questions)
  • Let search and replace be within the path of the files (ask for overwriting. -Y = no questions)
  • Let pattern and globs be piped
  • Let Pattern, replacement, and globs come from file
  • Let pattern and glob be javascript code returning string as result
  • Error != warning
  • Flag for simple string search (all other chars than [\n\r\t])
  • Flag for plain string search literal (no regex, no special chars, no escape chars)
  • Check if https://github.com/eugeneware/replacestream is good to rely on
  • Check if regex engine from spider monkey can be wrapped in something that does not need node
  • Implement in go so that all platforms can be supported with no need for node (might be based on)
  • Let https://github.com/dthree/vorpal deal with the interface? Or maybe https://www.npmjs.com/package/pretty-cli
  • Expand speed test to compare all related projects
  • Check if modular + compile is slower than mini monolith + require fs

There are many projects seeking to solve the same problem as RexReplace. Most lack the flexible CLI interface or are limited in how diverse the replacement can be. If our way does not suit you, we suggest you have a look at:


RexReplace mascot Benny on the RexReplace logBo

Please note that RexReplace is an OPEN open source project. This means that individuals making significant and valuable contributions are given commit access to the project to contribute as they see fit. This project is more like an open wiki than a standard guarded open source project.

OPEN open source software

Icon inspired by Freepik from www.flaticon.com