JSPM

  • Created
  • Published
  • Downloads 12
  • Score
    100M100P100Q45684F
  • License MIT

CLI tool to collect field values from a MongoDB collection into target collection with batching

Package Exports

  • mongo-collector

Readme

Discord MIT License

mongoCollector

CLI tool for extracting values of a specific field from a MongoDB collection and saving them into a target collection.
Supports batching, large dataset processing, and flexible write configurations.

Features

  1. Extract values of any field from MongoDB documents.
  2. Data filtering using match.
  3. Batching (batchSize) to avoid MongoDB’s 16MB per-document limit.
  4. ObjectId transformation: ObjectId('68a8c8207090be6dd0e23a90') → '68a8c8207090be6dd0e23a90'.
  5. Large collections supported via allowDiskUse.
  6. Flexible array handling:
  • Overwrite or append to arrays.
  • Allow or eliminate duplicates.
  1. Informative logs:

Installation & Usage

  1. Install the package:
npm i  mongo-collector
  1. Add a script in your package.json:
"scripts": {
  "mongoCollector": "mongo-collector"
}
  1. In the root of the project, create a file - mongo-collector.config.js.

Example of file contents:

export default {
  source: {
    uri: "mongodb://127.0.0.1:27017",
    db: "crystalTest",
    collection: "users",
    field: "_id",
    match: {}
  },

  target: {
    uri: "mongodb://127.0.0.1:27017",
    db: "pool",
    collection: "usersIdFromCrystalTest",
    field: "users",
    documentId: false,
    rewriteDocuments: true,
    rewriteArray: true,
    duplicatesInArray: false,
    unwrapObjectId: true
  },

  aggregation: {
    allowDiskUse: true,
    batchSize: 200
  },
};

⚠️ All parameters are required - if any is missing, the tool will throw an error.

  1. Run from the project root:
npm run mongoCollector

Example of work

Source collection users (from source):

{ "_id": ObjectId("68a8c8207090be6dd0e23a90"), "name": "Alice" }
{ "_id": ObjectId("68a8c8207090be6dd0e23a91"), "name": "Sarah" }
{ "_id": ObjectId("68a8c8207090be6dd0e23a92"), "name": "John" }

After running mongo-collector, in the target collection usersIdFromCrystal:

{ "users": [ "68a8c8207090be6dd0e23a90", "68a8c8207090be6dd0e23a91", "68a8c8207090be6dd0e23a92" ] }

Config parameters

match

You can do any match configurations, for example:

match: {} - take all documents.

match: { createdAt: { $gte: new Date("2025-08-20T01:26:11.327+00:00") } } - filter documents by date.

documentId

documentId: false - create a new document.

documentId: '68a8c8207090be6dd0e23a90' - append data to an existing document, or create one with this _id if missing.

rewriteDocuments

rewriteDocuments: true - clear the entire target collection before writing.

rewriteArray

true - overwrite array
false - append to an existing array

duplicatesInArray

false - eliminate duplicates (uses $addToSet)

unwrapObjectId

true - ObjectId('68a8c8207090be6dd0e23a90') → '68a8c8207090be6dd0e23a90' (final result in target).

allowDiskUse

true - allows MongoDB to write temporary data to disk when processing aggregation stages.

  • Use this option for large datasets to avoid memory limitations.

false - restricts processing to memory only.

  • This can improve performance, but may result in errors if the dataset is too large to fit into memory.

batchSize

batchSize: 10 - controls the length of the array inside each target document.

⚠️ Make sure the array does not exceed 16MB, otherwise MongoDB will throw an error.

An example of mongoCollector in operation:

CRYSTAL v1.0 features

SHEDOV.TOP | CRYSTAL | Discord | Telegram | X | VK | VK Video | YouTube