Generating MongoDB Sample Data

Last year I wrote about generating better random strings.

Lorem Ipsum is the devil. It messes with us who are students of Latin; Cicero is hard enough without people throwing randomized Cicero in our faces. It's better to use something that isn't part of a linguistic insurgency. Use my Hamlet generator instead.

Anyway...

Because MongoDB is a standard component of any modern architecture these days, we need the ability to generate, not simply strings, but full objects for our test databases.

The following MongoDB script will do just that. Change the value of the run function-call to set the number of objects to throw at MongoDB.

You run this with the MongoDB shell:

./mongo < hamlet.js

Note: The third-party tool Robomongo, while awesome for day-to-day usage, will not work for this. It doens't play nicely with initializeUnorderedBulkOp, which you need for bulk data import. It's like the BULK INSERT command in SQL.

You can use the following with abridged data or this with the full hamlet lexicon.

var raw = "o my offence is rank it smells to heaven hath the primal eldest curse upont a brothers murder pray can i not though inclination be as sharp will stronger guilt defeats strong intent and like man double business bound stand in pause where shall first begin both neglect what if this cursed hand were thicker than itself with blood there rain enough sweet heavens wash white snow whereto serves mercy but confront visage of whats prayer two-fold force forestalled ere we come fall or pardond being down then ill look up fault past form serve turn forgive me foul that cannot since am still possessd those effects for which did crown mine own ambition queen may one retain corrupted currents world offences gilded shove by justice oft tis seen wicked prize buys out law so above no shuffling action lies his true nature ourselves compelld even teeth forehead our faults give evidence rests try repentance can yet when repent wretched state bosom black death limed soul struggling free art more engaged help angels make assay bow stubborn knees heart strings steel soft sinews newborn babe all well";
var data = raw.split(" ");

function hamlet(count) {
    return data[parseInt(Math.random() * data.length)] + (count == 1 ? "" : " " + hamlet(count - 1));
}

function randrange(min, max) {
    if(!max) { max = min; min = 1;}
    return Math.floor(Math.random() * (max - min + 1)) + min;
}

function createArray(count, generator) {
    var list = [];
    for(var n=0; n<count; n++) {
        list.push(generator());
    }
    return list;
}

function pad(number){
    return ("0" + number).substr(-2);
}

function createItem() {
    item = {
        "_id": "9780" + randrange(100000000, 999999999),
        "title": hamlet(randrange(4, 8)),
        "authors": createArray(randrange(4), function() { return hamlet(2) }),
        "metadata": {
            "pages": NumberInt(randrange(1, 400)),
            "genre": createArray(randrange(2), function() { return hamlet(1) }),
            "summary": hamlet(randrange(100, 400)),
        },
        "published": new Date(randrange(1960, 2016) + "-" + pad(randrange(12)) + "-" + pad(randrange(28)))
    };

    if (randrange(4) == 1) {
        item.editor = hamlet(1);
    }

    return item;
}

function run(count) {
    var bulk = db.book.initializeUnorderedBulkOp();
    for (var n = 0; n < count; n++) {
        bulk.insert(createItem());
    }
    bulk.execute();
}

run(100000)