2005 2006 2007 2008 2009 2010 2011 2015 2016 aspnet azure csharp debugging exceptions firefox javascriptajax linux llblgen powershell projects python security services silverlight training videos wcf wpf xag xhtmlcss

Developing Azure Modular ARM Templates

Cloud architectures are nearly uniquitous. Managers are letting go of their FUD and embracing a secure model that can extend their reach globally. IT guys, who don't lose any sleep over the fact that their company's finance data is on the same physical wire as their public data, because the data is separated by VLANs, are realizing that VNets on Azure function on the same principle. Developers are embracing a cross-platform eutopia where Python and .NET can live together as citizens in a harmonious cloud solution. OK... maybe I'm dreaming about that last one, but the cloud is widely used.

With Azure 2.0 (aka Azure ARM), we finally have a model of managing our resources (database, storage account, network card, VM, load balancer, etc) is a declarative model where we can throw nouns at Azure and it let verb it into existence. The JSON templates give us a beautiful 100% GUI-free environment to restore sanity the stolen from us by years of dreadfully clicking buttons. Yet, there's gotta be a better way of dealing with our ARM templates than scrolling up and down all the time. Well, there is...

Below is a link to a template that defines all kinds of awesome:

Baseline ARM Template

Take this magical spell and throw it at Azure and you'll get a full infrastucture of many Elasticsearch nodes, all talking to each other, each with their own endpoint, and a traffic manager to unify the endpoints to make sure everyone in the US gets a fast search connection. There's also the multiple VNets, mesh VPN, and the administative VM and all that stuff.

Yet, this isn't even remotely how I work with my templates. This is:

ARM Components

Synopsis

Before moving on, note that there are a lot of related concepts going on here. It's important that I give you a quick synopsis of what follows:

  • Modularly splitting ARM templates into managable, mergable, reusable JSON files
  • Deploying ARM templates in phases.
  • Proposal for symlinking for reusable architectures
  • Recording production deployments
  • Managing deployment arguments
  • Automating support files

Let's dive in...

Modular Resources

Notice that the above screenshot does not show monolith. Instead, I manage individual resources, not the entire template at once. This let's me find and add, remove, enable, disable, merge, etc things quickly.

Note that each folder represents "resource provider/resource type/resource.json". The root is where you would put the optional sections variables.json, parameters.json, and outputs.json. In this example, I have a PS1 file there just because it supports this particular template.

My deployment PowerShell script combines the appropriate JSON files together to create the final azuredeploy-generated.json file.

I originally started with grunt to handle the merging. grunt-contrib-concat + grunt-json-format worked for a while, but my Gruntfile.js became rather long, and the entire process was wildly unreliable anyway. Besides, it was just one extra moving part that I didn't need. I was already deploying with PowerShell. So, might as well just do that...

You can get my PowerShell Azure modular JSON magical script at the end of this article.

There's a lot to discuss here, but let's review some core benefits...

Core Benefits

Aside from the obvious benefit of modularity to help you sleep at night, there are at least two other core benefits:

First, is the ability to add and remove resources via files, but a much greater benefit is the ability to enable or disable resources. In my merge script, I exclude any file that starts with an underscore. This acts a a simple way to comment out a resource.

Second, is the ability to version and merge individual resources in Git (I'm assuming you're living in 2016 or beyond, there are are using Git, not that one old subversive version control thing or Terrible Foundation Server). The ability to diff and merge resources, not entire JSON monoliths is great.

Phased Deployment

When something is refactored, often fringe benefits naturally appear. In this case, modular JSON resources allows for programmaticly enabling and disabling of resources. More specifically, I'd like to mention a concept I integrate into my deployment model: phased deployment.

When deploying a series of VM and VNets, it's important to make sure your dependencies are setup correctly. That's fairly simple: just make sure dependsOn is setup right in each resource. Azure will take that information into account to see what to deploy in parallel.

That's epic, but I don't really want to wait around forever if part of my dependency tree is a network gateway. Those things take forever to deploy. Not only that, but I've some phases that are simply done in PowerShell.

Go back and look at the screenshot we started with. Notice that some of the resources start with 1., 2., etc.... So, starting a JSON resource with "#." states at what phase that resource will deploy. In my deployment script I'll state what phase I'm currently deploying. I might specify that I only want to deploy phase 1. This will do everything less than phase 1. If I like what I see, I'll deploy phase 2.

In my example, phase 2 is my network gateway phase. After I've aged a bit, I'll come back to run some PowerShell to create a VPN mesh (not something I'd try to declare in JSON). Then, I'll deploy phase 3 to setup my VMs.

Crazy SymLink Idea

This section acts more as an extended sidebar than part of the main idea.

Most benefits of this modular approach are obvious. What might not be obvious is the following:

You can symlink to symbols for reuse. For any local Hyper-V Windows VM I spin up, I usually have a Linux VM to go along with it. For my day-to-day stuff, I have a Linux VM that I for general development which I never turn off. I keep all my templates/Git repos on it.

On any *nix-based system, you can create symbolic links to expose the same file with multiple file names (similar to how myriad Git "filename" will point to the same blob based on a common SHA1 hash).

Don't drift off simply because you think it's some crazy fringe idea.

For this discussion, this can mean the following:

./storage/storageAccounts/storage-copyIndex.json
./network/publicIPAddresses/pip-copyIndex.json
./network/networkInterfaces/nic-copyIndex.json
./network/networkSecurityGroups/nsg-copyIndex.json
./network/virtualNetworks/vnet-copyIndex.json

These resources could be some epic, pristine awesomeness that you want to reuse somewhere. Now, do use the following Bash script:

#!/bin/bash

if [ -z "$1" ]; then
    echo "usage: link_common.sh type"
    exit 1
fi

TYPE=$1

mkdir -p `pwd`/$TYPE/template/resources/storage/storageAccounts
mkdir -p `pwd`/$TYPE/template/resources/network/{publicIPAddresses,networkInterfaces,networkSecurityGroups,virtualNetworks}

ln -sf `pwd`/_common/storage/storageAccounts/storage-copyIndex.json `pwd`/$TYPE/template/resources/storage/storageAccounts/storage-copyIndex.json
ln -sf `pwd`/_common/network/publicIPAddresses/pip-copyIndex.json `pwd`/$TYPE/template/resources/network/publicIPAddresses/pip-copyIndex.json
ln -sf `pwd`/_common/network/networkInterfaces/nic-copyIndex.json `pwd`/$TYPE/template/resources/network/networkInterfaces/nic-copyIndex.json
ln -sf `pwd`/_common/network/networkSecurityGroups/nsg-copyIndex.json `pwd`/$TYPE/template/resources/network/networkSecurityGroups/nsg-copyIndex.json
ln -sf `pwd`/_common/network/virtualNetworks/vnet-copyIndex.json `pwd`/$TYPE/template/resources/network/virtualNetworks/vnet-copyIndex.json

Run this:

chmod +x ./link_common.sh
./link_common.sh myimpressivearchitecture

This will won't create duplicate files, but it will create files that point to the same content. Change one => Change all.

Doing this, you might want to make the source-of-truth files read-only. There are a few days to do this, but the simplest is to give root ownership of the common stuff, then give yourself file-read and directory-list rights.

sudo chown -R root:$USER _common
sudo chmod -R 755 _common 

LINUX NOTE: directory-list rights are set with the directory execute bit

If you need to edit something, you'll have to do it as root (e.g. sudo). This will protect you from doing stupid stuff.

Linux symlinks look like normal files and folders to Windows. There's nothing to worry about there.

This symlinking concept will help you link to already established architectures. You can add/remove symlinks as you need to add/remove resources. This is an established practice in the Linux world. It's very common to create a folder for ./sites-available and ./sites-enabled. You never delete from ./sites-enabled, you simply create links to enable or disable.

Hmm, OK, yes, that is a crazy fringe idea. I don't even do it. Just something you can try on Linux, or on Windows with some sysinternals tools.

Deployment

When you're watching an introductory video or following a hello world example of ARM templates, throwing variables at a template is great, but I'd never do this in production.

In production, you're going to archive each script that is thrown at the server. You might even have a Git repo for each and every server. You're going to stamp everything with files and archive everything you did together. Because this is how you work anyway, it's best to keep that as an axiom and let everything else mold to it.

To jump to the punchline, after I deploy a template twice (perhaps once with gateways disabled, and one with them enabled, to verify in phases), here's what my ./deploy folder looks like:

./09232016-072446.1/arguments-generated.json
./09232016-072446.1/azuredeploy-generated.json
./09232016-072446.1/success.txt
./09242016-051529.2/arguments-generated.json
./09242016-051529.2/azuredeploy-generated.json
./09242016-051529.2/success.txt

Each deployment archives the generated files with the timestamp. Not a while lot to talk about there.

Let's back up a little bit and talk about deal with arguments and that arguments-generated.json listed above.

If I'm doing phased deployment, the phase will be suffixed to the deploy folder name (e.g. 09242016-051529.1).

Deployment Arguments

Instead of setting up parameters in the traditional ARM manner, I opt to generate an arguments file. So, my model is to not only generate the "azuredeploy.json", but also the "azuredeploy-parameters.json". Once these are generated, they can be stamped with a timestamp, then archived with the status.

Sure, zip them and throw them on a blob store if you want. Meh. I find it a bit overkill and old school. If anything, I'll throw my templates at my Elasticsearch cluster so I can view the archives that way.

While my azuredeploy-generated.json is generated from myriad JSON files, my arguments-generated.json is generated from my ./template/arguments.json file.

Here's my ./template/arguments.json file:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "admin-username": {
            "value": "{{admin-username}}"
        },
        "script-base": {
            "value": "{{blobpath}}/"
        },
        "ssh-public-key": {
            "value": "{{ssh-public-key}}"
        }
    }
}

My deployment script will add in the variables to generate the final arguments file.

$arguments = @{
    "blobpath" = $blobPath
    "admin-username" = "dbetz"
    "ssh-public-key" = (cat $sshPublicKeyPath -raw)
}

Aside from the benefits of automating the public key creation for Linux, there's that blobpath argument. That's important. In fact, dynamic arguments like this might not even make sense until you see my support file model.

Support Files

If you are going to upload assets/scripts/whatever to your server during deployment, you need to get them to a place they are accessible. One way to do this is to commit to Git every 12 seconds. Another way is to simply use blob storage.

Here's the idea:

You have the following folder structure:

./template
./support

You saw ./template in VS Code above, in this example, ./support looks like this:

support/install.sh
support/create_data_generation_setup.sh
[support/generate/hamlet.py](https://netfxharmonics.com/n/2015/03/brstrings)

These are files that I need to get on the server. Use Git if you want, but Azure can handle this directly:

$key = (Get-AzureRmStorageAccountKey -ResourceGroupName $deploymentrg -Name $deploymentaccount)[0].value
$ctx = New-AzureStorageContext -StorageAccountName $deploymentaccount -StorageAccountKey $key
$blobPath = Join-Path $templatename $ts
$supportPath = (Join-Path $projectFolder "support")
(ls -File -Recurse $supportPath).foreach({
    $relativePath = $_.fullname.substring($supportPath.length + 1)
    $blob = Join-Path $blobPath $relativePath
    Write-Host "Uploading $blob"
    Set-AzureStorageBlobContent -File $_.fullname -Container 'support' -Blob $blob -BlobType Block -Context $ctx -Force > $null
})

This PowerShell code in my ./support folder and replicates the structure to blob storage.

You ask: "what blob storage?"

Response: I keep a resource group named deploy01 around with a storage account named file (with 8 random letters to make it unique). I reuse this account for all my Azure deployments. You might duplicate this per client. Upon deployment, blobs are loaded with the fully qualified file path including the template that I'm using and my deployment timestamp.

The result is that by time the ARM template is thrown at Azure, the following URL was generated and the files are in place to be used:

https://files0908bf7n.blob.core.windows.net/support/elasticsearch-secure-nodes/09232016-052804 

For each deployment, I'm going to have a different set of files in blob storage.

In this case, the following blobs were uploaded:

elasticsearch-secure-nodes/09232016-072446/generate/hamlet.py                                
elasticsearch-secure-nodes/09232016-072446/install.sh                                        
elasticsearch-secure-nodes/09232016-072446/create_data_generation_setup.sh 

SECURITY NOTE: For anything sensitive, disable public access, create a SAS token policy, and use that policy to generate a SAS token URL. Give this a few hours to live so your entire template can successfully complete. Remember, gateways take a while to create. Once again: this is why I do phased deployments.

When the arguments-generated.json is used, the script-base parameter is populated like this:

"setup-script": {
    "value": "https://files0c0a8f6c.blob.core.windows.net/support/elasticsearch-secure-nodes/09232016-072446"
},

You can then use this parameter to do things like this in your VM extensions:

"fileUris": [
    "[concat(parameters('script-base'), '/install.sh')]"
],
"commandToExecute": "[concat('sh install.sh ', length(variables('locations')), ' ''', parameters('script-base'), ''' ', variables('names')[copyindex()])]"

Notice that https://files0908bf7n.blob.core.windows.net/support/elasticsearch-secure-nodes/09232016-072446/install.sh is the script to be called, but https://files0908bf7n.blob.core.windows.net/support/elasticsearch-secure-nodes/09232016-072446 is also sends in as a parameter. This will tell the script itself where to pull the other files. Actually, in this case, that endpoint is passed a few levels deep.

In my script, when I'm doing phased deployment, I can set uploadSupportFilesAtPhase to whatever phase I want to upload support files. I generally don't do this at phase 1, because, for mat, that phase is everything up to the VM or gateway. The support files are for the VMs, so there's no need to play around with them while doing idempotent updates to phase 1.

Visual Studio Code

I've a lot of different editors that I use. Yeah, sure, there's Visual Studio, whatever. For me, it's .NET only. It's far too bulky for most anything else. For ARM templates, it's absolutely terrible. I feel like I'm playing with VB6 with it's GUI driven resource seeking.

While I use EditPlus or Notepad2 (scintilla) for most everything, this specific scenario calls for Visual Studio Code (Atom). It allows you to open a folder directly without the needs for pointless SLN files and lets you view the entire hierarchy at once. It also lets you quickly CTRL-C/CTRL-V a JSON file to create a new one (File->New can die). F2 also works for rename. Not much else you need in life.

Splitting a Monolith

Going from an exist monolithic template is simple. Just write a quick tool to open JSON and dump it in to various files. Below is my a subpar script I wrote in PowerShell to make this happen:

$templateBase = '\\10.1.40.1\dbetz\azure\armtemplates'
$template = 'python-uwsgi-nginx'
$templateFile = Join-Path $templateBase "$template\azuredeploy.json"
$json = cat $templateFile -raw
$partFolder = 'E:\Drive\Code\Azure\Templates\_parts'
$counters = @{ "type"=0 }

((ConvertFrom-Json $json).resources).foreach({
    $index = $_.type.indexof('/')
    $resourceProvider = $_.type.substring(0, $index).split('.')[1].tolower()
    $resourceType = $_.type.substring($index+ 1, $_.type.length - $index - 1)
    
    $folder = Join-Path $partFolder $resourceProvider
    if(!(Test-Path $folder)) {
        mkdir $folder > $null
    }

    $netResourceType = $resourceType
    while($resourceType.contains('/')) {    
        $index = $resourceType.indexof('/')
        $parentResourceType = $resourceType.substring(0, $index)
        $resourceType = $resourceType.substring($index+ 1, $resourceType.length - $index - 1)
        $netResourceType = $resourceType
        $folder = Join-Path $folder $parentResourceType
        if(!(Test-Path $folder)) {
            mkdir $folder > $null
        }
    }
    $folder = Join-Path $folder $netResourceType
    if(!(Test-Path $folder)) {
        mkdir $folder > $null
    }
    
    $counters[$_.type] = $counters[$_.type] + 1
    $file = $folder + "\" + $netResourceType + $counters[$_.type] + '.json'
    Write-Host "saving to $file"
    (ConvertTo-Json -Depth 100 $_ -Verbose).Replace('\u0027', '''') | sc $file
})

Here's a Python tool I wrote that does the same thing, but the JSON formatting is much better: https://jampadcdn01.azureedge.net/netfx/2016/09/modulararm/armtemplatesplit.py

This is compatible with Python 3 and legacy Python (2.7+).

Deploy script

Here's my current deploy armdeploy.ps1 script:

Deploy ARM Template (ps1)

Learning Elasticsearch with PowerShell

Reframing Elasticsearch

Before I talk about any topic, I like to reframe it away from the marketing, lame "Hello World" examples, and your own personal echo chamber. So, I'm going to begin my talking about what Elasticsearch ("ES") is. I do not consider it to be a "search engine", so... pay attention.

I'm not big on marketing introductions. They are usually filled with non-technical pseudo-truths and gibberish worthy of the "As seen on TV" warning label. So, what is Elasticsearch? Marketing says it's a search system. People who have used it say it's a hardcore Aristotelian database that makes for a fine primary datastore as well as for a fine search engine.

One of the major differences with MongoDB is that Elastic is more explicit about its indexing. Every database does indexing and everything has a schema. Have a data-structure? You have an index. Have fields? At a minimum, you have an implicit schema. This is what makes an Aristotelian-system Aristotelian.

See my video on Platonic and Aristotelian Data Philosophies for more information on why "NoSQL" is a modern marketing fiction similar to "AJAX".

More particularly, Elasticsearch has a strong focus on the schema than MongoDB.

You might find people say that Elastic is schemaless. These people have neither read nor peeked at the docs. Elastic is very explicit about it's indexes. Sometimes you'll hear that it's schemaless because it uses Lucene (the engine deep deep deep down that does the searching), which is schemaless. That's stupid. Lucene uses straight bytes and Elastic adds the schema on top of it. Your file system uses bytes and SQL Server adds a schema on top of it. Just because your file system uses bytes, not a schema, this doesn't mean that SQL Server "doesn't have a schema" because it uses MDF files on a a file system using bytes. SQL Server has a schema. Elastic has a schema. It might be implicit, but it has a schema. Even if you never create one, there is a schema. Elastic is explicit about having a schema.

Beyond the Aristotelian nature though, like MongoDB, ES is an object database (or "document database" as the marketing guys call it, but, unless we wildly redefine what "document" means, I've never stored a document in MongoDB / ES!) You work with objects anyway, why in the world are you translating them to and from relational patterns? Square peg -> round role. ES and MongoDB are perfect for systems that rely heavily on objects. JSON in / JSON out. No translation required. This one feature here is why many ditch SQL Server for ES or MongoDB. Translation for the sake of translation is insane.

Yet another difference is in terms of access: because MongoDB uses TCP and ES uses HTTP, for all practical purposes, MongoDB requires a library, but an ES library is redundant. When something is as dynamic as ES, using strongly-typed objects in .NET makes .NET fodder for ridicule. .NET developers with an old school mindset (read: inability to paradigm shift) in particular have an unhealthy attachment to make everything they touch into some strongly-typed abstraction layer (though not .NET, the blasphemy known as Mongoose comes to mind.) This is also everything right about the dynamic keyword in C#. It's good to see C# finally get close to the modern world.

The stereotype of a .NET developer (for non-.NET developers) is this: the dude is given a perfectly good HTTP endpoint, he then proceeds to express his misguided cleverness by destroying the open nature of the endpoint by wrapping it in yet another API you have to learn. You can simply look at Nuget to see the massive number of pointless abstraction layers everyone feels the need to dump onto the world. To make matters worst, when you want to make a simple call, you're forced to defeat the entire point of the clean RESTful API by using the pointless abstraction layer. Fail. Abstraction layers are fun to write to learn an API, but... dude... keep them to yourself. Go simple. Go raw. Nobody wants abstraction layer complexity analogous to WebForms; there's a reason Web API is so popular now. Don't ruin it. This lesson is for learning, not to create yet another pointless abstraction layer. Live and love dynamic programming. Welcome to the modern world.

This is most likely why REST is little more than a colloquialism. People who attempt to use hypermedia (a requirement for REST) either find it pointlessly complicated or impossible to wrap. REST is dead until something like the HAL spec gains widespread acceptance; for now, we use proper term "REST" to mean any HTTP verb-driven, resource-based API. ES is not REST, it's "REST" only in this latter sense. We usually refer to this, ironically, as "RESTful".

Game Plan

This leads to the point of this entire document: you will learn to use ES from first-principles; you will be able to write queries as wrapped or unwrapped as you want.

My game plan here will seem absolutely backward:

  • Abstract some lower-level ES functionality with PowerShell using search as an example.
  • Discuss ES theory, index setup, and data inserting.

Given that setup is a one-time thing and day-to-day stuff is... uh... daily, the day-to-day stuff comes first. The first-things-first fallacy can die.

I chose PowerShell for this because it's almost guaranteed to be something you've not seen before-- and it offends .NET developers, iOS developers, and Python developers equally. In practice, I use raw curl to manage my ES clusters and Python for mass importing (.NET has far too much bloat for one-off tools!)

As I demonstrate using ES from PowerShell, I will give commentary on what I'm doing with ES. You should be able to learn both ES and some practical, advanced PowerShell. If you don't care about PowerShell... even better! One reason I chose PowerShell was to make sure you focus on the underlying concepts.

There's a lot of really horrible PowerShell out there. I'm not part of the VB-style / Cmdlet / horribly-tedious-and-tiring-long-command-name PowerShell weirdness. If you insist on writing out Invoke-WebRequest instead of simply using wget, but use int and long instead of Int32 and Int64, then you have a ridiculous inconsistency to work out. Also, you're part of the reason PowerShell isn't more widespread.. You are making it difficult on everyone . In the following code, we're going to use PowerShell that won't make you hate PowerShell; it will be pleasant and the commands will ends up rolling off your fingers (something that won't happen with the horrible longhand command names). Linux users will feel at home with this syntax. VB-users will be wildly offended. EXCELLENT!

Flaw Workaround

Before we do anything, we have to talk about one of the greatest epic fails in recent software history...

While the Elasticsearch architecture is truly epic, it has a known design flaw (the absolute worst form of a bug): it allows a POST body in a GET request. This makes development painful:

  • Fiddler throws a huge red box at you.
  • wget in PowerShell gives you an error.
  • Postman in Chrome doesn't even try.
  • HttpClient in .NET throws System.Net.ProtocolViolationException saying "Cannot send a content-body with this verb-type."

.NET is right. Elasticsearch is wrong. Breaking the rules for the sake of what you personally feel makes you a vigilante. Forcing a square peg into a round hole just for the sake of making sure "gets" are done with GET makes you an extremist. Bodies belong in POST and PUT, not GET.

It's a pretty stupid problem having given how clever their overall architecture is. There's a an idiom for this in English: Homer Nodded.

To get around this flaw, instead of actually allowing us to search with POST (like normal people would), we are forced to use a hack: source query string parameter.

PowerShell Setup

When following along, you'll want to use the PowerShell ISE. Just type ISE in PowerShell.

PowerShell note: hit F5 in ISE to run a script

If you are going to run these in in a ps1 file, make sure to run Set-ExecutionPolicy RemoteSigned. as admin. Microsoft doesn't seem to like PowerShell at all. It's not the default in Windows 8/10. It's not the default on Windows Server. You can't run scripts by default. Someone needs to be fired. Run the aforementioned command to allow local scripts.

HTTP/S call

We're now ready to be awesome.

To start, let's create a call to ES. In the following code, I'm calling HTTPS with authorization. I'm not giving you the sissy Hello World, this is from production. While you're playing around, you can remove HTTP and authorization. You figure out how. That's part of learning.

$base = 'search.domain.net' 

$call = {
    param($params)

    $uri = "https://$base"

    $headers = @{ 
        'Authorization' = 'Basic fVmBDcxgYWpndYXJj3RpY3NlkZzY3awcmxhcN2Rj'
    }

    $response = $null
    $response = wget -Uri "$uri/$params" -method Get -Headers $headers -ContentType 'application/json'
    $response.Content
}

PowerShell note: prefix your : with ` or else you'll get a headache

So far, simple.

We can call &$call to call an HTTPS service with authorization.

But, let's break this out a bit...

$call = {
    param($verb, $params)

    $uri = "https://$base"

    $headers = @{ 
        'Authorization' = 'Basic fVmBDcxgYWpndYXJj3RpY3NlkZzY3awcmxhcN2Rj'
    }

    $response = wget -Uri "$uri/$params" -method $verb -Headers $headers -ContentType 'application/json'
    $response.Content
}

$get = {
    param($params)
    &$call "Get" $params
}

$delete = {
    param($params)
    &$call "Delete" $params
}

Better. Now we can call various verb functions.

To add PUT and POST, we need to account for the POST body. I'm also going to add some debug output to make life easier.

$call = {
    param($verb, $params, $body)

    $uri = "https://$base"

    $headers = @{ 
        'Authorization' = 'Basic fVmBDcxgYWpndYXJj3RpY3NlkZzY3awcmxhcN2Rj'
    }

    Write-Host "`nCalling [$uri/$params]" -f Green
    if($body) {
        if($body) {
            Write-Host "BODY`n--------------------------------------------`n$body`n--------------------------------------------`n" -f Green
        }
    }

    $response = wget -Uri "$uri/$params" -method $verb -Headers $headers -ContentType 'application/json' -Body $body
    $response.Content
}

$put = {
    param($params,  $body)
    &$call "Put" $params $body
}

$post = {
    param($params,  $body)
    &$call "Post" $params $body
}

In addition to having POST and PUT, we can also see what serialized data we are sending, and where.

ES Catalog Output

Now, let's use $call (or $get, etc) in something with some meaning:

$cat = {
    param($json)

    &$get "_cat/indices?v&pretty"
}

This will get the catalog of indexes.

Elasticsearch note: You can throw pretty anywhere to get formatted JSON.

Running &$cat gives me the following json:

[ {
  "health" : "yellow",
  "status" : "open",
  "index" : "site1!production",
  "pri" : "5",
  "rep" : "1",
  "docs.count" : "170",
  "docs.deleted" : "0",
  "store.size" : "2.4mb",
  "pri.store.size" : "2.4mb"
}, {
  "health" : "yellow",
  "status" : "open",
  "index" : "site2!production",
  "pri" : "5",
  "rep" : "1",
  "docs.count" : "141",
  "docs.deleted" : "0",
  "store.size" : "524.9kb",
  "pri.store.size" : "524.9kb"
} ]

But, we're in PowerShell; we can do better:

ConvertFrom-Json (&$cat) | ft

Output:

 health status index                 pri rep docs.count docs.deleted store.size pri.store.size
------ ------ -----                 --- --- ---------- ------------ ---------- --------------
yellow open   site1!staging         5   1   176        0            2.5mb      2.5mb         
yellow open   site2!staging         5   1   144        0            514.5kb    514.5kb    
yellow open   site1!production      5   1   170        0            2.4mb      2.4mb         
yellow open   site2!production      5   1   141        0            524.9kb    524.9kb     

Example note: !production and !staging have nothing to do with ES. It's something I do in ES, Redis, Mongo, SQL Server, and every other place the data will be stored to separate deployments. Normally I would remove this detail from article samples, but the following examples use this to demonstrate filtering.

PowerShell note: Use F8 to run a selection or single line. It might be worth removing your entire $virtualenv, if you want to play around with this.

Much nicer. Not only that, but we have the actual object we can use to filter on the client side. It's not just text.

(ConvertFrom-Json (&$cat)) `
    | where { $_.index -match '!production' }  `
    | select index, docs.count, store.size |  ft

Output:

index              docs.count store.size
-----              ---------- ----------
site1!production   170        2.4mb     
site2!production   141        532.9kb   

Getting Objects from ES

Let's move forward by adding our search function:

$search = {
    param($index, $json)

    &$get "$index/mydatatype/_search?pretty&source=$json"
}

Calling it...

&$search 'site2!production' '{
    "query": {
        "match_phrase": {
            "content": "struggling serves"
        }
    }
}'

Elasticsearch note match_phrase will match the entire literal phrase "struggling serves"; match would have search for "struggling" or "serves". Results will return with a score, sorted by that score; entries with both words would have a higher score than an entry with only one of them. Also, wildcard will allow stuff like `struggl*.

Meh, I'm not a big fan of this analogy of SELECT * FROM [site2!production]:

&$search 'site2!production' '{
    "query": {
        "match_phrase": {
            "content": "struggling serves"
        }
    },
    "fields": ["selector", "title"]
}'

This will return a bunch of JSON.

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.6106973,
    "hits" : [ {
      "_index" : "site2!staging",
      "_type" : "entry",
      "_id" : "AVGUw_v5EFj4l3MNvkeV",
      "_score" : 0.6106973,
      "fields" : {
        "selector" : [ "sweet/mine" ]
      }
    }, {
      "_index" : "site2!staging",
      "_type" : "entry",
      "_id" : "AVGU3PwoEFj4l3MNvk9D",
      "_score" : 0.4333064,
      "fields" : {
        "selector" : [ "of/ambition" ]
      }
    }, {
      "_index" : "site2!staging",
      "_type" : "entry",
      "_id" : "AVGU3QHeEFj4l3MNvk9G",
      "_score" : 0.4333064,
      "fields" : {
        "selector" : [ "or/if" ]
      }
    } ]
  }
}

We can improve on this.

First, we can convert the input to something nicer:

&$search 'site2!production' (ConvertTo-Json @{
    query = @{
        match_phrase = @{
            content = "struggling serves"
        }
    }
    fields = @('selector', 'title')
})

Here we're just creating a dynamic object and serializing it. No JIL or Newtonsoft converters required.

To make this a a lot cleaner, here's a modified $search:

$search = {
    param($index, $json, $obj)
    if($obj) {
        $json = ConvertTo-Json -Depth 10 $obj
    }

   &$get "$index/mydatatype/_search?pretty&source=$json"
}

You need -Depth <Int32> because the default is 2. Nothing deeper than the default will serialize. It will simply show "System.Collections.Hashtable. In ES, you'll definitely have deep objects.

Now, I can call this with this:

&$search 'site2!production' -obj @{
    query = @{
        match_phrase = @{
            content = "struggling serves"
        }
    }
    fields = @('selector', 'title')
}

This works fine. Not only that, but the following code still work:

&$search 'site2!production' '{
    "query": {
        "match_phrase": {
            "content": "struggling serves"
        }
    }
    "fields" = ["selector", "title"]
}'

Now you don't have to fight with escaping strings; you can also still copy/paste JSON with no problem.

JSON to PowerShell Conversion Notes:

  • : becomes =
  • all ending commas go away
    • newlines denote new properties
  • @ before all new objects (e.g. {})
  • [] becomes @()
    • @() is PowerShell for array
  • " becomes ""
    • PowerShell escaping is double-doublequotes

DO NOT FORGET THE @ BEFORE {. If you do, it will sits there forever as it tries to serialize nothing into nothing. After a few minutes, you'll get hundreds of thousands of JSON entries. Seriously. I tries to serialize every aspect of every .NET property forever. This is why the -Depth defaults to 2.

Next, let's format the output:

(ConvertFrom-Json(&$search 'content!staging' 'entry' -obj @{
    query = @{
        match_phrase = @{
            content = "struggling serves"
        }
    }
    fields = @('selector', 'title')
})).hits.hits.fields | ft

Could probably just wrap this up:

$matchPhrase = {
    param($index, $type, $text, $fieldArray)
    (ConvertFrom-Json(&$search $index $type -obj @{
        query = @{
            match_phrase = @{
                content = $text
            }
        }
        fields = $fieldArray
    })).hits.hits.fields | ft
}

Just for completeness, here's $match. Nothing too shocking.

$match = {
    param($index, $type, $text, $fieldArray)
    (ConvertFrom-Json(&$search $index $type -obj @{
        query = @{
            match = @{
                content = $text
            }
        }
        fields = $fieldArray
    })).hits.hits.fields | ft
}

Finally, we have this:

&$match 'content!staging' 'entry' 'even' @('selector', 'title')

Output:

title                               selector           
-----                               --------           
{There Defeats Cursed Sinews}       {forestalled/knees}
{Foul Up Of World}                  {sweet/mine}       
{And As Rank Down}                  {crown/faults}     
{Inclination Confront Angels Stand} {or/if}            
{Turn For Effects World}            {of/ambition}      
{Repent Justice Is Defeats}         {white/bound}      
{Buys Itself Black I}               {form/prayer}

There we go: phenominal cosmic power in an ity bity living space

Beefier Examples and Objects

Here's an example of a search that's a bit beefier:

&$search  'bible!production' -obj @{
    query = @{
        filtered = @{
            query = @{
                match = @{
                    content = "river Egypt"
                }
            }
            filter = @{
                term = @{
                    "labels.key" = "isaiah"
                }
            }
        }
    }
    fields = @("passage")
    highlight = @{
        pre_tags = @("<span class=""search-result"">")
        post_tags = @("</span>")
        fields = @{
            content = @{
                fragment_size = 150
                number_of_fragments = 3
            }
        }
    }
}

Elasticsearch note: Filters are binary: have it or not? Queries are analog: they have a score. In this example, I'm moving a filter with a query. Here I'm searching the index for content containing "river" or "Egypt" where labels: { "key": 'isaiah' }

Using this I'm able to to filter my documents by label where my labels are complex objects like this:

  "labels": [
    {
      "key": "isaiah",
      "implicit": false,
      "metadata": []
    }
  ]

I'm able to search by labels.key to do a hierarchical filter. This isn't an ES tutorial; rather, this is to explain why "labels.key" was in quotes in my PowerShell, but nothing else is.

Design note: The objects you sent to ES should be something optimized for ES. This nested type example is somewhat contrived to demonstrate nesting. You can definitely just throw your data and ES and it will figure out the schema on the fly, but that just means you're lazy. You're probably the type of person who used the forbidden [KnownType] attribute in WCF because you were too lazy to write DTOs. Horrible. Go away.

This beefier example also shows me using ES highlighting. In short, it allows me to tell ES that I want a content summary of a certain size with some keywords wrapped in some specified HTML tags.

This content will show in addition to the requested fields.

The main reason I mention highlighting here is this:

When you serialize the object, it will look weird:

"pre_tags":  [
   "\u003cspan class=\"search-result\"\u003e"
]

Chill. It's fine. I freaked out at first too. Turns out ES can handle unicode just fine.

Let's run with this highlighting idea a bit by simplifying it, parameterizing it, and deserializing the result (like we've done already):

$result = ConvertFrom-Json(&$search  $index -obj @{
    query = @{
        filtered = @{
            query = @{
                match = @{
                    content = $word
                }
            }
        }
    }
    fields = @("selector", "title")
    highlight = @{
        pre_tags = @("<span class=""search-result"">")
        post_tags = @("</span>")
        fields = @{
            content = @{
                fragment_size = 150
                number_of_fragments = 3
            }
        }
    }
})

Nothing new so far. Same thing, just assigning it to a variable...

The JSON is passed back was this...

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 30,
    "max_score" : 0.093797974,
    "hits" : [ {
      "_index" : "content!staging",
      "_type" : "entry",
      "_id" : "AVGhy1octuYZuX6XP7zu",
      "_score" : 0.093797974,
      "fields" : {
        "title" : [ "Pause First Bosom The Oft" ],
        "selector" : [ "hath come/hand of" ]
      },
      "highlight" : {
        "content" : [ " as to fault <span class=\"search-result\">smells</span> ourselves am free wash not tho
se lies enough business eldest sharp first what corrupted blood which state knees wash our cursed oft", " give
 above shall curse be help faults offences this snow me pray shuffling confront ere forgive newborn a engaged 
<span class=\"search-result\">smells</span> rain state angels up form", " ambition eldest i guilt <span class=
\"search-result\">smells</span> forehead true snow thicker rain compelld intent bound my which currents our so
ul of limed angels white snow this buys" ]
      }
    },
    ...    

The data we want is is hits.hits.fields and hits.hits.highlights.

So, we can get them, and play with the output...

$hits = $result.hits.hits
$formatted = $hits | select `
        @{ Name='selector'; Expression={$_.fields.selector} },
        @{ Name='title'; Expression={$_.fields.title} }, 
        @{ Name='highlight'; Expression={$_.highlight.content} }
$formatted

This is basically the following...

hits.Select(p=>new { selector = p.fields.selector, ...});

Output:

selector                   title                            highlight                                        
--------                   -----                            ---------                                        
hath come/hand of          Pause First Bosom The Oft        { as to fault <span class="search-result">smel...
all fall/well thicker      Enough Nature Heaven Help Smells { buys me sweet queen <span class="search-resu...
with tis/law oft           Me And Even Those Business       { if what form this engaged wretched heavens m...
hand forehead/engaged when Form Confront Prize Oft Defeats  {white begin two-fold faults than that strong ...
newborn will/not or        Were Gilded Help Did Nature      { above we evidence still to me no where law o...
cursed thicker/as free     Cursed Tis Corrupted Guilt Where { justice wicked neglect by <span class="searc...
currents hand/true of      Both Than Serves Were May        { serves engaged down ambition to man it is we...
two-fold can/eldest queen  Then Sweet Intent Help Turn      { heart double than stubborn enough the begin ...
force did/neglect whereto  Compelld There Strings Like Not  { it oft sharp those action enough art rests s...
babe whereto/whereto is    As Currents Prayer That Free     { defeats form stand above up <span class="sea...

This part is important: highlight is an array. Your search terms may show up more than once in a document. That's where the whole number_of_fragments = 3 came in. The highlight size is from that fragment_size = 150. So, for each entry we have, we have a selector (basically an ID), a title, and up to three highlights up to 150-characters each.

Let's abstract this stuff before we go to the final data analysis step:

$loadHighlights = {
    param($index, $word)

    $result = ConvertFrom-Json(&$search  $index -obj @{
        query = @{
            filtered = @{
                query = @{
                    match = @{
                        content = $word
                    }
                }
            }
        }
        fields = @("selector", "title")
        highlight = @{
            pre_tags = @("<span class=""search-result"">")
            post_tags = @("</span>")
            fields = @{
                content = @{
                    fragment_size = 150
                    number_of_fragments = 3
                }
            }
        }
    })

    $hits = $result.hits.hits
    $hits | select `
            @{ Name='selector'; Expression={$_.fields.selector} },
            @{ Name='title'; Expression={$_.fields.title} }, 
            @{ Name='highlight'; Expression={$_.highlight.content} }
}

Now, we can run:

&$loadHighlights 'content!staging' 'smells'

Let's use this and analyze the results:

(&$loadHighlights 'content!staging' 'smells').foreach({
    Write-Host ("Selector: {0}" -f $_.selector)
    Write-Host ("Title: {0}`n" -f $_.title)
    $_.highlight | foreach {$i=0} {
        $i++
        Write-Host "$i $_"
    }
    Write-Host "`n`n"
})

Here's a small sampling of the results:

Selector: two-fold can/eldest queen
Title: Then Sweet Intent Help Turn

1  heart double than stubborn enough the begin and confront as <span class="search-result">smells</span> ourse
lves death stronger crown murder steel pray i though stubborn struggling come by
2  of forehead newborn mine above forgive limed offences bosom yet death come angels angels <span class="searc
h-result">smells</span> i sinews past that murder his bosom being look death
3  <span class="search-result">smells</span> where strong ill action mine foul heavens turn so compelld our to
 struggling pause force stubborn look forgive death then death try corrupted



Selector: force did/neglect whereto
Title: Compelld There Strings Like Not

1  it oft sharp those action enough art rests shove stand cannot rain bosom bosom give tis repentance try upon
t possessd my state itself lies <span class="search-result">smells</span> the
2  brothers blood white shove no stubborn than in ere <span class="search-result">smells</span> newborn art re
pentance as like though newborn will form upont pause oft struggling forehead help
3  shuffling serve lies <span class="search-result">smells</span> stand well queen well visage and free his pr
ayer lies that art ere a there law even by business confront offences may retain

We have the selector, the title, and the highlights with <span class="search-result">...</span> showing us where our terms were found.

Setup

Up to this point, I've assumed that you already have an ES setup. Setup is once, but playing around is continuous. So, I got the playing around out of the way first.

Now, I'll go back and talk about ES setup with PowerShell. This should GREATLY improve your ES development. Well, it's what helps me at least...

ES has a schema. It's not fully Platonic like SQL Server, nor is it fully Aristotelian like MongoDB. You can throw all kinds of things at ES and it will figure them out. This is what ES calls dynamic mapping. If you like the idea of digging through incredibly massive data dumps during debugging or passing impossibly huge datasets back and forth, then this might be the way to go (were you "that guy" who threw [KnownType] on your WCF objects? This is you. You have no self-respect.) On the other hand, if you are into light-weight objects, you're probably passing ES a nice tight JSON object anyway. In any case, want schema to be computed as you go? That's dynamic mapping. Want to define your schema and have ES ignore unknown properties? That's, well, disabling dynamic mapping.

Dynamic mapping ends up being similar to the lowest-common denominator ("LCD") schema like in Azure Table Storage: your schema might end up looking like a combination of all fields in all documents.

ES doesn't so much deal with "schema" in the abstract, but with concrete indexes and types.

No, that doesn't mean it's schemaless. That's insane. It means that index and the types are the schema.

In any case, in ES, you create indexes. These are like your tables. Your indexes will have metadata and various properties much like SQL Server metadata and columns. Properties have types just like SQL Server columns. Unlike SQL Server, there's also a concept of a type. Indexes can have multiple types.

Per the Elasticsearch: Definitive Guide, the type is little more than a "_type" property internally, thus types are almost like property keys in Azure Table Storage. This means that when searching, you're searching across all types, unless you specify the type as well. Again, this maps pretty closely to a property key in Azure Table Storage.

Creating an index

Creating an index with a type is a matter of calling POST / with your mapping JSON object. Our $createIndex function will be really simple:

$createIndex = {
    param($index, $json, $obj)
    if($obj) {
        $json = ConvertTo-Json -Depth 10 $obj
    }
    &$post $index $json
}

Thing don't get interesting until we call it:

&$createIndex 'content!staging' -obj @{
    mappings = @{
        entry = @{
            properties = @{
                selector = @{
                    type = "string"
                }
                title = @{
                    type = "string"
                }
                content = @{
                    type = "string"
                }
                created = @{
                    type = "date"
                    format = "YYYY-MM-DD"  
                }
                modified = @{
                    type = "date"                  
                }
            }
        }
    }
}

This creates an index called content!staging with a type called entry with five properties: selector, title, content, created, and modified.

The created property is there to demonstrate that fact that you can throw formats on properties. Normally, dates are UTC, but here I'm specifying that I don't even care about times when it comes to the create date.

With this created, we can see how ES sees our data. We do this by calling GET //_mapping:

$mapping = {
    param($index)
   &$get "$index/_mapping?pretty"
}

Now to call it...

&$mapping 'content!staging'

Adding Data

Now to throw some data at this index...

Once again, the PowerShell function is simple:

$add = {
    param($index, $type, $json, $obj)
    if($obj) {
        $json = ConvertTo-Json -Depth 10 $obj
    }
    &$post "$index/$type" $json
}

To add data, I'm going to use a trick I wrote about elsewhere

if (!([System.Management.Automation.PSTypeName]'_netfxharmonics.hamlet.Generator').Type) {
    Add-Type -Language CSharp -TypeDefinition '
        namespace _Netfxharmonics.Hamlet {
            public static class Generator {
                private static readonly string[] Words = "o my offence is rank it smells to heaven hath the primal eldest curse upont a brothers murder pray can i not though inclination be as sharp will stronger guilt defeats strong intent and like man double business bound stand in pause where shall first begin both neglect what if this cursed hand were thicker than itself with blood there rain enough sweet heavens wash white snow whereto serves mercy but confront visage of whats prayer two-fold force forestalled ere we come fall or pardond being down then ill look up fault past form serve turn forgive me foul that cannot since am still possessd those effects for which did crown mine own ambition queen may one retain corrupted currents world offences gilded shove by justice oft tis seen wicked prize buys out law so above no shuffling action lies his true nature ourselves compelld even teeth forehead our faults give evidence rests try repentance can yet when repent wretched state bosom black death limed soul struggling free art more engaged help angels make assay bow stubborn knees heart strings steel soft sinews newborn babe all well".Split('' '');
                private static readonly int Length = Words.Length;
                private static readonly System.Random Rand = new System.Random();

                public static string Run(int count, bool subsequent = false) {
                    return Words[Rand.Next(1, Length)] + (count == 1 ? "" : " " + Run(count - 1, true));
                }
            }
        }

    '
}

n Let's use $gen with $add to load up some data:

$ti = (Get-Culture).TextInfo
(1..30).foreach({
    &$add -index 'content!staging' -type 'entry' -obj @{
        selector = "{0}/{1}" -f ([_netfxharmonics.hamlet.Generator]::Run(1)), ([_netfxharmonics.hamlet.Generator]::Run(1))
        title = $ti.ToTitleCase([_netfxharmonics.hamlet.Generator]::Run(4))
        content = [_netfxharmonics.hamlet.Generator]::Run(400) + '.'
        created = [DateTime]::Now.ToString("yyyy-MM-dd")
        modified = [DateTime]::Now.ToUniversalTime().ToString("o")
    }
})

This runs fast; crank the sucker up to 3000 with a larger content size if you want. Remove "Write-Host" from $call for more speed.

Your output will look something like this

Calling [https://search.domain.net/content!staging/entry]
BODY
--------------------------------------------
{
    "selector":  "death/defeats",
    "title":  "Sinews Look Lies Rank",
    "created":  "2015-11-07",
    "content":  "Engaged shove evidence soul even stronger bosom bound form soul wicked oft compelld steel which turn prize yet stand prize.",
    "modified":  "2015-11-07T05:45:10.6943622Z"
}
--------------------------------------------

When a run one of the earlier searches...

&$match 'content!staging' 'entry' 'even'

...we get the following results:

selector        
--------        
{death/defeats}

Debugging

If stuff doesn't work, you need to figure out how to find out why; not simply find out why. So, brain-mode activated: wrong results? Are you getting them wrong or are they actually wrong? Can't insert? Did the data make it to the server at all? Did it make it there, but couldn't get inserted? Did it make it there, get inserted, but you were simply told that it didn't insert? Figure it out.

As far as simple helps, I'd recommend doing some type of dump:

$dump = {
    param($index, $type)
    &$get "$index/$type/_search?q=*:*&pretty"
}

This is a raw JSON dump. You might want to copy/paste somewhere for analysis, or play around in PowerShell:

(ConvertFrom-Json(&$dump 'content!staging' 'entry')).hits.hits

I'd recommend just using a text editor to look around instead of random faux PowerShell data-mining. Because, JSON and XML both absolutely perfectly human readable, you'll see what you need quick. Even then, there's no reason no to just type the actual link into your own browser:

http://10.1.60.3:9200/content!staging/entry/_search?q=*:*&pretty

I'd recommend the Pretty Beautiful Javascript extension for Chrome.

You can remove &pretty when using this Chrome extension.

Another thing I'd strongly recommend is having a JSON beautifier toggle for input JSON:

$pretty = $False

So you can do something like this:

$serialize = {
    param($obj)
    if(!$pretty) {
        $pretty = $false
    }
    if($pretty) {
        ConvertTo-Json -Depth 10 $obj;
    }
    else {
        ConvertTo-Json -Compress -Depth 10 $obj
    }
}

Instead of calling ConvertTo-Json in your other places, just call &$serialize.

$search = {
    param($index, $type, $json, $obj)
    if($obj) {
        $json = &$serialize $obj
    }
    &$get "$index/$type/_search?pretty&source=$json"
}

Remember, this is for input, not output. This is for the data going to the server.

You want this option because once you disable this, you can do this:

&$match 'content!staging' 'entry' 'struggling' @('selector', 'title')

To get this...

Calling [http://10.1.60.3:9200/content!staging/entry/_search?pretty&source={"fields":["selector","title"],"query":{"match":{"content":"struggling"}}}]

title                            selector         
-----                            --------         
{Those Knees Look State}         {own/heavens}    

Now you have a URL you can dump into your web browser. Also, you have a link to share with others.

Regarding logs, on Linux systems you can view error messages at the following location:

/var/log/elasticsearch/CLUSTERNAME.log

I like to keep a SSH connection open while watching the log closely:

tail -f /var/log/elasticsearch/david-content.log 

Your cluster name will be configured in the YAML config somewhere around /etc/elasticsearch/elasticsearch.yml.

Updating (and Redis integration)

Updating Elasticsearch objects ("documents") is interesting for two reasons, a good one and a weird one:

Good reason: documents are immutable. Updates involve marking the existing item as deleted and inserting a new document. This is exactly how SQL Server 2014 IMOLTP works. It's one secret of extreme efficiency. It's an excellent practice to follow.

Weird reason: you have to update to know the integer ID to update a document. It's highly efficient, which makes it, at worst, "weird"; not "bad". It once allowed updates based on custom fields, you'd have a potential perf hit. Key lookups are the fastest.

Prior to Elasticsearch 2.x, you could add something like { "_id": { "path": "selector" } } to tell ES that you want to use your "selector" field as your ID. This was deprecared in version 1.5 and removed in 2.x (yes, they are two separate things). Today, _id is immutable. So, when you see docs saying you can do this, check the version. It will probably be something like version 1.4. Compare the docs for _id in version 1.4 with version 2.1 to see what I mean.

When you make a call like the following example, an cryptic ID is generated:

POST http://10.1.60.3:9200/content!staging/entry

But, you can specify the integer:

POST http://10.1.60.3:9200/content!staging/entry/5

This is great, but nobody anywhere cares about integer IDs. These surrogate keys have absolutely no meaning to your document. How in the world could you possibly know how to update something? You have to know the ID.

If you have your own useful identifier, then good for you, just use the following:

POST http://10.1.60.3:9200/content!staging/entry/tacoburger

Yet, you can't use any type of slash. Soooo.... forget it. Since I usually use ES to store content linked from URLs, this is isn't going to fly. Besides, nobody wants to have to keep track of all various encodings you have to do to make your data clean. So, we need to add some normalization to our IDs both to make ES happy and to keep our usage simple.

So, OK, fine... have to save some type of surrogate key to key map somewhere. Where could I possibly save them? Elasticsearh IS. MY. DATABASE. I need something insanely efficient for key / value lookups, but that persists to disk. I need something easy to use on all platforms. It should also be non-experimental. It should be a time-tested system. Oh... right: Redis.

The marketing says that Redis is a "cache". Whatever that means. It's the job of marketing to either lie about products to trick people into buying stuff or to downplay stuff for the sake of a niche market. In reality, Redis is a key/value database. It's highly efficiently and works everywhere. It's perfect. Let's start making the awesome...

I'l all about doing things based on first-principles (people who can't do this laugh at people who can do this and accuse them of "not invented here syndrome"; jealous expresses it in many ways), but I'm here I'm going to use the Stackoverflow.Redis package. It seems to be pretty standard and it works pretty well. I'm running it in a few places. Create some VS2015 (or whatever) project and add the NuGet package. Or, go find it and download it. But... meh... that sounds like work. Use NuGet.. Now we're going to reference that DLL..

$setupTracking = {
    Add-Type -Path 'E:\_GIT\awesomeness\packages\StackExchange.Redis.1.0.488\lib\net45\StackExchange.Redis.dll'
    $cs = '10.1.60.2'
    $config = [StackExchange.Redis.ConfigurationOptions]::Parse($cs)
    $connection = [StackExchange.Redis.ConnectionMultiplexer]::Connect($config)
    $connection.GetDatabase()
}

Here I'm adding the assembly, creating my connection string, creating a connection, then getting the database.

Let's call this and set some type of relative global:

$redis = &$setupTracking

We need to go over a few things in Redis first:

Redis communicates over TCP. You sends commands to it and you get stuff back. The commands are assembler-looking codes like:

  • HGET
  • FLUSHALL
  • KEYS
  • GET
  • SET
  • INCR

When you use INCR, you are incrementing a counter. So...

INCR taco

That sets taco to 1.

INCR taco

Now it's 2.

We can get the value...

GET taco

The return value will be 2.

By the way, this is how you setup realtime counters on your website. You don't have to choose between database locking and eventual consistency. Use Redis.

Then there's the idea of a hash. You know, a dictionary-looking thingy.

So,

HSET chicken elephant "dude"

This sets elephant on the chicken hash to "dude".

HGET chicken elephant

This gets "dude". Shocking... I know.

HGETALL chicken

This dumps the entire chicken hash.

Weird names demonstrate that the name has nothing to do with the system and it forces you to think, thus remembering it better long-term.

To get all the values, do something like this:

KEYS *

When I say "all", I mean "all". Both the values that INCR and the values from HSET will show. This is a typical wildcard. You can do stuff like KEYS *name* just fine.

Naming note: Do whatever you want, but it's commmon to use names like "MASTERSCOPE:SCOPE#VARIABLE". My system already has a well defined internal naming system of Area!Environment, so in what follows we'll use "content!staging#counter" and "content!staging#Hlookup"

OK, that's enough to get started. Here's the plan: Because the integer IDs mean absolutely nothing to me, I'm going to treat them as an implemenation detail; more technically, as a surrogate key. My key is selector. I want to update via selector not some internal ID that means nothing to me.

To do this, I'll basically just emulate what Elasticsearch 1.4 did: specify what property I want as my key.

To this end, I need to add a new $lookupId function, plus update both $add and $delete:

$lookupId = {
    param($index, $selector)

    if($redis) {
        $id = [int]$redis.HashGet("$index#Hlookup", $selector)
    }
    if(!$id) {
        $id = 0
    }
    $id
}

$add = {
    param($index, $type, $json, $obj, $key)
    if($obj) {
        $json = &$serialize $obj
        if($key) {
            $keyValue = $obj[$key]
        }
    }
    
    if($redis -and $keyValue) {
        $id = &$lookupId $index $keyValue
        Write-Host "`$id is $id"
        if($id -eq 0) {
            $id = [int]$redis.StringIncrement("$index#counter")
            if($verbose) {
                Write-Host "Linking $keyValue to $id"
            }
            &$post "$index/$type/$id" $json
            [void]$redis.HashSet("$index#Hlookup", $keyValue, $id)
        }
        else {
            &$put "$index/$type/$id" $json
        }

    }
    else {
        &$post "$index/$type" $json
    }
}

$delete = {
    param($index)
    &$call "Delete" $index

    if($redis) {
        [void]$redis.KeyDelete("$index#counter")
        [void]$redis.KeyDelete("$index#Hlookup")
    }
}

When stuff doens't exist, you get some type of blank entity. I've never seen a null while using the Stackoverflow.Redis package, so that's something to celebrate. The values that Stackoverflow.Redis methods work with are RedisKey and RedisValue. There's not much to learn there though, since there are operators for many different conversions. You can work with strings just fine without needing to know about RedisKey and RedisValue.

So, if I'm sending it a key, key the key value from the object I sent in. If there is a key value and Redis is enabled and active, see if that key value is the ID of an existing item. That's a Redis lookup. Not there? OK, must be new, use Redis to generate a new ID and send that to Elasticsearch (POST $index/$type/$id). The ID was already there? That means the selector was already assigned a unique, sequential ID by Redis, use that for the update.

For now, POST works fine for an Elasticsearch update as well. Regardless, I'd recommend using PUT for update even though POST works. You never know when they'll enforce it.

Let's run a quick test:

$selectorArray = &$generate 'content!staging' 'entry' 2 -key 'selector'

($selectorArray).foreach({
    $selector = $_
    Write-Host ("ID for $selector is {0}" -f (&$lookupId 'content!staging' $selector))
})

Output:

ID for visage/is is 4
ID for if/blood is 5

I'm going to hope over to Chrome to see how my data looks:

http://10.1.60.3:9200/content!staging/_search?q=*:*

It's there...

{
    "_index": "content!staging",
    "_type": "entry",
    "_id": "4",
    "_score": 1,
    "_source": {
        "selector": "visage/is",
        "title": "Bound Fault Pray Or",
        "created": "2015-11-07",
        "content": "very long content omitted",
        "modified": "2015-11-07T22:24:23.0283870Z"
    }
}

Cool, ID is 4.

What about updating?

Let's try it...

$obj = @{
    selector = "visage/is"
    title = 'new title, same document'
    content = 'smaller content'
    created = [DateTime]::Now.ToString("yyyy-MM-dd")
    modified = [DateTime]::Now.ToUniversalTime().ToString("o")
}
&$add -index 'content!staging' -type 'entry' -obj $obj -key 'selector' > $null

Output:

{
    "_index": "content!staging",
    "_type": "entry",
    "_id": "4",
    "_score": 1,
    "_source": {
        "selector": "visage/is",
        "title": "new title, same document",
        "created": "2015-11-07",
        "content": "smaller content",
        "modified": "2015-11-07T23:11:58.4607963Z"
    }
}

Sweet. Now I can update via my own key (selector) and not have to ever touch Elasticsearch surrogate keys (_id).

Full SSL, HOSTS, and IIS Dev Setup via PowerShell

I'm all about people having their own setup and way of doing things. If you like Resharper, cool. If you like Notepad++, fine. Problems arise when you are forced into a standardized setup. Focus on the interface, not the implementation. With standardization you lose the natural QA you get from diverse environments.

Getting to specifics: HTTP is great, but times have changed. While we all know and love it (whatever), it needs to be taken out back. You can throw a site into the wild without protection from the beasts. This is NOT an after-the-fact production implementation detail. You need this in development. You can't have HTTPS surprises at the last second. That's simply irresponsible.

You should use IISExpress with SSL where possible. My preference is definitely for a full-IIS dev box setup with full HTTPS on everywhere.

What goes into this setup?

  • Dev IP Addresses
  • Updating HOSTS/DNS
  • SSL Certs
  • WebSites in IIS
  • Getting Visual Studio to like it

There is all where PowerShell comes in.

Prerequisite: Make sure you can run ps1 files in PowerShell. Here's one suggestion: Set-ExecutionPolicy RemoteSigned.

Adding IP addresses and updating HOSTS

While I sometimes teach kids, this explanation is for adults. I'll assume you can read and can deduce what does what. Code is self-documenting.

Do this in PowerShell as Administrator. I'd recommend the ISE.

10..30 | % {
    $ip = "10.1.111.$_"
    New-NetIPAddress -InterfaceAlias "Ethernet" -IPAddress $ip -PrefixLength 16
    #++ for Hyper-V
    #New-NetIPAddress -InterfaceAlias "vEthernet (External Virtual Switch)" -IPAddress $ip -PrefixLength 16
}

"
10.1.111.13 api.domain.local
10.1.111.12 viewer.domain.local
10.1.111.11 admin.domain.local
10.1.111.10 domain.local
" | ac "$($env:systemroot)\system32\drivers\etc\hosts"

Obviously, this will add the following IP Addresses and assign them .local domain names in the hosts file (don't hijack a top level domain; use .local!)

Next...

Creating SSL certs

Below is a PowerShell script that will:

  • create a master certificate and
  • create SSL certificates 1.

First, it uses makecert.exe to create the cert 2. Then, it uses pvk2pfx.exe to create a pfx file. Finally, it uses Import-PfxCertificate to import this pfx file into the certificate store.

Look at the "##+ run" section for examples of how to run this:

  • First, create a master certificate.
  • Then, create the others.

Uncomment and comment as needed. It just makes life easier. No need for fancy PowerShell modules.

Troubleshooting: Go to run (Win-R) and type MMC to get to a place where you can delete the certs. File->Add/Remove Snap-in. Add "Certificates". Add. Select Local Computer. Look in Trusted Root Certificate Authorities for master and Personal for other others. Use your eyes. You'll see it.

$virtualenv = {

##+ environment
$apiFolder = 'C:\Program Files (x86)\Windows Kits\10\bin\x64'
$target = 'E:\Drive\Code\Security\Cert'

if(!(Test-Path $apiFolder)) {
    "$apiFolder not found; check SDK installation"
    exit
}

##+ /environment

$root = {
    param([string]$name)

    $certFile = Join-Path $target $name
    $rootBaseName = "$certFile`Root"

    if(Test-Path "$rootBaseName.cer") {
        "$rootBaseName.cer already exists. ABORTING."
        exit
    }

    &"$apiFolder\makecert.exe" -r -n "CN=$rootName`Root" -pe -sv "$rootBaseName.pvk" -a sha1 -len 2048 -b 01/01/2010 -e 01/01/2030 -cy authority "$rootBaseName.cer"
    &"$apiFolder\pvk2pfx.exe" -pvk "$rootBaseName.pvk" -spc "$rootBaseName.cer" -pfx "$rootBaseName.pfx"

    Import-Certificate -FilePath "$rootBaseName.cer" -CertStoreLocation Cert:\LocalMachine\Root > $null
}

$ssl = {
    param([string]$name, [string]$rootName, [boolean]$isClientCert = $false)

    $rootCertFile = Join-Path $target $rootName
    $rootBaseName = "$rootCertFile`Root"

    if(!(Test-Path "$rootBaseName.cer")) {
        "$rootBaseName.cer does not exist. Stopping."
        return
    }

    $siteBaseName = Join-Path $target $name
    $siteBaseName

    if($isClientCert) {
        &"$apiFolder\makecert.exe" -iv "$rootBaseName.pvk" -ic "$rootBaseName.cer" -n "CN=$name" -pe -sv "$siteBaseName.pvk" -a sha1 -len 2048 -b 01/01/2010 -e 01/01/2030 -sky exchange "$siteBaseName.cer" -eku 1.3.6.1.5.5.7.3.2
    }
    else {
        &"$apiFolder\makecert.exe" -iv "$rootBaseName.pvk" -ic "$rootBaseName.cer" -n "CN=$name" -pe -sv "$siteBaseName.pvk" -a sha1 -len 2048 -b 01/01/2010 -e 01/01/2030 -sky exchange "$siteBaseName.cer" -eku 1.3.6.1.5.5.7.3.1
    }

    &"$apiFolder\pvk2pfx.exe" -pvk "$siteBaseName.pvk" -spc "$siteBaseName.cer" -pfx "$siteBaseName.pfx"

    ##++ cer is only public key; need private key for this one; that's pfx
    Import-PfxCertificate -FilePath "$siteBaseName.pfx" Cert:\LocalMachine\My > $null
}

##+ run
$rootName = 'Master'

#&$root $rootName
&$ssl 'identity.jampad.local' -rootName $rootName
#&$ssl 'devworkerrole01.local' -rootName $rootName
#&$ssl 'client.ssl' -rootName $rootName -isClientCert $true
##+ /run

}

&$virtualenv

Now for IIS...

Setting up IIS

If you are on Windows Server, just check go here and run that one line: IIS PowerShell Installation.

Now for setup...

This is somewhat of a monster: it creates the site, adds the IP addresses, and adds the certs. IIS doesn't let you create a binding-less WebSite, so HTTP came first. Remove it. It's obsolete and shouldn't be used.

$virtualenv = {

$networkInterfaceName = 'Ethernet'
##+ hyper-v
#$networkInterfaceName = 'vEthernet (External Virtual Switch)'

$ipCheck = {
    param([string]$ip)
    $addressData = Get-NetIPAddress | where { $_.InterfaceAlias -eq $networkInterfaceName } | select -expand IPAddress

    if(($addressData | where { $_ -eq $ip }).Count -eq 0) {
        Write-Host "IP address $ip not found. Run the following command. ABORTING."
        Write-Host "    New-NetIPAddress -InterfaceAlias '$networkInterfaceName' -IPAddress $ip -PrefixLength 16"
        exit
    }
}

$addSslBinding = {
    param([string]$name, [string]$ip, [string] $sslCert)
    if((Get-WebBinding  -name $name -IP $ip -Port 443 -Protocol https).Count -eq 0) {
        Write-Host "Adding SSL for address $ip..."

        New-WebBinding -name $name -IP $ip -Port 443 -Protocol https

        try {
            Write-Host "Setting SSL cert $sslCert..."
            $subjectName = "CN=$sslCert"
            $cert = ls Cert:\LocalMachine\My | where { $_.Subject -eq "CN=$sslCert" }
            $cert | ni "IIS:\SslBindings\$ip!443" > $null
        }
        catch {
            Write-Host ("`t{0}" -f $_.Exception.Message)
        }
    }
}

$setup = {
    param (
        [Parameter(Mandatory=$True)]
        [string] $name,
        [string] $path,
        [string] $ip,
        [int32] $httpPort = 80,
        [string] $sslCert,
        $ipHostMap,
        [boolean] $assignSslForEachHost,
        [boolean] $sslOnly
    )

    if($ip) {
        &$ipCheck $ip
    }
    elseif($ipHostMap) {
        foreach($ip in $ipHostMap.Keys) {
            # Write-Host ("$ip => {0}" -f $ipHostMap[$ip])
            &$ipCheck $ip
        }
        $ip = '127.0.0.1'
        $tempIp = $true
    }

    try {
        if((ls IIS:\AppPools | where name -eq $name).Count -eq 0) {
            Write-Host 'Creating application pool...'
            pushd IIS:\AppPools
            New-Item $name > $null
        }
        if((ls IIS:\Sites | where name -eq $name).Count -eq 0) {
            cd IIS:\Sites
            Write-Host "Creating web site ($name)..."
            Write-Host "`nNOTE: Adding temporary HTTP binding (required). If SSL-only, will try to remove HTTP in a moment.`n" -f Yellow
            New-Item $name -bindings @{protocol="http"; bindingInformation=$ip + ":" + $httpPort + ":"} -physicalPath $path > $null
            Set-ItemProperty $name -name applicationPool -value $name > $null
        }
        if($sslOnly -and !$sslCert) {
            $sslCert = $name
        }
        if($sslCert -and !$assignSslForEachHost) {
            &$addSslBinding $name $ip $sslCert
        }
        elseif($assignSslForEachHost) {
            foreach($ip in $ipHostMap.Keys) {
                $sslCert = $ipHostMap[$ip]
                &$addSslBinding $name $ip $sslCert
            }
        }
        if($sslOnly -and ($sslCert -or $ipHostMap)) {
            Write-Host "Removing HTTP..."
            Remove-WebBinding -Name $name -Protocol 'http'
            Write-Host "`nNOTE: You should double check to make sure HTTP was removed." -f Yellow
        }
    }
    finally {
        popd
    }
}

Import-Module WebAdministration

$ipHostMap = @{
  '10.1.111.13' = 'api.domain.local'
  '10.1.111.12' = 'viewer.domain.local'
  '10.1.111.11' = 'admin.domain.local'
  '10.1.111.10' = 'domain.local'
}

$name = 'domain.local'
if((ls IIS:\Sites | where name -eq $name).Count -gt 0) {
    cd IIS:\Sites
    Write-Host 'Deleting web site...'
    rm -r $name
}

# single-side multi-tenancy
&$setup -name 'domain.local' -path 'E:\_GIT\Domain.Project\Domain.Project.WebSite' -ipHostMap $ipHostMap -assignSslForEachHost $true -sslOnly $true


# single-side multi-tenancy
&$setup -name 'domain.local' -path 'E:\_GIT\Domain.Project\Domain.Project.WebSite' -ipHostMap $ipHostMap -assignSslForEachHost $true -sslOnly $true

&$setup -name 'anotherdomain.local' -ip '10.1.111.111' -path 'E:\_GIT\anotherdomain\anotherdomain.WebSite' -sslOnly $true
&$setup -name 'yetanotherdomain.local' -ip '10.1.111.112' -path 'E:\_GIT\yetanotherdomain\yetanotherdomain.WebSite' -sslOnly $true
&$setup -name 'another.local' -ip '10.1.111.113' -path 'E:\_GIT\another\another.WebSite' -sslOnly $true
&$setup -name 'onemore.local' -ip '10.1.111.113' -path 'E:\_GIT\onemore\onemore.WebSite' -sslOnly $true

}

&$virtualenv

Your question: "What the heck is that &$virtualenv?" I do everything in the ISE. It shares variables between tabs. This &$virtualenv runs the entire thing in it's own scope. The name isn't magic. I used to call it "scope". Whatever.

At this point the server is setup. Do this enough, it will take only the time it takes to type in the addresses and host names.

Removing annoying Visual Studio message

One problem: Visual Studio hates attaching to IIS. First, you have to be admin. Get over it. Be admin. I have my Visual Studio to always run as admin. Second, it complains when attaching to IIS.

Visual Studio Image

The fix is simple.

#vs 13
sp -path HKCU:\Software\Microsoft\VisualStudio\12.0\Debugger -Name DisableAttachSecurityWarning -Value 1

#vs 15
sp -path HKCU:\Software\Microsoft\VisualStudio\14.0\Debugger -Name DisableAttachSecurityWarning -Value 1

That's it. Now you have a nice server system.

Also, no more F5 nonsense. No more "run app". What "app"? It's not a console app. It's a WebSite. It's running anyway. F5 doesn't "start it". So... no more of that nonsense.

Quick HowTos

How do you debug Startup? Because Global.asax is obsolete and you're now using OWIN/Startup, you know that startup is Startup. Things now make sense. To debug this, throw in the following code:

System.Diagnostics.Debugger.Launch();

How do I restart the WebSite? Touch web.config (open and save it).

How do I restart the WebSite while attached? Yeah, for ABSOLUTELY NO REASON Visual Studio wants to detach when you touch web.config. The solution shouldn't be a surprise: PowerShell. When I'm developing I have a lot of PowerShell stuff open. Actually, when I'm doing anything I have it open.

Here's what I actually use to touch all my sites to reset them all at once. No F5-on-each. No AppPool recycle.

$touch = {
    param([string]$base)
    $config = Join-Path $base 'web.config'
    (ls $config).LastWriteTime = [DateTime]::Now
}

&$touch 'E:\_GIT\jampad.content\Content.Sample.WebSite'
&$touch 'E:\_GIT\jampad.content\Content.Sample.WebSite.Api'
&$touch 'E:\_GIT\jampad.content\Jampad.Content.WebSite.Api'
&$touch 'E:\_GIT\netfxharmonics\NetFXHarmonics.WebSite'


1
It will also create client certificates. See the 'client.ssl' example.
2
You can use pure PowerShell to create a self-signed certificate, but, apparently, makecert.exe goes above and beyond the call of duty.

Quick Google Drive Install

So, you setup a new VM (locally or on Azure). How do you get your stuff on it?

Me? I use Google Drive. I sync up a custom folder named _Transport_SystemSetup. This folder has my Notepad2, WinRar, Chrome, and a ton of other thing I just dump into my Windows folder (so I can Win-R run them) (e.g. putty, makecert, nuget, all sysinternals).

Now... how do I get Google Drive on there? Chicken and Egg?

No. Run the following in PowerShell. It will download and install Google Drive for you.

$local = "$env:userprofile\downloads"
$url = 'https://dl.google.com/tag/s/appguid%3D%7B3C122445-AECE-4309-90B7-85A6AEF42AC0%7D%26iid%3D%7BD7394D9A-AF62-3D75-9686-627C62B1E926%7D%26lang%3Den%26browser%3D4%26usagestats%3D0%26appname%3DGoogle%2520Drive%26needsadmin%3Dtrue/drive/googledrivesync.exe'
$file = "$local\googledrivesync.exe"
wget $url -outfile $file

&$file

That's it.

Just for fun... sometimes I'll spin up an Azure VM just for a quick web test. So, here's the PowerShell to get Chrome quickly:

$local = "$env:userprofile\downloads"
$url = 'https://dl.google.com/tag/s/appguid%3D%7B8A69D345-D564-463C-AFF1-A69D9E530F96%7D%26iid%3D%7B01986938-8680-E7F2-FAFC-001798EF10B8%7D%26lang%3Den%26browser%3D4%26usagestats%3D0%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26installdataindex%3Ddefaultbrowser/update2/installers/ChromeSetup.exe'
$file = "$local\ChromeSetup.exe"
wget $url -outfile $file

&$file

Hamlet: Better Random Strings

I don't like Lorem Ipsum. Why? Because I'm a student of Latin. Whenever I see the Lorem Ipsum nonsense, my brain spends more time trying to decode it than anything else. I know it's just randomized Cicero, but my brain tries to troll me.

So, I'm just going to use English. Instead of Lorem Ipsum, I use Hamlet. I took a section from Hamlet and threw it into PowerShell. Then, I optimized it down to just a few lines so I can reuse it in samples without much bloat:

By the way, Hamlet has 4752 distinct words with each inflection and declension being counted (e.g. "I", "be", "being", "was", and "am" being five counted as distinct words, though they really aren't).

$genData = 'o my offence is rank it smells to heaven hath the primal eldest curse upont a brothers murder pray can i not though inclination be as sharp will stronger guilt defeats strong intent and like man double business bound stand in pause where shall first begin both neglect what if this cursed hand were thicker than itself with blood there rain enough sweet heavens wash white snow whereto serves mercy but confront visage of whats prayer two-fold force forestalled ere we come fall or pardond being down then ill look up fault past form serve turn forgive me foul that cannot since am still possessd those effects for which did crown mine own ambition queen may one retain corrupted currents world offences gilded shove by justice oft tis seen wicked prize buys out law so above no shuffling action lies his true nature ourselves compelld even teeth forehead our faults give evidence rests try repentance can yet when repent wretched state bosom black death limed soul struggling free art more engaged help angels make assay bow stubborn knees heart strings steel soft sinews newborn babe all well'.split(' ')

$gen = {
    param ($count)
    1..$count | foreach {$r = @()} { $r += $genData[(random) % $genData.count] } {[String]::Join(' ', $r)}
}

&$gen 10

Don't care about Powershell? Chill. Python and JavaScript versions are at the end.

"like man teeth shall turn since repentance turn bound corrupted"

It's worth noting that if you run &$gen 100, you might be a candidate for some sort of poetry award:

so help intent rests a foul strong visage angels our double stubborn effects no angels did buys not law death i action since man oft is soul wash possessd which were we snow confront cursed shove forgive so up murder there offence repentance faults i but our forestalled black forgive man out serve look can defeats engaged down one mercy assay assay ere knees were retain gilded struggling a then prayer there law engaged prayer evidence of hand faults snow white a itself we business defeats serve double stand struggling smells defeats but repent heart both since defeats out shall

I should mention that if you wanted a concise way to capitalize the first letter, that's definitely doable:

$gen = {
    param ($count)
    1..$count | foreach {$a = @()} { $a += $genData[(random) % $genData.count] } {[String]::Join(' ', $a)}
}

All this works so far, but it's not very efficient.

Let's use our super-duper object-oriented, results-focused, enterprise-class benchmark software:

$then = [DateTime]::Now
&$gen 400
$now = [DateTime]::Now
Write-Host ($now - $then).Seconds

Our fancy benchmark enterprise-class reporting software shows 818ms on my machine.

What can we do better? Raw .NET:

if (!([System.Management.Automation.PSTypeName]'_netfxharmonics.hamlet.Generator').Type) {
    Add-Type -Language CSharp -TypeDefinition '
        namespace _Netfxharmonics.Hamlet {
            public static class Generator {
                private static readonly string[] Words = "o my offence is rank it smells to heaven hath the primal eldest curse upont a brothers murder pray can i not though inclination be as sharp will stronger guilt defeats strong intent and like man double business bound stand in pause where shall first begin both neglect what if this cursed hand were thicker than itself with blood there rain enough sweet heavens wash white snow whereto serves mercy but confront visage of whats prayer two-fold force forestalled ere we come fall or pardond being down then ill look up fault past form serve turn forgive me foul that cannot since am still possessd those effects for which did crown mine own ambition queen may one retain corrupted currents world offences gilded shove by justice oft tis seen wicked prize buys out law so above no shuffling action lies his true nature ourselves compelld even teeth forehead our faults give evidence rests try repentance can yet when repent wretched state bosom black death limed soul struggling free art more engaged help angels make assay bow stubborn knees heart strings steel soft sinews newborn babe all well".Split('' '');
                private static readonly int Length = Words.Length;
                private static readonly System.Random Rand = new System.Random();

                public static string Run(int count, bool subsequent = false) {
                    return Words[Rand.Next(1, Length)] + (count == 0 ? "" : " " + Run(count - 1, true));
                }
            }
        }

    '
}

You call this with:

[_netfxharmonics.ps.Generator]::Run(400)

Rerunning our fancy enterprise-class benchmark again...

$then = [DateTime]::Now
[_netfxharmonics.ps.Generator]::Run(400)
$now = [DateTime]::Now
Write-Host ($now - $then).Milliseconds

This time I get 8ms. Yeah. Eight.

Man, given this speed, we can increase our dictionary (see below to download hamlet_distinct.t):

if (!([System.Management.Automation.PSTypeName]'_netfxharmonics.hamlet.Generator').Type) {
    Add-Type -Language CSharp -TypeDefinition '
        namespace _Netfxharmonics.Hamlet {
            public static class Generator {
                private static readonly string[] Words = System.IO.File.ReadAllText(@"E:\Drive\Documents\Content\NetFX\NetFXContent\2015\03\hamlet\hamlet_distinct.t").Split('' '');
                private static readonly int Length = Words.Length;
                private static readonly System.Random Rand = new System.Random();

                public static string Run(int count, bool subsequent = false) {
                    return Words[Rand.Next(1, Length)] + (count == 0 ? "" : " " + Run(count - 1, true));
                }
            }
        }

    '
}

This needs to work in more than simply PowerShell. I do a lot of work in both Python and JavaScript, so...

Python

#!/usr/bin/env python

from random import randint

genData = 'o my offence is rank it smells to heaven hath the primal eldest curse upont a brothers murder pray can i not though inclination be as sharp will stronger guilt defeats strong intent and like man double business bound stand in pause where shall first begin both neglect what if this cursed hand were thicker than itself with blood there rain enough sweet heavens wash white snow whereto serves mercy but confront visage of whats prayer two-fold force forestalled ere we come fall or pardond being down then ill look up fault past form serve turn forgive me foul that cannot since am still possessd those effects for which did crown mine own ambition queen may one retain corrupted currents world offences gilded shove by justice oft tis seen wicked prize buys out law so above no shuffling action lies his true nature ourselves compelld even teeth forehead our faults give evidence rests try repentance can yet when repent wretched state bosom black death limed soul struggling free art more engaged help angels make assay bow stubborn knees heart strings steel soft sinews newborn babe all well'.split(' ')

def gen(count):
    return genData[randint(1, len(genData) - 1)] + ('' if count == 0 else ' ' + gen(count - 1))

print gen(300)

JavaScript

var genData = 'o my offence is rank it smells to heaven hath the primal eldest curse upont a brothers murder pray can i not though inclination be as sharp will stronger guilt defeats strong intent and like man double business bound stand in pause where shall first begin both neglect what if this cursed hand were thicker than itself with blood there rain enough sweet heavens wash white snow whereto serves mercy but confront visage of whats prayer two-fold force forestalled ere we come fall or pardond being down then ill look up fault past form serve turn forgive me foul that cannot since am still possessd those effects for which did crown mine own ambition queen may one retain corrupted currents world offences gilded shove by justice oft tis seen wicked prize buys out law so above no shuffling action lies his true nature ourselves compelld even teeth forehead our faults give evidence rests try repentance can yet when repent wretched state bosom black death limed soul struggling free art more engaged help angels make assay bow stubborn knees heart strings steel soft sinews newborn babe all well'.split(' ');

function gen(count) {
    return genData[parseInt(Math.random() * genData.length)] + (count == 0 ? '' : ' ' + gen(count - 1));
}

console.log(gen(300))

Want the full hamlet_distinct.t? OK, fine....

1 2

Powered by
Python / Django / Elasticsearch / Azure / Nginx / CentOS 7

Mini-icons are part of the Silk Icons set of icons at famfamfam.com