Migrating from NUnit to MSpec with psake and TeamCity

We have a large project with thousands of NUnit tests. We are starting to use MSpec for new projects due to it being less verbose than NUnit for BDD style tests. We would like to enable new tests in existing projects to be written in MSpec, but maintain the current NUnit tests and get all of the nice testing integration with TeamCity for both NUnit and MSpec.

First, we reference MSpec from one of our existing test assemblies using Nuget, and write an MSpec test.

Next, we modify our psake build script to run both NUnit and MSpec tests.

Existing psake unit testing task:

task UnitTests -depends Compile -description "Unit Tests" {
 exec{ & $nunit $nunitTestsNUnitFile /nologo /config:$buildConfiguration /noshadow "/exclude=LongRunning,EndToEnd,Deployment" }
}

Updated psake unit testing task:

task UnitTests -depends NUnitUnitTests, MSpecUnitTests -description "Unit Tests" {
}

task NUnitUnitTests -depends Compile -description "NUnit unit tests" {
 exec{ & $nunit $nunitTestsNUnitFile /nologo /config:$buildConfiguration /noshadow "/exclude=LongRunning,EndToEnd,Deployment" }
}

task MSpecUnitTests -depends Compile -description "MSpec unit tests" {
  $testDlls = ls "$srcDir\*\bin\$buildConfiguration" -rec `
    | where { $_.Name.EndsWith(".Tests.dll") } `
    | where { (Test-Path ([System.IO.Path]::GetDirectoryName($_.FullName) + "\Machine.Specifications.dll")) -eq $True } `
    | foreach { $_.FullName }

  $mspecExePath = Join-Path $srcDir "packages\Machine.Specifications.0.5.7\tools\mspec-clr4.exe"

  if($env:TEAMCITY_VERSION -ne $Null)
  {
      exec{ & $mspecExePath $testDlls --teamcity }
  }
  else
  {
      exec{ & $mspecExePath $testDlls }
  }
}

As you can see, we have split the unit testing task into both NUnit and MSpec dependent tasks. We keep a flat project structure, so this allows us to perform a query to find all MSpec test DLLs (we assume that any test project build directory that contains Machine.Specifications.dll should be run with the MSpec runner). These DLLs can also contain NUnit tests.

To integrate with TeamCity, we check for the existance of the TEAMCITY_VERSION environment variable. Running the MSpec exe with –teamCity causes it to output in a way that TC understands. Now we have both our NUnit and MSpec tests in the list of tests in TeamCity.

If a developer wants to start using MSpec in a project, they add a reference to MSpec via Nuget.

Secure SSH proxy with Windows Azure Linux VM

A proxy server can provide a secure mechanism to route Internet traffic through when on an insecure network, such as at a hotel or coffee shop. This guide demonstrates how to create a Linux VM within Windows Azure, set up SSH, and set up your laptop to easily connect to it and enjoy secure browsing.

Steps

Create Azure account and subscription, and then browse to the management portal.

Sign-up for the VM preview.

Create a new Linux VM from the gallery (I’ve chosen Ubuntu, but you can use other distributions).

When prompted to, give it a username and password.  We will use an Extra-Small instance, and we will not upload an SSH key for this example.

VM Setup (cont). Note this image is slightly incorrect, as I used the DNS name SshProxy.

VM starting:

Once your VM is created, you can test that it’s available by logging on using SSH.  I’m using the SSH client that comes with Git, but feel free to use any.  PuTTY (
http://www.chiark.greenend.org.uk/~sgtatham/putty/
) is a particularly good one.

Now that you have an SSH server available, you have many options for tunnelling traffic. A simple approach is to use PuTTY to set up a dynamic tunnel, and set your browser’s configuration to use the tunnel.

The following screenshots show the PuTTY setup.

Hostname:

Dynamic proxy:

Once you’ve done this, press the Open button. You will be prompted to login.

Now set up your browser to route all traffic through this port. The following shows the configuration for Firefox:

Now you can google “What’s my IP” and you should see it changed. You now have web browsing traffic routed through a secure tunnel.

Although this method works, it is not as easy as it could be, especially if you are changing browser settings every time you go to a coffee shop. Additionally, it will not secure connections made outside of your browser. I use ProxyCap (
http://www.proxycap.com
) to route all Internet traffic through the SSH server. This means I don’t need to use PuTTY, change browser settings, or worry about other processes that don’t have proxy configuration. ProxyCap is not free ($25?) but I have found it to be a very reliable application.

The following is the server configuration for ProxyCap:

In addition, you will need to set up the following rules: one for forwarding all traffic, the other for ignoring connections to the SSH server.

WebRole Deployment with Azure CmdLets

We recently set up a new Azure project.  We had previously built a custom Azure management API client for this purpose, but didn’t want to share this much code, so we looked for another solution.

We used the Azure CmdLets and were surprised at how quickly we were able to automate our deployment.

I’ve posted the code to:
http://github.com/awithy/AzureCmdLetDeployment
.  You will not need the Azure SDK or tools installed to use this project (one of our requirements).  Please reference the README.md file for more info.

Enigma Machine Exercise

At London Software Craftsmanship 2012, there was an excellent session held by the guys at Financial Agile where they had everyone attempt to build an Enigma encryption machine.  We were given about two hours to complete the exercise (with the promise of beer on success), and  unfortunately I did not.

The purpose of the exercise was to provoke discussions regarding TDD approaches, such as the value of varying levels of tests, the need for mocking, etc.

Link to code on GitHub, 
http://github.com/awithy/EnigmaMachine

Update: I rewrote the NUnit BDD style tests using MSpec as an exercise (branch MSpec).

Large scale logging in Windows Azure: Queues, WCF, and Page Blobs

Introduction

When something goes wrong in a large-scale software system, how do you diagnose it?  This article provides an overview of the journey our team took in developing diagnostic logging for a large scale Azure grid computing system.

Logging philosophy

The value of very detailed developer facing diagnostics logging can be debated.  The main argument for it is to be able to gain as much information as possible to debug an issue; for those situations where reproducing a bug are not an option.  The main arguments against is that verbose logging leads to a decrease in code readability, and can be a hindrance to performance.

For the purposes of this discussion, I will take as given that a very verbose level of diagnostics logging is required.  A bug should only be observed once, and the development team should have enough information to fix it without reproducing.

Summary of requirements

We have the following logging levels, shown here in increasing levels of verbosity: Error, Warning, Info, Verbose, Debug, and Data.  Error, Warning, and Info diagnostics are written in customer language, and understandable by anyone who uses the system.  Everything below Info logs are intended for developer consumption.

We have two primary needs for logging: real-time visibility, and issue debugging.  A closely related requirement is error reporting.

Real-time logging

Real-time logging provides an almost real-time log of top-level system information.  This provides a twitter-like feed of what’s going on.  The amount of information that can be digested in real-time is limited, however, so we limit this level of logging to informational and up.

Issue debugging

To support issue debugging, a very low level of logging is required, down to code control flow, and data operations.  This information does not need to be real-time, but can be gathered after the fact by a developer.

Best effort and no performance compromises

We consider logging (unlike error handling/reporting) to be a best effort mechanism.  What this means is that logging should not have the ability to crash a system.  If a single log is lost, we do not consider it to be a system failure.  If 5% of your logs are missing, then there’s an issue.

We also do not accept that there is a tradeoff between logging and performance other than in insolated high performance transactions and algorithms.

How we started: Azure Queues

I have long had logging as a primary feature in my toolbox of system scaffolding.  When applying to a new Azure system working in an Agile team, the question was: how do we log as simply as possible with the least effort?  The answer: Azure queues.   We simply wrote log messages to a queue in plain text.

Azure tables would have worked as well, but the API was a bit more difficult to work with.  The built-in Azure diagnostics features would have worked, but are a bit tricky to set up (no matter how many times I’ve done it, I have to look it up).  Azure diagnostics also have a minimum one minute refresh interval –too long for my brain to hold a thought under stress.  Also, built-in infrastructure has a risk to not grow with your needs.

As with all of the logging discussed here, we use in-memory client message buffers to have minimal effect on performance.

Queues did well for a long time.  I’ve found that a single queue can scale to about 60 transactions per second (this may have gone up since I measured it early last year).  We did, however, quickly outgrow queues as a logging mechanism.   If you’re starting a new project, and want something fast, queues may work well for you.

 The next step: custom WCF log service

Our next step was to build a WCF service that would provide a sink for client processes to send logs.  What this involves is a WCF service that maintains an in-memory ring-buffer.  We would store the last 100,000 logs, while a process asynchronously persisted them.  We would maintain a ring-buffer for each log level, so that we could keep a long history of info events, without them being drowned by lower level logs.  We would then serve these to a UI to provide real-time logging.  We were able to get over 10,000 logs/second through the system using a single WCF service (with a very tweaked net.tcp endpoint).  This was a big step up from queues, but in time we outgrew this as well.

Current solution: Queues, WCF service, and Page Blobs

Two things caused us to outgrow our pure WCF solution: the scale of our grid (up to the multiple 1,000s of VMs) and a limitation of Windows Azure that you can’t have private endpoints shared across services.  The former reason caused the sheer amount of log messages coming in per second to overwhelm our diagnostics sink.  The later reason was caused by us needing to run our system across multiple Windows Azure services.  To resolve the second issue, we would have needed to host our logging endpoint on a public IP and deal with security.

Page blobs provided the answer.  Page blobs are a Windows Azure Storage mechanism that can be appended to in blocks of 512.  What we did was have each process (role instance) in our system write to its own page blob.  The names of these blobs are rotated every day, and every 50MB to ensure manageability.  This provides us the low level logging we require.  When fully utilized, we can log a terabyte of information a day on a large grid.  Due to the nature of this type of logging, we only use these logs for after-the-fact issue investigation.

For real time logging, we use a combination of queues and our WCF service.  We use queues to transmit Info, Warning, and Error messages from each of our processes across the Windows Azure service boundary, to a diagnostics process, that also hosts the WCF service described earlier.  This allows us to get near real-time diagnostics messages (Info and up) aggregated and able to be served to a UI.  An overview diagram of this solution is provided.

Azure Diagnostics Diagram

Azure Diagnostics Diagram

I will try and pull some of this code together into a GitHub project if anyone is interested.  Cheers.

Technology breaks: it’s how you react to it

After years developing and operating mission and life critical software systems, I have had a deep belief that failures in software systems, processes, and the people who manage them were unacceptable. I believed that any defect in these aspects of software delivery were able to be engineered away to the point of being a negligible risk. The best people, good practices, rigorous TDD, a high-level of automation, chaos testing, and manual QA will ensure software simply will not break… right?

There are some fields of software engineering that the answer to this has to be yes: missile guidance systems, x-ray machines, space shuttle navigation systems, etc.

Unless our customers want to pay for this level of quality (which would slow enterprise system development down to the point of losing the required business agility), this level of quality is prohibitively costly.

What we must do is understand that technology breaks. We make mistakes. Bugs will be written, and we can not cover all possible scenarios with the above mentioned strategies. Up-front software quality matters enormously. But what also matters is how we handle the situation when it breaks.

Do we actively monitor our software to find issues before our customers do?

Do we have a well-defined and practiced incident management process, so when incidents occur we are able to react to them in an organized, professional, and calm manner?

Do we transparently keep our customers informed during critical incidents?

Do we communicate the risks of software systems to our customers and work with them to understand that software sometimes fails?

Do we measure the severity, frequency, and duration of critical incidents?

Do we learn from our incidents? Do we take root cause analysis seriously?

Given limited resources, we can only take software quality to an extent for most business cases. What we can do is make sure we compensate for this by easing the pain of critical incidents, and having a good process in place to ensure our customers of our expertise, learn from our mistakes, and not allow ourselves to be drowned by the fire-drill.

C# power set algorithm implementation

I’ve been working on understanding algorithms for constructing a power set. Power sets are the set of subsets for a set. We are using these to construct terms of a polynomial.

A precise definition: The power set P(S) of set S is the set of all subsets of S, including S itself and the empty set.

Example: The set S with elements {x,y,z} would have power set P(S) = {{x,y,z},{x,y},{x,z},{y,z},{x},{y},{z},{}}.

The following is a power set builder algorithm in C#:

public class PowerSetBuilder<T>
{
      private ISet<ISet<T>> _powerSet;
      private ISet<T> _startingSet;
      private int _maxDegrees;

      public IEnumerable<T> Build(IEnumerable<T> startingSet, int maxDegrees)
      {
          _startingSet = new HashSet<T>(startingSet);
          _maxDegrees = maxDegrees;
          _powerSet = new HashSet<ISet<T>>();
          _BuildPowerSet(_startingSet);
          return _powerSet.Select(x => new T(x.SelectMany(y => y.Components).Distinct().ToArray()));
      }

      private void _BuildPowerSet(IEnumerable<T> workingSet)
      {
          var workingSetArray = workingSet.ToArray();
          for(var i = 0; i < workingSetArray.Length; i++)
          {
              var workingElement = new[] {workingSetArray[i] };
              _AddNewCombinations(workingElement);
              _AddWorkingElement(workingElement);
          }
      }

      private void _AddNewCombinations(T[] workingElement)
      {
          foreach (var currentPowerSetElement in _powerSet.ToArray())
          {
              var newPowerSetElement = new HashSet<T>(currentPowerSetElement);
              newPowerSetElement.UnionWith(workingElement);
              _powerSet.Add(newPowerSetElement);
          }
      }

      private void _AddWorkingElement(T[] workingElement)
      {
          _powerSet.Add(new HashSet<T>(workingElement));
      }
  }
}

A quick test:

[Test]
public void TestPowerSet()
{
	var set = new HashSet<string>{"x", "y", "z"};
	var powerSet = new PowerSetBuilder<string>().GetPowerSet(set);
	foreach (var subset in powerSet)
		Console.WriteLine(subset.Aggregate((x1, x2) => x1 + x2));
}

This should output:

x
xy
y
xz
xyz
yz
z

This algorithm is not novel. I stole the basics from Wikipedia:
http://en.wikipedia.org/wiki/Power_set
.

Check out
http://rosettacode.org/wiki/Power_Set
for a good number of other solutions.

New open source project: Azure Storage using WebAPI

I’ve been working on a project to use the Microsoft WebAPI to consume the Windows Azure Storage Services REST API. Currently only queues are supported and there’s no retry for 409s or network errors.


http://github.com/awithy/AzureStorage

The plan is to develop this into a production quality client implementation. Updates to follow…

Azure Create Deployment (with start) creates, but fails to start

For many Azure applications, failures at deployment time can be dealt with manually (in fact I would bet most Azure deployments are manual!) When Azure deployments are automated, and especially when Azure deployment operations are an important runtime aspect of your system, it is important to have a robust client implementation of the Management Api.

To that end, a quick note about the CreateDeployment with the StartDeployment option enabled:

We have experienced (although infrequent) this operation creating the deployment, but not starting it. In order to compensate for this limitation, you must either manually fix it (not an option sometimes) or you need your system to be able to know the difference between the created and started deployment states. This defeats the whole purpose of CreateDeployment with start, and you might as well just do Create and Start as separate operations. This way you can handle each failure case independently.

MSDN: Create Deployment

Remove msshrtmi.dll as a dependency in your Azure project

Msshrtmi.dll is a nasty little assembly that ships as part of the Azure SDK and is required as a dependency of ServiceRuntime to determine if you are running in or outside of Azure.

The problem lies in the fact that the assembly is managed, and ships with different DLLs for x86 and x64. This causes all kinds of havoc as you need to build your projects with different target architectures depending upon context. As an example, IISExpress only runs in 32-bit mode.

Note: some of this pain is self-inflicted, as I absolutely refuse to install things into the GAC of my build server. This is a different post.

In order to remove this dependency, you need to find a different way of determining if you’re on Azure. We found the simplest way was to add an environment variable that was set as an Azure start-up task. We then look for this variable when constructing our RoleEnvironment adapter.

Add a batch file such as this to your project and set to copy always.

SetEnvVar.cmd

setx INAZURE True /M

Add the following element to the top of each WorkerRole element in your csdef files.

<Startup>
  <Task commandLine="SetEnvVar.cmd" 
    executionContext="elevated" 
    taskType="simple" />
</Startup>

Update: Check out http://github.com/awithy/AzureCmdLetDeployment for a working example project.

Follow

Get every new post delivered to your Inbox.