Microsoft’s Azure platform provides excellent data storage facilities in the form of the Windows Azure Storage service, with Table, Blob and Queue stores, and SQL Azure, which is a near-complete SQL Server-as-a-service offering. But one thing it doesn’t provide is a “document database”, in the NoSQL sense of the term.

I saw Captain Codeman’s excellent post on running MongoDB on Azure with CloudDrive, and wondered if Ayende’s new RavenDB database could be run in a similar fashion; specifically, on a Worker role providing persistence for a Web role.

The short answer was, with the current build, no. RavenDB uses the .NET HttpListener class internally, and apparently that class will not work on worker roles, which are restricted to listening on TCP only.

I’m not one to give up that easily, though. I’d already downloaded the source for Raven so I could step-debug through a few other blockers (to do with config, mainly), and I decided to take a look at the HTTP stack. Turns out Ayende, ever the software craftsman, has abstracted his HTTP classes and provided interfaces for them. I forked the project and hacked together implementations of those interfaces built on the TcpListener, with my own HttpContext, HttpRequest and HttpResponse types. It’s still not perfect, but I have an instance running at http://ravenworker.cloudapp.net:8080.

My fork of RavenDB is at http://github.com/markrendle/RavenDB – the Samples solution contains an Azure Cloud Service and a Worker Role project. My additions to the core code are mainly within the Server.Abstractions namespace if you want to poke around.

HOWTO

My technique for getting Raven running on a worker role differ from the commonly-used methods for third-party software. Generally these rely on running applications as a separate process, getting them to listen on a specified TCP port. With Raven, this is unnecessary since it consists of .NET assemblies which can be referenced directly from the worker project, so that’s how I did it:

image

The role uses an Azure CloudDrive for Raven’s data files. A CloudDrive is a VHD disk image that is held in Blob storage, and can be mounted as a drive within Azure instances.

Mounting a CloudDrive requires some fairly straightforward, boilerplate code:
[sourcecode language="csharp"] private void MountCloudDrive(){var localCache = RoleEnvironment.GetLocalResource("RavenCache");

CloudDrive.InitializeCache(localCache.RootPath.TrimEnd('\'), localCache.MaximumSizeInMegabytes);

var ravenDataStorageAccount =CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("StorageAccount"));var blobClient = ravenDataStorageAccount.CreateCloudBlobClient();var ravenDrives = blobClient.GetContainerReference("ravendrives");ravenDrives.CreateIfNotExist();var vhdUrl =blobClient.GetContainerReference("ravendrives").GetPageBlobReference("RavenData.vhd").Uri.ToString();

_ravenDataDrive = ravenDataStorageAccount.CreateCloudDrive(vhdUrl);

try{_ravenDataDrive.Create(localCache.MaximumSizeInMegabytes);}catch (CloudDriveException ex){// This exception is thrown if the drive exists already, which is fine.}

_ravenDrivePath = _ravenDataDrive.Mount(localCache.MaximumSizeInMegabytes, DriveMountOptions.Force);}
[/sourcecode]

(This code has been trimmed for size; the actual code involves more exception handling and logging.)

Once the drive is mounted, we can start the server:

[sourcecode language="csharp"] private void StartTheServer(){var ravenConfiguration = new RavenConfiguration{AnonymousUserAccessMode = AnonymousUserAccessMode.All,Port = _endPoint.Port,ListenerProtocol = ListenerProtocol.Tcp,DataDirectory = _ravenDrivePath};

documentDatabase = new DocumentDatabase(ravenConfiguration);documentDatabase.SpinBackgroundWorkers();

ravenHttpServer = new HttpServer(ravenConfiguration, _documentDatabase);ravenHttpServer.Start();}[/sourcecode]
Again, trimmed for size.

A few points on the configuration properties:

  • the Port is obtained from the Azure Endpoint, which specifies the internal port that the server should listen on, rather than the external endpoint which will be visible to clients;
  • I added a new Enum, ListenerProtocol, which tells the server whether to use the Http or Tcp stack;
  • AnonymousUserAccessMode is set to all. My intended use for this project will only expose the server internally, to other Azure roles, so I have not implemented authentication on the TCP HTTP classes yet;
  • The DataDirectory is set to the path that the CloudDrive Mount operation returned.

I have to sign a contribution agreement, and do some more extensive testing, but I hope that Ayende is going to pull my TCP changes into the RavenDB trunk so that this deployment model is supported by the official releases. I’ll keep you posted.