Thursday, February 27, 2014

Sending binary data along with a REST API request

The problem I would like to discuss is an API call, where you need to send binary data (for example multiple images) and some metadata information together. There are various ways you can approach this, and I will describe them briefly. Then I will go into more detail on multipart/form-data requests and how they can help you with the mentioned task.

Approach 1 – Send metadata and files in separate requests

The steps could be this:

  1. Send metadata to server
  2. Server stores metadata and generates an unique URL, to which files should be uploaded. Sends the URL in response.
  3. Client posts files to specified URL

This may be a good approach in a scenario, where you don’t need to receive the files right away together with the metadata. This enables the client to upload some initial files, then later add some more. This could be a good approach if you are creating a new photo album (metadata), then adding photos to it.

Approach 2 – Send metadata and files together in one request

There are some cases however when metadata and files are one entity and neither makes sense on its own. In this scenario you want to take some action on the whole client request data right away. Let’s say you are creating a service that combines some images into one, calculates some statistics, etc. If you used approach 1 in this scenario, you need to manage state between client requests. This can be complex and may hurt your scalability.

Fortunately you can send all data in one request if this is what makes the most sense. Here are a few ways to do this:

JSON / XML

Probably the easiest and most compatible way to send the data is to serialize it to JSON or XML. Also the binary data, which means getting +33% in message size due to BASE64 compression. This may be just fine in some cases.

BSON

However if the extra message size is not something you can put up with, then you can use a form of binary serialization. With ASP.NET WebAPI 2.1 you’ll get BsonMediaTypeFormatter out of the box, which will help you use BSON serialization (see http://bsonspec.org/). On the client side, you can use JSON.NET to do the serialization of the request message.

With some effort you should also be able to use Google’s protobuf protocol, which may be more efficient according to this blog post.

multipart/form-data

In some cases, maybe for compatibility reasons, you’ll not be able to use modern binary serialization like BSON or protobuf. In those cases you can still avoid sending binary data in BASE64 encoded string. You can use multipart/form-data request, effectively simulating HTML forms with file uploads behavior. This is a bit more complex than the previous approaches, so I would like to go into more detail.

But before that, I should mention, that this approach is not semantically correct. By using the Content-Type multipart/form-data you state, that what you send is actually a form. But it is not. It is rather some custom data format and for that, the most appropriate content type seems to be multipart/mixed (see the RFC). The HttpClient library for .NET won’t have any problems handling different subtypes of multipart/* MIME type, but for other platforms, it may not be true. I have seen some libraries (for Python for example), which had multipart/form-data content type hardcoded. So in this case you have two options: either write your own client library or give up being semantically correct and go with multipart/form-data. I can live with the latter.

So how to put some metadata together with multiple files into one request? Look at this example:

POST http://127.0.0.1:53908/api/send HTTP/1.1
Content-Type: multipart/form-data; boundary="01ead4a5-7a67-4703-ad02-589886e00923"
Host: 127.0.0.1:53908
Content-Length: 707419

--01ead4a5-7a67-4703-ad02-589886e00923
Content-Type: application/json; charset=utf-8
Content-Disposition: form-data; name=imageset

{"name":"Model"}
--01ead4a5-7a67-4703-ad02-589886e00923
Content-Type: image/jpeg
Content-Disposition: form-data; name=image0; filename=Small-Talk-image.jpg


...image content...
--01ead4a5-7a67-4703-ad02-589886e00923
Content-Type: image/jpeg
Content-Disposition: form-data; name=image2; filename=url.jpg



...image content...
--01ead4a5-7a67-4703-ad02-589886e00923--

Going from the top we have a part with name = imageset. This part contains the metadata. It has JSON content type and a serialized JSON object. Then there are two parts containing image data with names image0 and image1. Those two additionally specify: image filename and image type (in Content-Type header).

The server, after receiving such request, can distinguish metadata and image data by looking at the part names (the names are  part of the API contract client needs to follow). Then it can put the request together and execute a controller action passing in received data.

So how to actually implement this using ASP.NET WebAPI? (you can look at the code at Github)

Let’s start quickly with the definition of the data passed around:

    public class ImageSet
    {
        public string Name { get; set; }

        public List<Image> Images { get; set; }
    }

    public class Image
    {
        public string FileName { get; set; }

        public string MimeType { get; set; }

        public byte[] ImageData { get; set; }
    }

Now, when implementing a WebAPI controller, we would like to receive and instance of the ImageSet as a parameter of the action.

[Route("api/send")]
public string UploadImageSet(ImageSet model)
{
    var sb = new StringBuilder();
            
    sb.AppendFormat("Received image set {0}: ", model.Name);
    model.Images.ForEach(i =>
        sb.AppendFormat("Got image {0} of type {1} and size {2} bytes,", 
            i.FileName, 
            i.MimeType,
            i.ImageData.Length)
        );

    var result = sb.ToString();
    Trace.Write(result);

    return result;
}

Fortunately, WebAPI has a notion of MediaTypeFormatter, which basically lets you define logic for translating a certain type of request to a certain .NET type (and back). Here’s how to implement one for the ImageSet:

public class ImageSetMediaTypeFormatter : MediaTypeFormatter
{
    public ImageSetMediaTypeFormatter()
    {
        SupportedMediaTypes.Add(new MediaTypeHeaderValue("multipart/form-data"));
    }
        
    public override bool CanReadType(Type type)
    {
        return type == typeof (ImageSet);
    }

    public override bool CanWriteType(Type type)
    {
        return false;
    }

    public async override Task<object> ReadFromStreamAsync(
        Type type,
        Stream readStream, 
        HttpContent content, 
        IFormatterLogger formatterLogger)
    {
        var provider = await content.ReadAsMultipartAsync();

        var modelContent = provider.Contents
            .FirstOrDefault(c => c.Headers.ContentDisposition.Name.NormalizeName() == "imageset");
            
        var imageSet = await modelContent.ReadAsAsync<ImageSet>();

        var fileContents = provider.Contents
            .Where(c => c.Headers.ContentDisposition.Name.NormalizeName().Matches(@"image\d+"))
            .ToList();

        imageSet.Images = new List<Image>();
        foreach (var fileContent in fileContents)
        {
            imageSet.Images.Add(new Image
            {
                ImageData = await fileContent.ReadAsByteArrayAsync(),
                MimeType = fileContent.Headers.ContentType.MediaType,
                FileName = fileContent.Headers.ContentDisposition.FileName.NormalizeName()
            });
        }

        return imageSet;

    }
}

public static class StringExtenstions
{
    public static string NormalizeName(this string text)
    {
        return text.Replace("\"", "");
    }

    public static bool Matches(this string text, string pattern)
    {
        return Regex.IsMatch(text, pattern);
    }
}

By adding an entry to SupportedMediaTypes, you specify what content types this MediaTypeFormatter is able to handle. Then in CanRead and CanWrite you specify .NET types, to which (or from which) the request can be translated. ReadFromStreamAsync does the actual work.

Decoupling controller logic from request parsing logic gives you possibility to handle multiple request formats and you can let your clients choose the format they are most comfortable with by specifying appropriate Content-Type header.

Note: when you do the content.ReadAsMultipartAsync() call, you are using the MultipartMemoryStreamProvider, which handles all processing in-memory. Depending on your scenario, you may take different approach, for example MultipartFormDataStreamProvider, for which you’ll find a nice sample here.

The above code doesn’t do any request format validation for clarity of the example. In production, you’ll get all types of malformed requests from 3rd party clients, so you need to handle those situations.

How about some client code? First of all, on the client side we’ll make a small change in the ImageSet class:

    public class ImageSet
    {
        public string Name { get; set; }

        [JsonIgnore]
        public List<Image> Images { get; set; }
    }

We want the JSON serialization to ignore Images collection, since they will be put into separate request parts.

This is how you could prepare a request to be sent:

static void Main(string[] args)
{
    var imageSet = new ImageSet()
    {
        Name = "Model",
        Images = Directory
            .EnumerateFiles("../../../../../SampleImages")
            .Where(file => new[] {".jpg", ".png"}.Contains(Path.GetExtension(file)))
            .Select(file => new Image
            {
                FileName = Path.GetFileName(file),
                MimeType = MimeMapping.GetMimeMapping(file),
                ImageData = File.ReadAllBytes(file)
            })
            .ToList()
    };

    SendImageSet(imageSet);
}

And here’s how to send it using HttpClient and Json .NET:

private static void SendImageSet(ImageSet imageSet)
{
    var multipartContent = new MultipartFormDataContent();

    var imageSetJson = JsonConvert.SerializeObject(imageSet, 
        new JsonSerializerSettings
        {
            ContractResolver = new CamelCasePropertyNamesContractResolver()
        });

    multipartContent.Add(
        new StringContent(imageSetJson, Encoding.UTF8, "application/json"), 
        "imageset"
        );

    int counter = 0;
    foreach (var image in imageSet.Images)
    {
        var imageContent = new ByteArrayContent(image.ImageData);
        imageContent.Headers.ContentType = new MediaTypeHeaderValue(image.MimeType);
        multipartContent.Add(imageContent, "image" + counter++, image.FileName);
    }

    var response = new HttpClient()
        .PostAsync("http://localhost:53908/api/send", multipartContent)
        .Result;

    var responseContent = response.Content.ReadAsStringAsync().Result;
    Trace.Write(responseContent);
}

Summary

There are multiple ways to send mixed plain text / binary data to a REST API endpoint. The best what you can do while implementing public-facing API, is to let your clients choose format which is convenient for them. ASP.NET WebAPI has ways to facilitate that. multipart/form-data requests are a bit more complicated than whole-request binary serialization (BSON or protobuf), but may be more compatible with some platforms.

The code presented in this post can be found on Github. You’ll find there also client code samples for multipart/form-data scenario for Java, Python and node.js

Saturday, August 3, 2013

Experiments with System.AddIn

Introduction

System.AddIn namespace (sometimes called MAF – Managed Add-in Framework) is present in .NET since version 3.5. It is quite old and nothing significant has been changed in it since the 2008 release. It was not widely adopted, probably due to steep learning curve and high degree of complexity. However it is still a great tool for building applications in host – addin model, especially when you need a certain degree of separation and sandboxing between those two. This is my goal - I’m about to build a module, that hosts addins. Those addins may be built and maintained separately. They may use 3rd party libraries, in different versions. They may crash, but the host process should handle this gracefully, and restart them. This post is to share some observations and solutions from my research.

You can browse the code on GitHub.

Required components

System.AddIn enforces some strict rules about components present in the solution and their location on disk. You need to have at least projects for:

  • the addin
  • the host
  • the contract between add in and the host
  • the addin view – which is a “host from the addin’s point of view”, used by addin to communicate with the host
  • the addin adapter – which can be used to translate between different versions of addin/host
  • the host view – which is a “addin from the host’s point of view”, used by host to communicate with addins
  • the host adapter – which can be used to translate between different versions of addin/host

That’s quite a lot. The structure is complicated because it provides means to achieve backward and forward compatibility between the host and addins. You can find a great Architecture Journal article explaining the structure here.  There is also CodeProject article I used to get started.

Scenario

We’ll be building a task scheduler (host) and some sample tasks (addins) that will run on specified schedule. Those addins will show some challenges you can face building extensible system, and how to deal with those challenges with System.AddIn. The contract between host and addin is really simple:

    [AddInContract]
    public interface IScheduledTask : IContract
    {
        ScheduleOptions GetScheduleOptions();

        TaskResult Run(RunOptions options);
    }

    [Serializable]
    public class RunOptions
    {
        public DateTime PointInTime { get; set; }
    }

    [Serializable]
    public class ScheduleOptions
    {
        public string CronExpression { get; set; }
    }

    [Serializable]
    public class TaskResult
    {
        public bool Successful { get; set; }
    }

I’ll try not to paste all of the source code here, but rather pinpoint most interesting parts.

REALLY ensuring backward / forward compatibility

Adapters are the components, that properly implemented, can translate contract between host and addin from one version to another. However, they are no good if you have hard dependency on the contract library in your host or addin. As you can see above there are some types like RunOptions passed around the application. In order not to have hard dependencies, I decided to introduce copies of those types in both host view and addin view. The adapters are responsible for mapping the data from one type to another. This ensures you can change the contract and adjust adapters to transform data between different versions of the contract.

Here’s the host adapter as an example:

    [HostAdapter]
    public class ScheduledTaskHostAdapter : ScheduledTaskHostView
    {
        private readonly IScheduledTask _scheduledTask;
        private readonly ContractHandle _contractHandle;

        public ScheduledTaskHostAdapter(IScheduledTask scheduledTask)
        {
            _scheduledTask = scheduledTask;
            _contractHandle = new ContractHandle(scheduledTask);
        }


        public override HostView.ScheduleOptions GetScheduleOptions()
        {
            var options = _scheduledTask.GetScheduleOptions();
            return ReflectionCopier.Copy<HostView.ScheduleOptions>(options);
        }

        public override HostView.TaskResult Run(HostView.RunOptions options)
        {
            var contractOptions = ReflectionCopier.Copy<AddInContracts.RunOptions>(options);
            var result = _scheduledTask.Run(contractOptions);
            return ReflectionCopier.Copy<HostView.TaskResult>(result);
        }
    }

This has performance implications as the mapping takes time twice. Also remoting communication (used by System.AddIn) overhead needs to be taken into account. In my case, this is not a big deal. You may want to find another solution if performance is critical for your case.

Different versions of libraries loaded by addins

To test how loading of different assembly versions behaves, I used both signed and unsigned assemblies.

Host and addins use different versions of the NLog library, which has a signed assembly (Host, RogueTask, SickTask - 2.0.1.2, SayByeTask - 1.0.0.505, SayHelloTask - 2.0.0.2000). Tasks use 3 different versions of the unsigned MultiversionLib (included in source code). Each version of this lib produces different results, which is visible in the application console output.

Here’s how this looks in the loaded modules view:

image

This is from the version of host that uses separate application domains to load addins. As you can see each application component loaded it’s own copy of NLog. Also three different versions of MultiversionLib are loaded.

Preventing rogue addin from crashing the host

There are two addins in the application that behave in an unexpected way (as seen by the host).

The SickTask throws an exception when it is invoked:

    [AddIn("Sick Task", Version = "1.0.0.0", Description = "Is sick and throws exception")]
    public class SickTask : ScheduledTaskAddInView
    {
        private static Logger _logger = LogManager.GetCurrentClassLogger();
        private static string[] _thingsToSay =
            {
                ", I feel sick...", 
                ", seriously, I think I'm gonna faint!", 
                ", blaaargh..."
            };

        private NameGenerator _nameGenerator = new NameGenerator();
        private int _state = 0;

        public override ScheduleOptions GetScheduleOptions()
        {
            return new ScheduleOptions { CronExpression = "0/5 * * * * ?" };
        }

        public override TaskResult Run(RunOptions options)
        {
            _logger.Debug("NLog version is " + typeof(Logger).Assembly.GetName().Version);
            
            _logger.Info(_nameGenerator.GetName() + _thingsToSay[_state++ % _thingsToSay.Length]);
            if(_state % _thingsToSay.Length == 0)
                throw new Exception("(falls to the ground)");

            return new TaskResult { Successful = true };
        }
    }

This type of problem is dealt with quite easily. The exception is marshalled between application domains or processes and all you need is a regular catch:

            try
            {
                info.AddIn.Run(new RunOptions { PointInTime = DateTime.Now });
            }
            catch (Exception ex)
            {
                _logger.ErrorException("Running task resulted in exception", ex);
            }

However with RogueTask the situation is more complicated:

        public override TaskResult Run(RunOptions options)
        {
            _logger.Debug("Nothing suspicious going on here...");

            if (_exceptionThrower == null)
            {
                _exceptionThrower = new Thread(() =>
                    {
                        Thread.Sleep(30000);
                        throw new Exception("Nobody expects the Spanish Inquisition!");
                    });
                _exceptionThrower.Start();
            }

            return new TaskResult {Successful = true};
        }

This task throws an exception on separate thread. As the exception is unhandled, it crashes whole process.

In order to deal with this problem, addins need to be hosted in separate processes. The demo application contains two basic hosts: AddInHost (uses application domains) and AddInHostExternalProcess (uses processes). It is really easy to spin up a new process with System.AddIn. You just need to create a new instance of AddInProcess class (or use existing one) and pass it in during add in activation. This is what you will see in Task Manager:

image

Also, you won’t be able to see log entries from addins in the console, as they are now displayed in different processes. You can still see Debug Output from the processes if you attach to them explicitly. When exception is thrown by the RogueTask, its process crashes and you are no longer able to communicate with this addin. However other addins will work just fine.

But what if you need to restart the RogueTask? It turns out that restarting itself is quite easy – just activate addin again. It’s the knowing that something is wrong that is the hard part. Unfortunately the AddInProcess.ShuttingDown event is not fired when process crashes. It’s fired only when you request process shutdown. The approach I took is to catch the RemotingException while communicating with the addin and then reactivate it:

try
{
    info.AddIn.Run(new RunOptions { PointInTime = DateTime.Now });
}
catch (RemotingException ex)
{
    _logger.ErrorException(
        string.Format(
            "Exception occured when communicating with addin {0}, probably process crashed",
            info.Token.AddInFullName), ex);

    _logger.Debug("Attempting to restart addin process");

    info.Process.Shutdown();

    var reactivatedInfo = ActivationHelper.ActivateAddIn(info.Token);
    // store new information in existing info
    info.AddIn = reactivatedInfo.AddIn;
    info.Process = reactivatedInfo.Process;
}
catch (Exception ex)
{
    _logger.ErrorException("Running task resulted in exception", ex);
}

The problem with this approach is that you won’t know that something is wrong with the addin, until you try talk to it. Until then, addins background processing is down. There may be different solutions for this. You may include some kind of Ping method in the addin contract and call it periodically. You may also monitor state of the processes at the OS level.  

Deploying to Windows Azure

I would like to be able to deploy my host to Windows Azure Cloud Service. The challenge here is to package addin files during build process and preserve the directory structure. Right now I will not go into subject of installing / swapping addins without necessity to redeploy the cloud service. The goal is to get all the necessary files into the CSPKG file and allow the solution to run as Worker Role.

I tried two approaches. The one that failed was to use Contents element in service definition. The problem with this approach is this:

SourceDirectory: path: Required. Relative or absolute path of a local directory whose contents will be copied to the Windows Azure virtual machine. Expansion of environment variables in the directory path is supported. Expansion of environment variables in the directory path is supported.

In my experience resolving a relative path behaved really unpredictable. It started to be resolved relative to build directory, but slight change in the path caused it to be resolved relatively to path containing CSDEF file. In the end, I was not able to point into directory I was expecting.

The approach that worked is described in this blog post by Phil Hoff. Basically you need to overwrite BeforeAddRoleContent target in cloud host project file and add AzureRoleContent items. Here’s how this looked for me:

  <Import Project="$(CloudExtensionsDir)Microsoft.WindowsAzure.targets" />
  <Target Name="BeforeAddRoleContent">
    <ItemGroup>
      <AzureRoleContent Include="..\bin">
        <RoleName>AzureWorkerRole</RoleName>
        <Destination>AddInfiles</Destination>
      </AzureRoleContent>
    </ItemGroup>
  </Target>

In this case the addin related files are deployed to AddInfiles subfolder, so you need to point the AddInStore.FindAddIns method to it.

One more step is required for Worker Role to be able to spawn new processes: it has to run elevated:

  <WorkerRole name="AzureWorkerRole" vmsize="Small">
    <Runtime executionContext="elevated" />
    ...
  </WorkerRole>

Summary

System.AddIn is quite powerful, but it puts some constraints on development and brings about high level of complexity. In simple scenarios you should probably forget about it and use DI container or MEF. In more advanced ones, you should definitely look at System.AddIn – maybe it can save you the time you would spend building your own solution.

Thursday, February 7, 2013

.NET developer’s view on node.js

Recently I had an impulse to play with a technology from outside Microsoft’s world. At the same time I was working on a small ASP.NET WebAPI application and I thought – I wonder how difficult it would be for me (as a .NET developer) to port this application to node.js? Actually it was not that difficult at all. But let me start from the beginning.

Getting an overview

Before I started to build anything in node.js, the only thing I knew about it was, that it is that new trendy framework to write server side code in javascript (say whaaat?). And there was also the keyword – async. Writing asynchronous server-side code is not a new thing, but while .NET merely gives you a possibility to do so, in node.js it is the basic assumption for the core framework and any library you might use.

To get some more general knowledge before I start coding anything, I went through a Pluralsight course by Paul O'Fallon. I know there are some people, that go straight to the code when learning new technology, but I find it valuable to grasp basic concepts first, so that I don’t try to port any patterns that are specific to the platform I’m used to, but are not a good match for the new environment.

After some digging around I knew what I want to use for the port of my little application. While the original ran on ASP.NET WebAPI + SQL Server, the port would use espress.js + CouchDB.

IDE

Call me old fashioned, but in my opinion a good IDE is extremely important in any developers work. There is this trend I see recently where more and more people use tools like Sublime Text or vim to do their coding. I use Visual Studio and Resharper in my day to day work and that really sets my expectations high towards the functionality and ease of use of the IDE.

I began to search for something similar to code for node.js. Paul O’Fallon uses Cloud9 IDE in his course, so I decided to give it a try. It’s a browser-based IDE that gives you access to your private or public workspace, preinstalled node.js and npm (node package manager) and also a fully featured Linux terminal that you can use for example to issue git commands. What’s great about it, is that it requires zero setup on your machine. You can develop, run and debug node.js applications, all in your browser. Combined with CouchDB hosted on Iris Couch, I had development environment in no time, without installing anything on my machine. Cloud9 also has nice integration with GitHub and BitBucket.

I actually did some coding in Cloud9 and it worked great most of the time, but there were some lags and freezes and one time I couldn’t even access my workspace at all. So I went to look for some alternatives. And guess what? You don’t have many options here. There is some support in Eclipse and in NetBeans, but I wouldn’t call that “native support”. On stackoverflow you get redirected to vim, Sublime Text or… Cloud9. I ended up using trial version of WebStorm from JetBrains it it was a quite nice experience. I set it up on a Ubuntu VM to be a little adventurous.

Javascript

Javascript is not new to me as a language, but it continues to amaze me how ubiquitous it has become and how horrible language it is at the same time. I don’t want to focus on Javascript, but rather let’s say any dynamically typed language. It is really easy and fast to build something small and to provide basic functionality using such tool. I mean, I was ready with some basic stuff in no time. No tedious importing types, assemblies, setting up namespaces, declaring data structures, and so on. Just plug and play. But as the project grows, it becomes harder and harder to manage. Dynamic nature of the language limits the amount of help you get from the IDE. You easily loose control over what methods are available on what classes, what was the type of that parameter you added just two days ago, and so on. WebStorms does its best to provide hints and “Intellisense”, but that has its limits. And refactoring tools? Forget it. There is an option to code for node.js using Typescript and I imagine when IDEs get better support for that language, it would be a real help. I haven’t tried that however.

One more thing: witing asynchronous code with callbacks is not really comfortable, especially when you need to call several external resources one by one. node.js would really benefit from a language supporting something like C#’s async keyword.

I must say however that using one language across whole application (including CouchDB, because it uses JS too) was a really nice thing to do. Very consistent. No entity mapping, changing naming conventions and so on – just load a JSON document from db, manipulate it and return it from RESTful API.

Libs and community

Same as with NuGet, npm is a great tool and saves lots of time. Apart from modules integrated into node.js itself, you have access to tons of modules that I think cover all basic needs you might have. I for instance needed a module for CouchDB access and there it is: cradle just to mention one. And it was really user friendly too. However I encountered an opinion on stackoverflow dated few months back, that most node.js modules are not mature enough and if you want to use them in production, you’ll most probably need to get involved in their development. But I guess this is true for any OSS project.

The outcome

I managed to port the application quite quickly for a person with no previous experience with node.js. express.js framework was really helpful and its usage of chain of responsibility pattern is very interesting. I would like to see something like that in ASP.NET MVC. Connecting to hosted CouchDB was a breeze, however I wish I tried MongoDB instead – just to learn something new.

Is it better than ASP.NET?

A colleague of mine said: the correct answer is always “it depends”. It depends on what your needs are. I myself wouldn’t build a large system with node.js, however I am aware that such systems exist and work just fine. Maybe it’s just because I have more experience with another technology. Maybe it’s because I’m afraid the project would become unmaintainable.

Someone said they used node.js solely for building a proxy for doing cross-domain AJAX requests. I imagine this is a great use of node.js. Its main power lays in the asynchronous nature, that allows to gain large throughput when you’re doing a lot of IO (like for example requests to external systems). I should mention that this can be also quite easily achieved in ASP.NET, however it’s not a popular practice. Also – you can save some bucks on MS licenses (don’t get me started on MS licensing schemes).

I’ve also seen someone using it for quickly prototyping a client – server solution with requests being forwarded to different services and that allowed to quickly prove some assumptions about feasibility. I did not have chance to see how a node.js application behaves in production environment, how easy it would be to monitor its performance and behavior and so on. But people are saying it’s ok, so I’m going to trust them.

So to sum up – working with node.js was a nice experience and it certainly has its uses. Also, I find it very valuable to venture outside the Microsoft ecosystem from time to time to see what others are doing. You can learn a lot that way.

Sunday, November 4, 2012

Using jQuery Mobile with knockout

// EDIT 2013-05-12: fixed “this” issue pointed out by commenter

Just a quick note on how I decided to structure my solution using jQuery Mobile with knockout. I’ve seen some other solutions on the web, but they seemed not to be easily scalable to larger projects. You can find a working sample on jsfiddle.

Problem definition

We need a solution that will:

  • Have a separate view model for each JQM view. We don’t want to have a god object view model spanning whole application, as this will not scale to larger projects.
  • Allow pages to be downloaded on demand by jQuery Mobile engine. We don’t want to store all pages inside a single HTML document downloaded upfront. This implies, that knockout bindings need to be applied at the right moment – after page has been downloaded and stored in DOM.
  • Notify view model, when a bound page is being navigated to.

The solution

To achieve those goals, we’ll be implementing pagechange event listener, that will apply knockout bindings and notify view models, that related page is being navigated to. The pagechange event occurs each time navigation to another page takes place. It is fired event when JQM displays the first page in the HTML document after it has been loaded into user’s browser.

To demonstrate this technique, we’ll build a little sample app. It will let the user to pick a fruit from the list and navigate to another page that displays the name of the selected fruit. User can then return to fruit list and pick a different one.

image

First of all, we need some HTML. For the simplicity’s sake, I’ll put all of it in one file, but this will also work when pages are in separate HTML files downloaded on demand.

<div id="fruitList" data-role="page" data-viewmodel="fruitListViewModel">
    <div data-role="header" data-position="fixed">
        <h1>Fruit</h1>
    </div>
    <ul data-role="listview" data-bind="foreach: fruit, refreshList: fruit">
        <li><a href="#" data-bind="text: $data, click: selectItem"></a></li>
    </ul>
</div>

<div id="selectedFruit" data-role="page" data-viewmodel="selectedFruitViewModel">
    <div data-role="header" data-position="fixed">
        <a href="#" data-rel="back">Back</a>
        <h1>Your selection</h1>
    </div>
    <h2 data-bind="text: selectedItem"></h2>
</div>    

As you can see, the HTML already includes knockout bindings and as well as jQuery Mobile attributes. There is the refreshList custom binding on the UL element (more on that later). There are also the data-viewmodel attributes. We’ll be using those to associate a page with a view model. The value of this attribute is the name of the global variable containing reference to the view model. Let’s declare the view models:

globalContext = (function () {

    var fruit = ["banana", "apple", "orange", "strawberry"];
    var selectedItem = null;

    return {
        fruit: fruit,
        selectedItem: selectedItem
    };

})();


fruitListViewModel = (function (globalContext) {

    // properties
    var fruit = ko.observableArray(globalContext.fruit);

    // behaviors
    var selectItem = function (item) {
        globalContext.selectedItem = item;
        $.mobile.changePage("#selectedFruit");
    };

    return {
        fruit: fruit,
        selectItem: selectItem
    };

})(globalContext);

selectedFruitViewModel = (function (globalContext) {
    // properties
    var selectedItem = ko.observable();

    // behaviors
    var activate = function () {
        selectedItem(globalContext.selectedItem);
    };

    return {
        selectedItem: selectedItem,
        activate: activate
    };

})(globalContext);

globalContext is a helper object for holding values, that all view models in the application share. fruitListViewModel is responsible for displaying a list of items. When item is clicked, it is saved to globalContext and user is redirected to the selectedFruit page. Note: if your pages are each in it’s own file, you can specify URL instead of page id in the changePage method. selectedFruitViewModel has the activate method, which will be called after user navigates to the selectedFruit page. This method sets the selectedItem obeservable to the value stored in globalContext.

Now the navigation listener itself:

$(document).bind("pagechange", function (e, info) {

    var page = info.toPage[0];

    console.log("Changing page to: " + page.id);

    // get view model name
    var viewModelName = info.toPage.attr("data-viewmodel");
    if (viewModelName) {
        // get view model object
        var viewModel = window[viewModelName];

        // apply bindings if they are not yet applied
        if (!ko.dataFor(page)) {
            ko.applyBindings(viewModel, page);
        }

        // call activate on view model if implemented
        if (viewModel && viewModel.activate && typeof viewModel.activate === "function") {
            viewModel.activate();
        }
    }
});

Page that is navigated to is examined for data-viewmodel attribute. Name of the view model is extracted and a global variable with that name is looked up in the window object. If bindings are not yet applied for the page, applyBindings is called. Then, activate method of the view model is called.

Note: if working with pages in separate HTML files, you’d want either to clear bindings for a page when it is being navigated from (as it will be removed from DOM by JQM by default and then reloaded when user returns) or to order JQM to cache pages in DOM by setting the $.mobile.page.prototype.options.domCache = true .

There is still one thing missing, namely the refreshList custom binding. If data displayed in the list changes(for example is downloaded using AJAX), we need a way to order JQM to extend all the new / changed elements that were inserted into DOM by knockout binding. This is what the refreshList binding does:

// refreshList custom binding
// refreshes runs jQuery Mobile on newly added DOM elements in a listview
ko.bindingHandlers.refreshList = {
    update: function (element, valueAccessor, allBindingsAccessor) {
        $(element).listview("refresh");
        $(element).trigger("create");
    }
};

In similar way, other types of JQM controls can be refreshed when knockout binding modifies the DOM.

See the working sample on jsfiddle.

Next steps

The sample I presented is very simplistic. In real world scenario, you’d like to declare the view models and other components of your application as AMD-style modules, and use some kind of framework like requirejs to manage the dependencies. Then, the navigation listener instead of looking for global variables, could load view model modules by name.

You can also handle more jQuery Mobile navigation-related events to provide more flexible view model activation / deactivation framework.

The globalContext object could be modified to be more of a application-wide controller, than just data storage. It could manage usecase flow by navigating to appropriate pages basing on current user and application context.

Sunday, April 1, 2012

Dealing with STA COMs in web applications and WCF

Instantiating and using Single Thread Apartment COM objects is quite straightforward in desktop applications (by that I mean console / WinForms / WPF), because the main thread is STA itself. It becomes more problematic once ASP.NET or WCF applications are considered. The threads which are used to serve requests are MTA and when trying to instantiate STA COM object, you get a nasty exception. The workaround for this problem is to spawn a new thread, make it STA compatible, and run our COM operation on it. Here I’ll show how to make this nice and easy.

The easiest way to call operation on different thread is to use the Thread class. I’ll put all the COM related operations into a wrapper class called ComComponentFacade:

public class ComComponentFacade
{
    
    public int CallComObject1(string parameter)
    {
        int result = 0;

        var thread = new Thread(() => result = DoCallComObject(parameter));
        thread.SetApartmentState(ApartmentState.STA);
        thread.Start();
        thread.Join();

        return result;
    }

    private int DoCallComObject(string parameter)
    {
        // instantiate COM

        // perform operation and return result
    }

}

Then, you just need to call this facade from MVC controller or WCF operation to get desired results instead of aforementioned exception. Yet, this approach has a drawback of not controlling the number of simultaneous STA threads being spawned and COM instances created. It would be better to use some kind of pool of threads. Unfortunately the standard .NET ThreadPool cannot be used, since it is running on MTA threads. Instead we’re going to take advantage of Task Parallel Library and it’s pluggable mechanism of scheduling tasks on threads. What we need is a custom task scheduler that is going to use STA threads. This scheduler would create a certain number of threads and reuse them for queued tasks. If there are more tasks than available threads, they will wait until there is a thread available to run them. We could write this scheduler on our own or use existing implementation from Parallel Extension Extras. I’ll do the latter.

The task scheduler needs to be common to all requests to our application. I’m going to store it in a private field of the façade and then make sure it is used as a singleton (register it in Unity container with appropriate lifetime manager).

private StaTaskScheduler _staTaskScheduler;

public StaTaskScheduler TaskScheduler
{
    get { return _staTaskScheduler; }
}

public ComComponentFacade()
{
    _staTaskScheduler = new StaTaskScheduler(4);
}

public int CallComObject2(string parameter)
{
    return Task<int>.Factory.StartNew(
        () => { return DoCallComObject(parameter); },
        CancellationToken.None,
        TaskCreationOptions.None,
        _staTaskScheduler
    ).Result;
}

This takes care of thread pooling, but still leaves us with a lot of boilerplate code that needs to be repeated in each method of our façade. In the next step we’re going to move it into an aspect. I’m going to use PostSharp’s MethodInterceptionAspect in order to substitute a method call with code, that will run the method’s body inside of thread pool created in previous step. You could try different AOP library or possibly take advantage of action filters in MVC or operation invokers in WCF.

[Serializable]
[AttributeUsage(AttributeTargets.Method)]
public class ScheduleStaOperationAttribute : MethodInterceptionAspect
{
    public override void OnInvoke(MethodInterceptionArgs args)
    {
        var comComponentFacade = IoC.Container.Resolve<ComComponentFacade>();

        Task.Factory.StartNew(
            () => 
            {
                args.Proceed();
            },
            CancellationToken.None,
            TaskCreationOptions.None,
            comComponentFacade.TaskScheduler
        ).Wait();
        
        // the return value is already in the args
    }
}

Now we can use the aspect in or façade.

[ScheduleStaOperation]
public int CallComObject3(string parameter)
{
    // instantiate COM

    // perform operation and return result
    
}

There is one more thing that we can do. If the instantiation of the COM object takes a lot of time (as in my case) or creating and destroying instances multiple times is undesirable for another reason, you may want to cache the instance in current thread. Each thread from thread pool would have it’s own instance of the COM object and reuse it in subsequent calls. For this, a thread static field can be used.

[ThreadStatic]
private static MyComType _comComponent;

[ScheduleStaOperation]
public int CallComObject4(string parameter)
{
    if (_comComponent == null)
    {
        // instantiate COM

    }
    
    // perform operation and return result

}

Thursday, November 17, 2011

Process integrity

Recently I finished the “Enterprise SOA: Service-Oriented Architecture Best Practices” book by Dirk Krafzig, Karl Banke and Dirk Slama. A great book I must say, because it covers not only purely technical and architectural aspects, but also the business and policy environment with which you have to deal in large organizations actually using the SOA approach. One particular thing in the architecture part of the book got me really interested, namely process integrity.

Some theory…

Imagine this: you are creating a web service method (whoa!). The method is responsible for conducting some business process. The challenge is that this process involves interaction with two external systems over which you have no control. For example you might be booking a flight ticket along with hotel reservation (first external system) and car rental reservation (second external system). External systems tend to be, well… external, which means you have to assume than sooner or later communication with them is going to fail. What if you are able to make hotel reservation but then car rental reservation fails? You have to be able to a) detect when something goes wrong, b) be able to handle this situation for example by cancelling the hotel reservation and the ticket (compensation process).

The most natural way to handle this would be to have some kind of ACID transaction spanning all the systems involved. While it is possible to use distributed transactions in some cases, there are a lot of problems involved. First, there are technology constraints. The external system probably won’t support a compatible transaction coordination system / standard. Then there is lack of support for long-lived processes (for example reservation confirmation from hotel system takes up to an hour). Also, distributed transactions over internet are probably bad idea.

The idea to deal with that problem in SOA environment is to have so called persistent queues and transactional steps. You split the whole process into steps where each step is transactional. Then after taking each of those steps you note the fact that it either passed or failed in a persistent way. A passed step is a sign that you can perform next step in the process or finish the whole process if you have taken all the steps.

image

(click to enlarge)

What happens here? First we save a ticket information and enqueue the hotel reservation task. Then hotel reservation processor, which is periodically checking its queue, loads the task and processes it. After successful hotel reservation, car rental reservation task is enqueued. If hotel reservation task failed, the transaction would be rolled back, and the hotel reservation task would return into queue. Similar process happens for car rental reservation task.

The important thing to note here is that the user who initiated the process doesn’t get the information whether it was successful or not straight away. The process is asynchronous, so the user has to check its status periodically. Also all tasks and data saved to database should be marked with a common process (or transaction) id. This way we’ll be able to identify what is the state of the process.

So what can be done if one of the tasks fails? Maybe the external system is just overloaded or under maintenance. Then we can just try again later. The problem might also be with data passed to the system (for example it doesn’t pass validation). In this case trying over and over again won’t do much good. Tasks with no good chances of completing successfully should be marked as failed. Then, either automatically or manually (by a technician), the processes of which they are part should be rolled back, which usually means executing some kind of compensating logic.

…and some practice

So what is the best way to implement all that? You may don’t know it, but you’ve got all the infrastructure ready in… the Workflow Foundation. Well maybe it’s not 1:1 with what I was explaining above, but serves the purpose. Yet WF may be too heavyweight solution for you, since it requires a certain shift of thinking about building your service.

Sticking to transactional steps and queues you may wonder how to implement such a queue mechanism. I think two most popular approaches would be to have either database queue or MSMQ queue. Both have their pros and cons. MSMQ is nicely suited for processing large amounts of messages, but causes some overhead in maintenance and also requires you to promote your transactions to distributed if you want them to span the queue and the database. With database the transactions are local and there’s no maintenance overhead, but it is quite challenging to create an efficient queuing mechanism due to locking problems. Then again – you might need the integrity, not the performance.

Here I have a simple example how to create database queue and task processors working in background. I’m not working with a hotel or car rental system, but rather with Twitter and Facebook. Let’s pretend for a minute that it’s absolutely crucial for you to have your status updates hit both Twitter and Facebook in an integral manner :) In this example I’m using a nice scheduling library called Quartz.NET, and also Twitterizer and Facebook C# SDK. (It is amazing how NuGet makes putting together such mashup apps quick and easy). Grab the source here.

image

It’s a simple console app:

static void Main(string[] args)
{
    // required to use sql ce with entity framework
    Database.DefaultConnectionFactory = new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");

    // start scheduler
    var factory = new StdSchedulerFactory();
    var scheduler = factory.GetScheduler();
    scheduler.Start();

    // those are our task processors
    var twitterJobDetail = new JobDetail("twitterJob", null, typeof(PostOnTwitterJob));
    var facebookJobDetail = new JobDetail("facebookJob", null, typeof(PostOnFacebookJob));

    // they check the queue every 5 seconds
    var twitterTrigger = TriggerUtils.MakeSecondlyTrigger(5);
    twitterTrigger.Name = "twitter";
    var facebookTrigger = TriggerUtils.MakeSecondlyTrigger(5);
    facebookTrigger.Name = "facebook";

    // lets start them
    scheduler.ScheduleJob(twitterJobDetail, twitterTrigger);
    scheduler.ScheduleJob(facebookJobDetail, facebookTrigger);

    while (true)
    {
        // get post text from user
        var post = Console.ReadLine();
        // and initiate posting
        TaskQueue.Enqueue(new TasksDbContext(), Guid.NewGuid(), new PostOnTwitterTaskData { Text = post });
        // also you could force the twitter job to run straight away so that you don't wait <= 5s
    }
}

Basically we’re taking user input and initiate posting process by storing the post on Twitter task in the database. The database representation of the task is very simple:

public class PersistentTask
{
    public Guid Id { get; set; }
    
    /// <summary>
    /// Helps to identify the process across different tasks
    /// </summary>
    public Guid TransactionId { get; set; }

    public bool IsFailed { get; set; }

    public string Type { get; set; }

    public string SerializedData { get; set; }
}

The task queue:

public class TaskPair<TTaskData>
 {
     /// <summary>
     /// DB representation of the task
     /// </summary>
     public PersistentTask PersistentTask { get; set; }

     /// <summary>
     /// Deserialized task data
     /// </summary>
     public TTaskData TaskData { get; set; }

     public TaskPair(PersistentTask persistentTask, TTaskData taskData)
     {
         PersistentTask = persistentTask;
         TaskData = taskData;
     }
 }
 
 public static class TaskQueue
 {
     /// <summary>
     /// Puts a task into database queue
     /// </summary>
     public static void Enqueue(TasksDbContext context, Guid trasactionId, object taskData)
     {
         Console.WriteLine("Enqueueing {0} with transactionId = {1}", taskData.GetType().Name, trasactionId);
         
         var serializer = new XmlSerializer(taskData.GetType());
         var writer = new StringWriter();
         serializer.Serialize(writer, taskData);

         var persistentTask = new PersistentTask
         {
             Id = Guid.NewGuid(),
             TransactionId = trasactionId,
             IsFailed = false,
             Type = taskData.GetType().Name,
             SerializedData = writer.ToString()
         };

         context.PersistentTasks.Add(persistentTask);
         context.SaveChanges();
     }

     /// <summary>
     /// Dequeues a task from database queue and deserializes its data
     /// </summary>
     public static TaskPair<TTaskData> Dequeue<TTaskData>(TasksDbContext context)
     {
         var task = context.PersistentTasks
             .Where(pt => pt.Type == typeof(TTaskData).Name && !pt.IsFailed).FirstOrDefault();
         if (task == null)
         {
             return null;
         }

         Console.WriteLine("Dequeued {0} with transactionId = {1}", typeof(TTaskData).Name, task.TransactionId);

         context.PersistentTasks.Remove(task);
         context.SaveChanges();

         var serializer = new XmlSerializer(typeof(TTaskData));
         var reader = new StringReader(task.SerializedData);

         return new TaskPair<TTaskData>(task, (TTaskData)serializer.Deserialize(reader));
     }

     /// <summary>
     /// Marks a certain task as failed
     /// </summary>
     public static void MarkFailed(TasksDbContext context, Guid taskId)
     {
         var task = context.PersistentTasks.SingleOrDefault(pt => pt.Id == taskId);
         if (task != null)
         {
             task.IsFailed = true;
             context.SaveChanges();
         }
     }
 }

And the jobs (or task processors):

public abstract class TaskJobBase<TTaskDescription> : IStatefulJob
{
    public void Execute(JobExecutionContext jobContext)
    {
        TaskPair<TTaskDescription> pair = null;
        ObjectContext objctx = null;

        using (var context = new TasksDbContext())
        {
            // unfortunately with sql ce you have to manually open/close connection or it'll not enlist in transaction
            objctx = ((IObjectContextAdapter)context).ObjectContext;
            objctx.Connection.Open();

            // foreach task waiting in the database
            do
            {
                try
                {
                    // open transaction
                    using (var scope = new TransactionScope())
                    {
                        // dequeue task
                        pair = TaskQueue.Dequeue<TTaskDescription>(context);
                        if (pair != null)
                        {
                            // execute it
                            PerformTask(context, pair);
                            // if ok - commit transaction
                            scope.Complete();
                        }
                    }
                }
                catch
                {
                    // transaction was rolled back
                    Console.WriteLine("Task {0} with transaction id = {1} failed!", pair.TaskData.GetType().Name,
                        pair.PersistentTask.TransactionId);

                    // in this example we also mark the task as failed                        
                    if (pair != null)
                    {
                        using (var scope = new TransactionScope())
                        {
                            TaskQueue.MarkFailed(context, pair.PersistentTask.Id);
                            scope.Complete();
                        }
                    }
                }
            } while (pair != null);

            objctx.Connection.Close();
        }
    }

    public abstract void PerformTask(TasksDbContext context, TaskPair<TTaskDescription> pair);
}

public class PostOnTwitterJob : TaskJobBase<PostOnTwitterTaskData>
{
    public override void PerformTask(TasksDbContext context, TaskPair<PostOnTwitterTaskData> pair)
    {
        Console.WriteLine("Posting \"{0}\" on Twitter", pair.TaskData.Text);

        OAuthTokens tokens = new OAuthTokens();
        tokens.AccessToken = "[your access token]";
        tokens.AccessTokenSecret = "[your access token secret]";
        tokens.ConsumerKey = "[your consumer key]";
        tokens.ConsumerSecret = "[your consumer secret]";

        TwitterResponse<TwitterStatus> tweetResponse = TwitterStatus.Update(tokens, pair.TaskData.Text);
        if (tweetResponse.Result != RequestResult.Success)
        {
            throw new Exception("Unable to post to twitter!");
        }

        // ok, posted! now the next step - facebook
        // it's still in the transaction!
        TaskQueue.Enqueue(context, pair.PersistentTask.TransactionId, new PostOnFacebookTaskData { Text = pair.TaskData.Text });
    }
}

public class PostOnFacebookJob : TaskJobBase<PostOnFacebookTaskData>
{
    public override void PerformTask(TasksDbContext context, TaskPair<PostOnFacebookTaskData> pair)
    {
        Console.WriteLine("Posting \"{0}\" on Facebook", pair.TaskData.Text);

        // in real world app you need to refresh this token since it expires after ~2h
        var fb = new FacebookClient("[your access token]");
 
        dynamic parameters = new ExpandoObject();
        parameters.message = pair.TaskData.Text;

        dynamic result = fb.Post("me/feed", parameters);
        var id = result.id;

        // nothing left to do - process finished
    }
}

Of course the app is not something I would put into production, but it should give you an idea how things can be organized. Grab the source here.

Share