WebSocket subprotocol negotiation in ASP.NET Core

WebSocket is the closest API to a network socket available in browser. This makes it probably the most flexible transport which a web application can use. That flexibility comes at price. From WebSocket perspective the message content is opaque, it only provides distinction between text and binary data. There is also no ready to use mechanism for communicating additional metadata. This means that client and server must agree on application subprotocol. This isn't something problematic as long as the scenario is simple, but the moment there are clients which are not in our control and we want to evolve the subprotocol a problem rises. WebSocket provides a solution for this problem in form of simple subprotocol negotiation mechanism and the Microsoft.AspNetCore.WebSockets package, which provides low-level WebSocket support for ASP.NET Core, fully supports it.

Sample scenario

The sample scenario will be a very simple web application, which regularly receives plain text messages over WebSocket and displays them to users. The relevant part of client side code is the below snippet.

var handleWebSocketPlainTextData = function(data) {
    ...
};

var webSocket = new WebSocket('ws://example.com/socket');

webSocket.onmessage = function(message) {
    handleWebSocketPlainTextData(message.data);
};

On server side there is simple middleware which manages WebSocket connections.

public class WebSocketConnectionsMiddleware
{
    private IWebSocketConnectionsService _connectionsService;


    public WebSocketSubprotocolsMiddleware(RequestDelegate next,
        IWebSocketConnectionsService connectionsService)
    {
        _connectionsService = connectionsService ??
            throw new ArgumentNullException(nameof(connectionsService));
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            WebSocket webSocket = await context.WebSockets.AcceptWebSocketAsync();

            WebSocketConnection webSocketConnection = new WebSocketConnection(webSocket);

            _connectionsService.AddConnection(webSocketConnection);

            byte[] webSocketBuffer = new byte[1024 * 4];
            WebSocketReceiveResult webSocketReceiveResult = await webSocket.ReceiveAsync(
                new ArraySegment<byte>(webSocketBuffer), CancellationToken.None);
            if (webSocketReceiveResult.MessageType != WebSocketMessageType.Close)
            {
                ...
            }
            await webSocket.CloseAsync(webSocketReceiveResult.CloseStatus.Value,
                webSocketReceiveResult.CloseStatusDescription, CancellationToken.None);

            _connectionsService.RemoveConnection(webSocketConnection.Id);
        }
        else
        {
            context.Response.StatusCode = 400;
        }
    }
}

The IWebSocketConnectionsService implementation is managing connections with help of ConcurrentDictionary and WebSocketConnection is a wrapper around WebSocket class which abstracts the low-level aspects of the API.

public class WebSocketConnection
{
    private WebSocket _webSocket;

    public Guid Id => Guid.NewGuid();

    public WebSocketConnection(WebSocket webSocket)
    {
        _webSocket = webSocket ?? throw new ArgumentNullException(nameof(webSocket));
    }

    public async Task SendAsync(string message, CancellationToken cancellationToken)
    {
        if (_webSocket.State == WebSocketState.Open)
        {
            ArraySegment<byte> buffer = new ArraySegment<byte>(Encoding.ASCII.GetBytes(message),
                0, message.Length);

            await _webSocket.SendAsync(buffer, WebSocketMessageType.Text, true, cancellationToken);
        }
    }

    ...
}

The goal is to introduce new (JSON based) subprotocol which will allow sending additional metadata, but the backward compatibility is also required.

Abstracting the subprotocol

First an abstraction of subprotocol is needed. The abstraction needs to provide the name of the subprotocol and methods for sending/receiving. In general the application will still be sending text based messages so following interface should be sufficient.

public interface ITextWebSocketSubprotocol
{
    string SubProtocol { get; }

    Task SendAsync(string message, WebSocket webSocket, CancellationToken cancellationToken);

    ...
}

The implementation for the plain text version can be extracted from WebSocketConnection.

public class PlainTextWebSocketSubprotocol : ITextWebSocketSubprotocol
{
    public string SubProtocol => "aspnetcore-ws.plaintext";

    public async Task SendAsync(string message, WebSocket webSocket,
        CancellationToken cancellationToken)
    {
        if (webSocket.State == WebSocketState.Open)
        {
            ArraySegment<byte> buffer = new ArraySegment<byte>(Encoding.ASCII.GetBytes(message),
                0, message.Length);

            await webSocket.SendAsync(buffer, WebSocketMessageType.Text, true, cancellationToken);
        }
    }

    ...
}

This means that WebSocketConnection should now be dependent on the subprotocol abstraction.

public class WebSocketConnection
{
    private WebSocket _webSocket;
    private ITextWebSocketSubprotocol _subProtocol;

    public Guid Id => Guid.NewGuid();

    public WebSocketConnection(WebSocket webSocket, ITextWebSocketSubprotocol subProtocol)
    {
        _webSocket = webSocket ?? throw new ArgumentNullException(nameof(webSocket));
        _subProtocol = subProtocol ?? throw new ArgumentNullException(nameof(subProtocol));
    }

    public Task SendAsync(string message, CancellationToken cancellationToken)
    {
        return _subProtocol.SendAsync(message, _webSocket, cancellationToken);
    }

    ...
}

Also a small adjustion to the middleware is needed.

public class WebSocketConnectionsMiddleware
{
    private readonly ITextWebSocketSubprotocol _defaultSubProtocol;
    private IWebSocketConnectionsService _connectionsService;


    public WebSocketSubprotocolsMiddleware(RequestDelegate next,
        IWebSocketConnectionsService connectionsService)
    {
        _defaultSubProtocol = new PlainTextWebSocketSubprotocol();
        _connectionsService = connectionsService ??
            throw new ArgumentNullException(nameof(connectionsService));
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            WebSocket webSocket = await context.WebSockets.AcceptWebSocketAsync();

            WebSocketConnection webSocketConnection = new WebSocketConnection(webSocket,
                _defaultSubProtocol);

            ...
        }
        else
        {
            context.Response.StatusCode = 400;
        }
    }
}

Now the infrastructure needed for introducing a second subprotocol is in place. It will be a JSON based subprotocol which in addition to the message provides a timestamp.

public class JsonWebSocketSubprotocol : ITextWebSocketSubprotocol
{
    public string SubProtocol => "aspnetcore-ws.json";

    public async Task SendAsync(string message, WebSocket webSocket,
        CancellationToken cancellationToken)
    {
        if (webSocket.State == WebSocketState.Open)
        {
            string jsonMessage = JsonConvert.SerializeObject(new {
                message,
                timestamp = DateTime.UtcNow
            });

            ArraySegment<byte> buffer = new ArraySegment<byte>(Encoding.ASCII.GetBytes(jsonMessage),
                0, jsonMessage.Length);

            await webSocket.SendAsync(buffer, WebSocketMessageType.Text, true, cancellationToken);
        }
    }
}

Subprotocol negotiation

The subprotocol negotiation starts on the client side. As part of the WebSocket object constructor an array of supported subprotocols can be provided. If the negotiation succeeds the information about chosen subprotocol is available through protocol attribute of WebSocket instance.

var handleWebSocketPlainTextData = function(data) {
    ...
};

var handleWebSocketJsonData = function(data) {
    ...
};

var webSocket = new WebSocket('ws://example.com/socket',
    ['aspnetcore-ws.plaintext', 'aspnetcore-ws.json']);

webSocket.onmessage = function(message) {
    if (webSocket.protocol == 'aspnetcore-ws.json') {
        handleWebSocketJsonData(message.data);
    } else {
        handleWebSocketPlainTextData(message.data);
    }
};

The advertised subprotocols are transferred to the server as part of the connection handshake in Sec-WebSocket-Protocol header. On the server side this list is available through HttpContext.WebSockets.WebSocketRequestedProtocols property. The server completes handshake by providing the name of selected subprotocol as parameter to HttpContext.WebSockets.AcceptWebSocketAsync method.

The rules of subprotocol negotiation are simple. If the client has advertised a list of subprotocols the server must choose one of them. If the client hasn't advertised any subprotocols the server can't provide a subprotocol name as part of handshake. The best place to implement those rules seems to be the middleware.

There is also no way for client to specify a preference between subprotocols, the choice is entirely up to the server. In the below code the available subprotocols are being kept as a list and order on that list represents preference.

public class WebSocketConnectionsMiddleware
{
    private readonly ITextWebSocketSubprotocol _defaultSubProtocol;
    private readonly IList<ITextWebSocketSubprotocol> _supportedSubProtocols;
    private IWebSocketConnectionsService _connectionsService;


    public WebSocketSubprotocolsMiddleware(RequestDelegate next,
        IWebSocketConnectionsService connectionsService)
    {
        _defaultSubProtocol = new PlainTextWebSocketSubprotocol();
        _supportedSubProtocols = new List<ITextWebSocketSubprotocol>
        {
            new JsonWebSocketSubprotocol(),
            _defaultSubProtocol
        }
        _connectionsService = connectionsService ??
            throw new ArgumentNullException(nameof(connectionsService));
    }

    public async Task Invoke(HttpContext context)
    {
        if (context.WebSockets.IsWebSocketRequest)
        {
            ITextWebSocketSubprotocol subProtocol =
                NegotiateSubProtocol(context.WebSockets.WebSocketRequestedProtocols);

            WebSocket webSocket =
                await context.WebSockets.AcceptWebSocketAsync(subProtocol?.SubProtocol);

            WebSocketConnection webSocketConnection =
                new WebSocketConnection(webSocket, subProtocol ?? _defaultSubProtocol);

            ...
        }
        else
        {
            context.Response.StatusCode = 400;
        }
    }

    private ITextWebSocketSubprotocol NegotiateSubProtocol(IList<string> requestedSubProtocols)
    {
        ITextWebSocketSubprotocol subProtocol = null;

        foreach (ITextWebSocketSubprotocol supportedSubProtocol in _options.SupportedSubProtocols)
        {
            if (requestedSubProtocols.Contains(supportedSubProtocol.SubProtocol))
            {
                subProtocol = supportedSubProtocol;
                break;
            }
        }

        return subProtocol;
    }
}

With this implementation all the possible clients which are not supporting the new subprotocol will continue to work without any changes, they will not advertise any subprotocols and server will use the default one without providing the name as part of the handshake. Meantime our application, which advertises support for new protocol, will use it because it is the one preferred by the server.

It is also worth to mention that subprotocols are not meant only for the internal purposes. There is a number of well known subprotocols which registry is available here.

The demo project is available on GitHub, it should be a good starting point for playing with WebSocket subprotocol negotiation.