Monitoring C# Azure Functions in the Isolated Worker Model - Infrastructure & Configuration Deep Dive

Not so long ago my colleague reached out with a question "Are there any known issues with ITelemetryInitializer in Azure Functions?". This question started a discussion about the monitoring configuration for C# Azure Functions in the isolated worker model. At some point in that discussion I stated "Alright, you've motivated me, I'll make a sample". When I sat down to do that, I started wondering which scenario should I cover and my conclusion was that there are several things I should go through... So here I am.

Setting the Scene

Before I start describing how we can monitor an isolated worker model C# Azure Function, allow me to introduce you to the one we are going to use for this purpose. I want to start with as basic setup as possible. This is why the initial Program.cs will contain only four lines.

var host = new HostBuilder()
    .ConfigureFunctionsWebApplication()
    .Build();

host.Run();

The host.json will also be minimal.

{
    "version": "2.0",
    "logging": {
        "logLevel": {
            "default": "Information"
        }
    }
}

For the function itself, I've decided to go for the Fibonacci sequence implementation as it can easily generate a ton of logs.

public class FibonacciSequence
{
    private readonly ILogger<FibonacciSequence> _logger;

    public FibonacciSequence(ILogger<FibonacciSequence> logger)
    {
        _logger = logger;
    }

    [Function(nameof(FibonacciSequence))]
    public async Task<HttpResponseData> Run(
        [HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = "fib/{index:int}")]
        HttpRequestData request,
        int index)
    {
        _logger.LogInformation(
            $"{nameof(FibonacciSequence)} function triggered for {{index}}.",
            index
        );

        var response = request.CreateResponse(HttpStatusCode.OK);
        response.Headers.Add("Content-Type", "text/plain; charset=utf-8");

        await response.WriteStringAsync(FibonacciSequenceRecursive(index).ToString());

        return response;
    }

    private int FibonacciSequenceRecursive(int index)
    {
        int fibonacciNumber = 0;

        _logger.LogInformation("Calculating Fibonacci sequence for {index}.", index);

        if (index <= 0)
        {
            fibonacciNumber = 0;
        }
        else if (index <= 1)
        {
            fibonacciNumber = 1;
        }
        else
        {
            fibonacciNumber = 
              FibonacciSequenceRecursive(index - 1)
              + FibonacciSequenceRecursive(index - 2);
        }

        _logger.LogInformation(
            "Calculated Fibonacci sequence for {index}: {fibonacciNumber}.",
            index,
            fibonacciNumber
        );

        return fibonacciNumber;
    }
}

This is a recursive implementation which in our case has the added benefit of being crashable on demand 😉.

Now we can start capturing the signals this function will produce after deployment.

Simple and Limited Option for Specific Scenarios - File System Logging

What we can capture in a minimal deployment scenario is logs coming from our function. What do I understand by a minimal deployment scenario? The bare minimum that Azure Function requires is a storage account, app service plan, and function app. The function can push all its logs into that storage account. Is this something I would recommend for a production scenario? Certainly not. It's only logs (and in production you will want metrics) and works reasonably only for Azure Functions deployed on Windows (for production I would suggest Linux due to better cold start performance or lower pricing for dedicated plan). But it may be the right option for some development scenarios (when you want to run some work, download the logs and analyze them locally). So, how to achieve this? Let's start with some Bicep snippets for the required infrastructure. First, we need to deploy a storage account.

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = {
  name: 'stwebjobs${uniqueString(resourceGroup().id)}'
  location: resourceGroup().location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'Storage'
}

We also need an app service plan. It must be a Windows one (the below snippet creates a Windows Consumption Plan).

resource appServicePlan 'Microsoft.Web/serverfarms@2024-04-01' = {
  name: 'plan-monitored-function'
  location: resourceGroup().location
  sku: {
    name: 'Y1'
    tier: 'Dynamic'
  }
  properties: {
    computeMode: 'Dynamic'
  }
}

And the actual service doing the work, the function app.

resource functionApp 'Microsoft.Web/sites@2024-04-01' = {
  name: 'func-monitored-function'
  location: resourceGroup().location
  kind: 'functionapp'
  properties: {
    serverFarmId: appServicePlan.id
    siteConfig: {
      netFrameworkVersion: 'v8.0'
      appSettings: [
        {
          name: 'AzureWebJobsStorage'
          value: 'DefaultEndpointsProtocol=https;AccountName=${storageAccount.name};EndpointSuffix=${environment().suffixes.storage};AccountKey=${storageAccount.listKeys().keys[0].value}'
        }
        {
          name: 'WEBSITE_CONTENTAZUREFILECONNECTIONSTRING'
          value: 'DefaultEndpointsProtocol=https;AccountName=${storageAccount.name};EndpointSuffix=${environment().suffixes.storage};AccountKey=${storageAccount.listKeys().keys[0].value}'
        }
        {
          name: 'WEBSITE_CONTENTSHARE'
          value: 'func-monitored-function'
        }
        {
          name: 'FUNCTIONS_EXTENSION_VERSION'
          value: '~4'
        }
        {
          name: 'FUNCTIONS_WORKER_RUNTIME'
          value: 'dotnet-isolated'
        }
      ]
    }
    httpsOnly: true
  }
}

A couple of words about this function app. The kind: 'functionapp' indicates that this is a Windows function app that creates the requirement of setting netFrameworkVersion to desired .NET version. To make this an isolated worker model function, the FUNCTIONS_EXTENSION_VERSION is set to ~4 and FUNCTIONS_WORKER_RUNTIME to dotnet-isolated. When it comes to all the settings referencing the storage account, your attention should go to WEBSITE_CONTENTAZUREFILECONNECTIONSTRING and WEBSITE_CONTENTSHARE - those indicate the file share that will be created for this function app in the storage account (this is where the logs will go).

After deployment, the resulting infrastructure should be as below.

What remains is configuring the file system logging in the host.json file of our function. The default behavior is to generate log files only when the function is being debugged using the Azure portal. We want them to be generated always.

{
    "version": "2.0",
    "logging": {
        "fileLoggingMode": "always",
        "logLevel": {
            "default": "Information"
        }
    }
}

When you deploy the code and execute some requests, you will be able to find the log files in the defined file share (the path is /LogFiles/Application/Functions/Host/).

2024-11-17T15:33:39.339 [Information] Executing 'Functions.FibonacciSequence' ...
2024-11-17T15:33:39.522 [Information] FibonacciSequence function triggered for 6.
2024-11-17T15:33:39.526 [Information] Calculating Fibonacci sequence for 6.
2024-11-17T15:33:39.526 [Information] Calculating Fibonacci sequence for 5.
...
2024-11-17T15:33:39.527 [Information] Calculated Fibonacci sequence for 5: 5.
...
2024-11-17T15:33:39.528 [Information] Calculated Fibonacci sequence for 4: 3.
2024-11-17T15:33:39.528 [Information] Calculated Fibonacci sequence for 6: 8.
2024-11-17T15:33:39.566 [Information] Executed 'Functions.FibonacciSequence' ...

For Production Scenarios - Entra ID Protected Application Insights

As I said, I wouldn't use file system logging for production. Very often it's not even enough for development. This is why Application Insights are considered a default these days. But before we deploy one, we can use the fact that we no longer new Windows and change the deployment to be a Linux based one. First I'm going to change the app service plan to a Linux Consumption Plan.

resource appServicePlan 'Microsoft.Web/serverfarms@2024-04-01' = {
  ...
  kind: 'linux'
  properties: {
    reserved: true
  }
}

With a different app service plan, the function app can be changed to a Linux one, which means changing the kind to functionapp,linux and replacing netFrameworkVersion with the corresponding linuxFxVersion.

resource functionApp 'Microsoft.Web/sites@2024-04-01' = {
  ...
  kind: 'functionapp,linux'
  properties: {
    serverFarmId: appServicePlan.id
    siteConfig: {
      linuxFxVersion: 'DOTNET-ISOLATED|8.0'
      ...
    }
    httpsOnly: true
  }
}

There is also a good chance that you no longer need the file share (although the documentation itself is inconsistent whether it's needed when running Linux functions on the Elastic Premium plan) so the WEBSITE_CONTENTAZUREFILECONNECTIONSTRING and WEBSITE_CONTENTSHARE can simply be removed. No requirement for the file share creates one more improvement opportunity - we can drop credentials for the blob storage (AzureWebJobsStorage) as this connection can use managed identity. I'm going to use a user-assigned managed identity because I often prefer them and they usually cause more trouble to set up 😉. To do so, we need to create one and grant it the Storage Blob Data Owner role for the storage account.

resource managedIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
  name: 'id-monitored-function'
  location: resourceGroup().location
}

resource storageBlobDataOwnerRoleDefinition 'Microsoft.Authorization/roleDefinitions@2022-04-01' existing = {
  name: 'b7e6dc6d-f1e8-4753-8033-0f276bb0955b' // Storage Blob Data Owner
  scope: subscription()
}

resource storageBlobDataOwnerRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  scope: storageAccount
  name: guid(storageAccount.id, managedIdentity.id, storageBlobDataOwnerRoleDefinition.id)
  properties: {
    roleDefinitionId: storageBlobDataOwnerRoleDefinition.id
    principalId: managedIdentity.properties.principalId
    principalType: 'ServicePrincipal'
  }
}

The change to the function app is only about assigning the managed identity and replacing AzureWebJobsStorage with AzureWebJobsStorage__accountName.

resource functionApp 'Microsoft.Web/sites@2024-04-01' = {
  ...

  ...
  properties: {
    ...
    siteConfig: {
      ...
      appSettings: [
        {
          name: 'AzureWebJobsStorage__accountName'
          value: storageAccount.name
        }
        ...
      ]
    }
    httpsOnly: true
  }
}

Enough detouring (although it wasn't without a purpose 😜) - it's time to deploy Application Insights. As the classic Application Insights are retired, we are going to create a workspace-based instance.

resource logAnalyticsWorkspace 'Microsoft.OperationalInsights/workspaces@2023-09-01' = {
  name: 'log-monitored-function'
  location: resourceGroup().location
  properties: {
    sku: { 
      name: 'PerGB2018' 
    }
  }
}

resource applicationInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: 'appi-monitored-function'
  location: resourceGroup().location
  kind: 'web'
  properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalyticsWorkspace.id
    DisableLocalAuth: true
  }
}

You might have noticed that I've set DisableLocalAuth to true. This is a security improvement. It enforces authentication by Entra ID for ingestion and as a result, makes InstrumentationKey a resource identifier instead of a secret. This is nice and we can easily handle it because we already have managed identity in place (I told you it had a purpose 😊). All we need to do is grant the Monitoring Metrics Publisher role to our managed identity.

resource monitoringMetricsPublisherRoleDefinition 'Microsoft.Authorization/roleDefinitions@2022-04-01' existing = {
  name: '3913510d-42f4-4e42-8a64-420c390055eb' // Monitoring Metrics Publisher
  scope: subscription()
}

resource monitoringMetricsPublisherRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  scope: applicationInsights
  name: guid(applicationInsights.id, managedIdentity.id, monitoringMetricsPublisherRoleDefinition.id)
  properties: {
    roleDefinitionId: monitoringMetricsPublisherRoleDefinition.id
    principalId: managedIdentity.properties.principalId
    principalType: 'ServicePrincipal'
  }
}

Adding two application settings definitions to our function app will tie the services together.

resource functionApp 'Microsoft.Web/sites@2024-04-01' = {
  ...
  properties: {
    ...
    siteConfig: {
      ...
      appSettings: [
        ...
        {
          name: 'APPLICATIONINSIGHTS_CONNECTION_STRING'
          value: applicationInsights.properties.ConnectionString
        }
        {
          name: 'APPLICATIONINSIGHTS_AUTHENTICATION_STRING'
          value: 'ClientId=${managedIdentity.properties.clientId};Authorization=AAD'
        }
        ...
      ]
    }
    httpsOnly: true
  }
}

Are we done with the infrastructure? Not really. There is one more useful thing that is often forgotten - enabling storage logs. There are some important function app data in there so it would be nice to monitor it.

resource storageAccountBlobService 'Microsoft.Storage/storageAccounts/blobServices@2023-05-01' existing = {
  name: 'default'
  parent: storageAccount
}

resource storageAccountDiagnosticSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
  name: '${storageAccount.name}-diagnostic'
  scope: storageAccountBlobService
  properties: {
    workspaceId: logAnalyticsWorkspace.id
    logs: [
      {
        category: 'StorageWrite'
        enabled: true
      }
    ]
    metrics: [
      {
        category: 'Transaction'
        enabled: true
      }
    ]
  }
}

Now we are done and our infrastructure should look like below.

Once we deploy our function and make some requests, thanks to the codeless monitoring by the host and relaying worker logs through the host, we can find the traces in Application Insights.

You may be asking what I mean by "relaying worker logs through the host"? You probably remember that in the case of C# Azure Functions in the isolated worker model, we have two processes: functions host and isolated worker. Azure Functions wants to be helpful and by default, it sends the logs from the worker process to the functions host process, which then sends them to Application Insights.

This is nice, but may not be exactly what you want. You may want to split the logs so you can treat them separately (for example by configuring different default log levels for the host and the worker). To achieve that you must explicitly set up Application Insights integration in the worker code.

var host = new HostBuilder()
    .ConfigureFunctionsWebApplication()
    .ConfigureServices(services => {
        services.AddApplicationInsightsTelemetryWorkerService();
        services.ConfigureFunctionsApplicationInsights();
    })
    .Build();

host.Run();

The telemetry from the worker can now be controlled in the code, while from the host through the host.json. But if you deploy this, you will find no telemetry from the worker in Application Insights. You can still see your logs in the Log stream. You will also see a lot of those:

Azure.Identity: Service request failed.
Status: 400 (Bad Request)

Content:
{"statusCode":400,"message":"Unable to load the proper Managed Identity.","correlationId":"..."}

That's Application Insights SDK not being able to use the user-assigned managed identity. Azure Functions runtime uses the APPLICATIONINSIGHTS_AUTHENTICATION_STRING setting to provide a user-assigned managed identity OAuth token to the Application Insights but none of that (at the time of writing this) happens when we set up the integration explicitly. That said, we can do it ourselves. The parser for the setting is available in the Microsoft.Azure.WebJobs.Logging.ApplicationInsights so we can mimic the host implementation.

using TokenCredentialOptions =
    Microsoft.Azure.WebJobs.Logging.ApplicationInsights.TokenCredentialOptions;

var host = new HostBuilder()
    ...
    .ConfigureServices(services => {
        services.Configure<TelemetryConfiguration>(config =>
        {
            string? authenticationString =
                Environment.GetEnvironmentVariable("APPLICATIONINSIGHTS_AUTHENTICATION_STRING");

            if (!String.IsNullOrEmpty(authenticationString))
            {
                var tokenCredentialOptions = TokenCredentialOptions.ParseAuthenticationString(
                    authenticationString
                );
                config.SetAzureTokenCredential(
                    new ManagedIdentityCredential(tokenCredentialOptions.ClientId)
                );
            }
        });
        services.AddApplicationInsightsTelemetryWorkerService();
        services.ConfigureFunctionsApplicationInsights();
    })
    .Build();

host.Run();

The telemetry should now reach the Application Insights without a problem.

Be cautious. As Application Insights SDK now controls emitting the logs you must be aware of its opinions. It so happens that it tries to optimize by default and adds a logging filter that sets the minimum log level to Warning. You may want to get rid of that (be certain to make it after ConfigureFunctionsApplicationInsights).

...

var host = new HostBuilder()
    ...
    .ConfigureServices(services => {
        ...
        services.ConfigureFunctionsApplicationInsights();
        services.Configure<LoggerFilterOptions>(options =>
        {
            LoggerFilterRule? sdkRule = options.Rules.FirstOrDefault(rule =>
                rule.ProviderName == typeof(Microsoft.Extensions.Logging.ApplicationInsights.ApplicationInsightsLoggerProvider).FullName
            );

            if (sdkRule is not null)
            {
                options.Rules.Remove(sdkRule);
            }
        });
    })
    .Build();

host.Run();

Modifying Application Map in Application Insights

The default application map is not impressive - it will simply show your function app by its resource name.

As function apps rarely exist in isolation, we often want to present them in a more meaningful way on the map by providing a cloud role. The simplest way to do this is through the WEBSITE_CLOUD_ROLENAME setting.

resource functionApp 'Microsoft.Web/sites@2024-04-01' = {
  ...
  properties: {
    ...
    siteConfig: {
      ...
      appSettings: [
        ...
        {
          name: 'WEBSITE_CLOUD_ROLENAME'
          value: 'InstrumentedFunctions'
        }
        ...
      ]
    }
    ...
  }
}

This will impact both the host and the work. With explicit Application Insights integration, we can separate the two (if there is such a need) and change the cloud role for the worker. This is done in the usual way, by registering an ITelemetryInitializer. The only important detail is the place of the registration - it needs to be after ConfigureFunctionsApplicationInsights as Azure Functions are adding their own initializers.

public class IsolatedWorkerTelemetryInitializer : ITelemetryInitializer
{
    public void Initialize(ITelemetry telemetry)
    {
        telemetry.Context.Cloud.RoleName = "InstrumentedFunctionsIsolatedWorker";
    }
}

...

var host = new HostBuilder()
    ...
    .ConfigureServices(services => {
        ...
        services.ConfigureFunctionsApplicationInsights();
        services.AddSingleton<ITelemetryInitializer, IsolatedWorkerTelemetryInitializer>();
        services.Configure<LoggerFilterOptions>(options =>
        {
            ...
        });
    })
    .Build();

host.Run();

The resulting map will look as below.

Monitoring for Scaling Decisions

This is currently in preview, but you can enable emitting scale controller logs (for a chance to understand the scaling decisions). This is done with a single setting (SCALE_CONTROLLER_LOGGING_ENABLED) that takes a value in the : format. The possible values for destination are Blob and AppInsights, while the verbosity can be None, Verbose, and Warning. The Verbose seems to be the most useful one when you are looking for understanding as it's supposed to provide reason for changes in the worker count, and information about the triggers that impact those decisions.

resource functionApp 'Microsoft.Web/sites@2024-04-01' = {
  ...
  properties: {
    ...
    siteConfig: {
      ...
      appSettings: [
        ...
        {
          name: 'SCALE_CONTROLLER_LOGGING_ENABLED'
          value: 'AppInsights:Verbose'
        }
        ...
      ]
    }
    ...
  }
}

Is This Exhaustive?

Certainly not 🙂. There are other things that you can fiddle with. You can have a very granular configuration of different log levels for different categories. You can fine-tune aggregation and sampling. When you are playing with all those settings, please remember that you're looking for the right balance between the gathered details and costs. My advice is not to configure monitoring for the worst-case scenario (when you need every detail) as that often isn't financially sustainable. Rather aim for a lower amount of information and change the settings to gather more if needed (a small hint - you can override host.json settings through application settings without deployment).