Entity Framework 6 - Dynamically creating temporary tables from IQueryable (Part 2 of 2)

In previous post I've shown how a temporary table can be generated based on IQueryable and then reused in subsequent queries. I have also pointed out a serious limitation. Described approach won't work if projection contains columns not coming directly from any tables. In such case the generated query will contain columns names like C2 which are not matching property names. One possible workaround is overriding columns names in configuration class by using HasColumnName("C2") but this is far from perfect. Something more generic would be desired.

What I did know is that after ToTraceString Entity Framework keeps the query plan in cache. As part of this plan there should be information how properties are being mapped to columns. The only thing I was able to find was this StackOverflow question. Apparently Entity Framework is storing only the positions of properties. This actually makes sense as access by index is the primary method provided by ADO.NET, but unfortunately this is not an information which can be used to properly build the SELECT clause. After some thinking I decided I will use those positions to properly order columns in INSERT INTO list.

First thing to change was my extension method for getting the entity columns because now I needed to get the names of the corresponding properties as well.

internal static class DbContextExtensions
{
    internal static IEnumerable<ScalarPropertyMapping> GetEntityPropertyMappings<TEntity>(this DbContext dbContext)
    {
        // Get the metadata
        MetadataWorkspace metadata = ((IObjectContextAdapter)dbContext).ObjectContext.MetadataWorkspace;

        // Get the space within the metadata which contains information about CLR types
        ObjectItemCollection clrSpace = ((ObjectItemCollection)metadata.GetItemCollection(DataSpace.OSpace));

        // Get the entity type from the metadata that maps to the CLR type
        EntityType entityEntityType = metadata.GetItems<EntityType>(DataSpace.OSpace).Single(e => clrSpace.GetClrType(e) == typeof(TEntity));

        // Get the entity set that uses this entity type
        EntitySet entityEntitySet = metadata.GetItems<EntityContainer>(DataSpace.CSpace).Single().EntitySets.Single(s => s.ElementType.Name == entityEntityType.Name);

        // Get the mapping between conceptual and storage model for this entity set
        EntitySetMapping entityEntitySetMapping = metadata.GetItems<EntityContainerMapping>(DataSpace.CSSpace).Single().EntitySetMappings.Single(m => m.EntitySet == entityEntitySet);

        // Get the entity columns
        return entityEntitySetMapping.EntityTypeMappings.Single().Fragments.Single().PropertyMappings.OfType<ScalarPropertyMapping>();
    }
}

That was the easy part. Next step was getting property positions out of the IQueryable. Based on the answer to the question mentioned above I have extended my ObjectQuery extensions.

internal static class ObjectQueryExtensions
{
    internal static ObjectQuery<TQueryProjection> GetObjectQuery<TQueryProjection>(this IQueryable<TQueryProjection> query)
    {
        object internalQuery = GetField(query, "_internalQuery");

        ObjectQuery<TQueryProjection> objectQuery = GetField(internalQuery, "_objectQuery") as ObjectQuery<TQueryProjection>;

        return objectQuery;
    }

    internal static IReadOnlyDictionary<string, int> GetQueryPropertyPositions<TQueryProjection>(this IQueryable<TQueryProjection> query)
    {
        ObjectQuery<TQueryProjection> objectQuery = query.GetObjectQuery();

        objectQuery.ToTraceString();

        return GetQueryPropertyPositions(objectQuery);
    }

    internal static IReadOnlyDictionary<string, int> GetQueryPropertyPositions<TQueryProjection>(this ObjectQuery<TQueryProjection> objectQuery)
    {
        IDictionary<string, int> propertyPositions = new Dictionary<string, int>();

        // Get the query state.
        object objectQueryState = GetProperty(objectQuery, "QueryState");

        // Get the cached query execution plan.
        object cachedPlan = GetField(objectQueryState, "_cachedPlan");

        // Get the command definition.
        object commandDefinition = GetField(cachedPlan, "CommandDefinition");

        // Get the column map generator.
        Array columnMapGenerators = GetField(commandDefinition, "_columnMapGenerators") as Array;
        object columnMapGenerator = ((columnMapGenerators != null) && (columnMapGenerators.Length == 1)) ? columnMapGenerators.GetValue(0) : null;

        // Get the column map.
        object columnMap = GetField(columnMapGenerator, "_columnMap");

        // get the record column map.
        object columnMapElement = GetProperty(columnMap, "Element");

        // Get column map properties.
        Array properties = GetProperty(columnMapElement, "Properties") as Array;
        if (properties != null)
        {
            for (int propertyIndex = 0; propertyIndex < properties.Length; propertyIndex++)
            {
                object property = properties.GetValue(propertyIndex);
                propertyPositions.Add(GetProperty(property, "Name") as String, (int)GetProperty(property, "ColumnPos"));
            }
        }

        return new ReadOnlyDictionary<string, int>(propertyPositions);
    }

    private static object GetProperty(object objectInstance, string propertyName)
    {
        object propertyValue = null;

        if (objectInstance != null)
        {
            PropertyInfo property = objectInstance.GetType().GetProperty(propertyName, BindingFlags.NonPublic | BindingFlags.Instance);
            if (property != null)
            {
                propertyValue = property.GetValue(objectInstance, new object[0]);
            }
        }

        return propertyValue;
    }

    private static object GetField(object objectInstance, string fieldName)
    {
        object fieldValue = null;

        if (objectInstance != null)
        {
            FieldInfo field = objectInstance.GetType().GetField(fieldName, BindingFlags.NonPublic | BindingFlags.Instance);

            if (field != null)
            {
                fieldValue = field.GetValue(objectInstance);
            }
        }

        return fieldValue;
    }
}

The change to actual temporary table creation code was pretty straightforward with usage of the methods above. Some ordering and couple StringBuilders did the trick.

public IQueryable<TTemporaryEntity> AsTemporarySnapshotQueryable<TQueryProjection, TTemporaryEntity>(IQueryable<TQueryProjection> query)
{
    IQueryable<TTemporaryEntity> snapshotQueryable = null;

    ObjectQuery<TQueryProjection> temporarySnapshotObjectQuery = query.GetObjectQuery();
    IDictionary<string, EdmProperty> temporarySnapshotColumns = _dBContext.GetEntityPropertyMappings<TTemporaryEntity>().ToDictionary(p => p.Property.Name, p => p.Column);


    if ((temporarySnapshotObjectQuery != null) && temporarySnapshotColumns.Any())
    {
        string temporarySnapshotTableName = "#" + typeof(TTemporaryEntity).Name;
        string temporarySnapshotQuerySql = temporarySnapshotObjectQuery.ToTraceString();
        IOrderedEnumerable<KeyValuePair<string, int>> temporarySnapshotObjectQueryColumnsPositions = temporarySnapshotObjectQuery.GetQueryPropertyPositions().OrderBy(cp => cp.Value);

        StringBuilder temporarySnapshotCreateColumnsListBuilder = new StringBuilder();
        StringBuilder temporarySnapshotFillColumnsListBuilder = new StringBuilder();
        foreach (KeyValuePair<string, int> temporarySnapshotObjectQueryColumnPosition in temporarySnapshotObjectQueryColumnsPositions)
        {
            EdmProperty temporarySnapshotColumn = temporarySnapshotColumns[temporarySnapshotObjectQueryColumnPosition.Key];

            temporarySnapshotCreateColumnsListBuilder.Append(GetTemporarySnapshotColumnCreateSql(temporarySnapshotColumn));
            temporarySnapshotFillColumnsListBuilder.AppendFormat("[{0}],", temporarySnapshotColumn.Name);
        }
        temporarySnapshotCreateColumnsListBuilder.Length -= 1;
        temporarySnapshotFillColumnsListBuilder.Length -= 1;

        // We need to handle "1 AS [C1]" column here
        if (temporarySnapshotObjectQueryColumnsPositions.First().Value == 1)
        {
            temporarySnapshotCreateColumnsListBuilder.Insert(0, "[RESERVED_EF_INTERNAL] INT,");
            temporarySnapshotFillColumnsListBuilder.Insert(0, "[RESERVED_EF_INTERNAL],");
        }

        string temporarySnapshotCreateSqlCommand = String.Format("IF OBJECT_ID('tempdb..{0}') IS NOT NULL BEGIN DROP TABLE {0} END{1}CREATE TABLE {0} ({2})", temporarySnapshotTableName, Environment.NewLine, temporarySnapshotCreateColumnsListBuilder);
        string temporarySnapshotFillSqlCommand = String.Format("INSERT INTO {0}({1}) (SELECT * FROM ({2}) AS [TemporarySnapshotQueryable])", temporarySnapshotTableName, temporarySnapshotFillColumnsListBuilder, temporarySnapshotQuerySql);
        object[] temporarySnapshotFillSqlCommandParameters = temporarySnapshotObjectQuery.Parameters.Select(parameter => new SqlParameter(parameter.Name, parameter.Value)).ToArray();

        // We are opening connection manually here because since Entity Framework 6 it will not be automatically closed until context disposal - this way the temporary table will be visible for other queries.
        _dbContext.Database.Connection.Open();
        _dbContext.Database.ExecuteSqlCommand(temporarySnapshotCreateSqlCommand);
        _dbContext.Database.ExecuteSqlCommand(temporarySnapshotFillSqlCommand, temporarySnapshotFillSqlCommandParameters);

        snapshotQueryable = _dbContext.Set<TTemporaryEntity>().AsNoTracking().AsQueryable();
    }

    return snapshotQueryable;
}

One thing worth explaining in the code above is the if statement commented by 'We need to handle "1 AS [C1]" column here'. If you happen to look at SQL Server profiler log when Entity Framework queries are being performed you would notice that for a lot of them 1 AS [C1] column is being generated. To my knowledge this allows Entity Framework to distinguish between empty tables and selecting only nullable fields (especially for sub queries scenarios), but for our use case this is one more thing we have to account for. Above code does it in simplest way possible - if the properties columns positions start at 1 it will create a dummy column which will store the 1 AS [C1].

This modified solution has less limitations than previous one but in order to achieve it a number of "internal" dependencies had to be introduced. This can easily be broken just by updating Entity Framework. The solution should be somehow protected against this. The absolute minimum which should be done are unit tests which will guard the most important aspects.

Unit Testing

The thing that can be broken in easiest way is the GetQueryPropertyPositions method as it is entirely based on non-public API. So first test should check if that method returns anything.

[TestMethod]
public void TemporarySnapshot_GetStudentsIdentities_GetQueryPropertyPositions_ReturnsNonEmptyCollection()
{
    using (var temporarySnapshotTestContext = new FakeDbContext())
    {
        StudentsManager studentsManager = PrepareStudentsManager(temporarySnapshotTestContext);
        IQueryable<StudentIdentity> studentsIdentitiesQueryable = studentsManager.GetStudentsIdentities(...);

        IReadOnlyDictionary<string, int> studentsIdentitiesQueryableColumns = studentsIdentitiesQueryable.GetQueryPropertyPositions();

        Assert.IsTrue((studentsIdentitiesQueryableColumns != null) && (studentsIdentitiesQueryableColumns.Any()));
    }
}

For consistency the GetEntityPropertyMappings method should be checked in same way.

[TestMethod]
public void TemporarySnapshot_GetEntityPropertyMappings_ReturnsNonEmptyCollection()
{
    using (var temporarySnapshotTestContext = new FakeDbContext())
    {
        IEnumerable<ScalarPropertyMapping> temporarySnapshotColumns = temporarySnapshotTestContext.GetEntityPropertyMappings<TemporaryStudentIdentity>();

        Assert.IsTrue((temporarySnapshotColumns != null) && (temporarySnapshotColumns.Any()));
    }
}

Now we can test if (at least theoretically) the code can create correct temporary table. There are four preconditions which are absolute minimum for that:

  • The columns positions returned from GetQueryPropertyPositions must be in a sequence
  • First column index has to be either 0 or 1
  • All properties from entity representing temporary table exist in the query
  • The number of property mappings in entity and query is the same

Following unit tests cover those requirements.

[TestMethod]
public void TemporarySnapshot_GetStudentsIdentities_GetQueryPropertyPositions_ColumnsInSequence()
{
    bool inSequence = false;

    using (var temporarySnapshotTestContext = new FakeDbContext())
    {
        StudentsManager studentsManager = PrepareStudentsManager(temporarySnapshotTestContext);
        IQueryable<StudentIdentity> studentsIdentitiesQueryable = studentsManager.GetStudentsIdentities(...);

        IReadOnlyDictionary<string, int> studentsIdentitiesQueryableColumns = studentsIdentitiesQueryable.GetQueryPropertyPositions();

        IEnumerable<int> orderedPositions = studentsIdentitiesQueryableColumns.Values.OrderBy(position => position);
        inSequence = orderedPositions.Zip(orderedPositions.Skip(1), (previousPosition, position) => (previousPosition + 1) == position).All(positionsInSequence => positionsInSequence);
    }

    Assert.IsTrue(inSequence);
}

[TestMethod]
public void TemporarySnapshot_GetStudentsIdentities_GetQueryPropertyPositions_CorrectFirstColumnIndex()
{
    bool correctFirstColumnIndex = false;

    using (var temporarySnapshotTestContext = new FakeDbContext())
    {
        StudentsManager studentsManager = PrepareStudentsManager(temporarySnapshotTestContext);
        IQueryable<StudentIdentity> studentsIdentitiesQueryable = studentsManager.GetStudentsIdentities(...);

        IReadOnlyDictionary<string, int> studentsIdentitiesQueryableColumns = studentsIdentitiesQueryable.GetQueryPropertyPositions();

        int firstPosition = studentsIdentitiesQueryableColumns.Values.OrderBy(position => position).First();
        correctFirstColumnIndex = (firstPosition == 0) || (firstPosition == 1);
    }

    Assert.IsTrue(correctFirstColumnIndex);
}

[TestMethod]
public void TemporarySnapshot_GetEntityPropertyMappings_GetSupervisees_RetrievePhoto_CheckIsSupervisor_AllColumnsExists()
{
    bool allColumnsExists = true;

    using (var temporarySnapshotTestContext = new FakeDbContext())
    {
        StudentsManager studentsManager = PrepareStudentsManager(temporarySnapshotTestContext);
        IQueryable<StudentIdentity> studentsIdentitiesQueryable = studentsManager.GetStudentsIdentities(...);

        IReadOnlyDictionary<string, int> studentsIdentitiesQueryableColumns = studentsIdentitiesQueryable.GetQueryPropertyPositions();
        IEnumerable<ScalarPropertyMapping> temporarySnapshotColumns = temporarySnapshotTestContext.GetEntityPropertyMappings<TemporaryStudentIdentity>();

        foreach (ScalarPropertyMapping temporarySnapshotColumn in temporarySnapshotColumns)
        {
            allColumnsExists = allColumnsExists && studentsIdentitiesQueryableColumns.ContainsKey(temporarySnapshotColumn.Property.Name);
        }
    }

    Assert.IsTrue(allColumnsExists);
}

[TestMethod]
public void TemporarySnapshot_GetEntityPropertyMappings_GetStudentsIdentities_GetQueryPropertyPositions_EqualColumnsCount()
{
    using (var temporarySnapshotTestContext = new FakeDbContext())
    {
        StudentsManager studentsManager = PrepareStudentsManager(temporarySnapshotTestContext);
        IQueryable<StudentIdentity> studentsIdentitiesQueryable = studentsManager.GetStudentsIdentities(...);

        IReadOnlyDictionary<string, int> studentsIdentitiesQueryableColumns = studentsIdentitiesQueryable.GetQueryPropertyPositions();
        IEnumerable<ScalarPropertyMapping> temporarySnapshotColumns = temporarySnapshotTestContext.GetEntityPropertyMappings<TemporaryStudentIdentity>();

        Assert.AreEqual(temporarySnapshotColumns.Count(), studentsIdentitiesQueryableColumns.Count);
    }
}

Conclusion

The solution is complete and lets face it - it is "hacky". It solves a very specific problem and comes with cost of relying on Entity Framework internals. Even with unit tests there are maintainability risks here. Still, at least for the specific case mentioned, it was worth it. Also the journey to the solution was interesting and satisfactory one.