-
Notifications
You must be signed in to change notification settings - Fork 249
Description
I am not sure if this is an EF issue or a efcore.pg issue so I am starting here.
We have an entity with a primitive collection represented as a Postgres array column. We have found an issue in EF Core wherein the collection order is not preserved by EF. In our case, the incorrect ordering seems dependent upon various combinations of:
OrderByon the main query- Number of entities returned by the main query
- Projection of the primitive collection to a DTO or just
ToList - Number of items in the primitive collections
In our actual application any one of the first three items seems to have an impact; we whittled our queries down such that we were only dealing with one of 1, 2, or 3 at a time and each would reproduce depending on the number of items in the related collections (4).
For the sake of this bug report, I've boiled it down to a minimal reproducible example (see below). However I've only been able to reproduce it when OrderBy is involved. I've not be able to reproduce it otherwise, so will continue trying to reduce our real application down until I can.
Entity and DTO Types
public sealed class Thing
{
public int Id { get; set; }
public string Name { get; set; } = null!;
public int[] Related { get; set; } = [ ];
}
public sealed class Dto
{
public int Id { get; set; }
}Context
public sealed class TestContext : DbContext
{
public DbSet<Thing> Things => Set<Thing>();
private const string Conn = "Host=localhost;Port=5432;Database=arraytest;Username=postgres;Password=postgres;Include Error Detail=true";
private static readonly DbDataSource Datatsource = new Npgsql.NpgsqlDataSourceBuilder(Conn)
.Build();
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder.UseNpgsql(Datatsource, e => e.SetPostgresVersion(15, 0))
.EnableDetailedErrors()
.EnableSensitiveDataLogging()
.LogTo(s => Debug.WriteLine(s));
}
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
modelBuilder.Entity<Thing>(e =>
{
e.ToTable("things");
e.HasKey(p => p.Id);
e.Property(p => p.Id)
.HasColumnName("id")
.UseIdentityByDefaultColumn();
e.Property(p => p.Name)
.HasColumnName("name")
.IsRequired()
.HasColumnType("text");
// primitive collection has the issue
e.PrimitiveCollection(p => p.Related)
.HasColumnName("related")
.HasColumnType("integer[]")
.HasDefaultValueSql("ARRAY[]::integer[]")
.IsRequired()
.ElementType(i => i.IsRequired());
// As does defining this way instead
// e.Property(p => p.Related)
// .HasColumnName("related")
// .HasColumnType("integer[]")
// .HasDefaultValueSql("ARRAY[]::integer[]");
e.HasIndex(p => p.Related)
.HasMethod("gin");
});
}
}Here is the application code/reproduction.
Application code
using Microsoft.EntityFrameworkCore;
using System.Data.Common;
using System.Diagnostics;
var alt = false;
await using (var ctx = new TestContext())
{
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();
}
await using (var ctx = new TestContext())
{
var a = new Thing
{
Id = 1,
Name = "Thing A",
Related = [1,2,3]
};
var b = new Thing
{
Id = 2,
Name = "B Thing",
Related = [3,2]
};
var c = new Thing
{
Id = 3,
Name = "Empty Thing",
Related = [ ]
};
var d = new Thing
{
Id = 4,
Name = "Third Thing",
Related = [ 4,5,6,7,8 ]
};
ctx.Add(a);
ctx.Add(b);
if (alt)
ctx.Add(c);
ctx.Add(d);
await ctx.SaveChangesAsync();
}
await using (var ctx = new TestContext())
{
WriteSpacer("#1 Correct (OrderBy + Basic Projection)");
var res1 = await ctx.Things
.OrderBy(e => e.Name)
.Select(e => new { e.Id, e.Related })
.ToListAsync();
foreach (var output in res1)
Console.WriteLine("{0} - {1}", output.Id, string.Join("-", output.Related));
WriteSpacer($"#2 {(alt ? "Correct?!?!": "Incorrect")} (OrderBy + ProjectRelated)");
var res2 = await ctx.Things
.OrderBy(e => e.Name)
.Select(e => new { e.Id, Related = e.Related.Select(r => new Dto { Id=r }).ToArray() })
.ToListAsync();
foreach (var output in res2)
Console.WriteLine("{0} - {1}", output.Id, string.Join("-", output.Related.Select(o => o.Id)));
WriteSpacer("#3 Correct (ProjectRelated)");
var res3 = await ctx.Things
.Select(e => new { e.Id, Related = e.Related.Select(r => new Dto { Id=r }).ToArray() })
.ToListAsync();
foreach (var output in res3)
Console.WriteLine("{0} - {1}", output.Id, string.Join("-", output.Related.Select(o => o.Id)));
WriteSpacer($"#4 {(alt ? "Correct?!?!": "Incorrect")} (OrderBy + ToList)");
var res4 = await ctx.Things
.OrderBy(e => e.Name)
.Select(e => new { e.Id, r = e.Related.ToList() })
.ToListAsync();
foreach (var output in res4)
Console.WriteLine("{0} - {1}", output.Id, string.Join("-", output.r));
WriteSpacer("#5 Always Incorrect (TotalItems + OrderBy + ToList)");
var res5 = await ctx.Things
.Where(e => e.Id != 3)
.OrderBy(e => e.Name)
.Select(e => new { e.Id, r = e.Related.ToList() })
.ToListAsync();
foreach (var output in res5)
Console.WriteLine("{0} - {1}", output.Id, string.Join("-", output.r));
}
void WriteSpacer(string title)
{
Console.WriteLine("");
Console.WriteLine($"=== {title} ".PadRight(70, '='));
Console.WriteLine("");
}Run the code once, as is. The expected result is that the output of the primitive collection matches the original order, despite the ordering of the main query itself. Meaning that (order of this list aside), the values should always be print to the console as:
1 - 1-2-3
2 - 3-2
4 - 4-5-6-7-8
Instead, under result sets #2, #4, and #5 you will see that the ordering for items with Id's 1 and 4 are incorrectly output as:
1 - 2-3-1
4 - 8-4-5-6-7
Now run the code again, this time with the alt variable set to true. This adds a fourth entity to the database with an empty primitive collection. The outputs of tests #2 and #4 are as expected this time, while #5 remains incorrect.
Output with alt=false
=== #1 Correct (OrderBy + Basic Projection) ==========================
2 - 3-2
1 - 1-2-3
4 - 4-5-6-7-8
=== #2 Incorrect (OrderBy + ProjectRelated) ==========================
2 - 3-2
1 - 2-3-1
4 - 8-4-5-6-7
=== #3 Correct (ProjectRelated) ======================================
1 - 1-2-3
2 - 3-2
4 - 4-5-6-7-8
=== #4 Incorrect (OrderBy + ToList) ==================================
2 - 3-2
1 - 2-3-1
4 - 8-4-5-6-7
=== #5 Always Incorrect (TotalItems + OrderBy + ToList) ==============
2 - 3-2
1 - 2-3-1
4 - 8-4-5-6-7
Output with alt=true
=== #1 Correct (OrderBy + Basic Projection) ==========================
2 - 3-2
3 -
1 - 1-2-3
4 - 4-5-6-7-8
=== #2 Correct?!?! (OrderBy + ProjectRelated) ========================
2 - 3-2
3 -
1 - 1-2-3
4 - 4-5-6-7-8
=== #3 Correct (ProjectRelated) ======================================
1 - 1-2-3
2 - 3-2
3 -
4 - 4-5-6-7-8
=== #4 Correct?!?! (OrderBy + ToList) ================================
2 - 3-2
3 -
1 - 1-2-3
4 - 4-5-6-7-8
=== #5 Always Incorrect (TotalItems + OrderBy + ToList) ==============
2 - 3-2
1 - 2-3-1
4 - 8-4-5-6-7
We were getting random reports of incorrect ordering from customers. Two of us spent a combined 48 hours trying to debug into this and figure out what was going on. Every time we thought we had a solution, a new test case would start failing because ultimately it all seemed dependent on ordering, filtering, and total counts. We ended up working around this by writing a Postgres function that preserved and sorted by the Ordinality of the array collection, unnesting it into the type we were trying to project to.
While we have the workaround, we are concerned the issue may creep up in other forms. We'd also like to ultimately remove the workaround since a function manually added into migrations won't benefit from changes to the model.
Versions used
- Npgsql.EntityFrameworkCore.PostgreSQL - 8.0.4
- Npgsql - 8.0.3
- Microsoft.EntityFrameworkCore - 8.0.6
- .NET 8