When I create a query on a DbSet
with too many Concat
or Where
clauses, I get a stack overflow error.
Essentially I have the problem where I have a list of thousands of AND
clauses all connected with OR
clauses. It would look a little something like:
(A AND B) OR (C AND D) OR ...
The clauses are created from a list so the number of AND
clauses that are concatenated by the OR
clauses is dynamic and could be from 0 to thousands.
I tried creating selects for each AND
clause and using Concat
to combine multiple selects together using Entity Framework, but I get a stack overflow exception.
I feel like there should be a better way to write the code, but I'm not sure so I've included the error and some example code in a hope someone knows how this should be done without reverting back to writing inline SQL (Goes against the entity framework paradigm)
The exact error is as follows:
Stack overflow.
Repeat 798 times:
--------------------------------
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.VisitMethodCall(System.Linq.Expressions.MethodCallExpression)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Linq.Expressions.Expression)
--------------------------------
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
Libraries:
EFCore.BulkExtensions.PostgreSql
Version="8.1.2"Microsoft.EntityFrameworkCore.Design
Version="9.0.0"Npgsql.EntityFrameworkCore.PostgreSQL
Version="9.0.2"Npgsql.EntityFrameworkCore.PostgreSQL.Design
Version="1.1.0"
Here's an example how to reproduce the issue:
private class SimpleDbContext : DbContext
{
// Stores the values 0, 1, 2, ..., 100000
public virtual DbSet<SequencePoint> SequencePoints { get; set; }
}
private class SequencePoint
{
public int SequenceNumber { get; set; }
}
private void ConcatErrorTest()
{
SimpleDbContext simpleDbContext = new();
List<Tuple<int, int>> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
for (int i = 0; i < 7500; i++)
{
int startRange = i * 10;
int endRange = startRange + (i % 5);
selectRanges.Add(new Tuple<int, int>(startRange, endRange));
}
IQueryable<SequencePoint> queryable = null;
foreach (Tuple<int,int> selectRange in selectRanges)
{
IQueryable<SequencePoint> whereQueryable = simpleDbContext.SequencePoints.AsQueryable().Where(point =>
(point.SequenceNumber >= selectRange.Item1) &&
(point.SequenceNumber <= selectRange.Item2)
);
queryable = queryable == null ? whereQueryable : queryable.Concat(whereQueryable);
}
// Throws Stack overflow.
List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
_logger.LogInformation("result = {result}", result);
}
When I create a query on a DbSet
with too many Concat
or Where
clauses, I get a stack overflow error.
Essentially I have the problem where I have a list of thousands of AND
clauses all connected with OR
clauses. It would look a little something like:
(A AND B) OR (C AND D) OR ...
The clauses are created from a list so the number of AND
clauses that are concatenated by the OR
clauses is dynamic and could be from 0 to thousands.
I tried creating selects for each AND
clause and using Concat
to combine multiple selects together using Entity Framework, but I get a stack overflow exception.
I feel like there should be a better way to write the code, but I'm not sure so I've included the error and some example code in a hope someone knows how this should be done without reverting back to writing inline SQL (Goes against the entity framework paradigm)
The exact error is as follows:
Stack overflow.
Repeat 798 times:
--------------------------------
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.VisitMethodCall(System.Linq.Expressions.MethodCallExpression)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Linq.Expressions.Expression)
--------------------------------
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
Libraries:
EFCore.BulkExtensions.PostgreSql
Version="8.1.2"Microsoft.EntityFrameworkCore.Design
Version="9.0.0"Npgsql.EntityFrameworkCore.PostgreSQL
Version="9.0.2"Npgsql.EntityFrameworkCore.PostgreSQL.Design
Version="1.1.0"
Here's an example how to reproduce the issue:
private class SimpleDbContext : DbContext
{
// Stores the values 0, 1, 2, ..., 100000
public virtual DbSet<SequencePoint> SequencePoints { get; set; }
}
private class SequencePoint
{
public int SequenceNumber { get; set; }
}
private void ConcatErrorTest()
{
SimpleDbContext simpleDbContext = new();
List<Tuple<int, int>> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
for (int i = 0; i < 7500; i++)
{
int startRange = i * 10;
int endRange = startRange + (i % 5);
selectRanges.Add(new Tuple<int, int>(startRange, endRange));
}
IQueryable<SequencePoint> queryable = null;
foreach (Tuple<int,int> selectRange in selectRanges)
{
IQueryable<SequencePoint> whereQueryable = simpleDbContext.SequencePoints.AsQueryable().Where(point =>
(point.SequenceNumber >= selectRange.Item1) &&
(point.SequenceNumber <= selectRange.Item2)
);
queryable = queryable == null ? whereQueryable : queryable.Concat(whereQueryable);
}
// Throws Stack overflow.
List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
_logger.LogInformation("result = {result}", result);
}
Share
Improve this question
edited Feb 7 at 7:32
DarkBee
15.6k8 gold badges70 silver badges115 bronze badges
asked Feb 7 at 4:25
PeterPeter
111 silver badge2 bronze badges
New contributor
Peter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
6
|
Show 1 more comment
2 Answers
Reset to default 1Not only is EF Core going to struggle with this, the database itself will struggle. You need to rethink your whole approach.
Use an array or JSON parameter to pass in your data, and do a single .Where
on your table.
private async Task ConcatErrorTest()
{
using SimpleDbContext simpleDbContext = new();
List<int> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
for (int i = 0; i < 7500; i++)
{
int startRange = i * 10;
selectRanges.Add(startRange);
}
var array = selectRanges.ToArray();
IQueryable<SequencePoint> queryable = simpleDbContext.SequencePoints
.Where(point =>
array.Any(i =>
point.SequenceNumber >= i &&
point.SequenceNumber <= i + i / 10 % 5
)
);
List<int> result = await queryable
.Select(sequenceNumber => sequenceNumber.SequenceNumber)
.ToListAsync();
_logger.LogInformation("result = {result}", result);
}
Note also the addition of using
and await
.
Thanks for the responses.
In my case the example problem here was a simplification of my actual problem but the responses helped me get on the right track.
The most entity framework way of approaching this problem was to have the ranges in another table and use a multi select. Note a Join would not work here as you can't do that with a LINQ joins - LINQ only supports equijoins
.
Here's what the code would look like:
private class SimpleDbContext : DbContext
{
// Stores the values 0, 1, 2, ..., 100000
public virtual DbSet<SequencePoint> SequencePoints { get; set; }
// [0, 0], [10, 11], [20, 21, 22], etc...
public virtual DbSet<SelectionRange> SelectionRanges { get; set; }
}
private class SequencePoint
{
public int SequenceNumber { get; set; }
}
private class SelectionRange
{
public int LowerRange { get; set; }
public int UpperRange { get; set; }
}
private void ConcatReworked()
{
SimpleDbContext simpleDbContext = new();
IQueryable<SequencePoint> queryable = from sequencePoint in simpleDbContext.SequencePoints
from selectionRange in simpleDbContext.SelectionRanges
where sequencePoint.SequenceNumber >= selectionRange.LowerRange
where sequencePoint.SequenceNumber <= selectionRange.UpperRange
orderby sequencePoint.SequenceNumber
select sequencePoint;
List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
_logger.LogInformation("result = {result}", result);
}
The other approach I tried was to write the Query manually using a StringBuilder which looked something along this lines of:
private async Task LoadSelection(List<SelectionRange> selectionRange)
{
using NpgsqlConnection connection = await npgsqlDataSource.OpenConnectionAsync();
StringBuilder commandSqlBuilder = new();
commandSqlBuilder.Append("SELECT ");
// ... More SQL here
for (int i = 0; i < selectionRange.Count; i++)
{
SelectionRange range = selectionRange[i];
commandSqlBuilder.Append("OR (");
commandSqlBuilder.Append($"value >= (@range_lower_param_{i})");
commandSqlBuilder.Append($"value <= (@range_upper_param_{i})");
// ... More SQL here
}
// ... More SQL here
NpgsqlCommand command = new NpgsqlCommand(commandSqlBuilder.ToString(), connection);
command.Parameters.AddWithValue(...);
NpgsqlDataReader reader = await command.ExecuteReaderAsync();
while (await reader.ReadAsync())
}
While this does work, and runs about twice as fast as the join, it does have new limitations such as A statement cannot have more than 65535 parameters
. So I needed to split it into smaller statements and join the results in code later. Also when the database names change in a migration I will need to come back an update this String Builder to reflect the changes.
selectRanges
? The obvious solution is to find the threshold where stackoverflow occurs, and then rather than having a singlequeryable
you run them in batches and append toresult
e.g. batchselectRanges
into batches of 100, and then run multiple queries, each with 200 parameters in them and then run them one at a time,AddRange
ing intoresult
. This will likely also allow you to avoid stackoverflow.com/questions/1009706/… . – mjwills Commented Feb 7 at 7:20simpleDbContext.SequencePoints.AsQueryable().Where(point => selectRanges.Any( sr => point.SequenceNumber >= sr.Item1 && point.SequenceNumber <= sr.Item2))
or something along these line should remove the need for the for loop and the Concat. That still leaves open the option that the dbprovider will choke whenselectRanges
is big – rene Commented Feb 7 at 7:23