Composite specifications

Every once in a while, we get requests for composite specifications. In this issue, I want to elaborate more on why we've been reluctant adding them and why we think they might not be the best idea for this library.

First, let's shortly explain the "original" specification pattern. It was proposed in the early 2000s by Eric Evans and Martin Fowler. The core idea was to encapsulate business rules/conditions into separate constructs. Then, using boolean operations (AND/OR/NOT), these atomic specifications may be combined/composed to form composite specifications. They will mainly emit a boolean result, whether a given object/entity satisfies the specification or not. And that's the crucial point, they're all about criteria.

This library, on the other hand, implements query specifications. The main goal is to extract common queries into separate constructs, and apply them for different providers. The intent was clean and concise since its inception. It was always about queries. That said, the overall design was optimized for this purpose. The aim was to have as little overhead as possible, and we've done tons of optimizations to achieve that. Allocation-wise, as seen from [the benchmarks](https://github.com/ardalis/Specification/blob/main/tests/Ardalis.Specification.Benchmarks/Benchmark2_ToQueryString.cs), we barely have 0.5% overhead, and the execution time is in the range of being statistically insignificant. We plan to improve this even further in the next versions.

Why am I writing about it? The design that makes this library efficient for queries is the reason that makes it not fit for in-memory operations. The primary issue is the `state`, or the data. Once you design it for queries and keep the state as expressions, the implementation for the other usage, at best, will be "mediocre". Any in-memory operation will require us to compile the expressions we store. That's a very expensive process. In case of in-memory collections and `Evaluate` feature, we're caching the delegates locally; so for large collections, it might be acceptable. But, for `IsSatisifedBy` feature, it's quite inefficient. We're compiling expressions just to check a single entity. Even if we ignore the compiling part, just a waste of storing the criteria as an expression (instead of a delegate) is not small at all.

```csharp
Expression<Func<Customer, bool>> criteria = x => x.Age > 18
```

This seemingly simple expression (no capturing, no closures) allocates ~600 bytes. That would be totally acceptable for queries, since users anyway will create these expressions and there won't be any overhead from our side. However, that's not the case for in-memory operations.

Let me be more pragmatic and provide some simple benchmarks. As seen here, this is a very simple and rudimentary example, and yet, it allocates 10K of memory. In case of more complex cases, and composite specifications, this value will be drastically higher.

| Method         | Mean           | Error          | StdDev      | Ratio     | RatioSD  | Gen0   | Gen1   | Allocated | Alloc Ratio |
|--------------- |---------------:|---------------:|------------:|----------:|---------:|-------:|-------:|----------:|------------:|
| IfStatements   |      0.7186 ns |      0.3452 ns |   0.0189 ns |      1.00 |     0.03 |      - |      - |         - |          NA |
| Specifications | 65,891.1906 ns | 12,552.7049 ns | 688.0561 ns | 91,731.85 | 2,221.37 | 0.7324 | 0.6104 |   10371 B |          NA |

<details>
<summary>Benchmark Code</summary>

```csharp
using Ardalis.Specification;
using BenchmarkDotNet.Attributes;

namespace CompositeSpecifications;

[MemoryDiagnoser]
[ShortRunJob]
public class Benchmark
{
    private Customer _customer = null!;
    private Order _order = null!;

    [GlobalSetup]
    public void Setup()
    {
        _customer = new Customer(30);
        _order = new Order("Alcohol");
    }

    [Benchmark(Baseline = true)]
    public bool IfStatements()
    {
        return _customer.Age >= 21
            && _order.ItemName == "Alcohol";
    }

    [Benchmark]
    public bool Specifications()
    {
        var adultSpec = new AdultCustomerSpec();
        var alcoholSpec = new AlcoholBeveragesSpec();
        return adultSpec.IsSatisfiedBy(_customer)
            && alcoholSpec.IsSatisfiedBy(_order);
    }

    public class AdultCustomerSpec : Specification<Customer>
    {
        public AdultCustomerSpec()
            => Query.Where(c => c.Age >= 21);
    }

    public class AlcoholBeveragesSpec : Specification<Order>
    {
        public AlcoholBeveragesSpec()
            => Query.Where(o => o.ItemName == "Alcohol");
    }

    public record Customer(int Age);
    public record Order(string? ItemName);
}
```
</details>

You may be compelled to say we've done a terrible job. That won't be totally accurate. Here is another benchmark to show you the pure cost of compiling expressions. Creating and compiling an expression (a very simple one) allocates 5K of memory. In the previous example, we had two of them, hence 10K allocations. So, all of the cost originates simply by this operation.

| Method       | Mean           | Error         | StdDev      | Gen0   | Gen1   | Allocated |
|------------- |---------------:|--------------:|------------:|-------:|-------:|----------:|
| Delegate     |      0.7886 ns |     1.0418 ns |   0.0571 ns |      - |      - |         - |
| Expression   |    266.9091 ns |   100.0398 ns |   5.4835 ns | 0.0458 |      - |     584 B |
| CompiledExpr | 14,349.5107 ns | 1,907.9908 ns | 104.5834 ns | 0.3662 | 0.3357 |    4903 B |

<details>
<summary>Benchmark Code</summary>

```csharp
using BenchmarkDotNet.Attributes;
using System.Linq.Expressions;

namespace CompositeSpecifications;

[MemoryDiagnoser]
[ShortRunJob]
public class Benchmark
{
    public record Customer(int Age);

    [Benchmark]
    public Func<Customer, bool> Delegate()
    {
        return x => x.Age >= 21;
    }

    [Benchmark]
    public Expression<Func<Customer, bool>> Expression()
    {
        return x => x.Age >= 21;
    }

    [Benchmark]
    public Func<Customer, bool> CompiledExpr()
    {
        Expression<Func<Customer, bool>> expr = x => x.Age >= 21;
        var func = expr.Compile();
        return func;
    }
}
```

</details>

And there is not much to do here. Either we optimize for queries (and that's 95% of our users), or we optimize for in-memory operations.

I hope it's more clear why we've been so reluctant on expanding the in-memory features. Yes, we do indeed have exposed the `IsSatisfiedBy` functionality. But at this point, that's a niche feature (somwhere on the edge) that noone talks about. If we do expand into composite specifications and other related features, that somehow will become the main theme of the library, while offering sub-standard performance. We do care about the quality and performance very deeply. The epic for [version 9](https://github.com/ardalis/Specification/issues/427) was all about reducing allocations, and we went to great lengths just to reduce a few bytes of allocation.

There are other reasons why we were avoiding composite specifications. We've elaborated that on [our FAQ page](https://specification.ardalis.com/getting-started/faq.html#how-do-i-use-composite-specifications). But, in this post/issue, I didn't want to focus on all the "subjective" reasons, and instead focus on a very tangible and objective reason.

We'll leave this issue open, and we're keen to hear your opinion. Now that you have more details on this issue, would you still consider using those new features in this library?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Composite specifications #541

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Method	Mean	Error	StdDev	Ratio	RatioSD	Gen0	Gen1	Allocated	Alloc Ratio
IfStatements	0.7186 ns	0.3452 ns	0.0189 ns	1.00	0.03	-	-	-	NA
Specifications	65,891.1906 ns	12,552.7049 ns	688.0561 ns	91,731.85	2,221.37	0.7324	0.6104	10371 B	NA

Method	Mean	Error	StdDev	Gen0	Gen1	Allocated
Delegate	0.7886 ns	1.0418 ns	0.0571 ns	-	-	-
Expression	266.9091 ns	100.0398 ns	5.4835 ns	0.0458	-	584 B
CompiledExpr	14,349.5107 ns	1,907.9908 ns	104.5834 ns	0.3662	0.3357	4903 B

Composite specifications #541

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions