[SqlClient/EFCore] summary optimization phase 2 #3116

stevejgordon · 2025-09-15T14:26:21Z

Contributes to #2238

Changes

This is a rather large refactor of the SqlProcessor. The initial work focused on removing all allocation overhead. Further work optimised the non-cached execution time with Span-based code. Several iterations have led to the code in this PR which is a balance between readability and performance. In all cases, the execution time is reduced compared to the existing code.

Further work has been introduced to support additional SQL keywords and less common statements. This is not exhaustive but builds upon those previously supported.

@alanwest I'll add some PR comments around some specific items that may need discussion. I've included SqlProcessorAdditionalTestCases.json with more statements. We could eventually contribute these into the semantic conventions. These will need a careful review to check we are happy the expected summaries align with the intent from the semantic conventions.

Before

| Method          | Sql                  | Mean     | Error    | StdDev    | Median   | Allocated |
|---------------- |--------------------- |---------:|---------:|----------:|---------:|----------:|
| GetSanitizedSql | CREAT(...)s(Id) [56] | 420.6 ns | 22.18 ns |  60.71 ns | 428.3 ns |     584 B |
| GetSanitizedSql | DELET(...) = 42 [32] | 255.7 ns |  7.11 ns |  20.18 ns | 252.0 ns |     448 B |
| GetSanitizedSql | INSER(...)3e-5) [76] | 514.4 ns | 16.39 ns |  48.33 ns | 513.3 ns |     680 B |
| GetSanitizedSql | SELEC(...)ls od [39] | 439.8 ns | 39.56 ns | 116.65 ns | 374.5 ns |     528 B |
| GetSanitizedSql | SELE(...)tory [111]  | 584.5 ns | 11.55 ns |  24.36 ns | 583.5 ns |    1056 B |
| GetSanitizedSql | SELEC(...)table [69] | 269.9 ns |  8.34 ns |  23.79 ns | 271.8 ns |     600 B |
| GetSanitizedSql | SELEC(...) c.Id [74] | 563.1 ns | 22.71 ns |  66.25 ns | 553.4 ns |     736 B |
| GetSanitizedSql | SELE(...)_id) [101]  | 691.8 ns | 19.53 ns |  56.02 ns | 693.5 ns |     912 B |
| GetSanitizedSql | UPDAT(...) = 42 [44] | 328.1 ns | 11.58 ns |  33.40 ns | 329.8 ns |     504 B |

After

| Method          | Sql                  | Mean     | Error   | StdDev  | Median   | Allocated |
|---------------- |--------------------- |---------:|--------:|--------:|---------:|----------:|
| GetSanitizedSql | CREAT(...)s(Id) [56] | 207.9 ns | 4.12 ns | 7.54 ns | 205.0 ns |     248 B |
| GetSanitizedSql | DELET(...) = 42 [32] | 161.2 ns | 1.90 ns | 1.59 ns | 161.2 ns |     128 B |
| GetSanitizedSql | INSER(...)3e-5) [76] | 254.1 ns | 2.37 ns | 2.22 ns | 253.8 ns |     192 B |
| GetSanitizedSql | SELEC(...)ls od [39] | 146.3 ns | 1.98 ns | 1.65 ns | 145.8 ns |     184 B |
| GetSanitizedSql | SELE(...)tory [111]  | 267.8 ns | 1.41 ns | 1.32 ns | 268.0 ns |     424 B |
| GetSanitizedSql | SELEC(...)table [69] | 106.5 ns | 2.13 ns | 3.45 ns | 104.7 ns |     128 B |
| GetSanitizedSql | SELEC(...) c.Id [74] | 246.1 ns | 1.65 ns | 1.29 ns | 246.3 ns |     264 B |
| GetSanitizedSql | SELE(...)_id) [101]  | 294.1 ns | 2.04 ns | 1.81 ns | 294.1 ns |     312 B |
| GetSanitizedSql | UPDAT(...) = 42 [44] | 196.9 ns | 0.97 ns | 0.86 ns | 197.2 ns |     144 B |

Merge requirement checklist

CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
Unit tests added/updated
~~Appropriate CHANGELOG.md files updated for non-trivial changes~~
~~Changes in public API reviewed (if applicable)~~

build/Common.nonprod.props

codecov · 2025-09-15T14:27:43Z

Codecov Report

❌ Patch coverage is 96.43917% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.18%. Comparing base (fabb87f) to head (007f300).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/Shared/SqlProcessor.cs	96.43%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3116      +/-   ##
==========================================
+ Coverage   70.04%   70.18%   +0.13%     
==========================================
  Files         439      439              
  Lines       16889    17086     +197     
==========================================
+ Hits        11830    11991     +161     
- Misses       5059     5095      +36

Flag	Coverage Δ
unittests-Contrib.Shared.Tests	`86.76% <96.43%> (+3.02%)`	⬆️
unittests-Exporter.Geneva	`53.33% <ø> (-0.45%)`	⬇️
unittests-Exporter.InfluxDB	`95.14% <ø> (ø)`
unittests-Exporter.Instana	`74.86% <ø> (ø)`
unittests-Exporter.OneCollector	`94.61% <ø> (ø)`
unittests-Extensions	`90.65% <ø> (ø)`
unittests-Extensions.Enrichment	`100.00% <ø> (ø)`
unittests-Extensions.Enrichment.AspNetCore	`94.11% <ø> (ø)`
unittests-Extensions.Enrichment.Http	`94.33% <ø> (ø)`
unittests-Instrumentation.AWS	`83.59% <ø> (-0.22%)`	⬇️
unittests-Instrumentation.AspNet	`74.93% <ø> (-0.25%)`	⬇️
unittests-Instrumentation.AspNetCore	`70.76% <ø> (ø)`
unittests-Instrumentation.Cassandra	`23.52% <ø> (ø)`
unittests-Instrumentation.ConfluentKafka	`14.10% <ø> (ø)`
unittests-Instrumentation.ElasticsearchClient	`80.12% <ø> (ø)`
unittests-Instrumentation.EntityFrameworkCore	`80.80% <ø> (ø)`
unittests-Instrumentation.EventCounters	`76.36% <ø> (ø)`
unittests-Instrumentation.GrpcCore	`91.42% <ø> (ø)`
unittests-Instrumentation.GrpcNetClient	`79.61% <ø> (ø)`
unittests-Instrumentation.Hangfire	`84.61% <ø> (ø)`
unittests-Instrumentation.Http	`74.18% <ø> (ø)`
unittests-Instrumentation.Owin	`88.62% <ø> (ø)`
unittests-Instrumentation.Process	`100.00% <ø> (ø)`
unittests-Instrumentation.Quartz	`78.76% <ø> (ø)`
unittests-Instrumentation.Runtime	`100.00% <ø> (ø)`
unittests-Instrumentation.ServiceFabricRemoting	`34.54% <ø> (ø)`
unittests-Instrumentation.SqlClient	`87.29% <ø> (ø)`
unittests-Instrumentation.StackExchangeRedis	`71.80% <ø> (ø)`
unittests-Instrumentation.Wcf	`78.95% <ø> (ø)`
unittests-OpAmp.Client	`71.30% <ø> (ø)`
unittests-PersistentStorage	`65.88% <ø> (+0.33%)`	⬆️
unittests-Resources.AWS	`74.42% <ø> (ø)`
unittests-Resources.Azure	`85.31% <ø> (ø)`
unittests-Resources.Container	`67.34% <ø> (ø)`
unittests-Resources.Gcp	`71.42% <ø> (ø)`
unittests-Resources.Host	`73.91% <ø> (ø)`
unittests-Resources.OperatingSystem	`76.98% <ø> (ø)`
unittests-Resources.Process	`100.00% <ø> (ø)`
unittests-Resources.ProcessRuntime	`79.59% <ø> (ø)`
unittests-Sampler.AWS	`88.25% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/Shared/SqlStatementInfo.cs	`100.00% <ø> (ø)`
src/Shared/SqlProcessor.cs	`96.04% <96.43%> (-1.76%)`	⬇️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

stevejgordon · 2025-09-15T14:29:50Z

test/OpenTelemetry.Contrib.Shared.Tests/SqlProcessorTestCases.cs

+        data.Add(new()
+        {
+            Name = "alter_table",
+            Input = new()
+            {
+                DbSystemName = "other_sql",
+                Query = "ALTER  TABLE MyTable ADD Name varchar(255)",
+            },
+            Expected = new()
+            {
+                SanitizedQueryText = ["ALTER  TABLE MyTable ADD Name varchar(255)"],
+                Summary = "ALTER TABLE MyTable",
+            },
+        });


@alanwest This specific test in the semantic conventions expects the 255 literal to be sanitized. My current implementation handles column definitions such that we retain the value. If you're happy with this, should I update this in the semantic conventions?

The intent of the original implementation was to aggressively sanitize since I figured any number or string literal in the query added little value to db.query.text. That said, I think this is ok.

The reason SanitizedQueryText is an array is to allow alternate acceptable forms of sanitization. I don't feel strongly that we should keep the previous form as an option, but it's a question we can raise when we update the test suite upstream in case others have a preference.

We just want to be extra cautious not to expose sensitive data. One thing the test suite from the semantic conventions repo does not cover are pathological SQL statements. For example, SQL that is not valid. Something like ALTER TABLE MyTable SELECT * FROM Users WHERE (something sensitive here). I don't think pathological test cases belong in the test suite from the semantic conventions, but I think we may want to add some coverage separately. I guess ideally it would be nice to not set db.query.text at all in case of invalid SQL, but I think it may be challenging to detect without implementing a full SQL parser which would be silly. Maybe we could consider unsetting db.query.text in the case of an exception indicating invalid SQL?

I'll open a PR upstream for the test and we can discuss there. I think both forms are acceptable if its hard to achieve without overhead in other languages.

Regarding invalid SQL statements. I'm unsure if we want to avoid setting db.query.text. Firstly, as you say, we end up building a full parser which is complex. Also, if an invalid statement is included, that will fail the DB command and in that case, the user likely wants to know what the bad SQL was. My only real concern would be SQL built from unrestricted/invalidated user input. In that case, there could be a small vector to make something nasty that hangs up our parsing. Realistically though, if the customer code accept unvalidated user input, they have bigger problems!

I'm unsure if we want to avoid setting db.query.text.

This is kind of a secondary concern to me. Primary concern is that the literal values are sanitized even if the SQL is malformed. I think we should err on the side of being extra aggressive.

So this test case does bring me a little bit of pause. Are you confident that any literal values that might come after the statement will be sanitized? Few examples

ALTER TABLE MyTable ADD Name varchar(255) 'mypassword' 1234 would become ALTER TABLE MyTable ADD Name varchar(255) ? ?

ALTER TABLE MyTable 'mypassword' ADD 123 Name varchar(255) would become ALTER TABLE MyTable ? ADD ? Name varchar(255)

build/Common.nonprod.props

test/OpenTelemetry.Contrib.Shared.Tests/OpenTelemetry.Contrib.Shared.Tests.csproj

src/Shared/SqlProcessor.cs

stevejgordon · 2025-09-15T15:23:01Z

A couple of the tests are failin gon net462 - I'll look into those tomorrow.

stevejgordon · 2025-09-17T13:04:48Z

@martincostello There seem to be a bunch of EFCore failures now. I'll take a look into those, hopefully tomorrow and see why this breaks those.

stevejgordon · 2025-09-18T14:38:51Z

@martincostello The EF Core tests pass on my machine. Are you able to rerun these please?

stevejgordon · 2025-09-19T13:18:25Z

@alanwest What are you thoughts on the revised test (see #3116 (comment)) and additional test cases? I'm on leave next week, so will pick up any changes when I'm back.

github-actions · 2025-09-27T03:19:07Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

alanwest · 2025-09-29T17:31:48Z

test/OpenTelemetry.Contrib.Shared.Tests/SqlProcessorAdditionalTestCases.json

+    },
+    "expected": {
+      "db.query.text": [
+        "SELECT * FROM Vecnzotjejucwitzgrfifscuevittljlnrlpbruvkezeptqciyvbjhsytmbucbhwttidayecnthaztxbyppbyztcqccedeirgkxzrfezjxfwbtuqxeusroqgbvulgmsvnelovkxqsaqmlogomkhtjuirzhaocxlrmerihnmwaelullionarkmxwdamhduwrbooknqsnilurutgyerxphokeqnnoumbpcfjtmqrbpukjllofiwaltyoawkp o, OrderDetails od"


The spec for generating db.query.summary states that truncation should not occur within an operation or target.

Instrumentations that parse the query to set db.query.summary SHOULD truncate the summary to 255 characters (ensuring truncation does not occur within an operation name or target).

Ah yes, I must have scanned over that. I'll update to handle this. Do you think we should a) truncate at the end of the last token that falls before 255, or b) all completion of any tokens starting before 255 chars, so potentially going over the 255 recommended limit? The later is only likely an issue for unusually long target names, but I doubt that's a real issue.

I think always staying <= 255 would be more true to the spec. So in this example db.query.summary would just be SELECT.

alanwest · 2025-09-29T18:04:25Z

test/OpenTelemetry.Contrib.Shared.Tests/SqlProcessorAdditionalTestCases.json

+      "db.system.name": "other_sql",
+      "query": "CREATE UNIQUE CLUSTERED INDEX IX_Employee_Email ON Employees (Email);"
+    },
+    "expected": {
+      "db.query.text": [
+        "CREATE UNIQUE CLUSTERED INDEX IX_Employee_Email ON Employees (Email);"
+      ],
+      "db.query.summary": "CREATE UNIQUE CLUSTERED INDEX IX_Employee_Email"


There is not a strict specification for this, but I think I'd lean towards trying to exclude qualifiers like UNIQUE and CLUSTERED when parsing the operation.

Also this appears to be a test that should specifically target MS SQL Server, so db.system.name should equal microsoft.sql_server. This is so that the test suite can be filtered down to tests only applicable to certain database systems if necessary. The list of known db.system.name values are listed here.

Yes I was unsure whether we wanted those keywords capture in the summary. It adds context to what the statement does, directly in the span name, but perhaps there's little value in that. I doubt many instrumented apps create indices. I'm happy to update to exclude these for now. I can add the test upstream and perhaps a discussion can occur there on whether to include these (or not). My only concern would be if that discussion delays getting this instrumentation to a stable state.

Fixed in 007f300

Sounds good. We can always decide they should be included later if there is demand for it.

test/OpenTelemetry.Contrib.Shared.Tests/SqlProcessorAdditionalTestCases.json

Co-authored-by: Martin Costello <[email protected]>

…Shared.Tests.csproj Co-authored-by: Martin Costello <[email protected]>

stevejgordon · 2025-09-30T13:34:47Z

@martincostello I have one more (hopefully) thing to tackle which is the truncation once we decide on option a) or b).

stevejgordon requested a review from a team as a code owner September 15, 2025 14:26

stevejgordon requested a deployment to external September 15, 2025 14:26 — with GitHub Actions Waiting

github-actions bot added infra Infra work - CI/CD, code coverage, linters perf Performance related labels Sep 15, 2025

stevejgordon commented Sep 15, 2025

View reviewed changes

build/Common.nonprod.props Show resolved Hide resolved

stevejgordon commented Sep 15, 2025

View reviewed changes

stevejgordon requested a deployment to external September 15, 2025 14:34 — with GitHub Actions Waiting

martincostello reviewed Sep 15, 2025

View reviewed changes

stevejgordon force-pushed the sql-summary-optimisation-phase-2 branch from a6bfe75 to f563d5e Compare September 15, 2025 14:57

stevejgordon requested a deployment to external September 15, 2025 14:57 — with GitHub Actions Waiting

stevejgordon requested a deployment to external September 15, 2025 15:08 — with GitHub Actions Waiting

stevejgordon requested a deployment to external September 15, 2025 15:11 — with GitHub Actions Waiting

Kielek changed the title ~~Sql summary optimisation phase 2~~ [SqlClient/EFCore] summary optimization phase 2 Sep 16, 2025

stevejgordon requested a deployment to external September 16, 2025 07:12 — with GitHub Actions Waiting

stevejgordon requested a deployment to external September 16, 2025 08:19 — with GitHub Actions Waiting

stevejgordon requested a review from martincostello September 16, 2025 08:22

github-actions bot added the Stale label Sep 27, 2025

Kielek removed the Stale label Sep 29, 2025

alanwest reviewed Sep 29, 2025

View reviewed changes

test/OpenTelemetry.Contrib.Shared.Tests/SqlProcessorAdditionalTestCases.json Outdated Show resolved Hide resolved

alanwest reviewed Sep 29, 2025

View reviewed changes

test/OpenTelemetry.Contrib.Shared.Tests/SqlProcessorAdditionalTestCases.json Outdated Show resolved Hide resolved

Add additional benchmark cases

2b35dd3

stevejgordon and others added 13 commits September 30, 2025 12:11

Add extra test cases

302aa55

Refactor SqlProcessor for optimised performance

76c386c

Update comments

f4330b2

Final benchmarks

1999923

Update readme

4da6681

Update src/Shared/SqlProcessor.cs

4ed94a8

Co-authored-by: Martin Costello <[email protected]>

Update src/Shared/SqlProcessor.cs

129e7f8

Co-authored-by: Martin Costello <[email protected]>

Update test/OpenTelemetry.Contrib.Shared.Tests/OpenTelemetry.Contrib.…

267ba2d

…Shared.Tests.csproj Co-authored-by: Martin Costello <[email protected]>

Code review comments

445f860

Review feedback and cleanup

a59e6cb

Updated benchmarks

f65ade8

Fix and update test cases

ab848e1

Update test cases per PR feedback

007f300

stevejgordon force-pushed the sql-summary-optimisation-phase-2 branch from 8e5aeab to 007f300 Compare September 30, 2025 11:11

stevejgordon requested a deployment to external September 30, 2025 11:11 — with GitHub Actions Waiting

martincostello approved these changes Sep 30, 2025

View reviewed changes

[SqlClient/EFCore] summary optimization phase 2 #3116

Are you sure you want to change the base?

[SqlClient/EFCore] summary optimization phase 2 #3116

Uh oh!

Conversation

stevejgordon commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Before

After

Merge requirement checklist

Uh oh!

Uh oh!

codecov bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanwest Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanwest Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stevejgordon commented Sep 15, 2025

Uh oh!

stevejgordon commented Sep 17, 2025

Uh oh!

stevejgordon commented Sep 18, 2025

Uh oh!

stevejgordon commented Sep 19, 2025

Uh oh!

github-actions bot commented Sep 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

stevejgordon commented Sep 30, 2025

Uh oh!

Uh oh!

stevejgordon commented Sep 15, 2025 •

edited

Loading

codecov bot commented Sep 15, 2025 •

edited

Loading

alanwest Sep 19, 2025 •

edited

Loading

alanwest Sep 30, 2025 •

edited

Loading