fix: disable LTO on mods lib and synchronize entity processing by Guffawaffle · Pull Request #125 · netniV/stfc-mod

Guffawaffle · 2026-04-13T05:41:50Z

Summary

Two latent crash bugs found during hotkey feature development. Both cause intermittent crashes that are extremely difficult to reproduce because they depend on code layout, timing, and optimization decisions.

Bug 1: LTO/LTCG breaks SPUD detour hooks

Symptom: Intermittent C0000005 (access violation) at detour call sites. Adding or removing any struct member changes whether it crashes because it shifts code layout.

Root Cause: The mods static library compiles with /GL (Whole Program Optimization) by default, emitting MSIL bytecode. When the linker performs LTCG on the final DLL, it re-optimizes and re-lays-out function bodies — but SPUD's detour trampolines were patched against pre-LTCG addresses. The trampoline jumps into relocated/rewritten instructions.

Fix: set_policy("build.optimization.lto", false) on the mods target in mods/xmake.lua.

Why this wasn't caught before: LTO is a link-time concern. Everything compiled and linked fine. Crashes only manifested at runtime in code paths that got relocated, and the crash location moved depending on what code was added/removed — making it look like data corruption rather than codegen.

Bug 2: Detached threads cause stack corruption

Symptom: Intermittent C0000409 (GS cookie / stack buffer overrun) in submit_async or RTC handler.

Root Cause: std::thread().detach() in submit_async creates fire-and-forget threads accessing shared state without synchronization. When the OS reclaims a detached thread's stack while it's still executing, or multiple threads race on shared containers, the GS security cookie fails.

Fix: Replace std::thread().detach() with synchronous execution. Entity processing is fast enough that blocking is not a problem.

Other changes

.vscode/settings.json removed from tracking (personal editor config)
.gitignore updated to exclude .smartergpt/

Testing

Mod loads, hooks install, hotkeys function with both fixes
Stress-tested config struct modifications that previously triggered Bug 1 — no crashes
Zero behavioral changes

- Disable LTO (build.optimization.lto) on mods static lib to prevent LTCG codegen from breaking SPUD detour hooks at runtime (C0000005) - Replace detached std::thread in submit_async and RTC handler with synchronous execution to fix GS cookie stack corruption (C0000409) - Remove .vscode/settings.json from tracking (personal editor config) - Add .smartergpt/ to .gitignore

Guffawaffle · 2026-04-13T06:31:09Z

Why disable LTO instead of fixing the root cause?

The /GL + LTCG pipeline works by deferring final code generation to link time. Instead of emitting machine code per translation unit, MSVC emits an intermediate representation, then at link time LTCG sees all code at once and re-optimizes globally — inlining across translation units, merging identical function bodies, eliminating "unused" functions, and reshaping prologues.

The problem: SPUD installs detours by patching function prologues at runtime, after linking is long finished. LTCG has no visibility into these runtime patches. When it inlines or eliminates a function that SPUD later tries to detour, the patch either lands on the wrong code or targets an address that no longer exists. The compiler is making correct optimizations based on incomplete information.

Alternatives considered

Approach	Drawback
`__declspec(noinline)` on detour targets	Fragile — miss one and you get a mystery crash. Must be maintained as new hooks are added.
LTO-aware detour library	SPUD doesn't support this; would require upstream changes or a library swap.
Disable LTO on the mods target only	Chosen. Single policy line, prevents the entire class of breakage, negligible perf impact for a small injected DLL.

This is a pragmatic workaround, not a root-cause fix. If SPUD gains LTO compatibility in the future, the set_policy("build.optimization.lto", false) line can be removed.

Guffawaffle · 2026-04-13T08:02:40Z

Guffawaffle · 2026-04-13T08:55:26Z

Dev Note: Why entity processing must be synchronous

During testing, we explored reducing the game-thread footprint of the synchronous entity processing fix. Documenting findings here for future reference.

What was tried

Replaced the synchronous submit_async calls in HandleEntityGroup with a single persistent worker thread using a queue (same architecture as TargetWorker):

Game thread: null check → byte copy → mutex lock → queue push → notify CV (~microseconds)
Worker thread: protobuf parse → diff → JSON serialize → queue_data() (the heavy work)
All Config::Get() checks remained on the game thread before queueing

This was a middle ground between the original per-call std::thread().detach() (crash-prone) and fully-synchronous processing (blocks game thread).

Result

Same C0000005 crash at ~20% game load. Identical to the original detached-thread crash.

Conclusion

The issue is not specific to detached threads or Config::Get() access patterns. Any processing on a separate thread stack — whether detached, persistent worker, or joined — triggers the crash. The SPUD detour trampoline is sensitive to the call stack layout in a way that requires entity processing to execute on the same stack frame as the original hooked function.

Synchronous inline processing is the only working approach. The actual frame cost is negligible in practice — it's just protobuf parse + diff + JSON serialize + queue push (no I/O). HTTP sending is already async via TargetWorker threads, which are unaffected because they don't run inside a SPUD detour call chain.

netniV · 2026-04-14T07:57:30Z

We do NOT want to run the processing synchronously, that is a bad idea.

netniV · 2026-04-14T07:58:00Z

Also, you're PRs are NOT clean... you have mixed up various changes and bled them into other PRs

Guffawaffle · 2026-04-14T08:26:54Z

Yeah this one cascades into others because, without it, I crash on init on pretty much anything I touch so this one does get pulled in. I add a "depends on" to kinda point towards this but it's not clean when looking at a PR.

I'll step back and examine more about this init crash tomorrow. I can clearly say, without this PR, any other changes I make become a crap shoot on if it will crash on init (20% load) or not. [edit: this might also be fixed by the thin hook > dispatcher > handlers method on the other ticket]

I'll step back to this with the lessons I picked up today and rework it. I've learned a lot. Likely shouldn't have tossed any of these as prs yet but, in the dev cycle I work on, this is just normal to get a pr open and then folks can start reviewing as you iterate. Helps to get early feedback imo.

Guffawaffle · 2026-04-15T01:43:28Z

Closing — This did not full resolve the crash issues, or even noticably improve the experience.

Guffawaffle mentioned this pull request Apr 13, 2026

Feature: hotkey dispatch table + 3-tier architecture extraction #126

Open

Guffawaffle mentioned this pull request Apr 13, 2026

fix: legacy sync url/token now inherits data sync options (#106) #127

Open

Guffawaffle closed this Apr 14, 2026

Guffawaffle reopened this Apr 14, 2026

Guffawaffle closed this Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: disable LTO on mods lib and synchronize entity processing#125

fix: disable LTO on mods lib and synchronize entity processing#125
Guffawaffle wants to merge 1 commit intonetniV:mainfrom
Guffawaffle:fix/lto-and-sync-crashes

Guffawaffle commented Apr 13, 2026

Uh oh!

Guffawaffle commented Apr 13, 2026

Uh oh!

Guffawaffle commented Apr 13, 2026

Uh oh!

Guffawaffle commented Apr 13, 2026

Uh oh!

netniV commented Apr 14, 2026

Uh oh!

netniV commented Apr 14, 2026

Uh oh!

Guffawaffle commented Apr 14, 2026 •

edited

Loading

Uh oh!

Guffawaffle commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Guffawaffle commented Apr 13, 2026

Summary

Bug 1: LTO/LTCG breaks SPUD detour hooks

Bug 2: Detached threads cause stack corruption

Other changes

Testing

Uh oh!

Guffawaffle commented Apr 13, 2026

Why disable LTO instead of fixing the root cause?

Alternatives considered

Uh oh!

Guffawaffle commented Apr 13, 2026

QA Checklist

Prerequisites

Bug 1: LTO Disabled

Bug 2: Synchronous Entity Processing

Sync Functionality

Regression

Uh oh!

Guffawaffle commented Apr 13, 2026

Dev Note: Why entity processing must be synchronous

What was tried

Result

Conclusion

Uh oh!

netniV commented Apr 14, 2026

Uh oh!

netniV commented Apr 14, 2026

Uh oh!

Guffawaffle commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Guffawaffle commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Guffawaffle commented Apr 14, 2026 •

edited

Loading