Ignore alternatives that are binary-identical to existing ones#157
Ignore alternatives that are binary-identical to existing ones#157AdamWill wants to merge 1 commit intofedora-sysv:mainfrom
Conversation
In https://bugzilla.redhat.com/show_bug.cgi?id=2363937 we found a problem that is ultimately triggered by alternatives configs having multiple entries that point to the same binary, or the same *effective* binary after /usr and /sbin merges (which is what streq_bin handles). I can't see a reason why we'd ever want to support this as a real thing, so when reading the config, let's just skip ingesting any alternative whose leader target is the same effective binary as an alternative we've already read. Signed-off-by: Adam Williamson <awilliam@redhat.com>
|
we might also want to add something to not add binary-equal alternatives into the config file in the first place, but it seemed more important to do this, since we know there are systems already out there in the wild that have them. We could also actually go ahead and wipe dupe entries from the config file when we encounter them, but that seems more radical (and I don't know how to do it properly/safely, I'm not really a C coder.) |
|
The known consequences of this, btw:
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds duplicate detection logic to skip alternatives with duplicate leader targets during configuration reading. When a duplicate is found, the code jumps to skip adding it to the set.
- Added loop to check for duplicates by comparing
newAlt.leader.targetwith existing alternatives - Added
nextaltlabel andgotostatement to skip duplicate entries
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| for (i = 0; i < set->numAlts; i++) { | ||
| if (streq_bin(newAlt.leader.target, set->alts[i].leader.target)) { |
There was a problem hiding this comment.
Memory leak when skipping duplicate alternatives. The newAlt structure has allocated memory (via strdup, malloc, and strsteal on lines 461-520) that must be freed before jumping to nextalt. The memset at line 537 only zeros the pointers without freeing the memory. Call clearAlternative(&newAlt) before the goto nextalt statement.
| if (streq_bin(newAlt.leader.target, set->alts[i].leader.target)) { | |
| if (streq_bin(newAlt.leader.target, set->alts[i].leader.target)) { | |
| clearAlternative(&newAlt); |
There was a problem hiding this comment.
hmm, but we don't do this in the existing loop, so I didn't think it was necessary on the short-circuit path. can somebody who knows anything about C comment? :D
|
Sorry, I was away last week. If the problem is the duplicate entries, this might go deeper. I could imagine that you would have the same problem when removing such an alternative, which this patch does not cover |
|
The existence of duplicate entries is the root of all the problems, I'd say, yeah. (Where we define 'duplicate' as 'entries with the same leader binary, after resolving symlinks'). It's possible the other recent changes here have made it so we'll no longer create duplicate entries. It's hard to be sure, there's a lot of moving parts and I don't actually have a clear way to reproduce the bug 'naturally' (I've been reproducing it by hand-editing the data file). But from the bug report it's clear that there are real-world users who've got duplicate entries, so we do need to handle that situation somehow. If you can come up with something that detects and removes duplicate entries reliably and consistently that should also resolve the problems, I believe. |
|
@lnykryn still, I would like to push this fix to stable for Fedora for now, to stop people running into the problem. Do you think it's OK to push https://bodhi.fedoraproject.org/updates/FEDORA-2025-aa88a0d00b and https://bodhi.fedoraproject.org/updates/FEDORA-2025-bbaef37b87 - which have this backported - stable now, so people stop hitting the upgrade problem, then we can do more comprehensive change later? Thanks. |
|
Yeah, go ahead. That will fix the symptoms. I think I have an idea how to handle this a bit more definitely, but it is still work in progress lnykryn@7a627d4 |
In https://bugzilla.redhat.com/show_bug.cgi?id=2363937 we found a problem that is ultimately triggered by alternatives configs having multiple entries that point to the same binary, or the same effective binary after /usr and /sbin merges (which is what streq_bin handles). I can't see a reason why we'd ever want to support this as a real thing, so when reading the config, let's just skip ingesting any alternative whose leader target is the same effective binary as an alternative we've already read.