Skip to content

Conversation

@oltolm
Copy link
Contributor

@oltolm oltolm commented Sep 27, 2025

I think I have fixed it properly this time, but I will leave it as draft for a while.

I will explain what the problem was. It has something to do with sequences, but the previous fix was wrong.

I am executing

addr2line.exe -C -e bin\rpcs3.exe -f 0x1f7686c

it prints this (I added debug logging)

observe
addr > plineaddr && addr < lineaddr: 0x141f7686c > 0x141e80fa2 && 0x141f7686c < 0x14267a920
C:/src/rpcs3/rpcs3/util/types.hpp:924

which is wrong.

If we look at objdump -WL output we see this

C:/src/rpcs3/rpcs3/util/types.hpp:
types.hpp                                921         0x141e80f80               x
types.hpp                                923         0x141e80f8c               x
types.hpp                                923         0x141e80f94               x
types.hpp                                924         0x141e80f9c               x
types.hpp                                  -         0x141e80fa2

C:/msys64/ucrt64/include/c++/15.2.0/bits/stl_algobase.h:
stl_algobase.h                           234         0x14267a920               x

It compares two addresses from difference sequences.

It should have found this instead

C:/src/rpcs3/rpcs3/util/atomic.hpp:
atomic.hpp                              1315         0x141f7686c               x

The solution is to reset the search each time we encounter the end of sequence DW_LNE_end_sequence. This is what libdwarf has to say

/*  At the end of any contiguous line-table there may be
    a DW_LNE_end_sequence operator.
    This returns non-zero thru *return_bool
    if and only if this 'line' entry was a DW_LNE_end_sequence.

    Within a compilation unit or function there may be multiple
    line tables, each ending with a DW_LNE_end_sequence.
    Each table describes a contiguous region.
    Because compilers may split function code up in arbitrary ways
    compilers may need to emit multiple contigous regions (ie
    line tables) for a single function.
    See the DWARF3 spec section 6.2.  */
/*  Each 'line' entry has a line-number.
    If the entry is a DW_LNE_end_sequence the line-number is
    meaningless (see dwarf_lineendsequence(), just above).  */
/*  Each 'line' entry has a file-number, an index
    into the file table.
    If the entry is a DW_LNE_end_sequence the index is
    meaningless (see dwarf_lineendsequence(), just above).

If we find a DW_LNE_end_sequence we ignore the file and line information and reset the state of the loop so that in never compares addresses from different sequence or line-tables as libdwarf calls it.

@codecov
Copy link

codecov bot commented Sep 27, 2025

Codecov Report

❌ Patch coverage is 58.82353% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.06%. Comparing base (5dc71ed) to head (3722532).

Files with missing lines Patch % Lines
src/mgwhelp/dwarf_find.cpp 58.82% 2 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #97      +/-   ##
==========================================
- Coverage   51.11%   51.06%   -0.05%     
==========================================
  Files          15       15              
  Lines        2160     2166       +6     
  Branches      824      830       +6     
==========================================
+ Hits         1104     1106       +2     
- Misses        809      811       +2     
- Partials      247      249       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jrfonseca
Copy link
Owner

Thanks for the explanations. I spent some time today trying to understand the issue.

I got a rough picture of what DW_LNE_end_sequence means, but it's difficult for me to deduce how the code needs to change in practice, and review the code changes. Foremost because I'm not the original author of this code -- it was based off elftoolchain as mentioned on the header, though it doesn't look like upstream has been updated. Furthermore I haven't looked at it in a very long time..

A few preliminary thoughts:

  • Is there any way we could create a test case that reliably triggers this issue? I fear we might keep getting it wrong (or regress) without one. For example:
    • a function, with a bunch of force-inline functions, all in one single line of code,
    • or a huge function with dummy calculations spread across lots of switch statements to force the compiler to spread the code through many ranges.
  • Is there third party code we could rely upon instead of implementing this logic directly ourselves?
    • In particular, for a long time now I envisioned to integrate libbacktrace , because it not only resolves line addresses, but also walk the stack using DWARF. Furthermore it already supports Windows and PE files. They handle DW_LNE_end_sequence here. Alas I never had or have much time to spend on DrMingw (I can squeeze little things now and then, but it's difficult to squeeze such a large endeavor as to replace libdwarf with libbacktrace.)

@oltolm
Copy link
Contributor Author

oltolm commented Oct 4, 2025

I will try to create a test case. I also thought about it, but didn't know how to force the compiler to create the file that I wanted. BTW the github mirror of elftoolchain has not been updated in years. The original repo is https://sourceforge.net/projects/elftoolchain/, but there line2addr does not handle DW_LNE_end_sequence.

I tested 3 other implementations

  1. the one from dwarfstack handles it here https://github.com/ssbssa/dwarfstack/blob/36ef3ce1fa26bf5d080430883b77ab43e18aa15e/src/dwst-file.c#L652
  2. I could not find where llvm-addr2line handles it. Maybe here https://github.com/llvm/llvm-project/blob/8243c368b750cfe127aed3cd96c675b4499be7f9/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp#L906.
  3. GNU addr2line maybe here https://github.com/bminor/binutils-gdb/blob/025c45fdaca4e4bbfe16cb7931f77d6f68c5356c/bfd/dwarf2.c#L2973, I am not sure.

@oltolm
Copy link
Contributor Author

oltolm commented Nov 1, 2025

I took a look at dwarfstack and I liked it. dwarfstack uses almost the same code that I wrote for drmingw, but it's better. At least now after I fixed a couple of bugs in dwarfstack. I benchmarked it and found a performance problem that I fixed in dwarfstack v2.3. If I had known about dwarfstack I would not have written my code.

I have created a branch https://github.com/oltolm/drmingw/tree/dwarfstack where I have replaced the code in mgwhelp with code that uses dwarfstack. All tests pass. I had to fork dwarfstack because the interface didn't fit mgwhelp, but there are almost no functional changes to dwarfstack. I could open a PR and close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants