Skip to content

Comments

pmda/rds: Introduce new PMDA for RDS#2447

Open
Hannibal404 wants to merge 5 commits intoperformancecopilot:mainfrom
Hannibal404:rds_pmda
Open

pmda/rds: Introduce new PMDA for RDS#2447
Hannibal404 wants to merge 5 commits intoperformancecopilot:mainfrom
Hannibal404:rds_pmda

Conversation

@Hannibal404
Copy link
Contributor

This change adds a new PMDA (Performance Metrics Domain Agent) for Reliable Datagram Sockets (RDS). It exports key metrics including connection information, socket and connection statistics, and details of send, receive, and retransmit queues for performance analysis using Performance Co-Pilot (PCP).

This PMDA is intended to aid in diagnosing network-related issues on systems using RDS over Infiniband or TCP.

Replaces #2230

This commit adds a new PMDA (Performance Metrics Domain Agent) for
Reliable Datagram Sockets (RDS). It exports key metrics including
connection information, socket and connection statistics, and details
of send, receive, and retransmit queues for performance analysis using
Performance Co-Pilot (PCP).

This PMDA is intended to aid in diagnosing network-related issues
on systems using RDS over Infiniband or TCP.

Signed-off-by: Mohith Kumar Thummaluru <mohith.k.kumar.thummaluru@oracle.com>
Signed-off-by: Mohith Kumar Thummaluru <mohith.k.kumar.thummaluru@oracle.com>
Add manpage for rds pmda and address some linting issues

Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
@natoscott
Copy link
Member

Install fails for me after building rpm packages with:

[pcpqa@fedora rds]$ sudo ./Install 
Traceback (most recent call last):
  File "/var/lib/pcp/pmdas/rds/pmdards.python", line 25, in <module>
    from modules.rds_ping import rds_ping_all_avlbl_dest
ModuleNotFoundError: No module named 'modules.rds_ping'

I expect it relates to the .python file extensions, and the more dynamic import mechanism used by pmdabcc might be more what you're after here.

Unrelated to this, the new QA test .out file contains several errors as well that shouldn't be there (relating to 'unknown metric name') - but, it fails with the Install for me so I've not been able to observe that second issue locally to advise further (its definitely wrong, I just don't know why).

@Hannibal404
Copy link
Contributor Author

Added simlinks for the modules files to fix the errors.

The QA output had unknown metrics errors due to IB specific metrics on a machine without infiniband. Updated.

@natoscott
Copy link
Member

@Hannibal404 thanks for the updates, I'm still seeing issues though. The test fails because rds Install fails similarly to previously...

[pcpqa@fedora rds]$ sudo ./Install 
Traceback (most recent call last):
  File "/var/lib/pcp/pmdas/rds/pmdards.python", line 50, in <module>
    from modules.rds_ping import rds_ping_all_avlbl_dest
ModuleNotFoundError: No module named 'modules.rds_ping'
Arrgh! failed to create /var/lib/pcp/pmdas/rds/domain.h.python from /var/lib/pcp/pmdas/rds/pmdards.python

I think you may need something more like this code from pmdabcc:

    def init_modules(self):
        """ Initialize modules """
        self.log("Initializing modules:")

        # For packaging, allow both .python and .py suffixed files
        cwd = os.getcwd()
        pmdadir = PCP.pmGetConfig('PCP_PMDASADM_DIR') + '/' + self.read_name()
        for root, _, filenames in os.walk(pmdadir):
            os.chdir(root)
            for filename in fnmatch.filter(filenames, '*.python'):
                if filename in ('pmdabcc.python', 'domain.h.python', 'pmns.python'):
                    continue
                pyf = filename[:-4]
                if not os.path.exists(pyf):
                    os.symlink(filename, pyf)
            os.chdir(pmdadir)
        os.chdir(cwd)

        import pmdautil # pylint: disable=import-outside-toplevel
        self.proc_helper = pmdautil.ProcMon(self.log, self.err)
        for module in self.modules:
            self.log(module)
            try:
                mod = importlib.import_module('modules.%s' % self.modules[module][MODULE])

@Hannibal404
Copy link
Contributor Author

that's strange, it was failing for me on a fedora machine, but after creating the symlinks it got resolved. I'll try using importlib.

@Hannibal404
Copy link
Contributor Author

Replaced the regular imports with importlib

@natoscott
Copy link
Member

Something is still wrong, this is what I see:

[pcpqa@fedora rds]$ sudo ./Install 
Traceback (most recent call last):
  File "/var/lib/pcp/pmdas/rds/pmdards.python", line 43, in <module>
    mod_rds_ping = importlib.import_module('modules.rds_ping')
  File "/usr/lib64/python3.14/importlib/__init__.py", line 88, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1398, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1371, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1335, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'modules.rds_ping'

I realize now there's a simpler example you can use - see the netcheck PMDA. The .py/.python aspect seems to be a red herring as it doesn't have to bother with that.

Can you do a ./Makepkgs build, install the new RPMs, and then in qa "./check -g pmda.rds" before resending - thanks!

Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
@Hannibal404
Copy link
Contributor Author

Hannibal404 commented Feb 23, 2026

The installation now works for me even without the symlink creation both with and without importlib. It doesn't seem to be an issue with how modules are imported since even netcheck has similar imports:

from modules.pcpnetcheck import PCPNetcheckModuleParams, DGW, DNS, NTP

I do not see a mention of modules.pcpnetcheck anywhere in netcheck.conf either.
I see that the install file for rds does not make a mention of the domain number, which I have updated as well.
Tried running the check script and it worked as expected.

PS: Made some updates to the test output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants