Skip to content

Notable performance regression between version 0.2.0(?) and latest master branch code #26

@QuantumToasted

Description

@QuantumToasted

Hello,

When I shelved my massive, broken cleanup PR, I set out to try again, with smaller steps, more commits, more testing, to make sure I didn't break anything.

My very first move was to simply remove the NuGet dependencies within the library - to strictly make them reference each other as projects and allow more tight coupling and rapid development. After doing so, I observed an extremely sharp decline in the ability for the CartPole-v1 example to "learn" and make meaningful progress, even with the exact same hyperparameters. I am unsure exactly which version of RLMatrix this regression became noticeable. My memory is a little foggy because I performed this test several weeks ago, but I believe as the current master branch code references a 0.4.x version, I simply randomly chose an older version, 0.2.0, and this issue went away.

I wrote a very simple example demonstrating this issue on the nouveau-2.0 of my fork of RLMatrix. A video demonstrating the situation can be found below. Simply changing Old to false in CartPole-v1.csproj creates a stark contrast in performance. I would be happy to provide relevant logs or other debugging information if needed, finding details that could explain this is a little out of my range of expertise.

2025-03-18.19-54-57-00.00.00.000-00.01.40.967.mp4

There absolutely could be something I'm missing here! I'm hoping that maybe I'm misusing the newer code in some way that makes it inconsistent with the older code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions