Skip to content

Conversation

@korydraughn
Copy link

@korydraughn korydraughn commented Aug 14, 2025

This PR updates support for iRODS by replacing Jargon (legacy iRODS library) with irods4j.

The foundational work was implemented by @MINGYJ, a recent iRODS intern. My commits are mainly polish and corrections around the use of the irods4j library.

Basic functionality is working - i.e. single stream uploads/downloads, renames, editing, etc.

The parallel transfer implementation doesn't appear to be working as intended. This is likely due to not having a full understanding of how the Cyberduck components fit together - i.e. Read/WriteFeature vs Upload/DownloadFeature.

Here are the steps for performing parallel transfer (i.e. multipart uploads) in iRODS.

Putting in draft for now. Feedback and guidance on how to implement proper support for parallel transfer would be greatly appreciated.

Resolves #14449.

Comment on lines +46 to +51
// TODO If we're dealing with a collection, should existing data objects sharing
// the same name be overwritten? This should probably be a configurable option.
IRODSFilesystem.copy(conn.getRcComm(), from, to, IRODSFilesystem.CopyOptions.RECURSIVE);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle TODO

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@korydraughn
Copy link
Author

@dkocher Can you provide the Entity CLA?

@dkocher dkocher changed the title [#14449] Update support for iRODS Update support for iRODS Sep 1, 2025
@dkocher
Copy link
Contributor

dkocher commented Sep 1, 2025

Putting in draft for now. Feedback and guidance on how to implement proper support for parallel transfer would be greatly appreciated.

Discussed in #14449 (comment)

@korydraughn
Copy link
Author

TODO: Update documentation for iRODS - https://docs.cyberduck.io/protocols/irods/

@trel
Copy link

trel commented Oct 9, 2025

updating the docs is a separate repo, would have its own PR

@korydraughn
Copy link
Author

TODO: Release new version of irods4j for checksum fixes.

@korydraughn
Copy link
Author

korydraughn commented Oct 9, 2025

Rebased PR on top of master.

Verified the following:

  • Parallel/Single stream uploads and downloads works
  • Config options defined in new iRODS profile are honored
  • PAM authentication works via the pam_password auth scheme
  • Other operations work too - rename, delete, etc.

With this PR, users will not be able to calculate checksums on data in an iRODS zone. Cyberduck will report checksums in their hex form if they exist in iRODS. We could add an option to the iRODS profile which allows users to instruct Cyberduck to calculate checksums following an upload, but that didn't feel like the correct approach, mainly because profiles seem to be loaded once on program start and never reread.

As for the unit/integration tests, I'm not sure how the implementation can be tested without a real iRODS server.

Finally, this PR bumps the minimum iRODS version requirement to 4.3.2. A new version of irods4j will be needed before this is merged. See irods/irods4j#126. Will get a new version released this week soon.

@korydraughn
Copy link
Author

@dkocher When I move a file in or out of a directory, I see a new iRODS connection get created. It disappears eventually, but not cleanly.

Can you explain why Cyberduck creates a new connection every time a file is moved?

The move class is very similar to other implementations so it's not clear to me what causes the new connection.

@dkocher
Copy link
Contributor

dkocher commented Oct 14, 2025

@dkocher When I move a file in or out of a directory, I see a new iRODS connection get created. It disappears eventually, but not cleanly.

Can you explain why Cyberduck creates a new connection every time a file is moved?

The move class is very similar to other implementations so it's not clear to me what causes the new connection.

You will need to return Statefulness.stateless in IRODSProtocol#getStatefulness.

@korydraughn
Copy link
Author

@dkocher When would someone choose stateless or stateful?
If I change the protocol to stateless, how will that affect other operations?
If I continue using stateful, what needs to change to make sure the additional connections are closed immediately?

To provide some context, a single iRODS connection cannot be used to execute multiple API operations in parallel. A single request is sent and the server returns a response.

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

If you can share credentials with us for your test environment we can run daily integration tests from our CI.

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

@dkocher When would someone choose stateless or stateful? If I change the protocol to stateless, how will that affect other operations? If I continue using stateful, what needs to change to make sure the additional connections are closed immediately?

To provide some context, a single iRODS connection cannot be used to execute multiple API operations in parallel. A single request is sent and the server returns a response.

We will have to keep it stateful then as otherwise it will be attempted to use a single connection for multiple actions in parallel, i.e. when the user is browsing folders or a file transfer with multiple files in parallel.

I will need to review how we can still support the native copy feature implementation for iRODS that would not require a new connection.

@korydraughn
Copy link
Author

If you can share credentials with us for your test environment we can run daily integration tests from our CI.

We don't have a CI system for people to hook into yet. We have a small set of tools which make it easy to launch one or more iRODS servers for testing. It's likely overkill for your needs though.

With that said, building a containerized environment for testing iRODS is pretty easy. I'm happy to put together a Docker compose project for the iRODS component. That will allow you to launch it on a local computer and have it available for testing.

We will have to keep it stateful then as otherwise it will be attempted to use a single connection for multiple actions in parallel, i.e. when the user is browsing folders or a file transfer with multiple files in parallel.

I will need to review how we can still support the native copy feature implementation for iRODS that would not require a new connection.

So, because the IRODSProtocol class reports iRODS as being stateful, does that mean every operation results in a new iRODS connection being created?

If that's true, why is it that the Move operation results in connections which do not disconnect? No other operation shares that behavior.

I added some log statements to my local build to try and box in what is leading to the additional connections, but it didn't help. However, it did reveal a high number of instantiations of IRODSMoveFeature. Any idea why the move operation is instantiated so much?

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

We have other usages of Docker Compose containers in integration tests, thus that should be feasible.

@korydraughn
Copy link
Author

Where should I place the Docker Compose project? Is the test directory for irods appropriate?

I figure you can move things around if needed.

@korydraughn
Copy link
Author

@dkocher What triggers the Copy implementation?

Using the Duplicate option within the context menu doesn't appear to trigger it.

@korydraughn
Copy link
Author

Docker compose project added under test directory.

Squashing everything down.

@korydraughn
Copy link
Author

@dkocher Any updates regarding the CLA?

@korydraughn
Copy link
Author

TODO: Release new version of irods4j for checksum fixes.

Released irods4j 0.5.0 and updated PR to use it.

Comment on lines -82 to -88
@Test
public void testResource() {
final Host bookmark = new Host(new Profile(new IRODSProtocol(), new Deserializer<String>() {
@Override
public String stringForKey(final String key) {
return null;
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore this test.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

@dkocher What triggers the Copy implementation?

Using the Duplicate option within the context menu doesn't appear to trigger it.

The Duplicate… option in Cyberduck is expected to trigger the Copy feature.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

@dkocher Any updates regarding the CLA?

I have received your signed CLA and I should be able to merge this regardless of the auto check failing.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

Docker compose project added under test directory.

Squashing everything down.

  • We previously had integration tests running with iPlant Collaborative which are still functioning. Can you confirm this is still the case and there are no regressions?
  • We have tests that use a Testcontainers/Docker setup in the smb and the sts package of the s3 module. Ideally the iRODS container would be launched similarly to run tests.

@dkocher
Copy link
Contributor

dkocher commented Oct 16, 2025

So, because the IRODSProtocol class reports iRODS as being stateful, does that mean every operation results in a new iRODS connection being created?

That is true but the connections are pooled using the DefaultSessionPool implementation. We only have FTP that uses the same model, all other implementations are stateless.

@korydraughn
Copy link
Author

The Duplicate… option in Cyberduck is expected to trigger the Copy feature.

I do see references to CopyWorker in the log file. Maybe I didn't have any logging in the IRODSCopyFeature class. I'll take another look at it.

  • We previously had integration tests running with iPlant Collaborative which are still functioning. Can you confirm this is still the case and there are no regressions?

mvn test -DskipITs, as documented in the README, results in passing tests, but it only covers tests which do not require interaction with a server.

Whenever I tried running the tests without -DskipITs, they failed immediately due to vault/permission issues. I think I lack credentials to do that level of testing.

So, I can't confirm the state of the tests. I can say that clicking the buttons within the GUI have shown the desired behavior and do work with a live iRODS 4.3.4 server.

  • We have tests that use a Testcontainers/Docker setup in the smb and the sts package of the s3 module. Ideally the iRODS container would be launched similarly to run tests.

I'll take a look at those and see if I can mirror them. I may not be able to resolve the vault/permission related issues, but I can at least document the necessary changes so that you can wire things up in the way that you like.

That is true but the connections are pooled using the DefaultSessionPool implementation. We only have FTP that uses the same model, all other implementations are stateless.

I'll take a look at the FTP implementation. Given the connection management requires a deeper understanding of the Cyberduck internals, I might end up needing to let someone more familiar with that handle it.

@korydraughn
Copy link
Author

@dkocher Any thoughts on my previous comment? Specifically, the vault/permission issues?

@dkocher
Copy link
Contributor

dkocher commented Oct 29, 2025

Unfortunately the secrets to access the credentials for integration tests are only accessible in our CI enviroment. Running the tests against data.iplantcollaborative.org gives error [ -12000 ].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PAM passwords not handled correctly

4 participants