Skip to content

Conversation

jbtrystram
Copy link
Member

Using the FUSE mount was unstable in my testing. This approach use more disk space but is faster.
Files will also survive under tmp-diff will also still be there after git diff returns.

Using the FUSE mount was unstable in my testing. This approach use more
disk space but is faster.
Files will also survive under `tmp-diff` will also still be there after
git diff returns.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors cmd-diff to replace the unstable FUSE-based mounting of disk images with a more robust approach that extracts files using libguestfs and tar. This aligns with the goal of improving stability and speed. The changes also introduce new generic command-line arguments, --artifact and --artifact-part-table, for diffing artifacts.

My review has identified a critical issue where the functions to handle these new arguments are missing, which will cause the script to crash. I've also pointed out a medium-severity issue in the new diff_metal implementation related to exception handling that could mask the root cause of errors. Addressing these points will ensure the new functionality is complete and robust.

Comment on lines +97 to +101
if args.artifact:
diff_artifact(diff_from, diff_to, args.artifact)
if args.artifact_part_table:
diff_artifact_partitions(diff_from, diff_to, args.artifact_part_table)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The functions diff_artifact and diff_artifact_partitions are called here but are not defined anywhere in the file. This will cause a NameError at runtime when using the --artifact or --artifact-part-table arguments.

To fix this, you need to implement these functions. They should likely act as dispatchers to the specific diffing functions based on the artifact name. For example:

# A map of artifact names to their diffing functions
ARTIFACT_DIFFERS = {
    'metal': diff_metal,
}

# A map of artifact names to their partition table diffing functions
ARTIFACT_PARTITION_DIFFERS = {
    'metal': diff_metal_partitions,
}

def diff_artifact(diff_from, diff_to, artifact_name):
    """Looks up and calls the diffing function for a given artifact."""
    differ = ARTIFACT_DIFFERS.get(artifact_name)
    if not differ:
        raise ValueError(f"No differ found for artifact: {artifact_name}")
    print(f"Diffing {artifact_name} artifact...")
    differ(diff_from, diff_to)

def diff_artifact_partitions(diff_from, diff_to, artifact_name):
    """Looks up and calls the partition diffing function for a given artifact."""
    differ = ARTIFACT_PARTITION_DIFFERS.get(artifact_name)
    if not differ:
        raise ValueError(f"No partition differ found for artifact: {artifact_name}")
    print(f"Diffing {artifact_name} artifact partition table...")
    differ(diff_from, diff_to)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Maybe you didn't mean to include this bit in this PR?

Comment on lines +367 to +369
finally:
if g:
g.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If an exception occurs in the try block, and g.close() also raises an exception in this finally block, the original exception will be hidden. This can make debugging more difficult.

To make the error handling more robust, you can wrap the call to g.close() in its own try...except block. This ensures that an error during cleanup doesn't mask the root cause of a failure.

Suggested change
finally:
if g:
g.close()
finally:
if g:
try:
g.close()
except Exception as e:
print(f"Error closing guestfs for {image_path}, ignoring: {e}", file=sys.stderr)

finally:
if g:
g.close()
diff_cmd_outputs(['sfdisk', '--dump'], metal_from, metal_to)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is probably worth its own commit (i.e. I'm interested in the reasoning and future people may be interested in that reasoning too).


for image_path, target_dir in [(metal_from, dir_from), (metal_to, dir_to)]:
if os.path.exists(target_dir):
shutil.rmtree(target_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I run cosa diff --metal && cosa diff --metal will the results of the previous run survive? i.e. it would be nice to detect if the work had already been done and use the previous work here.


g = None
try:
g = guestfs.GuestFS(python_return_dict=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're still using GuestFS here, which is completely fine (I'm not sure of another unprivileged way to mount things anyway), but I'm wondering if the previous problems about all files not being detected/used are taken care of with this PR?

i.e. if we're still using GuestFS, what's different now that fixes the problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was because I had I/O errors with FUSE so I thought dumping everything would be more reliable.

I can't say if it's more reliable, i haven't done enough testing

excludes = ["*backingFsBlockDev"]

with tempfile.NamedTemporaryFile(suffix=".tar", delete=True) as tmp_tar:
g.tar_out("/", tmp_tar.name, xattrs=True, selinux=True, excludes= excludes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
g.tar_out("/", tmp_tar.name, xattrs=True, selinux=True, excludes= excludes)
g.tar_out("/", tmp_tar.name, xattrs=True, selinux=True, excludes=excludes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants