Skip to content

Enhance Azure Blob Storage SDK with OpenReadAsync for Efficient Streaming Over DownloadAsync #97

@henrikroschmann

Description

@henrikroschmann

Description

An alternative to the current method is to utilize the OpenReadAsync method from the Azure SDK for blob storage, which will improve performance.

Current Implementation

The existing implementation uses DownloadAsync, which manually checks blob properties, handles edge cases (e.g., zero-sized blobs), and downloads the blob content upfront. This approach increases complexity and is less efficient for streaming large blobs.

Proposed Change

Update the method to use the built-in OpenReadAsync from the Azure SDK. Example implementation:

public async Task<Stream> OpenReadAsync(string fullPath, CancellationToken cancellationToken = default)
{
    GenericValidation.CheckBlobFullPath(fullPath);

    (BlobContainerClient container, string path) = await GetPartsAsync(fullPath, false).ConfigureAwait(false);

    BlockBlobClient client = container.GetBlockBlobClient(path);

    try
    {
        return await client.OpenReadAsync(cancellationToken: cancellationToken).ConfigureAwait(false);
    }
    catch (RequestFailedException ex) when (ex.ErrorCode == "BlobNotFound")
    {
        return null;
    }
}

Purpose and Benefits

  1. Streamlined Code: OpenReadAsync abstracts the manual steps involved in downloading blob content, reducing the implementation's complexity.
  2. Efficient Streaming: Instead of downloading the entire blob upfront, OpenReadAsync streams data on-demand, which is particularly beneficial for handling large blobs. This can reduce memory usage and improve performance. (BlobBaseClient.OpenReadAsync(BlobOpenReadOptions, CancellationToken) Method)
  3. Alignment with Azure SDK Best Practices: Encourages usage of modern SDK capabilities for optimized blob handling.

Why This Matters

This refactor aligns with performance and scalability goals by enabling efficient streaming of large blobs, making the application more resource-friendly. By simplifying the logic, it also reduces maintenance costs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions