Skip to content

Implement Azure Blob Store #1554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

amankrx
Copy link
Contributor

@amankrx amankrx commented Dec 26, 2024

Description

This PR implements the Azure Blob Store implementation and closely aligns with the AWS S3 implementation.
This utilizes the azure_sdk libraries (azure_core, azure_storage, and azure_storage_blobs). The libraries are still unofficial and being developed, so we should be mindful of any breaking changes in the future.

Apart from that, I started my development by first creating a POC: https://gist.github.com/amankrx/45e7d2a6ed935aa13dda0318681af2ad
This POC tests the get and upload blobs with all the default features disabled for Azure SDK. This creates a custom HttpClient with manually signing the requests to perform the transactions.

Fixes #1542
/claim #1542

Type of change

Please delete options that aren't relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

I tested it locally using the bazel test command with the azure_blob_backend.json5 file, configured to use the actual blob storage. To test this locally, you should set the environment variable AZURE_STORAGE_KEY as the access key. Additionally, I created a test suite to cover both individual happy paths and error scenarios.

Checklist

  • Updated documentation if needed
  • Tests added/amended
  • Local testing completed
  • bazel test //... passes locally
  • PR is contained in a single commit, using git amend see some docs

This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented Dec 26, 2024

CLA assistant check
All committers have signed the CLA.

@amankrx amankrx marked this pull request as draft December 26, 2024 20:05
@amankrx amankrx marked this pull request as ready for review December 29, 2024 22:10
@amankrx amankrx force-pushed the azure-blob-store branch from 7e2a535 to 2fa208e Compare May 6, 2025 02:56
@amankrx amankrx force-pushed the azure-blob-store branch from 2fa208e to 65c0342 Compare May 6, 2025 03:13
@amankrx
Copy link
Contributor Author

amankrx commented May 6, 2025

I have updated the PR with the latest changes and have verified that it works locally.
Awaiting review from @MarcusSorealheis @aaronmondal

@amankrx
Copy link
Contributor Author

amankrx commented May 11, 2025

Any updates here?

Copy link
Member

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good already. One thing I'm wondering is what happened to the buffer logic from the S3 implementation. Was that not usable here or did it have other issues?

Regarding the code, there seems to be quite a lot of duplication going on though. WDYT about something roughly like this?

+---------------Current Architecture-----+  +------Refactored Architecture----+
|                                        |  |                                 |
|       +----------------------------+   |  | +----------------------------+  |
|       |      StoreDriver Trait     |   |  | |      StoreDriver Trait     |  |
|       |  Common interface for all  |   |  | |  Common interface for all  |  |
|       |         storage            |   |  | |          storage           |  |
|       +----------------------------+   |  | +----------------------------+  |
|                    |                   |  |                |                |
|              implemented by            |  |          implemented by         |
|                    |                   |  |                |                |
|                    v                   |  |                v                |
| +----------+ +----------+ +----------+ |  | +-----------------------------+ |
| |          | |          | |          | |  | |  GenericCloudStore<P,NowFn> | |
| |  S3Store | |AzureStore| | GcsStore | |  | |   Common implementation     | |
| |          | |          | |          | |  | |with optimized critical paths| |
| +----------+ +----------+ +----------+ |  | +-----------------------------+ |
|                                        |  |                |                |
+----------------------------------------+  |              uses               |
                                            |                |                |
                                            |                v                |
                                            | +----------------------------+  |
                                            | |  CloudStorageProvider Trait|  |
                                            | |    Minimal provider-       |  |
                                            | |    specific operations     |  |
                                            | +----------------------------+  |
                                            |                |                |
                                            |          implemented by         |
                                            |                |                |
                                            |      +---------+---------+      |
                                            |      |         |         |      |
                                            |      v         v         v      |
                                            | +------+   +------+   +------+  |
                                            | |  S3  |   |Azure |   | GCS  |  |
                                            | +------+   +------+   +------+  |
                                            |                                 |
                                            +---------------------------------+

There are other parts that look very duplicated among the now 3 store implementations like the object path logic and the request transformations across the stores.

+@jhpratt

Reviewable status: 0 of 2 LGTMs obtained, and 0 of 10 files reviewed (waiting on @jhpratt)

Copy link
Contributor Author

@amankrx amankrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! This is something that I had planned as well. But it needs some restructuring, that was out of scope for this PR atleast. We can maybe create a separate PR that handles the GenericCloudStore. This would minimize a lot of duplication.

Reviewable status: 0 of 2 LGTMs obtained, and 0 of 10 files reviewed (waiting on @jhpratt)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

💎 Implement an Azure store
4 participants