Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Cloud storage support #300

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

githubzilla
Copy link

@githubzilla githubzilla commented Oct 26, 2023

This pull request introduces support for Google Cloud Storage as an optional storage backend. Users can enable this feature by using the compile flag USE_GCP. Additionally, the USE_GCP flag can be used in conjunction with the USE_AWS flag to enable both the S3 and GCS storage backends simultaneously. To validate this feature, two new test suites, namely gcp_db_cloud_test and gcp_file_system_test, have been created.

Here are the steps to set up and run this feature:

# Setup Google Cloud storage client env
# according to https://cloud.google.com/cpp/docs/setup

# Init the Google Cloud client running environment and ADC
# provide the region/zone and project id
gcloud init
# login
gcloud auth application-default login

# Build the feature
USE_RTTI=1 USE_GCP=1 make all -j8

# Test the feature, since GCS does not support domain name style bucket name, you have to customize the bucket name
ROCKSDB_CLOUD_TEST_BUCKET_PREFIX=rockset- ROCKSDB_CLOUD_TEST_BUCKET_NAME=dbcloudtest GOOGLE_CLOUD_PROJECT=your_google_project_id ./gcp_file_system_test
ROCKSDB_CLOUD_TEST_BUCKET_PREFIX=rockset- ROCKSDB_CLOUD_TEST_BUCKET_NAME=dbcloudtest GOOGLE_CLOUD_PROJECT=your_google_project_id ./gcp_db_cloud_test

@dhruba
Copy link

dhruba commented Oct 26, 2023

Thanks for contributing this code! Can we get an idea of the status of this code, for example, is it being used in any application? Any insights into where/how/scale of thie code being used would be great.

@@ -688,7 +688,7 @@ IOStatus S3StorageProvider::ExistsCloudObject(const std::string& bucket_name,
IOStatus S3StorageProvider::GetCloudObjectSize(const std::string& bucket_name,
const std::string& object_path,
uint64_t* filesize) {
HeadObjectResult result;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get rid of this change, this is not needed and makes the diff large in size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@githubzilla
Copy link
Author

Thanks for contributing this code! Can we get an idea of the status of this code, for example, is it being used in any application? Any insights into where/how/scale of thie code being used would be great.

This pull request (PR) originates from MonographDB. In our current setup, we utilize RocksDB Cloud as the log server backend to store log items, and the S3 backend has been performing effectively. Now, we aim to expand our support to Google Cloud, which serves as the motivation for creating this PR. Although the integration work is still in progress, we anticipate its completion in the near future and are actively working towards its resolution.

@airhorns
Copy link

airhorns commented Nov 1, 2024

Would be great to get this in!

@dhruba
Copy link

dhruba commented Nov 4, 2024

Thanks for the great contribution. Some minor comments within.
Can you also sign the ICLA https://github.com/rockset/rocksdb-cloud/blob/master/CONTRIBUTING.md

@githubzilla
Copy link
Author

Thanks for the great contribution. Some minor comments within. Can you also sign the ICLA https://github.com/rockset/rocksdb-cloud/blob/master/CONTRIBUTING.md

This is my CLA issue
#299

I notice there are some conflicting files, I will fix them later.

@dhruba
Copy link

dhruba commented Nov 4, 2024

@airhorns are you using this patch now? if so let us know the size and scale of your testing on google cloud

@githubzilla
Copy link
Author

Rebase gcs_support onto rockset:master. Here are the test results:

(All test cases passed, except those excluded by -gtest_filter, which also failed on rockset:master.)

  1. USE_AWS=1

LIBNAME=librocksdb-cloud-aws USE_RTTI=1 USE_AWS=1 make -j8 db_test db_test2 db_basic_test db_cloud_test cloud_manifest_test
export ROCKSDB_CLOUD_REGION="your-region"
export ROCKSDB_CLOUD_TEST_BUCKET_PREFIX="rockset."
export ROCKSDB_CLOUD_TEST_BUCKET_NAME="your-unique-bucket-name"
./db_test --gtest_filter=-DBTest.PurgeInfoLogs
./db_test2 --gtest_filter=-DBTest2.BestEffortsRecoveryWithSstUniqueIdVerification:DistributedFS/RenameCurrentTest.Flush/0
./db_basic_test --gtest_filter=-DBBasicTest.MultiGetStats:DBBasicTest.LastSstFileNotInManifest:DBBasicTest.RecoverWithNoCurrentFile:DeadlineIO/DBBasicTestMultiGetDeadline.MultiGetDeadlineExceeded/0
./env_basic_test
./db_cloud_test
./cloud_manifest_test

  1. USE_GCP=1

LIBNAME=librocksdb-cloud-gcp USE_RTTI=1 USE_GCP=1 make -j8 db_test db_test2 db_basic_test env_basic_test gcp_db_cloud_test cloud_manifest_test
export GOOGLE_CLOUD_PROJECT="you-gcp-project"
export ROCKSDB_CLOUD_TEST_BUCKET_PREFIX="rockset-dont-have-dot-in-name"
export ROCKSDB_CLOUD_TEST_BUCKET_NAME="you-unique-bucket-name-dont-have-dot-in-name"
./db_test --gtest_filter=-DBTest.PurgeInfoLogs
./db_test2 --gtest_filter=-DBTest2.BestEffortsRecoveryWithSstUniqueIdVerification:DistributedFS/RenameCurrentTest.Flush/0
./db_basic_test --gtest_filter=-DBBasicTest.MultiGetStats:DBBasicTest.LastSstFileNotInManifest:DBBasicTest.RecoverWithNoCurrentFile:DeadlineIO/DBBasicTestMultiGetDeadline.MultiGetDeadlineExceeded/0
./env_basic_test
./db_cloud_test
./cloud_manifest_test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants