Skip to content

Releases: octue/octue-sdk-python

Use twined=^0.5.0

16 May 11:05
9878126
Compare
Choose a tag to compare

Contents (#454)

Dependencies

  • Use twined=^0.5.0, which removes deprecated support for datasets provided as lists instead of as dictionaries

Remove deprecated code

12 May 17:40
8cec1a6
Compare
Choose a tag to compare

Summary

Remove deprecated code that's built up over the past few months.

Contents (#450)

IMPORTANT: There are 5 breaking changes.

Refactoring

  • BREAKING CHANGE: Remove deprecated cloud bucket name and path parameters

    Cloud paths must now be provided as gs://bucket-name/path/within/bucket instead of via the bucket_name and path_within_bucket parameters

  • BREAKING CHANGE: Remove deprecated provision of datasets as a list to Manifest

    Datasets must now be provided as a dictionary mapping their name to their path or Dataset instance.

  • BREAKING CHANGE: Remove deprecation warning if using store_datasets parameter

    Upload datasets using Dataset.to_cloud after running Manifest.to_cloud if you wish to upload the datasets within a manifest

  • BREAKING CHANGE: Remove deprecated update_cloud_metadata parameter

    This has been renamed to update_metadata

  • BREAKING CHANGE: Remove Dataset.from_cloud and Dataset.from_local_directory

    Just use the Dataset constructor e.g. Dataset(path="gs://bucket-name/dataset") or
    Dataset(path="local/dataset") instead

Allow Dataset.to_cloud to infer cloud location

10 May 19:23
bc82804
Compare
Choose a tag to compare

Contents (#449)

Enhancements

  • Allow Dataset.to_cloud to infer cloud location

Fixes

  • Use cloud paths for relative paths when possible in Dataset.to_cloud

Refactoring

  • Factor out cloud properties and methods common to Dataset and Datafile into new CloudPathable mixin

Make Cloud Run services idempotent to questions

10 May 16:47
c70506d
Compare
Choose a tag to compare

Summary

Stop Cloud Run services from running a given question more than once and rely on Pub/Sub retries for connection problems.

Contents (#446)

Enhancements

  • Increase default delivery acknowledgement deadline to 120s

Fixes

  • Acknowledge and drop questions redelivered to Cloud Run services based on their UUIDs
  • Remove redundant question retries in Service and raise error instead if the delivery acknowledgement deadline is reached

Refactoring

  • Factor out saving/updating of local metadata files
  • Rename local metadata save function
  • Rename Service.send_exception_to_asker to Service.send_exception
  • Use "answer" instead of "response" terminology in Service

Unify cloud and local dataset instantiation

09 May 15:59
267f3ed
Compare
Choose a tag to compare

Summary

Remove the need for alternative constructors when instantiating datafiles from the cloud or from a local directory - the Dataset constructor now handles this automatically.

Contents (#445)

Enhancements

  • Unify local and cloud dataset instantiation via Datafile.__init__
  • Raise a deprecation warning if datasets are constructed via Dataset.from_cloud or Dataset.from_local_directory

Unify metadata update methods

06 May 10:43
280bb5f
Compare
Choose a tag to compare

Summary

Add a single method for updating stored datafile and dataset metadata that deduces whether to update the local or cloud metadata.

Contents (#443)

Enhancements

  • Add Datafile.update_metadata method
  • Add Dataset.update_metadata method

Refactoring

  • Rename update_cloud_metadata parameter to update_metadata in Datafile instantiation and raise a deprecation warning if the old parameter name is used

Add ability to download datafiles from a signed URL without cloud permissions

03 May 18:03
e1f637b
Compare
Choose a tag to compare

Contents (#440)

Enhancements

  • Add ability to download datafiles from a signed URL without cloud permissions
  • Add Datafile.generate_signed_url method
  • Raise error if trying to modify a URL-based datafile
  • Raise error if trying to generate a signed URL for a local datafile or dataset

Refactoring

  • Use URI terminology in cloud storage path module

Testing

  • Test that datafiles and datasets can be downloaded from URLs without cloud permissions

Validate output location in runner instead of twine

03 May 15:34
64cb3e0
Compare
Choose a tag to compare

Summary

Contents (#439)

Enhancements

  • Validate output locations given to Runner

Dependencies

  • Use twined=^0.4.1

Allow dataset metadata to be updated

02 May 16:24
77ef960
Compare
Choose a tag to compare

Summary

This release provides public methods and a context manager for updating datasets' metadata easily. It also standardises the internals of metadata getting, setting, and using across Datafile and Dataset.

Contents (#436)

New features

  • Add context manager for updating dataset stored metadata

Enhancements

  • Add new Metadata mixin to Datafile, Dataset, and Manifest
  • Allow kwargs to be provided to Dataset.from_cloud

Fixes

  • Stop creating local metadata file on instantiation of Dataset
  • Stop implicitly uploading metadata when calling Dataset.from_cloud
  • Add missing name property setter to Dataset
  • Use correct metadata path for signed URL datasets

Refactoring

  • Factor out metadata method into new Metadata mixin
  • Rename Dataset._upload_cloud_metadata to Dataset.update_cloud_metadata
  • Rename Dataset._save_local_metadata to Dataset.update_local_metadata

Simplify datafile metadata resolution order

02 May 13:30
e78370c
Compare
Choose a tag to compare

Summary

This release simplifies the Datafile class internals and clarifies its metadata resolution order. Stored metadata will now be used in preference to instantiation metadata unless the hypothetical parameter is True, allowing the removal of some confusing internal logic from the class.

Contents (#433)

Enhancements

  • If hypothetical is not True when re-instantiating existing datafiles, always use their stored metadata (from the cloud object or local metadata file)
  • Store cloud metadata on Datafile instances without the octue__ namespace prefix in its keys
  • Make Datafile metadata update methods public so they can be called easily by users
  • Make it optional whether to include the SDK version in the output of Datafile.metadata
  • Return None from GoogleCloudStorageClient.get_metadata if bucket not found instead of raising an error

Fixes

  • Allow instantiation of a cloud datafile with a non-existent or inaccessible cloud path (defer raising errors until attempting to access it)

Refactoring

  • Simplify Datafile internals by removing the "initialisation parameters" concept
  • Rename Datafile._get_local_metadata to Datafile._use_local_metadata
  • Align Datafile._use_cloud_metadata and Datafile._use_local_metadata methods
  • Factor out setting Datafile instance metadata from stored metadata into _set_metadata method