Releases: octue/octue-sdk-python
Fix problems with Dataflow-related dependencies and remove unused dependencies
Summary
Fix the installation of octue
when including the dataflow
extra by pre-determining the packages needed for apache-beam
to work with GCP. This also significantly slims down the dependencies of octue
by only including those explicitly needed.
Contents (#495)
Dependencies
- Widen range of acceptable
packaging
versions to avoid conflict withpoetry
- Remove
gcp
option fromapache-beam
package - Explicitly add necessary GCP package ranges for Dataflow
Fixes
- Ensure
octue.services
namespace is used for Dataflow service topics
Use python3.9 for testing
Contents (#499)
Operations
- Use
python3.9
for testing - Use and run latest pre-commit hooks
Dependencies
- Align
pyproject.toml
code style package versions with those used bypre-commit
- Replace
flake8-isort
withisort
Ensure outputs of analyses are always validated against the twine
Contents (#468)
Enhancements
- Call
Analysis.finalise
if it isn't called in an app
Fixes
- Ensure output values sent to
stdout
by CLI are JSON formatted
Add metadata to blobs before uploading to cloud storage
Summary
Avoid potential race conditions between file and metadata upload in, for example, Cloud Functions that are triggered by file upload and rely on certain metadata being immediately present on those files.
This potential problem was brought to light in this issue: aerosense-ai/data-gateway#87.
Contents (#494)
Fixes
- Add metadata to blobs before uploading to cloud storage
Ensure children can always access input datasets
Summary
Use signed URLs to point to cloud datasets in input manifests when sending them to children for processing. This means the children will always be able to access the input datasets - not just in the special case of the parent and child having access to the same buckets.
Contents (#491)
Fixes
- Use signed URLs when sending input manifests to services
Refactoring
- Factor out dataset URL signing to new
Manifest
method
Testing
- Simplify service tests' arguments
- Update test method docstring
- Ignore irrelevant files in test
Get metadata for signed URL datafiles
Summary
Allow metadata from datafiles instantiated from signed URLs to be retrieved. This has been tested with the real Google Cloud Storage endpoint and works in production, but doesn't currently with the storage emulator used in our tests. See #489 for more information.
Contents (#490)
Fixes
- Get datafile metadata when instantiating from a signed URL
Dependencies
- Use latest
gcp-storage-emulator
for tests
Avoid h5py ImportError when not using HDF5 files
Fix poetry/importlib-metadata dependency resolution failure for docs
Contents (#484)
Dependencies
- Use
poetry
pre-release1.2.0b2
in docs requirements to avoidimportlib-metadata
dependency resolution failure
Improve documentation
Summary
This PR is a complete overhaul of the octue
documentation. The docs are now high-level, much more comprehensive, and make it significantly easier to get started with the SDK.
Contents (#441)
IMPORTANT: There is 1 breaking change.
Enhancements
- 💥 BREAKING CHANGE: Remove redundant
name
argument fromChild
- Improve and simplify the template apps
- Improve CLI help text
- Remove redundant "logger" logic in
Serialisable
- Remove
credentials
andmonitors
parameters fromAnalysis
constructor
Fixes
- Use unique folder for output data in fractal template app
Dependencies
- Use latest docs packages and unify docs dependencies with those specified in
pre-commit-config.yaml
- Add documentation dev dependencies to
pyproject.toml
- Update
poetry.lock
file
Refactoring
- Rename fractal template
- Rename
Analysis.app_src
toAnalysis.app_source
- Make file hashing in
Datafile
clearer
Testing
- Use non-existing and clearer names for test project and bucket
- Prefix created bucket names with test bucket name
Documentation
- Summarise SDK features on docs index page
- Update installation page
- Add repo, organisation, version, and copyright info to docs
- Dynamically get
octue
version to display in docs - Add data containers page
- Summarise data container key features on their own pages
- Improve data containers' pages
- Add page on asking questions to digital twins
- Add digital twin creation page
- Remove old cloud path usages from docs
- Add link to deprecated code removal schedule
- Improve README
- Move contribution guidelines into separate doc and update it
- Add selected API documentation
- Link to automated API docs from manually-written docs
- Add info on how to install
hdf5
dependency withpoetry
- Replace
Datafile
context manager withDatafile.open
context manager in docs - Link to template apps in docs
- Link to default dockerfiles in docs
- Add information about semver to version history page
- Add definitions to docs using a custom admonition
- Add example data container use cases
- Update
Service
docstrings - Improve
Child
docstrings - Improve logging page
- Remove outdated cloud storage docs
- Move information about
Analysis
instances into class docstring - Link to Analysis API docs in creating services doc
- Improve
Analysis
docstring - Merge deploying services page into creating services page
- Update licence to cover 2022
- Update
FilterContaine
r docstrings - Simplify filter container documentation
- Move available filters into its own page
- Add documentation covering CLI
- Add documentation on running services locally
- Merge child services page into asking questions page
- Improve
Datafile.open
docstring - Add authentication page to docs
Upgrade instructions
💥 Remove redundant name argument from Child
Remove the name
argument from the Child
constructor and use name-based services.