Skip to content

Add support for presigned URLs #53

@marktgraham

Description

@marktgraham

There may be cases where ExtractTable requires access to a file within a private repo/bucket (e.g. S3 bucket). It is possible to grant access to private images via the use of presigned urls. For example, images in a private S3 bucket can be accessed via a presigned url of the form:

https://[bucket_name].s3.amazonaws.com/[image_name].png?X-Amz-Algorithm=XXXX-Amz-Credential=AKIA...%2Feu-west-2%2Fs3%2Faws4_request&X-Amz-Date=20230207T103049Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=xxxxxx

ExtractTable supports '.pdf', '.jpeg', '.jpg', '.png', but the check is filepath.lower().endswith(self.__SUPPORTED_EXTENSIONS__) which fails with the following error:

Exception: Failed to get response from ExtractTable API. Exception = Allowed file types are ('.pdf', '.jpeg', '.jpg', '.png')

This is because the url ends with some randomly generated signature, whereas the image itself is a valid image.

The request is for an option to specify to ExtractTable that the url is presigned, and a second option to specify the delimiter which marks the end of the filename and the beginning of the signature.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions