-
Notifications
You must be signed in to change notification settings - Fork 35
Description
There may be cases where ExtractTable requires access to a file within a private repo/bucket (e.g. S3 bucket). It is possible to grant access to private images via the use of presigned urls. For example, images in a private S3 bucket can be accessed via a presigned url of the form:
https://[bucket_name].s3.amazonaws.com/[image_name].png?X-Amz-Algorithm=XXXX-Amz-Credential=AKIA...%2Feu-west-2%2Fs3%2Faws4_request&X-Amz-Date=20230207T103049Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=xxxxxx
ExtractTable supports '.pdf', '.jpeg', '.jpg', '.png', but the check is filepath.lower().endswith(self.__SUPPORTED_EXTENSIONS__)
which fails with the following error:
Exception: Failed to get response from ExtractTable API. Exception = Allowed file types are ('.pdf', '.jpeg', '.jpg', '.png')
This is because the url ends with some randomly generated signature, whereas the image itself is a valid image.
The request is for an option to specify to ExtractTable that the url is presigned, and a second option to specify the delimiter which marks the end of the filename and the beginning of the signature.