AssetManagerClient¶
-
class
scenebox.clients.
AssetManagerClient
(asset_type='sessions', asset_manager_url=None, metadata_only=False, auth_token=None, cache_disabled=False, is_non_standard_asset=False, kv_store=None, num_threads=20)¶ Asset manager client for fetching/putting assets and metadata.
- Parameters:
asset_type (
str
) – (str) The type of assetasset_manager_url (
Optional
[str
]) – (str) The URL of the asset managermetadata_only (
bool
) – (bool) Whether to only return metadataauth_token (
Optional
[str
]) – (str) Authentication token of the user – must be providedcache_disabled (
bool
) – (bool): Disable caching; defaults to False(bool) (is_non_standard_asset) – Whether the asset is a non-standard asset. If True, ignore asset-type checks.
kv_store (
Optional
[KVStoreTemplate
]) – key-value store to cache bytes, default to None.
Methods
Add auxiliary metadata to an asset.
Retrieve an iterator for Assets with a search query.
Average a specified field over assets that satisfy a query.
Args:
Clears asset manager’s Redis cache by organization Args: all_organizations: if True, Redis cache for all organization is cleared. Requires Super Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared.
Copy an asset.
Count the number of assets satisfying a query.
Delete an asset.
Delete assets specified with a list of asset IDs.
Delete assets specified with a search query.
Download asset files using a search query or ids.
Check if an asset exists.
Check if assets exist.
Get the data bytes of an asset.
Get the data bytes from kv_store if available.
Get the data bytes for a list of assets.
Fetch an asset’s metadata.
Fetch an asset’s metadata.
Get the public URL of an asset.
Get the public URLs of a list of assets.
Convert a directory to a zipfile and upload.
Register and upload a file with the asset manager Args: file (BytesIO or bytes): bytes object id (str): the uid of the file. If not provided, will be set automatically owner_organization_id (str): file owner organization id wait_for_completion (bool): Whether to wait for the upload to finish before returning retry: should we retry? :rtype:
ObjectAccess
:return: object_access for the uploaded file.Remove a key from index and its mapping (Admin only).
Retrieve asset IDs with a search query.
Retrieve all asset IDs matching a search query.
Retrieve asset metadata with a search query.
Retrieve all asset metadata matching a search query.
Sum a specified field over assets that satisfy a query.
Get a metadata summary.
Update an asset’s auxiliary metadata.
Args:
Update an asset’s metadata.
Update an asset’s metadata.
Upsert an asset’s metadata.
Waits for update by query task completion and returns the status.
Set the asset state, for use in chaining.
-
delete
(id, wait_for_deletion=True, raise_for_failure=False)¶ Delete an asset.
- Parameters:
id – The ID of the asset to delete.
wait_for_deletion – If True, polls until the specified asset no longer exists. Otherwise, returns immediately (even if the asset is still in the process of being deleted).
raise_for_failure – should raise for failure
- Return type:
dict
-
put_directory
(directory_path, metadata, temp_dir=PosixPath('/tmp'))¶ Convert a directory to a zipfile and upload.
- Return type:
str
-
put_file
(file_object, id=None, folder=None, owner_organization_id=None, content_type=None, content_encoding=None, content_size=None, add_to_redis_cache=False, retry=False)¶ Register and upload a file with the asset manager Args:
file (BytesIO or bytes): bytes object id (str): the uid of the file. If not provided, will be set automatically owner_organization_id (str): file owner organization id wait_for_completion (bool): Whether to wait for the upload to finish before returning retry: should we retry?
- Return type:
ObjectAccess
- Returns:
object_access for the uploaded file
-
get_url
(id, expiration=43200)¶ Get the public URL of an asset.
- Parameters:
id – The ID of the asset to get the URL of.
expiration – url expiration time in seconds (default 12 hours)
- Returns:
The URL of the specified asset.
- Return type:
str
-
get_bytes_from_kv_store
(id)¶ Get the data bytes from kv_store if available. :Parameters: id – The ID of the asset to get the bytes of.
- Return type:
Optional
[bytes
]
-
get_bytes
(id, add_to_diskcache=False, add_to_redis_cache=True, url=None, retry=False)¶ Get the data bytes of an asset.
- Parameters:
id – The ID of the asset to get the bytes of.
add_to_diskcache – bool. Default is False. Flag to add retrieved asset to diskcache.
add_to_redis_cache – bool. Default is True. Flag to add retrieved asset to redis cache.
url – Optional string. Url of the asset.
retry – If True, retries the call for total_tries set in retry decorator
- Returns:
The data bytes of the specified asset.
- Return type:
bytes
-
get_bytes_in_batch
(ids, add_to_redis_cache=True, add_to_diskcache=False)¶ Get the data bytes for a list of assets.
- Parameters:
ids – The IDs of the assets to get the data bytes of.
add_to_redis_cache – should we add to redis cache
add_to_diskcache – should we add to disk cache
- Returns:
The fetched bytes in a dictionary. Keys are the input
ids
. Values are the corresponding data bytes.- Return type:
dict[str, str]
-
get_url_in_batch
(ids)¶ Get the public URLs of a list of assets.
- Parameters:
ids – The IDs of the assets to get the URLs of.
- Returns:
The URLs of the specified assets. Keys are the inputted
ids
. Values are the corresponding URLs.- Return type:
dict[str, str]
-
exists
(id)¶ Check if an asset exists.
- Parameters:
id – The ID of the asset to check the existence of.
- Returns:
Returns True if the named asset exists. Otherwise, returns False.
- Return type:
bool
-
exists_multiple
(ids)¶ Check if assets exist.
- Parameters:
ids – List of the IDs of assets to check the existence of.
- Returns:
Returns True if the named assets exist. Otherwise, returns False.
- Return type:
bool
-
copy
(id, new_id)¶ Copy an asset.
- Parameters:
id – The ID of the asset to copy.
new_id – The ID to give to the created asset copy.
-
count
(search=None)¶ Count the number of assets satisfying a query.
- Parameters:
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
- Returns:
The number of assets that fulfil
search
.- Return type:
int
-
sum
(field_name, is_nested=False, search=None)¶ Sum a specified field over assets that satisfy a query.
Find the aggregate sum of
field_name
over all the assets specified bysearch
.- Parameters:
field_name – The field name in the filtered assets’ metadata of which to sum over.
is_nested – Whether field is nested type or not.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
- Returns:
The aggregate sum of the specified field name over the filtered assets.
- Return type:
float
-
average
(field_name, is_nested=False, search=None)¶ Average a specified field over assets that satisfy a query.
Find the aggregate average of
field_name
over all the assets specified bysearch
.- Parameters:
field_name – The field name in the filtered assets’ metadata of which to average over.
is_nested – Whether field is nested type or not.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
- Returns:
The aggregate average of the specified field name over the filtered assets.
- Return type:
float
-
cardinality
(field_name, string_field=True, is_nested=False, search=None)¶ - Args:
field_name: the field to do average over it is_nested: Whether field is nested type or not. search: search criteria for the assets
Returns: Count of distinct values
- Return type:
float
-
get_metadata
(id, retry=False)¶ Fetch an asset’s metadata.
- Parameters:
id – The ID of the asset to receive the metadata of.
retry – If True, retries the call for total_tries set in retry decorator
- Returns:
The metadata of the specified asset.
- Return type:
dict
-
get_metadata_in_batch
(ids, source_selected_fields=None)¶ Fetch an asset’s metadata.
- Parameters:
ids – The IDs of the asset to receive the metadata of.
source_selected_fields – list of desired metadata fields. By default, retrieve everything
- Returns:
Dict – The metadata of the specified asset-set to None if not found.
- Return type:
str->dict or None
-
update_metadata
(id, metadata, buffered_write=False, replace_sets=False, geo_field=None, shape_group_field=None, nested_fields=None, retry=False)¶ Update an asset’s metadata.
- Parameters:
id – The ID of the asset to update the metadata of.
metadata – The metadata to update the existing metadata with.
buffered_write – If True, ingests the metadata in a buffered fashion.
replace_sets – If True, replaces existing metadata with
metadata
. Otherwise, attempts to appendmetadata
to the existing metadata.geo_field – Geolocation field
shape_group_field – Shape group field (example: UMAP)
nested_fields – nested fields (example: [“annotations_meta”])
retry – If True, retries the call for total_tries set in retry decorator
- Return type:
dict
-
update_aux_metadata
(id, metadata, geo_field=None, shape_group_field=None, nested_fields=None)¶ Update an asset’s auxiliary metadata.
- Parameters:
id – The ID of the asset to update the metadata of.
metadata – The metadata to update the existing metadata with.
geo_field – Geolocation field
shape_group_field – Shape group field (example: UMAP)
nested_fields – nested fields (example: [“annotations_meta”])
- Return type:
dict
-
update_metadata_batch
(ids, metadata, buffered_write=False, replace_sets=False, geo_field=None, shape_group_field=None, nested_fields=None)¶ Update an asset’s metadata.
- Parameters:
ids – The IDs of the assets to update the metadata of.
metadata – The metadata to update the existing metadata with.
buffered_write – If True, ingests the metadata in a buffered fashion.
replace_sets – If True, replaces existing metadata with
metadata
. Otherwise, attempts to appendmetadata
to the existing metadata.geo_field – Geolocation field
shape_group_field – Shape group field (example: UMAP)
nested_fields – nested fields (example: [“annotations_meta”])
retry – If True, retries the call for total_tries set in retry decorator
- Returns:
Update status for each of the chunks the ids and metadata were broken into.
- Return type:
dict
-
upsert_metadata
(id, metadata, geo_field=None, shape_group_field=None, nested_fields=None, buffered_write=False)¶ Upsert an asset’s metadata.
Inserts new fields of
metadata
if they do not already exist. Otherwise, updates the existing field if it does exist.- Parameters:
id – The ID of the asset to upsert the metadata of.
metadata – The metadata to upsert the existing metadata with.
geo_field – Geolocation field
shape_group_field – Shape group field (example: UMAP)
nested_fields – nested fields (example: [“annotations_meta”])
buffered_write – If True, ingests metadata in a buffered fashion.
- Return type:
dict
-
update_by_query
(update_field, update_value, operation, limit=None, query=None)¶ - Args:
update_field: field to update update_value: update value operation: add|remove limit: limit the number of docs in query result. <limit> number of docs will be randomly selected. If not
provided, all documents satisfying the query will be added to set.
query: search query to limit the docs to update
- Returns:
update task id(s). you can check the status of task using the jobs/query_task endpoint. once the task is complete you may need to clear the cache for associated assets to view the lasted changes
- Return type:
list
-
assets_iterator
(query=None, filters=None, sort_field='id', sort_order='asc')¶ Retrieve an iterator for Assets with a search query.
- Parameters:
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
query
. Dict keys represent the existing filter names, and dict values are the filter values.sort_field – Filters for the specified name.
sort_order – Specifies the string sorting order. [asc. desc]
- Returns:
Iterator for assets
- Return type:
Iterator[Asset]
-
search_assets
(query=None, filters=None, size=50, offset=0, sort_field=None, sort_order=None, search_after=None, scan=False)¶ Retrieve asset IDs with a search query.
Returns the top
size
matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_assets_large().- Parameters:
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
query
. Dict keys represent the existing filter names, and dict values are the filter values.size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if
scan
is False.offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if
scan
is False.sort_field – Filters for the specified name.
sort_order – Specifies the Elasticsearch string sorting order.
search_after – Used for retrieving all assets. See search_assets_large
scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.
- Returns:
A list of the IDs of the assets fulfilling the search query.
- Return type:
List[str]
-
search_assets_large
(query=None, filters=None)¶ Retrieve all asset IDs matching a search query.
Return all hits matching a search query. If a return of less than 10000 hits is desired, please use AssetManagerClient.search_assets() or SceneEngineClient.search_assets().
- Parameters:
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
query
. Dict keys represent the existing filter names, and dict values are the filter values.
- Returns:
A list of the IDs of the assets fulfilling the search query.
- Return type:
List[str]
-
remove_key_from_index
(key, wait_for_completion=False)¶ Remove a key from index and its mapping (Admin only).
- Parameters:
key – Key to be removed from index and its mapping
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
- Returns:
The id of the Job that carries out the key removal.
- Return type:
str
-
delete_with_query
(query=None, filters=None, wait_for_completion=False)¶ Delete assets specified with a search query.
- Parameters:
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
query
. Dict keys represent the existing filter names, and dict values are the filter values.wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
- Returns:
The id of the Job that carries out the asset deletion.
- Return type:
str
-
delete_with_list
(assets_list=None, wait_for_completion=False)¶ Delete assets specified with a list of asset IDs.
- Parameters:
assets_list – Asset IDs of the assets to delete.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
- Returns:
The id of the Job that carries out the asset deletion.
- Return type:
str
-
search_meta
(query=None, size=50, filters=None, offset=0, sort_field=None, sort_order=None, scan=False, compress=False)¶ Retrieve asset metadata with a search query.
Returns the top
size
matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_meta_large().- Parameters:
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
query
. Dict keys represent the existing filter names, and dict values are the filter values.size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if
scan
is False.offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if
scan
is False.sort_field – Filters for the specified name.
sort_order – Specifies the Elasticsearch string sorting order.
scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.
compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.
- Returns:
A list of the metadata dicts of each asset that fulfills the search query.
- Return type:
List[dict]
-
search_meta_large
(query=None, filters=None, compress=False)¶ Retrieve all asset metadata matching a search query.
Return all hits matching a search query. If a return of less than 10000 hits is desired, please use AssetManagerClient.search_meta() or SceneEngineClient.search_meta().
- Parameters:
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
query
. Dict keys represent the existing filter names, and dict values are the filter values.compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.
- Returns:
A list of the metadata dicts of each asset that fulfills the search query.
- Return type:
List[dict]
-
summary_meta
(summary_request)¶ Get a metadata summary.
- Parameters:
summary_request – Dict of summary settings of the following form: {
“search” (dict): <Query to locate the data subset of interest>, “dimensions” (Optional[List[str]): <Dimensions of summary (term aggregations)>, “nested_dimensions” (Optional[List[str]): <Nested dimensions of summary (term aggregations)>, “custom_buckets_numeric” Optional[List[dict]]: <Buckets for range aggregation> “custom_buckets_time” Optional[List[dict]]: <Buckets for date_range aggregation> “aggregate_field” Optional[str]: <Field for metric aggregations> “ is_nested” Optional[bool] Whether field is nested type or not. “aggregation_type” Optional[str]: <Type of metric aggregation (e.g. avg, sum)> “max_size_for_agg” Optional[str]: < Maximum number of buckets to return (default is 100)>
}
- Returns:
Metadata summary according to
summary_request
.- Return type:
dict
-
with_asset_state
(asset_type, metadata_only=False)¶ Set the asset state, for use in chaining.
Eg. client.with_asset_state(“images”, True).search_files({})
-
download_data_in_batch
(destination_dir, ids=None, query=None, download_metadata=True)¶ Download asset files using a search query or ids.
- Parameters:
destination_dir – The existing local directory to download the files to.
ids – The id of the assets to be downloaded
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
download_metadata – If True, downloads asset metadata. Otherwise, does not download asset metadata.
- Returns:
Maps asset IDs obtained with
query
to local filepath and metadata. Highest level keys represent asset IDs. Lower level keys “filepath” and “metadata” map to local filepath and metadata for a specific ID.- Return type:
dict
-
add_auxiliary_metadata
(id=None, ids=None, search=None, filters=None, **kwargs)¶ Add auxiliary metadata to an asset.
- Parameters:
id – An asset IDs to add the auxiliary metadata to.
ids – A list of asset IDs to add the auxiliary metadata to.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
filters – Filter names and values to append to
search
. Dict keys represent the existing filter names, and dict values are the filter values.**kwargs – Maps custom metadata labels to any value. Keys represent the auxiliary metadata labels. Values represent the auxiliary metadata values.
-
clear_cache
(all_organizations=False, partitions=None)¶ Clears asset manager’s Redis cache by organization Args:
all_organizations: if True, Redis cache for all organization is cleared. Requires Super Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared
- Returns:
ACK_OK_RESPONSE on success
-
wait_for_batch_update_task
(query_task_id, timeout=1200)¶ Waits for update by query task completion and returns the status.