AssetManagerClient

class scenebox.clients.AssetManagerClient(asset_type='sessions', asset_manager_url=None, metadata_only=False, auth_token=None, cache_disabled=False, is_non_standard_asset=False, kv_store=None, num_threads=20)

Asset manager client for fetching/putting assets and metadata.

Parameters:
  • asset_type (str) – (str) The type of asset

  • asset_manager_url (Optional[str]) – (str) The URL of the asset manager

  • metadata_only (bool) – (bool) Whether to only return metadata

  • auth_token (Optional[str]) – (str) Authentication token of the user – must be provided

  • cache_disabled (bool) – (bool): Disable caching; defaults to False

  • (bool) (is_non_standard_asset) – Whether the asset is a non-standard asset. If True, ignore asset-type checks.

  • kv_store (Optional[KVStoreTemplate]) – key-value store to cache bytes, default to None.

Methods

add_auxiliary_metadata

Add auxiliary metadata to an asset.

assets_iterator

Retrieve an iterator for Assets with a search query.

average

Average a specified field over assets that satisfy a query.

cardinality

Args:

clear_cache

Clears asset manager’s Redis cache by organization Args: all_organizations: if True, Redis cache for all organization is cleared. Requires Super Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared.

copy

Copy an asset.

count

Count the number of assets satisfying a query.

delete

Delete an asset.

delete_with_list

Delete assets specified with a list of asset IDs.

delete_with_query

Delete assets specified with a search query.

download_data_in_batch

Download asset files using a search query or ids.

exists

Check if an asset exists.

exists_multiple

Check if assets exist.

get_bytes

Get the data bytes of an asset.

get_bytes_from_kv_store

Get the data bytes from kv_store if available.

get_bytes_in_batch

Get the data bytes for a list of assets.

get_metadata

Fetch an asset’s metadata.

get_metadata_in_batch

Fetch an asset’s metadata.

get_url

Get the public URL of an asset.

get_url_in_batch

Get the public URLs of a list of assets.

put_directory

Convert a directory to a zipfile and upload.

put_file

Register and upload a file with the asset manager Args: file (BytesIO or bytes): bytes object id (str): the uid of the file. If not provided, will be set automatically owner_organization_id (str): file owner organization id wait_for_completion (bool): Whether to wait for the upload to finish before returning retry: should we retry? :rtype: ObjectAccess :return: object_access for the uploaded file.

remove_key_from_index

Remove a key from index and its mapping (Admin only).

search_assets

Retrieve asset IDs with a search query.

search_assets_large

Retrieve all asset IDs matching a search query.

search_meta

Retrieve asset metadata with a search query.

search_meta_large

Retrieve all asset metadata matching a search query.

sum

Sum a specified field over assets that satisfy a query.

summary_meta

Get a metadata summary.

update_aux_metadata

Update an asset’s auxiliary metadata.

update_by_query

Args:

update_metadata

Update an asset’s metadata.

update_metadata_batch

Update an asset’s metadata.

upsert_metadata

Upsert an asset’s metadata.

wait_for_batch_update_task

Waits for update by query task completion and returns the status.

with_asset_state

Set the asset state, for use in chaining.

delete(id, wait_for_deletion=True, raise_for_failure=False)

Delete an asset.

Parameters:
  • id – The ID of the asset to delete.

  • wait_for_deletion – If True, polls until the specified asset no longer exists. Otherwise, returns immediately (even if the asset is still in the process of being deleted).

  • raise_for_failure – should raise for failure

Return type:

dict

put_directory(directory_path, metadata, temp_dir=PosixPath('/tmp'))

Convert a directory to a zipfile and upload.

Return type:

str

put_file(file_object, id=None, folder=None, owner_organization_id=None, content_type=None, content_encoding=None, content_size=None, add_to_redis_cache=False, retry=False)

Register and upload a file with the asset manager Args:

file (BytesIO or bytes): bytes object id (str): the uid of the file. If not provided, will be set automatically owner_organization_id (str): file owner organization id wait_for_completion (bool): Whether to wait for the upload to finish before returning retry: should we retry?

Return type:

ObjectAccess

Returns:

object_access for the uploaded file

get_url(id, expiration=43200)

Get the public URL of an asset.

Parameters:
  • id – The ID of the asset to get the URL of.

  • expiration – url expiration time in seconds (default 12 hours)

Returns:

The URL of the specified asset.

Return type:

str

get_bytes_from_kv_store(id)

Get the data bytes from kv_store if available. :Parameters: id – The ID of the asset to get the bytes of.

Return type:

Optional[bytes]

get_bytes(id, add_to_diskcache=False, add_to_redis_cache=True, url=None, retry=False)

Get the data bytes of an asset.

Parameters:
  • id – The ID of the asset to get the bytes of.

  • add_to_diskcache – bool. Default is False. Flag to add retrieved asset to diskcache.

  • add_to_redis_cache – bool. Default is True. Flag to add retrieved asset to redis cache.

  • url – Optional string. Url of the asset.

  • retry – If True, retries the call for total_tries set in retry decorator

Returns:

The data bytes of the specified asset.

Return type:

bytes

get_bytes_in_batch(ids, add_to_redis_cache=True, add_to_diskcache=False)

Get the data bytes for a list of assets.

Parameters:
  • ids – The IDs of the assets to get the data bytes of.

  • add_to_redis_cache – should we add to redis cache

  • add_to_diskcache – should we add to disk cache

Returns:

The fetched bytes in a dictionary. Keys are the input ids. Values are the corresponding data bytes.

Return type:

dict[str, str]

get_url_in_batch(ids)

Get the public URLs of a list of assets.

Parameters:

ids – The IDs of the assets to get the URLs of.

Returns:

The URLs of the specified assets. Keys are the inputted ids. Values are the corresponding URLs.

Return type:

dict[str, str]

exists(id)

Check if an asset exists.

Parameters:

id – The ID of the asset to check the existence of.

Returns:

Returns True if the named asset exists. Otherwise, returns False.

Return type:

bool

exists_multiple(ids)

Check if assets exist.

Parameters:

ids – List of the IDs of assets to check the existence of.

Returns:

Returns True if the named assets exist. Otherwise, returns False.

Return type:

bool

copy(id, new_id)

Copy an asset.

Parameters:
  • id – The ID of the asset to copy.

  • new_id – The ID to give to the created asset copy.

count(search=None)

Count the number of assets satisfying a query.

Parameters:

search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns:

The number of assets that fulfil search.

Return type:

int

sum(field_name, is_nested=False, search=None)

Sum a specified field over assets that satisfy a query.

Find the aggregate sum of field_name over all the assets specified by search.

Parameters:
  • field_name – The field name in the filtered assets’ metadata of which to sum over.

  • is_nested – Whether field is nested type or not.

  • search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns:

The aggregate sum of the specified field name over the filtered assets.

Return type:

float

average(field_name, is_nested=False, search=None)

Average a specified field over assets that satisfy a query.

Find the aggregate average of field_name over all the assets specified by search.

Parameters:
  • field_name – The field name in the filtered assets’ metadata of which to average over.

  • is_nested – Whether field is nested type or not.

  • search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns:

The aggregate average of the specified field name over the filtered assets.

Return type:

float

cardinality(field_name, string_field=True, is_nested=False, search=None)
Args:

field_name: the field to do average over it is_nested: Whether field is nested type or not. search: search criteria for the assets

Returns: Count of distinct values

Return type:

float

get_metadata(id, retry=False)

Fetch an asset’s metadata.

Parameters:
  • id – The ID of the asset to receive the metadata of.

  • retry – If True, retries the call for total_tries set in retry decorator

Returns:

The metadata of the specified asset.

Return type:

dict

get_metadata_in_batch(ids, source_selected_fields=None)

Fetch an asset’s metadata.

Parameters:
  • ids – The IDs of the asset to receive the metadata of.

  • source_selected_fields – list of desired metadata fields. By default, retrieve everything

Returns:

Dict – The metadata of the specified asset-set to None if not found.

Return type:

str->dict or None

update_metadata(id, metadata, buffered_write=False, replace_sets=False, geo_field=None, shape_group_field=None, nested_fields=None, retry=False)

Update an asset’s metadata.

Parameters:
  • id – The ID of the asset to update the metadata of.

  • metadata – The metadata to update the existing metadata with.

  • buffered_write – If True, ingests the metadata in a buffered fashion.

  • replace_sets – If True, replaces existing metadata with metadata. Otherwise, attempts to append metadata to the existing metadata.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

  • retry – If True, retries the call for total_tries set in retry decorator

Return type:

dict

update_aux_metadata(id, metadata, geo_field=None, shape_group_field=None, nested_fields=None)

Update an asset’s auxiliary metadata.

Parameters:
  • id – The ID of the asset to update the metadata of.

  • metadata – The metadata to update the existing metadata with.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

Return type:

dict

update_metadata_batch(ids, metadata, buffered_write=False, replace_sets=False, geo_field=None, shape_group_field=None, nested_fields=None)

Update an asset’s metadata.

Parameters:
  • ids – The IDs of the assets to update the metadata of.

  • metadata – The metadata to update the existing metadata with.

  • buffered_write – If True, ingests the metadata in a buffered fashion.

  • replace_sets – If True, replaces existing metadata with metadata. Otherwise, attempts to append metadata to the existing metadata.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

  • retry – If True, retries the call for total_tries set in retry decorator

Returns:

Update status for each of the chunks the ids and metadata were broken into.

Return type:

dict

upsert_metadata(id, metadata, geo_field=None, shape_group_field=None, nested_fields=None, buffered_write=False)

Upsert an asset’s metadata.

Inserts new fields of metadata if they do not already exist. Otherwise, updates the existing field if it does exist.

Parameters:
  • id – The ID of the asset to upsert the metadata of.

  • metadata – The metadata to upsert the existing metadata with.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

  • buffered_write – If True, ingests metadata in a buffered fashion.

Return type:

dict

update_by_query(update_field, update_value, operation, limit=None, query=None)
Args:

update_field: field to update update_value: update value operation: add|remove limit: limit the number of docs in query result. <limit> number of docs will be randomly selected. If not

provided, all documents satisfying the query will be added to set.

query: search query to limit the docs to update

Returns:

update task id(s). you can check the status of task using the jobs/query_task endpoint. once the task is complete you may need to clear the cache for associated assets to view the lasted changes

Return type:

list

assets_iterator(query=None, filters=None, sort_field='id', sort_order='asc')

Retrieve an iterator for Assets with a search query.

Parameters:
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • sort_field – Filters for the specified name.

  • sort_order – Specifies the string sorting order. [asc. desc]

Returns:

Iterator for assets

Return type:

Iterator[Asset]

search_assets(query=None, filters=None, size=50, offset=0, sort_field=None, sort_order=None, search_after=None, scan=False)

Retrieve asset IDs with a search query.

Returns the top size matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_assets_large().

Parameters:
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if scan is False.

  • offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if scan is False.

  • sort_field – Filters for the specified name.

  • sort_order – Specifies the Elasticsearch string sorting order.

  • search_after – Used for retrieving all assets. See search_assets_large

  • scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.

Returns:

A list of the IDs of the assets fulfilling the search query.

Return type:

List[str]

search_assets_large(query=None, filters=None)

Retrieve all asset IDs matching a search query.

Return all hits matching a search query. If a return of less than 10000 hits is desired, please use AssetManagerClient.search_assets() or SceneEngineClient.search_assets().

Parameters:
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

Returns:

A list of the IDs of the assets fulfilling the search query.

Return type:

List[str]

remove_key_from_index(key, wait_for_completion=False)

Remove a key from index and its mapping (Admin only).

Parameters:
  • key – Key to be removed from index and its mapping

  • wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The id of the Job that carries out the key removal.

Return type:

str

delete_with_query(query=None, filters=None, wait_for_completion=False)

Delete assets specified with a search query.

Parameters:
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The id of the Job that carries out the asset deletion.

Return type:

str

delete_with_list(assets_list=None, wait_for_completion=False)

Delete assets specified with a list of asset IDs.

Parameters:
  • assets_list – Asset IDs of the assets to delete.

  • wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The id of the Job that carries out the asset deletion.

Return type:

str

search_meta(query=None, size=50, filters=None, offset=0, sort_field=None, sort_order=None, scan=False, compress=False)

Retrieve asset metadata with a search query.

Returns the top size matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_meta_large().

Parameters:
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if scan is False.

  • offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if scan is False.

  • sort_field – Filters for the specified name.

  • sort_order – Specifies the Elasticsearch string sorting order.

  • scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.

  • compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.

Returns:

A list of the metadata dicts of each asset that fulfills the search query.

Return type:

List[dict]

search_meta_large(query=None, filters=None, compress=False)

Retrieve all asset metadata matching a search query.

Return all hits matching a search query. If a return of less than 10000 hits is desired, please use AssetManagerClient.search_meta() or SceneEngineClient.search_meta().

Parameters:
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.

Returns:

A list of the metadata dicts of each asset that fulfills the search query.

Return type:

List[dict]

summary_meta(summary_request)

Get a metadata summary.

Parameters:

summary_request – Dict of summary settings of the following form: {

“search” (dict): <Query to locate the data subset of interest>, “dimensions” (Optional[List[str]): <Dimensions of summary (term aggregations)>, “nested_dimensions” (Optional[List[str]): <Nested dimensions of summary (term aggregations)>, “custom_buckets_numeric” Optional[List[dict]]: <Buckets for range aggregation> “custom_buckets_time” Optional[List[dict]]: <Buckets for date_range aggregation> “aggregate_field” Optional[str]: <Field for metric aggregations> “ is_nested” Optional[bool] Whether field is nested type or not. “aggregation_type” Optional[str]: <Type of metric aggregation (e.g. avg, sum)> “max_size_for_agg” Optional[str]: < Maximum number of buckets to return (default is 100)>

}

Returns:

Metadata summary according to summary_request.

Return type:

dict

with_asset_state(asset_type, metadata_only=False)

Set the asset state, for use in chaining.

Eg. client.with_asset_state(“images”, True).search_files({})

download_data_in_batch(destination_dir, ids=None, query=None, download_metadata=True)

Download asset files using a search query or ids.

Parameters:
  • destination_dir – The existing local directory to download the files to.

  • ids – The id of the assets to be downloaded

  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • download_metadata – If True, downloads asset metadata. Otherwise, does not download asset metadata.

Returns:

Maps asset IDs obtained with query to local filepath and metadata. Highest level keys represent asset IDs. Lower level keys “filepath” and “metadata” map to local filepath and metadata for a specific ID.

Return type:

dict

add_auxiliary_metadata(id=None, ids=None, search=None, filters=None, **kwargs)

Add auxiliary metadata to an asset.

Parameters:
  • id – An asset IDs to add the auxiliary metadata to.

  • ids – A list of asset IDs to add the auxiliary metadata to.

  • search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to search. Dict keys represent the existing filter names, and dict values are the filter values.

  • **kwargs – Maps custom metadata labels to any value. Keys represent the auxiliary metadata labels. Values represent the auxiliary metadata values.

clear_cache(all_organizations=False, partitions=None)

Clears asset manager’s Redis cache by organization Args:

all_organizations: if True, Redis cache for all organization is cleared. Requires Super Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared

Returns:

ACK_OK_RESPONSE on success

wait_for_batch_update_task(query_task_id, timeout=1200)

Waits for update by query task completion and returns the status.