AssetManagerClient

class scenebox.clients.AssetManagerClient(asset_type='sessions', asset_manager_url=None, metadata_only=False, auth_token=None, cache_disabled=False, is_non_standard_asset=False, kv_store=None, num_threads=20)

Asset manager client for fetching/putting assets and metadata.

Parameters
  • asset_type (str) – (str) The type of asset

  • asset_manager_url (Optional[str]) – (str) The URL of the asset manager

  • metadata_only (bool) – (bool) Whether to only return metadata

  • auth_token (Optional[str]) – (str) Authentication token of the user – must be provided

  • cache_disabled (bool) – (bool): Disable caching; defaults to False

  • (bool) (is_non_standard_asset) – Whether the asset is a non-standard asset. If True, ignore asset-type checks.

  • kv_store (Optional[KVStoreTemplate]) – key-value store to cache bytes, default to None.

Methods

add_auxiliary_metadata

Add auxiliary metadata to an asset.

assets_iterator

Retrieve an iterator for Assets with a search query.

average

Average a specified field over assets that satisfy a query.

cardinality

Args:

clear_cache

Clears asset manager’s Redis cache by organization Args: all_organizations: if True, Redis cache for all organization is cleared. Requires Super Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared.

copy

Copy an asset.

count

Count the number of assets satisfying a query.

delete

Delete an asset.

delete_with_list

Delete assets specified with a list of asset IDs.

delete_with_query

Delete assets specified with a search query.

download_data_in_batch

Download asset files using a search query or ids.

exists

Check if an asset exists.

exists_multiple

Check if an asset exists.

get_bytes

Get the data bytes of an asset.

get_bytes_from_kv_store

Get the data bytes from kv_store if available.

get_bytes_in_batch

Get the data bytes for a list of assets.

get_metadata

Fetch an asset’s metadata.

get_metadata_in_batch

Fetch an asset’s metadata.

get_url

Get the public URL of an asset.

get_url_in_batch

Get the public URLs of a list of assets.

put_directory

Convert a directory to a zipfile and upload.

put_file

Register and upload a file with the asset manager Args: file (BytesIO or bytes): bytes object id (str): the uid of the file. If not provided, will be set automatically owner_organization_id (str): file owner organization id wait_for_completion (bool): Whether to wait for the upload to finish before returning retry: should we retry? :rtype: ObjectAccess :return: object_access for the uploaded file.

search_assets

Retrieve asset IDs with a search query.

search_assets_large

Retrieve all asset IDs matching a search query.

search_meta

Retrieve asset metadata with a search query.

search_meta_large

Retrieve all asset metadata matching a search query.

sum

Sum a specified field over assets that satisfy a query.

summary_meta

Get a metadata summary.

update_by_query

Args:

update_metadata

Update an asset’s metadata.

update_metadata_batch

Update an asset’s metadata.

upsert_metadata

Upsert an asset’s metadata.

wait_for_batch_update_task

Waits for update by query task completion and returns the status.

with_asset_state

Set the asset state, for use in chaining.

delete(id, wait_for_deletion=True)

Delete an asset.

Parameters
  • id – The ID of the asset to delete.

  • wait_for_deletion – If True, polls until the specified asset no longer exists. Otherwise, returns immediately (even if the asset is still in the process of being deleted).

Return type

dict

put_directory(directory_path, metadata, temp_dir=PosixPath('/tmp'))

Convert a directory to a zipfile and upload.

Return type

str

put_file(file_object, id=None, owner_organization_id=None, content_type=None, content_encoding=None, content_size=None, add_to_redis_cache=False, wait_for_completion=False, retry=False)

Register and upload a file with the asset manager Args:

file (BytesIO or bytes): bytes object id (str): the uid of the file. If not provided, will be set automatically owner_organization_id (str): file owner organization id wait_for_completion (bool): Whether to wait for the upload to finish before returning retry: should we retry?

Return type

ObjectAccess

Returns

object_access for the uploaded file

get_url(id)

Get the public URL of an asset.

Parameters

id – The ID of the asset to get the URL of.

Returns

The URL of the specified asset.

Return type

str

get_bytes_from_kv_store(id)

Get the data bytes from kv_store if available. :Parameters: id – The ID of the asset to get the bytes of.

Return type

Optional[bytes]

get_bytes(id, add_to_diskcache=False, url=None, retry=False)

Get the data bytes of an asset.

Parameters
  • id – The ID of the asset to get the bytes of.

  • add_to_diskcache – book. Default is False. Flag to add retrieved asset to diskcache.

  • url – Optional string. Url of the asset.

  • retry – If True, retries the call for total_tries set in retry decorator

Returns

The data bytes of the specified asset.

Return type

bytes

get_bytes_in_batch(ids)

Get the data bytes for a list of assets.

Parameters

ids – The IDs of the assets to get the data bytes of.

Returns

The fetched bytes in a dictionary. Keys are the input ids. Values are the corresponding data bytes.

Return type

dict[str, str]

get_url_in_batch(ids)

Get the public URLs of a list of assets.

Parameters

ids – The IDs of the assets to get the URLs of.

Returns

The URLs of the specified assets. Keys are the inputted ids. Values are the corresponding URLs.

Return type

dict[str, str]

exists(id)

Check if an asset exists.

Parameters

id – The ID of the asset to check the existence of.

Returns

Returns True if the named asset exists. Otherwise, returns False.

Return type

bool

exists_multiple(ids)

Check if an asset exists.

Parameters

ids – List of the IDs of assets to check the existence of.

Returns

Returns True if the named assets exist. Otherwise, returns False.

Return type

bool

copy(id, new_id)

Copy an asset.

Parameters
  • id – The ID of the asset to copy.

  • new_id – The ID to give to the created asset copy.

count(search=None)

Count the number of assets satisfying a query.

Parameters

search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns

The number of assets that fulfil search.

Return type

int

sum(field_name, is_nested=False, search=None)

Sum a specified field over assets that satisfy a query.

Find the aggregate sum of field_name over all the assets specified by search.

Parameters
  • field_name – The field name in the filtered assets’ metadata of which to sum over.

  • is_nested – Whether field is nested type or not.

  • search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns

The aggregate sum of the specified field name over the filtered assets.

Return type

float

average(field_name, is_nested=False, search=None)

Average a specified field over assets that satisfy a query.

Find the aggregate average of field_name over all the assets specified by search.

Parameters
  • field_name – The field name in the filtered assets’ metadata of which to average over.

  • is_nested – Whether field is nested type or not.

  • search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns

The aggregate average of the specified field name over the filtered assets.

Return type

float

cardinality(field_name, string_field=True, is_nested=False, search=None)
Args:

field_name: the field to do average over it is_nested: Whether field is nested type or not. search: search criteria for the assets

Returns: Count of distinct values

Return type

float

get_metadata(id, retry=False)

Fetch an asset’s metadata.

Parameters
  • id – The ID of the asset to receive the metadata of.

  • retry – If True, retries the call for total_tries set in retry decorator

Returns

The metadata of the specified asset.

Return type

dict

get_metadata_in_batch(ids, source_selected_fields=None)

Fetch an asset’s metadata.

Parameters
  • ids – The IDs of the asset to receive the metadata of.

  • source_selected_fields – list of desired metadata fields. By default, retrieve everything

Returns

Dict – The metadata of the specified asset-set to None if not found.

Return type

str->dict or None

update_metadata(id, metadata, buffered_write=False, replace_sets=False, geo_field=None, shape_group_field=None, nested_fields=None, retry=False)

Update an asset’s metadata.

Parameters
  • id – The ID of the asset to update the metadata of.

  • metadata – The metadata to update the existing metadata with.

  • buffered_write – If True, ingests the metadata in a buffered fashion.

  • replace_sets – If True, replaces existing metadata with metadata. Otherwise, attempts to append metadata to the existing metadata.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

  • retry – If True, retries the call for total_tries set in retry decorator

Return type

dict

update_metadata_batch(ids, metadata, buffered_write=False, replace_sets=False, geo_field=None, shape_group_field=None, nested_fields=None)

Update an asset’s metadata.

Parameters
  • ids – The IDs of the assets to update the metadata of.

  • metadata – The metadata to update the existing metadata with.

  • buffered_write – If True, ingests the metadata in a buffered fashion.

  • replace_sets – If True, replaces existing metadata with metadata. Otherwise, attempts to append metadata to the existing metadata.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

  • retry – If True, retries the call for total_tries set in retry decorator

Return type

dict

upsert_metadata(id, metadata, geo_field=None, shape_group_field=None, nested_fields=None, buffered_write=False)

Upsert an asset’s metadata.

Inserts new fields of metadata if they do not already exist. Otherwise, updates the existing field if it does exist.

Parameters
  • id – The ID of the asset to upsert the metadata of.

  • metadata – The metadata to upsert the existing metadata with.

  • geo_field – Geolocation field

  • shape_group_field – Shape group field (example: UMAP)

  • nested_fields – nested fields (example: [“annotations_meta”])

  • buffered_write – If True, ingests metadata in a buffered fashion.

Return type

dict

update_by_query(update_field, update_value, operation, limit=None, query=None)
Args:

update_field: field to update update_value: update value operation: add|remove limit: limit the number of docs in query result. <limit> number of docs will be randomly selected. If not

provided, all documents satisfying the query will be added to set.

query: search query to limit the docs to update

Returns:

update task id(s). you can check the status of task using the jobs/query_task endpoint. once the task is complete you may need to clear the cache for associated assets to view the lasted changes

Return type

dict

assets_iterator(query=None, filters=None, sort_field='id', sort_order='asc')

Retrieve an iterator for Assets with a search query.

Parameters
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • sort_field – Filters for the specified name.

  • sort_order – Specifies the string sorting order. [asc. desc]

Returns

Iterator for assets

Return type

Iterator[Asset]

search_assets(query=None, filters=None, size=50, offset=0, sort_field=None, sort_order=None, scan=False)

Retrieve asset IDs with a search query.

Returns the top size matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_assets_large().

Parameters
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if scan is False.

  • offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if scan is False.

  • sort_field – Filters for the specified name.

  • sort_order – Specifies the Elasticsearch string sorting order.

  • scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.

Returns

A list of the IDs of the assets fulfilling the search query.

Return type

List[str]

search_assets_large(query=None, filters=None)

Retrieve all asset IDs matching a search query.

Return all hits matching a search query. If a return of less than 10000 hits is desired, please use AssetManagerClient.search_assets() or SceneEngineClient.search_assets().

Parameters
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

Returns

A list of the IDs of the assets fulfilling the search query.

Return type

List[str]

delete_with_query(query=None, filters=None, wait_for_completion=False)

Delete assets specified with a search query.

Parameters
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns

The id of the Job that carries out the asset deletion.

Return type

str

delete_with_list(assets_list=None, wait_for_completion=False)

Delete assets specified with a list of asset IDs.

Parameters
  • assets_list – Asset IDs of the assets to delete.

  • wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns

The id of the Job that carries out the asset deletion.

Return type

str

search_meta(query=None, size=50, filters=None, offset=0, sort_field=None, sort_order=None, scan=False, compress=False)

Retrieve asset metadata with a search query.

Returns the top size matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_meta_large().

Parameters
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if scan is False.

  • offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if scan is False.

  • sort_field – Filters for the specified name.

  • sort_order – Specifies the Elasticsearch string sorting order.

  • scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.

  • compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.

Returns

A list of the metadata dicts of each asset that fulfills the search query.

Return type

List[dict]

search_meta_large(query=None, filters=None, compress=False)

Retrieve all asset metadata matching a search query.

Return all hits matching a search query. If a return of less than 10000 hits is desired, please use AssetManagerClient.search_meta() or SceneEngineClient.search_meta().

Parameters
  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to query. Dict keys represent the existing filter names, and dict values are the filter values.

  • compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.

Returns

A list of the metadata dicts of each asset that fulfills the search query.

Return type

List[dict]

summary_meta(summary_request)

Get a metadata summary.

Parameters

summary_request – Dict of summary settings of the following form: {

“search” (dict): <Query to locate the data subset of interest>, “dimensions” (Optional[List[str]): <Dimensions of summary (term aggregations)>, “nested_dimensions” (Optional[List[str]): <Nested dimensions of summary (term aggregations)>, “custom_buckets_numeric” Optional[List[dict]]: <Buckets for range aggregation> “custom_buckets_time” Optional[List[dict]]: <Buckets for date_range aggregation> “aggregate_field” Optional[str]: <Field for metric aggregations> ” is_nested” Optional[bool] Whether field is nested type or not. “aggregation_type” Optional[str]: <Type of metric aggregation (e.g. avg, sum)> “max_size_for_agg” Optional[str]: < Maximum number of buckets to return (default is 100)>

}

Returns

Metadata summary according to summary_request.

Return type

dict

with_asset_state(asset_type, metadata_only=False)

Set the asset state, for use in chaining.

Eg. client.with_asset_state(“images”, True).search_files({})

download_data_in_batch(destination_dir, ids=None, query=None, download_metadata=True)

Download asset files using a search query or ids.

Parameters
  • destination_dir – The existing local directory to download the files to.

  • ids – The id of the assets to be downloaded

  • query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • download_metadata – If True, downloads asset metadata. Otherwise, does not download asset metadata.

Returns

Maps asset IDs obtained with query to local filepath and metadata. Highest level keys represent asset IDs. Lower level keys “filepath” and “metadata” map to local filepath and metadata for a specific ID.

Return type

dict

add_auxiliary_metadata(id=None, ids=None, search=None, filters=None, **kwargs)

Add auxiliary metadata to an asset.

Parameters
  • id – An asset IDs to add the auxiliary metadata to.

  • ids – A list of asset IDs to add the auxiliary metadata to.

  • search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

  • filters – Filter names and values to append to search. Dict keys represent the existing filter names, and dict values are the filter values.

  • **kwargs – Maps custom metadata labels to any value. Keys represent the auxiliary metadata labels. Values represent the auxiliary metadata values.

clear_cache(all_organizations=False, partitions=None)

Clears asset manager’s Redis cache by organization Args:

all_organizations: if True, Redis cache for all organization is cleared. Requires Super Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared

Returns:

ACK_OK_RESPONSE on success

wait_for_batch_update_task(query_task_id, timeout=1200)

Waits for update by query task completion and returns the status.