SceneEngineClient¶

class scenebox.clients.SceneEngineClient(auth_token=None, scene_engine_url=None, asset_manager_url=None, environment=None, artifacts_path='.', kv_store=None, num_threads=20)¶

Acts as the root client for all user-tasks on SceneBox.

Create a SceneEngineClient object. Used to interact with SceneBox’s high-level REST APIs.

Parameters:

auth_token – Personal token associated with a valid SceneBox account. Find this token under user profile on the SceneBox web app.
scene_engine_url – URL for the SceneEngine REST server.
asset_manager_url – URL for the AssetManager associated with the SceneEngine server.
environment – The environment that the client is going to use (dev or prod). It will override
artifacts_path – SceneBox’s output directory.
kv_store – key-value store to use in asset-manager-client to cache bytes. max_threads:

maximum number of threads to run internal processes in. Default specified in constants file.

Methods

`add_annotation`	Add a single annotation.
`add_annotations`	Add several annotations at once.
`add_assets_to_set`	Add assets to a set.
`add_auxiliary_session_data`	Add auxiliary session data e.g.
`add_coco_annotations_from_folder`	Add several images and/or COCO annotations.
`add_comment_to_assets`	Add a comment to a list of assets.
`add_comments`	Add a comment to a time segment of a session.
`add_df`	Add a measurement DataFrame to a session as scalar entities.
`add_ego_pose`	Add an ego pose file to a session. An ego pose file has a series of frames in which a list of timestamped objects with their ego position in “lat” and “lon” is given. Example: [ { “frame”:0, “timestamp”:”2021-08-24T21:15:15.574000+00:00”, “objects”:{ “0”:{ “lat”:33.6, “lon”:-0.15 }, “1”:{ “lat”:5.6, “lon”:-3.45 }, “2”:{ “lat”:23.25, “lon”:-3.85 }, “3”:{ “lat”:83.0, “lon”:-5.9 }, “4”:{ “lat”:198.55, “lon”:12.15 } } }, { “frame”:1, “timestamp”:”2021-08-24T21:15:15.599000+00:00”, “objects”:{ “5”:{ “lat”:33.55, “lon”:-0.15 }, “6”:{ “lat”:5.7, “lon”:-3.45 }, “7”:{ “lat”:23.25, “lon”:-3.85 }, “8”:{ “lat”:82.45, “lon”:-5.9 }, “9”:{ “lat”:197.8, “lon”:12.15 } } } ].
`add_embeddings`	Add existing embeddings to an asset.
`add_enrichments_configs`	Add enrichment configurations to session metadata.
`add_entities`	Add a list of entities to the specified sessions.
`add_entity`	Add an entity to the specified session.
`add_event_interval`	Add an event interval to a session.
`add_event_intervals`	Add a list of several event intervals to a session.
`add_geolocation_entities`	Add geolocation entities to a session.
`add_geolocation_entity`	Add a geolocation entity to a session.
`add_image`	Upload an image onto SceneBox.
`add_images`	Upload multiple images onto SceneBox.
`add_images_from_folder`	Upload images from a single local folder path onto SceneBox.
`add_lidar`	Upload a LIDAR onto SceneBox.
`add_lidars`	Upload multiple videos onto SceneBox.
`add_organization`	Add a new SceneBox organization.
`add_point_events`	Add a list of several point events to a session.
`add_rosbag2`	Upload Rosbag2 onto SceneBox.
`add_scalar_intervals`	Add a timeseries with interval values to a session.
`add_scalar_measurements`	Add a timeseries with point values (“scalars”) to a session.
`add_session`	Add a session.
`add_sets_to_project`	Add sets to a project.
`add_source_data`	Add source data to session metadata.
`add_task_to_campaign`	Add new tasks to a campaign.
`add_timeseries_csv`	Add a timeseries from a CSV file.
`add_user`	Add a new SceneBox user account to your organization.
`add_video`	Upload a video onto SceneBox.
`add_videos`	Upload multiple videos onto SceneBox.
`annotations_to_objects`	Extracts objects from annotations.
`assets_in_set`	List assets within a set.
`cleanup_assets_annotation_meta`	Cleanup annotation lists by removing repeated and non-existent entries in an asset’s metadata. asset_ids: List of asset ids to clean-up. asset_type: The type of asset. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants.
`clear_cache`	Clears scene engine’s Redis cache by organization Args: all_organizations: if True, Redis cache for all organization is cleared. Requires Admin privilege partitions: list of partitions (like images, sets, annotations) to clear their cache, if not set, all is cleared.
`commit_set`	Commit the contents of a set to a new version.
`compare_sets`	Compare sets based on a given metric_type in an embedding space with given index_name.
`compress_images`	Compress a list of images.
`compress_lidars`	Compress a list of videos.
`compress_videos`	Compress a list of videos.
`concatenate_videos`	Concatenate a list of videos into a single video.
`count`	Retrieve summary of asset metadata with a search query.
`create_annotation_instruction`	Create an annotation instruction.
`create_campaign`	Create a Campaign A Campaign is a collection of tasks to run on the raw data or metadata on SceneBox.
`create_config`	Create a config.
`create_operation`	Create an operation inside a project.
`create_project`	Create a project.
`create_set`	Create a named dataset.
`create_umap`	Applies UMAP onto existing asset indices.
`delete_annotation_instruction`	Delete an existing annotation instruction.
`delete_annotations`	Delete annotations using a list of ids or a search query, and updates metadata of corresponding assets.
`delete_assets_annotations`	Delete annotations using a list of asset ids, and updates metadata of corresponding assets.
`delete_campaign`	Delete a campaign Deleting a campaign will delete all tasks and by extension all operation runs contained in it.
`delete_campaign_operation_config`	Delete a config.
`delete_comment_from_assets`	Delete a comment from a list of assets.
`delete_config`	Delete a config.
`delete_embeddings`	Delete embeddings using an index name and updates the metadata of corresponding assets.
`delete_events`	Delete session events.
`delete_operation`	Delete an operation.
`delete_project`	Delete a project.
`delete_session`	Delete an existing session.
`delete_set`	Delete a set.
`delete_task`	Delete a task Deleting a task will delete operation instances associated with it.
`disable_webapp_features`	Disable a list of features on SceneBox WebApp.
`display_dashboard`	Display a given dashboard on the SceneBox web app.
`display_image`	Display a given image on the SceneBox web app.
`display_lidar`	Display a given Lidar on the SceneBox web app.
`display_object`	Display a given object on the SceneBox web app.
`display_projects`	Display a given project on the SceneBox web app.
`display_session`	Display a given session on the SceneBox web app.
`display_set`	Display a given set on the SceneBox web app.
`display_video`	Display a given image on the SceneBox web app.
`download_annotations`	Download annotations to a destination folder including the mask files
`download_timeseries`	Download the timeseries using a search query.
`download_zipfile`	Download a zipfile into either a bytesIO object or a file.
`enable_webapp_features`	Enable a list of features on SceneBox WebApp.
`entity_summary`	Get the entity summary for a session.
`extend_time_span`	Extend the start time and end time of a session based on min/max of a list of timestamps.
`extract_frame_thumbnails`	Extract thumbnails for a list of frames in a video.
`extract_images_from_videos`	Extract images from video
`extract_subclip_from_videos`	Create a subclip of a video.
`flush_entities_buffer`	write all of the entities in the buffer.
`get_annotation_labels`	Returns all labels for all images or images that satisfy a search query
`get_annotation_sources`	Returns all sources of annotation for all images or images that satisfy a search query
`get_asset_manager`	Set asset_manager state.
`get_campaign_operation_config`	Get config metadata.
`get_comments`	Get (common) comments from a list of assets.
`get_config`	Get config metadata.
`get_embeddings`	Retrieve asset embeddings.
`get_job`	Get the job_manager_client.Job object associated with the provided job ID.
`get_job_manager`	Set asset_manager state.
`get_metadata`	Get the metadata of an asset.
`get_operation`	Get specifications of an operation
`get_operation_status`	Get the status of an operation.
`get_performance_metrics`	Get performance metrics.
`get_performance_metrics_per_dimension`	Get performance metrics per dimension.
`get_platform_version`	Get version of the SceneBox Client.
`get_project_sets`	Get sets at a specific stage in a project.
`get_searchable_field_statistics`	Get the metadata searchable field statistics for a certain asset type.
`get_session_resolution_status`	Get the status of the session resolution task (if it exists).
`get_set_id_by_name`	Get set with a its name
`get_set_status`	Get the status of a set.
`get_set_version_history`	Get versioning history of the set as a list of commit details. Commit details are
`get_sets_comparison_matrix`	Get the comparison matrix of an index.
`get_streamable_set`	A StreamableSet is a SceneBox object that can be used to build dataloaders for a set.
`get_task_rundown`	Get a task run history including details such as config, input, output and phase payloads.
`get_video_frame_thumbnail`	Get thumbnail png for requested frame fram a video.
`image_properties_enrichment`	Enrich images or objects with classical image processing features.
`index_recording`	Index Recording.
`index_rosbag`	Index Ros1 files.
`index_rosbag2`	Index Ros2 files.
`index_s3_images_batch`	Index s3 images.
`is_coco_annotation_file_valid`	Tests if a COCO annotation file is valid.
`jira_create_issue`	create a Jira issue
`jira_is_connected`	Check Jira connection :returns: is jira connected :rtype: bool
`jira_link_new_issue_to_existing`	create a new Jira issue and link to an existing one
`list_operations`	Get specifications of an operation
`list_organizations`	Returns a list of all organizations.
`list_ownerships`	Returns a list of all ownerships.
`list_users`	Returns a list of all users.
`make_video_from_image_uris`	Make video from frames in a given cloud_storage, bucket, and folder.
`make_video_with_annotations_from_uris`	Make video with overlayed annotations from provided image uris and annotaion uris per provider
`model_inference`	Perform inference from a list of supported models.
`modify_organization`	Modify an existing organization. :Parameters: * organization_id – Organization id * organization_name – The new organization name.
`modify_user`	Modify an existing user account.
`register_rosbag_session`	Register a rosbag session with the SceneEngine.
`remove_annotation_sources`	Delete annotations using a list of ids or a search query, and updates metadata of corresponding assets.
`remove_assets_from_set`	Remove assets from a set either by query or by IDs.
`remove_sets_from_project`	Remove sets from a project.
`resolve_session`	Resolve a session.
`run_operation_instance`	Run an individual instance of an operation contained in a task.
`save_campaign_operation_config`	Create a config for a campaign operation.
`search_assets`	Retrieve asset IDs with a search query.
`search_campaigns`	Search for campaigns that satisfy the filtering criteria
`search_meta`	Retrieve asset metadata with a search query.
`search_similar_assets`	Find top k similar assets to an existing asset with embeddings.
`search_tasks`	Search for tasks that satify the filtering criteria
`search_within_set`	Search for assets within a set.
`set_is_locked`	Check the lock status of a set.
`set_lock`	Lock a set.
`set_unlock`	Unlocks a set.
`similarity_search_bulk_index`	Bulk indexes embeddings to allow for similarity search.
`slice_session`	slice an existing session with start and end time.
`split_set`	Split a set into subsets by randomly sampling and distributing constituent assets.
`start_operation`	Start an operation.
`summary_meta`	Retrieve summary of asset metadata with a search query.
`update_artifact_path`	Updates the SceneEngineClient artifact path.
`update_campaign_pipeline`	Update a campaign’s task pipeline Use this method to update a campaign’s task pipeline.
`update_time_span`	Update the start/end times of a session.
`whoami`	Get logged-in user authentication details
`with_auth`	Save auth_token into SceneEngineClient as an instance attribute.
`zip_set`	Create a zipfile of a set.
`zip_set_and_download`	Zip and download an existing set.

whoami()¶

Get logged-in user authentication details

Returns:: A dictionary with the following keys: “version”,”build_host””token”,”username”,”role”,”organization”
Return type:: dict

with_auth(auth_token=None)¶

Save auth_token into SceneEngineClient as an instance attribute. Authorizes the current SceneEngineClient object with the auth_token provided in this call or uses the auth_token the client was created with.

Parameters:: auth_token – Personal token associated with a valid SceneBox account. Find this token under user profile on the SceneBox web app.

get_platform_version()¶

Get version of the SceneBox Client.

Returns:: SceneBox version number
Return type:: str
Raises:: ValueError – If a valid version cannot be obtained from SceneBox.

update_artifact_path(artifacts_path)¶

Updates the SceneEngineClient artifact path.

Parameters:: artifacts_path – Path that specifies the location to save output files.

get_job(job_id)¶

Get the job_manager_client.Job object associated with the provided job ID.

Can be a currently executing or finished Job.

Parameters:: job_id – The job ID associated with the Job of interest.
Returns:: Job object with the job id job_id.
Return type:: Job

create_project(name, tags=None)¶

Create a project.

Projects are used to organize and create data workflows. Once a project is made, users can add sets, and apply operations associated with object/embeddings/annotation extraction, and more.

Parameters:

name – Name for the created project.
tags – A list of tags to associate this project with. Helpful for filtering them later.

Returns:

The ID of the created project.

Return type:

str

Raises:

ResponseError – If the project could not be created.

delete_project(project_id)¶

Delete a project.

Delete a project. Deleting a project will not delete any related sets (primary or curated) or annotation recipes.

Parameters:: project_id – The name of the project to delete.
Raises:: ResponseError – If the project could not be deleted.

get_project_sets(name, stage)¶

Get sets at a specific stage in a project.

Get sets for a project.

Parameters:

name – The name of the project.
stage – the stage of the sets

Return type:

List[str]

create_operation(name, operation_type, project_id, config, stage=0, description=None)¶

Create an operation inside a project.

Operations are the custom workflows that can be created inside projects. Several operations can be added to a single project. Choose from annotation, dataset similarity search, consensus, object extraction, and model inference operation types.

Parameters:

name – The name of the operation to be created.
operation_type – The type of the operation to execute. Must match a constant in scenebox.constants.OperationTypes.VALID_TYPES.
project_id – The ID of the project to add the operation to.
config – Configurations for the operation to be added. If operation_type is "annotation", then config must contain a dict entry with a key named “annotation_instruction_id” with the value of the relevant annotation instruction details.
stage – The stage of the operation to be added. Represents the order of the operations to be executed inside a project.
description – A brief description outlining the purpose of the operation, what the operation will accomplish, etc.

Returns:

The ID of the created operation.

Return type:

str

Raises:

ResponseError: – If the project could not be created.
OperationError: – If the provided operation type is not a valid operation type.

delete_operation(operation_id)¶

Delete an operation.

Delete an operation from a project. Deleting an operation will not delete any related sets (primary or curated) or annotation recipes used by or generated by the operation.

Parameters:: operation_id – The ID of the operation to delete.
Raises:: ResponseError – If the operation could not be deleted.

start_operation(operation_id, operation_step=None, wait_for_completion=False)¶

Start an operation.

Parameters:

operation_id – The ID of the operation to execute.
operation_step – If the operation is a CVAT or SuperAnnotate operation, chooses whether to send data to the given annotator, or receive annotations. To send data to the annotator, pass send_data. To receive, input receive_annotations. Otherwise, has no effect.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The job ID of the Job that starts the operation.

Return type:

str

get_operation_status(operation_id)¶

Get the status of an operation.

Parameters:: operation_id – The ID of the operation to execute.
Returns:: The full status of the operation.
Return type:: dict

add_sets_to_project(project_id, sets, stage=0)¶

Add sets to a project.

Specify sets to add to a project. To run a given operation (associated with some project) on a set, the set must be added to the same project as the desired operation.

Parameters:

project_id – The ID of the project to add the given sets to.
sets – List of IDs of sets to add to the project.
stage – The stage associated with the set. Match this with the stage of the desired operation to run.

remove_sets_from_project(project_id, sets)¶

Remove sets from a project.

Parameters:

project_id – ID of the project to remove sets from.
sets – A list of dictionaries listing sets to be removed. Dict format:

[{“set_id”: id_of_the_set_tobe_removed, “stage”: “project_stage_for_the_set”}]

create_set(name, asset_type, origin_set_id=None, expected_count=None, tags=None, id=None, aux_metadata=None, is_primary=False, raise_if_set_exists=False)¶

Create a named dataset.

Set creation is helpful for performing data operations on several assets at once (e.g. ingesting an entire set of images/videos/LIDARs/etc. into SceneBox). Each set can only contain one asset type. Can optionally raise a SetsError if a set by the given name already exists.

Parameters:

name – The name to give to the set.
asset_type – The type of assets for the set to contain. e.g. images, videos, LIDARs, etc. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
origin_set_id – Id of the set, the new set is created from. Useful when creating a set by deriving its assets from other sets.
expected_count – Estimated count of the number of assets this set is expected to contain.
tags – Labels associated with the set. Allows for easy set search.
id – Optional unique id for set (needs to be a string)
aux_metadata – Optional dictionary of key and values to be added to sets metadata
is_primary – If true, Gives the set an additional “Primary” tag for easy set sorting. Otherwise, does nothing.
raise_if_set_exists – Raises a SetsError if the name parameter matches an existing set.

Returns:

The name of the successfully created set.

Return type:

str

Raises:

SetsError – If raise_if_set_exists is True, then will raise an error if the given set name matches an existing set.

get_set_id_by_name(name)¶

Get set with a its name

Parameters:: name – The name of the set.
Returns:: The ID of the set
Return type:: str

commit_set(set_id, commit_message, wait_for_completion=False)¶

Commit the contents of a set to a new version. This will also enable versioning on the set if it is previously unversioned.

Parameters:

set_id – String. ID of the set you want to commit.
commit_message – String. A short message to be associated with the commit.
wait_for_completion – Boolean. If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

String. The Job ID that is versioning the set. None if the set is already versioned and current version is in-sync with the contents of the set.

Return type:

job_id

Raises:

SetsError – If versioning was unsuccessful.

create_annotation_instruction(name, key, annotator, annotation_type, payload, media_type='images', description=None)¶

Create an annotation instruction.

Annotation instructions hold the configurations for annotating assets with external annotators. When executing an operation of type “annotation”, the relevant annotation instruction card is parsed.

Parameters:

name – Name of the annotation instruction to create.
key – Authentication key to send to Scale or SuperAnnotate.
annotator – Name of the annotator to use. Choose an option from scenebox.constants.AnnotationProviders.VALID_PROVIDERS.
annotation_type – Name of the annotation type to use. Choose an option from scenebox.constants.AnnotationTypes.VALID_TYPES.
payload – Configuration for the given annotator.
media_type – Media type to send to the annotator. Choose an option from scenebox.constants.AssetConstants.VALID_TYPES.
description – A brief description of the annotation instruction’s purpose, what it aims to accomplish, etc.

Returns:

The ID of the created annotation instruction.

Return type:

str

Raises:

ResponseError: – If the annotation instruction could not be created.

delete_annotation_instruction(annotation_instruction_id)¶

Delete an existing annotation instruction.

Parameters:: annotation_instruction_id – The ID of the annotation instruction to delete.
Raises:: ResponseError: – If the annotation instruction could not be deleted.

create_config(name, type, config, description=None, id=None, **kwarg)¶

Create a config.

Parameters:

name – Name of the configuration.
type – Type of the config to create. Choose an option from scenebox.constants.ConfigTypes.VALID_TYPES.
config – Body of the config to create. Form dependent on the configuration needs.
description – A brief description of the config’s purpose, settings, etc. Same as the payload argument for self.create_annotation_instruction.
id – optional unique identifier
**kwarg – Any other misc. kwargs to save into the config.

Returns:

The ID of the created config.

Return type:

str

delete_config(config_id)¶

Delete a config.

Parameters:: config_id – The ID of the config to delete.
Raises:: ResponseError: – If the config could not be deleted.

get_config(config_id, complete_info=False)¶

Get config metadata.

Parameters:

config_id – The ID of the config to receive the metadata of.
complete_info – If True, returns the the entire metadata of the config (with the config parameters contained inside). Otherwise, only returns the config parameters.

Returns:

The metadata and/or parameters of the desired config.

Return type:

dict

Raises:

ResponseError: – If the config metadata could not be obtained.

set_lock(set_id)¶

Lock a set.

Locks a set from changes. Useful for protecting a set (i.e. marks a set as “in use”) when performing a data operation on a set’s assets.

Parameters:: set_id – The ID of the set to lock.
Returns:: Always returns True.
Return type:: bool
Raises:: HTTPError: – If a server or client error occurs.

set_unlock(set_id)¶

Unlocks a set.

Releases a set from protection/external changes.

Parameters:: set_id – The ID of the set to unlock.
Returns:: Always returns True.
Return type:: bool
Raises:: HTTPError: – If a server or client error occurs.

set_is_locked(set_id)¶

Check the lock status of a set.

Parameters:: set_id – The ID of the set to check the lock status of.
Returns:: Returns True if the set is locked. Otherwise, returns False if the set is not locked.
Return type:: bool
Raises:: HTTPError: – If a server or client error occurs.

delete_set(set_id, delete_content=False, wait_for_completion=False)¶

Delete a set.

Parameters:

set_id – The ID of the set to delete.
delete_content – Should the content of the set be deleted or just the set itself. Set to False by default meaning the content wont be deleted.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to delete the set.

Return type:

str

split_set(set_id, percentages_or_sections, subset_names=None, random_seed=42, wait_for_completion=False)¶

Split a set into subsets by randomly sampling and distributing constituent assets.

Parameters:

set_id – The ID of the parent set to split.
percentages_or_sections – percentages_or_sections is an integer n or a list of floating point values. If percentages_or_sections is an integer n, the input set is split into n sections. If input set size is divisible by n, each section will be of equal size, input_set_size / n. If input set size is not divisible by n, the sizes of the first int(input_set_size % n) sections will have size int(input_set_size / n) + 1, and the rest will have size int(input_set_size / n).

If percentages_or_sections is a list of floating point values, the input set is split into subsets with the amount of data in each split i as floor(percentage_or_sections[i] * input set size). Any leftover data due to the floor operation is equally redistributed into len(percentage_or_sections) subsets according to the same logic as with percentages_or_sections being an integer value.
subset_names – Names for the subsets to be created. Subsets corresponding to the provided subset names should not previously exist.
random_seed – Optional. Set a different seed to get a different random subsample of the data in each split.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete. If numof splits is greater than 5, wait_for_completion cannot be set to False.

Returns:

The job ID that carried out the split set job.

Return type:

str

get_set_status(set_id)¶

Get the status of a set.

Parameters:: set_id – The ID of the set to get the status of.
Returns:: The full status of the desired set.
Return type:: dict
Raises:: HTTPError: – If a server or client error occurs.

add_assets_to_set(set_id, search=None, ids=None, limit=None, wait_for_availability=True, timeout=30, wait_for_completion=False)¶

Add assets to a set.

Add assets to a set either with a search query, or by IDs. Utilize only one of

Parameters:

search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
ids – IDs of the assets to add to the given set.
set_id – The ID of the set to add assets to.
limit – The limit of the number of additions. If the limit is reached, randomly down-sample the list of assets to add.
wait_for_availability – If True and the specified set is locked, waits until timeout for the set to unlock. Otherwise, if False, raises a SetsError if the set is locked.
timeout – If wait_for_availability is True, represents the maximum amount of time for a set to unlock. Otherwise, has no effect.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

ID of the Job that runs/ran the add assets job.

Return type:

str

Raises:

SetsError – If wait_for_availability is False, and the specified set is locked. If neither ids or search are passed.
TimeoutError – If wait_for_availability is True, and timeout is reached before the desired set is unlocked.
ResponseError – If assets could not be added to the desired set.

remove_assets_from_set(search=None, ids=None, set_id=None, wait_for_availability=True, timeout=30, wait_for_completion=False)¶

Remove assets from a set either by query or by IDs.

Parameters:

search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
ids – IDs of the assets to add to the given set.
set_id – The ID of the set to add assets to.
wait_for_availability – If True and the specified set is locked, waits until timeout for the set to unlock. Otherwise, if False, raises a SetsError if the set is locked.
timeout – If wait_for_availability is True, represents the maximum amount of time for a set to unlock. Otherwise, has no effect.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the job that runs/ran the Job to remove assets from a set.

Return type:

str

assets_in_set(set_id=None)¶

List assets within a set.

Parameters:: set_id – The ID of the set to list assets from.
Returns:: A list of the IDs of the assets contained in the specified set.
Return type:: List[str]

search_within_set(set_id, search=None, show_meta=True)¶

Search for assets within a set.

Parameters:

search – Search query to send to the endpoint. Filters through existing sets according to the dictionary passed.
set_id – The ID of the set to search.
show_meta – If True, returns the metadata of each asset in the set. Otherwise, simply lists the IDs of each asset in the set.

Returns:

Contains asset IDs and/or metadata for the assets in the given set.

Return type:

List

Raises:

ResponseError – If the search could not be performed.

zip_set(set_id, version=None, force_recreate=False)¶

Create a zipfile of a set.

Parameters:

set_id – The ID of the set to zip.
version – String. Commit tag for the set’s version to download.
force_recreate – Boolean. Default False. Set to True to force recreate the zip file for the set.

Returns:

The ID of the job that zipped the set, The zipfile ID of the zipped set).

Return type:

(str, str)

Raises:

ValueError – If no set ID is provided.

get_set_version_history(set_id)¶

Get versioning history of the set as a list of commit details. Commit details are: dictionaries containing the following keys: (“short_hash”, “full_hash”, “timestamp”, “author_name”, “author_email”, “message”)

Parameters:: set_id – The ID of the set to zip.
Returns:: A list of the commit details for each time the set was versioned. Latest commit first.
Return type:: List[dict]
Raises:: ValueError – If no set ID is provided, or if set does not exist or set is not versioned.

compare_sets(sets, index_name, metric_type=None, metric_params=None, wait_for_completion=True)¶

Compare sets based on a given metric_type in an embedding space with given index_name.

Parameters:

sets – The IDs of the sets to compare with each other.
index_name – The name of the embedding space.
metric_type – The metric type that the comparison is based on; if not given, default value is wasserstein.
metric_params – key-values of any parameters that are needed to calculate the metric; if not given, default values for num_projections and num_seeds are 25 and 50, respectively, for wasserstein distance calculation.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the job that compared sets.

Return type:

str

Raises:

ValueError – If sets is empty or contains only one set id. If index_name is not provided.

get_sets_comparison_matrix(search=None, index_name=None)¶

Get the comparison matrix of an index.

Parameters:

search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
index_name – index_name of the embeddings of interest.

Returns:

The requested comparison matrix.

Return type:

List[dict]

get_performance_metrics(comparison_task_id, search=None, iou_threshold=0.0)¶

Get performance metrics.

Get performance metrics including precision, recall, and f1 score for a given search on annotations comparisons asset space and iou threshold.

Parameters:

comparison_task_id – id of the comparison task to calculate metrics for

search – Query to locate the data subset of interest. Filters through existing annotations comparisons according to the dictionary passed.

# Search example
search = {
            "filters": [{"field": "sets", "values": ["some_set_id"]}]
         }

iou_threshold – iou threshold to use in confusion matrix calculation

Returns:

The requested performance metrics.

# Example return
{
    "cat": {'precision': 0.75, 'recall': 0.47368421052631576, 'f1': 0.5806451612903226},
    "dog": {'precision': 0, 'recall': 0, 'f1': N/A}
}

Return type:

Dict[str, Any]

get_performance_metrics_per_dimension(comparison_task_id, dimension, iou_threshold=0.0, search=None, force_recalculation=False)¶

Get performance metrics per dimension.

Get performance metrics including precision, recall, and f1 score per existing buckets in a given dimension e.g. image ids.

Parameters:

comparison_task_id – id of the comparison task to calculate metrics for
dimension – the dimension for which the performances should be calculated e.g. “id”
iou_threshold – iou threshold to use in confusion matrix calculation

search – Query images to locate the data subset of interest.

# Search example:
search = {
            "filters": [{"field": "sensor", "values": ["camera_1"]}]
         }

force_recalculation – If True, cached results are ignored if available.

Returns:

A dictionary that has 2 keys; job_id, and performances.

The job_id key is mapped to the ID of the job that calculated the performances, and the performances key is mapped to the resultant performance.

# Example return
{
    "job_id": "some_job_id",
        "performances": {
            "image_id_1": {
                "cat": {'precision': 0.75, 'recall': 0.47368421052631576, 'f1': 0.5806451612903226},
                "dog": {'precision': 0, 'recall': 0, 'f1': N/A}
            },
            "image_id_2": {
                "cat": {'precision': 0.25, 'recall': 0.5, 'f1': 0.33333333333},
                "dog": {'precision': 0.5, 'recall': 1.0, 'f1': 0.66666666666}
            }
        }
    }

Return type:

Dict[str, Dict[str, Any]]

download_zipfile(zip_id, output_filename=None)¶

Download a zipfile into either a bytesIO object or a file.

Parameters:

zip_id – The ID of the zipfile to download.
output_filename – The filename to give the downloaded zipfile.

Returns:

If output_filename is provided, returns the path where the zipfile was locally saved as str. Otherwise, the zipfile is returned directly as a io.BytesIO object.

Return type:

Union[str, io.BytesIO]

download_timeseries(search_dic=None, fields=None)¶

Download the timeseries using a search query.

Parameters:

search_dic – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
fields – Field names of fields to return from each session.

Returns:

Mapping between session uids, and the returned timeseries data.

Return type:

dict

zip_set_and_download(set_id=None, version=None, filename=None)¶

Zip and download an existing set.

The downloaded zip file contains several subfolders/files encapsulating the logged data on SceneBox. More specifically, this includes the following: - Asset files with their original extension (images/videos/LIDARs, etc.) - Asset metadata in .json format - Annotation data in .json format (if available)

Parameters:

set_id – Set name of the set to zip and download
version – String. If the set is versioned, then download a specific version of the set
filename – Name of the downloaded zip folder

Returns:

If filename is provided, returns the path where the zipfile was locally saved as str. Otherwise, the zipfile is returned directly as a io.BytesIO object.

Return type:

Union[str, io.BytesIO]

Raises:

ValueError – If no set_id is provided.

get_asset_manager(asset_type)¶

Set asset_manager state.

Set the asset_manager state to access particular assets. This is often used in chaining: Eg. client.with_asset_state(“images”, True).search_files({})

Parameters:: asset_type – Desired AssetManagerClient asset type. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
Returns:: The AssetManagerClient associated with the specified asset.
Return type:: AssetManagerClient

get_job_manager(job_id=None)¶

Set asset_manager state.

Set the asset_manager state to access particular assets. This is often used in chaining: Eg. client.with_asset_state(“images”, True).search_files({})

Parameters:: job_id – Optional. A job id for the job to manage. If not provided, a job manager is created with a guid for job id.
Returns:: The AssetManagerClient associated with the specified asset.
Return type:: AssetManagerClient

get_streamable_set(set_id, force_recreate=False, shard_size=50000)¶

A StreamableSet is a SceneBox object that can be used to build dataloaders for a set. A dataloader enables fetching batches of tuples of data such as (images, annotations) from scenebox that can be used to in a machine learning models’s training or testing loop.

Parameters:

set_id – String. ID of the set that you want to stream.
force_recreate – Boolean. Default False. Flag for mandatory update of the StreamableSet object’s data repository.
shard_size – Integer. Maximum number of assets in a shard. Data will be split into multiple shards of shard_size.

Returns:

A StreamableSet object that supports pytorch dataloader methods.

Return type:

StreamableSet

register_rosbag_session(session_name, session_directory_path=None, rosbag_ids=None, config_name=None, wait_for_completion=True)¶

Register a rosbag session with the SceneEngine.

Parameters:

session_name – Name of the session.
session_directory_path – Local path (on ros-extractor) to a session directory.
rosbag_ids – List of rosbag IDs belonging to the session.
config_name – Name of the session configuration file.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

the Job ID that carried out the indexing job, and the session UID of the created session.

Return type:

Tuple[str, str]

get_session_resolution_status(session_uid)¶

Get the status of the session resolution task (if it exists).

Parameters:: session_uid – The session uid to get the session resolution status of.
Returns:: The session resolution status.
Return type:: str

get_searchable_field_statistics(asset_type)¶

Get the metadata searchable field statistics for a certain asset type.

Parameters:: asset_type – The asset type to receive the searchable field statistics of. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
Returns:: Dictionary with keys of the searchable statistics, and values of the min/max/count/defaults for that statistic.
Return type:: dict

get_metadata(id, asset_type, with_session_metadata=False)¶

Get the metadata of an asset.

Parameters:

id – The ID of the asset to get metadata from.
asset_type – The asset type of the asset to get metadata from. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
with_session_metadata – If True and the asset belongs to a session, also return session entities

Returns:

The asset metadata.

Return type:

dict

compress_images(ids, desired_shape=None, thumbnail_tag=None, use_preset=None)¶

Compress a list of images.

Either specify desired_shape or use_preset to compress an image.

Parameters:

ids – The IDs of the images to compress.
desired_shape – Specifies the desired output shape. Not used when use_preset is set.
thumbnail_tag – Tag of the thumbnail to be created. Not used when use_preset is set.
use_preset – Use a preset configuration for desired_shape & thumbnail tag. Must be included inside the config.

Returns:

The ID of the job that carries out the image compression.

Return type:

str

compress_videos(ids, desired_shape=None, thumbnail_tag=None, use_preset=None, wait_for_completion=False)¶

Compress a list of videos.

Either specify desired_shape or use_preset to compress a video.

Parameters:

ids – The IDs of the videos to compress.
desired_shape – Specifies the desired output shape. Not used when use_preset is set.
thumbnail_tag – Tag of the thumbnail to be created. Not used when use_preset is set.
use_preset – Use a preset configuration for desired_shape & thumbnail tag. Must be included inside the config.
wait_for_completion – should wait for completion?

Returns:

job id of the compression

Return type:

str

compress_lidars(ids, skip_factors=None, gzip_compress=True)¶

Compress a list of videos.

Either specify desired_shape or use_preset to compress a lidar.

Parameters:

ids – The IDs of the LIDARs to compress.
skip_factors – Determines how many LIDAR points to skip. Each point listed creates a new thumbnail. Defaults to [1, 10, 100].
gzip_compress – Default True. Store a gzip compressed version of the lidars. Allows quicker loading on the webapp.

Returns:

The job ID of the Job that carries out the LIDAR compression.

Return type:

dict

concatenate_videos(ids, output_video_id, video_metadata, job_id=None)¶

Concatenate a list of videos into a single video.

Parameters:

ids – The IDs of the videos to concatenate.
output_video_id – The ID of the concatenated video produced.
video_metadata – Metadata of the concatenated video produced.
job_id – If provided, creates a job with the given job_id. Otherwise, automatically generates a job ID.

Returns:

The job ID of the Job that carries out the video concatenation.

Return type:

dict

add_annotation(annotation, update_asset=True, buffered_write=False, replace_sets=False, add_to_cache=False, annotation_to_objects=False, remove_asset_repeated_annotations=False)¶

Add a single annotation.

Add an annotation using a class from the scenebox.models.annotation module.

Parameters:

annotation – The formatted annotation to ingest.
update_asset – If True, updates the metadata of the previously uploaded asset associated with the annotation
remove_asset_repeated_annotations – If update_asset is True and remove_asset_repeated_annotations is True, removes repeated annotation listings in the asset’s metadata.
buffered_write – If True, ingests annotation in a buffered fashion.
replace_sets – If True and update_asset is False, appends to existing annotation metadata rather than replacing it. If False and update_asset is True, replaces existing annotation metadata with the inputted annotation metadata.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.
annotation_to_objects – If True, runs the annotations_to_objects async task after adding the annotations.

cleanup_assets_annotation_meta(asset_ids, asset_type)¶

Cleanup annotation lists by removing repeated and non-existent entries in an asset’s metadata. asset_ids:

List of asset ids to clean-up.

asset_type:: The type of asset. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants

add_annotations(annotations, buffered_write=True, update_asset=True, threading=True, disable_tqdm=False, replace_sets=False, cleanup_annotation_masks=False, remove_asset_repeated_annotations=False, session_uid=None, annotations_to_objects=False, add_to_cache=False, wait_for_annotations_to_objects_completion=False)¶

Add several annotations at once.

Parameters:

annotations – The annotations to ingest.
buffered_write – If True, ingests annotations in a buffered fashion.
update_asset – If True, updates the metadata of the asset associated with each of the items in annotations. Otherwise, does not update the metadata of the associated asset.
raw_annotations – A list of the raw annotation files to ingest. Gets uploaded as raw files.
threading – If True, uses multithreading to speed up annotations ingestion. Otherwise, does not use multithreading.
disable_tqdm – If False, uses a progressbar wrapper that calculates and shows the progress of the threading process. Otherwise, disables/silences the tqdm progressbar wrapper.
replace_sets – If True and update_asset is False, appends to existing annotation metadata rather than replacing it. If False and update_asset is True, replaces existing annotation metadata with the inputted annotation metadata.
cleanup_annotation_masks – If True, runs the cleanup_annotation_masks async task after adding all annotations.
remove_asset_repeated_annotations – If update_asset is True and cleanup_asset_meta is True, removes repeated annotation listings in the asset’s metadata.
session_uid – If provided and annotations_to_objects is set to True, objects entities are added to the session during the annotations_to_objects async task.
annotations_to_objects – If True, runs the annotations_to_objects async task after adding the annotations.
wait_for_annotations_to_objects_completion – If True, wait for completion of the annotations_to_objects async task.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.

static is_coco_annotation_file_valid(file_path)¶

Tests if a COCO annotation file is valid.

A valid COCO annotations file is a .json file that only contains keys listed in scenebox.constants.COCOAnnotations.KEYS, and values that are all of type list.

Parameters:: file_path – Location of the COCO annotation file to validate.
Returns:: Returns True if the COCO annotation file is valid (and can be ingested into SceneBox). Otherwise, returns False if the COCO annotation file is invalid.
Return type:: bool

add_coco_annotations_from_folder(file_path, provider, images_set_id, folder_path=None, version=None, annotation_group='ground_truth', session_uid=None, annotations_set_id=None, use_image_id_as_annotation_id=False, preprocesses=None, thumbnailize_at_ingest=False, annotation_to_objects=False, wait_for_completion=False)¶

Add several images and/or COCO annotations.

Upload several images and/or COCO annotations at once with the same local folder path. This method is best used with local images that are all located in the same folder. For images not located in the same folder, or images that are not located on your local machine, check out self.add_image.

Parameters:

file_path – The filepath to the file that contains the coco annotations. Must be in json format.
provider – Name of the provider that supplied the annotations.
folder_path – The local path to the folder of images to upload. If not provided, no new images are uploaded.
version – The version of the model that supplied the annotations.
annotation_group – The group that the annotation belongs to. It can be one of ground_truth, model_generated, or other
session_uid – The session to associate with the images to upload. If folder_path is not provided, has no effect.
images_set_id – The set to add the images to. If folder_path is not provided, has no effect.
annotations_set_id – The set to add the COCO annotations to
use_image_id_as_annotation_id – If True, appends each annotation’s associated image_id onto the annotation’s ID. Otherwise, makes the annotation IDs None.
preprocesses – Specifies which process to treat the image thumbnails with.
thumbnailize_at_ingest – If True, create thumbnail at ingestion time. Otherwise, create thumbnails “on the fly”.
annotation_to_objects – If True, extract objects from the annotations when adding them.
wait_for_completion – If True, wait for the completion of the annotation to object process

delete_annotations(ids=None, search=None)¶

Delete annotations using a list of ids or a search query, and updates metadata of corresponding assets.

Parameters:

ids – A list of annotation IDs to delete.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Returns:

A list of job IDs. One for each job that carries out the deletion of a chunk of annotations.

Return type:

job_ids

delete_assets_annotations(asset_ids, annotation_type, annotation_group=None, annotation_provider=None, annotation_version=None)¶

Delete annotations using a list of asset ids, and updates metadata of corresponding assets.

Parameters:

asset_ids – A list of asset ids to clean the annotations of.
annotation_type – Mandatory annotation type. if set to “all” it deletes everything. Examples: “polygon” “two_d_bounding_box”
annotation_group – Optional annotation group. Example “ground_truth” “model_generated”
annotation_provider – Optional annotation provider.
annotation_version – Optional annotation version.

Returns:

A list of job IDs. One for each job that carries out the deletion of a chunk of annotations.

Return type:

job_ids

remove_annotation_sources(annotation_sources, asset_type, asset_filters=None, wait_for_completion=True)¶

Delete annotations using a list of ids or a search query, and updates metadata of corresponding assets.

Parameters:

annotation_sources – A list of dictionaries. Each dictionary defines an annotation source with the mandatory field: provider and optional fields: version, annotation_type, annotation_group. Can be collected from the get_annotation_sources method.
asset_type – Asset media type for annotations provided by all annotation_sources. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
asset_filters – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete. Default is True.

Returns:

The job ID of the Job that carries out the deletion of annotations.

Return type:

job_id

add_session(session_name, session_type='generic', set_id=None, resolution=None, session_aux_metadata=None, status='ready', raise_if_session_exists=False)¶

Add a session.

Parameters:

session_name – The name to give to the session.
session_type – The session type to give to the session. Choose from scenebox.constants.SessionTypes.
set_id – The session set to add the session to.
resolution – The resolution at which to sample events on. Measured in seconds. Ensure you choose a small enough resolution that none of your events get aliased to occur at a lower frequency than expected.
session_aux_metadata – session auxiliary metadata including any metadata that are not primary on SceneBox and user wants to search sessions with Example:

{
“key_1”: 1, “key_2”: 1.0, “key_3”: “1”, “key_4”: [“abc”, “def”], “key_5”: {“a”: 1, “b”: 2}

}
status – Session status. By default, this is READY but can be IN PROGRESS as well.
raise_if_session_exists – Raises a SessionError if the name parameter matches an existing set.

Returns:

The session UID of the added session.

Return type:

str

slice_session(session_uid, start_time, end_time, session_name=None)¶

slice an existing session with start and end time.

Parameters:

session_uid – session id to be sliced
start_time – start time stamp of the slice
end_time – end time stamp of the slice
session_name – Optional new name for the silced_session

Returns:

The session UID of the added session.

Return type:

str

update_time_span(session_uid, session_start_time, session_end_time)¶

Update the start/end times of a session.

Parameters:

session_uid – session uid of the session to change the start/end time for.
session_start_time – The new session start time.
session_end_time – The new session end time.

Returns:

The updated metadata of the session after the timespan update.

Return type:

dict

extend_time_span(session_uid, timestamps)¶

Extend the start time and end time of a session based on min/max of a list of timestamps.

Parameters:

session_uid – session uid of the session to change the start/end time for.
timestamps – list of timestamps

Returns:

The updated metadata of the session after the timespan update.

Return type:

dict

add_source_data(session_uid, source_data, sensors, replace=True)¶

Add source data to session metadata.

Parameters:

session_uid – The session UID of the session metadata to update.
source_data – The source data to add to the session metadata. Example: [

{
“sensor”: “camera_1”, “id”: “id_1”, “type”: “videos”

}, {

“sensor”: “camera_2”, “id”: “id_2”, “type”: “videos”

}

]
sensors – The sensors to be added to session metadata. Describes what sensors were used to capture the data. Example: [

{
“name”: “camera_1”, “type”: “camera”

}, {

“name”: “camera_2”, “type”: “camera”

}

]
replace – If True, overwrites existing source metadata with the supplied source data. Otherwise, appends the supplied source data to the existing source metadata.

Returns:

The updated metadata of the session after the source data update.

Return type:

dict

add_auxiliary_session_data(session_uid, auxiliary_data, replace=True)¶

Add auxiliary session data e.g. video data to metadata of a given session.

Parameters:

session_uid – The session UID Of the session to add auxiliary data to.
auxiliary_data – The auxiliary data to add to the session metadata. Example: [

{
“id”: “id_1”, “type”: “concatenated_video”, “tags”: [

“camera_1”

]

}, {

“id”: “id_2”, “type”: “concatenated_video”, “tags”: [

“camera_2”

]

}

]
replace – If True, overwrites existing auxiliary metadata with the supplied auxiliary data. Otherwise, appends the supplied auxiliary data to the existing auxiliary metadata.

Returns:

The updated metadata of the session after the auxiliary session data update.

Return type:

dict

add_ego_pose(session_uid, ego_pose_data=None, ego_pose_path=None, title=None)¶

Add an ego pose file to a session. An ego pose file has a series of frames in which a list of timestamped objects with their ego position in “lat” and “lon” is given. Example:

[

{
“frame”:0, “timestamp”:”2021-08-24T21:15:15.574000+00:00”, “objects”:{

“0”:{
“lat”:33.6, “lon”:-0.15

}, “1”:{

“lat”:5.6, “lon”:-3.45

}, “2”:{

“lat”:23.25, “lon”:-3.85

}, “3”:{

“lat”:83.0, “lon”:-5.9

}, “4”:{

“lat”:198.55, “lon”:12.15

}

}

}, {

“frame”:1, “timestamp”:”2021-08-24T21:15:15.599000+00:00”, “objects”:{

“5”:{
“lat”:33.55, “lon”:-0.15

}, “6”:{

“lat”:5.7, “lon”:-3.45

}, “7”:{

“lat”:23.25, “lon”:-3.85

}, “8”:{

“lat”:82.45, “lon”:-5.9

}, “9”:{

“lat”:197.8, “lon”:12.15

}

}

}

]

session_uid:
The session UID Of the session to add auxiliary ego pose of the objects.

ego_pose_data:
dictionary for ego pose of the objects

ego_pose_path:
path to a json file containing the json file for ego pose of the objects.

title:
title of the ego pose attached to the session.

add_enrichments_configs(session_uid, enrichments_configs, replace=True)¶

Add enrichment configurations to session metadata.

Parameters:

session_uid – The session UID of the session to add the enrichment configurations to.
enrichments_configs – The enrichments configuration to add to the session metadata. Example: [

{
“input_event”: “geo_locations”, “type”: “location”, “configuration”: {

“influence_type”: “interval”, “influence_radius_in_seconds”: 0.5

}

}, {

“input_event”: “geo_locations”, “type”: “visibility”, “configuration”: {

“influence_type”: “interval”, “influence_radius_in_seconds”: 2.0

}

}, {

“input_event”: “geo_locations”, “type”: “weather”, “configuration”: {

“influence_type”: “interval”, “influence_radius_in_seconds”: 2.0

}

}

]
replace – If True, overwrites existing enrichments config metadata with the supplied enrichments config. Otherwise, appends the supplied enrichments config to the existing enrichments config metadata.

Returns:

The updated metadata of the session after adding the enrichments configs.

Return type:

dict

add_geolocation_entity(latitude, longitude, timestamp, session_uid, buffered=False)¶

Add a geolocation entity to a session.

Parameters:

latitude – The latitude of the geolocation.
longitude – The longitude of the geolocation.
timestamp – The timestamp of the geolocation.
session_uid – The session uid of the session to add the geolocation entity to.
buffered – Buffer write

Returns:

A confirmation that the Job to add the entity was acknowledged.

Return type:

dict

add_geolocation_entities(latitudes, longitudes, timestamps, session_uid, buffered=False)¶

Add geolocation entities to a session.

Parameters:

latitudes – A list of latitudes of the geolocations.
longitudes – A list of longitudes of the geolocations.
timestamps – A list of timestamps of the geolocations.
session_uid – The session uid of the session to add the geolocation entities to.
buffered – Buffer write

Returns:

A list of dictionaries containing confirmations that the Job to add the entity chunks were acknowledged.

Return type:

List

add_entity(entity_dict, urgent=False, buffered=False)¶

Add an entity to the specified session.

Parameters:

entity_dict – The entity data to add. Typically, has the following form: {

“session”: session_uid, “start_time”: datetime_or_str_to_iso_utc(start_time), “end_time”: datetime_or_str_to_iso_utc(end_time), “timestamp”: datetime_or_str_to_iso_utc(start_time), “entity_id”: entity_id, “entity_name”: event_name, “entity_value”: event_value, “entity_type”: Choose from scenebox.constants.EntityTypes, “influence”: Choose from scenebox.constants.EntityInfluenceTypes “extend_session”: bool whether to extend a session to cover the influence interval

}
urgent – If True, entity is ingested immediately. However, a manual resolution is needed afterwards to make the entity searchable.
buffered – If true, ignores the urgent and uses a built-in buffer for writing.

Returns:

A confirmation that the Job to add the entity was acknowledged.

Return type:

dict

add_entities(entity_dicts, urgent=False, buffered=False)¶

Add a list of entities to the specified sessions.

Parameters:

entity_dicts – The entity data to add. Typically, each list item has the following form: {

“session”: session_uid, “start_time”: datetime_or_str_to_iso_utc(start_time), “end_time”: datetime_or_str_to_iso_utc(end_time), “timestamp”: datetime_or_str_to_iso_utc(start_time), “entity_id”: entity_id, “entity_name”: event_name, “entity_value”: event_value, “entity_type”: Choose from scenebox.constants.EntityTypes, “influence”: Choose from scenebox.constants.EntityInfluenceTypes “extend_session”: bool whether to extend a session to cover the influence interval

}
urgent – If True, entity is ingested immediately. However, a manual resolution is needed afterwards to make the entity searchable.
buffered – if True, the entities is written to the buffer and “urgent” is ignored. user should call flush_entities_buffer eventuially to write all the data.

Returns:

A confirmation that the Job to add the entities was acknowledged.

Return type:

dict

flush_entities_buffer()¶: write all of the entities in the buffer. If scene engine is exited without calling this, all of the unflushed entities are discarded.

resolve_session(session_uid, resolution=None, wait_for_completion=False)¶

Resolve a session.

Project session events onto a single timeline. Events are sampled at the given resolution. A smaller resolution is recommended if events rapidly change in value.

Parameters:

session_uid – The session UID of the session to resolve.
resolution – The resolution at which to sample session events. Measured in seconds.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to resolve the session.

Return type:

dict

add_event_interval(event_name, event_value, start_time, end_time, session_uid, urgent=False, buffered=False)¶

Add an event interval to a session.

Adds a “state_set” entity to a specific session’s timespan.

Parameters:

event_name – The name of the event interval to add to the session.
event_value – The value(s) of the event over the timespan.
start_time – The start time of the event.
end_time – The end time of the event.
session_uid – The session UID of the session to add the event to.
urgent – If True, entity is ingested immediately. However, a manual resolution is needed afterwards to make the entity searchable.
buffered – buffered write

Returns:

The ID of the added entity.

Return type:

str

add_comments(comments, start_time, end_time, session_uid, wait_for_completion=False)¶

Add a comment to a time segment of a session.

Parameters:

comments – The comment(s) to add to the given timespan.
start_time – The start time of the timespan to add the comment to.
end_time – The end time of the timespan to add the comment to.
session_uid – The session UID of the session to add the comment to.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to add the comment.

Return type:

dict

add_event_intervals(event_name, event_values, start_times, end_times, session_uid, urgent=False, extend_session=True, epsilon=0, buffered=False)¶

Add a list of several event intervals to a session.

Adds several “state_set” entities to a specific session’s timespan.

Parameters:

event_name – The name of the event interval to add to the session.
event_values – The value(s) of the event over the timespan.
start_times – The start time of each event.
end_times – The end time of each event.
session_uid – The session UID of the session to add the events to.
urgent – If True, entity is ingested immediately. However, a manual resolution is needed afterwards to make the entity searchable.
extend_session – should extend the session to include this event (default to true)
epsilon – Constant to increase the time-delta of entity start time and endtime. Measured in seconds.
buffered – buffered write

Returns:

A list of the added entity IDs

Return type:

List[str]

add_scalar_intervals(measurement_name, measurement_values, start_times, end_times, session_uid, epsilon=0.001)¶

Add a timeseries with interval values to a session.

Parameters:

measurement_name – The name to assign to the measurements.
measurement_values – The numerical values of each interval. Each interval can only have one value.
start_times – The start times of each interval.
end_times – The end times of each interval.
session_uid – The UID of the session to add the timeseries to.
epsilon – The amount of time to add to start times and subtract off end times. Prevents intervals from having undesired overlap.

Returns:

The IDs of the created entities.

Return type:

List[str]

add_scalar_measurements(measurement_name, measurement_values, timestamps, session_uid, buffered=False)¶

Add a timeseries with point values (“scalars”) to a session.

Parameters:

measurement_name – The name to assign to the measurements.
measurement_values – The numerical values of each point. Each point can only have one value.
timestamps – The timestamps at which each measurement occurred.
session_uid – The UID of the session to add the timeseries to.
buffered – If True, writes events in a buffered fashion.

Returns:

The IDs of the created entities.

Return type:

List[str]

add_point_events(measurement_name, measurement_values, timestamps, session_uid, buffered=False)¶

Add a list of several point events to a session.

Parameters:

measurement_name – The name to assign to the measurements.
measurement_values – The state of the event at each timestamp.
timestamps – A list of timestamps at which the events occurred.
session_uid – The UID of the session to add the events to.
buffered – If True, writes events in a buffered fashion.

Returns:

A list of the added entity IDs.

Return type:

List[str]

add_timeseries_csv(session_uid, csv_filepath, df_labels=None)¶

Add a timeseries from a CSV file.

Add a measurement DataFrame to a session as scalar entities from a CSV. Creates a measurement_df Pandas DataFrame from the inputted CSV file, then passes this dataframe to self.add_df.

Parameters:

session_uid – The session to add the timeseries to.
csv_filepath – The filepath of the CSV to turn into a timeseries.
df_labels – List of the CSV column names to use.

add_df(measurement_df, session_uid, timestamps=None)¶

Add a measurement DataFrame to a session as scalar entities.

Add several measurements across time in a session by populating the measurement_df Pandas DataFrame argument. Can add an arbitrary number of named measurements, either numeric or non-numeric. Timestamps must either be specified in the timestamps argument, or under a column named “timestamps” inside measurement_df.

Parameters:

measurement_df – DataFrame holding the measurement(s) of interest. Add an arbitrary number of named columns to represent different measurement types. If a column named “timestamp” specifying the timestamps of the measurements does not exist, must be specified under the timestamps method argument.

If latitude/longitude columns are included in under the names “lat” and “lon” respectively, will automatically be added as geolocation entities.
session_uid – The session UID to add the measurement(s) to.
timestamps – A list of timestamps corresponding to the data measurements in ``measurement_df`

entity_summary(session_uid, entity_names=None, start_time=None, end_time=None)¶

Get the entity summary for a session.

Return type:: dict

delete_session(session_uid, delete_assets_contents=True, wait_for_completion=False)¶

Delete an existing session.

Optionally delete the assets inside the session, as well.

Parameters:

session_uid – The UID of the session to delete.
delete_assets_contents – If True, deletes the assets contained inside the specified session. Otherwise, does not delete the assets in the session.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
Returns

delete_events(session_uids, event_names, start_time=None, end_time=None, wait_for_completion=False)¶

Delete session events.

Delete any events that either begin or end between the specified start/end times (if listed). If no start/end times are provided, deletes the provided events across the entire session time-span.

Parameters:

session_uids – List of session UIDs to delete events from.
event_names – Event name to delete.
start_time – Start time of items to delete. Defaults to the start time of each listed session_uid.
end_time – End time of items to delete. Defaults to the start time of each listed session_uid.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the job that deletes the events.

Return type:

str

search_assets(asset_type, search=None, size=None, offset=None, sort_field=None, sort_order=None, scan=True)¶

Retrieve asset IDs with a search query.

Returns the top size matching hits. If a return of more than 10000 hits or all hits is desired, use the default values and only set the search if any.

Parameters:

asset_type – Asset type to filter for in the asset search. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if scan is False.
offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if scan is False.
sort_field – Sorts for the specified name.
sort_order – Specifies the Elasticsearch string sorting order.
scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.

Returns:

A list of all IDs of the assets fulfilling the search query.

Return type:

List[str]

search_meta(asset_type, query=None, size=50, offset=0, sort_field=None, sort_order=None, scan=True, compress=False)¶

Retrieve asset metadata with a search query.

Returns the top size matching hits. If a return of more than 10000 hits is desired, please use AssetManagerClient.search_meta_large().

Parameters:

asset_type – Asset type to filter for in the asset metadata search. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
query – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
size – Specifies the Elasticsearch search size. The maximum number of hits to return. Has no effect if scan is False.
offset – Specifies the Elasticsearch search offset. The number of hits to skip. Has no effect if scan is False.
sort_field – Sorts for the specified name.
sort_order – Specifies the Elasticsearch string sorting order.
scan – If True, uses the Elasticsearch scan capability. Otherwise, uses the Elasticsearch search API.
compress – Boolean. If set to True, a gzip compressed list of metadata is returned. Typically used in cases where the metadata returned is over 20MB.

Returns:

A list of the metadata of the assets fulfilling the search query.

Return type:

List[Dict]

count(asset_type, search=None)¶

Retrieve summary of asset metadata with a search query.

Parameters:

asset_type – Asset type to collect metadata summary for. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.

Return type:

int

summary_meta(asset_type, summary_request)¶

Retrieve summary of asset metadata with a search query.

Parameters:

asset_type – Asset type to collect metadata summary for. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
summary_request –
Includes:
- search: (Dict) to locate the data subset of interest.
- dimensions: (List[str]) to collect statistical summary of the desired dimensions
- nested_dimensions: (List[str]) to collect statistical summary of the desired nested dimensions
Example:

{

“search”: {

“filters”: [

{
“field”: “objects”, “values”: [

“ball.camera_0.tester_manual_point”

], “filter_out”: false

}

]

}, “nested_dimensions”: [

“annotations_meta.provider”

]

}

Returns:

Example: {

”total”: 3, “aggregations”: [

{
“dimension”: “annotations_meta.provider”, “buckets”: [

{
“key”: “COCO_bounding_box”, “doc_count”: 1

}, {

”key”: “COCO_polygon”, “doc_count”: 2

}

]

}

]

}

Return type:

Dict

add_embeddings(embeddings, similarity_ingest=True, create_umap=True, wait_for_completion=False, add_to_cache=False, umap_params=None)¶

Add existing embeddings to an asset.

Parameters:

embeddings – List of embeddings objects that are created from Scenebox Embedding class.
similarity_ingest – If True, enables similar image/object search by performing bulk indexing. Otherwise, has no effect.
create_umap – If True, the embeddings are going to be added to a new umap
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.
umap_params – umap parameters including: n_neighbors, min_dist and train_size. Default is {

n_neighbors: 20 min_dist: 0.3 train_size: 10000

}

Returns:

The IDs of the successfully added embeddings

Return type:

List[str]

get_embeddings(embedding_space, asset_type='images', asset_ids=None, use_umap_as_embedding=False, max_number_of_embeddings=200000, search=None)¶

Retrieve asset embeddings.

Parameters:

embedding_space – Mandatory Embedding space for obtaining the embeddings
asset_type – Asset types, default is images
asset_ids – ids of assets. By default, image ids. If not specified, fetch everything
use_umap_as_embedding – should use 2D umaps instead of high dimensional embeddings. Default is False
max_number_of_embeddings – Maximum number of embeddings to fetch. If exceeds, throw an error
search – search parameter. By default, search everything

Returns:

Returns list of Embedding objects

Return type:

List[Embedding]

delete_embeddings(embedding_space, asset_type='images', wait_for_completion=False)¶

Delete embeddings using an index name and updates the metadata of corresponding assets.

Parameters:

embedding_space – The embedding space to be deleted (mandatory)
asset_type – The corresponding asset type
wait_for_completion – Should we wait for completion?

Returns:

A job IDs to be monitored for deleting the embeddings

Return type:

job_id

add_image(image_path=None, id=None, image_url=None, image_uri=None, image_bytes=None, sensor=None, timestamp=None, session_uid=None, set_id=None, annotations=None, preprocesses=None, aux_metadata=None, geo_field=None, shape_group_field=None, nested_fields=None, enrichments=None, add_to_session=False, thumbnailize_at_ingest=False, buffered_write=False, add_provider_to_labels=True, overwrite=True, add_to_cache=False)¶

Upload an image onto SceneBox.

Upload an image with a local file path, URL, URI, or image bytes. This method is best used with singular images, or images that are not all located in the same folder. For several images all located in the same folder, check out self.add_images_from_folder.

Parameters:

image_path – The local path to the image to upload. If not None, image_url, image_uri, and image_bytes should all be None.
id – A unique image ID. A common choice is the image filename.
image_url – The URL of the image to upload. Images must be publicly available. If not None, image_url, image_uri, and image_path should all be None.
image_uri – The URI of the image to upload. Can be from a private source. If not None, image_url, image_path, and image_bytes should all be None.
image_bytes – The image bytes to upload. If not None, image_path, image_url, and image_uri should all be None.
sensor – The sensor associated with the image.
timestamp – The time at which the image was taken.
session_uid – The session associated with the image.
set_id – The set (str) or sets (List[str]) to add the image to.
annotations – Annotations associated with the image. Each item in the passed list must be of a class from scenebox.models.annotation.
preprocesses – Specifies which process to treat the image thumbnails with.
aux_metadata – Auxiliary metadata associated with the image (partitioned from primary metadata)
geo_field – Geolocation metadata associated with the image.
shape_group_field – Shape group metadata associated with the image (example: UMAP).
nested_fields – nested fields (example: “[annotations_meta]”)
enrichments – The types of enrichments to add to the image.
thumbnailize_at_ingest – If True, create thumbnail at ingestion time. Otherwise, create thumbnails “on the fly”.
buffered_write – If True, writes images in a buffered fashion.
add_to_session – If True and session_uid is not None, add to the existing session with the passed session_uid.
add_provider_to_labels – If True the labels of the annotations in aux.label will included the provider as well. If False only the labels will be ingested
overwrite – If True and id is not None, updates the metadata/annotations/etc. of a previously uploaded image.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.

Returns:

The id of the added image.

Return type:

str

Raises:

SetsErrorInvalidArgumentsError: – If more than one of image_path, image_url, image_uri, and image_bytes is not None
ValueError: – If image_path, image_bytes, image_uri, and image_url are all None

add_images_from_folder(folder_path, set_id, session_uid=None, filename_image_id_map=None, preprocesses=None, thumbnailize_at_ingest=False)¶

Upload images from a single local folder path onto SceneBox.

Upload several images at once with the same local folder path. This method is best used with local images that are all located in the same folder. For images not located in the same folder, or images that are not located on your local machine, check out self.add_image.

Parameters:

folder_path – The local path to the folder of images to upload.
session_uid – The session associated with the image.
set_id – The set to add the images to.
filename_image_id_map – Provides a mapping from the image filenames to the desired image IDs. If not specified, Image IDs are automatically assigned to a randomized, unique string.
preprocesses – Specifies which process to images thumbnails with.
thumbnailize_at_ingest – If True, create thumbnail at ingestion time. Otherwise, create thumbnails “on the fly”.

Raises:

ValueError: – “If image_path, image_bytes, image_uri, and image_url are all None”

add_images(images, set_id, geo_field=None, shape_group_field=None, nested_fields=None, preprocesses=None, add_to_session=False, thumbnailize_at_ingest=False, add_provider_to_labels=True, add_to_cache=False, overwrite=True, disable_tqdm=True)¶

Upload multiple images onto SceneBox.

Upload an image with a local file path, URL, URI, or image bytes. This method is best used with singular images, or images that are not all located in the same folder. For several images all located in the same folder, check out self.add_images_from_folder.

Parameters:

images – A list of objects of the type Image defined in models.Image. For Image objects that exist, the metadata is overwritten with the information provided in this call. For updating metadata, refer to asset_manager_client.update_metadata().
set_id – The set (str) or sets (List[str]) to add the images to.
geo_field – Geolocation metadata associated with the image.
shape_group_field – Shape group metadata associated with the image (example: UMAP).
nested_fields – nested fields (example: “[annotations_meta]”)
preprocesses – Specifies which process to treat the image thumbnails with.
thumbnailize_at_ingest – If True, create thumbnail at ingestion time. Otherwise, create thumbnails “on the fly”.
add_to_session – If True and session_uid is not None, add to the existing session with the passed session_uid.
add_provider_to_labels – If True the labels of the annotations in aux.label will included the provider as well. If False only the labels will be ingested
add_to_cache – If True, corresponding bytes will be added to cache for quick access.
overwrite – Only effective if the id is provided in the image. If True, and an asset with the same id exists on SceneBox, it overwrites the asset completely with new metadata. It removes it from the previous sets and adds to new sets. If False and asset with the same id exists on SceneBox, it only adds them to the new set. It will not affect the metadata and annotations.
disable_tqdm – If False, TQDM is used to show progress.

Returns:

List of ids of the added images.

Return type:

List[str]

Raises:

SetsErrorInvalidArgumentsError: – If more than one of image_path, image_url, image_uri, and image_bytes is not None
ValueError: – If image_path, image_bytes, image_uri, and image_url are all None

add_video(set_id, video_path=None, video_url=None, video_uri=None, sensor=None, start_timestamp=None, session_uid=None, id=None, annotations=None, aux_metadata=None, enrichments=None, tags=None, compress_video=True, buffered_write=False, add_to_session=False, create_session=False)¶

Upload a video onto SceneBox.

Upload a video with a local file path, URL, or URI.

Parameters:

video_path – The local path to the video to upload. If not None, video_uri and video_url should both be None.
video_url – The URL of the video to upload. Video must be publicly available. If not None, video_path and video_uri should both be None.
video_uri – The URI of the video to upload. Can be from a private source. If not None, video_path, and video_url, should both be None.
sensor – The sensor associated with the image.
start_timestamp – The time at which the video recording began.
session_uid – The session associated with the video.
id – A unique video ID. A common choice is the video filename.
set_id – The set to add the video to.
annotations – Annotations associated with the Video. Each item in the passed list must be of a class from scenebox.models.annotation.
aux_metadata – Auxiliary metadata associated with the image (partitioned from primary metadata)
enrichments – The types of enrichments to add to the video.
tags – Labels associated with the video. Allows for easy video search.
compress_video – If true, register compressed video thumbnails.
buffered_write – If True, writes videos in a buffered fashion.
add_to_session – If True and session_uid is not None, add to the existing session with the passed session_uid.
create_session – If True and session_uid is None, create a single video session from the video with the video naming and the video start time and duration. Session resolution would be default (1.0 second) and video will be added as an aux_metadata. In this case, if the sensor name is not specified, it will be “main”

Returns:

The ID of the added video.

Return type:

str

Raises:

ValueError: – If video_path, video_url, and video_uri are all None

add_videos(videos, set_id, add_to_session=False, compress_videos=True, add_to_cache=False, overwrite=True, wait_for_compression_completion=False)¶

Upload multiple videos onto SceneBox.

Parameters:

videos – A list of objects of the type Video defined in models.Video. For Video objects that exist, the metadata is overwritten with the information provided in this call. For updating metadata, refer to asset_manager_client.update_metadata().
set_id – The set (str) or sets (List[str]) to add the videos to.
add_to_session – If True and session_uid is not None, add to the existing session with the passed session_uid.
compress_videos – If true, register compressed video thumbnails.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.
overwrite – If False and asset exists on SceneBox, adds the previously uploaded videos to the given set_id.
wait_for_compression_completion – If True wait for compression to be completed

Returns:

List of ids of the added videos.

Return type:

List[str]

Raises:

SetsErrorInvalidArgumentsError: – If more than one of video path, url, uri, and bytes is not None

get_video_frame_thumbnail(video_id, frame, thumbnail_tag)¶

Get thumbnail png for requested frame fram a video.

Parameters:

video_id – ID of the video to fetch the frame from.
frame – The number ‘n’ for the n-th frame from the beginning.
thumbnail_tag – Tag of the thumbnail to be created.

Returns:

URL for the requested thumbnail

Return type:

str

extract_frame_thumbnails(ids, frames, thumbnail_tags=None, wait_for_completion=False)¶

Extract thumbnails for a list of frames in a video.

Parameters:

ids – IDs of the videos to extract frame thumbnails from.
frames – A list of frame numbers starting from the beginning of each video.
thumbnail_tags – List of types of thumbnails tags to extract.

Returns:

The job ID of the Job that running the operation.

Return type:

str

add_lidar(set_id, lidar_path=None, lidar_url=None, lidar_uri=None, lidar_bytes=None, sensor=None, timestamp=None, session_uid=None, id=None, format='pcd', binary_format=None, num_fields=None, enrichments=None, buffered_write=False, add_to_session=False, add_to_cache=False, aux_metadata=None)¶

Upload a LIDAR onto SceneBox.

Upload a LIDAR with a local file path, URL, or URI.

Parameters:

lidar_path – The local path to the LIDAR to upload. If not None, lidar_uri and lidar_url should both be None.
lidar_url – The URL of the LIDAR to upload. LIDAR must be publicly available. If not None, lidar_path and lidar_uri should both be None.
lidar_uri – The URI of the LIDAR to upload. Can be from a private source. If not None, lidar_path, and lidar_url, should both be None.
sensor – The sensor associated with the LIDAR.
timestamp – The time at which the LIDAR was taken.
session_uid – The session associated with the LIDAR.
id – A unique LIDAR ID. A common choice is the LIDAR filename.
format – The format in which the LIDAR was embedded. pcd|numpy|binary. Default is pcd. PCD files are expected to be valid. Support for binary files is experimental.
binary_format – Experimental. For binary data, a decoder to decode each packet to x,y,z,intensity,epoch. Format should be compatible with struct (https://docs.python.org/3/library/struct.html) library.
num_fields – The number of fields exists in LIDAR file to reshape numpy array for displaying. Required for numpy formats
set_id – The set to add the LIDAR to.
enrichments – The types of enrichments to add to the LIDAR.
buffered_write – If True, writes images in a buffered fashion.
add_to_session – If True and session_uid is not None, add to the existing session with the passed session_uid.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.
aux_metadata – Auxiliary metadata associated with the image (partitioned from primary metadata)

Returns:

The ID of the added LIDAR.

Return type:

str

Raises:

ValueError: – If lidar_path, lidar_url, and lidar_uri are all None.

add_lidars(lidars, set_id, add_to_session=False, add_to_cache=False, overwrite=True)¶

Upload multiple videos onto SceneBox.

Parameters:

lidars – A list of objects of the type Lidar defined in models.Lidar. For Lidar objects that exist, the metadata is overwritten with the information provided in this call. For updating metadata, refer to asset_manager_client.update_metadata().
set_id – The set (str) or sets (List[str]) to add the lidars to.
add_to_session – If True and session_uid is not None, add to the existing session with the passed session_uid.
add_to_cache – If True, corresponding bytes will be added to cache for quick access.
overwrite – If False and asset exists on SceneBox, adds the previously uploaded lidars to the given set_id.

Returns:

List of ids of the added lidars.

Return type:

List[str]

Raises:

SetsErrorInvalidArgumentsError: – If more than one of lidar path, url, uri, and bytes is not None

enable_webapp_features(features)¶

Enable a list of features on SceneBox WebApp.

Parameters:: features – A list of features to enable on SceneBox WebApp. For example: [“UPLOAD_RECORDING”]
Raises:: AssetError: – If multiple org_configs are found. If org config exists but the name does not have a supported format.

disable_webapp_features(features)¶

Disable a list of features on SceneBox WebApp.

Parameters:: features – A list of features to disable on SceneBox WebApp. For example: [“UPLOAD_RECORDING”]
Raises:: AssetError: – If multiple org_configs are found. If org config exists but the name does not have a supported format.

index_s3_images_batch(bucket, folder, thumbnailize_at_ingest=True, overwrite=True, wait_for_completion=False, extensions=['jpeg', 'jpg', 'png', 'tif', 'tiff'], max_wait_sec=None)¶

Index s3 images.

Ingest images in a given s3 bucket and folder.

Parameters:

bucket – The bucket name containing images for example “my_bucket”.
folder – The folder containing images for example “folder” or “folder/sub_folder”.
thumbnailize_at_ingest – If True, create thumbnail at ingestion time. Otherwise, create thumbnails “on the fly”.
overwrite – If True, re-index the image which will overwrite an existing image with the same id.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
extensions – Extensions of images to ingest. Pass “*” to ingest all objects with/without any extension.
max_wait_sec – If a value is given, the wait for job completion will be for that many seconds. If None (default), and wait_for_completion is True, there will be no limit on the wait for job completion.

Returns:

The associated job id

Return type:

str

index_recording(path, recording_name=None, session_set_id=None, session_aux_metadata=None, wait_for_completion=False, max_wait_sec=None)¶

Index Recording.

Ingests recording data into scenebox including avi videos, etc.

Parameters:

path – The path containing all including the bucket e.g. bucket/a/b/recording_name or bucket/recording_name
recording_name – The name of the recording to ingest. This will be used as session name. If not given, will get that info from path
session_set_id – The id of the existing session set id to add the new session to.
session_aux_metadata – The auxiliary metadata to be added to session’s metadata
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
max_wait_sec – If a value is given, the wait for job completion will be for that many seconds. If None (default), and wait_for_completion is True, there will be no limit on the wait for job completion.

Returns:

The associated job id

Return type:

str

index_rosbag(sets_prefix, rosbag_uri=None, rosbag_path=None, session_name=None, session_uid=None, session_set_id=None, session_aux_metadata=None, index_session_aux_keys=False, deep_analysis=False, models=None, sensors=None, topics=None, session_resolution=1.0, image_skip_factor=10, indexing_parameters=None, wait_for_completion=False, max_wait_sec=None)¶

Index Ros1 files.

Ingests ros1 data into scenebox, then extracts all available images, videos, and other asset types.

Parameters:

sets_prefix – The prefix to append on all created set names.
rosbag_uri – The uri location of the folder containing the rosbag file If not None, rosbag_path should be None.
rosbag_path – The local path of the rosbag If not None, rosbag_uri should be None.
session_name – The name to give to the session or added previously to SceneBox which is associated with the indexed rosbag.
session_uid – The uid of the session added previously to SceneBox and will be associated with the indexed rosbag. If not provided, the session will be added.
session_set_id – The id of the existing session set id to add the new rosbag session to.
session_aux_metadata – The auxiliary metadata to be added to session’s metadata
index_session_aux_keys – If set to true, will collect all the keys from session_aux and index them under available_keys
deep_analysis – Enables enrichments such as object extraction, image/object embeddings, and UMAP visualizaion.
models – A list of the models to use to enrich the rosbag data. Choose from the models listed in scenebox.constants.AssetsConstants.MLModelConstants.
sensors – A list of the sensors (listed in image metadata) to create videos from.
topics – A list topics to ingest. If no topics passed, all topics will be ingested.
session_resolution – resolution of the ingested session in seconds
image_skip_factor – skip factor for image messages
indexing_parameters – indexing parameters for the rosbag ingestion
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
max_wait_sec – If a value is given, the wait for job completion will be for that many seconds. If None (default), and wait_for_completion is True, there will be no limit on the wait for job completion.

Returns:

The associated job id

Return type:

str

index_rosbag2(sets_prefix, relative_file_paths, folder_uri=None, folder_path=None, session_name=None, session_set_id=None, deep_analysis=False, models=None, sensors=None, topics=None, indexing_parameters=None, session_resolution=1.0, wait_for_completion=False)¶

Index Ros2 files.

Ingests ros2 data into scenebox, then extracts all available images, videos, and other asset types.

Parameters:

sets_prefix – The prefix to append on all created set names.
relative_file_paths – The relative file paths of rosbag files to ingest (.yaml/db3 files). Relative according to specified
folder_uri – The uri location of the folder containing the rosbag db3 file and metadata yaml file. If not None, folder_path should be None.
folder_path – The local folder path of the folder containing the rosbag db3 file and metadata yaml file. If not None, folder_uri should be None.
session_name – The name to give to the session associated with the indexed ros2.
session_set_id – The id of the existing session set id to add the new ros2 session to.
deep_analysis – Enables enrichments such as object extraction, image/object embeddings, and UMAP visualizaion.
models – A list of the models to use to enrich the ros2 data. Choose from the models listed in scenebox.constants.AssetsConstants.MLModelConstants.
sensors – A list of the sensors (listed in image metadata) to create videos from.
topics – A list topics to ingest. If no topics passed, all topics will be ingested.
indexing_parameters – indexing parameters for the rosbag ingestion
session_resolution – resolution of the ingested session in seconds
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

A dictionary containing “job_id” as the key and the associated job id as the value.

Return type:

dict

add_rosbag2(relative_file_paths, folder_uri=None, folder_path=None, session_uid=None, set_id=None)¶

Upload Rosbag2 onto SceneBox.

Upload Rosbag2 with a local file path, URL, or URI. Can only process ros2.

Parameters:

relative_file_paths – The relative file paths of rosbag files to ingest (.yaml/db3 files). Relative according to specified folder uri or path.
folder_uri – The uri location of the folder containing the rosbag db3 file and metadata yaml file. If not None, folder_path should be None.
folder_path – The local folder path of the folder containing the rosbag db3 file and metadata yaml file. If not None, folder_uri should be None.
session_uid – The UID of the session.
set_id – The set id to associate with the ingested rosbag.

Returns:

The ID of the added Rosbag2 asset.

Return type:

str

annotations_to_objects(ids=None, search=None, create_objects=False, margin_ratio=0.1, output_set_id=None, margin_pixels=None, source_annotation_type=None, session_uid=None, add_to_session=True, wait_for_completion=True, progress_callback=None)¶

Extracts objects from annotations.

Converts annotations to objects from annotations that have been previously ingested into SceneBox. Gets labels from the asset’s auxiliary metadata.

Parameters:

ids – IDs of the previously ingested annotations.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
create_objects – If False and if mask annotations are available for the given asset id, adds the area of each annotation to the asset’s auxiliary metadata. Else, additionally creates object entities from the existing annotation data associated with the passed ID.
margin_ratio – Widens/shrinks the object’s area of interest. A larger number increases the object’s margin. Minimum value 0.
margin_pixels – Widens/shrinks the object’s area of interest by this amount. Minimum value 0. If both margin_rotio and margin_pixels are specified, margin_pixes takes precedence.
output_set_id – The name of the set to add the created objects to, if create_objects is True. Otherwise, has no effect.
source_annotation_type – If given, extracts objects from the annotation type specified. Otherwise, extracts objects from the any/all of the existing annotation types: polygons, bounding boxes, poses
session_uid – The session associated with the objects. Adds an event interval (and thus, an entity) to the existing session.
add_to_session – Whether to add the object to the session with session_uid for session search
wait_for_completion – If True, polls until job is complete. Raises an error if job fails to finish after (2 * #ids) seconds if IDs are supplied. Otherwise, continues execution and does not raise an error if the job does not complete.
progress_callback – Callback function to log the progress of the inference job. The callback should accept a parameter of type float for progress.

Returns:

ID of the Job that runs/ran the object extraction job.

Return type:

str

Raises:

ResponseError – If endpoint call does not get a valid response

get_annotation_sources(search=None)¶

Returns all sources of annotation for all images or images that satisfy a search query

Parameters:: search – Query to locate the data subset of interest. Filters through existing assets (images) according to the dictionary passed.
Returns:: a list of dictionaries that include provider, version, annotation_group, and annotation_type information for existing annotation sources
Return type:: List

get_annotation_labels(search=None)¶

Returns all labels for all images or images that satisfy a search query

Parameters:: search – Query to locate the data subset of interest. Filters through existing assets (images) according to the dictionary passed.
Returns:: a list of strings which are labels of existing annotation entities
Return type:: List

model_inference(asset_type='images', ids=None, search=None, model='mask_rcnn', obtain_mask=False, obtain_bbox=False, obtain_object_entities=False, obtain_embeddings=False, threshold=None, classes_of_interest=None, wait_for_completion=False, progress_callback=None, additional_params=None)¶

Perform inference from a list of supported models. See for supported models.

Extracts inferences such as masks, bounding boxes, objects, and/or embeddings on an asset of choice using a model of choice. Select from Mask RCNN, StarDist, or Image Intensity Histogram.

Parameters:

asset_type – The type of assets to perform model inference on. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
ids – IDs of the previously uploaded assets on which to perform a model inference.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
model – The model to perform the inference with. Select from Mask RCNN (‘mrcnn’), StarDist (‘stardist’), or Image Intensity Histogram (‘histogram’)
obtain_mask – If True and model is Mask RCNN or StarDist, infer segmentations and add masks as a scenebox.models.annotation.SegmentationAnnotation. Otherwise, do nothing but log an error if the model chosen is Image Intensity Histogram.
obtain_bbox – If True and model is Mask RCNN or StarDist, infer bounding boxes from masks add the masks as a scenebox.models.annotation.BoundingBoxAnnotation. Otherwise, do nothing but log an error if the model chosen is Image Intensity Histogram.
obtain_object_entities – If True and model is Mask RCNN, obtains and adds object entities. Otherwise, do nothing but log an error if the model chosen is Image Intensity Histogram or StarDist.
obtain_embeddings – If True, extract asset embeddings with the chosen model.
image_size – If not None, resizes the input images to the provided size before performing inference(s). Otherwise, do nothing.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
progress_callback – Callback function to log the progress of the inference job. The callback should accept a parameter of type float for progress.
additional_params – A JSON formatted key-value pair structure for additional parameters such as threshold and classes of interest supported by the chosen model.

image_properties_enrichment(asset_type='images', properties=None, ids=None, search=None, wait_for_completion=False, progress_callback=None, additional_params=None)¶

Enrich images or objects with classical image processing features.

Enrich image or object assets with classical image properties/features such as brightness or contrast and add this as auxiliary metadata for the asset.

Parameters:

asset_type – The type of assets to perform enrichment on. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
ids – IDs of the previously uploaded assets on which to perform a model inference.
search – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
properties – A list of strings for each property to be added to the metadata of the asset. Choose from [‘mean_brightness’, ‘rms_contrast’, ‘variance_of_laplacian’].
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.
progress_callback – Callback function to log the progress of the inference job. The callback should accept a parameter of type float for progress.
additional_params – A JSON formatted key-value pair structure for additional parameters such as batch_size for the task.

extract_images_from_videos(video_id, extract_start_time=None, extract_end_time=None, fps=None, set_id=None, wait_for_completion=False)¶

Extract images from video

Parameters:

video_id – id of the video to extract images from
extract_start_time – optional start timestamp for extraction interval. If not given, will be set to start timestamp of the video.
extract_end_time – optional end timestamp for extraction interval. If not given, will be set to end timestamp of the video.
fps – desired fps (frame per second) value for extraction If not given, default value of 1.0 / 3.0 will be used.
set_id – The ID of the set to add the images to
wait_for_completion – If True, polls until job is complete. Raises an error if job fails to finish after (2 * #ids) seconds if IDs are supplied. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to perform image extraction.

Return type:

str

extract_subclip_from_videos(video_id, subclip_id=None, extract_start_time=None, extract_end_time=None, set_id=None, wait_for_completion=False)¶

Create a subclip of a video.

Parameters:

video_id – The ID of the full-length video to extract from
subclip_id – The ID of the generated subclip
extract_start_time – The time in the full-length video to start the subclip from
extract_end_time – The time in the full-length video to end the subclip at
set_id – The ID of the set to place the generated subclip into
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The job ID of the Job that tracks the subclip extraction.

Return type:

str

make_video_from_image_uris(bucket, folder, cloud_storage, sensor, session_uid=None, video_id=None, tags=None, set_id=None, image_skip_factor=1, wait_for_completion=False)¶

Make video from frames in a given cloud_storage, bucket, and folder. The path “{cloud_storage}://{bucket}/{folder}/” should contain frames with the following format: Unix_timestamps_in_nanoseconds.extension e.g. 1664999890113942016.jpeg

Parameters:

bucket – bucket where the frames are located
folder – folder where the frames are located. Note that bucket should not be included here.
cloud_storage – cloud provider e.g. s3 or gs
sensor – Sensor name associated with the image uris
session_uid – The ID of the session that the images and video belong to
video_id – The ID of the video
tags – Tags to add to video metadata
set_id – The ID of the set to add the video to
image_skip_factor – An integer number to use, if user wants to skip images when making video. Default is 1; skiping no image.
wait_for_completion – If True, polls until job is complete. Raises an error if job fails to finish after (2 * #ids) seconds if IDs are supplied. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to perform video making.

Return type:

str

make_video_with_annotations_from_uris(timestamps_image_uris_csv_filepath, sensor, width, height, image_format, session_uid=None, video_id=None, tags=None, set_id=None, image_skip_factor=1, wait_for_completion=False)¶

Make video with overlayed annotations from provided image uris and annotaion uris per provider

Parameters:

timestamps_image_uris_csv_filepath – filepath for the csv file that contains two columns; timestamp and image_uri
sensor – Sensor name associated with the image uris
width – Width of images
height – Height of images
image_format – Format of images e.g. png
session_uid – The ID of the session that the images and video belong to
video_id – The ID of the video
tags – Tags to add to video metadata
set_id – The ID of the set to add the video to
image_skip_factor – An integer number to use, if user wants to skip images when making video. Default is 1; skiping no image.
wait_for_completion – If True, polls until job is complete. Raises an error if job fails to finish after (2 * #ids) seconds if IDs are supplied. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to perform video making.

Return type:

str

similarity_search_bulk_index(embedding_ids=None, search_dic=None, wait_for_completion=False)¶

Bulk indexes embeddings to allow for similarity search.

Existing embeddings are mapped to a high-dimensional vector space. Then a distance measure is applied to find which assets are the most similar. After running this method, similarity search is available for the assets associated with the inputted embeddings.

Parameters:

embedding_ids – The IDs of the embeddings to bulk index.
search_dic – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
wait_for_completion – If True, polls until job is complete. Raises an error if job fails to finish after (2 * #ids) seconds if IDs are supplied. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The ID of the Job that attempts to perform bulk indexing.

Return type:

str

create_umap(asset_type, index_name, search_dic=None, n_neighbors=20, min_dist=0.3, train_size=10000, progress_callback=None, transform_only=False, wait_for_completion=False)¶

Applies UMAP onto existing asset indices.

Applies UMAP (Uniform Manifold Approximation and Projection) to existing asset embeddings, and outputs a visual representation of the embeddings space. Can be viewed from the “Embeddings” view on the SceneBox web app.

Note

Embeddings must be indexed before this function can be used.

Parameters:

asset_type – The type of asset to apply UMAP on. To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
index_name – index_name of the embeddings. Used to filter through the embeddings metadata.
search_dic – Query to locate the data subset of interest. Filters through existing assets according to the dictionary passed.
n_neighbors – Controls how UMAP balances local vs. global structure in data. Low values of n_neighbours forces UMAP to concentrate on local structure, while high values push UMAP to look at larger neighbourhoods of each point.
min_dist – Controls how tightly UMAP is allowed to pack points together.
train_size – How much of the data should be used to train the umap. Default is 10000
progress_callback – Callback function to log the progress of the inference job. The callback should accept a parameter of type float for progress.
transform_only – If umap already exists and new data is needed to be transformed to the same embedding space, setting this parameter to True will take care of that.
wait_for_completion – If True, polls until job is complete. Otherwise, continues execution and does not raise an error if the job does not complete.

Returns:

The job_id of the Job that applies UMAP.

Return type:

str

search_similar_assets(id, similar_asset_count=10, asset_type='images', embedding_space=None)¶

Find top k similar assets to an existing asset with embeddings.

Parameters:

id – asset id.
similar_asset_count – count of similar assets (10 by default)
asset_type – type of assets (Images by default) To ensure using a valid asset_type, you can import AssetsConstants from scenebox.constants
embedding_space – embedding space in which similarity is performed on. Ineffective/optional if there is only one embedding space

Returns:

list of asset ids similar to the asset sorted by most to least similar.

Return type:

list

download_annotations(ids, folder_path=None)¶

Download annotations to a destination folder including the mask files

Parameters:

ids – list of annotation ids
folder_path – optional destination folder_path. If not given, all objects will be downloaded and returned as a result

Return type:

Dict

add_user(email, username, role, organization_id)¶

Add a new SceneBox user account to your organization.

Parameters:

email – E-mail of the user to add.
username – Username of the user to add.
role – Role of the user to add. Choose from admin, tester, health_checker, data_provider, data_user, or public_user. See the SceneBox user guides for more information.
organization_id – Organization id to add the user to.

Returns:

The successfully created user username, email, role, and organization.

Return type:

dict

modify_user(username, email=None, role=None, organization_id=None)¶

Modify an existing user account.

Please note that an existing user’s username cannot be changed.

Parameters:

username – The user username to modify.
email – The new user email.
role – The new user role. Choose from admin, tester, health_checker, data_provider, data_user, or public_user. See the SceneBox user guides for more information.
organization_id – The new user organization.

list_users()¶

Returns a list of all users.

Returns:: Returns a dictionary of username, organization name, role, email, and token for each user.
Return type:: List[Dict]

add_organization(organization_name)¶

Add a new SceneBox organization. Requires superadmin privilege.

Parameters:: organization_name – organization name.
Returns:: The successfully created organization id, name.
Return type:: dict

list_organizations()¶

Returns a list of all organizations.

Returns:: Returns a list of dictionary of organization name and id.
Return type:: List[Dict]

modify_organization(organization_id, organization_name)¶

Modify an existing organization. :Parameters: * organization_id – Organization id

organization_name – The new organization name.

list_ownerships()¶

Returns a list of all ownerships.

Returns:: A list of what ownerships the user associated with
Return type:: List[Dict]

display_image(image_id)¶

Display a given image on the SceneBox web app.

Note