atomscale.client.Client#

class atomscale.client.Client(api_key: str | None = None, endpoint: str | None = None, mute_bars: bool = False)[source]

Bases: BaseClient

Atomic Data Sciences API client

Parameters:
  • api_key (str | None)

  • endpoint (str)

  • mute_bars (bool)

  • api_key – API key. Explicit value takes precedence; if None, falls back to AS_API_KEY environment variable.

  • endpoint – Root API endpoint. Explicit value takes precedence; if None, falls back to AS_API_ENDPOINT environment variable, defaulting to ‘https://api.atomscale.ai/’ if not set.

  • mute_bars – Whether to mute progress bars. Defaults to False.

__init__(api_key: str | None = None, endpoint: str | None = None, mute_bars: bool = False)[source]
Parameters:
  • api_key (str | None) – API key. Explicit value takes precedence; if None, falls back to AS_API_KEY environment variable.

  • endpoint (str) – Root API endpoint. Explicit value takes precedence; if None, falls back to AS_API_ENDPOINT environment variable, defaulting to ‘https://api.atomscale.ai/’ if not set.

  • mute_bars (bool) – Whether to mute progress bars. Defaults to False.

Methods

__init__([api_key, endpoint, mute_bars])

aiter_poll_similarity_trajectory(source_id, *)

Asynchronously poll similarity trajectory data without blocking the loop.

create_growth_instrument(label, name, ...[, ...])

Create a new growth instrument.

delete_growth_instrument(synth_source_id)

Delete a growth instrument.

download(data_ids[, dest_dir, data_type])

Download raw or processed files for any data type to disk.

download_videos(data_ids[, dest_dir, data_type])

Deprecated alias for download().

get(data_ids)

Get analyzed data results

get_changepoints(data_ids[, latest_only, ...])

Get changepoint detection records for one or more data IDs.

get_physical_sample(physical_sample_id, *[, ...])

Get all data for a physical sample.

get_project(project_id, *[, ...])

Get all data grouped by physical sample for a project.

get_similarity_trajectory(source_id, *[, ...])

Fetch a one-shot similarity trajectory for a source data_id or physical_sample_id.

iter_poll_similarity_trajectory(source_id, *)

Synchronously poll similarity trajectory data, yielding DataFrames.

list_growth_instruments()

List all growth instruments accessible by the user.

list_physical_samples()

List physical samples available to the user.

list_projects()

List projects available to the user.

search([keywords, ...])

Search and obtain data catalogue entries

start_polling_similarity_trajectory_task(...)

Start polling similarity trajectory data as an asyncio.Task.

start_polling_similarity_trajectory_thread(...)

Start polling similarity trajectory data in a background daemon thread.

upload(files[, physical_sample, project])

Upload and process files.

Attributes

session

Session under which HTTP requests are issued

search(keywords: str | list[str] | None = None, include_organization_data: bool = True, data_ids: str | list[str] | None = None, physical_sample_ids: str | list[str] | None = None, project_ids: str | list[str] | None = None, data_type: Literal['rheed_image', 'rheed_stationary', 'rheed_rotating', 'xps', 'xrd', 'photoluminescence', 'pl', 'raman', 'recipe', 'optical', 'metrology', 'ellipsometry', 'all'] = 'all', status: Literal['success', 'pending', 'error', 'running', 'stream_active', 'stream_interrupted', 'stream_finalizing', 'stream_error', 'all'] = 'all', growth_length: tuple[int | None, int | None] = (None, None), upload_datetime: tuple[datetime | None, datetime | None] = (None, None), last_accessed_datetime: tuple[datetime | None, datetime | None] = (None, None)) DataFrame[source]

Search and obtain data catalogue entries

Parameters:
  • keywords (str | list[str] | None) – Keyword or list of keywords to search all data catalogue fields with. This searching is applied after all other explicit filters. Defaults to None.

  • include_organization_data (bool) – Whether to include catalogue entries from other users in your organization. Defaults to True.

  • data_ids (str | list[str] | None) – Data ID or list of data IDs. Defaults to None.

  • physical_sample_ids (str | list[str] | None) – Physical sample ID or list of IDs. Defaults to None.

  • project_ids (str | list[str] | None) – Project ID or list of IDs. Defaults to None.

  • data_type (Literal["rheed_image", "rheed_stationary", "rheed_rotating", "xps", "xrd", "photoluminescence", "raman", "recipe", "optical", "metrology", "ellipsometry", "all"]) – Type of data. Defaults to “all”.

  • status (Literal["success", "pending", "error", "running", "all"]) – Analyzed status of the data. Defaults to “all”.

  • growth_length (tuple[int | None, int | None]) – Minimum and maximum values of the growth length in seconds. Defaults to (None, None) which will include all non-video data.

  • upload_datetime (tuple[datetime | None, datetime | None]) – Minimum and maximum values of the upload datetime. Defaults to (None, None).

  • last_accessed_datetime (tuple[datetime | None, datetime | None]) – Minimum and maximum values of the last accessed datetime. Defaults to (None, None).

Returns:

Pandas DataFrame containing matched entries in the data catalogue.

Return type:

(DataFrame)

get(data_ids: str | list[str]) list[RHEEDVideoResult | RHEEDImageResult | XPSResult | XRDResult | PhotoluminescenceResult | RamanResult | OpticalResult | MetrologyResult | EllipsometryResult | UnknownResult][source]

Get analyzed data results

Parameters:

data_ids (str | list[str]) – Data ID or list of data IDs from the data catalogue to obtain analyzed results for.

Returns:

List of result objects

Return type:

list[atomscale.results.RHEEDVideoResult | atomscale.results.RHEEDImageResult | atomscale.results.XPSResult | atomscale.results.XRDResult]

get_changepoints(data_ids: str | list[str], latest_only: bool = True, detection_method: Literal['forecasting', 'clustering', 'intensity_profile'] | None = 'intensity_profile', severity: Literal['info', 'warning', 'critical'] | None = 'critical', as_dataframe: bool = True) DataFrame | list[ChangepointResult][source]

Get changepoint detection records for one or more data IDs.

Parameters:
  • data_ids (str | list[str]) – Data ID or list of data IDs from the data catalogue.

  • latest_only (bool) – If True (default), only return changepoints from the most recently completed detection run for each (data_id, detection_method) pair. If False, return all changepoints from every historical run.

  • detection_method (str | None) – Filter to a single detection method. One of “forecasting”, “clustering”, “intensity_profile”. Defaults to “intensity_profile”. Pass None to include all detection methods.

  • severity (str | None) – Filter to a single severity level. One of “info”, “warning”, “critical”. Defaults to “critical”. Pass None to include all severities.

  • as_dataframe (bool) – If True (default) return a pandas DataFrame. If False return a list of ChangepointResult objects.

Returns:

Changepoint records matching the filters.

Return type:

DataFrame | list[ChangepointResult]

get_similarity_trajectory(source_id: str, *, workflow: str = 'rheed_stationary', last_n: int | None = None, window_span: float | None = None, reference_ids: list[str] | None = None, softmax_mode: str | None = None, reference_n_values: int | None = None) SimilarityTrajectoryResult[source]

Fetch a one-shot similarity trajectory for a source data_id or physical_sample_id.

Parameters:
  • source_id (str) – Data ID or physical sample ID the trajectory is computed against.

  • workflow (str) – Similarity workflow name (e.g. “rheed_stationary”). Defaults to “rheed_stationary”.

  • last_n (int | None) – If set, only fetch the last N points of the trajectory.

  • window_span (float | None) – Optional window span parameter forwarded to the provider.

  • reference_ids (list[str] | None) – Optional list of reference data IDs to compare against.

  • softmax_mode (str | None) – Optional softmax mode forwarded to the provider.

  • reference_n_values (int | None) – Optional number of reference values forwarded to the provider.

Return type:

SimilarityTrajectoryResult

Returns:

SimilarityTrajectoryResult with the populated timeseries DataFrame.

iter_poll_similarity_trajectory(source_id: str, *, interval: float = 1.0, last_n: int | None = None, **kwargs: Any) Iterator[DataFrame][source]

Synchronously poll similarity trajectory data, yielding DataFrames.

Thin wrapper around atomscale.similarity.iter_poll_trajectory(). See that function for the full set of keyword arguments (distinct_by, until, max_polls, fire_immediately, jitter, on_error).

Parameters:
  • source_id (str)

  • interval (float)

  • last_n (int | None)

  • kwargs (Any)

Return type:

Iterator[DataFrame]

aiter_poll_similarity_trajectory(source_id: str, *, interval: float = 1.0, last_n: int | None = None, **kwargs: Any) AsyncIterator[DataFrame][source]

Asynchronously poll similarity trajectory data without blocking the loop.

Thin wrapper around atomscale.similarity.aiter_poll_trajectory().

Parameters:
  • source_id (str)

  • interval (float)

  • last_n (int | None)

  • kwargs (Any)

Return type:

AsyncIterator[DataFrame]

start_polling_similarity_trajectory_thread(source_id: str, *, interval: float = 1.0, last_n: int | None = None, on_result: Callable[[DataFrame], None], **kwargs: Any) Event[source]

Start polling similarity trajectory data in a background daemon thread.

Returns a threading.Event that can be set to stop polling.

Parameters:
  • source_id (str)

  • interval (float)

  • last_n (int | None)

  • on_result (Callable[[DataFrame], None])

  • kwargs (Any)

Return type:

Event

start_polling_similarity_trajectory_task(source_id: str, *, interval: float = 1.0, last_n: int | None = None, on_result: Callable[[DataFrame], Any] | None = None, **kwargs: Any) Task[None][source]

Start polling similarity trajectory data as an asyncio.Task.

Parameters:
  • source_id (str)

  • interval (float)

  • last_n (int | None)

  • on_result (Callable[[DataFrame], Any] | None)

  • kwargs (Any)

Return type:

Task[None]

list_physical_samples() DataFrame[source]

List physical samples available to the user.

Return type:

DataFrame

list_projects() DataFrame[source]

List projects available to the user.

Return type:

DataFrame

get_physical_sample(physical_sample_id: str, *, include_organization_data: bool = True, align: bool | str = False) PhysicalSampleResult[source]

Get all data for a physical sample.

Parameters:
  • physical_sample_id (str) – Identifier of the physical sample.

  • include_organization_data (bool) – Whether to include organization data. Defaults to True.

  • align (bool | str) – Whether to align timeseries data. If truthy, an aligned DataFrame is returned.

Return type:

PhysicalSampleResult

get_project(project_id: str, *, include_organization_data: bool = True, align: bool | str = False) ProjectResult[source]

Get all data grouped by physical sample for a project.

Parameters:
  • project_id (str) – Identifier of the project.

  • include_organization_data (bool) – Whether to include organization data. Defaults to True.

  • align (bool | str) – Whether to align timeseries at the project level. Defaults to False.

Return type:

ProjectResult

upload(files: list[str | BinaryIO], physical_sample: str | None = None, project: str | None = None) list[str][source]

Upload and process files.

Parameters:
  • files (list[str | BinaryIO]) – List containing string paths to files, or BinaryIO objects from open.

  • physical_sample (str | None) – Physical sample name or UUID to link uploads to. If a name is given and no matching sample exists, one is created automatically.

  • project (str | None) – Project name or UUID to associate the uploads with. The project must already exist (the SDK does not auto-create projects). When provided, physical_sample is required so the sample can be added to the project’s tracking list via POST /projects/{id}/configuration/tracking_samples.

Returns:

Data IDs assigned to the uploaded files.

Return type:

list[str]

download(data_ids: str | list[str], dest_dir: str | Path | None = None, data_type: Literal['raw', 'processed'] = 'processed')[source]

Download raw or processed files for any data type to disk.

Works for every data_type the platform stores (RHEED video, XPS / XRD / PL / Raman / optical / metrology / ellipsometry / etc.) — the underlying data_entries/{raw_data|processed_data}/{data_id} endpoint is data-type-agnostic and returns whatever file format the backend has on record.

Parameters:
  • data_ids (str | list[str]) – One or more data IDs from the data catalogue.

  • dest_dir (str | Path | None) – Directory to write the files to. Defaults to the current working directory.

  • data_type (Literal["raw", "processed"]) – Whether to download raw or processed data.

download_videos(data_ids: str | list[str], dest_dir: str | Path | None = None, data_type: Literal['raw', 'processed'] = 'processed')[source]

Deprecated alias for download(). Kept for backwards compatibility.

Parameters:
  • data_ids (str | list[str])

  • dest_dir (str | Path | None)

  • data_type (Literal['raw', 'processed'])

list_growth_instruments() list[dict[str, Any]][source]

List all growth instruments accessible by the user.

Returns instruments within the user’s organization.

Returns:

List of instruments with keys including:
  • synth_source_id (int): Unique instrument ID

  • source_name (str): Display name

  • synth_source_type (str): Instrument type (mbe, cvd, etc.)

  • source_manufacturer (str | None): Manufacturer name

  • source_model (str | None): Model name

Return type:

list[dict]

Example

>>> instruments = client.list_growth_instruments()
>>> for inst in instruments:
...     print(f"{inst['synth_source_id']}: {inst['source_name']}")
create_growth_instrument(label: str, name: str, instrument_type: Literal['mbe', 'cvd', 'pvd', 'sputter', 'ald', 'pld'], serial_id: str | None = None) int[source]

Create a new growth instrument.

Parameters:
  • label (str) – Display name for the instrument (e.g., “Main MBE”).

  • name (str) – Manufacturer and model (e.g., “Veeco GEN10”).

  • instrument_type (Literal['mbe', 'cvd', 'pvd', 'sputter', 'ald', 'pld']) – Type of instrument.

  • serial_id (str | None) – Optional serial number or identifier.

Returns:

The synth_source_id of the created instrument.

Return type:

int

Example

>>> instrument_id = client.create_growth_instrument(
...     label="Main MBE",
...     name="Veeco GEN10",
...     instrument_type="mbe",
...     serial_id="SN-12345",
... )
delete_growth_instrument(synth_source_id: int) None[source]

Delete a growth instrument.

Parameters:

synth_source_id (int) – ID of the instrument to delete.

Raises:

ClientError – If the instrument is not found or not accessible.

Return type:

None

Example

>>> client.delete_growth_instrument(synth_source_id=42)
property session

Session under which HTTP requests are issued