atomscale.streaming#

High-performance streaming for RHEED and instrument data.

class atomscale.streaming.RHEEDStreamer(api_key: str, endpoint: str | None = None)

Bases: object

A thin, high-performance Python interface (via PyO3) for real-time RHEED frame streaming into the Atomscale platform. This class takes chunks of 8-bit frames (NumPy arrays) and uploads them for analysis while they are being captured from a camera programmatically.

Typical usage:

Instantiate the streamer

2) initialize(…) to create the remote data item and receive data_id 3a) run(data_id, frames_iter) to stream by yielding frame chunks from a generator/iterator, or 3b) push(data_id, chunk_idx, frames) repeatedly to send chunks from your own loop 4) finalize(data_id) to mark the stream complete on the server

Notes

Frame dtype is coerced to uint8. Shapes (H, W) or (N, H, W) are accepted; (N,H,W) is preferred for chunks.
Packaging happens concurrently for throughput; network PUTs are async.
This class is safe to call from Python; heavy work is offloaded to multithreaded async workers.
See also: finalize(…).

Parameters:

api_key (str) – Your Atomscale API key.
endpoint (Optional[str]) – Base API URL. Defaults to “https://api.atomscale.ai”.

Raises:

RuntimeError – If the HTTP client or async runtime cannot be constructed.

finalize(self, data_id: str) → None

Explicitly closes the remote stream for the given data_id. This signals to the server that no further chunks will be uploaded and allows any downstream jobs (e.g., indexing, aggregation, or post-processing) to begin.

Typical use: - After run(…) returns (it waits for all chunk tasks), call finalize(data_id). - In push(…) mode, call finalize(data_id) only after you have pushed your last

chunk and ensured any in-flight uploads have finished (since push(…) detaches tasks).

Notes

The operation performs a single HTTP POST to the …/end endpoint.
It is safe to call more than once; the server may treat it as idempotent, but repeated calls are unnecessary.

Parameters:: data_id (str) – The stream identifier returned by initialize(…).
Returns:: None
Raises:: RuntimeError – If the finalization POST fails.

Timestamps#

When capture_start_ms_utc is not provided, this method stamps each chunk with Utc::now() sampled the moment push() is entered, before any GIL-held packaging work. The chunk’s end_unix_ms_utc is then start + (n / fps) * 1000. Inter-chunk gaps therefore reflect real arrival jitter; intra-chunk span follows declared fps. Pass an explicit timestamp when you have a hardware clock (camera trigger, OS monotonic-to-utc conversion).

type data_id:
param data_id:: The remote data identifier returned by initialize(…).
type data_id:: str
type chunk_idx:
param chunk_idx:: Zero-based index of this chunk (used in Zarr shard path).
type chunk_idx:: int
type frames:
param frames:: (N,H,W) or (H,W) grayscale frames as uint8.
type frames:: numpy.ndarray
type capture_start_ms_utc:
param capture_start_ms_utc:: Capture-start timestamp in milliseconds since UNIX epoch (UTC). Pass when you have a real capture-time clock (camera trigger, OS monotonic-to-utc conversion). When None, sampled from Utc::now() on entry.
type capture_start_ms_utc:: Optional[int]
returns:: None
raises RuntimeError:: If packaging or upload fails internally.

Timestamps#

When the caller does not supply a capture_start_ms_utc, this method stamps each chunk with Utc::now() sampled the moment the iterator yields, before any GIL-held packaging work. The chunk’s end_unix_ms_utc is then start + (n / fps) * 1000. Inter-chunk gaps therefore reflect real arrival jitter; intra-chunk span follows declared fps. Pass an explicit timestamp when you have a hardware clock (camera trigger, OS monotonic-to-utc conversion).

type data_id:
param data_id:: The stream data ID returned by initialize(…).
type data_id:: str
type frames_iter:
param frames_iter:: Python iterable/generator of (N,H,W) / (H,W) uint8 arrays, or (frames, capture_start_ms_utc) tuples.
type frames_iter:: Iterable
returns:: None
raises RuntimeError:: If any packaging/join/upload step fails.
raises ValueError:: If a yielded tuple’s second element is not an int.