# Data Flow This document describes the primary data flows through XPCS Viewer, from HDF5 file ingestion through computation to GUI visualization. ## XPCS Analysis Pipeline The main data flow for G2 correlation analysis: ```{mermaid} flowchart TD HDF5["HDF5 File
/xpcs/ hierarchy"] CP["HDF5ConnectionPool
fileIO/hdf_reader.py"] XF["XpcsFile
In-memory data model"] DC["DataCache
LRU cache for arrays"] VK["ViewerKernel
Analysis coordinator"] subgraph Backend["Backend Computation"] BE["get_backend()
JAX or NumPy"] COMP["Array Operations
backend.exp(), backend.mean(), ..."] JIT["JIT Cache
(JAX only)"] end subgraph Fitting["Fitting Pipeline"] NLSQ["NLSQ 0.6.0
Point estimates"] NUTS["NumPyro NUTS
Posterior samples"] FRES["FitResult
ArviZ diagnostics"] end subgraph Output["Output Boundaries"] PQG["PyQtGraph
Interactive plots"] MPL["Matplotlib
Publication plots"] H5W["HDF5 Write
Fit results"] end EN["ensure_numpy()
I/O boundary conversion"] HDF5 -->|"pool.get_connection()"| CP CP -->|"np.ndarray (float64)"| XF XF -->|"cached reads"| DC XF -->|"g2_data, qmap"| VK VK -->|"lazy module load"| COMP VK -->|"fit request"| NLSQ BE -->|"backend arrays"| COMP COMP -->|"JIT-compiled"| JIT NLSQ -->|"warm-start init"| NUTS NUTS -->|"posterior samples"| FRES COMP -->|"analysis results"| EN FRES -->|"plot data"| EN EN -->|"np.ndarray"| PQG EN -->|"np.ndarray"| MPL EN -->|"np.ndarray"| H5W style EN fill:#f96,stroke:#333,color:#000 style BE fill:#69f,stroke:#333,color:#000 style NLSQ fill:#9c6,stroke:#333,color:#000 style NUTS fill:#9c6,stroke:#333,color:#000 ``` ### Array Type Transitions | Stage | Array Type | Notes | |-------|-----------|-------| | HDF5 read | `np.ndarray` | Always NumPy from h5py | | XpcsFile storage | `np.ndarray` | Cached as NumPy | | Backend computation | JAX array or `np.ndarray` | Depends on active backend | | JIT cache | JAX array | Compiled traces cached | | NLSQ fitting | `np.ndarray` | NLSQ operates on NumPy | | NumPyro NUTS | JAX array | Requires JAX backend | | I/O boundary | `np.ndarray` | `ensure_numpy()` conversion | | PyQtGraph/Matplotlib | `np.ndarray` | Display libraries require NumPy | ## SimpleMask Data Flow Mask editing and Q-map computation flow: ```{mermaid} flowchart TD IMG["Detector Image
np.ndarray (H, W)"] SMW["SimpleMaskWindow
User draws ROIs"] SMK["SimpleMaskKernel
Computation core"] subgraph MaskOps["Mask Operations"] AM["AreaMask
Undo/redo history"] DT["DrawingTools
Rect, Circle, Polygon, ..."] MASK["Final Mask
int32, values 0/1"] end subgraph QMapComp["Q-Map Computation"] GEO["GeometryMetadata
bcx, bcy, det_dist, ..."] QM["qmap.compute_qmap()
JIT-compiled (JAX)"] QRES["QMapSchema
sqmap, dqmap, phis"] end subgraph Partition["Partition Generation"] PG["utils.generate_partition()
JIT-compiled (JAX)"] PS["PartitionSchema
partition_map, val_list, num_list"] end subgraph Export["Signal Export"] SIG1["mask_exported
Signal(np.ndarray)"] SIG2["qmap_exported
Signal(dict)"] end XV["XPCS Viewer
apply_mask(), apply_qmap_result()"] H5["HDF5 File
Mask/Partition persistence"] IMG --> SMW SMW --> SMK SMK --> AM DT --> AM AM --> MASK SMK --> QM GEO --> QM QM --> QRES QRES --> PG MASK --> PG PG --> PS MASK --> SIG1 PS --> SIG2 SIG1 -.->|"loose coupling"| XV SIG2 -.->|"loose coupling"| XV SMK --> H5 style SIG1 fill:#f96,stroke:#333,color:#000 style SIG2 fill:#f96,stroke:#333,color:#000 ``` ### Signal Payload Schemas **`mask_exported(np.ndarray)`**: - Shape: `(H, W)`, dtype: `int32` - Values: 0 (masked) or 1 (valid) **`qmap_exported(dict)`**: ```python { "partition_map": np.ndarray, # int32, (H, W), Q-bin indices "num_pts": int, # Number of Q-bins "val_list": list[float], # Q-bin center values "num_list": list[int], # Pixels per Q-bin } ``` ## Fitting Data Flow The two-stage Bayesian fitting pipeline: ```{mermaid} flowchart TD G2["G2 Data
delay_times, g2, g2_err"] subgraph Stage1["Stage 1: NLSQ Warm-Start"] NF["nlsq.nlsq_optimize()"] NR["NLSQResult
Point estimates + CurveFitResult"] P0["Initial params
tau, baseline, contrast"] end subgraph Stage2["Stage 2: Bayesian Inference"] MC["NumPyro Model
single_exp_model()"] NU["NUTS Sampler
4 chains x 1000 samples"] PS["Posterior Samples
dict[str, ndarray]"] end subgraph Diagnostics["Convergence Diagnostics"] RH["R-hat < 1.01"] ESS["ESS bulk/tail > 400"] DIV["Divergences == 0"] BF["BFMI >= 0.2"] FD["FitDiagnostics"] end subgraph Results["Result Objects"] FR["FitResult
samples + diagnostics"] AZ["xarray DataTree
Posterior plots"] end G2 --> NF NF --> NR NR -->|"warm-start"| P0 P0 --> MC MC --> NU NU --> PS PS --> RH PS --> ESS PS --> DIV PS --> BF RH --> FD ESS --> FD DIV --> FD BF --> FD PS --> FR FD --> FR FR --> AZ style NR fill:#9c6,stroke:#333,color:#000 style FR fill:#69f,stroke:#333,color:#000 style FD fill:#f96,stroke:#333,color:#000 ``` ### Convergence Thresholds | Diagnostic | Threshold | Meaning | |-----------|-----------|---------| | R-hat | < 1.01 | All chains converge to same distribution | | ESS bulk | > 400 | Sufficient effective samples for mean/variance | | ESS tail | > 400 | Sufficient effective samples for quantiles | | Divergences | == 0 | No divergent transitions | | BFMI | >= 0.2 | Adequate energy exploration | ## HDF5 File Schema The standard HDF5 file structure for XPCS data: ```{mermaid} graph LR subgraph HDF5["HDF5 File Structure"] ROOT["/"] XPCS["/xpcs/"] QMAP["/xpcs/qmap/"] G2["/xpcs/g2/"] META["/xpcs/metadata/"] SM["/simplemask/"] SMMASK["/simplemask/mask/"] SMPART["/simplemask/partition/"] end ROOT --> XPCS ROOT --> SM XPCS --> QMAP XPCS --> G2 XPCS --> META SM --> SMMASK SM --> SMPART subgraph QMapDS["Q-Map Datasets"] SQ["sqmap
float64, (H,W)"] DQ["dqmap
float64, (H,W)"] PH["phis
float64, (H,W)"] MK["mask
int32, (H,W)"] PM["partition_map
int32, (H,W)"] end subgraph G2DS["G2 Datasets"] G2V["g2
float64, (n_delay, n_q)"] G2E["g2_err
float64, (n_delay, n_q)"] DT["delay_times
float64, (n_delay,)"] QV["q_values
float64, (n_q,)"] end subgraph MetaDS["Metadata Attributes"] BCX["bcx: float"] BCY["bcy: float"] DD["det_dist: float (mm)"] LAM["lambda_: float (A)"] PD["pix_dim: float (mm)"] SHP["shape: (H, W)"] end QMAP --> QMapDS G2 --> G2DS META --> MetaDS ``` ## I/O Boundary Summary All I/O boundaries where `ensure_numpy()` conversion occurs: | Boundary | Direction | Source Module | Target | |----------|-----------|--------------|--------| | PyQtGraph plotting | Backend -> NumPy | `module/saxs1d`, `module/saxs2d`, `module/g2mod`, `module/twotime` | `plot.setData()` | | Matplotlib plotting | Backend -> NumPy | `fitting/visualization` | `plt.plot()` | | HDF5 write | Backend -> NumPy | `simplemask/area_mask`, `simplemask/simplemask_kernel` | `h5py.create_dataset()` | | HDF5 read | NumPy -> Backend | `xpcs_file`, `fileIO/hdf_reader` | `backend.from_numpy()` | | Signal export | Backend -> NumPy | `simplemask/qmap`, `simplemask/utils` | `Signal.emit()` |