XPCS Viewer Dependency Analysis and Integration Catalog

Analysis Date: 2026-01-06 (updated 2026-02-18) Codebase Version: master branch Analyzed Components: Core modules, backends, fitting, simplemask, file I/O

… note::

As of February 2026, JAX, NumPyro, NLSQ, and ArviZ are core dependencies (listed in pyproject.toml under dependencies). References to “optional” JAX installation in this document are historical.


Executive Summary

Dependency Health Overview

Metric

Status

Notes

Circular Dependencies

✅ None detected

Clean dependency tree in key modules

High Fan-In Modules

⚠️ 3 modules

Logging, backends, conversions

Tight Coupling

⚠️ Moderate

XpcsFile tightly coupled to analysis modules

Backend Abstraction

✅ Well-isolated

JAX/NumPy switching via clean interface

Integration Points

⚠️ 12 identified

HDF5 I/O boundaries need standardization

Key Findings

  1. No circular dependencies in core modules - clean separation of concerns

  2. Backend abstraction layer is well-designed - minimal coupling to JAX/NumPy

  3. High coupling at I/O boundaries - HDF5 read/write scattered across modules

  4. SimpleMask integration is loosely coupled - uses signals for data export

  5. Shared data schemas lack formal definition - implicit contracts, no validation


1. Internal Module Dependencies

1.1 Dependency Graph

xpcsviewer/
├── backends/               [FAN-IN: 8, FAN-OUT: 0]
│   ├── _base.py           - Protocol definition
│   ├── _jax_backend.py    - JAX implementation
│   ├── _numpy_backend.py  - NumPy fallback
│   ├── _conversions.py    [FAN-IN: 9] - I/O boundary conversion
│   └── _device.py         - GPU device management
│
├── fitting/               [FAN-IN: 0, FAN-OUT: 2]
│   ├── models.py          - NumPyro model definitions
│   ├── sampler.py         - NUTS sampler orchestration
│   ├── nlsq.py            - JAX-accelerated NLSQ warm-start
│   ├── results.py         - FitResult, NLSQResult classes
│   └── legacy.py          - scipy.optimize compatibility layer
│
├── xpcs_file.py           [FAN-IN: 2, FAN-OUT: 3]
│   └── xpcs_file/
│       ├── cache.py       - LRU data cache
│       ├── memory.py      - Memory monitoring
│       └── fitting.py     - Legacy fitting functions
│
├── viewer_kernel.py       [FAN-IN: 1, FAN-OUT: 0]
│   └── Coordinates module loading and analysis orchestration
│
├── module/                [Analysis modules]
│   ├── saxs1d.py          [FAN-OUT: 2] - 1D SAXS plotting
│   ├── saxs2d.py          [FAN-OUT: 2] - 2D SAXS plotting
│   ├── g2mod.py           [FAN-OUT: 2] - G2 correlation analysis
│   ├── twotime.py         [FAN-OUT: 4] - Two-time correlation
│   ├── tauq.py            [FAN-OUT: 3] - Tau-Q analysis
│   └── intt.py            - Intensity-time analysis
│
├── simplemask/            [Self-contained subsystem]
│   ├── simplemask_window.py  [FAN-OUT: 3] - QMainWindow + signals
│   ├── simplemask_kernel.py  [FAN-IN: 2, FAN-OUT: 4]
│   ├── area_mask.py          - Mask assembly with undo/redo
│   ├── qmap.py               [FAN-IN: 2] - Q-map computation
│   ├── utils.py              [FAN-IN: 2] - Partition generation
│   └── pyqtgraph_mod.py      [FAN-IN: 3] - Custom ROI classes
│
├── fileIO/
│   ├── hdf_reader.py      [FAN-IN: 2] - HDF5 connection pooling
│   ├── qmap_utils.py      - Q-map HDF5 I/O
│   └── aps_8idi.py        - Beamline-specific key mapping
│
└── utils/
    ├── logging_config.py  [FAN-IN: 24] ⚠️ High coupling
    └── validation.py      - Input validation utilities

1.2 High Fan-In Modules (Integration Hotspots)

Module

Dependents

Risk

Recommendation

utils.logging_config

24

Low

Stable utility, acceptable coupling

backends._conversions

9

Medium

Critical for I/O boundaries, monitor closely

backends

8

Medium

Core abstraction, changes cascade widely

constants

6

Low

Configuration values, stable

simplemask.pyqtgraph_mod

3

Low

UI components, isolated

Analysis: backends._conversions is the critical integration point between JAX and I/O systems. Any changes to ensure_numpy() affect 9 modules.

1.3 High Fan-Out Modules (Brittle Dependencies)

Module

Dependencies

Risk

Recommendation

module.twotime

4

High

Refactor to reduce backend coupling

simplemask.simplemask_kernel

4

Medium

Well-structured, acceptable

xpcs_file

3

Medium

Consider breaking into smaller services

Analysis: module.twotime depends on backends, conversions, scipy replacements, and logging. Consider facade pattern to reduce direct dependencies.


2. External Service Integrations

2.1 HDF5 File I/O

Integration Points (12 identified)

Module

Operation

Schema

Boundary Type

xpcs_file.py

Read/Write

XPCS data files

Primary I/O

fileIO/hdf_reader.py

Read with pooling

Generic HDF5

Connection management

fileIO/qmap_utils.py

Read

/xpcs/qmap

Q-map loader

simplemask/simplemask_kernel.py

Write

Mask HDF5

Export boundary

simplemask/area_mask.py

Read/Write

Mask HDF5

Persistence

module/twotime_utils.py

Read

Two-time data

Analysis cache

utils/lazy_loader.py

Read

Deferred loading

Memory optimization

utils/validation.py

Read

Schema validation

Input validation

HDF5 Schema Contracts (Implicit)

XPCS Data File Schema:

/xpcs/
  ├── qmap/
  │   ├── mask           # int32, shape=(H, W)
  │   ├── sqmap          # float64, shape=(H, W) - Static Q
  │   ├── dqmap          # float64, shape=(H, W) - Dynamic Q
  │   └── partition_map  # int32, shape=(H, W) - Q-bin indices
  ├── g2/                # G2 correlation results
  ├── saxs1d/            # 1D SAXS profiles
  └── metadata/          # Geometry, detector config

SimpleMask Partition Schema:

{
    "version": "1.0.0",
    "mask": np.ndarray,           # int32, shape=(H, W), values 0/1
    "partition_map": np.ndarray,  # int32, shape=(H, W), bin indices
    "num_pts": int,               # Number of Q-bins
    "val_list": list[float],      # Q-bin center values
    "num_list": list[int],        # Pixels per bin
    "metadata": dict              # Geometry parameters
}

Signal Export Schema (SimpleMask → XPCS Viewer):

# Signal: mask_exported(np.ndarray)
mask: np.ndarray[int32]  # Shape=(H, W), values 0/1

# Signal: qmap_exported(dict)
{
    "partition_map": np.ndarray[int32],  # Shape=(H, W)
    "num_pts": int,
    "val_list": list[float],
    "num_list": list[int]
}

⚠️ Integration Risks

  1. No schema validation - HDF5 files loaded without runtime checks

  2. Inconsistent error handling - Some modules raise, others return None

  3. Mixed use of h5py.File contexts - Connection pooling vs direct access

  4. No versioning - Schema changes could break backward compatibility


3. Shared Data Schemas

3.1 Core Data Structures

Q-Map Dictionary (Cross-Module Contract)

# Produced by: simplemask.qmap.compute_qmap()
# Consumed by: xpcs_file.py, viewer_kernel.py, module/saxs*.py

QMapDict = {
    "sqmap": np.ndarray,      # float64, shape=(H, W) - Static Q
    "dqmap": np.ndarray,      # float64, shape=(H, W) - Dynamic Q
    "phis": np.ndarray,       # float64, shape=(H, W) - Azimuthal angle
    "sqmap_unit": str,        # "nm^-1" or "A^-1"
    "dqmap_unit": str,
    "phis_unit": str,         # "rad" or "deg"
}

Dependency Graph:

simplemask.qmap.compute_qmap()
    ↓
xpcs_file.get_cropped_qmap() → viewer_kernel.plot_qmap()
    ↓                              ↓
module.saxs1d.plot()          module.saxs2d.plot()

⚠️ Coupling Issue: No formal validation - consumers assume specific keys exist.

Geometry Metadata (Cross-Module Contract)

# Produced by: fileIO/hdf_reader.py (from HDF5 /xpcs/metadata)
# Consumed by: simplemask.qmap, xpcs_file.py

GeometryMetadata = {
    "bcx": float,          # Beam center X (column, pixels)
    "bcy": float,          # Beam center Y (row, pixels)
    "det_dist": float,     # Detector distance (mm)
    "lambda_": float,      # Wavelength (Å)
    "pix_dim": float,      # Pixel size (mm)
    "shape": tuple[int, int]  # (height, width)
}

Dependency Graph:

fileIO.hdf_reader → xpcs_file.metadata
                       ↓
            simplemask.qmap.compute_qmap()
                       ↓
            viewer_kernel.plot_qmap()

⚠️ Coupling Issue: Metadata passed as plain dict without type checking.

G2 Data Structure

# Produced by: xpcs_file.get_g2_data()
# Consumed by: module.g2mod, viewer_kernel.plot_g2()

G2Data = {
    "g2": np.ndarray,         # shape=(n_q, n_delay)
    "g2_err": np.ndarray,     # shape=(n_q, n_delay)
    "delay_times": np.ndarray,  # shape=(n_delay,)
    "q_values": list[float],  # Length n_q
}

Backend Array Protocol

# Produced by: backends.get_backend()
# Consumed by: All analysis modules using JAX/NumPy

BackendArray = Union[np.ndarray, jax.Array]

# I/O Boundary Conversion:
ensure_numpy(array: BackendArray) -> np.ndarray
ensure_backend_array(array: np.ndarray) -> BackendArray

Critical I/O Boundaries:

  1. PyQtGraph plotting: Always requires NumPy (use ensure_numpy())

  2. HDF5 writing: Always requires NumPy (use ensure_numpy())

  3. Matplotlib: Always requires NumPy (use ensure_numpy())

  4. User input: Convert to backend array for computation


4. Cross-Module Data Flows

4.1 Primary Data Flow: XPCS Analysis Pipeline

[HDF5 File]
    ↓ (fileIO.hdf_reader)
[XpcsFile] ← Memory cache, fitting cache
    ↓ (viewer_kernel coordinates)
[Analysis Modules]
    ├→ module.saxs1d → [PyQtGraph plot]
    ├→ module.saxs2d → [PyQtGraph plot]
    ├→ module.g2mod → [fitting.fit_single_exp] → [FitResult]
    └→ module.twotime → [backends._conversions] → [Plot]
    ↓
[Export to HDF5 / GUI signals]

Backend Conversion Points:

  • Entry: xpcs_file.py reads NumPy from HDF5

  • Computation: Analysis modules use backends.get_backend() for array ops

  • Exit: ensure_numpy() before PyQtGraph/Matplotlib/HDF5

4.2 SimpleMask Data Flow

[User Detector Image]
    ↓
[SimpleMaskWindow]
    ↓
[SimpleMaskKernel]
    ├→ [MaskAssemble] - Mask editing with undo/redo
    ├→ [qmap.compute_qmap()] - JAX/NumPy backend
    └→ [utils.generate_partition()] - JAX JIT-compiled
    ↓
[Export Signals]
    ├→ mask_exported(np.ndarray) → [XPCS Viewer applies mask]
    └→ qmap_exported(dict) → [XPCS Viewer updates partition]

Backend Conversion Points:

  • Entry: SimpleMaskWindow receives NumPy from user

  • Computation: Q-map uses backends.get_backend() with JIT cache

  • Exit: ensure_numpy() before emitting signals

4.3 Fitting Data Flow

[module.g2mod requests fit]
    ↓
[fitting.fit_single_exp]
    ├→ [nlsq.nlsq_optimize] - JAX gradient descent warm-start
    │   └→ backends.value_and_grad()
    ↓
    └→ [sampler.run_single_exp_fit] - NumPyro NUTS
        └→ backends (JAX required)
    ↓
[FitResult] - ArviZ-compatible posterior samples
    ↓
[visualization.plot_posterior_predictive]
    └→ ensure_numpy() before Matplotlib

Backend Requirements:

  • NLSQ: Works with NumPy or JAX backend

  • NumPyro: Requires JAX backend (raises RuntimeError otherwise)

  • Visualization: Always converts to NumPy for Matplotlib


5. Integration Points Requiring Attention

5.1 Critical Facades Needed

1. HDF5 I/O Facade (Priority: HIGH)

Problem: 12 modules directly use h5py.File, inconsistent error handling.

Proposed Solution:

# xpcsviewer/io/hdf5_facade.py
class HDF5Facade:
    def __init__(self, connection_pool: HDF5ConnectionPool):
        self._pool = connection_pool

    def read_qmap(self, file_path: str) -> QMapDict:
        """Read Q-map with schema validation and connection pooling."""
        with self._pool.get_connection(file_path) as f:
            return self._validate_qmap_schema(f['/xpcs/qmap'])

    def write_partition(self, file_path: str, partition: PartitionDict) -> None:
        """Write partition with versioning and compression."""
        pass

Affected Modules:

  • xpcs_file.py - Migrate to facade for qmap access

  • simplemask/simplemask_kernel.py - Use facade for mask I/O

  • fileIO/qmap_utils.py - Replace with facade

  • module/twotime_utils.py - Use facade for cache reads

Migration Strategy:

  1. Create facade in parallel (no breaking changes)

  2. Add deprecation warnings to direct h5py.File usage

  3. Migrate module by module with tests

  4. Remove direct access after 2 releases

2. Backend Array Adapter (Priority: MEDIUM)

Problem: I/O boundary conversions scattered across 9 modules.

Proposed Solution:

# xpcsviewer/backends/io_adapter.py
class BackendIOAdapter:
    """Automatic backend conversion at I/O boundaries."""

    @staticmethod
    def for_pyqtgraph(array: BackendArray) -> np.ndarray:
        """Convert backend array for PyQtGraph plotting."""
        return ensure_numpy(array)

    @staticmethod
    def for_hdf5(array: BackendArray) -> np.ndarray:
        """Convert backend array for HDF5 writing."""
        return ensure_numpy(array)

    @staticmethod
    def from_user_input(array: np.ndarray) -> BackendArray:
        """Convert user input to active backend."""
        return ensure_backend_array(array)

Affected Modules:

  • module/saxs1d.py - PyQtGraph boundary

  • module/saxs2d.py - PyQtGraph boundary

  • module/twotime.py - PyQtGraph + HDF5 boundaries

  • simplemask/qmap.py - Signal export boundary

  • simplemask/area_mask.py - HDF5 boundary

3. Data Schema Validators (Priority: HIGH)

Problem: No runtime validation of shared data structures.

Proposed Solution:

# xpcsviewer/schemas/validators.py
from dataclasses import dataclass
import numpy as np

@dataclass
class QMapSchema:
    sqmap: np.ndarray
    dqmap: np.ndarray
    phis: np.ndarray
    sqmap_unit: str
    dqmap_unit: str
    phis_unit: str

    def __post_init__(self):
        assert self.sqmap.shape == self.dqmap.shape == self.phis.shape
        assert self.sqmap_unit in ["nm^-1", "A^-1"]
        # ... more validation

def validate_qmap(data: dict) -> QMapSchema:
    """Validate Q-map dictionary schema."""
    return QMapSchema(**data)

Affected Modules:

  • simplemask.qmap - Producer validation

  • xpcs_file.py - Consumer validation

  • viewer_kernel.py - Consumer validation

5.2 Circular Dependency Risks

Current Status: ✅ No circular dependencies detected

Potential Future Risks:

  1. fitting ← → backends:

    • fitting depends on backends for JAX access

    • Risk: If backends adds fitting utilities, creates cycle

    • Mitigation: Keep backends purely computational (no domain logic)

  2. xpcs_file ← → module/*:

    • xpcs_file loads data, modules analyze

    • Risk: Modules might cache results in XpcsFile

    • Mitigation: Use separate cache service (already done via xpcs_file.cache)

  3. simplemask ← → viewer_kernel:

    • Currently decoupled via signals

    • Risk: Direct method calls for convenience

    • Mitigation: Enforce signal-only communication in code reviews

5.3 Tight Coupling Analysis

High Coupling: XpcsFile ↔ Analysis Modules

Evidence:

  • xpcs_file.py has 39 public methods, many module-specific

  • fit_g2(), fit_tauq(), get_g2_data() - tightly coupled to analysis

Recommendation:

# Current (tight coupling):
xf = XpcsFile(path)
xf.fit_g2(q_idx=0, model='single_exp')  # XpcsFile knows about fitting

# Proposed (loose coupling via services):
xf = XpcsFile(path)
g2_data = xf.get_g2_data(q_idx=0)
result = fitting.fit_single_exp(g2_data['delay_times'], g2_data['g2'])

Migration Strategy:

  1. Create service layer: services/g2_analysis.py, services/saxs_analysis.py

  2. Move analysis logic from xpcs_file.py to services

  3. Keep convenience methods in xpcs_file.py as thin wrappers (deprecated)

  4. Fully migrate after 3 releases



7. Migration Roadmap

Phase 1: Non-Breaking Additions (Weeks 1-4)

Goal: Add new patterns in parallel without breaking existing code.

  1. Week 1-2:

    • Create io/hdf5_facade.py with basic read/write methods

    • Create backends/io_adapter.py with PyQtGraph/HDF5 adapters

    • Add unit tests for new components

  2. Week 3-4:

    • Create schema validators in schemas/validators.py

    • Add repositories/xpcs_repository.py

    • Integration tests for facade + repository

Deliverables:

  • ✅ New modules coexist with old code

  • ✅ 100% test coverage for new code

  • ✅ Documentation with migration examples

Phase 2: Gradual Migration (Weeks 5-12)

Goal: Migrate modules one at a time with feature parity.

Priority Order:

  1. simplemask/simplemask_kernel.py - Self-contained, low risk

  2. module/saxs1d.py - Low complexity, good reference

  3. module/saxs2d.py - Similar to saxs1d

  4. module/twotime.py - Higher complexity, more I/O points

  5. xpcs_file.py - Core module, migrate last

Per-Module Process:

  1. Add facade/adapter usage alongside existing code

  2. Run full test suite + integration tests

  3. Add deprecation warnings to old code paths

  4. Monitor for 2 weeks in production

  5. Remove old code if no issues

Deliverables:

  • ✅ All analysis modules use facade pattern

  • ✅ Backward compatibility maintained

  • ✅ Deprecation warnings in logs

Phase 3: Cleanup and Optimization (Weeks 13-16)

Goal: Remove deprecated code, optimize patterns.

  1. Week 13-14:

    • Remove deprecated direct h5py.File usage

    • Remove deprecated ensure_numpy() calls outside adapters

    • Update documentation

  2. Week 15-16:

    • Performance benchmarks for facade vs. direct access

    • Optimize connection pooling based on metrics

    • Final integration tests

Deliverables:

  • ✅ Clean dependency graph with facades

  • ✅ Performance parity or improvement

  • ✅ Updated architecture documentation


8. Risk Assessment

High-Risk Integration Points

Integration Point

Risk Level

Impact

Mitigation

backends._conversions

HIGH

9 modules depend on it

Extensive tests, version lock

HDF5 schema changes

HIGH

Breaking changes to file format

Schema versioning, migration tools

xpcs_file.py refactor

MEDIUM

Core module, many dependents

Gradual migration, parallel paths

NumPyro JAX requirement

MEDIUM

Fitting requires JAX

Clear error messages, fallback to NLSQ

PyQtGraph array format

LOW

Well-tested, stable API

Use adapters consistently

Mitigation Strategies

  1. Schema Versioning:

    # Add to all HDF5 writes
    f.attrs['schema_version'] = '2.0.0'
    
    # Migration utilities
    def migrate_v1_to_v2(file_path):
        # Upgrade old files automatically
        pass
    
  2. Feature Flags:

    # Enable new facade gradually
    USE_HDF5_FACADE = os.environ.get('XPCS_USE_FACADE', 'false').lower() == 'true'
    
    if USE_HDF5_FACADE:
        qmap = facade.read_qmap(path)
    else:
        qmap = legacy_qmap_loader(path)
    
  3. Extensive Integration Tests:

    # tests/integration/test_hdf5_facade.py
    def test_facade_backward_compatibility():
        """Ensure facade reads old HDF5 files correctly."""
        old_file = "tests/data/legacy_v1.hdf5"
        result = facade.read_qmap(old_file)
        assert result.schema_version == "1.0.0"
        # ... validation
    

9. Performance Implications

Connection Pooling Impact

Current: fileIO/hdf_reader.py has connection pooling (good ✅)

With Facade:

  • All HDF5 access goes through single pool

  • Easier to monitor/tune pool size

  • Estimated overhead: < 1% (facade layer is thin)

Backend Conversion Overhead

Measured in JAX migration tests:

  • ensure_numpy() on GPU array: ~0.5ms for 1024x1024 array

  • ensure_backend_array(): ~0.1ms (view creation)

With Adapter Pattern:

  • Same conversions, but centralized

  • Easier to optimize (e.g., caching small arrays)

  • Estimated overhead: 0% (same underlying operations)

JIT Compilation Benefits

SimpleMask Q-map with JAX:

  • First call: ~200ms (compilation)

  • Subsequent calls: ~5ms (10x faster than NumPy)

  • Cache hit rate: ~95% in typical usage

Recommendation: Keep JIT caching in facade layer for best performance.


10. Conclusion

Strengths

  1. No circular dependencies - Clean modular architecture

  2. Backend abstraction is well-designed - JAX/NumPy switching works seamlessly

  3. SimpleMask integration uses loose coupling - Signal-based, testable

  4. Connection pooling exists - HDF5 access is already optimized

Weaknesses

  1. ⚠️ No schema validation - Runtime errors possible from malformed HDF5

  2. ⚠️ High coupling at I/O boundaries - 9 modules directly use ensure_numpy()

  3. ⚠️ Implicit data contracts - Shared dicts without type checking

  4. ⚠️ XpcsFile is a “god object” - 39 public methods, knows too much

Priority Actions

Immediate (Next Sprint):

  1. Create io/hdf5_facade.py with schema validation

  2. Add schemas/validators.py for QMapSchema, GeometryMetadata

  3. Document all shared data structures in this file

Short-Term (Next Quarter):

  1. Migrate SimpleMask to use HDF5 facade

  2. Create backend I/O adapters

  3. Add integration tests for facades

Long-Term (Next Release):

  1. Refactor xpcs_file.py into service layer

  2. Remove deprecated direct HDF5 access

  3. Enforce facade pattern via linting rules


Appendix A: Data Structure Reference

A.1 Complete QMapDict Schema

QMapDict = TypedDict('QMapDict', {
    # Required fields
    'sqmap': np.ndarray,      # float64, shape=(H, W), Static Q magnitude
    'dqmap': np.ndarray,      # float64, shape=(H, W), Dynamic Q magnitude
    'phis': np.ndarray,       # float64, shape=(H, W), Azimuthal angle

    # Units (required)
    'sqmap_unit': Literal["nm^-1", "A^-1"],
    'dqmap_unit': Literal["nm^-1", "A^-1"],
    'phis_unit': Literal["rad", "deg"],

    # Optional fields
    'mask': Optional[np.ndarray],  # int32, shape=(H, W), 0=masked, 1=valid
    'partition_map': Optional[np.ndarray],  # int32, Q-bin indices
})

A.2 Complete GeometryMetadata Schema

GeometryMetadata = TypedDict('GeometryMetadata', {
    # Beam center (pixels, 0-indexed)
    'bcx': float,            # Column (X)
    'bcy': float,            # Row (Y)

    # Detector configuration
    'det_dist': float,       # mm, detector-to-sample distance
    'pix_dim': float,        # mm, pixel size
    'shape': Tuple[int, int],  # (height, width) in pixels

    # X-ray properties
    'lambda_': float,        # Angstroms, wavelength

    # Optional
    'det_rotation': Optional[float],  # degrees, detector rotation
    'incident_angle': Optional[float],  # degrees, grazing incidence
})

A.3 Partition Dictionary Schema

PartitionDict = TypedDict('PartitionDict', {
    'version': str,           # Schema version, e.g., "1.0.0"
    'partition_map': np.ndarray,  # int32, shape=(H, W), bin indices
    'num_pts': int,           # Number of Q-bins
    'val_list': List[float],  # Q-bin center values (length=num_pts)
    'num_list': List[int],    # Pixels per bin (length=num_pts)
    'metadata': GeometryMetadata,  # Geometry used for computation

    # Optional
    'mask': Optional[np.ndarray],  # Mask used during partitioning
    'method': Optional[Literal["linear", "log"]],  # Binning method
})

Document Version: 1.0 Author: Architecture Review Next Review Date: 2026-02-06 (1 month)