ionique.cparsers

Cython implementations of ionic current parsers.

This module contains optimized Cython implementations of the segmentation parser (SpeedyStatSplit) defined in parsers.py. Variance-based recursive segmentation is provided by FastStatSplit, which is approximately 50-100x faster than the equivalent pure-Python implementation.

Modules

FastStatSplit

Cython implementation of the StatSplit variance-based segmenter.

pairwise

Utility generator yielding consecutive pairs from an iterable.

class ionique.cparsers.FastStatSplit

Bases: object

Cython implementation of the variance-based signal segmenter by Kevin Karplus.

Approximately 50-100x faster than the equivalent pure-Python implementation depending on parameters. Segments a 1D ionic current trace by recursively finding split points that maximize the reduction in total variance across the two resulting sub-segments.

Parameters:
  • min_width (int, optional) – Minimum number of samples required in any segment. Defaults to 100.

  • max_width (int, optional) – Maximum number of samples allowed in any segment before a forced split is inserted. Defaults to 1000000.

  • window_width (int, optional) – Width of the sliding window used during stepwise split search. Must be >= 2 * min_width. Defaults to 10000.

  • min_gain_per_sample (float or None, optional) – If provided, uses the legacy method for setting the gain threshold (deprecated). Defaults to None.

  • false_positive_rate (float or None, optional) – Expected number of false-positive split detections per second. Used in Bayesian gain threshold calculation. Defaults to sampling_freq.

  • prior_segments_per_second (float or None, optional) – Prior expectation of how many true segments occur per second. Used in Bayesian gain threshold calculation. Defaults to sampling_freq / 2.

  • sampling_freq (float, optional) – Sampling frequency of the signal in Hz. Defaults to 1e5.

  • cutoff_freq (float or None, optional) – Low-pass cutoff frequency in Hz used to adjust the Bayesian threshold. Must be <= 0.5 * sampling_freq if provided. Defaults to None.

min_gain

Minimum log-likelihood gain required to accept a split point, computed from the Bayesian formulation or the legacy min_gain_per_sample method.

Type:

float

__init__()

Initialize FastStatSplit and compute the minimum gain threshold.

Parameters:
  • min_width (int, optional) – Minimum number of samples in any segment. Defaults to 100.

  • max_width (int, optional) – Maximum number of samples before a forced split. Defaults to 1000000.

  • window_width (int, optional) – Width of the sliding search window. Must be >= 2 * min_width. Defaults to 10000.

  • min_gain_per_sample (float or None, optional) – Legacy gain-per-sample threshold (deprecated). If provided, min_gain is set to min_gain_per_sample * window_width. Defaults to None.

  • false_positive_rate (float or None, optional) – Expected false-positive splits per second for the Bayesian threshold. Defaults to sampling_freq.

  • prior_segments_per_second (float or None, optional) – Prior rate of true segments per second for the Bayesian threshold. Defaults to sampling_freq / 2.

  • sampling_freq (float, optional) – Sampling frequency of the signal in Hz. Defaults to 1e5.

  • cutoff_freq (float or None, optional) – Low-pass cutoff frequency in Hz for threshold adjustment. Must be <= 0.5 * sampling_freq. Defaults to None.

best_single_split(current)

Find the single best split point in the entire current array.

Parameters:

current (numpy.ndarray) – 1D array of ionic current values to analyze.

Returns:

A tuple of (gain, index) where gain is the log-likelihood gain in variance achieved by splitting at the returned index, and index is the position in the current array at which the split should occur.

Return type:

tuple[float, int]

min_gain
parse(current)

Segment a current trace and return a list of Segment objects.

Computes cumulative sum arrays for efficient variance calculations, then recursively finds split points and constructs one Segment per detected sub-segment, each containing its slice of the current array.

Parameters:

current (numpy.ndarray) – 1D array of ionic current values to segment.

Returns:

A list of Segment objects covering the full input array with no gaps or overlaps.

Return type:

list[Segment]

parse_meta(current)

Segment a current trace and return lightweight MetaSegment objects.

Identical to parse but constructs MetaSegment instances instead of Segment instances, so no signal data is stored in each node.

Parameters:

current (numpy.ndarray) – 1D array of ionic current values to segment.

Returns:

A list of MetaSegment objects covering the full input array with no gaps or overlaps.

Return type:

list[MetaSegment]

score_samples(current, no_split=False)

Return per-sample log-likelihood gain scores for each recursive scan.

For each recursive sweep across the data, a score array is produced recording the variance-gain value evaluated at every candidate split position. One score array is returned per scan; splits detected during a scan trigger further recursive scans of the resulting sub-segments.

Parameters:
  • current (numpy.ndarray) – 1D array of ionic current values to score.

  • no_split (bool, optional) – If True, perform only a single (non-recursive) scan and return one score array without looking for further splits. Defaults to False.

Returns:

A list of score arrays, one per recursive scan. Each array has the same length as the stored cumulative array and contains the log-likelihood gain at each sample index.

Return type:

list[numpy.ndarray]

ionique.cparsers.pairwise(iterable)

Return an iterator of overlapping pairs from the input iterable.

Parameters:

iterable (iterable) – Any iterable to consume. Each element is paired with its immediate successor, so an input of length n yields n-1 pairs.

Returns:

An iterator of (a, b) tuples where b immediately follows a in the original iterable.

Return type:

zip