Read-ID Intersection¶
Enumeration of read IDs from each input file type and the per-condition
three-way intersection reads(BAM) ∩ reads(FASTQ) ∩ reads(BLOW5). For the
rationale see Inputs › Read-ID intersection.
Internal module
These live in baleen.eventalign._read_ids. They are not part of the stable
top-level API but are documented here because the intersection is a core
pipeline guarantee.
Enumeration¶
read_ids_from_bam
¶
read_ids_from_bam(
bam_path: PathLike,
*,
primary_only: bool = True,
min_mapq: int = 0
) -> set[str]
Return the set of read query-names in bam_path surviving the same primary/min_mapq filters used later by the pipeline.
Source code in baleen/eventalign/_read_ids.py
read_ids_from_fastq
¶
Parse a FASTQ (optionally gzipped) and return the set of read IDs.
Read IDs are the first whitespace-delimited token of each header line
(@<read_id> runid=... ch=...). This is the sole read-id source for
the FASTQ side of the intersection: the krill engine reads signal directly
from BLOW5 and never produces an f5c .index.readdb, so any such file
found adjacent to the FASTQ is ignored (older f5c readdb files used a
*<TAB>blow5_path "single-BLOW5" form that carries no read ids at all).
Source code in baleen/eventalign/_read_ids.py
read_ids_from_blow5
¶
Enumerate read IDs in a BLOW5/SLOW5 file via pyslow5.
pyslow5.Open(path, 'r').get_read_ids() returns (ids, n_reads)
where ids is a Python list of UUID strings.
Source code in baleen/eventalign/_read_ids.py
Intersection & persistence¶
compute_condition_intersection
¶
compute_condition_intersection(
*,
bam: PathLike,
fastq: PathLike,
blow5: PathLike,
primary_only: bool = True,
min_mapq: int = 0,
label: str = ""
) -> set[str]
Compute reads(BAM) ∩ reads(FASTQ) ∩ reads(BLOW5) for one condition.
Source code in baleen/eventalign/_read_ids.py
write_read_ids
¶
Persist a set of read IDs to path, newline-separated, atomically.
Source code in baleen/eventalign/_read_ids.py
load_read_ids
¶
Inverse of write_read_ids. Returns None when path is None
so callers can short-circuit when the intersection feature is off.