Aggregation¶
Per-site aggregation of per-read modification probabilities, and TSV writers.
SiteResult
dataclass
¶
SiteResult(
contig: str,
position: int,
kmer: str,
mod_ratio: float,
ci_low: float,
ci_high: float,
pvalue: float,
padj: float,
effect_size: float,
n_native: int,
n_ivt: int,
mean_p_mod: float,
stoichiometry: float,
)
aggregate_all
¶
aggregate_all(
results: dict[str, ContigModificationResult],
*,
score_field: str = "p_mod_hmm",
mod_threshold: float = 0.9
) -> list[SiteResult]
Aggregate all contigs and apply per-transcript FDR correction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
dict[str, ContigModificationResult]
|
|
required |
score_field
|
str
|
Which per-read score to aggregate. |
'p_mod_hmm'
|
mod_threshold
|
float
|
Per-read probability threshold for counting a read as modified. |
0.9
|
Returns:
| Type | Description |
|---|---|
list[SiteResult]
|
All sites across all contigs, sorted by contig then position,
with |
Source code in baleen/eventalign/_aggregation.py
aggregate_contig
¶
aggregate_contig(
cmr: ContigModificationResult,
*,
score_field: str = "p_mod_hmm",
mod_threshold: float = 0.9
) -> list[SiteResult]
Aggregate per-read results into site-level calls for one contig.
P-values are not FDR-corrected here; use :func:aggregate_all
for multi-contig FDR correction, or apply :func:_benjamini_hochberg
manually.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cmr
|
ContigModificationResult
|
Output of |
required |
score_field
|
str
|
Which per-read score to aggregate. Default |
'p_mod_hmm'
|
Returns:
| Type | Description |
|---|---|
list[SiteResult]
|
One entry per position, sorted by position. |
Source code in baleen/eventalign/_aggregation.py
write_site_tsv
¶
write_site_tsv(
sites: list[SiteResult], path: str | Path
) -> Path
Write site-level results to a TSV file (header + rows).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sites
|
list[SiteResult]
|
Output of :func: |
required |
path
|
str | Path
|
Output file path. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
The written file path. |
Source code in baleen/eventalign/_aggregation.py
write_site_tsv_header
¶
write_site_tsv_rows
¶
write_site_tsv_rows(
file: IO[str], sites: list[SiteResult]
) -> None
Write sites as TSV data rows (no header) to file.
Source code in baleen/eventalign/_aggregation.py
merge_contig_tsvs
¶
Concat per-contig TSV slices (rows-only) into one TSV with a header.
The caller is responsible for sorting per_contig_tsvs in the desired output order (typically alphabetic by contig name).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
per_contig_tsvs
|
list[Path]
|
List of per-contig TSV paths. Each file contains data rows
only (no header) — produced via :func: |
required |
output_path
|
str | Path
|
Final TSV path. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
The written final TSV path. |