Main Function for Genomic Distance Calculation and Phylogenetic Tree Construction
fit.RdThis function computes genomic distances between samples with flexible distance-function selection, processes the data, and constructs a phylogenetic tree. It supports distinct metrics for greedy (G) and breakage–fusion–bridge (BFB; B) distance calculations.
Arguments
- data
Input data frame with columns:
cell_id— Cell identifierchr— Chromosome namestart— Start of genomic binend— End of genomic binCN— Total copy numberA— Copy number of allele AB— Copy number of allele B
- chromosomes
Vector of chromosomes to include (default:
c(1:22, "X", "Y")).- alleles
Alleles to consider (default:
c("A","B"); alternative:"CN").- k_jitter_fix
Numeric jitter factor for numerical stability (default:
0).- bfb_penalty
Penalty to apply to BFB events (default:
0). Ignored ifb_dist_func = NULL.- tree_func
Function used to build the tree from the final distance matrix (default:
ape::nj).- fillna
Value to fill
NAentries in pre-processing (default:0).- g_dist_func
Name of the greedy (G) distance function. Must be one of
names(G_DISTS)(default:"greedy_fast").- b_dist_func
Name of the BFB (B) distance function. Must be one of
names(B_DISTS). Set toNULLto disable the BFB stage and run a greedy-only analysis (default:"bfb_fast").- ...
Additional arguments passed to downstream helpers.
Value
A list with:
tree— The constructed phylogenetic tree (rooted, diploid removed)all_input_Xs— Processed input dataD— Final distance matrixgreedy_Ds— Per-chromosome/allele greedy distance matricesavg_Ds— Per-chromosome/allele balanced (G vs B) distance matrices; ifb_dist_func = NULL, thenavg_Dsequalsgreedy_Dsg_dist_func— Name of the G distance function usedb_dist_func— Name of the B distance function used (orNULL)
Details
The pipeline performs:
Validation of selected distance functions
Input pre-processing and diploid augmentation
Computation of greedy (G) distances
Optional BFB stage: if
b_dist_funcis provided, BFB (B) distances are computed and merged with G; ifb_dist_func = NULL, the BFB stage is skipped andavg_Ds <- greedy_Ds(greedy-only analysis). In theNULLcase,bfb_penaltyis ignored.Merging to minimal distances and optional allele summation
Tree construction via
tree_func
Available functions can be inspected with names(G_DISTS) and names(B_DISTS).
Using b_dist_func = NULL is useful for ablation studies, speed-ups, or when
BFB modelling is not desired; results reduce to the greedy metric.