Table 1.

NeighbourNet regression setting and computational cost

DataCells (N)Responses (P)Predictors (Q)Computational time (s)Memory usage (GB)
PBMC3k dataN = 2638P = 288Q = 4116LVC: 34.5625.60
Case study 1CollecTRI TFsCollecTRI targetsRegression: 784.52
Perturb-seq data 1N = 7411P = 5Q = 5151LVC: 155.399.35
Case study 1CollecTRI TFsCollecTRI targetsRegression: 18.6
(Papalexi et al. 2021)
Perturb-seq data 2 (d7)N = 16,506P = 10Q = 3227LVC: 872.875.62
Supplemental Resultsperturbed TFsCollecTRI targetsRegression: 120.48
(Dixit et al. 2016)
Perturb-seq data 2 (d13)N = 9633P = 10Q = 3227LVC: 305.203.00
Supplemental Resultsperturbed TFsCollecTRI targetsRegression: 63.18
(Dixit et al. 2016)
Lin early hematopetic cell atlasN = 1078P = 805Q = 4600LVC: 33.68
Case study 2(Subsampled)PKN TFsPKN targetsRegression: 1045.0934.50
(Pellin et al. 2019)Meta-network 482.11
Small cell lung cancer atlasN = 2909P = 28Q = 900LVC: 87.03
Case study 3(Subsampled)DEGsPKN TFsRegression: 66.611.23
(Chan et al. 2021)Meta-network 39.11

[i] We summarize the number of cells (N), the number of response genes (P), and the number of predictor genes (Q) involved in the NNet analysis for each case study. Each NNet analysis thus generates an ensemble of N × P × Q coexpression networks. For NNet analyses in the early hematopoiesis (Case study 2) and lung cancer (Case study 3) case studies, networks were built on a subset of cells (indicated by “subsampled”) to represent the full data set, further reducing computational burden. The Computational time column records the runtime of NNet in seconds, broken down into the stages of local gene variance calculation (LVC), regression, and meta-network construction. LVC calculation is an initial part of the coexpression measurement (Methods) and only needs to be performed once before regression. Adjusting the response genes for subsequent regression steps does not require recalculating local variance. The Memory usage column shows the change in memory (in gigabytes) before and after the LVC and regression steps. All the analysis were ran on a RStudio server allocated with four cores of Intel Xeon Gold 6254 @ 3.10 GHz CPU. We did not perform parallel computing. Other acronyms: DEGs: differentially expressed genes. PKN: prior knowledge network.