Benchmarking the performance of stMLnet. (A) Overview of benchmark framework. It involves three components: input, methods, and benchmark metrics. We benchmark the performance of stMLnet and the other seven representative CCC inference methods on ST data sets. Specifically, CellCahtV2, COMMOT, CytoSignal, CytoTalk, MISTy, NicheNet, Scriabin, and stMLnet can all infer intercellular ligand–receptor (LR) interactions, whereas COMMOT, CytoTalk, MISTy, NicheNet, and stMLnet can further infer LR-TG regulation, namely, intracellular signaling. We used DLRC to evaluate intercellular LR inference and AURRC to evaluate the accuracy of intracellular target gene predictions. (B) Evaluation and comparison of the eight LR inference methods based on the DLRC metric. Rank indicates the performance ranking of a method on a data set according to the DLRC value (i.e., score). The higher the DLRC value, the lower the rank number. (C) Evaluation and comparison of the five LR-TG inference methods using the AUPRC metric. The statistical significance of difference in AUPRC values between stMLnet and each other method was assessed using the Wilcoxon rank-sum test. (D) Comprehensive evaluation of stMLnet with other methods from both intercellular and intracellular communication aspects. (E) Summary of properties/functionalities and benchmarking results for the eight CCC inference methods. (a) The names of CCC methods. (b) The properties/functionalities of different CCC methods. We list the properties of each method, including the spatial information utilization, and the prediction capacities on intercellular LR interactions, intracellular LR-TG regulation, multilayer network structures, and feedback loops. (c) Accuracy and overall ranking of different methods according to the two metrics. (NA) Not applicable. The lighter the color, the higher the ranking, indicating better method performance.
