Characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomic data with nonuniform cellular densities

Recent technological advances have enabled spatially resolved measurements of expression profiles for hundreds to thousands of genes in fixed tissues at single-cell resolution. However, scalable computational analysis methods able to take into consideration the inherent 3D spatial organization of cell types and nonuniform cellular densities within tissues are still lacking. To address this, we developed MERINGUE, a computational framework based on spatial autocorrelation and cross-correlation analysis to identify genes with spatially heterogeneous expression patterns, infer putative cell–cell communication, and perform spatially informed cell clustering in 2D and 3D in a density-agnostic manner using spatially resolved transcriptomic data. We applied MERINGUE to a variety of spatially resolved transcriptomic data sets including multiplexed error-robust fluorescence in situ hybridization (MERFISH), spatial transcriptomics, Slide-seq, and aligned in situ hybridization (ISH) data. We anticipate that such statistical analysis of spatially resolved transcriptomic data will facilitate our understanding of the interplay between cell state and spatial organization in tissue development and disease.


MERINGUE analysis of Spatial Transcriptomics of the mouse olfactory bulb (MOB)
Data was downloaded from https://www.spatialresearch.org/ for the MOB Replicate 11 sample that was described in the original publication (Ståhl et al. 2016).
Beginning with 262 probe spots and 15928 genes, we filtered out spots with fewer than 100 genes detected and filtered out genes with fewer than 100 reads across all spots resulting in 260 probe spots and 7365 genes. Counts were normalized to counts per million (CPM) values for downstream analysis.
For the expression-based analysis, we performed principal components analysis (PCA) on the performed on the log10 transform CPM values with pseudo count of 1. To identify transcriptional clusters, we performed Louvain clustering on a k-nearest neighbor graph constructed from the top 5 PCs with k=30. We manually annotated identified clusters as different cell layers to best correspond to the original publication (Ståhl et al. 2016). We visualized the results using t-stochastic neighbor embedding (tSNE) on the top 5 PCs.
For differential expression analysis, we use ANOVA testing to identify genes with significant expression variability across the annotated cell layers (adjusted p-value < 0.05).
For the spatial analysis, we filtered out adjacency relationships beyond a Euclidean distance of 2 (corresponding approximately 300 microns) to minimize long-range adjacency relationships towards the edge of the sample. We restricted to significantly (adjusted p-value < 0.05) spatially heterogeneous genes driven by more than 5% of spots. We grouped these significantly spatially variable genes into primary spatial patterns using hierarchical clustering with the 'ward.D' approach and deepSplit=4 parameter for dynamic tree cutting using the hybrid approach (Langfelder et al. 2008). For visualization, we interpolated expression patterns with a binSize of 50.
For the analysis of olfactory receptor (Olfr) genes, we identified summing the counts of all Olfr genes detected in each spot and assessed the summed expression for spatial heterogeneity.

MERINGUE analysis of aligned In Situ hybridization (ISH) of the Drosophila melanogaster embryo
Data was downloaded from DVEX (Karaiskos et al. 2017) at https://shiny.mdc-berlin.de/DVEX/ for the continuous in situ expression matrix from the BDTNP (Fowlkes et al. 2008).
We analyzed the in situ expression of 84 genes across the 3039 aligned stage 6 Drosophila melanogaster embryonic cell locations. For the spatial analysis, we used the 3D x, y, z coordinates and filtered out adjacency relationships beyond a 3D Euclidean distance of 10 to minimize long-range adjacency relationships towards the edge of the sample. We restricted to significantly (adjusted p-value < 0.05) spatially variable genes driven by more than 5% of locations. We grouped these significantly spatially variable genes into primary spatial patterns using hierarchical clustering with the 'ward.D' approach and deepSplit=2 parameter for dynamic tree cutting. For visualization, we interpolated expression to the whole embryo with a binSize of 50.
For expression-based clustering, to recapitulate results from (Karaiskos et al. 2017), we performed Louvain clustering on a k-nearest neighbor graph constructed from the top 20 PCs with k=150. For spatially aware clustering, we apply Louvain clustering to this same k-nearest neighbor graph but with edge weights using the same 3D adjacency weight matrix as the spatial analysis with = = 0.01.

MERINGUE analysis of multi-section Spatial Transcriptomics of breast cancer tissue
Data was downloaded from https://www.spatialresearch.org/ for the 4 breast cancer sections that was described in the original publication (Ståhl et al. 2016).
Beginning with 4 breast cancer tissue sections, we concatenated expression matrices based on genes detected in all tissue layers, resulting in 13360 genes across 1031 probe spots. We further filtered out spots with fewer than 100 genes detected and filtered out genes with fewer than 100 reads across all spots resulting in 6214 genes across 1029 probe spots. Counts were normalized to counts per million (CPM) values for downstream analysis.
For each individual breast cancer tissue layer, we first performed spatial analysis by constructing an adjacency weight matrix based on probe spots within the individual layer and filtered out adjacency relationships beyond a Euclidean distance of 2. We restricted to significantly (adjusted p-value < 0.05) spatially variable genes driven by more than 5% of spots in that tissue layer.
To identify patterns consistent across tissue layers, we manually rotated and centered spatial spot coordinates to enable the construction of a cross-layer adjacency weight matrix based on the k mutual nearest neighbors of probe spots across aligned layers with K = 3. We then identified significantly (adjusted p-value < 0.05) spatially variable genes driven by more than 1.25% of all spots.

MERINGUE analysis of SlideSeq data of the mouse cerebellum
Data was downloaded for Puck_180819_12 that was described in the original publication (Rodriques et al. 2019).
We focus our analysis on beads within the Purkinje layer of the mouse cerebellum, which was annotated as AnalogizerClusterAssignmentsOriginal=6 in the original publication, resulting in 1589 spots. We further filtered out beads with fewer than 100 genes detected and filtered out genes with fewer than 100 reads across all beads resulting in 191 genes across 883 beads. Counts were normalized to counts per million (CPM) values for downstream analysis.
For the spatial analysis, we filtered out adjacency relationships beyond a Euclidean distance of 250. We restricted to significantly (adjusted p-value < 0.05) spatially variable genes driven by more than 5% of beads.
For the inference of cell-cell communication, we focus on known receptor-ligand pairs (Ramilowski et al. 2015). From the adjacency matrix, we identify beads spatially adjacent to the Purkinje layer beads. We restrict to testing receptor and ligand pairs that are detected (> 0 expression) in at least 30 beads in the Purkinje and neighbor beads respectively and vice versa. Observed iSCI were evaluated for statistical significance using adaptive permutation testing for 100, 1000, and 10000 permutations. We applied the Benjamini-Hochberg procedure for multiple testing correction to account for all receptor and ligand pairs tested.

MERINGUE analysis of MERFISH data of the mouse hypothalamic preoptic region
Data was downloaded from the original publication (Moffitt et al. 2018).
For the spatial analysis, we focused on each cell-type and subtype as annotated in the original publication. Expression levels were normalized per cell by the imaged volume of each cell per the originally published analysis (Moffitt et al. 2018). For each individual tissue layer in each animal, we first performed spatial analysis on cell-types and subtypes that are present with more than 30 cells, constructed an adjacency weight matrix, and filtered out adjacency relationships beyond a Euclidean distance of 250 to minimize long-range adjacency relationships towards the edge of the sample. Due to concerns of subtype annotation errors potentially driving spatial patterns, we require significantly spatial variable genes to be driven by more than 10% of cells. In this manner, if certain subtypes contain a few misannotated cells that are truly from another transcriptionally and spatially distinct subtype, we would not mis-interpret the mis-annotation as spatial heterogeneity. Further, due to concerns of spatially patterned misidentification, we further require cells driving identified spatially heterogeneous genes to express the genes at a minimum expression magnitude calculated from the 95 th percentile of blank control misidentification rate. In this manner, low expression from spatially patterned misidentification would not be mis-interpreted as spatial heterogeneity. In addition, we required patterns to be significantly consistent across tissue sections. To identify patterns consistent across tissue sections, for each animal, we ordered tissue sections by the annotated Bregma position per the original publication. We centered the x and ypositions to approximately align tissue sections. We then constructed a cross-layer adjacency weight matrix based on the k mutual nearest neighbors of cells across aligned layers with k = 3. Finally, we considered a gene as spatially heterogeneous if it was identified to exhibit significant spatial heterogeneity both within individual tissue sections and across tissue sections in at least 25% of evaluated animals.
To identify potential sexually dimorphic differences in gene expression patterns, visually evaluated identified spatially heterogeneous genes to identify Nos1 expressing in Inhibitory neuron I-11 as a potential candidate. To test for statistical differences, for each animal, we quantified the fraction of Nos1+ I-11 neurons by comparing I-11 neurons expressing Nos1 versus all I-11 neurons. We defined I-11 neurons expressing Nos1 as those with Nos1 detected at above the 95-percentile blank expression rate = 0.1. We tested for differences between the fraction of Nos1+ I-11 neurons between female and male naïve mice for a representative tissue section with the most abundant I-11 neurons using a Student's t-test. Likewise, we systematically evaluated genes for spatial heterogeneity separately for naïve male and female mice using the same standards above and identified Tacr1 expression in Excitatory neuron E-8 as being significantly spatially heterogeneous in male but not female naïve animals. We used LISA to quantify the scale of spatial heterogeneity for Tacr1. We tested for differences between the scale of Tacr1 spatial heterogeneity in E-8 neurons between female and male naïve mice for a representative tissue section with the most abundant E-8 neurons using a Student's t-test.
For expression-based clustering, we focus on only inhibitory neurons in animal FN7, tissue section brain_pos=9. We restricted analysis to neuronal subtypes that are present with more than 100 cells. To recapitulate results from (Moffitt et al. 2018), we performed Infomap clustering on a k-nearest neighbor graph constructed from the top 50 PCs with k=50. For spatially informed clustering, due to bilateral symmetry in the preoptic region, we constructed the adjacency matrix W on transformed xpositions where ! = ( − ) in order to accommodate the known symmetry about the y-axis. We again applied Infomap clustering to this same k-nearest neighbor graph but with edge weights using the adjacency matrix W on transformed x-positions with = = 1. Our spatially informed clustering identified additional clusters splitting Inhibitory neurons subtype I-11 and I-2. To generalize the spatially aware clustering to all animals, we trained a linear discriminant classifier to map from gene expression to the identified spatially aware clustering annotations for I-11 and I-2 separately and applied the classifier to I-11 and I-2 in all animals separately. This analysis was repeated for excitatory neurons. To test for differential activation between I-2 subtypes identified by spatially informed clustering, we evaluated the fraction of activated neurons for each I-2 subtype. We define activated neurons as those with cFos expression > 0.1.
For the inference of cell-cell communication, we specifically focus testing for inter-cell-type spatial cross-correlation between Cyp19a1 with Esr1 or Ar. For each tissue layers in each animal, we restricted analysis to neuronal subtypes that are present with more than 100 cells. We construct an adjacency weight matrix based on all annotated cells within the tissue layer, excluding cells annotated as ambiguous from the original publication.

Comparing MERINGUE to previously published spatial analysis methods
We compare MERINGUE to previously published spatial analysis methods SpatialDE v1.1.3, Trendsceek v1.0.0, and SPARK v1.1.0. Trendsceek and SPARK were run from within R while SpatialDE was run using R's reticulate package to interface with Python.
To compare each method's results on the MOB dataset, we restricted to the same filtered set of 260 cells and 7365 genes. For MERINGUE and SpatialDE, CPM normalized expression values were used. For SPARK, counts were used as SPARK performs internal library size normalization. Trendsceek could not complete in a reasonable amount of time and was omitted from comparison. To evaluate the overlap between identified significantly spatially heterogenenous genes, MERINGUE results were filtered for genes with adjusted p-value < 0.05 driven by > 5% of spots, SpatialDE results were filtered for genes with a adjusted p-value threshold < 0.05, and SPARK results were filtered with an adjusted pvalue < 0.05 combined across all default evaluated kernels.
To compare the runtime and memory usage of each method in identifying significantly heterogeneous genes, we used the 10X Visium dataset of the mouse coronal brain section downloaded from https://www.10xgenomics.com/products/spatial-gene-expression. To evaluate runtime as a function of spots, we downsampled the data to a constant set of 1000 genes and varied the number of spots from 125, 250, 500, to 1000 and evaluated the runtime of each method. Likewise, to evaluate runtime as a function of genes, we downsampled the data to a constant set of 1000 spots and varied the number of genes from 125, 250, 500, to 1000 and evaluated the runtime of each method. We evaluated memory usage as the amount of memory allocated and then subsequently released after the running lines corresponding to each method, excluding the amount of memory necessary to create or store the positional and gene expression matrices. For Trendsceek, we limited to 100 permutations and only evaluated performance for the two smallest datasets due to runtime constraints. For SPARK and Trendsceek, we limited computation to a single core for comparison purposes. SpatialDE was omitted from comparison due to concerns that R-to-Python conversion differences introduced non-comparable runtime and memory usage.

Comparing the implementation of Moran's I in MERINGUE with spdep
spdep (Bivand et al, 2013) also has an implementation of Moran's I in R. We compared the implementation of Moran's I in the spdep library (v1.1-7) to the implementation in MERINGUE in evaluating 2000 genes across 260 spots in the MOB dataset. We provide identical adjacency matrices to confirm that both implementations yielded equivalent I statistics (Spearman correlation R=1). We also evaluated 2000 genes across 260 spots in the MOB dataset to compare runtime per gene. Moran's I in MERINGUE is written in C++ whereas the spdep implementation is written in native R.

Evaluating robustness to changes in cellular density
To evaluate robustness to changes in cellular density, we used the MOB dataset's spatial coordinates, centering and raising 1.1 to the power of the x-coordinates in order to induce non-uniform spot densities. We then compared each method's resulting -log10(adjusted p-values) or -log10(combined adjusted p-values) where appropriate under the uniform and non-uniform density cases.

Potential limitations of MERINGUE
MERINGUE builds on Moran's I, which others have noted may result in inflated P-values, which could lead to false positives (Sun et al. 2020). However, we find MERINGUE's additional filtering criterions using LISA to restrict to spatial patterns driven by a certain proportion of cells can minimize false positives (Supplemental Fig. 2C). Furthermore, conservative P-value corrections scores can help minimize such false positives.
MERINGUE requires that gene expression magnitudes within an individual spatially resolved transcriptomics dataset have been either collected in a single batch or have been corrected for batch effects. Therefore, expression-level batch correction may be applied prior to analysis with MERINGUE to ensure that identified spatial patterns are not driven by batch (Johnson et al. 2007).

B. Supplementary Figures
Supplemental Figure S1. Challenges imposed by non-homogenous cell-densities in spatial analysis. We simulate cells in space for illustrative purposes. Each point is a cell. Red indicates high expression, while blue indicates low expression of a simulated gene. A. An example of random gene expression with non-uniform cell density. Cells are denser in the top right corner. Gaussian randomly distributed gene expression is simulated across cells. Although gene expression is randomly distributed across all cells, higher gene expression is observed by chance in the denser region. Methods that fail to take into consideration differences in cell density may mistake such density-confounded random gene expression as spatial heterogeneity due to density aggregation. B. An example of patterned gene expression confounded by cell density. Figure S2. Parameter and design justifications. A. Stability of Voronoi-based neighbor inference compared to k-nearest-neighbors-based neighbor inference. We simulate cells in space for illustrative purposes. Each point is a cell. Red lines connect the cells if they are inferred to be neighbors. k-nearest neighbor approach for inferring neighbors for k=3 (left) and k=6 (middle). Cells are considered neighbors if they are within the 3 spatially closest cells based on Euclidean distance. Voronoi-based approach for inferring neighbors is parameter free (right). Cells are considered neighbors if their Voronoitessellation share an edge. B. Correspondence between moment-based and permutation-based p-values in assessing significance of the Moran's I statistic. Each point is a gene. Red line denotes linear fit. P-values from the moment-based and permutation-based approach to assessing the significance of the Moran's I statistic is highly correlated on both the linear (left) and log scale (right), though log-scale p-values from the permutation-based approach for assessing the significance of the Moran's I statistic plateaus at 4 due to limitations of permutation as only 10 4 permutations were generated. C. Histogram of the percentage of probe spots driving spatial patterns for randomly permuted gene expression. Red line denotes 5% threshold. Supplemental Figure S4. Analysis of SlideSeq of the mouse cerebellum. A. Neighborhood adjacency relationship between beads that correspond to the Purkinje layer. B. Spatial expression pattern of one identified significantly spatially variable gene Aldoc.

Supplemental Figure S3. Additional analysis of Spatial Transcriptomics (ST) data of the mouse olfactory bulb (MOB). A. Expression-based clustering analysis. Each point is a spot colored based on
Supplemental Figure S5. Comparison of MERINGUE to previously published methods for analyzing spatial transcriptomics data using ST data of the MOB. Significance of spatial heterogeneity in terms of -log10(adjusted p-value) across evaluated genes for A. MERINGUE versus SpatialDE and B. MERINGUE versus SPARK. C. Venn diagram of genes identified as significantly spatially heterogeneous by MERINGUE, SpatialDE, and SPARK. Runtime as a function of D. number of cells and E. number of genes across spatial transcriptomics analysis methods. Memory usage, defined as the amount of memory allocated and then subsequently released after the running lines corresponding to each spatial transcriptomics analysis method in R, as a function of F. number of cells and G. number of genes.   Figure S13. Simulation of inter-cell-type spatial cross correlation. A. Three simulated cell-types in space represented as different shapes (left). We focus on communications between cell-type A (blue squares) and cell-type B (red circles). Lines are drawn between cells if they are neighbors in space and are represent both our cell-types of interest. B. The corresponding adjacency weight matrix W to (A). W has a value of 1 is two cells are neighbors and represent both our cell-types of interest. W has a value of 0 if two cells are not neighbors or are neighbors with the same cell-type. C. Simulated expression of a receptor and corresponding ligand. Receptor expression is depicted on the x-axis and ligand expression on the y-axis. Cell-type A cells express the ligand highly but not the receptor while cell-type B cells express the receptor highly but not the ligand. Correlation between the receptor and ligand gene expression across cells is therefore negative, as depicted by the negative correlation line in red. We simulate high correlation between receptor expression in cell-type B cells and ligand expression in spatially neighboring cell-type A cells. D. Distribution of the inter-cell-type spatial cross-correlation index (iSCI) for the ligand and receptor expression in (C) for randomly permuted cell labels provides a null distribution. The observed iSCI for the ligand and receptor expression between cell-type A and B is shown as a red horizontal line. The permutation p-value is derived by comparing the observed iSCI to the null distribution. In this case, the observed iSCI is much higher than what we expect by chance, giving it a significant p-value. E. Distribution of the inter-cell-type spatial cross-correlation index (iSCI) for random normally distributed gene expressions. The distribution of resulting p-values are uniformly distributed with the expected ~5% of the computed iSCI to reach significance based on a p-value < 0.05 without multiple-testing correction, suggesting a type I error rate of 5% or a 5% probability of incorrectly rejecting the true null hypothesis that the genes are not spatially cross-correlated across cell-types.