The spatialHeatmap Shiny App is the interactive implementation of most functionalities in the spatialHeatmap software, which is specialized in visualizing spatial bulk and single cell assays in anatomical images. This user manual introduces the most important features and most basic operations of this App.

1 Datasets

This tab is designed for selecting pre-configured datasets or uploading custom datasets.
Quick start: Select a pre-configured dataset (6) and click “Spatial Heatmap” (then select genes in the table to see spatial heatmaps).

Figure 1.1-3: Go to Figure 1.1-3 (top red line: current selected tab) to select/upload a dataset, and click “Spatial Heatmap” to see spatial heatmaps.
Figure 1.4-5: Upload custom datasets. Details of each portal is seen at the respective tooltips. To format custom bulk data, please refer to instructions available here. For formatting both bulk and single-cell data, instructions are provided here. To format anatomical images, guidelines are provided here.
Figure 1.6: Instead of uploading custom datasets, select pre-configured datasets.

Page for selecting datasets.

Figure 1: Page for selecting datasets

2 Spatial Heatmap

The spatial heatmap functionality is designed for coloring color spatial features (e.g. tissues) annotated in SVG images (aSVGs) based on the quantitative abundance levels of biomolecules (e.g. mRNAs) using a color key. The resulting plot is called a spatial heatmap (SHM). This tab includes different output forms of SHMs.

2.1 Static image

This tab display SHMs in form of static images.

Quick start: Select genes (10) in the table and click “Plot (11)”.

Figure 2.1-2: Go to tabs displaying SHMs in form of static images.

Customize SHMs use settings in Figure 2.3-7:

  • Figure 2.3: The most frequently used (basic) settings.
  • Figure 2.4: Set selected spatial features transparent.
  • Figure 2.5: Add heat colors and numeric values as secondary legend at the bottom of each spatial heatmap.
  • Figure 2.6: Adjust spatial features outlines (widths, colors).
  • Figure 2.7: See respective pop-up tooltips.
Settings for SHMs in form of static images.

Figure 2: Settings for SHMs in form of static images

Figure 2.8-9: The input assay data.
Figure 2.10-12: Select genes (Figure 2.10), click the button (Figure 2.11), then spatial heatmaps will be created (Figure 2.12).

Figure 2.13-15: In the Experiment design (Figure 2.14), if a “reference” column that contains strings of comma-separated experiment variables is include (Figure 2.15), relative expression levels can be used by choosing “Yes” in Figure 2.13. For example, choosing “Yes” in Figure 2.13 computes relative expression levels in brain under “DBA.2J” based on “C57BL” and “CD1” respectively.

2.2 Interactive image

Settings for SHMs in form of interactive images.

Figure 3: Settings for SHMs in form of interactive images

Figure 3.1-3: Go to tabs displaying SHMs in form of interactive images (top red line: current selected tab).
Figure 3.4: Click the “Run” button to show the interactive images.
Figure 3.5: Click the “Play” button to show images sequentially.

2.3 Video

Settings for SHMs in form of videos.

Figure 4: Settings for SHMs in form of videos

Figure 4.1-3: Go to tabs displaying SHMs in form of videos (top red line: current selected tab).
Figure 4.4: Click the “Run” button to show the videos.

3 Spatial Enrichment

The spatial enrichment module identifies spatially enriched or depleted genes that are significantly up- or down- regulated in one feature (e.g. tissue) relative to reference features, and their abundance values are visualized as enrichment SHMs. Similarly, genes enriched or depleted in one experimental variable (e.g. treatments) relative to reference variables can also be detected and visualized.

Quick start: Click “Run” (Figure 5.2.1) to perform spatial enrichment for each selected spatial features (Figure 5.2.3), select a query feature (Figure 5.3.1) to get corresponding results (Figure 5.4.2), click Figure 5.4.1 to create enrichment SHMs.

Figure 5.1: Go to the tab displaying spatial enrichment (top red line: current selected tab).
Figure 5.2: Perform spatial enrichment according to the settings:

  • The input data are pre-processed: genes with expression values over a cutoff (Figure 5.2.2A) across at least a proportion (Figure 5.2.2P) of samples and coefficient of variance (CV) within a range (Figure 5.2.2CV1, CV2) are retained. Then the assay data are normalized.

  • Spatial features and experimental variables (e.g. treatments) are listed in 2.3 and 2.4 respectively. Only those chosen will be considered for spatial enrichment. If the comparison (2.5) is across spatial features, variables under the same spatial feature will be treated as replicates, and vice versa.

  • The stringency of spatial enrichment can be relaxed by allowing a number of outliers (Figure 5.2.6) in reference features. The methods (Figure 5.2.7) for spatial enrichment include differential expression analysis tools of edgeR (McCarthy et al. 2012), limma (Ritchie et al. 2015), DESeq2 (Love, Huber, and Anders 2014), and distinct (Tiberi et al., n.d.). The top up- or down-regulated genes can be selected by log2-fold change (e.g. \(\geq\) 1) and FDR (Figure 5.2.8, e.g. \(\leq\) 0.05).

  • By clicking “Run” (Figure 5.2.1), all-against-all comparisons will be performed according to these settings.

Spatial enrichment.

Figure 5: Spatial enrichment

Figure 5.3: Query the results

  • The enrichment results among spatial features (Figure 5.2.3) are compared in three types of plots. The results can be queried by choosing a feature in Figure 5.3.1, where all selected features (Figure 5.2.3) are listed, while the other features in Figure 5.3.1 will be regarded as references (Figure 5.3.2). Note, variables (Figure 5.2.4) are treated as replicates (Figure 5.3.4).

Figure 5.4: Results of the query feature

  • The enrichment results of the query feature (Figure 5.3.1) are displayed in a table (Figure 5.4.2). By clicking “Enrichment SHMs” (Figure 5.4.1), the table will be sent to the “Spatial Enrichment” tab for visualization.

4 Data mining

Although SHMs are powerful for visualization, only a few genes can be plotted simultaneously as each requires an individual plot. To overcome this limitation and support analysis routines involving a large number of genes, the Shiny App integrates functionalities for large-scale data mining, including hierarchical clustering, K-means clustering, and network analysis (Figure 6).

Quick start: Click “Run” (Figure 6.3.2) to identfify the cluster containing the query gene (Figure 6.2.1) chosen from SHMs.

Figure 6.1: Go the tab displaying the data mining interface.
Figure 6.2: Step1: To obtain genes showing expression similarity with a query gene chosen from SHMs (Figure 6.2.1), the complete assay data can be subsetted using a similarity measure (Figure 6.2.2) and a cutoff (Figure 6.2.3). The subsetted matrix will be passed to Step2. If no subsetting is applied, the whole matrix will be used in Step2.

Large-scale data mining downstream sptial heatmaps.

Figure 6: Large-scale data mining downstream sptial heatmaps

Figure 6.3: Step2: Select a method (Figure 6.3.1) and click “Run” (Figure 6.3.2), then a cluster or network module showing highly similar expression patterns with the query will be identified in the subsetted matrix from step1, and the results will be shown in Figure 6.3.3A-C respectively.

Network analysis is performed with the WGCNA algorithm (Langfelder and Horvath 2008; Ravasz et al. 2002). The objective is to identify the network module containin the query that can be visualized in form of network graphs. See more details here.

Figure 6.4: Step3: Perform optional further network analysis on the cluster containing the query (Figure 6.3.3A-B) from step2. This tab is disabled until the cluster is shown (Figure 6.3.3A-B).

5 Co-visualization

The co-visualization module provides novel plotting functionalities designed to gain insights into tissue-level organizations of single-cell data, or vice versa cellular compositions of tissues (Figure 7.9.5-9.6). It combines SHMs and embedding plots where matching tissues and cells are associated by identical point colors. The coloring (Figure 7.9.3) of the single cells (dots) and tissue features can be based on quantitative values (heat coloring) or fixed group-based colors. Cell group labels are required for the cell-tissue matching. This includes support for existing cell annotations, marker gene-based methods, manual assignments, and co-clustering of bulk and single-cell data (Figure 8.7).

When using the first four methods, there are often differences in naming conventions between cell group labels and tissue labels, so the user interface for cell labels obtained by these methods utilizes a ranslation map to create a bridge between the cell and tissue labels (Figure 7.7.2-7.4). By contrast, the co-clustering method directly groups cells using source tissue labels, so the cell groups and tissues already have programatically identical labels. Due to this inherent alignment, the user interface for the co-clustering method is designed separately (Figure 8).

5.1 Interface for annotation (or other) labels

This user interface (Figure 7.1-7) is designed for cell group labels from existing cell annotations, marker gene-based methods, manual assignments, etc.

Quick start: Select “Annotation (or other) labels” (Figure 7.2), have an overview on the single-cell data (Figure 7.6), match cells and tissues (Figure 7.7), and click “Run” (Figure 7.7.5) to create co-visualization plots.

Figure 7.1: Go to the tab for co-visualization (red line: current selected tab).
Figure 7.2: Select the source of cell group labels. The option “Annotation (or other) labels” and “Co-clustering” will introcude the interface in Figure 7 and Figure Figure 8 respectively.
Figure 7.3: In the “Cell-to-bulk” option, when choosing the “cell-by-group” coloring option in 9.3, the heat colors will be derived from the single-cell data. Vice versa for the “Bulk-to-cell” option.
Figure 7.4-5: Go to Figure 7.4 to pre-process the bulk and single-cell assay data if needed, which will be provided in tables in Figure 7.5.
Figure 7.6: This tab is designed for exploring the single cell data before going to Figure 7.7. The metadata (colData slot of SingleCellExperiment) are provided in Figure 7.6.4. In the embedding plot, single cells are colored according to the chosen group label in Figure 7.6.1. By selecing rows in Figure 7.6.4 and clicking Figure 7.6.2, the selected cells will be highlighted in the embedding plot.

Co-visualizing bulk and single-cell data using annotation (or other) labels.

Figure 7: Co-visualizing bulk and single-cell data using annotation (or other) labels

Figure 7.7: After having an understanding of single-cell data in Figure 7.6, click Figure 7.7 to mactch cells and tissue features. By dragging (Figure 7.7.4) one or multiple spatial features (Figure 7.7.2) to the desired cell labels (Figure 7.7.3), the cell-tissue matching will be established for subsequent co-visualization. Then clicking “Run” (Figure 7.7.5) will turn the page to Spatial Heatmap automatically for co-visualization (Figure 7.9).
Figure 7.8: The source of cell group labels (Figure 7.2) and mapping direction (Figure 7.3) is shown in a box for tracking.
Figure 7.9.1-9.2: Go to the tabs/settings for co-visualization.
Figure 7.9.3: Select coloring options for co-visualization plots (Figure 7.9.5-9.6):

  • Cell-by-value: cells in Figure 7.9.5 and tissues in Figure 7.9.6 are colored independently according to expression values of a chosen gene in single-cell and bulk data respectively. This option provides the most detailed information in the plots.
  • Cell-by-group: expression values of a chosen gene in single-cell data are averaged by cell groups. The same heat color derived from the averaged value will be assigned to cells of the same group and matching tissues.
  • Feature-by-group: similar with “Cell-by-group” except that the averaged expression values are from the bulk data.
  • Fixed-by-group: cells of the same group and matching tissue features are assigned the same constant colors.

Figure 7.9.4-9.6: Single-cell and bulk data are visualized in an embedding plot (Figure 7.9.5) and an SHM (Figure 7.9.6) respectively. In Figure 7.9.5, grey dots represent cells not matched with any tissue feature (Figure 7.7.4). All cell group labels that are matched with tissue features (Figure 7.7.4) are listed in Figure 7.9.4, where options are provided to visualized all (default) or a single group in Figure 7.9.5.

5.2 Interface for co-clustering

This user interface is designed for co-clustering only.

Quick start: Have an overview on the co-clustering results (Figure 8.6), then click “Co-visualizing” (Figure 8.6.3) to create co-visualization plots.

Figure 8.1-3: Figure 8.1-3 are the same as Figure 7. Select “Co-clustering labels” (Figure 8.2) to display the interface for co-clustering (Figure 8).
Figure 8.4-5: The co-clustering workflow (Figure 8.7, see below) is performed according to settings in Figure 8.4. The bulk and single-cell assay data are displayed in Figure 8.5.
Figure 8.6: Before co-visualization, go to this tab to see co-clustering (see below) results in form an embedding plot and a table (Figure 8.6.6). The bulk labels assigned to cells and corresponding similarities (Spearman’s correlation coefficients) are shown in Figure 8.6.4 and Figure 8.6.5 respectively, where “none” denotes no assignments. All (default) or a chosen cluster can be selected (Figure 8.6.2) to show in the embedding plot. Selecting rows in Figure 8.6.6 and clicking Figure 8.6.1 will highlight corresponding cells/tissues in the embedding plot. Clicking Figure 8.6.3 will automatically turn the page to “Spatial Heatmap” for co-visualization, which is the same as Figure 7.9.

Co-visualizing bulk and single-cell data using co-clustering labels.

Figure 8: Co-visualizing bulk and single-cell data using co-clustering labels

Figure 8.7: Co-clustering illustration:

  • Although the co-clustering method (Figure 8.7) is generally applicable to various types of data modalities (transcriptome, proteome, metabolome, etc), it is specifically explained using RNA-seq data. Initially, the raw count matrices of bulk and single cells are combined column-wise for joint normalization (Figure 8.7A1) using scater and scran (McCarthy et al. 2017; Lun, McCarthy, and Marioni 2016).

  • Figure 8.7A: Following separation from the single-cell data, for the bulk data, genes are filtered based on their expression values exceeding a cutoff across a certain proportion of bulk samples, and their coefficient of variance (CV) falls within a range (CV1, CV2). On the other hand, the single-cell data are filtered to include genes with robust expression (\(\geq\) cutoff) across a certain proportion of cells and cells with robust expression across a certain proportion of genes (Figure 8.7A2). Next, the bulk data is subsetted to include the same set of genes as the single-cell data to reduce sparsity in the latter and make these two types of data more comparable (Figure 8.7A3).

  • Figure 8.7B: In the subsequent step, the bulk and single-cell data are combined column-wise for joint embedding using a dimensionality reduction technique (PCA or UMAP).

  • Figure 8.7C: Co-clustering is then performed on the top joint dimensions. Specifically, a graph is built with methods (buildKNNGraph or buildSNNGraph) from scran where nodes are cells (or tissues) and edges are connections between nearest neighbors (Lun, McCarthy, and Marioni 2016), and subsequently this graph is partitioned with methods (cluster_walktrap, cluster_fast_greedy, or cluster_leading_eigen) from igraph to obtain clusters (Csardi and Nepusz 2006). Three types of clusters are shown: (i) multiple cells are co-clustered and assigned to one bulk tissue sample (Figure 8.7C1); (ii) multiple cells are co-clustered with several bulk tissues, and then assigned to a single bulk tissue with a nearest-neighbor approach (Figure 8.7C2), which is based on the Spearman’s correlation coefficient (similarity, Figure 8.6.5); and (iii) cells that do not co-cluster with any bulk tissue remain unassigned (Figure 8.7C3).

  • Figure 8.7D-E: After co-clustering, cells are labeled by bulk tissues or remain un-labeled (“none” in Figure 8.7D). Lastly, the obtained labels are subsequently used to match cells with tissues in embedding and SHMs, respectively (Figure 8.7E).

Reference

Csardi, Gabor, and Tamas Nepusz. 2006. “The Igraph Software Package for Complex Network Research.” InterJournal Complex Systems: 1695. http://igraph.org.
Langfelder, Peter, and Steve Horvath. 2008. WGCNA: An R Package for Weighted Correlation Network Analysis.” BMC Bioinformatics 9 (December): 559.
Love, Michael I., Wolfgang Huber, and Simon Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2.” Genome Biology 15: 550. https://doi.org/10.1186/s13059-014-0550-8.
Lun, Aaron T. L., Davis J. McCarthy, and John C. Marioni. 2016. “A Step-by-Step Workflow for Low-Level Analysis of Single-Cell RNA-Seq Data with Bioconductor.” F1000Res. 5: 2122. https://doi.org/10.12688/f1000research.9501.2.
McCarthy, Davis J., Kieran R. Campbell, Aaron T. L. Lun, and Quin F. Willis. 2017. “Scater: Pre-Processing, Quality Control, Normalisation and Visualisation of Single-Cell RNA-Seq Data in R.” Bioinformatics 33: 1179–86. https://doi.org/10.1093/bioinformatics/btw777.
McCarthy, Davis J., Chen, Yunshun, Smyth, and Gordon K. 2012. “Differential Expression Analysis of Multifactor RNA-Seq Experiments with Respect to Biological Variation.” Nucleic Acids Research 40 (10): 4288–97.
Ravasz, E, A L Somera, D A Mongru, Z N Oltvai, and A L Barabási. 2002. “Hierarchical Organization of Modularity in Metabolic Networks.” Science 297 (5586): 1551–55.
Ritchie, Matthew E, Belinda Phipson, Di Wu, Yifang Hu, Charity W Law, Wei Shi, and Gordon K Smyth. 2015. “Limma Powers Differential Expression Analyses for RNA-sequencing and Microarray Studies.” Nucleic Acids Res. 43 (7): e47.
Tiberi, Simone, Helena L Crowell, Lukas M Weber, Pantelis Samartsidis, and Mark D Robinson. n.d. “Distinct : A Novel Approach to Differential Distribution Analyses.” bioRxiv.