There are two functions for visualizing DAGs. dag_graphviz()
uses the DiagrammeR package to visualize small DAGs as HTML widgets. dag_circular_viz()
uses a circular layout for huge DAGs.
library(simona)
parents = c("a", "a", "b", "b", "c", "d")
children = c("b", "c", "c", "d", "e", "f")
dag_small = create_ontology_DAG(parents, children)
There are several graphical parameters for controlling nodes in the DAG.
color = sample(colors(), 6)
shape = c("polygon", "box", "oval", "egg", "diamond", "parallelogram")
dag_graphviz(dag_small, color = color, shape = shape)
The DOT code can be obtained with dag_as_DOT()
:
digraph {
graph [overlap = true]
node [fontname=Helvetical]
"a" [color="beige", style="solid", shape="polygon", fontsize=10, fontcolor="black"];
"b" [color="green1", style="solid", shape="box", fontsize=10, fontcolor="black"];
"c" [color="grey57", style="solid", shape="oval", fontsize=10, fontcolor="black"];
"d" [color="thistle1", style="solid", shape="egg", fontsize=10, fontcolor="black"];
"e" [color="lightgoldenrod3", style="solid", shape="diamond", fontsize=10, fontcolor="black"];
"f" [color="steelblue4", style="solid", shape="parallelogram", fontsize=10, fontcolor="black"];
edge [dir="back"]
"a" -> "b";
"a" -> "c";
"b" -> "c";
"b" -> "d";
"c" -> "e";
"d" -> "f";
}
You can paste the DOT code to http://magjac.com/graphviz-visual-editor/ to generate the graph.
Following is an example of visualizing all upstream terms of a GO term. Note dag[, "GO:0010228"]
returns a sub-DAG of all upstream terms of GO:0010228
.
Edge attributes should be set as a named vector where names correspond to all relation types.
Visualizing huge DAGs is not an easy job because a term can have more than one parents. Here the dag_circular_viz()
uses a circular layout to visualize huge DAGs.
In the circular layout, each circle correspond to a specific depth (maximal distance to root). The distance of a circle to the circle center is proportional to the logorithm of the number of terms with depth equal to or less than the current depth of this circle. On each circle, each term has a width (or a sector on the circle) associated where offspring terms are only drawn within that section. The width is proportional to the number of offspring terms of the term. Dot size corresponds to the number of child terms. If colors are not set, let’s say root term is on level 0, the DAG is cut at level-0 terms (links between level-1 terms and level-0 terms are all cut), and each sub-DAG is assigned with a different color. A legend of top terms of sub-DAGs is also added in the plot.
The term IDs are not informative in the plot. If there are additional information of terms stored in the meta data frame of the DAG object, the column name can be set with the legend_labels_from
argument.
## id name
## GO:0000001 GO:0000001 mitochondrion inheritance
## GO:0000002 GO:0000002 mitochondrial genome maintenance
## GO:0000003 GO:0000003 reproduction
## GO:0000011 GO:0000011 vacuole inheritance
## GO:0000012 GO:0000012 single strand break repair
## GO:0000017 GO:0000017 alpha-glucoside transport
## definition
## GO:0000001 The distribution of mitochondria, including the mitochondrial genome, into daughter cells after mitosis or meiosis, mediated by interactions between mitochondria and the cytoskeleton.
## GO:0000002 The maintenance of the structure and integrity of the mitochondrial genome; includes replication and segregation of the mitochondrial chromosome.
## GO:0000003 The production of new individuals that contain some portion of genetic material inherited from one or more parent organisms.
## GO:0000011 The distribution of vacuoles into daughter cells after mitosis or meiosis, mediated by interactions between vacuoles and the cytoskeleton.
## GO:0000012 The repair of single strand breaks in DNA. Repair of such breaks is mediated by the same enzyme systems as are used in base excision repair.
## GO:0000017 The directed movement of alpha-glucosides into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Alpha-glucosides are glycosides in which the sugar group is a glucose residue, and the anomeric carbon of the bond is in an alpha configuration.
dag_treelize()
can convert a DAG to a tree where a term only has one parent. The circular visualization on the reduced tree is as follows:
tree = dag_treelize(dag)
dag_circular_viz(tree, legend_labels_from = "name", edge_transparency = 0.95)
One useful application is to map GO terms of interest (e.g. significant GO terms from function enrichment analysis) to the DAG. In the following example, sig_go_ids
contains 249 significant GO terms from an enrichment analysis.
sig_go_ids = readRDS(system.file("extdata", "sig_go_ids.rds", package = "simona"))
dag_circular_viz(dag, highlight = sig_go_ids, legend_labels_from = "name")
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] org.Hs.eg.db_3.18.0 AnnotationDbi_1.64.0 IRanges_2.36.0
## [4] S4Vectors_0.40.0 Biobase_2.62.0 BiocGenerics_0.48.0
## [7] igraph_1.5.1 simona_1.0.0 knitr_1.44
##
## loaded via a namespace (and not attached):
## [1] KEGGREST_1.42.0 circlize_0.4.15 shape_1.4.6
## [4] rjson_0.2.21 xfun_0.40 bslib_0.5.1
## [7] visNetwork_2.1.2 htmlwidgets_1.6.2 GlobalOptions_0.1.2
## [10] vctrs_0.6.4 tools_4.3.1 bitops_1.0-7
## [13] curl_5.1.0 parallel_4.3.1 Polychrome_1.5.1
## [16] RSQLite_2.3.1 cluster_2.1.4 blob_1.2.4
## [19] pkgconfig_2.0.3 RColorBrewer_1.1-3 scatterplot3d_0.3-44
## [22] GenomeInfoDbData_1.2.11 compiler_4.3.1 textshaping_0.3.7
## [25] Biostrings_2.70.0 codetools_0.2-19 ComplexHeatmap_2.18.0
## [28] clue_0.3-65 GenomeInfoDb_1.38.0 htmltools_0.5.6.1
## [31] sass_0.4.7 RCurl_1.98-1.12 yaml_2.3.7
## [34] crayon_1.5.2 jquerylib_0.1.4 GO.db_3.18.0
## [37] ellipsis_0.3.2 cachem_1.0.8 iterators_1.0.14
## [40] foreach_1.5.2 digest_0.6.33 fastmap_1.1.1
## [43] grid_4.3.1 colorspace_2.1-0 cli_3.6.1
## [46] DiagrammeR_1.0.10 magrittr_2.0.3 bit64_4.0.5
## [49] rmarkdown_2.25 XVector_0.42.0 httr_1.4.7
## [52] matrixStats_1.0.0 bit_4.0.5 ragg_1.2.6
## [55] png_0.1-8 GetoptLong_1.0.5 memoise_2.0.1
## [58] evaluate_0.22 doParallel_1.0.17 rlang_1.1.1
## [61] Rcpp_1.0.11 glue_1.6.2 DBI_1.1.3
## [64] xml2_1.3.5 rstudioapi_0.15.0 jsonlite_1.8.7
## [67] R6_2.5.1 systemfonts_1.0.5 zlibbioc_1.48.0