1 Abstract
2 Introduction and Motivation
3 Why Bioconductor?
4 Related Bioconductor Packages and Comparison
5 Using ExpoRiskR with SummarizedExperiment Objects
6 Summary
7 Session information

1 Abstract

Environmental and lifestyle exposures play a central role in shaping host-associated biological systems, including the microbiome and metabolome. Integrative analysis of multi-omics data together with exposure information is therefore essential for understanding disease risk and environmental health mechanisms. However, many existing multi-omics integration tools focus on correlation or latent factor discovery without explicitly accounting for exposure variables.

ExpoRiskR is a Bioconductor package designed for exposure-aware multi-omics integration, providing standardized workflows for aligning, preprocessing, and analyzing microbiome, metabolomics, and exposure data. The package emphasizes interoperability with Bioconductor data structures, particularly the SummarizedExperiment class, enabling seamless integration with existing Bioconductor workflows. ExpoRiskR supports reproducible and interpretable analysis of exposure-adjusted cross-omics associations, making it especially suitable for exposome, environmental epidemiology, and disease onset studies.

2 Introduction and Motivation

High-throughput profiling technologies have enabled the simultaneous measurement of multiple molecular layers, such as the microbiome and metabolome, across large cohorts. In parallel, advances in exposure science have made it possible to quantify environmental and lifestyle factors that influence biological systems. Integrating these heterogeneous data types remains challenging due to differences in scale, structure, and confounding by exposures.

Most multi-omics integration approaches aim to identify shared variation or discriminative features across omics layers. While powerful, these methods often do not explicitly incorporate exposure variables into the integration framework, making it difficult to disentangle intrinsic biological relationships from exposure-driven effects.

ExpoRiskR was developed to address this gap by providing a coherent workflow to align samples across multiple omics layers and exposure metadata, preprocess heterogeneous data in a consistent manner, and enable downstream analysis of cross-omics associations in the context of measured exposures.

3 Why Bioconductor?

ExpoRiskR is implemented within the Bioconductor ecosystem to leverage its mature infrastructure for high-throughput biological data analysis. Bioconductor provides standardized data containers, rigorous software review, and strong guarantees of interoperability and reproducibility.

A key design principle of ExpoRiskR is native support for the SummarizedExperiment class. By operating directly on SummarizedExperiment objects, ExpoRiskR integrates naturally with existing Bioconductor workflows and can be combined with other Bioconductor packages without ad-hoc data transformations.

5 Using ExpoRiskR with SummarizedExperiment Objects

ExpoRiskR natively supports SummarizedExperiment objects through dedicated helper functions, enabling direct integration with Bioconductor workflows.

5.1 Creating and Aligning SummarizedExperiment Inputs

set.seed(7)
d <- generate_dummy_exporisk(
  n = 12, p_micro = 5, p_metab = 6, p_expo = 3
)

aligned <- align_omics_se(
  microbiome = d$microbiome,
  metabolome = d$metabolome,
  exposures  = d$exposures,
  meta       = d$meta,
  id_col     = "sample_id",
  strict     = TRUE
)

aligned$se_microbiome

## class: SummarizedExperiment 
## dim: 5 12 
## metadata(0):
## assays(1): abundance
## rownames(5): micro_1 micro_2 micro_3 micro_4 micro_5
## rowData names(0):
## colnames(12): S1 S10 ... S8 S9
## colData names(5): sample_id outcome expo_1 expo_2 expo_3

aligned$se_metabolome

## class: SummarizedExperiment 
## dim: 6 12 
## metadata(0):
## assays(1): intensity
## rownames(6): metab_1 metab_2 ... metab_5 metab_6
## rowData names(0):
## colnames(12): S1 S10 ... S8 S9
## colData names(5): sample_id outcome expo_1 expo_2 expo_3

5.2 Preprocessing SummarizedExperiment-based Data

prepped <- prep_omics_se(aligned)
prepped

## $X
##         micro_1     micro_2     micro_3     micro_4      micro_5
## S1   0.96280547  0.49814473  0.13225423 -0.19812021  1.985941121
## S10  0.51873745  0.88795075  0.28888014  0.46846818 -0.987800961
## S11  0.35726331  0.41016554  0.37517057  1.44914418  1.254101882
## S12 -1.15611767 -1.96343160 -0.06759319 -0.42001847  1.221623144
## S2   0.26460300 -1.56626266 -0.77608060 -0.80542213 -1.174711261
## S3  -0.80698984 -0.14138258 -0.87774006 -0.49430148 -0.565836250
## S4  -0.54817202  0.09251204 -0.79635845 -0.50057990 -0.399492091
## S5  -0.17120643 -0.86465355 -0.66862520 -1.31349826  0.085060252
## S6   1.84115243  0.17546235 -0.63470412  0.01619189 -0.092259816
## S7   0.67636537  0.31542631 -0.06096963  2.13570484 -0.314834932
## S8  -0.08532825  0.72016071  2.79432566 -0.86788420 -1.020051451
## S9  -1.85311281  1.43590796  0.29144063  0.53031557  0.008260362
## 
## $Y
##        metab_1     metab_2     metab_3     metab_4    metab_5     metab_6
## S1   1.5712429 -1.16289061  2.15807796  1.87140658 -1.5015621  0.15117522
## S10  1.1861832  1.47381448 -0.92195103  0.04002660 -0.6141845  1.58543227
## S11  1.1114371 -0.71964871  0.27921407  1.42698701  0.5957128 -0.04847733
## S12  0.1317138  1.12156629  0.24852391 -0.56284777  1.2029108 -0.73344698
## S2  -0.8690632  0.92947408 -0.58859908 -0.68363917  0.4450536 -1.35460560
## S3  -0.8916094  0.35332375  1.10502953 -0.90031879  1.5946824 -1.44861422
## S4  -0.1592589 -0.78102443 -1.13079326  0.40320431 -0.7554514  0.35249979
## S5  -0.0927472  0.15149007 -0.13783062  0.05774713 -0.4172125  1.35151675
## S6  -0.4272096  0.93604755  0.08503764 -1.24286251 -1.1505514 -1.13738301
## S7  -1.1768566  0.07661987 -0.14576299 -0.49809242 -0.6689004  0.52663318
## S8   0.9583355 -0.76118737  0.52630631  1.01953836  0.1330800  0.07534041
## S9  -1.3421677 -1.61758496 -1.47725244 -0.93114934  1.1364226  0.67992951
## 
## $E
##          expo_1      expo_2     expo_3
## S1   1.41930250  1.64714633  1.7672908
## S10  1.34830361  0.17509081 -0.1645179
## S11  0.01036126  0.73606065 -0.6534422
## S12  1.73280766 -1.78004535 -0.8559107
## S2  -1.12376188 -0.18105570  0.2109550
## S3  -0.75699093  1.28720452  1.0230380
## S4  -0.55115306 -0.04688015  0.7935528
## S5  -0.95872757 -1.31847657 -1.4576272
## S6  -0.94165220 -0.77072298 -0.4469839
## S7   0.29587279 -0.48818850 -1.2972346
## S8  -0.33557936  0.43924123  0.9750505
## S9  -0.13878282  0.30062570  0.1058294

5.3 Network construction and visualization

Below we construct an exposure-adjusted microbe–metabolite network from the preprocessed matrices and visualize the resulting graph.

X <- prepped$X
Y <- prepped$Y
E <- prepped$E

net <- build_exposure_network(
  X = X, Y = Y, E = E,
  fdr = 0.8, # relaxed for vignette speed / non-empty illustration
  max_pairs = 1500,
  seed = 1
)

plot_exposure_network(net)

5.4 Exposure perturbation ranking

ExpoRiskR summarizes how strongly each exposure perturbs the estimated cross-omics associations and ranks exposures accordingly.

scores <- exposure_perturbation_score(
  X = X, Y = Y, E = E,
  fdr = 0.8,
  max_pairs = 1500,
  seed = 1
)

print(plot_exposure_ranking(scores, top_n = 15))

5.5 Exposure feature importance

This example uses the simulated outcome included in the dummy metadata to illustrate simple exposure feature importance summaries.

outcome <- d$meta$outcome
names(outcome) <- d$meta$sample_id

print(plot_feature_importance(
  E = E,
  outcome = outcome,
  top_n = 15
))

5.6 Risk ROC curve

As an illustrative end-to-end example, ExpoRiskR can visualize discrimination performance for a simple risk model derived from the exposure-adjusted network.

print(plot_risk_roc(
  X = X, Y = Y, E = E,
  outcome = outcome,
  edges = net$edges,
  top_edges = 80
))

5.7 Integration with Bioconductor Workflows

Because ExpoRiskR operates on SummarizedExperiment objects, users can seamlessly integrate it with other Bioconductor packages for downstream analysis, including differential testing, visualization, or additional statistical modeling.

6 Summary

ExpoRiskR provides a Bioconductor-native framework for exposure-aware multi-omics integration with a strong emphasis on interpretability and reproducibility. By supporting SummarizedExperiment objects and aligning with existing Bioconductor workflows, the package enables robust investigation of exposure-driven biological mechanisms in environmental health and disease studies.

7 Session information

sessionInfo()

## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ExpoRiskR_1.0.0  BiocStyle_2.40.0
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.10                 generics_0.1.4             
##  [3] SparseArray_1.12.0          lattice_0.22-9             
##  [5] digest_0.6.39               magrittr_2.0.5             
##  [7] RColorBrewer_1.1-3          evaluate_1.0.5             
##  [9] grid_4.6.0                  bookdown_0.46              
## [11] fastmap_1.2.0               jsonlite_2.0.0             
## [13] Matrix_1.7-5                tinytex_0.59               
## [15] BiocManager_1.30.27         scales_1.4.0               
## [17] jquerylib_0.1.4             abind_1.4-8                
## [19] cli_3.6.6                   rlang_1.2.0                
## [21] XVector_0.52.0              Biobase_2.72.0             
## [23] withr_3.0.2                 cachem_1.1.0               
## [25] DelayedArray_0.38.0         yaml_2.3.12                
## [27] otel_0.2.0                  S4Arrays_1.12.0            
## [29] tools_4.6.0                 dplyr_1.2.1                
## [31] ggplot2_4.0.3               SummarizedExperiment_1.42.0
## [33] BiocGenerics_0.58.0         vctrs_0.7.3                
## [35] R6_2.6.1                    matrixStats_1.5.0          
## [37] stats4_4.6.0                lifecycle_1.0.5            
## [39] magick_2.9.1                Seqinfo_1.2.0              
## [41] S4Vectors_0.50.0            IRanges_2.46.0             
## [43] pkgconfig_2.0.3             pillar_1.11.1              
## [45] bslib_0.10.0                gtable_0.3.6               
## [47] glue_1.8.1                  Rcpp_1.1.1-1.1             
## [49] tidyselect_1.2.1            tibble_3.3.1               
## [51] xfun_0.57                   GenomicRanges_1.64.0       
## [53] dichromat_2.0-0.1           MatrixGenerics_1.24.0      
## [55] knitr_1.51                  farver_2.1.2               
## [57] htmltools_0.5.9             igraph_2.3.0               
## [59] labeling_0.4.3              rmarkdown_2.31             
## [61] compiler_4.6.0              S7_0.2.2

ExpoRiskR: Exposure-aware multi-omics integration in Bioconductor

28 April 2026

Contents

1 Abstract

2 Introduction and Motivation

3 Why Bioconductor?

5 Using ExpoRiskR with SummarizedExperiment Objects

5.1 Creating and Aligning SummarizedExperiment Inputs

5.2 Preprocessing SummarizedExperiment-based Data

5.3 Network construction and visualization

5.4 Exposure perturbation ranking

5.5 Exposure feature importance

5.6 Risk ROC curve

5.7 Integration with Bioconductor Workflows

6 Summary

7 Session information

ExpoRiskR: Exposure-aware multi-omics integration in Bioconductor

28 April 2026

Contents

1 Abstract

2 Introduction and Motivation

3 Why Bioconductor?

4 Related Bioconductor Packages and Comparison

5 Using ExpoRiskR with SummarizedExperiment Objects

5.1 Creating and Aligning SummarizedExperiment Inputs

5.2 Preprocessing SummarizedExperiment-based Data

5.3 Network construction and visualization

5.4 Exposure perturbation ranking

5.5 Exposure feature importance

5.6 Risk ROC curve

5.7 Integration with Bioconductor Workflows

6 Summary

7 Session information