Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Vignette bug: mc4 summary stat plots not displaying #320

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 28 additions & 17 deletions vignettes/Introduction_Appendices.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,23 @@ author: US EPA's Center for Computational Toxicology and Exposure [email protected]
output:
rmdformats::readthedown:
fig_retina: false
code_folding: show
params:
my_css: css/rmdformats.css

vignette: >
%\VignetteIndexEntry{1. Introduction to tcpl and invitrodb}
%\VignetteEngine{knitr::rmarkdown}
%\usepackage[utf8]{inputenc}


---

```{css, code = readLines(params$my_css), hide=TRUE, echo = FALSE}
```

```{r, echo = FALSE, message = FALSE, warning = FALSE}
#devtools::load_all() #use this instead of lbrary(tcpl) when dev versions are installed locally
# devtools::load_all() #use this instead of lbrary(tcpl) when dev versions are installed locally
library(tcpl)
library(tcplfit2)
# Data Formatting Packages #
Expand Down Expand Up @@ -83,6 +86,7 @@ library(data.table) # recommended for interacting with tcpl data frame-like obje
library(tcpl)
```


After loading <font face="CMTT10">tcpl</font>, the function <font face="CMTT10">tcplConf</font> is used to establish connection to a database server or the API. While a typical database connection requires 5 parameters to be provided, using an API connection requires the user to only specify password (`pass`) and driver (`drvr`):

```{r setup-api, eval=FALSE}
Expand Down Expand Up @@ -915,7 +919,7 @@ Once the Level 0 data are loaded, data processing occurs via the <font face="CMT

The processing is sequential, and every level of processing requires successful processing at the antecedent level. Any processing changes will trigger a "delete cascade," removing any subsequent data affected by the processing change to ensure complete data fidelity. For example, processing level 3 data will first require data from levels 4 through 6 to be deleted for the corresponding IDs. **Changing method assignments will also trigger a delete cascade for any corresponding data.**

For tcplRun, the user must supply a starting level (slvl) and ending level (elvl). There are four phases of processing, as reflected by messages printed in console: (1) data for the given IDs are loaded, (2) the data are processed, (3) data for the same ID in subsequent levels are deleted, and (4) the processed data is written to the database. The 'outfile' parameter can give the user the option of print this output text to a file. If an id fails while processing multiple levels, the function will not attempt to process the failed ids in subsequent levels. When finished processing, a list indicating the processing success of each ID is returned. For each level processed, the list will contain two elements: (1) "l#" a named Boolean vector where <font face="CMTT10">TRUE</font> indicates successful processing, and (2) "l#_failed" containing the names of any ids that failed processing, where "#" is the processing level.
For tcplRun, the user must supply a starting level (slvl) and ending level (elvl). There are four phases of processing, as reflected by messages printed in console: (1) data for the given IDs are loaded, (2) the data are processed, (3) data for the same ID in subsequent levels are deleted, and (4) the processed data is written to the database. The 'outfile' parameter can give the user the option of printing this output text to a file. If an id fails while processing multiple levels, the function will not attempt to process the failed ids in subsequent levels. When finished processing, a list indicating the processing success of each ID is returned. For each level processed, the list will contain two elements: (1) "l#" a named Boolean vector where <font face="CMTT10">TRUE</font> indicates successful processing, and (2) "l#_failed" containing the names of any ids that failed processing, where "#" is the processing level.

Processing of multiple assay components or endpoints can be executed simultaneously. This is done with the internal utilization of the <font face="CMTT10">mclapply</font> function from the <font face="CMTT10">parallel</font> package. Parallel processing is done by id. Depending on the system environment and memory constraints, the user may wish to use more or less processing power. For processing on a Windows operating system, the default is $mc.cores = 1$, unless otherwise specified. For processing on a Unix-based operating system, the default is $mc.cores = NULL$ i.e. to utilize all cores except for one, which is necessary for 'overhead' processes. The user can specify more or less processing power by setting the "mc.cores" parameter to the desired level. **Note, this specification should meet the following criteria $1 \space \leq \space \mathit{mc.cores} \space \leq \space \mathit{detectCores()}-1$.**

Expand Down Expand Up @@ -1264,7 +1268,7 @@ tcplMthdAssign(lvl = 3, # processing level
mc3_res <- tcplRun(id = 1, slvl = 3, elvl = 3, type = "mc")
```

Notice that MC3 processing takes an acid, not an aeid, as the input ID. As mentioned in previous sections, the user must will assign MC3 normalization methods by aeid then process by acid. The MC3 processing will attempt to process all endpoints for a given component. If one endpoint fails for any reason (e.g., does not have appropriate methods assigned), the processing for the entire component fails.
Notice that MC3 processing takes an acid, not an aeid, as the input ID. As mentioned in previous sections, the user must assign MC3 normalization methods by aeid then process by acid. The MC3 processing will attempt to process all endpoints for a given component. If one endpoint fails for any reason (e.g., does not have appropriate methods assigned), the processing for the entire component fails.

::: {.noticebox data-latex=""}

Expand Down Expand Up @@ -1360,7 +1364,7 @@ mc3 <- tcplPrepOtpt(mc3)

For demonstration purposes, the <font face="CMTT10"> mc_vignette </font> R data object is provided in the package since the vignette is not directly connected to such a database. The <font face="CMTT10"> mc_vignette </font> object contains a subset of data from levels 3 through 5 from invitrodb v4.2. The following code loads the example mc3 data object, then plots the concentration-response series for an example spid with the summary estimates indicated.

```{r fig.align='center',message=FALSE,message=FALSE,fig.dim=c(8,10),eval = FALSE}
```{r class.source="fold-hide", fig.align='center',message=FALSE,message=FALSE,fig.dim=c(8,10),eval = TRUE}
# Load the example data from the `tcpl` package.
data(mc_vignette, package = 'tcpl')
# Allocate the level 3 example data to `mc3`.
Expand All @@ -1370,6 +1374,9 @@ mc3_example[, logc := log10(conc)]
# Obtain the MC4 example data.
mc4_example <- mc_vignette[["mc4"]]
# Obtain the minimum response observed and the 'logc' group - 'resp_min'.



level3_min <- mc3_example %>%
dplyr::group_by(spid, chnm) %>%
dplyr::filter(resp == min(resp)) %>%
Expand Down Expand Up @@ -1423,11 +1430,12 @@ B <- mc3_example %>%
theme(legend.position = 'bottom')
# Plot the maximum mean & median responses at the related log-concentration -
# 'max_mean' & 'max_mean_conc'.

C <- mc3_example %>%
dplyr::filter(spid == "01504209") %>%
ggplot(data = .,aes(logc, resp)) +
geom_point(pch = 1, size = 2) +
geom_point(data = dplyr::filter(mc4, spid == "01504209"),
geom_point(data = dplyr::filter(mc4_example, spid == "01504209"),
aes(x = log10(max_mean_conc), y = max_mean,
col = 'maximum mean response'),
alpha = 0.75, size = 2)+
Expand All @@ -1441,11 +1449,11 @@ C <- mc3_example %>%
theme(legend.position = 'bottom')
# Plot the maximum mean & median responses at the related log-concentration -
# 'max_med' & 'max_med_conc'.
D <- example %>%
D <- mc3_example %>%
dplyr::filter(spid == "01504209") %>%
ggplot(data = ., aes(logc, resp)) +
geom_point(pch = 1, size = 2) +
geom_point(data = dplyr::filter(mc4, spid == "01504209"),
geom_point(data = dplyr::filter(mc4_example, spid == "01504209"),
aes(x = log10(max_med_conc), y = max_med,
col = "maximum median response"),
alpha = 0.75, size = 2)+
Expand Down Expand Up @@ -1484,11 +1492,11 @@ G <- mc3_example %>%
dplyr::filter(spid == "01504209") %>%
ggplot(data = ., aes(logc, resp)) +
geom_point(pch = 1, size = 2) +
geom_vline(data = dplyr::filter(mc4, spid == "01504209"),
geom_vline(data = dplyr::filter(mc4_example, spid == "01504209"),
aes(xintercept = log10(conc_min),
col = 'minimum concentration'),
lty = "dashed") +
geom_vline(data = dplyr::filter(mc4, spid == "01504209"),
geom_vline(data = dplyr::filter(mc4_example, spid == "01504209"),
aes(xintercept = log10(conc_max),
col = 'maximum concentration'),
lty = "dashed") +
Expand All @@ -1501,10 +1509,13 @@ G <- mc3_example %>%
theme_bw() +
theme(legend.position = 'bottom')
## Compile Summary Plots in One Figure ##

aenm = "TOX21_ERa_BLA_Agonist_ratio"
spid = "01504209"
gridExtra::grid.arrange(
A,B,C,D,E,G,
nrow = 3, ncol = 2,
top = mc3[which(mc4[,spid] == "01504209"), aenm]
top = mc3[which(mc4_example[,spid] == "01504209"), aenm]
)
```

Expand Down Expand Up @@ -1618,7 +1629,7 @@ where $n$ is the number of observations.

The following plots provide simulated concentration-response curves to illustrate the general curve shapes captured by <font face="CMTT10">tcplFit2</font> models. When fitting 'real-world' experimental data, the resulting curve shapes will minimize the error between the observed data and the concentration-response curve. Thus, the shape for each model fit may or may not reflect what is illustrated below:

```{r class.source="scroll-100",fig.align='center'}
```{r class.source="fold-hide", fig.align='center'}
## Example Data ##
# example fit concentration series
ex_conc <- seq(0.03, 100, length.out = 100)
Expand Down Expand Up @@ -1711,7 +1722,7 @@ A subset of MC4 data is available within the <font face="CMTT10"> mc_vignette </

The level 4 data includes fields for each of the ten model fits as well as the ID fields, as defined [here](#mc4). Model fit information are prefaced by the model abbreviations (e.g. $\mathit{cnst}$, $\mathit{hill}$, $\mathit{gnls}$, $\mathit{poly1}$, etc.). The fields ending in $\mathit{success}$ indicate the convergence status of the model, where 1 means the model converged, 0 otherwise. NA values indicate the fitting algorithm did not attempt to fit the model. Smoothed model fits of the concentration-response data from the MC4 data object are displayed below:

```{r fig.align='center',fig.dim=c(8,5.5),class.source = "scroll-100", warnings=FALSE, message=FALSE}
```{r class.source="fold-hide", fig.align='center',fig.dim=c(8,5.5), warnings=FALSE, message=FALSE}
# Load the example data from the `tcpl` package.
data(mc_vignette, package = 'tcpl')
# Allocate the level 3 example data to `mc3`.
Expand Down Expand Up @@ -1887,7 +1898,7 @@ mc5_example <- mc_vignette[["mc5"]]

For demonstrative purposes, an alternative visual of the model fits from MC4 and the best model as well as the potency estimates from MC5 data is produced below.{#mc5_plot}

```{r fig.align='center',fig.dim=c(8,5.5),class.source = "scroll-100"}
```{r class.source="fold-hide", fig.align='center',fig.dim=c(8,5.5)}
## Obtain Data ##
# Load the example data from the `tcpl` package.
data(mc_vignette,package = 'tcpl')
Expand Down Expand Up @@ -2485,7 +2496,7 @@ tcplLoadChemList(field = "chid", val = 1:2)

The <font face="CMTT10">**tcplMthdList**</font> function returns methods available for processing at a specified level (i.e. step in the <font face="CMTT10">tcpl</font> pipeline). The user defined function in the following code chunk retrieves and outputs all available methods for both the SC and MC data levels.

```{r mthd_list, fig.align='center',class.source="scroll-100",message=FALSE, eval=FALSE}
```{r mthd_list, fig.align='center',message=FALSE, eval=FALSE}
# Create a function to list all available methods function (SC & MC).
method_list <- function() {
# Single Concentration
Expand Down Expand Up @@ -2573,7 +2584,7 @@ mc0 <- tcplPrepOtpt(tcplLoadData(lvl = 0, fld = "acid", val = 1, type = "mc"))

The goal of this section is to provide example quantitative metrics, such as z-prime and coefficient of variance, to evaluate assay performance relative to controls.

```{r mc0_aq, fig.align='center', class.source = "scroll-100", message=FALSE, eval=FALSE}
```{r mc0_aq, fig.align='center', message=FALSE, eval=FALSE}
# Create a function to review assay quality metrics using indexed Level 0 data.
aq <- function(ac){
# obtain level 1 multiple concentration data for specified acids
Expand Down Expand Up @@ -2666,7 +2677,7 @@ sc2 <- tcplPrepOtpt(tcplLoadData(lvl = 2, fld = "aeid", val = aeids$aeid, type =

## - Load SC Methods

```{r sc2_mthd, fig.align='center',class.source="scroll-100",message=FALSE, eval=FALSE}
```{r sc2_mthd, fig.align='center',message=FALSE, eval=FALSE}
# Create a function to load methods for single concentration data processing
# steps for given aeids.
sc_methods <- function(aeids) {
Expand Down Expand Up @@ -2710,7 +2721,7 @@ mc5 <- tcplPrepOtpt(tcplLoadData(lvl = 5, fld = "aeid", val = aeids$aeid, type =

## - Load MC Methods

```{r mc5_methods, fig.align='center',class.source="scroll-100",message=FALSE, eval=FALSE}
```{r mc5_methods, fig.align='center',message=FALSE, eval=FALSE}
# Create a function to load methods for MC data processing
# for select aeids.
mc_methods <- function(aeids) {
Expand Down
Loading