Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macs3/3.0.2 #19

Merged
merged 11 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
- name: Install dependencies
run: |
BiocManager::install("AnVIL")
AnVIL::install(c("remotes", "rcmdcheck", "basilisk", "reticulate", "S4Vectors", "ExperimentHub", "AnnotationHub"))
BiocManager::install(c("remotes", "rcmdcheck", "basilisk", "reticulate", "S4Vectors", "ExperimentHub", "AnnotationHub"))
remotes::install_deps(dependencies = TRUE)
shell: Rscript {0}
- name: Check
Expand Down
13 changes: 9 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,19 +1,24 @@
Package: MACSr
Title: MACS: Model-based Analysis for ChIP-Seq
Version: 1.11.2
Authors@R:
Version: 1.11.3
Authors@R: c(
person(given = "Philippa",
family = "Doherty",
role = c("aut"),
email = "[email protected]"),
person(given = "Qiang",
family = "Hu",
role = c("aut", "cre"),
email = "[email protected]")
)
Description: The Model-based Analysis of ChIP-Seq (MACS) is a widely
used toolkit for identifying transcript factor binding sites.
This package is an R wrapper of the lastest MACS3.
License: BSD_3_clause + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.2
Depends: R (>= 4.1.0)
Imports: utils, reticulate, S4Vectors, methods, basilisk, ExperimentHub, AnnotationHub
Suggests:
Expand All @@ -22,7 +27,7 @@ Suggests:
rmarkdown,
BiocStyle,
MACSdata
PythonRequirements: Python (>= 3.6.0), macs3
PythonRequirements: Python (>= 3.9.0), macs3
VignetteBuilder: knitr
biocViews: Software, ChIPSeq, ATACSeq, ImmunoOncology
StagedInstall: no
2 changes: 2 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
Changes in version 1.11.3 (2024-9-12)
+ Upgrade to MACS 3.0.2.
Changes in version 1.11.2 (2023-11-20)
+ Upgrade to MACS 3.0.0.
2 changes: 1 addition & 1 deletion R/basilisk.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' @import basilisk
env_macs <- BasiliskEnvironment("env_macs", pkgname="MACSr",
packages = c("python=3.10"),
pip = c("macs3==3.0.0"))
pip = c("macs3==3.0.2"))
16 changes: 8 additions & 8 deletions R/bdgbroadcall.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,8 @@
#' bedGraph files from MACS3 are accpetable.
#'
#' @param ifile MACS score in bedGraph. REQUIRED.
#' @param cutoffpeak Cutoff for peaks depending on which method you
#' used for score track. If the file contains qvalue scores from
#' MACS3, score 2 means qvalue 0.01. DEFAULT: 2
#' @param cutofflink Cutoff for linking regions/low abundance regions
#' depending on which method you used for score track. If the file
#' contains qvalue scores from MACS3, score 1 means qvalue 0.1,
#' and score 0.3 means qvalue 0.5. DEFAULT: 1", default = 1
#' @param cutoffpeak Cutoff for peaks depending on which method you used for score track. If the file contains qvalue scores from MACS3, score 2 means qvalue 0.01. Regions with signals lower than cutoff will not be considerred as enriched regions. DEFAULT: 2
#' @param cutofflink Cutoff for linking regions/low abundance regions depending on which method you used for score track. If the file contains qvalue scores from MACS3, score 1 means qvalue 0.1, and score 0.3 means qvalue 0.5. DEFAULT: 1
#' @param minlen minimum length of peak, better to set it as d value. DEFAULT: 200",
#' default = 200
#' @param lvl1maxgap maximum gap between significant peaks, better to
Expand All @@ -20,7 +15,10 @@
#' to set it as 4 times of d value. DEFAULT: 800
#' @param trackline Tells MACS not to include trackline with bedGraph
#' files. The trackline is required by UCSC.
#' @param outputfile The output file.
#' @param outputfile Output file name. Mutually exclusive with --o-prefix
#' @param oprefix The PREFIX of output bedGraph file to write
#' scores. If it is given as A, and method is 'ppois', output file
#' will be A_ppois.bdg. Mutually exclusive with -o/--ofile.
#' @param outdir The output directory.
#' @param log Whether to capture logs.
#' @param verbose Set verbose level of runtime message. 0: only show
Expand All @@ -44,6 +42,7 @@ bdgbroadcall <- function(ifile, cutoffpeak = 2, cutofflink = 1,
minlen = 200L, lvl1maxgap = 30L,
lvl2maxgap = 800L, trackline = TRUE,
outdir = ".", outputfile = character(),
oprefix = character(),
log = TRUE, verbose = 2L){
ifile <- normalizePath(ifile)
cl <- basiliskStart(env_macs)
Expand All @@ -58,6 +57,7 @@ bdgbroadcall <- function(ifile, cutoffpeak = 2, cutofflink = 1,
trackline = trackline,
outdir = outdir,
ofile = outputfile,
oprefix = oprefix,
verbose = verbose)
.bdgbroadcall <- reticulate::import("MACS3.Commands.bdgbroadcall_cmd")
if(log){
Expand Down
13 changes: 1 addition & 12 deletions R/bdgcmp.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,7 @@
#' @param pseudocount The pseudocount used for calculating logLR,
#' logFE or FE. The count will be applied after normalization of
#' sequencing depth. DEFAULT: 0.0, no pseudocount is applied.
#' @param method Method to use while calculating a score in any bin by
#' comparing treatment value and control value. Available choices
#' are: ppois, qpois, subtract, logFE, logLR, and slogLR. They
#' represent Poisson Pvalue (-log10(pvalue) form) using control as
#' lambda and treatment as observation, q-value through a BH
#' process for poisson pvalues, subtraction from treatment, linear
#' scale fold enrichment, log10 fold enrichment(need to set
#' pseudocount), log10 likelihood between ChIP-enriched model and
#' open chromatin model(need to set pseudocount), symmetric log10
#' likelihood between two ChIP-enrichment models, or maximum value
#' between the two tracks. Default option is
#' ppois.",default="ppois".
#' @param method Method to use while calculating a score in any bin by comparing treatment value and control value. Available choices are: ppois, qpois, subtract, logFE, FE, logLR, slogLR, and max. They represent Poisson Pvalue (-log10(pvalue) form) using control as lambda and treatment as observation, q-value through a BH process for poisson pvalues, subtraction from treatment, linear scale fold enrichment, log10 fold enrichment(need to set pseudocount), log10 likelihood between ChIP-enriched model and open chromatin model(need to set pseudocount), symmetric log10 likelihood between two ChIP-enrichment models, or maximum value between the two tracks. Default option is ppois.
#' @param oprefix The PREFIX of output bedGraph file to write
#' scores. If it is given as A, and method is 'ppois', output file
#' will be A_ppois.bdg. Mutually exclusive with -o/--ofile.
Expand Down
14 changes: 3 additions & 11 deletions R/bdgdiff.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,26 +13,18 @@
#' 1. Incompatible with callpeak --SPMR output. REQUIRED
#' @param c2bdg MACS control lambda bedGraph for condition
#' 2. Incompatible with callpeak --SPMR output. REQUIRED
#' @param cutoff logLR cutoff. DEFAULT: 3 (likelihood
#' ratio=1000)", default = 3
#' @param cutoff log10LR cutoff. Regions with signals lower than cutoff will not be considerred as enriched regions. DEFAULT: 3 (likelihood ratio=1000)
#' @param minlen Minimum length of differential region. Try bigger value to remove small regions. DEFAULT: 200",
#' default = 200
#' @param maxgap Maximum gap to merge nearby differential
#' regions. Consider a wider gap for broad marks. Maximum gap
#' should be smaller than minimum length (-g). DEFAULT:
#' 100", default = 100
#' 100
#' @param depth1 Sequencing depth (# of non-redundant reads in
#' million) for condition 1. It will be used together with
#' --d2. See description for --d2 below for how to assign
#' them. Default: 1
#' @param depth2 Sequencing depth (# of non-redundant reads in
#' million) for condition 2. It will be used together with
#' --d1. DEPTH1 and DEPTH2 will be used to calculate scaling
#' factor for each sample, to down-scale larger sample to the
#' level of smaller one. For example, while comparing 10 million
#' condition 1 and 20 million condition 2, use --d1 10 --d2 20,
#' then pileup value in bedGraph for condition 2 will be divided
#' by 2. Default: 1
#' @param depth2 Sequencing depth (# of non-redundant reads in million) for condition 2. It will be used together with --d1. DEPTH1 and DEPTH2 will be used to calculate scaling factor for each sample, to down-scale larger sample to the level of smaller one. For example, while comparing 10 million condition 1 and 20 million condition 2, use --d1 10 --d2 20, then pileup value in bedGraph for condition 2 will be divided by 2. Default: 1
#'
#' @param oprefix Output file prefix. Actual files will be named as
#' PREFIX_cond1.bed, PREFIX_cond2.bed and
Expand Down
21 changes: 2 additions & 19 deletions R/bdgopt.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,11 @@
#'
#' @param ifile MACS score in bedGraph. Note: this must be a bedGraph
#' file covering the ENTIRE genome. REQUIRED
#' @param method Method to modify the score column of bedGraph
#' file. Available choices are: multiply, add, max, min, or
#' p2q. 1) multiply, the EXTRAPARAM is required and will be
#' multiplied to the score column. If you intend to divide the
#' score column by X, use value of 1/X as EXTRAPARAM. 2) add, the
#' EXTRAPARAM is required and will be added to the score
#' column. If you intend to subtract the score column by X, use
#' value of -X as EXTRAPARAM. 3) max, the EXTRAPARAM is required
#' and will take the maximum value between score and the
#' EXTRAPARAM. 4) min, the EXTRAPARAM is required and will take
#' the minimum value between score and the EXTRAPARAM. 5) p2q,
#' this will convert p-value scores to q-value scores using
#' Benjamini-Hochberg process. The EXTRAPARAM is not
#' required. This method assumes the scores are -log10 p-value
#' from MACS3. Any other types of score will cause unexpected
#' errors.", default="p2q"
#' @param method Method to modify the score column of bedGraph file. Available choices are: multiply, add, max, min, or p2q. 1) multiply, the EXTRAPARAM is required and will be multiplied to the score column. If you intend to divide the score column by X, use value of 1/X as EXTRAPARAM. 2) add, the EXTRAPARAM is required and will be added to the score column. If you intend to subtract the score column by X, use value of -X as EXTRAPARAM. 3) max, the EXTRAPARAM is required and will take the maximum value between score and the EXTRAPARAM. 4) min, the EXTRAPARAM is required and will take the minimum value between score and the EXTRAPARAM. 5) p2q, this will convert p-value scores to q-value scores using Benjamini-Hochberg process. The EXTRAPARAM is not required. This method assumes the scores are -log10 p-value from MACS3. Any other types of score will cause unexpected errors. Default="p2q"
#'
#' @param extraparam The extra parameter for METHOD. Check the detail
#' of -m option.
#' @param outputfile Output filename. Mutually exclusive with
#' --o-prefix. The number and the order of arguments for --ofile
#' must be the same as for -m.
#' @param outputfile Output BEDGraph filename. Required.
#' @param outdir The output directory.
#' @param log Whether to capture logs.
#' @param verbose Set verbose level of runtime message. 0: only show
Expand Down
22 changes: 14 additions & 8 deletions R/bdgpeakcall.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,7 @@
#' bedGraph files from MACS3 are accpetable.
#'
#' @param ifile MACS score in bedGraph. REQUIRED.
#' @param cutoff Cutoff depends on which method you used for score
#' track. If the file contains pvalue scores from MACS3, score 5
#' means pvalue 1e-5. DEFAULT: 5", default = 5.
#' @param cutoff Cutoff depends on which method you used for score track. If the file contains pvalue scores from MACS3, score 5 means pvalue 1e-5. Regions with signals lower than cutoff will not be considerred as enriched regions. DEFAULT: 5
#' @param minlen minimum length of peak, better to set it as d
#' value. DEFAULT: 200", default = 200.
#' @param maxgap maximum gap between significant points in a peak,
Expand All @@ -19,11 +17,15 @@
#' or total length of peaks that can be called by different cutoff
#' then output a summary table to help user decide a better
#' cutoff. Note, minlen and maxgap may affect the
#' results. DEFAULT: False", default = False.
#' @param trackline Tells MACS not to include trackline with bedGraph
#' files. The trackline is required by UCSC.
#' @param outputfile The output file.
#' results. DEFAULT: False
#' @param cutoff_analysis_max The maximum cutoff score for performing cutoff analysis. Together with --cutoff-analysis-steps, the resolution in the final report can be controlled. Please check the description in --cutoff-analysis-steps for detail. DEFAULT: 100
#' @param cutoff_analysis_steps Steps for performing cutoff analysis. It will be used to decide which cutoff value should be included in the final report. Larger the value, higher resolution the cutoff analysis can be. The cutoff analysis function will first find the smallest (at least 0) and the largest (controlled by --cutoff-analysis-max) score in the data, then break the range of score into `CUTOFF_ANALYSIS_STEPS` intervals. It will then use each score as cutoff to call peaks and calculate the total number of candidate peaks, the total basepairs of peaks, and the average length of peak in basepair. Please note that the final report ideally should include `CUTOFF_ANALYSIS_STEPS` rows, but in practice, if the cutoff yield zero peak, the row for that value won't be included. DEFAULT: 100
#' @param trackline Tells MACS not to include trackline with bedGraph files. The trackline is used by UCSC for the options for display.
#' @param outputfile Output file name. Mutually exclusive with --o-prefix
#' @param outdir The output directory.
#' @param oprefix Output file prefix. Actual files will be named as
#' PREFIX_cond1.bed, PREFIX_cond2.bed and
#' PREFIX_common.bed. Mutually exclusive with -o/--ofile.
#' @param log Whether to capture logs.
#' @param verbose Set verbose level of runtime message. 0: only show
#' critical message, 1: show additional warning message, 2: show
Expand All @@ -46,7 +48,8 @@
#' }
bdgpeakcall <- function(ifile, cutoff = 5, minlen = 200L, maxgap = 30L,
call_summits = FALSE, cutoff_analysis = FALSE,
trackline = TRUE, outdir = ".",
cutoff_analysis_max = 100, cutoff_analysis_steps = 100,
trackline = TRUE, outdir = ".", oprefix= character(),
outputfile = character(),
log = TRUE, verbose = 2L){
cl <- basiliskStart(env_macs)
Expand All @@ -59,8 +62,11 @@ bdgpeakcall <- function(ifile, cutoff = 5, minlen = 200L, maxgap = 30L,
maxgap = maxgap,
call_summits = call_summits,
cutoff_analysis = cutoff_analysis,
cutoff_analysis_max = cutoff_analysis_max,
cutoff_analysis_steps = cutoff_analysis_steps,
trackline = trackline,
outdir = outdir,
oprefix = oprefix,
ofile = outputfile,
verbose = verbose)
.bdgpeakcall <- reticulate::import("MACS3.Commands.bdgpeakcall_cmd")
Expand Down
Loading