Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
knausb committed Sep 1, 2020
2 parents c208923 + 2d53fa5 commit 3abd40a
Show file tree
Hide file tree
Showing 13 changed files with 152 additions and 63 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Description: Facilitates easy manipulation of variant call format (VCF) data.
data may be written to a VCF file (*.vcf.gz). It also may be converted into
other popular R objects (e.g., genlight, DNAbin). VcfR provides a link between
VCF data and familiar R software.
Version: 1.10.0
Version: 1.12.0
Authors@R: c(
person(c("Brian", "J."), "Knaus", role = c("cre", "aut"),
email = "[email protected]", comment = c(ORCID = "0000-0003-1665-4343")),
Expand Down Expand Up @@ -56,5 +56,5 @@ Suggests:
tidyr
VignetteBuilder: knitr
License: GPL-3
RoxygenNote: 7.0.2
RoxygenNote: 7.1.1
Encoding: UTF-8
12 changes: 12 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,18 @@ I think I encountered a situation where 4-96 was not enough so I've bumped it to
This may have been addressed at 64a308ba50b9119108e8946737460de5997b805b by adding `samples` to vcfR method `[`.
* In issue #92 (vcfR2genlight big data #92), JimWhiting91 has documented that `extract.gt()` could be greatly improved with multithreading. While he used `mclapply()` I do not feel this is the best solution because it does not work on Windows. I think a better solution would be [RCppParallel](https://rcppcore.github.io/RcppParallel/) because this should work on all CRAN platforms.

# vcfR 1.13.0
Released on CRAN 202X-XX-XX

# vcfR 1.12.0
Released on CRAN 2020-09-01
* Added ```PKGTYPE: both``` to appveyor.yml so Windows packages can be built from source
* Omitted ```configure``` file that unnecessarily tried to invoke checkbashisms
* Incorporated help from https://stackoverflow.com/a/62721142 to use checkbashisms when checking on Debian flavors of Linux (ended up omitting this change but left this here to document it and the link)

# vcfR 1.11.0
Released on CRAN 2020-06-05
* Now compatible with R 4.0.0 and dplyr 1.0.0

# vcfR 1.10.0
Released on CRAN 2020-02-06
Expand Down
2 changes: 1 addition & 1 deletion R/vcfR2DNAbin.R
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@
#'
#'
#' When a variant overlaps a deletion it may be encoded by an \strong{asterisk allele (*)}.
#' The GATK site covers this in a post on \href{https://gatkforums.broadinstitute.org/gatk/discussion/6926/spanning-or-overlapping-deletions-allele}{Spanning or overlapping deletions} ].
#' The GATK site covers this in a post on \href{https://gatk.broadinstitute.org/hc/en-us/articles/360035531912-Spanning-or-overlapping-deletions-allele-}{Spanning or overlapping deletions} ].
#' This is handled in vcfR by allowing the user to decide how it is handled with the paramenter \code{asterisk_as_del}.
#' When \code{asterisk_as_del} is TRUE this allele is converted into a deletion ('-').
#' When \code{asterisk_as_del} is FALSE the asterisk allele is converted to NA.
Expand Down
4 changes: 2 additions & 2 deletions R/vcfR_to_tidy_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -597,15 +597,15 @@ guess_types <- function(D) {
tmp <- D %>%
# dplyr::filter_(~Number == 1) %>%
dplyr::filter(Number == 1) %>%
dplyr::mutate(tt = ifelse(Type == "Integer", "i", ifelse(Type == "Numeric" | Type == "Float", "n", ""))) %>%
dplyr::mutate(tt = dplyr::if_else(Type == "Integer", "i", dplyr::if_else(Type == "Numeric" | Type == "Float", "n", ""))) %>%
dplyr::filter(tt %in% c("n", "i")) %>%
# dplyr::filter_(~tt %in% c("n", "i")) %>%
dplyr::select(ID, Number, Type, tt)
# dplyr::select_(~ID, ~Number, ~Type, ~tt)

# tmp <- D %>% dplyr::filter_(~Number == 0 & Type == 'Flag') %>%
tmp <- D %>% dplyr::filter(Number == 0 & Type == 'Flag') %>%
dplyr::mutate(tt = ifelse(Type == "Flag", "f")) %>%
dplyr::mutate(tt = dplyr::if_else(Type == "Flag", "f", "")) %>%
dplyr::filter(tt %in% c("f")) %>%
# dplyr::filter_(~tt %in% c("f")) %>%
dplyr::select(ID, Number, Type, tt) %>%
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,9 +129,9 @@ If you know of a software that I have not included on this list, particularly if

* [Cortex](http://cortexassembler.sourceforge.net/)
* [freebayes](https://github.com/ekg/freebayes)
* [GATK haplotype caller](https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_haplotypecaller_HaplotypeCaller.php)
* [GATK MuTect2](https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php)
* [GATK GenotypeGVCFs](https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_variantutils_GenotypeGVCFs.php)
* [GATK haplotype caller](https://software.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_haplotypecaller_HaplotypeCaller.php)
* [GATK MuTect2](https://software.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php)
* [GATK GenotypeGVCFs](https://software.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_tools_walkers_variantutils_GenotypeGVCFs.php)
* [LoFreq](http://csb5.github.io/lofreq/)
* [Samtools](http://www.htslib.org/)
* [VarScan2](http://dkoboldt.github.io/varscan/)
Expand All @@ -140,7 +140,7 @@ If you know of a software that I have not included on this list, particularly if
**Restriction site associated DNA markers (e.g., RADseq, GBS):**

* [Stacks](http://catchenlab.life.illinois.edu/stacks/)
* [Tassel](http://www.maizegenetics.net/#!tassel/c17q9)
* [Tassel](https://www.maizegenetics.net)

**Manipulation of VCF data:**

Expand Down
1 change: 1 addition & 0 deletions appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ cache:
environment:
global:
WARNINGS_ARE_ERRORS: 1
PKGTYPE: both

matrix:
- R_VERSION: devel
Expand Down
164 changes: 116 additions & 48 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,80 +1,148 @@

vcfR was last published on CRAN on 2020-01-10, so this may appear as an early resubmission.
However, I was asked by CRAN to fix warnings occurring on R-devel by 2-17 so I'm submitting.

## Resubmission

This is a resubmission.
In my previous submission I asserted that

https://gatkforums.broadinstitute.org/gatk/discussion/6926/spanning-or-overlapping-deletions-allele

was a valid URL. However, CRAN correctly identified it as invalid. This has been updated to the following.

https://gatk.broadinstitute.org/hc/en-us/articles/360035531912-Spanning-or-overlapping-deletions-allele-

I have now also validated that

http://www.1000genomes.org/node/101
https://uswest.ensembl.org/info/docs/tools/vep/index.html

are valid URLs by pasting them into firefox.


## Submission

This package, vcfR, was archived on CRAN on 2020-07-05 because CRAN asked me to address issues that I was unable to address before their deadline.
This submission is in hope of being restored to CRAN.
The issues I received via email are as follows.

```
checkbashisms is not even in SystemRequirements and used
unconditionally. See 'Writing R Extensions', which called that 'annoying'.
It is a Debian script and not widely installed.
Your moniker "briank.lists" is not appropriate for a CRAN maintainer --
see the CRAN policy.
```

It appears that I misunderstood how to handle "checkbashisms."
I posted on R-pkg-devel and was advised that I should assume that CRAN machines that require this script should have it.
I have removed my "configure" script which attempted to handle this on my side.

I do not understand the criticism of my email or "moniker" of "[email protected]."
I feel this is a misunderstanding.
I have consulted the CRAN Repository Policy at the below link.

https://cran.r-project.org/web/packages/policies.html

It states that the maintainer must be "a person, not a mailing list" which I feel may be the source of the confusion.
The address "[email protected]" is my personal address where I receive email from the various lists I subscribe to (and have been using since vcfR 1.0.0).
It is not a mailing list.
If I am mistaken please provide clarification.
Thank you!


## Test environments
* local: ubuntu 16.04 LTS and R 3.6.2
* local: OS X Catalina 10.15.2 and R 3.6.2 and clang
* travis-ci: ubuntu 16.04 LTS, R 3.6.2 and R Under development (unstable) (2020-02-04 r77771)
* AppVeyor: Windows Server 2012 R2 x64 (build 9600) R version 3.6.2 Patched (2020-01-25 r77764)
* winbuilder: R version 3.6.2 (2019-12-12) and R Under development (unstable) (2020-01-28 r77738)

## R CMD check results
There were no ERRORs.
* local:
ubuntu 18.04 LTS and R 4.0.2

There were 2 NOTEs:
* local:
OS X Catalina 10.15.6 and R 4.0.2 and clang

Found the following (possibly) invalid URLs:
URL: http://www.1000genomes.org/node/101
From: inst/doc/intro_to_vcfR.html
Status: Error
Message: libcurl error code 60:
SSL certificate problem: unable to get local issuer certificate
(Status without verification: OK)
win-builder:
* using R version 4.0.2 (2020-06-22)
* using R Under development (unstable) (2020-08-23 r79071)

This url works when I copy and paste it into firefox.
travis-ci:
* Ubuntu 16.04.6 LTS, R version 4.0.0 (2020-04-24)
* Ubuntu 16.04.6 LTS, R Under development (unstable) (2020-08-26 r79084)

> checking installed package size ... NOTE
installed size is 9.9Mb
sub-directories of 1Mb or more:
libs 8.0Mb
Currently failing:
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/home/travis/R/Library/ape/libs/ape.so':
libRlapack.so: cannot open shared object file: No such file or directory

## Downstream dependencies
My interpretation is that this is not due to vcfR.

I have also run R CMD check on downstream dependencies of vcfR
All packages that I could install passed:
AppVeyor:
* Windows Server 2012 R2 x64 (build 9600), R Under development (unstable) (2020-08-24 r79088)
* Windows Server 2012 R2 x64 (build 9600), R version 4.0.2 (2020-06-22)

devtools::revdep_check() is no longer a part of devtools and the package revdepcheck (on GitHub but not CRAN) threw an error because a dependent R package could not be installed.
rhub:
* None for this submission


## R CMD check results

I used:
R CMD check --as-cran
on the tarballs from CRAN for the following packages.
* checking CRAN incoming feasibility ... NOTE
Maintainer: 'Brian J. Knaus <[email protected]>'

Reverse imports: binmapr, pcadapt, whoa
Reverse suggests: LDheatmap, onemap, perfectphyloR, rehh, SimRVSequences
New submission

I spent an entire afternoon trying to install dependencies of reversedependencies of vcfR but was not successful.
It appears there are dependencies of these packages that are not available for R version 3.6.2.
Package was archived on CRAN

Possibly mis-spelled words in DESCRIPTION:
DNAbin (9:46)
VCF (2:33, 3:68, 4:62, 5:5, 8:30, 10:5)
VcfR (9:55)
genlight (9:36)

These words and acronyms are esoteric to working with genomic data and are all correctly spelled.

Found the following (possibly) invalid URLs:
URL: https://uswest.ensembl.org/info/docs/tools/vep/index.html
From: man/vep.Rd
Status: Error
Message: libcurl error code 60:
SSL certificate problem: unable to get local issuer certificate
(Status without verification: OK)
URL: https://www.internationalgenome.org/node/101
From: inst/doc/intro_to_vcfR.html
Status: Error
Message: libcurl error code 60:
SSL certificate problem: unable to get local issuer certificate
(Status without verification: OK)

These URLs all work when pasted into firefox.

Results:

WARNINGS were thrown because the version tested was the same as on CRAN.
* checking installed package size ... NOTE
installed size is 10.4Mb
sub-directories of 1Mb or more:
libs 8.5Mb

binmapr
* checking Rd cross-references ... NOTE
Package unavailable to check Rd xrefs: ‘qtl’
This has not been an issue in the past.

pcadapt
* checking package dependencies ... ERROR
Packages required but not available:
'mmapcharr', 'plotly', 'robust', 'RSpectra', 'rmio'

These do not appear to be available for R version 3.6.2.
* checking for future file timestamps ... NOTE
unable to verify current time

LDheatmap
Warning message:
package ‘snpStats’ is not available (for R version 3.6.2)
I interpret this as not an issue with vcfR.

onemap
package ‘MDSmap’ is not available (for R version 3.6.2)

## Thank you CRAN Core Team!

[CRAN Repository Policy](https://cran.r-project.org/web/packages/policies.html) states that all correspondence should be with CRAN and not members of the team.
However, I think its polite to thank those who have helped this project.
So I've decided to start a list of thanks with the hope that these individuals may see this in the future.

v1.12.0 Thank you Uwe Ligges for processing my submission!

v1.11.0 Thank you Uwe Ligges for processing my submission!

v1.10.0 Thank you Uwe Ligges for processing my submission!

v1.9.0 Thank you Uwe Ligges for processing my submission!

v1.8.0 Thank you Uwe Ligges for processing my submission!
Expand Down
4 changes: 3 additions & 1 deletion man/chromR_example.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/vcfR2DNAbin.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion man/vcfR_example.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion man/vcfR_test.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion man/vep.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion vignettes/intro_to_vcfR.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ This is one situation where not having complete chromosomes actually can be an a
## Data input


The vcfR package is designed to work with data from [VCF](http://www.1000genomes.org/node/101) files.
The vcfR package is designed to work with data from [VCF](https://www.internationalgenome.org/node/101) files.
The use of a sequence file ([FASTA format](https://en.wikipedia.org/wiki/FASTA_format)) and an annotation file ([GFF format](https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md)) can provide helpful context, but are not required.
We'll begin our example by locating the data files from the package 'pinfsc50.'

Expand Down

0 comments on commit 3abd40a

Please sign in to comment.