Commit f83e9d97 authored by Andreas Tille's avatar Andreas Tille

Update upstream source from tag 'upstream/1.28.0+dfsg'

Update to upstream version '1.28.0+dfsg'
with Debian dir a9d2c4a3a022214d41b8a2d027e8c67618c1ace2
parents ecc4c9d7 30996592
Package: phyloseq
Version: 1.26.1
Date: 2018-07-15
Version: 1.28.0
Date: 2019-04-23
Title: Handling and analysis of high-throughput microbiome census data
Description: phyloseq provides a set of classes and tools
to facilitate the import, storage, analysis, and
......@@ -18,8 +18,8 @@ Imports: ade4 (>= 1.7.4), ape (>= 5.0), Biobase (>= 2.36.2),
1.4.1), scales (>= 0.4.0), vegan (>= 2.5)
Depends: R (>= 3.3.0)
Suggests: BiocStyle (>= 2.4), DESeq2 (>= 1.16.1), genefilter (>= 1.58),
knitr (>= 1.16), metagenomeSeq (>= 1.14), rmarkdown (>= 1.6),
testthat (>= 1.0.2)
knitr (>= 1.16), magrittr (>= 1.5), metagenomeSeq (>= 1.14),
rmarkdown (>= 1.6), testthat (>= 1.0.2)
VignetteBuilder: knitr
Enhances: doParallel (>= 1.0.10)
biocViews: ImmunoOncology, Sequencing, Microbiome, Metagenomics,
......@@ -37,11 +37,11 @@ Collate: 'allClasses.R' 'allPackage.R' 'allData.R' 'as-methods.R'
'network-methods.R' 'distance-methods.R'
'deprecated_functions.R' 'extend_DESeq2.R' 'phylo-class.R'
'extend_metagenomeSeq.R'
RoxygenNote: 6.0.1
RoxygenNote: 6.1.1
git_url: https://git.bioconductor.org/packages/phyloseq
git_branch: RELEASE_3_8
git_last_commit: a084072
git_last_commit_date: 2019-01-04
Date/Publication: 2019-01-04
git_branch: RELEASE_3_9
git_last_commit: a86ed1e
git_last_commit_date: 2019-05-02
Date/Publication: 2019-05-02
NeedsCompilation: no
Packaged: 2019-01-05 00:36:54 UTC; biocbuild
Packaged: 2019-05-03 00:24:42 UTC; biocbuild
......@@ -63,7 +63,7 @@
#' # Filter samples that don't have Enterotype
#' x <- subset_samples(enterotype, !is.na(Enterotype))
#' # (the taxa are at the genera level in this dataset)
#' res = mt(x, "Enterotype", method=c("fdr", "bonferroni"), test="f", B=300)
#' res = mt(x, "Enterotype", method="fdr", test="f", B=300)
#' head(res, 10)
#' ## # Not surprisingly, Prevotella and Bacteroides top the list.
#' ## # Different test, multiple-adjusted t-test, whether samples are ent-2 or not.
......
......@@ -159,7 +159,7 @@ rarefy_even_depth <- function(physeq, sample.size=min(sample_sums(physeq)),
message(length(rmsamples), " samples removed",
"because they contained fewer reads than `sample.size`.")
message("Up to first five removed samples are: \n")
message(rmsamples[1:min(5, length(rmsamples))], sep="\t")
message(paste(rmsamples[1:min(5, length(rmsamples))], sep="\t"))
message("...")
}
# Now done with notifying user of pruning, actually prune.
......
......@@ -4,6 +4,20 @@
[![Travis-CI Build Status](https://travis-ci.org/joey711/phyloseq.svg?branch=master)](https://travis-ci.org/joey711/phyloseq)
## Quick Install
In R terminal:
```
if(!requireNamespace("BiocManager")){
install.packages("BiocManager")
}
BiocManager::install("phyloseq")
```
See [the phyloseq installation page](http://joey711.github.io/phyloseq/install.html)
for further details, examples.
## Article on Improved Microbiome Analysis
McMurdie and Holmes (2014)
......
No preview for this file type
......@@ -544,19 +544,48 @@ A Google search for "phyloseq differential abundance"
will also likely turn up a number of useful, related resources.
# - I need help analyzing my data. It has the following study design...
I am currently a biostatistician at Second Genome, Inc.,
which offers complete
[end-to-end microbiome experiment solutions](http://www.secondgenome.com/solutions)
## Please be more specific
The biggest problem is often scope.
Your correspondence to open-source package authors
should be very specific features, bugs, or contributions;
where the answer will benefit other users who can later
find the post in a search or benefit from the new feature/bugfix.
## Please respect my time (and other package authors)
Package authors have finite time available to help.
There is an unavoidable trade-off of time
between package features/maintenance
and answering user correspondence.
My full-time efforts are in understanding the role of the microbiome
in human health and disease
as Senior Staff Data Scientist at a biotech startup in San Francisco
(a full-time job).
Meanwhile, there are thousands of monthly-downloads of phyloseq.
**It is a mathematical fact that I do not have time to help you analyze your specific dataset.**
What you post on the issue tracker might be answered by others,
but if it is sufficiently of the "help me analyze my data for me" variety of questions,
I will eventually close the issue.
## Pay for Help (not me)
I believe there are now competing options to help you with your analysis
as a fee-for-service.
I was previously a biostatistician and developer at one such company,
Second Genome, Inc.
which offered complete
end-to-end microbiome experiment solutions
as a fee-for-service.
In some cases Second Genome clients already have their microbiome data
and want to make use of our team of trained microbiome analysts
to get the most information from their expeirment.
and want to make use of their team of trained microbiome analysts
to get the most information from their experiment.
I recommend contacting one of the sales associates at the link above.
My day-to-day efforts are in understanding the role of the microbiome
in human health and disease.
If you're looking for a collaboration on your microbiome
data collection or data analysis,
please contact [Second Genome Solutions](http://www.secondgenome.com/solutions).
please contact [Second Genome's capable team](https://www.secondgenome.com/platform/microbiome-technologies/16s-sequencing-community-analysis).
......@@ -8,8 +8,8 @@
\usage{
UniFrac(physeq, weighted=FALSE, normalized=TRUE, parallel=FALSE, fast=TRUE)
\S4method{UniFrac}{phyloseq}(physeq, weighted = FALSE, normalized = TRUE,
parallel = FALSE, fast = TRUE)
\S4method{UniFrac}{phyloseq}(physeq, weighted = FALSE,
normalized = TRUE, parallel = FALSE, fast = TRUE)
}
\arguments{
\item{physeq}{(Required). \code{\link{phyloseq-class}}, containing at minimum
......
......@@ -9,8 +9,8 @@
\usage{
capscale.phyloseq(physeq, formula, distance, ...)
\S4method{capscale.phyloseq}{phyloseq,formula,dist}(physeq, formula, distance,
...)
\S4method{capscale.phyloseq}{phyloseq,formula,dist}(physeq, formula,
distance, ...)
\S4method{capscale.phyloseq}{phyloseq,formula,character}(physeq, formula,
distance, ...)
......
......@@ -12,10 +12,11 @@ distance(physeq, method, type = "samples", ...)
\S4method{distance}{phyloseq,ANY}(physeq, method)
\S4method{distance}{otu_table,character}(physeq, method, type = "samples",
...)
\S4method{distance}{otu_table,character}(physeq, method,
type = "samples", ...)
\S4method{distance}{phyloseq,character}(physeq, method, type = "samples", ...)
\S4method{distance}{phyloseq,character}(physeq, method, type = "samples",
...)
}
\arguments{
\item{physeq}{(Required). A \code{\link{phyloseq-class}} or
......
......@@ -4,8 +4,9 @@
\alias{gapstat_ord}
\title{Estimate the gap statistic on an ordination result}
\usage{
gapstat_ord(ord, axes = c(1:2), type = "sites", FUNcluster = function(x,
k) { list(cluster = pam(x, k, cluster.only = TRUE)) }, K.max = 8, ...)
gapstat_ord(ord, axes = c(1:2), type = "sites",
FUNcluster = function(x, k) { list(cluster = pam(x, k, cluster.only
= TRUE)) }, K.max = 8, ...)
}
\arguments{
\item{ord}{(Required). An ordination object. The precise class can vary.
......
......@@ -6,7 +6,8 @@
\usage{
import_mothur(mothur_list_file = NULL, mothur_group_file = NULL,
mothur_tree_file = NULL, cutoff = NULL, mothur_shared_file = NULL,
mothur_constaxonomy_file = NULL, parseFunction = parse_taxonomy_default)
mothur_constaxonomy_file = NULL,
parseFunction = parse_taxonomy_default)
}
\arguments{
\item{mothur_list_file}{(Optional). The list file name / location produced by \emph{mothur}.}
......
......@@ -4,10 +4,10 @@
\alias{import_qiime}
\title{Import function to read the now legacy-format QIIME OTU table.}
\usage{
import_qiime(otufilename = NULL, mapfilename = NULL, treefilename = NULL,
refseqfilename = NULL, refseqFunction = readDNAStringSet,
refseqArgs = NULL, parseFunction = parse_taxonomy_qiime, verbose = TRUE,
...)
import_qiime(otufilename = NULL, mapfilename = NULL,
treefilename = NULL, refseqfilename = NULL,
refseqFunction = readDNAStringSet, refseqArgs = NULL,
parseFunction = parse_taxonomy_qiime, verbose = TRUE, ...)
}
\arguments{
\item{otufilename}{(Optional). A character string indicating
......
......@@ -4,8 +4,8 @@
\alias{import_usearch_uc}
\title{Import usearch table format (\code{.uc}) to OTU table}
\usage{
import_usearch_uc(ucfile, colRead = 9, colOTU = 10, readDelimiter = "_",
verbose = TRUE)
import_usearch_uc(ucfile, colRead = 9, colOTU = 10,
readDelimiter = "_", verbose = TRUE)
}
\arguments{
\item{ucfile}{(Required). A file location character string
......
......@@ -4,8 +4,8 @@
\alias{microbio_me_qiime}
\title{Import microbio.me/qiime (QIIME-DB) data package}
\usage{
microbio_me_qiime(zipftp, ext = ".zip", parsef = parse_taxonomy_greengenes,
...)
microbio_me_qiime(zipftp, ext = ".zip",
parsef = parse_taxonomy_greengenes, ...)
}
\arguments{
\item{zipftp}{(Required). A character string that is the full URL
......
......@@ -77,7 +77,7 @@ data(enterotype)
# Filter samples that don't have Enterotype
x <- subset_samples(enterotype, !is.na(Enterotype))
# (the taxa are at the genera level in this dataset)
res = mt(x, "Enterotype", method=c("fdr", "bonferroni"), test="f", B=300)
res = mt(x, "Enterotype", method="fdr", test="f", B=300)
head(res, 10)
## # Not surprisingly, Prevotella and Bacteroides top the list.
## # Different test, multiple-adjusted t-test, whether samples are ent-2 or not.
......
......@@ -4,7 +4,8 @@
\alias{ordinate}
\title{Perform an ordination on phyloseq data}
\usage{
ordinate(physeq, method = "DCA", distance = "bray", formula = NULL, ...)
ordinate(physeq, method = "DCA", distance = "bray", formula = NULL,
...)
}
\arguments{
\item{physeq}{(Required). Phylogenetic sequencing data
......
......@@ -3,9 +3,7 @@
\name{parse_taxonomy_default}
\alias{parse_taxonomy_default}
\alias{parse_taxonomy_greengenes}
\alias{parse_taxonomy_default}
\alias{parse_taxonomy_qiime}
\alias{parse_taxonomy_default}
\title{Parse elements of a taxonomy vector}
\usage{
parse_taxonomy_default(char.vec)
......
......@@ -7,8 +7,8 @@
plot_heatmap(physeq, method = "NMDS", distance = "bray",
sample.label = NULL, taxa.label = NULL, low = "#000033",
high = "#66CCFF", na.value = "black", trans = log_trans(4),
max.label = 250, title = NULL, sample.order = NULL, taxa.order = NULL,
first.sample = NULL, first.taxa = NULL, ...)
max.label = 250, title = NULL, sample.order = NULL,
taxa.order = NULL, first.sample = NULL, first.taxa = NULL, ...)
}
\arguments{
\item{physeq}{(Required). The data, in the form of an instance of the
......
......@@ -6,8 +6,8 @@
\usage{
plot_net(physeq, distance = "bray", type = "samples", maxdist = 0.7,
laymeth = "fruchterman.reingold", color = NULL, shape = NULL,
rescale = FALSE, point_size = 5, point_alpha = 1, point_label = NULL,
hjust = 1.35, title = NULL)
rescale = FALSE, point_size = 5, point_alpha = 1,
point_label = NULL, hjust = 1.35, title = NULL)
}
\arguments{
\item{physeq}{(Required).
......
......@@ -4,11 +4,11 @@
\alias{plot_tree}
\title{Plot a phylogenetic tree with optional annotations}
\usage{
plot_tree(physeq, method = "sampledodge", nodelabf = NULL, color = NULL,
shape = NULL, size = NULL, min.abundance = Inf, label.tips = NULL,
text.size = NULL, sizebase = 5, base.spacing = 0.02,
ladderize = FALSE, plot.margin = 0.2, title = NULL, treetheme = NULL,
justify = "jagged")
plot_tree(physeq, method = "sampledodge", nodelabf = NULL,
color = NULL, shape = NULL, size = NULL, min.abundance = Inf,
label.tips = NULL, text.size = NULL, sizebase = 5,
base.spacing = 0.02, ladderize = FALSE, plot.margin = 0.2,
title = NULL, treetheme = NULL, justify = "jagged")
}
\arguments{
\item{physeq}{(Required). The data about which you want to
......
......@@ -41,7 +41,7 @@ the class of \code{physeq}.
}
\description{
This method merges species that have the same taxonomy at a certain
taxaonomic rank.
taxonomic rank.
Its approach is analogous to \code{\link{tip_glom}}, but uses categorical data
instead of a tree. In principal, other categorical data known for all taxa
could also be used in place of taxonomy,
......
......@@ -4,7 +4,6 @@
\name{transform_sample_counts}
\alias{transform_sample_counts}
\alias{transformSampleCounts}
\alias{transformSampleCounts}
\title{Transform abundance data in an \code{otu_table}, sample-by-sample.}
\usage{
transform_sample_counts(physeq, fun, ...)
......
This diff is collapsed.
This diff is collapsed.
################################################################################
# Use testthat to test phyloseq transformation functions/methods
################################################################################
library("phyloseq"); library("testthat")
library("phyloseq"); library("magrittr"); library("testthat")
# # # # TESTS!
################################################################################
# rarefy_even_depth
################################################################################
data("GlobalPatterns")
set.seed(711) # The random seed for randomly selecting subset of OTUs
randoOTUs = sample(taxa_names(GlobalPatterns), 100, FALSE)
GP100 = prune_taxa(randoOTUs, GlobalPatterns)
min_lib = 1000
# The default rng seed is being implied in this call (also 711)
rGP = suppressMessages(rarefy_even_depth(GP100, sample.size=min_lib, rngseed=FALSE))
rGPr = suppressMessages(rarefy_even_depth(GP100, sample.size=min_lib, rngseed=FALSE, replace=FALSE))
# The random seed for randomly selecting subset of OTUs
rngseed = 711
keepTaxa <-
GlobalPatterns %>% taxa_sums() %>% sort %>% tail(100) %>% names() %>%
# Append some OTUs that will be cut...
# c(., (GlobalPatterns %>% taxa_sums() %>% sort %>% names() %>% .[101:length(.)]) %>% sample(size = 100))
c(., c("12589", "10444", "9286", "374370", "63062", "158132", "324145", "180450", "178513", "542714"))
GP100 = prune_taxa(keepTaxa, GlobalPatterns)
min_lib = 4000
# The default rng seed is being implied in this call
rGP = rarefy_even_depth(GP100, sample.size=min_lib, rngseed=rngseed)
rGPr = rarefy_even_depth(GP100, sample.size=min_lib, rngseed=rngseed + 1L, replace=FALSE)
################################################################################
# Test that specific OTUs and samples were removed
################################################################################
test_that("Test that empty OTUs and samples were automatically pruned", {
rmOTU = setdiff(taxa_names(GP100), taxa_names(rGP))
expect_equal(length(rmOTU), 20L)
expect_equal(rmOTU[1:5], c("534601", "408325", "325564", "8112", "571917"))
expect_true(taxa_names(GP100)[taxa_sums(GP100) <= 0] %in% rmOTU)
setdiff(taxa_names(rGP), taxa_names(rGPr))
expect_equal(length(rmOTU), 3L)
expect_equal(rmOTU, c("10444", "9286", "542714"))
expect_true(all(taxa_sums(rGP) > 0))
rmsam = setdiff(sample_names(GP100), sample_names(rGP))
expect_equal(length(rmsam), 12L)
expect_equal(rmsam[1:5], c("M11Fcsw", "M31Tong", "M11Tong", "NP2", "TRRsed1"))
expect_equal(length(rmsam), 2L)
expect_equal(rmsam, c("CC1", "SV1"))
expect_true(all(sample_sums(rGP) > 0))
})
################################################################################
# Test specific values. Should be reproducible, and you set the seed.
################################################################################
test_that("Test values", {
# with replacement values
expect_equal(as(otu_table(rGP)[1, 3:10], "vector"), rep(0, 8))
expect_equal(as(otu_table(rGP)[2, 1:10], "vector"), c(rep(0, 9), 2))
expect_equal(as(otu_table(rGP)[3, 8:12], "vector"), c(892, 956, 56, 10, 25))
expect_equal(as(otu_table(rGP)[70:78, 4], "vector"),
c(710, 2, 0, 2, 0, 8, 154, 2, 0))
# without replacement values
expect_equal(as(otu_table(rGPr)[1, 3:10], "vector"), c(rep(0, 7), 1))
expect_equal(as(otu_table(rGPr)[2, 1:10], "vector"),
c(rep(0, 5), 4, 0, 877, 960, 55))
expect_equal(as(otu_table(rGPr)[3, 8:12], "vector"),
c(10, 34, 2, 0, 2))
expect_equal(as(otu_table(rGPr)[70:78, 4], "vector"),
c(0, 706, 1, 0, 2, 0, 5, 173, 1))
})
################################################################################
# Include tests from the rarefy-without-replacement results, used by many.
#################################################################################
test_that("Test library sizes are all the same set value", {
......@@ -55,7 +41,7 @@ test_that("Test library sizes are all the same set value", {
expect_true(all(sample_sums(rGPr)==min_lib))
})
test_that("The same samples should have been cut in each results", {
expect_equal(nsamples(rGP), 14)
expect_equal(nsamples(rGP), 24)
expect_true(setequal(sample_names(rGP), sample_names(rGPr)))
})
################################################################################
......@@ -100,11 +100,11 @@ data("enterotype")
test_that("filter_taxa gives correct, reliable logicals and pruning", {
flist <- filterfun(kOverA(5, 2e-05))
ent.logi <- filter_taxa(enterotype, flist)
expect_that(ent.logi, is_a("logical"))
expect_is(ent.logi, ("logical"))
ent.trim <- filter_taxa(enterotype, flist, TRUE)
expect_that(ent.trim, is_a("phyloseq"))
expect_that(sum(ent.logi), equals(ntaxa(ent.trim)))
expect_that(prune_taxa(ent.logi, enterotype), is_identical_to(ent.trim))
expect_that(ntaxa(ent.trim), equals(416L))
expect_that(nsamples(ent.trim), equals(nsamples(enterotype)))
expect_is(ent.trim, ("phyloseq"))
expect_equal(sum(ent.logi), (ntaxa(ent.trim)))
expect_identical(prune_taxa(ent.logi, enterotype), (ent.trim))
expect_equal(ntaxa(ent.trim), (416L))
expect_equal(nsamples(ent.trim), (nsamples(enterotype)))
})
......@@ -11,9 +11,9 @@ test_that("Can transform_sample_counts of an OTU table that is either orientatio
OTU0 = otu_table(esophagus)
OTU1 = transform_sample_counts(OTU0, rank)
OTU2 = transform_sample_counts(t(OTU0), rank)
expect_that(identical(ntaxa(OTU0), ntaxa(OTU1)), is_true(),
expect_true(identical(ntaxa(OTU0), ntaxa(OTU1)),
"ntaxa OTU1 doesn't match original after transformation.")
expect_that(identical(ntaxa(OTU0), ntaxa(OTU2)), is_true(),
expect_true(identical(ntaxa(OTU0), ntaxa(OTU2)),
"ntaxa OTU2 doesn't match original after transformation.")
})
test_that("Can transform_sample_counts on phyloseq with either orientation", {
......
......@@ -544,19 +544,48 @@ A Google search for "phyloseq differential abundance"
will also likely turn up a number of useful, related resources.
# - I need help analyzing my data. It has the following study design...
I am currently a biostatistician at Second Genome, Inc.,
which offers complete
[end-to-end microbiome experiment solutions](http://www.secondgenome.com/solutions)
## Please be more specific
The biggest problem is often scope.
Your correspondence to open-source package authors
should be very specific features, bugs, or contributions;
where the answer will benefit other users who can later
find the post in a search or benefit from the new feature/bugfix.
## Please respect my time (and other package authors)
Package authors have finite time available to help.
There is an unavoidable trade-off of time
between package features/maintenance
and answering user correspondence.
My full-time efforts are in understanding the role of the microbiome
in human health and disease
as Senior Staff Data Scientist at a biotech startup in San Francisco
(a full-time job).
Meanwhile, there are thousands of monthly-downloads of phyloseq.
**It is a mathematical fact that I do not have time to help you analyze your specific dataset.**
What you post on the issue tracker might be answered by others,
but if it is sufficiently of the "help me analyze my data for me" variety of questions,
I will eventually close the issue.
## Pay for Help (not me)
I believe there are now competing options to help you with your analysis
as a fee-for-service.
I was previously a biostatistician and developer at one such company,
Second Genome, Inc.
which offered complete
end-to-end microbiome experiment solutions
as a fee-for-service.
In some cases Second Genome clients already have their microbiome data
and want to make use of our team of trained microbiome analysts
to get the most information from their expeirment.
and want to make use of their team of trained microbiome analysts
to get the most information from their experiment.
I recommend contacting one of the sales associates at the link above.
My day-to-day efforts are in understanding the role of the microbiome
in human health and disease.
If you're looking for a collaboration on your microbiome
data collection or data analysis,
please contact [Second Genome Solutions](http://www.secondgenome.com/solutions).
please contact [Second Genome's capable team](https://www.secondgenome.com/platform/microbiome-technologies/16s-sequencing-community-analysis).
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment