Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a positive control #7

Open
RichStack opened this issue May 30, 2023 · 4 comments
Open

Using a positive control #7

RichStack opened this issue May 30, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@RichStack
Copy link

Hi again,
I have a mock community I'm running through with my other samples, but I'm a little unsure how to specify this in the config file.

I am able to point to the mock sample by specifying the metadata value under 'sample-type' column, but I don't know how to direct grimer to the expected composition of the mock.
I have a tsv file which contains taxonomic levels for each mock member and expected relative frequency. Is it possible to also direct grimer to this data in the config file?
Thanks in advance.

@pirovc
Copy link
Owner

pirovc commented May 30, 2023

Can you give me some examples of your files so I can better help?

@RichStack
Copy link
Author

Thanks. This is an example of the mock community expected relative frequency tsv file I have:

Taxonomy MC
Bacteria;Actinobacteriota;Actinobacteria;Micrococcales;Micrococcaceae;Micrococcus 0.02
Bacteria;Firmicutes;Bacilli;Lactobacillales;Enterococcaceae;Enterococcus 0.23
Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter 0.23
Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Morganellaceae;Providencia 0.23
Bacteria;Firmicutes;Bacilli;Staphylococcales;Staphylococcaceae;Staphylococcus 0.02
Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Escherichia-Shigella 0.02
Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Lactobacillus 0.02
Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas 0.23

So it has taxonomic ranks of the organisms with a ';' separator, as does my input file, which is also in tsv format.

My Mock Community (MC) is one of the sample types included in the input tsv file, and I have sequence counts for this sample as with all the others.

I have a sample metadata file too - and this includes a categorical column, specifying sample-type. The Mock Community I ran in this run of sequencing is identified as 'sample-type' Mock.

So I can include this line in the config file:
controls: "Positve Control": "sample-type": - "Mock"

However, by running that, it will tell grimer which sample is the mock, but won't tell grimer what relative frequencies are expected.
I can see how negative controls are easily run as part of the decontam package, but I'm unsure how Mock Controls are interpreted.

So, I guess one of my questions is, is it possible for grimer to handle this type of sample, or is it just negative/blanks?

Hope that makes sense, and thanks.

@pirovc pirovc added the enhancement New feature or request label May 31, 2023
@pirovc
Copy link
Owner

pirovc commented May 31, 2023

Now I get it but unfortunately there's no support for this kind of analysis in GRIMER. By adding the "sample-type": - "Mock" in your controls, GRIMER will only check which organisms are in your mock samples and consider them as positive control, without checking for their relative frequencies, as you noted.

If there was such a feature, how would you expect to see the in the plots/table? I'll mark this as an enhancement

@RichStack
Copy link
Author

Hi thanks for confirming with me.

There is a function in Qiime that carries out a linear regression for mock expected relative frequencies versus those observed, so I can get this functionality elsewhere, but your question made me think about this use of mock samples in GRIMER in the context of use as a reliable positive control.

In my last run my mock sample also contained contaminant sequences, and although these sequences are usually shared with negative controls, I wasn't sure if by adding the mock as a positive control to the GRIMER analysis, these sequences would then result in not being flagged as contaminants (hope I explained that well). So maybe some kind of feature that at least allows the user to indicate which sequences are expected to occur in the mock sample, would allow for some useful filtering.

Thanks again for getting back to me, and would just like to say that I really like the functionality of GRIMER so far. The database of contaminant organisms that you've curated has been excellent and very useful to me - so thanks very much. R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants