Skip to content
Commits on Source (7)
language: python
python:
- "2.7"
- "3.4"
- "3.5"
- "3.6"
- "3.6-dev"
before_install:
- sudo apt-get build-dep python-scipy
- pip install scipy
install:
#- pip install --upgrade pip setuptools wheel
#- pip install --only-binary numpy --only-binary scipy
- pip install -r requirements.txt
script:
#- pytest
# Normal run
- python scoary.py -g scoary/exampledata/Gene_presence_absence.csv -t scoary/exampledata/Tetracycline_resistance.csv -o Test1 --no-time
# No_pairwise run
- python scoary.py -g scoary/exampledata/Gene_presence_absence.csv -t scoary/exampledata/Tetracycline_resistance.csv --no_pairwise -o Test2 --no-time
# Restricted run
- python scoary.py -g scoary/exampledata/Gene_presence_absence.csv -t scoary/exampledata/Tetracycline_resistance.csv -r scoary/exampledata/Restrict_to.csv -w -o Test3 --no-time
# Advanced opts run
- python scoary.py -g scoary/exampledata/Gene_presence_absence.csv -t scoary/exampledata/Tetracycline_resistance.csv -p 0.01 1E-5 -c B EPW --collapse -m 50 -u -n scoary/exampledata/ExampleTree.nwk --threads 4 -o Test4 --no-time
- python scoary/vcf2scoary.py --force scoary/exampledata/Example.vcf
# Add test to verify output
- python 'tests/test_scoary_output.py'
# CHANGELOG
v1.6.16 (Aug 2017)
- Bug fixes to vcf2scoary
v1.6.15 (Jul 2017)
- Apparently 1.6.14 did not fix the pip issue for all users. Deleted the function that seemed to cause the pip crash
v1.6.14 (Jul 2017)
- Fixed a bug where Scoary could not be upgraded using pip
v1.6.13 (Jul 2017)
- Fixed a bug where Scoary would not handle converted VCF files due to different headers
- Changed the parameters of --test
- Fixed the GUI that was broken since 1.6.12 due to introduction of --include_input_columns
- Include_input_columns now correctly handles the ALL keyword
- Added the ExampleVCFTrait.csv file to test VCF functionality
v1.6.12 (Jun 2017)
- Convert VCF files to Roary/Scoary format, allowing analysis on a wide range of variants (SNPs, indels, structural variations etc)
- Grab columns from the Roary input and put in the output (To get strain-specific protein names, for example)
- Scoary now comes with a manual, located under docs/tex/scoary_manual.pdf
- The log now includes the original command line
v1.6.11 (Apr 2017)
- Blank values in trait files will now correctly be read as missing. Fixes (#54)
- Added --no_pairwise option for simple set differences / categorical enrichment analysis without causal hypothesis (As requested among others in (#53)
- Modified GUI with no_pairwise and slightly modified look.
- Added ExampleTree.nwk to exampledata
- Added support for travis. (CI tests will be further developed)
- Added example cases in README.md
- Fixed broken links in README.md
v1.6.10 (Jan 2017)
- Scoary now creates a log file (both in terminal and GUI mode)
- Fixed a bug where empirical p-values would exceed 1.0
- Fixed a bug where Scoary would crash when pruning many isolates from the internally calculated phylogenetic tree
v1.6.9 (Dec 2016)
- Scoary now handles missing data specified in the traits file as "NA", "-" or ".".
- Now also handles missing isolates (rows) in the traits file.
......
graft scoary/exampledata
graft docs
......@@ -3,6 +3,10 @@
Scoary is designed to take the gene_presence_absence.csv file from [Roary](https://sanger-pathogens.github.io/Roary/) as well as a traits file created by the user and calculate the assocations between all genes in the accessory genome and the traits. It reports a list of genes sorted by strength of association per trait.
[![DOI](https://zenodo.org/badge/51000172.svg)](https://zenodo.org/badge/latestdoi/51000172)
[![PyPI version](https://badge.fury.io/py/scoary.svg)](https://badge.fury.io/py/scoary)
[![Build Status](https://travis-ci.org/AdmiralenOla/Scoary.svg?branch=master)](https://travis-ci.org/AdmiralenOla/Scoary)
[![OMICtools](https://omictools.com/img/logo-blue.png)](https://omictools.com/association-mapping-category)
## Contents
- [What's new](#whats-new)
......@@ -15,6 +19,7 @@ Scoary is designed to take the gene_presence_absence.csv file from [Roary] (http
- [Options](#options)
- [Population structure](#population-structure)
- [Example data](#example-data)
- [Examples](#examples)
- [License](#license)
- [Etymology](#etymology)
- [Bugs](#bugs)
......@@ -27,13 +32,14 @@ Scoary is designed to take the gene_presence_absence.csv file from [Roary] (http
## What's new?
**LATEST VERSION - 1.6.9**
**LATEST VERSION - 1.6.16**
- Bug fixes to vcf2scoary
All changes are logged in the [CHANGELOG](CHANGELOG.md)
## Dependencies
- Python (Tested with versions 2.7 and 3.5)
- Python (Tested with versions 2.7, 3.4, 3.5, 3.6 and 3.6-dev)
- [SciPy](http://www.scipy.org/install.html) (Tested with versions 0.16, 0.17, 0.18)
#### If you supply custom trees (Optional)
......@@ -92,7 +98,7 @@ to bring up a graphical interface. It is fairly intuitive, has a progress bar an
scoary.py -g <gene_presence_absence.csv> -t <traits.csv>
## Input
Scoary requires two input files: The gene_presence_absence.csv file from [Roary] (https://sanger-pathogens.github.io/Roary/) and a list of traits to test associations to.
Scoary requires two input files: The gene_presence_absence.csv file from [Roary](https://sanger-pathogens.github.io/Roary/) and a list of traits to test associations to. Traits can be anything as long as you can classify it into binary categories. (e.g. antibiotic resistance, group membership (yes/no), MIC value higher/lower than 16)
The **gene_presence_absence.csv** file will look something like this:
![gene_presence_absence.csv output](http://sanger-pathogens.github.io/Roary/images/gene_presence_and_absence.png)
......@@ -121,6 +127,23 @@ It should look something like this:
You can see an example of how the input files could look in the exampledata folder.
#### LS-BSR input
You can also use as input the pan-genome as called from Jason Sahl's program [LS-BSR](https://github.com/jasonsahl/LS-BSR) (Large-Scale Blast Score Ratio). The program includes a python script for converting LS-BSR output to the Roary/Scoary format.
#### Converting VCF files to use as Scoary input
From version 1.6.12, Scoary has a function for converting VCF files to the Roary/Scoary format. This allows you to use a wide range of variants (e.g SNPs, indels, structural variants etc) in your input. The script can be run using the following command:
vcf2scoary myvariants.vcf
The current vcf2scoary script is a beta version, and may not correctly handle every VCF file. (Please report bugs!)
You can extract specific variant types (e.g. snps, indels, complex etc) using the --types argument. (This requires that TYPE=XX is defined in the INFO column of the vcf file.) The following command line would for example extract all snp, mnp and complex types.
vcf2scoary --types snp,mnp,complex myvariants.vcf
Note that Scoary simplifies analysis for variants with more than 2 alleles. Rather than comparing all possible contrasts, it compares each non-reference with the reference. Say for example that 4 different alleles exist at a known SNP site. Let's call them A, C, G, and T, and let A be the reference allele. (The reference category is always inferred from the VCF file). This allele can be encoded in a single line in a VCF file, but in the Scoary format it needs to be spread over 3 different lines. (One for each contrast to the reference, i.e. A vs C, A vs G, and A vs T). Thus, not every possible contrast is tested in the association analysis! It is for example possible that there is a real difference in phenotype between G and T, but this contrast is not tested.
#### Missing data
Don't worry if you have not measured the phenotype for all your traits. From v1.6.9 on, Scoary can handle missing data. The missing values need to be specified as "NA", "." or "-". Note that Scoary does not actually specify any kind of uncertainty model for these missing values, it simply excludes them from further analysis.
......@@ -164,7 +187,7 @@ usage: scoary.py [-h] [-t TRAITS] [-g GENES] [-o OUTDIR]
[--threads THREADS] [--no-time] [--test] [--citation]
[--version]
Scoary version 1.6.9 - Screen pan-genome for trait-associated genes
Scoary version 1.6.10 - Screen pan-genome for trait-associated genes
optional arguments:
-h, --help show this help message and exit
......@@ -288,11 +311,13 @@ From version 1.4.0, you can also mix different restrictions together. For exampl
Alternatively, you can specify a single (one) p-value, and this will be taken as the filter for all the specified -c options. For example _-c EPW BH -p 0.05_ will filter the results to only include genes where the entire range of pairwise comparison as well as the Benjamini-Hochberg p-values are > 0.05
#### The -u flag
Calling Scoary with the **-u** flag will cause it to write a newick file of the UPGMA tree that is calculated internally. The tree is based on pairwise Hamming distances in the gene_presence_absence matrix.
Calling Scoary with the **-u** flag will cause it to write a newick file of the UPGMA tree that is calculated internally. The tree is based on pairwise Hamming distances in the gene_presence_absence matrix. Taxa have to be named the same as they are in the gene presence/absence and trait files.
#### The -n parameter
Can be used to supply a custom phylogenetic tree (in newick format) to Scoary. This tree will be used for calculating contrasting pairs rather than Scoary using the gene presence absence file for UPGMA calculation.
Note: The input sample tree topology is a fixed parameter in Scoary. It is assumed to be without error. By default, Scoary calculates a UPGMA tree topology internally from the presence/absence status in the gene presence/absence matrix, which is probably not the most robust data for phylogenetic inference. Since pairwise comparisons rely on the branching order in the tree, a best practices approach would be to supply tree(s) that you have calculated using a more robust approach (e.g. a ML tree based on your sequence data).
#### Post-analysis label-switching permutations
Use **-e X** to permute the dataset X times, rank the test estimators (number of successes (AB-ab pairs) / total number of contrasting pairs (ie. AB-ab and Ab-aB)) and report the unpermuted test estimator's empirical p-value. Calculated as (r+1)/(n+1) where r is the number of estimators that exceed the unpermuted estimator in value and n is the total number of permutations (North, Curtis and Sham, 2002). Empirical p-values are great for deciding if your result looks significant just by coincidence or by a true association. The permutation procedure destroys the relationship between the variant and the phenotype, making the null hypothesis true. Each permutation test estimator is sampled under the null hypothesis. If these data look like your real data, you're in trouble. So if your empirical p-value is not low, chances are you seeing a false positive results even if your other p-values (Bonferroni, pairwise comparisons etc) indicate significance of the variant. You can use empirical p-values as a results filter by using **-c P**.
......@@ -334,6 +359,48 @@ Running Scoary with the --test flag is equivalent to the following command:
python ./scoary.py -t ./exampledata/Tetracycline_resistance.csv -g ./exampledata/Gene_presence_absence.csv -u -c I EPW
```
## Examples
Below are presented some popular use cases with examples of how to run and interpret results.
#### 1. Resistance towards an antibiotic compund in Mycobacterium abscessus
A user wanted to screen for possible genetic causes of resistance towards a new antibiotic in Mycobacterium abscessus. One hypothesis was that the resistance could be related to a truncated form of a gene. In this experiment, the user had classified the resistance pattern as (S)usceptible, (I)ntermediate and (R)esistant. This information was coded into the traits file as dummy variables, e.g. the first trait was SI_vs_R and the second was S_vs_IR.
Mycobacterium abscessus contains numerous subspecies, and the user wanted to test only M. abscessus ss abscessus. The Roary output additionally contained other subspecies, such as M. abscessus ss masiliense. To avoid altering the Roary file, a csv was made containing the names of all isolates that were M. abscessus ss abscessus. To write a separate gene presence/absence file from only these isolates (and to speed up analysis), the -w parameter was used.
A high number of isolates was used in the experiment, and it was therefore decided to set the significance threshold high (i.e. require low p-values). The experiment was interested in causal mutations, so pairwise comparisons had to be used. (Population structure could be a major confounder). It was decided to require that the entire range of pairwise comparison values should be < 1E-4. Additionally, after 10.000 permutations the input configuration should be in the top 0.1 percentile. (Among 10.000 randomly permuted datasets, no more than 9 were allowed to have a even higher number of contrasting pairs for a gene to be included in results).
A ML phylogeny was built with a dedicated tree program and provided as a custom tree.
Finally, since it was possible that the resistance determinant was inherited as a set of genes (such as a plasmid), the --collapse flag was used to collapse genes with identical distribution patterns.
The analysis was run with the following command:
```
scoary -t Resistancefile -g Gene_presence_absence.csv -p 1E-4 1E-3 -c EPW P -e 10000 -w -r OnlyAbscessusIsolates.csv --collapse -n raxmltree.nwk
```
Results showed that the top two hits were different alleles of the same gene, one positively and one negatively associated with the trait. (The two alleles were different enough to not be clustered as the same by Roary). The interpretation was that this gene was likely to play a role in the resistance pattern.
#### 2. Enrichment of genes in select host groups
Another user had a high number of E. coli isolated from different hosts, and wanted to know which genes were enriched in which host groups. In this case, Scoary was not used to infer causal association, but simply to discover trends in different sets. The input trait file consisted of dummy variable memberships to different host groups, reminiscent of the below table:
| | Cattle | Human | Sheep | Food |
| - | ------ | ----- | ----- | ---- |
| Ecoli1 | 1 | 0 | 0 | 0 |
| Ecoli2 | 0 | 1 | 0 | 0 |
| Ecoli3 | 0 | 1 | 0 | 0 |
Here, the user is not trying to infer which genes "cause" membership in a group, just which genes are overrepresented. Therefore, the --no_pairwise flag was used. The Benjamini-Hochberg adjusted p-value was used to only show the genes most overrepresented in a specific host group
The analysis was run with the following command:
```
scoary -g gene_presence_absence.csv -t Hostgroup_membership.csv -p 1E-5 -c BH --no_pairwise
```
#### 3. SNPs linked to penicillin resistance in Neisseria meningitidis
For population structure-aware association analysis to work, it is imperative to work on trees that best represent the genealogy of the input sample. Due to the high frequency of recombination in Neisseria meningitidis, the internal tree builder in Scoary is likely to perform poorly. In this case (actually, in almost any case) it would be advisable to use a dedicated tree program and provide this to Scoary instead. There are now many programs that can produce phylogenetic trees where only the clonally evolved patterns are retained (i.e. "free" from the obfuscating effects of recombination). Some examples are [Gubbins](https://sanger-pathogens.github.io/gubbins), [ClonalFrameML](https://github.com/xavierdidelot/ClonalFrameML) and [BRATNextGen](http://www.helsinki.fi/bsg/software/BRAT-NextGen).
```
scoary -g gene_presence_absence.csv -t penicillinres.csv -n clonaltree.nwk
```
## License
Scoary is freely available under a GPLv3 license.
......@@ -356,8 +423,9 @@ In theory yes, but it requires some tinkering. Obviously you couldn't take the i
- **A lot of my empirical p-values are identical. Bug?**
Not really. In order to save time, Scoary calculates the empirical p-values as it goes through permutations. If it sees early on that a particular variant is not interesting (e.g. empirical p-value above 0.1) it does not waste any more resources on this variant. As a result, note that higher empirical p-values are less accurate (because they have been calculated from fewer permutations).
- **Can I use this for SNPs/kmers?**
- **Do I need to convert my gene_presence_absence.csv file into 1s and 0s rather than gene/locus_tag names?**
No. Scoary treats "0", "-" and "" as gene absence, anything else as presence. You should be able to feed the file directly from Roary.
- **Can I use this for archea?**
Honestly, I don't know enough about archea to say for sure.
......@@ -367,6 +435,7 @@ Most certainly not.
## Coming soon
- Multiprocessing also when using the GUI. (The GUI currently only uses a single thread. See Issues).
- Support for non-binary traits
- Please feel free to suggest improvements, point out bugs or methods that could be better optimized.
## Acknowledgements
......
......@@ -5,7 +5,7 @@ Scoary - Microbial pan-GWAS
Dependencies
------------
- Python (Tested with versions 2.7 and 3.5)
- Python (Tested with versions 2.7, 3.4, 3.5, 3.6 and 3.6-dev)
- `SciPy <http://www.scipy.org/install.html>`_ (Tested with versions 0.16, 0.17, 0.18)
If you supply custom trees (Optional)
......@@ -37,7 +37,7 @@ The most updated documentation for scoary is found at `the project site <https:/
Citation
--------
If you use Scoary, please citing our `paper <https://dx.doi.org/10.1186/s13059-016-1108-8>`_
If you use Scoary, please cite our `paper <https://dx.doi.org/10.1186/s13059-016-1108-8>`_
License
-------
......
scoary (1.6.9-1) UNRELEASED; urgency=medium
scoary (1.6.16-1) UNRELEASED; urgency=medium
* Initial release (Closes: #nnnn)
-- Afif Elghraoui <afif@debian.org> Sun, 15 Jan 2017 15:16:03 -0800
-- Andreas Tille <tille@debian.org> Tue, 29 Jan 2019 18:26:15 +0100
Source: scoary
Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
Uploaders: Afif Elghraoui <afif@debian.org>,
Andreas Tille <tille@debian.org>
Section: science
Priority: optional
Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
Uploaders: Afif Elghraoui <afif@debian.org>
Build-Depends:
debhelper (>=9),
Build-Depends: debhelper (>=9),
dh-python,
python-all,
python-setuptools,
python-scipy (>= 0.16),
# Test-Depends:
python-nose,
Standards-Version: 3.9.8
python-nose
Standards-Version: 4.3.0
Vcs-Browser: https://salsa.debian.org/med-team/scoary
Vcs-Git: https://salsa.debian.org/med-team/scoary.git
Homepage: https://github.com/AdmiralenOla/Scoary
Vcs-Git: https://anonscm.debian.org/debian-med/scoary.git
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/scoary.git
Package: scoary
Architecture: all
Depends:
${misc:Depends},
Depends: ${misc:Depends},
${python:Depends},
python-pkg-resources,
python-pkg-resources
Suggests: roary
Description: pangenome-wide association studies
Scoary is designed to take the gene_presence_absence.csv file from
......
......@@ -8,6 +8,7 @@ License: GPL-3.0
Files: debian/*
Copyright: 2017 Afif Elghraoui <afif@debian.org>
2019 Andreas Tille <tille@debian.org>
License: GPL-3.0
License: GPL-3.0
......
......@@ -6,8 +6,10 @@ Reference:
Volume: 17
Number: 238
DOI: 10.1186/s13059-016-1108-8
PMID: 27887642
URL: >
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1108-8
eprint: https://genomebiology.biomedcentral.com/track/pdf/10.1186/s13059-016-1108-8
Registry:
- Name: OMICtools
Entry: OMICS_13120
......
version=4
opts="filenamemangle=s%(?:.*?)?v?(\d[\d.]*)\.tar\.gz%scoary-$1.tar.gz%" \
https://github.com/AdmiralenOla/Scoary/tags \
(?:.*?/)?v?(\d[\d.]*)\.tar\.gz debian uupdate
https://github.com/AdmiralenOla/Scoary/releases/latest .*/archive/v?@ANY_VERSION@@ARCHIVE_EXT@
tex/scoary_manual.pdf
\ No newline at end of file
@article{north2002note,
title={A note on the calculation of empirical P values from Monte Carlo procedures},
author={North, Bernard V and Curtis, David and Sham, Pak C},
journal={The American Journal of Human Genetics},
volume={71},
number={2},
pages={439--441},
year={2002},
publisher={Cell Press}
}
@article{read1995inference,
title={Inference from binary comparative data},
author={Read, Andrew F and Nee, Sean},
journal={Journal of Theoretical Biology},
volume={173},
number={1},
pages={99--108},
year={1995},
publisher={Elsevier}
}
@article{maddison2000testing,
title={Testing character correlation using pairwise comparisons on a phylogeny},
author={MADDISON, WAYNE P},
journal={Journal of Theoretical Biology},
volume={202},
number={3},
pages={195--204},
year={2000},
publisher={Elsevier}
}
@article{brynildsrud2016rapid,
title={Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary},
author={Brynildsrud, Ola and Bohlin, Jon and Scheffer, Lonneke and Eldholm, Vegard},
journal={Genome biology},
volume={17},
number={1},
pages={238},
year={2016},
publisher={BioMed Central}
}
@article{page2015roary,
title={Roary: rapid large-scale prokaryote pan genome analysis},
author={Page, Andrew J and Cummins, Carla A and Hunt, Martin and Wong, Vanessa K and Reuter, Sandra and Holden, Matthew TG and Fookes, Maria and Falush, Daniel and Keane, Jacqueline A and Parkhill, Julian},
journal={Bioinformatics},
volume={31},
number={22},
pages={3691--3693},
year={2015},
publisher={Oxford Univ Press}
}
\relax
\citation{brynildsrud2016rapid}
\citation{page2015roary}
\@writefile{toc}{\contentsline {section}{\numberline {1}Scoary utility}{1}}
\@writefile{toc}{\contentsline {section}{\numberline {2}Installation}{1}}
\@writefile{toc}{\contentsline {subsection}{\numberline {2.1}Dependencies}{1}}
\@writefile{lof}{\contentsline {figure}{\numberline {1}{\ignorespaces Scoary GUI}}{2}}
\newlabel{fig:gui}{{1}{2}}
\@writefile{toc}{\contentsline {subsection}{\numberline {2.2}Installation}{2}}
\@writefile{toc}{\contentsline {section}{\numberline {3}Basic usage}{2}}
\@writefile{toc}{\contentsline {subsection}{\numberline {3.1}Getting started}{2}}
\@writefile{toc}{\contentsline {subsection}{\numberline {3.2}Input}{3}}
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.2.1}Gene presence/absence file}{3}}
\@writefile{lof}{\contentsline {figure}{\numberline {2}{\ignorespaces Input Roary file (Source: http://sanger-pathogens.github.io/Roary)}}{3}}
\newlabel{fig:gpa}{{2}{3}}
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.2.2}Traits file}{3}}
\gdef \LT@i {\LT@entry
{1}{69.32pt}\LT@entry
{1}{68.92001pt}\LT@entry
{1}{68.92001pt}\LT@entry
{1}{68.92001pt}\LT@entry
{1}{68.92001pt}}
\@writefile{lot}{\contentsline {table}{\numberline {1}{A properly formatted traits file}}{4}}
\newlabel{tab:traits}{{1}{4}}
\@writefile{toc}{\contentsline {subsubsection}{\numberline {3.2.3}Converting VCF files to use as Scoary input}{4}}
\gdef \LT@ii {\LT@entry
{2}{143.64886pt}\LT@entry
{1}{201.35114pt}}
\@writefile{toc}{\contentsline {subsection}{\numberline {3.3}Output}{5}}
\@writefile{lot}{\contentsline {table}{\numberline {2}{Explanation of columns in the output}}{5}}
\newlabel{tab:cols}{{2}{5}}
\@writefile{lot}{\contentsline {table}{\numberline {2}{Explanation of columns in the output}}{6}}
\newlabel{tab:cols}{{2}{6}}
\@writefile{toc}{\contentsline {section}{\numberline {4}Advanced usage}{7}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.1}Restricting analysis to a subset of isolates with the -r parameter}{9}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.2}Getting input right when using non-standard Roary files using -s}{9}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.3}Controlling the output}{10}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.4}Writing a newick tree}{10}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.5}Setting a custom tree with the -n parameter}{10}}
\citation{north2002note}
\citation{read1995inference}
\citation{maddison2000testing}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.6}Post-analysis label-switching permutations}{11}}
\@writefile{toc}{\contentsline {subsection}{\numberline {4.7}Collapsing correlated variants}{11}}
\@writefile{toc}{\contentsline {section}{\numberline {5}Population structure}{11}}
\@writefile{lof}{\contentsline {figure}{\numberline {3}{\ignorespaces A not-so-significant link between gene and trait}}{12}}
\newlabel{fig:badlink}{{3}{12}}
\@writefile{lof}{\contentsline {figure}{\numberline {4}{\ignorespaces A significant link between gene and trait}}{13}}
\newlabel{fig:goodlink}{{4}{13}}
\@writefile{lof}{\contentsline {figure}{\numberline {5}{\ignorespaces A best possible pairing}}{13}}
\newlabel{fig:best}{{5}{13}}
\@writefile{lof}{\contentsline {figure}{\numberline {6}{\ignorespaces A worst possible pairing}}{14}}
\newlabel{fig:worst}{{6}{14}}
\@writefile{toc}{\contentsline {section}{\numberline {6}Example data}{14}}
\@writefile{toc}{\contentsline {section}{\numberline {7}Example use cases}{15}}
\gdef \LT@iii {\LT@entry
{1}{69.32pt}\LT@entry
{1}{68.92001pt}\LT@entry
{1}{68.92001pt}\LT@entry
{1}{68.92001pt}\LT@entry
{1}{68.92001pt}}
\@writefile{lot}{\contentsline {table}{\numberline {3}{Traits input for example 2}}{16}}
\newlabel{tab:hostgroups}{{3}{16}}
\@writefile{toc}{\contentsline {section}{\numberline {8}License}{17}}
\@writefile{toc}{\contentsline {section}{\numberline {9}Etymology}{17}}
\@writefile{toc}{\contentsline {section}{\numberline {10}FAQ}{17}}
\bibdata{citations}
\bibcite{brynildsrud2016rapid}{1}
\@writefile{toc}{\contentsline {section}{\numberline {11}Acknowledgements}{19}}
\@writefile{toc}{\contentsline {section}{\numberline {12}Feedback}{19}}
\@writefile{toc}{\contentsline {section}{\numberline {13}Citation}{19}}
\@writefile{toc}{\contentsline {section}{\numberline {14}Contact}{19}}
\bibcite{page2015roary}{2}
\bibcite{north2002note}{3}
\bibcite{read1995inference}{4}
\bibcite{maddison2000testing}{5}
\bibstyle{unsrt}