|
Finding non-binary SNP candidates.
The NCBI has, amongst other things,
a public database of single nucleotide polymorphisms (SNPs).
We have made a tool that searches this database for non-binary SNP candidates
for various purposes.
- Make sure the expat
development files and libraries are installed.
- Get the program source.
- Compile the program.
To run the program:
- Get one of the files from
this
location.
- If we assume that the downloaded file is named gt_chrXX.xml.gz,
use the following command to find the SNP candidates:
zcat gt_chrXX.xml.gz | ./snp <threshold> > output.txt
The <treshold> variable specifies the minimum allele frequency
(in percentages). If this option is omitted, the treshold defaults to 0. By
increasing this variable the amount of output can be greatly reduced, setting
it to 1 or higher is recommended.
Although we have not tested this program on files related to other species, we
see no reason why it should not work. To find input files for other species,
go here, select one of the
directories and choose the genotype subdirectory (if present).
Of course, when running multiple sessions with different tresholds,
decompressing the downloaded file first and running:
./snp <treshold> < gt_chrXX.xml > output.txt
is recommended.
|