SNAP Combine

SNAP Combine is a command-line based tool that merges the contents of multiple single locus DNA sequence files into a single multi-locus output file. There are various input and output file formats. The files can be merged into a union or intersection of all the input loci. Additionally Combine tracks the start and end positions of each file allowing the user to exclude variable sites or taxa, important in creating input files for multilocus analyses.

TO INSTALL: Download the file combine.zip and extract the contents.

SNAP Combine is distributed as a single Java jar file, Combine.jar. SNAP Combine was designed and tested on Mac OS X 10.3/10.4 with Java 1.4/1.5 but it should be compatible on different operating systems and recent versions of java.

COMMAND LINE java –jar Combine.jar [-i | -u] [-rc {column range}] [-rr {row range}] [-I] OUTFILE [INFILE [... INFILE]]

EXAMPLE java –jar Combine.jar combine_union.phy

 

Option

Description
-u Union mode – Includes individuals in output file from input files that are not necessarily represented by every input file. Missing regions are padded with ‘?’ characters. Note: this feature is enabled by default and is mutually exclusive to intersection mode described below.

-i

Intersection mode – Excludes any individual that is not represented by every input locus.

-rc [x -y[,…,x-y]]

Remove column – Takes a range of columns, hyphen delimited, or a list of ranges, comma delimited, and removes them from the final output file. Note: There should be no spaces between the hyphens or commas.

-rr [x-y[,…,x-y]]

Remove row – Takes a range of rows, hyphen delimited, or a list of ranges, comma delimited, and removes them from the final output file. Note: There should be no spaces between the hyphens or commas.

-I

Interleave output – Sets the sequence formatting to interleaved instead of sequential, which is the default setting, for the specific output type specified by the output files extension.

  

 

SUPPORTED FORMATS

Combine allows the user to specify their desired output format implicitly via the file name extension for the output file. Additionally each supported file has both a sequential and an interleaved sub-format which is specified with the –I flag. The supported files extensions are listed below: 

 

  

 

Extension

NEXUS

nxs

CLUSTAL

aln

FASTA

fas

PHYLIP

phy

 

Each of these file types is also supported as an input file format. Input file types in Combine are determined internally by the actual content of the file, not its filename extension. Therefore input files do not necessarily have to follow the same filename extension conventions used to determine output file format.Also included in this distribution is MLCombine, a java program which combines multiple single locus MIGRATE infiles generated using SNAP Map into a single multilocus MIGRATE infile.

 

COMMAND LINE

java –jar MLCombine.jar OUTFILE [MIGRATE INFILE [... MIGRATE INFILE]]