The RAq Approach: Readme Documentation


(c) 2006 Suzanne Matthews


Table of Contents:

  1. Project description
  2. Minimum requirements
  3. Project Directory Structure
  4. Bugs known
  5. Compile and Execution Instructions
  6. FAQ
  7. Notice of GPL
  8. Contact Information

Project description:

The RAqApproach Software package is designed to create a starting tree using the RAq Approach for phylogenetic construction. After the program is run, a starting tree will be outputted in Newick format into a file called newick.txt. This file then can be used to generate a tree necessary for analysis in the parsimony-ratchet analysis, or for your simple viewing pleasure in TreeView or like program.

-top-


Minimum requirements:

PC: x86 with cygwin (sorry, support with MS C++ compiler not tested!)
MAC: OS X
Linux: Solaris, FreeBSD, Ubuntu. Other distros are assumed to be supported

browsers: Any browser; must support css

other:

-top-


Project directory structure:

Once you download and extract the software package, you will see a number of folders in your specified directory. There structure and contents are explained below:

+ \source
| ::Contains the source code for the RAqApproach software, including run file and taxa.txt
- run file: contains the commands for running the file, so you don't have to enter them manually
- taxa.txt: file that contains the taxa information. The first line MUST specify the number of taxa

+ \docs
| ::Contains reference manual (.pdf), (.ps) and the HTML version of the manual. Also contains GPL license and FDL license notices
-copy.txt: contains GPL license

+ \sample_taxa
| ::Contains some examples of valid input files for the RAqApproach. In order to create trees with one of these taxa sets, copy the contents over to the taxa.txt file located in the \source directory
- example: to create a tree with taxa_51, simply enter: cp sample_taxa\taxa_51 source\taxa.txt
- make sure you are in the project directory before you do this

+ \example_files
| ::Contains some examples of the structure of the input file, taxa.txt and the output files from the the RAqApproach program. It is recommended that the user take a look at these files to understand exactly what is being outputted and inputted and why.
- graph_me.txt: used for preliminary benchmarking. Kept in here for just viewing pleasure (curiosity)
- newick.txt: file in which final tree will be outputted to. Tree in this file can be used/modified for further analysis
- output.txt: file that program's standard output is redirected to. This is so that output text does not clutter the terminal. By default, only the bare minimum of output is shown. In order to view created distance matrix, and overall tree structure creation, you need need to set the 'DISPLAY' variable in common.h to "true". By default, this variable is set to "false", which is ok.
- taxa.txt: an example the taxa.txt input file. Note how the first line of the file is the number of taxa. It is VERY important that this number is at the beginning of every file!
- warning: remove any terminating \n at end of the file. This will adversely affect the performance (a.k.a break) the program!

-top-


Bugs known:

# main.cpp:
- Currently, only the score-based distance matrix works. This means, kimura and jukes-cantor do not work.
- It is recommended that you do not use uncorrected distances, as this may break the program as well
- Erroneous characters (such as 'e') will be treated as an unknown character for distance matrix creation

# common.h
- it is recommended that no variables are changed, save for DISPLAY. Changes to other values have not yet been tested
- Program only supports lambda values of 2. Program will break if lambda value is altered.

-top-


Compile and Execution instructions:

- compilation and execution of the program can be done in one easy step. Simple go to the \source directory and type in the following command:

./run

this will compile and subsequently run the RAqApproach program. Output is redirected to the file output.txt.

- to output debug statements, go to /source/command.h and change the 'DISPLAY' value to "true". This will cause distance matrix and tree creation steps to be outputted to output.txt

-top-


FAQ

  1. How do I get a copy of the RAqApproach program?
    - The RAq Approach code is conveniently located on the web. (link will be provided in the future). Download either the .zip or .tar file, depending on your OS. It is assumed you already know how to extract these files.
  2. How do I run the RAqApproach program?
    - go to the \source directory and type in the command ./run into your terminal window.
  3. The program is taking a long time to run. It is making me nervous! How can I make sure that it is still running and not segfaulting or doing something horrific?
    - as the data set gets larger, it may take more and more time to create the starting tree. To give a rough estimate, it took less than two minutes to create a tree with 921 taxa on a 3Ghz dual-core Pentium D Optiplex GX620 with 2 GB of RAM. However, this time estimation will vary depending on your computer specs, so please be patient!
    - If it helps, set the value of the 'DISPLAY' variable in common.h to "true", to see what stage the algorithm is currently in
  4. The program crashed/didn't output the tree! What do I do???
    - Check to make sure the following are set correctly:
  5. I tried your suggestions, but nothing worked! What should I do??
    - Well, I wasn't expecting THIS to happen. Please set the 'DISPLAY' variable in common.h to true, and determine exactly where the program is having trouble. Then, send me the output.txt file and your taxa.txt file. I'll see if I can reproduce your results, and we'll take it from there. Or, what I would really like you to do, is to look in the source code and figure it out yourself :-) It's all commented, so it shouldn't be too hard.

-top-


Notice of GPL:

Copyright (C) 2006
Suzanne Matthews, The RAq Approach Project

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

-top-


Contact information

Suzanne Matthews
Rensselaer Polytechnic Institute
matths3 AT rpi DOT edu

-top-