Consensus RNA secondary structure prediction with KNetFold

I. INSTALLATION

Here are the necessary steps for installing the KNetFold system. The absolute pathname to the knetfold directory is called in this secion "rootdir".
 a) Install tar file:
	gunzip knetfold_v1.4.4b.tar.gz
	tar xvf knetfold_v1.4.4b.tar

 b)
 The environment variable KNETFOLD_HOME has to be set to the value of rootdir. Also, one optionally can set a location for temporary files with the environment variable KNETFOLD_TMP. If this variable is not set, the directory /tmp will be used. Depending on the shell, the following lines have to be added to your .cshrc ( or .bashrc etc ) script:
   For C-shell (csh)  add to file .cshrc in your home directory (replace YOUR_LCATION with your path to knetfold_v1.4.4b and MY_TMP_DIR_LOCATION with the absolute path to a directory for temporary files):
   setenv KNETFOLD_HOME YOUR_LOCATION/knetfold_v1.4.4b
   # optionally:
   setenv KNETFOLD_TMP /tmp # or any other location convinient for you, /tmp is default

   Using the Bash shell, these commands read:
    export KNETFOLD_HOME="YOUR_LOCATION/knetfold_v1.4.4b"
    export KNETFOLD_TMP="MY_TMP_DIR_LOCATION"

  It is also convinient to add the absolute pathname of the directory $KNETFOLD_HOME/bin to your $PATH environment variable.
 c)
   The RNA secondary structure prediction program RNAfold from the Vienna package has to be installed in a location that is part of your $PATH variable. The RNAfold program can be obtained from:  http://www.tbi.univie.ac.at/~ivo/RNA/ , the literature reference is given in reference [2].
 d) (optional) The statistics program "R" is needed to generate pdf output files. It can be downloaded from http://www.r-project.org/ . The program "R" has to be found with that name in one of the directories that are in the $PATH variable. If the program cannot be found, the generation of pdf files is skipped.

II. USAGE

    a) Quickstart : 
      change into subdirectory "example" and try the two test scripts:
      cd $KNETFOLD_HOME/example
      ./run_example.sh
      or 
      ./run_smallexample.sh
	
    b) command line parameters;

    The secondary structure prediction is started with the command:
    knetfold.pl -i fastafile

    Parameters of knetfold.pl are:
	
    -i filename : mandatory specification of sequence alignment filename in fasta format.
    -d 0|1      : debug mode (1 for on, 0 for off (default)). Optionally.
    -m length   : minimum length of predicted stems. Default is 2 since version 1.4.4, it was 1 in earlier versions.
    -n value    : number of iterations (optional). The structure prediction is restarted this many times using an alignment that was collapsed with respect to another randomly chosen sequence. The default value is 10. It is not recommended to choose smaller values than 10 for production runs. 
    -o name     : path and prefix for output files. Optionally.
    -q 0|1      : option parallelization using PBS queuing system

III. OUTPUT
    The KNetFold program generates RNA secondary structure predictions in different output formats. The predictions used in our publication have the ending ".ct" or ".sec". For example, if the program was called with the command:
    knetfold.pl -i myfile.fasta
    the program generates the files: myfile_knet.sec (prediction in bracket notation), myfile_knet.ct (ct file format)  and myfile_knet.pdf (contact matrix in pdf file format).

Version History:

v 1.1 : initial version corresponding to 2006 publication.
v 1.2 : version including optional pseudoknot filter
v 1.3 : redesigned but equivalent source code
v 1.3.18 : implemented possible use of relative paths as KNETFOLD_HOME variable
v 1.4.0  : first version that is officially tagged using cvs tag using command cvs tag version_1_4_0 knetfold. Minor additions that are not directly related to knetfold (experimental scanning for certain RNA structures etc). Best version for giving out the software so far. (Aug 28, 2006)
v 1.4.1  : cvs tag version_1_4_1 knetfold. Minor changes, for example to this README file.
v 1.4.3  : cvs tag version_1_4_3 knetfold. Minor bugfix in vectornumerics.cc that appears for very small uniform score matrices. Added output formats to editmatrix
v 1.4.4  : several minor changes like new output formats in alignedit2. Default stem length changed from 1 to 2.
v 1.4.4b : adding missing parameter file to tar file, updated README file.

References:

[1] Eckart Bindewald and Bruce A. Shapiro:
RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. RNA. 2006. Mar;12(3):342-352.

[2] I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M. Tacker, P. Schuster (1994) Fast Folding and Comparison of RNA Secondary Structures. Monatshefte f. Chemie 125: 167-188



