Clustal Omega  1.1.0
Macros | Functions
hhalign_wrapper.c File Reference
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <ctype.h>
#include <stdbool.h>
#include "seq.h"
#include "tree.h"
#include "progress.h"
#include "hhalign/general.h"
#include "hhalign/hhfunc.h"
#include "hhalign/hhalign.h"
Include dependency graph for hhalign_wrapper.c:

Macros

#define APPLY_BG_HMM_UP_TO_TREE_DEPTH   10
#define TIMING   0
#define TRACE   0
#define NOTX   'N'

Functions

void SetDefaultHhalignPara (hhalign_para *prHhalignPara)
 FIXME.
void SanitiseUnknown (mseq_t *mseq)
 get rid of unknown residues
void TranslateUnknown2Ambiguity (mseq_t *mseq)
 translate unknown residues back to ambiguity codes; hhalign translates ambiguity codes (B,Z) into unknown residue (X). we still have the original (un-aligned) residue information, by iterating along the original and aligned sequences we can reconstruct where codes have been changed and restore them to their original value
void ReAttachLeadingGaps (mseq_t *prMSeq, int iProfProfSeparator)
 re-attach leading and trailing gaps to alignment
void PrepareAlignment (mseq_t *mseq, char **ppcProfile1, char **ppcProfile2, double *pdWeightsL, double *pdWeightsR, double *pdSeqWeights, int iLeafCountL, int *piLeafListL, int iLeafCountR, int *piLeafListR)
 reallocate enough memory for alignment and attach sequence pointers to profiles
double HHalignWrapper (mseq_t *prMSeq, int *piOrderLR, double *pdSeqWeights, int iNodeCount, hmm_light *prHMMList, int iHMMCount, int iProfProfSeparator, hhalign_para rHhalignPara)
 wrapper for hhalign. This is a frontend function to the ported hhalign code.

Macro Definition Documentation

#define APPLY_BG_HMM_UP_TO_TREE_DEPTH   10
#define NOTX   'N'
#define TIMING   0
#define TRACE   0

Function Documentation

double HHalignWrapper ( mseq_t prMSeq,
int *  piOrderLR,
double *  pdSeqWeights,
int  iNodeCount,
hmm_light *  prHMMList,
int  iHMMCount,
int  iProfProfSeparator,
hhalign_para  rHhalignPara 
)

wrapper for hhalign. This is a frontend function to the ported hhalign code.

Parameters
[in,out]prMSeqholds the unaligned sequences [in] and the final alignment [out]
[in]piOrderLRholds order in which sequences/profiles are to be aligned, even elements specify left nodes, odd elements right nodes, if even and odd are same then it is a leaf
[in]pdSeqWeightsWeight per sequence. No weights used if NULL
[in]iNodeCountnumber of nodes in tree, piOrderLR has 2*iNodeCount elements
[in]prHMMListList of background HMMs (transition/emission probabilities)
[in]iHMMCountNumber of input background HMMs
[in]iProfProfSeparatorGives the number of sequences in the first profile, if in profile/profile alignment mode (iNodeCount==3). That assumes mseqs holds the sequences of profile 1 and profile 2.
[in]rHhalignParavarious parameters read from commandline
Returns
score of the alignment FIXME what is this?
Note
complex function. could use some simplification, more and documentation and a struct'uring of piOrderLR
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment
: introduced argument hhalign_para rHhalignPara; FS, r240 -> r241
: if hhalign() fails then try with Viterbi by setting MAC-RAM=0; FS, r241 -> r243

translate back ambiguity residues hhalign translates ambiguity codes (B,Z) into unknown residues (X). as we still have the original input we can substitute them back

void PrepareAlignment ( mseq_t mseq,
char **  ppcProfile1,
char **  ppcProfile2,
double *  pdWeightsL,
double *  pdWeightsR,
double *  pdSeqWeights,
int  iLeafCountL,
int *  piLeafListL,
int  iLeafCountR,
int *  piLeafListR 
)

reallocate enough memory for alignment and attach sequence pointers to profiles

Parameters
[in,out]mseqsequence/profile data, increase memory for sequences in profiles
[out]ppcProfile1pointers to sequencese in 1st profile
[out]ppcProfile2pointers to sequencese in 2nd profile
[out]pdWeightsLweights (normalised to 1.0) for sequences in left profile
[out]pdWeightsRweights (normalised to 1.0) for sequences in right profile
[in]pdSeqWeightsweights for all sequences in alignment
[in]iLeafCountLnumber of sequences in 1st profile
[in]piLeafListLarray of integer IDs of sequences in 1st profile
[in]iLeafCountRnumber of sequences in 2nd profile
[in]piLeafListRarray of integer IDs of sequences in 2nd profile
void ReAttachLeadingGaps ( mseq_t prMSeq,
int  iProfProfSeparator 
)

re-attach leading and trailing gaps to alignment

Parameters
[in,out]prMSeqalignment structure (at this stage there should be no un-aligned sequences)
[in]iProfProfSeparatorgives sizes of input profiles, -1 if no input-profiles but un-aligned sequences
Note
leading and tailing profile columns that only contain gaps have no effect on the alignment and are removed during the alignment. if they are encountered a warning message is printed to screen. some users like to preserve these gap columns FS, r213->214
void SanitiseUnknown ( mseq_t mseq)

get rid of unknown residues

Note
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment FS, r213->214
void SetDefaultHhalignPara ( hhalign_para *  prHhalignPara)

FIXME.

Note
prHalignPara has to point to an already allocated instance
void TranslateUnknown2Ambiguity ( mseq_t mseq)

translate unknown residues back to ambiguity codes; hhalign translates ambiguity codes (B,Z) into unknown residue (X). we still have the original (un-aligned) residue information, by iterating along the original and aligned sequences we can reconstruct where codes have been changed and restore them to their original value

Parameters
[in,out]mseqsequence/profile data, mseq->seq [in,out] is changed to conform with mseq->orig_seq [in]