org.biojava.bio.structure.io
public class StructureSequenceMatcher extends java.lang.Object
Constructor and Description |
---|
StructureSequenceMatcher() |
Modifier and Type | Method and Description |
---|---|
static ProteinSequence |
getProteinSequenceForStructure(Structure struct,
java.util.Map<java.lang.Integer,Group> groupIndexPosition)
Generates a ProteinSequence corresponding to the sequence of struct,
and maintains a mapping from the sequence back to the original groups.
|
static ResidueNumber[] |
matchSequenceToStructure(ProteinSequence seq,
Structure struct)
Given a sequence and the corresponding Structure, get the ResidueNumber
for each residue in the sequence.
|
static ProteinSequence |
removeGaps(ProteinSequence gapped)
Removes all gaps ('-') from a protein sequence
|
static <T> T[][] |
removeGaps(T[][] gapped)
Creates a new list consisting of all columns of gapped where no row
contained a null value.
|
public static ProteinSequence getProteinSequenceForStructure(Structure struct, java.util.Map<java.lang.Integer,Group> groupIndexPosition)
struct
- Input structuregroupIndexPosition
- An empty map, which will be populated with
(residue index in returned ProteinSequence) -> (Group within struct)SeqRes2AtomAligner#getFullAtomSequence(List, Map)}, which
does the heavy lifting.
public static ResidueNumber[] matchSequenceToStructure(ProteinSequence seq, Structure struct)
Smith-Waterman alignment is used to match the sequences. Residues in the sequence but not the structure or mismatched between sequence and structure will have a null atom, while residues in the structure but not the sequence are ignored with a warning.
seq
- The protein sequence. Should match the sequence of struct very
closely.struct
- The corresponding protein structurepublic static ProteinSequence removeGaps(ProteinSequence gapped)
gapped
- public static <T> T[][] removeGaps(T[][] gapped)
gapped
- A rectangular matrix containing null to mark gaps