AmphipaSeeK is a prediction method of amphipathic in-plane membrane anchors (IPM anchors) in protein sequences. IPM anchoring is a particular type of membrane interaction found as well as in monotopic proteins (e.g. NonStructural protein 5A of the Hepatitis C virus [1]) and in polytopic proteins (e.g. gp41 from the Human Immunodeficiency Virus [2]). AmphipaSeeK has been initally developed with a set of 21 monotopic membrane proteins including an experimentally characterized IPM anchor. The set is currently extended to IPM segments found in transmembrane proteins. To date, 7 transmembrane proteins are included in the data set.
[ ! ] AmphipaSeeK is not a prediction method of transmembrane segments or monotopic proteins. It only tells if a protein sequence potentially contains an IPM anchor [ ! ]
AmphipaSeeK uses a SVM classifier based on Gaussian kernel. The classification process requires a Gram matrix derived from a substitution matrix (currently : PHAT [3]). This part of AmphipaSeeK code come from the M-SVM source code distributed by Yann Guermeur under the GNU public License. M-SVM is a Multiclass Support Vector Machine, for details see [4]. Additionally, AmphipaSeeK automatically generates a multiple alignment to compute a weighted average prediction. Homologous to your query sequence are retrieve from the UniProt database. If no homologous sequence is retrived, the prediction will be computed on your query only. The computation of the average prediction uses the Squid-1.9g library distributed by Stephen Eddy under the GNU Public License.
If you use AmphipaSeeK in a publication, please cite : |
References :
1 | ADP Ribosylation Factor 1 | [ UniProt ] | [ abstract ] |
2 | ADP Ribosylation Factor GTPase Activating Protein 1 | [ UniProt ] | [ abstract ] |
3 | Aerobic Glycerol-3-Phosphate Dehydrogenase | [ UniProt ] | [ abstract ] |
4 | Cholinephosphate CytidylylTransferase α | [ UniProt ] | [ abstract ] |
5 | Coat Protein γ | [ UniProt ] | [ abstract ] |
6 | Core protein | [ UniProt ] | [ abstract ] |
7 | Dense Granule Protein 2 | [ UniProt ] | [ abstract ] |
8 | DnaA, chromosomal replication initiator protein | [ UniProt ] | [ abstract ] |
9 | Epsin 1 | [ UniProt ] | [ abstract ] |
10 | G protein-coupled Receptor Kinase 5 | [ UniProt ] | [ abstract ] |
11 | Glucose-specific IIa component | [ UniProt ] | [ abstract ] |
12 | Lactophorin (Proteose-Peptone Component 3) | [ UniProt ] | [ abstract ] |
13 | Myelin Basic Protein | [ UniProt ] | [ abstract ] |
14 | NonSpecific Lipid-Transfer Protein (Sterol Carrier Protein 2) | [ UniProt ] | [ abstract ] |
15 | NonStructural protein 5A - Hepatitis C Virus | [ UniProt ] | [ abstract ] |
16 | NonStructural protein 5A - GB virus A | [ UniProt ] | submitted |
17 | NonStructural protein 5A - GB virus B | [ UniProt ] | submitted |
18 | NonStructural protein 5A - GB virus C | [ UniProt ] | submitted |
19 | NonStructural protein 5A - Bovine Viral Diarrhea Virus | [ UniProt ] | [ abstract ] |
20 | NonStructural Protein 1 | [ UniProt ] | [ abstract ] |
21 | NonStructural protein 2C - Poliovirus | [ UniProt ] | [ abstract ] |
22 | NonStructural protein 2C - Foot and Mout Disease Virus | [ UniProt ] | [ abstract ] |
23 | Penicillin Binding Protein 5 (D-alanyl-D-alanine carboxypeptidase 5) | [ UniProt ] | [ abstract ] |
24 | Penicillin Binding Protein 6 (D-alanyl-D-alanine carboxypeptidase 6) | [ UniProt ] | [ abstract ] |
25 | PhosphoDiEsterase 4A cAMP specific (RD1) | [ UniProt ] | [ abstract ] |
26 | Prostaglandin H2 synthase - isoforme 1 | [ UniProt ] | [ abstract ] |
27 | Prostaglandin H2 synthase - isoforme 2 | [ UniProt ] | [ abstract ] |
28 | SAR1 | [ UniProt ] | [ abstract ] |
29 | Septum site-determining protein minD | [ UniProt ] | [ abstract ] |
30 | Squalene-Hopene Cyclase | [ UniProt ] | [ abstract ] |
1 | crcA (PagP) | [ UniProt ] | [ abstract ] |
2 | GP41 | [ UniProt ] | [ abstract ] |
3 | Kcsa | [ UniProt ] | [ abstract ] |
4 | KirBac | [ UniProt ] | [ abstract ] |
5 | M2 | [ UniProt ] | [ abstract ] |
6 | Major Coat Protein | [ UniProt ] | [ abstract ] |
7 | Amine Oxidase B | [ UniProt ] | [ abstract ] |
8 | VPU | [ UniProt ] | [ abstract ] |
AmphipaSeeK is available :
[ ! ] Analyses options are not used by AmphipaSeeK to compute a predictions. They are just a help [ ! ]
ParametersSVM efficiency : choice of the SVM training set, specificity and sensitivity. Available options correspond to 6 different SVM training parameterizations giving 6 different performance. For specific predictions, try option [3]. If you prefer for sensitive predictions, the option [1] is more suitable. Efficiencies of each options are summurized in table 1 :
Option | Training Set | Accuracy | Sensitivity | Specificity | Positive Predictive Value | Pearson-Matthews Coefficient |
[1] | 30 monotopic | 94.2 | 31.0 (low) | 99.5 (high) | 82.9 (high) | 0.49 |
[2] | 30 monotopic | 93.9 | 33.3 (high) | 99.0 (low) | 73.8 (low) | 0.47 |
[3]a | 30 monotopic + 8 transmembrane proteins | 92.9 | 23.1 (low) | 99.3 (high) | 74.6 (high) | 0.39 |
[4]a | 30 monotopic + 8 transmembrane proteins | 92.5 | 26.0 (high) | 98.6 (low) | 62.3 (low) | 0.37 |
The output is divided into 1 HTML page and 4 text files.
10 20 30 40 50.......... sequence position
| | | | |
SGSWLRDIWDWICEVLSDFKTWLKAKLMPQLPGIPFVSCQRGYRGVWRGD.......... protein sequence
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .......... AmphipaSeeK prediction : A = IPM anchor
cchhhhhhhhhhhhhhhhhhhhhhhhccccccccceeeccccceeeeecc.......... predicted secondary structure
33434455555444444344322222222332111222334455554442.......... amphipathy : from 0=low to 5=high amphipathy
based on the sequence-average μH
- 0 : μH > average - 2sd
- 1 : average - 2sd ≥ μH > average - 1sd
- 2 : average - 1sd ≥ μH > average
- 3 : average ≥ μH > average + 1sd
- 4 : average + 1sd ≥ μH > average + 2sd
- 5 : average + 2sd ≥ μH
result file............................... AmphipaSeeK results in a nice text format
result score.............................. score computed by AmphipaSeeK in an aligned text format
used alignment............................ multiple alignment used to compute an average score (in CLUSTALW format)
predicted secondary structure consensus... predicted secondary structure in NPS@ format
>JohnDoe.................................................... sequence header
========
10 20 30 40 50.......... sequence position
| | | | |
SGSWLRDIWDWICEVLSDFKTWLKAKLMPQLPGIPFVSCQRGYRGVWRGD.......... protein sequence
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA________________.......... AmphipaSeeK prediction : "A" = IPM anchor | "_" = non IPM anchor
cchhhhhhhhhhhhhhhhhhhhhhhhccccccccceeeccccceeeeecc.......... predicted secondary structure
33434455555444444344322222222332111222334455554442.......... amphipathy : from 0=low to 5=high amphipathy
based on the sequence-average μH
- 0 : μH > average - 2sd
- 1 : average - 2sd ≥ μH > average - 1sd
- 2 : average - 1sd ≥ μH > average
- 3 : average ≥ μH > average + 1sd
- 4 : average + 1sd ≥ μH > average + 2sd
- 5 : average + 2sd ≥ μH
>JohnDoe ............................................................. sequence header
# rank residue score topology structII <muH> amphipathy... score associated to each
1 S 0.028 A c 0.438 3... information associated to the sequence position
2 G 0.056 A c 0.415 3 (membrane topology, secondary structure, μH, ...)
3 S 0.198 A h 0.457 4
...