2. Refine Your Query Using Heat Map
After submitting a query you can further refine the results using the new heat map retrieval tool to quickly find the entries that are most relevant to you. Text classification helps you find candidate peptides that are related to cancer, cardiovascular diseases, diabetes, apoptosis, angiogenesis and molecular imaging or peptides for which binding data exist.
PepBank is a database of peptides based on sequence text mining and public peptide data sources. Only peptides that are 20 amino acids or shorter are stored. Only peptides with available sequences are stored.
Version 2 (October 2009) [PubMed] [Full text]
The heat map can be used to navigate the results. Entries are automatically classified into predefined categories of interest, for example: related to cancer, have binding data available, etc, using a machine learning algorithm. The users are able to vote. Each vote is used to tag the entry and improve classification of other, untagged entries.
Version 1 (August 2007) [PubMed] [Full text]
A new text mining tool was developed and used to identify peptide sequences in MEDLINE abstracts. These data were combined with two of the public sources of peptide sequence data, ASPD and UniProt, as well as with peptide data that are manually curated from abstracts and full text articles. To search for sequences, use BLAST search or Smith-Waterman search. Score is a measure of confidence of the entry, ranging from 0 (lowest) to 1 (highest). Confidence is higher for manually annotated than for automatically mined entries.