Uni Prot Universal Protein Resource

UniProt (Apweiler et al., 2004) joins the experience and expertise of the three major annotated and comprehensive protein sequence databases, PIR-PSD, Swiss-Prot and TrEMBL. The Protein Information Resource (PIR) (Wu et al., 2003) provides an integrated public resource of protein informatics to support genomic and proteomic research and scientific discovery. PIR produces the Protein Sequence Database (PSD) of functionally annotated protein sequences, which grew out of the Atlas of Protein...

References

Achard, F., Cussat-Blanc, C., Viara, E. and Barillot, E. (1998). The new Virgil Database a service of rich links. Bioinformatics 14, 342-348. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Bader, G.D., Betel, D. and Hogue, C.W. (2003). BIND the biomolecular interaction network database. Nucleic Acids Res. 31, 248-250. Barker,...

PFDM mediator architecture

The main components of the P FDM mediator are shown inside the dashed box in Figure 12.8, and are described below. The parser reads a Daplex query (Daplex is the query language for the FDM), checks it for consistency and produces a list comprehension containing the essential elements of the query in a form that is easier to process than Daplex text (we call this internal form 'ICode'). The simplifier's role is to produce shorter, more elegant and more consistent ICode, mainly through removing...

Identification of the Biologically Relevant Assembly

Quaternary structure is defined as that level of form in which units of tertiary structure aggregate to form homo- or hetero-multimers. Consideration of the presence of a quaternary state is important in the understanding of a protein's biological function. For a PDB entry determined using X-ray crystallography, the deposited co-ordinates typically consist of the contents of the asymmetric unit (ASU). The deposited coordinates may, therefore, contain one or more complete macromolecule(s), some...

Contents

1 Annotation and Databases Status and Prospects 1 M. Hoebeke, H. Chiapello, J.-F. Gibrat, Ph. Bessieres and J. Garnier 1.2 Annotation of Genomic Data 3 1.3 Databases Concepts and Definitions 9 1.4 Access to Annotation Databases 12 Glossary 19 References 20 2 Survey of Sequence Databases Archival Projects 25 M. Magrane, M. Garcia-Pastor and R. Apweiler 2.2 Nucleotide Sequence Databases 27 2.6 UniProt 42 References 43 3 Survey of Sequence Databases Derived Databases 45 M. Pruess, N. Mulder and R....

CIF and mmCIF

Internally the PDB maintains the detailed information about structures in mmCIF (Bourne et al., 1997), the macromolecular version of the Crystallographic Information File, and uses extensions to mmCIF to maintain metadata about the processing of entries. The mmCIF data sets differ from the fixed field PDB data sets in format and, in some ways, in content. mmCIF provides a more detailed, database-oriented view of the information in a data set. The same information may be presented in different...

Architectures for Data Integration

We are aiming to develop a system that will provide uniform access to heterogeneous databases via a single high level query language or graphical interface and will enable multi-database queries. This objective is illustrated in Figure 12.5. Data replication and multi-databases are two alternative approaches that could help us to meet this objective. This section contains an overview of these two approaches, but we start by MODELS OF DATABASE INTERCONNECTIVITY Ad Hoc Queries Graphical User...