Necessary Resources

Hardware

Computer connected to the Internet

Command-line MEME works on many uniprocessor computers, some multiprocessor computers, and clusters that have the MPICH message-passing software installed. A list of supported operating systems and their manufacturers is available at ftp://ftp.sdsc.edu/pub/sdsc/biology/meme/README.

Software

Web browser (e.g., Internet Explorer, Netscape Navigator) E-mail reader (e.g., Netscape Messenger) Command-line MEME (optional)

(http://meme.sdsc.edu/meme/website/meme-download.html)

MEME can be used remotely over the Web (Web MEME), with results being returned by E-mail, or it can be installed and run on the user's Unix-based computer (command-line MEME). The Web interface has the advantage of not requiring any software installation, but some MEME features are only available in the command-line version. Command-line MEME removes the restriction on the size of the training set imposed by the MEME Web server (maximum of 60,000 characters). Web access is free (currently available at http://meme.sdsc.edu and http://bioweb.pasteur.fr/seqanal/motif/meme). The command-line version is free for noncommercial use or can be obtained with a commercial license, and can be downloaded over the Web (http://meme.sdsc.edu/meme/website/meme-download.html).

When using MEME via a Web interface, results will typically arrive within a few hours. It is not possible to predict when the MEME results will arrive because the computers on which MEME runs at SDSC and the Pasteur Institute are shared resources. Depending on the load, it can sometimes take a day or more for a job to be processed. Please be patient. This unpredictability can be avoided by installing command-line MEME locally on the user's Unix-based computer.

Files

A sequence file (training set) containing one or more protein sequences (FASTA format required for command-line MEME; APPENDIX 1B). Other formats, described on the MEME Web site, are supported if using MEME via the Web interface, but the total number of characters in the sequences may not exceed 60,000.

There are many ways to construct a family of protein sequences for input to MEME. For example, file tf4.fasta contains a family of bacterial protein sequences related to Entrez sequence gi|i5897224|ref|NP_3 4i82 9.i| hypothetical protein [Sulfolobus solfataricus]. It was constructed by doing a BLASTP search (UNIT 3.4) of the nonredundant protein database using the sequence named above (gi|i5 8 9 72 24) as the query. The accession numbers of all of the sequences matching the query with BLAST E-values £0.01 were then placed in file tf4.acc. Then, Batch Entrez (UNIT 1.3) was used with the file of accession numbers to download the sequences in FASTA format into file tf4.fasta.

The data file used in this example (tf4.fasta) should be downloaded from the Current Protocols Web site (http://www.currentprotocols.com).

0 0

Post a comment