'97 BC/BP 578

Week 1

A tour of the computing resources available to you in this course.

Author:

Susan Jean Johns

Welcome to BC/BP 578. In this course you will expand your understanding of biocomputing by applying computing techniques to problems in molecular biology. These techniques will allow you to analyze protein and nucleotide sequences and to visualiz e their structures.

For this first week, do not worry about the mechanics of doing computing tasks. Making connections to the platforms housing the software to be used, and running the needed software, will be dealt with later in the course. Right now concentrate on exploring this new computing environment.

This week's activities center on a tour of the computing resources available to you in the course. You will explore the nets, contact the various machines which house these computing resources and run a series of demonstrations to illustrate the capabilities of the system.

The machines in front of you are Power PC MAC clones. In this course they serve a number of purposes. First, they run software located on the machines themselves. Locally available software includes Netscape, Versa Term Pro and Fetch.

Second, they serve as terminal emulators. This means that the computers pretend to be terminals connected to another computer and not use their own computing resources to perform the required computing tasks. The computers that they are connected to and serve as terminals for are the VADMS' platforms, ribozyme and model1. These machines normally perform the computing tasks required by the course and are known as host computers.

Emulators affect how a device (in this case the MAC clone) communicates with its host machine (the VADMS' machines). They made it possible to change the functions of the keyboard's keys. Since the computer pretends to be another type of terminal, the keyboard setup must match that of the original terminal. Changing the keys' functions to match that of another keyboard is known as remapping. Depending on the terminal being emulated, you may need more than one set of remapped keys. Some terminals use the same keys for different purposes in general operation as opposed to editing tasks. The clones are set up to act like VT220 terminals, a very generic ascii terminal type.

BC/BP 578 relies on the computing resources of the VADMS Center. To connect to these resources, the clones must communicate with their host machines. This is done over a network. Once connected to a host, the clone uses Versa Term Pro for terminal emulation for normal sessions and MacX for x-windowing sessions.

When the screen saver of the clone is not active, the image on the next page appears on the screen. It shows the locally available software on the machine as standard icons. To deactivate the screen saver to see this image on the terminal, move the mouse.

You will probably use the icons from the Launcher window the most: NETSCAPE, RIBOZYME, MODEL1, X RIBOZYME and ENTREZ. Other less used software is available through the apple pull down menu. Netscape is the course's web browser. A number of biocomputing resources are only available on the web. Versa Term Pro is the terminal emulator that connects to the VADMS' computers for normal terminal sessions and is represented by the RIBOZYME and MODEL1 icons. This emulator allows the clones to act like a Tektronix's color graphics terminal and to use software that calls for that type of graphical output. MacX is the x-window emulator on the machine and is used when x-windowing software is run with the X RIBOZYME icon. ENTREZ is a database browser for accessing information at NCBI.

Notice the icon arrangement on this facsimile of the terminal screen. All the heavily used software is in the Launcher window.

The clones have mice. Some of the software you will be using depends on moving the cursor (or arrow) around on the screen with the mouse. The mouse has a single button. Try to keep the mouse on the mouse pad.



Introduction to Web Resources

To start your tour, let's begin with resources located on the web. Use your mouse to double click on the NETSCAPE icon with the arrow. The arrow changes to an hourglass while the connection is made to the VADMS home page, the default home page location for the course and each of these machines.

Before you explore the biocomputing resources off campus, check out the local ones. The VADMS home page is still under construction, but contains valuable information. In the future it will serve as the main information source for the VADMS Center and its activities. Most of the development time so far has been spent getting the instructional activities portion of the web site online. Items containing information are noted by a blue ball, those with none, a white one. Read the information given there. Use the mouse to move the cursor over to the right-hand side of the window and the sliding scroll bar area. Once located on the lower right-hand down arrow, press down the mouse button to scroll down the page.

Use the mouse to click on the Instructional activities line. The arrow turns into a hand, the color of the selected subject turns red and then the new page appears. The color of the subject line depends on whether it was recently visited by someone else using this same machine. It will be red if visited lately, otherwise it will be blue. The Instructional activities page is then loaded and displayed on the screen.

Read the information given there, using the scroll bar lower arrow to scan through the entire text. Go back to the top of the page and then explore the various parts of the page to answer the following questions. Record your answers in the space provided. Use either the Back button from the top of the Netscape window or a return link on a page to move back and forth and locate the needed information. (You will need to poke into all aspects of the pages to find the necessary information.)

1) How long has the VADMS Center been in existence? _____________

2) On which Hawaiian island was the '97 Pacific Symposium on Biocomputing held?

   ________________________________________________________________

3) What English college was planning on coordinating a Structure-Based Drug Design 
   course on the web in the summer of 1996? 

   ________________________________________________________________

Now use the mouse to select the Home button from the top of the Netscape window to return to the VADMS home page. Scroll down this page to the Software information link. Poke around this page to find the answer to the following question.

4) What does the name of the PILOT program stand for in the Pine mailer system?

   ________________________________________________________________

Use the mouse to select the Home button from the top of the Netscape window to return to the VADMS home page. From this location you will move off campus to another site of interest in molecular biology.

There is a site on the web known as Molecules R US. It is maintained by NIH. This site contains structural and image data for all the entries contained in the Protein Database (PDB). Use the arrow to select the Bookmarks menu, and the FORM for PDB query: Molecules R US entry from this menu.

You are now connected to the Molecules R US home page. Depending on network traffic, it may take a moment for their logo to appear. This is a form driven system. Note the empty white box. Move the arrow to the beginning of this box and press the left mouse button.

For this tour, there are three selected molecules. Their PDB access codes are: 1crn for crambin, 6lyz for lysozyme and 1rmn for a hammerhead ribozyme. Type the first of these codes in the box and press the RETURN key.

The results of the PDB database search for the 1crn code is shown on the screen in blue text. There is only one data file with that access code. Keyword searches would have produced a number of hits to choose from. Move the arrow to that line (it turns to a hand in this process), and click the left mouse button.

This form actually requests a structural image of the desired access code. Position the arrow on the Submit Request box and press the left mouse button.

A Viewing Location window appears showing the transfer process. This appears as a black box on the screen with its program's name at the top. Then there is a structure in the RasMol window, a wire frame representation of the requested protein in default CPK colors. This image can be moved. Use the scroll bar at the bottom of the screen to rotate the molecule. Try out the various options under the Display and Colours menus. The ones that are most meaningful for nucleotide structures are shapely and group under Colours. In the Display options, Ribbons and Strands produce the same image for nucleotide structures. The options are the more fully used with protein structures. When you are finished exploring, select the Exit option from the File menu of the RasMol control bar to close the RasMol program.

Click twice on the Back button at the top of the screen to go back to the initial query screen. At this point you can either exit the program if you are finished looking at all the molecules or look at the rest of the codes given on the previous page.

To look at the other access codes, move the arrow to the white box and click the left mouse button. Then use the delete key to remove the text found in the box and type in the new code. Repeat the instructions given on the previous page for submitting a request and looking at the desired molecule.

To exit the Netscape program select the File option from the top of the screen and select its Exit option. This will return you to the machine's Launcher window screen.



Introduction to Ribozyme Resources

The tour next moves on to the local biocomputing resources available on ribozyme. This machine is the main VADMS' platform. It holds the bulk of VADMS' computing resources. All its sequence and evolutionary analysis software packages are housed here.

In this course, a number of accounts have been set up on this machine for student use. They all are called bcsxx, where the xx refers to a number between 01 and 28. All these accounts have the same initial password, mygenes0. For this first week's activities, you will log into the machine using an account that corresponds to the number of the machine you are using. The number is given on the front of the terminal screen and is similar to the following, clone1.vadms.wsu.edu. If you were using this machine you would use account bcs01. Find the name of your clone and determine which account you are to use for this part of the tour. Record that account name below.

using account: _______________________________________________________________

Select the ribozyme icon from the launcher bar. This will get you to the login prompt for the ribozyme machine. Enter your account name and password.

IRIX (ribozyme)

login:
Password: 

The following screen will appear on the terminal.

IRIX Release 6.2 IP21 ribozyme
Copyright 1987-1996 Silicon Graphics, Inc. All Rights Reserved.
Last login: Mon Jan 10 13:17:55 PST 1997 by UNKNOWN@clone2.vadms.wsu.edu

            This is a Power Challenge L running IRIX 6.2

   Sequence analysis users, may enter the term seqtool to get into the
   character driven menuing system for access to VADMS sequencing software.

   Use pico for editing and pine for mail.

   The EST files have been separated out of GenEmbl.  Use est:* for the
   database if you want to search this information.

   Please note that changes in the BLAST menu.  Searching only GENBANK or
   EMBL is no longer possible.

   A new solution to the font problem in PHYLIP has been created.  If you are
   interested, check out the following file" $PHYLIP_DIR/font.readme

   VADMS' Web page URL:
     Mosaic users    http://ribozyme.vadms.wsu.edu/~vadms/
     Netscape users  http://www.vadms.wsu.edu/

The introductory message talks about a seqtool menuing system for accessing VADMS sequencing software. Let's check it out. To more fully use this tool, first activate the principle sequencing package GCG. This package has introductory message when the software is activated. Enter the term gcg and press the RETURN key at the % prompt. After that launch the seqtool system by entering the term seqtool and pressing the RETURN key at the % prompt. The input you should enter is shown in bold type.

% gcg 
                     Welcome to the WISCONSIN PACKAGE
                      Version 9.0-UNIX, December 1996

                             Installed on irix

  Copyright 1982, 1983, 1984, 1985, 1986, 1987, 1989, 1991, 1992, 1994, 1996
            Genetics Computer Group, Inc.  All rights reserved.

         Published research assisted by this software should cite:

                 Program Manual for the Wisconsin Package,
            Version 9, December 1996, Genetics Computer Group,
             575 Science Drive, Madison, Wisconsin, USA  53711

              Databases available:
                   GenBank            Release 98.0 (12/96)
                   EMBL (Abridged)    Release 43.0 ( 5/95)
                   PIR-Protein        Release 49.0 ( 6/96)
                   SWISS-PROT         Release 33.0 ( 3/96)
                   NRL_3D             Release 2x.0 ( 6/96)
                   PROSITE            Release 12.2 ( 3/95)
                   Restriction Enzymes (REBASE)    ( 6/95)

               Help is available with the command % genhelp or by
            calling (608) 231-5200 or sending e-mail to Help@GCG.Com

% seqtool

The following menuing system comes up on the screen. You can move through the system by using the arrow keys on the terminal. Once you have the cursor on the desired subject, press the RETURN key.

               +---------------------------------------+   arrow keys navigate
               |     VADMS sequence analysis tools     |   "?" for help
               |          based on HYBROW 1.3          |   "q" to quit
               | Hypertext-like Unix Browser/Interface |
               |    Copyright 1992 D.Kiong, TW Tan     |
               +---------------------------------------+

[GCG programs]         GCG Sequence Analysis software
[EGCG programs]        Extended GCG Sequence Analysis software
[PHYLIP programs]      Joe Felsenstein's PHYLIP phylogenetic inference software
[Sequence Utilities]   Assorted software for sequence data manipulations
[Linkage programs]     Classical genetics linkage software
[Net Analysis tools]   Sequence Analysis tools from off the Networks

[tutorial1]            How to use a HYBROW interface
[Copyright]            Information of the HYBROW Copyright<

For this pass through the menuing system, you will be looking at the GCG software. When the menuing system starts, the cursor is on the first item in the list, in this case the GCG programs. To look at this software suite press the RETURN key.

The following text appears on the terminal reminding you to activate the GCG software package prior to attempting to use it in this menuing system. The same situation hold true for using the EGCG package within this system. You have already done this when you entered the gcg name before getting into the seqtool menuing system.

           HYGCGMENU, the character-based menu system for

        GCG sequence analysis package Unix vn 8.0 and below.
(by Tan Tin Wee and Derek Kiong of the National University of Singapore)
    (modified by Susan Johns to fit the VADMS computing environment)

   WARNING:  Please make sure you have activated the package by entering

             gcg

             before you start to use the software.

        [Press RETURN to continue]
        Press "q" to quit.

________________________
Latest update May 1995 

By pressing the RETURN key, you move on to a listing of the various analysis functions that GCG can perform.

                 | Welcome to HYGCGmenu  Unix version 1.0 |
                     Editing Sequences   [edseq]
                     Fragment Assembly   [fragas]
                   Restriction Mapping   [restmap]
                   Sequence Comparison   [seqcomp]
        Database Searching / Retrieval   [dbsearch]
            Multiple Sequence Analysis   [multseqan]
                 Evolutionary Analysis   [evolan]
      DNA Sequence Pattern Recognition   [patrecog]
               RNA Secondary Structure   [rnasecstr]
             Protein Sequence Analysis   [protan]
                           Translation   [trans]
                          Manipulation   [manip]
                               Display   [disp]
               Convert Sequence Format   [seqconv]
             File Management Utilities   [util]
                         Miscellaneous   [misc]

      What is GCGmenu? [whatisgcgmenu]   Request  for help [mailforhelp]
Right arrow or RETURN to go into an option.  Arrow keys up and down navigate.
Left arrow to get out of an option.          "?" for help
"q" to quit program                          Directory browser []

As you can see the GCG software package has been broken down into groupings of programs based on their general function. Explore one of these functional groups, Protein Sequence Analysis. Move through the list with the down arrow until you can come to the protan line and press the RETURN or right arrow key. You now have a listing on the screen of the various protein analysis programs that GCG has available. Use this listing to come up with the answers to the following questions. Record your answers in the space provided.

1) How many programs are listed under this grouping? _______________________________

2) How many of these require graphics set up prior to use? ____________________________

Now use the left arrow key to move back up through the menuing system to get to the main window, the one with the box stating that this tool is based on HYBROW. From here, explore other options to find the necessary information to answer the questions on the next page. Record your answers in the space provided there.

1) What is the current number of usable net analysis tools at the moment? ________________

2) What is the current number of linkage tools available? _____________________________

When you have the information required, press q to get out of the system. The prompt given below should appear on the screen. Respond with y and press the RETURN key to get back to a machine prompt, which will appear in the middle of the screen. Once another command is entered, the prompt will go back to the left-hand side of the screen.

Really want to quit? [y] y 

The use of seqtool will not be covered in the course. Most of the time you will use command line entries to activate necessary software. After the course is over, however, you may find it very useful to use the seqtool interface to access VADMS software. You don't have to remember all the names of the various programs in the GCG, EGCG or PHYLIP packages, but can get at them by just knowing the types of manipulations you want to perform.

The tour continues with examples of solving sequence analysis problems with GCG software. This part of the tour is set up as an automatic demo. You start the demo running and it automatically moves through the examples, presenting text for you to read, or images for you to look at. Pressing the RETURN key moves you from one screen of text or image to the next one. This way you can pace the demo to your own reading speed.



Solving Sequence Analysis Problems

At WSU, the most commonly used software to solve sequence analysis problems is that of the Genetics Computer Group, known as GCG. For this part of the ribozyme resource tour, you will look at two sequence analysis problems. The first is a characterization problem dealing with RNA's which have enzymatic qualities. These compounds are known in the literature as ribozymes. The question we will attempt to answer is: are there any structural or feature similarities between these compounds which would allow you to recognize them? The second problem is an attempt to infer the phylogenetic relationships of the G-coupled receptors. These receptors are vital to several sensory and hormonal pathways in animals. The evolution and sequence conservation of a number of the human receptors will be explored.

sequence analysis tour highlights

l) A number of ribozymes are displayed to see if you can spot any sequence similarities.

2) Multiple alignment analysis is run on them to see what sort of similarities the computer can spot.

3) Folding analysis is then run and displayed to see if these compounds show any similar structural features at their enzyme behaving site.

4) A profile is assembled of one of the receptor representatives and the database is searched to find all human G-coupled TM7 receptors.

5) A new multiple sequence alignment is assembled using pregapped sequences from profile analysis; similarity plots show sequence conservation of transmembraneregions. Structural predictions also point out the seven transmembrane helices.

6) Finally an evolutionary phylogenetic hypothesis is proposed based on the alignment. It is displayed as a GrowTree phylogram.

To start this tour enter the term sequence_show.

% sequence_show

After the demo stops, it is time to move on to another of VADMS's computing platforms, MODEL1. This platform houses most of the molecular modelling software to be used in the course. To get back to the Launcher window screen, enter logout to the machine prompt.

% logout

The cursor turns into a rotating world and then returns to normal. Use the mouse to move the arrow to the File location onthe control bar at the top of the screen. Press down on the mouse button to see the pull down menu located there. While holding down the mouse button, move the arrow cursor to the Quit line turning that location black and then release the mouse button. Messages will appear about saving settings for Versa Term Pro and then the screen will return the to original Launcher window screen.

Introduction to Model1 Resources

The tour moves on to local biocomputing resources available on model1. This machine is an auxiliary VADMS' platform. It holds the bulk of VADMS' molecular modelling resources. The software used to enter and minimize molecular structures is located here. Those students doing a modelling project will spend a lot of time here. Model1 uses a different operating system than ribozyme, and may seem confusing at first.

In the course, a similar set of accounts has been set up on model1 as they were on ribozyme for student use. They all are called bcsxx, where the xx refers to a number between 01 and 28. All these accounts have the same password, mygenes0. For this first week's activities, log into the machine using an account that corresponds to the number of the machine you are using. If you were using clone1.vadms.wsu.edu, then you would use account bcs01. Refer back to page 5 to get the account name to use in this portion of the tour.

Select the MODEL1 icon from the Launcher window. This will get you to the login prompt for model1. Enter your account name and password.

        Welcome to OpenVMS VAX V6.1

Username:

The following welcome screen appears on your terminal. It states that there is modelling software on this machine.

        Welcome to OpenVMS VAX version V6.1 on node MODEL1
    Last interactive login on Tuesday, 28-DEC-1996 13:53
    Last non-interactive login on Tuesday, 28-DEC-1996 13:52

                        Welcome to the Model1
                        ---------------------

        This machine is for the molecular modelling needs of the WSU
        campus. Authorized use is controlled by the VADMS LABORATORY.

   Current software running on the device:
        1) Macromodel - a molecular modelling package from Columbia Univ.
           enter mmv30 to run program<
        2) Specialized user software.

        printing is with vpr for output in 2015 ITB
        printing is with cpr for output in 210 Commons

   Report any problems to Susan Johns, prcadams@ribozyme.vadms.wsu.edu, 5-0424.

The tour continues with an example of entering a structure, minimizing it and then looking at another molecule with MacroModel software. This part of the tour is set up as an automatic demo. This demo is not self-pacing.

Solving a Molecular Modelling Problem

Molecular modelling studies usually start with the use of the modelling program such as MacroModel. This software allows you to enter structures, energy minimize them, and then do various types of analyses on the resultant structure(s). Molecules can be docked either visually or through the minimization process. A recent area of increased interest is that of fullerenes. Fullerenes are carbon-only compounds that were first discovered in space, and have interesting electrical and filtering properties. These materials are known as bucky balls and will be the subject of the molecular modelling tour.

tour highlights

l) A simple structure is entered and minimized to show the minimization process.

2) The various steps on the way to forming one of these fullerenes (having 28 carbons shown along with the final structure.

3) Different types of structural analyses are shown for this structure, such as CPK modelling, Van Der Waals surfaces and volume determinations.

4) A number of the balls are docked, and analysis run on the spacing between molecules to explore filtering characteristics.

To run this tour, enter the term mmv30, respond to the prompt for a script file with bucky_balls.log and reply with n to the batch process question. Instructions are given below.

$ mmv30

  bucky_balls.log

  n

The demo will finish with a prompt at the top of the screen asking if you really want to stop the program or not. This occurs when the image of the four 60-member bucky balls is on the screen on the size of the hole between these molecules is shown. Record those distances below.

x distance: ___________________________ y distance: ____________________________

Now get out of the program. Respond with y to the Confirm Program Stop (Y/N):. The screen goes white. Select the Emulation from theVersa Term Pro control bar and select DEC VT220 from the menupresented. The screen changes color from white to blue and has the question init, Delete the current log file (Y/N):. Respond with y and press the RETURN key. The y you try to enter turns intoù and there is a message of junk on the line below the prompt.Select Emulation from the Versa Term Pro control bar again, this time selecting Reset Terminal from the menu options presented. The type in the blue screen is now readable and you can continue on with your computing tasks.

To start the exiting process to get back to the overlapping window screen enter log at the prompt and press the RETURN key.

$ log

After getting information from model1 about the session you had just run, move the arrow up to the Versa Term Pro control bar again, this time move the arrow to the File location. Press down on the mouse button to see the pull down menu located there. While holding down the mouse button, move the arrow cursor to the Quit line turning that location black and then release the mouse button. Messages will appear about saving settings for Versa Term Pro and then the screen will return the to original Launcher window screen.



Introduction to X-Windowing Resources

The tour moves on to the local biocomputing x-windowing resources. These resources are actually housed on ribozyme, but require a special means of accessing the platform. Look closely at the initial screen. Among the icons in the Launcher window is X RIBOZYME. Move the arrow to this icon and double click the mouse button.

You will need to log into ribozyme again using the account name and password you did on page 5. When you are connected to ribozyme, a xterm window appears on the screen containing the ribozyme welcoming message and a % prompt. This means of accessing ribozyme is very sensitive to network load levels. The more traffic on the nets, the slower your xterm session will be.

In visualizing molecules, some pieces of software produce postscript files. One way of looking at these files without printing them, is to use the GhostScript program. This software has been installed on ribozyme, but, requires an x-windowing environment to run. The tour continues with the viewing of two such postscript files.

At the machine prompt enter the command given below. The screen shows an initializing... comment and then a GhostScript window appears on the left-hand side of the screen. This window draws an image of two DNA helical segments, a normal one and one with a carcinogen intercalated in it. This image was generated by the Molscript program. Depending on network traffic it may take some time for this image to be completed. Record below your observations on the nature of the distortion caused by the carcinogen in the DNA helical structure.

% gs mx.ps

DNA observations: ________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

When the image is completed, move the cursor into the window behind the white GhostScript window and press the mouse button. The xterm window is now in front of the other one. Press the RETURN key. The GS> prompt appears on the screen. Get out of the program by entering quit.

To see the second image, enter the command given below. This time the image is the protein lysozyme. The secondary structural elements have been colored. The helices are red and the sheets green. When the image is completed and you are finished looking at it, record your observations on the next page and repeat the instructions given above to get out of the program and back to the machine prompt.

% gs 6lyz.ps

Lysozyme observations: ____________________________________________________

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________

Just as getting into this session required special commands, so does exiting it. Use the mouse to move the arrow to the File location of the control bar. This is a pull-down menu. From this listing select the QUIT option to exit the session and release the mouse button. This moves you back to the Launcher window screen.



Finishing up

With your tours completed, fill out the report form on the following pages and turn it into your lab instructor before the end of your lab period. Use the data collected during the running of tours to complete the form.


Internet resources used:

VADMS Center Web site:

   mosaic users:     http://ribozyme.vadms.wsu.edu/~vadms
   netscape users:   http://www.vadms.wsu.edu

Molecules R US site:

   http://molbio.info.nih.gov/cgi-bin/pdb

Entrez:

   While you used ENTREZ over the Internet, it requires that special software
   exist on your local machine in order for it to work. Your local machine also
   has to have an IP address (be on the campus backbone or some other network
   access to the Internet) in order to use the software. ENTREZ is available 
   for DOS, Mac and UNIX platforms.

578 Week 1 Lab Report

Name:  _____________________________________________________________________

Lab day: ___________________________________________________________________


1) What information did you record from your searching of the VADMS web site? 

   a) how long has the VADMS Center been in existence: _______________________________

   b) which island hosted the '97 Pacific Symposium on Biocomputing:____________

   c) what English college planned on offering a Structure-Based Drug Design course 
      on the nets in the summer of 96: _______________________________________________

2) What information did you record when looking at the GCG programs in the seqtool 
   interface?

   a) how many programs are listed in the Protein Sequence Analysis section: ________

   b) how many of these programs required graphics set up: __________________________


3) While exploring the rest of the seqtool interface what answers did you find to these questions?

   a) how many usable net analysis are on-line at the moment: _______________________

   b) how many linkage tools are available: _________________________________________


4) What was the size of the opening between the bucky balls in the modelling demo?

   x distance :_____________________         y distance : ______________________


5) What did you think of the x windowing images?

   a) DNA observations : ____________________________________________________

      _______________________________________________________________________

      _______________________________________________________________________

      _______________________________________________________________________


   b) Lysozyme observations : _______________________________________________

      _______________________________________________________________________

      _______________________________________________________________________

      _______________________________________________________________________