A Users' Guide to the Molecular Modeling Core Facility.

Mihaly Mezei, Ph.D.

Mount Sinai School of Medicine, NYU.
Department of Structural and Chemical Biology

IMI (East Bldg) 15-23A
(212) 659-5475 (Ext. 85475)
E-mail address: Mihaly.Mezei@mssm.edu

UNDER REVISISON

You are invited to send comments, suggestions and corrections

Jan. 23, 2008.

CONTENTS

1. INTRODUCTION

The Molecular Modeling Core facility is based on a variety of computers, some of them oriented toward graphics, others toward number crunching. Different uses of the facility require different levels of expertise. The present document collects information on the hardware and the operating systems that is specific to the facility to provide the knowledge necessary to operate in our environment: run, compile and write programs. To be able to do these, one has to be cognizant of

A Survival Guide is also available providing the general survival skills necessary to operate in a Unix/Linux environment. The fundamentals of editing files, managing and navigating around files, directories and file systems, use of the network to access other computers, and moving files between computers are briefly described.

The software library is described in a separate document.

These guides are intended to be updated frequently as the hardware configuration changes or new software is added. Any comment, suggestion or correction would be greatly appreciated and will be incorporated expeditiously - contact Dr. Mezei by e-mail, telephone (X 85475) or in person (IMI 15-23A).

2. COMPUTERS

The computers, also called hosts when on a network, belonging to the facility or accessible to users of the facility are listed in the table below, with their principal characteristics.

  Servers and special purpose computers
Hostname Hardware OS# of CPU's
fulcrum.physbio.mssm.edu SGI R10000Irix 6.5 2
atlas.physbio.mssm.edu DellLinux 2
lucy.physbio.mssm.edu SGI R12000Irix 6.5 8
pepi.physbio.mssm.edu SGI R12000Irix 6.5 8
bon.physbio.mssm.edu SGI R12000Irix 6.5 8
faradis.physbio.mssm.edu Apple G5OSX 150
compass.physbio.mssm.edu DellLinux 4
magnet.physbio.mssm.edu DellLinux 8

Fulcrum is the file server for all the SGI's and atlas is the web server for the Department of Structural and Chemical Biology. Unless utherwise stated, logging to any of the SGI systems will place the user into the same home directory located on fulcrum (see the Survival Guide for details).

The Fulcrum also acts as 'password server'. This means that your password is the same on all machines served by fulcrum. Note, that it is important to choose a nontrivial password (i.e., NOT your username). In particular, it should not be a dictionary word.

Lucy, pepi and bon are shared-memory multiprocessor machines, mostly for number crunching. Faradis is a cluster serving the whole school. Compass and Magnet are departmental servers.

Special care must be exercised for compatibility of binary data generated on Linux systems with data generated on the SGI's - see Sec 3.4 or contact Dr. Mezei for details.

Calculations requiring more power can be performed at one of the national supercomputer centers, collectively known as the Teragrid. Access to the Teragrid requires a small application, emphasizing the computational needs of the project.

3. BASIC OPERATIONS.

3.1. Obtaining accounts.

To use any of the computers in the list above it is necessary to obtain an account first. Contact the Structural and Chemical Biology system manager, Mr. Kevin Kelliher at Ext. 40493, (E-mail address: kevin.kelliher@mssm.edu).

3.2. Printing, viewing and converting files.

We have a script developed locally by Ben Goldsteen called qprint that is able to print files that are in a wide variety of formats. qprint detects the file fomats and prints them on the output device requested. qprint recognizes text (i.e., ASCII) files (the simplest and the most transferable format), postscript files (.ps), Adobe acrobat (.pdf) files, and various graphics files: .rgb, .tiff, .jpeg and possibly .gif.

The format of the command is

qprint -to <printer name> [<format>] <filename>

where <format> can be

Other printers have their names on them. By default, images are printed at 300dpi. It is usually desirable to print images captured from the screen at 150 dpi. The option for that is:

qprint -to claser -dpi 150 density.rgb

We also have the public-domain program a2ps that can print a text formatted in a variety of ways (multi column, headers, line numbers, etc.). Type man a2ps to obtain a list of options. For example, a2ps myfile | qprint -to laser will give a nice 2-column printout, in landscape orientation, with labels specifying the origin of the file and page numbers and context-sensitive font choice.

Postscript (.ps) files can be viewed by the public-domain program ghostview (running on an X-terminal) or by SGI's DesktopTool xpsview. The command acroread opens the Adobe Acrobat viewer on the SGI graphics workstations allowing to view .pdf files.

Regular Postscript files can to be converted to encapsulated Postcript with the ps2epsi command. Acrobat distiller (installed on the PC in Rm 21-77A) can convert a Postscript file into PDF format. Photoshop provides the option of saving in a variety of formats, effectively acting as a converter. Also, the SGI's have a number of conversion commands - type apropos convert to get a list.

3.3. Running number-crunching applications.

Calculations that require more than a few minutes should be run in the background. Furthermore, if the machine has batch queues set up users should send their jobs (a script file with the commands to be executed) there and the operating system will execute them in turn (a few at a time).

To run a job in the background, append the execution command with & , e.g., calculate < input.data >& output.data &. The command above will execute the program calculate using the file input.data as the standard input, put both the results (>) and system error messages (&) on the file output.data and run in the background. This has the additional advantage that when you log off, the job will continue to run.

To submit a script file <fn> to the queue named <que> issue the command qsub -q <que> <fn>. The script file's protection has to include x, otherwise the submission will fail - type chmod a+x <fn> to include it. To check on the status of your job(s) issue the command qstat [<que>] (the <que> is optional). To delete a jobs from a queue called <que> issue the command qdel -k <jobid>.

3.4. Writing programs

The writing of new programs (or modifying of existing ones) governed by the rules of the programming language chosen and is not described here. A corresponding compiler has to be invoked to produce an executable code.

The form of the instruction to compile a fortran program is

<Compile command> [Options] - o <Executable file> <Source file>

Note that while the source code is frequently transferable from machine to machine, the executable is NOT. Major programming languages are reasonably well standardized, but the compilers may have different names and options on different operating systems. Most compilers are able to optimize at various levels. For debugging purpose, there is the index-check option which, when set, causes the program to perform a run-time check on the array and string elements to see if they are within bounds. It adds to the execution time, but it is a very important debugging tool. Most of the compilers also have additional debugging facilities (e.g., dbx) that allow you to probe the status of your variables during the run (consult the manual!).

The table below gives the minimum information necessary to compile a Fortran program.

Host Compile Optimize Index check Swap byte order Debugger use cmd
SGI's f77 -O2 or -O3 -C   -g  
Linux g77 -O4 or -O5 -C -byteswapio -g use pgi
OSX xlf -O4 or -O5 -C -byteswapio -g use pgi

For example, g77 -O4 -o mmc.bin mmc.f will create the executable mmc.bin from the source code mmc.f, compiled at optimization level 4. The -byteswapio compilation option will create bynary data files compatible with SGI.

3.4.1. Using the debugger dbx on the SGI's

Once an execution terminates with a core dump a debugger can be used to examine the status of the program. On the SGI's the command dbx <executable name> starts the debugger, yielding a dbx> prompt. On the alpha's you have to type dbx <executable name> core Typing l either results in few lines of code printed, with the line the program stopped marked, or with the message Source not available. In the latter case, you have to keep typing the command up until it prints the line it aborted. If this place is inside a subroutine then further up commands will give you the listing of the place it was called from.

Additionally, if the program was compiled with the -g compilation option then you can query your variables: the command p <variable name> will print the actual value of that variable at the time the program stopped. Note that the -g switch prevents optimization thus it should only be used in anticipation of the core dump.

On the SGI's the default is to continue after a floating point exception occurred. To abort on floating-point exception on the SGI's you have to end the compilation command with -trapuv -lfpe and set the environmental variable TRAP_FPE. The command setenv TRAP_FPE "ALL=ABORT,trace(1)" will abort the program even when an underflow occurs. More selective control can be exercised, however, as in setenv TRAP_FPE "UNDERFL=ZERO;INT_OVERFL=ZERO;OVERFL=ABORT;INVALID=ABORT;DIVZERO=ABORT, trace(1)"

Note also, that the -C compilation option will abort the program (on the SGI's without producing a core dump) when an index boundary is exceeded. This is quite often the prelude to floating-point exceptions.

3.4.2. Using the debugger gdb on Linux

3.5. Running parallel programs

Running a program in parallel requires the program to use of a communication library. The most popular such library is MPI.

3.5.1. Running parallel program on the SGI's

To execute a parallel job using the MPI on the SGI's communication library, simply type

mpirun -np <number of CPU's> <run command>

where the <run command> is what you would type for running a single CPU job (e.g., myprog < job.inp >& job.out & On the alphas, use dmpirun instead of mpirun.

Note, however, that there is a special script executing Charmm.

3.5.2. Running parallel programs under the SUN grid engine

To execute a parallel job using the MPI on the SGI's the job has to be sent to the que with the qsub command.

4. USAGE POLICIES

4.1. Disk space policies

Since the total disk space is limited on each machine, fair usage dictates to set limits to the disk space used. These limits are different for the home directories (that are regularly backed up by the system) and for the scratch directories. The command quota -v will tell you your current allocation (when set) and usage.

4.2. Job submission policies

4.3. General code of conduct.

As a matter of principle, the Molecular Modeling Core would like to establish as few rules as possible. Achievement of this goal requires responsible behavior from everybody's part.

Responsible users

1. familiarize themselves with the capacity of the various facilities and generally refrain from 'monopolizing' its resources;

2. exit from programs with limited mumber of licenses (e.g., insight, matlab) when not using it.

3. back files that are not likely to be used soon (the larger the files the shorter time period is represented by this 'soon'), compress files that are not needed immediately and heed periodic backup requests from the system managers that may appear among the login messages;

4. scrupulously observe the policies related to both the disk space usage and the job submissions as spelled out above in sections 6.1. and 4.1.

Lack of responsible behavior (as defined above) is likely to result in grumbles of increasing intensity with each occurrence, mostly from the part of the system manager, but you may hear from your affected colleagues as well.

Adherence to the disk use policies (as described above in Sec. 4.2.) is of prime importance since consuming excessive amount of disk space may lead (and in fact, several times has lead) to filling up disks and corrupting jobs running at that time.

Return to contents page