C PROTEIN DATA BANK SOURCE CODE REFMTE C AUTHOR. L.RELLICK,J.DUANE C ENTRY DATE. 1/84 UNSUPPORTED C LAST REVISION. 1/84 C PURPOSE. PREPROCESSOR FOR MOLECULAR C PURPOSE. MODELLING AND FINITE ELEMENT C PURPOSE. ANALYSIS C LANGUAGE. FORTRAN 77,VAX 11/780 C NOTE. USES SUPERTAB (NOT INCLUDED) C NOTE. OPTIONALLY USES SUPERB(NOT INCLDD) C BEGIN DOCUMENTATION BLOCK C C AUTHORS LORRAINE M. RELLICK BIOPHYSICS PROGRAM, OHIO STATE UNIVERSITY C JOSANN W. DUANE ENGINEERING GRAPHICS DEPARTMENT, C OHIO STATE UNIVERSITY C C SYSTEM VAX 11/780 WITH VMS OPERATING SYSTEM C C PROGRAM PREPROCESSOR FOR MOLECULAR MODELING AND FINITE ELEMENT C ANALYSIS C C ABSTRACT C PROGRAM TO REFORMAT CRYSTALLOGRAPHIC DATA OF PROTEINS C AND NUCLEIC ACIDS. USES DATA STORED ON THE PROTEIN DATA C BANK TAPES (BROOKHAVEN NATIONAL LABORATORY). PREPARES C DATA FOR MOLECULAR MODELING AND FINITE ELEMENT ANALYSIS C USING THE PROGRAMS SUPERTAB AND SUPERB (STRUCTURAL DYNAMICS C CORPORATION, CINCINNATI OHIO). C C COMPUTING SYSTEM REQUIREMENTS C C A.) HARDWARE C C COMPUTING DEVICE COMPARABLE TO VAX 11/780 C C ONE OF THE FOLLOWING GRAPHICS TERMINALS; C C TEKTRONIX 4014,4012,COLOR 2027 C DEC VS11 C IMLAC 610 C C B.) SOFTWARE C THIS PROGRAM IS NOT OPERATING SYSTEM DEPENDENT C THE FOLLOWING SOFTWARE IS NEEDED; C 1) SUPERTAB NEEDED FOR MOLECULAR MODELING C 2) SUPERB ALSO NEEDED IF FINITE ELEMENT STUDIES C ARE GOING TO BE DONE ON MOLECULE C C BACKGROUND C BECAUSE OF THE SOCIAL CONSEQUENCES OF DESIGN FAILURE C IN ENGINEERING, RESOURCES HAVE BEEN MADE AVAILABLE TO C DEVELOP GRAPHICS SUPPORT FOR ENGINEERING DESIGN AND C ANALYSIS WHICH GOES BEYOND THAT WHICH IS AVAILABLE IN C THE BIOLOGICAL SCIENCES. TWO GRAPHICS APPLICATIONS C PROGRAMS FOR ENGINEERING DESIGN AND ANALYSIS, PADL AND C SUPERTAB ARE BEING USED IN THE INTERACTIVE GRAPHICS C LABORATORY AND IN THE ADVANCED DESIGN METHODS LABORATORY C AT THE OHIO STATE UNIVERSITY. PADL IS BEING DEVELOPED AT C THE UNIVERSITY OF ROCHESTER AND IS AVAILABLE TO C EDUCATIONAL INSTITUTIONS FOR $600.00. SUPERTAB IS A C COMMERCIAL CODE WHICH IS AVAILABLE TO UNIVERSITIES C THROUGH GRANTS FROM ITS DEVELOPER, STRUCTURAL DYNAMICS C CORPORATION, CINNCINATI OHIO. C C BOTH PADL AND SUPERTAB ALLOW THE NOVICE COMPUTER USER TO C BRING GRAPHICS SUPPORT TO DESIGN AND ANALYSIS OTHERWISE C ONLY AVAILABLE TO THOSE SKILLED IN GRAPHICS PROGRAMMING. C BY ADAPTING THE TWO PROGRAMS FOR USE IN THE BIOLOGICAL C SCIENCES, THIS TECHNOLOGY CAN BE TRANSFERRED AT A FRACTION C OF THE ORIGINAL DEVELOPMENT COST. C C PURPOSE C WE HAVE WRITTEN A DATA FORMATTER WHICH REFORMATS AND CODES C DATA STORED IN THE PROTEIN DATA BANK SO THAT BIOLOGICAL C STRUCTURES CAN BE DISPLAYED USING SUPERTAB. WE ARE IN THE C PROCESS OF DEVELOPING SIMILAR SOFTWARE TO ADAPT PADL FOR USE C IN MOLECULAR MODELING AND TO EXTEND THE CAPABILITIES OF OUR C DATA FORMATTER FOR SUPERTAB FOR USE IN FINITE ELEMENT ANALYSIS C OF BIOLOGICAL STRUCTURES USING THE ANALYSIS PROGRAM SUPERB C (STRUCTURAL DYNAMICS CORPORATION). C C DISCUSSION OF MODELING CAPABILITIES OF SUPERTAB C C SUPERTAB IS A PROGRAM USED BY ENGINEERS TO PREPARE BULK C DATA FOR ANALYSIS USING A FINITE ELEMENT PROGRAM. HOWEVER C IT CAN BE USED AS A GENERAL PURPOSE GRAPHICS DISPLAY PROGRAM. C SUPERTAB HAS MANY FEATURES. IT ALLOWS THE USER TO DISPLAY C THE MODEL FROM ANY ORIENTATION. IT ALLOWS PANNING OF THE C MODEL AND ZOOMING IN ON SPECIFIC REGIONS. IT ALLOWS THE C USER TO VIEW THE STRUCTURE WITH PERSPECTIVE. A DETAILED C DESCRIPTION OF THE APPLICATION OF THESE FEATURES TO C MOLECULAR MODELS FOLLOWS. C C 1) DISPLAY AND CODING OF STICK MODEL C SUPERTAB PRODUCES A BALL AND STICK REPRESENTATION OF THE C MOLECULAR STRUCTURE. THIS MODEL CAN BE VIEWED WITH OR C WITHOUT THE ATOMS OR BONDS NUMBERED. WHEN SMALL SECTIONS C OF THE MODEL ARE BEING VIEWED THESE NUMERIC LABELS ASSIST C IN IDENTIFYING THE PARTICULAR RESIDUES INVOLVED. THE C ATOMIC NUMBERING IS SEQUENTIOA, BEGINING WITH THE N-TERMINUS C OF THE PROTEIN, OR WITH THE 5-PRIME END OF THE NUCLEIC ACID. C FOR PROTEINS, THE ATOMS IN A PARTICULAR RESIDUE N RESIDUES C FROM THE N-TERMINUS WILL BE NUMBERED IN THE RANGE FROM C (N-1)*20 TO N*20 (IN OTHER WORDS, THE RESIDUES ARE ALLOWED C TWENTY ATOMS EACH; THE RESIDUES ARE NUMBERED IN INCREMENTS C OF TWENTY). ATOMS OCCURRING IN NUCLEIC ACID NUCLEOTIDES C N NUCLEOTIDES FROM THE STARTING TERMINUS WILL BE NUMBERED C IN THE RANGE FROM (N-1)*30 TO N*30 (I.E., NUCLEOTIDE C RESIDUES ARE NUMBERED IN INCREMENTS OF THIRTY). C C C 2) SELECTIVE DISPLAY C MANY MODES OF SELECTIVE DISPLAY ARE AVAILABLE USING C SUPERTAB. C C A) IN PROTEINS THE BACKBONE SEGMENTS OF ALL RESIDUES C IN THE STRUCTURE CAN BE VIEWED IN THE ABSENCE OF C THE R-GROUPS. SIMILARLY, IN NUCLEIC ACIDS, THE C SUGAR BACKBONE CAN BE VIEWED IN THE ABSENCE OF C THE BASE COMPONENTS. C C B) IN PROTEINS, THE R-GROUPS CAN BE VIEWED INDEPENDENTLY C OF EACH OTHER AND OF THE BACKBONE SEGMENTS OF THE C RESIDUES. IN NUCLEIC ACIDS, THE DIFFERENT BASES CAN C BE VIEWED INDEPENDENTLY OF EACH OTHER AND OF THE SUGAR C SEGMENTS. C C C) FOR PROTEINS THE NUMERIC LABELS ON THE RESIDUES ARE C DEFINED IN SUCH A WAY SO AS TO ALLOW THE USER TO C EASILY DISPLAY ONLY THOSE RESIDUES WHICH ARE EITHER C HYDROPHOBIC, UNCHARGED POLAR, OR POSITIVELY OR NEGATIVELY C CHARGED AT PH 7.0. THESE GROUPS OF ATOMS CAN BE DIS- C PLAYED IN THE PRESENCE OR ABSENCE OF THE BACKBONE. C C D) ALPHA AND BETA CHAINS OF NUCLEIC ACIDS CAN BE VIEWED C INDEPENDENTLY OF EACH OTHER. C C C 3) SECTIONING C C THE SECTOINING FEATURE OF SUPERTAB ALLOWS THE USER TO VIEW C SPECIFIC SPACIAL REGIONS OF THE STRUCTURE. THE USER CAN C VIEW A 'SLICE' OF THE MOLECULE. THE POSITION AND THICKNESS C OF THIS SLICE ARE DEFINED BY THE USER. SIMILARLY A USER C DEFINED (HEIGHT AND WIDTH) VOLUME CAN BE VIEWED. THESE C CAPABILITIES ALLOW THE USER TO HAVE A BETTER CONCEPTION OF C THREE-DIMENSIONALITY OF THE MOLECULAR STRUCTURE, AND THE C SPACIAL RELATIONSHIPS THAT EXIST BETWEEN THE GROUPS ON THE C INTERIOR OF THE MOLECULE. C C C 4) PERSPECTIVE C C PERSPECTIVE ALLOWS THE USER TO REALISTICALLY VIEW THE C MODEL FROM A SPECIFIC POINT IN SPACE (I.E., THE PARTS OF C THE STRUCTURE CLOSER TO THE VIEWER APPEAR LARGER THAN C FARTHER AWAY). THIS IS AN INFORMATIVE WAY TO VIEW FOR C EXAMPLE THE ACTIVE SITE OR BINDING SITE OF A PROTEIN. C USING PERSPECTIVE, THE USER CAN APPROACH THE SITE ALONG C A POSSIBLE PATH FOR THE INCOMING SUBSTRATE, AND SEE WHAT C THE SUBSTRATE 'SEES'. USING PERSPECTIVE, THE USER CAN C 'TRAVEL THROUGH' THE INTERIOR OF THE MOLECULE. C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C MAIN PROGRAM C C VARIABLES USED; C C DSN1 1 BYTE CHARACTER VARIABLE. INPUT BY USER FROM C TERMAINAL. ALLOWS USER TO CHANGE FILE TYPE. C C DSN2 1 BYTE CHARACTER VARIABLE. INPUT BY USER FROM C TERMINAL. ALLOWS USER TO REFORMAT ANOTHER FILE C C DSN3 1 BYTE CHARACTER VARIABLE INPUT BY USER FROM C TERMINAL. ALLOWS USER TO CHANGE USER-ENTERED VALUE C FOR NUMBER RESIDUES IN FILE. C C FLNM 5 BYTE CHARACTER VARIABLE. IDENTIFYING PART C OF INPUT AND OUTPUT FILENAMES. INPUT BY USER C FROM TERMINAL. C C FILEIN 10 BYTE CHARACTER VARIABLE. INPUT FILE NAME. C OF THE FORM FLNM.DAT C C FILOUT 10 BYTE CHARACTER VARIABLE. OUTPUT FILE NAME. C OF THE FORM FLNM2.DAT C C LIMIT INTEGER VARIABLE. ALLOWS USER TEN INPUT ERRORS C BEFORE PROGRAM TERMINATION. C C MOLTYP 1 BYTE CHARACTER VARIABLE. INPUT BY USER FROM C TERMINAL. SIGNIFIES WHICH TYPE FILE USER WISHES C TO REFORMAT. IF EQUAL TO 'T', PROGRAM IS C TERMINATED. C C NMAX 2 BYTE INTEGER VARIABLE. NUMBER OF RESIDUES IN THE C CRYSTALLOGRAPHIC DATA FILE. C C PAUSE 1 BYTE CHARACTER VARIABLE. ALLOWS USER TO PAUSE C TO READ INTRODUCTION TO PROGRAM. C C CALLS TO; C MAJOR1 C MAJOR2 C C C C ORDER OF PROGRAM SEGMENTS; C C 1. MAJOR1 9. MAJOR2 C 2. BKBONE 10. NUM C 3. RCHAIN 11. REORDR C 4. ELMNT1 12. ELMNT2 C 5. SWAP1 13. RD2 C 6. SWAP2 C 7. SWAP3 C 8. RD1 C C I/O SPECS C C INPUT; INFORMATION ABOUT FILE (NAME, TYPE, NUMBER RESIDUES) C ENTERED BY USER FROM TERMINAL. C C OUTPUT; REFORMATED DATA FILE, READY TO BE INPUT TO SUPERTAB. C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C INITIALIZE PROPERTY REFERENCE VALUES OF BONDS C INTEGER*2 NPRALA(5)/1,2,3,4,5/,NPRARG(11)/1,2,3,4,5,6,6,7,8,9,9/, *NPRASN(8)/1,2,3,4,5,13,14,14/,NPRASP(8)/1,2,3,4,5,13,14,14/, *NPRCYS(6)/1,2,3,4,5,15/,NPRGLN(9)/1,2,3,4,5,6,10,11,12/, *NPRGLU(9)/1,2,3,4,5,6,13,14,14/,NPRGLY(4)/1,2,3,4/,NPRHIS(11) */1,2,3,4,5,16,17,18,19,20,21/,NPRILE(8)/1,2,3,4,22,23,24,25/, *NPRLEU(8)/1,2,3,4,5,22,25,25/,NPRLYS(9)/1,2,3,4,5,6,6,6,26/, *NPRMET(8)/1,2,3,4,5,6,27,28/ INTEGER*2 NPRPHE(12)/1,2,3,4,5,29,30,30,30,30,30,30/,NPRSER(6) */1,2,3,4,31,32/,NPRTHR(7)/1,2,3,4,33,34,35/,NPRTRP(16)/1,2,3,4,5, *29,36,37,38,39,40,40,40,39,41,42/,NPRTYR(13)/1,2,3,4,5,29,43,43, *44,44,43,43,45/,NPRVAL(7)/1,2,3,4,22,25,25/,NPRPRO(8)/46,47,3,4, *48,49,50,51/,NPRHYP(9)/46,47,3,4,48,52,53,51,54/,NPRASX(8)/1,2,3, *4,5,55,56,56/,NPRGLX(9)/1,2,3,4,5,6,55,56,56/ INTEGER*2 NPRRB2(11)/1,2,3,4,5,5,6,7,8,7,1/,NPRRB1(12)/1,2,3,4, *5,5,6,10,8,8,1,7/,NPRA(11)/10,10,11,12,13,14,14,11,12,15,11/, *NPRG(12)/10,10,11,12,21,14,14,11,12,16,11,17/,NPRC(8)/10,16,17, *17,18,19,20,15/,NPRT(9)/10,18,19,21,18,19,21,16,22/,NPRU(8)/10, *16,17,17,18,15,20,15/ C INTEGER*2 NPR(2500),NTR(2500),NN1(2500)/2500*0/,NN2(2500),M(2500), *NANN1(2500)/2500*0/,NANN2(2500),REF,FLAG,TYPE,TYPREF,NDES/0/, *NDIS/0/,NC/8/,NTC/1/,NMR/1/,NE/2/,K/1/ C INTEGER NMAX C CHARACTER CODE*3,ATOM*3,NUC,CHAIN,MOLTYP,DSN1,DSN2,DSN3, *FMT1*18/'(T18,A3,T31,3F8.3)'/,FMT2*15/'(4I10,1P3E13.5)'/, *FMT3*6/'(7I10)'/,FMT4*32/'(T14,A3,T20,A1,T22,A1,T31,3F8.3)'/, *FMT7*7/'(4X,I2)'/,FLNM*5,FILEIN*10,FILOUT*10,OLDCD*3/'OOO'/, *PAUSE C LOGICAL*1 SAMRES/.FALSE./,NOREAD/.FALSE./ C COMMON /PROTN/LIMN,TYPE,TYPREF/BOTH/J,NPR,NTR,NN1,NN2,M,NMR, *NC,NE,NTC,NDES,NDIS/NUCACD/I,REF,NN,N,K/FORMAT/FMT1,FMT4, *FMT2,FMT3,FMT7/NPRS/NPRALA,NPRARG,NPRASN,NPRASP,NPRCYS, *NPRGLN,NPRGLU,NPRGLY,NPRHIS,NPRILE,NPRLEU,NPRLYS,NPRMET, *NPRPHE,NPRSER,NPRTHR,NPRTRP,NPRTYR,NPRVAL,NPRPRO,NPRHYP, *NPRASX,NPRGLX/PCHAR/CODE,OLDCD/NCHAR/NUC,CHAIN/NPRNUC/NPRRB1, *NPRRB2,NPRA,NPRG,NPRC,NPRT,NPRU/LOGVAR/SAMRES,NOREAD C C C C INTRODUCE PROGRAM TO USER C WRITE(5,171) 171 FORMAT('1',2(36('**'),/,1X),'****',64(' '),'****',/,1X *'**** SUPERTAB PREPROCESSOR FOR *MOLECULAR MODELING',18(' '),'****',/,1X,'****',64(' '),'****',/, *1X,'****',64(' '),'****',/,1X,'****',12(' '), *'PROGRAM TO REFORMAT CRYSTALLOGRAPHIC DATA',11(' '),'****',/, *1X,'****',12(' '),'FOR DISPLAY OF STRUCTURE AND TO PREPARE', *13(' '),'****',/,1X,'****',12(' '),'MODEL FOR STRUCTURAL ANALYSIS *USING SUPERTAB',8(' '),'****',/,1X,'****',12(' '),'(STRUCTURAL *DYNAMICS RESEARCH CORPORATION)',10(' '),'****',/,1X,'****', *64(' '),'****',/,1X,'****',3(' '),'PROGRAMMED BY',48(' '),'****',/, *1X,'****',12(' '),'LORRAINE RELLICK AND JOSANN W. DUANE OF THE', *9(' '),'****',/,1X,'****',8(' '),'BIOPHYSICS PROGRAM AND THE *DEPARTMENT OF ENGINEERING',4(' '),'****',/,1X,'****',8(' '), *'GRAPHICS OF THE OHIO STATE UNIVERSITY, COLUMBUS OHIO',4(' '), *'****',/,1X,'****',64(' '),'****',/,1X,2(36('**'),/,1X),///) C C C PAUSE TO GIVE READER A CHANCE TO READ INTRO C WRITE(5,170) 170 FORMAT(' TO CONTINUE, TYPE IN C') READ(5,1)PAUSE C C INFORM USER WHAT INFORMATION MUST BE INPUT FROM C THE TERMINAL C WRITE(5,21) 21 FORMAT('1',10X,'TO USE THIS PREPROCESSOR YOU MUST INPUT',/, *10X,'THE FOLLOWING INFORMATION;',//,15X, *' THE NAME OF THE FILE YOU WISH TO USE',/,15X, *' THE TYPE OF FILE (PROTEIN OR NUCLEIC ACID)',/,15X, *' THE NUMBER OF RESIDUES IN THE MOLECULE', *//,10X,'MESSAGES WILL FOLLOW TO ASSIST YOU IN SUPPLYING',/,10X, *'THIS INFORMATION',/,10X,'NOTE; THE OUTPUT FILE YOU CREATE',/, *15X,'WILL HAVE A NAME OF THE FORM FILENAME2.DAT',/,15X, *'WHERE FILENAME IS THE NAME OF THE INPUT FILE'///) C C ASK USER TO INPUT IDENTIFYING PART OF FILE NAMES C 7 WRITE(5,11) 11 FORMAT('0',10X,'TYPE IN THE NAME OF THE FILE YOU WANT TO',/, *10X,'REFORMAT. THE NAME CAN BE UP TO FIVE CHARACTERS LONG, ',/, *10X,'BEGINING WITH A LETTER.') C C READ FIRST FIVE CHARACTERS OF FILENAME FROM TERMINAL C READ(5,12)FLNM 12 FORMAT(A5) C C COMPLETE THE FILENAMES C FILEIN=FLNM//'.DAT' FILOUT=FLNM//'2.DAT' C C OPEN FILES C OPEN(UNIT=2,STATUS='UNKNOWN',NAME=FILEIN) OPEN(UNIT=3,STATUS='NEW',NAME=FILOUT) C C C INTERACTIVE CODE TO ASK USER WHICH TYPE OF FILE WILL BE C REFORMATTED C C C SET LIMIT. THE USER CAN MAKE UP TO 10 INPUT ERRORS BEFORE THE C PROGRAM IS TERMINATED C 131 CONTINUE C WRITE(5,10) 10 FORMAT('0',15X,'SPECIFY TYPE OF FILE YOU ARE USING BY TYPING', */10X,'IN EITHER A "P", OR AN "N".',//20X,'A "P" SIGNIFIES A * PROTEIN FILE',//20X,'AN "N" SIGNIFIES A NUCLEIC ACID FILE.', *//20X,'TO TERMINATE THE PROGRAM,ENTER "T"') C READ(5,1)MOLTYP 1 FORMAT(A1) C C C C THE USER HAS THE OPTION OF TERMINATION THE PROGRAM BY ENTERING "T". C IF((LIMIT.EQ.10).OR.(MOLTYP.EQ.'T'))STOP IF((MOLTYP.NE.'P').AND.(MOLTYP.NE.'N'))THEN WRITE(5,201) 201 FORMAT('1',10X,'SORRY, INPUT ERROR') LIMIT=LIMIT+1 GOTO 131 ENDIF C IF(MOLTYP.EQ.'P')THEN WRITE(5,301) 301 FORMAT('1',10X,'YOU HAVE CHOSEN TO USE A PROTEIN FILE.') ELSE WRITE(5,401) 401 FORMAT('1',10X,'YOU HAVE CHOSEN TO USE A NUCLEIC ACID FILE.') ENDIF C WRITE(5,501) 501 FORMAT('0',20X,'DO YOU WISH TO CHANGE YOUR DECISION?',//,20X, *'(TYPE "Y" FOR YES, "N" FOR NO)') C READ(5,1)DSN1 C IF((DSN1.NE.'Y').AND.(DSN1.NE.'N'))THEN WRITE(5,201) LIMIT=LIMIT+1 GOTO 131 ENDIF C C C IF USER WISHES TO CHANGE FILE TYPE, GO BACK TO BEGINING OF C INTERACTIVE SECTION C IF(DSN1.EQ.'Y')GOTO 131 C C C MOLTYP HAS BEEN READ FROM THE TERMINAL. IF MOLTYP EQUALS "P" C (I.E. IF THE USER IS REFORMATTING A PROTEIN FILE), CONTROL WILL C BE PASSED TO THE SUBROUTINE MAJOR1. IF MOLTYP EQUALS "N" C (I.E. IF THE USER IS REFORMATTING A NUCLEIC ACID FILE), C THEN CONTROL WILL BE PASSED TO THE SUBROUTINE MAJOR2. C C C BEFORE CONTROL CAN BE TRANSFERRED, THE USER MUST INPUT THE NUMBER C OF RESIDUES IN THE FILE TO BE REFORMATTED C C 541 WRITE(5,551) 551 FORMAT('1',10X,'TYPE IN THE NUMBER OF RESIDUES',/,15X, *'IN THE FILE TO BE REFORMATTED',//,15X,'ENTER ONLY THE NUMBER *OF RESIDUES ',/,15X,'APPEARING IN THE CRYSTALLOGRAPHIC *DATA',/,15X,'(THIS MAY BE DIFFERENT THAN THE ACTUAL NUMBER).') C READ(5,561)NMAX 561 FORMAT(I10) C WRITE(5,571) 571 FORMAT('0',10X,'DO YOU WISH TO CHANGE THE VALUE YOU HAVE INPUT?', */,15X,'(Y FOR YES, N FOR NO)') READ(5,1)DSN3 C C IF((DSN3.NE.'Y').AND.(DSN3.NE.'N'))THEN WRITE(5,201) LIMIT=LIMIT+1 IF(LIMIT.EQ.11)STOP GOTO 541 ENDIF C IF(DSN3.EQ.'Y')GOTO 541 C WRITE(3,FMT7)-1,-1,15 C C IF(MOLTYP.EQ.'N')THEN CALL MAJOR2(NMAX) ELSE CALL MAJOR1(NMAX) ENDIF C C C GIVE THE USER THE OPTION OF REFORMATTING ANOTHER FILE C WRITE(5,601) 601 FORMAT('1',10X,'DO YOU WISH TO REFORMAT ANOTHER FILE?', *//,20X,'TYPE "Y" FOR YES, "N" FOR NO.') C READ(5,1)DSN2 IF(DSN2.EQ.'Y')GOTO 7 C C STOP END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO REFORMAT PROTEIN FILES; MAJOR1 C C VARIABLE LIST C C J 2 BYTE INTEGER VARIABLE. LOOP PARAMETER OVER C RESIDUE NUMBER C C L INTEGER VARIABLE. USED AS COUNTER OVER ALL C BONDS IN MOLECULE C C M 2 BYTE INTEGER ARRAY. ELEMENT NUMBER. C C NE,NC,NTC 2 BYTE INTEGER VARIABLES. VALUES NEEDED FOR C STRESS ANALYSIS. NOT PRESENTLY GIVEN A VALUE. C C NMR 2 BYTE INTEGER VARIABLE. MATERIAL REFERENCE C NUMBER. NOT PRESENTLY USED. C C NMAX INTEGER VARIABLE. TOTAL NUMBER RESIDUES IN C PRESENT DATA FILE. ENTERED BE USER FROM C TERMINAL C C NN1,NN2 2 BYTE INTEGER ARRAYS. BOND NODES (ATOMS C COMPRIZING A BOND) C C NPR 2 BYTE INTEGER ARRAY. PROPERTY REFERENCE VALUES. C USED TO DIFFERENTIATE BONDS. BONDS CODED CAN C BE SPECIFICALLY BLANKED OUT USING SUPERTAB C C NTR 2 BYTE INTEGER ARRAY. TYPE REFERENCE NUMBER. C USED TO DIFFERENTIATE RESIDUES, RGROUPS FROM C BACKBONE. ALLOWS USER TO BLANK OUT CERTAIN C RGROUPS ETC. WHEN USING SUPERTAB. C C X,Y,Z REAL ARRAYS. X,Y AND Z COORDINATES OF ATOMS. C READ FROM PROTEIN DATA BANK FILES C C C C C CALLS TO; C C BKBONE C RCHAIN C C CALLED FROM; C C MAIN C C I/O SPECS C C INPUT; NMAX C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C C SUBROUTINE MAJOR1(NMAX) C C INTEGER*2 NPR(2500),NTR(2500),NN1(2500),NN2(2500),M(2500),TYPE, *TYPREF,NDES,NDIS,NC,NTC,NMR,NE C CHARACTER FMT1*18,FMT2*15,FMT3*6,FMT7*7,FMT4*32 C COMMON/PROTN/LIMN,TYPE,TYPREF/BOTH/J,NPR,NTR,NN1,NN2,M,NMR, *NC,NE,NTC/FORMAT/FMT1,FMT4,FMT2,FMT3,FMT7 C REAL X(20),Y(20),Z(20) C C SET-UP LOOP OVER ALL RESIDUES C DO 10 J=1,NMAX C C C CONTROL IS TRANSFERED TO SUBROUTINE TO DETERMINE PARAMETERS C OF BACKBONE PORTION OF RESIDUE; BKBONE C CALL BKBONE(X,Y,Z) C C CONTROL IS TRANSFERRED TO SUBROUTINE TO DETERMINE PARAMETERS C OF SIDE CHAIN PORTION OF RESIDUE; RCHAIN C CALL RCHAIN(X,Y,Z) C 10 CONTINUE C C WRITE INFORMATION TO UNIVERSAL FILE C WRITE(3,FMT7)-1,-1,16 C DO 35 L=1,NMAX*20 IF(NN1(L).NE.0)THEN WRITE(3,FMT3)L,NTC,NTR(L),NPR(L),NMR,NC,NE WRITE(3,FMT3)NN1(L),NN2(L) ENDIF C 35 CONTINUE C WRITE(3,FMT7)-1 C C C CONTROL IS RETURNED TO MAIN PROGRAM C C RETURN END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO DETERMINE PARAMETERS OF BACKBONE PORTION C OF RESIDUES; BKBONE C C C VARIABLES; C C J RESIDUE NUMBER C C NMAX TOTAL NUMBER RESIDUES C C X,Y,Z (ARRAYS) ATOMIC SPACIAL COORDINATES C C CODE THREE LETTER CHARACTER VARIABLE SIGNIFYING C AMINO ACID THREE LETTER CODE. C C M (ARRAY) ELEMENT NUMBER C C NTR (ARRAY) TYPE REFERENCE NUMBER OF ELEMENT. C SIGNIFIES TYPE OF BOND (I.E., WHETHER IT C OCCURS IN THE BACKBONE PORTION OF THE C MOLECULE OR IN A PARTICULAR SIDE CHAIN) C AND STORES THESE VALUES FOR ALL ELEMENTS C (BONDS) IN MODEL (MOLECULE). C C C NN1 (ARRAY) FIRST NODE(ATOM) OF AN ELEMENT C (BOND). C C NN2 (ARRAY) SECOND NODE(ATOM) OF AN ELEMENT C (BOND). C C C CALLS TO; C C NONE C C CALLED BY; C C MAJOR1 C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE BKBONE(X,Y,Z) C C INTEGER*2 NPR(2500),NTR(2500),NN1(2500),NN2(2500),M(2500),TYPE, *TYPREF,NMR,NC,NE,NTC,NDES,NDIS C CHARACTER FMT1*18,FMT2*15,FMT3*6,CODE*3,FMT4*32,OLDCD*3 C COMMON /PROTN/LIMN,TYPE,TYPREF/BOTH/J,NPR,NTR,NN1,NN2,M, *NMR,NC,NE,NTC,NDES,NDIS/FORMAT/FMT1,FMT4,FMT2,FMT3/PCHAR/ *CODE,OLDCD C REAL X(20),Y(20),Z(20) C C C SET-UP LOOP OVER BACKBONE SEGMENT OF RESIDUES C C DO 20 I=1,6 C N=(J-1)*20+I IF(I.EQ.5)GOTO 25 CALL RD1(X,Y,Z,I,I) OLDCD=CODE WRITE(3,FMT2)N,NDES,NDIS,NC,X(I),Y(I),Z(I) 25 CONTINUE C C SEE SPECIFICATIONS FOR UNIVERSAL FILE FOR EXPLANATION OF FORMAT C OF OUTPUT C M(N)=N C C DEFINITION OF ELEMENTS FOR THE BACKBONE PART OF THE RESIDUE C IF(I.EQ.6)GOTO 20 IF(I.EQ.4)THEN NTR(N)=1 NN1(N)=N-1 NN2(N)=N+17 IF(CODE.EQ.'GLY')GOTO 22 ELSEIF(I.NE.5)THEN NTR(N)=1 NN1(N)=N NN2(N)=N+1 IF(I.EQ.3)NTR(N)=2 ELSE NN1(N)=N-3 NN2(N)=N+1 ENDIF C 20 CONTINUE C C 22 CONTINUE RETURN END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO DETERMINE PARAMETERS OF SIDECHAIN PORTION OF C RESIDUES; RCHAIN C C C VARIABLES; C C LIM NUMBER OF ATOMS IN RESIDUE. APPLIES TO C RESIDUES WHICH HAVE RING SIDE CHAINS C (EXCEPT SPECIAL CASES). C C LIMN NUMBER OF ATOMS IN RESIDUES WHICH HAVE C NON-RING SIDE CHAINS. C C TYPE THE TYPE OF SIDE CHAIN ON THE RESIDUE C C 1 = STRAIGHT CHAIN C 2 = BRANCHED CHAIN C 21 = NORMAL C 22 = ILE C 3 = RINGS C 31 = PHE C 32 = TYR C 33 = HIS C 34 = TRP C 4 = SPECIAL C 41 = PRO C 42 = HYP C C C TYPREF THE TYPE REFERENCE ASSIGNED TO THE VARIABLE C NTR. REFERS TO THE PARTICULAR RESIDUE TYPE C C NPRALA(ETC.) (ARRAY) STORES PROPERTY REFERENCE VALUES OF C ELEMENTS(BONDS). THESE VALUES ARE USED TO C DIFFERENTIATE THE DIFFERENT TYPES OF ELEMEN C (E.G., A CARBON-CARBON BOND FROM AN OXYGEN- C CARBON BOND). C C NPR (ARRAY) STORES THE PROPERTY REFERENCE VALUE C FOR ALL THE NODES(ATOMS) IN THE MODEL(MOLEC C C NTR,CODE,X,Y,Z (SEE VARIABLE LIST FOR SUBROUTINE BKBONE) C C C CALLS TO; C C ELMNT1 C C CALLED BY; C C MAJOR1 C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C SUBROUTINE RCHAIN(X,Y,Z) C INTEGER*2 NPR(2500),NTR(2500),NN1(2500),NN2(2500),M(2500), *NPRALA(5),NPRARG(11),NPRASN(8),NPRASP(8),NPRCYS(6),NPRGLN(9), *NPRGLU(9),NPRGLY(4),NPRHIS(11),NPRILE(8),NPRLEU(8),NPRLYS(9), *NPRMET(8),NPRPHE(12),NPRSER(6),NPRTHR(7),NPRTRP(16),NPRTYR(13), *NPRVAL(7),NPRPRO(8),NPRHYP(9),NPRASX(8),NPRGLX(9),TYPE,TYPREF C CHARACTER CODE*3 C LOGICAL*1 SAMRES C COMMON /PROTN/LIMN,TYPE,TYPREF/BOTH/J,NPR,NTR,NN1,NN2 */NPRS/NPRALA,NPRARG,NPRASN,NPRASP,NPRCYS,NPRGLN,NPRGLU,NPRGLY, *NPRHIS,NPRILE,NPRLEU,NPRLYS,NPRMET,NPRPHE,NPRSER,NPRTHR,NPRTRP, *NPRTYR,NPRVAL,NPRPRO,NPRHYP,NPRASX,NPRGLX/PCHAR/CODE/LOGVAR/ *SAMRES C REAL X(20),Y(20),Z(20) C C SAMRES=.TRUE. C TYPE=21 LIM=8 N=(J-1)*20 C IF(CODE.EQ.'ALA')THEN TYPREF=18 LIM=5 NTR(N+5)=TYPREF DO 100 NN=N+1,N+LIM 100 NPR(NN)=NPRALA(NN-N) GOTO 30 C ELSEIF(CODE.EQ.'ARG')THEN TYPREF=30 LIM=11 DO 101 NN=N+1,N+LIM 101 NPR(NN)=NPRARG(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'ASN')THEN TYPREF=21 DO 102 NN=N+1,N+LIM 102 NPR(NN)=NPRASN(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'ASP')THEN TYPREF=27 DO 103 NN=N+1,N+LIM 103 NPR(NN)=NPRASP(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'CYS')THEN TYPREF=22 LIM=6 DO 104 NN=N+1,N+LIM 104 NPR(NN)=NPRCYS(NN-N) TYPE=1 GOTO 300 C ELSEIF(CODE.EQ.'GLN')THEN TYPREF=23 LIM=9 DO 105 NN=N+1,N+LIM 105 NPR(NN)=NPRGLN(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'GLU')THEN TYPREF=28 LIM=9 DO 106 NN=N+1,N+LIM 106 NPR(NN)=NPRGLU(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'GLY')THEN TYPREF=19 LIM=4 DO 107 NN=N+1,N+LIM 107 NPR(NN)=NPRGLY(NN-N) GOTO 30 C ELSEIF(CODE.EQ.'HIS')THEN TYPREF=31 LIM=11 TYPE=33 DO 108 NN=N+1,N+LIM 108 NPR(NN)=NPRHIS(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'ILE')THEN TYPREF=16 TYPE=22 DO 109 NN=N+1,N+LIM 109 NPR(NN)=NPRILE(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'LEU')THEN TYPREF=17 DO 110 NN=N+1,N+LIM 110 NPR(NN)=NPRLEU(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'LYS')THEN TYPREF=29 LIM=9 TYPE=1 DO 111 NN=N+1,N+LIM 111 NPR(NN)=NPRLYS(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'MET')THEN TYPREF=11 TYPE=1 DO 112 NN=N+1,N+LIM 112 NPR(NN)=NPRMET(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'PHE')THEN TYPREF=13 LIM=12 TYPE=31 DO 113 NN=N+1,N+LIM 113 NPR(NN)=NPRPHE(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'SER')THEN TYPREF=20 LIM=6 TYPE=1 DO 114 NN=N+1,N+LIM 114 NPR(NN)=NPRSER(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'THR')THEN TYPREF=25 LIM=7 DO 115 NN=N+1,N+LIM 115 NPR(NN)=NPRTHR(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'TRP')THEN TYPREF=12 LIM=15 TYPE=34 DO 116 NN=N+1,N+LIM+1 116 NPR(NN)=NPRTRP(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'TYR')THEN TYPREF=26 LIM=12 TYPE=32 DO 117 NN=N+1,N+LIM+1 117 NPR(NN)=NPRTYR(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'VAL')THEN TYPREF=15 LIM=7 DO 118 NN=N+1,N+LIM 118 NPR(NN)=NPRVAL(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'PRO')THEN TYPREF=14 TYPE=41 DO 119 NN=N+1,N+LIM 119 NPR(NN)=NPRPRO(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'HYP')THEN TYPREF=24 TYPE=42 DO 120 NN=N+1,N+LIM+1 120 NPR(NN)=NPRHYP(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'ASX')THEN TYPREF=30 DO 121 NN=N+1,N+LIM 121 NPR(NN)=NPRASX(NN-N) GOTO 300 C ELSEIF(CODE.EQ.'GLX')THEN TYPREF=31 LIM=9 DO 122 NN=N+1,N+LIM 122 NPR(NN)=NPRGLX(NN-N) C 300 CONTINUE LIMN=LIM+1 C C CALL ELMNT1(X,Y,Z) 30 CONTINUE ENDIF C SAMRES=.FALSE. C C RETURN END C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO DEFINE ELEMENTS FOR PROTEIN FILES; ELMNT1 C C VARIABLES; C C N NODE(ATOM) NUMBER C C X,Y,Z,NN1,NN2 (SEE VARIABLE LIST FOR SUBROUTINE RCHAIN) C C LIM,TYPE (SEE VARIABLE LIST FOR SUBROUTINE BKBONE) C C C CALLS TO; C C SWAP1 C SWAP2 C SWAP3 C C CALLED BY; C C RCHAIN C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE ELMNT1(X,Y,Z) C INTEGER*2 NPR(2500),NTR(2500),NN1(2500),NN2(2500),M(2500), *NMR,TYPE,TYPREF,NC,NE,NTC,NDES,NDIS C CHARACTER FMT1*18,FMT2*15,FMT3*6,FMT4*32,FMT7*7,CODE*3 C LOGICAL*1 SAMRES,NOREAD C COMMON /PROTN/LIMN,TYPE,TYPREF/BOTH/J,NPR,NTR,NN1,NN2,M, *NMR,NC,NE,NTC,NDES,NDIS/FORMAT/FMT1,FMT4,FMT2,FMT3,FMT7 */LOGVAR/SAMRES,NOREAD C REAL X(LIMN),Y(LIMN),Z(LIMN) C C C READ IN COORDINATE DATA C C DEPENDING ON THE TYPE OF THE RESIDUE, A DIFFERENT NUMBER OF C COORDINATE DATA POINTS MUST BE READ IN C C MARKER=LIMN I3=7 IF((TYPE.EQ.1).OR.(TYPE.EQ.21).OR.(TYPE.EQ.22))THEN CALL RD1(X,Y,Z,I3,LIMN) ELSE CALL RD1(X,Y,Z,I3,LIMN-1) MARKER=LIMN-1 ENDIF C C C ASSUME THE DATA CONTAINS EITHER THE WHOLE RESIDUE OR C JUST THE BACKBONE C C IF NOREAD IS TRUE AT THIS POINT, THEN THE DATA FILE ONLY C HAS THE BACKBONE PORTION OF THE RESIDUE, AND IT IS NECESSARY C TO BEGIN READING IN COORDINATES OF A NEW RESIDUE C C IF(NOREAD)GOTO 50 C C C CALL A REORDERING SUBROUTINE IF NECESSARY C THIS WILL BE NECESSARY FOR CERTAIN RESIDUES WHOSE C COORDINATE DATA WE HAVE CHOSEN TO REORDER TO OBTAIN C A MORE SEQUENTIAL PATTERN FOR ARRANGING THE NODES(ATOMS) C AND ELEMENTS(BONDS). C C IF((TYPE.EQ.31).OR.(TYPE.EQ.32))THEN CALL SWAP1(X) CALL SWAP1(Y) CALL SWAP1(Z) ELSEIF((TYPE.EQ.33).OR.(TYPE.EQ.34))THEN CALL SWAP2(X) CALL SWAP2(Y) CALL SWAP2(Z) ELSEIF(TYPE.EQ.22)THEN CALL SWAP3(X) CALL SWAP3(Y) CALL SWAP3(Z) ENDIF C C DO 5 K=7,MARKER N=(J-1)*20+K 5 WRITE(3,FMT2)N,NDES,NDIS,NC,X(K),Y(K),Z(K) C C CONNECT THE NODES TO FORM THE ELEMENTS OF THE SIDE CHAINS C LIM=LIMN-1 DO 30 I=5,LIM C N=(J-1)*20+I M(N)=N NTR(N)=TYPREF C IF(I.EQ.5)GOTO 30 IF(I.NE.LIM) THEN NN1(N)=N NN2(N)=N+1 ELSEIF(TYPE.EQ.1)THEN NN1(N)=N NN2(N)=N+1 ELSEIF(TYPE.EQ.21)THEN NN1(N)=N-1 NN2(N)=N+1 ELSEIF(TYPE.EQ.22)THEN NN1(N)=N-2 NN2(N)=N+1 ELSEIF((TYPE.EQ.31).OR.(TYPE.EQ.32).OR.(TYPE.EQ.33).OR. *(TYPE.EQ.34))THEN NN1(N)=N NN2(N)=(J-1)*20+7 ELSEIF((TYPE.EQ.41).OR.(TYPE.EQ.42))THEN NN1(N)=N NN2(N)=N-7 ENDIF C 30 CONTINUE C C FINISH UP THE SPECIAL CASES; THE HYDROXYL GROUP ON TYROSINE C RESIDUES, THE SHARES BOND BETWEEN THE TWO RINGS IN TRYPTOPHAN C RESIDUES, AND THE HYDROXYL GROUP ON HYDROXYPROLINE RESIDUES. C I=I+1 N=N+1 C C IF(TYPE.EQ.32)THEN CALL RD1(X,Y,Z,I,I) NN1(N)=N-3 NN2(N)=N WRITE(3,FMT2)N,NDES,NDIS,NC,X(I),Y(I),Z(I) M(N)=N NTR(N)=TYPREF ELSEIF(TYPE.EQ.34)THEN NN1(N)=N-6 NN2(N)=N-1 M(N)=N NTR(N)=TYPREF ELSEIF(TYPE.EQ.42)THEN CALL RD1(X,Y,Z,I,I) NN1(N)=N-2 NN2(N)=N WRITE(3,FMT2)N,NDES,NDIS,NC,X(I),Y(I),Z(I) M(N)=N NTR(N)=TYPREF ENDIF C 50 CONTINUE C RETURN END C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO REORDER ATOMS IN PHE AND TYR; SWAP1 C C VARIABLES; C C W (ARRAY) STORES ATOMIC X,Y, OR Z CORRIDINATE. C C WN (ARRAY) USED AS DUMMY VARIABLES TO SWAP POSITION C OF MEMBERS OF W. C C LIMN,TYPE (SEE VARIABLE LIST OF SUBROUTINE RCHAIN) C C C CALLS TO; C C NONE C C CALLED FROM; C C ELMNT1 C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE SWAP1(W) C COMMON /PROTN/LIMN DIMENSION W(LIMN),WN(12) C C DO 15 I=9,12 15 WN(I)=W(I) W(9)=WN(10) W(10)=WN(12) W(12)=WN(9) C RETURN END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE FOR REORDERING TRP AND HIS; SWAP2 C C VARIBLES; C C W,WN (SEE VARIABLE LIST FOR SUBROUTINE SWAP1) C C LIMN,TYPE (SEE VARIABLE LIST FOR SUBROUTINE RCHAIN) C C C CALLS TO; C C NONE C C CALLED FROM; C C RCHAIN C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE SWAP2(W) C INTEGER*2 TYPE COMMON /PROTN/LIMN,TYPE DIMENSION W(LIMN),WN(15) C C DO 25 I=9,LIMN-1 25 WN(I)=W(I) C W(9)=WN(10) W(10)=WN(11) IF(TYPE.EQ.33)THEN W(11)=WN(9) ELSE W(14)=WN(12) W(13)=WN(14) W(12)=WN(15) W(11)=WN(13) W(15)=WN(9) ENDIF C RETURN END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO SWAP ATOMS IN ILE RESIDUES; SWAP3 C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C SUBROUTINE SWAP3(W) C COMMON /PROTN/LIMN REAL W(LIMN),WN(2) DO 10 I=8,9 10 WN(I-7)=W(I) W(8)=WN(2) W(9)=WN(1) RETURN END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO READ IN DATA FOR PROTEIN FILES; RD1 C C DESCRIPTION; C C THIS SUBROUTINE TESTS INPUT TO SEE IF IT IS ACCEPTIBLE AS DATA C I.E., TO TELL IT APPART FROM TEXT IN THE INPUT FILE. IT ALSO C TESTS TO SEE IF THE CRYSTALLOGRAPHIC DATA FOR A PARTICULAR C RESIDUE IS INCOMPLETE. IT IS ASSUMED THAT EITHER THE ENTIRE C RESIDUE IS PRESENT OR ONLY THE BACKBONE PORTION. C C EACH LINE OF FROM THE INPUT FILE IS READ IN AS A CHARACTER C STRING AND TESTED TO SEE IF IT IS ACCEPTIBLE AS DATA. IF IS C ACCEPTIBLE, A CORE TO CORE INPUT OPERATION IS PERFORMED TO C REREAD IT INTO THE DESIRED FORMAT. C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE RD1(X,Y,Z,A,B) C REAL X(20),Y(20),Z(20) CHARACTER BUFF*80,CODE*3,FMT1*18,FRST4*4,OLDCD*3 INTEGER A,B LOGICAL*1 SAMRES,NOREAD,SAMCD C COMMON /FORMAT/FMT1/LOGVAR/SAMRES,NOREAD/PCHAR/CODE,OLDCD C DO 30 I=A,B DO 20 K=1,1500 IF(.NOT.NOREAD)THEN READ(2,100)BUFF 100 FORMAT(A80) FRST4=BUFF ENDIF C IF((FRST4.EQ.'ATOM').OR.(NOREAD))THEN READ(BUFF,FMT1)CODE,X(I),Y(I),Z(I) SAMCD=(OLDCD.EQ.CODE) IF((SAMRES).AND.(.NOT.SAMCD))THEN NOREAD=.TRUE. RETURN ELSE NOREAD=.FALSE. ENDIF GOTO 30 ENDIF C 20 CONTINUE 30 CONTINUE RETURN END C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO DO NUCLEIC ACID FILES; MAJOR2 C C VARIABLES; C C NMAX TOTAL NUMBER RESIDUES IN THE MOLECULE C C X,Y,Z (ARRAYS) ATOMIC COORIDINATES C C ATOM 3 BYTE CHARACTER VARIABLE USED TO IDENTIFY C MOLECULE AS HAVING DEOXY RIBOSE OR RIBOSE C AS ITS SUGAR. C C NUC 1 BYTE CHARACTER VARIABLE. SIGNIFIES TYPE C OF NUCLEOTIDE RESIDUE (I.E., A,T,U,C,G) C C CHAIN 1 BYTE CHARACTER VARIABLE. REPRESENTS THE C CHAIN (I.E., ALPHA(A), OR BETA(B)) IN WHICH C THE RESIDUE IS LOCATED. C C FLAG INTEGER VARABLE USED AS A COUNTER TO DIFFERENTIATE C BETWEEN RESIDUES HAVING DEOXY SUGAR FROM THOSE THA C HAVE RIBOSE. C C N SIGNIFIES NUMBER OF ATOMS IN PARTICULAR TYPE C RESIDUE C C K RESIDUE COUNTER C C NN1,NN2 (ARRAYS) END NODES(ATOMS) OF ELEMENTS(BONDS) C C NTR (ARRAY) TYPE REFERENCE NUMBER. IDENTIFIES ATOMS C AS BELINGING TO A PARTICULAR TYPE OF RESIDUE. C C REF INTEGER VARIABLE. SIGNIFIES A OR B CHAIN C ADDED TO TYPE REFERENCE NUMBER TO BE ABLE C TO DISTINGUISH RESIDUES ON DIFFERENT CHAINS. C C NPR (ARRAY) PROPERTY REFERENCE VALUE OF ELEMENTS(BONDS C USED TO DIFFERNETIATE BETWEEN DIFFERNT TYPE ELEMEN C (BONDS). C C C CALLS TO; C C NUM,REORDR,ELMNT2 C C CALLED BY; C C MAIN C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE MAJOR2(NMAX) C INTEGER*2 NPR(2500),NTR(2500),NN1(2500),NN2(2500),M(2500), *NMR,NC,NE,NTC,NDES,NDIS,K,REF,FLAG,FLG/0/,LASTK/1/,A C CHARACTER FMT1*18,FMT2*15,FMT3*6,FMT4*32, *NUC,CHAIN,ATOM*3,FMT7*7 C C COMMON /NUCACD/I,REF,NN,N,K,/BOTH/J,NPR,NTR,NN1, *NN2,M,NMR,NC,NE,NTC,NDES,NDIS,/FORMAT/FMT1,FMT4,FMT2,FMT3, *FMT7/NCHAR/NUC,CHAIN C REAL X(30),Y(30),Z(30) C C C READ ATOMIC COORDINATES FOR FIRST RESIDUE, TAKING INTO ACCOUNT THE C FACT THAT THE FIRST RESIDUE WILL NOT HAVE A PHOSPHATE GROUP. C THIS WILL ALSO BE THE CASE FOR THE FIRST RESIDUE IN THE BETA CHAIN. C 150 CALL RD2(ATOM,NUC,CHAIN,X,Y,Z,4,11,FLG) C C FLAG IS SET TO 17 IF THE NUCLEIC ACID HAS DEOXY RIBOSE AS ITS SUGAR. C IF THE NUCLEIC ACID HAS RIBOSE AS ITS SUGAR, THERE IS ONE MORE ATOM C PER RESIDUE. FLAG IN THIS CASE IS 18. THIS IS ONLY DONE FOR THE C THE FIRST RESIDUE IN THE FILE. C IF(FLG.EQ.0)THEN FLAG=17 IF(ATOM.EQ.'O2*')FLAG=18 I=FLAG ELSE LASTK=K ENDIF C C USE FUNCTION SUPROGRAM NUM TO DETERMINE THE TOTAL NUMBER OF ATOMS C IN THE PARTICULAR RESIDUE C N=NUM() C C C READ THE REMAINDER OF THE ATOMS IN THE RESIDUE C CALL RD2(ATOM,NUC,CHAIN,X,Y,Z,12,N,FLG) C C REORDER THE ATOMIC COORDINATES TO OBTAIN A MORE SEQUENTIAL NUMBERING C PATTERN C CALL REORDR(X) CALL REORDR(Y) CALL REORDR(Z) C DO 30 J=3,N JJ=(LASTK-1)*30+J IF((I.EQ.17).AND.(J.EQ.11))GOTO 30 IF((I.EQ.18).AND.(J.EQ.12))GOTO 30 WRITE(3,FMT2)JJ,NDES,NDIS,NC,X(J),Y(J),Z(J) 30 CONTINUE C C CALL ELMNT2 TO DETERMINE ELEMENTS C CALL ELMNT2(FLG) C C SET FLG BACK TO 0 (IT IS NECESSARY TO RESET FLG TO 0 AFTER C DEALING WITH THE END TERMINAL RESIDUE OF THE BETA CHAIN) C FLG=0 C C SET-UP LOOP OVER ALL RESIDUES C DO 40 K=LASTK+1,NMAX CALL RD2(ATOM,NUC,CHAIN,X,Y,Z,1,1,FLG) IF(FLG.EQ.1)GOTO 150 N=NUM() CALL RD2(ATOM,NUC,CHAIN,X,Y,Z,2,N,FLG) CALL REORDR(X) CALL REORDR(Y) CALL REORDR(Z) C DO 35 J=1,N 35 WRITE(3,FMT2)J+30*(K-1),NDES,NDIS,NC,X(J),Y(J),Z(J) C CALL ELMNT2(FLG) 40 CONTINUE C C WRITE INFORMATION TO UNIVERSAL FILE C WRITE(3,FMT7)-1,-1,16 DO 45 J=1,NMAX*30 IF(NN1(J).NE.0)THEN WRITE(3,FMT3)J,NTC,NTR(J),NPR(J),NMR,NC,NE WRITE(3,FMT3)NN1(J),NN2(J) ENDIF 45 CONTINUE WRITE(3,FMT7)-1 C C CONTROL IS RETURNED TO MAIN PROGRAM C RETURN END C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C FUNCTION SUBPROGRAM TO DETERMINE NUMBER OF ATOMS IN C RESIDUE; NUM C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C FUNCTION NUM() C CHARACTER NUC,CHAIN INTEGER*2 REF COMMON /NUCACD/I,REF/NCHAR/NUC,CHAIN C NUM=(I+2) IF(NUC.EQ.'G')NUM=(I+5) IF(NUC.EQ.'A')NUM=(I+4) IF(NUC.EQ.'T')NUM=(I+3) C C IF THE RESIDUE IS IN THE A CHAIN, REF IS 100. C OTHERWISE, REF IS 200 C REF=200 IF(CHAIN.EQ.'A')REF=100 C RETURN END C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO REORDER ATOMS IN NUCLEIC ACID RESIDUES; C REORDR C C VARIABLES; C C W (ARRAY) ATOMIC COORDINATE (EITHER X,Y, OR Z) C C WN (ARRAY) DUMMY VARIABLES FOR SWAP C C C CALLS TO; C C NONE C C CALLED BY; C C MAJOR2 C C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C SUBROUTINE REORDR (W) C CHARACTER NUC,CHAIN INTEGER*2 REF COMMON /NUCACD/I,REF,NN,N/NCHAR/NUC,CHAIN REAL W(N),WN(30) C DO 10 J=1,N 10 WN(J)=W(J) C C REORDER SUGAR SEGMENT OF RESIDUE C DO 20 J=1,10 IF((J.EQ.1).OR.(J.EQ.3).OR.(J.EQ.4).OR.(J.EQ.5).OR. *(J.EQ.6))THEN W(J)=WN(J+1) ELSE W(J)=WN(J-1) ENDIF 20 CONTINUE C W(8)=WN(10) W(7)=WN(I-6) W(I-6)=WN(3) C C REORDER REST OF RESIDUE C IF(NUC.EQ.'A')THEN W(I+4)=WN(I) DO 30 J=I,I+3 30 W(J)=WN(J+1) ELSEIF(NUC.EQ.'G')THEN W(I+2)=WN(I+4) W(I+3)=WN(I+5) W(I)=WN(I+1) W(I+1)=WN(I+2) W(I+4)=WN(I) W(I+5)=WN(I+3) ELSE W(I)=WN(I-4) W(I+1)=WN(I-3) W(I-1)=WN(I-2) W(I-2)=WN(I-1) W(I-3)=WN(I+1) W(I+2)=WN(I) IF(CODE.EQ.'T')THEN W(I+3)=WN(I+2) W(I-4)=WN(I+3) ELSE W(I-4)=WN(I+2) ENDIF ENDIF C RETURN END C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO DETERMINE ELEMENTS FOR NUCLEIC ACID FILES; C ELMNT2 C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C SUBROUTINE ELMNT2(FLG) C C INTEGER*2 A,B,REF,NPR(2500),NTR(2500),NN1(2500),NN2(2500), *M(2500),NMR,NC,NE,NTC,NDES,NDIS,K,NPRRB1(12),NPRRB2(11), *NPRA(11),NPRG(12),NPRC(8),NPRT(9),NPRU(8),FLG C CHARACTER NUC,CHAIN C COMMON /BOTH/J,NPR,NTR,NN1,NN2,M,NMR,NC,NE,NTC,NDES,NDIS */NUCACD/I,REF,NN,N,K/NCHAR/NUC,CHAIN/NPRNUC/NPRRB1, *NPRRB2,NPRA,NPRG,NPRC,NPRT,NPRU C KK=30*(K-1) II=KK+I C IF((K.EQ.1).OR.(FLG.EQ.1))THEN A=3 B=10 ELSE A=1 B=11 ENDIF C C C DO RIBOSE PART OF MOLECULE FIRST C DO 20 J=KK+A,KK+B IF(J.EQ.(10+KK))THEN NN1(J+15)=J NN2(J+15)=KK+32 NTR(J+15)=REF+6 IF(I.EQ.18)THEN NN1(J)=J-2 NN2(J)=J+1 NN1(J+2)=J-5 NN2(J+2)=J-1 NTR(J+2)=1+REF ELSE NN1(J)=J-5 NN2(J)=J-1 ENDIF ELSEIF(J.EQ.(11+KK))THEN NN1(J)=J-9 NN2(J)=(II-6) ELSEIF(J.EQ.(7+KK))THEN NN1(J)=J NN2(J)=J+1 NN1(J+17)=J NN2(J+17)=II-5 NTR(J+17)=1+REF ELSE NN1(J)=J NN2(J)=J+1 ENDIF 20 NTR(J)=1+REF C C NOW DO BASE C IF((NUC.EQ.'A').OR.(NUC.EQ.'G'))THEN DO 40 J=II-5,II+5 C IF(J.EQ.II+5)THEN NN1(J)=J-10 NN2(J)=J-2 IF(NUC.EQ.'G')THEN NN1(J+1)=J-4 NN2(J+1)=J NTR(J+1)=3+REF ENDIF ELSEIF((J.EQ.(II+3)).OR.(J.EQ.(II+4)))THEN NN1(J)=J-5 NN2(J)=J ELSE NN1(J)=J NN2(J)=J+1 ENDIF IF(NUC.EQ.'A')NTR(J)=2+REF 40 IF(NUC.EQ.'G')NTR(J)=3+REF C ELSE DO 50 J=II-5,II+2 IF(J.EQ.(II+1))THEN NN1(J)=J-6 NN2(J)=J-1 ELSEIF(J.EQ.(II+2))THEN NN1(J)=J-4 NN2(J)=J IF(NUC.EQ.'T')THEN NN1(J+1)=II-3 NN2(J+1)=II+3 NTR(J+1)=5+REF ENDIF ELSE NN1(J)=J NN2(J)=J+1 ENDIF C C IF(NUC.EQ.'C')NTR(J)=4+REF IF(NUC.EQ.'T')NTR(J)=5+REF IF(NUC.EQ.'U')NTR(J)=6+REF 50 CONTINUE ENDIF C C ASSIGN NPR VALUES C DO 60 IK=KK+1,KK+30 A=IK-KK B=A-6 IF(NN1(IK).NE.0)THEN IF(IK.EQ.24+KK)THEN NPR(IK)=9 ELSEIF(IK.LE.(II-6))THEN IF(I.EQ.18)NPR(IK)=NPRRB1(A) IF(I.EQ.17)NPR(IK)=NPRRB2(A) ELSE IF(NUC.EQ.'A')NPR(IK)=NPRA(B) IF(NUC.EQ.'G')NPR(IK)=NPRG(B) IF(NUC.EQ.'C')NPR(IK)=NPRC(B) IF(NUC.EQ.'T')NPR(IK)=NPRT(B) IF(NUC.EQ.'U')NPR(IK)=NPRU(B) ENDIF ENDIF 60 CONTINUE C RETURN END CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC C C SUBROUTINE TO READ DATA IN FOR NUCLEIC ACID FILES C SUBROUTINE RD2(ATOM,NUC,CHAIN,X,Y,Z,A,B,FLG) C REAL X(30),Y(30),Z(30) CHARACTER ATOM*3,NUC,CHAIN,BUFF*80,FMT4*32,FMT1*18, *FRST4*4 INTEGER*2 A,B,FLG C COMMON /FORMAT/FMT1,FMT4 C DO 30 I=A,B DO 10 K=1,1500 READ(2,100)BUFF 100 FORMAT(A80) FRST4=BUFF IF(FRST4.EQ.'ATOM')THEN READ(BUFF,FMT4)ATOM,NUC,CHAIN,X(I),Y(I),Z(I) GOTO 30 ELSEIF(FRST4.EQ.'TER ')THEN FLG=1 GOTO 40 ENDIF 10 CONTINUE 30 CONTINUE 40 CONTINUE C RETURN END C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC