SPORADIC DIALOGUES
by Eugene Volokh, VESOFT
Q&A column published in INTERACT Magazine, 1987-1989.
Published in "Thoughts & Discourses on HP3000 Software", 4th ed.
Q: I have what you might think to be a strange question -- where is
the Command Interpreter?
On my MPE/XL machine, I see a program file called CI.PUB.SYS, so I
guess that that's the Command Interpreter program (the one that
implements :RUN, :PURGE, etc.) -- however, I couldn't find such a
program on my MPE/V computer.
:EDITOR, I know, is implemented through EDITOR.PUB.SYS; :FCOPY is
FCOPY.PUB.SYS; all the compilers have their own program files. Where
is the CI kept?
A: This is a very interesting question -- one that bothered me for a
while several years ago. The answer is somewhat surprising.
Virtually all of the operating system on MPE/V is kept in the system
SL (SL.PUB.SYS). This includes the file system, the memory manager,
etc.; it also includes the Command Interpreter. But, you may say,
each job or session has a Command Interpreter process, and each
process must have a program file attached to it. Nope. Each of YOUR
processes must have a program file attached to it; MPE itself faces
no such restrictions.
All that a process really has to have is a data segment (which MPE
builds for each CI process when you sign on) and code segments;
internally, there's no reason why those code segments can't be in the
system SL. MPE just uses an internal CREATEPROCESS-like procedure
that creates a process given not a program file name but an address
(plabel) of a system SL procedure; no program file needed.
On MPE/XL, the situation is somewhat different (as you've noticed);
there's a program file called CI.PUB.SYS that is actually what is run
for each CI process. However, if you do a :LISTF CI.PUB.SYS,2,
you'll find that it's actually quite small (186 records on my MPE/XL
1.1 system); since Native Mode program files usually take a lot more
records than MPE/V program files, those 186 records can't really do a
lot of work. CI.PUB.SYS is really just a shell that outputs the
prompt, reads the input, and executes it by calling MPE/XL system
library procedures. MPE/XL library procedures (including the ones
that CI.PUB.SYS calls, either directly, or indirectly) are kept in
the files NL.PUB.SYS, XL.PUB.SYS, and SL.PUB.SYS.
Thus, in both MPE/V and MPE/XL, virtually all of the smarts behind
command execution is kept in the system library (or libraries).
Notable exceptions include all the "process-handling" commands, like
EDITOR, FCOPY, STORE/RESTORE (which use the program STORE.PUB.SYS),
and even PREP (which actually goes through a program called
SEGPROC.PUB.SYS). However, :FILE, :PURGE, :BUILD, and so on, are all
handled by SL, XL, or NL code.
Q: A couple of months ago I read in this column that you can speed up
access to databases by having somebody else open them first. Is
there some easy way that I can take advantage of this without having
to write a special program? I guess what I'd have to do is have a
job that DBOPENs the database and then suspends with the database
open, but how can this be easily done?
A: Indeed, if a TurboIMAGE database is already DBOPENed by somebody
then a subsequent DBOPEN will be faster than it would be if it were
the first DBOPEN. On my Micro XE, a DBOPEN of a database that nobody
else has open took about 2 seconds; a DBOPEN of a database that was
already opened by somebody else took only about 0.8 seconds. The
first DBGET/DBPUT/DBUPDATE against a dataset were also faster if the
database was already open (for the DBGET, about 50 milliseconds vs.
about 400-500 milliseconds).
How can you take advantage of this? Well, if the database is
constantly in use (e.g. it's opened by each of the many users that is
accessing your application system), then the time savings comes
automatically -- the first user will spend 2 seconds on the DBOPEN
and all subsequent users will spend 0.8 seconds, as long as the
database is open by at least one other user.
The problem arises when your database is frequently opened but is
left open for only a short time; for instance, if it is used by some
short program that only needs to do a few database transactions
before it terminates. Then, you might have hundreds of DBOPENs every
day, the great majority of which happen when the database is NOT
already open. You'd like to have one job hanging out there keeping
the database open so that all subsequent DBOPENs will find the
database already opened by that job.
How can this be done without writing a special custom program? (I
agree that you should try to avoid writing custom programs whenever
possible -- if you wrote a special program, you'd have to keep track
of three files [the job stream, the program, and the source] rather
than just the job stream.) Well, let's take this one step at a time.
Having a job stream that opens your database is easy -- just say
!JOB OPENBASE,...
!RUN QUERY.PUB.SYS
BASE=MYBASE
<<password>>
1 <<the mode>>
Unfortunately, as soon as this job opens the database, it'll start to
execute all the subsequent QUERY commands; eventually (because of an
>EXIT or an end-of-file) QUERY will terminate and the database will
get closed. It would be nice if QUERY had some sort of >PAUSE
command (or, to be precise, a >PAUSE FOREVER command), but it
doesn't.
Or does it? What are the mechanisms that MPE gives you for
suspending a process? Can we use any of them from within QUERY?
Well, there's obviously the PAUSE intrinsic, which pauses for a given
amount of time. It's not quite what we want (since we want to pause
indefinitely), but more importantly there is really no way of calling
PAUSE from within QUERY (unless your QUERY is hooked with VESOFT's
MPEX). So much for that idea. There are also a few other intrinsics
-- for instance, SUSPEND and PRINTOPREPLY -- that suspend a process,
but they're not callable from QUERY either. You can't call
intrinsics from QUERY; you can't even do MPE commands (though in any
event there aren't any MPE commands that suspend a process).
One other thing, however, comes to mind -- message files. A read
against an empty message file is a good way to suspend a process;
however, QUERY doesn't have an ">FOPEN" or an ">FREAD" command,
either, does it?
Well, in a way it does. QUERY's >XEQ command lets you execute QUERY
commands from an external MPE file, and to do that it has to FOPEN it
and FREAD from it. You might therefore say:
!JOB OPENBASE,...
!BUILD TEMPMSG;MSG;TEMP;REC=,,,ASCII
!RUN QUERY.PUB.SYS
BASE=MYBASE
<<password>>
1 <<the mode>>
XEQ TEMPMSG
EXIT
!EOJ
The XEQ will start reading from this temporary message file, see that
it's empty, and hang, waiting for a record to be written to this
file. Since the file is a temporary file, nobody else will be able
to write to it, so the job will remain suspended until it's aborted.
There are a few other alternatives along the same lines. For one,
you could make the message file permanent -- that way, you can stop
the job by just writing a record to this file. Why would you want do
this? Because you may want to have your backup job automatically
abort the OPENBASE job so that the database can get backed up.
If the message file were permanent, your backup job could then just
say:
!FCOPY FROM;TO=OPENMSG.JOB.SYS
<< just a blank line -- any text will do >>
...
The one thing that you'd have to watch out for is the security on the
OPENMSG.JOB.SYS file -- you wouldn't want to have just anybody be
able to write to this file because any commands that are written to
the OPENMSG file will be executed by the QUERY in the OPENBASE job
stream. (Remember the >XEQ command.)
If you don't use the permanent message file approach, you can still
have the background job abort the OPENBASE job by saying
!ABORTJOB LOCKBASE,user.account
However, for this, you'd either have to have :JOBSECURITY LOW and
have your backup job log on with SM capability, or have the :ABORTJOB
command be globally allowed (unless you have the contributed ALLOWME
program or SECURITY/3000's $ALLOW feature).
Another alternative is to >XEQ a file that requires a :REPLY rather
than a message file, e.g.
!JOB OPENBASE,...
!FILE NOREPLY;DEV=TAPE
!RUN QUERY.PUB.SYS
BASE=MYBASE
<<password>>
1 <<the mode>>
XEQ *NOREPLY
EXIT
!EOJ
The trouble with this solution is that the console operator might
then accidentally :REPLY to the "NOREPLY" request. (Also, the job
wouldn't quite work if you had an auto-:REPLY tape drive!)
In any event, though, one of the above solutions might be the thing
for you. It's not going to buy you too much time (it only saves time
for DBOPENs and the first DBGETs/DBPUTs/DBUPDATEs on each dataset),
it isn't necessary if the database is already likely to be opened all
the time, and it won't work on pre-TurboIMAGE or MPE/XL systems.
However, in some cases it could be a non-trivial (and cheap!)
performance improvement.
One note to keep in mind (if you really look carefully at the way
IMAGE behaves): the BASE= in the job stream will only open the
database root file -- the individual datasets will remain closed.
HOWEVER, once the keep-the-database-open job starts, any open of any
dataset by any user will leave the dataset open for all the others;
even when the user closes the database, the datasets will still
remain open. If the user only reads the dataset (and doesn't write
to it), the dataset will only be opened for read access; however,
once any user tries to write to the dataset, the dataset will be
re-opened for both read and write access.
The upshot of this is that the database will start out with only the
root file opened, and then (as users start accessing datasets in it)
will slowly have more and more of its datasets opened. After each
dataset has been accessed (especially written to) at least once, no
more opens will be necessary until the last user of the database
(most likely the keep-open job itself) closes it.
Q: I recently tried to develop a way to periodically check my system
disk usage (free, used, and lost). I based it on information that
VESOFT once supplied in one of their articles.
My procedure was as follows:
1. Do a :REPORT XXXXXXXX.@ into a file to determine total disk
usage by account.
2. Subtotal the list (in my case, using QUIZ) to get a system disk
space usage total.
3. Run FREE5.PUB.SYS with ;STDLIST= redirected to a disk file.
4. Use QUIZ to combine the two output files and convert the totals
to Megabytes.
This way, I can show the total free space and total used space on one
report for easy examination. I can also add these two numbers, compare
the sum to the total disk capacity, and thus determine the 'lost' disk
space.
My question is in regard to the lost disk space. The first time I ran
the job, the total lost disk space came out to be approximately 19
Megabytes. After doing a Coolstart with 'Recover Lost disk Space', the
job again showed a total lost disk space of 19 Megabytes. Didn't the
Recover Lost disk Space save any space? Could there be something I'm
overlooking?
A: Unfortunately, there is. Paradoxical as it may seem, the sum of the
total used space and the total free space is NOT supposed to be the
same as the total disk space on the system; or, to be precise, there
is more "used space" on your system than just what the :REPORT command
shows you.
The :REPORT command shows you the total space occupied by PERMANENT
disk FILES. However, other objects on the system also use disk space:
- TEMPORARY FILES created by various jobs and sessions;
- "NEW" FILES, i.e. disk files that are opened by various processes
but not saved as either permanent or temporary;
- SPOOL FILES, both output and input (input spool files are
typically the $STDINs of jobs);
- THE SYSTEM DIRECTORY;
- VIRTUAL MEMORY;
- and various other objects, mostly very small ones (such as disk
labels, defective tracks, etc.).
Your job stream, for instance, certainly has a $STDLIST spool file and
a $STDIN file (both of which use disk space), and might use temporary
files (for instance, for the output of the :REPORT and FREE5); also,
if you weren't the only user on the system, any of the other jobs or
sessions might have had temporary files, new files, or spool files.
In other words, to get really precise "lost space" results, you ought
to:
* Change your job stream to take into account input and output
spool file space (available from the :SHOWIN JOB=@;SP and
:SHOWOUT JOB=@;SP commands). Since :SHOWIN and :SHOWOUT output
can not normally be redirected to a disk file, you should
accomplish this by running a program (say, FCOPY) with its
;STDLIST= redirected and then making the program do the :SHOWIN
and :SHOWOUT, e.g. by saying
:RUN FCOPY.PUB.SYS;INFO=":SHOWOUT JOB=@;SP";STDLIST=...
* Consider the space used by the system directory and by virtual
memory (these values are available using a :SYSDUMP $NULL).
* Consider the space used by your own job's temporary files.
* Run the job when you're the only user on the system.
Your best bet would probably be to run the job once, immediately after
a recover lost disk space, with nobody else on the system; the total
'unaccounted for' disk space it shows you (i.e. the total space minus
the :REPORT sum minus the free disk space) will be, by definition, the
amount of space need by the system and by your job stream. You can
call this post-recover-lost-disk-space value the 'baseline'
unaccounted-for total.
If your job stream ever shows you an 'unaccounted for' total that is
greater than the baseline, you'll know that there MAY be some disk
space lost. To be sure, you should
* Run the job when you're the only user on the system.
* Make sure that your session (the only one on the system) has no
temporary files or open new files while the job is running.
If you do all this, then comparing the 'unaccounted for' disk space
total against the baseline will tell you just how much space is really
lost.
Incidentally, note that you needn't rely on your guess as to the total
size of your disks -- even this can be found out exactly (though not
easily). The very end of the output of VINIT's ">PFSPACE ldev;ADDR"
command shows the total disk size as the "TOTAL VOLUME CAPACITY".
Thus, you could, in your job, :RUN PVINIT.PUB.SYS (the VINIT program
file) with ;STDLIST= redirected to a disk file and issue a ">PFSPACE
ldev;ADDR" command for each of your disk drives. (If you want to get
REALLY general-purpose, you can even avoid relying on the exact
logical device numbers of your disks by doing a :DSTAT ALL into a disk
file and then converting the output of this command into input to
VINIT).
Finally, it's important to remember that all this applies ONLY to
MPE/V. The rules of the game for MPE/XL are very, very different; in
any event, disk space on MPE/XL should (supposedly) never get lost.
Q: I want to "edit" a spool file -- make a few modifications to it
before printing it. SPOOK, of course, doesn't let you do this, so I
decided just to use SPOOK's >COPY command to copy the file to a disk
file, use EDITOR to edit it, and then print the disk file. However,
when I printed the file, the output pagination ended up a lot
different from the way it was in the original spool file! Is there
some special way in which I should print the file, or am I doing
something else wrong?
A: Well, first of all, there is a special way in which you must print
any file that used to have Carriage Control before you edited it with
EDITOR. When EDITOR /TEXTs in a CCTL file, it conveniently forgets
that the file had CCTL -- when it /KEEPs the file back, the file ends
up being a non-CCTL file (although the carriage control codes remain
as data in the first column of the file). To output the file as a CCTL
file, you should say
:FILE LP;DEV=LP
:FCOPY FROM=MYFILE;TO=*LP;CCTL
The ";CCTL" parameter tells FCOPY to interpret the first character of
each record as a carriage control code.
Actually, you probably already know this, since otherwise you would
have asked a slightly different question. You've done the :FCOPY
;CCTL, and you have still ended up with pagination that's different
from what you had to begin with. Why?
Unfortunately, not all the carriage control information from a spool
file gets copied to a disc file with the >COPY command. In particular:
* If your program sends carriage control codes to the printer using
FCONTROL-mode-1s instead of FWRITEs, these carriage control codes
will be lost with a >COPY.
* If the spool file you're copying is a job $STDLIST file, the
"FOPEN" records (which usually cause form-feeds every time a
program is run) will be lost.
* And, most importantly, if your program using "PRE-SPACING"
carriage control rather than the default "POST-SPACING" mode, the
>COPYd spool file will not reflect this.
Statistically speaking, it is this third point that usually bites
people.
The MPE default is that when you write a record with a particular
carriage control code, the data will be output first and then the
carriage control (form-feed, line-feed, no-skip, etc.) will be
performed -- this is called POST-SPACING. However, some languages
(such as COBOL or FORTRAN) tell the system to switch to PRE-SPACING
(i.e. to do the carriage control operation before outputting the
data), and it is precisely this command -- the switch-to-pre-spacing
command -- that is getting lost with a >COPY.
Thus, output that was intended to come out with pre-spacing will
instead be printed (after the >COPY) with post-spacing; needless to
say, the output will end up looking very different from what it was
supposed to look like.
What can you do about this? Well, I recommend that you take the disc
file generated by the >COPY and add one record at the very beginning
that contains only a single letter "A" in its first character. This
"A" is the carriage control code for switch-to-pre-spacing, and it is
the very code that (if I've diagnosed your problem correctly) was
dropped by the >COPY command. By re-inserting this "A" code, you
should be able to return the output to its original, intended format.
Now, this is only a guess -- I'm just suggesting that you try putting
in this "A" line and seeing if the result is any better than what you
had before. There might be other carriage control codes being lost by
the >COPY; there might be, for instance, carriage control transmitted
using FCONTROLs, or your program might even switch back and forth
between pre- and post-spacing mode (which is fairly unlikely).
However, the lost switch-to-pre-spacing is, I believe, the most common
cause of outputs mangled by the SPOOK >COPY command.
Robert M. Green of Robelle Consulting Ltd. once wrote an excellent
paper called "Secrets of Carriage Control" (published, among other
places, in The HP3000 Bible, available from PCI Press at
512-250-5518); it explains a lot of little-known facts about carriage
control files, and may help you write programs that don't cause
"interesting" behavior like the one you've experienced. Carriage
control is a tricky matter, and Bob Green's paper discusses it very
well.
Q: I sometimes have to restore files from my system backups, which are
usually about seven reels. If the file I want is, say, on the sixth
reel (but I don't know it), I have to mount each one of reels 1
through 6 and wait for :RESTORE to read through each one in its
entirety before it finds my file.
I know we ought to keep the full SYSDUMP listing so that we can
tell which file is on which reel, but that can be a pretty big
printout (12,000 files = 200 pages). Isn't there some way for :SYSDUMP
to put some sort of directory on the first reel that maps every file
to its reel number?
A: Well, unfortunately things aren't that simple (are they ever?).
Let's look at how :SYSDUMP (or :STORE) works.
Say that you say :STORE @.@.@. At first, :STORE has no idea how
many reels this operation will consume (who knows -- maybe you have a
10,000' reel!).
The first thing :STORE writes to tape is a "directory" containing
the names of all the files being stored. Note that so far :STORE
doesn't know which reel each file goes on, so this directory can't
contain this information.
Then, :STORE begins to write the files to tape. If you're lucky,
all the files will fit on one reel; if not, at some point :STORE will
get an "end of tape" signal from the tape hardware, and will know that
a new reel needs to be started.
Now, it would be great if :STORE could then go back to the
directory and at least mark each of the files that couldn't fit on
this reel. That way, when you mount the reel, :RESTORE can instantly
figure out if the file is on it or not.
Unfortunately, a tape drive is not a "read-write" medium. You can't
just go and, say, update the 10th record of a file that's stored on
tape. You can write a tape from scratch, or you can read it, but you
can't take an already-written tape and modify it.
Thus, we now have a reel that contains a directory with the names
of ALL the files in the fileset but only SOME of the actual files.
Only by reading all the way to the end of the reel can :RESTORE detect
that a particular file is actually on the next reel.
We're done with reel #1 -- now to reel #2. At the beginning of reel
#2, :STORE also writes out the same directory as it did on reel #1,
but also writes to the tape label THE NUMBER (relative to the start of
the fileset) OF THE FIRST FILE THAT IS ACTUALLY ON THIS TAPE. Then, it
writes out the files themselves, again hoping that all of them will
fit on this reel. If they don't fit, :STORE attempts to put them on
reel #3, reel #4, etc. In any case, the rule is that
EVERY REEL CONTAINS THE DIRECTORY OF ALL THE FILES IN THE ENTIRE
FILESET, PLUS THE NUMBER OF THE FIRST FILE ACTUALLY STORED ON THIS
REEL.
This, again, is because:
1. There's no way for :STORE to know in advance that any of the
files won't fit on the reel;
2. Once the reel is written, there's no way for :STORE to update
the directory on that reel short of re-writing the entire tape.
Thus, the bad news is that :STORE doesn't keep any table that tells
which reel each file is on. The good news is that it's still possible
to restore a file quickly even if you don't know its reel number --
JUST GO THROUGH THE REELS BACKWARDS!
That's right -- backwards. If you first mount reel #1, then reel
#2, then reel #3, etc., :RESTORE has to read each entire reel from
beginning to end, since only by reaching the end of tape marker can it
tell that the file is not on this reel. However, if you first mount
reel #7, then reel #6, then reel #5, etc., :RESTORE will only have go
through the DIRECTORY (a rather small chunk) of each reel to find the
reel that actually has the file.
In the example shown in the picture, say that you try to :RESTORE
file E. First you mount reel 3, whose directory indicates that it
contains files F and G (since F is the 6th file on the reel). Without
having to read through the entire reel, :RESTORE will realize that
file E is on some previous reel; it will then ask you to mount reel
#2, on which the file does indeed exist.
Of course, you might notice that :STORE *could* have put the reel
numbers INTO THE DIRECTORY OF THE LAST REEL, so that simply mounting
the last reel would automatically tell you exactly which reel number
the file you want is on. (By the time the last reel is generated,
STORE naturally knows onto which reels all of the other files were
written.)
Unfortunately, the authors of MPE didn't choose to do this, much to
our disadvantage. What most likely happened is that they didn't think
of it on the first version of the system, and when they realized
afterwards that this would have been a good thing, it was too late.
:STORE/:RESTORE faces a greater burden of compatibility than other
aspects of MPE -- not only do tapes generated by OLD versions of
:STORE need to be readable by NEW versions of :RESTORE, but tapes
generated by NEW versions of :STORE must also be readable by OLD
versions of :RESTORE. Once HP put out one version of the system with a
directory format that had no room for the reel number, it was stuck.
:FILE T;DEV=TAPE
:STORE A,B,C,D,E,F,G;*T
Reel #1:
-------------------------------------
| tape label: first file number = 1 |
-------------------------------------
| directory: A B C D E F G |
-------------------------------------
| data of file A |
-------------------------------------
| data of file B |
-------------------------------------
| data of file C |
-------------------------------------
Reel #2:
-------------------------------------
| tape label: first file number = 4 |
-------------------------------------
| directory: A B C D E F G |
-------------------------------------
| data of file D |
-------------------------------------
| data of file E |
-------------------------------------
Reel #3:
-------------------------------------
| tape label: first file number = 6 |
-------------------------------------
| directory: A B C D E F G |
-------------------------------------
| data of file F |
-------------------------------------
| data of file G |
-------------------------------------
Q: I know that many commands like :EDITOR, :SYSDUMP, :COBOL, etc.
actually run program files in PUB.SYS called EDITOR.PUB.SYS,
SYSDUMP.PUB.SYS, COBOL.PUB.SYS, etc. What about :SEGMENTER? Obviously
there can't be a program SEGMENTER.PUB.SYS (file name to long) -- I've
heard somebody say that the :SEGMENTER program file is actually
SEGDVR.PUB.SYS, but when I do a :LISTF on it I see it has only 27
records! Could that be?
Also, :SEGMENTER has a -PREP command in it that seems just like the
MPE :PREP command -- does :SEGMENTER call MPE to execute the -PREP
command? I tried using the COMMAND intrinsic to do a :PREP from my
program, but I got a CI error 12, "COMMAND NOT PROGRAMATICALLY
EXECUTABLE". Perhaps it's MPE's :PREP command that calls SEGMENTER's
-PREP (I can't believe that HP would have written the same code in two
places); if so, can :ALLOCATEing SEGDVR.PUB.SYS make my :PREPs faster?
A: Consider, if you will: A humble HP3000 user trying to fathom the
intricacies of the MPE SEGMENTER. He believes that typing an MPE
command gets him into SEGDVR.PUB.SYS, but, in reality, that innocent
"-" prompt is just the first signpost of
THE TWILIGHT ZONE!
OK, let's try to straighten all this out.
First of all, typing :SEGMENTER is essentially equivalent to just
saying
:RUN SEGDVR.PUB.SYS
(just like :EDITOR = :RUN EDITOR.PUB.SYS). You're right so far.
Next, as you may have guessed, SEGDVR, with its 27 records, just
doesn't have what it takes to execute all the thirty-odd commands that
the SEGMENTER subsystem supports, from -PREP to -ADDSL to -COPY. Much
as I dislike judging a program by how big it is, 2500 or so words of
code aren't enough to do all that.
All that SEGDVR actually does is PARSE the commands typed at the
"-" prompt. Once it determines what command was typed and what the
parameters were, it passes it to a system-internal procedure called
SEGMENTER (not the command, but a procedure in the system SL). MPE's
:PREP, incidentally, also calls the SEGMENTER procedure, with exactly
the same parameters as :SEGMENTER's -PREP.
So, now things are simple. :SEGMENTER is just a parser -- the code
to actually do all the dirty work is in the system SL, right? Wrong.
The SEGMENTER procedure itself is only about 200-odd instructions.
All that it does is that it takes its parameters -- things like the
command number, the procedure/segment being manipulated, the -PREP
capabilities, maxdata, etc. -- and sticks them into a buffer which it
then sends (using the SENDMAIL intrinsic) to a program called
SEGPROC.PUB.SYS. The first command you execute from :SEGMENTER causes
this process to be created as a son of SEGDVR.PUB.SYS (that's why the
first command you type in :SEGMENTER is always slower than the
subsequent ones); similarly, when you do a :PREP in MPE,
SEGPROC.PUB.SYS is created as a son process of the CI. In either case,
the SEGMENTER procedure is used to communicate with the SEGPROC
process using the SENDMAIL and RECEIVEMAIL intrinsics.
Thus, if you want to make sure that your :PREPs go as fast as
possible, you should :ALLOCATE SEGPROC.PUB.SYS. To speed up your
:SEGMENTERs, you should :ALLOCATE both SEGPROC.PUB.SYS and
SEGDVR.PUB.SYS.
Finally, as you pointed out, although you can easily do a
programmatic :RUN (using the CREATEPROCESS intrinsic) and a
programmatic compile (by CREATEPROCESSing the appropriate compiler in
PUB.SYS), it's harder to do a programmatic :PREP. My suggestion would
be to put the PREP command into a file and then CREATEPROCESS
SEGDVR.PUB.SYS with its $STDIN redirected to that file. Alternatively,
you might call the SEGMENTER procedure directly (or even create
SEGPROC.PUB.SYS as a son process and communicate with it using
SENDMAIL and RECEIVEMAIL), but neither of these approaches is
documented by HP, and might change without notice.
:SEGMENTER :PREP
| /
v /
SEGDVR.PUB.SYS /
(just a parser) /
\ /
\ /
--------------\ /---------/
\ /
v
SEGMENTER procedure (system SL)
|
|
(via SENDMAIL/RECEIVEMAIL)
|
|
v
SEGPROC.PUB.SYS
(the workhorse)
Q: Today I came across a strange phenomenon upon which you may be able
to shed some light.
I have two files on disc which contain identical data (I checked by
running FCOPY on them with the COMPARE option), so I would assume that
they would occupy the same number of sectors on disc. But I was WRONG!
As you can see from the following :LISTF, the "vital statistics" of
the files are the same:
FILENAME ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
TN3WS 140B FA 3000 6000 20 3311 8 8
TN3WS86 140B FA 3000 6000 20 1672 4 8
but the numbers of sectors are different!
What's up?
A: The point that you raise is a good one. Clearly the system needs
only 4 extents for your 3000 records; why is it that one of your files
has all 8 extents allocated?
There are two possible reasons for this:
* In the :BUILD command (and in the equivalent options of the FOPEN
intrinsic), you can specify both the MAXIMUM NUMBER OF EXTENTS A
FILE SHOULD HAVE (in your case, it's 8 for both files) and HOW
MANY OF THOSE EXTENTS ARE TO BE ALLOCATED INITIALLY. You might
say, for instance
:BUILD X;DISC=100,10,3
and have a file that looks like:
FILENAME ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
X 128W FB 0 100 1 33 3 10
Although all the file HAS to have is 1 extent (to fit the file
label), your :BUILD command requested that 3 extents be initially
allocated. Similarly, saying
:BUILD TN3WS;REC=-140,20,F,ASCII;DISC=6000,8,8
will build the file
FILENAME ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
TN3WS 140B FA 0 6000 20 3311 8 8
which has all 8 extents allocated.
Why would you want to initially allocate more extents than
absolutely necessary? Well, if you only allocate one extent to
start with, then as you're writing to the file, new extents will
be allocated as necessary. If you run out of disc space when an
FWRITE operation requires a new extent to be allocated, the
FWRITE will fail -- thus, it would be possible for your program
to fail halfway through its execution, having written 3000 of
6000 records but having no room on disc to write the remaining
3000. If you allocate all the extents initially, then any "OUT OF
DISC SPACE" condition will be reported when you :BUILD the file;
once the file is built, you're guaranteed that all 6000 of your
FWRITEs will succeed.
* The other reason why a file would have more extents than its EOF
seems to indicate is that although new extents are allocated when
needed, they are NOT deallocated when they're no longer
necessary.
Say that you fill up your 6000-record file, causing all 8 extents
to be allocated; then, you open it with OUT access, wiping out
all the file's records. The no-longer-needed 7 extents are not
deallocated by this operation. Thus, your file might have once
been full; then someone opened it with OUT access (throwing away
all the records in the file, but not throwing away the extents)
and wrote 3000 new records to it. The file's number of extents
thus reflects the maximum number that was ever needed for this
file, even though that many may no longer be needed now.
Well, those are the two most likely explanations for you. If you want
to do something about the wasted space occupied by the empty extents,
you might rename the file, :FCOPY it, and then purge the old copy. The
new file will be built by :FCOPY to have only one extent to start
with; then, new extents will be allocated as needed, but never more
than necessary. Alternatively, if you use VESOFT's MPEX/3000, you
could just use the %ALTFILE command:
%ALTFILE TN3WS, EXTENTS=8
which will automatically rebuild the file, freeing all the extents
that are allocated but not used.
Q: My program DBOPENs a database that I know may be redirected with a
:FILE equation. How can I find out the TRUE filename of the database?
In other words, if my program opens database MYDB, and the user types
:FILE MYDB=SUPERB.TEST.PROD
how can my program figure out that it really is SUPERB.TEST.PROD that
it's accessing?
A: There are several ways of finding the fully qualified name of the
root file of the actual database being accessed by your program.
One solution -- the non-privileged one -- rests on the principle
that if there's a :FILE equation for your database root file, you can
get at it using LISTEQ2 (on MPE IV), LISTEQ5 (on MPE V), or the MPE
:LISTEQ command (which, I believe, exists starting with about T-MIT).
You can, for instance, use the COMMAND intrinsic to do a :LISTEQ into
a disc file, and then scan the disc file for a :FILE command that
applies to your database.
A second solution, which requires privileged mode, is based on the
fact that a root file is an MPE file that can be FOPENed just like any
other MPE file. To open a database file, all you need to do is:
* Be in privileged mode when you FOPEN it (AND when you do any
other file system operation, such as an FGETINFO or FCLOSE
against it).
* Specify the exact filecode of the file to be opened (for root
files, this is -400) in the FOPEN call.
Thus -- in SPL terminology -- you can say something like this:
GETPRIVMODE;
FNUM:=FOPEN (DBFILENAME, 1 << old >>,, ,,, ,,, ,,, -400);
IF FNUM<>0 THEN
BEGIN << successful FOPEN >>
FGETINFO (FNUM, REAL'FILENAME);
FCLOSE (FNUM, 0, 0);
END;
GETUSERMODE;
As you see, you FOPEN the root file using whatever filename your
program normally uses and then use FGETINFO to return you the REAL
filename of the file (fully-qualified, of course).
The advantage of this approach is that it's easier than doing
:LISTEQ into a file (or, on pre-T-MIT systems, running LISTEQ2 or
LISTEQ5 with output redirected) and then parsing the file. The
disadvantage is, of course, that it's privileged, and also that
THE USER CAN ISSUE A :FILE EQUATION SUCH AS
:FILE MYDB;DEL
THAT WILL CAUSE YOUR PROGRAM TO PURGE THE ROOT FILE!
The FOPEN will open the file MYDB but then fall under the influence of
the user's file equation, which will cause the FCLOSE to purge the
file! DBOPEN won't let the user do that (because it uses a system
internal privileged procedure to see exactly what parameters were
specified on the :FILE equation), but your ordinary unprotected FOPEN
won't.
Finally, there is a third solution, also privileged. Whenever you
DBOPEN a database, the root file is FOPENed on your behalf by the file
system. For any FOPENed file, you can call FGETINFO against its file
number and get its true filename (although for a privileged file, you
have to be in privileged mode to call FGETINFO). The trouble is that
to do this, you have to know the root file's file number, which DBOPEN
does not return to you. DBOPEN does set the first two bytes of the
database name to be the "database id", but that isn't related to the
file number.
What you can do, though, is scan ALL THE FILES THAT YOUR PROCESS
HAS OPENED until you come across one whose filecode is -400 (the code
of a root file). Then -- provided that your process has only opened
one database -- that will be the file number of the opened database's
root file.
Your program will then look something like this:
FOR FNUM:=3 << min file# >> UNTIL 255 << max file# >> DO
BEGIN
GETPRIVMODE; << FNUM may be privileged >>
FGETINFO (FNUM, FILENAME,,,,,,, FILECODE);
IF = << FNUM is a valid file number >> AND
FILECODE=-400 << A root file >> THEN
BEGIN
GETUSERMODE;
<< we have the right filename >>
...
END;
GETUSERMODE;
END;
As you see, you go through all valid file numbers, from 3 (the lowest
file number you can have) to 255 (the highest); if the file number is
invalid, FGETINFO will set the condition code to < -- if it's valid,
it will set the condition code to =, and then you can check the file
code to see if it's an IMAGE root file.
All things considered, the first approach (LISTEQ into a disc file)
is the most general (though quite cumbersome) -- it doesn't endanger
your database at all, and it works regardless of how many databases
you have open. The third approach (FGETINFOing all the file numbers)
works only if you have one database open. I would recommend against
the second approach (FOPENing the root file yourself), since then --
by accident or by design -- the root file could be injured or deleted.
Q: Help! I have several files that I just can't get at. I try to
copy them, :PURGE them, FOPEN them, and what-not; I always get
prompted for the lockword. If I go into LISTDIR5 and do a >LISTF
;PASS, LISTDIR5 shows them as having lockwords of "/"; however, if I
specify a "/" in response to the lockword prompt, MPE gives me a
LOCKWORD VIOLATION.
I thought that lockwords always had to be alphanumeric. How did
my file get this "/" lockword? How can I remove it?
A: You've fallen victim to a somewhat unusual MPE file system bug
that I've encountered a couple of times in the past. If you call the
FRENAME intrinsic and pass it a filename of, say, "NEWNAME/" instead
of just "NEWNAME", the FRENAME intrinsic creates the new file not
with an empty lockword, but with a lockword of "/". This will only
happen if you use the FRENAME intrinsic -- MPE's :RENAME command will
not allow you to specify a filename with a "/" but no lockword.
To open the file, you'll have to FOPEN it with a filename of
"NEWNAME/" -- again, a slash but no lockword. As before, MPE won't
let you specify such a filename, so you can't just say ":PURGE
NEWNAME/"; however, you can say
:FILE SALVAGE=NEWNAME/
and then say
:PURGE *SALVAGE
or
:RENAME *SALVAGE,NEWNAME
So, a strange bug but a simple workaround. Oddly enough, though,
there IS a very good use for a filename with a slash but no lockword.
Say you want to FOPEN a file but you DON'T want to prompt for a
lockword (perhaps your terminal is in block mode and a prompt would
only mess things up). If the file has no lockword, you'd much rather
have the system immediately return an FSERR 92 (lockword violation)
and not output anything to the terminal. To do this, just call FOPEN
with a filename of "FILE/.GROUP.ACCT". If the file has no lockword
(or a lockword of "/" caused by the bug mentioned above), the system
will open it; if the file does have a lockword, the system will
return an error but won't prompt the user for the lockword. You can
then, for instance, prompt for the lockword yourself using V/3000, or
perhaps look up the lockword in some data file or something.
Thus, to summarize:
File has File has File has bad ("/")
no lockword lockword lockword
"XXX"
FOPEN A.B.C OK Prompts Prompts, but any answer
is rejected
FOPEN A/XXX.B.C Error 92 OK Error 92
FOPEN A/.B.C OK Error 92 OK
Opening a file with a "/" but no lockword is useful for both
opening files with "/" lockwords (caused by the MPE error) and for
avoiding lockword prompts for files that have normal lockwords.
Q: What's the difference between $CONTROL PRIVILEGED, OPTION
PRIVILEGED, and OPTION UNCALLABLE? I always thought that $CONTROL
PRIVILEGED would make all the procedures in my program run in PM, but
it seems that it doesn't. What's going on?
A: First of all: what are OPTION PRIVILEGED and OPTION UNCALLABLE?
These are both SPL keywords that are placed in a procedure header,
and both of them have to do with PM capability; however, there is a
world of difference between them. Consider the following SPL
procedure:
PROCEDURE P;
OPTION UNCALLABLE;
BEGIN
...
END;
This simply means that ANYBODY WHO CALLS P MUST CALL IT FROM WITHIN
PRIVILEGED MODE. P is "UNCALLABLE" from normal user mode.
This can apply to procedures inside a program file but is more
often used in SL procedures. For instance, the system SL contains a
Now the very point of system SL procedures is that they can be called
from user programs -- DBOPEN, FREAD, PRINT, etc. are all system SL
procedures; however, we certainly don't want user programs to call
the SUDDENDEATH SL procedure. Since SUDDENDEATH is declared as
OPTION UNCALLABLE, only code that is running in privileged mode --
whether in the system SL or in a program file -- can call it. Any
other calls would cause the calling program to abort with a PROGRAM
ERROR #17: STT UNCALLABLE.
OPTION PRIVILEGED, however, does almost the opposite. Instead of
indicating that the CALLER of this procedure must be privileged, it
indicates that the procedure itself is to always execute in
privileged mode, REGARDLESS OF WHETHER OR NOT ITS CALLER IS
PRIVILEGED. Thus, saying:
PROCEDURE P;
OPTION PRIVILEGED;
BEGIN
...
END;
means that regardless of whether P's caller is in privileged mode or
not, P's code will always execute in PM. What's more, MPE doesn't
really keep the PRIVILEGED/ non-PRIVILEGED information on a
procedure-by-procedure basis (although the UNCALLABLE/non-UNCALLABLE
information IS kept for each procedure). An entire SEGMENT is either
permanently privileged (all its code executes in privileged mode) or
non-privileged (all its code executes in whatever mode its caller was
in). An OPTION PRIVILEGED procedure "taints" the entire segment it's
in, causing all procedures in the segment to always execute in
privileged mode (in which no bounds checking is done and system
failures are easy to cause).
At this point, we might also mention that what we're talking about
here is PERMANENTLY PRIVILEGED mode. Code that calls GETPRIVMODE and
GETUSERMODE WILL ONLY BE IN PRIVILEGED MODE BETWEEN THE "GETPRIVMODE"
AND THE "GETUSERMODE" CALLS. However, if we use OPTION PRIVILEGED
(or $CONTROL PRIVILEGED) instead of GETPRIVMODE/GETUSERMODE calls (as
is usually done for SL procedures), then entire segments will be
marked as permanently privileged.
$CONTROL PRIVILEGED is exactly like an OPTION PRIVILEGED for the
program's "outer block". Say that you have a program that has no
procedures in it -- just the main body. $CONTROL PRIVILEGED will
indicate that the main body itself will be permanently privileged,
just as a procedure's OPTION PRIVILEGED indicates that the procedure
will be permanently privileged. If your program has several
procedures in several segments, a $CONTROL PRIVILEGED will indicate
that THE SEGMENT THAT CONTAINS THE PROGRAM'S OUTER BLOCK is
permanently privileged; other segments need not be unless they have
OPTION PRIVILEGED procedures in them.
A $CONTROL PRIVILEGED thus does NOT apply to the entire source
file; it only applies to the outer block and, by extension, to the
segment that contains the outer block.
So, the important distinctions to remember are:
* OPTION PRIVILEGED (procedure executes in priv mode) vs. OPTION
UNCALLABLE (procedure can only be called from priv mode);
* permanently privileged (where the entire segment is always
privileged) vs. temporarily privileged (where you're only in
priv mode between the GETPRIVMODE call and the GETUSERMODE
call);
* and $CONTROL PRIVILEGED (which indicates that the segment
containing the outer block is privileged) vs. OPTION PRIVILEGED
(which indicates that the segment containing this procedure is
privileged).
Finally, one important note: the Intrinsics Manual describes two
intrinsics (GETPRIVMODE and SWITCHDB) as "O-P", or "option
privileged" (see p. 2-1 of the FEB 1986 edition). This does NOT
really mean "option privileged"; most intrinsics (including FOPEN,
QUIT, etc.) are declared as OPTION PRIVILEGED because they must
execute in privileged mode. Rather, the Intrinsics Manual's "option
privileged" means two things:
- for SWITCHDB, it really means "OPTION UNCALLABLE"; if you try to
call SWITCHDB from user mode, your program will abort with an
STT UNCALLABLE error;
- for GETPRIVMODE, it really means neither OPTION UNCALLABLE nor
OPTION PRIVILEGED; rather, it means that the calling program
must have been :PREP'ED WITH CAP=PM, regardless of whether or
not it is actually in privileged mode at the time of the call
(which is the distinction that OPTION UNCALLABLE uses);
Keep this in mind and be aware that neither of these definitions are
the true meaning of OPTION PRIVILEGED.
Q: What is the "COLD LOAD ID" number that LISTDIR5 outputs with its
LISTF command? It seems to be the same as the "cold load identity
field" in the :LISTF ,-1 output, but I can't for the life of me figure
out what it means.
A: A file's file label contains a lot of data about the file -- the
file's name, its filecode, the locations of its extents on disc, etc.
As a general rule, pretty much everything that FOPEN might need to
know about a disc file before opening it is kept in its file label.
Say that you FOPEN a file for exclusive access. All subsequent
attempts to FOPEN the file should fail, since you have it open it
exclusively -- but how is the system going to know this? Well,
whenever a file is accessed in any way (by FOPEN, by :RUN, by :STORE,
or by :RESTORE), the appropriate bit in the file label is set
indicating the type of access. Subsequent access attempts will look at
those file label bits to see if the file is being accessed in an
incompatible way. This is what prevents concurrent exclusive FOPENs,
:PURGEs of running programs, :STOREs of files that are being modified,
and so on. There are in fact several such fields in the file label --
fields that are used to indicate the way in which a file is being
accessed:
* The store bit, restore bit, load bit, exclusive bit, read bit,
and write bit (word 28);
* And, the so-called FCB vector (words 32-33) that indicates where
an FOPENed file's control block is located in memory -- this way,
two people who FOPEN the same file can easily share its file
control block.
Now, so far, all this has nothing to do with the coldload id.
However, ask yourself this: What if the system goes down while a file
is being accessed?
The file's access bits and FCB vector are still set to indicate the
file is in use; when you reboot the system, the file will apparently
be in the process of being accessed. Further attempts to FOPEN the
file exclusively will fail, shared FOPENs will try to share the FCB
pointed to by the (now entirely invalid) FCB vector, and so on.
Once upon a time there was a bug like this in IMAGE -- the IMAGE
root file had its Data Base Control Block's data segment number stored
in one of its records (to make it easy for subsequent openers of an
IMAGE databases to find the DBCB and share it); when the system
crashed and was then rebooted, IMAGE still thought that the DBCB was
stored in that data segment, which now didn't exist or contained
entirely different data. Naturally, this caused a good deal of grief.
Therefore, there must be some way for the system to determine
whether the "transitory" data in the file label -- data that is set
when a file is open and is supposed to be reset when the file is
closed -- is really correct or just a carry-over from a previous
"incarnation" of the system. Since by the very definition of a system
failure the system can't close the files normally (as opposed to a
normal =SHUTDOWN, which will normally close the files), there must be
some other solution.
One idea that comes to mind is to, at system startup time, go
through all the file labels in the system and reset their "transitory"
data. Unfortunately, if a system has 30,000 files and can perform
about 30 disc I/O's per second, this would take 1000 seconds or
somewhat over 15 minutes. Not a totally unreasonable amount of time,
but not very quick, either. When a system has just crashed and you're
rebooting it, what you care about most is getting it back up as
quickly as possible, and not taking time going through the labels of
files which, for the most part, may have not been accessed in months.
This is where the cold load id comes in. It's not really a cold
load id, but rather a "restart id"; whenever a system is rebooted
(with a warmstart, coolstart, cold load, update, or reload), this
number is incremented; whenever a file is opened, the current cold
load id is put into the file's file label.
This way, whenever you FOPEN a file, the file system has a very
simple way to see if the transitory data in the file label is valid --
if the current system cold load id is equal to the cold load id in the
file label, then the data is valid; if it is different, this means the
file has not been accessed at all since the last system re-start, and
even if the file label indicates that the file is in use, this is only
a carry-over from a previous system incarnation which crashed while
the file was open.
Thus, what you need to know about a cold load id is:
1) You should never really have to care about it, since everything
the system does with it is behind your back;
2) It is used for making sure that, when the system crashes with a
file open, subsequent incarnations of the system will not think
the file still open because of the access bits that may still be
set in the file label;
3) It should properly be called not a "cold load id" but a "restart
id", since it's reset at every system re-boot, from a warmstart
on up.
Actually, there are circumstances in which you may be able to use
cold load id's yourself. Say that you have a program that keeps some
data in a file while the file is opened -- for instance, you want to
make it easy for people to see who's doing what to the file, so every
time your program opens the file, it writes the opening user's name
into some special place, and every time it closes the file, it deletes
the user's name.
What happens when the system crashes and is then re-booted? Why,
the file contains the names of all those people who were using the
file when the system crashed but are obviously no longer using it now.
You need some way of figuring out whether this data was written to the
file before or after the last system failure.
What you can do is, every time you put some of this "transitory
data" (in this case, the file user's name) into the file, also put in
the current cold load id. Then, when you look at this data, you can
check the current cold load id against the one stored in the file --
if it doesn't match, this means that the transitory data in the file
was written before the last system re-start and is thus obsolete.
The only other thing you need to know is how to get the current
system cold load id. There's no intrinsic that gets you this
information, but there is an (undocumented) non-privileged procedure
that can get it for you:
INTEGER PROCEDURE SYSGLOB (WORDNUM);
VALUE WORDNUM;
INTEGER WORDNUM;
OPTION EXTERNAL;
This procedure lets you get an arbitrary value from the system's
"SYSGLOB" area, and it so happens that word %75 (decimal 61) is the
cold load id. Thus, just saying (in SPL)
INTEGER PROCEDURE SYSGLOB (W); VALUE W; INTEGER W; OPTION EXTERNAL;
...
COLDLOADID:=SYSGLOB(%75);
or (in FORTRAN)
INTEGER SYSGLOB
...
ICOLDLOADID = SYSGLOB (\%75\)
or (in COBOLII)
CALL "SYSGLOB" USING \%75\ GIVING COLD-LOAD-ID.
will get you the cold-load id.
The following is a re-run of a Q&A that appeared in the March 1987
issue of Interact. I'd like to print it again because it describes a
rather mysterious-seeming situation that is all too easy to get into;
in fact, recently I've had some people call me and ask me about this
very problem.
Q: On several occasions I've had very difficult problems caused by
"garbage" characters (escapes, nulls, control characters, etc.).
Sometimes they're in my source files; sometimes they're in my data
files; in any case, they're very hard to see at first glance and even
DISPLAY FUNCTIONS doesn't show some of them (e.g. nulls).
I'd like to have a job stream that goes through a few of my most
important files and checks to see if there are any garbage characters
inside them. How can I -- preferably without writing a special
program -- have a job stream check a file for garbage characters?
A: This is a classic "MPE PROGRAMMING" question -- how to do a
seemingly complicated task without having to write a special program
to do it. As always, MPE programming problems require a good deal of
ingenuity and, to be frank, a rather twisted way of looking at
a problem.
As I'm sure you're aware, :FCOPY has an option called ;CHAR that
outputs the contents of the ;FROM= file REPLACING ALL GARBAGE
CHARACTERS BY DOTS. For instance, if your ;FROM= file line is "AB"
followed by a null (ascii 0) followed by "CD", then an
:FCOPY FROM=MYFILE;TO;CHAR
will output the line as
RECORD 0 (%0, #0)
00000: AB.CD
The null character was replaced by a ".".
Now, FCOPY may have this option, but of what good is it to us? If
FCOPY had, for instance, set some JCW to 1 if at least one garbage
character was replaced by a "." and to 0 if none were, then we'd be
home free -- all we'd have to do is do the :FCOPY ;CHAR, check the
JCW, and that's that.
Unfortunately, FCOPY has nothing like this; it just takes the
;FROM= file and copies it (with whatever transformations are
appropriate) to the ;TO= file.
Here is where we have to be tricky. First of all, we have to
remember that FCOPY has another (less well-known) parameter called
;NORECNUM. This merely says that (when you specify ;CHAR, ;OCTAL, or
;HEX) the output should NOT contain the "RECORD 0" or "00000:"
headers. Thus, an
:FCOPY FROM=MYFILE;TO;CHAR;NORECNUM
will output the line
AB.CD
without any record numbering information.
So, this is what we do. First we say:
:BUILD MYFILENG;REC=<<must be the same as MYFILE>>
:FCOPY FROM=MYFILE;TO=MYFILENG;CHAR;NORECNUM
"MYFILENG" is now EXACTLY the same as MYFILE except that all garbage
characters have been replaced with "."s. Now, we say
:FCOPY FROM=MYFILE;TO=MYFILENG;COMPARE
This will now compare MYFILE and MYFILENG to see if there are ANY
DIFFERENCES between the two files. Obviously, these differences will
ONLY exist if there were some garbage characters in MYFILE (since all
non-garbage characters are copied exactly). Fortunately, if the
:FCOPY ;COMPARE finds at least one difference, it sets JCW to be
equal to FATAL (in batch). Therefore, we can say:
:JOB FINDGARB,ROGER.DEV;OUTCLASS=,1
...
:BUILD MYFILENG;REC=<<must be the same as MYFILE>>
:FCOPY FROM=MYFILE;TO=MYFILENG;CHAR;NORECNUM
:SETJCW JCW=0
:FCOPY FROM=MYFILE;TO=MYFILENG;COMPARE
:IF JCW<>0 THEN
: TELL ROGER.DEV !!! MYFILE has garbage characters !!!
:ENDIF
:PURGE MYFILENG
:EOJ
In fact, by looking at the job stream $STDLIST, you can even find
out in what record and at what word in the record the first
discrepancy (i.e. the first garbage character) occurred -- FCOPY
prints this information for you. Note, however, that you shouldn't
try to find all the garbage characters by doing, say, an ":FCOPY
;COMPARE=100". If you specify a maximum number of compare errors
(100 in this case), FCOPY won't actually set JCW unless at least that
many errors were found.
Finally, note that the file you're examining in this way should
contain only ASCII data -- if it's a data file that contains binary
data (e.g. word 5 is viewed as an integer), that data may appear as
"garbage" to FCOPY, since it views the entire input record as ASCII.
Also note that "garbage" means to FCOPY any character with an ASCII
value of 0 to 31 or 127 to 255 -- this includes all the ASCII
characters with the parity bit set, and may also include various
"native language character set" characters.
Q: With the advent of Spectrum we are using PASCAL to write systems
level code that formerly would have been written in SPL.
In SPL there was a close correlation between program statements and
machine language instructions, but in PASCAL, it is often difficult
to tell how small and/or efficient the resulting code will be. The
PASCAL/3000 manual is of limited help regarding these concerns.
For example, there are four ways of appending string B to string A:
(1) A := A + B;
(2) STRAPPEND (A, B);
(3) STRMOVE (STRLEN(B), B, 1, A, STRLEN(A) + 1);
(4) STRWRITE (A, STRLEN(A) + 1, T, B);
All four of these statements should do the same thing (assuming that
there's no error, i.e. STRLEN(B) > 0 and STRLEN(A) + STRLEN(B) <=
STRMAX(A)). Which of them will generate the "best" object code;
which will generate the "worst"? Will the answer to this question be
any different on MPE/XL?
A: You raise a very interesting question. Indeed, in many high-level
languages a single simple-looking operation can actually do a vast
amount of work and take a long time to do it. In fact, you needn't
look as far as a third-generation programming language -- do you know
how much stuff a simple PCAL (a one-word instruction) has to do, and
how long it might take? If it needs to swap in the code segment from
disc, this innocent-looking instruction might easily take you 30
milliseconds or more!
There is one sure way to find out which of the above four
operations is the fastest -- try them and see. The program I hacked
up looks like this:
$STANDARD_LEVEL 'HP3000'$
PROGRAM TIMINGS (INPUT, OUTPUT);
VAR I, T, TIME: INTEGER;
A, B: STRING[256];
FUNCTION PROCTIME: INTEGER; INTRINSIC;
BEGIN
TIME:=PROCTIME;
FOR I:=1 TO 10000 DO
BEGIN
SETSTRLEN (A, 30);
SETSTRLEN (B, 30);
END;
TIME:=PROCTIME-TIME;
WRITELN ('CONTROL':20, TIME);
TIME:=PROCTIME;
FOR I:=1 TO 10000 DO
BEGIN
SETSTRLEN (A, 30);
SETSTRLEN (B, 30);
A:=A+B;
END;
TIME:=PROCTIME-TIME;
WRITELN ('A+B':20, TIME);
TIME:=PROCTIME;
FOR I:=1 TO 10000 DO
BEGIN
SETSTRLEN (A, 30);
SETSTRLEN (B, 30);
STRAPPEND (A, B);
END;
TIME:=PROCTIME-TIME;
WRITELN ('STRAPPEND':20, TIME);
TIME:=PROCTIME;
FOR I:=1 TO 10000 DO
BEGIN
SETSTRLEN (A, 30);
SETSTRLEN (B, 30);
STRMOVE (STRLEN(B), B, 1, A, STRLEN(A)+1);
END;
TIME:=PROCTIME-TIME;
WRITELN ('STRMOVE':20, TIME);
TIME:=PROCTIME;
FOR I:=1 TO 10000 DO
BEGIN
SETSTRLEN (A, 30);
SETSTRLEN (B, 30);
STRWRITE (A, STRLEN(A)+1, T, B);
END;
TIME:=PROCTIME-TIME;
WRITELN ('STRWRITE':20, TIME);
END.
As you see, we use the PROCTIME intrinsic to determine the number of
CPU seconds consumed by the program so far; a PROCTIME before and
after each operation will tell us exactly how much CPU time the
operation took.
Note also that I had to make an arbitrary assumption -- I assume
that both strings are 30 characters long. Quite possibly one of the
methods might have a longer "start-up" time but be faster for each
character; in that case, copying longer strings might make that
method more efficient, while copying shorter strings might make it
less efficient.
Finally, note that none of the loops includes JUST the operation
we need to test. For one, we have to reset both strings to their
initial values; for another, the FOR loop itself takes a non-trivial
amount of time (to increment the variable, compare, and branch).
This is why the first loop is a "control" loop intended to figure out
how long everything EXCEPT the concatenation actually takes.
So, what's the result? Well, before I ran the program, I decided
to do a little self-test -- I guessed what I thought the relative
rankings would be. You might want to try to do this, too.
Intuitively, my general idea is "the more general-purpose a
function, the longer it'll take". This is because a function that
does only one thing can make all sorts of efficiency-improving
assumptions that the general function can't make.
It seems that the function most uniquely adapted to this job is
STRAPPEND. Whatever the fastest way of appending two strings is,
there's no earthly reason why STRAPPEND couldn't use it. Thus, it
stood to reason that the STRAPPEND (method (2)) would be the best.
My guess for second place was the A := A + B; third place went to
the STRMOVE. Don't ask me why I didn't guess the other way around --
perhaps STRMOVE's five parameters scared me.
I was absolutely sure that the STRWRITE would be the slowest of
them all. This is probably because I've been biased by FORTRAN's
formatter (the dog to end all dogs). General-purpose formatting
facilities are very useful (although PASCAL's is quite ugly), but
they're almost never very fast.
So that was my "educated" guess. What were the results on my
MICRO/XE?
CONTROL 257
A+B 1791
STRAPPEND 1073
STRMOVE 2042
STRWRITE 15733
The numbers are in milliseconds, and indicate the time taken to do
each 10,000-iteration loop. Thus, each STRWRITE took about 1.6
milliseconds.
Curiously enough, my intuition was rather confirmed by the test.
(I guessed before I saw the numbers -- scout's honor!) STRWRITE was
even slower than I though; the other three were pretty even, but
STRAPPEND (the least general of them) was the champ.
At this point, I asked a Spectrum-possessing friend of mine to run
the same test on his machine. Since the one thing he's forbidden to
do on his machine is test Spectrum performance, I will protect him by
leaving him anonymous. Let's just call him "Deep Code".
The results with the Spectrum Native Mode PASCAL/XL compiler were
somewhat different:
CONTROL 1
A+B 11.8
STRAPPEND 20.1
STRMOVE 6.5
STRWRITE 58.8
(To prevent comparisons between Spectrum and MPE/V, all the times are
given relative to the control time -- no absolute timings should be
inferred.)
As you see, the cumbersome-seeming STRMOVE was actually faster
than either the A+B or STRAPPEND. STRWRITE was still way behind.
Finally, I decided to compile the program with range checking off
($RANGE OFF$) on MPE/V. The results were now:
CONTROL 256 (with $RANGE ON%: 257)
A+B 1737 (with $RANGE ON$: 1791)
STRAPPEND 1070 (with $RANGE ON$: 1073)
STRMOVE 875 (with $RANGE ON$: 2042)
STRWRITE 15697 (with $RANGE ON$: 15733)
As you can see, the results are mostly unchanged EXCEPT for STRMOVE,
which is now the fastest of the lot, more than twice as fast as
before!
So there's an answer -- STRAPPEND is fastest on MPE/V with $RANGE
ON$, STRMOVE is fastest with $RANGE OFF$ or on MPE/XL. This is an
answer, but I don't think that it's THE answer.
In this day and age, the most valuable commodity in a DP
department is not computer time -- it's programmer time. If all you
care about is efficiency, you might as well stick with assembly code.
Consider also that the efficiency of most of your code is largely
irrelevant; the overwhelming majority (90% or more) of your program's
time is spent in less than 10% of its code.
My general philosophy, then, is this:
* DESIGN FOR EFFICIENCY.
* CODE FOR SIMPLICITY.
* OPTIMIZE LATER.
When you're making fundamental decisions about your algorithm
(sequential access vs. direct access, B-trees vs. hashing, etc.),
efficiency should play a major role in your thinking -- once you've
committed to one fundamental approach that later proves too slow,
it's often too difficult to change to another approach.
However, when I write my code, I almost invariably choose the
simplest, most straightforward approach -- for reliability,
readability, and maintainability. It so happens that it's about 50%
faster to shift right by 1 bit than to divide by 2. However, if I
want to average two numbers, I'd much rather say:
AVERAGE:=(MAXIMUM+MINIMUM)/2;
than
AVERAGE:=(MAXIMUM+MINIMUM)&ASR(1);
Looking at the first statement, I instantly see what's going on -- it
documents itself. To understand the second statement, I really have
to think about it. Similarly, even if it were faster to say
something like
AVERAGE:=DIVIDE(ADD(MAXIMUM,MINIMUM),2);
(where DIVIDE and ADD are presumably super-efficient built-in
functions), I'd still rather say
AVERAGE:=(MAXIMUM+MINIMUM)/2;
The (MAXIMUM+MINIMUM)/2 is just a lot easier to read and understand.
What's more, chances are that any little optimization I might do
to this statement would be of virtually no consequence. As I said
before, the overwhelming majority of a program's time is spent in a
very few places. You can spend weeks writing all your code in the
most "efficient" manner (and months fixing the bugs caused by all the
extra complexity), when you could have just spent a few hours
optimizing those few statements that take the most time.
What's more, I've found that I ALMOST NEVER know what parts of the
program will actually be the most frequently used ones. I just write
everything in the simplest, cleanest way and then use HP's wonderful
APS/3000 program to find out where the "hot spots" are. APS/3000 can
tell you EXACTLY where your program is spending most of its time --
armed with this data, you can often change a half dozen lines and get
a two-fold performance improvement.
Therefore, my recommendation is to use whatever mechanism looks to
you to be the easiest and most understandable. I'd recommend that
you say
A := A + B;
because that seems to me to be the clearest solution. Similarly,
even though STRWRITE is so horribly slow, I'd still recommend that
you use it in cases where it best represents what you're doing (for
instance, if you want to use the ":fieldlength" syntax or you want to
concatenate numbers as well as strings).
Then, after you've written the program, run APS/3000. If you find
that this statement is in a particular tight loop and is taking a
large portion of the program's run time, then you can optimize it to
your heart's content. However, don't spend too much effort chasing
an instruction here and there. Your time is a lot more valuable than
the computer's. First, a few words in response to some of the Letters
to the Editor in the February 1988 issue.
Mr. Gerstenhaber of Computation & Measurement Systems in Tel-Aviv
commented on avoiding EOF's on PASCAL READLNs in case the input starts
with a ":". He quite correctly points out that if you issue a file
equation :FILE INPUT=$STDINX (the "X" somehow dropped off the :FILE
equation in the printed copy of his letter), PASCAL will read from
$STDINX and will not get an error on ":" input.
My original answer recommended that you put in a "RESET (INPUT,
'$STDINX ');" statement into your PASCAL program -- this solution has
the advantage of making the program completely stand-alone, so it
doesn't require a :FILE equation to run properly. Mr. Gerstenhaber's
solution, on the other hand, has the advantage of not requiring any
source code changes (although the program won't run quite right
without a :FILE equation). Both solutions are very reasonable
alternatives.
Mr. van Herk of Mentor Graphics in the Netherlands inquired about
saving scheduled/waiting jobs across a coldload and about "letting a
session log itself off after a certain idle period". These issues were
apparently the subjects of previous Q&A questions (which were answered
by Mr. N. A. Hills).
The first question was originally asked in the June 1987 Q&A -- a
user did a coldload once a week (as recommended by HP), and found that
his scheduled jobs were deleted. He couldn't just re-submit them,
since they were originally submitted by users -- the system manager
doesn't know the original job stream filenames, and in any case the
original files might have already been modified or purged.
As Mr. van Herk quite correctly points out, the contributed library
JSPOOK program addresses this very problem. I'm not sure exactly how
much it preserves -- whether it works for WAITing jobs only, or
SCHEDuled ones as well; I suspect that there are newer and older
versions of JSPOOK in various places that do slightly different
things.
However, JSPOOK is definitely the place to start -- look for it on
the contributed library; it might very well do exactly what you want.
"Letting a session log itself off after a certain idle period" is a
different story. I guess that what the user wanted was to abort
sessions whose user has walked away from his terminal (thus causing a
potential security threat) or is just signed on and not doing anything
(which may, for instance, use precious ports on your port selector).
Q: I've always wondered how IMAGE calculates the maximum number of
extents for a dataset. Most of my datasets end up having 32 extents,
but some smaller datasets have fewer. Why is this? Is it some sort of
MPE limitation, or is it IMAGE's own choice.
A: My old IMAGE Manual (March 1983 version) has little to say on this
topic: "Each data file is physically constructed from one to 32
extents ..., as needed to meet the capacity requirements of the file,
subject to the constraints of the MPE file system" (p. 2-10).
To get the answer, I had to decompile the IMAGE code in the system
SL. As best I could tell from the machine instructions, the algorithm
was:
#extents = flimit / 64 + 1
In other words, if the file limit ends up being 920 records (the file
limit having been calculated from the capacity, the record size, and
the blocking factor), the number of extents will be
920/64 + 1 = 14 + 1 = 15 extents
(note that the division rounds down).
The principle here seems to be to avoid really small extents. The
best reason for this is disc caching (although I think the above
algorithm might have been chosen before disc caching was implemented)
-- the most that MPE can cache is one extent; if extents are very
small, you'll lose much of the benefit of caching.
Usually, these numbers of extents should work quite well for you.
If for some reason you want to change them, though, you can't just
issue a :FILE equation at DBUTIL >>CREATE time (since >>CREATE ignores
file equations). One thing you might do is use MPEX's
%ALTFILE datasetfilename;EXTENTS=newnumextents
(%ALTFILE ;EXTENTS= lets you change the maximum number of extents for
any disc file). However, as I said, you'd rarely want to do this.
Q: I'm trying to call a PASCAL procedure from a FORTRAN program and I
get a loader error that says "INCOMPATIBLE FUNCTION FOR" and then the
name of the PASCAL procedure. I know that PASCAL is a strict
type-checking language, but my FORTRAN calling sequence seems to
exactly match the PASCAL procedure's formal parameter list! (The
PASCAL procedure returns an integer and takes several simple integer
by-reference parameters, which is exactly what I pass to it.) What's
going on here?
A: The compilers, the segmenter, and the loader support a little-known
mechanism known as the CHECK LEVELS. This allows MPE (when :PREPping
together separately compiled code or when calling an SL procedure) to
make sure that the caller and the callee both have the same idea of
how the procedure parameters should be passed.
The procedure that is called (the "callee") may have (in its USL,
RL, or SL) information describing its parameter types; the procedure
that calls it (the "caller") may have information describing the
parameter types that it expects. If the two don't match, an error
message is printed.
By default, SPL generates all its procedures (and all its procedure
calls) with OPTION CHECK 0, which indicates that NO CHECKING IS TO BE
DONE. Thus, if you try to call an average system SL procedure (almost
certainly written in SPL), no parameter checking will be done -- if
you mis-specify it in an OPTION EXTERNAL declaration, you're in deep
trouble.
Now, FORTRAN and PASCAL by default generate all their procedures
(and all procedure calls) with OPTION CHECK 3, which indicates that
the NUMBER OF PARAMETERS, THE RESULT TYPE, and EACH PARAMETER TYPE are
to be checked. If FORTRAN generates an OPTION CHECK 3 procedure call,
but the called procedure is OPTION CHECK 0, no checking will be done;
if SPL generates an OPTION CHECK 0 procedure call, but the called
procedure is OPTION CHECK 3, no checking will be done. However, if
both the procedure call and the procedure itself are OPTION CHECK 3,
full checking will be done to make sure that the caller's and callee's
procedure declarations are identical.
So, what's the big deal? After all, you say that the FORTRAN call's
parameters are perfectly compatible with the PASCAL procedure's header
-- even though there's an OPTION CHECK 3 test, it should be passed
with flying colors.
Well, if you look in chapter 9 of your System Tables Manual
(what??? you say you don't have a System Tables manual???), you'll see
the following paragraph:
"PASCAL: Pascal sets the high order bit in the parameter type
descriptor when it is generating hashed values. The remaining 15
bits are based on a hash of the types of the parameter. Only the
Pascal compiler can compute the value, and the SEGMENTER must
match the whole 16 bit value."
Since PASCAL has so many different types (RECORDs, enumerated types,
subranges, etc.), the PASCAL compiler generates a special "parameter
type entry" into the procedure's OPTION CHECK 3 descriptor, an entry
that is INCOMPATIBLE WITH THE ENTRIES GENERATED BY ANY OTHER COMPILER,
even if the parameter types referred to by PASCAL and the other
compiler are perfectly compatible.
In other words, you can't call (with OPTION CHECK 3) from a
non-PASCAL program any OPTION CHECK 3 PASCAL procedure. What you must
do is either:
* make sure that the PASCAL procedure is NOT compiled with OPTION
CHECK 3, or
* make sure that the calling program does not generate any
procedure calls with OPTION CHECK 3.
PASCAL has two compiler keywords:
* $CHECK_ACTUAL_PARM$, which indicates the checking level for
PROCEDURES YOU CALL, and
* $CHECK_FORMAL_PARM$, which indicates the checking level for
PROCEDURES YOU ARE DECLARING.
FORTRAN, on the other hand, has only one relevant compiler keyword:
* $CONTROL CHECK=, which indicates the checking level for
PROCEDURES YOU ARE DECLARING.
What this means is that FORTRAN will ALWAYS GENERATE ITS PROCEDURE
CALLS WITH OPTION CHECK 3, since there's no keyword to turn this off.
Therefore, you must change your PASCAL procedure to have a
$CHECK_FORMAL_PARM 0$.
What if you don't have the source code to the PASCAL procedure? In
this case, you're in trouble, since the procedure has check level 3
and FORTRAN will always generate external procedure calls with check
level 3. To avoid this, you have to write an SPL "gateway" procedure
which calls the PASCAL procedure and is called by the FORTRAN
procedure. If the SPL gateway procedure declares the PASCAL procedure
to be OPTION CHECK 0, EXTERNAL, this will work.
For the curious: Yes, there are check levels 1 and 2. OPTION CHECK
1 indicates checking of only the PROCEDURE RESULT TYPE; OPTION CHECK 2
indicates checking of the PROCEDURE RESULT TYPE and the NUMBER OF
PARAMETERS. OPTION CHECK 3, as I mentioned before, checks the
PROCEDURE RESULT TYPE, the NUMBER OF PARAMETERS, and the TYPES OF ALL
THE PROCEDURE PARAMETERS.
Q: I notice that MPE lets me allocate extra data segments that belong
to a process (call GETDSEG with id = 0) or a shared within a session
(GETDSEG with id <> 0). I want to have a global data segment that's
shared among many sessions. I'd like to be able to have one program
that loads this segment up from a file or from a database (say, at
the beginning of the day), and all the other ones will then read data
from the segment without having to go to disc. How can I do this?
A: Unfortunately, you can't share data segments among sessions
without doing some serious privileged mode work. However, are you
sure you really want to have an extra data segment?
Whenever people talk about accessing files, they think "disc I/O".
After all, files are stored on disc, right? Well, almost right. When
you read a disc file (in default, buffered mode), the file system
doesn't actually go out to disc for every FREAD; rather, it reads one
disc BLOCK at a time (however many RECORDS it may contain) into an
in-memory buffer. Whenever the record to be read is already in the
buffer, it's taken from the buffer rather than from disc. This is
then just a memory-to-memory move, rather like a DMOVIN.
Now, if you want to keep your data in a data segment, you must
have no more than 32K words of data (actually, slightly less).
Curiously enough, this also happens to be close to the maximum size
of a file system buffer. If you say
:BUILD X;REC=128,255
then each block in X will be built with 255 records of 128 words
each. When you first read the file, the file system will read all
32640 words of it into memory; any subsequent reads of the file (as
long as they fit within those 128 records) will require NO DISC I/O
AT ALL, since they'll be satisfied entirely from the in-memory
buffer.
Thus, what you need to do is:
* Build the file with a sufficiently high block size (record size
WILL FIT IN ONE BLOCK.
* Always open the file SHR (shared) GMULTI -- the GMULTI means
that everybody will share THE SAME BUFFER (rather than having
one 32K data segment for each file accessor!).
Then, all of your reads will be (almost) as fast as data segment
accesses, since that's exactly what they'll be! The only difference
is that you'll be letting the file system take care of all the data
segment management behind your back.
In fact, using files in this case will give you a lot of other
advantages (besides inter-session sharing) over data segments:
* You can use your language's built-in file access mechanisms
instead of having to call DMOVIN, DMOVOUT, GETDSEG, FREEDSEG,
etc.
* If you want to look at your file to make sure that it's correct,
you can use FCOPY, EDITOR, etc.
* If people will be updating the data segment, you can use FLOCK
and FUNLOCK to coordinate the write access to it. (Make sure,
however, that nothing you do relies on the current record
pointer -- see below!)
* If the system goes down, the file will still remain around.
In general, files are just a lot easier to deal with than extra
data segments, and as long as you block them right (and stay within
32K, which is the limit for a data segment anyway), they can be as
fast or almost as fast!
The only thing you need to beware of is that when you open a file
GMULTI, not only the file buffers are shared among all the file
accessors, but so are the current record pointers! In other words, if
two people have a file opened GMULTI and both are reading it
sequentially, one will get about half of the file's records and the
other will get the rest! As soon as one reader reads record #0, the
current record pointer (which both readers share) will be incremented
to 1, and the other reader will read record #1 and never see record
#0.
Therefore, whenever you're reading (or writing) a GMULTI file,
DON'T DO SEQUENTIAL READS (unless you do some sort of locking, for
reads as well as writes). Do all your I/O using FREADDIR and
FWRITEDIR (or their equivalents in your language).
Q: Sometimes, when I do a :LISTF of a file, I get a display like
this:
FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
URLDAT USL 128W FB 141 959 1 60 2 32
The file has 529 records, but uses only 300 sectors! Is this some
super compression algorithm HP is using? Can I make all my files this
small?
A: Sorry, no. This file uses only 300 sectors because it has less
than 300 records worth of actual data. Although there is a record
#528 (which is why the EOF is 529), some of the records between
record #0 and record #528 are missing!
How can this happen? Well, try it yourself. Build a file with a
command such as
:BUILD X;DISC=1023,8
-- a file with 1023 records and up to 8 extents. The first of these
extents will be allocated when you build the file (except on MPE/XL);
the others will be unallocated until you write to them.
Now, write a small program that does an FWRITEDIR of a record into
record #1022. Then, when you :LISTF the file, you'll see something
like:
FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
X 128W FB 1023 1023 1 256 2 8
Only the first extent and the last extent (the one with record #1022)
will actually be allocated -- the others will remain empty until you
actually write to them.
Once you think about it, it's all pretty straightforward -- you
write a record to an extent and that extent gets allocated; if you
don't write a record to an extent, it won't get allocated. However,
when you first see this, it can be puzzling indeed!
In fact, this situation is quite rare, since most people either
write to files sequentially or entirely allocate the whole file to
start with (e.g. IMAGE databases, KSAM key files, and most data
files). The SEGMENTER is one of the few subsystems that regularly
creates "holey" files; it's quite common to find USL files like this.
Q: There's a program that I run that HAS to have a temporary file of
its own to do its work -- it just couldn't possibly work otherwise.
However, when I hit [BREAK] and do a :LISTFTEMP, MPE says there are
no temporary files; :LISTF doesn't show me any new permanent files
either. Where is the program keeping all its temporary data? It
doesn't have DS capability, so I know it's not in an extra data
segment. I'm not quite clear on the distinction between permanent
and temporary files, anyway. Are they just kept in two different
places?
A: If you want a file to be available to sessions other than its
creating one (either at the same time or later), you must make it a
PERMANENT file. The system will then keep its name and a pointer to
its data in the PERMANENT FILE DIRECTORY, a large chunk of disc space
reserved for this very purpose. :LISTF, of course, scans through the
directory, as does FOPEN when you ask it to access an existing
permanent file.
If you create a file within your own session and know that you'll
never need it outside your session, you can make it a TEMPORARY file.
Its name and data pointer will be kept in the TEMPORARY FILE
DIRECTORY, an extra data segment (actually part of the JDT) kept by
the system on your session's behalf. (Naturally, there is one
Temporary File Directory for each session in the system, but only one
Permanent File Directory for the entire computer [ignoring for a
moment private volumes].)
Temporary files have two major advantages: first, they go away
when your session dies (and their disc space is released) -- this can
be convenient, since otherwise you'd have to remember to purge them
before logging off (naturally, you'd always forget).
More importantly, since each session has its own Temporary File
Directory, you need not be afraid of interfering with another
session's temporary files. If your program needs to build a work
file (which you want to call WORKFILE) and two people are running it
in the same group, then you have the possibility of both processes
trying to create WORKFILE at the same time. If you keep WORKFILE as
a temporary file, each session will have its own WORKFILE file in its
own Temporary File Directory.
Temporary files also have two major disadvantages. One, of
course, is the very fact that they are temporary -- if the system
crashes or the job aborts, the files will be lost; there'll be no way
of seeing what was in them. If one of your job temporary files
contains important information, this may be a serious concern -- if
your job aborts, the file will be lost, and you won't be able to see
what was in the file (which might give you some important clues as to
why the job aborted).
The other problem is that when the system crashes, the space used
by the temporary files is lost. Since the Temporary File Directories
are all kept in extra data segments, the information in them
(especially the pointers to the file data) will be lost as well, and
thus the system will not be able to re-use the space occupied by the
temporary files until you do a RECOVER LOST DISC SPACE.
However, as your question points out, there are more things than
just permanent and temporary files. "Permanent" and "temporary"
indicate what DIRECTORY the file name and the pointer to the file
data are kept in -- what if they don't need to be kept anywhere?
What if a file will only be needed while a program is running, and as
soon as the program closes it, it can safely be discarded? In this
case, you wouldn't need to keep pointers to it in any permanent or
even semi-permanent place -- only in the file tables that MPE keeps
in your own stack.
When you call FOPEN (or when it's called on your behalf by your
language's compiler library), you can tell FOPEN which directory to
look for the file in. You can tell it to look in the PERMANENT FILE
DIRECTORY -- this means that the file must exist as a permanent file
at FOPEN time; you can tell it to look in the TEMPORARY FILE
DIRECTORY -- this means that the file must exist as a temporary file
at FOPEN time. Finally, if you don't expect the file to exist at all
but rather want to create a new file, you can indicate that in the
FOPEN call, too.
When you call FOPEN to open this "new file", the file will NOT
actually be cataloged in either directory -- this will only happen
when you FCLOSE the file (you can tell FCLOSE whether to save the
file as permanent or temporary). As long as the file is FOPENed but
not yet FCLOSEd, the file exists (space is allocated for it on disc),
but it's not pointed to by ANY directory. If the process FCLOSEs the
file without asking FCLOSE to save it, the file will go away;
similarly, if the process dies without FCLOSEing the file, it'll go
away as well. The file is thus a "process temporary file", in that
it is accessible only by the process that created it (unless the
process later saves the file) -- in the manuals, it's usually called
a "new file", meaning not just a recently created file but rather a
file that is not kept track of in either the Permanent or the
Temporary File Directory.
This is what is almost certainly happening in your mystery
program. It needs to store data somewhere, but sees no reason to
save it for access by other sessions or even by other processes in
the same session. It FOPENs the file as NEW, and then, when it's
done with the data, either FCLOSEs the file (without saving it) or
just terminates. Neither :LISTF (which looks in the Permanent File
Directory) nor :LISTFTEMP (which looks in the Temporary File
Directory) will show this file, because the file is utterly unknown
to anybody other than the creating process.
We can summarize all this with the following table:
PERMANENT FILE TEMPORARY FILE NEW FILE
--------- ---- --------- ---- --- ----
Available to Anyone, any time Only creating Only creating
session process
Shown by :LISTF :LISTFTEMP Nothing
If process File remains File remains File destroyed,
aborts space recovered
If job aborts File remains File destroyed, File destroyed,
space recovered space recovered
If system File remains, File destroyed, File destroyed,
crashes space not wasted space lost space lost
Q: I recently did a :SHOWME in one of my batch jobs, and I got some
rather unusual output:
USER: #S329,TEST.PROD,PUB (NOT IN BREAK)
MPE VERSION: HP32033G.B3.00. (BASE G.B3.00).
...
$STDIN LDEV: 4 $STDLIST LDEV: 5
I don't have ldevs 4 and 5! I expected it to say "$STDIN LDEV: 10"
(since 10 is my :STREAMS device) and "$STDLIST LDEV: 6" (since 6 is
the only device in my LP class), but it gave me those rather bizarre
values instead. What's going on here?
A: Actually, if you did have ldevs 4 and 5 configured, you would NOT
have gotten them in your :SHOWME output. In fact, a :SHOWME in a
spooled (i.e. :STREAMed to a spooled printer) batch job is GUARANTEED
to give you nonexistent $STDIN LDEV and $STDLIST LDEV numbers!
Why is this? Well, in the depths of the "System Operations and
Resource Management" manual (p. 7-92 of my January 1985 edition), you
can find an interesting statement:
"When a spool file is opened, MPE creates a 'virtual device' of
the required type by filling in an unused logical device entry
with the appropriate values."
What does this mean?
Well, say that you open a spool file on device class LP. At first
glance, you might think that the device number associated with this
file would be 6, the device of the line printer. Actually, this
clearly can't be so -- when you write a record to the spool file,
device 6 doesn't actually get written to; rather, the spool file on
disc is written to and is then (when the file is closed) copied to
device 6.
OK, you think, that's right -- the file is on disc; therefore, its
device number must be 1, 2, or 3 (the device numbers of my disc
drives), depending on which disc drive it happened to be built.
Nope. For some reason, the operating system requires that the spool
file be more than just a simple disc file. Perhaps this is because
some MPE I/O (e.g. the CI's prompt for the command) goes through the
internal ATTACHIO procedure rather than through the file system; the
ATTACHIO call requires a logical device number.
In any event, whenever you open a spool file, the system assigns a
"virtual logical device number" to it, and most access to the file
then happens through this device; naturally, this virtual logical
device number must not be the same as any real device number. Since
all batch jobs (except those that are submitted from non-spooled
devices, such as tapes and card readers, or those that are sent to
non-spooled devices, such as hot printers) have spool files both for
their $STDINs and $STDLISTs, all batch :SHOWMEs and WHOs will return
these virtual (i.e. invalid) logical device numbers.
If you're curious about getting the REAL $STDIN and $STDLIST
devices, call the new JOBINFO intrinsic asking for items 9 (input
device) and 10 (output device). The :SHOWME command should really do
this, but it was written long before JOBINFO was implemented.
Q: My program opens a file with Input/Output access, does some reads
from it, some writes to it, and then closes it. When I run the
program it works well, but when one of my users runs it, nothing
happens. The program doesn't get a file open error, but the file
doesn't get updated, either. I thought it might be a security
violation, but the file open's succeeding, so that can't be it.
A: Well, maybe it can. One would think that if you don't have write
access to a file, an open for Input/Output would fail with an FSERR
93 (SECURITY VIOLATION); however, this is not so. If you have read
access but NOT write access and you try to open a file for
Input/Output (or Update), the FOPEN will SUCCEED; however, the file
will be opened for READ ACCESS ONLY.
Now, if you try to write to the file, the FWRITE will fail (with
an FSERR 40, OPERATION INCONSISTENT WITH ACCESS TYPE). If you
checked for a file system error after the FOPEN but you DON'T checked
for a file system error after the FWRITE, everything will have looked
OK; but, since the FWRITE failed, the file won't have been updated.
What can you do? Well, you could (and should) check for an error
after the FWRITE call; if, however, you want to detect the error
condition at FOPEN time, you have to do something like this:
FNUM:=FOPEN (FILENAME, 1 << old >>, 4 << in/out >>);
IF <> THEN
FILE'ERROR
ELSE
BEGIN
FGETINFO (FNUM, , AOPTIONS);
IF AOPTIONS.(12:4)<>4 THEN
<< you asked for IN/OUT access, but got something less >>
...
END;
The FGETINFO call will get you the REAL aoptions that the file was
opened with -- then you can check to see if the access granted was
really IN/OUT.
Moral of the story (#1): ALWAYS check for error conditions.
Moral of the story (#2): Sometimes a successful result isn't
really successful.
Q: I have a program that runs just fine when I run it normally (with
$STDIN not redirected). However, when I try to redirect its $STDIN
to a disc file, it aborts on MPE with a tombstone that says "ERROR
NUMBER: 42". I looked it up, and the manual says it's "OPERATION
INCONSISTENT WITH DEVICE TYPE (FSERR 42)". What does that mean? The
program doesn't use any special devices (tapes, printers, etc.).
A: Well, actually the program DOES use a special device -- your
terminal. When you run your program with its $STDIN redirected to a
disc file, all of its file system operations on $STDIN become
operations on a disc file (rather than on a terminal). For normal
FREADs, that's OK -- you can read from a disc file just as well as
from a terminal. But what if the program does an FCONTROL mode 13 on
$STDIN to turn off echo? Or an FCONTROL mode 14 to turn off break?
These operations work when passed a terminal file; however, when you
pass them a disc file (even if its your program's ;STDIN=), they'll
fail with the very error condition you indicated.
Oddly enough, if your program is doing, say, an FCONTROL mode 13
to turn off echo, then the file system error is actually no problem.
The FCONTROL 13 is only useful when $STDIN is a terminal; if it's a
disc file, the FCONTROL can't be done, but it isn't needed, either.
It may be that your program is seeing this error and aborting
although it would be perfectly OK for it to go on.
If that's what you're doing -- an FCONTROL mode 13, or perhaps one
of the other FCONTROLs that's only needed when you're really reading
from the terminal -- then you may want to check for an error
condition, and if it's FSERROR 42, keep going as if nothing had
happened. Or -- dare I suggest it -- maybe not even check for the
FCONTROL error at all?
Moral of the story: Sometimes an error really isn't an error.
Q: I'm doing a "SQUEEZE" of a disc file (i.e. releasing allocated but
unused disc space beyond the EOF) with an MPEX %ALTFILE ;SQUEEZE; I
understand that this just opens the file and closes it with
disposition 8. This works for all my fixed record length files and
some of my variable record length files, but some variable record
length files it leaves completely untouched. Their FLIMITs and disc
space usages remain exactly the same as before. What's happening?
A: Well, I was stumped until I looked at my trusty MPE source code.
After I saw the actual code that did the "squeeze", everything fell
into place.
Once upon a time (before MPE IV came out), FCLOSE mode 8 only
worked for fixed record length files. This is because its job is to
release all the unused blocks beyond the end of file -- for this, it
has to know what the last used block is. In a fixed record length
file, the last used block # (counting from block #0) is equal to
CEILING (eof/blockingfactor) - 1
because every block has exactly blockingfactor records.
In a variable record length file, however, there can be any number
of records in a block. The EOF (the number of records in the file)
won't tell you what the last block # is; the only way to find the
last block (in MPE III) was to read the file sequentially until the
EOF was reached -- way too slow for FCLOSE 8 to use.
MPE IV allowed you to append to variable record length files
(perhaps because message files, which are always variable record
length, needed to be appended to). To do an append, the file system
also needs to know the last block #; for this, the file system
started keeping the last used block # in every variable record length
file's file label. This incidentally made FCLOSE mode 8s of variable
record length files possible. Thus, in MPE IV and later systems, an
FCLOSE mode 8 of a variable record length file simply goes to the
"end of file block number" and throws away all the blocks that go
after it.
So why do some FCLOSE 8s succeed and others do nothing? Well,
what if the file system sees an end of file block number of 0? It
could mean a file with exactly 1 block (block #0) -- OR, it could
mean a file that was first created on a pre-MPE IV system, when the
end of file block# field was ALWAYS set to 0! FCLOSE 8 can't just
throw away all blocks after block #0, since there MAY be (on a
pre-MPE IV file) a lot of data in those blocks.
Therefore, you have a bit of a paradox. Any variable record
length file that has at least two data blocks will get properly
squeezed; however, any file that's very small -- has only one data
block -- won't be modified. In this isolated case, the smaller files
may actually use more space (when squeezed) than the larger ones!
Q: Is there an easy way to determine what language a program was
written in? Maybe some word in word 0 of the program file?
A: Nothing that direct, I'm afraid. Code is code -- a program file
is just a set of an assembly instructions; there's no need for the
operating system to inquire into where they came from, so MPE doesn't
keep this information around.
However, if there isn't a direct way, there may well be an
indirect one. In fact, there are two -- neither is certain, but they
work for most programs.
The first method is based on the fact that all compilers -- except
SPL -- generate calls to so-called "compiler library" procedures.
For instance, when you do a WRITELN in a PASCAL program, the compiled
code won't actually contain all the instructions that do the I/O, nor
even a direct call to the FWRITE or PRINT intrinsic; rather, the code
will have a call to the P_WRITELN procedure, a system SL procedure
that actually does the PASCAL-file-to-MPE-file mapping, the I/O, the
error check, etc. All languages (except SPL) rely on such compiler
library procedures, typically to do I/O, but also to do bounds
checking, implement certain complicated language constructs, etc.
(Even SPL has a few "compiler library" procedures of its own that it
calls -- anybody know what they are?)
The point here is that each language has its own compiler library
routines -- PASCAL's typically start with "P'" (e.g. P'WRITELN),
COBOL's with "C'" (e.g. C'DISPLAY), BASIC's with "B'" (e.g.
B'PRINTSTR). You'd think that FORTRAN's would start with "F'", but
they don't -- the most common ones are FMTINIT', SIO' (and in general
xxxIO'), and BLANKFILL'.
Armed with this knowledge, you can just run the program with a
;LMAP parameter. This will show you all the external references,
which, in addition to those intrinsics that you explicitly call,
should include a whole bunch of compiler library procedures. From
the names of these procedures, you can figure out the source language
of the program (if there are no such procedures, it's probably in
SPL).
Note that it is possible, for instance, for a FORTRAN program NOT
to call any compiler library procedure (if it does no formatting and
nothing else that requires any compiler library help) -- however,
it's quite unlikely.
The other way of figuring out a program's source language is by
using the fact that all languages (except for SPL) do some sort of
automatic I/O error checking. If they encounter an unexpected error
condition, they will abort with tombstones -- each one with its own
kind. The trick is to force an I/O error; for a program that's doing
character mode I/O, the easiest way is by typing a :EOD on $STDIN.
* Most FORTRAN programs that do an ACCEPT or READ (5,xxx) will
abort with "END OF FILE DETECTED ON UNIT # 05" and a
PRINTFILEINFO tombstone for file "FTN05".
* Most COBOL programs will print an error message such as "READ
ERROR on ACCEPT (COBERR 551)" and then do a QUIT. The "COB" in
the "COBERR" should clue you in.
* Most PASCAL programs will abort with an error message like "****
ATTEMPT TO READ PAST EOF (PASCERR 694)".
* Most BASIC programs will abort with an "**ERROR 92: I/O ERROR ON
INPUT FILE" and then a BASIC tombstone, which is similar to a
PRINTFILEINFO but different. The second line of the tombstone
will probably be "FNAME: BASIN".
This is a less reliable trick since it won't work for programs
that do no direct terminal input (e.g. ones that do no terminal input
at all or do it through V/3000); also, some smart programs may trap
input errors and handle them themselves, rather than just letting the
compiler library abort.
Finally, note that these techniques will probably work for HP
Business Basic, RPG, and maybe even C programs -- all of them will
probably have their own compiler libraries, and all of them (except,
perhaps, C) will have their own way of aborting on an input I/O
error.
Q: I have about 30 job streams that need to run over the weekend. I
want them all to run in sequence, not because each one depends on the
previous ones, but because each is so CPU-intensive that running two
at once would bring the system to its knees (and completely drown out
any other users that might be foolish enough to try to use the
computer at the time).
At first, I tried to set the job limit to 1. This proved
impractical in my environment, since other users may want to submit
their own jobs while the 30 sequential jobs are running. (The other
users' jobs can't use ;HIPRI to bypass the job limit since the users
don't have SM or OP.)
The next thing I tried was the old trick of having each job stream
submit the next job stream at the end of its execution -- for
instance, JOB1 might look like:
!JOB JOB1,USER.PROD;OUTCLASS=,1
...
!STREAM JOB2
!EOJ
JOB1 streams JOB2, JOB2 streams JOB3, etc. Unfortunately, if I do
this, then any job that aborts in the middle would prevent any of the
other jobs from running, which is unacceptable. I thought of putting
!CONTINUEs in front of every line in each job stream, but I WANT a job
stream error to flush the remainder of THAT job stream (but not the
others). Putting !CONTINUEs and then !IFs designed to prevent (in case
of error) the execution of everything EXCEPT for the final !STREAM was
also unreasonable -- my job streams are hundreds of lines long, and
would thus require hundreds of !CONTINUEs and nested !IFs.
Another idea I had was to have each job stream SCHEDULE (not just
stream) the next job at the BEGINNING (rather than the END) of the
job. Thus, JOB1 might look like:
!JOB JOB1,USER.PROD;OUTCLASS=,1
!STREAM JOB2;IN=,3
...
!EOJ
The first line in JOB1 schedules JOB2 to run in 3 hours; if JOB1
aborts, JOB2 would already have been scheduled.
This one almost worked; unfortunately, the run times of the jobs
vary greatly. If JOB1 runs more than the interval time (in this case,
3 hours), then JOB1 and JOB2 would have to run simultaneously, and the
system would grind to a halt. If, however, to counteract this I set
the interval time even higher, then the jobs might just take too long
to run -- even at 3 hours per job, the 30 jobs would take 90 hours
(almost 4 days!). I don't want to have any unused time between job
executions.
(Actually, I only mention this for completeness' sake -- I have an
old Series III and am running MPE V/R, which does not support job
scheduling.)
Finally, there's one other alternative I can think of. I can write
a program that wakes up every several minutes, does a :SHOWJOB into a
disc file, sees if one of my jobs is running, and if none are,
:STREAMs the next one (it would keep track of what the next one should
be). Then I would stream this "traffic-cop" program and it would
submit my 30 jobs.
Unfortunately, I don't relish the task of writing this program,
having it analyze the :SHOWJOB output, execute MPE commands, open
files, etc. I'd like to make do with the minimum possible programming.
Is there an "MPE Programming" solution to this problem that doesn't
require writing a custom program that I'd then have to rely on and
maintain?
A: Wow! You sure are tough to please! When I read your opening
paragraph, I thought of all four of the solutions that you proposed,
but it seems that none of them is good enough for you. Fortunately, I
did manage to come up with another solution, though it took some hard
thinking.
Let's look at what you're trying to achieve. You want JOB2 to start
up as soon as JOB1 finishes (no sooner and no later), whether JOB1
finished successfully or aborted. What mechanism currently exists in
MPE for doing this sort of thing?
Well, there's no way of doing EXACTLY this; however, there is a
similar feature not for jobs, but for files. If JOB1 has an empty
message file opened and JOB2 is waiting on an FREAD against this file,
then JOB2 will awaken as soon as JOB1 closes the file, whether the
close is done normally or as the result of a job abort. Thus, if JOB1
said
!JOB JOB1,...
!OPEN MSGFILE FOR WRITE ACCESS
!STREAM JOB2
....
!EOJ
and JOB2 said
!JOB JOB2,...
!READ MSGFILE
...
!EOJ
then JOB2 would try to read MSGFILE (presumably an empty message
file), see that it's empty, and wait until a record is written to it
OR until all of MSGFILE's writers close the file. JOB1 has MSGFILE
opened for write access as long as it's running; when JOB1 terminates,
MSGFILE will automatically be closed, and JOB2 will automatically wake
up.
Sounds good? It's very much what we want, except for two things --
MPE has no way of opening a file or of reading a file. The "!READ
MSGFILE" can actually be done by saying
!FCOPY FROM=MSGFILE;TO=$NULL
Unfortunately, there's really no documented way for your job to open a
message file and keep it open. A program run by the job can open the
message file, but as soon as the program terminates, the file will be
closed; what you want is to have the Command Interpreter process
itself to open the file. But, since the CI has no !OPEN command, this
is impossible, right?
Of course, this (opening a file in the CI and keeping it opened) IS
possible. It just can't be done DIRECTLY, but must instead be done as
a SIDE EFFECT of something else.
The CI often opens files -- the :PURGE command opens the file to be
purged, :RENAME opens the file to be renamed, :LISTF and :SHOWCATALOG
open their list files. However, after opening the file, the CI (being
a well-written program) closes it. If we want to subvert a CI command
so that it will open a file but not close it, we have to somehow
inhibit the close operation.
One way that comes to mind is to somehow make the CI's FCLOSE fail
with some sort of file system error. If the FCLOSE fails, the file
remains opened. Our plan would be to issue some sort of :FILE equation
that would prevent the FCLOSE from succeeding.
There are very few ways in which the FCLOSE of an already-existing
file can fail. One little-known way is if you try to FCLOSE a
permanent file with disposition 2, which means "save the file as
temporary". If you try to do this, you'll get file system error 110,
ATTEMPT TO SAVE PERMANENT FILE AS TEMPORARY (FSERR 110)
If the FCLOSE fails, the file is not closed, and therefore remains
open. Thus, we can say:
:BUILD MSGFILE;MSG
:FILE MSGFILE,OLD;TEMP
:PURGE *MSGFILE
Sure enough, we get the error
ATTEMPT TO SAVE PERMANENT FILE AS TEMPORARY (FSERR 110)
UNABLE TO PURGE FILE *MSGFILE. (CIERR 385)
Now, we do a :LISTF of the file, and we see that... the file has no
asterisk ("*") after its name! Although the FCLOSE failed, the file is
NOT STILL OPENED. The :PURGE command was smart enough to see that,
since the FCLOSE failed, it should do a special FCLOSE that guarantees
that the file will get closed no matter what.
So, what now? Did I lead you all the way through this mess just to
let you down in the end? Hold off on those nasty letters! Try the
following:
:BUILD MSGFILE;MSG
:FILE MSGFILE,OLD;TEMP
:SHOWCATALOG *MSGFILE
Now, do a :LISTF MSGFILE,2 -- the file is still open! The :SHOWCATALOG
command (almost alone of all the MPE commands) does NOT force the
close of its list file; if the FCLOSE fails (as a result of the wicked
:FILE equation we set up), the list file will remain open until the
job or session that did it logs off!
Therefore, here's the solution:
!JOB JOB1,...
!BUILD MSGFILE;MSG
!FILE MSGFILE,OLD;TEMP
!SHOWCATALOG *MSGFILE
!STREAM JOB2
...
!EOJ
!JOB JOB2,...
!FCOPY FROM=MSGFILE;TO=$NULL
!FILE MSGFILE,OLD;TEMP
!SHOWCATALOG *MSGFILE
!STREAM JOB3
...
!EOJ
!JOB JOB3,...
!FCOPY FROM=MSGFILE;TO=$NULL
!FILE MSGFILE,OLD;TEMP
!SHOWCATALOG *MSGFILE
!STREAM JOB4
...
!EOJ
Oh, yes, don't forget to put a few !COMMENTs in those job streams
that explain what you're doing -- I wouldn't want to be the poor
innocent programmer who has to figure out what those !SHOWCATALOGs are
for!
In any event, this is just what you asked for -- an "MPE
programming" solution that meets all your criteria AND doesn't require
a specially-written program that does PAUSEs, SHOWJOBs, etc. The only
problem -- and it's potentially a very serious one -- is that you're
relying on an MPE BUG (:SHOWCATALOG's not closing of its list file
when the normal FCLOSE fails) which in theory could be fixed at any
time.
However, if you're using MPE V/R, you needn't worry about too many
operating system improvements. And, if you can ever ditch your Series
III and get a REAL MACHINE, you might not have such horrible CPU
resource problems.
In any case, this is yet another example of what amazing (and
somewhat disquieting) things that you can do with a bit of ingenuity.
I'll bet you never thought that the :SHOWCATALOG command could do
something like this!
Q: I have a temporary file that I'd like to /TEXT and /KEEP using
EDITOR. I wish that EDITOR had a /TEXT ,TEMP command and a /KEEP ,TEMP
command for doing this, but it doesn't. I tried saying
:FILE MYFILE;TEMP
and then saying
/TEXT *MYFILE
and
/KEEP *MYFILE
-- the /TEXT worked but the /KEEP didn't. I then tried saying
:FILE MYFILE,OLDTEMP
which also let the /TEXT work, but still didn't do the /KEEP right.
Why doesn't this work? What can I do?
A: A :FILE equation's job is to alter the way a file is opened or
closed. Therefore, to understand a :FILE equation's effects on a
particular program, you have to understand how the program opens and
closes the file to begin with.
First of all, what does EDITOR do when you say /TEXT MYFILE? It
FOPENs the file not as an OLD file (which would mean setting bits
(14:2) of the foptions parameter to 1) or as an OLDTEMP file
(foptions.(14:2) := 2), but with foptions.(14:2) set to 3. This
special FOPEN mode tries to look for the file first as a temporary
file, and then if the temporary file doesn't exist, as a permanent
file. When you say
/TEXT MYFILE
EDITOR will first try to text in the temporary MYFILE file, and then
(if there is no temporary MYFILE) as a permanent MYFILE; thus, you
don't have to do anything special to /TEXT in a temporary file. You
didn't need either a :FILE MYFILE,OLDTEMP or a :FILE MYFILE;TEMP.
Now, what happens when EDITOR does a /KEEP MYFILE? It actually does
four things:
1. It tries to FOPEN MYFILE with foptions.(14:2)=3 (look first for
a temporary file, then for a permanent file).
2. If this FOPEN succeeds (i.e. MYFILE already exists), it asks you
"PURGE OLD?", and if you say yes, FCLOSEs it with disposition
DELETE (to purge the old file).
3. It FOPENs MYFILE as a new file (and then writes the data to be
/KEEPed to it).
4. It FCLOSEs MYFILE with disposition SAVE (to save it as a
permanent file).
Assume, then, that there are no file equations in effect. If you don't
have a temporary file (or permanent file) called MYFILE, a /KEEP
MYFILE would:
1. Try to FOPEN the permanent or temporary file MYFILE -- and fail.
2. FOPEN a new file MYFILE and write data to it.
3. FCLOSE MYFILE with SAVE disposition (as a permanent file).
Thus, the /KEEP MYFILE would merely build a permanent file.
What if MYFILE already exists as a temporary file and you do a
/KEEP MYFILE (again without :FILE equations)? EDITOR would:
1. FOPEN the temporary file MYFILE.
2. FCLOSE it with DELETE access (thus purging it).
3. FOPEN a new file MYFILE and write data to it.
4. FCLOSE MYFILE with SAVE disposition.
Thus, you'd still build MYFILE as a permanent file, but you'd also
lose the temporary file MYFILE in the process! (Quite strange,
actually, since EDITOR could easily have saved MYFILE as a permanent
file without touching the temporary file.)
Now, what if there is BOTH a temporary file named MYFILE and a
permanent file named MYFILE? Here's what EDITOR would do:
1. FOPEN the temporary file MYFILE.
2. FCLOSE it with DELETE access (thus purging it).
3. FOPEN a new file MYFILE and write data to it.
4. FCLOSE MYFILE with SAVE disposition -- which would fail because
there's already a permanent file called MYFILE!
In this case, EDITOR purged the WRONG FILE -- purged the temporary
file INSTEAD of the permanent file; you've lost the temporary file and
still haven't done your /KEEP, since the permanent file still exists.
Confused yet? This is what happens WITHOUT a :FILE equation!
Now, let's say that you do a :FILE MYFILE,OLDTEMP when there is no
temporary (or permanent file) named MYFILE. This is what would happen:
1. EDITOR tries to open the file MYFILE (as a temporary file
because of the :FILE equation). The FOPEN fails because the file
doesn't exist, but that's no big deal, since EDITOR expected it
to fail if the file didn't exist.
2. EDITOR now tries to open the file MYFILE as a new file --
however, the :FILE equation overrides this, and the FOPEN
actually tries to open the file as an OLDTEMP file! This FOPEN
fails, and EDITOR prints "*41*FAILURE TO OPEN KEEP FILE (53) --
NONEXISTENT TEMPORARY FILE (FSERR 53)".
If a temporary file called MYFILE existed, you wouldn't have much more
luck, since by the time the second FOPEN took place, the file would
have been deleted by the first FCLOSE.
Finally, let's say that you do a :FILE MYFILE;TEMP when a temporary
file named MYFILE already exists. Then, when you /KEEP *MYFILE, EDITOR
will:
1. Try to FOPEN MYFILE as an old permanent or temporary file --
this FOPEN would succeed quite well, opening the old temporary
file.
2. Ask you whether you want to "PURGE OLD?" -- if you say YES, it
will try to FCLOSE the file with DELETE disposition. HOWEVER,
the ;TEMP on the :FILE equation (which says "close the file as a
temporary file") will override the DELETE -- the file will be
closed, but not deleted!
3. FOPEN MYFILE as a new file and write all the data to it.
4. Try to FCLOSE the file as a permanent file (SAVE disposition).
The :FILE equation's ;TEMP will override this, so the FCLOSE
will try to close the file as a temporary file, which is exactly
what you wanted. However, remember that the purge of the
temporary file that we tried to do in step #2 actually wasn't
done! Thus, the FCLOSE ;TEMP will fail with a "DUPLICATE
TEMPORARY FILE NAME (FSERR 101)". You said "YES" when asked
"PURGE OLD?", but that never happened because of the :FILE
equation.
Those are the problems you've been running into. If this all sounds
complicated, that's because it is. To understand it, you have to be
keenly aware of every FOPEN and every FCLOSE that EDITOR does. Since
these FOPENs and FCLOSEs actually aren't documented anywhere, you
really have to guess at what EDITOR must be doing.
What's the solution? Well, let's see what it is that we WANT each
FOPEN and FCLOSE to do:
WHAT EDITOR DOES WHAT WE WANT IT TO DO
1. Open MYFILE as old Open MYFILE as old
permanent or temporary temporary (so if there's
an old permanent MYFILE
but no old temporary
MYFILE, the old permanent
MYFILE won't be purged)
2. Close MYFILE with DELETE Close MYFILE with DELETE
disposition disposition
3. Open MYFILE as a new file Open MYFILE as a new file
4. Close MYFILE with SAVE Close MYFILE with TEMP
(save as permanent file) (save as temporary file)
disposition disposition
In other words, we want OPEN #1 to be affected by a :FILE ,OLDTEMP
without having the same :FILE equation affect OPEN #3; and, we want
CLOSE #4 to be affected by a :FILE ;TEMP without having the same :FILE
equation affect CLOSE #2.
What single :FILE equation can we use to do this? The simple answer
is: there isn't one. Instead, we must do two things:
/:PURGE MYFILE,TEMP
to delete any old file named MYFILE and then
/:FILE MYFILE;TEMP
/KEEP *MYFILE
to /KEEP MYFILE as a temporary file. Since the temporary file MYFILE
is then already gone, the :FILE ;TEMP will not badly affect FCLOSE #2
(the one that's supposed to do the purge of the old temporary file)
because FCLOSE #2 will not actually be done (since FOPEN #1 would have
failed since the old temporary file did not exist). Instead, the :FILE
equation will influence FCLOSE #4, which will then save the
newly-built /KEEP file as a temporary file rather than as a permanent
file.
Thus, that's your solution:
/TEXT MYFILE << no file equation needed >>
...
/:PURGE MYFILE,TEMP
/:FILE MYFILE;TEMP
/KEEP *MYFILE
The only problem is this: what happens if there is both a permanent
and a temporary file named MYFILE? Will the /KEEP then succeed? The
answer to this question is left for the reader...
Q: I'm trying to decide whether I should do my future development in C
or PASCAL. The one great advantage of PASCAL, I am told, is that it
does much more stringent type checking; I understand that this can be
more of a deficiency than an advantage, but I've heard that PASCAL/XL
lets you optionally waive type checking when needed. I think that type
checking (as long as it can be avoided when necessary) is a very
valuable thing; it's much better to let the compiler catch the bugs
than make the programmer find them all.
However, Draft ANSI Standard C seems to do the converse -- it seems
to bring C's type checking up to a reasonable level (just as PASCAL/XL
brought PASCAL's type checking DOWN to a reasonable level). Is this
true?
A: Like many good answers, the answer to this question is "yes and
no". First, though, let's talk a moment about "classic" (i.e.
pre-Draft ANSI Standard, also known as "Kernighan & Ritchie") C.
K&R C did virtually no type checking at all. For example, say that
the function "func" took two integer parameters. Your program could
easily say
func (10)
and thus pass "func" only one parameter; of course, the results would
be quite unpredictable (and almost certainly undesirable), but the
compiler wouldn't say a thing. Similarly, you could try to pass 3
parameters by saying
func (10, 20, 30)
Again, the compiler will be perfectly happy to do this, even though
it's a pretty obvious bug (one that will probably result in "func"
getting the wrong values and in garbage being left on the stack).
Finally, you might also say:
func (10.0, 20.0)
-- try to pass two floating-point numbers (rather than integers).
Since in most C implementations floating point numbers use twice as
much space as integers, this will again badly confuse both the caller
and the callee, but the compiler will not complain.
Other examples of this total lack of parameter type checking
abound. C requires you to pass all "by-reference" parameters by
specifically prefixing each one with an "&", e.g.
func (&refvara, &refvarb);
If you omit the "&" when it's needed (or specify it when it isn't),
the compiler won't catch the error; needless to say, if you try to
pass a record of type A when the procedure expects a record of type B,
the compiler won't catch this, either.
As you see, this can get pretty unpleasant. God knows, all of us
make mistakes; I, for one, would much rather have the compiler catch
them for me. This may be "protecting the programmer against himself",
but that's a good idea in my book. We programmers need all the
protection we can get, especially against ourselves.
Draft ANSI Standard C -- which more and more C compilers are
implementing -- is much better. It lets you define so-called "function
prototypes", which really are declarations of the parameter types that
each function expects. For instance, if you say
extern int FFF (int, int, float *);
then the compiler will know that FFF is a function that takes three
parameters -- two integers, and a float pointer (i.e. a real number by
reference) -- and returns an integer. Then, if you try to say
float x;
int i, j;
i = FFF (10, 20); or
i = FFF (10, 20, &x, 30); or
i = FFF (10, 20, x); or
i = FFF (10, 20, &j);
then the compiler will catch all four errors (too few parameters, too
many, not-by-reference, and bad type). There are (fortunately) ways to
waive type checking for all calls to a particular function, for a
particular function parameter, or for a particular function parameter
in a particular function call, but in the overwhelming majority of
cases in which type checking is useful, it's available.
So, to answer your question -- is Draft ANSI Standard C type
checking now as good as PASCAL/XL's? In my opinion, I'm afraid it
isn't.
First of all, remember the old "equal sign" problem? One of the
things that has always bedeviled me about C is that C's assignment
symbol is "=" and its comparison symbol is "==". Thus, saying
if (i=5) ...
does NOT check to see if I is equal to 5 -- it assigns 5 to I and
checks to see if the result (the assigned value) is non-zero. Try
finding a bug like this some time -- it certainly isn't easy! People
have told me "oh, you'll get used to it"; I worked with C on and off
for year