The MFT processor

The MFT processor
The MFT processor
(Version 2.0, October 1989)
Section Page
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
402
The character set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
405
Input and output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
408
Reporting errors to the user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
410
Inserting the changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
412
Data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
417
Initializing the primitive tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
420
Inputting the next token . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
429
Low-level output routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
432
Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
435
The main program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
440
System-dependent changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
441
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
442
The preparation of this report was supported in part by the National Science Foundation under grants IST-8201926, MCS-8300984, and CCR-8610181, and by the System
Development Foundation. ‘TEX’ is a trademark of the American Mathematical Society.
‘hijklmnj ’ is a trademark of Addison-Wesley Publishing Company.
402
INTRODUCTION
MFT
§
1. Introduction. This program converts a METAFONT source file to a TEX file. It was written by D. E.
Knuth in June, 1985; a somewhat similar SAIL program had been developed in January, 1980.
The general idea is to input a file called, say, foo.mf and to produce an output file called, say, foo.tex.
The latter file, when processed by TEX, will yield a “prettyprinted” representation of the input file.
Line breaks in the input are carried over into the output; moreover, blank spaces at the beginning of a
line are converted to quads of indentation in the output. Thus, the user has full control over the indentation
and line breaks. Each line of input is translated independently of the others.
A slight change to METAFONT’s comment convention allows further control. Namely, ‘%%’ indicates that
the remainder of an input line should be copied verbatim to the output; this interrupts the translation and
forces MFT to produce a certain result.
Furthermore, ‘%%% h token1 i . . . h tokenn i’ introduces a change in MFT’s formatting rules; all tokens after
the first will henceforth be translated according to the current conventions for h token1 i. The tokens must
be symbolic (i.e., not numeric or string tokens). For example, the input line
%%% addto fill draw filldraw
says that the ‘fill’, ‘draw’, and ‘filldraw’ operations of plain METAFONT should be formatted as the
primitive token ‘addto’, i.e., in boldface type. (Without such reformatting commands, MFT would treat ‘fill’
like an ordinary tag or variable name. In fact, you need a reformatting command even to get parentheses to
act like delimiters!)
METAFONT comments, which follow a single % sign, should be valid TEX input. But METAFONT material
can be included in | . . . | within a comment; this will be translated by MFT as if it were not in a comment.
For example, a phrase like ‘make |x2r| zero’ will be translated into ‘make $x_{2r}$ zero’.
The rules just stated apply to lines that contain one, two, or three % signs in a row. Comments to MFT can
follow ‘%%%%’. Five or more % signs should not be used.
Beside the normal input file, MFT also looks for a change file (e.g., ‘foo.ch’), which allows substitutions
to be made in the translation. The change file follows the conventions of WEB, and it should be null if there
are no changes. (Changes usually contain verbatim instructions to compensate for the fact that MFT cannot
format everything in an optimum way.)
There’s also a third input file (e.g., ‘plain.mft’), which is input before the other two. This file normally
contains the ‘%%%’ formatting commands that are necessary to tune MFT to a particular style of METAFONT
code, so it is called the style file.
The output of MFT should be accompanied by the macros in a small package called mftmac.tex.
Caveat: This program is not as “bulletproof” as the other routines produced by Stanford’s TEX project.
It takes care of a great deal of tedious formatting, but it can produce strange output, because METAFONT
is an extremely general language. Users should proofread their output carefully.
2.
MFT uses a few features of the local Pascal compiler that may need to be changed in other installations:
1) Case statements have a default.
2) Input-output routines may need to be adapted for use with a particular character set and/or for printing
messages on the user’s terminal.
These features are also present in the Pascal version of TEX, where they are used in a similar (but more
complex) way. System-dependent portions of MFT can be identified by looking at the entries for ‘system
dependencies’ in the index below.
The “banner line” defined here should be changed whenever MFT is modified.
define banner ≡ ´This is MFT, Version 2.0´
§3
MFT
INTRODUCTION
403
3. The program begins with a fairly normal header, made up of pieces that will mostly be filled in later.
The MF input comes from files mf file , change file , and style file ; the TEX output goes to file tex file .
If it is necessary to abort the job because of a fatal error, the program calls the ‘jump out ’ procedure,
which goes to the label end of MFT .
define end of MFT = 9999 { go here to wrap it up }
h Compiler directives 4 i
program MFT (mf file , change file , style file , tex file );
label end of MFT ; { go here to finish }
const h Constants in the outer block 8 i
type h Types in the outer block 12 i
var h Globals in the outer block 9 i
h Error handling procedures 29 i
procedure initialize ;
var h Local variables for initialization 14 i
begin h Set initial values 10 i
end;
4. The Pascal compiler used to develop this system has “compiler directives” that can appear in comments
whose first character is a dollar sign. In our case these directives tell the compiler to detect things that are
out of range.
h Compiler directives 4 i ≡
@{@&$C+, A+, D−@} { range check, catch arithmetic overflow, no debug overhead }
This code is used in section 3.
5. Labels are given symbolic names by the following definitions. We insert the label ‘exit :’ just before
the ‘end’ of a procedure in which we have used the ‘return’ statement defined below; the label ‘restart ’
is occasionally used at the very beginning of a procedure; and the label ‘reswitch ’ is occasionally used just
prior to a case statement in which some cases change the conditions and we wish to branch to the newly
applicable case. Loops that are set up with the loop construction defined below are commonly exited by
going to ‘done ’ or to ‘found ’ or to ‘not found ’, and they are sometimes repeated by going to ‘continue ’.
define
define
define
define
define
define
define
6.
exit = 10 { go here to leave a procedure }
restart = 20 { go here to start a procedure again }
reswitch = 21 { go here to start a case statement again }
continue = 22 { go here to resume a loop }
done = 30 { go here to exit a loop }
found = 31 { go here when you’ve found it }
not found = 32 { go here when you’ve found something else }
Here are some macros for common programming idioms.
define incr (#) ≡ # ← # + 1 { increase a variable by unity }
define decr (#) ≡ # ← # − 1 { decrease a variable by unity }
define loop ≡ while true do { repeat over and over until a goto happens }
define do nothing ≡ { empty statement }
define return ≡ goto exit { terminate a procedure call }
format return ≡ nil
format loop ≡ xclause
404
INTRODUCTION
MFT
§7
7. We assume that case statements may include a default case that applies if no matching label is found.
Thus, we shall use constructions like
case x of
1: h code for x = 1 i;
3: h code for x = 3 i;
othercases h code for x 6= 1 and x 6= 3 i
endcases
since most Pascal compilers have plugged this hole in the language by incorporating some sort of default
mechanism. For example, the compiler used to develop WEB and TEX allows ‘others :’ as a default label, and
other Pascals allow syntaxes like ‘else’ or ‘otherwise’ or ‘otherwise :’, etc. The definitions of othercases
and endcases should be changed to agree with local conventions. (Of course, if no default mechanism is
available, the case statements of this program must be extended by listing all remaining cases.)
define othercases ≡ others : { default for cases not listed explicitly }
define endcases ≡ end { follows the default case in an extended case statement }
format othercases ≡ else
format endcases ≡ end
8. The following parameters are set big enough to handle the Computer Modern fonts, so they should be
sufficient for most applications of MFT.
h Constants in the outer block 8 i ≡
max bytes = 10000; { the number of bytes in tokens; must be less than 65536 }
max names = 1000; { number of tokens }
hash size = 353; { should be prime }
buf size = 100; { maximum length of input line }
line length = 80; { lines of TEX output have at most this many characters, should be less than 256 }
This code is used in section 3.
9. A global variable called history will contain one of four values at the end of every run: spotless means that
no unusual messages were printed; harmless message means that a message of possible interest was printed
but no serious errors were detected; error message means that at least one error was found; fatal message
means that the program terminated abnormally. The value of history does not influence the behavior of the
program; it is simply computed for the convenience of systems that might want to use such information.
define
define
define
define
spotless = 0 { history value for normal jobs }
harmless message = 1 { history value when non-serious info was printed }
error message = 2 { history value when an error was noted }
fatal message = 3 { history value when we had to stop prematurely }
define mark harmless ≡ if history = spotless then history ← harmless message
define mark error ≡ history ← error message
define mark fatal ≡ history ← fatal message
h Globals in the outer block 9 i ≡
history : spotless . . fatal message ; { how bad was this run? }
See also sections 15, 20, 23, 25, 27, 34, 36, 51, 53, 55, 72, 74, 75, 77, 78, and 86.
This code is used in section 3.
10. h Set initial values 10 i ≡
history ← spotless ;
See also sections 16, 17, 18, 21, 26, 54, 57, 76, 79, 88, and 90.
This code is used in section 3.
§11
MFT
THE CHARACTER SET
405
11. The character set. MFT works internally with ASCII codes, like all other programs associated with
TEX and METAFONT. The present section has been lifted almost verbatim from the METAFONT program.
12. Characters of text that have been converted to METAFONT’s internal form are said to be of type
ASCII code , which is a subrange of the integers.
h Types in the outer block 12 i ≡
ASCII code = 0 . . 255; { eight-bit numbers }
See also sections 13, 50, and 52.
This code is used in section 3.
13. The original Pascal compiler was designed in the late 60s, when six-bit character sets were common,
so it did not make provision for lowercase letters. Nowadays, of course, we need to deal with both capital
and small letters in a convenient way, especially in a program for font design; so the present specification
of MFT has been written under the assumption that the Pascal compiler and run-time system permit the
use of text files with more than 64 distinguishable characters. More precisely, we assume that the character
set contains at least the letters and symbols associated with ASCII codes ´40 through ´176 . If additional
characters are present, MFT can be configured to work with them too.
Since we are dealing with more characters than were present in the first Pascal compilers, we have to
decide what to call the associated data type. Some Pascals use the original name char for the characters in
text files, even though there now are more than 64 such characters, while other Pascals consider char to be
a 64-element subrange of a larger data type that has some other name.
In order to accommodate this difference, we shall use the name text char to stand for the data type of
the characters that are converted to and from ASCII code when they are input and output. We shall also
assume that text char consists of the elements chr (first text char ) through chr (last text char ), inclusive.
The following definitions should be adjusted if necessary.
define text char ≡ char { the data type of characters in text files }
define first text char = 0 { ordinal number of the smallest element of text char }
define last text char = 255 { ordinal number of the largest element of text char }
h Types in the outer block 12 i +≡
text file = packed file of text char ;
14. h Local variables for initialization 14 i ≡
i: 0 . . 255;
See also section 56.
This code is used in section 3.
15. The MFT processor converts between ASCII code and the user’s external character set by means of
arrays xord and xchr that are analogous to Pascal’s ord and chr functions.
h Globals in the outer block 9 i +≡
xord : array [text char ] of ASCII code ; { specifies conversion of input characters }
xchr : array [ASCII code ] of text char ; { specifies conversion of output characters }
406
THE CHARACTER SET
MFT
§16
16. Since we are assuming that our Pascal system is able to read and write the visible characters of
standard ASCII (although not necessarily using the ASCII codes to represent them), the following assignment
statements initialize most of the xchr array properly, without needing any system-dependent changes. On
the other hand, it is possible to implement MFT with less complete character sets, and in such cases it will
be necessary to change something here.
h Set initial values 10 i +≡
xchr [´40 ] ← ´ ´; xchr [´41 ] ← ´!´; xchr [´42 ] ← ´"´; xchr [´43 ] ← ´#´; xchr [´44 ] ← ´$´;
xchr [´45 ] ← ´%´; xchr [´46 ] ← ´&´; xchr [´47 ] ← ´´´´;
xchr [´50 ] ← ´(´; xchr [´51 ] ← ´)´; xchr [´52 ] ← ´*´; xchr [´53 ] ← ´+´; xchr [´54 ] ← ´,´;
xchr [´55 ] ← ´−´; xchr [´56 ] ← ´.´; xchr [´57 ] ← ´/´;
xchr [´60 ] ← ´0´; xchr [´61 ] ← ´1´; xchr [´62 ] ← ´2´; xchr [´63 ] ← ´3´; xchr [´64 ] ← ´4´;
xchr [´65 ] ← ´5´; xchr [´66 ] ← ´6´; xchr [´67 ] ← ´7´;
xchr [´70 ] ← ´8´; xchr [´71 ] ← ´9´; xchr [´72 ] ← ´:´; xchr [´73 ] ← ´;´; xchr [´74 ] ← ´<´;
xchr [´75 ] ← ´=´; xchr [´76 ] ← ´>´; xchr [´77 ] ← ´?´;
xchr [´100 ] ← ´@´; xchr [´101 ] ← ´A´; xchr [´102 ] ← ´B´; xchr [´103 ] ← ´C´; xchr [´104 ] ← ´D´;
xchr [´105 ] ← ´E´; xchr [´106 ] ← ´F´; xchr [´107 ] ← ´G´;
xchr [´110 ] ← ´H´; xchr [´111 ] ← ´I´; xchr [´112 ] ← ´J´; xchr [´113 ] ← ´K´; xchr [´114 ] ← ´L´;
xchr [´115 ] ← ´M´; xchr [´116 ] ← ´N´; xchr [´117 ] ← ´O´;
xchr [´120 ] ← ´P´; xchr [´121 ] ← ´Q´; xchr [´122 ] ← ´R´; xchr [´123 ] ← ´S´; xchr [´124 ] ← ´T´;
xchr [´125 ] ← ´U´; xchr [´126 ] ← ´V´; xchr [´127 ] ← ´W´;
xchr [´130 ] ← ´X´; xchr [´131 ] ← ´Y´; xchr [´132 ] ← ´Z´; xchr [´133 ] ← ´[´; xchr [´134 ] ← ´\´;
xchr [´135 ] ← ´]´; xchr [´136 ] ← ´^´; xchr [´137 ] ← ´_´;
xchr [´140 ] ← ´`´; xchr [´141 ] ← ´a´; xchr [´142 ] ← ´b´; xchr [´143 ] ← ´c´; xchr [´144 ] ← ´d´;
xchr [´145 ] ← ´e´; xchr [´146 ] ← ´f´; xchr [´147 ] ← ´g´;
xchr [´150 ] ← ´h´; xchr [´151 ] ← ´i´; xchr [´152 ] ← ´j´; xchr [´153 ] ← ´k´; xchr [´154 ] ← ´l´;
xchr [´155 ] ← ´m´; xchr [´156 ] ← ´n´; xchr [´157 ] ← ´o´;
xchr [´160 ] ← ´p´; xchr [´161 ] ← ´q´; xchr [´162 ] ← ´r´; xchr [´163 ] ← ´s´; xchr [´164 ] ← ´t´;
xchr [´165 ] ← ´u´; xchr [´166 ] ← ´v´; xchr [´167 ] ← ´w´;
xchr [´170 ] ← ´x´; xchr [´171 ] ← ´y´; xchr [´172 ] ← ´z´; xchr [´173 ] ← ´{´; xchr [´174 ] ← ´|´;
xchr [´175 ] ← ´}´; xchr [´176 ] ← ´~´;
17. The ASCII code is “standard” only to a certain extent, since many computer installations have found
it advantageous to have ready access to more than 94 printing characters. If MFT is being used on a gardenvariety Pascal for which only standard ASCII codes will appear in the input and output files, it doesn’t
really matter what codes are specified in xchr [0 . . ´37 ], but the safest policy is to blank everything out by
using the code shown below.
However, other settings of xchr will make MFT more friendly on computers that have an extended character
set, so that users can type things like ‘≠’ instead of ‘<>’, and so that MFT can echo the page breaks found
in its input. People with extended character sets can assign codes arbitrarily, giving an xchr equivalent to
whatever characters the users of MFT are allowed to have in their input files. Appropriate changes to MFT’s
char class table should then be made. (Unlike TEX, each installation of METAFONT has a fixed assignment
of category codes, called the char class .) Such changes make portability of programs more difficult, so they
should be introduced cautiously if at all.
h Set initial values 10 i +≡
for i ← 0 to ´37 do xchr [i] ← ´ ´;
for i ← ´177 to ´377 do xchr [i] ← ´ ´;
§18
MFT
THE CHARACTER SET
407
18. The following system-independent code makes the xord array contain a suitable inverse to the information in xchr . Note that if xchr [i] = xchr [j] where i < j < ´177 , the value of xord [xchr [i]] will turn out
to be j or more; hence, standard ASCII code numbers will be used instead of codes below ´40 in case there
is a coincidence.
h Set initial values 10 i +≡
for i ← first text char to last text char do xord [chr (i)] ← ´177 ;
for i ← ´200 to ´377 do xord [xchr [i]] ← i;
for i ← 1 to ´176 do xord [xchr [i]] ← i;
408
INPUT AND OUTPUT
MFT
§19
19. Input and output. The I/O conventions of this program are essentially identical to those of WEAVE.
Therefore people who need to make modifications should be able to do so without too many headaches.
20. Terminal output is done by writing on file term out , which is assumed to consist of characters of type
text char :
define print (#) ≡ write (term out , #) { ‘print ’ means write on the terminal }
define print ln (#) ≡ write ln (term out , #) { ‘print ’ and then start new line }
define new line ≡ write ln (term out ) { start new line on the terminal }
define print nl (#) ≡ { print information starting on a new line }
begin new line ; print (#);
end
h Globals in the outer block 9 i +≡
term out : text file ; { the terminal as an output file }
21. Different systems have different ways of specifying that the output on a certain file will appear on the
user’s terminal. Here is one way to do this on the Pascal system that was used in WEAVE’s initial development:
h Set initial values 10 i +≡
rewrite (term out , ´TTY:´); { send term out output to the terminal }
22. The update terminal procedure is called when we want to make sure that everything we have output
to the terminal so far has actually left the computer’s internal buffers and been sent.
define update terminal ≡ break (term out ) { empty the terminal output buffer }
23. The main input comes from mf file ; this input may be overridden by changes in change file . (If
change file is empty, there are no changes.) Furthermore the style file is input first; it is unchangeable.
h Globals in the outer block 9 i +≡
mf file : text file ; { primary input }
change file : text file ; { updates }
style file : text file ; { formatting bootstrap }
24. The following code opens the input files. Since these files were listed in the program header, we assume
that the Pascal runtime system has already checked that suitable file names have been given; therefore no
additional error checking needs to be done.
procedure open input ; { prepare to read the inputs }
begin reset (mf file ); reset (change file ); reset (style file );
end;
25.
The main output goes to tex file .
h Globals in the outer block 9 i +≡
tex file : text file ;
26. The following code opens tex file . Since this file was listed in the program header, we assume that the
Pascal runtime system has checked that a suitable external file name has been given.
h Set initial values 10 i +≡
rewrite (tex file );
27.
Input goes into an array called buffer .
h Globals in the outer block 9 i +≡
buffer : array [0 . . buf size ] of ASCII code ;
§28
MFT
INPUT AND OUTPUT
409
28. The input ln procedure brings the next line of input from the specified file into the buffer array and
returns the value true , unless the file has already been entirely read, in which case it returns false . The
conventions of TEX are followed; i.e., ASCII code numbers representing the next line of the file are input
into buffer [0], buffer [1], . . . , buffer [limit − 1]; trailing blanks are ignored; and the global variable limit is set
to the length of the line. The value of limit must be strictly less than buf size .
function input ln (var f : text file ): boolean ; { inputs a line or returns false }
var final limit : 0 . . buf size ; { limit without trailing blanks }
begin limit ← 0; final limit ← 0;
if eof (f ) then input ln ← false
else begin while ¬eoln (f ) do
begin buffer [limit ] ← xord [f ↑]; get (f ); incr (limit );
if buffer [limit − 1] 6= " " then final limit ← limit ;
if limit = buf size then
begin while ¬eoln (f ) do get (f );
decr (limit ); { keep buffer [buf size ] empty }
if final limit > limit then final limit ← limit ;
print nl (´! Input line too long´); loc ← 0; error ;
end;
end;
read ln (f ); limit ← final limit ; input ln ← true ;
end;
end;
410
REPORTING ERRORS TO THE USER
MFT
§29
29. Reporting errors to the user. The command ‘err print (´! Error message´)’ will report a
syntax error to the user, by printing the error message at the beginning of a new line and then giving
an indication of where the error was spotted in the source file. Note that no period follows the error
message, since the error routine will automatically supply a period.
The actual error indications are provided by a procedure called error .
define err print (#) ≡
begin new line ; print (#); error ;
end
h Error handling procedures 29 i ≡
procedure error ; { prints ‘.’ and location of error message }
var k, l: 0 . . buf size ; { indices into buffer }
begin h Print error location based on input buffer 30 i;
update terminal ; mark error ;
end;
See also section 31.
This code is used in section 3.
30. The error locations can be indicated by using the global variables loc , line , styling , and changing ,
which tell respectively the first unlooked-at position in buffer , the current line number, and whether or not
the current line is from style file or change file or mf file . This routine should be modified on systems whose
standard text editor has special line-numbering conventions.
h Print error location based on input buffer 30 i ≡
begin if styling then print (´. (style file ´)
else if changing then print (´. (change file ´) else print (´. (´);
print ln (´l.´, line : 1, ´)´);
if loc ≥ limit then l ← limit
else l ← loc ;
for k ← 1 to l do print (xchr [buffer [k − 1]]); { print the characters already read }
new line ;
for k ← 1 to l do print (´ ´); { space out the next line }
for k ← l + 1 to limit do print (xchr [buffer [k − 1]]); { print the part not yet read }
end
This code is used in section 29.
31. The jump out procedure just cuts across all active procedure levels and jumps out of the program.
This is the only non-local goto statement in MFT. It is used when no recovery from a particular error has
been provided.
Some Pascal compilers do not implement non-local goto statements. In such cases the code that appears
at label end of MFT should be copied into the jump out procedure, followed by a call to a system procedure
that terminates the program.
define fatal error (#) ≡
begin new line ; print (#); error ; mark fatal ; jump out ;
end
h Error handling procedures 29 i +≡
procedure jump out ;
begin goto end of MFT ;
end;
§32
MFT
REPORTING ERRORS TO THE USER
411
32. Sometimes the program’s behavior is far different from what it should be, and MFT prints an error
message that is really for the MFT maintenance person, not the user. In such cases the program says
confusion (´indication of where we are´).
define confusion (#) ≡ fatal error (´! This can´´t happen (´, #, ´)´)
33.
An overflow stop occurs if MFT’s tables aren’t large enough.
define overflow (#) ≡ fatal error (´! Sorry, ´, #, ´ capacity exceeded´)
412
INSERTING THE CHANGES
MFT
§34
34. Inserting the changes. Let’s turn now to the low-level routine get line that takes care of merging
change file into mf file . The get line procedure also updates the line numbers for error messages. (This
routine was copied from WEAVE, but updated to include styling .)
h Globals in the outer block 9 i +≡
line : integer ; { the number of the current line in the current file }
other line : integer ; { the number of the current line in the input file that is not currently being read }
temp line : integer ; { used when interchanging line with other line }
limit : 0 . . buf size ; { the last character position occupied in the buffer }
loc : 0 . . buf size ; { the next character position to be read from the buffer }
input has ended : boolean ; { if true , there is no more input }
changing : boolean ; { if true , the current line is from change file }
styling : boolean ; { if true , the current line is from style file }
35. As we change changing from true to false and back again, we must remember to swap the values of
line and other line so that the err print routine will be sure to report the correct line number.
define change changing ≡ changing ← ¬changing ; temp line ← other line ; other line ← line ;
line ← temp line { line ↔ other line }
36. When changing is false , the next line of change file is kept in change buffer [0 . . change limit ], for
purposes of comparison with the next line of mf file . After the change file has been completely input, we set
change limit ← 0, so that no further matches will be made.
h Globals in the outer block 9 i +≡
change buffer : array [0 . . buf size ] of ASCII code ;
change limit : 0 . . buf size ; { the last position occupied in change buffer }
37.
Here’s a simple function that checks if the two buffers are different.
function lines dont match : boolean ;
label exit ;
var k: 0 . . buf size ; { index into the buffers }
begin lines dont match ← true ;
if change limit 6= limit then return;
if limit > 0 then
for k ← 0 to limit − 1 do
if change buffer [k] 6= buffer [k] then return;
lines dont match ← false ;
exit : end;
38. Procedure prime the change buffer sets change buffer in preparation for the next matching operation.
Since blank lines in the change file are not used for matching, we have (change limit = 0) ∧ ¬changing if
and only if the change file is exhausted. This procedure is called only when changing is true; hence error
messages will be reported correctly.
procedure prime the change buffer ;
label continue , done , exit ;
var k: 0 . . buf size ; { index into the buffers }
begin change limit ← 0; { this value will be used if the change file ends }
h Skip over comment lines in the change file; return if end of file 39 i;
h Skip to the next nonblank line; return if end of file 40 i;
h Move buffer and limit to change buffer and change limit 41 i;
exit : end;
§39
MFT
INSERTING THE CHANGES
413
39. While looking for a line that begins with @x in the change file, we allow lines that begin with @, as
long as they don’t begin with @y or @z (which would probably indicate that the change file is fouled up).
h Skip over comment lines in the change file; return if end of file 39 i ≡
loop begin incr (line );
if ¬input ln (change file ) then return;
if limit < 2 then goto continue ;
if buffer [0] 6= "@" then goto continue ;
if (buffer [1] ≥ "X") ∧ (buffer [1] ≤ "Z") then buffer [1] ← buffer [1] + "z" − "Z"; { lowercasify }
if buffer [1] = "x" then goto done ;
if (buffer [1] = "y")|(buffer [1] = "z") then
begin loc ← 2; err print (´! Where is the matching @x?´);
end;
continue : end;
done :
This code is used in section 38.
40.
Here we are looking at lines following the @x.
h Skip to the next nonblank line; return if end of file 40 i ≡
repeat incr (line );
if ¬input ln (change file ) then
begin err print (´! Change file ended after @x´); return;
end;
until limit > 0;
This code is used in section 38.
41. h Move buffer and limit to change buffer and change limit 41 i ≡
begin change limit ← limit ;
if limit > 0 then
for k ← 0 to limit − 1 do change buffer [k] ← buffer [k];
end
This code is used in sections 38 and 42.
414
INSERTING THE CHANGES
MFT
§42
42. The following procedure is used to see if the next change entry should go into effect; it is called only
when changing is false. The idea is to test whether or not the current contents of buffer matches the current
contents of change buffer . If not, there’s nothing more to do; but if so, a change is called for: All of the
text down to the @y is supposed to match. An error message is issued if any discrepancy is found. Then the
procedure prepares to read the next line from change file .
procedure check change ; { switches to change file if the buffers match }
label exit ;
var n: integer ; { the number of discrepancies found }
k: 0 . . buf size ; { index into the buffers }
begin if lines dont match then return;
n ← 0;
loop begin change changing ; { now it’s true }
incr (line );
if ¬input ln (change file ) then
begin err print (´! Change file ended before @y´); change limit ← 0; change changing ;
{ false again }
return;
end;
h If the current line starts with @y, report any discrepancies and return 43 i;
h Move buffer and limit to change buffer and change limit 41 i;
change changing ; { now it’s false }
incr (line );
if ¬input ln (mf file ) then
begin err print (´! MF file ended during a change´); input has ended ← true ; return;
end;
if lines dont match then incr (n);
end;
exit : end;
43. h If the current line starts with @y, report any discrepancies and return 43 i ≡
if limit > 1 then
if buffer [0] = "@" then
begin if (buffer [1] ≥ "X") ∧ (buffer [1] ≤ "Z") then buffer [1] ← buffer [1] + "z" − "Z";
{ lowercasify }
if (buffer [1] = "x")|(buffer [1] = "z") then
begin loc ← 2; err print (´! Where is the matching @y?´);
end
else if buffer [1] = "y" then
begin if n > 0 then
begin loc ← 2;
err print (´! Hmm... ´, n : 1, ´ of the preceding lines failed to match´);
end;
return;
end;
end
This code is used in section 42.
§44
44.
MFT
INSERTING THE CHANGES
415
Here’s what we do to get the input rolling.
h Initialize the input system 44 i ≡
begin open input ; line ← 0; other line ← 0;
changing ← true ; prime the change buffer ; change changing ;
styling ← true ; limit ← 0; loc ← 1; buffer [0] ← " "; input has ended ← false ;
end
This code is used in section 112.
45. The get line procedure is called when loc > limit ; it puts the next line of merged input into the buffer
and updates the other variables appropriately.
procedure get line ; { inputs the next line }
label restart ;
begin restart : if styling then h Read from style file and maybe turn off styling 47 i;
if ¬styling then
begin if changing then h Read from change file and maybe turn off changing 48 i;
if ¬changing then
begin h Read from mf file and maybe turn on changing 46 i;
if changing then goto restart ;
end;
end;
end;
46. h Read from mf file and maybe turn on changing 46 i ≡
begin incr (line );
if ¬input ln (mf file ) then input has ended ← true
else if limit = change limit then
if buffer [0] = change buffer [0] then
if change limit > 0 then check change ;
end
This code is used in section 45.
47. h Read from style file and maybe turn off styling 47 i ≡
begin incr (line );
if ¬input ln (style file ) then
begin styling ← false ; line ← 0;
end;
end
This code is used in section 45.
416
INSERTING THE CHANGES
MFT
§48
48. h Read from change file and maybe turn off changing 48 i ≡
begin incr (line );
if ¬input ln (change file ) then
begin err print (´! Change file ended without @z´); buffer [0] ← "@"; buffer [1] ← "z"; limit ← 2;
end;
if limit > 1 then { check if the change has ended }
if buffer [0] = "@" then
begin if (buffer [1] ≥ "X") ∧ (buffer [1] ≤ "Z") then buffer [1] ← buffer [1] + "z" − "Z";
{ lowercasify }
if (buffer [1] = "x")|(buffer [1] = "y") then
begin loc ← 2; err print (´! Where is the matching @z?´);
end
else if buffer [1] = "z" then
begin prime the change buffer ; change changing ;
end;
end;
end
This code is used in section 45.
49. At the end of the program, we will tell the user if the change file had a line that didn’t match any
relevant line in mf file .
h Check that all changes have been read 49 i ≡
if change limit 6= 0 then { changing is false }
begin for loc ← 0 to change limit do buffer [loc ] ← change buffer [loc ];
limit ← change limit ; changing ← true ; line ← other line ; loc ← change limit ;
err print (´! Change file entry did not match´);
end
This code is used in section 112.
§50
MFT
DATA STRUCTURES
417
50. Data structures. MFT puts token names into the large byte mem array, which is packed with eightbit integers. Allocation is sequential, since names are never deleted.
An auxiliary array byte start is used as a directory for byte mem ; the link and ilk arrays give further
information about names. These auxiliary arrays consist of sixteen-bit items.
h Types in the outer block 12 i +≡
eight bits = 0 . . 255; { unsigned one-byte quantity }
sixteen bits = 0 . . 65535; { unsigned two-byte quantity }
51. MFT has been designed to avoid the need for indices that are more than sixteen bits wide, so that it
can be used on most computers.
h Globals in the outer block 9 i +≡
byte mem : packed array [0 . . max bytes ] of ASCII code ; { characters of names }
byte start : array [0 . . max names ] of sixteen bits ; { directory into byte mem }
link : array [0 . . max names ] of sixteen bits ; { hash table links }
ilk : array [0 . . max names ] of sixteen bits ; { type codes }
52. The names of tokens are found by computing a hash address h and then looking at strings of
bytes signified by hash [h], link [hash [h]], link [link [hash [h]]], . . . , until either finding the desired name or
encountering a zero.
A ‘name pointer ’ variable, which signifies a name, is an index into byte start . The actual sequence of
characters in the name pointed to by p appears in positions byte start [p] to byte start [p + 1] − 1, inclusive,
of byte mem .
We usually have byte start [name ptr ] = byte ptr , which is the starting position for the next name to be
stored in byte mem .
define length (#) ≡ byte start [# + 1] − byte start [#] { the length of a name }
h Types in the outer block 12 i +≡
name pointer = 0 . . max names ;
{ identifies a name }
53. h Globals in the outer block 9 i +≡
name ptr : name pointer ; { first unused position in byte start }
byte ptr : 0 . . max bytes ; { first unused position in byte mem }
54. h Set initial values 10 i +≡
byte start [0] ← 0; byte ptr ← 0; byte start [1] ← 0; { this makes name 0 of length zero }
name ptr ← 1;
55. The hash table described above is updated by the lookup procedure, which finds a given name and
returns a pointer to its index in byte start . The token is supposed to match character by character. If it was
not already present, it is inserted into the table.
Because of the way MFT’s scanning mechanism works, it is most convenient to let lookup search for a token
that is present in the buffer array. Two other global variables specify its position in the buffer: the first
character is buffer [id first ], and the last is buffer [id loc − 1].
h Globals in the outer block 9 i +≡
id first : 0 . . buf size ; { where the current token begins in the buffer }
id loc : 0 . . buf size ; { just after the current token in the buffer }
hash : array [0 . . hash size ] of sixteen bits ; { heads of hash lists }
56. Initially all the hash lists are empty.
h Local variables for initialization 14 i +≡
h: 0 . . hash size ; { index into hash-head array }
418
DATA STRUCTURES
MFT
§57
57. h Set initial values 10 i +≡
for h ← 0 to hash size − 1 do hash [h] ← 0;
58.
Here now is the main procedure for finding tokens.
function lookup : name pointer ; { finds current token }
label found ;
var i: 0 . . buf size ; { index into buffer }
h: 0 . . hash size ; { hash code }
k: 0 . . max bytes ; { index into byte mem }
l: 0 . . buf size ; { length of the given token }
p: name pointer ; { where the token is being sought }
begin l ← id loc − id first ; { compute the length }
h Compute the hash code h 59 i;
h Compute the name location p 60 i;
if p = name ptr then h Enter a new name into the table at position p 62 i;
lookup ← p;
end;
59.
A simple hash code is used: If the sequence of ASCII codes is c1 c2 . . . cm , its hash value will be
(2n−1 c1 + 2n−2 c2 + · · · + cn ) mod hash size .
h Compute the hash code h 59 i ≡
h ← buffer [id first ]; i ← id first + 1;
while i < id loc do
begin h ← (h + h + buffer [i]) mod hash size ; incr (i);
end
This code is used in section 58.
60. If the token is new, it will be placed in position p = name ptr , otherwise p will point to its existing
location.
h Compute the name location p 60 i ≡
p ← hash [h];
while p 6= 0 do
begin if length (p) = l then h Compare name p with current token, goto found if equal 61 i;
p ← link [p];
end;
p ← name ptr ; { the current token is new }
link [p] ← hash [h]; hash [h] ← p; { insert p at beginning of hash list }
found :
This code is used in section 58.
61. h Compare name p with current token, goto found if equal 61 i ≡
begin i ← id first ; k ← byte start [p];
while (i < id loc ) ∧ (buffer [i] = byte mem [k]) do
begin incr (i); incr (k);
end;
if i = id loc then goto found ; { all characters agree }
end
This code is used in section 60.
§62
62.
MFT
DATA STRUCTURES
When we begin the following segment of the program, p = name ptr .
h Enter a new name into the table at position p 62 i ≡
begin if byte ptr + l > max bytes then overflow (´byte memory´);
if name ptr + 1 > max names then overflow (´name´);
i ← id first ; { get ready to move the token into byte mem }
while i < id loc do
begin byte mem [byte ptr ] ← buffer [i]; incr (byte ptr ); incr (i);
end;
incr (name ptr ); byte start [name ptr ] ← byte ptr ; h Assign the default value to ilk [p] 63 i;
end
This code is used in section 58.
419
420
INITIALIZING THE PRIMITIVE TOKENS
63. Initializing the primitive tokens.
the following “types”:
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
MFT
Each token read by MFT is recognized as belonging to one of
indentation = 0 { internal code for space at beginning of a line }
end of line = 1 { internal code for hypothetical token at end of a line }
end of file = 2 { internal code for hypothetical token at end of the input }
verbatim = 3 { internal code for the token ‘%%’ }
set format = 4 { internal code for the token ‘%%%’ }
mft comment = 5 { internal code for the token ‘%%%%’ }
min action type = 6 { smallest code for tokens that produce “real” output }
numeric token = 6 { internal code for tokens like ‘3.14159’ }
string token = 7 { internal code for tokens like ‘"pie"’ }
min symbolic token = 8 { smallest internal code for a symbolic token }
op = 8 { internal code for tokens like ‘sqrt’ }
command = 9 { internal code for tokens like ‘addto’ }
endit = 10 { internal code for tokens like ‘fi’ }
binary = 11 { internal code for tokens like ‘and’ }
abinary = 12 { internal code for tokens like ‘+’ }
bbinary = 13 { internal code for tokens like ‘step’ }
ampersand = 14 { internal code for the token ‘&’ }
pyth sub = 15 { internal code for the token ‘+−+’ }
as is = 16 { internal code for tokens like ‘]’ }
bold = 17 { internal code for tokens like ‘nullpen’ }
type name = 18 { internal code for tokens like ‘numeric’ }
path join = 19 { internal code for the token ‘..’ }
colon = 20 { internal code for the token ‘:’ }
semicolon = 21 { internal code for the token ‘;’ }
backslash = 22 { internal code for the token ‘\’ }
double back = 23 { internal code for the token ‘\\’ }
less or equal = 24 { internal code for the token ‘<=’ }
greater or equal = 25 { internal code for the token ‘>=’ }
not equal = 26 { internal code for the token ‘<>’ }
sharp = 27 { internal code for the token ‘#’ }
comment = 28 { internal code for the token ‘%’ }
recomment = 29 { internal code used to resume a comment after ‘| . . . |’ }
min suffix = 30 { smallest code for symbolic tokens in suffixes }
internal = 30 { internal code for tokens like ‘pausing’ }
input command = 31 { internal code for tokens like ‘input’ }
special tag = 32 { internal code for tags that take at most one subscript }
tag = 33 { internal code for nonprimitive tokens }
h Assign the default value to ilk [p] 63 i ≡
ilk [p] ← tag
This code is used in section 62.
§63
§64
MFT
INITIALIZING THE PRIMITIVE TOKENS
421
64. We have to get METAFONT’s primitives into the hash table, and the simplest way to do this is to insert
them every time MFT is run.
A few macros permit us to do the initialization with a compact program. We use the fact that the longest
primitive is intersectiontimes, which is 17 letters long.
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
define
spr17 (#) ≡ buffer [17] ← #; cur tok ← lookup ; ilk [cur tok ] ←
spr16 (#) ≡ buffer [16] ← #; spr17
spr15 (#) ≡ buffer [15] ← #; spr16
spr14 (#) ≡ buffer [14] ← #; spr15
spr13 (#) ≡ buffer [13] ← #; spr14
spr12 (#) ≡ buffer [12] ← #; spr13
spr11 (#) ≡ buffer [11] ← #; spr12
spr10 (#) ≡ buffer [10] ← #; spr11
spr9 (#) ≡ buffer [9] ← #; spr10
spr8 (#) ≡ buffer [8] ← #; spr9
spr7 (#) ≡ buffer [7] ← #; spr8
spr6 (#) ≡ buffer [6] ← #; spr7
spr5 (#) ≡ buffer [5] ← #; spr6
spr4 (#) ≡ buffer [4] ← #; spr5
spr3 (#) ≡ buffer [3] ← #; spr4
spr2 (#) ≡ buffer [2] ← #; spr3
spr1 (#) ≡ buffer [1] ← #; spr2
pr1 ≡ id first ← 17; spr17
pr2 ≡ id first ← 16; spr16
pr3 ≡ id first ← 15; spr15
pr4 ≡ id first ← 14; spr14
pr5 ≡ id first ← 13; spr13
pr6 ≡ id first ← 12; spr12
pr7 ≡ id first ← 11; spr11
pr8 ≡ id first ← 10; spr10
pr9 ≡ id first ← 9; spr9
pr10 ≡ id first ← 8; spr8
pr11 ≡ id first ← 7; spr7
pr12 ≡ id first ← 6; spr6
pr13 ≡ id first ← 5; spr5
pr14 ≡ id first ← 4; spr4
pr15 ≡ id first ← 3; spr3
pr16 ≡ id first ← 2; spr2
pr17 ≡ id first ← 1; spr1
422
INITIALIZING THE PRIMITIVE TOKENS
MFT
§65
65. The intended use of the macros above might not be immediately obvious, but the riddle is answered
by the following:
h Store all the primitives 65 i ≡
id loc ← 18;
pr2 (".")(".")(path join );
pr1 ("[")(as is );
pr1 ("]")(as is );
pr1 ("}")(as is );
pr1 ("{")(as is );
pr1 (":")(colon );
pr2 (":")(":")(colon );
pr3 ("|")("|")(":")(colon );
pr2 (":")("=")(as is );
pr1 (",")(as is );
pr1 (";")(semicolon );
pr1 ("\")(backslash );
pr2 ("\")("\")(double back );
pr5 ("a")("d")("d")("t")("o")(command );
pr2 ("a")("t")(bbinary );
pr7 ("a")("t")("l")("e")("a")("s")("t")(op );
pr10 ("b")("e")("g")("i")("n")("g")("r")("o")("u")("p")(command );
pr8 ("c")("o")("n")("t")("r")("o")("l")("s")(op );
pr4 ("c")("u")("l")("l")(command );
pr4 ("c")("u")("r")("l")(op );
pr10 ("d")("e")("l")("i")("m")("i")("t")("e")("r")("s")(command );
pr7 ("d")("i")("s")("p")("l")("a")("y")(command );
pr8 ("e")("n")("d")("g")("r")("o")("u")("p")(endit );
pr8 ("e")("v")("e")("r")("y")("j")("o")("b")(command );
pr6 ("e")("x")("i")("t")("i")("f")(command );
pr11 ("e")("x")("p")("a")("n")("d")("a")("f")("t")("e")("r")(command );
pr4 ("f")("r")("o")("m")(bbinary );
pr8 ("i")("n")("w")("i")("n")("d")("o")("w")(bbinary );
pr7 ("i")("n")("t")("e")("r")("i")("m")(command );
pr3 ("l")("e")("t")(command );
pr11 ("n")("e")("w")("i")("n")("t")("e")("r")("n")("a")("l")(command );
pr2 ("o")("f")(command );
pr10 ("o")("p")("e")("n")("w")("i")("n")("d")("o")("w")(command );
pr10 ("r")("a")("n")("d")("o")("m")("s")("e")("e")("d")(command );
pr4 ("s")("a")("v")("e")(command );
pr10 ("s")("c")("a")("n")("t")("o")("k")("e")("n")("s")(command );
pr7 ("s")("h")("i")("p")("o")("u")("t")(command );
pr4 ("s")("t")("e")("p")(bbinary );
pr3 ("s")("t")("r")(command );
pr7 ("t")("e")("n")("s")("i")("o")("n")(op );
pr2 ("t")("o")(bbinary );
pr5 ("u")("n")("t")("i")("l")(bbinary );
pr3 ("d")("e")("f")(command );
pr6 ("v")("a")("r")("d")("e")("f")(command );
See also sections 66, 67, 68, 69, 70, and 71.
This code is used in section 112.
§66
MFT
INITIALIZING THE PRIMITIVE TOKENS
423
66. (There are so many primitives, it’s necessary to break this long initialization code up into pieces so as
not to overflow WEAVE’s capacity.)
h Store all the primitives 65 i +≡
pr10 ("p")("r")("i")("m")("a")("r")("y")("d")("e")("f")(command );
pr12 ("s")("e")("c")("o")("n")("d")("a")("r")("y")("d")("e")("f")(command );
pr11 ("t")("e")("r")("t")("i")("a")("r")("y")("d")("e")("f")(command );
pr6 ("e")("n")("d")("d")("e")("f")(endit );
pr3 ("f")("o")("r")(command );
pr11 ("f")("o")("r")("s")("u")("f")("f")("i")("x")("e")("s")(command );
pr7 ("f")("o")("r")("e")("v")("e")("r")(command );
pr6 ("e")("n")("d")("f")("o")("r")(endit );
pr5 ("q")("u")("o")("t")("e")(command );
pr4 ("e")("x")("p")("r")(command );
pr6 ("s")("u")("f")("f")("i")("x")(command );
pr4 ("t")("e")("x")("t")(command );
pr7 ("p")("r")("i")("m")("a")("r")("y")(command );
pr9 ("s")("e")("c")("o")("n")("d")("a")("r")("y")(command );
pr8 ("t")("e")("r")("t")("i")("a")("r")("y")(command );
pr5 ("i")("n")("p")("u")("t")(input command );
pr8 ("e")("n")("d")("i")("n")("p")("u")("t")(bold );
pr2 ("i")("f")(command );
pr2 ("f")("i")(endit );
pr4 ("e")("l")("s")("e")(command );
pr6 ("e")("l")("s")("e")("i")("f")(command );
pr4 ("t")("r")("u")("e")(bold );
pr5 ("f")("a")("l")("s")("e")(bold );
pr11 ("n")("u")("l")("l")("p")("i")("c")("t")("u")("r")("e")(bold );
pr7 ("n")("u")("l")("l")("p")("e")("n")(bold );
pr7 ("j")("o")("b")("n")("a")("m")("e")(bold );
pr10 ("r")("e")("a")("d")("s")("t")("r")("i")("n")("g")(bold );
pr9 ("p")("e")("n")("c")("i")("r")("c")("l")("e")(bold );
pr4 ("g")("o")("o")("d")(special tag );
pr2 ("=")(":")(as is );
pr3 ("=")(":")("|")(as is );
pr4 ("=")(":")("|")(">")(as is );
pr3 ("|")("=")(":")(as is );
pr4 ("|")("=")(":")(">")(as is );
pr4 ("|")("=")(":")("|")(as is );
pr5 ("|")("=")(":")("|")(">")(as is );
pr6 ("|")("=")(":")("|")(">")(">")(as is );
pr4 ("k")("e")("r")("n")(binary ); pr6 ("s")("k")("i")("p")("t")("o")(command );
424
67.
INITIALIZING THE PRIMITIVE TOKENS
MFT
(Does anybody out there remember the commercials that went LS−MFT?)
h Store all the primitives 65 i +≡
pr13 ("n")("o")("r")("m")("a")("l")("d")("e")("v")("i")("a")("t")("e")(op );
pr3 ("o")("d")("d")(op );
pr5 ("k")("n")("o")("w")("n")(op );
pr7 ("u")("n")("k")("n")("o")("w")("n")(op );
pr3 ("n")("o")("t")(op );
pr7 ("d")("e")("c")("i")("m")("a")("l")(op );
pr7 ("r")("e")("v")("e")("r")("s")("e")(op );
pr8 ("m")("a")("k")("e")("p")("a")("t")("h")(op );
pr7 ("m")("a")("k")("e")("p")("e")("n")(op );
pr11 ("t")("o")("t")("a")("l")("w")("e")("i")("g")("h")("t")(op );
pr3 ("o")("c")("t")(op );
pr3 ("h")("e")("x")(op );
pr5 ("A")("S")("C")("I")("I")(op );
pr4 ("c")("h")("a")("r")(op );
pr6 ("l")("e")("n")("g")("t")("h")(op );
pr13 ("t")("u")("r")("n")("i")("n")("g")("n")("u")("m")("b")("e")("r")(op );
pr5 ("x")("p")("a")("r")("t")(op );
pr5 ("y")("p")("a")("r")("t")(op );
pr6 ("x")("x")("p")("a")("r")("t")(op );
pr6 ("x")("y")("p")("a")("r")("t")(op );
pr6 ("y")("x")("p")("a")("r")("t")(op );
pr6 ("y")("y")("p")("a")("r")("t")(op );
pr4 ("s")("q")("r")("t")(op );
pr4 ("m")("e")("x")("p")(op );
pr4 ("m")("l")("o")("g")(op );
pr4 ("s")("i")("n")("d")(op );
pr4 ("c")("o")("s")("d")(op );
pr5 ("f")("l")("o")("o")("r")(op );
pr14 ("u")("n")("i")("f")("o")("r")("m")("d")("e")("v")("i")("a")("t")("e")(op );
pr10 ("c")("h")("a")("r")("e")("x")("i")("s")("t")("s")(op );
pr5 ("a")("n")("g")("l")("e")(op );
pr5 ("c")("y")("c")("l")("e")(op );
68. (If you think this WEB code is ugly, you should see the Pascal code it produces.)
h Store all the primitives 65 i +≡
pr13 ("t")("r")("a")("c")("i")("n")("g")("t")("i")("t")("l")("e")("s")(internal );
pr16 ("t")("r")("a")("c")("i")("n")("g")("e")("q")("u")("a")("t")("i")("o")("n")("s")(internal );
pr15 ("t")("r")("a")("c")("i")("n")("g")("c")("a")("p")("s")("u")("l")("e")("s")(internal );
pr14 ("t")("r")("a")("c")("i")("n")("g")("c")("h")("o")("i")("c")("e")("s")(internal );
pr12 ("t")("r")("a")("c")("i")("n")("g")("s")("p")("e")("c")("s")(internal );
pr11 ("t")("r")("a")("c")("i")("n")("g")("p")("e")("n")("s")(internal );
pr15 ("t")("r")("a")("c")("i")("n")("g")("c")("o")("m")("m")("a")("n")("d")("s")(internal );
pr13 ("t")("r")("a")("c")("i")("n")("g")("m")("a")("c")("r")("o")("s")(internal );
pr12 ("t")("r")("a")("c")("i")("n")("g")("e")("d")("g")("e")("s")(internal );
pr13 ("t")("r")("a")("c")("i")("n")("g")("o")("u")("t")("p")("u")("t")(internal );
pr12 ("t")("r")("a")("c")("i")("n")("g")("s")("t")("a")("t")("s")(internal );
pr13 ("t")("r")("a")("c")("i")("n")("g")("o")("n")("l")("i")("n")("e")(internal );
§67
§69
MFT
INITIALIZING THE PRIMITIVE TOKENS
69. h Store all the primitives 65 i +≡
pr4 ("y")("e")("a")("r")(internal );
pr5 ("m")("o")("n")("t")("h")(internal );
pr3 ("d")("a")("y")(internal );
pr4 ("t")("i")("m")("e")(internal );
pr8 ("c")("h")("a")("r")("c")("o")("d")("e")(internal );
pr7 ("c")("h")("a")("r")("f")("a")("m")(internal );
pr6 ("c")("h")("a")("r")("w")("d")(internal );
pr6 ("c")("h")("a")("r")("h")("t")(internal );
pr6 ("c")("h")("a")("r")("d")("p")(internal );
pr6 ("c")("h")("a")("r")("i")("c")(internal );
pr6 ("c")("h")("a")("r")("d")("x")(internal );
pr6 ("c")("h")("a")("r")("d")("y")(internal );
pr10 ("d")("e")("s")("i")("g")("n")("s")("i")("z")("e")(internal );
pr4 ("h")("p")("p")("p")(internal );
pr4 ("v")("p")("p")("p")(internal );
pr7 ("x")("o")("f")("f")("s")("e")("t")(internal );
pr7 ("y")("o")("f")("f")("s")("e")("t")(internal );
pr7 ("p")("a")("u")("s")("i")("n")("g")(internal );
pr12 ("s")("h")("o")("w")("s")("t")("o")("p")("p")("i")("n")("g")(internal );
pr10 ("f")("o")("n")("t")("m")("a")("k")("i")("n")("g")(internal );
pr8 ("p")("r")("o")("o")("f")("i")("n")("g")(internal );
pr9 ("s")("m")("o")("o")("t")("h")("i")("n")("g")(internal );
pr12 ("a")("u")("t")("o")("r")("o")("u")("n")("d")("i")("n")("g")(internal );
pr11 ("g")("r")("a")("n")("u")("l")("a")("r")("i")("t")("y")(internal );
pr6 ("f")("i")("l")("l")("i")("n")(internal );
pr12 ("t")("u")("r")("n")("i")("n")("g")("c")("h")("e")("c")("k")(internal );
pr12 ("w")("a")("r")("n")("i")("n")("g")("c")("h")("e")("c")("k")(internal );
pr12 ("b")("o")("u")("n")("d")("a")("r")("y")("c")("h")("a")("r")(internal );
425
426
70.
INITIALIZING THE PRIMITIVE TOKENS
MFT
§70
Still more.
h Store all the primitives 65 i +≡
pr1 ("+")(abinary );
pr1 ("−")(abinary );
pr1 ("*")(abinary );
pr1 ("/")(as is );
pr2 ("+")("+")(binary );
pr3 ("+")("−")("+")(pyth sub );
pr3 ("a")("n")("d")(binary );
pr2 ("o")("r")(binary );
pr1 ("<")(as is );
pr2 ("<")("=")(less or equal );
pr1 (">")(as is );
pr2 (">")("=")(greater or equal );
pr1 ("=")(as is );
pr2 ("<")(">")(not equal );
pr9 ("s")("u")("b")("s")("t")("r")("i")("n")("g")(command );
pr7 ("s")("u")("b")("p")("a")("t")("h")(command );
pr13 ("d")("i")("r")("e")("c")("t")("i")("o")("n")("t")("i")("m")("e")(command );
pr5 ("p")("o")("i")("n")("t")(command );
pr10 ("p")("r")("e")("c")("o")("n")("t")("r")("o")("l")(command );
pr11 ("p")("o")("s")("t")("c")("o")("n")("t")("r")("o")("l")(command );
pr9 ("p")("e")("n")("o")("f")("f")("s")("e")("t")(command );
pr1 ("&")(ampersand );
pr7 ("r")("o")("t")("a")("t")("e")("d")(binary );
pr7 ("s")("l")("a")("n")("t")("e")("d")(binary );
pr6 ("s")("c")("a")("l")("e")("d")(binary );
pr7 ("s")("h")("i")("f")("t")("e")("d")(binary );
pr11 ("t")("r")("a")("n")("s")("f")("o")("r")("m")("e")("d")(binary );
pr7 ("x")("s")("c")("a")("l")("e")("d")(binary );
pr7 ("y")("s")("c")("a")("l")("e")("d")(binary );
pr7 ("z")("s")("c")("a")("l")("e")("d")(binary );
pr17 ("i")("n")("t")("e")("r")("s")("e")("c")("t")("i")("o")("n")("t")("i")("m")("e")("s")(binary );
pr7 ("n")("u")("m")("e")("r")("i")("c")(type name );
pr6 ("s")("t")("r")("i")("n")("g")(type name );
pr7 ("b")("o")("o")("l")("e")("a")("n")(type name );
pr4 ("p")("a")("t")("h")(type name );
pr3 ("p")("e")("n")(type name );
pr7 ("p")("i")("c")("t")("u")("r")("e")(type name );
pr9 ("t")("r")("a")("n")("s")("f")("o")("r")("m")(type name );
pr4 ("p")("a")("i")("r")(type name );
§71
71.
MFT
INITIALIZING THE PRIMITIVE TOKENS
At last we are done with the tedious initialization of primitives.
h Store all the primitives 65 i +≡
pr3 ("e")("n")("d")(endit );
pr4 ("d")("u")("m")("p")(endit );
pr9 ("b")("a")("t")("c")("h")("m")("o")("d")("e")(bold );
pr11 ("n")("o")("n")("s")("t")("o")("p")("m")("o")("d")("e")(bold );
pr10 ("s")("c")("r")("o")("l")("l")("m")("o")("d")("e")(bold );
pr13 ("e")("r")("r")("o")("r")("s")("t")("o")("p")("m")("o")("d")("e")(bold );
pr5 ("i")("n")("n")("e")("r")(command );
pr5 ("o")("u")("t")("e")("r")(command );
pr9 ("s")("h")("o")("w")("t")("o")("k")("e")("n")(command );
pr9 ("s")("h")("o")("w")("s")("t")("a")("t")("s")(bold );
pr4 ("s")("h")("o")("w")(command );
pr12 ("s")("h")("o")("w")("v")("a")("r")("i")("a")("b")("l")("e")(command );
pr16 ("s")("h")("o")("w")("d")("e")("p")("e")("n")("d")("e")("n")("c")("i")("e")("s")(bold );
pr7 ("c")("o")("n")("t")("o")("u")("r")(command );
pr10 ("d")("o")("u")("b")("l")("e")("p")("a")("t")("h")(command );
pr4 ("a")("l")("s")("o")(command );
pr7 ("w")("i")("t")("h")("p")("e")("n")(command );
pr10 ("w")("i")("t")("h")("w")("e")("i")("g")("h")("t")(command );
pr8 ("d")("r")("o")("p")("p")("i")("n")("g")(command );
pr7 ("k")("e")("e")("p")("i")("n")("g")(command );
pr7 ("m")("e")("s")("s")("a")("g")("e")(command );
pr10 ("e")("r")("r")("m")("e")("s")("s")("a")("g")("e")(command );
pr7 ("e")("r")("r")("h")("e")("l")("p")(command );
pr8 ("c")("h")("a")("r")("l")("i")("s")("t")(command );
pr8 ("l")("i")("g")("t")("a")("b")("l")("e")(command );
pr10 ("e")("x")("t")("e")("n")("s")("i")("b")("l")("e")(command );
pr10 ("h")("e")("a")("d")("e")("r")("b")("y")("t")("e")(command );
pr9 ("f")("o")("n")("t")("d")("i")("m")("e")("n")(command );
pr7 ("s")("p")("e")("c")("i")("a")("l")(command );
pr10 ("n")("u")("m")("s")("p")("e")("c")("i")("a")("l")(command );
pr1 ("%")(comment );
pr2 ("%")("%")(verbatim );
pr3 ("%")("%")("%")(set format );
pr4 ("%")("%")("%")("%")(mft comment );
pr1 ("#")(sharp );
427
428
INITIALIZING THE PRIMITIVE TOKENS
MFT
72.
§72
We also want to store a few other strings of characters that are used in MFT’s translation to TEX code.
define ttr1 (#) ≡ byte mem [byte ptr − 1] ← #; cur tok ← name ptr ; incr (name ptr );
byte start [name ptr ] ← byte ptr
define ttr2 (#) ≡ byte mem [byte ptr − 2] ← #; ttr1
define ttr3 (#) ≡ byte mem [byte ptr − 3] ← #; ttr2
define ttr4 (#) ≡ byte mem [byte ptr − 4] ← #; ttr3
define ttr5 (#) ≡ byte mem [byte ptr − 5] ← #; ttr4
define tr1 ≡ incr (byte ptr ); ttr1
define tr2 ≡ byte ptr ← byte ptr + 2; ttr2
define tr3 ≡ byte ptr ← byte ptr + 3; ttr3
define tr4 ≡ byte ptr ← byte ptr + 4; ttr4
define tr5 ≡ byte ptr ← byte ptr + 5; ttr5
h Globals in the outer block 9 i +≡
translation : array [ASCII code ] of name pointer ;
i: ASCII code ; { index into translation }
73. h Store all the translations 73 i ≡
for i ← 0 to 255 do translation [i] ← 0;
tr2 ("\")("$"); translation ["$"] ← cur tok ;
tr2 ("\")("#"); translation ["#"] ← cur tok ;
tr2 ("\")("&"); translation ["&"] ← cur tok ;
tr2 ("\")("{"); translation ["{"] ← cur tok ;
tr2 ("\")("}"); translation ["}"] ← cur tok ;
tr2 ("\")("_"); translation ["_"] ← cur tok ;
tr2 ("\")("%"); translation ["%"] ← cur tok ;
tr4 ("\")("B")("S")(" "); translation ["\"] ← cur tok ;
tr4 ("\")("H")("A")(" "); translation ["^"] ← cur tok ;
tr4 ("\")("T")("I")(" "); translation ["~"] ← cur tok ;
tr5 ("\")("a")("s")("t")(" "); translation ["*"] ← cur tok ;
tr4 ("\")("A")("M")(" "); tr amp ← cur tok ;
tr4 ("\")("B")("L")(" "); tr skip ← cur tok ;
tr4 ("\")("S")("H")(" "); tr sharp ← cur tok ;
tr4 ("\")("P")("S")(" "); tr ps ← cur tok ;
tr4 ("\")("l")("e")(" "); tr le ← cur tok ;
tr4 ("\")("g")("e")(" "); tr ge ← cur tok ;
tr4 ("\")("n")("e")(" "); tr ne ← cur tok ;
tr5 ("\")("q")("u")("a")("d"); tr quad ← cur tok ;
This code is used in section 112.
74. h Globals in the outer block 9 i +≡
tr le , tr ge , tr ne , tr amp , tr sharp , tr skip , tr ps , tr quad : name pointer ; { special translations }
§75
MFT
INPUTTING THE NEXT TOKEN
429
75. Inputting the next token. MFT’s lexical scanning routine is called get next . This procedure inputs
the next token of METAFONT input and puts its encoded meaning into two global variables, cur type and
cur tok .
h Globals in the outer block 9 i +≡
cur type : eight bits ; { type of token just scanned }
cur tok : integer ; { hash table or buffer location }
prev type : eight bits ; { previous value of cur type }
prev tok : integer ; { previous value of cur tok }
76. h Set initial values 10 i +≡
cur type ← end of line ; cur tok ← 0;
77. Two global state variables affect the behavior of get next : A space will be considered significant when
start of line is true , and the buffer will be considered devoid of information when empty buffer is true .
h Globals in the outer block 9 i +≡
start of line : boolean ; { has the current line had nothing but spaces so far? }
empty buffer : boolean ; { is it time to input a new line? }
78. The 256 ASCII code characters are grouped into classes by means of the char class table. Individual
class numbers have no semantic or syntactic significance, expect in a few instances defined here. There’s
also max class , which can be used as a basis for additional class numbers in nonstandard extensions of
METAFONT.
define
define
define
define
define
define
define
define
define
define
define
define
define
digit class = 0 { the class number of 0123456789 }
period class = 1 { the class number of ‘.’ }
space class = 2 { the class number of spaces and nonstandard characters }
percent class = 3 { the class number of ‘%’ }
string class = 4 { the class number of ‘"’ }
right paren class = 8 { the class number of ‘)’ }
isolated classes ≡ 5, 6, 7, 8 { characters that make length-one tokens only }
letter class = 9 { letters and the underline character }
left bracket class = 17 { ‘[’ }
right bracket class = 18 { ‘]’ }
invalid class = 20 { bad character in the input }
end line class = 21 { end of an input line (MFT only) }
max class = 21 { the largest class number }
h Globals in the outer block 9 i +≡
char class : array [ASCII code ] of 0 . . max class ; { the class numbers }
430
INPUTTING THE NEXT TOKEN
MFT
§79
79. If changes are made to accommodate non-ASCII character sets, they should be essentially the same in
MFT as in METAFONT. However, MFT has an additional class number, the end line class , which is used only
for the special character carriage return that is placed at the end of the input buffer.
define carriage return = ´15
{ special code placed in buffer [limit ] }
h Set initial values 10 i +≡
for i ← "0" to "9" do char class [i] ← digit class ;
char class ["."] ← period class ; char class [" "] ← space class ; char class ["%"] ← percent class ;
char class [""""] ← string class ;
char class [","] ← 5; char class [";"] ← 6; char class ["("] ← 7; char class [")"] ← right paren class ;
for i ← "A" to "Z" do char class [i] ← letter class ;
for i ← "a" to "z" do char class [i] ← letter class ;
char class ["_"] ← letter class ;
char class ["<"] ← 10; char class ["="] ← 10; char class [">"] ← 10; char class [":"] ← 10;
char class ["|"] ← 10;
char class ["`"] ← 11; char class ["´"] ← 11;
char class ["+"] ← 12; char class ["−"] ← 12;
char class ["/"] ← 13; char class ["*"] ← 13; char class ["\"] ← 13;
char class ["!"] ← 14; char class ["?"] ← 14;
char class ["#"] ← 15; char class ["&"] ← 15; char class ["@"] ← 15; char class ["$"] ← 15;
char class ["^"] ← 16; char class ["~"] ← 16;
char class ["["] ← left bracket class ; char class ["]"] ← right bracket class ;
char class ["{"] ← 19; char class ["}"] ← 19;
for i ← 0 to " " − 1 do char class [i] ← invalid class ;
char class [carriage return ] ← end line class ;
for i ← 127 to 255 do char class [i] ← invalid class ;
80.
And now we’re ready to take the plunge into get next itself.
define switch = 25 { a label in get next }
define pass digits = 85 { another }
define pass fraction = 86 { and still another, although goto is considered harmful }
procedure get next ; { sets cur type and cur tok to next token }
label switch , pass digits , pass fraction , done , found , exit ;
var c: ASCII code ; { the current character in the buffer }
class : ASCII code ; { its class number }
begin prev type ← cur type ; prev tok ← cur tok ;
if empty buffer then h Bring in a new line of input; return if the file has ended 85 i;
switch : c ← buffer [loc ]; id first ← loc ; incr (loc ); class ← char class [c]; h Branch on the class , scan the
token; return directly if the token is special, or goto found if it needs to be looked up 81 i;
found : id loc ← loc ; cur tok ← lookup ; cur type ← ilk [cur tok ];
exit : end;
§81
MFT
INPUTTING THE NEXT TOKEN
431
81. define emit (#) ≡ begin cur type ← #; cur tok ← id first ; return; end
h Branch on the class , scan the token; return directly if the token is special, or goto found if it needs to
be looked up 81 i ≡
case class of
digit class : goto pass digits ;
period class : begin class ← char class [buffer [loc ]];
if class > period class then goto switch { ignore isolated ‘.’ }
else if class < period class then goto pass fraction ; { class = digit class }
end;
space class : if start of line then emit (indentation )
else goto switch ;
end line class : emit (end of line );
string class : h Get a string token and return 82 i;
isolated classes : goto found ;
invalid class : h Decry the invalid character and goto switch 84 i;
othercases do nothing { letters, etc. }
endcases;
while char class [buffer [loc ]] = class do incr (loc );
goto found ;
pass digits : while char class [buffer [loc ]] = digit class do incr (loc );
if buffer [loc ] 6= "." then goto done ;
if char class [buffer [loc + 1]] 6= digit class then goto done ;
incr (loc );
pass fraction : repeat incr (loc );
until char class [buffer [loc ]] 6= digit class ;
done : emit (numeric token )
This code is used in section 80.
82. h Get a string token and return 82 i ≡
loop begin if buffer [loc ] = """" then
begin incr (loc ); emit (string token );
end;
if loc = limit then h Decry the missing string delimiter and goto switch 83 i;
incr (loc );
end
This code is used in section 81.
83. h Decry the missing string delimiter and goto switch 83 i ≡
begin err print (´! Incomplete string will be ignored´); goto switch ;
end
This code is used in section 82.
84. h Decry the invalid character and goto switch 84 i ≡
begin err print (´! Invalid character will be ignored´); goto switch ;
end
This code is used in section 81.
85. h Bring in a new line of input; return if the file has ended 85 i ≡
begin get line ;
if input has ended then emit (end of file );
buffer [limit ] ← carriage return ; loc ← 0; start of line ← true ; empty buffer ← false ;
end
This code is used in section 80.
432
LOW-LEVEL OUTPUT ROUTINES
MFT
§86
86. Low-level output routines. The TEX output is supposed to appear in lines at most line length
characters long, so we place it into an output buffer. During the output process, out line will hold the
current line number of the line about to be output.
h Globals in the outer block 9 i +≡
out buf : array [0 . . line length ] of ASCII code ; { assembled characters }
out ptr : 0 . . line length ; { number of characters in out buf }
out line : integer ; { coordinates of next line to be output }
87. The flush buffer routine empties the buffer up to a given breakpoint, and moves any remaining
characters to the beginning of the next line. If the per cent parameter is true , a "%" is appended to
the line that is being output; in this case the breakpoint b should be strictly less than line length . If the
per cent parameter is false , trailing blanks are suppressed. The characters emptied from the buffer form a
new line of output.
procedure flush buffer (b : eight bits ; per cent : boolean ); { outputs out buf [1 . . b], where b ≤ out ptr }
label done ;
var j, k: 0 . . line length ;
begin j ← b;
if ¬per cent then { remove trailing blanks }
loop begin if j = 0 then goto done ;
if out buf [j] 6= " " then goto done ;
decr (j);
end;
done : for k ← 1 to j do write (tex file , xchr [out buf [k]]);
if per cent then write (tex file , xchr ["%"]);
write ln (tex file ); incr (out line );
if b < out ptr then
for k ← b + 1 to out ptr do out buf [k − b] ← out buf [k];
out ptr ← out ptr − b;
end;
88. MFT calls flush buffer (out ptr , false ) before it has input anything. We initialize the output variables
so that the first line of the output file will be ‘\input mftmac’.
h Set initial values 10 i +≡
out ptr ← 1; out buf [1] ← " "; out line ← 1; write (tex file , ´\input mftmac´);
89. When we wish to append the character c to the output buffer, we write ‘out (c)’; this will cause the
buffer to be emptied if it was already full. Similarly, ‘out2 (c1 )(c2 )’ appends a pair of characters. A line
break will occur at a space or after a single-nonletter TEX control sequence.
define oot (#) ≡
if out ptr = line length then break out ;
incr (out ptr ); out buf [out ptr ] ← #;
define oot1 (#) ≡ oot (#) end
define oot2 (#) ≡ oot (#) oot1
define oot3 (#) ≡ oot (#) oot2
define oot4 (#) ≡ oot (#) oot3
define oot5 (#) ≡ oot (#) oot4
define out ≡ begin oot1
define out2 ≡ begin oot2
define out3 ≡ begin oot3
define out4 ≡ begin oot4
define out5 ≡ begin oot5
§90
MFT
LOW-LEVEL OUTPUT ROUTINES
433
90. The break out routine is called just before the output buffer is about to overflow. To make this routine
a little faster, we initialize position 0 of the output buffer to ‘\’; this character isn’t really output.
h Set initial values 10 i +≡
out buf [0] ← "\";
91. A long line is broken at a blank space or just before a backslash that isn’t preceded by another
backslash. In the latter case, a "%" is output at the break. (This policy has a known bug, in the rare
situation that the backslash was in a string constant that’s being output “verbatim.”)
procedure break out ; { finds a way to break the output line }
label exit ;
var k: 0 . . line length ; { index into out buf }
d: ASCII code ; { character from the buffer }
begin k ← out ptr ;
loop begin if k = 0 then h Print warning message, break the line, return 92 i;
d ← out buf [k];
if d = " " then
begin flush buffer (k, false ); return;
end;
if (d = "\") ∧ (out buf [k − 1] 6= "\") then { in this case k > 1 }
begin flush buffer (k − 1, true ); return;
end;
decr (k);
end;
exit : end;
92. We get to this module only in unusual cases that the entire output line consists of a string of backslashes
followed by a string of nonblank non-backslashes. In such cases it is almost always safe to break the line by
putting a "%" just before the last character.
h Print warning message, break the line, return 92 i ≡
begin print nl (´! Line had to be broken (output l.´, out line : 1); print ln (´):´);
for k ← 1 to out ptr − 1 do print (xchr [out buf [k]]);
new line ; mark harmless ; flush buffer (out ptr − 1, true ); return;
end
This code is used in section 91.
93.
To output a string of bytes from byte mem , we call out str .
procedure out str (p : name pointer ); { outputs a string }
var k: 0 . . max bytes ; { index into byte mem }
begin for k ← byte start [p] to byte start [p + 1] − 1 do out (byte mem [k]);
end;
434
LOW-LEVEL OUTPUT ROUTINES
MFT
§94
94. The out name subroutine is used to output a symbolic token. Unusual characters are translated into
forms that won’t screw up.
procedure out name (p : name pointer ); { outputs a name }
var k: 0 . . max bytes ; { index into byte mem }
t: name pointer ; { translation of character being output, if any }
begin for k ← byte start [p] to byte start [p + 1] − 1 do
begin t ← translation [byte mem [k]];
if t = 0 then out (byte mem [k])
else out str (t);
end;
end;
95.
We often want to output a name after calling a numeric macro (e.g., ‘\1{foo}’).
procedure out mac and name (n : ASCII code ; p : name pointer );
begin out ("\"); out (n);
if length (p) = 1 then out name (p)
else begin out ("{"); out name (p); out ("}");
end;
end;
96.
Here’s a routine that simply copies from the input buffer to the output buffer.
procedure copy (first loc : integer ); { output buffer [first loc . . loc − 1] }
var k: 0 . . buf size ; { buffer location being copied }
begin for k ← first loc to loc − 1 do out (buffer [k]);
end;
§97
MFT
TRANSLATION
435
97. Translation. The main work of MFT is accomplished by a routine that translates the tokens, one by
one, with a limited amount of lookahead/lookbehind. Automata theorists might loosely call this a “finite
state transducer,” because the flow of control is comparatively simple.
procedure do the translation ;
label restart , reswitch , done , exit ;
var k: 0 . . buf size ; { looks ahead in the buffer }
t: integer ; { type that spreads to new tokens }
begin restart : if out ptr > 0 then flush buffer (out ptr , false );
empty buffer ← true ;
loop begin get next ;
if start of line then h Do special actions at the start of a line 98 i;
reswitch : case cur type of
numeric token : h Translate a numeric token or a fraction 105 i;
string token : h Translate a string token 99 i;
indentation : out str (tr quad );
end of line , mft comment : h Wind up a line of translation and goto restart , or finish a | . . . | segment
and goto reswitch 110 i;
end of file : return;
h Cases that translate primitive tokens 100 i
comment , recomment : h Translate a comment and goto restart , unless there’s a | . . . | segment 108 i;
verbatim : h Copy the rest of the current input line to the output, then goto restart 109 i;
set format : h Change the translation format of tokens, and goto restart or reswitch 111 i;
internal , special tag , tag : h Translate a tag and possible subscript 106 i;
end; { all cases have been listed }
end;
exit : end;
98. h Do special actions at the start of a line 98 i ≡
if cur type ≥ min action type then
begin out ("$"); start of line ← false ;
case cur type of
endit : out2 ("\")("!");
binary , abinary , bbinary , ampersand , pyth sub : out2 ("{")("}");
othercases do nothing
endcases;
end
else if cur type = end of line then
begin out str (tr skip ); goto restart ;
end
else if cur type = mft comment then goto restart
This code is used in section 97.
99. Let’s start with some of the easier translations, so that the harder ones will also be easy when we get
to them. A string like "cat" comes out ‘\7"cat"’.
h Translate a string token 99 i ≡
begin out2 ("\")("7"); copy (cur tok );
end
This code is used in section 97.
436
TRANSLATION
100.
Similarly, the translation of ‘sqrt’ is ‘\1{sqrt}’.
h Cases that translate primitive tokens 100 i ≡
op : out mac and name ("1", cur tok );
command : out mac and name ("2", cur tok );
type name : if prev type = command then out mac and name ("1", cur tok )
else out mac and name ("2", cur tok );
endit : out mac and name ("3", cur tok );
bbinary : out mac and name ("4", cur tok );
bold : out mac and name ("5", cur tok );
binary : out mac and name ("6", cur tok );
path join : out mac and name ("8", cur tok );
colon : out mac and name ("?", cur tok );
See also sections 101, 102, and 103.
This code is used in section 97.
101.
Here are a few more easy cases.
h Cases that translate primitive tokens 100 i +≡
as is , sharp , abinary : out name (cur tok );
double back : out2 ("\")(";");
semicolon : begin out name (cur tok ); get next ;
if cur type 6= end of line then
if cur type 6= endit then out2 ("\")(" ");
goto reswitch ;
end;
102.
Some of the primitives have a fixed output (independent of cur tok ):
h Cases that translate primitive tokens 100 i +≡
backslash : out str (translation ["\"]);
pyth sub : out str (tr ps );
less or equal : out str (tr le );
greater or equal : out str (tr ge );
not equal : out str (tr ne );
ampersand : out str (tr amp );
103.
The remaining primitive is slightly special.
h Cases that translate primitive tokens 100 i +≡
input command : begin out mac and name ("2", cur tok ); out5 ("\")("h")("b")("o")("x");
h Scan the file name and output it in typewriter type 104 i;
end;
MFT
§100
§104
MFT
TRANSLATION
437
104. File names have different formats on different computers, so we don’t scan them with get next . Here
we use a rule that probably covers most cases satisfactorily: We ignore leading blanks, then consider the file
name to consist of all subsequent characters up to the first blank, semicolon, comment, or end-of-line. (A
carriage return appears at the end of the line.)
h Scan the file name and output it in typewriter type 104 i ≡
while buffer [loc ] = " " do incr (loc );
out5 ("{")("\")("t")("t")(" ");
while (buffer [loc ] 6= " ") ∧ (buffer [loc ] 6= "%") ∧ (buffer [loc ] 6= ";") ∧ (loc < limit ) do
begin out (buffer [loc ]); incr (loc );
end;
out ("}")
This code is used in section 103.
105. h Translate a numeric token or a fraction 105 i ≡
if buffer [loc ] = "/" then
if char class [buffer [loc + 1]] = digit class then { it’s a fraction }
begin out5 ("\")("f")("r")("a")("c"); copy (cur tok ); get next ; out2 ("/")("{"); get next ;
copy (cur tok ); out ("}");
end
else copy (cur tok )
else copy (cur tok )
This code is used in section 97.
106. h Translate a tag and possible subscript 106 i ≡
begin if length (cur tok ) = 1 then out name (cur tok )
else out mac and name ("\", cur tok );
get next ;
if byte mem [byte start [prev tok ]] = "´" then goto reswitch ;
case prev type of
internal : begin if (cur type = numeric token )|(cur type ≥ min suffix ) then out2 ("\")(",");
goto reswitch ;
end;
special tag : if cur type < min suffix then goto reswitch
else begin out ("."); cur type ← internal ; goto reswitch ;
end;
tag : begin if cur type = tag then
if byte mem [byte start [cur tok ]] = "´" then goto reswitch ;
{ a sequence of primes goes on the main line }
if (cur type = numeric token )|(cur type ≥ min suffix ) then h Translate a subscript 107 i
else if cur type = sharp then out str (tr sharp )
else goto reswitch ;
end;
end; { there are no other cases }
end
This code is used in section 97.
438
TRANSLATION
MFT
§107
107. h Translate a subscript 107 i ≡
begin out2 ("_")("{");
loop begin if cur type ≥ min suffix then out name (cur tok )
else copy (cur tok );
if prev type = special tag then
begin get next ; goto done ;
end;
get next ;
if cur type < min suffix then
if cur type 6= numeric token then goto done ;
if cur type = prev type then
if cur type = numeric token then out2 ("\")(",")
else if char class [byte mem [byte start [cur tok ]]] = char class [byte mem [byte start [prev tok ]]] then
if byte mem [byte start [prev tok ]] 6= "." then out (".")
else out2 ("\")(",");
end;
done : out ("}"); goto reswitch ;
end
This code is used in section 106.
108. The tricky thing about comments is that they might contain | . . . |. We scan ahead for this, and
replace the second ‘|’ by a carriage return .
h Translate a comment and goto restart , unless there’s a | . . . | segment 108 i ≡
begin if cur type = comment then out2 ("\")("9");
id first ← loc ;
while (loc < limit ) ∧ (buffer [loc ] 6= "|") do incr (loc );
copy (id first );
if loc < limit then
begin start of line ← true ; incr (loc ); k ← loc ;
while (k < limit ) ∧ (buffer [k] 6= "|") do incr (k);
buffer [k] ← carriage return ;
end
else begin if out buf [out ptr ] = "\" then out (" ");
out4 ("\")("p")("a")("r"); goto restart ;
end;
end
This code is used in section 97.
109. h Copy the rest of the current input line to the output, then goto restart 109 i ≡
begin id first ← loc ; loc ← limit ; copy (id first );
if out ptr = 0 then
begin out ptr ← 1; out buf [1] ← " ";
end;
goto restart ;
end
This code is used in section 97.
§110
MFT
TRANSLATION
439
110. h Wind up a line of translation and goto restart , or finish a | . . . | segment and goto reswitch 110 i ≡
begin out ("$");
if (loc < limit ) ∧ (cur type = end of line ) then
begin cur type ← recomment ; goto reswitch ;
end
else begin out4 ("\")("p")("a")("r"); goto restart ;
end;
end
This code is used in section 97.
111. h Change the translation format of tokens, and goto restart or reswitch 111 i ≡
begin start of line ← false ; get next ; t ← cur type ;
while cur type ≥ min symbolic token do
begin get next ;
if cur type ≥ min symbolic token then ilk [cur tok ] ← t;
end;
if cur type 6= end of line then
if cur type 6= mft comment then
begin err print (´! Only symbolic tokens should appear after %%%´); goto reswitch ;
end;
empty buffer ← true ; goto restart ;
end
This code is used in section 97.
440
THE MAIN PROGRAM
112.
The main program.
MFT
§112
Let’s put it all together now: MFT starts and ends here.
begin initialize ; { beginning of the main program }
print ln (banner ); { print a “banner line” }
h Store all the primitives 65 i;
h Store all the translations 73 i;
h Initialize the input system 44 i;
do the translation ; h Check that all changes have been read 49 i;
end of MFT : { here files should be closed if the operating system requires it }
h Print the job history 113 i;
end.
113. Some implementations may wish to pass the history value to the operating system so that it can be
used to govern whether or not other programs are started. Here we simply report the history to the user.
h Print the job history 113 i ≡
case history of
spotless : print nl (´(No errors were found.)´);
harmless message : print nl (´(Did you see the warning message above?)´);
error message : print nl (´(Pardon me, but I think I spotted something wrong.)´);
fatal message : print nl (´(That was a fatal error, my friend.)´);
end { there are no other cases }
This code is used in section 112.
§114
MFT
SYSTEM-DEPENDENT CHANGES
441
114. System-dependent changes. This module should be replaced, if necessary, by changes to the
program that are necessary to make MFT work at a particular installation. It is usually best to design your
change file so that all changes to previous modules preserve the module numbering; then everybody’s version
will be consistent with the printed program. More extensive changes, which introduce new modules, can be
inserted here; then only the index itself will get a new module number.
442
INDEX
115.
Index.
\! : 98
\, : 106, 107
\; : 101
\? : 100
\\ : 106
\ : 101
\AM, etc : 73
\frac : 105
\input mftmac : 88
\par : 108, 110
\1 : 100
\2 : 100
\3 : 100
\4 : 100
\5 : 100
\6 : 100
\7 : 99
\8 : 100
\9 : 108
{} : 98
abinary : 63, 70, 98, 101
ampersand : 63, 70, 98, 102
as is : 63, 65, 66, 70, 101
ASCII code: 11
ASCII code : 12, 13, 15, 27, 28, 36, 51, 72, 78,
80, 86, 91, 95
b: 87
backslash : 63, 65, 102
banner : 2, 112
bbinary : 63, 65, 98, 100
binary : 63, 66, 70, 98, 100
bold : 63, 66, 71, 100
boolean : 28, 34, 37, 77, 87
break : 22
break out : 89, 90, 91
buf size : 8, 27, 28, 29, 34, 36, 37, 38, 42, 55,
58, 96, 97
buffer : 27, 28, 29, 30, 37, 39, 41, 42, 43, 44, 46,
48, 49, 55, 58, 59, 61, 62, 64, 79, 80, 81, 82,
85, 96, 104, 105, 108
byte mem : 50, 51, 52, 53, 58, 61, 62, 72, 93,
94, 106, 107
byte ptr : 52, 53, 54, 62, 72
byte start : 50, 51, 52, 53, 54, 55, 61, 62, 72,
93, 94, 106, 107
c: 80
carriage return : 79, 85, 104, 108
Change file ended... : 40, 42, 48
Change file entry did not match : 49
change buffer : 36, 37, 38, 41, 42, 46, 49
change changing : 35, 42, 44, 48
MFT
§115
change file : 3, 23, 24, 30, 34, 36, 39, 40, 42, 48
change limit : 36, 37, 38, 41, 42, 46, 49
changing : 30, 34, 35, 36, 38, 42, 44, 45, 49
char : 13
char class : 17, 78, 79, 80, 81, 105, 107
character set dependencies: 17, 79
check change : 42, 46
chr : 13, 15, 18
class : 80, 81
colon : 63, 65, 100
command : 63, 65, 66, 70, 71, 100
comment : 63, 71, 97, 108
confusion : 32
continue : 5, 38, 39
copy : 96, 99, 105, 107, 108, 109
cur tok : 64, 72, 73, 75, 76, 80, 81, 99, 100, 101,
102, 103, 105, 106, 107, 111
cur type : 75, 76, 80, 81, 97, 98, 101, 106, 107,
108, 110, 111
d: 91
decr : 6, 28, 87, 91
digit class : 78, 79, 81, 105
do nothing : 6, 81, 98
do the translation : 97, 112
done : 5, 38, 39, 80, 81, 87, 97, 107
double back : 63, 65, 101
eight bits : 50, 75, 87
else: 7
emit : 81, 82, 85
empty buffer : 77, 80, 85, 97, 111
end: 7
end line class : 78, 79, 81
end of file : 63, 85, 97
end of line : 63, 76, 81, 97, 98, 101, 110, 111
end of MFT : 3, 31, 112
endcases: 7
endit : 63, 65, 66, 71, 98, 100, 101
eof : 28
eoln : 28
err print : 29, 35, 39, 40, 42, 43, 48, 49, 83, 84, 111
error : 28, 29, 31
error message : 9, 113
exit : 5, 6, 37, 38, 42, 80, 91, 97
f : 28
false : 28, 35, 36, 37, 42, 44, 47, 85, 87, 88,
91, 97, 98, 111
fatal error : 31, 32, 33
fatal message : 9, 113
final limit : 28
first loc : 96
first text char : 13, 18
§115
MFT
flush buffer : 87, 88, 91, 92, 97
found : 5, 58, 60, 61, 80, 81
get : 28
get line : 34, 45, 85
get next : 75, 77, 80, 97, 101, 104, 105, 106,
107, 111
greater or equal : 63, 70, 102
h: 56, 58
harmless message : 9, 113
hash : 52, 55, 57, 60
hash size : 8, 55, 56, 57, 58, 59
history : 9, 10, 113
Hmm... n of the preceding... : 43
i: 14, 58, 72
id first : 55, 58, 59, 61, 62, 64, 80, 81, 108, 109
id loc : 55, 58, 59, 61, 62, 65, 80
ilk : 50, 51, 63, 64, 80, 111
Incomplete string... : 83
incr : 6, 28, 39, 40, 42, 46, 47, 48, 59, 61, 62, 72,
80, 81, 82, 87, 89, 104, 108
indentation : 63, 81, 97
initialize : 3, 112
Input line too long : 28
input command : 63, 66, 103
input has ended : 34, 42, 44, 46, 85
input ln : 28, 39, 40, 42, 46, 47, 48
integer : 34, 42, 75, 86, 96, 97
internal : 63, 68, 69, 97, 106
Invalid character... : 84
invalid class : 78, 79, 81
isolated classes : 78, 81
j: 87
jump out : 3, 31
k: 29, 37, 38, 42, 58, 87, 91, 93, 94, 96, 97
Knuth, Donald Ervin: 1
l: 29, 58
last text char : 13, 18
left bracket class : 78, 79
length : 52, 60, 95, 106
less or equal : 63, 70, 102
letter class : 78, 79
limit : 28, 30, 34, 37, 39, 40, 41, 43, 44, 45, 46, 48,
49, 79, 82, 85, 104, 108, 109, 110
line : 30, 34, 35, 39, 40, 42, 44, 46, 47, 48, 49
Line had to be broken : 92
line length : 8, 86, 87, 89, 91
lines dont match : 37, 42
link : 50, 51, 52, 60
loc : 28, 30, 34, 39, 43, 44, 45, 48, 49, 80, 81, 82,
85, 96, 104, 105, 108, 109, 110
lookup : 55, 58, 64, 80
loop: 6
INDEX
mark error : 9, 29
mark fatal : 9, 31
mark harmless : 9, 92
max bytes : 8, 51, 53, 58, 62, 93, 94
max class : 78
max names : 8, 51, 52, 62
MF file ended... : 42
mf file : 3, 23, 24, 30, 34, 36, 42, 46, 49
MFT : 3
mft comment : 63, 71, 97, 98, 111
mftmac : 1, 88
min action type : 63, 98
min suffix : 63, 106, 107
min symbolic token : 63, 111
n: 42, 95
name pointer : 52, 53, 58, 72, 74, 93, 94, 95
name ptr : 52, 53, 54, 58, 60, 62, 72
new line : 20, 29, 30, 31, 92
nil: 6
not equal : 63, 70, 102
not found : 5
numeric token : 63, 81, 97, 106, 107
Only symbolic tokens... : 111
oot : 89
oot1 : 89
oot2 : 89
oot3 : 89
oot4 : 89
oot5 : 89
op : 63, 65, 67, 100
open input : 24, 44
ord : 15
other line : 34, 35, 44, 49
othercases: 7
others : 7
out : 89, 93, 94, 95, 96, 98, 104, 105, 106,
107, 108, 110
out buf : 86, 87, 88, 89, 90, 91, 92, 108, 109
out line : 86, 87, 88, 92
out mac and name : 95, 100, 103, 106
out name : 94, 95, 101, 106, 107
out ptr : 86, 87, 88, 89, 91, 92, 97, 108, 109
out str : 93, 94, 97, 98, 102, 106
out2 : 89, 98, 99, 101, 105, 106, 107, 108
out3 : 89
out4 : 89, 108, 110
out5 : 89, 103, 104, 105
overflow : 33, 62
p: 58, 93, 94, 95
pass digits : 80, 81
pass fraction : 80, 81
path join : 63, 65, 100
443
444
INDEX
per cent : 87
percent class : 78, 79
period class : 78, 79, 81
prev tok : 75, 80, 106, 107
prev type : 75, 80, 100, 106, 107
prime the change buffer : 38, 44, 48
print : 20, 29, 30, 31, 92
print ln : 20, 30, 92, 112
print nl : 20, 28, 92, 113
pr1 : 64, 65, 70, 71
pr10 : 64, 65, 66, 67, 69, 70, 71
pr11 : 64, 65, 66, 67, 68, 69, 70, 71
pr12 : 64, 66, 68, 69, 71
pr13 : 64, 67, 68, 70, 71
pr14 : 64, 67, 68
pr15 : 64, 68
pr16 : 64, 68, 71
pr17 : 64, 70
pr2 : 64, 65, 66, 70, 71
pr3 : 64, 65, 66, 67, 69, 70, 71
pr4 : 64, 65, 66, 67, 69, 70, 71
pr5 : 64, 65, 66, 67, 69, 70, 71
pr6 : 64, 65, 66, 67, 69, 70
pr7 : 64, 65, 66, 67, 69, 70, 71
pr8 : 64, 65, 66, 67, 69, 71
pr9 : 64, 66, 69, 70, 71
pyth sub : 63, 70, 98, 102
read ln : 28
recomment : 63, 97, 110
reset : 24
restart : 5, 45, 97, 98, 108, 109, 110, 111
reswitch : 5, 97, 101, 106, 107, 110, 111
return: 5, 6
rewrite : 21, 26
right bracket class : 78, 79
right paren class : 78, 79
semicolon : 63, 65, 101
set format : 63, 71, 97
sharp : 63, 71, 101, 106
sixteen bits : 50, 51, 55
Sorry, x capacity exceeded : 33
space class : 78, 79, 81
special tag : 63, 66, 97, 106, 107
spotless : 9, 10, 113
spr1 : 64
spr10 : 64
spr11 : 64
spr12 : 64
spr13 : 64
spr14 : 64
spr15 : 64
spr16 : 64
MFT
§115
spr17 : 64
spr2 : 64
spr3 : 64
spr4 : 64
spr5 : 64
spr6 : 64
spr7 : 64
spr8 : 64
spr9 : 64
start of line : 77, 81, 85, 97, 98, 108, 111
string class : 78, 79, 81
string token : 63, 82, 97
style file : 3, 23, 24, 30, 34, 47
styling : 30, 34, 44, 45, 47
switch : 80, 81, 83, 84
system dependencies: 2, 3, 4, 7, 13, 16, 17, 20, 21,
22, 24, 26, 28, 30, 31, 79, 112, 113, 114
t: 94, 97
tag : 63, 97, 106
temp line : 34, 35
term out : 20, 21, 22
tex file : 3, 25, 26, 87, 88
text char : 13, 15, 20
text file : 13, 20, 23, 25, 28
This can’t happen : 32
tr amp : 73, 74, 102
tr ge : 73, 74, 102
tr le : 73, 74, 102
tr ne : 73, 74, 102
tr ps : 73, 74, 102
tr quad : 73, 74, 97
tr sharp : 73, 74, 106
tr skip : 73, 74, 98
translation : 72, 73, 94, 102
true : 6, 28, 34, 35, 37, 42, 44, 46, 49, 77, 85,
87, 91, 92, 97, 108, 111
tr1 : 72
tr2 : 72, 73
tr3 : 72
tr4 : 72, 73
tr5 : 72, 73
ttr1 : 72
ttr2 : 72
ttr3 : 72
ttr4 : 72
ttr5 : 72
type name : 63, 70, 100
update terminal : 22, 29
user manual: 1
verbatim : 63, 71, 97
Where is the match... : 39, 43, 48
write : 20, 87, 88
§115
MFT
write ln : 20, 87
xchr : 15, 16, 17, 18, 30, 87, 92
xclause: 6
xord : 15, 18, 28
INDEX
445
446
NAMES OF THE SECTIONS
MFT
§115
h Assign the default value to ilk [p] 63 i Used in section 62.
h Branch on the class , scan the token; return directly if the token is special, or goto found if it needs to
be looked up 81 i Used in section 80.
h Bring in a new line of input; return if the file has ended 85 i Used in section 80.
h Cases that translate primitive tokens 100, 101, 102, 103 i Used in section 97.
h Change the translation format of tokens, and goto restart or reswitch 111 i Used in section 97.
h Check that all changes have been read 49 i Used in section 112.
h Compare name p with current token, goto found if equal 61 i Used in section 60.
h Compiler directives 4 i Used in section 3.
h Compute the hash code h 59 i Used in section 58.
h Compute the name location p 60 i Used in section 58.
h Constants in the outer block 8 i Used in section 3.
h Copy the rest of the current input line to the output, then goto restart 109 i Used in section 97.
h Decry the invalid character and goto switch 84 i Used in section 81.
h Decry the missing string delimiter and goto switch 83 i Used in section 82.
h Do special actions at the start of a line 98 i Used in section 97.
h Enter a new name into the table at position p 62 i Used in section 58.
h Error handling procedures 29, 31 i Used in section 3.
h Get a string token and return 82 i Used in section 81.
h Globals in the outer block 9, 15, 20, 23, 25, 27, 34, 36, 51, 53, 55, 72, 74, 75, 77, 78, 86 i Used in section 3.
h If the current line starts with @y, report any discrepancies and return 43 i Used in section 42.
h Initialize the input system 44 i Used in section 112.
h Local variables for initialization 14, 56 i Used in section 3.
h Move buffer and limit to change buffer and change limit 41 i Used in sections 38 and 42.
h Print error location based on input buffer 30 i Used in section 29.
h Print the job history 113 i Used in section 112.
h Print warning message, break the line, return 92 i Used in section 91.
h Read from change file and maybe turn off changing 48 i Used in section 45.
h Read from mf file and maybe turn on changing 46 i Used in section 45.
h Read from style file and maybe turn off styling 47 i Used in section 45.
h Scan the file name and output it in typewriter type 104 i Used in section 103.
h Set initial values 10, 16, 17, 18, 21, 26, 54, 57, 76, 79, 88, 90 i Used in section 3.
h Skip over comment lines in the change file; return if end of file 39 i Used in section 38.
h Skip to the next nonblank line; return if end of file 40 i Used in section 38.
h Store all the primitives 65, 66, 67, 68, 69, 70, 71 i Used in section 112.
h Store all the translations 73 i Used in section 112.
h Translate a comment and goto restart , unless there’s a | . . . | segment 108 i Used in section 97.
h Translate a numeric token or a fraction 105 i Used in section 97.
h Translate a string token 99 i Used in section 97.
h Translate a subscript 107 i Used in section 106.
h Translate a tag and possible subscript 106 i Used in section 97.
h Types in the outer block 12, 13, 50, 52 i Used in section 3.
h Wind up a line of translation and goto restart , or finish a | . . . | segment and goto reswitch 110 i
Used in section 97.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement