DECUS C LANGUAGE SYSTEM DECUS C Compiler Reference Manual

DECUS C LANGUAGE SYSTEM DECUS C Compiler Reference Manual

DECUS C LANGUAGE SYSTEM

DECUS C Compiler

Reference Manual

by

David G. Conroy

Edited by

Martin Minow, John D. Morton and Robert B. Denny

This document describes the CC compiler itself (including imple-

mentational quirks and known bugs), along with procedures for

compiling and executing programs under a wide variety of Digital

operating systems.

DECUS Structured Languages SIG

Version of 15-Oct-81

NOTE

This software is made available without any support

whatsoever. The person responsible for an

implementation of this system should expect to have to

understand and modify the source code if any problems

are encountered in implementing or maintaining the

compiler or its run-time library. The DECUS

'Structured Languages Special Interest Group' is the

primary focus for communication among users of this

software.

UNIX is a trademark of Bell Telephone Laboratories. RSX,

RSTS/E, RT-11 and VMS are trademarks of Digital Equipment

Corporation.

CHAPTER 1

INTRODUCTION

CC is a multipass C compiler for the PDP-11 that runs under the

RSX-11, VMS (compatibility mode), RSTS/E, and/or RT-11 operating

systems. Except for the restrictions noted in a later section,

it compiles programs as per the description of C in the Unix

Seventh Edition documentation or the book The C Programming

___ _ ___________

Language by Brian Kernighan and Dennis Ritchie (Englewood

________

Cliffs, NJ: Prentice-Hall, ISBN 0-13-110163-3).

In general, the code produced by this compiler is quite well

optimized for the PDP-11. Quality of the generated code is,

however, dependent on the programmer's understanding of both the

language and of the target machine (the PDP-11). In particular,

proper use of register variables and the prefix '--' and postfix

'++' operators with pointers can result in surprising reductions

in size and increases in speed. Experience is the best teacher.

CHAPTER 2

USING THE C COMPILER

Since the C compiler runs on so many operating systems, command

information is presented in individual sections for the various

operating system families, followed by a common section

describing usage and the switches needed to control compilation.

2.1 VMS, RSX-11, and RSTS/E RSX Emulation Mode ____

_______ ___ ______ ___ _________ ____

After the appropriate setup sequence (described in a later

section) has been executed, the compiler may be invoked as

follows:

XCC file [-switches]

or RUN C:XCC

CC> [type command line here]

The specified file is compiled and the resulting assembly code

is placed in a file having the same name as the source file but

with a filetype of 'S'. The default filetype for source files

is 'C'. The file will be written to the user's current default

account. On RSTS, this is the account under which the user is

logged in.

Diagnostics are written to the standard output. The diagnostic

stream may be redirected by means of the '>' or '>>'

conventions: '>filename' writes diagnostics to the named file,

while '>>filename' appends diagnostics to the named file. This

is compatible with Unix usage.

Only a single file may be compiled at one time. Wildcards are

not legal in file names.

The resulting assembly language is assembled with AS as follows:

XAS file -d

The generated code should never have any assembly errors. The

'-d' switch deletes the input file ('file.s') unless an error is

CC Reference Manual Page 2-2

Using the C compiler

detected.

Object files are compiled into executable images by using one of

the RSX-11M task builders. The simplest command sequences

possible on native RSX-11M are:

>FTB prog/CP=objects,[1,1]C/LB

or >TKB prog/CP=objects,[1,1]C/LB

Alternatively, on VMS, RSX-11, or RSTS/E RSX, the task builder

may be invoked explicitly:

TKB>prog/CP,map=objects,[1,1]C/LB (native RSX)

TKB>//

TKB>prog,map=objects,C:C/LB (VMS, RSTS/E RSX)

TKB>//

NOTE

On native RSX-11M, the C OTS is normally kept on UIC

[1,1], and cannot be referenced as "C:C/LB". On

RSX-11M PLUS, there is a 'libuic' on which the C

library would be kept, and may not be [1,1]. On the

other systems the library may be referenced as

"C:C/LB".

If a program uses large amounts of automatically-allocated

storage, the "STACK = number" option should be specified to the

task builder. A C program may be built with the 4K FCS resident

library FCSRES using the "LIBR = FCSRES:RO" option.

2.2 RT-11 or RSTS/E RT-11 Emulation Mode _____ __

______ _____ _________ ____

After the setup sequence described in a later section has been

executed, the compiler may be run as follows:

RUN C:CC file [/switches] (native RT-11)

CC file [/switches] (RSTS/E)

or RUN C:CC

CC> file [/switches]

or CC> file.s,file.tm1,file.tmp=file.c [/switches]

The latter case explicitly creates and saves the intermediate

code (.tm1) and expanded source (.tmp) files. Normally, these

are needed only when debugging the compiler. Note that if you

CC Reference Manual Page 2-3

Using the C compiler

do not specify extensions for the intermediate files, they will

be given the default of '.tmp' for the expanded source and

'.tm1' for the intermediate code file.

The resulting assembly language is assembled with AS as follows:

RUN C:AS file/d (RT-11)

AS file/d (RSTS/E)

or

RUN C:AS

AS> file/d

The generated code should never have any assembly errors. The

'/d' switch deletes the input file ('file.s') unless an error is

detected.

Object modules are compiled into executable images by using the

RT-11 linker:

LINK/BOT:2000 prog,objects,C:(SUPORT,CLIB) (RT-11)

LINK save,map=objects,C:SUPORT,C:CLIB/B:2000 (RSTS/E)

The two library files contain the actual main program (in

SUPORT) and the RT-11 run-time support library. The start

address must be at least 2000 to allow for dynamic storage by

subroutines. If the '/BOTTOM' option or the '/b' switch is

omitted, executing printf() may cause the program to abort with

an 'M-trap to 4' message.

2.3 Compilation Notes ___________ _____

MACRO-11 may NOT be used to assemble the output of CC. CC

expects that its assembler can perform certain optimizations

(most notably branch adjustment) not performed by MACRO-11.

The title of the object file will be set to the first six

characters of the source file name. This is of interest only to

people who load overlaid programs off libraries.

The compiler writes on files 'file.TMP' and 'file.TM1'. It is,

therefore, unwise to keep important things in files with these

filetypes. The '.TMP' file contains the C source with #include

and #define statements processed. This is the input to the

compiler proper. The '.TM1' file contains the intermediate code

generated by the compiler parser. This is the input to the code

generator.

CC Reference Manual Page 2-4

Using the C compiler

2.4 Switches ________

Under RSX modes, switches are given as single letters preceeded

by a minus sign:

XCC test -w

(On certain releases of Vax VMS, the switch must follow the

filename to be recognized.)

Under RSTS/E or RT-11, switches are given as single letters

preceeded by a slash:

RUN C:CC test/s (native RT-11)

Case is not significant. All switches are shown, although many

are of interest only to persons charged with maintaining the

compiler. The following switches are defined:

a This optional argument causes the compiler to chain to

the assembler, assembling the .S file. Note: it works

on native RSX-11M and RT-11 systems, but will not work

on RSX-11M emulated on VMS or RSTS/E. Also, on native

RT11, the switch may not be given in a command file.

The chain takes place only if no errors were detected by

the compiler. On RSX-11M, AS is invoked as if the "XAS

-D file" command were given. On RT-11, AS in envoked as

if "RUN C:AS file/D" were given.

d This optional argument causes the compiler to treat

floating-point according to C language specifications:

when calling a function, single-precision floating point

variables and constants are converted to

double-precision. Also, functions always return

double-precision results. This overrides the

installation default.

e This optional argument causes in-line code to be

generated for multiply, divide, xor, and shift

operations, using the PDP-11 extended instruction set

(EIS). Note: when CC is installed, this may be made

the default. The 'n' switch disables in-line generation

of EIS operations.

f This optional argument causes the compiler to pass

single-precision floating point variables and constants

to functions without extending them to double-precision.

It is thus incompatible with C language standards, but

is more efficient for certain applications. This may be

made the default when CC is installed.

l This optional argument causes internal code trees to be

CC Reference Manual Page 2-5

Using the C compiler

written (as comments) to the .S output file. This

option is for compiler maintanence.

m This optional argument disables the preprocessor. The

source (.C) file has been processed by the mp macro

preprocessor.

n This optional argument causes the compiler to call

subroutines for multiply, divide, xor, and shift

operations. When CC is installed, this may be made the

default.

p This optional argument causes profiling code to be

compiled (see the section on profiling).

v Ignored. In previous releases it caused the compiler to

echo the current line of the source onto the error

stream whenever an error is detected. (This is now

permanent). In most cases, the line echoed is not the

line containing the error, because the parser usually

has to read the next symbol of the source to determine

that an error exists. It will usually be within 1 line,

which should be close enough to locate the error.

w This optional argument supresses the "variable was

defined but never referenced" warning message.

x This optional argument is for debugging the compiler.

It causes the compiler to retain intermediate files and

to print timings of each compiler pass.

z This optional argument causes the compiler to execute a

breakpoint trap when entering each overlay segment. It

is used only for debugging the compiler. It is listed

here only to as a warning for the fumble-fingered

typist.

2.5 Setup of the Compiler _____ __ ___ ________

Before using the C compiler, it must be made known to the

operating system. This differs slightly for the various

systems.

CC Reference Manual Page 2-6

Using the C compiler

2.5.1 Setup under VMS _____ _____ ___

The following setup (or something much like it) should be added

to your LOGIN.COM file:

$ ASSIGN DBA0:[PUBLIC] C

$ XCC :== $C:CC.EXE CC

$ XAS :== $C:AS.EXE AS

The above enables use of the above-mentioned command sequences.

If your compiled C program is to make use of the

(Unix-compatible) startup sequence, you must proceed as follows:

$ XCC foo

$ XAS -d foo

$ MCR TKB foo,foo=foo,c:c/lb

Then, you must type:

$ FOOBAR :== "$DISK:[ACCOUNT]FOO.EXE"

$ FOOBAR Unix-style parameters

The '$' tells the VMS command interpretor that a command is

being defined. On VMS, the 'task name' will be passed to the

program as argv[0] when the program starts.

2.5.2 Setup under RSTS/E RSX emulation mode _____

_____ ______ ___ _________ ____

Under RSTS/E, the system manager must define the XCC and XAS CCL

commands and the C: system-wide logical in a start control file

such as the following (the account may be chosen to meet the

system manager's needs):

RUN $UTILTY

? ADD LOGICAL SY:[5,2]C

? CCL XAS-=C:AS.TSK;0

? CCL XCC-=C:CC.TSK;0

? CCL MCR-=C:MCR.*;30000

? EXIT

2.5.3 Setup under RSX-11M _____ _____ _______

As it is assembled, the CC compiler looks for #include files of

the form '<file.h>' on logical device 'C:'. This will not work

on RSX-11M, so the distributed compiler build file does a

'GBLPAT' to the location labeled 'SYSINC' to change it to

"LB:[1,1]". On an RSX-11M PLUS system, you should change this

to your 'libuic' if necessary by editing MMAKCC.CMD.

CC Reference Manual Page 2-7

Using the C compiler

Install CC and AS as MCR external commands '...XCC' and

'...XAS', respectively. The CC compiler MUST be installed

____

checkpointable in a mapped system to allow for task extension.

If you have an unmapped system, or do not have the 'extend task'

directive in your executive, install CC with an 'INC=20000' at

least, more if you get compiler aborts.

2.5.4 Setup under RT-11 and RSTS/E RT-11 mode

_____ _____ _____ ___ ______ _____ ____

Under RT-11, setup consists of simply ASSIGNing a physical

device to the logical device "C:". The compiler and assembler

.SAV files, the SUPORT.OBJ module, and the library CLIB.OBJ

should be placed on device 'C:'. You can make the assignment of

device 'C:' as part of the startup command file, e.g.:

.ASSIGN RK0: C:

This compiler has been built and used under RT-11 V3B and V4.

It has run on a PDP-11/34, a PDP-11/05 and on PDT150 systems.

Under RSTS/E, the system manager must execute a startup control

file such as the following:

RUN $UTILTY

? ADD LOGICAL SY:[5,2]C

? CCL AS-=C:AS.SAV;8192

? CCL CC-=C:CC.SAV;8220

? CCL MCR-=C:MCR.*;30000

? EXIT

2.6 Invoking Compiled C Programs ________ ________ _

________

When your program begins to execute and the startup module sees

that a command has been typed, a Unix C setup sequence is

emulated, including I/O redirection and command argument

processing. The startup module does not expand wild-card

filenames, however.

NOTE

On RSX-11M, this feature cannot be used unless your

program is installed as an MCR external command, i.e.

with a task name of '...xxx', and activated by typing

the "xxx". This requires that you be a priveleged

user.

On RT-11, if no command line has been passed, the module prompts

CC Reference Manual Page 2-8

Using the C compiler

"Argv: " and accepts a single line which is then parsed into

command arguments. This can be disabled by defining the $$narg

global symbol as described in the library documentation.

NOTE

On native RT-11, a command line passed via

"RUN prog ..." which has more than one 'token' or

'word' in it gets parsed by the RT-11 monitor before

it ever gets to the C program. See the documentation

in the RT-11 manual on the 'RUN' command. It causes

an "=" sign to get inserted, and the order of

arguments is shuffled.

To get around this, either use the "RUN prog" and answer the

"Argv: " prompt with the command line, or enclose the command

line in some delimiter plus a space, e.g.:

RUN C:prog [ command line ]

which tacks an extra token on to the command line that looks

like "]=[" for the case above. There is no problem on command

lines which have one token.

If you include an argument of the form '>file', standard output

will be written to the indicated file. If you include an

argument of the form '>>file', standard output will be appended

to the file (creating it if necessary). Append does not work on

RT-11 modes. If you include an argument of the form '<file',

the indicated file will be used for standard input.

When the C program is started, it will be entered with two

parameters.

argc This is the count of the number of arguments.

It will be at least 1.

argv This is an array of string pointers containing

the individual arguments. The first parameter,

argv[0], will be the name assigned to the

program, where appropriate:

o On RSX, this will be the name by which

the task was installed (TTnn if the

'RUN' command was used with no task

name).

o On VMS compatibility mode, this will be

the <name> parameter in the command

definition as shown above.

CC Reference Manual Page 2-9

Using the C compiler

o On RSTS/E, this will be the CCL name or

the program name as passed to the MCR

program.

o On RT-11 (or on RSTS, by default, if no

name can be found), this will be the

string 'Argv: '.

For example:

/*

* Echo arguments

*/

main(argc, argv)

int argc;

char *argv[];

{

register int i;

printf("Program \"%s\" has %d parameters\n",

argv[0], argc);

for (i = 1; i < argc; i++)

printf("Argument %d = \"%s\"\n",

i, argv[i]);

}

The above program is executed as follows on VMS:

$ ECHO abc "def ghi"

Program "ECHO" has 3 parameters

Argument 0 = "ECHO"

Argument 1 = "ABC"

Argument 2 = "def ghi"

Notice that unquoted arguments are converted to upper case by

the operating system.

Under RSTS/E, a C program may be installed as a CCL command or

the program may be started using the MCR CCL command which

emulates a CCL invocation for C programs.

2.7 Predefined Symbols __________ _______

Before reading the program source file, the C compiler defines

several symbols (which may then be tested with '#ifdef'

statements):

decus This is the Decus compiler.

nomacarg This version does not allow macros with

arguments.

CC Reference Manual Page 2-10

Using the C compiler

pdp11 Generate code for the PDP-11.

rsx The RSX compiler (or)

RT11 The RT-11 compiler

_DATE The compilation date and time as a quoted

string.

2.8 Program Sections _______ ________

Two directives, psect and dsect, have been added to the C

language syntax to permit programmer control over the program

sections generated. This was needed to permit C programs to be

configured for read-only memory systems and simplifies writing

RSTS/E run-time systems in C.

Warning

These directives are supported only on Decus C.

Programs using them are not transportable to other C

compilers. They can be "hidden" by suitable use of

the "#ifdef decus" pre-processor directive.

To change all default sections within a compilation, use the

psect directive as follows:

/*

* program

*/

int normal; /* goes into '.data.' section */

func() { /* goes into '.prog.' section */

}

.s

psect "xxx"; /* Name special sections */

int funny; /* goes into 'xxxdat' section */

subr() { /* goes into 'xxxcod' section */

}

.s

psect ""; /* Null string means normal */

int norm; /* goes into '.data.' section */

function() { /* goes inot '.prog.' section */

}

The dsect directive has the same syntax as the psect directive.

It affects only the allocation of global and static data. The

entire string is used.

int normal; /* Goes into the .data. section */

CC Reference Manual Page 2-11

Using the C compiler

dsect "mydata"; /* Switch to my section */

int pure1; /* Goes into the mydata section */

func() { /* Goes into the .prog. section */

static int more;

}

dsect ""; /* Revert to default .data. */

The dsect directive uses the first six non-blank (and

non-control) characters of the quoted argument. Your program

will not compile correctly (and you will not be warned) if the

dsect argument matches any other program section. For example,

dsect ".strn.";

Will not work correctly.

The psect directive takes the first three non-blank (and

non-control) bytes of the quoted argument together with 'cod',

'dat', etc. to form program sections.

The psect and dsect directives may not be given within a

function.

If a null string (or one with no non-blank text) is given, the

compiler will revert to the standard program sections. (Dsect

changes only the data section, while psect changes all.)

The compiler supports the following program sections:

.prog. xxxcod Executable code

.data. xxxdat Global and static data

.strn. xxxstr Strings

.mwcn. xxxmwc Multi-word constants

.prof. xxxprr Profile tables

If the psect string argument is less than three characters long,

it will be padded with '.'. Normally, .prog., .strn., and

.mwcn. sections may be allocated to read-only memory, while

.data. and .prof. sections should be allocated to read-write

memory.

Note that the compiler will always generate references to the

standard program sections, even if a psect directive is the

first in the file. While Decus C cannot specify program section

attributes (read-only, etc.), these can be specified by

task-builder control files.

Strings, by default, are written to the .strn. program section.

String vectors, however, require two allocations: a pointer

value (in .data.) and a character string (in .strn.). If the

program must control allocation of both the pointer and the

string, the psect directive must be used. Your program will not

compile correctly if you allocate pointer and string values in

CC Reference Manual Page 2-12

Using the C compiler

the same program section.

dsect "rodata";

char *entry[] = {

"string1", "string2"

};

In the above, entry[0] will be in program section rodata, while

"string1", etc. will be in .strn.

psect "xxx";

dsect "rodata";

char *entry[] = {

"string1", "string2"

};

psect "";

In the above, "string1" was allocated in program section

"xxxstr", while the entry[] pointers were allocated in program

section "rodata".

2.9 Profiling _________

The profiler permits the accumulation of function call

statistics during the execution of a program.

If any of the files comprising a program were compiled with the

profile option (and at least one of them has been called) then a

call profile, listing the function name and the number of calls,

will be written to file 'profil.out' when the program

terminates.

Also, if the program terminates because of a fatal error (such

as an illegal memory reference), a register dump and call trace

will be printed on the command terminal.

The run-time library contains several functions that can be

called to dynamically print flow trace information. For more

information, consult the C Runtime Library manual.

2.10 Diagnostics ___________

There are two general classes of diagnostics; those that relate

to compiler conditions, and those that relate to errors in the

user's program.

The only type of compiler condition messages the user should see

are those of the form "Cannot open .... file". These mean

exactly what they say.

CC Reference Manual Page 2-13

Using the C compiler

Other compiler condition messages are "Abort in phase x", "Abort

loading phase x" and "Trap type x", where "x" is replaced by

some small constant. Most likely, you are using a syntactic

construction (such as bit fields) that is supported by the

syntax analyser, but not by the code generator). If not, you

are the proud owner of a compiler bug. Report your find to a

guru. Remember the register dump and save your source file and

both temporary files. They are important.

Errors in the user's programs are reported in English, tagged by

the linenumber (which may be off by 1). Because of the nature

of the language, errors sometimes snowball. If you are greeted

by thousands of error messages, try fixing up the first few.

You may be pleasantly surprised.

The following are common sources of 'thousands of errors':

o If there is a missing right brace within a function, all

succeding functions will miscompile. The error message

will include a tag of the form "within function xxxxx",

where "xxxxx" is the function with the missing brace.

o If there is a missing right parenthesis in an if or

while statement which is followed by a left brace, the

syntax analyser will 'lose' the brace, causing many

messages:

if ((foo = fopen("abc.def", "w") == NULL) {

...

o In general, if the error message is "illegal

expression", that is (probably) the current line. If

the message is "illegal statement", you should look at

the previous statement.

CHAPTER 3

RUNTIME ENVIRONMENT

This description of the C runtime enviornment is sketchy. The

best reference is compiler generated code, and any question

regarding 'how does it ....' can usually be answered by

compiling a suitably contrived program.

3.1 Program Sections _______ ________

The C compiler uses 5 program sections whose names may be

overrided by the programmer, as described previously.

.PROG. is used for all executable code.

.DATA. is used for all static (read-write) data. The compiler

issues the .even assembler directive to force

word-alignment when necessary.

.STRN. is used for the bodies of all literal strings.

.PROF. is used to hold the names of functions and reference

counts for the profiler. It contains read-write data.

.MWCN. is used for multi-word constants (long integers and

floating-point values), as well as for transfer tables

for the switch statement processor. It is read-only

data.

All code and constant data are 'pure'. However, the assembler

is not able to generate all the varieties of .PSECTs. Thus,

everything is read-write. This should be changed. Also, the

compiler does not write a symbol table as such, making debugging

a chore.

CC Reference Manual Page 3-2

Runtime Environment

3.2 Register Usage ________ _____

R5 is used as an environment frame pointer. It points to the

highest address of the stack frame of the current function. In

MACRO-11 programs, symbols C$PMTR and C$AUTO may be used to

refer to the first parameter and first automatic variable,

respectively. Thus, when writing a MACRO subroutine, the macro

program should contain:

MOV C$PMTR+<parameter_number * 2>(R5), Dst

to access parameters (the first parameter_number is 0). (This

cannot be done when using the AS assembler.)

To access automatic variables, the recommended sequence is:

MOV C$AUTO-<variable_number * 2>(R5), Dst

Where the first variable_number is numbered 1. (This cannot be

done when using the AS assembler).

Registers R2, R3 and R4 are used as register variables. The

first register variable to be declared goes in R4, the second in

R3 and the third in R2. Any register not used as a register

variable can be used as a temporary.

Registers R0 and R1 are always scratch registers.

3.2.1 Calling Sequence _______ ________

The first instructions in a C function are a 'JSR R5,CSV$' and a

subtract to claim stack space. The 'CSV$' routine points R5 at

the new stack frame and pushes registers R4, R3, and R2 onto the

stack (Note that the character '$' in the CC/MACRO environment,

is represented by '~' in the AS environment).

R0, R1 and the floating point registers are NOT saved. This

means that if a C function is called asyncronously (i.e. from

an AST routine) the caller must arrange to save these registers

or be prepared to face the music.

Functions return via a 'JMP CRET$'. The return value is in R0

(for ints, chars and pointers), R0-R1 (for longs, high part in

R0) or AC0 (floats and doubles).

The caller passes control to a function by first pushing the

arguments (from right to left) onto the stack, calling the

function via a 'JSR PC,FUNCTION', and popping the arguments off

of the stack when the function returns.

All arguments are passed as ints, longs (push low part, then

CC Reference Manual Page 3-3

Runtime Environment

push high part) or doubles. Characters are passed as integers;

floats are passed as doubles.

3.3 Global Symbols Containing Radix-50 '$' and '.'

______ _______ __________ ________ ___

With this version of Decus C, it is possible to generate and

access global symbols which contain the Radix-50 '.' and '$'.

The compiler allows identifiers to contain the Ascii '$', which

becomes a Radix-50 '$' in the object code. The AS assembly code

shows this character as a tilde (~). The underscore character

(_) in a C program becomes a '.' in both the AS assembly

language and in the object code. Thus, in RSX-11M, it is

possible to say

extern int $dsw;

. . .

printf("Directive status = %06o\n", $dsw);

which will print the current contents of the task's directive

status word.

NOTE

Use of '$' in programs may not be transportable to

other C compilers.

Be careful about using global 'equates' in C. These

are NOT address labels. For example, if a program

declares "extern int is_suc;", where IS.SUC is

externally equated to 1, and then use is_suc in an

expression, you will get the contents of location 1

________ __ ________ _

(and probably an odd address trap!). It is possible

(but unbeautiful) to get around this by prefixing the

use of the equated symbol with the '&' operator, since

it means 'take this literally, not what it points to'.

Consider defining the symbols in a C header file

instead.

3.4 Virtual Addresses in C _______ _________ __ _

When interacting with executives and MACRO-11 programs at the

low level made possible by C, it is likely that virtual

addresses (i.e., mapped memory addresses) will be manipulated

and used as pointers. This is particularly true when using the

RSX-11M interface library memory management functions. Also,

the C storage allocator functions return virtual addresses, not

CC Reference Manual Page 3-4

Runtime Environment

C pointers. It is important to make this distinction, owing to

C's powerful address arithmetic capabilities. This is discussed

in The C Programming Language by Kernighan and Ritchie, sections

___ _ ___________ ________

5.4 and 5.6.

While virtual addresses are represented internally as unsigned

integers, it would be wise for the programmer to adopt the

convention of defining them as character pointers. To make

things crystal clear, one might

#define ADDR char *

making ADDR synonymous with 'character pointer'.

3.5 Profiler ________

When a program is compiled with the 'p' option, the standard

function entrance sequence is replaced by a "JSR R5,PCSV$".

Immediately following the call is a pointer to a counter word

followed by the name of the function as a null terminated

string. The 'PCSV$' routine increments the zero word on every

call:

.psect .prog.

entry: jsr r5,pcsv$

.word prof

.psect .prof.

prof: .word 0 ; Incremented at each call

.asciz /entry/ ; Function name

.even

.psect .prog.

...

The printing of the profile is arranged by having 'PCSV$' stuff

a global cell '$$PROF' with a pointer to the profile print

routine. This routine (called automagically on exit) scans

through memory looking for "JSR R5,PCSV$" instructions, and

printing the statistics to the file 'profil.out' via 'fprintf'.

Compiling a program with profiling has several additional

advantages:

o If the program fails because of an unexpected trap to

the operating system (and the profile collection code

was executed at least once), a register dump will be

printed on the command terminal and the program will

exit by calling error().

o If the function's execution would cause the stack

pointer to go below 600 octal, the program will be

aborted after printing an error message.

CC Reference Manual Page 3-5

Runtime Environment

o It is possible to obtain a dynamic trace of the flow of

a program by assigning the file descriptor of an open

file to global variable '$$flow'. For example:

#include <stdio.h>

extern FILE *$$flow;

main ()

{

$$flow = fopen("trace.out", "w");

process();

}

Note that the program may execute

$$flow = stdout;

to write the trace to the command terminal. To turn off

tracing, close $$flow and set $$flow = NULL.

___

o The caller() function may be used to obtain the name of

a routine's caller:

main ()

{

subr();

}

subr ()

{

printf("%s\n", caller());

}

When subr() is executed, it will print "main".

o The calltr() function may be used to print a trace of

calls from main() to the function that called calltr():

main ()

{

subr();

}

subr ()

{

calltr(stdout);

}

When subr() is executed, it will print:

[ main subr ]

on the standard output file. If some routine in the

call trace was not compiled with profiling, the octal

CC Reference Manual Page 3-6

Runtime Environment

address of the routine's entry point is printed. If the

routine gets confused (perhaps because the program is

exiting due to a trap), it prints "<bug at nnnnn>".

o If the program exits by calling error() and the profile

collection code was executed at least once, a call trace

will be printed on the command terminal.

3.5.1 Example _______

A function max(a, b), which returns the maximum value of its two

integer arguments may be written as follows:

max(arga, argb)

int arga;

int argb;

{

return((arga > argb) ? arga : argb);

}

After compilation, the following .S code will be generated:

max: jsr r5,csv$

cmp 2(sp),4(sp)

blt .0

mov 2(sp),r0

br .1

.0: mov 4(sp),r0

.1: jmp cret$

CHAPTER 4

INCOMPATIBILITIES AND RESTRICTIONS

The language accepted by the compiler is the language described

in the Unix Seventh Edition documentation (and Kernighan and

Ritchie) with several exceptions. The file 'C:CBUGS.DOC'

contains a current list of bugs. These should be regarded as

restrictions -- anything that was easy to fix has been fixed.

4.1 Restrictions ____________

o The AS assembler recognizes several pre-defined

variables. Consequently, the following may not be used

by a C program: 'r0, r1, r2, r3, r4, r5, sp, and pc'.

o Initialization of automatic and local static variables

is not supported.

o Enumerations are not supported.

o Bit fields do not work -- attempting to use bit fields

will cause the compiler to abort with a "missing code

table entry" error.

o Symbols defined as global may not be redefined as local

to a function.

o Variables may only be declared at function entrance.

The latest C language specification allows variable

declaration at any block entrance.

o Only FPU (11/45, 11/70) floating point is supported.

There is no code to support the FIS (11/40 11/03)

hardware, nor is there code present to emulate floating

point.

o The compiler does not support 'old-style' assigned

binary operators. These will generally result in syntax

errors. One exception (which started the whole mess) is

"foo =- 6". This will be accepted by the compiler.

Unfortunately, it will generate "foo = (-6)" when the

CC Reference Manual Page 4-2

Incompatibilities and Restrictions

program probably wanted "foo = foo - 6". You have been

warned.

o In order to ease conversion to Vax-11 C, variable

initialization must be written "int foo = 123;". The

'old-style' "int foo 123;" compiles correctly, but will

generate an annoying warning message.

o The include statement has two modes:

#include "filename" Includes the fully-qualified

file.

#include <filename> Includes the library file,

equivalent to:

#include "C:filename" (LB:[1,1] on RSX)

o Macros (#define statement with arguments) do not exist.

o As noted in the library documentation, the following

built-in function may be overridden by the C-program:

wrapup() Called when the program exits.

4.2 Incompatibilities _________________

There are several incompatibilities between the current DECUS

compiler and earlier versions which had been distributed by

various DECUS special-interest groups. Those known (and the

implications) are:

o The RSX compiler's subroutine calling sequence has been

changed to match the RT-11 compiler's (and Unix's).

This means that all user-written assembly-language code

must be modified. The calling sequence appears to be

compatible with the Unix and Whitesmith compilers,

although library names are different. Also, the

Whitesmith compiler has several optimizations in its

subroutine calling sequence that are not present in this

compiler.

o The underscore character now generates a RAD50 dot,

instead of a dollar-sign. The compiler allows

dollar-signs in local and global variables. Thus, C

programs can now access all PDP-11 global symbols.

Because of the change of the meaning of underscore, all

user-written assembly-language code must be modified.

o I/O library conventions now generally follow the Unix V7

definitions. There are several implications. In

CC Reference Manual Page 4-3

Incompatibilities and Restrictions

general, however, all C-language I/O calls should be

examined. The major problems are described below.

o fopen("filename", "openmode") follows the RSX-library

and Unix V7. This is incompatible with Unix V6 and the

old RT-11 library call.

o fgets(buffer, sizeof buffer, fd) requires the second

buffer size parameter, and does not remove the trailing

newline. This follows Unix V7 I/O conventions.

fgetss() is a new function, identical to fgets() except

that it removes the trailing newline. fgetss() is

compatible with the fgets() function in previous

versions of the Decus compiler.

o fputs(buffer, fd) does not append a newline to the

record. This follows Unix V7 I/O conventions.

fputss() is a new function, identical to fgets() except

that it appends a trailing newline.

o The "execute non-local goto" functions have been

renamed. Unix V6 reset() and setexit() (Unix V7

longjmp() and setexit()) are called reset() and unwind()

in this release. Two new functions, envsave() and

envreset() are also present for this purpose.

o The ctime() function (return time of day in Ascii) does

not return a trailing newline. To get the time of day,

the program may execute ctime(0).

4.2.1 Conversion from Unix __________ ____ ____

It is expected that many programs will be converted to the Decus

compiler from Unix. While trivial programs will require no work

whatsoever, most programs will require hand editing. Note the

following:

o Floating point requires floating-point hardware. Many

floating-point variables can be recoded as long integers

(large counters, for example). Anything else cannot be

converted at all.

o Unix V6 assigned binary operators must be converted to

the new format. Most of these will be caught by the

syntax analyser. Note, however, that "foo =- 6" will

parse, generating incorrect code.

o The Decus compiler has a 500 word expression stack.

CC Reference Manual Page 4-4

Incompatibilities and Restrictions

This means that many complex expressions (especially

those with embedded conditional statements) will cause

the compilation to abort. This requires rewriting.

o The Decus compiler lacks macros with arguments. Many of

these can be rewritten as function calls. If the

program intentionally makes use of the fact that macros

are expanded in-line, hand-editing will be needed. Note

also that only one level of indirect (#include) file is

supported.

o The previous release of the compiler treated nested

comments as follows:

/* begin comment

/* nested comment */

more comment

comment ends here: */

The current version is compatible with other C

compilers:

/* begin comment...

/* this generates a warning

comment ends here: */

o Unix V6 I/O is not supported. Thus, any program using

read(), write(), open(), or creat() will require

extensive modification. Note also that fopen() operates

quite differently in the standard I/O package than it

did in Unix V6. This requires rethinking but is fairly

straight-forward. Also, note that only a limited file

random-access capability is present.

o Very large programs (which depend on Unix's ability to

generate programs with separate instruction and data

space) must be redone using the linker (task-builder)

overlay capability. Non-trivial.

o Programs that use large amounts of local storage

(allocated on function entrance) must be linked with

enough stack space. When testing a program, it is

highly recommended that the program be compiled with

profiling as this enables a stack overflow check on

function entrance. Note that, on Unix, the runtime

stack and free storage (allocated by malloc()) compete

for the same memory, relieving the programmer of the

need to specify the maximum stack size. Unix programs

that exploit this fact may prove hard to convert to the

Decus compiler.

In general, the programmer should be alert to such minor

CC Reference Manual Page 4-5

Incompatibilities and Restrictions

incompatibilities that do exist.

APPENDIX A

FILE CBUGS.DOC (17-Sep-80)

The following is a reproduction of the CBUGS.DOC file

distributed with the DECUS C system.

Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement