DECUS C LANGUAGE SYSTEM DECUS C Compiler Reference Manual

DECUS C LANGUAGE SYSTEM DECUS C Compiler Reference Manual

DECUS C LANGUAGE SYSTEM

DECUS C Compiler

Reference Manual

by

David G. Conroy

Edited by

Martin Minow, John D. Morton and Robert B. Denny

This document describes the CC compiler itself (including imple-

mentational quirks and known bugs), along with procedures for

compiling and executing programs under a wide variety of Digital

operating systems.

DECUS Structured Languages SIG

Version of 17-Sep-80

Copyright (C) 1980, DECUS

General permission to copy or modify, but not for

profit, is hereby granted, provided that the above

copyright notice is included and reference made to the

fact that reproduction privileges were granted by

DECUS.

The information in this document is subject to change

without notice and should not be construed as a

commitment by Digital Equipment Corporation or by

DECUS.

Neither Digital Equipment Corporation, DECUS, nor the

authors assume any responsibility for the use or

reliability of this document or the described

software.

This software is made available without any support

whatsoever. The person responsible for an

implementation of this system should expect to have to

understand and modify the source code if any problems

are encountered in implementing or maintaining the

compiler or its run-time library. The DECUS

'Structured Languages Special Interest Group' is the

primary focus for communication among users of this

software.

UNIX is a trademark of Bell Telephone Laboratories. RSX,

RSTS/E, RT11 and VMS are trademarks of Digital Equipment

Corporation.

CHAPTER 1

INTRODUCTION

CC is a multipass C compiler for the PDP-11 that runs under the

RSX-11, VMS (compatibility mode), RSTS/E, and/or RT11 operating

systems. Except for the restrictions noted in a later section,

it compiles programs as per the description of C in the Unix

Seventh Edition documentation or the book The C Programming

___ _ ___________

Language by Brian Kernighan and Dennis Ritchie (Englewood

________

Cliffs, NJ: Prentice-Hall, ISBN 0-13-110163-3).

In general, the code produced by this compiler is quite well

optimized for the PDP-11. Quality of the generated code is,

however, dependent on the programmer's understanding of both the

language and of the target machine (the PDP-11). In particular,

proper use of register variables and the prefix '--' and postfix

'++' operators with pointers can result in surprising reductions

in size and increases in speed. Experience is the best teacher.

CHAPTER 2

USING THE C COMPILER

Since the C compiler runs on so many operating systems, command

information is presented in individual sections for the various

operating system families, followed by a common section

describing usage and the switches needed to control compilation.

2.1 VMS, RSX-11, and RSTS/E RSX emulation mode ____

_______ ___ ______ ___ _________ ____

After the appropriate setup sequence (described in a later

section) has been executed, the compiler may be invoked as

follows:

XCC [-switches] file

or RUN C:XCC

CC> [type command line here]

The specified file is compiled and the resulting assembly code

is placed in a file having the same name as the source file but

with a filetype of 'S'. The default filetype for source files

is 'C'. The file will be written to the user's current default

account. On RSTS, this is the account under which the user is

logged in.

Diagnostics are written to the standard output. The diagnostic

stream may be redirected by means of the '>' or '>>'

conventions: '>filename' writes diagnostics to the named file,

while '>>filename' appends diagnostics to the named file. This

is compatible with Unix usage.

Only a single file may be compiled at one time. Wildcards are

not legal in names.

The resulting assembly language is assembled with AS as follows:

XAS -d file

The generated code should never have any assembly errors. The

'-d' switch deletes the input file ('file.s') unless an error is

CC Reference Manual Page 2-2

Using the C compiler

detected. Note that it is not possible to RUN XAS.

Object files are compiled into executable images by using one of

the RSX-11M task builders. The simplest command sequences

possible on native RSX-11M are:

>FTB prog/CP=objects,[1,1]C/LB

>TKB prog/CP=objects,[1,1]C/LB

Alternatively, on VMS, RSX-11, or RSTS/E RSX, the task builder

may be invoked explicitly:

TKB>prog/CP,map=objects,[1,1]C/LB (native RSX)

TKB>//

TKB>prog,map=objects,C:C/LB (VMS, RSTS/E RSX)

TKB>//

NOTE

On native RSX-11M, the C OTS is normally kept on UIC

[1,1], and cannot be referenced as "C:C/LB". On

RSX-11M PLUS, there is a 'libuic' on which the C

library would be kept, and may not be [1,1]. On the

other systems the library may be referenced as

"C:C/LB".

If a program uses large amounts of automatically-allocated

storage, the "STACK = number" option should be specified to the

task builder. A C program may be built with the 4K FCS resident

library FCSRES using the "LIBR = FCSRES:RO" option.

2.2 RT11 or RSTS/E RT11 emulation mode ____ __

______ ____ _________ ____

After the setup sequence described in a later section has been

executed, the compiler may be run as follows:

RUN C:CC file [/switches] (native RT-11)

CC file [/switches] (RSTS/E)

or RUN C:CC

CC> file [/switches]

or CC> file.s,file.tm1,file.tmp=file.c [/switches]

The latter case explicitly creates and saves the intermediate

code (.tm1) and expanded source (.tmp) files. Normally, these

are needed only when debugging the compiler. Note that if you

CC Reference Manual Page 2-3

Using the C compiler

do not specify extensions for the intermediate files, they will

be given the default of '.tmp' for the expanded source and

'.tm1' for the intermediate code file.

The resulting assembly language is assembled with AS as follows:

RUN C:AS file/d (RT-11)

AS file/d (RSTS/E)

or

RUN C:AS

AS> file/d

The generated code should never have any assembly errors. The

'/d' switch deletes the input file ('file.s') unless an error is

detected.

Object modules are compiled into executable images by using the

RT11 linker:

LINK/BOT:2000 prog,objects,C:(SUPORT,CLIB) (RT-11)

LINK save,map=objects,C:SUPORT,C:CLIB/B:2000 (RSTS/E)

The two library files contain the actual main program (in

SUPORT) and the RT11 run-time support library. The start

address must be at least 2000 to allow for dynamic storage by

subroutines. If the '/BOTTOM' option or the '/b' switch is

omitted, executing printf() may cause the program to abort with

an 'M-trap to 4' message.

2.3 Compilation notes ___________ _____

MACRO-11 may NOT be used to assemble the output of CC. CC

expects that its assembler can perform certain optimizations

(most notably branch adjustment) not performed by MACRO-11.

The title of the object file will be set to the first six

characters of the source file name. This is of interest only to

people who load overlaid programs off libraries.

The compiler writes on files 'file.TMP' and 'file.TM1'. It is,

therefore, unwise to keep important things in files with these

filetypes. The '.TMP' file contains the C source with #include

and #define statements processed. This is the input to the

compiler proper. The '.TM1' file contains the intermediate code

generated by the compiler parser. This is the input to the code

generator.

CC Reference Manual Page 2-4

Using the C compiler

2.4 Switches ________

Under RSX modes, switches are given as single letters preceeded

by a minus sign:

XCC -v -s test

Under RSTS/E or RT11, switches are given as single letters

preceeded by a slash:

RUN C:CC test/v/s (native RT-11)

Case is not significant. The following switches are defined:

d This argument causes the compiler to execute a

breakpoint trap when entering each overlay segment. It

is used only for debugging the compiler.

e This optional argument causes in-line code to be

generated for multiply, divide, xor, and shift

operations. NOTE: the current compiler recognises this

switch, but does not generate in-line code.

f This optional argument causes in-line code to be

generated for floating-point operations. NOTE: the

current compiler recognises this switch, but does not

support floating-point. Any attempt to compile

floating-point operations will result in a fatal

compilation error.

i This optional argument causes the compiler to retain the

intermediate file (phase 1 to phase 2). This file is

normally deleted. This option is for compiler

maintanence.

l This optional argument causes internal code trees to be

written (as comments) to the .S output file. This

option is for compiler maintanence

m This optional argument causes timings of each pass to be

printed. This option is only operative on RSX-11 modes.

It requires hardware EIS.

p This optional argument causes profiling code to be

compiled (see the section on profiling).

s This optional argument causes the compiler to retain the

expanded source file (phase 0 to phase 1). This file is

normally deleted. This option is for compiler

maintanence.

CC Reference Manual Page 2-5

Using the C compiler

v This optional argument causes the compiler to echo the

current line of the source onto the error stream

whenever an error is detected. In most cases, this is

not the line containing the error, because the parser

usually has to read the next symbol of the source to

determine that an error exists. It will usually be

within 1 line, which should be close enough to locate

the error.

2.5 Setup of the compiler _____ __ ___ ________

Before using the C compiler, it must be made known to the

operating system. This differs slightly for the various

systems.

2.5.1 Setup under VMS _____ _____ ___

The following setup (or something much like it) should be added

to your LOGIN.COM file:

$ ASSIGN DBA0:[PUBLIC] C

$ XCC :== $C:CC.EXE CC

$ XAS :== $C:AS.EXE AS

The above enables use of the above-mentioned command sequences.

If your compiled C program is to make use of the

(Unix-compatible) startup sequence, you must proceed as follows:

$ XCC foo

$ XAS -d foo

$ MCR TKB foo,foo=foo,c:c/lb

Then, you must type:

$ FOOBAR :== "$DISK:[ACCOUNT]FOO.EXE <name>"

$ FOOBAR Unix-style parameters

The '$' tells the VMS command interpretor that a command is

being defined. Note that a dummy parameter must be specified.

This will become the 'task name' (argv[0]) when the program

starts.

CC Reference Manual Page 2-6

Using the C compiler

2.5.2 Setup under RSTS/E RSX emulation mode _____

_____ ______ ___ _________ ____

Under RSTS/E, the system manager must define the XCC and XAS CCL

commands and the C: system-wide logical in a start control file

such as the following (the account may be chosen to meet the

system manager's needs):

RUN $UTILTY

? ADD LOGICAL SY:[5,2]C

? CCL XAS-=C:AS.TSK;0

? CCL XCC-=C:CC.TSK;0

? CCL MCR-=C:MCR.*;30000

? EXIT

2.5.3 Setup under RSX-11M _____ _____ _______

As it is assembled, the CC compiler looks for #include files of

the form '<file.h>' on logical device 'C:'. This will not work

on RSX-11M, so the distributed compiler build file does a

'GBLPAT' to the location labeled 'SYSINC' to change it to

"LB:[1,1]". On an RSX-11M PLUS system, you should change this

to your 'libuic' if necessary by editing MMAKCC.CMD.

Install CC and AS as MCR external commands '...XCC' and

'...XAS', respectively. The CC compiler MUST be installed

____

checkpointable in a mapped system to allow for task extension.

If you have an unmapped system, or do not have the 'extend task'

directive in your executive, install CC with an 'INC=20000' at

least, more if you get compiler aborts.

2.5.4 Setup under RT11 and RSTS/E RT11 mode _____

_____ ____ ___ ______ ____ ____

Under RT11, setup consists of simply ASSIGNing a physical device

to the logical device "C:". The compiler and assembler .SAV

files, the SUPORT.OBJ module, and the library CLIB.OBJ should be

placed on device 'C:'. You can make the assignment of device

'C:' as part of the startup command file, e.g.:

.ASSIGN RK0: C:

This compiler has been built and used under RT-11 V3B and V4.

It has run on a PDP-11/34, a PDP-11/05 and on PDT150 systems.

Under RSTS/E, the system manager must execute a startup control

file such as the following:

RUN $UTILTY

? ADD LOGICAL SY:[5,2]C

? CCL AS-=C:AS.SAV;8192

CC Reference Manual Page 2-7

Using the C compiler

? CCL CC-=C:CC.SAV;8220

? CCL MCR-=C:MCR.*;30000

? EXIT

2.6 Invoking compiled C programs ________ ________ _

________

When your program begins to execute and the startup module sees

that a command has been typed, a Unix C setup sequence is

emulated, including I/O redirection and command argument

processing. The startup module does not expand wild-card

filenames, however.

NOTE

On RSX-11M, this feature cannot be used unless your

program is installed as an MCR external command, i.e.

with a task name of '...xxx', and activated by typing

the "xxx". This requires that you be a priveleged

user.

On RT-11, if no command line has been passed, the module prompts

"Argv: " and accepts a single line which is then parsed into

command arguments. This can be disabled by defining the $$narg

global symbol as described in the library documentation.

NOTE

On native RT-11, a command line passed via

"RUN prog ..." which has more than one 'token' or

'word' in it gets parsed by the RT-11 monitor before

it ever gets to the C program. See the documentation

in the RT-11 manual on the 'RUN' command. It causes

an "=" sign to get inserted, and the order of

arguments is shuffled.

To get around this, either use the "RUN prog" and answer the

"Argv: " prompt with the command line, or enclose the command

line in some delimiter plus a space, e.g.:

RUN C:prog [ command line ]

which tacks an extra token on to the command line that looks

like "]=[" for the case above. There is no problem on command

lines which have one token.

If you include an argument of the form '>file', standard output

will be written to the indicated file. If you include an

CC Reference Manual Page 2-8

Using the C compiler

argument of the form '>>file', standard output will be appended

to the file (creating it if necessary). Append does not work on

RT11-modes. If you include an argument of the form '<file', the

indicated file will be used for standard input.

When the C program is started, it will be entered with two

parameters.

argc This is the count of the number of arguments.

It will be at least 1.

argv This is an array of string pointers containing

the individual arguments. The first parameter,

argv[0], will be the name assigned to the

program, where appropriate:

o On RSX, this will be the name by which

the task was installed (TTnn if the

'RUN' command was used with no task

name).

o On VMS compatibility mode, this will be

the <name> parameter in the command

definition as shown above.

o On RSTS/E, this will be the CCL name or

the program name as passed to the MCR

program.

o On RT11 (or on RSTS, by default, if no

name can be found), this will be the

string 'Argv: '.

For example:

/*

* Echo arguments

*/

main(argc, argv)

int argc;

char *argv[];

{

register int i;

printf("Program \"%s\" has %d parameters\n",

argv[0], argc);

for (i = 1; i < argc; i++)

printf("Argument %d = \"%s\"\n",

i, argv[i]);

}

The above program is executed as follows on VMS:

$ ECHO abc "def ghi"

CC Reference Manual Page 2-9

Using the C compiler

Program "ECHO" has 3 parameters

Argument 0 = "ECHO"

Argument 1 = "ABC"

Argument 2 = "def ghi"

Notice that unquoted arguments are converted to upper case by

the operating system.

Under RSTS/E, a C program may be installed as a CCL command or

the program may be started using the MCR CCL command which

emulates a CCL invocation for C programs.

2.7 Predefined symbols __________ _______

Before reading the program source file, the C compiler defines

several symbols (which may then be tested with '#ifdef'

statements):

decus This is the Decus compiler.

nofpu This version does not support floating-point.

nomacarg This version does not allow macros with

arguments.

pdp11 Generate code for the PDP-11.

rsx The RSX compiler (or)

rt11 The RT11 compiler

2.8 Profiling _________

The profiler permits the accumulation of function call

statistics during the execution of a program.

If any of the files comprising a program were compiled with the

profile option (and at least one of them has been called) then a

call profile, listing the function name and the number of calls,

will be written to file 'profil.out' when the program

terminates.

Also, if the program terminates because of a fatal error (such

as an illegal memory reference), a register dump and call trace

will be printed on the command terminal.

The run-time library contains several functions that can be

called to dynamically print flow trace information. For more

information, consult the C Runtime Library manual.

CC Reference Manual Page 2-10

Using the C compiler

2.9 Diagnostics ___________

There are two general classes of diagnostics; those that relate

to compiler conditions, and those that relate to errors in the

user's program.

The only type of compiler condition messages the user should see

are those of the form "Cannot open .... file". These mean

exactly what they say.

Other compiler condition messages are "Abort in phase x", "Abort

loading phase x" and "Trap type x", where "x" is replaced by

some small constant. These are most likely attempts to use

floating-point operations. If not, you are the proud owner of a

compiler bug. Report your find to a guru. Remember the

register dump and save your source file and both temporary

files. They are important.

If you blunder into a missing code table the compiler aborts

with an error message.

Errors in the user's programs are reported in English, tagged by

the linenumber (which may be off by 1). Because of the nature

of the language, errors sometimes snowball. If you are greeted

by thousands of error messages, try fixing up the first few.

You may be pleasantly surprised.

The following are common sources of 'thousands of errors':

o If there is a missing right brace within a function, all

succeding functions will miscompile. The error message

will include a tag of the form "within function xxxxx",

where "xxxxx" is the function with the missing brace.

o If there is a missing right parenthesis in an if or

while statement which is followed by a left brace, the

syntax analyser will 'lose' the brace, causing many

messages:

if ((foo = fopen("abc.def", "w") == NULL) {

...

o In general, if the error message is "illegal

expression", that is (probably) the current line. If

the message is "illegal statement", you should look at

the previous statement.

CHAPTER 3

RUNTIME ENVIRONMENT

This description of the C runtime enviornment is sketchy. The

best reference is compiler generated code, and any question

regarding 'how does it ....' can usually be answered by

compiling a suitably contrived program.

3.1 Program Sections _______ ________

The C compiler uses 5 program sections. The '.PROG.' p-section

is used for all code. The '.DATA.' p-section is used for all

static data. The '.STRN.' p-section is used for the bodies of

all literal strings. The '.PROF.' psection is used to hold the

names of functions for the profiler. The '.MWCN.' psection is

used to hold multi-word (long and floating-point) constants.

All code is 'pure'. However, the assembler is not able to

generate all the varieties of .PSECTs. Thus, everything is

read-write. This should be changed. Also, the compiler does

not write a symbol table as such, making debugging a chore.

3.2 Register Usage ________ _____

R5 is used as an environment frame pointer. It points to the

highest address of the stack frame of the current function. In

MACRO-11 programs, symbols C$PMTR and C$AUTO may be used to

refer to the first parameter and first automatic variable,

respectively. Thus, when writing a MACRO subroutine, you should

write:

MOV C$PMTR+<parameter_number * 2>(R5), Dst

to access parameters (the first parameter_number is 0). (This

cannot be done when using the AS assembler.)

To access automatic variables, you should write:

MOV C$AUTO-<variable_number * 2>(R5), Dst

CC Reference Manual Page 3-2

Runtime Environment

Where the first variable_number is numbered 1. (This cannot be

done when using the AS assembler).

Registers R2, R3 and R4 are used as register variables. The

first register variable to be declared goes in R4, the second in

R3 and the third in R2. Any register not used as a register

variable can be used as a temporary.

Registers R0 and R1 are always scratch registers.

3.2.1 Calling Sequence _______ ________

The first instructions in a C function are a 'JSR R5,CSV$' and a

subtract to claim stack space. The 'CSV$' routine points R5 at

the new stack frame and pushes registers R4, R3, and R2 onto the

stack (Note that the character '$' in the CC/MACRO environment,

is represented by '~' in the AS environment).

R0, R1 and the floating point registers are NOT saved. This

means that if a C function is called asyncronously (i.e. from

an AST routine) the caller must arrange to save these registers

or be prepared to face the music.

Functions return via a 'JMP CRET$'. The return value is in R0

(for ints, chars and pointers), R0-R1 (for longs, high part in

R0) or AC0 (floats and doubles).

The caller passes control to a function by first pushing the

arguments (from right to left) onto the stack, calling the

function via a 'JSR PC,FUNCTION', and popping the arguments off

of the stack when the function returns.

All arguments are passed as ints, longs (push low part, then

push high part) or doubles. Characters are passed as integers;

floats are passed as doubles.

CC Reference Manual Page 3-3

Runtime Environment

3.3 Global Symbols containing RAD50 '$' and '.'

______ _______ __________ _____ ___

With this version of C, it is possible to generate and access

global symbols which contain the Radix-50 '.' and '$'. The

compiler allows identifiers to contain the Ascii '$', which

becomes a Radix-50 '$' in the object code. The AS assembly code

shows this character as a tilde (~). The underscore character

() in a C program becomes a '.' in both the AS assembly language

and in the object code. Thus, in RSX-11M, it is possible to say

extern int $dsw;

. . .

printf("Directive status = %06on", $dsw);

which will print the current contents of the task's directive

status word.

NOTE

Be careful about using global 'equates' in C. These

are NOT address labels. For example, if you declare

"extern int is_suc;", where IS.SUC is externally

equated to 1, and then use is_suc in an expression,

you will get the contents of location 1 (and probably

________ __ ________ _

an odd address trap!). It is possible (but very

tacky) to get around this by prefixing the use of the

equated symbol with the '&' operator, since it means

'take this literally, not what it points to'.

Consider #define'ing the symbols in a C header file

instead.

3.4 Virtual Addresses in C _______ _________ __ _

When interacting with executives and MACRO-11 programs at the

low level made possible by C, it is likely that virtual

addresses will be manipulated and used as pointers. This is

particularly true when using the memory management functions.

Also, the C storage allocator functions return virtual

addresses, not C pointers. It is important to make this

distinction, owing to C's powerful address arithmetic

capabilities. See The C Programming Language by Kernighan and

___ _ ___________ ________

Ritchie, sections 5.4 and 5.6. It is a kluge to define a

virtual address as an integer. Since virtual addresses on the

PDP-11 are 'pointers to bytes (or characters)' it is wise to

adopt the convention of defining them as character pointers. To

make things crystal clear, one might #define "typedef char

*ADDR;", making ADDR synonymous with 'character pointer'.

CC Reference Manual Page 3-4

Runtime Environment

3.5 Profiler ________

When a program is compiled with the 'p' option, the standard

save is replaced by a "JSR R5,PCSV$". Immediately following the

call is a pointer to a zero word (for the counter) followed by

the name of the function (in the '.PROF.' psection as a null

terminated string). The 'PCSV$' routine increments the zero

word on every call:

.psect .prog.

entry: jsr r5,pcsv$

.word prof

.psect .prof.

prof: .word 0 ; Incremented at each call

.asciz /entry/ ; Function name

.even

.psect .prog

...

The printing of the profile is arranged by having 'PCSV$' stuff

a global cell '$$PROF' with a pointer to the profile print

routine. This routine (called automagically on exit) scans

through core looking for "JSR R5,PCSV" instructions, and

printing the statistics to the file 'profil.out' via 'fprintf'.

The trace module has several other attributes:

o If the program fails because of an unexpected trap to

the operating system (and the profile collection code

was executed at least once), a register dump will be

printed on the command terminal and the program will

exit by calling error().

o If the function's execution would cause the stack

pointer to go below 600 octal, the program will be

aborted after printing an error message.

o It is possible to obtain a dynamic trace of the flow of

a program by assigning the file descriptor of an open

file to global variable '$$flow'. For example:

#include <stdio.h>

extern FILE *$$flow;

main ()

{

$$flow = fopen("trace.out", "w");

process();

}

Note that the program may execute

$$flow = stdout;

CC Reference Manual Page 3-5

Runtime Environment

to write the trace to the command terminal. To turn off

tracing, close $$flow and set $$flow = NULL.

o The caller() function may be used to obtain the name of

a routine's caller:

main ()

{

subr();

}

subr ()

{

printf("%s\n", caller());

}

When subr() is executed, it will print "main".

o The calltr() function may be used to print a trace of

calls from main() to the function that called calltr():

main ()

{

subr();

}

subr ()

{

calltr(stdout);

}

When subr() is executed, it will print:

[ main subr ]

on the standard output file. If some routine in the

call trace was not compiled with profiling, the octal

address of the routine's entry point is printed. If the

routine gets confused (perhaps because the program is

exiting due to a trap), it prints "<bug at nnnnn>".

o If the program exits by calling error() and the profile

collection code was executed at least once, a call trace

will be printed on the command terminal.

CC Reference Manual Page 3-6

Runtime Environment

3.5.1 Example

A function max(a, b), which returns the maximum value of its two

integer arguments may be written as follows:

max(arga, argb)

int arga;

int argb;

{

return((arga > argb) ? arga : argb);

}

After compilation, the following .S code will be generated:

max: jsr r5,csv$

cmp 2(sp),4(sp)

blt .0

mov 2(sp),r0

br .1

.0: mov 4(sp),r0

.1: jmp cret$

CHAPTER 4

INCOMPATIBILITIES AND RESTRICTIONS

The language accepted by the compiler is the language described

in the Unix Seventh Edition documentation (and Kernighan and

Ritchie) with several exceptions. The file 'C:CBUGS.DOC'

contains a current list of bugs. These should be regarded as

restrictions -- anything that was easy to fix has been fixed.

4.1 Restrictions ____________

o The AS assembler recognizes several pre-defined

variables. Consequently, the following may not be used

by a C program: 'r0, r1, r2, r3, r4, r5, sp, and pc'.

o Initialization of automatic and local static variables

is not supported.

o Enumerations are not supported.

o Bit fields do not work -- attempting to use bit fields

will cause the compiler to abort with a 'missing code

table entry' error.

o Symbols defined as global may not be redefined as local

to a function.

o Variables may only be declared at function entrance.

The latest C language specification allows variable

declaration at any block entrance.

o Floating point is non-existant. If you attempt to

compile a program that uses floating-point, the compiler

will abort with a suitable message.

o The compiler does not support 'old-style' assigned

binary operators. These will generally result in syntax

errors. One exception (which started the whole mess) is

"foo =- 6". This will be accepted by the compiler.

Unfortunately, it will generate "foo = (-6)" when the

program probably wanted "foo = foo - 6". You have been

CC Reference Manual Page 4-2

Incompatibilities and Restrictions

warned.

o The compiler allocates storage to character variables as

if they were integers (except if the character variable

is an array or part of a structure). If single-byte

allocation is necessary, the program should proceed as

follows:

char chara[1]; /* Declare one-byte

character */

#define A char_a[0] /* Name first byte of

char_a[] */

o The include statement has two modes:

#include "filename" Includes the fully-qualified

file.

#include <filename> Includes the library file,

equivalent to:

#include "C:filename" (LB:[1,1] on RSX)

o Macros (#define statement with arguments) do not exist.

o As noted in the library documentation, the following

built-in function may be overridden by the C-program:

wrapup() Called when the program exits.

4.2 Incompatibilities _________________

There are several incompatibilities between the current DECUS

compiler and earlier versions which had been distributed by

various DECUS special-interest groups. Those known (and the

implications) are:

o The RSX compiler's subroutine calling sequence has been

changed to match the RT11 compiler's (and Unix's). This

means that all user-written assembly-language code must

be modified. The calling sequence appears to be

compatible with the Unix and Whitesmith compilers,

although library names are different. Also, the

Whitesmith compiler has several optimizations in its

subroutine calling sequence that are not present in this

compiler.

o The underscore character now generates a RAD50 dot,

instead of a dollar-sign. The compiler allows

dollar-signs in local and global variables. Thus, C

programs can now access all PDP-11 global symbols.

CC Reference Manual Page 4-3

Incompatibilities and Restrictions

Because of the change of the meaning of underscore, all

user-written assembly-language code must be modified.

o I/O library conventions now generally follow the Unix V7

definitions. There are several implications. In

general, however, all C-language I/O calls should be

examined. The major problems are described below.

o fopen("filename", "openmode") follows the RSX-library

and Unix V7. This is incompatible with Unix V6 and the

old RT11-library call.

o fgets(buffer, sizeof buffer, fd) requires the second

buffer size parameter, and does not remove the trailing

newline. This follows Unix V7 I/O conventions.

fgetss() is a new function, identical to fgets() except

that it removes the trailing newline. fgetss() is

compatible with the fgets() function in previous

versions of the Decus compiler.

o fputs(buffer, fd) does not append a newline to the

record. This follows Unix V7 I/O conventions.

fputss() is a new function, identical to fgets() except

that it appends a trailing newline.

o The "execute non-local goto" functions have been

renamed. Unix V6 reset() and setexit() (Unix V7

longjmp() and setexit()) are called reset() and unwind()

in this release. Two new functions, envsave() and

envreset() are also present for this purpose.

o The ctime() function (return time of day in Ascii) does

not return a trailing newline. To get the time of day,

the program may execute ctime(0).

4.2.1 Conversion from Unix

It is expected that many programs will be converted to the Decus

compiler from Unix. While trivial programs will require no work

whatsoever, most programs will require hand editing. Note the

following:

o Floating point is non-existent. Many floating-point

variables can be recoded as long integers (large

counters, for example). Anything else cannot be

converted at all.

o Unix V6 assigned binary operators must be converted to

CC Reference Manual Page 4-4

Incompatibilities and Restrictions

the new format. Most of these will be caught by the

syntax analyser. Note, however, that "foo =- 6" will

parse, generating incorrect code.

o The Decus compiler has a 500 word expression stack.

This means that many complex expressions (especially

those with embedded conditional statements) will cause

the compilation to abort. This requires rewriting.

o The Decus compiler lacks macros with arguments. Many of

these can be rewritten as function calls. If the

program intentionally makes use of the fact that macros

are expanded in-line, hand-editing will be needed. Note

also that only one level of indirect (#include) file is

supported.

o Unix V6 I/O is not supported. Thus, any program using

read(), write(), open(), or creat() will require

extensive modification. Note also that fopen() operates

quite differently in the standard I/O package than it

did in Unix V6. This requires rethinking but is fairly

straight-forward. Also, note that only a limited file

random-access capability is present.

o Very large programs (which depend on Unix's ability to

generate programs with seperate instruction and data

space) must be redone using the linker (task-builder)

overlay capability. Non-trivial.

o Programs that use large amounts of local storage

(allocated on function entrance) must be linked with

enough stack space. When testing a program, it is

highly recommended that the program be compiled with

profiling as this enables a stack overflow check on

function entrance.

In general, the programmer should be alert to such minor

incompatibilities that do exist.

APPENDIX A

FILE CBUGS.DOC (17-Sep-80)

The following is a reproduction of the CBUGS.DOC file

distributed with the DECUS C system, as of 22-Sep-80:

** 03-May-80 __ _________

The construction

return((c < 0) -1 : 0);

(with a missing "?") aborts in phase 2. It should yield a

syntax error message.

** 07-May-80 __ _________

The compiler outputs a spurious error message if you follow

a declaration by an "extern" declaration:

int foo;

main()

{ ...

}

extern int foo;

The extern definition is flagged as a "redeclaration".

** 13-May-80 __ _________

The compiler doesn't always handle typedef's correctly. Note

the following:

typedef struct foo *FOOPTR;

struct foo {

...

};

FOOPTR foofun()

...

Foofun() gives a syntax error "declaration semantically

forbidden". However,

struct foo *foofun()

works correctly.

** 19-May-80 __ _________

PDP-11 register definitions ("r0, r1, ... r5, sp, and pc")

are generated by the compiler to refer to the hardware

registers. Also, these are predefined by the AS assembler.

Thus, you cannot name a function sp(), etc.

** 09-Jun-80 __ _________

Previous versions of RSX CC wrote int. files and the compiler

output (.s) file on the same disk/directory as the input file.

This has been changed so as to write all output files onto the

CC Reference Manual Page A-2

File CBUGS.DOC

user's current directory. This is compatible with RT11 CC and

general PDP11 practice. For example, assuming RSTS/E:

Command Old New

xcc [100,100]foo [100,100]foo.s sy:foo.s

Note that RT11 CC does a "normal" CSI scan, thus allowing

placement of all files.

** 19-Jun-80 __ _________

Certain constructions don't get registers setup properly.

For example, the following program crashes the compiler:

long atol(s)

char s[];

{

long n;

n = 10 * n + (s[0] - '0');

}

As a temporary fix for this sort of problem, you can break

the code into smaller units:

long atol(s)

char s[];

{

long n;

register int i;

i = s[0] - '0';

n = 10 * n + i;

}

** 23-Jul-80 __ _________

The AS assembler does not always process relative branches

(br .+4) correctly. They should not be used.

(Code generated by CC no longer contains relative branches.)

** 24-Jul-80 __ _________

Integer to long conversion is not always done the way one

might expect. For example:

longval = ((long) intvalue) ...

Does not convert the integer to long (with sign extension),

but rather uses a "garbage" high-order word. Also,

longval = intval * intval;

Converts the RESULT of the computation to a long, as if the

program executed:

inttemp = intval * intval;

longval = inttemp;

Moral: the correct way to proceed is:

longtemp = intval;

longval = longtemp ...

Sorry.

Note however that <number>L is a long constant. Thus,

longval = intval * 123L;

works properly.

** 14-Aug-80 __ _________

Note the following errors in the C parser:

char foo[][7] {

...

};

subroutine() {

CC Reference Manual Page A-3

File CBUGS.DOC

extern char foo[][7];

...

The extern declaration is rejected.

Also,

if (...)

do {

...

} while (...);

else

...

Is rejected with the message "illegal else". Rewrite as:

if (...) {

do {

...

} while (...);

}

else ...

Sorry.

** 15-Aug-80 __ _________

The C compiler really and truely does not support the old

assigned binary operators (=+, =-, etc.). In fact, the

statement "foo =- 6" is equivalent to "foo = (-6)"; it IS NOT

equivalent to "foo = foo - 6".

** 18-Aug-80 __ _________

The C compiler may reject structure definitions within the

body of a function:

foo() {

struct { int *bar; };

}

However, if the structure definition is moved outside the

function body, it will compile correctly:

struct { int *bar; };

foo() {

...

}

** 12-Sep-80 __ _________

The RSX file services emulator library under RSTS/E contains

a global symbol "EOF". Consequently, C programs running

under this library may not define a global by this name. The

following will not task-build correctly on RSTS/E,

RSX-11 mode:

int eof;

main() {

...

}

Note that this restriction is due to the global's presence in

the operating-system library, not the C run-time library.

** 15-Sep-80 __ _________

Fwild/fnext will not properly process versions ;0 and ;-1 on

native RSX systems that support FILES-11 disk directory

structures. The algorithm works correctly on ODS2 structures

(and thus on VMS compatiblity mode). The algorithm will work

on native RSX if the directory is sorted (by a program such

CC Reference Manual Page A-4

File CBUGS.DOC

as SRD) in order of decreasing version numbers.

** 15-Sep-80 __ _________

The C compiler is restrictive as to the ordering of

definitions. For example, given the following structure

definition:

struct stack {

int maxindex;

int currentindex;

int *vector;

};

The compiler rejects the following sequence:

struct stack datum { DATUMMAX, 0, datum };

int datum[DATUMMAX];

However, it accepts the following sequence:

int datum[DATUMMAX];

struct stack datum { DATUMMAX, 0, datum };

g sequence:

int datum[DATUMMAX];

struct stack datum { DATUMMAX, 0, datum };

Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement