This file documents the internals of the GNU compilers.
Copyright © 1988-2021 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being “Funding Free Software”, the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled “GNU Free Documentation License”.
(a) The FSF’s Front-Cover Text is:
A GNU Manual
(b) The FSF’s Back-Cover Text is:
You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.
Next: Contributing to GCC Development [Contents][Index]
This manual documents the internals of the GNU compilers, including how to port them to new targets and some information about how to write front ends for new languages. It corresponds to the compilers (GCC) version 11.3.0. The use of the GNU compilers is documented in a separate manual. See Introduction in Using the GNU Compiler Collection (GCC).
This manual is mainly a reference manual rather than a tutorial. It discusses how to contribute to GCC (see Contributing to GCC Development), the characteristics of the machines supported by GCC as hosts and targets (see GCC and Portability), how GCC relates to the ABIs on such systems (see Interfacing to GCC Output), and the characteristics of the languages for which GCC front ends are written (see Language Front Ends in GCC). It then describes the GCC source tree structure and build system, some of the interfaces to GCC front ends, and how support for a target system is implemented in GCC.
Additional tutorial information is linked to from http://gcc.gnu.org/readings.html.
gcc.target/i386
gcc.test-framework
dg-add-options
dg-require-support
dg-final
gcov
GIMPLE_ASM
GIMPLE_ASSIGN
GIMPLE_BIND
GIMPLE_CALL
GIMPLE_CATCH
GIMPLE_COND
GIMPLE_DEBUG
GIMPLE_EH_FILTER
GIMPLE_LABEL
GIMPLE_GOTO
GIMPLE_NOP
GIMPLE_OMP_ATOMIC_LOAD
GIMPLE_OMP_ATOMIC_STORE
GIMPLE_OMP_CONTINUE
GIMPLE_OMP_CRITICAL
GIMPLE_OMP_FOR
GIMPLE_OMP_MASTER
GIMPLE_OMP_ORDERED
GIMPLE_OMP_PARALLEL
GIMPLE_OMP_RETURN
GIMPLE_OMP_SECTION
GIMPLE_OMP_SECTIONS
GIMPLE_OMP_SINGLE
GIMPLE_PHI
GIMPLE_RESX
GIMPLE_RETURN
GIMPLE_SWITCH
GIMPLE_TRY
GIMPLE_WITH_CLEANUP_EXPR
define_insn
enabled
attributetargetm
Variable__attribute__
collect2
collect2
Next: GCC and Portability, Up: Introduction [Contents][Index]
If you would like to help pretest GCC releases to assure they work well, current development sources are available via Git (see http://gcc.gnu.org/git.html). Source and binary snapshots are also available for FTP; see http://gcc.gnu.org/snapshots.html.
If you would like to work on improvements to GCC, please read the advice at these URLs:
for information on how to make useful contributions and avoid duplication of effort. Suggested projects are listed at http://gcc.gnu.org/projects/.
Next: Interfacing to GCC Output, Previous: Contributing to GCC Development, Up: Introduction [Contents][Index]
GCC itself aims to be portable to any machine where int
is at least
a 32-bit type. It aims to target machines with a flat (non-segmented) byte
addressed data address space (the code address space can be separate).
Target ABIs may have 8, 16, 32 or 64-bit int
type. char
can be wider than 8 bits.
GCC gets most of the information about the target machine from a machine description which gives an algebraic formula for each of the machine’s instructions. This is a very clean way to describe the target. But when the compiler needs information that is difficult to express in this fashion, ad-hoc parameters have been defined for machine descriptions. The purpose of portability is to reduce the total work needed on the compiler; it was not of interest for its own sake.
GCC does not contain machine dependent code, but it does contain code
that depends on machine parameters such as endianness (whether the most
significant byte has the highest or lowest address of the bytes in a word)
and the availability of autoincrement addressing. In the RTL-generation
pass, it is often necessary to have multiple strategies for generating code
for a particular kind of syntax tree, strategies that are usable for different
combinations of parameters. Often, not all possible cases have been
addressed, but only the common ones or only the ones that have been
encountered. As a result, a new target may require additional
strategies. You will know
if this happens because the compiler will call abort
. Fortunately,
the new strategies can be added in a machine-independent fashion, and will
affect only the target machines that need them.
Next: The GCC low-level runtime library, Previous: GCC and Portability, Up: Introduction [Contents][Index]
GCC is normally configured to use the same function calling convention normally in use on the target system. This is done with the machine-description macros described (see Target Description Macros and Functions).
However, returning of structure and union values is done differently on some target machines. As a result, functions compiled with PCC returning such types cannot be called from code compiled with GCC, and vice versa. This does not cause trouble often because few Unix library routines return structures or unions.
GCC code returns structures and unions that are 1, 2, 4 or 8 bytes
long in the same registers used for int
or double
return
values. (GCC typically allocates variables of such types in
registers also.) Structures and unions of other sizes are returned by
storing them into an address passed by the caller (usually in a
register). The target hook TARGET_STRUCT_VALUE_RTX
tells GCC where to pass this address.
By contrast, PCC on most target machines returns structures and unions of any size by copying the data into an area of static storage, and then returning the address of that storage as if it were a pointer value. The caller must copy the data from that memory area to the place where the value is wanted. This is slower than the method used by GCC, and fails to be reentrant.
On some target machines, such as RISC machines and the 80386, the standard system convention is to pass to the subroutine the address of where to return the value. On these machines, GCC has been configured to be compatible with the standard compiler, when this method is used. It may not be compatible for structures of 1, 2, 4 or 8 bytes.
GCC uses the system’s standard convention for passing arguments. On some machines, the first few arguments are passed in registers; in others, all are passed on the stack. It would be possible to use registers for argument passing on any machine, and this would probably result in a significant speedup. But the result would be complete incompatibility with code that follows the standard convention. So this change is practical only if you are switching to GCC as the sole C compiler for the system. We may implement register argument passing on certain machines once we have a complete GNU system so that we can compile the libraries with GCC.
On some machines (particularly the SPARC), certain types of arguments are passed “by invisible reference”. This means that the value is stored in memory, and the address of the memory location is passed to the subroutine.
If you use longjmp
, beware of automatic variables. ISO C says that
automatic variables that are not declared volatile
have undefined
values after a longjmp
. And this is all GCC promises to do,
because it is very difficult to restore register variables correctly, and
one of GCC’s features is that it can put variables in registers without
your asking it to.
Next: Language Front Ends in GCC, Previous: Interfacing to GCC Output, Up: Introduction [Contents][Index]
GCC provides a low-level runtime library, libgcc.a or libgcc_s.so.1 on some platforms. GCC generates calls to routines in this library automatically, whenever it needs to perform some operation that is too complicated to emit inline code for.
Most of the routines in libgcc
handle arithmetic operations
that the target processor cannot perform directly. This includes
integer multiply and divide on some machines, and all floating-point
and fixed-point operations on other machines. libgcc
also includes
routines for exception handling, and a handful of miscellaneous operations.
Some of these routines can be defined in mostly machine-independent C. Others must be hand-written in assembly language for each processor that needs them.
GCC will also generate calls to C library routines, such as
memcpy
and memset
, in some cases. The set of routines
that GCC may possibly use is documented in Other
Builtins in Using the GNU Compiler Collection (GCC).
These routines take arguments and return values of a specific machine
mode, not a specific C type. See Machine Modes, for an explanation
of this concept. For illustrative purposes, in this chapter the
floating point type float
is assumed to correspond to SFmode
;
double
to DFmode
; and long double
to both
TFmode
and XFmode
. Similarly, the integer types int
and unsigned int
correspond to SImode
; long
and
unsigned long
to DImode
; and long long
and
unsigned long long
to TImode
.
Next: Routines for floating point emulation, Up: The GCC low-level runtime library [Contents][Index]
The integer arithmetic routines are used on platforms that don’t provide hardware support for arithmetic operations on some modes.
These functions return the result of shifting a left by b bits.
These functions return the result of arithmetically shifting a right by b bits.
These functions return the quotient of the signed division of a and b.
These functions return the result of logically shifting a right by b bits.
These functions return the remainder of the signed division of a and b.
These functions return the product of a and b.
These functions return the negation of a.
These functions return the quotient of the unsigned division of a and b.
These functions calculate both the quotient and remainder of the unsigned division of a and b. The return value is the quotient, and the remainder is placed in variable pointed to by c.
These functions return the remainder of the unsigned division of a and b.
The following functions implement integral comparisons. These functions implement a low-level compare, upon which the higher level comparison operators (such as less than and greater than or equal to) can be constructed. The returned values lie in the range zero to two, to allow the high-level operators to be implemented by testing the returned result using either signed or unsigned comparison.
These functions perform a signed comparison of a and b. If a is less than b, they return 0; if a is greater than b, they return 2; and if a and b are equal they return 1.
These functions perform an unsigned comparison of a and b. If a is less than b, they return 0; if a is greater than b, they return 2; and if a and b are equal they return 1.
The following functions implement trapping arithmetic. These functions
call the libc function abort
upon signed arithmetic overflow.
These functions return the absolute value of a.
These functions return the sum of a and b; that is
a + b
.
The functions return the product of a and b; that is
a * b
.
These functions return the number of leading 0-bits in a, starting at the most significant bit position. If a is zero, the result is undefined.
These functions return the number of trailing 0-bits in a, starting at the least significant bit position. If a is zero, the result is undefined.
These functions return the index of the least significant 1-bit in a, or the value zero if a is zero. The least significant bit is index one.
These functions return the value zero if the number of bits set in a is even, and the value one otherwise.
These functions return the number of bits set in a.
These functions return the a byteswapped.
Next: Routines for decimal floating point emulation, Previous: Routines for integer arithmetic, Up: The GCC low-level runtime library [Contents][Index]
The software floating point library is used on machines which do not have hardware support for floating point. It is also used whenever -msoft-float is used to disable generation of floating point instructions. (Not all targets support this switch.)
For compatibility with other compilers, the floating point emulation
routines can be renamed with the DECLARE_LIBRARY_RENAMES
macro
(see Implicit Calls to Library Routines). In this section, the default names are used.
Presently the library does not support XFmode
, which is used
for long double
on some architectures.
These functions return the sum of a and b.
These functions return the difference between b and a; that is, a - b.
These functions return the product of a and b.
These functions return the quotient of a and b; that is, a / b.
These functions return the negation of a. They simply flip the sign bit, so they can produce negative zero and negative NaN.
These functions extend a to the wider mode of their return type.
These functions truncate a to the narrower mode of their return type, rounding toward zero.
These functions convert a to a signed integer, rounding toward zero.
These functions convert a to a signed long, rounding toward zero.
These functions convert a to a signed long long, rounding toward zero.
These functions convert a to an unsigned integer, rounding toward zero. Negative values all become zero.
These functions convert a to an unsigned long, rounding toward zero. Negative values all become zero.
These functions convert a to an unsigned long long, rounding toward zero. Negative values all become zero.
These functions convert i, a signed integer, to floating point.
These functions convert i, a signed long, to floating point.
These functions convert i, a signed long long, to floating point.
These functions convert i, an unsigned integer, to floating point.
These functions convert i, an unsigned long, to floating point.
These functions convert i, an unsigned long long, to floating point.
There are two sets of basic comparison functions.
These functions calculate a <=> b. That is, if a is less than b, they return -1; if a is greater than b, they return 1; and if a and b are equal they return 0. If either argument is NaN they return 1, but you should not rely on this; if NaN is a possibility, use one of the higher-level comparison functions.
These functions return a nonzero value if either argument is NaN, otherwise 0.
There is also a complete group of higher level functions which correspond directly to comparison operators. They implement the ISO C semantics for floating-point comparisons, taking NaN into account. Pay careful attention to the return values defined for each set. Under the hood, all of these routines are implemented as
if (__unordXf2 (a, b)) return E; return __cmpXf2 (a, b);
where E is a constant chosen to give the proper behavior for NaN. Thus, the meaning of the return value is different for each set. Do not rely on this implementation; only the semantics documented below are guaranteed.
These functions return zero if neither argument is NaN, and a and b are equal.
These functions return a nonzero value if either argument is NaN, or if a and b are unequal.
These functions return a value greater than or equal to zero if neither argument is NaN, and a is greater than or equal to b.
These functions return a value less than zero if neither argument is NaN, and a is strictly less than b.
These functions convert raise a to the power b.
These functions return the product of a + ib and c + id, following the rules of C99 Annex G.
These functions return the quotient of a + ib and c + id (i.e., (a + ib) / (c + id)), following the rules of C99 Annex G.
Next: Routines for fixed-point fractional emulation, Previous: Routines for floating point emulation, Up: The GCC low-level runtime library [Contents][Index]
The software decimal floating point library implements IEEE 754-2008 decimal floating point arithmetic and is only activated on selected targets.
The software decimal floating point library supports either DPD (Densely Packed Decimal) or BID (Binary Integer Decimal) encoding as selected at configure time.
These functions return the sum of a and b.
These functions return the difference between b and a; that is, a - b.
These functions return the product of a and b.
These functions return the quotient of a and b; that is, a / b.
These functions return the negation of a. They simply flip the sign bit, so they can produce negative zero and negative NaN.
These functions convert the value a from one decimal floating type to another.
These functions convert the value of a from a binary floating type to a decimal floating type of a different size.
These functions convert the value of a from a decimal floating type to a binary floating type of a different size.
These functions convert the value of a between decimal and binary floating types of the same size.
These functions convert a to a signed integer.
These functions convert a to a signed long.
These functions convert a to an unsigned integer. Negative values all become zero.
These functions convert a to an unsigned long. Negative values all become zero.
These functions convert i, a signed integer, to decimal floating point.
These functions convert i, a signed long, to decimal floating point.
These functions convert i, an unsigned integer, to decimal floating point.
These functions convert i, an unsigned long, to decimal floating point.
These functions return a nonzero value if either argument is NaN, otherwise 0.
There is also a complete group of higher level functions which correspond directly to comparison operators. They implement the ISO C semantics for floating-point comparisons, taking NaN into account. Pay careful attention to the return values defined for each set. Under the hood, all of these routines are implemented as
if (__bid_unordXd2 (a, b)) return E; return __bid_cmpXd2 (a, b);
where E is a constant chosen to give the proper behavior for NaN. Thus, the meaning of the return value is different for each set. Do not rely on this implementation; only the semantics documented below are guaranteed.
These functions return zero if neither argument is NaN, and a and b are equal.
These functions return a nonzero value if either argument is NaN, or if a and b are unequal.
These functions return a value greater than or equal to zero if neither argument is NaN, and a is greater than or equal to b.
These functions return a value less than zero if neither argument is NaN, and a is strictly less than b.
These functions return a value less than or equal to zero if neither argument is NaN, and a is less than or equal to b.
These functions return a value greater than zero if neither argument is NaN, and a is strictly greater than b.
Next: Language-independent routines for exception handling, Previous: Routines for decimal floating point emulation, Up: The GCC low-level runtime library [Contents][Index]
The software fixed-point library implements fixed-point fractional arithmetic, and is only activated on selected targets.
For ease of comprehension fract
is an alias for the
_Fract
type, accum
an alias for _Accum
, and
sat
an alias for _Sat
.
For illustrative purposes, in this section the fixed-point fractional type
short fract
is assumed to correspond to machine mode QQmode
;
unsigned short fract
to UQQmode
;
fract
to HQmode
;
unsigned fract
to UHQmode
;
long fract
to SQmode
;
unsigned long fract
to USQmode
;
long long fract
to DQmode
;
and unsigned long long fract
to UDQmode
.
Similarly the fixed-point accumulator type
short accum
corresponds to HAmode
;
unsigned short accum
to UHAmode
;
accum
to SAmode
;
unsigned accum
to USAmode
;
long accum
to DAmode
;
unsigned long accum
to UDAmode
;
long long accum
to TAmode
;
and unsigned long long accum
to UTAmode
.
These functions return the sum of a and b.
These functions return the sum of a and b with signed saturation.
These functions return the sum of a and b with unsigned saturation.
These functions return the difference of a and b;
that is, a - b
.
These functions return the difference of a and b with signed
saturation; that is, a - b
.
These functions return the difference of a and b with unsigned
saturation; that is, a - b
.
These functions return the product of a and b.
These functions return the product of a and b with signed saturation.
These functions return the product of a and b with unsigned saturation.
These functions return the quotient of the signed division of a and b.
These functions return the quotient of the unsigned division of a and b.
These functions return the quotient of the signed division of a and b with signed saturation.
These functions return the quotient of the unsigned division of a and b with unsigned saturation.
These functions return the negation of a.
These functions return the negation of a with signed saturation.
These functions return the negation of a with unsigned saturation.
These functions return the result of shifting a left by b bits.
These functions return the result of arithmetically shifting a right by b bits.
These functions return the result of logically shifting a right by b bits.
These functions return the result of shifting a left by b bits with signed saturation.
These functions return the result of shifting a left by b bits with unsigned saturation.
The following functions implement fixed-point comparisons. These functions implement a low-level compare, upon which the higher level comparison operators (such as less than and greater than or equal to) can be constructed. The returned values lie in the range zero to two, to allow the high-level operators to be implemented by testing the returned result using either signed or unsigned comparison.
These functions perform a signed or unsigned comparison of a and b (depending on the selected machine mode). If a is less than b, they return 0; if a is greater than b, they return 2; and if a and b are equal they return 1.
These functions convert from fractional and signed non-fractionals to fractionals and signed non-fractionals, without saturation.
The functions convert from fractional and signed non-fractionals to fractionals, with saturation.
These functions convert from fractionals to unsigned non-fractionals; and from unsigned non-fractionals to fractionals, without saturation.
These functions convert from unsigned non-fractionals to fractionals, with saturation.
Next: Miscellaneous runtime library routines, Previous: Routines for fixed-point fractional emulation, Up: The GCC low-level runtime library [Contents][Index]
document me!
_Unwind_DeleteException _Unwind_Find_FDE _Unwind_ForcedUnwind _Unwind_GetGR _Unwind_GetIP _Unwind_GetLanguageSpecificData _Unwind_GetRegionStart _Unwind_GetTextRelBase _Unwind_GetDataRelBase _Unwind_RaiseException _Unwind_Resume _Unwind_SetGR _Unwind_SetIP _Unwind_FindEnclosingFunction _Unwind_SjLj_Register _Unwind_SjLj_Unregister _Unwind_SjLj_RaiseException _Unwind_SjLj_ForcedUnwind _Unwind_SjLj_Resume __deregister_frame __deregister_frame_info __deregister_frame_info_bases __register_frame __register_frame_info __register_frame_info_bases __register_frame_info_table __register_frame_info_table_bases __register_frame_table
Previous: Language-independent routines for exception handling, Up: The GCC low-level runtime library [Contents][Index]
This function clears the instruction cache between beg and end.
When using -fsplit-stack, this call may be used to iterate over the stack segments. It may be called like this:
void *next_segment = NULL; void *next_sp = NULL; void *initial_sp = NULL; void *stack; size_t stack_size; while ((stack = __splitstack_find (next_segment, next_sp, &stack_size, &next_segment, &next_sp, &initial_sp)) != NULL) { /* Stack segment starts at stack and is stack_size bytes long. */ }
There is no way to iterate over the stack segments of a different
thread. However, what is permitted is for one thread to call this
with the segment_arg and sp arguments NULL, to pass
next_segment, next_sp, and initial_sp to a different
thread, and then to suspend one way or another. A different thread
may run the subsequent __splitstack_find
iterations. Of
course, this will only work if the first thread is suspended while the
second thread is calling __splitstack_find
. If not, the second
thread could be looking at the stack while it is changing, and
anything could happen.
Internal variables used by the -fsplit-stack implementation.
Next: Source Tree Structure and Build System, Previous: The GCC low-level runtime library, Up: Introduction [Contents][Index]
The interface to front ends for languages in GCC, and in particular
the tree
structure (see GENERIC), was initially designed for
C, and many aspects of it are still somewhat biased towards C and
C-like languages. It is, however, reasonably well suited to other
procedural languages, and front ends for many such languages have been
written for GCC.
Writing a compiler as a front end for GCC, rather than compiling directly to assembler or generating C code which is then compiled by GCC, has several advantages:
Because of the advantages of writing a compiler as a GCC front end, GCC front ends have also been created for languages very different from those for which GCC was designed, such as the declarative logic/functional language Mercury. For these reasons, it may also be useful to implement compilers created for specialized purposes (for example, as part of a research project) as GCC front ends.
Next: Testsuites, Previous: Language Front Ends in GCC, Up: Introduction [Contents][Index]
This chapter describes the structure of the GCC source tree, and how GCC is built. The user documentation for building and installing GCC is in a separate manual (http://gcc.gnu.org/install/), with which it is presumed that you are familiar.
The configure and build process has a long and colorful history, and can be confusing to anyone who doesn’t know why things are the way they are. While there are other documents which describe the configuration process in detail, here are a few things that everyone working on GCC should know.
There are three system names that the build knows about: the machine you are building on (build), the machine that you are building for (host), and the machine that GCC will produce code for (target). When you configure GCC, you specify these with --build=, --host=, and --target=.
Specifying the host without specifying the build should be avoided, as
configure
may (and once did) assume that the host you specify
is also the build, which may not be true.
If build, host, and target are all the same, this is called a native. If build and host are the same but target is different, this is called a cross. If build, host, and target are all different this is called a canadian (for obscure reasons dealing with Canada’s political party and the background of the person working on the build at that time). If host and target are the same, but build is different, you are using a cross-compiler to build a native for a different system. Some people call this a host-x-host, crossed native, or cross-built native. If build and target are the same, but host is different, you are using a cross compiler to build a cross compiler that produces code for the machine you’re building on. This is rare, so there is no common way of describing it. There is a proposal to call this a crossback.
If build and host are the same, the GCC you are building will also be
used to build the target libraries (like libstdc++
). If build and host
are different, you must have already built and installed a cross
compiler that will be used to build the target libraries (if you
configured with --target=foo-bar, this compiler will be called
foo-bar-gcc
).
In the case of target libraries, the machine you’re building for is the
machine you specified with --target. So, build is the machine
you’re building on (no change there), host is the machine you’re
building for (the target libraries are built for the target, so host is
the target you specified), and target doesn’t apply (because you’re not
building a compiler, you’re building libraries). The configure/make
process will adjust these variables as needed. It also sets
$with_cross_host
to the original --host value in case you
need it.
The libiberty
support library is built up to three times: once
for the host, once for the target (even if they are the same), and once
for the build if build and host are different. This allows it to be
used by all programs which are generated in the course of the build
process.
Next: The gcc Subdirectory, Previous: Configure Terms and History, Up: Source Tree Structure and Build System [Contents][Index]
The top level source directory in a GCC distribution contains several files and directories that are shared with other software distributions such as that of GNU Binutils. It also contains several subdirectories that contain parts of GCC and its runtime libraries:
The Boehm conservative garbage collector, optionally used as part of the ObjC runtime library when configured with --enable-objc-gc.
Autoconf macros and Makefile fragments used throughout the tree.
Contributed scripts that may be found useful in conjunction with GCC. One of these, contrib/texi2pod.pl, is used to generate man pages from Texinfo manuals as part of the GCC build process.
The support for fixing system headers to work with GCC. See fixincludes/README for more information. The headers fixed by this mechanism are installed in libsubdir/include-fixed. Along with those headers, README-fixinc is also installed, as libsubdir/include-fixed/README.
The main sources of GCC itself (except for runtime libraries), including optimizers, support for different target architectures, language front ends, and testsuites. See The gcc Subdirectory, for details.
Support tools for GNAT.
Headers for the libiberty
library.
GNU libintl
, from GNU gettext
, for systems which do not
include it in libc
.
The Ada runtime library.
The runtime support library for atomic operations (e.g. for __sync
and __atomic
).
The C preprocessor library.
The Decimal Float support library.
The libffi
library, used as part of the Go runtime library.
The GCC runtime library.
The Fortran runtime library.
The Go runtime library. The bulk of this library is mirrored from the master Go repository.
The GNU Offloading and Multi Processing Runtime Library.
The libiberty
library, used for portability and for some
generally useful data structures and algorithms. See Introduction in GNU libiberty, for more information
about this library.
The runtime support library for transactional memory.
The Objective-C and Objective-C++ runtime library.
The runtime support library for quad-precision math operations.
The D standard and runtime library. The bulk of this library is mirrored from the master D repositories.
The Stack protector runtime library.
The C++ runtime library.
Plugin used by the linker if link-time optimizations are enabled.
Scripts used by the gccadmin
account on gcc.gnu.org
.
The zlib
compression library, used for compressing and
uncompressing GCC’s intermediate language in LTO object files.
The build system in the top level directory, including how recursion into subdirectories works and how building runtime libraries for multilibs is handled, is documented in a separate manual, included with GNU Binutils. See GNU configure and build system in The GNU configure and build system, for details.
Previous: Top Level Source Directory, Up: Source Tree Structure and Build System [Contents][Index]
The gcc directory contains many files that are part of the C sources of GCC, other files used as part of the configuration and build process, and subdirectories including documentation and a testsuite. The files that are sources of GCC are documented in a separate chapter. See Passes and Files of the Compiler.
Next: Configuration in the gcc Directory, Up: The gcc Subdirectory [Contents][Index]
The gcc directory contains the following subdirectories:
Subdirectories for various languages. Directories containing a file config-lang.in are language subdirectories. The contents of the subdirectories c (for C), cp (for C++), objc (for Objective-C), objcp (for Objective-C++), and lto (for LTO) are documented in this manual (see Passes and Files of the Compiler); those for other languages are not. See Anatomy of a Language Front End, for details of the files in these directories.
Source files shared between the compiler drivers (such as
gcc
) and the compilers proper (such as cc1). If an
architecture defines target hooks shared between those places, it also
has a subdirectory in common/config. See The Global targetm
Variable.
Configuration files for supported architectures and operating systems. See Anatomy of a Target Back End, for details of the files in this directory.
Texinfo documentation for GCC, together with automatically generated man pages and support for converting the installation manual to HTML. See Building Documentation.
System headers installed by GCC, mainly those required by the C standard of freestanding implementations. See Headers Installed by GCC, for details of when these and other headers are installed.
Message catalogs with translations of messages produced by GCC into
various languages, language.po. This directory also
contains gcc.pot, the template for these message catalogues,
exgettext, a wrapper around gettext
to extract the
messages from the GCC sources and create gcc.pot, which is run
by ‘make gcc.pot’, and EXCLUDES, a list of files from
which messages should not be extracted.
The GCC testsuites (except for those for runtime libraries). See Testsuites.
Next: Build System in the gcc Directory, Previous: Subdirectories of gcc, Up: The gcc Subdirectory [Contents][Index]
The gcc directory is configured with an Autoconf-generated script configure. The configure script is generated from configure.ac and aclocal.m4. From the files configure.ac and acconfig.h, Autoheader generates the file config.in. The file cstamp-h.in is used as a timestamp.
configure
Next: The config.build; config.host; and config.gcc Files, Up: Configuration in the gcc Directory [Contents][Index]
configure uses some other scripts to help in its work:
Next: Files Created by configure
, Previous: Scripts Used by configure, Up: Configuration in the gcc Directory [Contents][Index]
The config.build file contains specific rules for particular systems which GCC is built on. This should be used as rarely as possible, as the behavior of the build system can always be detected by autoconf.
The config.host file contains specific rules for particular systems which GCC will run on. This is rarely needed.
The config.gcc file contains specific rules for particular systems which GCC will generate code for. This is usually needed.
Each file has a list of the shell variables it sets, with descriptions, at the top of the file.
FIXME: document the contents of these files, and what variables should be set to control build, host and target configuration.
Previous: The config.build; config.host; and config.gcc Files, Up: Configuration in the gcc Directory [Contents][Index]
configure
Here we spell out what files will be set up by configure in the gcc directory. Some other files are created as temporary files in the configuration process, and are not used in the subsequent build; these are not documented.
outputs
, then
the files listed in outputs
there are also generated.
The following configuration headers are created from the Makefile,
using mkconfig.sh, rather than directly by configure.
config.h, bconfig.h and tconfig.h all contain the
xm-machine.h header, if any, appropriate to the host,
build and target machines respectively, the configuration headers for
the target, and some definitions; for the host and build machines,
these include the autoconfigured headers generated by
configure. The other configuration headers are determined by
config.gcc. They also contain the typedefs for rtx
,
rtvec
and tree
.
Next: Makefile Targets, Previous: Configuration in the gcc Directory, Up: The gcc Subdirectory [Contents][Index]
FIXME: describe the build system, including what is built in what stages. Also list the various source files that are used in the build process but aren’t source files of GCC itself and so aren’t documented below (see Passes and Files of the Compiler).
Next: Library Source Files and Headers under the gcc Directory, Previous: Build System in the gcc Directory, Up: The gcc Subdirectory [Contents][Index]
These targets are available from the ‘gcc’ directory:
all
This is the default target. Depending on what your build/host/target configuration is, it coordinates all the things that need to be built.
doc
Produce info-formatted documentation and man pages. Essentially it calls ‘make man’ and ‘make info’.
dvi
Produce DVI-formatted documentation.
pdf
Produce PDF-formatted documentation.
html
Produce HTML-formatted documentation.
man
Generate man pages.
info
Generate info-formatted pages.
mostlyclean
Delete the files made while building the compiler.
clean
That, and all the other files built by ‘make all’.
distclean
That, and all the files created by configure
.
maintainer-clean
Distclean plus any file that can be generated from other files. Note that additional tools may be required beyond what is normally needed to build GCC.
srcextra
Generates files in the source directory that are not version-controlled but should go into a release tarball.
srcinfo
srcman
Copies the info-formatted and manpage documentation into the source directory usually for the purpose of generating a release tarball.
install
Installs GCC.
uninstall
Deletes installed files, though this is not supported.
check
Run the testsuite. This creates a testsuite subdirectory that
has various .sum and .log files containing the results of
the testing. You can run subsets with, for example, ‘make check-gcc’.
You can specify specific tests by setting RUNTESTFLAGS
to be the name
of the .exp file, optionally followed by (for some tests) an equals
and a file wildcard, like:
make check-gcc RUNTESTFLAGS="execute.exp=19980413-*"
Note that running the testsuite may require additional tools be installed, such as Tcl or DejaGnu.
The toplevel tree from which you start GCC compilation is not the GCC directory, but rather a complex Makefile that coordinates the various steps of the build, including bootstrapping the compiler and using the new compiler to build target libraries.
When GCC is configured for a native configuration, the default action
for make
is to do a full three-stage bootstrap. This means
that GCC is built three times—once with the native compiler, once with
the native-built compiler it just built, and once with the compiler it
built the second time. In theory, the last two should produce the same
results, which ‘make compare’ can check. Each stage is configured
separately and compiled into a separate directory, to minimize problems
due to ABI incompatibilities between the native compiler and GCC.
If you do a change, rebuilding will also start from the first stage and “bubble” up the change through the three stages. Each stage is taken from its build directory (if it had been built previously), rebuilt, and copied to its subdirectory. This will allow you to, for example, continue a bootstrap after fixing a bug which causes the stage2 build to crash. It does not provide as good coverage of the compiler as bootstrapping from scratch, but it ensures that the new code is syntactically correct (e.g., that you did not use GCC extensions by mistake), and avoids spurious bootstrap comparison failures1.
Other targets available from the top level include:
bootstrap-lean
Like bootstrap
, except that the various stages are removed once
they’re no longer needed. This saves disk space.
bootstrap2
bootstrap2-lean
Performs only the first two stages of bootstrap. Unlike a three-stage bootstrap, this does not perform a comparison to test that the compiler is running properly. Note that the disk space required by a “lean” bootstrap is approximately independent of the number of stages.
stageN-bubble (N = 1…4, profile, feedback)
Rebuild all the stages up to N, with the appropriate flags, “bubbling” the changes as described above.
all-stageN (N = 1…4, profile, feedback)
Assuming that stage N has already been built, rebuild it with the appropriate flags. This is rarely needed.
cleanstrap
Remove everything (‘make clean’) and rebuilds (‘make bootstrap’).
compare
Compares the results of stages 2 and 3. This ensures that the compiler is running properly, since it should produce the same object files regardless of how it itself was compiled.
profiledbootstrap
Builds a compiler with profiling feedback information. In this case, the second and third stages are named ‘profile’ and ‘feedback’, respectively. For more information, see the installation instructions.
restrap
Restart a bootstrap, so that everything that was not built with the system compiler is rebuilt.
stageN-start (N = 1…4, profile, feedback)
For each package that is bootstrapped, rename directories so that, for example, gcc points to the stageN GCC, compiled with the stageN-1 GCC2.
You will invoke this target if you need to test or debug the stageN GCC. If you only need to execute GCC (but you need not run ‘make’ either to rebuild it or to run test suites), you should be able to work directly in the stageN-gcc directory. This makes it easier to debug multiple stages in parallel.
stage
For each package that is bootstrapped, relocate its build directory to indicate its stage. For example, if the gcc directory points to the stage2 GCC, after invoking this target it will be renamed to stage2-gcc.
If you wish to use non-default GCC flags when compiling the stage2 and
stage3 compilers, set BOOT_CFLAGS
on the command line when doing
‘make’.
Usually, the first stage only builds the languages that the compiler
is written in: typically, C and maybe Ada. If you are debugging a
miscompilation of a different stage2 front-end (for example, of the
Fortran front-end), you may want to have front-ends for other languages
in the first stage as well. To do so, set STAGE1_LANGUAGES
on the command line when doing ‘make’.
For example, in the aforementioned scenario of debugging a Fortran front-end miscompilation caused by the stage1 compiler, you may need a command like
make stage2-bubble STAGE1_LANGUAGES=c,fortran
Alternatively, you can use per-language targets to build and test
languages that are not enabled by default in stage1. For example,
make f951
will build a Fortran compiler even in the stage1
build directory.
Next: Headers Installed by GCC, Previous: Makefile Targets, Up: The gcc Subdirectory [Contents][Index]
FIXME: list here, with explanation, all the C source files and headers under the gcc directory that aren’t built into the GCC executable but rather are part of runtime libraries and object files, such as crtstuff.c and unwind-dw2.c. See Headers Installed by GCC, for more information about the ginclude directory.
Next: Building Documentation, Previous: Library Source Files and Headers under the gcc Directory, Up: The gcc Subdirectory [Contents][Index]
In general, GCC expects the system C library to provide most of the headers to be used with it. However, GCC will fix those headers if necessary to make them work with GCC, and will install some headers required of freestanding implementations. These headers are installed in libsubdir/include. Headers for non-C runtime libraries are also installed by GCC; these are not documented here. (FIXME: document them somewhere.)
Several of the headers GCC installs are in the ginclude
directory. These headers, iso646.h,
stdarg.h, stdbool.h, and stddef.h,
are installed in libsubdir/include,
unless the target Makefile fragment (see Target Makefile Fragments)
overrides this by setting USER_H
.
In addition to these headers and those generated by fixing system
headers to work with GCC, some other headers may also be installed in
libsubdir/include. config.gcc may set
extra_headers
; this specifies additional headers under
config to be installed on some systems.
GCC installs its own version of <float.h>
, from ginclude/float.h.
This is done to cope with command-line options that change the
representation of floating point numbers.
GCC also installs its own version of <limits.h>
; this is generated
from glimits.h, together with limitx.h and
limity.h if the system also has its own version of
<limits.h>
. (GCC provides its own header because it is
required of ISO C freestanding implementations, but needs to include
the system header from its own header as well because other standards
such as POSIX specify additional values to be defined in
<limits.h>
.) The system’s <limits.h>
header is used via
libsubdir/include/syslimits.h, which is copied from
gsyslimits.h if it does not need fixing to work with GCC; if it
needs fixing, syslimits.h is the fixed copy.
GCC can also install <tgmath.h>
. It will do this when
config.gcc sets use_gcc_tgmath
to yes
.
Next: Anatomy of a Language Front End, Previous: Headers Installed by GCC, Up: The gcc Subdirectory [Contents][Index]
The main GCC documentation is in the form of manuals in Texinfo format. These are installed in Info format; DVI versions may be generated by ‘make dvi’, PDF versions by ‘make pdf’, and HTML versions by ‘make html’. In addition, some man pages are generated from the Texinfo manuals, there are some other text files with miscellaneous documentation, and runtime libraries have their own documentation outside the gcc directory. FIXME: document the documentation for runtime libraries somewhere.
Next: Man Page Generation, Up: Building Documentation [Contents][Index]
The manuals for GCC as a whole, and the C and C++ front ends, are in files doc/*.texi. Other front ends have their own manuals in files language/*.texi. Common files doc/include/*.texi are provided which may be included in multiple manuals; the following files are in doc/include:
The GNU Free Documentation License.
The section “Funding Free Software”.
Common definitions for manuals.
The GNU General Public License.
A copy of texinfo.tex known to work with the GCC manuals.
DVI-formatted manuals are generated by ‘make dvi’, which uses
texi2dvi
(via the Makefile macro $(TEXI2DVI)
).
PDF-formatted manuals are generated by ‘make pdf’, which uses
texi2pdf
(via the Makefile macro $(TEXI2PDF)
). HTML
formatted manuals are generated by ‘make html’. Info
manuals are generated by ‘make info’ (which is run as part of
a bootstrap); this generates the manuals in the source directory,
using makeinfo
via the Makefile macro $(MAKEINFO)
,
and they are included in release distributions.
Manuals are also provided on the GCC web site, in both HTML and
PostScript forms. This is done via the script
maintainer-scripts/update_web_docs_git. Each manual to be
provided online must be listed in the definition of MANUALS
in
that file; a file name.texi must only appear once in the
source tree, and the output manual must have the same name as the
source file. (However, other Texinfo files, included in manuals but
not themselves the root files of manuals, may have names that appear
more than once in the source tree.) The manual file
name.texi should only include other files in its own
directory or in doc/include. HTML manuals will be generated by
‘makeinfo --html’, PostScript manuals by texi2dvi
and dvips
, and PDF manuals by texi2pdf
.
All Texinfo files that are parts of manuals must
be version-controlled, even if they are generated files, for the
generation of online manuals to work.
The installation manual, doc/install.texi, is also provided on the GCC web site. The HTML version is generated by the script doc/install.texi2html.
Next: Miscellaneous Documentation, Previous: Texinfo Manuals, Up: Building Documentation [Contents][Index]
Because of user demand, in addition to full Texinfo manuals, man pages
are provided which contain extracts from those manuals. These man
pages are generated from the Texinfo manuals using
contrib/texi2pod.pl and pod2man
. (The man page for
g++
, cp/g++.1, just contains a ‘.so’ reference
to gcc.1, but all the other man pages are generated from
Texinfo manuals.)
Because many systems may not have the necessary tools installed to generate the man pages, they are only generated if the configure script detects that recent enough tools are installed, and the Makefiles allow generating man pages to fail without aborting the build. Man pages are also included in release distributions. They are generated in the source directory.
Magic comments in Texinfo files starting ‘@c man’ control what parts of a Texinfo file go into a man page. Only a subset of Texinfo is supported by texi2pod.pl, and it may be necessary to add support for more Texinfo features to this script when generating new man pages. To improve the man page output, some special Texinfo macros are provided in doc/include/gcc-common.texi which texi2pod.pl understands:
@gcctabopt
Use in the form ‘@table @gcctabopt’ for tables of options, where for printed output the effect of ‘@code’ is better than that of ‘@option’ but for man page output a different effect is wanted.
@gccoptlist
Use for summary lists of options in manuals.
@gol
Use at the end of each line inside ‘@gccoptlist’. This is necessary to avoid problems with differences in how the ‘@gccoptlist’ macro is handled by different Texinfo formatters.
FIXME: describe the texi2pod.pl input language and magic comments in more detail.
Previous: Man Page Generation, Up: Building Documentation [Contents][Index]
In addition to the formal documentation that is installed by GCC, there are several other text files in the gcc subdirectory with miscellaneous documentation:
Notes on GCC’s Native Language Support. FIXME: this should be part of this manual rather than a separate file.
Notes on the Free Translation Project.
The GNU General Public License, Versions 2 and 3.
The GNU Lesser General Public License, Versions 2.1 and 3.
Change log files for various parts of GCC.
Details of a few changes to the GCC front-end interface. FIXME: the information in this file should be part of general documentation of the front-end interface in this manual.
Information about new features in old versions of GCC. (For recent versions, the information is on the GCC web site.)
Information about portability issues when writing code in GCC. FIXME: why isn’t this part of this manual or of the GCC Coding Conventions?
FIXME: document such files in subdirectories, at least config, c, cp, objc, testsuite.
Next: Anatomy of a Target Back End, Previous: Building Documentation, Up: The gcc Subdirectory [Contents][Index]
A front end for a language in GCC has the following parts:
default_compilers
in gcc.c for source file
suffixes for that language.
If the front end is added to the official GCC source repository, the following are also necessary:
A front end language directory contains the source files of that front end (but not of any runtime libraries, which should be outside the gcc directory). This includes documentation, and possibly some subsidiary programs built alongside the front end. Certain files are special and other parts of the compiler depend on their names:
This file is required in all language subdirectories. See The Front End config-lang.in File, for details of its contents
This file is required in all language subdirectories. See The Front End Make-lang.in File, for details of its contents.
This file registers the set of switches that the front end accepts on the command line, and their --help text. See Option specification files.
This file provides entries for default_compilers
in
gcc.c which override the default of giving an error that a
compiler for that language is not installed.
This file, which need not exist, defines any language-specific tree codes.
Next: The Front End Make-lang.in File, Previous: The Front End language Directory, Up: Anatomy of a Language Front End [Contents][Index]
Each language subdirectory contains a config-lang.in file. This file is a shell script that may define some variables describing the language:
language
This definition must be present, and gives the name of the language for some purposes such as arguments to --enable-languages.
lang_requires
If defined, this variable lists (space-separated) language front ends
other than C that this front end requires to be enabled (with the
names given being their language
settings). For example, the
Obj-C++ front end depends on the C++ and ObjC front ends, so sets
‘lang_requires="objc c++"’.
subdir_requires
If defined, this variable lists (space-separated) front end directories other than C that this front end requires to be present. For example, the Objective-C++ front end uses source files from the C++ and Objective-C front ends, so sets ‘subdir_requires="cp objc"’.
target_libs
If defined, this variable lists (space-separated) targets in the top
level Makefile to build the runtime libraries for this
language, such as target-libobjc
.
lang_dirs
If defined, this variable lists (space-separated) top level directories (parallel to gcc), apart from the runtime libraries, that should not be configured if this front end is not built.
build_by_default
If defined to ‘no’, this language front end is not built unless enabled in a --enable-languages argument. Otherwise, front ends are built by default, subject to any special logic in configure.ac (as is present to disable the Ada front end if the Ada compiler is not already installed).
boot_language
If defined to ‘yes’, this front end is built in stage1 of the bootstrap. This is only relevant to front ends written in their own languages.
compilers
If defined, a space-separated list of compiler executables that will be run by the driver. The names here will each end with ‘\$(exeext)’.
outputs
If defined, a space-separated list of files that should be generated by configure substituting values in them. This mechanism can be used to create a file language/Makefile from language/Makefile.in, but this is deprecated, building everything from the single gcc/Makefile is preferred.
gtfiles
If defined, a space-separated list of files that should be scanned by gengtype.c to generate the garbage collection tables and routines for this language. This excludes the files that are common to all front ends. See Memory Management and Type Information.
Previous: The Front End config-lang.in File, Up: Anatomy of a Language Front End [Contents][Index]
Each language subdirectory contains a Make-lang.in file. It contains
targets lang.hook
(where lang
is the
setting of language
in config-lang.in) for the following
values of hook
, and any other Makefile rules required to
build those targets (which may if necessary use other Makefiles
specified in outputs
in config-lang.in, although this is
deprecated). It also adds any testsuite targets that can use the
standard rule in gcc/Makefile.in to the variable
lang_checks
.
all.cross
start.encap
rest.encap
FIXME: exactly what goes in each of these targets?
tags
Build an etags
TAGS file in the language subdirectory
in the source tree.
info
Build info documentation for the front end, in the build directory.
This target is only called by ‘make bootstrap’ if a suitable
version of makeinfo
is available, so does not need to check
for this, and should fail if an error occurs.
dvi
Build DVI documentation for the front end, in the build directory.
This should be done using $(TEXI2DVI)
, with appropriate
-I arguments pointing to directories of included files.
pdf
Build PDF documentation for the front end, in the build directory.
This should be done using $(TEXI2PDF)
, with appropriate
-I arguments pointing to directories of included files.
html
Build HTML documentation for the front end, in the build directory.
man
Build generated man pages for the front end from Texinfo manuals (see Man Page Generation), in the build directory. This target is only called if the necessary tools are available, but should ignore errors so as not to stop the build if errors occur; man pages are optional and the tools involved may be installed in a broken way.
install-common
Install everything that is part of the front end, apart from the
compiler executables listed in compilers
in
config-lang.in.
install-info
Install info documentation for the front end, if it is present in the source directory. This target should have dependencies on info files that should be installed.
install-man
Install man pages for the front end. This target should ignore errors.
install-plugin
Install headers needed for plugins.
srcextra
Copies its dependencies into the source directory. This generally should be used for generated files such as Bison output files which are not version-controlled, but should be included in any release tarballs. This target will be executed during a bootstrap if ‘--enable-generated-files-in-srcdir’ was specified as a configure option.
srcinfo
srcman
Copies its dependencies into the source directory. These targets will be executed during a bootstrap if ‘--enable-generated-files-in-srcdir’ was specified as a configure option.
uninstall
Uninstall files installed by installing the compiler. This is currently documented not to be supported, so the hook need not do anything.
mostlyclean
clean
distclean
maintainer-clean
The language parts of the standard GNU
‘*clean’ targets. See Standard Targets for
Users in GNU Coding Standards, for details of the standard
targets. For GCC, maintainer-clean
should delete
all generated files in the source directory that are not version-controlled,
but should not delete anything that is.
Make-lang.in must also define a variable lang_OBJS
to a list of host object files that are used by that language.
Previous: Anatomy of a Language Front End, Up: The gcc Subdirectory [Contents][Index]
A back end for a target architecture in GCC has the following parts:
extra_options
variable in
config.gcc. See Option specification files.
__attribute__
), including where the
same attribute is already supported on some targets, which are
enumerated in the manual.
libstdc++
porting
manual needs to be installed as info for this to work, or to be a
chapter of this manual.
The machine.h header is included very early in GCC’s
standard sequence of header files, while machine-protos.h
is included late in the sequence. Thus machine-protos.h
can include declarations referencing types that are not defined when
machine.h is included, specifically including those from
rtl.h and tree.h. Since both RTL and tree types may not
be available in every context where machine-protos.h is
included, in this file you should guard declarations using these types
inside appropriate #ifdef RTX_CODE
or #ifdef TREE_CODE
conditional code segments.
If the backend uses shared data structures that require GTY
markers
for garbage collection (see Memory Management and Type Information), you must declare those
in machine.h rather than machine-protos.h.
Any definitions required for building libgcc must also go in
machine.h.
GCC uses the macro IN_TARGET_CODE
to distinguish between
machine-specific .c and .cc files and
machine-independent .c and .cc files. Machine-specific
files should use the directive:
#define IN_TARGET_CODE 1
before including config.h
.
If the back end is added to the official GCC source repository, the following are also necessary:
Next: Option specification files, Previous: Source Tree Structure and Build System, Up: Introduction [Contents][Index]
GCC contains several testsuites to help maintain compiler quality. Most of the runtime libraries and language front ends in GCC have testsuites. Currently only the C language testsuites are documented here; FIXME: document the others.
gcov
Next: Directives used within DejaGnu tests, Up: Testsuites [Contents][Index]
In general, C testcases have a trailing -n.c, starting with -1.c, in case other testcases with similar names are added later. If the test is a test of some well-defined feature, it should have a name referring to that feature such as feature-1.c. If it does not test a well-defined feature but just happens to exercise a bug somewhere in the compiler, and a bug report has been filed for this bug in the GCC bug database, prbug-number-1.c is the appropriate form of name. Otherwise (for miscellaneous bugs not filed in the GCC bug database), and previously more generally, test cases are named after the date on which they were added. This allows people to tell at a glance whether a test failure is because of a recently found bug that has not yet been fixed, or whether it may be a regression, but does not give any other information about the bug or where discussion of it may be found. Some other language testsuites follow similar conventions.
In the gcc.dg testsuite, it is often necessary to test that an error is indeed a hard error and not just a warning—for example, where it is a constraint violation in the C standard, which must become an error with -pedantic-errors. The following idiom, where the first line shown is line line of the file and the line that generates the error, is used for this:
/* { dg-bogus "warning" "warning in place of error" } */ /* { dg-error "regexp" "message" { target *-*-* } line } */
It may be necessary to check that an expression is an integer constant
expression and has a certain value. To check that E
has
value V
, an idiom similar to the following is used:
char x[((E) == (V) ? 1 : -1)];
In gcc.dg tests, __typeof__
is sometimes used to make
assertions about the types of expressions. See, for example,
gcc.dg/c99-condexpr-1.c. The more subtle uses depend on the
exact rules for the types of conditional expressions in the C
standard; see, for example, gcc.dg/c99-intconst-1.c.
It is useful to be able to test that optimizations are being made
properly. This cannot be done in all cases, but it can be done where
the optimization will lead to code being optimized away (for example,
where flow analysis or alias analysis should show that certain code
cannot be called) or to functions not being called because they have
been expanded as built-in functions. Such tests go in
gcc.c-torture/execute. Where code should be optimized away, a
call to a nonexistent function such as link_failure ()
may be
inserted; a definition
#ifndef __OPTIMIZE__ void link_failure (void) { abort (); } #endif
will also be needed so that linking still succeeds when the test is
run without optimization. When all calls to a built-in function
should have been optimized and no calls to the non-built-in version of
the function should remain, that function may be defined as
static
to call abort ()
(although redeclaring a function
as static may not work on all targets).
All testcases must be portable. Target-specific testcases must have appropriate code to avoid causing failures on unsupported systems; unfortunately, the mechanisms for this differ by directory.
FIXME: discuss non-C testsuites here.
Next: Ada Language Testsuites, Previous: Idioms Used in Testsuite Code, Up: Testsuites [Contents][Index]
dg-add-options
dg-require-support
dg-final
Next: Selecting targets to which a test applies, Up: Directives used within DejaGnu tests [Contents][Index]
Test directives appear within comments in a test source file and begin
with dg-
. Some of these are defined within DejaGnu and others
are local to the GCC testsuite.
The order in which test directives appear in a test can be important: directives local to GCC sometimes override information used by the DejaGnu directives, which know nothing about the GCC directives, so the DejaGnu directives must precede GCC directives.
Several test directives include selectors (see Selecting targets to which a test applies)
which are usually preceded by the keyword target
or xfail
.
{ dg-do do-what-keyword [{ target/xfail selector }] }
do-what-keyword specifies how the test is compiled and whether it is executed. It is one of:
preprocess
Compile with -E to run only the preprocessor.
compile
Compile with -S to produce an assembly code file.
assemble
Compile with -c to produce a relocatable object file.
link
Compile, assemble, and link to produce an executable file.
run
Produce and run an executable file, which is expected to return an exit code of 0.
The default is compile
. That can be overridden for a set of
tests by redefining dg-do-what-default
within the .exp
file for those tests.
If the directive includes the optional ‘{ target selector }’ then the test is skipped unless the target system matches the selector.
If do-what-keyword is run
and the directive includes
the optional ‘{ xfail selector }’ and the selector is met
then the test is expected to fail. The xfail
clause is ignored
for other values of do-what-keyword; those tests can use
directive dg-xfail-if
.
{ dg-options options [{ target selector }] }
This DejaGnu directive provides a list of compiler options, to be used if the target system matches selector, that replace the default options used for this set of tests.
{ dg-add-options feature … }
Add any compiler options that are needed to access certain features.
This directive does nothing on targets that enable the features by
default, or that don’t provide them at all. It must come after
all dg-options
directives.
For supported values of feature see Features for dg-add-options
.
{ dg-additional-options options [{ target selector }] }
This directive provides a list of compiler options, to be used if the target system matches selector, that are added to the default options used for this set of tests.
The normal timeout limit, in seconds, is found by searching the following in order:
dg-timeout
directive in
the test
{ dg-timeout n [{target selector }] }
Set the time limit for the compilation and for the execution of the test to the specified number of seconds.
{ dg-timeout-factor x [{ target selector }] }
Multiply the normal time limit for compilation and execution of the test by the specified floating-point factor.
{ dg-skip-if comment { selector } [{ include-opts } [{ exclude-opts }]] }
Arguments include-opts and exclude-opts are lists in which each element is a string of zero or more GCC options. Skip the test if all of the following conditions are met:
For example, to skip a test if option -Os
is present:
/* { dg-skip-if "" { *-*-* } { "-Os" } { "" } } */
To skip a test if both options -O2
and -g
are present:
/* { dg-skip-if "" { *-*-* } { "-O2 -g" } { "" } } */
To skip a test if either -O2
or -O3
is present:
/* { dg-skip-if "" { *-*-* } { "-O2" "-O3" } { "" } } */
To skip a test unless option -Os
is present:
/* { dg-skip-if "" { *-*-* } { "*" } { "-Os" } } */
To skip a test if either -O2
or -O3
is used with -g
but not if -fpic
is also present:
/* { dg-skip-if "" { *-*-* } { "-O2 -g" "-O3 -g" } { "-fpic" } } */
{ dg-require-effective-target keyword [{ target selector }] }
Skip the test if the test target, including current multilib flags,
is not covered by the effective-target keyword.
If the directive includes the optional ‘{ selector }’
then the effective-target test is only performed if the target system
matches the selector.
This directive must appear after any dg-do
directive in the test
and before any dg-additional-sources
directive.
See Keywords describing target attributes.
{ dg-require-support args }
Skip the test if the target does not provide the required support.
These directives must appear after any dg-do
directive in the test
and before any dg-additional-sources
directive.
They require at least one argument, which can be an empty string if the
specific procedure does not examine the argument.
See Variants of dg-require-support
, for a complete list of these directives.
{ dg-xfail-if comment { selector } [{ include-opts } [{ exclude-opts }]] }
Expect the test to fail if the conditions (which are the same as for
dg-skip-if
) are met. This does not affect the execute step.
{ dg-xfail-run-if comment { selector } [{ include-opts } [{ exclude-opts }]] }
Expect the execute step of a test to fail if the conditions (which are
the same as for dg-skip-if
) are met.
{ dg-ice comment [{ selector } [{ include-opts } [{ exclude-opts }]]] }
Expect the compiler to crash with an internal compiler error and return
a nonzero exit status if the conditions (which are the same as for
dg-skip-if
) are met. Used for tests that test bugs that have not been
fixed yet.
{ dg-shouldfail comment [{ selector } [{ include-opts } [{ exclude-opts }]]] }
Expect the test executable to return a nonzero exit status if the
conditions (which are the same as for dg-skip-if
) are met.
Where line is an accepted argument for these commands, a value of ‘0’ can be used if there is no line associated with the message.
{ dg-error regexp [comment [{ target/xfail selector } [line] ]] }
This DejaGnu directive appears on a source line that is expected to get
an error message, or else specifies the source line associated with the
message. If there is no message for that line or if the text of that
message is not matched by regexp then the check fails and
comment is included in the FAIL
message. The check does
not look for the string ‘error’ unless it is part of regexp.
{ dg-warning regexp [comment [{ target/xfail selector } [line] ]] }
This DejaGnu directive appears on a source line that is expected to get
a warning message, or else specifies the source line associated with the
message. If there is no message for that line or if the text of that
message is not matched by regexp then the check fails and
comment is included in the FAIL
message. The check does
not look for the string ‘warning’ unless it is part of regexp.
{ dg-message regexp [comment [{ target/xfail selector } [line] ]] }
The line is expected to get a message other than an error or warning.
If there is no message for that line or if the text of that message is
not matched by regexp then the check fails and comment is
included in the FAIL
message.
{ dg-bogus regexp [comment [{ target/xfail selector } [line] ]] }
This DejaGnu directive appears on a source line that should not get a message matching regexp, or else specifies the source line associated with the bogus message. It is usually used with ‘xfail’ to indicate that the message is a known problem for a particular set of targets.
{ dg-line linenumvar }
This DejaGnu directive sets the variable linenumvar to the line number of
the source line. The variable linenumvar can then be used in subsequent
dg-error
, dg-warning
, dg-message
and dg-bogus
directives. For example:
int a; /* { dg-line first_def_a } */ float a; /* { dg-error "conflicting types of" } */ /* { dg-message "previous declaration of" "" { target *-*-* } first_def_a } */
{ dg-excess-errors comment [{ target/xfail selector }] }
This DejaGnu directive indicates that the test is expected to fail due to compiler messages that are not handled by ‘dg-error’, ‘dg-warning’ or ‘dg-bogus’. For this directive ‘xfail’ has the same effect as ‘target’.
{ dg-prune-output regexp }
Prune messages matching regexp from the test output.
{ dg-output regexp [{ target/xfail selector }] }
This DejaGnu directive compares regexp to the combined output that the test executable writes to stdout and stderr.
{ dg-set-compiler-env-var var_name "var_value" }
Specify that the environment variable var_name needs to be set to var_value before invoking the compiler on the test file.
{ dg-set-target-env-var var_name "var_value" }
Specify that the environment variable var_name needs to be set to var_value before execution of the program created by the test.
{ dg-additional-files "filelist" }
Specify additional files, other than source files, that must be copied to the system where the compiler runs.
{ dg-additional-sources "filelist" }
Specify additional source files to appear in the compile line following the main test file.
{ dg-final { local-directive } }
This DejaGnu directive is placed within a comment anywhere in the
source file and is processed after the test has been compiled and run.
Multiple ‘dg-final’ commands are processed in the order in which
they appear in the source file. See Commands for use in dg-final
, for a list
of directives that can be used within dg-final
.
Next: Keywords describing target attributes, Previous: Syntax and Descriptions of test directives, Up: Directives used within DejaGnu tests [Contents][Index]
Several test directives include selectors to limit the targets for which a test is run or to declare that a test is expected to fail on particular targets.
A selector is:
Depending on the context, the selector specifies whether a test is skipped and reported as unsupported or is expected to fail. A context that allows either ‘target’ or ‘xfail’ also allows ‘{ target selector1 xfail selector2 }’ to skip the test for targets that don’t match selector1 and the test to fail for targets that match selector2.
A selector expression appears within curly braces and uses a single logical operator: one of ‘!’, ‘&&’, or ‘||’. An operand is another selector expression, an effective-target keyword, a single target triplet, or a list of target triplets within quotes or curly braces. For example:
{ target { ! "hppa*-*-* ia64*-*-*" } } { target { powerpc*-*-* && lp64 } } { xfail { lp64 || vect_no_align } }
Next: Features for dg-add-options
, Previous: Selecting targets to which a test applies, Up: Directives used within DejaGnu tests [Contents][Index]
Effective-target keywords identify sets of targets that support particular functionality. They are used to limit tests to be run only for particular targets, or to specify that particular sets of targets are expected to fail some tests.
Effective-target keywords are defined in lib/target-supports.exp in the GCC testsuite, with the exception of those that are documented as being local to a particular test directory.
The ‘effective target’ takes into account all of the compiler options
with which the test will be compiled, including the multilib options.
By convention, keywords ending in _nocache
can also include options
specified for the particular test in an earlier dg-options
or
dg-add-options
directive.
gcc.target/i386
gcc.test-framework
be
Target uses big-endian memory order for multi-byte and multi-word data.
le
Target uses little-endian memory order for multi-byte and multi-word data.
ilp32
Target has 32-bit int
, long
, and pointers.
lp64
Target has 32-bit int
, 64-bit long
and pointers.
llp64
Target has 32-bit int
and long
, 64-bit long long
and pointers.
double64
Target has 64-bit double
.
double64plus
Target has double
that is 64 bits or longer.
longdouble128
Target has 128-bit long double
.
int32plus
Target has int
that is at 32 bits or longer.
int16
Target has int
that is 16 bits or shorter.
longlong64
Target has 64-bit long long
.
long_neq_int
Target has int
and long
with different sizes.
short_eq_int
Target has short
and int
with the same size.
ptr_eq_short
Target has pointers (void *
) and short
with the same size.
int_eq_float
Target has int
and float
with the same size.
ptr_eq_long
Target has pointers (void *
) and long
with the same size.
large_double
Target supports double
that is longer than float
.
large_long_double
Target supports long double
that is longer than double
.
ptr32plus
Target has pointers that are 32 bits or longer.
size20plus
Target has a 20-bit or larger address space, so supports at least 16-bit array and structure sizes.
size24plus
Target has a 24-bit or larger address space, so supports at least 20-bit array and structure sizes.
size32plus
Target has a 32-bit or larger address space, so supports at least 24-bit array and structure sizes.
4byte_wchar_t
Target has wchar_t
that is at least 4 bytes.
floatn
Target has the _Floatn
type.
floatnx
Target has the _Floatnx
type.
floatn_runtime
Target has the _Floatn
type, including runtime support
for any options added with dg-add-options
.
floatnx_runtime
Target has the _Floatnx
type, including runtime support
for any options added with dg-add-options
.
floatn_nx_runtime
Target has runtime support for any options added with
dg-add-options
for any _Floatn
or
_Floatnx
type.
inf
Target supports floating point infinite (inf
) for type
double
.
inff
Target supports floating point infinite (inf
) for type
float
.
fortran_integer_16
Target supports Fortran integer
that is 16 bytes or longer.
fortran_real_10
Target supports Fortran real
that is 10 bytes or longer.
fortran_real_16
Target supports Fortran real
that is 16 bytes or longer.
fortran_large_int
Target supports Fortran integer
kinds larger than integer(8)
.
fortran_large_real
Target supports Fortran real
kinds larger than real(8)
.
vect_align_stack_vars
The target’s ABI allows stack variables to be aligned to the preferred vector alignment.
vect_avg_qi
Target supports both signed and unsigned averaging operations on vectors of bytes.
vect_mulhrs_hi
Target supports both signed and unsigned multiply-high-with-round-and-scale operations on vectors of half-words.
vect_sdiv_pow2_si
Target supports signed division by constant power-of-2 operations on vectors of 4-byte integers.
vect_condition
Target supports vector conditional operations.
vect_cond_mixed
Target supports vector conditional operations where comparison operands have different type from the value operands.
vect_double
Target supports hardware vectors of double
.
vect_double_cond_arith
Target supports conditional addition, subtraction, multiplication,
division, minimum and maximum on vectors of double
, via the
cond_
optabs.
vect_element_align_preferred
The target’s preferred vector alignment is the same as the element alignment.
vect_float
Target supports hardware vectors of float
when
-funsafe-math-optimizations is in effect.
vect_float_strict
Target supports hardware vectors of float
when
-funsafe-math-optimizations is not in effect.
This implies vect_float
.
vect_int
Target supports hardware vectors of int
.
vect_long
Target supports hardware vectors of long
.
vect_long_long
Target supports hardware vectors of long long
.
vect_check_ptrs
Target supports the check_raw_ptrs
and check_war_ptrs
optabs on vectors.
vect_fully_masked
Target supports fully-masked (also known as fully-predicated) loops, so that vector loops can handle partial as well as full vectors.
vect_masked_load
Target supports vector masked loads.
vect_masked_store
Target supports vector masked stores.
vect_scatter_store
Target supports vector scatter stores.
vect_aligned_arrays
Target aligns arrays to vector alignment boundary.
vect_hw_misalign
Target supports a vector misalign access.
vect_no_align
Target does not support a vector alignment mechanism.
vect_peeling_profitable
Target might require to peel loops for alignment purposes.
vect_no_int_min_max
Target does not support a vector min and max instruction on int
.
vect_no_int_add
Target does not support a vector add instruction on int
.
vect_no_bitwise
Target does not support vector bitwise instructions.
vect_bool_cmp
Target supports comparison of bool
vectors for at least one
vector length.
vect_char_add
Target supports addition of char
vectors for at least one
vector length.
vect_char_mult
Target supports vector char
multiplication.
vect_short_mult
Target supports vector short
multiplication.
vect_int_mult
Target supports vector int
multiplication.
vect_long_mult
Target supports 64 bit vector long
multiplication.
vect_extract_even_odd
Target supports vector even/odd element extraction.
vect_extract_even_odd_wide
Target supports vector even/odd element extraction of vectors with elements
SImode
or larger.
vect_interleave
Target supports vector interleaving.
vect_strided
Target supports vector interleaving and extract even/odd.
vect_strided_wide
Target supports vector interleaving and extract even/odd for wide element types.
vect_perm
Target supports vector permutation.
vect_perm_byte
Target supports permutation of vectors with 8-bit elements.
vect_perm_short
Target supports permutation of vectors with 16-bit elements.
vect_perm3_byte
Target supports permutation of vectors with 8-bit elements, and for the default vector length it is possible to permute:
{ a0, a1, a2, b0, b1, b2, … }
to:
{ a0, a0, a0, b0, b0, b0, … } { a1, a1, a1, b1, b1, b1, … } { a2, a2, a2, b2, b2, b2, … }
using only two-vector permutes, regardless of how long the sequence is.
vect_perm3_int
Like vect_perm3_byte
, but for 32-bit elements.
vect_perm3_short
Like vect_perm3_byte
, but for 16-bit elements.
vect_shift
Target supports a hardware vector shift operation.
vect_unaligned_possible
Target prefers vectors to have an alignment greater than element alignment, but also allows unaligned vector accesses in some circumstances.
vect_variable_length
Target has variable-length vectors.
vect_widen_sum_hi_to_si
Target supports a vector widening summation of short
operands
into int
results, or can promote (unpack) from short
to int
.
vect_widen_sum_qi_to_hi
Target supports a vector widening summation of char
operands
into short
results, or can promote (unpack) from char
to short
.
vect_widen_sum_qi_to_si
Target supports a vector widening summation of char
operands
into int
results.
vect_widen_mult_qi_to_hi
Target supports a vector widening multiplication of char
operands
into short
results, or can promote (unpack) from char
to
short
and perform non-widening multiplication of short
.
vect_widen_mult_hi_to_si
Target supports a vector widening multiplication of short
operands
into int
results, or can promote (unpack) from short
to
int
and perform non-widening multiplication of int
.
vect_widen_mult_si_to_di_pattern
Target supports a vector widening multiplication of int
operands
into long
results.
vect_sdot_qi
Target supports a vector dot-product of signed char
.
vect_udot_qi
Target supports a vector dot-product of unsigned char
.
vect_sdot_hi
Target supports a vector dot-product of signed short
.
vect_udot_hi
Target supports a vector dot-product of unsigned short
.
vect_pack_trunc
Target supports a vector demotion (packing) of short
to char
and from int
to short
using modulo arithmetic.
vect_unpack
Target supports a vector promotion (unpacking) of char
to short
and from char
to int
.
vect_intfloat_cvt
Target supports conversion from signed int
to float
.
vect_uintfloat_cvt
Target supports conversion from unsigned int
to float
.
vect_floatint_cvt
Target supports conversion from float
to signed int
.
vect_floatuint_cvt
Target supports conversion from float
to unsigned int
.
vect_intdouble_cvt
Target supports conversion from signed int
to double
.
vect_doubleint_cvt
Target supports conversion from double
to signed int
.
vect_max_reduc
Target supports max reduction for vectors.
vect_sizes_16B_8B
Target supports 16- and 8-bytes vectors.
vect_sizes_32B_16B
Target supports 32- and 16-bytes vectors.
vect_logical_reduc
Target supports AND, IOR and XOR reduction on vectors.
vect_fold_extract_last
Target supports the fold_extract_last
optab.
vect_len_load_store
Target supports the len_load
and len_store
optabs.
vect_partial_vectors_usage_1
Target supports loop vectorization with partial vectors and
vect-partial-vector-usage
is set to 1.
vect_partial_vectors_usage_2
Target supports loop vectorization with partial vectors and
vect-partial-vector-usage
is set to 2.
vect_partial_vectors
Target supports loop vectorization with partial vectors and
vect-partial-vector-usage
is nonzero.
tls
Target supports thread-local storage.
tls_native
Target supports native (rather than emulated) thread-local storage.
tls_runtime
Test system supports executing TLS executables.
dfp
Targets supports compiling decimal floating point extension to C.
dfp_nocache
Including the options used to compile this particular test, the target supports compiling decimal floating point extension to C.
dfprt
Test system can execute decimal floating point tests.
dfprt_nocache
Including the options used to compile this particular test, the test system can execute decimal floating point tests.
hard_dfp
Target generates decimal floating point instructions with current options.
arm32
ARM target generates 32-bit code.
arm_little_endian
ARM target that generates little-endian code.
arm_eabi
ARM target adheres to the ABI for the ARM Architecture.
arm_fp_ok
ARM target defines __ARM_FP
using -mfloat-abi=softfp
or
equivalent options. Some multilibs may be incompatible with these
options.
arm_fp_dp_ok
ARM target defines __ARM_FP
with double-precision support using
-mfloat-abi=softfp
or equivalent options. Some multilibs may
be incompatible with these options.
arm_hf_eabi
ARM target adheres to the VFP and Advanced SIMD Register Arguments
variant of the ABI for the ARM Architecture (as selected with
-mfloat-abi=hard
).
arm_softfloat
ARM target uses emulated floating point operations.
arm_hard_vfp_ok
ARM target supports -mfpu=vfp -mfloat-abi=hard
.
Some multilibs may be incompatible with these options.
arm_iwmmxt_ok
ARM target supports -mcpu=iwmmxt
.
Some multilibs may be incompatible with this option.
arm_neon
ARM target supports generating NEON instructions.
arm_tune_string_ops_prefer_neon
Test CPU tune supports inlining string operations with NEON instructions.
arm_neon_hw
Test system supports executing NEON instructions.
arm_neonv2_hw
Test system supports executing NEON v2 instructions.
arm_neon_ok
ARM Target supports -mfpu=neon -mfloat-abi=softfp
or compatible
options. Some multilibs may be incompatible with these options.
arm_neon_ok_no_float_abi
ARM Target supports NEON with -mfpu=neon
, but without any
-mfloat-abi= option. Some multilibs may be incompatible with this
option.
arm_neonv2_ok
ARM Target supports -mfpu=neon-vfpv4 -mfloat-abi=softfp
or compatible
options. Some multilibs may be incompatible with these options.
arm_fp16_ok
Target supports options to generate VFP half-precision floating-point instructions. Some multilibs may be incompatible with these options. This test is valid for ARM only.
arm_fp16_hw
Target supports executing VFP half-precision floating-point instructions. This test is valid for ARM only.
arm_neon_fp16_ok
ARM Target supports -mfpu=neon-fp16 -mfloat-abi=softfp
or compatible
options, including -mfp16-format=ieee
if necessary to obtain the
__fp16
type. Some multilibs may be incompatible with these options.
arm_neon_fp16_hw
Test system supports executing Neon half-precision float instructions. (Implies previous.)
arm_fp16_alternative_ok
ARM target supports the ARM FP16 alternative format. Some multilibs may be incompatible with the options needed.
arm_fp16_none_ok
ARM target supports specifying none as the ARM FP16 format.
arm_thumb1_ok
ARM target generates Thumb-1 code for -mthumb
.
arm_thumb2_ok
ARM target generates Thumb-2 code for -mthumb
.
arm_nothumb
ARM target that is not using Thumb.
arm_vfp_ok
ARM target supports -mfpu=vfp -mfloat-abi=softfp
.
Some multilibs may be incompatible with these options.
arm_vfp3_ok
ARM target supports -mfpu=vfp3 -mfloat-abi=softfp
.
Some multilibs may be incompatible with these options.
arm_arch_v8a_hard_ok
The compiler is targeting arm*-*-*
and can compile and assemble code
using the options -march=armv8-a -mfpu=neon-fp-armv8 -mfloat-abi=hard
.
This is not enough to guarantee that linking works.
arm_arch_v8a_hard_multilib
The compiler is targeting arm*-*-*
and can build programs using
the options -march=armv8-a -mfpu=neon-fp-armv8 -mfloat-abi=hard
.
The target can also run the resulting binaries.
arm_v8_vfp_ok
ARM target supports -mfpu=fp-armv8 -mfloat-abi=softfp
.
Some multilibs may be incompatible with these options.
arm_v8_neon_ok
ARM target supports -mfpu=neon-fp-armv8 -mfloat-abi=softfp
.
Some multilibs may be incompatible with these options.
arm_v8_1a_neon_ok
ARM target supports options to generate ARMv8.1-A Adv.SIMD instructions. Some multilibs may be incompatible with these options.
arm_v8_1a_neon_hw
ARM target supports executing ARMv8.1-A Adv.SIMD instructions. Some multilibs may be incompatible with the options needed. Implies arm_v8_1a_neon_ok.
arm_acq_rel
ARM target supports acquire-release instructions.
arm_v8_2a_fp16_scalar_ok
ARM target supports options to generate instructions for ARMv8.2-A and scalar instructions from the FP16 extension. Some multilibs may be incompatible with these options.
arm_v8_2a_fp16_scalar_hw
ARM target supports executing instructions for ARMv8.2-A and scalar instructions from the FP16 extension. Some multilibs may be incompatible with these options. Implies arm_v8_2a_fp16_neon_ok.
arm_v8_2a_fp16_neon_ok
ARM target supports options to generate instructions from ARMv8.2-A with the FP16 extension. Some multilibs may be incompatible with these options. Implies arm_v8_2a_fp16_scalar_ok.
arm_v8_2a_fp16_neon_hw
ARM target supports executing instructions from ARMv8.2-A with the FP16 extension. Some multilibs may be incompatible with these options. Implies arm_v8_2a_fp16_neon_ok and arm_v8_2a_fp16_scalar_hw.
arm_v8_2a_dotprod_neon_ok
ARM target supports options to generate instructions from ARMv8.2-A with the Dot Product extension. Some multilibs may be incompatible with these options.
arm_v8_2a_dotprod_neon_hw
ARM target supports executing instructions from ARMv8.2-A with the Dot Product extension. Some multilibs may be incompatible with these options. Implies arm_v8_2a_dotprod_neon_ok.
arm_fp16fml_neon_ok
ARM target supports extensions to generate the VFMAL
and VFMLS
half-precision floating-point instructions available from ARMv8.2-A and
onwards. Some multilibs may be incompatible with these options.
arm_v8_2a_bf16_neon_ok
ARM target supports options to generate instructions from ARMv8.2-A with the BFloat16 extension (bf16). Some multilibs may be incompatible with these options.
arm_v8_2a_i8mm_ok
ARM target supports options to generate instructions from ARMv8.2-A with the 8-Bit Integer Matrix Multiply extension (i8mm). Some multilibs may be incompatible with these options.
arm_v8_1m_mve_ok
ARM target supports options to generate instructions from ARMv8.1-M with the M-Profile Vector Extension (MVE). Some multilibs may be incompatible with these options.
arm_v8_1m_mve_fp_ok
ARM target supports options to generate instructions from ARMv8.1-M with the Half-precision floating-point instructions (HP), Floating-point Extension (FP) along with M-Profile Vector Extension (MVE). Some multilibs may be incompatible with these options.
arm_mve_hw
Test system supports executing MVE instructions.
arm_v8m_main_cde
ARM target supports options to generate instructions from ARMv8-M with the Custom Datapath Extension (CDE). Some multilibs may be incompatible with these options.
arm_v8m_main_cde_fp
ARM target supports options to generate instructions from ARMv8-M with the Custom Datapath Extension (CDE) and floating-point (VFP). Some multilibs may be incompatible with these options.
arm_v8_1m_main_cde_mve
ARM target supports options to generate instructions from ARMv8.1-M with the Custom Datapath Extension (CDE) and M-Profile Vector Extension (MVE). Some multilibs may be incompatible with these options.
arm_prefer_ldrd_strd
ARM target prefers LDRD
and STRD
instructions over
LDM
and STM
instructions.
arm_thumb1_movt_ok
ARM target generates Thumb-1 code for -mthumb
with MOVW
and MOVT
instructions available.
arm_thumb1_cbz_ok
ARM target generates Thumb-1 code for -mthumb
with
CBZ
and CBNZ
instructions available.
arm_divmod_simode
ARM target for which divmod transform is disabled, if it supports hardware div instruction.
arm_cmse_ok
ARM target supports ARMv8-M Security Extensions, enabled by the -mcmse
option.
arm_coproc1_ok
ARM target supports the following coprocessor instructions: CDP
,
LDC
, STC
, MCR
and MRC
.
arm_coproc2_ok
ARM target supports all the coprocessor instructions also listed as supported
in arm_coproc1_ok in addition to the following: CDP2
, LDC2
,
LDC2l
, STC2
, STC2l
, MCR2
and MRC2
.
arm_coproc3_ok
ARM target supports all the coprocessor instructions also listed as supported
in arm_coproc2_ok in addition the following: MCRR
and MRRC
.
arm_coproc4_ok
ARM target supports all the coprocessor instructions also listed as supported
in arm_coproc3_ok in addition the following: MCRR2
and MRRC2
.
arm_simd32_ok
ARM Target supports options suitable for accessing the SIMD32 intrinsics from
arm_acle.h
.
Some multilibs may be incompatible with these options.
arm_qbit_ok
ARM Target supports options suitable for accessing the Q-bit manipulation
intrinsics from arm_acle.h
.
Some multilibs may be incompatible with these options.
arm_dsp_ok
ARM Target supports options suitable for accessing the DSP intrinsics
from arm_acle.h
.
Some multilibs may be incompatible with these options.
arm_softfp_ok
ARM target supports the -mfloat-abi=softfp
option.
arm_hard_ok
ARM target supports the -mfloat-abi=hard
option.
arm_v8_1_lob_ok
ARM Target supports executing the Armv8.1-M Mainline Low Overhead Loop
instructions DLS
and LE
.
Some multilibs may be incompatible with these options.
arm_thumb2_no_arm_v8_1_lob
ARM target where Thumb-2 is used without options but does not support
executing the Armv8.1-M Mainline Low Overhead Loop instructions
DLS
and LE
.
arm_thumb2_ok_no_arm_v8_1_lob
ARM target generates Thumb-2 code for -mthumb
but does not
support executing the Armv8.1-M Mainline Low Overhead Loop
instructions DLS
and LE
.
aarch64_asm_<ext>_ok
AArch64 assembler supports the architecture extension ext
via the
.arch_extension
pseudo-op.
aarch64_tiny
AArch64 target which generates instruction sequences for tiny memory model.
aarch64_small
AArch64 target which generates instruction sequences for small memory model.
aarch64_large
AArch64 target which generates instruction sequences for large memory model.
aarch64_little_endian
AArch64 target which generates instruction sequences for little endian.
aarch64_big_endian
AArch64 target which generates instruction sequences for big endian.
aarch64_small_fpic
Binutils installed on test system supports relocation types required by -fpic for AArch64 small memory model.
aarch64_sve_hw
AArch64 target that is able to generate and execute SVE code (regardless of whether it does so by default).
aarch64_sve128_hw
aarch64_sve256_hw
aarch64_sve512_hw
aarch64_sve1024_hw
aarch64_sve2048_hw
Like aarch64_sve_hw
, but also test for an exact hardware vector length.
aarch64_fjcvtzs_hw
AArch64 target that is able to generate and execute armv8.3-a FJCVTZS instruction.
mips64
MIPS target supports 64-bit instructions.
nomips16
MIPS target does not produce MIPS16 code.
mips16_attribute
MIPS target can generate MIPS16 code.
mips_loongson
MIPS target is a Loongson-2E or -2F target using an ABI that supports the Loongson vector modes.
mips_msa
MIPS target supports -mmsa
, MIPS SIMD Architecture (MSA).
mips_newabi_large_long_double
MIPS target supports long double
larger than double
when using the new ABI.
mpaired_single
MIPS target supports -mpaired-single
.
msp430_small
MSP430 target has the small memory model enabled (-msmall
).
msp430_large
MSP430 target has the large memory model enabled (-mlarge
).
dfp_hw
PowerPC target supports executing hardware DFP instructions.
p8vector_hw
PowerPC target supports executing VSX instructions (ISA 2.07).
powerpc64
Test system supports executing 64-bit instructions.
powerpc_altivec
PowerPC target supports AltiVec.
powerpc_altivec_ok
PowerPC target supports -maltivec
.
powerpc_eabi_ok
PowerPC target supports -meabi
.
powerpc_elfv2
PowerPC target supports -mabi=elfv2
.
powerpc_fprs
PowerPC target supports floating-point registers.
powerpc_hard_double
PowerPC target supports hardware double-precision floating-point.
powerpc_htm_ok
PowerPC target supports -mhtm
powerpc_p8vector_ok
PowerPC target supports -mpower8-vector
powerpc_popcntb_ok
PowerPC target supports the popcntb
instruction, indicating
that this target supports -mcpu=power5
.
powerpc_ppu_ok
PowerPC target supports -mcpu=cell
.
powerpc_spe
PowerPC target supports PowerPC SPE.
powerpc_spe_nocache
Including the options used to compile this particular test, the PowerPC target supports PowerPC SPE.
powerpc_spu
PowerPC target supports PowerPC SPU.
powerpc_vsx_ok
PowerPC target supports -mvsx
.
powerpc_405_nocache
Including the options used to compile this particular test, the PowerPC target supports PowerPC 405.
ppc_recip_hw
PowerPC target supports executing reciprocal estimate instructions.
vmx_hw
PowerPC target supports executing AltiVec instructions.
vsx_hw
PowerPC target supports executing VSX instructions (ISA 2.06).
has_arch_pwr5
PowerPC target pre-defines macro _ARCH_PWR5 which means the -mcpu
setting is Power5 or later.
has_arch_pwr6
PowerPC target pre-defines macro _ARCH_PWR6 which means the -mcpu
setting is Power6 or later.
has_arch_pwr7
PowerPC target pre-defines macro _ARCH_PWR7 which means the -mcpu
setting is Power7 or later.
has_arch_pwr8
PowerPC target pre-defines macro _ARCH_PWR8 which means the -mcpu
setting is Power8 or later.
has_arch_pwr9
PowerPC target pre-defines macro _ARCH_PWR9 which means the -mcpu
setting is Power9 or later.
autoincdec
Target supports autoincrement/decrement addressing.
avx
Target supports compiling avx
instructions.
avx_runtime
Target supports the execution of avx
instructions.
avx2
Target supports compiling avx2
instructions.
avx2_runtime
Target supports the execution of avx2
instructions.
avxvnni
Target supports the execution of avxvnni
instructions.
avx512f
Target supports compiling avx512f
instructions.
avx512f_runtime
Target supports the execution of avx512f
instructions.
avx512vp2intersect
Target supports the execution of avx512vp2intersect
instructions.
amx_tile
Target supports the execution of amx-tile
instructions.
amx_int8
Target supports the execution of amx-int8
instructions.
amx_bf16
Target supports the execution of amx-bf16
instructions.
cell_hw
Test system can execute AltiVec and Cell PPU instructions.
coldfire_fpu
Target uses a ColdFire FPU.
divmod
Target supporting hardware divmod insn or divmod libcall.
divmod_simode
Target supporting hardware divmod insn or divmod libcall for SImode.
hard_float
Target supports FPU instructions.
non_strict_align
Target does not require strict alignment.
pie_copyreloc
The x86-64 target linker supports PIE with copy reloc.
rdrand
Target supports x86 rdrand
instruction.
sqrt_insn
Target has a square root instruction that the compiler can generate.
sse
Target supports compiling sse
instructions.
sse_runtime
Target supports the execution of sse
instructions.
sse2
Target supports compiling sse2
instructions.
sse2_runtime
Target supports the execution of sse2
instructions.
sync_char_short
Target supports atomic operations on char
and short
.
sync_int_long
Target supports atomic operations on int
and long
.
ultrasparc_hw
Test environment appears to run executables on a simulator that
accepts only EM_SPARC
executables and chokes on EM_SPARC32PLUS
or EM_SPARCV9
executables.
vect_cmdline_needed
Target requires a command line argument to enable a SIMD instruction set.
xorsign
Target supports the xorsign optab expansion.
c
The language for the compiler under test is C.
c++
The language for the compiler under test is C++.
c99_runtime
Target provides a full C99 runtime.
correct_iso_cpp_string_wchar_protos
Target string.h
and wchar.h
headers provide C++ required
overloads for strchr
etc. functions.
d_runtime
Target provides the D runtime.
d_runtime_has_std_library
Target provides the D standard library (Phobos).
dummy_wcsftime
Target uses a dummy wcsftime
function that always returns zero.
fd_truncate
Target can truncate a file from a file descriptor, as used by
libgfortran/io/unix.c:fd_truncate; i.e. ftruncate
or
chsize
.
fenv
Target provides fenv.h include file.
fenv_exceptions
Target supports fenv.h with all the standard IEEE exceptions and floating-point exceptions are raised by arithmetic operations.
fenv_exceptions_dfp
Target supports fenv.h with all the standard IEEE exceptions and floating-point exceptions are raised by arithmetic operations for decimal floating point.
fileio
Target offers such file I/O library functions as fopen
,
fclose
, tmpnam
, and remove
. This is a link-time
requirement for the presence of the functions in the library; even if
they fail at runtime, the requirement is still regarded as satisfied.
freestanding
Target is ‘freestanding’ as defined in section 4 of the C99 standard. Effectively, it is a target which supports no extra headers or libraries other than what is considered essential.
gettimeofday
Target supports gettimeofday
.
init_priority
Target supports constructors with initialization priority arguments.
inttypes_types
Target has the basic signed and unsigned types in inttypes.h
.
This is for tests that GCC’s notions of these types agree with those
in the header, as some systems have only inttypes.h
.
lax_strtofp
Target might have errors of a few ULP in string to floating-point conversion functions and overflow is not always detected correctly by those functions.
mempcpy
Target provides mempcpy
function.
mmap
Target supports mmap
.
newlib
Target supports Newlib.
newlib_nano_io
GCC was configured with --enable-newlib-nano-formatted-io
, which reduces
the code size of Newlib formatted I/O functions.
pow10
Target provides pow10
function.
pthread
Target can compile using pthread.h
with no errors or warnings.
pthread_h
Target has pthread.h
.
run_expensive_tests
Expensive testcases (usually those that consume excessive amounts of CPU
time) should be run on this target. This can be enabled by setting the
GCC_TEST_RUN_EXPENSIVE
environment variable to a non-empty string.
simulator
Test system runs executables on a simulator (i.e. slowly) rather than hardware (i.e. fast).
signal
Target has signal.h
.
stabs
Target supports the stabs debugging format.
stdint_types
Target has the basic signed and unsigned C types in stdint.h
.
This will be obsolete when GCC ensures a working stdint.h
for
all targets.
stdint_types_mbig_endian
Target accepts the option -mbig-endian and stdint.h
can be included without error when -mbig-endian is passed.
stpcpy
Target provides stpcpy
function.
sysconf
Target supports sysconf
.
trampolines
Target supports trampolines.
uclibc
Target supports uClibc.
unwrapped
Target does not use a status wrapper.
vxworks_kernel
Target is a VxWorks kernel.
vxworks_rtp
Target is a VxWorks RTP.
wchar
Target supports wide characters.
R_flag_in_section
Target supports the ’R’ flag in .section directive in assembly inputs.
automatic_stack_alignment
Target supports automatic stack alignment.
branch_cost
Target supports -branch-cost=N.
cxa_atexit
Target uses __cxa_atexit
.
default_packed
Target has packed layout of structure members by default.
exceptions
Target supports exceptions.
exceptions_enabled
Target supports exceptions and they are enabled in the current testing configuration.
fgraphite
Target supports Graphite optimizations.
fixed_point
Target supports fixed-point extension to C.
fopenacc
Target supports OpenACC via -fopenacc.
fopenmp
Target supports OpenMP via -fopenmp.
fpic
Target supports -fpic and -fPIC.
freorder
Target supports -freorder-blocks-and-partition.
fstack_protector
Target supports -fstack-protector.
gas
Target uses GNU as
.
gc_sections
Target supports --gc-sections.
gld
Target uses GNU ld
.
keeps_null_pointer_checks
Target keeps null pointer checks, either due to the use of -fno-delete-null-pointer-checks or hardwired into the target.
llvm_binutils
Target is using an LLVM assembler and/or linker, instead of GNU Binutils.
lra
Target supports local register allocator (LRA).
lto
Compiler has been configured to support link-time optimization (LTO).
lto_incremental
Compiler and linker support link-time optimization relocatable linking with -r and -flto options.
naked_functions
Target supports the naked
function attribute.
named_sections
Target supports named sections.
natural_alignment_32
Target uses natural alignment (aligned to type size) for types of 32 bits or less.
target_natural_alignment_64
Target uses natural alignment (aligned to type size) for types of 64 bits or less.
noinit
Target supports the noinit
variable attribute.
nonpic
Target does not generate PIC by default.
o_flag_in_section
Target supports the ’o’ flag in .section directive in assembly inputs.
offload_gcn
Target has been configured for OpenACC/OpenMP offloading on AMD GCN.
persistent
Target supports the persistent
variable attribute.
pie_enabled
Target generates PIE by default.
pcc_bitfield_type_matters
Target defines PCC_BITFIELD_TYPE_MATTERS
.
pe_aligned_commons
Target supports -mpe-aligned-commons.
pie
Target supports -pie, -fpie and -fPIE.
rdynamic
Target supports -rdynamic.
scalar_all_fma
Target supports all four fused multiply-add optabs for both float
and double
. These optabs are: fma_optab
, fms_optab
,
fnma_optab
and fnms_optab
.
section_anchors
Target supports section anchors.
short_enums
Target defaults to short enums.
stack_size
Target has limited stack size. The stack size limit can be obtained using the
STACK_SIZE macro defined by dg-add-options
feature
stack_size
.
static
Target supports -static.
static_libgfortran
Target supports statically linking ‘libgfortran’.
string_merging
Target supports merging string constants at link time.
ucn
Target supports compiling and assembling UCN.
ucn_nocache
Including the options used to compile this particular test, the target supports compiling and assembling UCN.
unaligned_stack
Target does not guarantee that its STACK_BOUNDARY
is greater than
or equal to the required vector alignment.
vector_alignment_reachable
Vector alignment is reachable for types of 32 bits or less.
vector_alignment_reachable_for_64bit
Vector alignment is reachable for types of 64 bits or less.
wchar_t_char16_t_compatible
Target supports wchar_t
that is compatible with char16_t
.
wchar_t_char32_t_compatible
Target supports wchar_t
that is compatible with char32_t
.
comdat_group
Target uses comdat groups.
indirect_calls
Target supports indirect calls, i.e. calls where the target is not constant.
lgccjit
Target supports -lgccjit, i.e. libgccjit.so can be linked into jit tests.
gcc.target/i386
3dnow
Target supports compiling 3dnow
instructions.
aes
Target supports compiling aes
instructions.
fma4
Target supports compiling fma4
instructions.
mfentry
Target supports the -mfentry
option that alters the
position of profiling calls such that they precede the prologue.
ms_hook_prologue
Target supports attribute ms_hook_prologue
.
pclmul
Target supports compiling pclmul
instructions.
sse3
Target supports compiling sse3
instructions.
sse4
Target supports compiling sse4
instructions.
sse4a
Target supports compiling sse4a
instructions.
ssse3
Target supports compiling ssse3
instructions.
vaes
Target supports compiling vaes
instructions.
vpclmul
Target supports compiling vpclmul
instructions.
xop
Target supports compiling xop
instructions.
gcc.test-framework
no
Always returns 0.
yes
Always returns 1.
Next: Variants of dg-require-support
, Previous: Keywords describing target attributes, Up: Directives used within DejaGnu tests [Contents][Index]
dg-add-options
The supported values of feature for directive dg-add-options
are:
arm_fp
__ARM_FP
definition. Only ARM targets support this feature, and only then
in certain modes; see the arm_fp_ok effective target
keyword.
arm_fp_dp
__ARM_FP
definition with double-precision support. Only ARM
targets support this feature, and only then in certain modes; see the
arm_fp_dp_ok effective target keyword.
arm_neon
NEON support. Only ARM targets support this feature, and only then in certain modes; see the arm_neon_ok effective target keyword.
arm_fp16
VFP half-precision floating point support. This does not select the FP16 format; for that, use arm_fp16_ieee or arm_fp16_alternative instead. This feature is only supported by ARM targets and then only in certain modes; see the arm_fp16_ok effective target keyword.
arm_fp16_ieee
ARM IEEE 754-2008 format VFP half-precision floating point support. This feature is only supported by ARM targets and then only in certain modes; see the arm_fp16_ok effective target keyword.
arm_fp16_alternative
ARM Alternative format VFP half-precision floating point support. This feature is only supported by ARM targets and then only in certain modes; see the arm_fp16_ok effective target keyword.
arm_neon_fp16
NEON and half-precision floating point support. Only ARM targets support this feature, and only then in certain modes; see the arm_neon_fp16_ok effective target keyword.
arm_vfp3
arm vfp3 floating point support; see the arm_vfp3_ok effective target keyword.
arm_arch_v8a_hard
Add options for ARMv8-A and the hard-float variant of the AAPCS, if this is supported by the compiler; see the arm_arch_v8a_hard_ok effective target keyword.
arm_v8_1a_neon
Add options for ARMv8.1-A with Adv.SIMD support, if this is supported by the target; see the arm_v8_1a_neon_ok effective target keyword.
arm_v8_2a_fp16_scalar
Add options for ARMv8.2-A with scalar FP16 support, if this is supported by the target; see the arm_v8_2a_fp16_scalar_ok effective target keyword.
arm_v8_2a_fp16_neon
Add options for ARMv8.2-A with Adv.SIMD FP16 support, if this is supported by the target; see the arm_v8_2a_fp16_neon_ok effective target keyword.
arm_v8_2a_dotprod_neon
Add options for ARMv8.2-A with Adv.SIMD Dot Product support, if this is supported by the target; see the arm_v8_2a_dotprod_neon_ok effective target keyword.
arm_fp16fml_neon
Add options to enable generation of the VFMAL
and VFMSL
instructions, if this is supported by the target; see the
arm_fp16fml_neon_ok effective target keyword.
arm_dsp
Add options for ARM DSP intrinsics support, if this is supported by the target; see the arm_dsp_ok effective target keyword.
bind_pic_locally
Add the target-specific flags needed to enable functions to bind locally when using pic/PIC passes in the testsuite.
floatn
Add the target-specific flags needed to use the _Floatn
type.
floatnx
Add the target-specific flags needed to use the _Floatnx
type.
ieee
Add the target-specific flags needed to enable full IEEE compliance mode.
mips16_attribute
mips16
function attributes.
Only MIPS targets support this feature, and only then in certain modes.
stack_size
Add the flags needed to define macro STACK_SIZE and set it to the stack size
limit associated with the stack_size
effective
target.
sqrt_insn
Add the target-specific flags needed to enable hardware square root instructions, if any.
tls
Add the target-specific flags needed to use thread-local storage.
Next: Commands for use in dg-final
, Previous: Features for dg-add-options
, Up: Directives used within DejaGnu tests [Contents][Index]
dg-require-support
A few of the dg-require
directives take arguments.
dg-require-iconv codeset
Skip the test if the target does not support iconv. codeset is the codeset to convert to.
dg-require-profiling profopt
Skip the test if the target does not support profiling with option profopt.
dg-require-stack-check check
Skip the test if the target does not support the -fstack-check
option. If check is ""
, support for -fstack-check
is checked, for -fstack-check=("check")
otherwise.
dg-require-stack-size size
Skip the test if the target does not support a stack size of size.
dg-require-visibility vis
Skip the test if the target does not support the visibility
attribute.
If vis is ""
, support for visibility("hidden")
is
checked, for visibility("vis")
otherwise.
The original dg-require
directives were defined before there
was support for effective-target keywords. The directives that do not
take arguments could be replaced with effective-target keywords.
dg-require-alias ""
Skip the test if the target does not support the ‘alias’ attribute.
dg-require-ascii-locale ""
Skip the test if the host does not support an ASCII locale.
dg-require-compat-dfp ""
Skip this test unless both compilers in a compat testsuite support decimal floating point.
dg-require-cxa-atexit ""
Skip the test if the target does not support __cxa_atexit
.
This is equivalent to dg-require-effective-target cxa_atexit
.
dg-require-dll ""
Skip the test if the target does not support DLL attributes.
dg-require-dot ""
Skip the test if the host does not have dot
.
dg-require-fork ""
Skip the test if the target does not support fork
.
dg-require-gc-sections ""
Skip the test if the target’s linker does not support the
--gc-sections
flags.
This is equivalent to dg-require-effective-target gc-sections
.
dg-require-host-local ""
Skip the test if the host is remote, rather than the same as the build
system. Some tests are incompatible with DejaGnu’s handling of remote
hosts, which involves copying the source file to the host and compiling
it with a relative path and "-o a.out
".
dg-require-mkfifo ""
Skip the test if the target does not support mkfifo
.
dg-require-named-sections ""
Skip the test is the target does not support named sections.
This is equivalent to dg-require-effective-target named_sections
.
dg-require-weak ""
Skip the test if the target does not support weak symbols.
dg-require-weak-override ""
Skip the test if the target does not support overriding weak symbols.
Previous: Variants of dg-require-support
, Up: Directives used within DejaGnu tests [Contents][Index]
dg-final
The GCC testsuite defines the following directives to be used within
dg-final
.
gcov
testsscan-file filename regexp [{ target/xfail selector }]
Passes if regexp matches text in filename.
scan-file-not filename regexp [{ target/xfail selector }]
Passes if regexp does not match text in filename.
scan-module module regexp [{ target/xfail selector }]
Passes if regexp matches in Fortran module module.
dg-check-dot filename
Passes if filename is a valid .dot file (by running
dot -Tpng
on it, and verifying the exit code is 0).
scan-assembler regex [{ target/xfail selector }]
Passes if regex matches text in the test’s assembler output.
scan-assembler-not regex [{ target/xfail selector }]
Passes if regex does not match text in the test’s assembler output.
scan-assembler-times regex num [{ target/xfail selector }]
Passes if regex is matched exactly num times in the test’s assembler output.
scan-assembler-dem regex [{ target/xfail selector }]
Passes if regex matches text in the test’s demangled assembler output.
scan-assembler-dem-not regex [{ target/xfail selector }]
Passes if regex does not match text in the test’s demangled assembler output.
scan-assembler-symbol-section functions section [{ target/xfail selector }]
Passes if functions are all in section. The caller needs to
allow for USER_LABEL_PREFIX
and different section name conventions.
scan-symbol-section filename functions section [{ target/xfail selector }]
Passes if functions are all in sectionin filename.
The same caveats as for scan-assembler-symbol-section
apply.
scan-hidden symbol [{ target/xfail selector }]
Passes if symbol is defined as a hidden symbol in the test’s assembly output.
scan-not-hidden symbol [{ target/xfail selector }]
Passes if symbol is not defined as a hidden symbol in the test’s assembly output.
check-function-bodies prefix terminator [options [{ target/xfail selector }]]
Looks through the source file for comments that give the expected assembly output for selected functions. Each line of expected output starts with the prefix string prefix and the expected output for a function as a whole is followed by a line that starts with the string terminator. Specifying an empty terminator is equivalent to specifying ‘"*/"’.
options, if specified, is a list of regular expressions, each of which matches a full command-line option. A non-empty list prevents the test from running unless all of the given options are present on the command line. This can help if a source file is compiled both with and without optimization, since it is rarely useful to check the full function body for unoptimized code.
The first line of the expected output for a function fn has the form:
prefix fn: [{ target/xfail selector }]
Subsequent lines of the expected output also start with prefix. In both cases, whitespace after prefix is not significant.
The test discards assembly directives such as .cfi_startproc
and local label definitions such as .LFB0
from the compiler’s
assembly output. It then matches the result against the expected
output for a function as a single regular expression. This means that
later lines can use backslashes to refer back to ‘(…)’
captures on earlier lines. For example:
/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ … /* ** add_w0_s8_m: ** mov (z[0-9]+\.b), w0 ** add z0\.b, p0/m, z0\.b, \1 ** ret */ svint8_t add_w0_s8_m (…) { … } … /* ** add_b0_s8_m: ** mov (z[0-9]+\.b), b0 ** add z1\.b, p0/m, z1\.b, \1 ** ret */ svint8_t add_b0_s8_m (…) { … }
checks whether the implementations of add_w0_s8_m
and
add_b0_s8_m
match the regular expressions given. The test only
runs when ‘-DCHECK_ASM’ is passed on the command line.
It is possible to create non-capturing multi-line regular expression groups of the form ‘(a|b|…)’ by putting the ‘(’, ‘|’ and ‘)’ on separate lines (each still using prefix). For example:
/* ** cmple_f16_tied: ** ( ** fcmge p0\.h, p0/z, z1\.h, z0\.h ** | ** fcmle p0\.h, p0/z, z0\.h, z1\.h ** ) ** ret */ svbool_t cmple_f16_tied (…) { … }
checks whether cmple_f16_tied
is implemented by the
fcmge
instruction followed by ret
or by the
fcmle
instruction followed by ret
. The test is
still a single regular rexpression.
A line containing just:
prefix ...
stands for zero or more unmatched lines; the whitespace after prefix is again not significant.
These commands are available for kind of tree
, ltrans-tree
,
offload-tree
, rtl
, offload-rtl
, ipa
, and
wpa-ipa
.
scan-kind-dump regex suffix [{ target/xfail selector }]
Passes if regex matches text in the dump file with suffix suffix.
scan-kind-dump-not regex suffix [{ target/xfail selector }]
Passes if regex does not match text in the dump file with suffix suffix.
scan-kind-dump-times regex num suffix [{ target/xfail selector }]
Passes if regex is found exactly num times in the dump file with suffix suffix.
scan-kind-dump-dem regex suffix [{ target/xfail selector }]
Passes if regex matches demangled text in the dump file with suffix suffix.
scan-kind-dump-dem-not regex suffix [{ target/xfail selector }]
Passes if regex does not match demangled text in the dump file with suffix suffix.
The suffix argument which describes the dump file to be scanned may contain a glob pattern that must expand to exactly one file name. This is useful if, e.g., different pass instances are executed depending on torture testing command-line flags, producing dump files whose names differ only in their pass instance number suffix. For example, to scan instances 1, 2, 3 of a tree pass “mypass” for occurrences of the string “code has been optimized”, use:
/* { dg-options "-fdump-tree-mypass" } */ /* { dg-final { scan-tree-dump "code has been optimized" "mypass\[1-3\]" } } */
output-exists [{ target/xfail selector }]
Passes if compiler output file exists.
output-exists-not [{ target/xfail selector }]
Passes if compiler output file does not exist.
scan-symbol regexp [{ target/xfail selector }]
Passes if the pattern is present in the final executable.
scan-symbol-not regexp [{ target/xfail selector }]
Passes if the pattern is absent from the final executable.
gcov
testsrun-gcov sourcefile
Check line counts in gcov
tests.
run-gcov [branches] [calls] { opts sourcefile }
Check branch and/or call counts, in addition to line counts, in
gcov
tests.
run-gcov-pytest { sourcefile pytest_file }
Check output of gcov
intermediate format with a pytest
script.
Usually the test-framework removes files that were generated during testing. If a testcase, for example, uses any dumping mechanism to inspect a passes dump file, the testsuite recognized the dump option passed to the tool and schedules a final cleanup to remove these files.
There are, however, following additional cleanup directives that can be used to annotate a testcase "manually".
cleanup-coverage-files
Removes coverage data files generated for this test.
cleanup-modules "list-of-extra-modules"
Removes Fortran module files generated for this test, excluding the module names listed in keep-modules. Cleaning up module files is usually done automatically by the testsuite by looking at the source files and removing the modules after the test has been executed.
module MoD1 end module MoD1 module Mod2 end module Mod2 module moD3 end module moD3 module mod4 end module mod4 ! { dg-final { cleanup-modules "mod1 mod2" } } ! redundant ! { dg-final { keep-modules "mod3 mod4" } }
keep-modules "list-of-modules-not-to-delete"
Whitespace separated list of module names that should not be deleted by cleanup-modules. If the list of modules is empty, all modules defined in this file are kept.
module maybe_unneeded end module maybe_unneeded module keep1 end module keep1 module keep2 end module keep2 ! { dg-final { keep-modules "keep1 keep2" } } ! just keep these two ! { dg-final { keep-modules "" } } ! keep all
dg-keep-saved-temps "list-of-suffixes-not-to-delete"
Whitespace separated list of suffixes that should not be deleted automatically in a testcase that uses -save-temps.
// { dg-options "-save-temps -fpch-preprocess -I." } int main() { return 0; } // { dg-keep-saved-temps ".s" } ! just keep assembler file // { dg-keep-saved-temps ".s" ".i" } ! ... and .i // { dg-keep-saved-temps ".ii" ".o" } ! or just .ii and .o
cleanup-profile-file
Removes profiling files generated for this test.
Next: C Language Testsuites, Previous: Directives used within DejaGnu tests, Up: Testsuites [Contents][Index]
The Ada testsuite includes executable tests from the ACATS testsuite, publicly available at http://www.ada-auth.org/acats.html.
These tests are integrated in the GCC testsuite in the
ada/acats directory, and
enabled automatically when running make check
, assuming
the Ada language has been enabled when configuring GCC.
You can also run the Ada testsuite independently, using
make check-ada
, or run a subset of the tests by specifying which
chapter to run, e.g.:
$ make check-ada CHAPTERS="c3 c9"
The tests are organized by directory, each directory corresponding to a chapter of the Ada Reference Manual. So for example, c9 corresponds to chapter 9, which deals with tasking features of the language.
The tests are run using two sh
scripts: run_acats and
run_all.sh. To run the tests using a simulator or a cross
target, see the small
customization section at the top of run_all.sh.
These tests are run using the build tree: they can be run without doing
a make install
.
Next: Support for testing link-time optimizations, Previous: Ada Language Testsuites, Up: Testsuites [Contents][Index]
GCC contains the following C language testsuites, in the gcc/testsuite directory:
This contains tests of particular features of the C compiler, using the more modern ‘dg’ harness. Correctness tests for various compiler features should go here if possible.
Magic comments determine whether the file is preprocessed, compiled, linked or run. In these tests, error and warning message texts are compared against expected texts or regular expressions given in comments. These tests are run with the options ‘-ansi -pedantic’ unless other options are given in the test. Except as noted below they are not run with multiple optimization options.
This subdirectory contains tests for binary compatibility using lib/compat.exp, which in turn uses the language-independent support (see Support for testing binary compatibility).
This subdirectory contains tests of the preprocessor.
This subdirectory contains tests for debug formats. Tests in this subdirectory are run for each debug format that the compiler supports.
This subdirectory contains tests of the -Wformat format checking. Tests in this directory are run with and without -DWIDE.
This subdirectory contains tests of code that should not compile and does not need any special compilation options. They are run with multiple optimization options, since sometimes invalid code crashes the compiler with optimization.
FIXME: describe this.
This contains particular code fragments which have historically broken easily. These tests are run with multiple optimization options, so tests for features which only break at some optimization levels belong here. This also contains tests to check that certain optimizations occur. It might be worthwhile to separate the correctness tests cleanly from the code quality tests, but it hasn’t been done yet.
FIXME: describe this.
This directory should probably not be used for new tests.
This testsuite contains test cases that should compile, but do not
need to link or run. These test cases are compiled with several
different combinations of optimization options. All warnings are
disabled for these test cases, so this directory is not suitable if
you wish to test for the presence or absence of compiler warnings.
While special options can be set, and tests disabled on specific
platforms, by the use of .x files, mostly these test cases
should not contain platform dependencies. FIXME: discuss how defines
such as STACK_SIZE
are used.
This testsuite contains test cases that should compile, link and run; otherwise the same comments as for gcc.c-torture/compile apply.
This contains tests which are specific to IEEE floating point.
FIXME: describe this.
This directory should probably not be used for new tests.
This directory contains C tests that require special handling. Some of these tests have individual expect files, and others share special-purpose expect files:
bprob*.c
Test -fbranch-probabilities using gcc.misc-tests/bprob.exp, which in turn uses the generic, language-independent framework (see Support for testing profile-directed optimizations).
gcov*.c
Test gcov
output using gcov.exp, which in turn uses the
language-independent support (see Support for testing gcov).
i386-pf-*.c
Test i386-specific support for data prefetch using i386-prefetch.exp.
dg-*.c
Test the testsuite itself using gcc.test-framework/test-framework.exp.
FIXME: merge in testsuite/README.gcc and discuss the format of test cases and magic comments more.
Next: Support for testing gcov
, Previous: C Language Testsuites, Up: Testsuites [Contents][Index]
Tests for link-time optimizations usually require multiple source files that are compiled separately, perhaps with different sets of options. There are several special-purpose test directives used for these tests.
{ dg-lto-do do-what-keyword }
do-what-keyword specifies how the test is compiled and whether it is executed. It is one of:
assemble
Compile with -c to produce a relocatable object file.
link
Compile, assemble, and link to produce an executable file.
run
Produce and run an executable file, which is expected to return an exit code of 0.
The default is assemble
. That can be overridden for a set of
tests by redefining dg-do-what-default
within the .exp
file for those tests.
Unlike dg-do
, dg-lto-do
does not support an optional
‘target’ or ‘xfail’ list. Use dg-skip-if
,
dg-xfail-if
, or dg-xfail-run-if
.
{ dg-lto-options { { options } [{ options }] } [{ target selector }]}
This directive provides a list of one or more sets of compiler options to override LTO_OPTIONS. Each test will be compiled and run with each of these sets of options.
{ dg-extra-ld-options options [{ target selector }]}
This directive adds options to the linker options used.
{ dg-suppress-ld-options options [{ target selector }]}
This directive removes options from the set of linker options used.
Next: Support for testing profile-directed optimizations, Previous: Support for testing link-time optimizations, Up: Testsuites [Contents][Index]
gcov
Language-independent support for testing gcov
, and for checking
that branch profiling produces expected values, is provided by the
expect file lib/gcov.exp. gcov
tests also rely on procedures
in lib/gcc-dg.exp to compile and run the test program. A typical
gcov
test contains the following DejaGnu commands within comments:
{ dg-options "--coverage" } { dg-do run { target native } } { dg-final { run-gcov sourcefile } }
Checks of gcov
output can include line counts, branch percentages,
and call return percentages. All of these checks are requested via
commands that appear in comments in the test’s source file.
Commands to check line counts are processed by default.
Commands to check branch percentages and call return percentages are
processed if the run-gcov
command has arguments branches
or calls
, respectively. For example, the following specifies
checking both, as well as passing -b to gcov
:
{ dg-final { run-gcov branches calls { -b sourcefile } } }
A line count command appears within a comment on the source line
that is expected to get the specified count and has the form
count(cnt)
. A test should only check line counts for
lines that will get the same count for any architecture.
Commands to check branch percentages (branch
) and call
return percentages (returns
) are very similar to each other.
A beginning command appears on or before the first of a range of
lines that will report the percentage, and the ending command
follows that range of lines. The beginning command can include a
list of percentages, all of which are expected to be found within
the range. A range is terminated by the next command of the same
kind. A command branch(end)
or returns(end)
marks
the end of a range without starting a new one. For example:
if (i > 10 && j > i && j < 20) /* branch(27 50 75) */ /* branch(end) */ foo (i, j);
For a call return percentage, the value specified is the percentage of calls reported to return. For a branch percentage, the value is either the expected percentage or 100 minus that value, since the direction of a branch can differ depending on the target or the optimization level.
Not all branches and calls need to be checked. A test should not check for branches that might be optimized away or replaced with predicated instructions. Don’t check for calls inserted by the compiler or ones that might be inlined or optimized away.
A single test can check for combinations of line counts, branch percentages, and call return percentages. The command to check a line count must appear on the line that will report that count, but commands to check branch percentages and call return percentages can bracket the lines that report them.
Next: Support for testing binary compatibility, Previous: Support for testing gcov
, Up: Testsuites [Contents][Index]
The file profopt.exp provides language-independent support for checking correct execution of a test built with profile-directed optimization. This testing requires that a test program be built and executed twice. The first time it is compiled to generate profile data, and the second time it is compiled to use the data that was generated during the first execution. The second execution is to verify that the test produces the expected results.
To check that the optimization actually generated better code, a test can be built and run a third time with normal optimizations to verify that the performance is better with the profile-directed optimizations. profopt.exp has the beginnings of this kind of support.
profopt.exp provides generic support for profile-directed optimizations. Each set of tests that uses it provides information about a specific optimization:
tool
tool being tested, e.g., gcc
profile_option
options used to generate profile data
feedback_option
options used to optimize using that profile data
prof_ext
suffix of profile data files
PROFOPT_OPTIONS
list of options with which to run each test, similar to the lists for torture tests
{ dg-final-generate { local-directive } }
This directive is similar to dg-final
, but the
local-directive is run after the generation of profile data.
{ dg-final-use { local-directive } }
The local-directive is run after the profile data have been used.
Next: Support for torture testing using multiple options, Previous: Support for testing profile-directed optimizations, Up: Testsuites [Contents][Index]
The file compat.exp provides language-independent support for binary compatibility testing. It supports testing interoperability of two compilers that follow the same ABI, or of multiple sets of compiler options that should not affect binary compatibility. It is intended to be used for testsuites that complement ABI testsuites.
A test supported by this framework has three parts, each in a separate source file: a main program and two pieces that interact with each other to split up the functionality being tested.
Contains the main program, which calls a function in file testname_x.suffix.
Contains at least one call to a function in testname_y.suffix.
Shares data with, or gets arguments from, testname_x.suffix.
Within each test, the main program and one functional piece are compiled by the GCC under test. The other piece can be compiled by an alternate compiler. If no alternate compiler is specified, then all three source files are all compiled by the GCC under test. You can specify pairs of sets of compiler options. The first element of such a pair specifies options used with the GCC under test, and the second element of the pair specifies options used with the alternate compiler. Each test is compiled with each pair of options.
compat.exp defines default pairs of compiler options.
These can be overridden by defining the environment variable
COMPAT_OPTIONS
as:
COMPAT_OPTIONS="[list [list {tst1} {alt1}] …[list {tstn} {altn}]]"
where tsti and alti are lists of options, with tsti
used by the compiler under test and alti used by the alternate
compiler. For example, with
[list [list {-g -O0} {-O3}] [list {-fpic} {-fPIC -O2}]]
,
the test is first built with -g -O0 by the compiler under
test and with -O3 by the alternate compiler. The test is
built a second time using -fpic by the compiler under test
and -fPIC -O2 by the alternate compiler.
An alternate compiler is specified by defining an environment
variable to be the full pathname of an installed compiler; for C
define ALT_CC_UNDER_TEST
, and for C++ define
ALT_CXX_UNDER_TEST
. These will be written to the
site.exp file used by DejaGnu. The default is to build each
test with the compiler under test using the first of each pair of
compiler options from COMPAT_OPTIONS
. When
ALT_CC_UNDER_TEST
or
ALT_CXX_UNDER_TEST
is same
, each test is built using
the compiler under test but with combinations of the options from
COMPAT_OPTIONS
.
To run only the C++ compatibility suite using the compiler under test and another version of GCC using specific compiler options, do the following from objdir/gcc:
rm site.exp make -k \ ALT_CXX_UNDER_TEST=${alt_prefix}/bin/g++ \ COMPAT_OPTIONS="lists as shown above" \ check-c++ \ RUNTESTFLAGS="compat.exp"
A test that fails when the source files are compiled with different compilers, but passes when the files are compiled with the same compiler, demonstrates incompatibility of the generated code or runtime support. A test that fails for the alternate compiler but passes for the compiler under test probably tests for a bug that was fixed in the compiler under test but is present in the alternate compiler.
The binary compatibility tests support a small number of test framework commands that appear within comments in a test file.
dg-require-*
These commands can be used in testname_main.suffix to skip the test if specific support is not available on the target.
dg-options
The specified options are used for compiling this particular source
file, appended to the options from COMPAT_OPTIONS
. When this
command appears in testname_main.suffix the options
are also used to link the test program.
dg-xfail-if
This command can be used in a secondary source file to specify that compilation is expected to fail for particular options on particular targets.
Next: Support for testing GIMPLE passes, Previous: Support for testing binary compatibility, Up: Testsuites [Contents][Index]
Throughout the compiler testsuite there are several directories whose tests are run multiple times, each with a different set of options. These are known as torture tests. lib/torture-options.exp defines procedures to set up these lists:
torture-init
Initialize use of torture lists.
set-torture-options
Set lists of torture options to use for tests with and without loops. Optionally combine a set of torture options with a set of other options, as is done with Objective-C runtime options.
torture-finish
Finalize use of torture lists.
The .exp file for a set of tests that use torture options must include calls to these three procedures if:
gcc-dg-runtest
and overrides DG_TORTURE_OPTIONS.
-torture
or
${tool}-torture-execute
, where tool is c
,
fortran
, or objc
.
dg-pch
.
It is not necessary for a .exp file that calls gcc-dg-runtest
to call the torture procedures if the tests should use the list in
DG_TORTURE_OPTIONS defined in gcc-dg.exp.
Most uses of torture options can override the default lists by defining TORTURE_OPTIONS or add to the default list by defining ADDITIONAL_TORTURE_OPTIONS. Define these in a .dejagnurc file or add them to the site.exp file; for example
set ADDITIONAL_TORTURE_OPTIONS [list \ { -O2 -ftree-loop-linear } \ { -O2 -fpeel-loops } ]
Next: Support for testing RTL passes, Previous: Support for torture testing using multiple options, Up: Testsuites [Contents][Index]
As of gcc 7, C functions can be tagged with __GIMPLE
to indicate
that the function body will be GIMPLE, rather than C. The compiler requires
the option -fgimple to enable this functionality. For example:
/* { dg-do compile } */ /* { dg-options "-O -fgimple" } */ void __GIMPLE (startwith ("dse2")) foo () { int a; bb_2: if (a > 4) goto bb_3; else goto bb_4; bb_3: a_2 = 10; goto bb_5; bb_4: a_3 = 20; bb_5: a_1 = __PHI (bb_3: a_2, bb_4: a_3); a_4 = a_1 + 4; return; }
The startwith
argument indicates at which pass to begin.
Use the dump modifier -gimple
(e.g. -fdump-tree-all-gimple)
to make tree dumps more closely follow the format accepted by the GIMPLE
parser.
Example DejaGnu tests of GIMPLE can be seen in the source tree at gcc/testsuite/gcc.dg/gimplefe-*.c.
The __GIMPLE
parser is integrated with the C tokenizer and
preprocessor, so it should be possible to use macros to build out
test coverage.
Previous: Support for testing GIMPLE passes, Up: Testsuites [Contents][Index]
As of gcc 7, C functions can be tagged with __RTL
to indicate that the
function body will be RTL, rather than C. For example:
double __RTL (startwith ("ira")) test (struct foo *f, const struct bar *b) { (function "test" [...snip; various directives go in here...] ) ;; function "test" }
The startwith
argument indicates at which pass to begin.
The parser expects the RTL body to be in the format emitted by this dumping function:
DEBUG_FUNCTION void print_rtx_function (FILE *outfile, function *fn, bool compact);
when "compact" is true. So you can capture RTL in the correct format from the debugger using:
(gdb) print_rtx_function (stderr, cfun, true);
and copy and paste the output into the body of the C function.
Example DejaGnu tests of RTL can be seen in the source tree under gcc/testsuite/gcc.dg/rtl.
The __RTL
parser is not integrated with the C tokenizer or
preprocessor, and works simply by reading the relevant lines within
the braces. In particular, the RTL body must be on separate lines from
the enclosing braces, and the preprocessor is not usable within it.
Next: Passes and Files of the Compiler, Previous: Testsuites, Up: Introduction [Contents][Index]
Most GCC command-line options are described by special option
definition files, the names of which conventionally end in
.opt
. This chapter describes the format of these files.
Next: Option properties, Up: Option specification files [Contents][Index]
Option files are a simple list of records in which each field occupies its own line and in which the records themselves are separated by blank lines. Comments may appear on their own line anywhere within the file and are preceded by semicolons. Whitespace is allowed before the semicolon.
The files can contain the following types of record:
cl_target_option
structure.
Var
properties.
gcc_options
structure, but these variables are also stored in
the cl_target_option
structure. The variables are saved in the
target save code and restored in the target restore code.
#ifdef
sequences to properly set up the initialization. These records have
two fields: the string ‘SourceInclude’ and the name of the
include file.
Name(name)
This property is required; name must be a name (suitable for use
in C identifiers) used to identify the set of strings in Enum
option properties.
Type(type)
This property is required; type is the C type for variables set
by options using this enumeration together with Var
.
UnknownError(message)
The message message will be used as an error message if the
argument is invalid; for enumerations without UnknownError
, a
generic error message is used. message should contain a single
‘%qs’ format, which will be used to format the invalid argument.
Enum(name)
This property is required; name says which ‘Enum’ record this ‘EnumValue’ record corresponds to.
String(string)
This property is required; string is the string option argument being described by this record.
Value(value)
This property is required; it says what value (representable as
int
) should be used for the given string.
Canonical
This property is optional. If present, it says the present string is the canonical one among all those with the given value. Other strings yielding that value will be mapped to this one so specs do not need to handle them.
DriverOnly
This property is optional. If present, the present string will only be accepted by the driver. This is used for cases such as -march=native that are processed by the driver so that ‘gcc -v’ shows how the options chosen depended on the system on which the compiler was run.
Undocumented
property).
By default, all options beginning with “f”, “W” or “m” are
implicitly assumed to take a “no-” form. This form should not be
listed separately. If an option beginning with one of these letters
does not have a “no-” form, you can use the RejectNegative
property to reject it.
The help text is automatically line-wrapped before being displayed. Normally the name of the option is printed on the left-hand side of the output and the help text is printed on the right. However, if the help text contains a tab character, the text to the left of the tab is used instead of the option’s name and the text to the right of the tab forms the help text. This allows you to elaborate on what type of argument the option takes.
target_flags
(see Run-time Target Specification) for
each mask name x and set the macro MASK_x
to the
appropriate bitmask. It will also declare a TARGET_x
macro that has the value 1 when bit MASK_x
is set and
0 otherwise.
They are primarily intended to declare target masks that are not associated with user options, either because these masks represent internal switches or because the options are not available on all configurations and yet the masks always need to be defined.
Previous: Option file format, Up: Option specification files [Contents][Index]
The second field of an option record can specify any of the following properties. When an option takes an argument, it is enclosed in parentheses following the option property name. The parser that handles option files is quite simplistic, and will be tricked by any nested parentheses within the argument text itself; in this case, the entire option argument can be wrapped in curly braces within the parentheses to demarcate it, e.g.:
Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)})
Common
The option is available for all languages and targets.
Target
The option is available for all languages but is target-specific.
Driver
The option is handled by the compiler driver using code not shared with the compilers proper (cc1 etc.).
language
The option is available when compiling for the given language.
It is possible to specify several different languages for the same
option. Each language must have been declared by an earlier
Language
record. See Option file format.
RejectDriver
The option is only handled by the compilers proper (cc1 etc.) and should not be accepted by the driver.
RejectNegative
The option does not have a “no-” form. All options beginning with “f”, “W” or “m” are assumed to have a “no-” form unless this property is used.
Negative(othername)
The option will turn off another option othername, which is
the option name with the leading “-” removed. This chain action will
propagate through the Negative
property of the option to be
turned off. The driver will prune options, removing those that are
turned off by some later option. This pruning is not done for options
with Joined
or JoinedOrMissing
properties, unless the
options have either RejectNegative
property or the Negative
property mentions an option other than itself.
As a consequence, if you have a group of mutually-exclusive
options, their Negative
properties should form a circular chain.
For example, if options -a, -b and
-c are mutually exclusive, their respective Negative
properties should be ‘Negative(b)’, ‘Negative(c)’
and ‘Negative(a)’.
Joined
Separate
The option takes a mandatory argument. Joined
indicates
that the option and argument can be included in the same argv
entry (as with -mflush-func=name
, for example).
Separate
indicates that the option and argument can be
separate argv
entries (as with -o
). An option is
allowed to have both of these properties.
JoinedOrMissing
The option takes an optional argument. If the argument is given,
it will be part of the same argv
entry as the option itself.
This property cannot be used alongside Joined
or Separate
.
MissingArgError(message)
For an option marked Joined
or Separate
, the message
message will be used as an error message if the mandatory
argument is missing; for options without MissingArgError
, a
generic error message is used. message should contain a single
‘%qs’ format, which will be used to format the name of the option
passed.
Args(n)
For an option marked Separate
, indicate that it takes n
arguments. The default is 1.
UInteger
The option’s argument is a non-negative integer consisting of either
decimal or hexadecimal digits interpreted as int
. Hexadecimal
integers may optionally start with the 0x
or 0X
prefix.
The option parser validates and converts the argument before passing
it to the relevant option handler. UInteger
should also be used
with options like -falign-loops
where both -falign-loops
and -falign-loops
=n are supported to make sure the saved
options are given a full integer. Positive values of the argument in
excess of INT_MAX
wrap around zero.
Host_Wide_Int
The option’s argument is a non-negative integer consisting of either
decimal or hexadecimal digits interpreted as the widest integer type
on the host. As with an UInteger
argument, hexadecimal integers
may optionally start with the 0x
or 0X
prefix. The option
parser validates and converts the argument before passing it to
the relevant option handler. Host_Wide_Int
should be used with
options that need to accept very large values. Positive values of
the argument in excess of HOST_WIDE_INT_M1U
are assigned
HOST_WIDE_INT_M1U
.
IntegerRange(n, m)
The options’s arguments are integers of type int
. The option’s
parser validates that the value of an option integer argument is within
the closed range [n, m].
ByteSize
A property applicable only to UInteger
or Host_Wide_Int
arguments. The option’s integer argument is interpreted as if in infinite
precision using saturation arithmetic in the corresponding type. The argument
may be followed by a ‘byte-size’ suffix designating a multiple of bytes
such as kB
and KiB
for kilobyte and kibibyte, respectively,
MB
and MiB
for megabyte and mebibyte, GB
and GiB
for gigabyte and gigibyte, and so on. ByteSize
should be used for
with options that take a very large argument representing a size in bytes,
such as -Wlarger-than=.
ToLower
The option’s argument should be converted to lowercase as part of
putting it in canonical form, and before comparing with the strings
indicated by any Enum
property.
NoDriverArg
For an option marked Separate
, the option only takes an
argument in the compiler proper, not in the driver. This is for
compatibility with existing options that are used both directly and
via -Wp,; new options should not have this property.
Var(var)
The state of this option should be stored in variable var
(actually a macro for global_options.x_var
).
The way that the state is stored depends on the type of option:
WarnRemoved
The option is removed and every usage of such option will result in a warning. We use it option backward compatibility.
Var(var, set)
The option controls an integer variable var and is active when
var equals set. The option parser will set var to
set when the positive form of the option is used and !set
when the “no-” form is used.
var is declared in the same way as for the single-argument form described above.
Mask
or InverseMask
properties,
var is the integer variable that contains the mask.
UInteger
property,
var is an integer variable that stores the value of the argument.
Enum
property,
var is a variable (type given in the Type
property of the
‘Enum’ record whose Name
property has the same argument as
the Enum
property of this option) that stores the value of the
argument.
Defer
property, var is a pointer to
a VEC(cl_deferred_option,heap)
that stores the option for later
processing. (var is declared with type void *
and needs
to be cast to VEC(cl_deferred_option,heap)
before use.)
The option-processing script will usually zero-initialize var.
You can modify this behavior using Init
.
Init(value)
The variable specified by the Var
property should be statically
initialized to value. If more than one option using the same
variable specifies Init
, all must specify the same initializer.
Mask(name)
The option is associated with a bit in the target_flags
variable (see Run-time Target Specification) and is active when that bit is set.
You may also specify Var
to select a variable other than
target_flags
.
The options-processing script will automatically allocate a unique bit
for the option. If the option is attached to ‘target_flags’,
the script will set the macro MASK_name
to the appropriate
bitmask. It will also declare a TARGET_name
macro that has
the value 1 when the option is active and 0 otherwise. If you use Var
to attach the option to a different variable, the bitmask macro with be
called OPTION_MASK_name
.
InverseMask(othername)
InverseMask(othername, thisname)
The option is the inverse of another option that has the
Mask(othername)
property. If thisname is given,
the options-processing script will declare a TARGET_thisname
macro that is 1 when the option is active and 0 otherwise.
Enum(name)
The option’s argument is a string from the set of strings associated with the corresponding ‘Enum’ record. The string is checked and converted to the integer specified in the corresponding ‘EnumValue’ record before being passed to option handlers.
Defer
The option should be stored in a vector, specified with Var
,
for later processing.
Alias(opt)
Alias(opt, arg)
Alias(opt, posarg, negarg)
The option is an alias for -opt (or the negative form
of that option, depending on NegativeAlias
). In the first form,
any argument passed to the alias is considered to be passed to
-opt, and -opt is considered to be
negated if the alias is used in negated form. In the second form, the
alias may not be negated or have an argument, and posarg is
considered to be passed as an argument to -opt. In the
third form, the alias may not have an argument, if the alias is used
in the positive form then posarg is considered to be passed to
-opt, and if the alias is used in the negative form
then negarg is considered to be passed to -opt.
Aliases should not specify Var
or Mask
or
UInteger
. Aliases should normally specify the same languages
as the target of the alias; the flags on the target will be used to
determine any diagnostic for use of an option for the wrong language,
while those on the alias will be used to identify what command-line
text is the option and what text is any argument to that option.
When an Alias
definition is used for an option, driver specs do
not need to handle it and no ‘OPT_’ enumeration value is defined
for it; only the canonical form of the option will be seen in those
places.
NegativeAlias
For an option marked with Alias(opt)
, the option is
considered to be an alias for the positive form of -opt
if negated and for the negative form of -opt if not
negated. NegativeAlias
may not be used with the forms of
Alias
taking more than one argument.
Ignore
This option is ignored apart from printing any warning specified using
Warn
. The option will not be seen by specs and no ‘OPT_’
enumeration value is defined for it.
SeparateAlias
For an option marked with Joined
, Separate
and
Alias
, the option only acts as an alias when passed a separate
argument; with a joined argument it acts as a normal option, with an
‘OPT_’ enumeration value. This is for compatibility with the
Java -d option and should not be used for new options.
Warn(message)
If this option is used, output the warning message.
message is a format string, either taking a single operand with
a ‘%qs’ format which is the option name, or not taking any
operands, which is passed to the ‘warning’ function. If an alias
is marked Warn
, the target of the alias must not also be marked
Warn
.
Warning
This is a warning option and should be shown as such in --help output. This flag does not currently affect anything other than --help.
Optimization
This is an optimization option. It should be shown as such in
--help output, and any associated variable named using
Var
should be saved and restored when the optimization level is
changed with optimize
attributes.
PerFunction
This is an option that can be overridden on a per-function basis.
Optimization
implies PerFunction
, but options that do not
affect executable code generation may use this flag instead, so that the
option is not taken into account in ways that might affect executable
code generation.
Param
This is an option that is a parameter.
Undocumented
The option is deliberately missing documentation and should not be included in the --help output.
Condition(cond)
The option should only be accepted if preprocessor condition cond is true. Note that any C declarations associated with the option will be present even if cond is false; cond simply controls whether the option is accepted and whether it is printed in the --help output.
Save
Build the cl_target_option
structure to hold a copy of the
option, add the functions cl_target_option_save
and
cl_target_option_restore
to save and restore the options.
SetByCombined
The option may also be set by a combined option such as
-ffast-math. This causes the gcc_options
struct to
have a field frontend_set_name
, where name
is the name of the field holding the value of this option (without the
leading x_
). This gives the front end a way to indicate that
the value has been set explicitly and should not be changed by the
combined option. For example, some front ends use this to prevent
-ffast-math and -fno-fast-math from changing the
value of -fmath-errno for languages that do not use
errno
.
EnabledBy(opt)
EnabledBy(opt || opt2)
EnabledBy(opt && opt2)
If not explicitly set, the option is set to the value of
-opt; multiple options can be given, separated by
||
. The third form using &&
specifies that the option is
only set if both opt and opt2 are set. The options opt
and opt2 must have the Common
property; otherwise, use
LangEnabledBy
.
LangEnabledBy(language, opt)
LangEnabledBy(language, opt, posarg, negarg)
When compiling for the given language, the option is set to the value
of -opt, if not explicitly set. opt can be also a list
of ||
separated options. In the second form, if
opt is used in the positive form then posarg is considered
to be passed to the option, and if opt is used in the negative
form then negarg is considered to be passed to the option. It
is possible to specify several different languages. Each
language must have been declared by an earlier Language
record. See Option file format.
NoDWARFRecord
The option is omitted from the producer string written by -grecord-gcc-switches.
PchIgnore
Even if this is a target option, this option will not be recorded / compared to determine if a precompiled header file matches.
CPP(var)
The state of this option should be kept in sync with the preprocessor
option var. If this property is set, then properties Var
and Init
must be set as well.
CppReason(CPP_W_Enum)
This warning option corresponds to cpplib.h
warning reason code
CPP_W_Enum. This should only be used for warning options of the
C-family front-ends.
Next: Sizes and offsets as runtime invariants, Previous: Option specification files, Up: Introduction [Contents][Index]
This chapter is dedicated to giving an overview of the optimization and code generation passes of the compiler. In the process, it describes some of the language front end interface, though this description is no where near complete.
Next: Gimplification pass, Up: Passes and Files of the Compiler [Contents][Index]
The language front end is invoked only once, via
lang_hooks.parse_file
, to parse the entire input. The language
front end may use any intermediate language representation deemed
appropriate. The C front end uses GENERIC trees (see GENERIC), plus
a double handful of language specific tree codes defined in
c-common.def. The Fortran front end uses a completely different
private representation.
At some point the front end must translate the representation used in the front end to a representation understood by the language-independent portions of the compiler. Current practice takes one of two forms. The C front end manually invokes the gimplifier (see GIMPLE) on each function, and uses the gimplifier callbacks to convert the language-specific tree nodes directly to GIMPLE before passing the function off to be compiled. The Fortran front end converts from a private representation to GENERIC, which is later lowered to GIMPLE when the function is compiled. Which route to choose probably depends on how well GENERIC (plus extensions) can be made to match up with the source language and necessary parsing data structures.
BUG: Gimplification must occur before nested function lowering, and nested function lowering must be done by the front end before passing the data off to cgraph.
TODO: Cgraph should control nested function lowering. It would only be invoked when it is certain that the outer-most function is used.
TODO: Cgraph needs a gimplify_function callback. It should be invoked when (1) it is certain that the function is used, (2) warning flags specified by the user require some amount of compilation in order to honor, (3) the language indicates that semantic analysis is not complete until gimplification occurs. Hum… this sounds overly complicated. Perhaps we should just have the front end gimplify always; in most cases it’s only one function call.
The front end needs to pass all function definitions and top level declarations off to the middle-end so that they can be compiled and emitted to the object file. For a simple procedural language, it is usually most convenient to do this as each top level declaration or definition is seen. There is also a distinction to be made between generating functional code and generating complete debug information. The only thing that is absolutely required for functional code is that function and data definitions be passed to the middle-end. For complete debug information, function, data and type declarations should all be passed as well.
In any case, the front end needs each complete top-level function or
data declaration, and each data definition should be passed to
rest_of_decl_compilation
. Each complete type definition should
be passed to rest_of_type_compilation
. Each function definition
should be passed to cgraph_finalize_function
.
TODO: I know rest_of_compilation currently has all sorts of RTL generation semantics. I plan to move all code generation bits (both Tree and RTL) to compile_function. Should we hide cgraph from the front ends and move back to rest_of_compilation as the official interface? Possibly we should rename all three interfaces such that the names match in some meaningful way and that is more descriptive than "rest_of".
The middle-end will, at its option, emit the function and data definitions immediately or queue them for later processing.
Next: Pass manager, Previous: Parsing pass, Up: Passes and Files of the Compiler [Contents][Index]
Gimplification is a whimsical term for the process of converting the intermediate representation of a function into the GIMPLE language (see GIMPLE). The term stuck, and so words like “gimplification”, “gimplify”, “gimplifier” and the like are sprinkled throughout this section of code.
While a front end may certainly choose to generate GIMPLE directly if it chooses, this can be a moderately complex process unless the intermediate language used by the front end is already fairly simple. Usually it is easier to generate GENERIC trees plus extensions and let the language-independent gimplifier do most of the work.
The main entry point to this pass is gimplify_function_tree
located in gimplify.c. From here we process the entire
function gimplifying each statement in turn. The main workhorse
for this pass is gimplify_expr
. Approximately everything
passes through here at least once, and it is from here that we
invoke the lang_hooks.gimplify_expr
callback.
The callback should examine the expression in question and return
GS_UNHANDLED
if the expression is not a language specific
construct that requires attention. Otherwise it should alter the
expression in some way to such that forward progress is made toward
producing valid GIMPLE. If the callback is certain that the
transformation is complete and the expression is valid GIMPLE, it
should return GS_ALL_DONE
. Otherwise it should return
GS_OK
, which will cause the expression to be processed again.
If the callback encounters an error during the transformation (because
the front end is relying on the gimplification process to finish
semantic checks), it should return GS_ERROR
.
Next: Inter-procedural optimization passes, Previous: Gimplification pass, Up: Passes and Files of the Compiler [Contents][Index]
The pass manager is located in passes.c, tree-optimize.c and tree-pass.h. It processes passes as described in passes.def. Its job is to run all of the individual passes in the correct order, and take care of standard bookkeeping that applies to every pass.
The theory of operation is that each pass defines a structure that represents everything we need to know about that pass—when it should be run, how it should be run, what intermediate language form or on-the-side data structures it needs. We register the pass to be run in some particular order, and the pass manager arranges for everything to happen in the correct order.
The actuality doesn’t completely live up to the theory at present.
Command-line switches and timevar_id_t
enumerations must still
be defined elsewhere. The pass manager validates constraints but does
not attempt to (re-)generate data structures or lower intermediate
language form based on the requirements of the next pass. Nevertheless,
what is present is useful, and a far sight better than nothing at all.
Each pass should have a unique name. Each pass may have its own dump file (for GCC debugging purposes). Passes with a name starting with a star do not dump anything. Sometimes passes are supposed to share a dump file / option name. To still give these unique names, you can use a prefix that is delimited by a space from the part that is used for the dump file / option name. E.g. When the pass name is "ud dce", the name used for dump file/options is "dce".
TODO: describe the global variables set up by the pass manager, and a brief description of how a new pass should use it. I need to look at what info RTL passes use first...
Next: Tree SSA passes, Previous: Pass manager, Up: Passes and Files of the Compiler [Contents][Index]
The inter-procedural optimization (IPA) passes use call graph information to perform transformations across function boundaries. IPA is a critical part of link-time optimization (LTO) and whole-program (WHOPR) optimization, and these passes are structured with the needs of LTO and WHOPR in mind by dividing their operations into stages. For detailed discussion of the LTO/WHOPR IPA pass stages and interfaces, see Using summary information in IPA passes.
The following briefly describes the inter-procedural optimization (IPA) passes, which are split into small IPA passes, regular IPA passes, and late IPA passes, according to the LTO/WHOPR processing model.
Next: Regular IPA passes, Up: Inter-procedural optimization passes [Contents][Index]
A small IPA pass is a pass derived from simple_ipa_opt_pass
.
As described in Using summary information in IPA passes, it does everything at once and
defines only the Execute stage. During this
stage it accesses and modifies the function bodies.
No generate_summary
, read_summary
, or write_summary
hooks are defined.
This pass frees resources that are used by the front end but are
not needed once it is done. It is located in tree.c and is described by
pass_ipa_free_lang_data
.
This is a local function pass handling visibilities of all symbols. This
happens before LTO streaming, so -fwhole-program should be ignored
at this level. It is located in ipa-visibility.c and is described by
pass_ipa_function_and_variable_visibility
.
This pass performs reachability analysis and reclaims all unreachable nodes.
It is located in passes.c and is described by
pass_ipa_remove_symbols
.
This is a pass group for OpenACC processing. It is located in
tree-ssa-loop.c and is described by pass_ipa_oacc
.
This is a tree-based points-to analysis pass. The idea behind this analyzer
is to generate set constraints from the program, then solve the resulting
constraints in order to generate the points-to sets. It is located in
tree-ssa-structalias.c and is described by pass_ipa_pta
.
This is a pass group for processing OpenACC kernels regions. It is a
subpass of the IPA OpenACC pass group that runs on offloaded functions
containing OpenACC kernels loops. It is located in
tree-ssa-loop.c and is described by
pass_ipa_oacc_kernels
.
This is a pass for parsing functions with multiple target attributes.
It is located in multiple_target.c and is described by
pass_target_clone
.
This pass uses AutoFDO profiling data to annotate the control flow graph.
It is located in auto-profile.c and is described by
pass_ipa_auto_profile
.
This pass does profiling for all functions in the call graph.
It calculates branch
probabilities and basic block execution counts. It is located
in tree-profile.c and is described by pass_ipa_tree_profile
.
This pass is a small IPA pass when argument small_p
is true.
It releases inline function summaries and call summaries.
It is located in ipa-fnsummary.c and is described by
pass_ipa_free_free_fn_summary
.
This pass increases the alignment of global arrays to improve
vectorization. It is located in tree-vectorizer.c
and is described by pass_ipa_increase_alignment
.
This pass is for transactional memory support.
It is located in trans-mem.c and is described by
pass_ipa_tm
.
This pass lowers thread-local storage (TLS) operations
to emulation functions provided by libgcc.
It is located in tree-emutls.c and is described by
pass_ipa_lower_emutls
.
Next: Late IPA passes, Previous: Small IPA passes, Up: Inter-procedural optimization passes [Contents][Index]
A regular IPA pass is a pass derived from ipa_opt_pass_d
that
is executed in WHOPR compilation. Regular IPA passes may have summary
hooks implemented in any of the LGEN, WPA or LTRANS stages (see Using summary information in IPA passes).
This pass performs various optimizations involving symbol visibility
with -fwhole-program, including symbol privatization,
discovering local functions, and dismantling comdat groups. It is
located in ipa-visibility.c and is described by
pass_ipa_whole_program_visibility
.
The IPA profile pass propagates profiling frequencies across the call
graph. It is located in ipa-profile.c and is described by
pass_ipa_profile
.
This is the inter-procedural identical code folding pass.
The goal of this transformation is to discover functions
and read-only variables that have exactly the same semantics. It is
located in ipa-icf.c and is described by pass_ipa_icf
.
This pass performs speculative devirtualization based on the type
inheritance graph. When a polymorphic call has only one likely target
in the unit, it is turned into a speculative call. It is located in
ipa-devirt.c and is described by pass_ipa_devirt
.
The goal of this pass is to discover functions that are always invoked
with some arguments with the same known constant values and to modify
the functions accordingly. It can also do partial specialization and
type-based devirtualization. It is located in ipa-cp.c and is
described by pass_ipa_cp
.
This pass can replace an aggregate parameter with a set of other parameters
representing part of the original, turning those passed by reference
into new ones which pass the value directly. It also removes unused
function return values and unused function parameters. This pass is
located in ipa-sra.c and is described by pass_ipa_sra
.
This pass merges multiple constructors and destructors for static
objects into single functions. It’s only run at LTO time unless the
target doesn’t support constructors and destructors natively. The
pass is located in ipa.c and is described by
pass_ipa_cdtor_merge
.
This pass provides function analysis for inter-procedural passes.
It collects estimates of function body size, execution time, and frame
size for each function. It also estimates information about function
calls: call statement size, time and how often the parameters change
for each call. It is located in ipa-fnsummary.c and is
described by pass_ipa_fn_summary
.
The IPA inline pass handles function inlining with whole-program
knowledge. Small functions that are candidates for inlining are
ordered in increasing badness, bounded by unit growth parameters.
Unreachable functions are removed from the call graph. Functions called
once and not exported from the unit are inlined. This pass is located in
ipa-inline.c and is described by pass_ipa_inline
.
This pass marks functions as being either const (TREE_READONLY
) or
pure (DECL_PURE_P
). The per-function information is produced
by pure_const_generate_summary
, then the global information is computed
by performing a transitive closure over the call graph. It is located in
ipa-pure-const.c and is described by pass_ipa_pure_const
.
This pass is a regular IPA pass when argument small_p
is false.
It releases inline function summaries and call summaries.
It is located in ipa-fnsummary.c and is described by
pass_ipa_free_fn_summary
.
This pass gathers information about how variables whose scope is
confined to the compilation unit are used. It is located in
ipa-reference.c and is described by pass_ipa_reference
.
This pass checks whether variables are used by a single function.
It is located in ipa.c and is described by
pass_ipa_single_use
.
This pass looks for static symbols that are used exclusively
within one comdat group, and moves them into that comdat group. It is
located in ipa-comdats.c and is described by
pass_ipa_comdats
.
Previous: Regular IPA passes, Up: Inter-procedural optimization passes [Contents][Index]
Late IPA passes are simple IPA passes executed after the regular passes. In WHOPR mode the passes are executed after partitioning and thus see just parts of the compiled unit.
Once all functions from compilation unit are in memory, produce all clones
and update all calls. It is located in ipa.c and is described by
pass_materialize_all_clones
.
Points-to analysis; this is the same as the points-to-analysis pass run with the small IPA passes (see Small IPA passes).
This is the OpenMP constructs’ SIMD clone pass. It creates the appropriate
SIMD clones for functions tagged as elemental SIMD functions.
It is located in omp-simd-clone.c and is described by
pass_omp_simd_clone
.
Next: RTL passes, Previous: Inter-procedural optimization passes, Up: Passes and Files of the Compiler [Contents][Index]
The following briefly describes the Tree optimization passes that are run after gimplification and what source files they are located in.
This pass is an extremely simple sweep across the gimple code in which
we identify obviously dead code and remove it. Here we do things like
simplify if
statements with constant conditions, remove
exception handling constructs surrounding code that obviously cannot
throw, remove lexical bindings that contain no variables, and other
assorted simplistic cleanups. The idea is to get rid of the obvious
stuff quickly rather than wait until later when it’s more work to get
rid of it. This pass is located in tree-cfg.c and described by
pass_remove_useless_stmts
.
If OpenMP generation (-fopenmp) is enabled, this pass lowers OpenMP constructs into GIMPLE.
Lowering of OpenMP constructs involves creating replacement
expressions for local variables that have been mapped using data
sharing clauses, exposing the control flow of most synchronization
directives and adding region markers to facilitate the creation of the
control flow graph. The pass is located in omp-low.c and is
described by pass_lower_omp
.
If OpenMP generation (-fopenmp) is enabled, this pass expands
parallel regions into their own functions to be invoked by the thread
library. The pass is located in omp-low.c and is described by
pass_expand_omp
.
This pass flattens if
statements (COND_EXPR
)
and moves lexical bindings (BIND_EXPR
) out of line. After
this pass, all if
statements will have exactly two goto
statements in its then
and else
arms. Lexical binding
information for each statement will be found in TREE_BLOCK
rather
than being inferred from its position under a BIND_EXPR
. This
pass is found in gimple-low.c and is described by
pass_lower_cf
.
This pass decomposes high-level exception handling constructs
(TRY_FINALLY_EXPR
and TRY_CATCH_EXPR
) into a form
that explicitly represents the control flow involved. After this
pass, lookup_stmt_eh_region
will return a non-negative
number for any statement that may have EH control flow semantics;
examine tree_can_throw_internal
or tree_can_throw_external
for exact semantics. Exact control flow may be extracted from
foreach_reachable_handler
. The EH region nesting tree is defined
in except.h and built in except.c. The lowering pass
itself is in tree-eh.c and is described by pass_lower_eh
.
This pass decomposes a function into basic blocks and creates all of
the edges that connect them. It is located in tree-cfg.c and
is described by pass_build_cfg
.
This pass walks the entire function and collects an array of all
variables referenced in the function, referenced_vars
. The
index at which a variable is found in the array is used as a UID
for the variable within this function. This data is needed by the
SSA rewriting routines. The pass is located in tree-dfa.c
and is described by pass_referenced_vars
.
This pass rewrites the function such that it is in SSA form. After
this pass, all is_gimple_reg
variables will be referenced by
SSA_NAME
, and all occurrences of other variables will be
annotated with VDEFS
and VUSES
; PHI nodes will have
been inserted as necessary for each basic block. This pass is
located in tree-ssa.c and is described by pass_build_ssa
.
This pass scans the function for uses of SSA_NAME
s that
are fed by default definition. For non-parameter variables, such
uses are uninitialized. The pass is run twice, before and after
optimization (if turned on). In the first pass we only warn for uses that are
positively uninitialized; in the second pass we warn for uses that
are possibly uninitialized. The pass is located in tree-ssa.c
and is defined by pass_early_warn_uninitialized
and
pass_late_warn_uninitialized
.
This pass scans the function for statements without side effects whose
result is unused. It does not do memory life analysis, so any value
that is stored in memory is considered used. The pass is run multiple
times throughout the optimization process. It is located in
tree-ssa-dce.c and is described by pass_dce
.
This pass performs trivial dominator-based copy and constant propagation,
expression simplification, and jump threading. It is run multiple times
throughout the optimization process. It is located in tree-ssa-dom.c
and is described by pass_dominator
.
This pass attempts to remove redundant computation by substituting
variables that are used once into the expression that uses them and
seeing if the result can be simplified. It is located in
tree-ssa-forwprop.c and is described by pass_forwprop
.
This pass attempts to change the name of compiler temporaries involved in
copy operations such that SSA->normal can coalesce the copy away. When compiler
temporaries are copies of user variables, it also renames the compiler
temporary to the user variable resulting in better use of user symbols. It is
located in tree-ssa-copyrename.c and is described by
pass_copyrename
.
This pass recognizes forms of PHI inputs that can be represented as
conditional expressions and rewrites them into straight line code.
It is located in tree-ssa-phiopt.c and is described by
pass_phiopt
.
This pass performs a flow sensitive SSA-based points-to analysis.
The resulting may-alias, must-alias, and escape analysis information
is used to promote variables from in-memory addressable objects to
non-aliased variables that can be renamed into SSA form. We also
update the VDEF
/VUSE
memory tags for non-renameable
aggregates so that we get fewer false kills. The pass is located
in tree-ssa-alias.c and is described by pass_may_alias
.
Interprocedural points-to information is located in
tree-ssa-structalias.c and described by pass_ipa_pta
.
This pass instruments the function in order to collect runtime block
and value profiling data. Such data may be fed back into the compiler
on a subsequent run so as to allow optimization based on expected
execution frequencies. The pass is located in tree-profile.c and
is described by pass_ipa_tree_profile
.
This pass implements series of heuristics to guess propababilities
of branches. The resulting predictions are turned into edge profile
by propagating branches across the control flow graphs.
The pass is located in tree-profile.c and is described by
pass_profile
.
This pass rewrites complex arithmetic operations into their component
scalar arithmetic operations. The pass is located in tree-complex.c
and is described by pass_lower_complex
.
This pass rewrites suitable non-aliased local aggregate variables into
a set of scalar variables. The resulting scalar variables are
rewritten into SSA form, which allows subsequent optimization passes
to do a significantly better job with them. The pass is located in
tree-sra.c and is described by pass_sra
.
This pass eliminates stores to memory that are subsequently overwritten
by another store, without any intervening loads. The pass is located
in tree-ssa-dse.c and is described by pass_dse
.
This pass transforms tail recursion into a loop. It is located in
tree-tailcall.c and is described by pass_tail_recursion
.
This pass sinks stores and assignments down the flowgraph closer to their
use point. The pass is located in tree-ssa-sink.c and is
described by pass_sink_code
.
This pass eliminates partially redundant computations, as well as
performing load motion. The pass is located in tree-ssa-pre.c
and is described by pass_pre
.
Just before partial redundancy elimination, if
-funsafe-math-optimizations is on, GCC tries to convert
divisions to multiplications by the reciprocal. The pass is located
in tree-ssa-math-opts.c and is described by
pass_cse_reciprocal
.
This is a simpler form of PRE that only eliminates redundancies that
occur on all paths. It is located in tree-ssa-pre.c and
described by pass_fre
.
The main driver of the pass is placed in tree-ssa-loop.c
and described by pass_loop
.
The optimizations performed by this pass are:
Loop invariant motion. This pass moves only invariants that would be hard to handle on RTL level (function calls, operations that expand to nontrivial sequences of insns). With -funswitch-loops it also moves operands of conditions that are invariant out of the loop, so that we can use just trivial invariantness analysis in loop unswitching. The pass also includes store motion. The pass is implemented in tree-ssa-loop-im.c.
Canonical induction variable creation. This pass creates a simple counter for number of iterations of the loop and replaces the exit condition of the loop using it, in case when a complicated analysis is necessary to determine the number of iterations. Later optimizations then may determine the number easily. The pass is implemented in tree-ssa-loop-ivcanon.c.
Induction variable optimizations. This pass performs standard induction variable optimizations, including strength reduction, induction variable merging and induction variable elimination. The pass is implemented in tree-ssa-loop-ivopts.c.
Loop unswitching. This pass moves the conditional jumps that are invariant out of the loops. To achieve this, a duplicate of the loop is created for each possible outcome of conditional jump(s). The pass is implemented in tree-ssa-loop-unswitch.c.
Loop splitting. If a loop contains a conditional statement that is always true for one part of the iteration space and false for the other this pass splits the loop into two, one dealing with one side the other only with the other, thereby removing one inner-loop conditional. The pass is implemented in tree-ssa-loop-split.c.
The optimizations also use various utility functions contained in tree-ssa-loop-manip.c, cfgloop.c, cfgloopanal.c and cfgloopmanip.c.
Vectorization. This pass transforms loops to operate on vector types
instead of scalar types. Data parallelism across loop iterations is exploited
to group data elements from consecutive iterations into a vector and operate
on them in parallel. Depending on available target support the loop is
conceptually unrolled by a factor VF
(vectorization factor), which is
the number of elements operated upon in parallel in each iteration, and the
VF
copies of each scalar operation are fused to form a vector operation.
Additional loop transformations such as peeling and versioning may take place
to align the number of iterations, and to align the memory accesses in the
loop.
The pass is implemented in tree-vectorizer.c (the main driver),
tree-vect-loop.c and tree-vect-loop-manip.c (loop specific parts
and general loop utilities), tree-vect-slp (loop-aware SLP
functionality), tree-vect-stmts.c, tree-vect-data-refs.c and
tree-vect-slp-patterns.c containing the SLP pattern matcher.
Analysis of data references is in tree-data-ref.c.
SLP Vectorization. This pass performs vectorization of straight-line code. The pass is implemented in tree-vectorizer.c (the main driver), tree-vect-slp.c, tree-vect-stmts.c and tree-vect-data-refs.c.
Autoparallelization. This pass splits the loop iteration space to run into several threads. The pass is implemented in tree-parloops.c.
Graphite is a loop transformation framework based on the polyhedral model. Graphite stands for Gimple Represented as Polyhedra. The internals of this infrastructure are documented in http://gcc.gnu.org/wiki/Graphite. The passes working on this representation are implemented in the various graphite-* files.
This pass applies if-conversion to simple loops to help vectorizer.
We identify if convertible loops, if-convert statements and merge
basic blocks in one big block. The idea is to present loop in such
form so that vectorizer can have one to one mapping between statements
and available vector operations. This pass is located in
tree-if-conv.c and is described by pass_if_conversion
.
This pass relaxes a lattice of values in order to identify those
that must be constant even in the presence of conditional branches.
The pass is located in tree-ssa-ccp.c and is described
by pass_ccp
.
A related pass that works on memory loads and stores, and not just
register values, is located in tree-ssa-ccp.c and described by
pass_store_ccp
.
This is similar to constant propagation but the lattice of values is
the “copy-of” relation. It eliminates redundant copies from the
code. The pass is located in tree-ssa-copy.c and described by
pass_copy_prop
.
A related pass that works on memory copies, and not just register
copies, is located in tree-ssa-copy.c and described by
pass_store_copy_prop
.
This transformation is similar to constant propagation but
instead of propagating single constant values, it propagates
known value ranges. The implementation is based on Patterson’s
range propagation algorithm (Accurate Static Branch Prediction by
Value Range Propagation, J. R. C. Patterson, PLDI ’95). In
contrast to Patterson’s algorithm, this implementation does not
propagate branch probabilities nor it uses more than a single
range per SSA name. This means that the current implementation
cannot be used for branch prediction (though adapting it would
not be difficult). The pass is located in tree-vrp.c and is
described by pass_vrp
.
This pass simplifies built-in functions, as applicable, with constant
arguments or with inferable string lengths. It is located in
tree-ssa-ccp.c and is described by pass_fold_builtins
.
This pass identifies critical edges and inserts empty basic blocks
such that the edge is no longer critical. The pass is located in
tree-cfg.c and is described by pass_split_crit_edges
.
This pass is a stronger form of dead code elimination that can
eliminate unnecessary control flow statements. It is located
in tree-ssa-dce.c and is described by pass_cd_dce
.
This pass identifies function calls that may be rewritten into
jumps. No code transformation is actually applied here, but the
data and control flow problem is solved. The code transformation
requires target support, and so is delayed until RTL. In the
meantime CALL_EXPR_TAILCALL
is set indicating the possibility.
The pass is located in tree-tailcall.c and is described by
pass_tail_calls
. The RTL transformation is handled by
fixup_tail_calls
in calls.c.
For non-void functions, this pass locates return statements that do
not specify a value and issues a warning. Such a statement may have
been injected by falling off the end of the function. This pass is
run last so that we have as much time as possible to prove that the
statement is not reachable. It is located in tree-cfg.c and
is described by pass_warn_function_return
.
This pass rewrites the function such that it is in normal form. At
the same time, we eliminate as many single-use temporaries as possible,
so the intermediate language is no longer GIMPLE, but GENERIC. The
pass is located in tree-outof-ssa.c and is described by
pass_del_ssa
.
This is part of the CFG cleanup passes. It attempts to join PHI nodes
from a forwarder CFG block into another block with PHI nodes. The
pass is located in tree-cfgcleanup.c and is described by
pass_merge_phi
.
If a function always returns the same local variable, and that local
variable is an aggregate type, then the variable is replaced with the
return value for the function (i.e., the function’s DECL_RESULT). This
is equivalent to the C++ named return value optimization applied to
GIMPLE. The pass is located in tree-nrv.c and is described by
pass_nrv
.
If a function returns a memory object and is called as var =
foo()
, this pass tries to change the call so that the address of
var
is sent to the caller to avoid an extra memory copy. This
pass is located in tree-nrv.c
and is described by
pass_return_slot
.
__builtin_object_size
This is a propagation pass similar to CCP that tries to remove calls
to __builtin_object_size
when the size of the object can be
computed at compile-time. This pass is located in
tree-object-size.c and is described by
pass_object_sizes
.
This pass removes expensive loop-invariant computations out of loops.
The pass is located in tree-ssa-loop.c and described by
pass_lim
.
This is a family of loop transformations that works on loop nests. It
includes loop interchange, scaling, skewing and reversal and they are
all geared to the optimization of data locality in array traversals
and the removal of dependencies that hamper optimizations such as loop
parallelization and vectorization. The pass is located in
tree-loop-linear.c and described by
pass_linear_transform
.
This pass removes loops with no code in them. The pass is located in
tree-ssa-loop-ivcanon.c and described by
pass_empty_loop
.
This pass completely unrolls loops with few iterations. The pass
is located in tree-ssa-loop-ivcanon.c and described by
pass_complete_unroll
.
This pass makes the code reuse the computations from the previous
iterations of the loops, especially loads and stores to memory.
It does so by storing the values of these computations to a bank
of temporary variables that are rotated at the end of loop. To avoid
the need for this rotation, the loop is then unrolled and the copies
of the loop body are rewritten to use the appropriate version of
the temporary variable. This pass is located in tree-predcom.c
and described by pass_predcom
.
This pass issues prefetch instructions for array references inside
loops. The pass is located in tree-ssa-loop-prefetch.c and
described by pass_loop_prefetch
.
This pass rewrites arithmetic expressions to enable optimizations that
operate on them, like redundancy elimination and vectorization. The
pass is located in tree-ssa-reassoc.c and described by
pass_reassoc
.
stdarg
functions
This pass tries to avoid the saving of register arguments into the
stack on entry to stdarg
functions. If the function doesn’t
use any va_start
macros, no registers need to be saved. If
va_start
macros are used, the va_list
variables don’t
escape the function, it is only necessary to save registers that will
be used in va_arg
macros. For instance, if va_arg
is
only used with integral types in the function, floating point
registers don’t need to be saved. This pass is located in
tree-stdarg.c
and described by pass_stdarg
.
Next: Optimization info, Previous: Tree SSA passes, Up: Passes and Files of the Compiler [Contents][Index]
The following briefly describes the RTL generation and optimization passes that are run after the Tree optimization passes.
The source files for RTL generation include
stmt.c,
calls.c,
expr.c,
explow.c,
expmed.c,
function.c,
optabs.c
and emit-rtl.c.
Also, the file
insn-emit.c, generated from the machine description by the
program genemit
, is used in this pass. The header file
expr.h is used for communication within this pass.
The header files insn-flags.h and insn-codes.h,
generated from the machine description by the programs genflags
and gencodes
, tell this pass which standard names are available
for use and which patterns correspond to them.
This pass generates the glue that handles communication between the exception handling library routines and the exception handlers within the function. Entry points in the function that are invoked by the exception handling library are called landing pads. The code for this pass is located in except.c.
This pass removes unreachable code, simplifies jumps to next, jumps to jump, jumps across jumps, etc. The pass is run multiple times. For historical reasons, it is occasionally referred to as the “jump optimization pass”. The bulk of the code for this pass is in cfgcleanup.c, and there are support routines in cfgrtl.c and jump.c.
This pass attempts to remove redundant computation by substituting variables that come from a single definition, and seeing if the result can be simplified. It performs copy propagation and addressing mode selection. The pass is run twice, with values being propagated into loops only on the second run. The code is located in fwprop.c.
This pass removes redundant computation within basic blocks, and optimizes addressing modes based on cost. The pass is run twice. The code for this pass is located in cse.c.
This pass performs two different types of GCSE depending on whether you are optimizing for size or not (LCM based GCSE tends to increase code size for a gain in speed, while Morel-Renvoise based GCSE does not). When optimizing for size, GCSE is done using Morel-Renvoise Partial Redundancy Elimination, with the exception that it does not try to move invariants out of loops—that is left to the loop optimization pass. If MR PRE GCSE is done, code hoisting (aka unification) is also done, as well as load motion. If you are optimizing for speed, LCM (lazy code motion) based GCSE is done. LCM is based on the work of Knoop, Ruthing, and Steffen. LCM based GCSE also does loop invariant code motion. We also perform load and store motion when optimizing for speed. Regardless of which type of GCSE is used, the GCSE pass also performs global constant and copy propagation. The source file for this pass is gcse.c, and the LCM routines are in lcm.c.
This pass performs several loop related optimizations. The source files cfgloopanal.c and cfgloopmanip.c contain generic loop analysis and manipulation code. Initialization and finalization of loop structures is handled by loop-init.c. A loop invariant motion pass is implemented in loop-invariant.c. Basic block level optimizations—unrolling, and peeling loops— are implemented in loop-unroll.c. Replacing of the exit condition of loops by special machine-dependent instructions is handled by loop-doloop.c.
This pass is an aggressive form of GCSE that transforms the control flow graph of a function by propagating constants into conditional branch instructions. The source file for this pass is gcse.c.
This pass attempts to replace conditional branches and surrounding assignments with arithmetic, boolean value producing comparison instructions, and conditional move instructions. In the very last invocation after reload/LRA, it will generate predicated instructions when supported by the target. The code is located in ifcvt.c.
This pass splits independent uses of each pseudo-register. This can improve effect of the other transformation, such as CSE or register allocation. The code for this pass is located in web.c.
This pass attempts to combine groups of two or three instructions that are related by data flow into single instructions. It combines the RTL expressions for the instructions by substitution, simplifies the result using algebra, and then attempts to match the result against the machine description. The code is located in combine.c.
This pass looks for instructions that require the processor to be in a specific “mode” and minimizes the number of mode changes required to satisfy all users. What these modes are, and what they apply to are completely target-specific. The code for this pass is located in mode-switching.c.
This pass looks at innermost loops and reorders their instructions by overlapping different iterations. Modulo scheduling is performed immediately before instruction scheduling. The code for this pass is located in modulo-sched.c.
This pass looks for instructions whose output will not be available by the time that it is used in subsequent instructions. Memory loads and floating point instructions often have this behavior on RISC machines. It re-orders instructions within a basic block to try to separate the definition and use of items that otherwise would cause pipeline stalls. This pass is performed twice, before and after register allocation. The code for this pass is located in haifa-sched.c, sched-deps.c, sched-ebb.c, sched-rgn.c and sched-vis.c.
These passes make sure that all occurrences of pseudo registers are eliminated, either by allocating them to a hard register, replacing them by an equivalent expression (e.g. a constant) or by placing them on the stack. This is done in several subpasses:
Source files of the allocator are ira.c, ira-build.c, ira-costs.c, ira-conflicts.c, ira-color.c, ira-emit.c, ira-lives, plus header files ira.h and ira-int.h used for the communication between the allocator and the rest of the compiler and between the IRA files.
The reload pass also optionally eliminates the frame pointer and inserts instructions to save and restore call-clobbered registers around calls.
Source files are reload.c and reload1.c, plus the header reload.h used for communication between them.
Unlike the reload pass, intermediate LRA decisions are reflected in RTL as much as possible. This reduces the number of target-dependent macros and hooks, leaving instruction constraints as the primary source of control.
LRA is run on targets for which TARGET_LRA_P returns true.
This pass implements profile guided code positioning. If profile information is not available, various types of static analysis are performed to make the predictions normally coming from the profile feedback (IE execution frequency, branch probability, etc). It is implemented in the file bb-reorder.c, and the various prediction routines are in predict.c.
This pass computes where the variables are stored at each position in code and generates notes describing the variable locations to RTL code. The location lists are then generated according to these notes to debug information if the debugging information format supports location lists. The code is located in var-tracking.c.
This optional pass attempts to find instructions that can go into the delay slots of other instructions, usually jumps and calls. The code for this pass is located in reorg.c.
On many RISC machines, branch instructions have a limited range. Thus, longer sequences of instructions must be used for long branches. In this pass, the compiler figures out what how far each instruction will be from each other instruction, and therefore whether the usual instructions, or the longer sequences, must be used for each branch. The code for this pass is located in final.c.
Conversion from usage of some hard registers to usage of a register stack may be done at this point. Currently, this is supported only for the floating-point registers of the Intel 80387 coprocessor. The code for this pass is located in reg-stack.c.
This pass outputs the assembler code for the function. The source files are final.c plus insn-output.c; the latter is generated automatically from the machine description by the tool genoutput. The header file conditions.h is used for communication between these files.
This is run after final because it must output the stack slot offsets for pseudo registers that did not get hard registers. Source files are dbxout.c for DBX symbol table format, dwarfout.c for DWARF symbol table format, files dwarf2out.c and dwarf2asm.c for DWARF2 symbol table format, and vmsdbgout.c for VMS debug symbol table format.
Previous: RTL passes, Up: Passes and Files of the Compiler [Contents][Index]
This section is describes dump infrastructure which is common to both pass dumps as well as optimization dumps. The goal for this infrastructure is to provide both gcc developers and users detailed information about various compiler transformations and optimizations.
Next: Optimization groups, Up: Optimization info [Contents][Index]
A dump_manager class is defined in dumpfile.h. Various passes
register dumping pass-specific information via dump_register
in
passes.c. During the registration, an optimization pass can
select its optimization group (see Optimization groups). After
that optimization information corresponding to the entire group
(presumably from multiple passes) can be output via command-line
switches. Note that if a pass does not fit into any of the pre-defined
groups, it can select OPTGROUP_NONE
.
Note that in general, a pass need not know its dump output file name,
whether certain flags are enabled, etc. However, for legacy reasons,
passes could also call dump_begin
which returns a stream in
case the particular pass has optimization dumps enabled. A pass could
call dump_end
when the dump has ended. These methods should go
away once all the passes are converted to use the new dump
infrastructure.
The recommended way to setup the dump output is via dump_start
and dump_end
.
Next: Dump files and streams, Previous: Dump setup, Up: Optimization info [Contents][Index]
The optimization passes are grouped into several categories. Currently defined categories in dumpfile.h are
OPTGROUP_IPA
¶IPA optimization passes. Enabled by -ipa
OPTGROUP_LOOP
¶Loop optimization passes. Enabled by -loop.
OPTGROUP_INLINE
¶Inlining passes. Enabled by -inline.
OPTGROUP_OMP
¶OMP (Offloading and Multi Processing) passes. Enabled by -omp.
OPTGROUP_VEC
¶Vectorization passes. Enabled by -vec.
OPTGROUP_OTHER
¶All other optimization passes which do not fall into one of the above.
OPTGROUP_ALL
¶All optimization passes. Enabled by -optall.
By using groups a user could selectively enable optimization information only for a group of passes. By default, the optimization information for all the passes is dumped.
Next: Dump output verbosity, Previous: Optimization groups, Up: Optimization info [Contents][Index]
There are two separate output streams available for outputting
optimization information from passes. Note that both these streams
accept stderr
and stdout
as valid streams and thus it is
possible to dump output to standard output or error. This is specially
handy for outputting all available information in a single file by
redirecting stderr
.
pstream
This stream is for pass-specific dump output. For example,
-fdump-tree-vect=foo.v dumps tree vectorization pass output
into the given file name foo.v. If the file name is not provided,
the default file name is based on the source file and pass number. Note
that one could also use special file names stdout
and
stderr
for dumping to standard output and standard error
respectively.
alt_stream
This steam is used for printing optimization specific output in
response to the -fopt-info. Again a file name can be given. If
the file name is not given, it defaults to stderr
.
Next: Dump types, Previous: Dump files and streams, Up: Optimization info [Contents][Index]
The dump verbosity has the following options
Print information when an optimization is successfully applied. It is up to a pass to decide which information is relevant. For example, the vectorizer passes print the source location of loops which got successfully vectorized.
Print information about missed optimizations. Individual passes control which information to include in the output. For example,
gcc -O2 -ftree-vectorize -fopt-info-vec-missed
will print information about missed optimization opportunities from vectorization passes on stderr.
Print verbose information about optimizations, such as certain transformations, more detailed messages about decisions etc.
Print detailed optimization information. This includes optimized, missed, and note.
Next: Dump examples, Previous: Dump output verbosity, Up: Optimization info [Contents][Index]
dump_printf
¶This is a generic method for doing formatted output. It takes an
additional argument dump_kind
which signifies the type of
dump. This method outputs information only when the dumps are enabled
for this particular dump_kind
. Note that the caller doesn’t
need to know if the particular dump is enabled or not, or even the
file name. The caller only needs to decide which dump output
information is relevant, and under what conditions. This determines
the associated flags.
Consider the following example from loop-unroll.c where an informative message about a loop (along with its location) is printed when any of the following flags is enabled
int report_flags = MSG_OPTIMIZED_LOCATIONS | TDF_RTL | TDF_DETAILS; dump_printf_loc (report_flags, insn, "loop turned into non-loop; it never loops.\n");
dump_basic_block
¶Output basic block.
dump_generic_expr
¶Output generic expression.
dump_gimple_stmt
¶Output gimple statement.
Note that the above methods also have variants prefixed with
_loc
, such as dump_printf_loc
, which are similar except
they also output the source location information. The _loc
variants
take a const dump_location_t &
. This class can be constructed from
a gimple *
or from a rtx_insn *
, and so callers can pass
a gimple *
or a rtx_insn *
as the _loc
argument.
The dump_location_t
constructor will extract the source location
from the statement or instruction, along with the profile count, and
the location in GCC’s own source code (or the plugin) from which the dump
call was emitted. Only the source location is currently used.
There is also a dump_user_location_t
class, capturing the
source location and profile count, but not the dump emission location,
so that locations in the user’s code can be passed around. This
can also be constructed from a gimple *
and from a rtx_insn *
,
and it too can be passed as the _loc
argument.
Previous: Dump types, Up: Optimization info [Contents][Index]
gcc -O3 -fopt-info-missed=missed.all
outputs missed optimization report from all the passes into missed.all.
As another example,
gcc -O3 -fopt-info-inline-optimized-missed=inline.txt
will output information about missed optimizations as well as optimized locations from all the inlining passes into inline.txt.
If the filename is provided, then the dumps from all the applicable optimizations are concatenated into the filename. Otherwise the dump is output onto stderr. If options is omitted, it defaults to optimized-optall, which means dump all information about successful optimizations from all the passes. In the following example, the optimization information is output on to stderr.
gcc -O3 -fopt-info
Note that -fopt-info-vec-missed behaves the same as -fopt-info-missed-vec. The order of the optimization group names and message types listed after -fopt-info does not matter.
As another example, consider
gcc -fopt-info-vec-missed=vec.miss -fopt-info-loop-optimized=loop.opt
Here the two output file names vec.miss and loop.opt are in conflict since only one output file is allowed. In this case, only the first option takes effect and the subsequent options are ignored. Thus only the vec.miss is produced which containts dumps from the vectorizer about missed opportunities.
Next: GENERIC, Previous: Passes and Files of the Compiler, Up: Introduction [Contents][Index]
GCC allows the size of a hardware register to be a runtime invariant rather than a compile-time constant. This in turn means that various sizes and offsets must also be runtime invariants rather than compile-time constants, such as:
machine_mode
(see Machine Modes);
mem
rtx (see Registers and Memory); and
subreg
rtx (see Registers and Memory).
The motivating example is the Arm SVE ISA, whose vector registers can be any multiple of 128 bits between 128 and 2048 inclusive. The compiler normally produces code that works for all SVE register sizes, with the actual size only being known at runtime.
GCC’s main representation of such runtime invariants is the
poly_int
class. This chapter describes what poly_int
does, lists the available operations, and gives some general
usage guidelines.
poly_int
poly_int
poly_int
poly_int
spoly_int
spoly_int
spoly_int
spoly_int
routinespoly_int
poly_int
We define indeterminates x1, …, xn whose values are only known at runtime and use polynomials of the form:
c0 + c1 * x1 + … + cn * xn
to represent a size or offset whose value might depend on some of these indeterminates. The coefficients c0, …, cn are always known at compile time, with the c0 term being the “constant” part that does not depend on any runtime value.
GCC uses the poly_int
class to represent these coefficients.
The class has two template parameters: the first specifies the number of
coefficients (n + 1) and the second specifies the type of the
coefficients. For example, ‘poly_int<2, unsigned short>’ represents
a polynomial with two coefficients (and thus one indeterminate), with each
coefficient having type unsigned short
. When n is 0,
the class degenerates to a single compile-time constant c0.
The number of coefficients needed for compilation is a fixed
property of each target and is specified by the configuration macro
NUM_POLY_INT_COEFFS
. The default value is 1, since most targets
do not have such runtime invariants. Targets that need a different
value should #define
the macro in their cpu-modes.def
file. See Anatomy of a Target Back End.
poly_int
makes the simplifying requirement that each indeterminate
must be a nonnegative integer. An indeterminate value of 0 should usually
represent the minimum possible runtime value, with c0 specifying
the value in that case.
For example, when targetting the Arm SVE ISA, the single indeterminate represents the number of 128-bit blocks in a vector beyond the minimum length of 128 bits. Thus the number of 64-bit doublewords in a vector is 2 + 2 * x1. If an aggregate has a single SVE vector and 16 additional bytes, its total size is 32 + 16 * x1 bytes.
The header file poly-int-types.h provides typedefs for the
most common forms of poly_int
, all having
NUM_POLY_INT_COEFFS
coefficients:
poly_uint16
a ‘poly_int’ with unsigned short
coefficients.
poly_int64
a ‘poly_int’ with HOST_WIDE_INT
coefficients.
poly_uint64
a ‘poly_int’ with unsigned HOST_WIDE_INT
coefficients.
poly_offset_int
a ‘poly_int’ with offset_int
coefficients.
poly_wide_int
a ‘poly_int’ with wide_int
coefficients.
poly_widest_int
a ‘poly_int’ with widest_int
coefficients.
Since the main purpose of poly_int
is to represent sizes and
offsets, the last two typedefs are only rarely used.
Next: Comparisons involving poly_int
, Previous: Overview of poly_int
, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
The two main consequences of using polynomial sizes and offsets are that:
poly_int
.
For example, if x is a runtime invariant, we cannot tell at compile time whether:
3 + 4x <= 1 + 5x
since the condition is false when x <= 1 and true when x >= 2.
Similarly, poly_int
cannot represent the result of:
(3 + 4x) * (1 + 5x)
since it cannot (and in practice does not need to) store powers greater than one. It also cannot represent the result of:
(3 + 4x) / (1 + 5x)
The following sections describe how we deal with these restrictions.
As described earlier, a poly_int<1, T>
has no indeterminates
and so degenerates to a compile-time constant of type T. It would
be possible in that case to do all normal arithmetic on the T,
and to compare the T using the normal C++ operators. We deliberately
prevent target-independent code from doing this, since the compiler needs
to support other poly_int<n, T>
as well, regardless of
the current target’s NUM_POLY_INT_COEFFS
.
However, it would be very artificial to force target-specific code
to follow these restrictions if the target has no runtime indeterminates.
There is therefore an implicit conversion from poly_int<1, T>
to T when compiling target-specific translation units.
Next: Arithmetic on poly_int
s, Previous: Consequences of using poly_int
, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
In general we need to compare sizes and offsets in two situations: those in which the values need to be ordered, and those in which the values can be unordered. More loosely, the distinction is often between values that have a definite link (usually because they refer to the same underlying register or memory location) and values that have no definite link. An example of the former is the relationship between the inner and outer sizes of a subreg, where we must know at compile time whether the subreg is paradoxical, partial, or complete. An example of the latter is alias analysis: we might want to check whether two arbitrary memory references overlap.
Referring back to the examples in the previous section, it makes sense to ask whether a memory reference of size ‘3 + 4x’ overlaps one of size ‘1 + 5x’, but it does not make sense to have a subreg in which the outer mode has ‘3 + 4x’ bytes and the inner mode has ‘1 + 5x’ bytes (or vice versa). Such subregs are always invalid and should trigger an internal compiler error if formed.
The underlying operators are the same in both cases, but the distinction affects how they are used.
poly_int
poly_int
comparisonspoly_int
spoly_int
spoly_int
marker valuepoly_int
spoly_int
spoly_int
poly_int
provides the following routines for checking whether
a particular condition “may be” (might be) true:
maybe_lt maybe_le maybe_eq maybe_ge maybe_gt maybe_ne
The functions have their natural meaning:
Return true if a might be less than b.
Return true if a might be less than or equal to b.
Return true if a might be equal to b.
Return true if a might not be equal to b.
Return true if a might be greater than or equal to b.
Return true if a might be greater than b.
For readability, poly_int
also provides “known” inverses of these
functions:
known_lt (a, b) == !maybe_ge (a, b) known_le (a, b) == !maybe_gt (a, b) known_eq (a, b) == !maybe_ne (a, b) known_ge (a, b) == !maybe_lt (a, b) known_gt (a, b) == !maybe_le (a, b) known_ne (a, b) == !maybe_eq (a, b)
Next: Comparing potentially-unordered poly_int
s, Previous: Comparison functions for poly_int
, Up: Comparisons involving poly_int
[Contents][Index]
poly_int
comparisonsAll “maybe” relations except maybe_ne
are transitive, so for example:
maybe_lt (a, b) && maybe_lt (b, c) implies maybe_lt (a, c)
for all a, b and c. maybe_lt
, maybe_gt
and maybe_ne
are irreflexive, so for example:
!maybe_lt (a, a)
is true for all a. maybe_le
, maybe_eq
and maybe_ge
are reflexive, so for example:
maybe_le (a, a)
is true for all a. maybe_eq
and maybe_ne
are symmetric, so:
maybe_eq (a, b) == maybe_eq (b, a) maybe_ne (a, b) == maybe_ne (b, a)
for all a and b. In addition:
maybe_le (a, b) == maybe_lt (a, b) || maybe_eq (a, b) maybe_ge (a, b) == maybe_gt (a, b) || maybe_eq (a, b) maybe_lt (a, b) == maybe_gt (b, a) maybe_le (a, b) == maybe_ge (b, a)
However:
maybe_le (a, b) && maybe_le (b, a) does not imply !maybe_ne (a, b) [== known_eq (a, b)] maybe_ge (a, b) && maybe_ge (b, a) does not imply !maybe_ne (a, b) [== known_eq (a, b)]
One example is again ‘a == 3 + 4x’
and ‘b == 1 + 5x’, where ‘maybe_le (a, b)’,
‘maybe_ge (a, b)’ and ‘maybe_ne (a, b)’
all hold. maybe_le
and maybe_ge
are therefore not antisymetric
and do not form a partial order.
From the above, it follows that:
known_ne
are transitive.
known_lt
, known_ne
and known_gt
are irreflexive.
known_le
, known_eq
and known_ge
are reflexive.
Also:
known_lt (a, b) == known_gt (b, a) known_le (a, b) == known_ge (b, a) known_lt (a, b) implies !known_lt (b, a) [asymmetry] known_gt (a, b) implies !known_gt (b, a) known_le (a, b) && known_le (b, a) == known_eq (a, b) [== !maybe_ne (a, b)] known_ge (a, b) && known_ge (b, a) == known_eq (a, b) [== !maybe_ne (a, b)]
known_le
and known_ge
are therefore antisymmetric and are
partial orders. However:
known_le (a, b) does not imply known_lt (a, b) || known_eq (a, b) known_ge (a, b) does not imply known_gt (a, b) || known_eq (a, b)
For example, ‘known_le (4, 4 + 4x)’ holds because the runtime
indeterminate x is a nonnegative integer, but neither
known_lt (4, 4 + 4x)
nor known_eq (4, 4 + 4x)
hold.
Next: Comparing ordered poly_int
s, Previous: Properties of the poly_int
comparisons, Up: Comparisons involving poly_int
[Contents][Index]
poly_int
sIn cases where there is no definite link between two poly_int
s,
we can usually make a conservatively-correct assumption. For example,
the conservative assumption for alias analysis is that two references
might alias.
One way of checking whether [begin1, end1) might overlap
[begin2, end2) using the poly_int
comparisons is:
maybe_gt (end1, begin2) && maybe_gt (end2, begin1)
and another (equivalent) way is:
!(known_le (end1, begin2) || known_le (end2, begin1))
However, in this particular example, it is better to use the range helper
functions instead. See Range checks on poly_int
s.
Next: Checking for a poly_int
marker value, Previous: Comparing potentially-unordered poly_int
s, Up: Comparisons involving poly_int
[Contents][Index]
poly_int
sIn cases where there is a definite link between two poly_int
s,
such as the outer and inner sizes of subregs, we usually require the sizes
to be ordered by the known_le
partial order. poly_int
provides
the following utility functions for ordered values:
Return true if a and b are ordered by the known_le
partial order.
Assert that a and b are ordered by known_le
and return the
minimum of the two. When using this function, please add a comment explaining
why the values are known to be ordered.
Assert that a and b are ordered by known_le
and return the
maximum of the two. When using this function, please add a comment explaining
why the values are known to be ordered.
For example, if a subreg has an outer mode of size outer and an inner mode of size inner:
Thus the subreg is only valid if ‘ordered_p (outer, inner)’ is true. If this condition is already known to be true then:
with the three conditions being mutually exclusive.
Code that checks whether a subreg is valid would therefore generally
check whether ordered_p
holds (in addition to whatever other
checks are required for subreg validity). Code that is dealing
with existing subregs can assert that ordered_p
holds
and use either of the classifications above.
Next: Range checks on poly_int
s, Previous: Comparing ordered poly_int
s, Up: Comparisons involving poly_int
[Contents][Index]
poly_int
marker valueIt is sometimes useful to have a special “marker value” that is not meant to be taken literally. For example, some code uses a size of -1 to represent an unknown size, rather than having to carry around a separate boolean to say whether the size is known.
The best way of checking whether something is a marker value is
known_eq
. Conversely the best way of checking whether something
is not a marker value is maybe_ne
.
Thus in the size example just mentioned, ‘known_eq (size, -1)’ would check for an unknown size and ‘maybe_ne (size, -1)’ would check for a known size.
Next: Sorting poly_int
s, Previous: Checking for a poly_int
marker value, Up: Comparisons involving poly_int
[Contents][Index]
poly_int
sAs well as the core comparisons
(see Comparison functions for poly_int
), poly_int
provides
utilities for various kinds of range check. In each case the range
is represented by a start position and a size rather than a start
position and an end position; this is because the former is used
much more often than the latter in GCC. Also, the sizes can be
-1 (or all ones for unsigned sizes) to indicate a range with a known
start position but an unknown size. All other sizes must be nonnegative.
A range of size 0 does not contain anything or overlap anything.
Return true if size represents a known range size, false if it is -1 or all ones (for signed and unsigned types respectively).
Return true if the range described by pos1 and size1 might overlap the range described by pos2 and size2 (in other words, return true if we cannot prove that the ranges are disjoint).
Return true if the range described by pos1 and size1 is known to overlap the range described by pos2 and size2.
Return true if the range described by pos1 and size1 is known to be contained in the range described by pos2 and size2.
Return true if value might be in the range described by pos and size (in other words, return true if we cannot prove that value is outside that range).
Return true if value is known to be in the range described by pos and size.
Return true if the range described by pos and size is open-ended or if the endpoint (pos + size) is representable in the same type as pos and size. The function returns false if adding size to pos makes conceptual sense but could overflow.
There is also a poly_int
version of the IN_RANGE_P
macro:
Return true if every coefficient of x is in the inclusive range [lower, upper]. This function can be useful when testing whether an operation would cause the values of coefficients to overflow.
Note that the function does not indicate whether x itself is in the
given range. x can be either a constant or a poly_int
.
Previous: Range checks on poly_int
s, Up: Comparisons involving poly_int
[Contents][Index]
poly_int
spoly_int
provides the following routine for sorting:
Compare a and b in reverse lexicographical order (that is, compare the highest-indexed coefficients first). This can be useful when sorting data structures, since it has the effect of separating constant and non-constant values. If all values are nonnegative, the constant values come first.
Note that the values do not necessarily end up in numerical order. For example, ‘1 + 1x’ would come after ‘100’ in the sort order, but may well be less than ‘100’ at run time.
Next: Alignment of poly_int
s, Previous: Comparisons involving poly_int
, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
sAddition, subtraction, negation and bit inversion all work normally for
poly_int
s. Multiplication by a constant multiplier and left
shifting by a constant shift amount also work normally. General
multiplication of two poly_int
s is not supported and is not
useful in practice.
Other operations are only conditionally supported: the operation might succeed or might fail, depending on the inputs.
This section describes both types of operation.
poly_int
with C++ arithmetic operatorswi
arithmetic on poly_int
spoly_int
spoly_int
arithmetic
Next: wi
arithmetic on poly_int
s, Up: Arithmetic on poly_int
s [Contents][Index]
poly_int
with C++ arithmetic operatorsThe following C++ expressions are supported, where p1 and p2
are poly_int
s and where c1 and c2 are scalars:
-p1 ~p1 p1 + p2 p1 + c2 c1 + p2 p1 - p2 p1 - c2 c1 - p2 c1 * p2 p1 * c2 p1 << c2 p1 += p2 p1 += c2 p1 -= p2 p1 -= c2 p1 *= c2 p1 <<= c2
These arithmetic operations handle integer ranks in a similar way
to C++. The main difference is that every coefficient narrower than
HOST_WIDE_INT
promotes to HOST_WIDE_INT
, whereas in
C++ everything narrower than int
promotes to int
.
For example:
poly_uint16 + int -> poly_int64 unsigned int + poly_uint16 -> poly_int64 poly_int64 + int -> poly_int64 poly_int32 + poly_uint64 -> poly_uint64 uint64 + poly_int64 -> poly_uint64 poly_offset_int + int32 -> poly_offset_int offset_int + poly_uint16 -> poly_offset_int
In the first two examples, both coefficients are narrower than
HOST_WIDE_INT
, so the result has coefficients of type
HOST_WIDE_INT
. In the other examples, the coefficient
with the highest rank “wins”.
If one of the operands is wide_int
or poly_wide_int
,
the rules are the same as for wide_int
arithmetic.
Next: Division of poly_int
s, Previous: Using poly_int
with C++ arithmetic operators, Up: Arithmetic on poly_int
s [Contents][Index]
wi
arithmetic on poly_int
sAs well as the C++ operators, poly_int
supports the following
wi
routines:
wi::neg (p1, &overflow) wi::add (p1, p2) wi::add (p1, c2) wi::add (c1, p1) wi::add (p1, p2, sign, &overflow) wi::sub (p1, p2) wi::sub (p1, c2) wi::sub (c1, p1) wi::sub (p1, p2, sign, &overflow) wi::mul (p1, c2) wi::mul (c1, p1) wi::mul (p1, c2, sign, &overflow) wi::lshift (p1, c2)
These routines just check whether overflow occurs on any individual coefficient; it is not possible to know at compile time whether the final runtime value would overflow.
Next: Other poly_int
arithmetic, Previous: wi
arithmetic on poly_int
s, Up: Arithmetic on poly_int
s [Contents][Index]
poly_int
sDivision of poly_int
s is possible for certain inputs. The functions
for division return true if the operation is possible and in most cases
return the results by pointer. The routines are:
Return true if a is an exact multiple of b, storing the result in quotient if so. There are overloads for various combinations of polynomial and constant a, b and quotient.
Like multiple_p
, but also test whether the multiple is a
compile-time constant.
Return true if we can calculate ‘trunc (a / b)’ at compile time, storing the result in quotient and remainder if so.
Return true if we can calculate ‘a / b’ at compile time, rounding away from zero. Store the result in quotient if so.
Note that this is true if and only if can_div_trunc_p
is true.
The only difference is in the rounding of the result.
There is also an asserting form of division:
Assert that a is a multiple of b and return
‘a / b’. The result is a poly_int
if a
is a poly_int
.
Previous: Division of poly_int
s, Up: Arithmetic on poly_int
s [Contents][Index]
poly_int
arithmeticThere are tentative routines for other operations besides division:
Return true if we can calculate ‘a | b’ at compile time, storing the result in result if so.
Also, ANDs with a value ‘(1 << y) - 1’ or its inverse can be
treated as alignment operations. See Alignment of poly_int
s.
In addition, the following miscellaneous routines are available:
Return the greatest common divisor of all nonzero coefficients in a, or zero if a is known to be zero.
Return a value that is a multiple of both a and b, where
one value is a poly_int
and the other is a scalar. The result
will be the least common multiple for some indeterminate values but
not necessarily for all.
Return a value that is a multiple of both poly_int
a and
poly_int
b, asserting that such a value exists. The
result will be the least common multiple for some indeterminate values
but not necessarily for all.
When using this routine, please add a comment explaining why the assertion is known to hold.
Please add any other operations that you find to be useful.
Next: Computing bounds on poly_int
s, Previous: Arithmetic on poly_int
s, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
spoly_int
provides various routines for aligning values and for querying
misalignments. In each case the alignment must be a power of 2.
Return true if we can align value up or down to the nearest multiple of align at compile time. The answer is the same for both directions.
Return true if can_align_p
; if so, set aligned to the greatest
aligned value that is less than or equal to value.
Return true if can_align_p
; if so, set aligned to the lowest
aligned value that is greater than or equal to value.
Return true if we can align a and b down to the nearest align boundary at compile time and if the two results are equal.
Return true if we can align a and b up to the nearest align boundary at compile time and if the two results are equal.
Return a result that is no greater than value and that is aligned to align. The result will the closest aligned value for some indeterminate values but not necessarily for all.
For example, suppose we are allocating an object of size bytes in a downward-growing stack whose current limit is given by limit. If the object requires align bytes of alignment, the new stack limit is given by:
aligned_lower_bound (limit - size, align)
Likewise return a result that is no less than value and that is aligned to align. This is the routine that would be used for upward-growing stacks in the scenario just described.
Return true if we can calculate the misalignment of value with respect to align at compile time, storing the result in misalign if so.
Return the minimum alignment that value is known to have (in other words, the largest alignment that can be guaranteed whatever the values of the indeterminates turn out to be). Return 0 if value is known to be 0.
Assert that value can be aligned down to align at compile time and return the result. When using this routine, please add a comment explaining why the assertion is known to hold.
Likewise, but aligning up.
Divide the result of force_align_down
by align. Again,
please add a comment explaining why the assertion in force_align_down
is known to hold.
Likewise for force_align_up
.
Assert that we can calculate the misalignment of value with respect to align at compile time and return the misalignment. When using this function, please add a comment explaining why the assertion is known to hold.
Next: Converting poly_int
s, Previous: Alignment of poly_int
s, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
spoly_int
also provides routines for calculating lower and upper bounds:
Assert that a is nonnegative and return the smallest value it can have.
Return the least value a can have, given that the context in which a appears guarantees that the answer is no less than b. In other words, the caller is asserting that a is greater than or equal to b even if ‘known_ge (a, b)’ doesn’t hold.
Return the greatest value a can have, given that the context in which a appears guarantees that the answer is no greater than b. In other words, the caller is asserting that a is less than or equal to b even if ‘known_le (a, b)’ doesn’t hold.
Return a value that is always less than or equal to both a and b. It will be the greatest such value for some indeterminate values but necessarily for all.
Return a value that is always greater than or equal to both a and b. It will be the least such value for some indeterminate values but necessarily for all.
Next: Miscellaneous poly_int
routines, Previous: Computing bounds on poly_int
s, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
sA poly_int<n, T>
can be constructed from up to
n individual T coefficients, with the remaining coefficients
being implicitly zero. In particular, this means that every
poly_int<n, T>
can be constructed from a single
scalar T, or something compatible with T.
Also, a poly_int<n, T>
can be constructed from
a poly_int<n, U>
if T can be constructed
from U.
The following functions provide other forms of conversion, or test whether such a conversion would succeed.
Return true if poly_int
value is a compile-time constant.
Return true if poly_int
value is a compile-time constant,
storing it in c1 if so. c1 must be able to hold all
constant values of value without loss of precision.
Assert that value is a compile-time constant and return its value. When using this function, please add a comment explaining why the condition is known to hold (for example, because an earlier phase of analysis rejected non-constants).
Return true if ‘poly_int<N, T>’ value can be
represented without loss of precision as a
‘poly_int<N, HOST_WIDE_INT
>’, storing it in that
form in p2 if so.
Return true if ‘poly_int<N, T>’ value can be
represented without loss of precision as a
‘poly_int<N, unsigned HOST_WIDE_INT
>’, storing it in that
form in p2 if so.
Forcibly convert each coefficient of ‘poly_int<N, T>’
value to HOST_WIDE_INT
, truncating any that are out of range.
Return the result as a ‘poly_int<N, HOST_WIDE_INT
>’.
Forcibly convert each coefficient of ‘poly_int<N, T>’
value to unsigned HOST_WIDE_INT
, truncating any that are
out of range. Return the result as a
‘poly_int<N, unsigned HOST_WIDE_INT
>’.
Return a poly_int
with the same value as value, but with
the coefficients converted from HOST_WIDE_INT
to wide_int
.
precision specifies the precision of the wide_int
cofficients;
if this is wider than a HOST_WIDE_INT
, the coefficients of
value will be sign-extended to fit.
Like wi::shwi
, except that value has coefficients of
type unsigned HOST_WIDE_INT
. If precision is wider than
a HOST_WIDE_INT
, the coefficients of value will be
zero-extended to fit.
Return a poly_int
of the same type as value, sign-extending
every coefficient from the low precision bits. This in effect
applies wi::sext
to each coefficient individually.
Like wi::sext
, but for zero extension.
Convert value to a poly_wide_int
in which each coefficient
has precision bits. Extend the coefficients according to
sign if the coefficients have fewer bits.
Convert value to a poly_offset_int
, extending its coefficients
according to sign if they have fewer bits than offset_int
.
Convert value to a poly_widest_int
, extending its coefficients
according to sign if they have fewer bits than widest_int
.
Next: Guidelines for using poly_int
, Previous: Converting poly_int
s, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
routinesPrint value to file as a decimal value, interpreting
the coefficients according to sign. The final argument is
optional if value has an inherent sign; for example,
poly_int64
values print as signed by default and
poly_uint64
values print as unsigned by default.
This is a simply a poly_int
version of a wide-int routine.
Previous: Miscellaneous poly_int
routines, Up: Sizes and offsets as runtime invariants [Contents][Index]
poly_int
One of the main design goals of poly_int
was to make it easy
to write target-independent code that handles variable-sized registers
even when the current target has fixed-sized registers. There are two
aspects to this:
poly_int
operations should be complete enough that
the question in most cases becomes “Can we do this operation on these
particular poly_int
values? If not, bail out” rather than
“Are these poly_int
values constant? If so, do the operation,
otherwise bail out”.
NUM_POLY_INT_COEFFS
, and if the code does not
use asserting functions like to_constant
, it is reasonable to
assume that the code also works on targets with other values of
NUM_POLY_INT_COEFFS
. There is no need to check this during
everyday development.
So the general principle is: if target-independent code is dealing
with a poly_int
value, it is better to operate on it as a
poly_int
if at all possible, choosing conservatively-correct
behavior if a particular operation fails. For example, the following
code handles an index pos
into a sequence of vectors that each
have nunits
elements:
/* Calculate which vector contains the result, and which lane of that vector we need. */ if (!can_div_trunc_p (pos, nunits, &vec_entry, &vec_index)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "Cannot determine which vector holds the" " final result.\n"); return false; }
However, there are some contexts in which operating on a
poly_int
is not possible or does not make sense. One example
is when handling static initializers, since no current target supports
the concept of a variable-length static initializer. In these
situations, a reasonable fallback is:
if (poly_value.is_constant (&const_value)) { … /* Operate on const_value. */ … } else { … /* Conservatively correct fallback. */ … }
poly_int
also provides some asserting functions like
to_constant
. Please only use these functions if there is a
good theoretical reason to believe that the assertion cannot fire.
For example, if some work is divided into an analysis phase and an
implementation phase, the analysis phase might reject inputs that are
not is_constant
, in which case the implementation phase can
reasonably use to_constant
on the remaining inputs. The assertions
should not be used to discover whether a condition ever occurs “in the
field”; in other words, they should not be used to restrict code to
constants at first, with the intention of only implementing a
poly_int
version if a user hits the assertion.
If a particular asserting function like to_constant
is needed
more than once for the same reason, it is probably worth adding a
helper function or macro for that situation, so that the justification
only needs to be given once. For example:
/* Return the size of an element in a vector of size SIZE, given that the vector has NELTS elements. The return value is in the same units as SIZE (either bits or bytes). to_constant () is safe in this situation because vector elements are always constant-sized scalars. */ #define vector_element_size(SIZE, NELTS) \ (exact_div (SIZE, NELTS).to_constant ())
Target-specific code in config/cpu only needs to handle
non-constant poly_int
s if NUM_POLY_INT_COEFFS
is greater
than one. For other targets, poly_int
degenerates to a compile-time
constant and is often interchangable with a normal scalar integer.
There are two main exceptions:
?:
expression, or when passing values
through ...
to things like print functions.
to_constant ()
calls where necessary. The previous option
is preferable because it will help with any future conversion of the
macro to a hook.
Next: GIMPLE, Previous: Sizes and offsets as runtime invariants, Up: Introduction [Contents][Index]
The purpose of GENERIC is simply to provide a
language-independent way of representing an entire function in
trees. To this end, it was necessary to add a few new tree codes
to the back end, but almost everything was already there. If you
can express it with the codes in gcc/tree.def
, it’s
GENERIC.
Early on, there was a great deal of debate about how to think
about statements in a tree IL. In GENERIC, a statement is
defined as any expression whose value, if any, is ignored. A
statement will always have TREE_SIDE_EFFECTS
set (or it
will be discarded), but a non-statement expression may also have
side effects. A CALL_EXPR
, for instance.
It would be possible for some local optimizations to work on the
GENERIC form of a function; indeed, the adapted tree inliner
works fine on GENERIC, but the current compiler performs inlining
after lowering to GIMPLE (a restricted form described in the next
section). Indeed, currently the frontends perform this lowering
before handing off to tree_rest_of_compilation
, but this
seems inelegant.
There are many places in which this document is incomplet and incorrekt. It is, as of yet, only preliminary documentation.
Next: Types, Previous: Deficiencies, Up: GENERIC [Contents][Index]
The central data structure used by the internal representation is the
tree
. These nodes, while all of the C type tree
, are of
many varieties. A tree
is a pointer type, but the object to
which it points may be of a variety of types. From this point forward,
we will refer to trees in ordinary type, rather than in this
font
, except when talking about the actual C type tree
.
You can tell what kind of node a particular tree is by using the
TREE_CODE
macro. Many, many macros take trees as input and
return trees as output. However, most macros require a certain kind of
tree node as input. In other words, there is a type-system for trees,
but it is not reflected in the C type-system.
For safety, it is useful to configure GCC with --enable-checking. Although this results in a significant performance penalty (since all tree types are checked at run-time), and is therefore inappropriate in a release version, it is extremely helpful during the development process.
Many macros behave as predicates. Many, although not all, of these
predicates end in ‘_P’. Do not rely on the result type of these
macros being of any particular type. You may, however, rely on the fact
that the type can be compared to 0
, so that statements like
if (TEST_P (t) && !TEST_P (y)) x = 1;
and
int i = (TEST_P (t) != 0);
are legal. Macros that return int
values now may be changed to
return tree
values, or other pointers in the future. Even those
that continue to return int
may return multiple nonzero codes
where previously they returned only zero and one. Therefore, you should
not write code like
if (TEST_P (t) == 1)
as this code is not guaranteed to work correctly in the future.
You should not take the address of values returned by the macros or functions described here. In particular, no guarantee is given that the values are lvalues.
In general, the names of macros are all in uppercase, while the names of functions are entirely in lowercase. There are rare exceptions to this rule. You should assume that any macro or function whose name is made up entirely of uppercase letters may evaluate its arguments more than once. You may assume that a macro or function whose name is made up entirely of lowercase letters will evaluate its arguments only once.
The error_mark_node
is a special tree. Its tree code is
ERROR_MARK
, but since there is only ever one node with that code,
the usual practice is to compare the tree against
error_mark_node
. (This test is just a test for pointer
equality.) If an error has occurred during front-end processing the
flag errorcount
will be set. If the front end has encountered
code it cannot handle, it will issue a message to the user and set
sorrycount
. When these flags are set, any macro or function
which normally returns a tree of a particular kind may instead return
the error_mark_node
. Thus, if you intend to do any processing of
erroneous code, you must be prepared to deal with the
error_mark_node
.
Occasionally, a particular tree slot (like an operand to an expression, or a particular field in a declaration) will be referred to as “reserved for the back end”. These slots are used to store RTL when the tree is converted to RTL for use by the GCC back end. However, if that process is not taking place (e.g., if the front end is being hooked up to an intelligent editor), then those slots may be used by the back end presently in use.
If you encounter situations that do not match this documentation, such as tree nodes of types not mentioned here, or macros documented to return entities of a particular kind that instead return entities of some different kind, you have found a bug, either in the front end or in the documentation. Please report these bugs as you would any other bug.
Next: Identifiers, Up: Overview [Contents][Index]
All GENERIC trees have two fields in common. First, TREE_CHAIN
is a pointer that can be used as a singly-linked list to other trees.
The other is TREE_TYPE
. Many trees store the type of an
expression or declaration in this field.
These are some other functions for handling trees:
tree_size
¶Return the number of bytes a tree takes.
build0
¶build1
¶build2
¶build3
¶build4
¶build5
¶build6
¶These functions build a tree and supply values to put in each
parameter. The basic signature is ‘code, type, [operands]’.
code
is the TREE_CODE
, and type
is a tree
representing the TREE_TYPE
. These are followed by the
operands, each of which is also a tree.
Next: Containers, Previous: Trees, Up: Overview [Contents][Index]
An IDENTIFIER_NODE
represents a slightly more general concept
than the standard C or C++ concept of identifier. In particular, an
IDENTIFIER_NODE
may contain a ‘$’, or other extraordinary
characters.
There are never two distinct IDENTIFIER_NODE
s representing the
same identifier. Therefore, you may use pointer equality to compare
IDENTIFIER_NODE
s, rather than using a routine like
strcmp
. Use get_identifier
to obtain the unique
IDENTIFIER_NODE
for a supplied string.
You can use the following macros to access identifiers:
IDENTIFIER_POINTER
¶The string represented by the identifier, represented as a
char*
. This string is always NUL
-terminated, and contains
no embedded NUL
characters.
IDENTIFIER_LENGTH
¶The length of the string returned by IDENTIFIER_POINTER
, not
including the trailing NUL
. This value of
IDENTIFIER_LENGTH (x)
is always the same as strlen
(IDENTIFIER_POINTER (x))
.
IDENTIFIER_OPNAME_P
¶This predicate holds if the identifier represents the name of an
overloaded operator. In this case, you should not depend on the
contents of either the IDENTIFIER_POINTER
or the
IDENTIFIER_LENGTH
.
IDENTIFIER_TYPENAME_P
¶This predicate holds if the identifier represents the name of a
user-defined conversion operator. In this case, the TREE_TYPE
of
the IDENTIFIER_NODE
holds the type to which the conversion
operator converts.
Previous: Identifiers, Up: Overview [Contents][Index]
Two common container data structures can be represented directly with
tree nodes. A TREE_LIST
is a singly linked list containing two
trees per node. These are the TREE_PURPOSE
and TREE_VALUE
of each node. (Often, the TREE_PURPOSE
contains some kind of
tag, or additional information, while the TREE_VALUE
contains the
majority of the payload. In other cases, the TREE_PURPOSE
is
simply NULL_TREE
, while in still others both the
TREE_PURPOSE
and TREE_VALUE
are of equal stature.) Given
one TREE_LIST
node, the next node is found by following the
TREE_CHAIN
. If the TREE_CHAIN
is NULL_TREE
, then
you have reached the end of the list.
A TREE_VEC
is a simple vector. The TREE_VEC_LENGTH
is an
integer (not a tree) giving the number of nodes in the vector. The
nodes themselves are accessed using the TREE_VEC_ELT
macro, which
takes two arguments. The first is the TREE_VEC
in question; the
second is an integer indicating which element in the vector is desired.
The elements are indexed from zero.
Next: Declarations, Previous: Overview, Up: GENERIC [Contents][Index]
All types have corresponding tree nodes. However, you should not assume that there is exactly one tree node corresponding to each type. There are often multiple nodes corresponding to the same type.
For the most part, different kinds of types have different tree codes.
(For example, pointer types use a POINTER_TYPE
code while arrays
use an ARRAY_TYPE
code.) However, pointers to member functions
use the RECORD_TYPE
code. Therefore, when writing a
switch
statement that depends on the code associated with a
particular type, you should take care to handle pointers to member
functions under the RECORD_TYPE
case label.
The following functions and macros deal with cv-qualification of types:
TYPE_MAIN_VARIANT
¶This macro returns the unqualified version of a type. It may be applied to an unqualified type, but it is not always the identity function in that case.
A few other macros and functions are usable with all types:
TYPE_SIZE
¶The number of bits required to represent the type, represented as an
INTEGER_CST
. For an incomplete type, TYPE_SIZE
will be
NULL_TREE
.
TYPE_ALIGN
¶The alignment of the type, in bits, represented as an int
.
TYPE_NAME
¶This macro returns a declaration (in the form of a TYPE_DECL
) for
the type. (Note this macro does not return an
IDENTIFIER_NODE
, as you might expect, given its name!) You can
look at the DECL_NAME
of the TYPE_DECL
to obtain the
actual name of the type. The TYPE_NAME
will be NULL_TREE
for a type that is not a built-in type, the result of a typedef, or a
named class type.
TYPE_CANONICAL
¶This macro returns the “canonical” type for the given type
node. Canonical types are used to improve performance in the C++ and
Objective-C++ front ends by allowing efficient comparison between two
type nodes in same_type_p
: if the TYPE_CANONICAL
values
of the types are equal, the types are equivalent; otherwise, the types
are not equivalent. The notion of equivalence for canonical types is
the same as the notion of type equivalence in the language itself. For
instance,
When TYPE_CANONICAL
is NULL_TREE
, there is no canonical
type for the given type node. In this case, comparison between this
type and any other type requires the compiler to perform a deep,
“structural” comparison to see if the two type nodes have the same
form and properties.
The canonical type for a node is always the most fundamental type in
the equivalence class of types. For instance, int
is its own
canonical type. A typedef I
of int
will have int
as its canonical type. Similarly, I*
and a typedef IP
(defined to I*
) will has int*
as their canonical
type. When building a new type node, be sure to set
TYPE_CANONICAL
to the appropriate canonical type. If the new
type is a compound type (built from other types), and any of those
other types require structural equality, use
SET_TYPE_STRUCTURAL_EQUALITY
to ensure that the new type also
requires structural equality. Finally, if for some reason you cannot
guarantee that TYPE_CANONICAL
will point to the canonical type,
use SET_TYPE_STRUCTURAL_EQUALITY
to make sure that the new
type–and any type constructed based on it–requires structural
equality. If you suspect that the canonical type system is
miscomparing types, pass --param verify-canonical-types=1
to
the compiler or configure with --enable-checking
to force the
compiler to verify its canonical-type comparisons against the
structural comparisons; the compiler will then print any warnings if
the canonical types miscompare.
TYPE_STRUCTURAL_EQUALITY_P
¶This predicate holds when the node requires structural equality
checks, e.g., when TYPE_CANONICAL
is NULL_TREE
.
SET_TYPE_STRUCTURAL_EQUALITY
¶This macro states that the type node it is given requires structural
equality checks, e.g., it sets TYPE_CANONICAL
to
NULL_TREE
.
same_type_p
¶This predicate takes two types as input, and holds if they are the same
type. For example, if one type is a typedef
for the other, or
both are typedef
s for the same type. This predicate also holds if
the two trees given as input are simply copies of one another; i.e.,
there is no difference between them at the source level, but, for
whatever reason, a duplicate has been made in the representation. You
should never use ==
(pointer equality) to compare types; always
use same_type_p
instead.
Detailed below are the various kinds of types, and the macros that can be used to access them. Although other kinds of types are used elsewhere in G++, the types described here are the only ones that you will encounter while examining the intermediate representation.
VOID_TYPE
Used to represent the void
type.
INTEGER_TYPE
Used to represent the various integral types, including char
,
short
, int
, long
, and long long
. This code
is not used for enumeration types, nor for the bool
type.
The TYPE_PRECISION
is the number of bits used in
the representation, represented as an unsigned int
. (Note that
in the general case this is not the same value as TYPE_SIZE
;
suppose that there were a 24-bit integer type, but that alignment
requirements for the ABI required 32-bit alignment. Then,
TYPE_SIZE
would be an INTEGER_CST
for 32, while
TYPE_PRECISION
would be 24.) The integer type is unsigned if
TYPE_UNSIGNED
holds; otherwise, it is signed.
The TYPE_MIN_VALUE
is an INTEGER_CST
for the smallest
integer that may be represented by this type. Similarly, the
TYPE_MAX_VALUE
is an INTEGER_CST
for the largest integer
that may be represented by this type.
REAL_TYPE
Used to represent the float
, double
, and long
double
types. The number of bits in the floating-point representation
is given by TYPE_PRECISION
, as in the INTEGER_TYPE
case.
FIXED_POINT_TYPE
Used to represent the short _Fract
, _Fract
, long
_Fract
, long long _Fract
, short _Accum
, _Accum
,
long _Accum
, and long long _Accum
types. The number of bits
in the fixed-point representation is given by TYPE_PRECISION
,
as in the INTEGER_TYPE
case. There may be padding bits, fractional
bits and integral bits. The number of fractional bits is given by
TYPE_FBIT
, and the number of integral bits is given by TYPE_IBIT
.
The fixed-point type is unsigned if TYPE_UNSIGNED
holds; otherwise,
it is signed.
The fixed-point type is saturating if TYPE_SATURATING
holds; otherwise,
it is not saturating.
COMPLEX_TYPE
Used to represent GCC built-in __complex__
data types. The
TREE_TYPE
is the type of the real and imaginary parts.
ENUMERAL_TYPE
Used to represent an enumeration type. The TYPE_PRECISION
gives
(as an int
), the number of bits used to represent the type. If
there are no negative enumeration constants, TYPE_UNSIGNED
will
hold. The minimum and maximum enumeration constants may be obtained
with TYPE_MIN_VALUE
and TYPE_MAX_VALUE
, respectively; each
of these macros returns an INTEGER_CST
.
The actual enumeration constants themselves may be obtained by looking
at the TYPE_VALUES
. This macro will return a TREE_LIST
,
containing the constants. The TREE_PURPOSE
of each node will be
an IDENTIFIER_NODE
giving the name of the constant; the
TREE_VALUE
will be an INTEGER_CST
giving the value
assigned to that constant. These constants will appear in the order in
which they were declared. The TREE_TYPE
of each of these
constants will be the type of enumeration type itself.
OPAQUE_TYPE
Used for things that have a MODE_OPAQUE
mode class in the
backend. Opaque types have a size and precision, and can be held in
memory or registers. They are used when we do not want the compiler to
make assumptions about the availability of other operations as would
happen with integer types.
BOOLEAN_TYPE
Used to represent the bool
type.
POINTER_TYPE
Used to represent pointer types, and pointer to data member types. The
TREE_TYPE
gives the type to which this type points.
REFERENCE_TYPE
Used to represent reference types. The TREE_TYPE
gives the type
to which this type refers.
FUNCTION_TYPE
Used to represent the type of non-member functions and of static member
functions. The TREE_TYPE
gives the return type of the function.
The TYPE_ARG_TYPES
are a TREE_LIST
of the argument types.
The TREE_VALUE
of each node in this list is the type of the
corresponding argument; the TREE_PURPOSE
is an expression for the
default argument value, if any. If the last node in the list is
void_list_node
(a TREE_LIST
node whose TREE_VALUE
is the void_type_node
), then functions of this type do not take
variable arguments. Otherwise, they do take a variable number of
arguments.
Note that in C (but not in C++) a function declared like void f()
is an unprototyped function taking a variable number of arguments; the
TYPE_ARG_TYPES
of such a function will be NULL
.
METHOD_TYPE
Used to represent the type of a non-static member function. Like a
FUNCTION_TYPE
, the return type is given by the TREE_TYPE
.
The type of *this
, i.e., the class of which functions of this
type are a member, is given by the TYPE_METHOD_BASETYPE
. The
TYPE_ARG_TYPES
is the parameter list, as for a
FUNCTION_TYPE
, and includes the this
argument.
ARRAY_TYPE
Used to represent array types. The TREE_TYPE
gives the type of
the elements in the array. If the array-bound is present in the type,
the TYPE_DOMAIN
is an INTEGER_TYPE
whose
TYPE_MIN_VALUE
and TYPE_MAX_VALUE
will be the lower and
upper bounds of the array, respectively. The TYPE_MIN_VALUE
will
always be an INTEGER_CST
for zero, while the
TYPE_MAX_VALUE
will be one less than the number of elements in
the array, i.e., the highest value which may be used to index an element
in the array.
RECORD_TYPE
Used to represent struct
and class
types, as well as
pointers to member functions and similar constructs in other languages.
TYPE_FIELDS
contains the items contained in this type, each of
which can be a FIELD_DECL
, VAR_DECL
, CONST_DECL
, or
TYPE_DECL
. You may not make any assumptions about the ordering
of the fields in the type or whether one or more of them overlap.
UNION_TYPE
Used to represent union
types. Similar to RECORD_TYPE
except that all FIELD_DECL
nodes in TYPE_FIELD
start at
bit position zero.
QUAL_UNION_TYPE
Used to represent part of a variant record in Ada. Similar to
UNION_TYPE
except that each FIELD_DECL
has a
DECL_QUALIFIER
field, which contains a boolean expression that
indicates whether the field is present in the object. The type will only
have one field, so each field’s DECL_QUALIFIER
is only evaluated
if none of the expressions in the previous fields in TYPE_FIELDS
are nonzero. Normally these expressions will reference a field in the
outer object using a PLACEHOLDER_EXPR
.
LANG_TYPE
This node is used to represent a language-specific type. The front end must handle it.
OFFSET_TYPE
This node is used to represent a pointer-to-data member. For a data
member X::m
the TYPE_OFFSET_BASETYPE
is X
and the
TREE_TYPE
is the type of m
.
There are variables whose values represent some of the basic types. These include:
void_type_node
A node for void
.
integer_type_node
A node for int
.
unsigned_type_node.
A node for unsigned int
.
char_type_node.
A node for char
.
It may sometimes be useful to compare one of these variables with a type
in hand, using same_type_p
.
Next: Attributes in trees, Previous: Types, Up: GENERIC [Contents][Index]
This section covers the various kinds of declarations that appear in the
internal representation, except for declarations of functions
(represented by FUNCTION_DECL
nodes), which are described in
Functions.
Next: Internal structure, Up: Declarations [Contents][Index]
Some macros can be used with any kind of declaration. These include:
DECL_NAME
¶This macro returns an IDENTIFIER_NODE
giving the name of the
entity.
TREE_TYPE
¶This macro returns the type of the entity declared.
EXPR_FILENAME
¶This macro returns the name of the file in which the entity was
declared, as a char*
. For an entity declared implicitly by the
compiler (like __builtin_memcpy
), this will be the string
"<internal>"
.
EXPR_LINENO
¶This macro returns the line number at which the entity was declared, as
an int
.
DECL_ARTIFICIAL
¶This predicate holds if the declaration was implicitly generated by the
compiler. For example, this predicate will hold of an implicitly
declared member function, or of the TYPE_DECL
implicitly
generated for a class type. Recall that in C++ code like:
struct S {};
is roughly equivalent to C code like:
struct S {}; typedef struct S S;
The implicitly generated typedef
declaration is represented by a
TYPE_DECL
for which DECL_ARTIFICIAL
holds.
The various kinds of declarations include:
LABEL_DECL
These nodes are used to represent labels in function bodies. For more information, see Functions. These nodes only appear in block scopes.
CONST_DECL
These nodes are used to represent enumeration constants. The value of
the constant is given by DECL_INITIAL
which will be an
INTEGER_CST
with the same type as the TREE_TYPE
of the
CONST_DECL
, i.e., an ENUMERAL_TYPE
.
RESULT_DECL
These nodes represent the value returned by a function. When a value is
assigned to a RESULT_DECL
, that indicates that the value should
be returned, via bitwise copy, by the function. You can use
DECL_SIZE
and DECL_ALIGN
on a RESULT_DECL
, just as
with a VAR_DECL
.
TYPE_DECL
These nodes represent typedef
declarations. The TREE_TYPE
is the type declared to have the name given by DECL_NAME
. In
some cases, there is no associated name.
VAR_DECL
These nodes represent variables with namespace or block scope, as well
as static data members. The DECL_SIZE
and DECL_ALIGN
are
analogous to TYPE_SIZE
and TYPE_ALIGN
. For a declaration,
you should always use the DECL_SIZE
and DECL_ALIGN
rather
than the TYPE_SIZE
and TYPE_ALIGN
given by the
TREE_TYPE
, since special attributes may have been applied to the
variable to give it a particular size and alignment. You may use the
predicates DECL_THIS_STATIC
or DECL_THIS_EXTERN
to test
whether the storage class specifiers static
or extern
were
used to declare a variable.
If this variable is initialized (but does not require a constructor),
the DECL_INITIAL
will be an expression for the initializer. The
initializer should be evaluated, and a bitwise copy into the variable
performed. If the DECL_INITIAL
is the error_mark_node
,
there is an initializer, but it is given by an explicit statement later
in the code; no bitwise copy is required.
GCC provides an extension that allows either automatic variables, or
global variables, to be placed in particular registers. This extension
is being used for a particular VAR_DECL
if DECL_REGISTER
holds for the VAR_DECL
, and if DECL_ASSEMBLER_NAME
is not
equal to DECL_NAME
. In that case, DECL_ASSEMBLER_NAME
is
the name of the register into which the variable will be placed.
PARM_DECL
Used to represent a parameter to a function. Treat these nodes
similarly to VAR_DECL
nodes. These nodes only appear in the
DECL_ARGUMENTS
for a FUNCTION_DECL
.
The DECL_ARG_TYPE
for a PARM_DECL
is the type that will
actually be used when a value is passed to this function. It may be a
wider type than the TREE_TYPE
of the parameter; for example, the
ordinary type might be short
while the DECL_ARG_TYPE
is
int
.
DEBUG_EXPR_DECL
Used to represent an anonymous debug-information temporary created to hold an expression as it is optimized away, so that its value can be referenced in debug bind statements.
FIELD_DECL
These nodes represent non-static data members. The DECL_SIZE
and
DECL_ALIGN
behave as for VAR_DECL
nodes.
The position of the field within the parent record is specified by a
combination of three attributes. DECL_FIELD_OFFSET
is the position,
counting in bytes, of the DECL_OFFSET_ALIGN
-bit sized word containing
the bit of the field closest to the beginning of the structure.
DECL_FIELD_BIT_OFFSET
is the bit offset of the first bit of the field
within this word; this may be nonzero even for fields that are not bit-fields,
since DECL_OFFSET_ALIGN
may be greater than the natural alignment
of the field’s type.
If DECL_C_BIT_FIELD
holds, this field is a bit-field. In a bit-field,
DECL_BIT_FIELD_TYPE
also contains the type that was originally
specified for it, while DECL_TYPE may be a modified type with lesser precision,
according to the size of the bit field.
NAMESPACE_DECL
Namespaces provide a name hierarchy for other declarations. They
appear in the DECL_CONTEXT
of other _DECL
nodes.
Previous: Working with declarations, Up: Declarations [Contents][Index]
DECL
nodes are represented internally as a hierarchy of
structures.
Next: Adding new DECL node types, Up: Internal structure [Contents][Index]
struct tree_decl_minimal
This is the minimal structure to inherit from in order for common
DECL
macros to work. The fields it contains are a unique ID,
source location, context, and name.
struct tree_decl_common
This structure inherits from struct tree_decl_minimal
. It
contains fields that most DECL
nodes need, such as a field to
store alignment, machine mode, size, and attributes.
struct tree_field_decl
This structure inherits from struct tree_decl_common
. It is
used to represent FIELD_DECL
.
struct tree_label_decl
This structure inherits from struct tree_decl_common
. It is
used to represent LABEL_DECL
.
struct tree_translation_unit_decl
This structure inherits from struct tree_decl_common
. It is
used to represent TRANSLATION_UNIT_DECL
.
struct tree_decl_with_rtl
This structure inherits from struct tree_decl_common
. It
contains a field to store the low-level RTL associated with a
DECL
node.
struct tree_result_decl
This structure inherits from struct tree_decl_with_rtl
. It is
used to represent RESULT_DECL
.
struct tree_const_decl
This structure inherits from struct tree_decl_with_rtl
. It is
used to represent CONST_DECL
.
struct tree_parm_decl
This structure inherits from struct tree_decl_with_rtl
. It is
used to represent PARM_DECL
.
struct tree_decl_with_vis
This structure inherits from struct tree_decl_with_rtl
. It
contains fields necessary to store visibility information, as well as
a section name and assembler name.
struct tree_var_decl
This structure inherits from struct tree_decl_with_vis
. It is
used to represent VAR_DECL
.
struct tree_function_decl
This structure inherits from struct tree_decl_with_vis
. It is
used to represent FUNCTION_DECL
.
Previous: Current structure hierarchy, Up: Internal structure [Contents][Index]
Adding a new DECL
tree consists of the following steps
DECL
nodeFor language specific DECL
nodes, there is a .def file
in each frontend directory where the tree code should be added.
For DECL
nodes that are part of the middle-end, the code should
be added to tree.def.
DECL
nodeThese structures should inherit from one of the existing structures in the language hierarchy by using that structure as the first member.
struct tree_foo_decl { struct tree_decl_with_vis common; }
Would create a structure name tree_foo_decl
that inherits from
struct tree_decl_with_vis
.
For language specific DECL
nodes, this new structure type
should go in the appropriate .h file.
For DECL
nodes that are part of the middle-end, the structure
type should go in tree.h.
For garbage collection and dynamic checking purposes, each DECL
node structure type is required to have a unique enumerator value
specified with it.
For language specific DECL
nodes, this new enumerator value
should go in the appropriate .def file.
For DECL
nodes that are part of the middle-end, the enumerator
values are specified in treestruct.def.
union tree_node
In order to make your new structure type usable, it must be added to
union tree_node
.
For language specific DECL
nodes, a new entry should be added
to the appropriate .h file of the form
struct tree_foo_decl GTY ((tag ("TS_VAR_DECL"))) foo_decl;
For DECL
nodes that are part of the middle-end, the additional
member goes directly into union tree_node
in tree.h.
In order to be able to check whether accessing a named portion of
union tree_node
is legal, and whether a certain DECL
node
contains one of the enumerated DECL
node structures in the
hierarchy, a simple lookup table is used.
This lookup table needs to be kept up to date with the tree structure
hierarchy, or else checking and containment macros will fail
inappropriately.
For language specific DECL
nodes, there is an init_ts
function in an appropriate .c file, which initializes the lookup
table.
Code setting up the table for new DECL
nodes should be added
there.
For each DECL
tree code and enumerator value representing a
member of the inheritance hierarchy, the table should contain 1 if
that tree code inherits (directly or indirectly) from that member.
Thus, a FOO_DECL
node derived from struct decl_with_rtl
,
and enumerator value TS_FOO_DECL
, would be set up as follows
tree_contains_struct[FOO_DECL][TS_FOO_DECL] = 1; tree_contains_struct[FOO_DECL][TS_DECL_WRTL] = 1; tree_contains_struct[FOO_DECL][TS_DECL_COMMON] = 1; tree_contains_struct[FOO_DECL][TS_DECL_MINIMAL] = 1;
For DECL
nodes that are part of the middle-end, the setup code
goes into tree.c.
Each added field or flag should have a macro that is used to access
it, that performs appropriate checking to ensure only the right type of
DECL
nodes access the field.
These macros generally take the following form
#define FOO_DECL_FIELDNAME(NODE) FOO_DECL_CHECK(NODE)->foo_decl.fieldname
However, if the structure is simply a base class for further structures, something like the following should be used
#define BASE_STRUCT_CHECK(T) CONTAINS_STRUCT_CHECK(T, TS_BASE_STRUCT) #define BASE_STRUCT_FIELDNAME(NODE) \ (BASE_STRUCT_CHECK(NODE)->base_struct.fieldname
Reading them from the generated all-tree.def file (which in
turn includes all the tree.def files), gencheck.c is
used during GCC’s build to generate the *_CHECK
macros for all
tree codes.
Next: Expressions, Previous: Declarations, Up: GENERIC [Contents][Index]
Attributes, as specified using the __attribute__
keyword, are
represented internally as a TREE_LIST
. The TREE_PURPOSE
is the name of the attribute, as an IDENTIFIER_NODE
. The
TREE_VALUE
is a TREE_LIST
of the arguments of the
attribute, if any, or NULL_TREE
if there are no arguments; the
arguments are stored as the TREE_VALUE
of successive entries in
the list, and may be identifiers or expressions. The TREE_CHAIN
of the attribute is the next attribute in a list of attributes applying
to the same declaration or type, or NULL_TREE
if there are no
further attributes in the list.
Attributes may be attached to declarations and to types; these attributes may be accessed with the following macros. All attributes are stored in this way, and many also cause other changes to the declaration or type or to other internal compiler data structures.
This macro returns the attributes on the declaration decl.
This macro returns the attributes on the type type.
Next: Statements, Previous: Attributes in trees, Up: GENERIC [Contents][Index]
The internal representation for expressions is for the most part quite straightforward. However, there are a few facts that one must bear in mind. In particular, the expression “tree” is actually a directed acyclic graph. (For example there may be many references to the integer constant zero throughout the source program; many of these will be represented by the same expression node.) You should not rely on certain kinds of node being shared, nor should you rely on certain kinds of nodes being unshared.
The following macros can be used with all expression nodes:
TREE_TYPE
¶Returns the type of the expression. This value may not be precisely the same type that would be given the expression in the original program.
In what follows, some nodes that one might expect to always have type
bool
are documented to have either integral or boolean type. At
some point in the future, the C front end may also make use of this same
intermediate representation, and at this point these nodes will
certainly have integral type. The previous sentence is not meant to
imply that the C++ front end does not or will not give these nodes
integral type.
Below, we list the various kinds of expression nodes. Except where
noted otherwise, the operands to an expression are accessed using the
TREE_OPERAND
macro. For example, to access the first operand to
a binary plus expression expr
, use:
TREE_OPERAND (expr, 0)
As this example indicates, the operands are zero-indexed.
Next: References to storage, Up: Expressions [Contents][Index]
The table below begins with constants, moves on to unary expressions, then proceeds to binary expressions, and concludes with various other kinds of expressions:
INTEGER_CST
These nodes represent integer constants. Note that the type of these
constants is obtained with TREE_TYPE
; they are not always of type
int
. In particular, char
constants are represented with
INTEGER_CST
nodes. The value of the integer constant e
is
represented in an array of HOST_WIDE_INT. There are enough elements
in the array to represent the value without taking extra elements for
redundant 0s or -1. The number of elements used to represent e
is available via TREE_INT_CST_NUNITS
. Element i
can be
extracted by using TREE_INT_CST_ELT (e, i)
.
TREE_INT_CST_LOW
is a shorthand for TREE_INT_CST_ELT (e, 0)
.
The functions tree_fits_shwi_p
and tree_fits_uhwi_p
can be used to tell if the value is small enough to fit in a
signed HOST_WIDE_INT or an unsigned HOST_WIDE_INT respectively.
The value can then be extracted using tree_to_shwi
and
tree_to_uhwi
.
REAL_CST
FIXME: Talk about how to obtain representations of this constant, do comparisons, and so forth.
FIXED_CST
These nodes represent fixed-point constants. The type of these constants
is obtained with TREE_TYPE
. TREE_FIXED_CST_PTR
points to
a struct fixed_value
; TREE_FIXED_CST
returns the structure
itself. struct fixed_value
contains data
with the size of two
HOST_BITS_PER_WIDE_INT
and mode
as the associated fixed-point
machine mode for data
.
COMPLEX_CST
These nodes are used to represent complex number constants, that is a
__complex__
whose parts are constant nodes. The
TREE_REALPART
and TREE_IMAGPART
return the real and the
imaginary parts respectively.
VECTOR_CST
These nodes are used to represent vector constants. Each vector constant v is treated as a specific instance of an arbitrary-length sequence that itself contains ‘VECTOR_CST_NPATTERNS (v)’ interleaved patterns. Each pattern has the form:
{ base0, base1, base1 + step, base1 + step * 2, … }
The first three elements in each pattern are enough to determine the values of the other elements. However, if all steps are zero, only the first two elements are needed. If in addition each base1 is equal to the corresponding base0, only the first element in each pattern is needed. The number of encoded elements per pattern is given by ‘VECTOR_CST_NELTS_PER_PATTERN (v)’.
For example, the constant:
{ 0, 1, 2, 6, 3, 8, 4, 10, 5, 12, 6, 14, 7, 16, 8, 18 }
is interpreted as an interleaving of the sequences:
{ 0, 2, 3, 4, 5, 6, 7, 8 } { 1, 6, 8, 10, 12, 14, 16, 18 }
where the sequences are represented by the following patterns:
base0 == 0, base1 == 2, step == 1 base0 == 1, base1 == 6, step == 2
In this case:
VECTOR_CST_NPATTERNS (v) == 2 VECTOR_CST_NELTS_PER_PATTERN (v) == 3
The vector is therefore encoded using the first 6 elements (‘{ 0, 1, 2, 6, 3, 8 }’), with the remaining 10 elements being implicit extensions of them.
Sometimes this scheme can create two possible encodings of the same vector. For example { 0, 1 } could be seen as two patterns with one element each or one pattern with two elements (base0 and base1). The canonical encoding is always the one with the fewest patterns or (if both encodings have the same number of petterns) the one with the fewest encoded elements.
‘vector_cst_encoding_nelts (v)’ gives the total number of
encoded elements in v, which is 6 in the example above.
VECTOR_CST_ENCODED_ELTS (v)
gives a pointer to the elements
encoded in v and VECTOR_CST_ENCODED_ELT (v, i)
accesses the value of encoded element i.
‘VECTOR_CST_DUPLICATE_P (v)’ is true if v simply contains repeated instances of ‘VECTOR_CST_NPATTERNS (v)’ values. This is a shorthand for testing ‘VECTOR_CST_NELTS_PER_PATTERN (v) == 1’.
‘VECTOR_CST_STEPPED_P (v)’ is true if at least one pattern in v has a nonzero step. This is a shorthand for testing ‘VECTOR_CST_NELTS_PER_PATTERN (v) == 3’.
The utility function vector_cst_elt
gives the value of an
arbitrary index as a tree
. vector_cst_int_elt
gives
the same value as a wide_int
.
STRING_CST
These nodes represent string-constants. The TREE_STRING_LENGTH
returns the length of the string, as an int
. The
TREE_STRING_POINTER
is a char*
containing the string
itself. The string may not be NUL
-terminated, and it may contain
embedded NUL
characters. Therefore, the
TREE_STRING_LENGTH
includes the trailing NUL
if it is
present.
For wide string constants, the TREE_STRING_LENGTH
is the number
of bytes in the string, and the TREE_STRING_POINTER
points to an array of the bytes of the string, as represented on the
target system (that is, as integers in the target endianness). Wide and
non-wide string constants are distinguished only by the TREE_TYPE
of the STRING_CST
.
FIXME: The formats of string constants are not well-defined when the target system bytes are not the same width as host system bytes.
POLY_INT_CST
These nodes represent invariants that depend on some target-specific
runtime parameters. They consist of NUM_POLY_INT_COEFFS
coefficients, with the first coefficient being the constant term and
the others being multipliers that are applied to the runtime parameters.
POLY_INT_CST_ELT (x, i)
references coefficient number
i of POLY_INT_CST
node x. Each coefficient is an
INTEGER_CST
.
Next: Unary and Binary Expressions, Previous: Constant expressions, Up: Expressions [Contents][Index]
ARRAY_REF
These nodes represent array accesses. The first operand is the array;
the second is the index. To calculate the address of the memory
accessed, you must scale the index by the size of the type of the array
elements. The type of these expressions must be the type of a component of
the array. The third and fourth operands are used after gimplification
to represent the lower bound and component size but should not be used
directly; call array_ref_low_bound
and array_ref_element_size
instead.
ARRAY_RANGE_REF
These nodes represent access to a range (or “slice”) of an array. The
operands are the same as that for ARRAY_REF
and have the same
meanings. The type of these expressions must be an array whose component
type is the same as that of the first operand. The range of that array
type determines the amount of data these expressions access.
TARGET_MEM_REF
These nodes represent memory accesses whose address directly map to
an addressing mode of the target architecture. The first argument
is TMR_SYMBOL
and must be a VAR_DECL
of an object with
a fixed address. The second argument is TMR_BASE
and the
third one is TMR_INDEX
. The fourth argument is
TMR_STEP
and must be an INTEGER_CST
. The fifth
argument is TMR_OFFSET
and must be an INTEGER_CST
.
Any of the arguments may be NULL if the appropriate component
does not appear in the address. Address of the TARGET_MEM_REF
is determined in the following way.
&TMR_SYMBOL + TMR_BASE + TMR_INDEX * TMR_STEP + TMR_OFFSET
The sixth argument is the reference to the original memory access, which is preserved for the purposes of the RTL alias analysis. The seventh argument is a tag representing the results of tree level alias analysis.
ADDR_EXPR
These nodes are used to represent the address of an object. (These expressions will always have pointer or reference type.) The operand may be another expression, or it may be a declaration.
As an extension, GCC allows users to take the address of a label. In
this case, the operand of the ADDR_EXPR
will be a
LABEL_DECL
. The type of such an expression is void*
.
If the object addressed is not an lvalue, a temporary is created, and the address of the temporary is used.
INDIRECT_REF
These nodes are used to represent the object pointed to by a pointer. The operand is the pointer being dereferenced; it will always have pointer or reference type.
MEM_REF
These nodes are used to represent the object pointed to by a pointer offset by a constant. The first operand is the pointer being dereferenced; it will always have pointer or reference type. The second operand is a pointer constant. Its type is specifying the type to be used for type-based alias analysis.
COMPONENT_REF
These nodes represent non-static data member accesses. The first
operand is the object (rather than a pointer to it); the second operand
is the FIELD_DECL
for the data member. The third operand represents
the byte offset of the field, but should not be used directly; call
component_ref_field_offset
instead.
Next: Vectors, Previous: References to storage, Up: Expressions [Contents][Index]
NEGATE_EXPR
These nodes represent unary negation of the single operand, for both integer and floating-point types. The type of negation can be determined by looking at the type of the expression.
The behavior of this operation on signed arithmetic overflow is
controlled by the flag_wrapv
and flag_trapv
variables.
ABS_EXPR
These nodes represent the absolute value of the single operand, for
both integer and floating-point types. This is typically used to
implement the abs
, labs
and llabs
builtins for
integer types, and the fabs
, fabsf
and fabsl
builtins for floating point types. The type of abs operation can
be determined by looking at the type of the expression.
This node is not used for complex types. To represent the modulus
or complex abs of a complex value, use the BUILT_IN_CABS
,
BUILT_IN_CABSF
or BUILT_IN_CABSL
builtins, as used
to implement the C99 cabs
, cabsf
and cabsl
built-in functions.
ABSU_EXPR
These nodes represent the absolute value of the single operand in
equivalent unsigned type such that ABSU_EXPR
of TYPE_MIN
is well defined.
BIT_NOT_EXPR
These nodes represent bitwise complement, and will always have integral type. The only operand is the value to be complemented.
TRUTH_NOT_EXPR
These nodes represent logical negation, and will always have integral
(or boolean) type. The operand is the value being negated. The type
of the operand and that of the result are always of BOOLEAN_TYPE
or INTEGER_TYPE
.
PREDECREMENT_EXPR
PREINCREMENT_EXPR
POSTDECREMENT_EXPR
POSTINCREMENT_EXPR
These nodes represent increment and decrement expressions. The value of
the single operand is computed, and the operand incremented or
decremented. In the case of PREDECREMENT_EXPR
and
PREINCREMENT_EXPR
, the value of the expression is the value
resulting after the increment or decrement; in the case of
POSTDECREMENT_EXPR
and POSTINCREMENT_EXPR
is the value
before the increment or decrement occurs. The type of the operand, like
that of the result, will be either integral, boolean, or floating-point.
FIX_TRUNC_EXPR
These nodes represent conversion of a floating-point value to an integer. The single operand will have a floating-point type, while the complete expression will have an integral (or boolean) type. The operand is rounded towards zero.
FLOAT_EXPR
These nodes represent conversion of an integral (or boolean) value to a floating-point value. The single operand will have integral type, while the complete expression will have a floating-point type.
FIXME: How is the operand supposed to be rounded? Is this dependent on -mieee?
COMPLEX_EXPR
These nodes are used to represent complex numbers constructed from two expressions of the same (integer or real) type. The first operand is the real part and the second operand is the imaginary part.
CONJ_EXPR
These nodes represent the conjugate of their operand.
REALPART_EXPR
IMAGPART_EXPR
These nodes represent respectively the real and the imaginary parts of complex numbers (their sole argument).
NON_LVALUE_EXPR
These nodes indicate that their one and only operand is not an lvalue. A back end can treat these identically to the single operand.
NOP_EXPR
These nodes are used to represent conversions that do not require any
code-generation. For example, conversion of a char*
to an
int*
does not require any code be generated; such a conversion is
represented by a NOP_EXPR
. The single operand is the expression
to be converted. The conversion from a pointer to a reference is also
represented with a NOP_EXPR
.
CONVERT_EXPR
These nodes are similar to NOP_EXPR
s, but are used in those
situations where code may need to be generated. For example, if an
int*
is converted to an int
code may need to be generated
on some platforms. These nodes are never used for C++-specific
conversions, like conversions between pointers to different classes in
an inheritance hierarchy. Any adjustments that need to be made in such
cases are always indicated explicitly. Similarly, a user-defined
conversion is never represented by a CONVERT_EXPR
; instead, the
function calls are made explicit.
FIXED_CONVERT_EXPR
These nodes are used to represent conversions that involve fixed-point values. For example, from a fixed-point value to another fixed-point value, from an integer to a fixed-point value, from a fixed-point value to an integer, from a floating-point value to a fixed-point value, or from a fixed-point value to a floating-point value.
LSHIFT_EXPR
RSHIFT_EXPR
These nodes represent left and right shifts, respectively. The first operand is the value to shift; it will always be of integral type. The second operand is an expression for the number of bits by which to shift. Right shift should be treated as arithmetic, i.e., the high-order bits should be zero-filled when the expression has unsigned type and filled with the sign bit when the expression has signed type. Note that the result is undefined if the second operand is larger than or equal to the first operand’s type size. Unlike most nodes, these can have a vector as first operand and a scalar as second operand.
BIT_IOR_EXPR
BIT_XOR_EXPR
BIT_AND_EXPR
These nodes represent bitwise inclusive or, bitwise exclusive or, and bitwise and, respectively. Both operands will always have integral type.
TRUTH_ANDIF_EXPR
TRUTH_ORIF_EXPR
These nodes represent logical “and” and logical “or”, respectively.
These operators are not strict; i.e., the second operand is evaluated
only if the value of the expression is not determined by evaluation of
the first operand. The type of the operands and that of the result are
always of BOOLEAN_TYPE
or INTEGER_TYPE
.
TRUTH_AND_EXPR
TRUTH_OR_EXPR
TRUTH_XOR_EXPR
These nodes represent logical and, logical or, and logical exclusive or.
They are strict; both arguments are always evaluated. There are no
corresponding operators in C or C++, but the front end will sometimes
generate these expressions anyhow, if it can tell that strictness does
not matter. The type of the operands and that of the result are
always of BOOLEAN_TYPE
or INTEGER_TYPE
.
POINTER_PLUS_EXPR
This node represents pointer arithmetic. The first operand is always a pointer/reference type. The second operand is always an unsigned integer type compatible with sizetype. This and POINTER_DIFF_EXPR are the only binary arithmetic operators that can operate on pointer types.
POINTER_DIFF_EXPR
This node represents pointer subtraction. The two operands always have pointer/reference type. It returns a signed integer of the same precision as the pointers. The behavior is undefined if the difference of the two pointers, seen as infinite precision non-negative integers, does not fit in the result type. The result does not depend on the pointer type, it is not divided by the size of the pointed-to type.
PLUS_EXPR
MINUS_EXPR
MULT_EXPR
These nodes represent various binary arithmetic operations. Respectively, these operations are addition, subtraction (of the second operand from the first) and multiplication. Their operands may have either integral or floating type, but there will never be case in which one operand is of floating type and the other is of integral type.
The behavior of these operations on signed arithmetic overflow is
controlled by the flag_wrapv
and flag_trapv
variables.
MULT_HIGHPART_EXPR
This node represents the “high-part” of a widening multiplication. For an integral type with b bits of precision, the result is the most significant b bits of the full 2b product.
RDIV_EXPR
This node represents a floating point division operation.
TRUNC_DIV_EXPR
FLOOR_DIV_EXPR
CEIL_DIV_EXPR
ROUND_DIV_EXPR
These nodes represent integer division operations that return an integer
result. TRUNC_DIV_EXPR
rounds towards zero, FLOOR_DIV_EXPR
rounds towards negative infinity, CEIL_DIV_EXPR
rounds towards
positive infinity and ROUND_DIV_EXPR
rounds to the closest integer.
Integer division in C and C++ is truncating, i.e. TRUNC_DIV_EXPR
.
The behavior of these operations on signed arithmetic overflow, when
dividing the minimum signed integer by minus one, is controlled by the
flag_wrapv
and flag_trapv
variables.
TRUNC_MOD_EXPR
FLOOR_MOD_EXPR
CEIL_MOD_EXPR
ROUND_MOD_EXPR
These nodes represent the integer remainder or modulus operation.
The integer modulus of two operands a
and b
is
defined as a - (a/b)*b
where the division calculated using
the corresponding division operator. Hence for TRUNC_MOD_EXPR
this definition assumes division using truncation towards zero, i.e.
TRUNC_DIV_EXPR
. Integer remainder in C and C++ uses truncating
division, i.e. TRUNC_MOD_EXPR
.
EXACT_DIV_EXPR
The EXACT_DIV_EXPR
code is used to represent integer divisions where
the numerator is known to be an exact multiple of the denominator. This
allows the backend to choose between the faster of TRUNC_DIV_EXPR
,
CEIL_DIV_EXPR
and FLOOR_DIV_EXPR
for the current target.
LT_EXPR
LE_EXPR
GT_EXPR
GE_EXPR
LTGT_EXPR
EQ_EXPR
NE_EXPR
These nodes represent the less than, less than or equal to, greater than, greater than or equal to, less or greater than, equal, and not equal comparison operators. The first and second operands will either be both of integral type, both of floating type or both of vector type, except for LTGT_EXPR where they will only be both of floating type. The result type of these expressions will always be of integral, boolean or signed integral vector type. These operations return the result type’s zero value for false, the result type’s one value for true, and a vector whose elements are zero (false) or minus one (true) for vectors.
For floating point comparisons, if we honor IEEE NaNs and either operand
is NaN, then NE_EXPR
always returns true and the remaining operators
always return false. On some targets, comparisons against an IEEE NaN,
other than equality and inequality, may generate a floating-point exception.
ORDERED_EXPR
UNORDERED_EXPR
These nodes represent non-trapping ordered and unordered comparison operators. These operations take two floating point operands and determine whether they are ordered or unordered relative to each other. If either operand is an IEEE NaN, their comparison is defined to be unordered, otherwise the comparison is defined to be ordered. The result type of these expressions will always be of integral or boolean type. These operations return the result type’s zero value for false, and the result type’s one value for true.
UNLT_EXPR
UNLE_EXPR
UNGT_EXPR
UNGE_EXPR
UNEQ_EXPR
These nodes represent the unordered comparison operators.
These operations take two floating point operands and determine whether
the operands are unordered or are less than, less than or equal to,
greater than, greater than or equal to, or equal respectively. For
example, UNLT_EXPR
returns true if either operand is an IEEE
NaN or the first operand is less than the second. All these operations
are guaranteed not to generate a floating point exception. The result
type of these expressions will always be of integral or boolean type.
These operations return the result type’s zero value for false,
and the result type’s one value for true.
MODIFY_EXPR
These nodes represent assignment. The left-hand side is the first
operand; the right-hand side is the second operand. The left-hand side
will be a VAR_DECL
, INDIRECT_REF
, COMPONENT_REF
, or
other lvalue.
These nodes are used to represent not only assignment with ‘=’ but also compound assignments (like ‘+=’), by reduction to ‘=’ assignment. In other words, the representation for ‘i += 3’ looks just like that for ‘i = i + 3’.
INIT_EXPR
These nodes are just like MODIFY_EXPR
, but are used only when a
variable is initialized, rather than assigned to subsequently. This
means that we can assume that the target of the initialization is not
used in computing its own value; any reference to the lhs in computing
the rhs is undefined.
COMPOUND_EXPR
These nodes represent comma-expressions. The first operand is an expression whose value is computed and thrown away prior to the evaluation of the second operand. The value of the entire expression is the value of the second operand.
COND_EXPR
These nodes represent ?:
expressions. The first operand
is of boolean or integral type. If it evaluates to a nonzero value,
the second operand should be evaluated, and returned as the value of the
expression. Otherwise, the third operand is evaluated, and returned as
the value of the expression.
The second operand must have the same type as the entire expression,
unless it unconditionally throws an exception or calls a noreturn
function, in which case it should have void type. The same constraints
apply to the third operand. This allows array bounds checks to be
represented conveniently as (i >= 0 && i < 10) ? i : abort()
.
As a GNU extension, the C language front-ends allow the second
operand of the ?:
operator may be omitted in the source.
For example, x ? : 3
is equivalent to x ? x : 3
,
assuming that x
is an expression without side effects.
In the tree representation, however, the second operand is always
present, possibly protected by SAVE_EXPR
if the first
argument does cause side effects.
CALL_EXPR
These nodes are used to represent calls to functions, including
non-static member functions. CALL_EXPR
s are implemented as
expression nodes with a variable number of operands. Rather than using
TREE_OPERAND
to extract them, it is preferable to use the
specialized accessor macros and functions that operate specifically on
CALL_EXPR
nodes.
CALL_EXPR_FN
returns a pointer to the
function to call; it is always an expression whose type is a
POINTER_TYPE
.
The number of arguments to the call is returned by call_expr_nargs
,
while the arguments themselves can be accessed with the CALL_EXPR_ARG
macro. The arguments are zero-indexed and numbered left-to-right.
You can iterate over the arguments using FOR_EACH_CALL_EXPR_ARG
, as in:
tree call, arg; call_expr_arg_iterator iter; FOR_EACH_CALL_EXPR_ARG (arg, iter, call) /* arg is bound to successive arguments of call. */ …;
For non-static
member functions, there will be an operand corresponding to the
this
pointer. There will always be expressions corresponding to
all of the arguments, even if the function is declared with default
arguments and some arguments are not explicitly provided at the call
sites.
CALL_EXPR
s also have a CALL_EXPR_STATIC_CHAIN
operand that
is used to implement nested functions. This operand is otherwise null.
CLEANUP_POINT_EXPR
These nodes represent full-expressions. The single operand is an expression to evaluate. Any destructor calls engendered by the creation of temporaries during the evaluation of that expression should be performed immediately after the expression is evaluated.
CONSTRUCTOR
These nodes represent the brace-enclosed initializers for a structure or an
array. They contain a sequence of component values made out of a vector of
constructor_elt, which is a (INDEX
, VALUE
) pair.
If the TREE_TYPE
of the CONSTRUCTOR
is a RECORD_TYPE
,
UNION_TYPE
or QUAL_UNION_TYPE
then the INDEX
of each
node in the sequence will be a FIELD_DECL
and the VALUE
will
be the expression used to initialize that field.
If the TREE_TYPE
of the CONSTRUCTOR
is an ARRAY_TYPE
,
then the INDEX
of each node in the sequence will be an
INTEGER_CST
or a RANGE_EXPR
of two INTEGER_CST
s.
A single INTEGER_CST
indicates which element of the array is being
assigned to. A RANGE_EXPR
indicates an inclusive range of elements
to initialize. In both cases the VALUE
is the corresponding
initializer. It is re-evaluated for each element of a
RANGE_EXPR
. If the INDEX
is NULL_TREE
, then
the initializer is for the next available array element.
In the front end, you should not depend on the fields appearing in any particular order. However, in the middle end, fields must appear in declaration order. You should not assume that all fields will be represented. Unrepresented fields will be cleared (zeroed), unless the CONSTRUCTOR_NO_CLEARING flag is set, in which case their value becomes undefined.
COMPOUND_LITERAL_EXPR
¶These nodes represent ISO C99 compound literals. The
COMPOUND_LITERAL_EXPR_DECL_EXPR
is a DECL_EXPR
containing an anonymous VAR_DECL
for
the unnamed object represented by the compound literal; the
DECL_INITIAL
of that VAR_DECL
is a CONSTRUCTOR
representing the brace-enclosed list of initializers in the compound
literal. That anonymous VAR_DECL
can also be accessed directly
by the COMPOUND_LITERAL_EXPR_DECL
macro.
SAVE_EXPR
A SAVE_EXPR
represents an expression (possibly involving
side effects) that is used more than once. The side effects should
occur only the first time the expression is evaluated. Subsequent uses
should just reuse the computed value. The first operand to the
SAVE_EXPR
is the expression to evaluate. The side effects should
be executed where the SAVE_EXPR
is first encountered in a
depth-first preorder traversal of the expression tree.
TARGET_EXPR
A TARGET_EXPR
represents a temporary object. The first operand
is a VAR_DECL
for the temporary variable. The second operand is
the initializer for the temporary. The initializer is evaluated and,
if non-void, copied (bitwise) into the temporary. If the initializer
is void, that means that it will perform the initialization itself.
Often, a TARGET_EXPR
occurs on the right-hand side of an
assignment, or as the second operand to a comma-expression which is
itself the right-hand side of an assignment, etc. In this case, we say
that the TARGET_EXPR
is “normal”; otherwise, we say it is
“orphaned”. For a normal TARGET_EXPR
the temporary variable
should be treated as an alias for the left-hand side of the assignment,
rather than as a new temporary variable.
The third operand to the TARGET_EXPR
, if present, is a
cleanup-expression (i.e., destructor call) for the temporary. If this
expression is orphaned, then this expression must be executed when the
statement containing this expression is complete. These cleanups must
always be executed in the order opposite to that in which they were
encountered. Note that if a temporary is created on one branch of a
conditional operator (i.e., in the second or third operand to a
COND_EXPR
), the cleanup must be run only if that branch is
actually executed.
VA_ARG_EXPR
This node is used to implement support for the C/C++ variable argument-list
mechanism. It represents expressions like va_arg (ap, type)
.
Its TREE_TYPE
yields the tree representation for type
and
its sole argument yields the representation for ap
.
ANNOTATE_EXPR
This node is used to attach markers to an expression. The first operand
is the annotated expression, the second is an INTEGER_CST
with
a value from enum annot_expr_kind
, the third is an INTEGER_CST
.
Previous: Unary and Binary Expressions, Up: Expressions [Contents][Index]
VEC_DUPLICATE_EXPR
This node has a single operand and represents a vector in which every element is equal to that operand.
VEC_SERIES_EXPR
This node represents a vector formed from a scalar base and step, given as the first and second operands respectively. Element i of the result is equal to ‘base + i*step’.
This node is restricted to integral types, in order to avoid specifying the rounding behavior for floating-point types.
VEC_LSHIFT_EXPR
VEC_RSHIFT_EXPR
These nodes represent whole vector left and right shifts, respectively. The first operand is the vector to shift; it will always be of vector type. The second operand is an expression for the number of bits by which to shift. Note that the result is undefined if the second operand is larger than or equal to the first operand’s type size.
VEC_WIDEN_MULT_HI_EXPR
VEC_WIDEN_MULT_LO_EXPR
These nodes represent widening vector multiplication of the high and low
parts of the two input vectors, respectively. Their operands are vectors
that contain the same number of elements (N
) of the same integral type.
The result is a vector that contains half as many elements, of an integral type
whose size is twice as wide. In the case of VEC_WIDEN_MULT_HI_EXPR
the
high N/2
elements of the two vector are multiplied to produce the
vector of N/2
products. In the case of VEC_WIDEN_MULT_LO_EXPR
the
low N/2
elements of the two vector are multiplied to produce the
vector of N/2
products.
VEC_WIDEN_PLUS_HI_EXPR
VEC_WIDEN_PLUS_LO_EXPR
These nodes represent widening vector addition of the high and low parts of
the two input vectors, respectively. Their operands are vectors that contain
the same number of elements (N
) of the same integral type. The result
is a vector that contains half as many elements, of an integral type whose size
is twice as wide. In the case of VEC_WIDEN_PLUS_HI_EXPR
the high
N/2
elements of the two vectors are added to produce the vector of
N/2
products. In the case of VEC_WIDEN_PLUS_LO_EXPR
the low
N/2
elements of the two vectors are added to produce the vector of
N/2
products.
VEC_WIDEN_MINUS_HI_EXPR
VEC_WIDEN_MINUS_LO_EXPR
These nodes represent widening vector subtraction of the high and low parts of
the two input vectors, respectively. Their operands are vectors that contain
the same number of elements (N
) of the same integral type. The high/low
elements of the second vector are subtracted from the high/low elements of the
first. The result is a vector that contains half as many elements, of an
integral type whose size is twice as wide. In the case of
VEC_WIDEN_MINUS_HI_EXPR
the high N/2
elements of the second
vector are subtracted from the high N/2
of the first to produce the
vector of N/2
products. In the case of
VEC_WIDEN_MINUS_LO_EXPR
the low N/2
elements of the second
vector are subtracted from the low N/2
of the first to produce the
vector of N/2
products.
VEC_UNPACK_HI_EXPR
VEC_UNPACK_LO_EXPR
These nodes represent unpacking of the high and low parts of the input vector,
respectively. The single operand is a vector that contains N
elements
of the same integral or floating point type. The result is a vector
that contains half as many elements, of an integral or floating point type
whose size is twice as wide. In the case of VEC_UNPACK_HI_EXPR
the
high N/2
elements of the vector are extracted and widened (promoted).
In the case of VEC_UNPACK_LO_EXPR
the low N/2
elements of the
vector are extracted and widened (promoted).
VEC_UNPACK_FLOAT_HI_EXPR
VEC_UNPACK_FLOAT_LO_EXPR
These nodes represent unpacking of the high and low parts of the input vector,
where the values are converted from fixed point to floating point. The
single operand is a vector that contains N
elements of the same
integral type. The result is a vector that contains half as many elements
of a floating point type whose size is twice as wide. In the case of
VEC_UNPACK_FLOAT_HI_EXPR
the high N/2
elements of the vector are
extracted, converted and widened. In the case of VEC_UNPACK_FLOAT_LO_EXPR
the low N/2
elements of the vector are extracted, converted and widened.
VEC_UNPACK_FIX_TRUNC_HI_EXPR
VEC_UNPACK_FIX_TRUNC_LO_EXPR
These nodes represent unpacking of the high and low parts of the input vector,
where the values are truncated from floating point to fixed point. The
single operand is a vector that contains N
elements of the same
floating point type. The result is a vector that contains half as many
elements of an integral type whose size is twice as wide. In the case of
VEC_UNPACK_FIX_TRUNC_HI_EXPR
the high N/2
elements of the
vector are extracted and converted with truncation. In the case of
VEC_UNPACK_FIX_TRUNC_LO_EXPR
the low N/2
elements of the
vector are extracted and converted with truncation.
VEC_PACK_TRUNC_EXPR
This node represents packing of truncated elements of the two input vectors into the output vector. Input operands are vectors that contain the same number of elements of the same integral or floating point type. The result is a vector that contains twice as many elements of an integral or floating point type whose size is half as wide. The elements of the two vectors are demoted and merged (concatenated) to form the output vector.
VEC_PACK_SAT_EXPR
This node represents packing of elements of the two input vectors into the output vector using saturation. Input operands are vectors that