This manual documents Guile version 2.2.7.
Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”
syntax-case
System
cond
clausesxml-match
: Pattern Matching of SXML
Next: Introduction, Up: The Guile Reference Manual [Contents][Index]
This manual describes how to use Guile, GNU’s Ubiquitous Intelligent Language for Extensions. It relates particularly to Guile version 2.2.7.
Next: The Guile License, Up: Preface [Contents][Index]
Like Guile itself, the Guile reference manual is a living entity, cared for by many people over a long period of time. As such, it is hard to identify individuals of whom to say “yes, this person, she wrote the manual.”
Still, among the many contributions, some caretakers stand out. First among them is Neil Jerram, who has been working on this document for ten years now. Neil’s attention both to detail and to the big picture have made a real difference in the understanding of a generation of Guile hackers.
Next we should note Marius Vollmer’s effect on this document. Marius maintained Guile during a period in which Guile’s API was clarified—put to the fire, so to speak—and he had the good sense to effect the same change on the manual.
Martin Grabmueller made substantial contributions throughout the manual in preparation for the Guile 1.6 release, including filling out a lot of the documentation of Scheme data types, control mechanisms and procedures. In addition, he wrote the documentation for Guile’s SRFI modules and modules associated with the Guile REPL.
Ludovic Courtès and Andy Wingo, the Guile maintainers at the time of this writing (late 2010), have also made their dent in the manual, writing documentation for new modules and subsystems in Guile 2.0. They are also responsible for ensuring that the existing text retains its relevance as Guile evolves. See Reporting Bugs, for more information on reporting problems in this manual.
The content for the first versions of this manual incorporated and was inspired by documents from Aubrey Jaffer, author of the SCM system on which Guile was based, and from Tom Lord, Guile’s first maintainer. Although most of this text has been rewritten, all of it was important, and some of the structure remains.
The manual for the first versions of Guile were largely written, edited, and compiled by Mark Galassi and Jim Blandy. In particular, Jim wrote the original tutorial on Guile’s data representation and the C API for accessing Guile objects.
Significant portions were also contributed by Thien-Thi Nguyen, Kevin Ryde, Mikael Djurfeldt, Christian Lynbech, Julian Graham, Gary Houston, Tim Pierce, and a few dozen more. You, reader, are most welcome to join their esteemed ranks. Visit Guile’s web site at http://www.gnu.org/software/guile/ to find out how to get involved.
Previous: Contributors to this Manual, Up: Preface [Contents][Index]
Guile is Free Software. Guile is copyrighted, not public domain, and there are restrictions on its distribution or redistribution, but these restrictions are designed to permit everything a cooperating person would want to do.
C code linking to the Guile library is subject to terms of that library. Basically such code may be published on any terms, provided users can re-link against a new or modified version of Guile.
C code linking to the Guile readline module is subject to the terms of that module. Basically such code must be published on Free terms.
Scheme level code written to be run by Guile (but not derived from Guile itself) is not restricted in any way, and may be published on any terms. We encourage authors to publish on Free terms.
You must be aware there is no warranty whatsoever for Guile. This is described in full in the licenses.
Next: Hello Guile!, Previous: Preface, Up: The Guile Reference Manual [Contents][Index]
Guile is an implementation of the Scheme programming language. Scheme (http://schemers.org/) is an elegant and conceptually simple dialect of Lisp, originated by Guy Steele and Gerald Sussman, and since evolved by the series of reports known as RnRS (the Revised^n Reports on Scheme).
Unlike, for example, Python or Perl, Scheme has no benevolent dictator. There are many Scheme implementations, with different characteristics and with communities and academic activities around them, and the language develops as a result of the interplay between these. Guile’s particular characteristics are that
The next few sections explain what we mean by these points. The sections after that cover how you can obtain and install Guile, and the typographical conventions that we use in this manual.
Next: Combining with C Code, Up: Introduction [Contents][Index]
Guile implements Scheme as described in the Revised^5 Report on the Algorithmic Language Scheme (usually known as R5RS), providing clean and general data and control structures. Guile goes beyond the rather austere language presented in R5RS, extending it with a module system, full access to POSIX system calls, networking support, multiple threads, dynamic linking, a foreign function call interface, powerful string processing, and many other features needed for programming in the real world.
The Scheme community has recently agreed and published R6RS, the latest installment in the RnRS series. R6RS significantly expands the core Scheme language, and standardises many non-core functions that implementations—including Guile—have previously done in different ways. Guile has been updated to incorporate some of the features of R6RS, and to adjust some existing features to conform to the R6RS specification, but it is by no means a complete R6RS implementation. See R6RS Support.
Between R5RS and R6RS, the SRFI process (http://srfi.schemers.org/) standardised interfaces for many practical needs, such as multithreaded programming and multidimensional arrays. Guile supports many SRFIs, as documented in detail in SRFI Support Modules.
In summary, so far as relationship to the Scheme standards is concerned, Guile is an R5RS implementation with many extensions, some of which conform to SRFIs or to the relevant parts of R6RS.
Next: Guile and the GNU Project, Previous: Guile and Scheme, Up: Introduction [Contents][Index]
Like a shell, Guile can run interactively—reading expressions from the user, evaluating them, and displaying the results—or as a script interpreter, reading and executing Scheme code from a file. Guile also provides an object library, libguile, that allows other applications to easily incorporate a complete Scheme interpreter. An application can then use Guile as an extension language, a clean and powerful configuration language, or as multi-purpose “glue”, connecting primitives provided by the application. It is easy to call Scheme code from C code and vice versa, giving the application designer full control of how and when to invoke the interpreter. Applications can add new functions, data types, control structures, and even syntax to Guile, creating a domain-specific language tailored to the task at hand, but based on a robust language design.
This kind of combination is helped by four aspects of Guile’s design and history. First is that Guile has always been targeted as an extension language. Hence its C API has always been of great importance, and has been developed accordingly. Second and third are rather technical points—that Guile uses conservative garbage collection, and that it implements the Scheme concept of continuations by copying and reinstating the C stack—but whose practical consequence is that most existing C code can be glued into Guile as is, without needing modifications to cope with strange Scheme execution flows. Last is the module system, which helps extensions to coexist without stepping on each others’ toes.
Guile’s module system allows one to break up a large program into manageable sections with well-defined interfaces between them. Modules may contain a mixture of interpreted and compiled code; Guile can use either static or dynamic linking to incorporate compiled code. Modules also encourage developers to package up useful collections of routines for general distribution; as of this writing, one can find Emacs interfaces, database access routines, compilers, GUI toolkit interfaces, and HTTP client functions, among others.
Next: Interactive Programming, Previous: Combining with C Code, Up: Introduction [Contents][Index]
Guile was conceived by the GNU Project following the fantastic success of Emacs Lisp as an extension language within Emacs. Just as Emacs Lisp allowed complete and unanticipated applications to be written within the Emacs environment, the idea was that Guile should do the same for other GNU Project applications. This remains true today.
The idea of extensibility is closely related to the GNU project’s primary goal, that of promoting software freedom. Software freedom means that people receiving a software package can modify or enhance it to their own desires, including in ways that may not have occurred at all to the software’s original developers. For programs written in a compiled language like C, this freedom covers modifying and rebuilding the C code; but if the program also provides an extension language, that is usually a much friendlier and lower-barrier-of-entry way for the user to start making their own changes.
Guile is now used by GNU project applications such as AutoGen, Lilypond, Denemo, Mailutils, TeXmacs and Gnucash, and we hope that there will be many more in future.
Next: Supporting Multiple Languages, Previous: Guile and the GNU Project, Up: Introduction [Contents][Index]
Non-free software has no interest in its users being able to see how it works. They are supposed to just accept it, or to report problems and hope that the source code owners will choose to work on them.
Free software aims to work reliably just as much as non-free software does, but it should also empower its users by making its workings available. This is useful for many reasons, including education, auditing and enhancements, as well as for debugging problems.
The ideal free software system achieves this by making it easy for interested
users to see the source code for a feature that they are using, and to follow
through that source code step-by-step, as it runs. In Emacs, good examples of
this are the source code hyperlinks in the help system, and edebug
.
Then, for bonus points and maximising the ability for the user to experiment
quickly with code changes, the system should allow parts of the source code to
be modified and reloaded into the running program, to take immediate effect.
Guile is designed for this kind of interactive programming, and this distinguishes it from many Scheme implementations that instead prioritise running a fixed Scheme program as fast as possible—because there are tradeoffs between performance and the ability to modify parts of an already running program. There are faster Schemes than Guile, but Guile is a GNU project and so prioritises the GNU vision of programming freedom and experimentation.
Next: Obtaining and Installing Guile, Previous: Interactive Programming, Up: Introduction [Contents][Index]
Since the 2.0 release, Guile’s architecture supports compiling any language to its core virtual machine bytecode, and Scheme is just one of the supported languages. Other supported languages are Emacs Lisp, ECMAScript (commonly known as Javascript) and Brainfuck, and work is under discussion for Lua, Ruby and Python.
This means that users can program applications which use Guile in the language of their choice, rather than having the tastes of the application’s author imposed on them.
Next: Organisation of this Manual, Previous: Supporting Multiple Languages, Up: Introduction [Contents][Index]
Guile can be obtained from the main GNU archive site ftp://ftp.gnu.org or any of its mirrors. The file will be named guile-version.tar.gz. The current version is 2.2.7, so the file you should grab is:
ftp://ftp.gnu.org/gnu/guile/guile-2.2.7.tar.gz
To unbundle Guile use the instruction
zcat guile-2.2.7.tar.gz | tar xvf -
which will create a directory called guile-2.2.7 with all the sources. You can look at the file INSTALL for detailed instructions on how to build and install Guile, but you should be able to just do
cd guile-2.2.7 ./configure make make install
This will install the Guile executable guile, the Guile library libguile and various associated header files and support libraries. It will also install the Guile reference manual.
Since this manual frequently refers to the Scheme “standard”, also known as R5RS, or the “Revised^5 Report on the Algorithmic Language Scheme”, we have included the report in the Guile distribution; see Introduction in Revised(5) Report on the Algorithmic Language Scheme. This will also be installed in your info directory.
Next: Typographical Conventions, Previous: Obtaining and Installing Guile, Up: Introduction [Contents][Index]
The rest of this manual is organised into the following chapters.
A whirlwind tour shows how Guile can be used interactively and as a script interpreter, how to link Guile into your own applications, and how to write modules of interpreted and compiled code for use with Guile. Everything introduced here is documented again and in full by the later parts of the manual.
For readers new to Scheme, this chapter provides an introduction to the basic ideas of the Scheme language. This material would apply to any Scheme implementation and so does not make reference to anything Guile-specific.
Provides an overview of programming in Scheme with Guile. It covers how to
invoke the guile
program from the command-line and how to write scripts
in Scheme. It also introduces the extensions that Guile offers beyond standard
Scheme.
Provides an overview of how to use Guile in a C program. It discusses the fundamental concepts that you need to understand to access the features of Guile, such as dynamic types and the garbage collector. It explains in a tutorial like manner how to define new data types and functions for the use by Scheme programs.
This part of the manual documents the Guile API in functionality-based groups with the Scheme and C interfaces presented side by side.
Describes some important modules, distributed as part of the Guile distribution, that extend the functionality provided by the Guile Scheme core.
Describes GOOPS, an object oriented extension to Guile that provides classes, multiple inheritance and generic functions.
Previous: Organisation of this Manual, Up: Introduction [Contents][Index]
In examples and procedure descriptions and all other places where the evaluation of Scheme expression is shown, we use some notation for denoting the output and evaluation results of expressions.
The symbol ‘⇒’ is used to tell which value is returned by an evaluation:
(+ 1 2) ⇒ 3
Some procedures produce some output besides returning a value. This is denoted by the symbol ‘-|’.
(begin (display 1) (newline) 'hooray) -| 1 ⇒ hooray
As you can see, this code prints ‘1’ (denoted by
‘-|’), and returns hooray
(denoted by
‘⇒’).
Next: Hello Scheme!, Previous: Introduction, Up: The Guile Reference Manual [Contents][Index]
This chapter presents a quick tour of all the ways that Guile can be used. There are additional examples in the examples/ directory in the Guile source distribution. It also explains how best to report any problems that you find.
The following examples assume that Guile has been installed in
/usr/local/
.
Next: Running Guile Scripts, Up: Hello Guile! [Contents][Index]
In its simplest form, Guile acts as an interactive interpreter for the
Scheme programming language, reading and evaluating Scheme expressions
the user enters from the terminal. Here is a sample interaction between
Guile and a user; the user’s input appears after the $
and
scheme@(guile-user)>
prompts:
$ guile scheme@(guile-user)> (+ 1 2 3) ; add some numbers $1 = 6 scheme@(guile-user)> (define (factorial n) ; define a function (if (zero? n) 1 (* n (factorial (- n 1))))) scheme@(guile-user)> (factorial 20) $2 = 2432902008176640000 scheme@(guile-user)> (getpwnam "root") ; look in /etc/passwd $3 = #("root" "x" 0 0 "root" "/root" "/bin/bash") scheme@(guile-user)> C-d $
Next: Linking Guile into Programs, Previous: Running Guile Interactively, Up: Hello Guile! [Contents][Index]
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
Here is a trivial Guile script. See Guile Scripting, for more details.
#!/usr/local/bin/guile -s !# (display "Hello, world!") (newline)
Next: Writing Guile Extensions, Previous: Running Guile Scripts, Up: Hello Guile! [Contents][Index]
The Guile interpreter is available as an object library, to be linked into applications using Scheme as a configuration or extension language.
Here is simple-guile.c, source code for a program that will
produce a complete Guile interpreter. In addition to all usual
functions provided by Guile, it will also offer the function
my-hostname
.
#include <stdlib.h> #include <libguile.h> static SCM my_hostname (void) { char *s = getenv ("HOSTNAME"); if (s == NULL) return SCM_BOOL_F; else return scm_from_locale_string (s); } static void inner_main (void *data, int argc, char **argv) { scm_c_define_gsubr ("my-hostname", 0, 0, 0, my_hostname); scm_shell (argc, argv); } int main (int argc, char **argv) { scm_boot_guile (argc, argv, inner_main, 0); return 0; /* never reached */ }
When Guile is correctly installed on your system, the above program can be compiled and linked like this:
$ gcc -o simple-guile simple-guile.c \ `pkg-config --cflags --libs guile-2.2`
When it is run, it behaves just like the guile
program except
that you can also call the new my-hostname
function.
$ ./simple-guile scheme@(guile-user)> (+ 1 2 3) $1 = 6 scheme@(guile-user)> (my-hostname) "burns"
Next: Using the Guile Module System, Previous: Linking Guile into Programs, Up: Hello Guile! [Contents][Index]
You can link Guile into your program and make Scheme available to the users of your program. You can also link your library into Guile and make its functionality available to all users of Guile.
A library that is linked into Guile is called an extension, but it really just is an ordinary object library.
The following example shows how to write a simple extension for Guile
that makes the j0
function available to Scheme code.
#include <math.h> #include <libguile.h> SCM j0_wrapper (SCM x) { return scm_from_double (j0 (scm_to_double (x))); } void init_bessel () { scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper); }
This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:
gcc `pkg-config --cflags guile-2.2` \ -shared -o libguile-bessel.so -fPIC bessel.c
For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction in GNU Libtool).
A shared library can be loaded into a running Guile process with the
function load-extension
. The j0
is then immediately
available:
$ guile scheme@(guile-user)> (load-extension "./libguile-bessel" "init_bessel") scheme@(guile-user)> (j0 2) $1 = 0.223890779141236
For more on how to install your extension, see Installing Site Packages.
Next: Reporting Bugs, Previous: Writing Guile Extensions, Up: Hello Guile! [Contents][Index]
Guile has support for dividing a program into modules. By using modules, you can group related code together and manage the composition of complete programs from largely independent parts.
For more details on the module system beyond this introductory material, See Modules.
Next: Writing new Modules, Up: Using the Guile Module System [Contents][Index]
Guile comes with a lot of useful modules, for example for string processing or command line parsing. Additionally, there exist many Guile modules written by other Guile hackers, but which have to be installed manually.
Here is a sample interactive session that shows how to use the
(ice-9 popen)
module which provides the means for communicating
with other processes over pipes together with the (ice-9
rdelim)
module that provides the function read-line
.
$ guile scheme@(guile-user)> (use-modules (ice-9 popen)) scheme@(guile-user)> (use-modules (ice-9 rdelim)) scheme@(guile-user)> (define p (open-input-pipe "ls -l")) scheme@(guile-user)> (read-line p) $1 = "total 30" scheme@(guile-user)> (read-line p) $2 = "drwxr-sr-x 2 mgrabmue mgrabmue 1024 Mar 29 19:57 CVS"
Next: Putting Extensions into Modules, Previous: Using Modules, Up: Using the Guile Module System [Contents][Index]
You can create new modules using the syntactic form
define-module
. All definitions following this form until the
next define-module
are placed into the new module.
One module is usually placed into one file, and that file is installed in a location where Guile can automatically find it. The following session shows a simple example.
$ cat /usr/local/share/guile/site/foo/bar.scm (define-module (foo bar) #:export (frob)) (define (frob x) (* 2 x)) $ guile scheme@(guile-user)> (use-modules (foo bar)) scheme@(guile-user)> (frob 12) $1 = 24
For more on how to install your module, see Installing Site Packages.
Previous: Writing new Modules, Up: Using the Guile Module System [Contents][Index]
In addition to Scheme code you can also put things that are defined in C into a module.
You do this by writing a small Scheme file that defines the module and
call load-extension
directly in the body of the module.
$ cat /usr/local/share/guile/site/math/bessel.scm (define-module (math bessel) #:export (j0)) (load-extension "libguile-bessel" "init_bessel") $ file /usr/local/lib/guile/2.2/extensions/libguile-bessel.so … ELF 32-bit LSB shared object … $ guile scheme@(guile-user)> (use-modules (math bessel)) scheme@(guile-user)> (j0 2) $1 = 0.223890779141236
See Modules and Extensions, for more information.
Previous: Using the Guile Module System, Up: Hello Guile! [Contents][Index]
Any problems with the installation should be reported to bug-guile@gnu.org.
If you find a bug in Guile, please report it to the Guile developers, so they can fix it. They may also be able to suggest workarounds when it is not possible for you to apply the bug-fix or install a new version of Guile yourself.
Before sending in bug reports, please check with the following list that you really have found a bug.
Before reporting the bug, check whether any programs you have loaded
into Guile, including your .guile file, set any variables that
may affect the functioning of Guile. Also, see whether the problem
happens in a freshly started Guile without loading your .guile
file (start Guile with the -q
switch to prevent loading the init
file). If the problem does not occur then, you must report the
precise contents of any programs that you must load into Guile in order
to cause the problem to occur.
When you write a bug report, please make sure to include as much of the information described below in the report. If you can’t figure out some of the items, it is not a problem, but the more information we get, the more likely we can diagnose and fix the bug.
(version)
from
within Guile.
config.guess
shell
script. If you have a Guile checkout, this file is located in
build-aux
; otherwise you can fetch the latest version from
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD.
$ build-aux/config.guess x86_64-unknown-linux-gnu
rpm -qa | grep guile
. On systems
that use DPKG, dpkg -l | grep guile
.
$ ./config.status --config '--enable-error-on-warning' '--disable-deprecated'...
If you have a Scheme program that produces the bug, please include it in the bug report. If your program is too big to include. please try to reduce your code to a minimal test case.
If you can reproduce your problem at the REPL, that is best. Give a transcript of the expressions you typed at the REPL.
If the manifestation of the bug is a Guile error message, it is
important to report the precise text of the error message, and a
backtrace showing how the Scheme program arrived at the error. This can
be done using the ,backtrace
command in Guile’s debugger.
If your bug causes Guile to crash, additional information from a
low-level debugger such as GDB might be helpful. If you have built Guile
yourself, you can run Guile under GDB via the
meta/gdb-uninstalled-guile
script. Instead of invoking Guile as
usual, invoke the wrapper script, type run
to start the process,
then backtrace
when the crash comes. Include that backtrace in
your report.
Next: Programming in Scheme, Previous: Hello Guile!, Up: The Guile Reference Manual [Contents][Index]
In this chapter, we introduce the basic concepts that underpin the elegance and power of the Scheme language.
Readers who already possess a background knowledge of Scheme may happily skip this chapter. For the reader who is new to the language, however, the following discussions on data, procedures, expressions and closure are designed to provide a minimum level of Scheme understanding that is more or less assumed by the chapters that follow.
The style of this introductory material aims about halfway between the terse precision of R5RS and the discursiveness of existing Scheme tutorials. For pointers to useful Scheme resources on the web, please see Further Reading.
Next: The Representation and Use of Procedures, Up: Hello Scheme! [Contents][Index]
This section discusses the representation of data types and values, what it means for Scheme to be a latently typed language, and the role of variables. We conclude by introducing the Scheme syntaxes for defining a new variable, and for changing the value of an existing variable.
Next: Values and Variables, Up: Data Types, Values and Variables [Contents][Index]
The term latent typing is used to describe a computer language, such as Scheme, for which you cannot, in general, simply look at a program’s source code and determine what type of data will be associated with a particular variable, or with the result of a particular expression.
Sometimes, of course, you can tell from the code what the type of
an expression will be. If you have a line in your program that sets the
variable x
to the numeric value 1, you can be certain that,
immediately after that line has executed (and in the absence of multiple
threads), x
has the numeric value 1. Or if you write a procedure
that is designed to concatenate two strings, it is likely that the rest
of your application will always invoke this procedure with two string
parameters, and quite probable that the procedure would go wrong in some
way if it was ever invoked with parameters that were not both strings.
Nevertheless, the point is that there is nothing in Scheme which
requires the procedure parameters always to be strings, or x
always to hold a numeric value, and there is no way of declaring in your
program that such constraints should always be obeyed. In the same
vein, there is no way to declare the expected type of a procedure’s
return value.
Instead, the types of variables and expressions are only known – in general – at run time. If you need to check at some point that a value has the expected type, Scheme provides run time procedures that you can invoke to do so. But equally, it can be perfectly valid for two separate invocations of the same procedure to specify arguments with different types, and to return values with different types.
The next subsection explains what this means in practice, for the ways that Scheme programs use data types, values and variables.
Next: Defining and Setting Variables, Previous: Latent Typing, Up: Data Types, Values and Variables [Contents][Index]
Scheme provides many data types that you can use to represent your data. Primitive types include characters, strings, numbers and procedures. Compound types, which allow a group of primitive and compound values to be stored together, include lists, pairs, vectors and multi-dimensional arrays. In addition, Guile allows applications to define their own data types, with the same status as the built-in standard Scheme types.
As a Scheme program runs, values of all types pop in and out of existence. Sometimes values are stored in variables, but more commonly they pass seamlessly from being the result of one computation to being one of the parameters for the next.
Consider an example. A string value is created because the interpreter reads in a literal string from your program’s source code. Then a numeric value is created as the result of calculating the length of the string. A second numeric value is created by doubling the calculated length. Finally the program creates a list with two elements – the doubled length and the original string itself – and stores this list in a program variable.
All of the values involved here – in fact, all values in Scheme – carry their type with them. In other words, every value “knows,” at runtime, what kind of value it is. A number, a string, a list, whatever.
A variable, on the other hand, has no fixed type. A variable –
x
, say – is simply the name of a location – a box – in which
you can store any kind of Scheme value. So the same variable in a
program may hold a number at one moment, a list of procedures the next,
and later a pair of strings. The “type” of a variable – insofar as
the idea is meaningful at all – is simply the type of whatever value
the variable happens to be storing at a particular moment.
Previous: Values and Variables, Up: Data Types, Values and Variables [Contents][Index]
To define a new variable, you use Scheme’s define
syntax like
this:
(define variable-name value)
This makes a new variable called variable-name and stores value in it as the variable’s initial value. For example:
;; Make a variable `x' with initial numeric value 1. (define x 1) ;; Make a variable `organization' with an initial string value. (define organization "Free Software Foundation")
(In Scheme, a semicolon marks the beginning of a comment that continues
until the end of the line. So the lines beginning ;;
are
comments.)
Changing the value of an already existing variable is very similar,
except that define
is replaced by the Scheme syntax set!
,
like this:
(set! variable-name new-value)
Remember that variables do not have fixed types, so new-value may have a completely different type from whatever was previously stored in the location named by variable-name. Both of the following examples are therefore correct.
;; Change the value of `x' to 5. (set! x 5) ;; Change the value of `organization' to the FSF's street number. (set! organization 545)
In these examples, value and new-value are literal numeric
or string values. In general, however, value and new-value
can be any Scheme expression. Even though we have not yet covered the
forms that Scheme expressions can take (see Expressions and Evaluation), you
can probably guess what the following set!
example does…
(set! x (+ x 1))
(Note: this is not a complete description of define
and
set!
, because we need to introduce some other aspects of Scheme
before the missing pieces can be filled in. If, however, you are
already familiar with the structure of Scheme, you may like to read
about those missing pieces immediately by jumping ahead to the following
references.
define
syntax that can be used when defining new procedures.
set!
syntax that helps with changing a single value in the depths
of a compound data structure.)
define
other
than at top level in a Scheme program, including a discussion of when it
works to use define
rather than set!
to change the value
of an existing variable.
Next: Expressions and Evaluation, Previous: Data Types, Values and Variables, Up: Hello Scheme! [Contents][Index]
This section introduces the basics of using and creating Scheme
procedures. It discusses the representation of procedures as just
another kind of Scheme value, and shows how procedure invocation
expressions are constructed. We then explain how lambda
is used
to create new procedures, and conclude by presenting the various
shorthand forms of define
that can be used instead of writing an
explicit lambda
expression.
One of the great simplifications of Scheme is that a procedure is just
another type of value, and that procedure values can be passed around
and stored in variables in exactly the same way as, for example, strings
and lists. When we talk about a built-in standard Scheme procedure such
as open-input-file
, what we actually mean is that there is a
pre-defined top level variable called open-input-file
, whose
value is a procedure that implements what R5RS says that
open-input-file
should do.
Note that this is quite different from many dialects of Lisp — including Emacs Lisp — in which a program can use the same name with two quite separate meanings: one meaning identifies a Lisp function, while the other meaning identifies a Lisp variable, whose value need have nothing to do with the function that is associated with the first meaning. In these dialects, functions and variables are said to live in different namespaces.
In Scheme, on the other hand, all names belong to a single unified namespace, and the variables that these names identify can hold any kind of Scheme value, including procedure values.
One consequence of the “procedures as values” idea is that, if you don’t happen to like the standard name for a Scheme procedure, you can change it.
For example, call-with-current-continuation
is a very important
standard Scheme procedure, but it also has a very long name! So, many
programmers use the following definition to assign the same procedure
value to the more convenient name call/cc
.
(define call/cc call-with-current-continuation)
Let’s understand exactly how this works. The definition creates a new
variable call/cc
, and then sets its value to the value of the
variable call-with-current-continuation
; the latter value is a
procedure that implements the behaviour that R5RS specifies under the
name “call-with-current-continuation”. So call/cc
ends up
holding this value as well.
Now that call/cc
holds the required procedure value, you could
choose to use call-with-current-continuation
for a completely
different purpose, or just change its value so that you will get an
error if you accidentally use call-with-current-continuation
as a
procedure in your program rather than call/cc
. For example:
(set! call-with-current-continuation "Not a procedure any more!")
Or you could just leave call-with-current-continuation
as it was.
It’s perfectly fine for more than one variable to hold the same
procedure value.
Next: Creating and Using a New Procedure, Previous: Procedures as Values, Up: The Representation and Use of Procedures [Contents][Index]
A procedure invocation in Scheme is written like this:
(procedure [arg1 [arg2 …]])
In this expression, procedure can be any Scheme expression whose value is a procedure. Most commonly, however, procedure is simply the name of a variable whose value is a procedure.
For example, string-append
is a standard Scheme procedure whose
behaviour is to concatenate together all the arguments, which are
expected to be strings, that it is given. So the expression
(string-append "/home" "/" "andrew")
is a procedure invocation whose result is the string value
"/home/andrew"
.
Similarly, string-length
is a standard Scheme procedure that
returns the length of a single string argument, so
(string-length "abc")
is a procedure invocation whose result is the numeric value 3.
Each of the parameters in a procedure invocation can itself be any Scheme expression. Since a procedure invocation is itself a type of expression, we can put these two examples together to get
(string-length (string-append "/home" "/" "andrew"))
— a procedure invocation whose result is the numeric value 12.
(You may be wondering what happens if the two examples are combined the other way round. If we do this, we can make a procedure invocation expression that is syntactically correct:
(string-append "/home" (string-length "abc"))
but when this expression is executed, it will cause an error, because
the result of (string-length "abc")
is a numeric value, and
string-append
is not designed to accept a numeric value as one of
its arguments.)
Next: Lambda Alternatives, Previous: Simple Procedure Invocation, Up: The Representation and Use of Procedures [Contents][Index]
Scheme has lots of standard procedures, and Guile provides all of these via predefined top level variables. All of these standard procedures are documented in the later chapters of this reference manual.
Before very long, though, you will want to create new procedures that
encapsulate aspects of your own applications’ functionality. To do
this, you can use the famous lambda
syntax.
For example, the value of the following Scheme expression
(lambda (name address) expression …)
is a newly created procedure that takes two arguments:
name
and address
. The behaviour of the
new procedure is determined by the sequence of expressions in the
body of the procedure definition. (Typically, these
expressions would use the arguments in some way, or else there
wouldn’t be any point in giving them to the procedure.) When invoked,
the new procedure returns a value that is the value of the last
expression in the procedure body.
To make things more concrete, let’s suppose that the two arguments are both strings, and that the purpose of this procedure is to form a combined string that includes these arguments. Then the full lambda expression might look like this:
(lambda (name address) (string-append "Name=" name ":Address=" address))
We noted in the previous subsection that the procedure part of a procedure invocation expression can be any Scheme expression whose value is a procedure. But that’s exactly what a lambda expression is! So we can use a lambda expression directly in a procedure invocation, like this:
((lambda (name address) (string-append "Name=" name ":Address=" address)) "FSF" "Cambridge")
This is a valid procedure invocation expression, and its result is the string:
"Name=FSF:Address=Cambridge"
It is more common, though, to store the procedure value in a variable —
(define make-combined-string (lambda (name address) (string-append "Name=" name ":Address=" address)))
— and then to use the variable name in the procedure invocation:
(make-combined-string "FSF" "Cambridge")
Which has exactly the same result.
It’s important to note that procedures created using lambda
have
exactly the same status as the standard built in Scheme procedures, and
can be invoked, passed around, and stored in variables in exactly the
same ways.
Previous: Creating and Using a New Procedure, Up: The Representation and Use of Procedures [Contents][Index]
Since it is so common in Scheme programs to want to create a procedure
and then store it in a variable, there is an alternative form of the
define
syntax that allows you to do just that.
A define
expression of the form
(define (name [arg1 [arg2 …]]) expression …)
is exactly equivalent to the longer form
(define name (lambda ([arg1 [arg2 …]]) expression …))
So, for example, the definition of make-combined-string
in the
previous subsection could equally be written:
(define (make-combined-string name address) (string-append "Name=" name ":Address=" address))
This kind of procedure definition creates a procedure that requires
exactly the expected number of arguments. There are two further forms
of the lambda
expression, which create a procedure that can
accept a variable number of arguments:
(lambda (arg1 … . args) expression …) (lambda args expression …)
The corresponding forms of the alternative define
syntax are:
(define (name arg1 … . args) expression …) (define (name . args) expression …)
For details on how these forms work, see See Lambda: Basic Procedure Creation.
Prior to Guile 2.0, Guile provided an extension to define
syntax
that allowed you to nest the previous extension up to an arbitrary
depth. These are no longer provided by default, and instead have been
moved to Curried Definitions
(It could be argued that the alternative define
forms are rather
confusing, especially for newcomers to the Scheme language, as they hide
both the role of lambda
and the fact that procedures are values
that are stored in variables in the same way as any other kind of value.
On the other hand, they are very convenient, and they are also a good
example of another of Scheme’s powerful features: the ability to specify
arbitrary syntactic transformations at run time, which can be applied to
subsequently read input.)
Next: The Concept of Closure, Previous: The Representation and Use of Procedures, Up: Hello Scheme! [Contents][Index]
So far, we have met expressions that do things, such as the
define
expressions that create and initialize new variables, and
we have also talked about expressions that have values, for
example the value of the procedure invocation expression:
(string-append "/home" "/" "andrew")
but we haven’t yet been precise about what causes an expression like this procedure invocation to be reduced to its “value”, or how the processing of such expressions relates to the execution of a Scheme program as a whole.
This section clarifies what we mean by an expression’s value, by introducing the idea of evaluation. It discusses the side effects that evaluation can have, explains how each of the various types of Scheme expression is evaluated, and describes the behaviour and use of the Guile REPL as a mechanism for exploring evaluation. The section concludes with a very brief summary of Scheme’s common syntactic expressions.
Next: Tail calls, Up: Expressions and Evaluation [Contents][Index]
In Scheme, the process of executing an expression is known as evaluation. Evaluation has two kinds of result:
Of the expressions that we have met so far, define
and
set!
expressions have side effects — the creation or
modification of a variable — but no value; lambda
expressions
have values — the newly constructed procedures — but no side
effects; and procedure invocation expressions, in general, have either
values, or side effects, or both.
It is tempting to try to define more intuitively what we mean by “value” and “side effects”, and what the difference between them is. In general, though, this is extremely difficult. It is also unnecessary; instead, we can quite happily define the behaviour of a Scheme program by specifying how Scheme executes a program as a whole, and then by describing the value and side effects of evaluation for each type of expression individually.
So, some1 definitions…
2.3
or a string
"Hello world!"
The following subsections describe how each of these types of expression is evaluated.
Next: Evaluating a Variable Reference, Up: Evaluating Expressions and Executing Programs [Contents][Index]
When a literal data expression is evaluated, the value of the expression is simply the value that the expression describes. The evaluation of a literal data expression has no side effects.
So, for example,
"abc"
is the string value
"abc"
3+4i
is the complex number 3 + 4i
#(1 2 3)
is a three-element vector
containing the numeric values 1, 2 and 3.
For any data type which can be expressed literally like this, the syntax of the literal data expression for that data type — in other words, what you need to write in your code to indicate a literal value of that type — is known as the data type’s read syntax. This manual specifies the read syntax for each such data type in the section that describes that data type.
Some data types do not have a read syntax. Procedures, for example,
cannot be expressed as literal data; they must be created using a
lambda
expression (see Creating and Using a New Procedure) or implicitly
using the shorthand form of define
(see Lambda Alternatives).
Next: Evaluating a Procedure Invocation Expression, Previous: Evaluating Literal Data, Up: Evaluating Expressions and Executing Programs [Contents][Index]
When an expression that consists simply of a variable name is evaluated, the value of the expression is the value of the named variable. The evaluation of a variable reference expression has no side effects.
So, after
(define key "Paul Evans")
the value of the expression key
is the string value "Paul
Evans"
. If key is then modified by
(set! key 3.74)
the value of the expression key
is the numeric value 3.74.
If there is no variable with the specified name, evaluation of the variable reference expression signals an error.
Next: Evaluating Special Syntactic Expressions, Previous: Evaluating a Variable Reference, Up: Evaluating Expressions and Executing Programs [Contents][Index]
This is where evaluation starts getting interesting! As already noted, a procedure invocation expression has the form
(procedure [arg1 [arg2 …]])
where procedure must be an expression whose value, when evaluated, is a procedure.
The evaluation of a procedure invocation expression like this proceeds by
For a procedure defined in Scheme, “calling the procedure with the list of values as its parameters” means binding the values to the procedure’s formal parameters and then evaluating the sequence of expressions that make up the body of the procedure definition. The value of the procedure invocation expression is the value of the last evaluated expression in the procedure body. The side effects of calling the procedure are the combination of the side effects of the sequence of evaluations of expressions in the procedure body.
For a built-in procedure, the value and side-effects of calling the procedure are best described by that procedure’s documentation.
Note that the complete side effects of evaluating a procedure invocation expression consist not only of the side effects of the procedure call, but also of any side effects of the preceding evaluation of the expressions procedure, arg1, arg2, and so on.
To illustrate this, let’s look again at the procedure invocation expression:
(string-length (string-append "/home" "/" "andrew"))
In the outermost expression, procedure is string-length
and
arg1 is (string-append "/home" "/" "andrew")
.
string-length
, which is a variable, gives a
procedure value that implements the expected behaviour for
“string-length”.
(string-append "/home" "/" "andrew")
, which is
another procedure invocation expression, means evaluating each of
string-append
, which gives a procedure value that implements the
expected behaviour for “string-append”
"/home"
, which gives the string value "/home"
"/"
, which gives the string value "/"
"andrew"
, which gives the string value "andrew"
and then invoking the procedure value with this list of string values as
its arguments. The resulting value is a single string value that is the
concatenation of all the arguments, namely "/home/andrew"
.
In the evaluation of the outermost expression, the interpreter can now invoke the procedure value obtained from procedure with the value obtained from arg1 as its arguments. The resulting value is a numeric value that is the length of the argument string, which is 12.
Previous: Evaluating a Procedure Invocation Expression, Up: Evaluating Expressions and Executing Programs [Contents][Index]
When a procedure invocation expression is evaluated, the procedure and all the argument expressions must be evaluated before the procedure can be invoked. Special syntactic expressions are special because they are able to manipulate their arguments in an unevaluated form, and can choose whether to evaluate any or all of the argument expressions.
Why is this needed? Consider a program fragment that asks the user whether or not to delete a file, and then deletes the file if the user answers yes.
(if (string=? (read-answer "Should I delete this file?") "yes") (delete-file file))
If the outermost (if …)
expression here was a procedure
invocation expression, the expression (delete-file file)
, whose
side effect is to actually delete a file, would already have been
evaluated before the if
procedure even got invoked! Clearly this
is no use — the whole point of an if
expression is that the
consequent expression is only evaluated if the condition of the
if
expression is “true”.
Therefore if
must be special syntax, not a procedure. Other
special syntaxes that we have already met are define
, set!
and lambda
. define
and set!
are syntax because
they need to know the variable name that is given as the first
argument in a define
or set!
expression, not that
variable’s value. lambda
is syntax because it does not
immediately evaluate the expressions that define the procedure body;
instead it creates a procedure object that incorporates these
expressions so that they can be evaluated in the future, when that
procedure is invoked.
The rules for evaluating each special syntactic expression are specified individually for each special syntax. For a summary of standard special syntax, see See Summary of Common Syntax.
Next: Using the Guile REPL, Previous: Evaluating Expressions and Executing Programs, Up: Expressions and Evaluation [Contents][Index]
Scheme is “properly tail recursive”, meaning that tail calls or recursions from certain contexts do not consume stack space or other resources and can therefore be used on arbitrarily large data or for an arbitrarily long calculation. Consider for example,
(define (foo n) (display n) (newline) (foo (1+ n))) (foo 1) -| 1 2 3 …
foo
prints numbers infinitely, starting from the given n.
It’s implemented by printing n then recursing to itself to print
n+1 and so on. This recursion is a tail call, it’s the
last thing done, and in Scheme such tail calls can be made without
limit.
Or consider a case where a value is returned, a version of the SRFI-1
last
function (see Selectors) returning the last
element of a list,
(define (my-last lst) (if (null? (cdr lst)) (car lst) (my-last (cdr lst)))) (my-last '(1 2 3)) ⇒ 3
If the list has more than one element, my-last
applies itself
to the cdr
. This recursion is a tail call, there’s no code
after it, and the return value is the return value from that call. In
Scheme this can be used on an arbitrarily long list argument.
A proper tail call is only available from certain contexts, namely the following special form positions,
and
— last expression
begin
— last expression
case
— last expression in each clause
cond
— last expression in each clause, and the call to a
=>
procedure is a tail call
do
— last result expression
if
— “true” and “false” leg expressions
lambda
— last expression in body
let
, let*
, letrec
, let-syntax
,
letrec-syntax
— last expression in body
or
— last expression
The following core functions make tail calls,
apply
— tail call to given procedure
call-with-current-continuation
— tail call to the procedure
receiving the new continuation
call-with-values
— tail call to the values-receiving
procedure
eval
— tail call to evaluate the form
string-any
, string-every
— tail call to predicate on
the last character (if that point is reached)
The above are just core functions and special forms. Tail calls in
other modules are described with the relevant documentation, for
example SRFI-1 any
and every
(see Searching).
It will be noted there are a lot of places which could potentially be
tail calls, for instance the last call in a for-each
, but only
those explicitly described are guaranteed.
Next: Summary of Common Syntax, Previous: Tail calls, Up: Expressions and Evaluation [Contents][Index]
If you start Guile without specifying a particular program for it to execute, Guile enters its standard Read Evaluate Print Loop — or REPL for short. In this mode, Guile repeatedly reads in the next Scheme expression that the user types, evaluates it, and prints the resulting value.
The REPL is a useful mechanism for exploring the evaluation behaviour
described in the previous subsection. If you type string-append
,
for example, the REPL replies #<primitive-procedure
string-append>
, illustrating the relationship between the variable
string-append
and the procedure value stored in that variable.
In this manual, the notation ⇒ is used to mean “evaluates to”. Wherever you see an example of the form
expression ⇒ result
feel free to try it out yourself by typing expression into the REPL and checking that it gives the expected result.
Previous: Using the Guile REPL, Up: Expressions and Evaluation [Contents][Index]
This subsection lists the most commonly used Scheme syntactic expressions, simply so that you will recognize common special syntax when you see it. For a full description of each of these syntaxes, follow the appropriate reference.
lambda
(see Lambda: Basic Procedure Creation) is used to construct procedure objects.
define
(see Top Level Variable Definitions) is used to create a new variable and
set its initial value.
set!
(see Top Level Variable Definitions) is used to modify an existing variable’s
value.
let
, let*
and letrec
(see Local Variable Bindings)
create an inner lexical environment for the evaluation of a sequence of
expressions, in which a specified set of local variables is bound to the
values of a corresponding set of expressions. For an introduction to
environments, see See The Concept of Closure.
begin
(see Sequencing and Splicing) executes a sequence of expressions in order
and returns the value of the last expression. Note that this is not the
same as a procedure which returns its last argument, because the
evaluation of a procedure invocation expression does not guarantee to
evaluate the arguments in order.
if
and cond
(see Simple Conditional Evaluation) provide conditional
evaluation of argument expressions depending on whether one or more
conditions evaluate to “true” or “false”.
case
(see Simple Conditional Evaluation) provides conditional evaluation of
argument expressions depending on whether a variable has one of a
specified group of values.
and
(see Conditional Evaluation of a Sequence of Expressions) executes a sequence of expressions in order
until either there are no expressions left, or one of them evaluates to
“false”.
or
(see Conditional Evaluation of a Sequence of Expressions) executes a sequence of expressions in order
until either there are no expressions left, or one of them evaluates to
“true”.
Next: Further Reading, Previous: Expressions and Evaluation, Up: Hello Scheme! [Contents][Index]
The concept of closure is the idea that a lambda expression “captures” the variable bindings that are in lexical scope at the point where the lambda expression occurs. The procedure created by the lambda expression can refer to and mutate the captured bindings, and the values of those bindings persist between procedure calls.
This section explains and explores the various parts of this idea in more detail.
Next: Local Variables and Environments, Up: The Concept of Closure [Contents][Index]
We said earlier that a variable name in a Scheme program is associated with a location in which any kind of Scheme value may be stored. (Incidentally, the term “vcell” is often used in Lisp and Scheme circles as an alternative to “location”.) Thus part of what we mean when we talk about “creating a variable” is in fact establishing an association between a name, or identifier, that is used by the Scheme program code, and the variable location to which that name refers. Although the value that is stored in that location may change, the location to which a given name refers is always the same.
We can illustrate this by breaking down the operation of the
define
syntax into three parts: define
define
expression
define
expression.
A collection of associations between names and locations is called an
environment. When you create a top level variable in a program
using define
, the name-location association for that variable is
added to the “top level” environment. The “top level” environment
also includes name-location associations for all the procedures that are
supplied by standard Scheme.
It is also possible to create environments other than the top level one, and to create variable bindings, or name-location associations, in those environments. This ability is a key ingredient in the concept of closure; the next subsection shows how it is done.
Next: Environment Chaining, Previous: Names, Locations, Values and Environments, Up: The Concept of Closure [Contents][Index]
We have seen how to create top level variables using the define
syntax (see Defining and Setting Variables). It is often useful to create variables
that are more limited in their scope, typically as part of a procedure
body. In Scheme, this is done using the let
syntax, or one of
its modified forms let*
and letrec
. These syntaxes are
described in full later in the manual (see Local Variable Bindings). Here
our purpose is to illustrate their use just enough that we can see how
local variables work.
For example, the following code uses a local variable s
to
simplify the computation of the area of a triangle given the lengths of
its three sides.
(define a 5.3) (define b 4.7) (define c 2.8) (define area (let ((s (/ (+ a b c) 2))) (sqrt (* s (- s a) (- s b) (- s c)))))
The effect of the let
expression is to create a new environment
and, within this environment, an association between the name s
and a new location whose initial value is obtained by evaluating
(/ (+ a b c) 2)
. The expressions in the body of the let
,
namely (sqrt (* s (- s a) (- s b) (- s c)))
, are then evaluated
in the context of the new environment, and the value of the last
expression evaluated becomes the value of the whole let
expression, and therefore the value of the variable area
.
Next: Lexical Scope, Previous: Local Variables and Environments, Up: The Concept of Closure [Contents][Index]
In the example of the previous subsection, we glossed over an important
point. The body of the let
expression in that example refers not
only to the local variable s
, but also to the top level variables
a
, b
, c
and sqrt
. (sqrt
is the
standard Scheme procedure for calculating a square root.) If the body
of the let
expression is evaluated in the context of the
local let
environment, how does the evaluation get at the
values of these top level variables?
The answer is that the local environment created by a let
expression automatically has a reference to its containing environment
— in this case the top level environment — and that the Scheme
interpreter automatically looks for a variable binding in the containing
environment if it doesn’t find one in the local environment. More
generally, every environment except for the top level one has a
reference to its containing environment, and the interpreter keeps
searching back up the chain of environments — from most local to top
level — until it either finds a variable binding for the required
identifier or exhausts the chain.
This description also determines what happens when there is more than
one variable binding with the same name. Suppose, continuing the
example of the previous subsection, that there was also a pre-existing
top level variable s
created by the expression:
(define s "Some beans, my lord!")
Then both the top level environment and the local let
environment
would contain bindings for the name s
. When evaluating code
within the let
body, the interpreter looks first in the local
let
environment, and so finds the binding for s
created by
the let
syntax. Even though this environment has a reference to
the top level environment, which also has a binding for s
, the
interpreter doesn’t get as far as looking there. When evaluating code
outside the let
body, the interpreter looks up variable names in
the top level environment, so the name s
refers to the top level
variable.
Within the let
body, the binding for s
in the local
environment is said to shadow the binding for s
in the top
level environment.
Next: Closure, Previous: Environment Chaining, Up: The Concept of Closure [Contents][Index]
The rules that we have just been describing are the details of how Scheme implements “lexical scoping”. This subsection takes a brief diversion to explain what lexical scope means in general and to present an example of non-lexical scoping.
“Lexical scope” in general is the idea that
In practice, lexical scoping is the norm for most programming languages, and probably corresponds to what you would intuitively consider to be “normal”. You may even be wondering how the situation could possibly — and usefully — be otherwise. To demonstrate that another kind of scoping is possible, therefore, and to compare it against lexical scoping, the following subsection presents an example of non-lexical scoping and examines in detail how its behavior differs from the corresponding lexically scoped code.
Up: Lexical Scope [Contents][Index]
To demonstrate that non-lexical scoping does exist and can be useful, we present the following example from Emacs Lisp, which is a “dynamically scoped” language.
(defvar currency-abbreviation "USD") (defun currency-string (units hundredths) (concat currency-abbreviation (number-to-string units) "." (number-to-string hundredths))) (defun french-currency-string (units hundredths) (let ((currency-abbreviation "FRF")) (currency-string units hundredths)))
The question to focus on here is: what does the identifier
currency-abbreviation
refer to in the currency-string
function? The answer, in Emacs Lisp, is that all variable bindings go
onto a single stack, and that currency-abbreviation
refers to the
topmost binding from that stack which has the name
“currency-abbreviation”. The binding that is created by the
defvar
form, to the value "USD"
, is only relevant if none
of the code that calls currency-string
rebinds the name
“currency-abbreviation” in the meanwhile.
The second function french-currency-string
works precisely by
taking advantage of this behaviour. It creates a new binding for the
name “currency-abbreviation” which overrides the one established by
the defvar
form.
;; Note! This is Emacs Lisp evaluation, not Scheme! (french-currency-string 33 44) ⇒ "FRF33.44"
Now let’s look at the corresponding, lexically scoped Scheme code:
(define currency-abbreviation "USD") (define (currency-string units hundredths) (string-append currency-abbreviation (number->string units) "." (number->string hundredths))) (define (french-currency-string units hundredths) (let ((currency-abbreviation "FRF")) (currency-string units hundredths)))
According to the rules of lexical scoping, the
currency-abbreviation
in currency-string
refers to the
variable location in the innermost environment at that point in the code
which has a binding for currency-abbreviation
, which is the
variable location in the top level environment created by the preceding
(define currency-abbreviation …)
expression.
In Scheme, therefore, the french-currency-string
procedure does
not work as intended. The variable binding that it creates for
“currency-abbreviation” is purely local to the code that forms the
body of the let
expression. Since this code doesn’t directly use
the name “currency-abbreviation” at all, the binding is pointless.
(french-currency-string 33 44) ⇒ "USD33.44"
This begs the question of how the Emacs Lisp behaviour can be
implemented in Scheme. In general, this is a design question whose
answer depends upon the problem that is being addressed. In this case,
the best answer may be that currency-string
should be
redesigned so that it can take an optional third argument. This third
argument, if supplied, is interpreted as a currency abbreviation that
overrides the default.
It is possible to change french-currency-string
so that it mostly
works without changing currency-string
, but the fix is inelegant,
and susceptible to interrupts that could leave the
currency-abbreviation
variable in the wrong state:
(define (french-currency-string units hundredths) (set! currency-abbreviation "FRF") (let ((result (currency-string units hundredths))) (set! currency-abbreviation "USD") result))
The key point here is that the code does not create any local binding
for the identifier currency-abbreviation
, so all occurrences of
this identifier refer to the top level variable.
Next: Example 1: A Serial Number Generator, Previous: Lexical Scope, Up: The Concept of Closure [Contents][Index]
Consider a let
expression that doesn’t contain any
lambda
s:
(let ((s (/ (+ a b c) 2))) (sqrt (* s (- s a) (- s b) (- s c))))
When the Scheme interpreter evaluates this, it
let
s
in the new environment, with
value given by (/ (+ a b c) 2)
let
in the context of
the new local environment, and remembers the value V
let
, using
the value V
as the value of the let
expression, in the
context of the containing environment.
After the let
expression has been evaluated, the local
environment that was created is simply forgotten, and there is no longer
any way to access the binding that was created in this environment. If
the same code is evaluated again, it will follow the same steps again,
creating a second new local environment that has no connection with the
first, and then forgetting this one as well.
If the let
body contains a lambda
expression, however, the
local environment is not forgotten. Instead, it becomes
associated with the procedure that is created by the lambda
expression, and is reinstated every time that that procedure is called.
In detail, this works as follows.
lambda
expression, to
create a procedure object, it stores the current environment as part of
the procedure definition.
The result is that the procedure body is always evaluated in the context of the environment that was current when the procedure was created.
This is what is meant by closure. The next few subsections present examples that explore the usefulness of this concept.
Next: Example 2: A Shared Persistent Variable, Previous: Closure, Up: The Concept of Closure [Contents][Index]
This example uses closure to create a procedure with a variable binding that is private to the procedure, like a local variable, but whose value persists between procedure calls.
(define (make-serial-number-generator) (let ((current-serial-number 0)) (lambda () (set! current-serial-number (+ current-serial-number 1)) current-serial-number))) (define entry-sn-generator (make-serial-number-generator)) (entry-sn-generator) ⇒ 1 (entry-sn-generator) ⇒ 2
When make-serial-number-generator
is called, it creates a local
environment with a binding for current-serial-number
whose
initial value is 0, then, within this environment, creates a procedure.
The local environment is stored within the created procedure object and
so persists for the lifetime of the created procedure.
Every time the created procedure is invoked, it increments the value of
the current-serial-number
binding in the captured environment and
then returns the current value.
Note that make-serial-number-generator
can be called again to
create a second serial number generator that is independent of the
first. Every new invocation of make-serial-number-generator
creates a new local let
environment and returns a new procedure
object with an association to this environment.
Next: Example 4: Object Orientation, Previous: Example 2: A Shared Persistent Variable, Up: The Concept of Closure [Contents][Index]
A frequently used programming model for library code is to allow an application to register a callback function for the library to call when some particular event occurs. It is often useful for the application to make several such registrations using the same callback function, for example if several similar library events can be handled using the same application code, but the need then arises to distinguish the callback function calls that are associated with one callback registration from those that are associated with different callback registrations.
In languages without the ability to create functions dynamically, this
problem is usually solved by passing a user_data
parameter on the
registration call, and including the value of this parameter as one of
the parameters on the callback function. Here is an example of
declarations using this solution in C:
typedef void (event_handler_t) (int event_type, void *user_data); void register_callback (int event_type, event_handler_t *handler, void *user_data);
In Scheme, closure can be used to achieve the same functionality without
requiring the library code to store a user-data
for each callback
registration.
;; In the library: (define (register-callback event-type handler-proc) …) ;; In the application: (define (make-handler event-type user-data) (lambda () … <code referencing event-type and user-data> …)) (register-callback event-type (make-handler event-type …))
As far as the library is concerned, handler-proc
is a procedure
with no arguments, and all the library has to do is call it when the
appropriate event occurs. From the application’s point of view, though,
the handler procedure has used closure to capture an environment that
includes all the context that the handler code needs —
event-type
and user-data
— to handle the event
correctly.
Previous: Example 3: The Callback Closure Problem, Up: The Concept of Closure [Contents][Index]
Closure is the capture of an environment, containing persistent variable bindings, within the definition of a procedure or a set of related procedures. This is rather similar to the idea in some object oriented languages of encapsulating a set of related data variables inside an “object”, together with a set of “methods” that operate on the encapsulated data. The following example shows how closure can be used to emulate the ideas of objects, methods and encapsulation in Scheme.
(define (make-account) (let ((balance 0)) (define (get-balance) balance) (define (deposit amount) (set! balance (+ balance amount)) balance) (define (withdraw amount) (deposit (- amount))) (lambda args (apply (case (car args) ((get-balance) get-balance) ((deposit) deposit) ((withdraw) withdraw) (else (error "Invalid method!"))) (cdr args)))))
Each call to make-account
creates and returns a new procedure,
created by the expression in the example code that begins “(lambda
args”.
(define my-account (make-account)) my-account ⇒ #<procedure args>
This procedure acts as an account object with methods
get-balance
, deposit
and withdraw
. To apply one of
the methods to the account, you call the procedure with a symbol
indicating the required method as the first parameter, followed by any
other parameters that are required by that method.
(my-account 'get-balance) ⇒ 0 (my-account 'withdraw 5) ⇒ -5 (my-account 'deposit 396) ⇒ 391 (my-account 'get-balance) ⇒ 391
Note how, in this example, both the current balance and the helper
procedures get-balance
, deposit
and withdraw
, used
to implement the guts of the account object’s methods, are all stored in
variable bindings within the private local environment captured by the
lambda
expression that creates the account object procedure.
Previous: The Concept of Closure, Up: Hello Scheme! [Contents][Index]
Next: Programming in C, Previous: Hello Scheme!, Up: The Guile Reference Manual [Contents][Index]
Guile’s core language is Scheme, and a lot can be achieved simply by using Guile to write and run Scheme programs — as opposed to having to dive into C code. In this part of the manual, we explain how to use Guile in this mode, and describe the tools that Guile provides to help you with script writing, debugging, and packaging your programs for distribution.
For detailed reference information on the variables, functions, and so on that make up Guile’s application programming interface (API), see API Reference.
Next: Invoking Guile, Up: Programming in Scheme [Contents][Index]
Guile’s core language is Scheme, which is specified and described in the series of reports known as RnRS. RnRS is shorthand for the Revised^n Report on the Algorithmic Language Scheme. Guile complies fully with R5RS (see Introduction in R5RS), and implements some aspects of R6RS.
Guile also has many extensions that go beyond these reports. Some of the areas where Guile extends R5RS are:
Next: Guile Scripting, Previous: Guile’s Implementation of Scheme, Up: Programming in Scheme [Contents][Index]
Many features of Guile depend on and can be changed by information that the user provides either before or when Guile is started. Below is a description of what information to provide and how to provide it.
Next: Environment Variables, Up: Invoking Guile [Contents][Index]
Here we describe Guile’s command-line processing in detail. Guile processes its arguments from left to right, recognizing the switches described below. For examples, see Scripting Examples.
script arg...
¶-s script arg...
By default, Guile will read a file named on the command line as a
script. Any command-line arguments arg... following script
become the script’s arguments; the command-line
function returns
a list of strings of the form (script arg...)
.
It is possible to name a file using a leading hyphen, for example, -myfile.scm. In this case, the file name must be preceded by -s to tell Guile that a (script) file is being named.
Scripts are read and evaluated as Scheme source code just as the
load
function would. After loading script, Guile exits.
-c expr arg...
¶Evaluate expr as Scheme code, and then exit. Any command-line
arguments arg... following expr become command-line
arguments; the command-line
function returns a list of strings of
the form (guile arg...)
, where guile is the
path of the Guile executable.
-- arg...
Run interactively, prompting the user for expressions and evaluating
them. Any command-line arguments arg... following the --
become command-line arguments for the interactive session; the
command-line
function returns a list of strings of the form
(guile arg...)
, where guile is the path of the
Guile executable.
-L directory
Add directory to the front of Guile’s module load path. The given
directories are searched in the order given on the command line and
before any directories in the GUILE_LOAD_PATH
environment
variable. Paths added here are not in effect during execution of
the user’s .guile file.
-C directory
Like -L, but adjusts the load path for compiled files.
-x extension
Add extension to the front of Guile’s load extension list
(see %load-extensions
). The specified extensions
are tried in the order given on the command line, and before the default
load extensions. Extensions added here are not in effect during
execution of the user’s .guile file.
-l file
Load Scheme source code from file, and continue processing the command line.
-e function
Make function the entry point of the script. After loading
the script file (with -s) or evaluating the expression (with
-c), apply function to a list containing the program name
and the command-line arguments—the list provided by the
command-line
function.
A -e switch can appear anywhere in the argument list, but Guile always invokes the function as the last action it performs. This is weird, but because of the way script invocation works under POSIX, the -s option must always come last in the list.
The function is most often a simple symbol that names a function
that is defined in the script. It can also be of the form (@
module-name symbol)
, and in that case, the symbol is
looked up in the module named module-name.
As a shorthand you can use the form (symbol ...)
, that is, a list
of only symbols that doesn’t start with @
. It is equivalent to
(@ module-name main)
, where module-name is
(symbol ...)
form. See Using Guile Modules and Scripting Examples.
-ds
Treat a final -s option as if it occurred at this point in the command line; load the script here.
This switch is necessary because, although the POSIX script invocation mechanism effectively requires the -s option to appear last, the programmer may well want to run the script before other actions requested on the command line. For examples, see Scripting Examples.
\
Read more command-line arguments, starting from the second line of the script file. See The Meta Switch.
--use-srfi=list
¶The option --use-srfi expects a comma-separated list of numbers,
each representing a SRFI module to be loaded into the interpreter
before evaluating a script file or starting the REPL. Additionally,
the feature identifier for the loaded SRFIs is recognized by
the procedure cond-expand
when this option is used.
Here is an example that loads the modules SRFI-8 (’receive’) and SRFI-13 (’string library’) before the GUILE interpreter is started:
guile --use-srfi=8,13
--debug
¶Start with the debugging virtual machine (VM) engine. Using the debugging VM will enable support for VM hooks, which are needed for tracing, breakpoints, and accurate call counts when profiling. The debugging VM is slower than the regular VM, though, by about ten percent. See VM Hooks, for more information.
By default, the debugging VM engine is only used when entering an interactive session. When executing a script with -s or -c, the normal, faster VM is used by default.
--no-debug
¶Do not use the debugging VM engine, even when entering an interactive session.
Note that, despite the name, Guile running with --no-debug does support the usual debugging facilities, such as printing a detailed backtrace upon error. The only difference with --debug is lack of support for VM hooks and the facilities that build upon it (see above).
-q
¶Do not load the initialization file, .guile. This option only has an effect when running interactively; running scripts does not load the .guile file. See The Init File, ~/.guile.
--listen[=p]
While this program runs, listen on a local port or a path for REPL clients. If p starts with a number, it is assumed to be a local port on which to listen. If it starts with a forward slash, it is assumed to be the file name of a UNIX domain socket on which to listen.
If p is not given, the default is local port 37146. If you look at it upside down, it almost spells “Guile”. If you have netcat installed, you should be able to nc localhost 37146 and get a Guile prompt. Alternately you can fire up Emacs and connect to the process; see Using Guile in Emacs for more details.
Note: Opening a port allows anyone who can connect to that port to do anything Guile can do, as the user that the Guile process is running as. Do not use --listen on multi-user machines. Of course, if you do not pass --listen to Guile, no port will be opened.
Guile protects against the HTTP inter-protocol exploitation attack, a scenario whereby an attacker can, via an HTML page, cause a web browser to send data to TCP servers listening on a loopback interface or private network. Nevertheless, you are advised to use UNIX domain sockets, as in
--listen=/some/local/file
, whenever possible.
That said, --listen is great for interactive debugging and development.
--auto-compile
Compile source files automatically (default behavior).
--fresh-auto-compile
Treat the auto-compilation cache as invalid, forcing recompilation.
--no-auto-compile
Disable automatic source file compilation.
--language=lang
For the remainder of the command line arguments, assume that files
mentioned with -l
and expressions passed with -c
are
written in lang. lang must be the name of one of the
languages supported by the compiler (see Compiler Tower). When run
interactively, set the REPL’s language to lang (see Using Guile Interactively).
The default language is scheme
; other interesting values include
elisp
(for Emacs Lisp), and ecmascript
.
The example below shows the evaluation of expressions in Scheme, Emacs Lisp, and ECMAScript:
guile -c "(apply + '(1 2))" guile --language=elisp -c "(= (funcall (symbol-function '+) 1 2) 3)" guile --language=ecmascript -c '(function (x) { return x * x; })(2);'
To load a file written in Scheme and one written in Emacs Lisp, and then start a Scheme REPL, type:
guile -l foo.scm --language=elisp -l foo.el --language=scheme
-h, --help
Display help on invoking Guile, and then exit.
-v, --version
Display the current version of Guile, and then exit.
Previous: Command-line Options, Up: Invoking Guile [Contents][Index]
The environment is a feature of the operating system; it consists of a collection of variables with names and values. Each variable is called an environment variable (or, sometimes, a “shell variable”); environment variable names are case-sensitive, and it is conventional to use upper-case letters only. The values are all text strings, even those that are written as numerals. (Note that here we are referring to names and values that are defined in the operating system shell from which Guile is invoked. This is not the same as a Scheme environment that is defined within a running instance of Guile. For a description of Scheme environments, see Names, Locations, Values and Environments.)
How to set environment variables before starting Guile depends on the
operating system and, especially, the shell that you are using. For
example, here is how to tell Guile to provide detailed warning messages
about deprecated features by setting GUILE_WARN_DEPRECATED
using
Bash:
$ export GUILE_WARN_DEPRECATED="detailed" $ guile
Or, detailed warnings can be turned on for a single invocation using:
$ env GUILE_WARN_DEPRECATED="detailed" guile
If you wish to retrieve or change the value of the shell environment variables that affect the run-time behavior of Guile from within a running instance of Guile, see Runtime Environment.
Here are the environment variables that affect the run-time behavior of Guile:
GUILE_AUTO_COMPILE
¶This is a flag that can be used to tell Guile whether or not to compile Scheme source files automatically. Starting with Guile 2.0, Scheme source files will be compiled automatically, by default.
If a compiled (.go) file corresponding to a .scm file is not found or is not newer than the .scm file, the .scm file will be compiled on the fly, and the resulting .go file stored away. An advisory note will be printed on the console.
Compiled files will be stored in the directory
$XDG_CACHE_HOME/guile/ccache, where XDG_CACHE_HOME
defaults to the directory $HOME/.cache. This directory will be
created if it does not already exist.
Note that this mechanism depends on the timestamp of the .go file being newer than that of the .scm file; if the .scm or .go files are moved after installation, care should be taken to preserve their original timestamps.
Set GUILE_AUTO_COMPILE
to zero (0), to prevent Scheme files from
being compiled automatically. Set this variable to “fresh” to tell
Guile to compile Scheme files whether they are newer than the compiled
files or not.
GUILE_HISTORY
¶This variable names the file that holds the Guile REPL command history. You can specify a different history file by setting this environment variable. By default, the history file is $HOME/.guile_history.
GUILE_INSTALL_LOCALE
¶This is a flag that can be used to tell Guile whether or not to install
the current locale at startup, via a call to (setlocale LC_ALL
"")
2. See Locales, for more
information on locales.
You may explicitly indicate that you do not want to install
the locale by setting GUILE_INSTALL_LOCALE
to 0
, or
explicitly enable it by setting the variable to 1
.
Usually, installing the current locale is the right thing to do. It allows Guile to correctly parse and print strings with non-ASCII characters. Therefore, this option is on by default.
GUILE_STACK_SIZE
¶Guile currently has a limited stack size for Scheme computations. Attempting to call too many nested functions will signal an error. This is good to detect infinite recursion, but sometimes the limit is reached for normal computations. This environment variable, if set to a positive integer, specifies the number of Scheme value slots to allocate for the stack.
In the future we will implement stacks that can grow and shrink, but for now this hack will have to do.
GUILE_LOAD_COMPILED_PATH
¶This variable may be used to augment the path that is searched for
compiled Scheme files (.go files) when loading. Its value should
be a colon-separated list of directories. If it contains the special
path component ...
(ellipsis), then the default path is put in
place of the ellipsis, otherwise the default path is placed at the end.
The result is stored in %load-compiled-path
(see Load Paths).
Here is an example using the Bash shell that adds the current directory,
., and the relative directory ../my-library to
%load-compiled-path
:
$ export GUILE_LOAD_COMPILED_PATH=".:../my-library" $ guile -c '(display %load-compiled-path) (newline)' (. ../my-library /usr/local/lib/guile/2.2/ccache)
GUILE_LOAD_PATH
¶This variable may be used to augment the path that is searched for
Scheme files when loading. Its value should be a colon-separated list
of directories. If it contains the special path component ...
(ellipsis), then the default path is put in place of the ellipsis,
otherwise the default path is placed at the end. The result is stored
in %load-path
(see Load Paths).
Here is an example using the Bash shell that prepends the current
directory to %load-path
, and adds the relative directory
../srfi to the end:
$ env GUILE_LOAD_PATH=".:...:../srfi" \ guile -c '(display %load-path) (newline)' (. /usr/local/share/guile/2.2 \ /usr/local/share/guile/site/2.2 \ /usr/local/share/guile/site \ /usr/local/share/guile \ ../srfi)
(Note: The line breaks, above, are for documentation purposes only, and not required in the actual example.)
GUILE_WARN_DEPRECATED
¶As Guile evolves, some features will be eliminated or replaced by newer
features. To help users migrate their code as this evolution occurs,
Guile will issue warning messages about code that uses features that
have been marked for eventual elimination. GUILE_WARN_DEPRECATED
can be set to “no” to tell Guile not to display these warning
messages, or set to “detailed” to tell Guile to display more lengthy
messages describing the warning. See Deprecation.
HOME
¶Guile uses the environment variable HOME
, the name of your home
directory, to locate various files, such as .guile or
.guile_history.
Next: Using Guile Interactively, Previous: Invoking Guile, Up: Programming in Scheme [Contents][Index]
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
Next: The Meta Switch, Up: Guile Scripting [Contents][Index]
The first line of a Guile script must tell the operating system to use Guile to evaluate the script, and then tell Guile how to go about doing that. Here is the simplest case:
The operating system interprets this to mean that the rest of the line is the name of an executable that can interpret the script. Guile, however, interprets these characters as the beginning of a multi-line comment, terminated by the characters ‘!#’ on a line by themselves. (This is an extension to the syntax described in R5RS, added to support shell scripts.)
coding: utf-8
should appear in a comment
somewhere in the first five lines of the file: see Character Encoding of Source Files.
Guile reads the program, evaluating expressions in the order that they appear. Upon reaching the end of the file, Guile exits.
Next: Command Line Handling, Previous: The Top of a Script File, Up: Guile Scripting [Contents][Index]
Guile’s command-line switches allow the programmer to describe reasonably complicated actions in scripts. Unfortunately, the POSIX script invocation mechanism only allows one argument to appear on the ‘#!’ line after the path to the Guile executable, and imposes arbitrary limits on that argument’s length. Suppose you wrote a script starting like this:
#!/usr/local/bin/guile -e main -s !# (define (main args) (map (lambda (arg) (display arg) (display " ")) (cdr args)) (newline))
The intended meaning is clear: load the file, and then call main
on the command-line arguments. However, the system will treat
everything after the Guile path as a single argument — the string
"-e main -s"
— which is not what we want.
As a workaround, the meta switch \
allows the Guile programmer to
specify an arbitrary number of options without patching the kernel. If
the first argument to Guile is \
, Guile will open the script file
whose name follows the \
, parse arguments starting from the
file’s second line (according to rules described below), and substitute
them for the \
switch.
Working in concert with the meta switch, Guile treats the characters ‘#!’ as the beginning of a comment which extends through the next line containing only the characters ‘!#’. This sort of comment may appear anywhere in a Guile program, but it is most useful at the top of a file, meshing magically with the POSIX script invocation mechanism.
Thus, consider a script named /u/jimb/ekko which starts like this:
#!/usr/local/bin/guile \ -e main -s !# (define (main args) (map (lambda (arg) (display arg) (display " ")) (cdr args)) (newline))
Suppose a user invokes this script as follows:
$ /u/jimb/ekko a b c
Here’s what happens:
/usr/local/bin/guile \ /u/jimb/ekko a b c
This is the usual behavior, prescribed by POSIX.
\ /u/jimb/ekko
, it opens
/u/jimb/ekko, parses the three arguments -e
, main
,
and -s
from it, and substitutes them for the \
switch.
Thus, Guile’s command line now reads:
/usr/local/bin/guile -e main -s /u/jimb/ekko a b c
(main "/u/jimb/ekko" "a" "b" "c")
.
When Guile sees the meta switch \
, it parses command-line
argument from the script file according to the following rules:
""
.
\n
and
\t
are also supported. These produce argument constituents; the
two-character combination \n
doesn’t act like a terminating
newline. The escape sequence \NNN
for exactly three octal
digits reads as the character whose ASCII code is NNN. As above,
characters produced this way are argument constituents. Backslash
followed by other characters is not allowed.
Next: Scripting Examples, Previous: The Meta Switch, Up: Guile Scripting [Contents][Index]
The ability to accept and handle command line arguments is very important when writing Guile scripts to solve particular problems, such as extracting information from text files or interfacing with existing command line applications. This chapter describes how Guile makes command line arguments available to a Guile script, and the utilities that Guile provides to help with the processing of command line arguments.
When a Guile script is invoked, Guile makes the command line arguments
accessible via the procedure command-line
, which returns the
arguments as a list of strings.
For example, if the script
#! /usr/local/bin/guile -s !# (write (command-line)) (newline)
is saved in a file cmdline-test.scm and invoked using the command
line ./cmdline-test.scm bar.txt -o foo -frumple grob
, the output
is
("./cmdline-test.scm" "bar.txt" "-o" "foo" "-frumple" "grob")
If the script invocation includes a -e
option, specifying a
procedure to call after loading the script, Guile will call that
procedure with (command-line)
as its argument. So a script that
uses -e
doesn’t need to refer explicitly to command-line
in its code. For example, the script above would have identical
behaviour if it was written instead like this:
#! /usr/local/bin/guile \ -e main -s !# (define (main args) (write args) (newline))
(Note the use of the meta switch \
so that the script invocation
can include more than one Guile option: See The Meta Switch.)
These scripts use the #!
POSIX convention so that they can be
executed using their own file names directly, as in the example command
line ./cmdline-test.scm bar.txt -o foo -frumple grob
. But they
can also be executed by typing out the implied Guile command line in
full, as in:
$ guile -s ./cmdline-test.scm bar.txt -o foo -frumple grob
or
$ guile -e main -s ./cmdline-test2.scm bar.txt -o foo -frumple grob
Even when a script is invoked using this longer form, the arguments that
the script receives are the same as if it had been invoked using the
short form. Guile ensures that the (command-line)
or -e
arguments are independent of how the script is invoked, by stripping off
the arguments that Guile itself processes.
A script is free to parse and handle its command line arguments in any
way that it chooses. Where the set of possible options and arguments is
complex, however, it can get tricky to extract all the options, check
the validity of given arguments, and so on. This task can be greatly
simplified by taking advantage of the module (ice-9 getopt-long)
,
which is distributed with Guile, See The (ice-9 getopt-long) Module.
Previous: Command Line Handling, Up: Guile Scripting [Contents][Index]
To start with, here are some examples of invoking Guile directly:
guile -- a b c
Run Guile interactively; (command-line)
will return
("/usr/local/bin/guile" "a" "b" "c")
.
guile -s /u/jimb/ex2 a b c
Load the file /u/jimb/ex2; (command-line)
will return
("/u/jimb/ex2" "a" "b" "c")
.
guile -c '(write %load-path) (newline)'
Write the value of the variable %load-path
, print a newline,
and exit.
guile -e main -s /u/jimb/ex4 foo
Load the file /u/jimb/ex4, and then call the function
main
, passing it the list ("/u/jimb/ex4" "foo")
.
guile -e '(ex4)' -s /u/jimb/ex4.scm foo
Load the file /u/jimb/ex4.scm, and then call the function
main
from the module ’(ex4)’, passing it the list
("/u/jimb/ex4" "foo")
.
guile -l first -ds -l last -s script
Load the files first, script, and last, in that
order. The -ds
switch says when to process the -s
switch. For a more motivated example, see the scripts below.
Here is a very simple Guile script:
#!/usr/local/bin/guile -s !# (display "Hello, world!") (newline)
The first line marks the file as a Guile script. When the user invokes
it, the system runs /usr/local/bin/guile to interpret the script,
passing -s
, the script’s filename, and any arguments given to the
script as command-line arguments. When Guile sees -s
script
, it loads script. Thus, running this program
produces the output:
Hello, world!
Here is a script which prints the factorial of its argument:
#!/usr/local/bin/guile -s !# (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) (display (fact (string->number (cadr (command-line))))) (newline)
In action:
$ ./fact 5 120 $
However, suppose we want to use the definition of fact
in this
file from another script. We can’t simply load
the script file,
and then use fact
’s definition, because the script will try to
compute and display a factorial when we load it. To avoid this problem,
we might write the script this way:
#!/usr/local/bin/guile \ -e main -s !# (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) (define (main args) (display (fact (string->number (cadr args)))) (newline))
This version packages the actions the script should perform in a
function, main
. This allows us to load the file purely for its
definitions, without any extraneous computation taking place. Then we
used the meta switch \
and the entry point switch -e
to
tell Guile to call main
after loading the script.
$ ./fact 50 30414093201713378043612608166064768844377641568960512000000000000
Suppose that we now want to write a script which computes the
choose
function: given a set of m distinct objects,
(choose n m)
is the number of distinct subsets
containing n objects each. It’s easy to write choose
given
fact
, so we might write the script this way:
#!/usr/local/bin/guile \ -l fact -e main -s !# (define (choose n m) (/ (fact m) (* (fact (- m n)) (fact n)))) (define (main args) (let ((n (string->number (cadr args))) (m (string->number (caddr args)))) (display (choose n m)) (newline)))
The command-line arguments here tell Guile to first load the file
fact, and then run the script, with main
as the entry
point. In other words, the choose
script can use definitions
made in the fact
script. Here are some sample runs:
$ ./choose 0 4 1 $ ./choose 1 4 4 $ ./choose 2 4 6 $ ./choose 3 4 4 $ ./choose 4 4 1 $ ./choose 50 100 100891344545564193334812497256
To call a specific procedure from a given module, we can use the special
form (@ (module) procedure)
:
#!/usr/local/bin/guile \ -l fact -e (@ (fac) main) -s !# (define-module (fac) #:export (main)) (define (choose n m) (/ (fact m) (* (fact (- m n)) (fact n)))) (define (main args) (let ((n (string->number (cadr args))) (m (string->number (caddr args)))) (display (choose n m)) (newline)))
We can use @@
to invoke non-exported procedures. For exported
procedures, we can simplify this call with the shorthand
(module)
:
#!/usr/local/bin/guile \ -l fact -e (fac) -s !# (define-module (fac) #:export (main)) (define (choose n m) (/ (fact m) (* (fact (- m n)) (fact n)))) (define (main args) (let ((n (string->number (cadr args))) (m (string->number (caddr args)))) (display (choose n m)) (newline)))
For maximum portability, we can instead use the shell to execute
guile
with specified command line arguments. Here we need to
take care to quote the command arguments correctly:
#!/usr/bin/env sh exec guile -l fact -e '(@ (fac) main)' -s "$0" "$@" !# (define-module (fac) #:export (main)) (define (choose n m) (/ (fact m) (* (fact (- m n)) (fact n)))) (define (main args) (let ((n (string->number (cadr args))) (m (string->number (caddr args)))) (display (choose n m)) (newline)))
Finally, seasoned scripters are probably missing a mention of
subprocesses. In Bash, for example, most shell scripts run other
programs like sed
or the like to do the actual work.
In Guile it’s often possible get everything done within Guile itself, so
do give that a try first. But if you just need to run a program and
wait for it to finish, use system*
. If you need to run a
sub-program and capture its output, or give it input, use
open-pipe
. See Processes, and See Pipes, for more
information.
Next: Using Guile in Emacs, Previous: Guile Scripting, Up: Programming in Scheme [Contents][Index]
When you start up Guile by typing just guile
, without a
-c
argument or the name of a script to execute, you get an
interactive interpreter where you can enter Scheme expressions, and
Guile will evaluate them and print the results for you. Here are some
simple examples.
scheme@(guile-user)> (+ 3 4 5) $1 = 12 scheme@(guile-user)> (display "Hello world!\n") Hello world! scheme@(guile-user)> (values 'a 'b) $2 = a $3 = b
This mode of use is called a REPL, which is short for “Read-Eval-Print Loop”, because the Guile interpreter first reads the expression that you have typed, then evaluates it, and then prints the result.
The prompt shows you what language and module you are in. In this case, the
current language is scheme
, and the current module is
(guile-user)
. See Support for Other Languages, for more information on Guile’s
support for languages other than Scheme.
Next: Readline, Up: Using Guile Interactively [Contents][Index]
When run interactively, Guile will load a local initialization file from ~/.guile. This file should contain Scheme expressions for evaluation.
This facility lets the user customize their interactive Guile environment, pulling in extra modules or parameterizing the REPL implementation.
To run Guile without loading the init file, use the -q
command-line option.
Next: Value History, Previous: The Init File, ~/.guile, Up: Using Guile Interactively [Contents][Index]
To make it easier for you to repeat and vary previously entered expressions, or to edit the expression that you’re typing in, Guile can use the GNU Readline library. This is not enabled by default because of licensing reasons, but all you need to activate Readline is the following pair of lines.
scheme@(guile-user)> (use-modules (ice-9 readline)) scheme@(guile-user)> (activate-readline)
It’s a good idea to put these two lines (without the
scheme@(guile-user)>
prompts) in your .guile file.
See The Init File, ~/.guile, for more on .guile.
Next: REPL Commands, Previous: Readline, Up: Using Guile Interactively [Contents][Index]
Just as Readline helps you to reuse a previous input line, value
history allows you to use the result of a previous evaluation in
a new expression. When value history is enabled, each evaluation result
is automatically assigned to the next in the sequence of variables
$1
, $2
, …. You can then use these variables in
subsequent expressions.
scheme@(guile-user)> (iota 10) $1 = (0 1 2 3 4 5 6 7 8 9) scheme@(guile-user)> (apply * (cdr $1)) $2 = 362880 scheme@(guile-user)> (sqrt $2) $3 = 602.3952191045344 scheme@(guile-user)> (cons $2 $1) $4 = (362880 0 1 2 3 4 5 6 7 8 9)
Value history is enabled by default, because Guile’s REPL imports the
(ice-9 history)
module. Value history may be turned off or on within the
repl, using the options interface:
scheme@(guile-user)> ,option value-history #f scheme@(guile-user)> 'foo foo scheme@(guile-user)> ,option value-history #t scheme@(guile-user)> 'bar $5 = bar
Note that previously recorded values are still accessible, even if value history
is off. In rare cases, these references to past computations can cause Guile to
use too much memory. One may clear these values, possibly enabling garbage
collection, via the clear-value-history!
procedure, described below.
The programmatic interface to value history is in a module:
(use-modules (ice-9 history))
Return true if value history is enabled, or false otherwise.
Turn on value history, if it was off.
Turn off value history, if it was on.
Clear the value history. If the stored values are not captured by some other data structure or closure, they may then be reclaimed by the garbage collector.
Next: Error Handling, Previous: Value History, Up: Using Guile Interactively [Contents][Index]
The REPL exists to read expressions, evaluate them, and then print their results. But sometimes one wants to tell the REPL to evaluate an expression in a different way, or to do something else altogether. A user can affect the way the REPL works with a REPL command.
The previous section had an example of a command, in the form of
,option
.
scheme@(guile-user)> ,option value-history #t
Commands are distinguished from expressions by their initial comma (‘,’). Since a comma cannot begin an expression in most languages, it is an effective indicator to the REPL that the following text forms a command, not an expression.
REPL commands are convenient because they are always there. Even if the
current module doesn’t have a binding for pretty-print
, one can
always ,pretty-print
.
The following sections document the various commands, grouped together
by functionality. Many of the commands have abbreviations; see the
online help (,help
) for more information.
Next: Module Commands, Up: REPL Commands [Contents][Index]
When Guile starts interactively, it notifies the user that help can be
had by typing ‘,help’. Indeed, help
is a command, and a
particularly useful one, as it allows the user to discover the rest of
the commands.
all
| group | [-c]
command] ¶Show help.
With one argument, tries to look up the argument as a group name, giving help on that group if successful. Otherwise tries to look up the argument as a command, giving help on the command.
If there is a command whose name is also a group name, use the ‘-c command’ form to give help on the command instead of the group.
Without any argument, a list of help commands and command groups are displayed.
Gives information about Guile.
With one argument, tries to show a particular piece of information; currently supported topics are ‘warranty’ (or ‘w’), ‘copying’ (or ‘c’), and ‘version’ (or ‘v’).
Without any argument, a list of topics is displayed.
Find bindings/modules/packages.
Show description/documentation.
Next: Language Commands, Previous: Help Commands, Up: REPL Commands [Contents][Index]
Change modules / Show current module.
Import modules / List those imported.
Load a file in the current module.
Reload the given module, or the current module if none was given.
List current bindings.
Evaluate an expression, or alternatively, execute another meta-command
in the context of a module. For example, ‘,in (foo bar) ,binding’
will show the bindings in the module (foo bar)
.
Next: Compile Commands, Previous: Module Commands, Up: REPL Commands [Contents][Index]
Change languages.
Next: Profile Commands, Previous: Language Commands, Up: REPL Commands [Contents][Index]
Generate compiled code.
Compile a file.
Expand any macros in a form.
Run the optimizer on a piece of code and print the result.
Disassemble a compiled procedure.
Disassemble a file.
Next: Debug Commands, Previous: Compile Commands, Up: REPL Commands [Contents][Index]
Time execution.
Profile execution of an expression. This command compiled exp and
then runs it within the statprof profiler, passing all keyword options
to the statprof
procedure. For more on statprof and on the the
options available to this command, See Statprof.
Trace execution.
By default, the trace will limit its width to the width of your terminal, or width if specified. Nested procedure invocations will be printed farther to the right, though if the width of the indentation passes the max-indent, the indentation is abbreviated.
Next: Inspect Commands, Previous: Profile Commands, Up: REPL Commands [Contents][Index]
These debugging commands are only available within a recursive REPL; they do not work at the top level.
Print a backtrace.
Print a backtrace of all stack frames, or innermost count frames. If count is negative, the last count frames will be shown.
Select a calling stack frame.
Select and print stack frames that called this one. An argument says how many frames up to go.
Select a called stack frame.
Select and print stack frames called by this one. An argument says how many frames down to go.
Show a frame.
Show the selected frame. With an argument, select a frame by index, then show it.
Show local variables.
Show locally-bound variables in the selected frame.
Show error message.
Display the message associated with the error that started the current debugging REPL.
Show the VM registers associated with the current frame.
See Stack Layout, for more information on VM stack frames.
Sets the number of display columns in the output of ,backtrace
and ,locals
to cols. If cols is not given, the width
of the terminal is used.
The next 3 commands work at any REPL.
Set a breakpoint at proc.
Set a breakpoint at the given source location.
Set a tracepoint on the given procedure. This will cause all calls to the procedure to print out a tracing message. See Tracing Traps, for more information.
The rest of the commands in this subsection all apply only when the stack is continuable — in other words when it makes sense for the program that the stack comes from to continue running. Usually this means that the program stopped because of a trap or a breakpoint.
Tell the debugged program to step to the next source location.
Tell the debugged program to step to the next source location in the same frame. (See Traps for the details of how this works.)
Tell the program being debugged to continue running until the completion of the current stack frame, and at that time to print the result and reenter the REPL.
Next: System Commands, Previous: Debug Commands, Up: REPL Commands [Contents][Index]
Inspect the result(s) of evaluating exp.
Pretty-print the result(s) of evaluating exp.
Previous: Inspect Commands, Up: REPL Commands [Contents][Index]
Garbage collection.
Display statistics.
With no arguments, lists all options. With one argument, shows the current value of the name option. With two arguments, sets the name option to the result of evaluating the Scheme expression exp.
Quit this session.
Current REPL options include:
compile-options
The options used when compiling expressions entered at the REPL. See Compiling Scheme Code, for more on compilation options.
interp
Whether to interpret or compile expressions given at the REPL, if such a choice is available. Off by default (indicating compilation).
prompt
A customized REPL prompt. #f
by default, indicating the default
prompt.
print
A procedure of two arguments used to print the result of evaluating each
expression. The arguments are the current REPL and the value to print.
By default, #f
, to use the default procedure.
value-history
Whether value history is on or not. See Value History.
on-error
What to do when an error happens. By default, debug
, meaning to
enter the debugger. Other values include backtrace
, to show a
backtrace without entering the debugger, or report
, to simply
show a short error printout.
Default values for REPL options may be set using
repl-default-option-set!
from (system repl common)
:
Set the default value of a REPL option. This function is particularly useful in a user’s init file. See The Init File, ~/.guile.
Next: Interactive Debugging, Previous: REPL Commands, Up: Using Guile Interactively [Contents][Index]
When code being evaluated from the REPL hits an error, Guile enters a new prompt, allowing you to inspect the context of the error.
scheme@(guile-user)> (map string-append '("a" "b") '("c" #\d)) ERROR: In procedure string-append: ERROR: Wrong type (expecting string): #\d Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. scheme@(guile-user) [1]>
The new prompt runs inside the old one, in the dynamic context of the error. It is a recursive REPL, augmented with a reified representation of the stack, ready for debugging.
,backtrace
(abbreviated ,bt
) displays the Scheme call
stack at the point where the error occurred:
scheme@(guile-user) [1]> ,bt 1 (map #<procedure string-append _> ("a" "b") ("c" #\d)) 0 (string-append "b" #\d)
In the above example, the backtrace doesn’t have much source
information, as map
and string-append
are both
primitives. But in the general case, the space on the left of the
backtrace indicates the line and column in which a given procedure calls
another.
You can exit a recursive REPL in the same way that you exit any REPL: via ‘(quit)’, ‘,quit’ (abbreviated ‘,q’), or C-d, among other options.
Previous: Error Handling, Up: Using Guile Interactively [Contents][Index]
A recursive debugging REPL exposes a number of other meta-commands that inspect the state of the computation at the time of the error. These commands allow you to
See Debug Commands, for documentation of the individual commands. This section aims to give more of a walkthrough of a typical debugging session.
First, we’re going to need a good error. Let’s try to macroexpand the
expression (unquote foo)
, outside of a quasiquote
form,
and see how the macroexpander reports this error.
scheme@(guile-user)> (macroexpand '(unquote foo)) ERROR: In procedure macroexpand: ERROR: unquote: expression not valid outside of quasiquote in (unquote foo) Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. scheme@(guile-user) [1]>
The backtrace
command, which can also be invoked as bt
,
displays the call stack (aka backtrace) at the point where the debugger
was entered:
scheme@(guile-user) [1]> ,bt In ice-9/psyntax.scm: 1130:21 3 (chi-top (unquote foo) () ((top)) e (eval) (hygiene #)) 1071:30 2 (syntax-type (unquote foo) () ((top)) #f #f (# #) #f) 1368:28 1 (chi-macro #<procedure de9360 at ice-9/psyntax.scm...> ...) In unknown file: 0 (scm-error syntax-error macroexpand "~a: ~a in ~a" # #f)
A call stack consists of a sequence of stack frames, with each frame describing one procedure which is waiting to do something with the values returned by another. Here we see that there are four frames on the stack.
Note that macroexpand
is not on the stack – it must have made a
tail call to chi-top
, as indeed we would find if we searched
ice-9/psyntax.scm
for its definition.
When you enter the debugger, the innermost frame is selected, which
means that the commands for getting information about the “current”
frame, or for evaluating expressions in the context of the current
frame, will do so by default with respect to the innermost frame. To
select a different frame, so that these operations will apply to it
instead, use the up
, down
and frame
commands like
this:
scheme@(guile-user) [1]> ,up In ice-9/psyntax.scm: 1368:28 1 (chi-macro #<procedure de9360 at ice-9/psyntax.scm...> ...) scheme@(guile-user) [1]> ,frame 3 In ice-9/psyntax.scm: 1130:21 3 (chi-top (unquote foo) () ((top)) e (eval) (hygiene #)) scheme@(guile-user) [1]> ,down In ice-9/psyntax.scm: 1071:30 2 (syntax-type (unquote foo) () ((top)) #f #f (# #) #f)
Perhaps we’re interested in what’s going on in frame 2, so we take a look at its local variables:
scheme@(guile-user) [1]> ,locals Local variables: $1 = e = (unquote foo) $2 = r = () $3 = w = ((top)) $4 = s = #f $5 = rib = #f $6 = mod = (hygiene guile-user) $7 = for-car? = #f $8 = first = unquote $9 = ftype = macro $10 = fval = #<procedure de9360 at ice-9/psyntax.scm:2817:2 (x)> $11 = fe = unquote $12 = fw = ((top)) $13 = fs = #f $14 = fmod = (hygiene guile-user)
All of the values are accessible by their value-history names
($n
):
scheme@(guile-user) [1]> $10 $15 = #<procedure de9360 at ice-9/psyntax.scm:2817:2 (x)>
We can even invoke the procedure at the REPL directly:
scheme@(guile-user) [1]> ($10 'not-going-to-work) ERROR: In procedure macroexpand: ERROR: source expression failed to match any pattern in not-going-to-work Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
Well at this point we’ve caused an error within an error. Let’s just quit back to the top level:
scheme@(guile-user) [2]> ,q scheme@(guile-user) [1]> ,q scheme@(guile-user)>
Finally, as a word to the wise: hackers close their REPL prompts with C-d.
Next: Using Guile Tools, Previous: Using Guile Interactively, Up: Programming in Scheme [Contents][Index]
Any text editor can edit Scheme, but some are better than others. Emacs is the best, of course, and not just because it is a fine text editor. Emacs has good support for Scheme out of the box, with sensible indentation rules, parenthesis-matching, syntax highlighting, and even a set of keybindings for structural editing, allowing navigation, cut-and-paste, and transposition operations that work on balanced S-expressions.
As good as it is, though, two things will vastly improve your experience with Emacs and Guile.
The first is Taylor Campbell’s Paredit. You should not code in any dialect of Lisp without Paredit. (They say that unopinionated writing is boring—hence this tone—but it’s the truth, regardless.) Paredit is the bee’s knees.
The second is
José
Antonio Ortega Ruiz’s
Geiser. Geiser complements Emacs’
scheme-mode
with tight integration to running Guile processes via
a comint-mode
REPL buffer.
Of course there are keybindings to switch to the REPL, and a good REPL environment, but Geiser goes beyond that, providing:
See Geiser’s web page at http://www.nongnu.org/geiser/, for more information.
Next: Installing Site Packages, Previous: Using Guile in Emacs, Up: Programming in Scheme [Contents][Index]
Guile also comes with a growing number of command-line utilities: a
compiler, a disassembler, some module inspectors, and in the future, a
system to install Guile packages from the internet. These tools may be
invoked using the guild
program.
$ guild compile -o foo.go foo.scm wrote `foo.go'
This program used to be called guile-tools
up to
Guile version 2.0.1, and for backward
compatibility it still may be called as such. However we changed the
name to guild
, not only because it is pleasantly shorter and
easier to read, but also because this tool will serve to bind Guile
wizards together, by allowing hackers to share code with each other
using a CPAN-like system.
See Compiling Scheme Code, for more on guild compile
.
A complete list of guild scripts can be had by invoking guild
list
, or simply guild
.
Previous: Using Guile Tools, Up: Programming in Scheme [Contents][Index]
At some point, you will probably want to share your code with other people. To do so effectively, it is important to follow a set of common conventions, to make it easy for the user to install and use your package.
The first thing to do is to install your Scheme files where Guile can find them. When Guile goes to find a Scheme file, it will search a load path to find the file: first in Guile’s own path, then in paths for site packages. A site package is any Scheme code that is installed and not part of Guile itself. See Load Paths, for more on load paths.
There are several site paths, for historical reasons, but the one that
should generally be used can be obtained by invoking the
%site-dir
procedure. See Configuration, Build and Installation. If Guile
2.2 is installed on your system in /usr/
,
then (%site-dir)
will be
/usr/share/guile/site/2.2
. Scheme files
should be installed there.
If you do not install compiled .go
files, Guile will compile your
modules and programs when they are first used, and cache them in the
user’s home directory. See Compiling Scheme Code, for more on
auto-compilation. However, it is better to compile the files before
they are installed, and to just copy the files to a place that Guile can
find them.
As with Scheme files, Guile searches a path to find compiled .go
files, the %load-compiled-path
. By default, this path has two
entries: a path for Guile’s files, and a path for site packages. You
should install your .go
files into the latter directory, whose
value is returned by invoking the %site-ccache-dir
procedure. As
in the previous example, if Guile 2.2 is installed
on your system in /usr/
, then (%site-ccache-dir)
site
packages will be
/usr/lib/guile/2.2/site-ccache
.
Note that a .go
file will only be loaded in preference to a
.scm
file if it is newer. For that reason, you should install
your Scheme files first, and your compiled files second. See Load Paths, for more on the loading process.
Finally, although this section is only about Scheme, sometimes you need
to install C extensions too. Shared libraries should be installed in
the extensions dir. This value can be had from the build config
(see Configuration, Build and Installation). Again, if Guile 2.2 is
installed on your system in /usr/
, then the extensions dir will
be /usr/lib/guile/2.2/extensions
.
Next: API Reference, Previous: Programming in Scheme, Up: The Guile Reference Manual [Contents][Index]
This part of the manual explains the general concepts that you need to understand when interfacing to Guile from C. You will learn about how the latent typing of Scheme is embedded into the static typing of C, how the garbage collection of Guile is made available to C code, and how continuations influence the control flow in a C program.
This knowledge should make it straightforward to add new functions to Guile that can be called from Scheme. Adding new data types is also possible and is done by defining foreign objects.
The An Overview of Guile Programming section of this part contains general musings and guidelines about programming with Guile. It explores different ways to design a program around Guile, or how to embed Guile into existing programs.
For a pedagogical yet detailed explanation of how the data representation of Guile is implemented, See Data Representation. You don’t need to know the details given there to use Guile from C, but they are useful when you want to modify Guile itself or when you are just curious about how it is all done.
For detailed reference information on the variables, functions etc. that make up Guile’s application programming interface (API), See API Reference.
Next: Linking Programs With Guile, Up: Programming in C [Contents][Index]
Guile provides strong API and ABI stability guarantees during stable series, so that if a user writes a program against Guile version 2.2.3, it will be compatible with some future version 2.2.7. We say in this case that 2.2 is the effective version, composed of the major and minor versions, in this case 2 and 2.
Users may install multiple effective versions of Guile, with each version’s headers, libraries, and Scheme files under their own directories. This provides the necessary stability guarantee for users, while also allowing Guile developers to evolve the language and its implementation.
However, parallel installability does have a down-side, in that users
need to know which version of Guile to ask for, when they build against
Guile. Guile solves this problem by installing a file to be read by the
pkg-config
utility, a tool to query installed packages by name.
Guile encodes the version into its pkg-config name, so that users can
ask for guile-2.0
or guile-2.2
, as appropriate.
For effective version 2.2, for example, you would
invoke pkg-config --cflags --libs guile-2.2
to get the compilation and linking flags necessary to link to version
2.2 of Guile. You would typically run
pkg-config
during the configuration phase of your program and use
the obtained information in the Makefile.
Guile’s pkg-config
file,
guile-2.2.pc, defines additional useful
variables:
sitedir
¶The default directory where Guile looks for Scheme source and compiled
files (see %site-dir). Run
pkg-config guile-2.2 --variable=sitedir
to see its value. See GUILE_SITE_DIR, for more on
how to use it from Autoconf.
extensiondir
¶The default directory where Guile looks for extensions—i.e., shared
libraries providing additional features (see Modules and Extensions). Run pkg-config guile-2.2
--variable=extensiondir
to see its value.
guile
¶guild
The absolute file name of the guile
and guild
commands3. Run pkg-config
guile-2.2 --variable=guile
or
--variable=guild
to see their value.
These variables allow users to deal with program name transformations
that may be specified when configuring Guile with
--program-transform-name
, --program-suffix
, or
--program-prefix
(see Transformation Options in GNU
Autoconf Manual).
See the pkg-config
man page, for more information, or its web
site, http://pkg-config.freedesktop.org/.
See Autoconf Support, for more on checking for Guile from within a
configure.ac
file.
Next: Linking Guile with Libraries, Previous: Parallel Installations, Up: Programming in C [Contents][Index]
This section covers the mechanics of linking your program with Guile on a typical POSIX system.
The header file <libguile.h>
provides declarations for all of
Guile’s functions and constants. You should #include
it at the
head of any C source file that uses identifiers described in this
manual. Once you’ve compiled your source files, you need to link them
against the Guile object code library, libguile
.
As noted in the previous section, <libguile.h>
is not in the
default search path for headers. The following command lines give
respectively the C compilation and link flags needed to build programs
using Guile 2.2:
pkg-config guile-2.2 --cflags pkg-config guile-2.2 --libs
Next: A Sample Guile Main Program, Up: Linking Programs With Guile [Contents][Index]
To initialize Guile, you can use one of several functions. The first,
scm_with_guile
, is the most portable way to initialize Guile. It
will initialize Guile when necessary and then call a function that you
can specify. Multiple threads can call scm_with_guile
concurrently and it can also be called more than once in a given thread.
The global state of Guile will survive from one call of
scm_with_guile
to the next. Your function is called from within
scm_with_guile
since the garbage collector of Guile needs to know
where the stack of each thread is.
A second function, scm_init_guile
, initializes Guile for the
current thread. When it returns, you can use the Guile API in the
current thread. This function employs some non-portable magic to learn
about stack bounds and might thus not be available on all platforms.
One common way to use Guile is to write a set of C functions which
perform some useful task, make them callable from Scheme, and then link
the program with Guile. This yields a Scheme interpreter just like
guile
, but augmented with extra functions for some specific
application — a special-purpose scripting language.
In this situation, the application should probably process its
command-line arguments in the same manner as the stock Guile
interpreter. To make that straightforward, Guile provides the
scm_boot_guile
and scm_shell
function.
For more about these functions, see Initializing Guile.
Previous: Guile Initialization Functions, Up: Linking Programs With Guile [Contents][Index]
Here is simple-guile.c, source code for a main
and an
inner_main
function that will produce a complete Guile
interpreter.
/* simple-guile.c --- Start Guile from C. */ #include <libguile.h> static void inner_main (void *closure, int argc, char **argv) { /* preparation */ scm_shell (argc, argv); /* after exit */ } int main (int argc, char **argv) { scm_boot_guile (argc, argv, inner_main, 0); return 0; /* never reached, see inner_main */ }
The main
function calls scm_boot_guile
to initialize
Guile, passing it inner_main
. Once scm_boot_guile
is
ready, it invokes inner_main
, which calls scm_shell
to
process the command-line arguments in the usual way.
Here is a Makefile which you can use to compile the example program. It
uses pkg-config
to learn about the necessary compiler and
linker flags.
# Use GCC, if you have it installed. CC=gcc # Tell the C compiler where to find <libguile.h> CFLAGS=`pkg-config --cflags guile-2.2` # Tell the linker what libraries to use and where to find them. LIBS=`pkg-config --libs guile-2.2` simple-guile: simple-guile.o ${CC} simple-guile.o ${LIBS} -o simple-guile simple-guile.o: simple-guile.c ${CC} -c ${CFLAGS} simple-guile.c
If you are using the GNU Autoconf package to make your application more
portable, Autoconf will settle many of the details in the Makefile
automatically, making it much simpler and more portable; we recommend
using Autoconf with Guile. Here is a configure.ac file for
simple-guile
that uses the standard PKG_CHECK_MODULES
macro to check for Guile. Autoconf will process this file into a
configure
script. We recommend invoking Autoconf via the
autoreconf
utility.
AC_INIT(simple-guile.c) # Find a C compiler. AC_PROG_CC # Check for Guile PKG_CHECK_MODULES([GUILE], [guile-2.2]) # Generate a Makefile, based on the results. AC_OUTPUT(Makefile)
Run autoreconf -vif
to generate configure
.
Here is a Makefile.in
template, from which the configure
script produces a Makefile customized for the host system:
# The configure script fills in these values. CC=@CC@ CFLAGS=@GUILE_CFLAGS@ LIBS=@GUILE_LIBS@ simple-guile: simple-guile.o ${CC} simple-guile.o ${LIBS} -o simple-guile simple-guile.o: simple-guile.c ${CC} -c ${CFLAGS} simple-guile.c
The developer should use Autoconf to generate the configure script from the configure.ac template, and distribute configure with the application. Here’s how a user might go about building the application:
$ ls Makefile.in configure* configure.ac simple-guile.c $ ./configure checking for gcc... ccache gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether ccache gcc accepts -g... yes checking for ccache gcc option to accept ISO C89... none needed checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for GUILE... yes configure: creating ./config.status config.status: creating Makefile $ make [...] $ ./simple-guile guile> (+ 1 2 3) 6 guile> (getpwnam "jimb") #("jimb" "83Z7d75W2tyJQ" 4008 10 "Jim Blandy" "/u/jimb" "/usr/local/bin/bash") guile> (exit) $
Next: General concepts for using libguile, Previous: Linking Programs With Guile, Up: Programming in C [Contents][Index]
The previous section has briefly explained how to write programs that
make use of an embedded Guile interpreter. But sometimes, all you
want to do is make new primitive procedures and data types available
to the Scheme programmer. Writing a new version of guile
is
inconvenient in this case and it would in fact make the life of the
users of your new features needlessly hard.
For example, suppose that there is a program guile-db
that is a
version of Guile with additional features for accessing a database.
People who want to write Scheme programs that use these features would
have to use guile-db
instead of the usual guile
program.
Now suppose that there is also a program guile-gtk
that extends
Guile with access to the popular Gtk+ toolkit for graphical user
interfaces. People who want to write GUIs in Scheme would have to use
guile-gtk
. Now, what happens when you want to write a Scheme
application that uses a GUI to let the user access a database? You
would have to write a third program that incorporates both the
database stuff and the GUI stuff. This might not be easy (because
guile-gtk
might be a quite obscure program, say) and taking this
example further makes it easy to see that this approach can not work in
practice.
It would have been much better if both the database features and the GUI
feature had been provided as libraries that can just be linked with
guile
. Guile makes it easy to do just this, and we encourage you
to make your extensions to Guile available as libraries whenever
possible.
You write the new primitive procedures and data types in the normal fashion, and link them into a shared library instead of into a stand-alone program. The shared library can then be loaded dynamically by Guile.
This section explains how to make the Bessel functions of the C library
available to Scheme. First we need to write the appropriate glue code
to convert the arguments and return values of the functions from Scheme
to C and back. Additionally, we need a function that will add them to
the set of Guile primitives. Because this is just an example, we will
only implement this for the j0
function.
Consider the following file bessel.c.
#include <math.h> #include <libguile.h> SCM j0_wrapper (SCM x) { return scm_from_double (j0 (scm_to_double (x))); } void init_bessel () { scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper); }
This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:
gcc `pkg-config --cflags guile-2.2` \ -shared -o libguile-bessel.so -fPIC bessel.c
For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction in GNU Libtool).
A shared library can be loaded into a running Guile process with the
function load-extension
. In addition to the name of the
library to load, this function also expects the name of a function from
that library that will be called to initialize it. For our example,
we are going to call the function init_bessel
which will make
j0_wrapper
available to Scheme programs with the name
j0
. Note that we do not specify a filename extension such as
.so when invoking load-extension
. The right extension for
the host platform will be provided automatically.
(load-extension "libguile-bessel" "init_bessel") (j0 2) ⇒ 0.223890779141236
For this to work, load-extension
must be able to find
libguile-bessel, of course. It will look in the places that
are usual for your operating system, and it will additionally look
into the directories listed in the LTDL_LIBRARY_PATH
environment variable.
To see how these Guile extensions via shared libraries relate to the module system, See Putting Extensions into Modules.
Next: Defining New Foreign Object Types, Previous: Linking Guile with Libraries, Up: Programming in C [Contents][Index]
When you want to embed the Guile Scheme interpreter into your program or library, you need to link it against the libguile library (see Linking Programs With Guile). Once you have done this, your C code has access to a number of data types and functions that can be used to invoke the interpreter, or make new functions that you have written in C available to be called from Scheme code, among other things.
Scheme is different from C in a number of significant ways, and Guile tries to make the advantages of Scheme available to C as well. Thus, in addition to a Scheme interpreter, libguile also offers dynamic types, garbage collection, continuations, arithmetic on arbitrary sized numbers, and other things.
The two fundamental concepts are dynamic types and garbage collection. You need to understand how libguile offers them to C programs in order to use the rest of libguile. Also, the more general control flow of Scheme caused by continuations needs to be dealt with.
Running asynchronous signal handlers and multi-threading is known to C code already, but there are of course a few additional rules when using them together with libguile.
Next: Garbage Collection, Up: General concepts for using libguile [Contents][Index]
Scheme is a dynamically-typed language; this means that the system cannot, in general, determine the type of a given expression at compile time. Types only become apparent at run time. Variables do not have fixed types; a variable may hold a pair at one point, an integer at the next, and a thousand-element vector later. Instead, values, not variables, have fixed types.
In order to implement standard Scheme functions like pair?
and
string?
and provide garbage collection, the representation of
every value must contain enough information to accurately determine its
type at run time. Often, Scheme systems also use this information to
determine whether a program has attempted to apply an operation to an
inappropriately typed value (such as taking the car
of a string).
Because variables, pairs, and vectors may hold values of any type, Scheme implementations use a uniform representation for values — a single type large enough to hold either a complete value or a pointer to a complete value, along with the necessary typing information.
In Guile, this uniform representation of all Scheme values is the C type
SCM
. This is an opaque type and its size is typically equivalent
to that of a pointer to void
. Thus, SCM
values can be
passed around efficiently and they take up reasonably little storage on
their own.
The most important rule is: You never access a SCM
value
directly; you only pass it to functions or macros defined in libguile.
As an obvious example, although a SCM
variable can contain
integers, you can of course not compute the sum of two SCM
values
by adding them with the C +
operator. You must use the libguile
function scm_sum
.
Less obvious and therefore more important to keep in mind is that you
also cannot directly test SCM
values for trueness. In Scheme,
the value #f
is considered false and of course a SCM
variable can represent that value. But there is no guarantee that the
SCM
representation of #f
looks false to C code as well.
You need to use scm_is_true
or scm_is_false
to test a
SCM
value for trueness or falseness, respectively.
You also can not directly compare two SCM
values to find out
whether they are identical (that is, whether they are eq?
in
Scheme terms). You need to use scm_is_eq
for this.
The one exception is that you can directly assign a SCM
value to
a SCM
variable by using the C =
operator.
The following (contrived) example shows how to do it right. It implements a function of two arguments (a and flag) that returns a+1 if flag is true, else it returns a unchanged.
SCM my_incrementing_function (SCM a, SCM flag) { SCM result; if (scm_is_true (flag)) result = scm_sum (a, scm_from_int (1)); else result = a; return result; }
Often, you need to convert between SCM
values and appropriate C
values. For example, we needed to convert the integer 1
to its
SCM
representation in order to add it to a. Libguile
provides many function to do these conversions, both from C to
SCM
and from SCM
to C.
The conversion functions follow a common naming pattern: those that make
a SCM
value from a C value have names of the form
scm_from_type (…)
and those that convert a SCM
value to a C value use the form scm_to_type (…)
.
However, it is best to avoid converting values when you can. When you
must combine C values and SCM
values in a computation, it is
often better to convert the C values to SCM
values and do the
computation by using libguile functions than to the other way around
(converting SCM
to C and doing the computation some other way).
As a simple example, consider this version of
my_incrementing_function
from above:
SCM my_other_incrementing_function (SCM a, SCM flag) { int result; if (scm_is_true (flag)) result = scm_to_int (a) + 1; else result = scm_to_int (a); return scm_from_int (result); }
This version is much less general than the original one: it will only
work for values A that can fit into a int
. The original
function will work for all values that Guile can represent and that
scm_sum
can understand, including integers bigger than long
long
, floating point numbers, complex numbers, and new numerical types
that have been added to Guile by third-party libraries.
Also, computing with SCM
is not necessarily inefficient. Small
integers will be encoded directly in the SCM
value, for example,
and do not need any additional memory on the heap. See Data Representation to find out the details.
Some special SCM
values are available to C code without needing
to convert them from C values:
Scheme value | C representation |
#f | SCM_BOOL_F |
#t | SCM_BOOL_T |
() | SCM_EOL |
In addition to SCM
, Guile also defines the related type
scm_t_bits
. This is an unsigned integral type of sufficient
size to hold all information that is directly contained in a
SCM
value. The scm_t_bits
type is used internally by
Guile to do all the bit twiddling explained in Data Representation, but
you will encounter it occasionally in low-level user code as well.
Next: Control Flow, Previous: Dynamic Types, Up: General concepts for using libguile [Contents][Index]
As explained above, the SCM
type can represent all Scheme values.
Some values fit entirely into a SCM
value (such as small
integers), but other values require additional storage in the heap (such
as strings and vectors). This additional storage is managed
automatically by Guile. You don’t need to explicitly deallocate it
when a SCM
value is no longer used.
Two things must be guaranteed so that Guile is able to manage the storage automatically: it must know about all blocks of memory that have ever been allocated for Scheme values, and it must know about all Scheme values that are still being used. Given this knowledge, Guile can periodically free all blocks that have been allocated but are not used by any active Scheme values. This activity is called garbage collection.
Guile’s garbage collector will automatically discover references to
SCM
objects that originate in global variables, static data
sections, function arguments or local variables on the C and Scheme
stacks, and values in machine registers. Other references to SCM
objects, such as those in other random data structures in the C heap
that contain fields of type SCM
, can be made visible to the
garbage collector by calling the functions scm_gc_protect_object
or
scm_permanent_object
. Collectively, these values form the “root
set” of garbage collection; any value on the heap that is referenced
directly or indirectly by a member of the root set is preserved, and all
other objects are eligible for reclamation.
In Guile, garbage collection has two logical phases: the mark
phase, in which the collector discovers the set of all live objects,
and the sweep phase, in which the collector reclaims the resources
associated with dead objects. The mark phase pauses the program and
traces all SCM
object references, starting with the root set.
The sweep phase actually runs concurrently with the main program,
incrementally reclaiming memory as needed by allocation.
In the mark phase, the garbage collector traces the Scheme stack and
heap precisely. Because the Scheme stack and heap are managed by
Guile, Guile can know precisely where in those data structures it might
find references to other heap objects. This is not the case,
unfortunately, for pointers on the C stack and static data segment.
Instead of requiring the user to inform Guile about all variables in C
that might point to heap objects, Guile traces the C stack and static
data segment conservatively. That is to say, Guile just treats
every word on the C stack and every C global variable as a potential
reference in to the Scheme heap4. Any value that looks like a pointer to a GC-managed
object is treated as such, whether it actually is a reference or not.
Thus, scanning the C stack and static data segment is guaranteed to find
all actual references, but it might also find words that only
accidentally look like references. These “false positives” might keep
SCM
objects alive that would otherwise be considered dead. While
this might waste memory, keeping an object around longer than it
strictly needs to is harmless. This is why this technique is called
“conservative garbage collection”. In practice, the wasted memory
seems to be no problem, as the static C root set is almost always finite
and small, given that the Scheme stack is separate from the C stack.
The stack of every thread is scanned in this way and the registers of the CPU and all other memory locations where local variables or function parameters might show up are included in this scan as well.
The consequence of the conservative scanning is that you can just
declare local variables and function parameters of type SCM
and
be sure that the garbage collector will not free the corresponding
objects.
However, a local variable or function parameter is only protected as
long as it is really on the stack (or in some register). As an
optimization, the C compiler might reuse its location for some other
value and the SCM
object would no longer be protected. Normally,
this leads to exactly the right behavior: the compiler will only
overwrite a reference when it is no longer needed and thus the object
becomes unprotected precisely when the reference disappears, just as
wanted.
There are situations, however, where a SCM
object needs to be
around longer than its reference from a local variable or function
parameter. This happens, for example, when you retrieve some pointer
from a foreign object and work with that pointer directly. The
reference to the SCM
foreign object might be dead after the
pointer has been retrieved, but the pointer itself (and the memory
pointed to) is still in use and thus the foreign object must be
protected. The compiler does not know about this connection and might
overwrite the SCM
reference too early.
To get around this problem, you can use scm_remember_upto_here_1
and its cousins. It will keep the compiler from overwriting the
reference. See Foreign Object Memory Management.
Next: Asynchronous Signals, Previous: Garbage Collection, Up: General concepts for using libguile [Contents][Index]
Scheme has a more general view of program flow than C, both locally and non-locally.
Controlling the local flow of control involves things like gotos, loops, calling functions and returning from them. Non-local control flow refers to situations where the program jumps across one or more levels of function activations without using the normal call or return operations.
The primitive means of C for local control flow is the goto
statement, together with if
. Loops done with for
,
while
or do
could in principle be rewritten with just
goto
and if
. In Scheme, the primitive means for local
control flow is the function call (together with if
).
Thus, the repetition of some computation in a loop is ultimately
implemented by a function that calls itself, that is, by recursion.
This approach is theoretically very powerful since it is easier to reason formally about recursion than about gotos. In C, using recursion exclusively would not be practical, though, since it would eat up the stack very quickly. In Scheme, however, it is practical: function calls that appear in a tail position do not use any additional stack space (see Tail calls).
A function call is in a tail position when it is the last thing the
calling function does. The value returned by the called function is
immediately returned from the calling function. In the following
example, the call to bar-1
is in a tail position, while the
call to bar-2
is not. (The call to 1-
in foo-2
is in a tail position, though.)
(define (foo-1 x) (bar-1 (1- x))) (define (foo-2 x) (1- (bar-2 x)))
Thus, when you take care to recurse only in tail positions, the recursion will only use constant stack space and will be as good as a loop constructed from gotos.
Scheme offers a few syntactic abstractions (do
and named
let
) that make writing loops slightly easier.
But only Scheme functions can call other functions in a tail position: C functions can not. This matters when you have, say, two functions that call each other recursively to form a common loop. The following (unrealistic) example shows how one might go about determining whether a non-negative integer n is even or odd.
(define (my-even? n) (cond ((zero? n) #t) (else (my-odd? (1- n))))) (define (my-odd? n) (cond ((zero? n) #f) (else (my-even? (1- n)))))
Because the calls to my-even?
and my-odd?
are in tail
positions, these two procedures can be applied to arbitrary large
integers without overflowing the stack. (They will still take a lot
of time, of course.)
However, when one or both of the two procedures would be rewritten in C, it could no longer call its companion in a tail position (since C does not have this concept). You might need to take this consideration into account when deciding which parts of your program to write in Scheme and which in C.
In addition to calling functions and returning from them, a Scheme program can also exit non-locally from a function so that the control flow returns directly to an outer level. This means that some functions might not return at all.
Even more, it is not only possible to jump to some outer level of control, a Scheme program can also jump back into the middle of a function that has already exited. This might cause some functions to return more than once.
In general, these non-local jumps are done by invoking
continuations that have previously been captured using
call-with-current-continuation
. Guile also offers a slightly
restricted set of functions, catch
and throw
, that can
only be used for non-local exits. This restriction makes them more
efficient. Error reporting (with the function error
) is
implemented by invoking throw
, for example. The functions
catch
and throw
belong to the topic of exceptions.
Since Scheme functions can call C functions and vice versa, C code can
experience the more general control flow of Scheme as well. It is
possible that a C function will not return at all, or will return more
than once. While C does offer setjmp
and longjmp
for
non-local exits, it is still an unusual thing for C code. In
contrast, non-local exits are very common in Scheme, mostly to report
errors.
You need to be prepared for the non-local jumps in the control flow
whenever you use a function from libguile
: it is best to assume
that any libguile
function might signal an error or run a pending
signal handler (which in turn can do arbitrary things).
It is often necessary to take cleanup actions when the control leaves a
function non-locally. Also, when the control returns non-locally, some
setup actions might be called for. For example, the Scheme function
with-output-to-port
needs to modify the global state so that
current-output-port
returns the port passed to
with-output-to-port
. The global output port needs to be reset to
its previous value when with-output-to-port
returns normally or
when it is exited non-locally. Likewise, the port needs to be set again
when control enters non-locally.
Scheme code can use the dynamic-wind
function to arrange for
the setting and resetting of the global state. C code can use the
corresponding scm_internal_dynamic_wind
function, or a
scm_dynwind_begin
/scm_dynwind_end
pair together with
suitable ’dynwind actions’ (see Dynamic Wind).
Instead of coping with non-local control flow, you can also prevent it
by erecting a continuation barrier, See Continuation Barriers. The function scm_c_with_continuation_barrier
, for
example, is guaranteed to return exactly once.
Next: Multi-Threading, Previous: Control Flow, Up: General concepts for using libguile [Contents][Index]
You can not call libguile functions from handlers for POSIX signals, but
you can register Scheme handlers for POSIX signals such as
SIGINT
. These handlers do not run during the actual signal
delivery. Instead, they are run when the program (more precisely, the
thread that the handler has been registered for) reaches the next
safe point.
The libguile functions themselves have many such safe points.
Consequently, you must be prepared for arbitrary actions anytime you
call a libguile function. For example, even scm_cons
can contain
a safe point and when a signal handler is pending for your thread,
calling scm_cons
will run this handler and anything might happen,
including a non-local exit although scm_cons
would not ordinarily
do such a thing on its own.
If you do not want to allow the running of asynchronous signal handlers,
you can block them temporarily with scm_dynwind_block_asyncs
, for
example. See Asynchronous Interrupts.
Since signal handling in Guile relies on safe points, you need to make sure that your functions do offer enough of them. Normally, calling libguile functions in the normal course of action is all that is needed. But when a thread might spent a long time in a code section that calls no libguile function, it is good to include explicit safe points. This can allow the user to interrupt your code with C-c, for example.
You can do this with the macro SCM_TICK
. This macro is
syntactically a statement. That is, you could use it like this:
while (1) { SCM_TICK; do_some_work (); }
Frequent execution of a safe point is even more important in multi threaded programs, See Multi-Threading.
Previous: Asynchronous Signals, Up: General concepts for using libguile [Contents][Index]
Guile can be used in multi-threaded programs just as well as in single-threaded ones.
Each thread that wants to use functions from libguile must put itself into guile mode and must then follow a few rules. If it doesn’t want to honor these rules in certain situations, a thread can temporarily leave guile mode (but can no longer use libguile functions during that time, of course).
Threads enter guile mode by calling scm_with_guile
,
scm_boot_guile
, or scm_init_guile
. As explained in the
reference documentation for these functions, Guile will then learn about
the stack bounds of the thread and can protect the SCM
values
that are stored in local variables. When a thread puts itself into
guile mode for the first time, it gets a Scheme representation and is
listed by all-threads
, for example.
Threads in guile mode can block (e.g., do blocking I/O) without causing
any problems5; temporarily leaving guile mode with
scm_without_guile
before blocking slightly improves GC
performance, though. For some common blocking operations, Guile
provides convenience functions. For example, if you want to lock a
pthread mutex while in guile mode, you might want to use
scm_pthread_mutex_lock
which is just like
pthread_mutex_lock
except that it leaves guile mode while
blocking.
All libguile functions are (intended to be) robust in the face of multiple threads using them concurrently. This means that there is no risk of the internal data structures of libguile becoming corrupted in such a way that the process crashes.
A program might still produce nonsensical results, though. Taking hashtables as an example, Guile guarantees that you can use them from multiple threads concurrently and a hashtable will always remain a valid hashtable and Guile will not crash when you access it. It does not guarantee, however, that inserting into it concurrently from two threads will give useful results: only one insertion might actually happen, none might happen, or the table might in general be modified in a totally arbitrary manner. (It will still be a valid hashtable, but not the one that you might have expected.) Guile might also signal an error when it detects a harmful race condition.
Thus, you need to put in additional synchronizations when multiple threads want to use a single hashtable, or any other mutable Scheme object.
When writing C code for use with libguile, you should try to make it robust as well. An example that converts a list into a vector will help to illustrate. Here is a correct version:
SCM my_list_to_vector (SCM list) { SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED); size_t len, i; len = scm_c_vector_length (vector); i = 0; while (i < len && scm_is_pair (list)) { scm_c_vector_set_x (vector, i, scm_car (list)); list = scm_cdr (list); i++; } return vector; }
The first thing to note is that storing into a SCM
location
concurrently from multiple threads is guaranteed to be robust: you don’t
know which value wins but it will in any case be a valid SCM
value.
But there is no guarantee that the list referenced by list is not
modified in another thread while the loop iterates over it. Thus, while
copying its elements into the vector, the list might get longer or
shorter. For this reason, the loop must check both that it doesn’t
overrun the vector and that it doesn’t overrun the list. Otherwise,
scm_c_vector_set_x
would raise an error if the index is out of
range, and scm_car
and scm_cdr
would raise an error if the
value is not a pair.
It is safe to use scm_car
and scm_cdr
on the local
variable list once it is known that the variable contains a pair.
The contents of the pair might change spontaneously, but it will always
stay a valid pair (and a local variable will of course not spontaneously
point to a different Scheme object).
Likewise, a vector such as the one returned by scm_make_vector
is
guaranteed to always stay the same length so that it is safe to only use
scm_c_vector_length once and store the result. (In the example,
vector is safe anyway since it is a fresh object that no other
thread can possibly know about until it is returned from
my_list_to_vector
.)
Of course the behavior of my_list_to_vector
is suboptimal when
list does indeed get asynchronously lengthened or shortened in
another thread. But it is robust: it will always return a valid vector.
That vector might be shorter than expected, or its last elements might
be unspecified, but it is a valid vector and if a program wants to rule
out these cases, it must avoid modifying the list asynchronously.
Here is another version that is also correct:
SCM my_pedantic_list_to_vector (SCM list) { SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED); size_t len, i; len = scm_c_vector_length (vector); i = 0; while (i < len) { scm_c_vector_set_x (vector, i, scm_car (list)); list = scm_cdr (list); i++; } return vector; }
This version relies on the error-checking behavior of scm_car
and
scm_cdr
. When the list is shortened (that is, when list
holds a non-pair), scm_car
will throw an error. This might be
preferable to just returning a half-initialized vector.
The API for accessing vectors and arrays of various kinds from C takes a slightly different approach to thread-robustness. In order to get at the raw memory that stores the elements of an array, you need to reserve that array as long as you need the raw memory. During the time an array is reserved, its elements can still spontaneously change their values, but the memory itself and other things like the size of the array are guaranteed to stay fixed. Any operation that would change these parameters of an array that is currently reserved will signal an error. In order to avoid these errors, a program should of course put suitable synchronization mechanisms in place. As you can see, Guile itself is again only concerned about robustness, not about correctness: without proper synchronization, your program will likely not be correct, but the worst consequence is an error message.
Real thread-safety often requires that a critical section of code is executed in a certain restricted manner. A common requirement is that the code section is not entered a second time when it is already being executed. Locking a mutex while in that section ensures that no other thread will start executing it, blocking asyncs ensures that no asynchronous code enters the section again from the current thread, and the error checking of Guile mutexes guarantees that an error is signalled when the current thread accidentally reenters the critical section via recursive function calls.
Guile provides two mechanisms to support critical sections as outlined
above. You can either use the macros
SCM_CRITICAL_SECTION_START
and SCM_CRITICAL_SECTION_END
for very simple sections; or use a dynwind context together with a
call to scm_dynwind_critical_section
.
The macros only work reliably for critical sections that are guaranteed to not cause a non-local exit. They also do not detect an accidental reentry by the current thread. Thus, you should probably only use them to delimit critical sections that do not contain calls to libguile functions or to other external functions that might do complicated things.
The function scm_dynwind_critical_section
, on the other hand,
will correctly deal with non-local exits because it requires a dynwind
context. Also, by using a separate mutex for each critical section,
it can detect accidental reentries.
Next: Function Snarfing, Previous: General concepts for using libguile, Up: Programming in C [Contents][Index]
The foreign object type facility is Guile’s mechanism for
importing object and types from C or other languages into Guile’s
system. If you have a C struct foo
type, for example, you can
define a corresponding Guile foreign object type that allows Scheme code
to handle struct foo *
objects.
To define a new foreign object type, the programmer provides Guile with some essential information about the type — what its name is, how many fields it has, and its finalizer (if any) — and Guile allocates a fresh type for it. Foreign objects can be accessed from Scheme or from C.
To create a new foreign object type from C, call
scm_make_foreign_object_type
. It returns a value of type
SCM
which identifies the new type.
Here is how one might declare a new type representing eight-bit gray-scale images:
#include <libguile.h> struct image { int width, height; char *pixels; /* The name of this image */ SCM name; /* A function to call when this image is modified, e.g., to update the screen, or SCM_BOOL_F if no action necessary */ SCM update_func; }; static SCM image_type; void init_image_type (void) { SCM name, slots; scm_t_struct_finalize finalizer; name = scm_from_utf8_symbol ("image"); slots = scm_list_1 (scm_from_utf8_symbol ("data")); finalizer = NULL; image_type = scm_make_foreign_object_type (name, slots, finalizer); }
The result is an initialized image_type
value that identifies the
new foreign object type. The next section describes how to create
foreign objects and how to access their slots.
Next: Type Checking of Foreign Objects, Previous: Defining Foreign Object Types, Up: Defining New Foreign Object Types [Contents][Index]
Foreign objects contain zero or more “slots” of data. A slot can hold
a pointer, an integer that fits into a size_t
or ssize_t
,
or a SCM
value.
All objects of a given foreign type have the same number of slots. In
the example from the previous section, the image
type has one
slot, because the slots list passed to
scm_make_foreign_object_type
is of length one. (The actual names
given to slots are unimportant for most users of the C interface, but
can be used on the Scheme side to introspect on the foreign object.)
To construct a foreign object and initialize its first slot, call
scm_make_foreign_object_1 (type, first_slot_value)
.
There are similarly named constructors for initializing 0, 1, 2, or 3
slots, or initializing n slots via an array. See Foreign Objects, for full details. Any fields that are not explicitly
initialized are set to 0.
To get or set the value of a slot by index, you can use the
scm_foreign_object_ref
and scm_foreign_object_set_x
functions. These functions take and return values as void *
pointers; there are corresponding convenience procedures like
_signed_ref
, _unsigned_set_x
and so on for dealing with
slots as signed or unsigned integers.
Foreign objects fields that are pointers can be tricky to manage. If possible, it is best that all memory that is referenced by a foreign object be managed by the garbage collector. That way, the GC can automatically ensure that memory is accessible when it is needed, and freed when it becomes inaccessible. If this is not the case for your program – for example, if you are exposing an object to Scheme that was allocated by some other, Guile-unaware part of your program – then you will probably need to implement a finalizer. See Foreign Object Memory Management, for more.
Continuing the example from the previous section, if the global variable
image_type
contains the type returned by
scm_make_foreign_object_type
, here is how we could construct a
foreign object whose “data” field contains a pointer to a freshly
allocated struct image
:
SCM make_image (SCM name, SCM s_width, SCM s_height) { struct image *image; int width = scm_to_int (s_width); int height = scm_to_int (s_height); /* Allocate the `struct image'. Because we use scm_gc_malloc, this memory block will be automatically reclaimed when it becomes inaccessible, and its members will be traced by the garbage collector. */ image = (struct image *) scm_gc_malloc (sizeof (struct image), "image"); image->width = width; image->height = height; /* Allocating the pixels with scm_gc_malloc_pointerless means that the pixels data is collectable by GC, but that GC shouldn't spend time tracing its contents for nested pointers because there aren't any. */ image->pixels = scm_gc_malloc_pointerless (width * height, "image pixels"); image->name = name; image->update_func = SCM_BOOL_F; /* Now wrap the struct image* in a new foreign object, and return that object. */ return scm_make_foreign_object_1 (image_type, image); }
We use scm_gc_malloc_pointerless
for the pixel buffer to tell the
garbage collector not to scan it for pointers. Calls to
scm_gc_malloc
, scm_make_foreign_object_1
, and
scm_gc_malloc_pointerless
raise an exception in out-of-memory
conditions; the garbage collector is able to reclaim previously
allocated memory if that happens.
Next: Foreign Object Memory Management, Previous: Creating Foreign Objects, Up: Defining New Foreign Object Types [Contents][Index]
Functions that operate on foreign objects should check that the passed
SCM
value indeed is of the correct type before accessing its
data. They can do this with scm_assert_foreign_object_type
.
For example, here is a simple function that operates on an image object, and checks the type of its argument.
SCM clear_image (SCM image_obj) { int area; struct image *image; scm_assert_foreign_object_type (image_type, image_obj); image = scm_foreign_object_ref (image_obj, 0); area = image->width * image->height; memset (image->pixels, 0, area); /* Invoke the image's update function. */ if (scm_is_true (image->update_func)) scm_call_0 (image->update_func); return SCM_UNSPECIFIED; }
Next: Foreign Objects and Scheme, Previous: Type Checking of Foreign Objects, Up: Defining New Foreign Object Types [Contents][Index]
Once a foreign object has been released to the tender mercies of the
Scheme system, it must be prepared to survive garbage collection. In
the example above, all the memory associated with the foreign object is
managed by the garbage collector because we used the scm_gc_
allocation functions. Thus, no special care must be taken: the garbage
collector automatically scans them and reclaims any unused memory.
However, when data associated with a foreign object is managed in some
other way—e.g., malloc
’d memory or file descriptors—it is
possible to specify a finalizer function to release those
resources when the foreign object is reclaimed.
As discussed in see Garbage Collection, Guile’s garbage collector will reclaim inaccessible memory as needed. This reclamation process runs concurrently with the main program. When Guile analyzes the heap and determines that an object’s memory can be reclaimed, that memory is put on a “free list” of objects that can be reclaimed. Usually that’s the end of it—the object is available for immediate re-use. However some objects can have “finalizers” associated with them—functions that are called on reclaimable objects to effect any external cleanup actions.
Finalizers are tricky business and it is best to avoid them. They can be invoked at unexpected times, or not at all—for example, they are not invoked on process exit. They don’t help the garbage collector do its job; in fact, they are a hindrance. Furthermore, they perturb the garbage collector’s internal accounting. The GC decides to scan the heap when it thinks that it is necessary, after some amount of allocation. Finalizable objects almost always represent an amount of allocation that is invisible to the garbage collector. The effect can be that the actual resource usage of a system with finalizable objects is higher than what the GC thinks it should be.
All those caveats aside, some foreign object types will need finalizers. For example, if we had a foreign object type that wrapped file descriptors—and we aren’t suggesting this, as Guile already has ports —then you might define the type like this:
static SCM file_type; static void finalize_file (SCM file) { int fd = scm_foreign_object_signed_ref (file, 0); if (fd >= 0) { scm_foreign_object_signed_set_x (file, 0, -1); close (fd); } } static void init_file_type (void) { SCM name, slots; scm_t_struct_finalize finalizer; name = scm_from_utf8_symbol ("file"); slots = scm_list_1 (scm_from_utf8_symbol ("fd")); finalizer = finalize_file; image_type = scm_make_foreign_object_type (name, slots, finalizer); } static SCM make_file (int fd) { return scm_make_foreign_object_1 (file_type, (void *) fd); }
Note that the finalizer may be invoked in ways and at times you might not expect. In particular, if the user’s Guile is built with support for threads, the finalizer may be called from any thread that is running Guile. In Guile 2.0, finalizers are invoked via “asyncs”, which interleaves them with running Scheme code; see Asynchronous Interrupts. In Guile 2.2 there will be a dedicated finalization thread, to ensure that the finalization doesn’t run within the critical section of any other thread known to Guile.
In either case, finalizers run concurrently with the main program, and
so they need to be async-safe and thread-safe. If for some reason this
is impossible, perhaps because you are embedding Guile in some
application that is not itself thread-safe, you have a few options. One
is to use guardians instead of finalizers, and arrange to pump the
guardians for finalizable objects. See Guardians, for more
information. The other option is to disable automatic finalization
entirely, and arrange to call scm_run_finalizers ()
at
appropriate points. See Foreign Objects, for more on these
interfaces.
Finalizers are allowed to allocate memory, access GC-managed memory, and
in general can do anything any Guile user code can do. This was not the
case in Guile 1.8, where finalizers were much more restricted. In
particular, in Guile 2.0, finalizers can resuscitate objects. We do not
recommend that users avail themselves of this possibility, however, as a
resuscitated object can re-expose other finalizable objects that have
been already finalized back to Scheme. These objects will not be
finalized again, but they could cause use-after-free problems to code
that handles objects of that particular foreign object type. To guard
against this possibility, robust finalization routines should clear
state from the foreign object, as in the above free_file
example.
One final caveat. Foreign object finalizers are associated with the lifetime of a foreign object, not of its fields. If you access a field of a finalizable foreign object, and do not arrange to keep a reference on the foreign object itself, it could be that the outer foreign object gets finalized while you are working with its field.
For example, consider a procedure to read some data from a file, from our example above.
SCM read_bytes (SCM file, SCM n) { int fd; SCM buf; size_t len, pos; scm_assert_foreign_object_type (file_type, file); fd = scm_foreign_object_signed_ref (file, 0); if (fd < 0) scm_wrong_type_arg_msg ("read-bytes", SCM_ARG1, file, "open file"); len = scm_to_size_t (n); SCM buf = scm_c_make_bytevector (scm_to_size_t (n)); pos = 0; while (pos < len) { char *bytes = SCM_BYTEVECTOR_CONTENTS (buf); ssize_t count = read (fd, bytes + pos, len - pos); if (count < 0) scm_syserror ("read-bytes"); if (count == 0) break; pos += count; } scm_remember_upto_here_1 (file); return scm_values (scm_list_2 (buf, scm_from_size_t (pos))); }
After the prelude, only the fd
value is used and the C compiler
has no reason to keep the file
object around. If
scm_c_make_bytevector
results in a garbage collection,
file
might not be on the stack or anywhere else and could be
finalized, leaving read
to read a closed (or, in a multi-threaded
program, possibly re-used) file descriptor. The use of
scm_remember_upto_here_1
prevents this, by creating a reference
to file
after all data accesses. See Function related to Garbage Collection.
scm_remember_upto_here_1
is only needed on finalizable objects,
because garbage collection of other values is invisible to the program
– it happens when needed, and is not observable. But if you can, save
yourself the headache and build your program in such a way that it
doesn’t need finalization.
Previous: Foreign Object Memory Management, Up: Defining New Foreign Object Types [Contents][Index]
It is also possible to create foreign objects and object types from Scheme, and to access fields of foreign objects from Scheme. For example, the file example from the last section could be equivalently expressed as:
(define-module (my-file) #:use-module (system foreign-object) #:use-module ((oop goops) #:select (make)) #:export (make-file)) (define (finalize-file file) (let ((fd (struct-ref file 0))) (unless (< fd 0) (struct-set! file 0 -1) (close-fdes fd)))) (define <file> (make-foreign-object-type '<file> '(fd) #:finalizer finalize-file)) (define (make-file fd) (make <file> #:fd fd))
Here we see that the result of make-foreign-object-type
, which is
the equivalent of scm_make_foreign_object_type
, is a struct
vtable. See Vtables, for more information. To instantiate the
foreign object, which is really a Guile struct, we use make
. (We
could have used make-struct/no-tail
, but as an implementation
detail, finalizers are attached in the initialize
method called
by make
). To access the fields, we use struct-ref
and
struct-set!
. See Structure Basics.
There is a convenience syntax, define-foreign-object-type
, that
defines a type along with a constructor, and getters for the fields. An
appropriate invocation of define-foreign-object-type
for the
file object type could look like this:
(use-modules (system foreign-object)) (define-foreign-object-type <file> make-file (fd) #:finalizer finalize-file)
This defines the <file>
type with one field, a make-file
constructor, and a getter for the fd
field, bound to fd
.
Foreign object types are not only vtables but are actually GOOPS classes, as hinted at above. See GOOPS, for more on Guile’s object-oriented programming system. Thus one can define print and equality methods using GOOPS:
(use-modules (oop goops)) (define-method (write (file <file>) port) ;; Assuming existence of the `fd' getter (format port "#<<file> ~a>" (fd file))) (define-method (equal? (a <file>) (b <file>)) (eqv? (fd a) (fd b)))
One can even sub-class foreign types.
(define-class <named-file> (<file>) (name #:init-keyword #:name #:init-value #f #:accessor name))
The question arises of how to construct these values, given that
make-file
returns a plain old <file>
object. It turns out
that you can use the GOOPS construction interface, where every field of
the foreign object has an associated initialization keyword argument.
(define* (my-open-file name #:optional (flags O_RDONLY)) (make <named-file> #:fd (open-fdes name flags) #:name name)) (define-method (write (file <named-file>) port) (format port "#<<file> ~s ~a>" (name file) (fd file)))
See Foreign Objects, for full documentation on the Scheme interface to foreign objects. See GOOPS, for more on GOOPS.
As a final note, you might wonder how this system supports encapsulation of sensitive values. First, we have to recognize that some facilities are essentially unsafe and have global scope. For example, in C, the integrity and confidentiality of a part of a program is at the mercy of every other part of that program – because any part of the program can read and write anything in its address space. At the same time, principled access to structured data is organized in C on lexical boundaries; if you don’t expose accessors for your object, you trust other parts of the program not to work around that barrier.
The situation is not dissimilar in Scheme. Although Scheme’s unsafe
constructs are fewer in number than in C, they do exist. The
(system foreign)
module can be used to violate confidentiality
and integrity, and shouldn’t be exposed to untrusted code. Although
struct-ref
and struct-set!
are less unsafe, they still
have a cross-cutting capability of drilling through abstractions.
Performing a struct-set!
on a foreign object slot could cause
unsafe foreign code to crash. Ultimately, structures in Scheme are
capabilities for abstraction, and not abstractions themselves.
That leaves us with the lexical capabilities, like constructors and
accessors. Here is where encapsulation lies: the practical degree to
which the innards of your foreign objects are exposed is the degree to
which their accessors are lexically available in user code. If you want
to allow users to reference fields of your foreign object, provide them
with a getter. Otherwise you should assume that the only access to your
object may come from your code, which has the relevant authority, or via
code with access to cross-cutting struct-ref
and such, which also
has the cross-cutting authority.
Next: An Overview of Guile Programming, Previous: Defining New Foreign Object Types, Up: Programming in C [Contents][Index]
When writing C code for use with Guile, you typically define a set of
C functions, and then make some of them visible to the Scheme world by
calling scm_c_define_gsubr
or related functions. If you have
many functions to publish, it can sometimes be annoying to keep the
list of calls to scm_c_define_gsubr
in sync with the list of
function definitions.
Guile provides the guile-snarf
program to manage this problem.
Using this tool, you can keep all the information needed to define the
function alongside the function definition itself; guile-snarf
will extract this information from your source code, and automatically
generate a file of calls to scm_c_define_gsubr
which you can
#include
into an initialization function.
The snarfing mechanism works for many kind of initialization actions,
not just for collecting calls to scm_c_define_gsubr
. For a
full list of what can be done, See Snarfing Macros.
The guile-snarf
program is invoked like this:
guile-snarf [-o outfile] [cpp-args ...]
This command will extract initialization actions to outfile.
When no outfile has been specified or when outfile is
-
, standard output will be used. The C preprocessor is called
with cpp-args (which usually include an input file) and the
output is filtered to extract the initialization actions.
If there are errors during processing, outfile is deleted and the program exits with non-zero status.
During snarfing, the pre-processor macro SCM_MAGIC_SNARFER
is
defined. You could use this to avoid including snarfer output files
that don’t yet exist by writing code like this:
#ifndef SCM_MAGIC_SNARFER #include "foo.x" #endif
Here is how you might define the Scheme function clear-image
,
implemented by the C function clear_image
:
#include <libguile.h>
SCM_DEFINE (clear_image, "clear-image", 1, 0, 0,
(SCM image),
"Clear the image.")
{
/* C code to clear the image in image
... */
}
void
init_image_type ()
{
#include "image-type.x"
}
The SCM_DEFINE
declaration says that the C function
clear_image
implements a Scheme function called
clear-image
, which takes one required argument (of type
SCM
and named image
), no optional arguments, and no rest
argument. The string "Clear the image."
provides a short help
text for the function, it is called a docstring.
SCM_DEFINE
macro also defines a static array of characters
initialized to the Scheme name of the function. In this case,
s_clear_image
is set to the C string, "clear-image". You might
want to use this symbol when generating error messages.
Assuming the text above lives in a file named image-type.c, you will need to execute the following command to prepare this file for compilation:
guile-snarf -o image-type.x image-type.c
This scans image-type.c for SCM_DEFINE
declarations, and writes to image-type.x the output:
scm_c_define_gsubr ("clear-image", 1, 0, 0, (SCM (*)() ) clear_image);
When compiled normally, SCM_DEFINE
is a macro which expands to
the function header for clear_image
.
Note that the output file name matches the #include
from the
input file. Also, you still need to provide all the same information
you would if you were using scm_c_define_gsubr
yourself, but you
can place the information near the function definition itself, so it is
less likely to become incorrect or out-of-date.
If you have many files that guile-snarf
must process, you should
consider using a fragment like the following in your Makefile:
snarfcppopts = $(DEFS) $(INCLUDES) $(CPPFLAGS) $(CFLAGS) .SUFFIXES: .x .c.x: guile-snarf -o $@ $< $(snarfcppopts)
This tells make to run guile-snarf
to produce each needed
.x file from the corresponding .c file.
The program guile-snarf
passes its command-line arguments
directly to the C preprocessor, which it uses to extract the
information it needs from the source code. this means you can pass
normal compilation flags to guile-snarf
to define preprocessor
symbols, add header file directories, and so on.
Next: Autoconf Support, Previous: Function Snarfing, Up: Programming in C [Contents][Index]
Guile is designed as an extension language interpreter that is straightforward to integrate with applications written in C (and C++). The big win here for the application developer is that Guile integration, as the Guile web page says, “lowers your project’s hacktivation energy.” Lowering the hacktivation energy means that you, as the application developer, and your users, reap the benefits that flow from being able to extend the application in a high level extension language rather than in plain old C.
In abstract terms, it’s difficult to explain what this really means and what the integration process involves, so instead let’s begin by jumping straight into an example of how you might integrate Guile into an existing program, and what you could expect to gain by so doing. With that example under our belts, we’ll then return to a more general analysis of the arguments involved and the range of programming options available.
Dia is a free software program for drawing schematic diagrams like flow charts and floor plans (http://www.gnome.org/projects/dia/). This section conducts the thought experiment of adding Guile to Dia. In so doing, it aims to illustrate several of the steps and considerations involved in adding Guile to applications in general.
First off, you should understand why you want to add Guile to Dia at all, and that means forming a picture of what Dia does and how it does it. So, what are the constituents of the Dia application?
(In other words, a textbook example of the model - view - controller paradigm.)
Next question: how will Dia benefit once the Guile integration is complete? Several (positive!) answers are possible here, and the choice is obviously up to the application developers. Still, one answer is that the main benefit will be the ability to manipulate Dia’s application domain objects from Scheme.
Suppose that Dia made a set of procedures available in Scheme, representing the most basic operations on objects such as shapes, connectors, and so on. Using Scheme, the application user could then write code that builds upon these basic operations to create more complex procedures. For example, given basic procedures to enumerate the objects on a page, to determine whether an object is a square, and to change the fill pattern of a single shape, the user can write a Scheme procedure to change the fill pattern of all squares on the current page:
(define (change-squares'-fill-pattern new-pattern) (for-each-shape current-page (lambda (shape) (if (square? shape) (change-fill-pattern shape new-pattern)))))
Next: How to Represent Dia Data in Scheme, Previous: Deciding Why You Want to Add Guile, Up: How One Might Extend Dia Using Guile [Contents][Index]
Assuming this objective, four steps are needed to achieve it.
First, you need a way of representing your application-specific objects
— such as shape
in the previous example — when they are
passed into the Scheme world. Unless your objects are so simple that
they map naturally into builtin Scheme data types like numbers and
strings, you will probably want to use Guile’s foreign object
interface to create a new Scheme data type for your objects.
Second, you need to write code for the basic operations like
for-each-shape
and square?
such that they access and
manipulate your existing data structures correctly, and then make these
operations available as primitives on the Scheme level.
Third, you need to provide some mechanism within the Dia application that a user can hook into to cause arbitrary Scheme code to be evaluated.
Finally, you need to restructure your top-level application C code a little so that it initializes the Guile interpreter correctly and declares your foreign objects and primitives to the Scheme world.
The following subsections expand on these four points in turn.
Next: Writing Guile Primitives for Dia, Previous: Four Steps Required to Add Guile, Up: How One Might Extend Dia Using Guile [Contents][Index]
For all but the most trivial applications, you will probably want to allow some representation of your domain objects to exist on the Scheme level. This is where foreign objects come in, and with them issues of lifetime management and garbage collection.
To get more concrete about this, let’s look again at the example we gave earlier of how application users can use Guile to build higher-level functions from the primitives that Dia itself provides.
(define (change-squares'-fill-pattern new-pattern) (for-each-shape current-page (lambda (shape) (if (square? shape) (change-fill-pattern shape new-pattern)))))
Consider what is stored here in the variable shape
. For each
shape on the current page, the for-each-shape
primitive calls
(lambda (shape) …)
with an argument representing that
shape. Question is: how is that argument represented on the Scheme
level? The issues are as follows.
square?
and change-fill-pattern
primitives. In
other words, a primitive like square?
has somehow to be able to
turn the value that it receives back into something that points to the
underlying C structure describing a shape.
shape
in a global variable, but then that shape is deleted (in a
way that the Scheme code is not aware of), and later on some other
Scheme code uses that global variable again in a call to, say,
square?
?
shape
argument passes
transiently in and out of the Scheme world, it would be quite wrong the
delete the underlying C shape just because the Scheme code has
finished evaluation. How do we avoid this happening?
One resolution of these issues is for the Scheme-level representation of
a shape to be a new, Scheme-specific C structure wrapped up as a foreign
object. The foreign object is what is passed into and out of Scheme
code, and the Scheme-specific C structure inside the foreign object
points to Dia’s underlying C structure so that the code for primitives
like square?
can get at it.
To cope with an underlying shape being deleted while Scheme code is still holding onto a Scheme shape value, the underlying C structure should have a new field that points to the Scheme-specific foreign object. When a shape is deleted, the relevant code chains through to the Scheme-specific structure and sets its pointer back to the underlying structure to NULL. Thus the foreign object value for the shape continues to exist, but any primitive code that tries to use it will detect that the underlying shape has been deleted because the underlying structure pointer is NULL.
So, to summarize the steps involved in this resolution of the problem
(and assuming that the underlying C structure for a shape is
struct dia_shape
):
struct dia_guile_shape { struct dia_shape * c_shape; /* NULL => deleted */ }
struct dia_shape
that points to its struct
dia_guile_shape
if it has one —
struct dia_shape { … struct dia_guile_shape * guile_shape; }
— so that C code can set guile_shape->c_shape
to NULL when the
underlying shape is deleted.
struct dia_guile_shape
as a foreign object type.
c_shape
field when decoding it, to find out whether the
underlying C shape is still there.
As far as memory management is concerned, the foreign object values and their Scheme-specific structures are under the control of the garbage collector, whereas the underlying C structures are explicitly managed in exactly the same way that Dia managed them before we thought of adding Guile.
When the garbage collector decides to free a shape foreign object value,
it calls the finalizer function that was specified when defining
the shape foreign object type. To maintain the correctness of the
guile_shape
field in the underlying C structure, this function
should chain through to the underlying C structure (if it still exists)
and set its guile_shape
field to NULL.
For full documentation on defining and using foreign object types, see Defining New Foreign Object Types.
Next: Providing a Hook for the Evaluation of Scheme Code, Previous: How to Represent Dia Data in Scheme, Up: How One Might Extend Dia Using Guile [Contents][Index]
Once the details of object representation are decided, writing the primitive function code that you need is usually straightforward.
A primitive is simply a C function whose arguments and return value are
all of type SCM
, and whose body does whatever you want it to do.
As an example, here is a possible implementation of the square?
primitive:
static SCM square_p (SCM shape) { struct dia_guile_shape * guile_shape; /* Check that arg is really a shape object. */ scm_assert_foreign_object_type (shape_type, shape); /* Access Scheme-specific shape structure. */ guile_shape = scm_foreign_object_ref (shape, 0); /* Find out if underlying shape exists and is a square; return answer as a Scheme boolean. */ return scm_from_bool (guile_shape->c_shape && (guile_shape->c_shape->type == DIA_SQUARE)); }
Notice how easy it is to chain through from the SCM shape
parameter that square_p
receives — which is a foreign object
— to the Scheme-specific structure inside the foreign object, and
thence to the underlying C structure for the shape.
In this code, scm_assert_foreign_object_type
,
scm_foreign_object_ref
, and scm_from_bool
are from the
standard Guile API. We assume that shape_type
was given to us
when we made the shape foreign object type, using
scm_make_foreign_object_type
. The call to
scm_assert_foreign_object_type
ensures that shape is indeed
a shape. This is needed to guard against Scheme code using the
square?
procedure incorrectly, as in (square? "hello")
;
Scheme’s latent typing means that usage errors like this must be caught
at run time.
Having written the C code for your primitives, you need to make them
available as Scheme procedures by calling the scm_c_define_gsubr
function. scm_c_define_gsubr
(see Primitive Procedures)
takes arguments that specify the Scheme-level name for the primitive and
how many required, optional and rest arguments it can accept. The
square?
primitive always requires exactly one argument, so the
call to make it available in Scheme reads like this:
scm_c_define_gsubr ("square?", 1, 0, 0, square_p);
For where to put this call, see the subsection after next on the structure of Guile-enabled code (see Top-level Structure of Guile-enabled Dia).
Next: Top-level Structure of Guile-enabled Dia, Previous: Writing Guile Primitives for Dia, Up: How One Might Extend Dia Using Guile [Contents][Index]
To make the Guile integration useful, you have to design some kind of hook into your application that application users can use to cause their Scheme code to be evaluated.
Technically, this is straightforward; you just have to decide on a mechanism that is appropriate for your application. Think of Emacs, for example: when you type ESC :, you get a prompt where you can type in any Elisp code, which Emacs will then evaluate. Or, again like Emacs, you could provide a mechanism (such as an init file) to allow Scheme code to be associated with a particular key sequence, and evaluate the code when that key sequence is entered.
In either case, once you have the Scheme code that you want to evaluate,
as a null terminated string, you can tell Guile to evaluate it by
calling the scm_c_eval_string
function.
Next: Going Further with Dia and Guile, Previous: Providing a Hook for the Evaluation of Scheme Code, Up: How One Might Extend Dia Using Guile [Contents][Index]
Let’s assume that the pre-Guile Dia code looks structurally like this:
main ()
When you add Guile to a program, one (rather technical) requirement is
that Guile’s garbage collector needs to know where the bottom of the C
stack is. The easiest way to ensure this is to use
scm_boot_guile
like this:
main ()
scm_boot_guile (argc, argv, inner_main, NULL)
inner_main ()
scm_c_define_gsubr
In other words, you move the guts of what was previously in your
main
function into a new function called inner_main
, and
then add a scm_boot_guile
call, with inner_main
as a
parameter, to the end of main
.
Assuming that you are using foreign objects and have written primitive
code as described in the preceding subsections, you also need to insert
calls to declare your new foreign objects and export the primitives to
Scheme. These declarations must happen inside the dynamic scope
of the scm_boot_guile
call, but also before any code is
run that could possibly use them — the beginning of inner_main
is an ideal place for this.
Previous: Top-level Structure of Guile-enabled Dia, Up: How One Might Extend Dia Using Guile [Contents][Index]
The steps described so far implement an initial Guile integration that already gives a lot of additional power to Dia application users. But there are further steps that you could take, and it’s interesting to consider a few of these.
In general, you could progressively move more of Dia’s source code from C into Scheme. This might make the code more maintainable and extensible, and it could open the door to new programming paradigms that are tricky to effect in C but straightforward in Scheme.
A specific example of this is that you could use the guile-gtk package, which provides Scheme-level procedures for most of the Gtk+ library, to move the code that lays out and displays Dia objects from C to Scheme.
As you follow this path, it naturally becomes less useful to maintain a distinction between Dia’s original non-Guile-related source code, and its later code implementing foreign objects and primitives for the Scheme world.
For example, suppose that the original source code had a
dia_change_fill_pattern
function:
void dia_change_fill_pattern (struct dia_shape * shape, struct dia_pattern * pattern) { /* real pattern change work */ }
During initial Guile integration, you add a change_fill_pattern
primitive for Scheme purposes, which accesses the underlying structures
from its foreign object values and uses dia_change_fill_pattern
to do the real work:
SCM change_fill_pattern (SCM shape, SCM pattern) { struct dia_shape * d_shape; struct dia_pattern * d_pattern; … dia_change_fill_pattern (d_shape, d_pattern); return SCM_UNSPECIFIED; }
At this point, it makes sense to keep dia_change_fill_pattern
and
change_fill_pattern
separate, because
dia_change_fill_pattern
can also be called without going through
Scheme at all, say because the user clicks a button which causes a
C-registered Gtk+ callback to be called.
But, if the code for creating buttons and registering their callbacks is
moved into Scheme (using guile-gtk), it may become true that
dia_change_fill_pattern
can no longer be called other than
through Scheme. In which case, it makes sense to abolish it and move
its contents directly into change_fill_pattern
, like this:
SCM change_fill_pattern (SCM shape, SCM pattern) { struct dia_shape * d_shape; struct dia_pattern * d_pattern; … /* real pattern change work */ return SCM_UNSPECIFIED; }
So further Guile integration progressively reduces the amount of functional C code that you have to maintain over the long term.
A similar argument applies to data representation. In the discussion of foreign objects earlier, issues arose because of the different memory management and lifetime models that normally apply to data structures in C and in Scheme. However, with further Guile integration, you can resolve this issue in a more radical way by allowing all your data structures to be under the control of the garbage collector, and kept alive by references from the Scheme world. Instead of maintaining an array or linked list of shapes in C, you would instead maintain a list in Scheme.
Rather like the coalescing of dia_change_fill_pattern
and
change_fill_pattern
, the practical upshot of such a change is
that you would no longer have to keep the dia_shape
and
dia_guile_shape
structures separate, and so wouldn’t need to
worry about the pointers between them. Instead, you could change the
foreign object definition to wrap the dia_shape
structure
directly, and send dia_guile_shape
off to the scrap yard. Cut
out the middle man!
Finally, we come to the holy grail of Guile’s free software / extension language approach. Once you have a Scheme representation for interesting Dia data types like shapes, and a handy bunch of primitives for manipulating them, it suddenly becomes clear that you have a bundle of functionality that could have far-ranging use beyond Dia itself. In other words, the data types and primitives could now become a library, and Dia becomes just one of the many possible applications using that library — albeit, at this early stage, a rather important one!
In this model, Guile becomes just the glue that binds everything together. Imagine an application that usefully combined functionality from Dia, Gnumeric and GnuCash — it’s tricky right now, because no such application yet exists; but it’ll happen some day …
Next: Example: Using Guile for an Application Testbed, Previous: How One Might Extend Dia Using Guile, Up: An Overview of Guile Programming [Contents][Index]
Underlying Guile’s value proposition is the assumption that programming in a high level language, specifically Guile’s implementation of Scheme, is necessarily better in some way than programming in C. What do we mean by this claim, and how can we be so sure?
One class of advantages applies not only to Scheme, but more generally to any interpretable, high level, scripting language, such as Emacs Lisp, Python, Ruby, or TeX’s macro language. Common features of all such languages, when compared to C, are that:
In the case of Scheme, particular features that make programming easier — and more fun! — are its powerful mechanisms for abstracting parts of programs (closures — see The Concept of Closure) and for iteration (see Iteration mechanisms).
The evidence in support of this argument is empirical: the huge amount of code that has been written in extension languages for applications that support this mechanism. Most notable are extensions written in Emacs Lisp for GNU Emacs, in TeX’s macro language for TeX, and in Script-Fu for the Gimp, but there is increasingly now a significant code eco-system for Guile-based applications as well, such as Lilypond and GnuCash. It is close to inconceivable that similar amounts of functionality could have been added to these applications just by writing new code in their base implementation languages.
Next: A Choice of Programming Options, Previous: Why Scheme is More Hackable Than C, Up: An Overview of Guile Programming [Contents][Index]
As an example of what this means in practice, imagine writing a testbed for an application that is tested by submitting various requests (via a C interface) and validating the output received. Suppose further that the application keeps an idea of its current state, and that the “correct” output for a given request may depend on the current application state. A complete “white box”6 test plan for this application would aim to submit all possible requests in each distinguishable state, and validate the output for all request/state combinations.
To write all this test code in C would be very tedious. Suppose instead that the testbed code adds a single new C function, to submit an arbitrary request and return the response, and then uses Guile to export this function as a Scheme procedure. The rest of the testbed can then be written in Scheme, and so benefits from all the advantages of programming in Scheme that were described in the previous section.
(In this particular example, there is an additional benefit of writing most of the testbed in Scheme. A common problem for white box testing is that mistakes and mistaken assumptions in the application under test can easily be reproduced in the testbed code. It is more difficult to copy mistakes like this when the testbed is written in a different language from the application.)
Next: How About Application Users?, Previous: Example: Using Guile for an Application Testbed, Up: An Overview of Guile Programming [Contents][Index]
The preceding arguments and example point to a model of Guile programming that is applicable in many cases. According to this model, Guile programming involves a balance between C and Scheme programming, with the aim being to extract the greatest possible Scheme level benefit from the least amount of C level work.
The C level work required in this model usually consists of packaging and exporting functions and application objects such that they can be seen and manipulated on the Scheme level. To help with this, Guile’s C language interface includes utility features that aim to make this kind of integration very easy for the application developer. These features are documented later in this part of the manual: see REFFIXME.
This model, though, is really just one of a range of possible programming options. If all of the functionality that you need is available from Scheme, you could choose instead to write your whole application in Scheme (or one of the other high level languages that Guile supports through translation), and simply use Guile as an interpreter for Scheme. (In the future, we hope that Guile will also be able to compile Scheme code, so lessening the performance gap between C and Scheme code.) Or, at the other end of the C–Scheme scale, you could write the majority of your application in C, and only call out to Guile occasionally for specific actions such as reading a configuration file or executing a user-specified extension. The choices boil down to two basic questions:
These are of course design questions, and the right design for any given application will always depend upon the particular requirements that you are trying to meet. In the context of Guile, however, there are some generally applicable considerations that can help you when designing your answers.
Suppose, for the sake of argument, that you would prefer to write your whole application in Scheme. Then the API available to you consists of:
A module in the last category can either be a pure Scheme module — in
other words a collection of utility procedures coded in Scheme — or a
module that provides a Scheme interface to an extension library coded in
C — in other words a nice package where someone else has done the work
of wrapping up some useful C code for you. The set of available modules
is growing quickly and already includes such useful examples as
(gtk gtk)
, which makes Gtk+ drawing functions available in
Scheme, and (database postgres)
, which provides SQL access to a
Postgres database.
Given the growing collection of pre-existing modules, it is quite feasible that your application could be implemented by combining a selection of these modules together with new application code written in Scheme.
If this approach is not enough, because the functionality that your application needs is not already available in this form, and it is impossible to write the new functionality in Scheme, you will need to write some C code. If the required function is already available in C (e.g. in a library), all you need is a little glue to connect it to the world of Guile. If not, you need both to write the basic code and to plumb it into Guile.
In either case, two general considerations are important. Firstly, what is the interface by which the functionality is presented to the Scheme world? Does the interface consist only of function calls (for example, a simple drawing interface), or does it need to include objects of some kind that can be passed between C and Scheme and manipulated by both worlds. Secondly, how does the lifetime and memory management of objects in the C code relate to the garbage collection governed approach of Scheme objects? In the case where the basic C code is not already written, most of the difficulties of memory management can be avoided by using Guile’s C interface features from the start.
For the full documentation on writing C code for Guile and connecting existing C code to the Guile world, see REFFIXME.
Next: Your Preferred Programming Style, Previous: What Functionality is Already Available?, Up: A Choice of Programming Options [Contents][Index]
Next: What Controls Program Execution?, Previous: Functional and Performance Constraints, Up: A Choice of Programming Options [Contents][Index]
Previous: Your Preferred Programming Style, Up: A Choice of Programming Options [Contents][Index]
Previous: A Choice of Programming Options, Up: An Overview of Guile Programming [Contents][Index]
So far we have considered what Guile programming means for an application developer. But what if you are instead using an existing Guile-based application, and want to know what your options are for programming and extending this application?
The answer to this question varies from one application to another, because the options available depend inevitably on whether the application developer has provided any hooks for you to hang your own code on and, if there are such hooks, what they allow you to do.7 For example…
In the last two cases, what you can do is, by definition, restricted by the application, and you should refer to the application’s own manual to find out your options.
The most well known example of the first case is Emacs, with its extension language Emacs Lisp: as well as being a text editor, Emacs supports the loading and execution of arbitrary Emacs Lisp code. The result of such openness has been dramatic: Emacs now benefits from user-contributed Emacs Lisp libraries that extend the basic editing function to do everything from reading news to psychoanalysis and playing adventure games. The only limitation is that extensions are restricted to the functionality provided by Emacs’s built-in set of primitive operations. For example, you can interact and display data by manipulating the contents of an Emacs buffer, but you can’t pop-up and draw a window with a layout that is totally different to the Emacs standard.
This situation with a Guile application that supports the loading of arbitrary user code is similar, except perhaps even more so, because Guile also supports the loading of extension libraries written in C. This last point enables user code to add new primitive operations to Guile, and so to bypass the limitation present in Emacs Lisp.
At this point, the distinction between an application developer and an application user becomes rather blurred. Instead of seeing yourself as a user extending an application, you could equally well say that you are developing a new application of your own using some of the primitive functionality provided by the original application. As such, all the discussions of the preceding sections of this chapter are relevant to how you can proceed with developing your extension.
Previous: An Overview of Guile Programming, Up: Programming in C [Contents][Index]
Autoconf, a part of the GNU build system, makes it easy for users to build your package. This section documents Guile’s Autoconf support.
Next: Autoconf Macros, Up: Autoconf Support [Contents][Index]
As explained in the GNU Autoconf Manual, any package needs configuration at build-time (see Introduction in The GNU Autoconf Manual). If your package uses Guile (or uses a package that in turn uses Guile), you probably need to know what specific Guile features are available and details about them.
The way to do this is to write feature tests and arrange for their execution
by the configure script, typically by adding the tests to
configure.ac, and running autoconf
to create configure.
Users of your package then run configure in the normal way.
Macros are a way to make common feature tests easy to express. Autoconf provides a wide range of macros (see Existing Tests in The GNU Autoconf Manual), and Guile installation provides Guile-specific tests in the areas of: program detection, compilation flags reporting, and Scheme module checks.
Next: Using Autoconf Macros, Previous: Autoconf Background, Up: Autoconf Support [Contents][Index]
As mentioned earlier in this chapter, Guile supports parallel
installation, and uses pkg-config
to let the user choose which
version of Guile they are interested in. pkg-config
has its own
set of Autoconf macros that are probably installed on most every
development system. The most useful of these macros is
PKG_CHECK_MODULES
.
PKG_CHECK_MODULES([GUILE], [guile-2.2])
This example looks for Guile and sets the GUILE_CFLAGS
and
GUILE_LIBS
variables accordingly, or prints an error and exits if
Guile was not found.
Guile comes with additional Autoconf macros providing more information,
installed as prefix/share/aclocal/guile.m4. Their names
all begin with GUILE_
.
This macro runs the pkg-config
tool to find development files
for an available version of Guile.
By default, this macro will search for the latest stable version of Guile (e.g. 2.2), falling back to the previous stable version (e.g. 2.0) if it is available. If no guile-VERSION.pc file is found, an error is signalled. The found version is stored in GUILE_EFFECTIVE_VERSION.
If GUILE_PROGS
was already invoked, this macro ensures that the
development files have the same effective version as the Guile
program.
GUILE_EFFECTIVE_VERSION is marked for substitution, as by
AC_SUBST
.
This macro runs the pkg-config
tool to find out how to compile
and link programs against Guile. It sets four variables:
GUILE_CFLAGS, GUILE_LDFLAGS, GUILE_LIBS, and
GUILE_LTLIBS.
GUILE_CFLAGS: flags to pass to a C or C++ compiler to build code that
uses Guile header files. This is almost always just one or more -I
flags.
GUILE_LDFLAGS: flags to pass to the compiler to link a program
against Guile. This includes -lguile-VERSION
for the
Guile library itself, and may also include one or more -L
flag
to tell the compiler where to find the libraries. But it does not
include flags that influence the program’s runtime search path for
libraries, and will therefore lead to a program that fails to start,
unless all necessary libraries are installed in a standard location
such as /usr/lib.
GUILE_LIBS and GUILE_LTLIBS: flags to pass to the compiler or to libtool, respectively, to link a program against Guile. It includes flags that augment the program’s runtime search path for libraries, so that shared libraries will be found at the location where they were during linking, even in non-standard locations. GUILE_LIBS is to be used when linking the program directly with the compiler, whereas GUILE_LTLIBS is to be used when linking the program is done through libtool.
The variables are marked for substitution, as by AC_SUBST
.
This looks for Guile’s "site" directories. The variable GUILE_SITE will
be set to Guile’s "site" directory for Scheme source files (usually something
like PREFIX/share/guile/site). GUILE_SITE_CCACHE will be set to the
directory for compiled Scheme files also known as .go
files
(usually something like
PREFIX/lib/guile/GUILE_EFFECTIVE_VERSION/site-ccache).
GUILE_EXTENSION will be set to the directory for compiled C extensions
(usually something like
PREFIX/lib/guile/GUILE_EFFECTIVE_VERSION/extensions). The latter two
are set to blank if the particular version of Guile does not support
them. Note that this macro will run the macros GUILE_PKG
and
GUILE_PROGS
if they have not already been run.
The variables are marked for substitution, as by AC_SUBST
.
This macro looks for programs guile
and guild
, setting
variables GUILE and GUILD to their paths, respectively.
The macro will attempt to find guile
with the suffix of
-X.Y
, followed by looking for it with the suffix X.Y
, and
then fall back to looking for guile
with no suffix. If
guile
is still not found, signal an error. The suffix, if any,
that was required to find guile
will be used for guild
as well.
By default, this macro will search for the latest stable version of Guile (e.g. 2.2). x.y or x.y.z versions can be specified. If an older version is found, the macro will signal an error.
The effective version of the found guile
is set to
GUILE_EFFECTIVE_VERSION. This macro ensures that the effective
version is compatible with the result of a previous invocation of
GUILE_FLAGS
, if any.
As a legacy interface, it also looks for guile-config
and
guile-tools
, setting GUILE_CONFIG and GUILE_TOOLS.
The variables are marked for substitution, as by AC_SUBST
.
var is a shell variable name to be set to the return value. check is a Guile Scheme expression, evaluated with "$GUILE -c", and returning either 0 or non-#f to indicate the check passed. Non-0 number or #f indicates failure. Avoid using the character "#" since that confuses autoconf.
var is a shell variable name to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list). featuretest is an expression acceptable to GUILE_CHECK, q.v. description is a present-tense verb phrase (passed to AC_MSG_CHECKING).
var is a shell variable name to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list).
symlist is a list of symbols, WITHOUT surrounding parens, like: ice-9 common-list.
var is a shell variable to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list). modvar is the Guile Scheme variable to check.
module is a list of symbols, like: (ice-9 common-list). modvar is the Guile Scheme variable to check.
Previous: Autoconf Macros, Up: Autoconf Support [Contents][Index]
Using the autoconf macros is straightforward: Add the macro "calls" (actually
instantiations) to configure.ac, run aclocal
, and finally,
run autoconf
. If your system doesn’t have guile.m4 installed, place
the desired macro definitions (AC_DEFUN
forms) in acinclude.m4,
and aclocal
will do the right thing.
Some of the macros can be used inside normal shell constructs: if foo ;
then GUILE_BAZ ; fi
, but this is not guaranteed. It’s probably a good idea
to instantiate macros at top-level.
We now include two examples, one simple and one complicated.
The first example is for a package that uses libguile, and thus needs to
know how to compile and link against it. So we use
PKG_CHECK_MODULES
to set the vars GUILE_CFLAGS
and
GUILE_LIBS
, which are automatically substituted in the Makefile.
In configure.ac: PKG_CHECK_MODULES([GUILE], [guile-2.2]) In Makefile.in: GUILE_CFLAGS = @GUILE_CFLAGS@ GUILE_LIBS = @GUILE_LIBS@ myprog.o: myprog.c $(CC) -o $ $(GUILE_CFLAGS) $< myprog: myprog.o $(CC) -o $ $< $(GUILE_LIBS)
The second example is for a package of Guile Scheme modules that uses an
external program and other Guile Scheme modules (some might call this a "pure
scheme" package). So we use the GUILE_SITE_DIR
macro, a regular
AC_PATH_PROG
macro, and the GUILE_MODULE_AVAILABLE
macro.
In configure.ac: GUILE_SITE_DIR probably_wont_work="" # pgtype pgtable GUILE_MODULE_AVAILABLE(have_guile_pg, (database postgres)) test $have_guile_pg = no && probably_wont_work="(my pgtype) (my pgtable) $probably_wont_work" # gpgutils AC_PATH_PROG(GNUPG,gpg) test x"$GNUPG" = x && probably_wont_work="(my gpgutils) $probably_wont_work" if test ! "$probably_wont_work" = "" ; then p=" ***" echo echo "$p" echo "$p NOTE:" echo "$p The following modules probably won't work:" echo "$p $probably_wont_work" echo "$p They can be installed anyway, and will work if their" echo "$p dependencies are installed later. Please see README." echo "$p" echo fi In Makefile.in: instdir = @GUILE_SITE@/my install: $(INSTALL) my/*.scm $(instdir)
Next: Guile Modules, Previous: Programming in C, Up: The Guile Reference Manual [Contents][Index]
Guile provides an application programming interface (API) to developers in two core languages: Scheme and C. This part of the manual contains reference documentation for all of the functionality that is available through both Scheme and C interfaces.
Next: Deprecation, Up: API Reference [Contents][Index]
Guile’s application programming interface (API) makes functionality available that an application developer can use in either C or Scheme programming. The interface consists of elements that may be macros, functions or variables in C, and procedures, variables, syntax or other types of object in Scheme.
Many elements are available to both Scheme and C, in a form that is
appropriate. For example, the assq
Scheme procedure is also
available as scm_assq
to C code. These elements are documented
only once, addressing both the Scheme and C aspects of them.
The Scheme name of an element is related to its C name in a regular way. Also, a C function takes its parameters in a systematic way.
Normally, the name of a C function can be derived given its Scheme name, using some simple textual transformations:
-
(hyphen) with _
(underscore).
?
(question mark) with _p
.
!
(exclamation point) with _x
.
->
with _to_
.
<=
(less than or equal) with _leq
.
>=
(greater than or equal) with _geq
.
<
(less than) with _less
.
>
(greater than) with _gr
.
scm_
.
A C function always takes a fixed number of arguments of type
SCM
, even when the corresponding Scheme function takes a
variable number.
For some Scheme functions, some last arguments are optional; the
corresponding C function must always be invoked with all optional
arguments specified. To get the effect as if an argument has not been
specified, pass SCM_UNDEFINED
as its value. You can not do
this for an argument in the middle; when one argument is
SCM_UNDEFINED
all the ones following it must be
SCM_UNDEFINED
as well.
Some Scheme functions take an arbitrary number of rest arguments; the corresponding C function must be invoked with a list of all these arguments. This list is always the last argument of the C function.
These two variants can also be combined.
The type of the return value of a C function that corresponds to a
Scheme function is always SCM
. In the descriptions below,
types are therefore often omitted but for the return value and for the
arguments.
Next: The SCM Type, Previous: Overview of the Guile API, Up: API Reference [Contents][Index]
From time to time functions and other features of Guile become obsolete. Guile’s deprecation is a mechanism that can help you cope with this.
When you use a feature that is deprecated, you will likely get a warning
message at run-time. Also, if you have a new enough toolchain, using a
deprecated function from libguile
will cause a link-time warning.
The primary source for information about just what interfaces are deprecated in a given release is the file NEWS. That file also documents what you should use instead of the obsoleted things.
The file README contains instructions on how to control the inclusion or removal of the deprecated features from the public API of Guile, and how to control the deprecation warning messages.
The idea behind this mechanism is that normally all deprecated interfaces are available, but you get feedback when compiling and running code that uses them, so that you can migrate to the newer APIs at your leisure.
Next: Initializing Guile, Previous: Deprecation, Up: API Reference [Contents][Index]
Guile represents all Scheme values with the single C type SCM
.
For an introduction to this topic, See Dynamic Types.
SCM
is the user level abstract C type that is used to represent
all of Guile’s Scheme objects, no matter what the Scheme object type is.
No C operation except assignment is guaranteed to work with variables of
type SCM
, so you should only use macros and functions to work
with SCM
values. Values are converted between C data types and
the SCM
type with utility functions and macros.
scm_t_bits
is an unsigned integral data type that is guaranteed
to be large enough to hold all information that is required to
represent any Scheme object. While this data type is mostly used to
implement Guile’s internals, the use of this type is also necessary to
write certain kinds of extensions to Guile.
This is a signed integral type of the same size as scm_t_bits
.
Transforms the SCM
value x into its representation as an
integral type. Only after applying SCM_UNPACK
it is possible to
access the bits and contents of the SCM
value.
Takes a valid integral representation of a Scheme object and transforms
it into its representation as a SCM
value.
Next: Snarfing Macros, Previous: The SCM Type, Up: API Reference [Contents][Index]
Each thread that wants to use functions from the Guile API needs to
put itself into guile mode with either scm_with_guile
or
scm_init_guile
. The global state of Guile is initialized
automatically when the first thread enters guile mode.
When a thread wants to block outside of a Guile API function, it
should leave guile mode temporarily with scm_without_guile
,
See Blocking in Guile Mode.
Threads that are created by call-with-new-thread
or
scm_spawn_thread
start out in guile mode so you don’t need to
initialize them.
Call func, passing it data and return what func returns. While func is running, the current thread is in guile mode and can thus use the Guile API.
When scm_with_guile
is called from guile mode, the thread remains
in guile mode when scm_with_guile
returns.
Otherwise, it puts the current thread into guile mode and, if needed,
gives it a Scheme representation that is contained in the list returned
by all-threads
, for example. This Scheme representation is not
removed when scm_with_guile
returns so that a given thread is
always represented by the same Scheme value during its lifetime, if at
all.
When this is the first thread that enters guile mode, the global state
of Guile is initialized before calling func
.
The function func is called via
scm_with_continuation_barrier
; thus, scm_with_guile
returns exactly once.
When scm_with_guile
returns, the thread is no longer in guile
mode (except when scm_with_guile
was called from guile mode, see
above). Thus, only func
can store SCM
variables on the
stack and be sure that they are protected from the garbage collector.
See scm_init_guile
for another approach at initializing Guile
that does not have this restriction.
It is OK to call scm_with_guile
while a thread has temporarily
left guile mode via scm_without_guile
. It will then simply
temporarily enter guile mode again.
Arrange things so that all of the code in the current thread executes as
if from within a call to scm_with_guile
. That is, all functions
called by the current thread can assume that SCM
values on their
stack frames are protected from the garbage collector (except when the
thread has explicitly left guile mode, of course).
When scm_init_guile
is called from a thread that already has been
in guile mode once, nothing happens. This behavior matters when you
call scm_init_guile
while the thread has only temporarily left
guile mode: in that case the thread will not be in guile mode after
scm_init_guile
returns. Thus, you should not use
scm_init_guile
in such a scenario.
When a uncaught throw happens in a thread that has been put into guile
mode via scm_init_guile
, a short message is printed to the
current error port and the thread is exited via scm_pthread_exit
(NULL)
. No restrictions are placed on continuations.
The function scm_init_guile
might not be available on all
platforms since it requires some stack-bounds-finding magic that might
not have been ported to all platforms that Guile runs on. Thus, if you
can, it is better to use scm_with_guile
or its variation
scm_boot_guile
instead of this function.
Enter guile mode as with scm_with_guile
and call main_func,
passing it data, argc, and argv as indicated. When
main_func returns, scm_boot_guile
calls exit (0)
;
scm_boot_guile
never returns. If you want some other exit value,
have main_func call exit
itself. If you don’t want to exit
at all, use scm_with_guile
instead of scm_boot_guile
.
The function scm_boot_guile
arranges for the Scheme
command-line
function to return the strings given by argc
and argv. If main_func modifies argc or argv,
it should call scm_set_program_arguments
with the final list, so
Scheme code will know which arguments have been processed
(see Runtime Environment).
Process command-line arguments in the manner of the guile
executable. This includes loading the normal Guile initialization
files, interacting with the user or running any scripts or expressions
specified by -s
or -e
options, and then exiting.
See Invoking Guile, for more details.
Since this function does not return, you must do all application-specific initialization before calling this function.
Next: Data Types, Previous: Initializing Guile, Up: API Reference [Contents][Index]
The following macros do two different things: when compiled normally,
they expand in one way; when processed during snarfing, they cause the
guile-snarf
program to pick up some initialization code,
See Function Snarfing.
The descriptions below use the term ‘normally’ to refer to the case
when the code is compiled normally, and ‘while snarfing’ when the code
is processed by guile-snarf
.
Normally, SCM_SNARF_INIT
expands to nothing; while snarfing, it
causes code to be included in the initialization action file,
followed by a semicolon.
This is the fundamental macro for snarfing initialization actions. The more specialized macros below use it internally.
Normally, this macro expands into
static const char s_c_name[] = scheme_name; SCM c_name arglist
While snarfing, it causes
scm_c_define_gsubr (s_c_name, req, opt, var, c_name);
to be added to the initialization actions. Thus, you can use it to declare a C function named c_name that will be made available to Scheme with the name scheme_name.
Note that the arglist argument must have parentheses around it.
Normally, these macros expand into
static SCM c_name
or
SCM c_name
respectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_from_locale_symbol (scheme_name));
Thus, you can use them declare a static or global variable of type
SCM
that will be initialized to the symbol named
scheme_name.
Normally, these macros expand into
static SCM c_name
or
SCM c_name
respectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_c_make_keyword (scheme_name));
Thus, you can use them declare a static or global variable of type
SCM
that will be initialized to the keyword named
scheme_name.
These macros are equivalent to SCM_VARIABLE_INIT
and
SCM_GLOBAL_VARIABLE_INIT
, respectively, with a value of
SCM_BOOL_F
.
Normally, these macros expand into
static SCM c_name
or
SCM c_name
respectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_c_define (scheme_name, value));
Thus, you can use them declare a static or global C variable of type
SCM
that will be initialized to the object representing the
Scheme variable named scheme_name in the current module. The
variable will be defined when it doesn’t already exist. It is always
set to value.
Next: Foreign Objects, Previous: Snarfing Macros, Up: API Reference [Contents][Index]
Guile’s data types form a powerful built-in library of representations and functionality that you can apply to your problem domain. This chapter surveys the data types built-in to Guile, from the simple to the complex.
Next: Numerical data types, Up: Data Types [Contents][Index]
The two boolean values are #t
for true and #f
for false.
They can also be written as #true
and #false
, as per R7RS.
Boolean values are returned by predicate procedures, such as the general
equality predicates eq?
, eqv?
and equal?
(see Equality) and numerical and string comparison operators like
string=?
(see String Comparison) and <=
(see Comparison Predicates).
(<= 3 8) ⇒ #t (<= 3 -3) ⇒ #f (equal? "house" "houses") ⇒ #f (eq? #f #f) ⇒ #t
In test condition contexts like if
and cond
(see Simple Conditional Evaluation), where a group of subexpressions will be
evaluated only if a condition expression evaluates to “true”,
“true” means any value at all except #f
.
(if #t "yes" "no") ⇒ "yes" (if 0 "yes" "no") ⇒ "yes" (if #f "yes" "no") ⇒ "no"
A result of this asymmetry is that typical Scheme source code more often
uses #f
explicitly than #t
: #f
is necessary to
represent an if
or cond
false value, whereas #t
is
not necessary to represent an if
or cond
true value.
It is important to note that #f
is not equivalent to any
other Scheme value. In particular, #f
is not the same as the
number 0 (like in C and C++), and not the same as the “empty list”
(like in some Lisp dialects).
In C, the two Scheme boolean values are available as the two constants
SCM_BOOL_T
for #t
and SCM_BOOL_F
for #f
.
Care must be taken with the false value SCM_BOOL_F
: it is not
false when used in C conditionals. In order to test for it, use
scm_is_false
or scm_is_true
.
Return #t
if obj is either #t
or #f
, else
return #f
.
The SCM
representation of the Scheme object #t
.
The SCM
representation of the Scheme object #f
.
Return 0
if obj is #f
, else return 1
.
Return 1
if obj is #f
, else return 0
.
Return 1
if obj is either #t
or #f
, else
return 0
.
Return #f
if val is 0
, else return #t
.
Return 1
if val is SCM_BOOL_T
, return 0
when val is SCM_BOOL_F
, else signal a ‘wrong type’ error.
You should probably use scm_is_true
instead of this function
when you just want to test a SCM
value for trueness.
Next: Characters, Previous: Booleans, Up: Data Types [Contents][Index]
Guile supports a rich “tower” of numerical types — integer, rational, real and complex — and provides an extensive set of mathematical and scientific functions for operating on numerical data. This section of the manual documents those types and functions.
You may also find it illuminating to read R5RS’s presentation of numbers in Scheme, which is particularly clear and accessible: see Numbers in R5RS.
Next: Integers, Up: Numerical data types [Contents][Index]
Scheme’s numerical “tower” consists of the following categories of numbers:
Whole numbers, positive or negative; e.g. –5, 0, 18.
The set of numbers that can be expressed as p/q where p and q are integers; e.g. 9/16 works, but pi (an irrational number) doesn’t. These include integers (n/1).
The set of numbers that describes all possible positions along a one-dimensional line. This includes rationals as well as irrational numbers.
The set of numbers that describes all possible positions in a two dimensional space. This includes real as well as imaginary numbers (a+bi, where a is the real part, b is the imaginary part, and i is the square root of -1.)
It is called a tower because each category “sits on” the one that follows it, in the sense that every integer is also a rational, every rational is also real, and every real number is also a complex number (but with zero imaginary part).
In addition to the classification into integers, rationals, reals and
complex numbers, Scheme also distinguishes between whether a number is
represented exactly or not. For example, the result of
2*sin(pi/4) is exactly 2^(1/2), but Guile
can represent neither pi/4 nor 2^(1/2) exactly.
Instead, it stores an inexact approximation, using the C type
double
.
Guile can represent exact rationals of any magnitude, inexact
rationals that fit into a C double
, and inexact complex numbers
with double
real and imaginary parts.
The number?
predicate may be applied to any Scheme value to
discover whether the value is any of the supported numerical types.
Return #t
if obj is any kind of number, else #f
.
For example:
(number? 3) ⇒ #t (number? "hello there!") ⇒ #f (define pi 3.141592654) (number? pi) ⇒ #t
This is equivalent to scm_is_true (scm_number_p (obj))
.
The next few subsections document each of Guile’s numerical data types in detail.
Next: Real and Rational Numbers, Previous: Scheme’s Numerical “Tower”, Up: Numerical data types [Contents][Index]
Integers are whole numbers, that is numbers with no fractional part, such as 2, 83, and -3789.
Integers in Guile can be arbitrarily big, as shown by the following example.
(define (factorial n) (let loop ((n n) (product 1)) (if (= n 0) product (loop (- n 1) (* product n))))) (factorial 3) ⇒ 6 (factorial 20) ⇒ 2432902008176640000 (- (factorial 45)) ⇒ -119622220865480194561963161495657715064383733760000000000
Readers whose background is in programming languages where integers are limited by the need to fit into just 4 or 8 bytes of memory may find this surprising, or suspect that Guile’s representation of integers is inefficient. In fact, Guile achieves a near optimal balance of convenience and efficiency by using the host computer’s native representation of integers where possible, and a more general representation where the required number does not fit in the native form. Conversion between these two representations is automatic and completely invisible to the Scheme level programmer.
C has a host of different integer types, and Guile offers a host of
functions to convert between them and the SCM
representation.
For example, a C int
can be handled with scm_to_int
and
scm_from_int
. Guile also defines a few C integer types of its
own, to help with differences between systems.
C integer types that are not covered can be handled with the generic
scm_to_signed_integer
and scm_from_signed_integer
for
signed types, or with scm_to_unsigned_integer
and
scm_from_unsigned_integer
for unsigned types.
Scheme integers can be exact and inexact. For example, a number
written as 3.0
with an explicit decimal-point is inexact, but
it is also an integer. The functions integer?
and
scm_is_integer
report true for such a number, but the functions
exact-integer?
, scm_is_exact_integer
,
scm_is_signed_integer
, and scm_is_unsigned_integer
only
allow exact integers and thus report false. Likewise, the conversion
functions like scm_to_signed_integer
only accept exact
integers.
The motivation for this behavior is that the inexactness of a number
should not be lost silently. If you want to allow inexact integers,
you can explicitly insert a call to inexact->exact
or to its C
equivalent scm_inexact_to_exact
. (Only inexact integers will
be converted by this call into exact integers; inexact non-integers
will become exact fractions.)
Return #t
if x is an exact or inexact integer number, else
return #f
.
(integer? 487) ⇒ #t (integer? 3.0) ⇒ #t (integer? -3.4) ⇒ #f (integer? +inf.0) ⇒ #f
This is equivalent to scm_is_true (scm_integer_p (x))
.
Return #t
if x is an exact integer number, else
return #f
.
(exact-integer? 37) ⇒ #t (exact-integer? 3.0) ⇒ #f
This is equivalent to scm_is_true (scm_exact_integer_p (x))
.
The C types are equivalent to the corresponding ISO C types but are
defined on all platforms, with the exception of scm_t_int64
and
scm_t_uint64
, which are only defined when a 64-bit type is
available. For example, scm_t_int8
is equivalent to
int8_t
.
You can regard these definitions as a stop-gap measure until all platforms provide these types. If you know that all the platforms that you are interested in already provide these types, it is better to use them directly instead of the types provided by Guile.
Return 1
when x represents an exact integer that is
between min and max, inclusive.
These functions can be used to check whether a SCM
value will
fit into a given range, such as the range of a given C integer type.
If you just want to convert a SCM
value to a given C integer
type, use one of the conversion functions directly.
When x represents an exact integer that is between min and max inclusive, return that integer. Else signal an error, either a ‘wrong-type’ error when x is not an exact integer, or an ‘out-of-range’ error when it doesn’t fit the given range.
Return the SCM
value that represents the integer x. This
function will always succeed and will always return an exact number.
When x represents an exact integer that fits into the indicated C type, return that integer. Else signal an error, either a ‘wrong-type’ error when x is not an exact integer, or an ‘out-of-range’ error when it doesn’t fit the given range.
The functions scm_to_long_long
, scm_to_ulong_long
,
scm_to_int64
, and scm_to_uint64
are only available when
the corresponding types are.
Return the SCM
value that represents the integer x.
These functions will always succeed and will always return an exact
number.
Assign val to the multiple precision integer rop.
val must be an exact integer, otherwise an error will be
signalled. rop must have been initialized with mpz_init
before this function is called. When rop is no longer needed
the occupied space must be freed with mpz_clear
.
See Initializing Integers in GNU MP Manual, for details.
Return the SCM
value that represents val.
Next: Complex Numbers, Previous: Integers, Up: Numerical data types [Contents][Index]
Mathematically, the real numbers are the set of numbers that describe all possible points along a continuous, infinite, one-dimensional line. The rational numbers are the set of all numbers that can be written as fractions p/q, where p and q are integers. All rational numbers are also real, but there are real numbers that are not rational, for example the square root of 2, and pi.
Guile can represent both exact and inexact rational numbers, but it
cannot represent precise finite irrational numbers. Exact rationals are
represented by storing the numerator and denominator as two exact
integers. Inexact rationals are stored as floating point numbers using
the C type double
.
Exact rationals are written as a fraction of integers. There must be no whitespace around the slash:
1/2 -22/7
Even though the actual encoding of inexact rationals is in binary, it may be helpful to think of it as a decimal number with a limited number of significant figures and a decimal point somewhere, since this corresponds to the standard notation for non-whole numbers. For example:
0.34 -0.00000142857931198 -5648394822220000000000.0 4.0
The limited precision of Guile’s encoding means that any finite “real”
number in Guile can be written in a rational form, by multiplying and
then dividing by sufficient powers of 10 (or in fact, 2). For example,
‘-0.00000142857931198’ is the same as -142857931198 divided
by 100000000000000000. In Guile’s current incarnation, therefore, the
rational?
and real?
predicates are equivalent for finite
numbers.
Dividing by an exact zero leads to a error message, as one might expect. However, dividing by an inexact zero does not produce an error. Instead, the result of the division is either plus or minus infinity, depending on the sign of the divided number and the sign of the zero divisor (some platforms support signed zeroes ‘-0.0’ and ‘+0.0’; ‘0.0’ is the same as ‘+0.0’).
Dividing zero by an inexact zero yields a NaN (‘not a number’)
value, although they are actually considered numbers by Scheme.
Attempts to compare a NaN value with any number (including
itself) using =
, <
, >
, <=
or >=
always returns #f
. Although a NaN value is not
=
to itself, it is both eqv?
and equal?
to itself
and other NaN values. However, the preferred way to test for
them is by using nan?
.
The real NaN values and infinities are written ‘+nan.0’,
‘+inf.0’ and ‘-inf.0’. This syntax is also recognized by
read
as an extension to the usual Scheme syntax. These special
values are considered by Scheme to be inexact real numbers but not
rational. Note that non-real complex numbers may also contain
infinities or NaN values in their real or imaginary parts. To
test a real number to see if it is infinite, a NaN value, or
neither, use inf?
, nan?
, or finite?
, respectively.
Every real number in Scheme belongs to precisely one of those three
classes.
On platforms that follow IEEE 754 for their floating point
arithmetic, the ‘+inf.0’, ‘-inf.0’, and ‘+nan.0’ values
are implemented using the corresponding IEEE 754 values.
They behave in arithmetic operations like IEEE 754 describes
it, i.e., (= +nan.0 +nan.0)
⇒ #f
.
Return #t
if obj is a real number, else #f
. Note
that the sets of integer and rational values form subsets of the set
of real numbers, so the predicate will also be fulfilled if obj
is an integer number or a rational number.
Return #t
if x is a rational number, #f
otherwise.
Note that the set of integer values forms a subset of the set of
rational numbers, i.e. the predicate will also be fulfilled if
x is an integer number.
Returns the simplest rational number differing from x by no more than eps.
As required by R5RS, rationalize
only returns an
exact result when both its arguments are exact. Thus, you might need
to use inexact->exact
on the arguments.
(rationalize (inexact->exact 1.2) 1/100) ⇒ 6/5
Return #t
if the real number x is ‘+inf.0’ or
‘-inf.0’. Otherwise return #f
.
Return #t
if the real number x is ‘+nan.0’, or
#f
otherwise.
Return #t
if the real number x is neither infinite nor a
NaN, #f
otherwise.
Return the numerator of the rational number x.
Return the denominator of the rational number x.
Equivalent to scm_is_true (scm_real_p (val))
and
scm_is_true (scm_rational_p (val))
, respectively.
Returns the number closest to val that is representable as a
double
. Returns infinity for a val that is too large in
magnitude. The argument val must be a real number.
Return the SCM
value that represents val. The returned
value is inexact according to the predicate inexact?
, but it
will be exactly equal to val.
Next: Exact and Inexact Numbers, Previous: Real and Rational Numbers, Up: Numerical data types [Contents][Index]
Complex numbers are the set of numbers that describe all possible points in a two-dimensional space. The two coordinates of a particular point in this space are known as the real and imaginary parts of the complex number that describes that point.
In Guile, complex numbers are written in rectangular form as the sum of
their real and imaginary parts, using the symbol i
to indicate
the imaginary part.
3+4i ⇒ 3.0+4.0i (* 3-8i 2.3+0.3i) ⇒ 9.3-17.5i
Polar form can also be used, with an ‘@’ between magnitude and angle,
1@3.141592 ⇒ -1.0 (approx) -1@1.57079 ⇒ 0.0-1.0i (approx)
Guile represents a complex number as a pair of inexact reals, so the real and imaginary parts of a complex number have the same properties of inexactness and limited precision as single inexact real numbers.
Note that each part of a complex number may contain any inexact real value, including the special values ‘+nan.0’, ‘+inf.0’ and ‘-inf.0’, as well as either of the signed zeroes ‘0.0’ or ‘-0.0’.
Return #t
if z is a complex number, #f
otherwise. Note that the sets of real, rational and integer
values form subsets of the set of complex numbers, i.e. the
predicate will also be fulfilled if z is a real,
rational or integer number.
Equivalent to scm_is_true (scm_complex_p (val))
.
Next: Read Syntax for Numerical Data, Previous: Complex Numbers, Up: Numerical data types [Contents][Index]
R5RS requires that, with few exceptions, a calculation involving inexact
numbers always produces an inexact result. To meet this requirement,
Guile distinguishes between an exact integer value such as ‘5’ and
the corresponding inexact integer value which, to the limited precision
available, has no fractional part, and is printed as ‘5.0’. Guile
will only convert the latter value to the former when forced to do so by
an invocation of the inexact->exact
procedure.
The only exception to the above requirement is when the values of the
inexact numbers do not affect the result. For example (expt n 0)
is ‘1’ for any value of n
, therefore (expt 5.0 0)
is
permitted to return an exact ‘1’.
Return #t
if the number z is exact, #f
otherwise.
(exact? 2) ⇒ #t (exact? 0.5) ⇒ #f (exact? (/ 2)) ⇒ #t
Return a 1
if the number z is exact, and 0
otherwise. This is equivalent to scm_is_true (scm_exact_p (z))
.
An alternate approch to testing the exactness of a number is to
use scm_is_signed_integer
or scm_is_unsigned_integer
.
Return #t
if the number z is inexact, #f
else.
Return a 1
if the number z is inexact, and 0
otherwise. This is equivalent to scm_is_true (scm_inexact_p (z))
.
Return an exact number that is numerically closest to z, when there is one. For inexact rationals, Guile returns the exact rational that is numerically equal to the inexact rational. Inexact complex numbers with a non-zero imaginary part can not be made exact.
(inexact->exact 0.5) ⇒ 1/2
The following happens because 12/10 is not exactly representable as a
double
(on most platforms). However, when reading a decimal
number that has been marked exact with the “#e” prefix, Guile is
able to represent it correctly.
(inexact->exact 1.2) ⇒ 5404319552844595/4503599627370496 #e1.2 ⇒ 6/5
Convert the number z to its inexact representation.
Next: Operations on Integer Values, Previous: Exact and Inexact Numbers, Up: Numerical data types [Contents][Index]
The read syntax for integers is a string of digits, optionally preceded by a minus or plus character, a code indicating the base in which the integer is encoded, and a code indicating whether the number is exact or inexact. The supported base codes are:
#b
#B
the integer is written in binary (base 2)
#o
#O
the integer is written in octal (base 8)
#d
#D
the integer is written in decimal (base 10)
#x
#X
the integer is written in hexadecimal (base 16)
If the base code is omitted, the integer is assumed to be decimal. The following examples show how these base codes are used.
-13 ⇒ -13 #d-13 ⇒ -13 #x-13 ⇒ -19 #b+1101 ⇒ 13 #o377 ⇒ 255
The codes for indicating exactness (which can, incidentally, be applied to all numerical values) are:
#e
#E
the number is exact
#i
#I
the number is inexact.
If the exactness indicator is omitted, the number is exact unless it contains a radix point. Since Guile can not represent exact complex numbers, an error is signalled when asking for them.
(exact? 1.2) ⇒ #f (exact? #e1.2) ⇒ #t (exact? #e+1i) ERROR: Wrong type argument
Guile also understands the syntax ‘+inf.0’ and ‘-inf.0’ for plus and minus infinity, respectively. The value must be written exactly as shown, that is, they always must have a sign and exactly one zero digit after the decimal point. It also understands ‘+nan.0’ and ‘-nan.0’ for the special ‘not-a-number’ value. The sign is ignored for ‘not-a-number’ and the value is always printed as ‘+nan.0’.
Next: Comparison Predicates, Previous: Read Syntax for Numerical Data, Up: Numerical data types [Contents][Index]
Return #t
if n is an odd number, #f
otherwise.
Return #t
if n is an even number, #f
otherwise.
Return the quotient or remainder from n divided by d. The quotient is rounded towards zero, and the remainder will have the same sign as n. In all cases quotient and remainder satisfy n = q*d + r.
(remainder 13 4) ⇒ 1 (remainder -13 4) ⇒ -1
See also truncate-quotient
, truncate-remainder
and
related operations in Arithmetic Functions.
Return the remainder from n divided by d, with the same sign as d.
(modulo 13 4) ⇒ 1 (modulo -13 4) ⇒ 3 (modulo 13 -4) ⇒ -3 (modulo -13 -4) ⇒ -1
See also floor-quotient
, floor-remainder
and
related operations in Arithmetic Functions.
Return the greatest common divisor of all arguments. If called without arguments, 0 is returned.
The C function scm_gcd
always takes two arguments, while the
Scheme function can take an arbitrary number.
Return the least common multiple of the arguments. If called without arguments, 1 is returned.
The C function scm_lcm
always takes two arguments, while the
Scheme function can take an arbitrary number.
Return n raised to the integer exponent k, modulo m.
(modulo-expt 2 3 5) ⇒ 3
Return two exact non-negative integers s and r such that k = s^2 + r and s^2 <= k < (s + 1)^2. An error is raised if k is not an exact non-negative integer.
(exact-integer-sqrt 10) ⇒ 3 and 1
Next: Converting Numbers To and From Strings, Previous: Operations on Integer Values, Up: Numerical data types [Contents][Index]
The C comparison functions below always takes two arguments, while the
Scheme functions can take an arbitrary number. Also keep in mind that
the C functions return one of the Scheme boolean values
SCM_BOOL_T
or SCM_BOOL_F
which are both true as far as C
is concerned. Thus, always write scm_is_true (scm_num_eq_p (x,
y))
when testing the two Scheme numbers x
and y
for
equality, for example.
Return #t
if all parameters are numerically equal.
Return #t
if the list of parameters is monotonically
increasing.
Return #t
if the list of parameters is monotonically
decreasing.
Return #t
if the list of parameters is monotonically
non-decreasing.
Return #t
if the list of parameters is monotonically
non-increasing.
Return #t
if z is an exact or inexact number equal to
zero.
Return #t
if x is an exact or inexact number greater than
zero.
Return #t
if x is an exact or inexact number less than
zero.
Next: Complex Number Operations, Previous: Comparison Predicates, Up: Numerical data types [Contents][Index]
The following procedures read and write numbers according to their
external representation as defined by R5RS (see R5RS Lexical Structure in The Revised^5 Report on the Algorithmic
Language Scheme). See the (ice-9
i18n)
module, for locale-dependent number parsing.
Return a string holding the external representation of the number n in the given radix. If n is inexact, a radix of 10 will be used.
Return a number of the maximally precise representation
expressed by the given string. radix must be an
exact integer, either 2, 8, 10, or 16. If supplied, radix
is a default radix that may be overridden by an explicit radix
prefix in string (e.g. "#o177"). If radix is not
supplied, then the default radix is 10. If string is not a
syntactically valid notation for a number, then
string->number
returns #f
.
As per string->number
above, but taking a C string, as pointer
and length. The string characters should be in the current locale
encoding (locale
in the name refers only to that, there’s no
locale-dependent parsing).
Next: Arithmetic Functions, Previous: Converting Numbers To and From Strings, Up: Numerical data types [Contents][Index]
Return a complex number constructed of the given real-part and imaginary-part parts.
Return the complex number mag * e^(i * ang).
Return the real part of the number z.
Return the imaginary part of the number z.
Return the magnitude of the number z. This is the same as
abs
for real arguments, but also allows complex numbers.
Like scm_make_rectangular
or scm_make_polar
,
respectively, but these functions take double
s as their
arguments.
Returns the real or imaginary part of z as a double
.
Returns the magnitude or angle of z as a double
.
Next: Scientific Functions, Previous: Complex Number Operations, Up: Numerical data types [Contents][Index]
The C arithmetic functions below always takes two arguments, while the
Scheme functions can take an arbitrary number. When you need to
invoke them with just one argument, for example to compute the
equivalent of (- x)
, pass SCM_UNDEFINED
as the second
one: scm_difference (x, SCM_UNDEFINED)
.
Return the sum of all parameter values. Return 0 if called without any parameters.
If called with one argument z1, -z1 is returned. Otherwise the sum of all but the first argument are subtracted from the first argument.
Return the product of all arguments. If called without arguments, 1 is returned.
Divide the first argument by the product of the remaining arguments. If called with one argument z1, 1/z1 is returned.
Return the absolute value of x.
x must be a number with zero imaginary part. To calculate the
magnitude of a complex number, use magnitude
instead.
Return the maximum of all parameter values.
Return the minimum of all parameter values.
Round the inexact number x towards zero.
Round the inexact number x to the nearest integer. When exactly halfway between two integers, round to the even one.
Like scm_truncate_number
or scm_round_number
,
respectively, but these functions take and return double
values.
These procedures accept two real numbers x and y, where the
divisor y must be non-zero. euclidean-quotient
returns the
integer q and euclidean-remainder
returns the real number
r such that x = q*y + r and
0 <= r < |y|. euclidean/
returns both q and
r, and is more efficient than computing each separately. Note
that when y > 0, euclidean-quotient
returns
floor(x/y), otherwise it returns
ceiling(x/y).
Note that these operators are equivalent to the R6RS operators
div
, mod
, and div-and-mod
.
(euclidean-quotient 123 10) ⇒ 12 (euclidean-remainder 123 10) ⇒ 3 (euclidean/ 123 10) ⇒ 12 and 3 (euclidean/ 123 -10) ⇒ -12 and 3 (euclidean/ -123 10) ⇒ -13 and 7 (euclidean/ -123 -10) ⇒ 13 and 7 (euclidean/ -123.2 -63.5) ⇒ 2.0 and 3.8 (euclidean/ 16/3 -10/7) ⇒ -3 and 22/21
These procedures accept two real numbers x and y, where the
divisor y must be non-zero. floor-quotient
returns the
integer q and floor-remainder
returns the real number
r such that q = floor(x/y) and
x = q*y + r. floor/
returns
both q and r, and is more efficient than computing each
separately. Note that r, if non-zero, will have the same sign
as y.
When x and y are integers, floor-remainder
is
equivalent to the R5RS integer-only operator modulo
.
(floor-quotient 123 10) ⇒ 12 (floor-remainder 123 10) ⇒ 3 (floor/ 123 10) ⇒ 12 and 3 (floor/ 123 -10) ⇒ -13 and -7 (floor/ -123 10) ⇒ -13 and 7 (floor/ -123 -10) ⇒ 12 and -3 (floor/ -123.2 -63.5) ⇒ 1.0 and -59.7 (floor/ 16/3 -10/7) ⇒ -4 and -8/21
These procedures accept two real numbers x and y, where the
divisor y must be non-zero. ceiling-quotient
returns the
integer q and ceiling-remainder
returns the real number
r such that q = ceiling(x/y) and
x = q*y + r. ceiling/
returns
both q and r, and is more efficient than computing each
separately. Note that r, if non-zero, will have the opposite sign
of y.
(ceiling-quotient 123 10) ⇒ 13 (ceiling-remainder 123 10) ⇒ -7 (ceiling/ 123 10) ⇒ 13 and -7 (ceiling/ 123 -10) ⇒ -12 and 3 (ceiling/ -123 10) ⇒ -12 and -3 (ceiling/ -123 -10) ⇒ 13 and 7 (ceiling/ -123.2 -63.5) ⇒ 2.0 and 3.8 (ceiling/ 16/3 -10/7) ⇒ -3 and 22/21
These procedures accept two real numbers x and y, where the
divisor y must be non-zero. truncate-quotient
returns the
integer q and truncate-remainder
returns the real number
r such that q is x/y rounded toward zero,
and x = q*y + r. truncate/
returns
both q and r, and is more efficient than computing each
separately. Note that r, if non-zero, will have the same sign
as x.
When x and y are integers, these operators are
equivalent to the R5RS integer-only operators quotient
and
remainder
.
(truncate-quotient 123 10) ⇒ 12 (truncate-remainder 123 10) ⇒ 3 (truncate/ 123 10) ⇒ 12 and 3 (truncate/ 123 -10) ⇒ -12 and 3 (truncate/ -123 10) ⇒ -12 and -3 (truncate/ -123 -10) ⇒ 12 and -3 (truncate/ -123.2 -63.5) ⇒ 1.0 and -59.7 (truncate/ 16/3 -10/7) ⇒ -3 and 22/21
These procedures accept two real numbers x and y, where the
divisor y must be non-zero. centered-quotient
returns the
integer q and centered-remainder
returns the real number
r such that x = q*y + r and
-|y/2| <= r < |y/2|. centered/
returns both q and r, and is more efficient than computing
each separately.
Note that centered-quotient
returns x/y
rounded to the nearest integer. When x/y lies
exactly half-way between two integers, the tie is broken according to
the sign of y. If y > 0, ties are rounded toward
positive infinity, otherwise they are rounded toward negative infinity.
This is a consequence of the requirement that
-|y/2| <= r < |y/2|.
Note that these operators are equivalent to the R6RS operators
div0
, mod0
, and div0-and-mod0
.
(centered-quotient 123 10) ⇒ 12 (centered-remainder 123 10) ⇒ 3 (centered/ 123 10) ⇒ 12 and 3 (centered/ 123 -10) ⇒ -12 and 3 (centered/ -123 10) ⇒ -12 and -3 (centered/ -123 -10) ⇒ 12 and -3 (centered/ 125 10) ⇒ 13 and -5 (centered/ 127 10) ⇒ 13 and -3 (centered/ 135 10) ⇒ 14 and -5 (centered/ -123.2 -63.5) ⇒ 2.0 and 3.8 (centered/ 16/3 -10/7) ⇒ -4 and -8/21
These procedures accept two real numbers x and y, where the
divisor y must be non-zero. round-quotient
returns the
integer q and round-remainder
returns the real number
r such that x = q*y + r and
q is x/y rounded to the nearest integer,
with ties going to the nearest even integer. round/
returns both q and r, and is more efficient than computing
each separately.
Note that round/
and centered/
are almost equivalent, but
their behavior differs when x/y lies exactly half-way
between two integers. In this case, round/
chooses the nearest
even integer, whereas centered/
chooses in such a way to satisfy
the constraint -|y/2| <= r < |y/2|, which
is stronger than the corresponding constraint for round/
,
-|y/2| <= r <= |y/2|. In particular,
when x and y are integers, the number of possible remainders
returned by centered/
is |y|, whereas the number of
possible remainders returned by round/
is |y|+1 when
y is even.
(round-quotient 123 10) ⇒ 12 (round-remainder 123 10) ⇒ 3 (round/ 123 10) ⇒ 12 and 3 (round/ 123 -10) ⇒ -12 and 3 (round/ -123 10) ⇒ -12 and -3 (round/ -123 -10) ⇒ 12 and -3 (round/ 125 10) ⇒ 12 and 5 (round/ 127 10) ⇒ 13 and -3 (round/ 135 10) ⇒ 14 and -5 (round/ -123.2 -63.5) ⇒ 2.0 and 3.8 (round/ 16/3 -10/7) ⇒ -4 and -8/21
Next: Bitwise Operations, Previous: Arithmetic Functions, Up: Numerical data types [Contents][Index]
The following procedures accept any kind of number as arguments, including complex numbers.
Return the square root of z. Of the two possible roots (positive and negative), the one with a positive real part is returned, or if that’s zero then a positive imaginary part. Thus,
(sqrt 9.0) ⇒ 3.0 (sqrt -9.0) ⇒ 0.0+3.0i (sqrt 1.0+1.0i) ⇒ 1.09868411346781+0.455089860562227i (sqrt -1.0-1.0i) ⇒ 0.455089860562227-1.09868411346781i
Return z1 raised to the power of z2.
Return the sine of z.
Return the cosine of z.
Return the tangent of z.
Return the arcsine of z.
Return the arccosine of z.
Return e to the power of z, where e is the base of natural logarithms (2.71828…).
Return the natural logarithm of z.
Return the base 10 logarithm of z.
Return the hyperbolic sine of z.
Return the hyperbolic cosine of z.
Return the hyperbolic tangent of z.
Return the hyperbolic arcsine of z.
Return the hyperbolic arccosine of z.
Return the hyperbolic arctangent of z.
Next: Random Number Generation, Previous: Scientific Functions, Up: Numerical data types [Contents][Index]
For the following bitwise functions, negative numbers are treated as infinite precision twos-complements. For instance -6 is bits …111010, with infinitely many ones on the left. It can be seen that adding 6 (binary 110) to such a bit pattern gives all zeros.
Return the bitwise AND of the integer arguments.
(logand) ⇒ -1 (logand 7) ⇒ 7 (logand #b111 #b011 #b001) ⇒ 1
Return the bitwise OR of the integer arguments.
(logior) ⇒ 0 (logior 7) ⇒ 7 (logior #b000 #b001 #b011) ⇒ 3
Return the bitwise XOR of the integer arguments. A bit is set in the result if it is set in an odd number of arguments.
(logxor) ⇒ 0 (logxor 7) ⇒ 7 (logxor #b000 #b001 #b011) ⇒ 2 (logxor #b000 #b001 #b011 #b011) ⇒ 1
Return the integer which is the ones-complement of the integer argument, ie. each 0 bit is changed to 1 and each 1 bit to 0.
(number->string (lognot #b10000000) 2) ⇒ "-10000001" (number->string (lognot #b0) 2) ⇒ "-1"
Test whether j and k have any 1 bits in common. This is
equivalent to (not (zero? (logand j k)))
, but without actually
calculating the logand
, just testing for non-zero.
(logtest #b0100 #b1011) ⇒ #f (logtest #b0100 #b0111) ⇒ #t
Test whether bit number index in j is set. index starts from 0 for the least significant bit.
(logbit? 0 #b1101) ⇒ #t (logbit? 1 #b1101) ⇒ #f (logbit? 2 #b1101) ⇒ #t (logbit? 3 #b1101) ⇒ #t (logbit? 4 #b1101) ⇒ #f
Return floor(n * 2^{count}). n and count must be exact integers.
With n viewed as an infinite-precision twos-complement
integer, ash
means a left shift introducing zero bits
when count is positive, or a right shift dropping bits
when count is negative. This is an “arithmetic” shift.
(number->string (ash #b1 3) 2) ⇒ "1000" (number->string (ash #b1010 -1) 2) ⇒ "101" ;; -23 is bits ...11101001, -6 is bits ...111010 (ash -23 -2) ⇒ -6
Return round(n * 2^count). n and count must be exact integers.
With n viewed as an infinite-precision twos-complement
integer, round-ash
means a left shift introducing zero
bits when count is positive, or a right shift rounding
to the nearest integer (with ties going to the nearest even
integer) when count is negative. This is a rounded
“arithmetic” shift.
(number->string (round-ash #b1 3) 2) ⇒ \"1000\" (number->string (round-ash #b1010 -1) 2) ⇒ \"101\" (number->string (round-ash #b1010 -2) 2) ⇒ \"10\" (number->string (round-ash #b1011 -2) 2) ⇒ \"11\" (number->string (round-ash #b1101 -2) 2) ⇒ \"11\" (number->string (round-ash #b1110 -2) 2) ⇒ \"100\"
Return the number of bits in integer n. If n is positive, the 1-bits in its binary representation are counted. If negative, the 0-bits in its two’s-complement binary representation are counted. If zero, 0 is returned.
(logcount #b10101010) ⇒ 4 (logcount 0) ⇒ 0 (logcount -2) ⇒ 1
Return the number of bits necessary to represent n.
For positive n this is how many bits to the most significant one bit. For negative n it’s how many bits to the most significant zero bit in twos complement form.
(integer-length #b10101010) ⇒ 8 (integer-length #b1111) ⇒ 4 (integer-length 0) ⇒ 0 (integer-length -1) ⇒ 0 (integer-length -256) ⇒ 8 (integer-length -257) ⇒ 9
Return n raised to the power k. k must be an exact integer, n can be any number.
Negative k is supported, and results in 1/n^abs(k) in the usual way. n^0 is 1, as usual, and that includes 0^0 is 1.
(integer-expt 2 5) ⇒ 32 (integer-expt -3 3) ⇒ -27 (integer-expt 5 -3) ⇒ 1/125 (integer-expt 0 0) ⇒ 1
Return the integer composed of the start (inclusive) through end (exclusive) bits of n. The startth bit becomes the 0-th bit in the result.
(number->string (bit-extract #b1101101010 0 4) 2) ⇒ "1010" (number->string (bit-extract #b1101101010 4 9) 2) ⇒ "10110"
Previous: Bitwise Operations, Up: Numerical data types [Contents][Index]
Pseudo-random numbers are generated from a random state object, which
can be created with seed->random-state
or
datum->random-state
. An external representation (i.e. one
which can written with write
and read with read
) of a
random state object can be obtained via
random-state->datum
. The state parameter to the
various functions below is optional, it defaults to the state object
in the *random-state*
variable.
Return a copy of the random state state.
Return a number in [0, n).
Accepts a positive integer or real n and returns a number of the same type between zero (inclusive) and n (exclusive). The values returned have a uniform distribution.
Return an inexact real in an exponential distribution with mean
1. For an exponential distribution with mean u use (*
u (random:exp))
.
Fills vect with inexact real random numbers the sum of whose
squares is equal to 1.0. Thinking of vect as coordinates in
space of dimension n = (vector-length vect)
,
the coordinates are uniformly distributed over the surface of the unit
n-sphere.
Return an inexact real in a normal distribution. The distribution
used has mean 0 and standard deviation 1. For a normal distribution
with mean m and standard deviation d use (+ m
(* d (random:normal)))
.
Fills vect with inexact real random numbers that are independent and standard normally distributed (i.e., with mean 0 and variance 1).
Fills vect with inexact real random numbers the sum of whose
squares is less than 1.0. Thinking of vect as coordinates in
space of dimension n = (vector-length vect)
,
the coordinates are uniformly distributed within the unit
n-sphere.
Return a uniformly distributed inexact real random number in [0,1).
Return a new random state using seed.
Return a new random state from datum, which should have been
obtained by random-state->datum
.
Return a datum representation of state that may be written out and read back with the Scheme reader.
Construct a new random state seeded from a platform-specific source of entropy, appropriate for use in non-security-critical applications. Currently /dev/urandom is tried first, or else the seed is based on the time, date, process ID, an address from a freshly allocated heap cell, an address from the local stack frame, and a high-resolution timer if available.
The global random state used by the above functions when the state parameter is not given.
Note that the initial value of *random-state*
is the same every
time Guile starts up. Therefore, if you don’t pass a state
parameter to the above procedures, and you don’t set
*random-state*
to (seed->random-state your-seed)
, where
your-seed
is something that isn’t the same every time,
you’ll get the same sequence of “random” numbers on every run.
For example, unless the relevant source code has changed, (map
random (cdr (iota 30)))
, if the first use of random numbers since
Guile started up, will always give:
(map random (cdr (iota 19))) ⇒ (0 1 1 2 2 2 1 2 6 7 10 0 5 3 12 5 5 12)
To seed the random state in a sensible way for non-security-critical applications, do this during initialization of your program:
(set! *random-state* (random-state-from-platform))
Next: Character Sets, Previous: Numerical data types, Up: Data Types [Contents][Index]
In Scheme, there is a data type to describe a single character.
Defining what exactly a character is can be more complicated than it seems. Guile follows the advice of R6RS and uses The Unicode Standard to help define what a character is. So, for Guile, a character is anything in the Unicode Character Database.
The Unicode Character Database is basically a table of characters
indexed using integers called ’code points’. Valid code points are in
the ranges 0 to #xD7FF
inclusive or #xE000
to
#x10FFFF
inclusive, which is about 1.1 million code points.
Any code point that has been assigned to a character or that has otherwise been given a meaning by Unicode is called a ’designated code point’. Most of the designated code points, about 200,000 of them, indicate characters, accents or other combining marks that modify other characters, symbols, whitespace, and control characters. Some are not characters but indicators that suggest how to format or display neighboring characters.
If a code point is not a designated code point – if it has not been assigned to a character by The Unicode Standard – it is a ’reserved code point’, meaning that they are reserved for future use. Most of the code points, about 800,000, are ’reserved code points’.
By convention, a Unicode code point is written as “U+XXXX” where “XXXX” is a hexadecimal number. Please note that this convenient notation is not valid code. Guile does not interpret “U+XXXX” as a character.
In Scheme, a character literal is written as #\name
where
name is the name of the character that you want. Printable
characters have their usual single character name; for example,
#\a
is a lower case a
.
Some of the code points are ’combining characters’ that are not meant
to be printed by themselves but are instead meant to modify the
appearance of the previous character. For combining characters, an
alternate form of the character literal is #\
followed by
U+25CC (a small, dotted circle), followed by the combining character.
This allows the combining character to be drawn on the circle, not on
the backslash of #\
.
Many of the non-printing characters, such as whitespace characters and control characters, also have names.
The most commonly used non-printing characters have long character names, described in the table below.
Character Name | Codepoint |
#\nul | U+0000 |
#\alarm | U+0007 |
#\backspace | U+0008 |
#\tab | U+0009 |
#\linefeed | U+000A |
#\newline | U+000A |
#\vtab | U+000B |
#\page | U+000C |
#\return | U+000D |
#\esc | U+001B |
#\space | U+0020 |
#\delete | U+007F |
There are also short names for all of the “C0 control characters” (those with code points below 32). The following table lists the short name for each character.
0 = #\nul | 1 = #\soh | 2 = #\stx | 3 = #\etx |
4 = #\eot | 5 = #\enq | 6 = #\ack | 7 = #\bel |
8 = #\bs | 9 = #\ht | 10 = #\lf | 11 = #\vt |
12 = #\ff | 13 = #\cr | 14 = #\so | 15 = #\si |
16 = #\dle | 17 = #\dc1 | 18 = #\dc2 | 19 = #\dc3 |
20 = #\dc4 | 21 = #\nak | 22 = #\syn | 23 = #\etb |
24 = #\can | 25 = #\em | 26 = #\sub | 27 = #\esc |
28 = #\fs | 29 = #\gs | 30 = #\rs | 31 = #\us |
32 = #\sp |
The short name for the “delete” character (code point U+007F) is
#\del
.
The R7RS name for the “escape” character (code point U+001B) is
#\escape
.
There are also a few alternative names left over for compatibility with previous versions of Guile.
Alternate | Standard |
#\nl | #\newline |
#\np | #\page |
#\null | #\nul |
Characters may also be written using their code point values. They can
be written with as an octal number, such as #\10
for
#\bs
or #\177
for #\del
.
If one prefers hex to octal, there is an additional syntax for character
escapes: #\xHHHH
– the letter ’x’ followed by a hexadecimal
number of one to eight digits.
Fundamentally, the character comparison operations below are numeric comparisons of the character’s code points.
Return #t
if code point of x is equal to the code point
of y, else #f
.
Return #t
if the code point of x is less than the code
point of y, else #f
.
Return #t
if the code point of x is less than or equal
to the code point of y, else #f
.
Return #t
if the code point of x is greater than the
code point of y, else #f
.
Return #t
if the code point of x is greater than or
equal to the code point of y, else #f
.
Case-insensitive character comparisons use Unicode case folding. In case folding comparisons, if a character is lowercase and has an uppercase form that can be expressed as a single character, it is converted to uppercase before comparison. All other characters undergo no conversion before the comparison occurs. This includes the German sharp S (Eszett) which is not uppercased before conversion because its uppercase form has two characters. Unicode case folding is language independent: it uses rules that are generally true, but, it cannot cover all cases for all languages.
Return #t
if the case-folded code point of x is the same
as the case-folded code point of y, else #f
.
Return #t
if the case-folded code point of x is less
than the case-folded code point of y, else #f
.
Return #t
if the case-folded code point of x is less
than or equal to the case-folded code point of y, else
#f
.
Return #t
if the case-folded code point of x is greater
than the case-folded code point of y, else #f
.
Return #t
if the case-folded code point of x is greater
than or equal to the case-folded code point of y, else
#f
.
Return #t
if chr is alphabetic, else #f
.
Return #t
if chr is numeric, else #f
.
Return #t
if chr is whitespace, else #f
.
Return #t
if chr is uppercase, else #f
.
Return #t
if chr is lowercase, else #f
.
Return #t
if chr is either uppercase or lowercase, else
#f
.
Return a symbol giving the two-letter name of the Unicode general
category assigned to chr or #f
if no named category is
assigned. The following table provides a list of category names along
with their meanings.
Lu | Uppercase letter | Pf | Final quote punctuation |
Ll | Lowercase letter | Po | Other punctuation |
Lt | Titlecase letter | Sm | Math symbol |
Lm | Modifier letter | Sc | Currency symbol |
Lo | Other letter | Sk | Modifier symbol |
Mn | Non-spacing mark | So | Other symbol |
Mc | Combining spacing mark | Zs | Space separator |
Me | Enclosing mark | Zl | Line separator |
Nd | Decimal digit number | Zp | Paragraph separator |
Nl | Letter number | Cc | Control |
No | Other number | Cf | Format |
Pc | Connector punctuation | Cs | Surrogate |
Pd | Dash punctuation | Co | Private use |
Ps | Open punctuation | Cn | Unassigned |
Pe | Close punctuation | ||
Pi | Initial quote punctuation |
Return the code point of chr.
Return the character that has code point n. The integer n
must be a valid code point. Valid code points are in the ranges 0 to
#xD7FF
inclusive or #xE000
to #x10FFFF
inclusive.
Return the uppercase character version of chr.
Return the lowercase character version of chr.
Return the titlecase character version of chr if one exists; otherwise return the uppercase version.
For most characters these will be the same, but the Unicode Standard
includes certain digraph compatibility characters, such as U+01F3
“dz”, for which the uppercase and titlecase characters are different
(U+01F1
“DZ” and U+01F2
“Dz” in this case,
respectively).
These C functions take an integer representation of a Unicode
codepoint and return the codepoint corresponding to its uppercase,
lowercase, and titlecase forms respectively. The type
scm_t_wchar
is a signed, 32-bit integer.
Characters also have “formal names”, which are defined by Unicode.
These names can be accessed in Guile from the (ice-9 unicode)
module:
(use-modules (ice-9 unicode))
Return the formal all-upper-case Unicode name of ch,
as a string, or #f
if the character has no name.
Return the character whose formal all-upper-case Unicode name is
name, or #f
if no such character is known.
Next: Strings, Previous: Characters, Up: Data Types [Contents][Index]
The features described in this section correspond directly to SRFI-14.
The data type charset implements sets of characters (see Characters). Because the internal representation of character sets is not visible to the user, a lot of procedures for handling them are provided.
Character sets can be created, extended, tested for the membership of a characters and be compared to other character sets.
Next: Iterating Over Character Sets, Up: Character Sets [Contents][Index]
Use these procedures for testing whether an object is a character set,
or whether several character sets are equal or subsets of each other.
char-set-hash
can be used for calculating a hash value, maybe for
usage in fast lookup procedures.
Return #t
if obj is a character set, #f
otherwise.
Return #t
if all given character sets are equal.
Return #t
if every character set char_seti is a subset
of character set char_seti+1.
Compute a hash value for the character set cs. If bound is given and non-zero, it restricts the returned value to the range 0 … bound - 1.
Next: Creating Character Sets, Previous: Character Set Predicates/Comparison, Up: Character Sets [Contents][Index]
Character set cursors are a means for iterating over the members of a
character sets. After creating a character set cursor with
char-set-cursor
, a cursor can be dereferenced with
char-set-ref
, advanced to the next member with
char-set-cursor-next
. Whether a cursor has passed past the last
element of the set can be checked with end-of-char-set?
.
Additionally, mapping and (un-)folding procedures for character sets are provided.
Return a cursor into the character set cs.
Return the character at the current cursor position
cursor in the character set cs. It is an error to
pass a cursor for which end-of-char-set?
returns true.
Advance the character set cursor cursor to the next
character in the character set cs. It is an error if the
cursor given satisfies end-of-char-set?
.
Return #t
if cursor has reached the end of a
character set, #f
otherwise.
Fold the procedure kons over the character set cs, initializing it with knil.
This is a fundamental constructor for character sets.
This is a fundamental constructor for character sets.
Apply proc to every character in the character set cs. The return value is not specified.
Map the procedure proc over every character in cs. proc must be a character -> character procedure.
Next: Querying Character Sets, Previous: Iterating Over Character Sets, Up: Character Sets [Contents][Index]
New character sets are produced with these procedures.
Return a newly allocated character set containing all characters in cs.
Return a character set containing all given characters.
Convert the character list list to a character set. If the character set base_cs is given, the character in this set are also included in the result.
Convert the character list list to a character set. The characters are added to base_cs and base_cs is returned.
Convert the string str to a character set. If the character set base_cs is given, the characters in this set are also included in the result.
Convert the string str to a character set. The characters from the string are added to base_cs, and base_cs is returned.
Return a character set containing every character from cs so that it satisfies pred. If provided, the characters from base_cs are added to the result.
Return a character set containing every character from cs so that it satisfies pred. The characters are added to base_cs and base_cs is returned.
Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).
If error is a true value, an error is signalled if the
specified range contains characters which are not contained in
the implemented character range. If error is #f
,
these characters are silently left out of the resulting
character set.
The characters in base_cs are added to the result, if given.
Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).
If error is a true value, an error is signalled if the
specified range contains characters which are not contained in
the implemented character range. If error is #f
,
these characters are silently left out of the resulting
character set.
The characters are added to base_cs and base_cs is returned.
Coerces x into a char-set. x may be a string, character or char-set. A string is converted to the set of its constituent characters; a character is converted to a singleton set; a char-set is returned as-is.
Next: Character-Set Algebra, Previous: Creating Character Sets, Up: Character Sets [Contents][Index]
Access the elements and other information of a character set with these procedures.
Returns an association list containing debugging information for cs. The association list has the following entries.
char-set
The char-set itself
len
The number of groups of contiguous code points the char-set contains
ranges
A list of lists where each sublist is a range of code points and their associated characters
The return value of this function cannot be relied upon to be consistent between versions of Guile and should not be used in code.
Return the number of elements in character set cs.
Return the number of the elements int the character set cs which satisfy the predicate pred.
Return a list containing the elements of the character set cs.
Return a string containing the elements of the character set cs. The order in which the characters are placed in the string is not defined.
Return #t
if the character ch is contained in the
character set cs, or #f
otherwise.
Return a true value if every character in the character set cs satisfies the predicate pred.
Return a true value if any character in the character set cs satisfies the predicate pred.
Next: Standard Character Sets, Previous: Querying Character Sets, Up: Character Sets [Contents][Index]
Character sets can be manipulated with the common set algebra operation, such as union, complement, intersection etc. All of these procedures provide side-effecting variants, which modify their character set argument(s).
Add all character arguments to the first argument, which must be a character set.
Delete all character arguments from the first argument, which must be a character set.
Add all character arguments to the first argument, which must be a character set.
Delete all character arguments from the first argument, which must be a character set.
Return the complement of the character set cs.
Note that the complement of a character set is likely to contain many
reserved code points (code points that are not associated with
characters). It may be helpful to modify the output of
char-set-complement
by computing its intersection with the set
of designated code points, char-set:designated
.
Return the union of all argument character sets.
Return the intersection of all argument character sets.
Return the difference of all argument character sets.
Return the exclusive-or of all argument character sets.
Return the difference and the intersection of all argument character sets.
Return the complement of the character set cs.
Return the union of all argument character sets.
Return the intersection of all argument character sets.
Return the difference of all argument character sets.
Return the exclusive-or of all argument character sets.
Return the difference and the intersection of all argument character sets.
Previous: Character-Set Algebra, Up: Character Sets [Contents][Index]
In order to make the use of the character set data type and procedures useful, several predefined character set variables exist.
These character sets are locale independent and are not recomputed
upon a setlocale
call. They contain characters from the whole
range of Unicode code points. For instance, char-set:letter
contains about 100,000 characters.
All lower-case characters.
All upper-case characters.
All single characters that function as if they were an upper-case letter followed by a lower-case letter.
All letters. This includes char-set:lower-case
,
char-set:upper-case
, char-set:title-case
, and many
letters that have no case at all. For example, Chinese and Japanese
characters typically have no concept of case.
The union of char-set:letter
and char-set:digit
.
All characters which would put ink on the paper.
The union of char-set:graphic
and char-set:whitespace
.
All whitespace characters.
All horizontal whitespace characters, which notably includes
#\space
and #\tab
.
The ISO control characters are the C0 control characters (U+0000 to U+001F), delete (U+007F), and the C1 control characters (U+0080 to U+009F).
All punctuation characters, such as the characters
!"#%&'()*,-./:;?@[\\]_{}
All symbol characters, such as the characters $+<=>^`|~
.
The hexadecimal digits 0123456789abcdefABCDEF
.
This character set contains all designated code points. This includes all the code points to which Unicode has assigned a character or other meaning.
This character set contains all possible code points. This includes both designated and reserved code points.
Next: Symbols, Previous: Character Sets, Up: Data Types [Contents][Index]
Strings are fixed-length sequences of characters. They can be created by calling constructor procedures, but they can also literally get entered at the REPL or in Scheme source files.
Strings always carry the information about how many characters they are composed of with them, so there is no special end-of-string character, like in C. That means that Scheme strings can contain any character, even the ‘#\nul’ character ‘\0’.
To use strings efficiently, you need to know a bit about how Guile implements them. In Guile, a string consists of two parts, a head and the actual memory where the characters are stored. When a string (or a substring of it) is copied, only a new head gets created, the memory is usually not copied. The two heads start out pointing to the same memory.
When one of these two strings is modified, as with string-set!
,
their common memory does get copied so that each string has its own
memory and modifying one does not accidentally modify the other as well.
Thus, Guile’s strings are ‘copy on write’; the actual copying of their
memory is delayed until one string is written to.
This implementation makes functions like substring
very
efficient in the common case that no modifications are done to the
involved strings.
If you do know that your strings are getting modified right away, you
can use substring/copy
instead of substring
. This
function performs the copy immediately at the time of creation. This
is more efficient, especially in a multi-threaded program. Also,
substring/copy
can avoid the problem that a short substring
holds on to the memory of a very large original string that could
otherwise be recycled.
If you want to avoid the copy altogether, so that modifications of one
string show up in the other, you can use substring/shared
. The
strings created by this procedure are called mutation sharing
substrings since the substring and the original string share
modifications to each other.
If you want to prevent modifications, use substring/read-only
.
Guile provides all procedures of SRFI-13 and a few more.
Next: String Predicates, Up: Strings [Contents][Index]
The read syntax for strings is an arbitrarily long sequence of
characters enclosed in double quotes ("
).
Backslash is an escape character and can be used to insert the following
special characters. \"
and \\
are R5RS standard,
\|
is R7RS standard, the next seven are R6RS standard —
notice they follow C syntax — and the remaining four are Guile
extensions.
\\
Backslash character.
\"
Double quote character (an unescaped "
is otherwise the end
of the string).
\|
Vertical bar character.
\a
Bell character (ASCII 7).
\f
Formfeed character (ASCII 12).
\n
Newline character (ASCII 10).
\r
Carriage return character (ASCII 13).
\t
Tab character (ASCII 9).
\v
Vertical tab character (ASCII 11).
\b
Backspace character (ASCII 8).
\0
NUL character (ASCII 0).
\(
Open parenthesis. This is intended for use at the beginning of lines in multiline strings to avoid confusing Emacs lisp modes.
\
followed by newline (ASCII 10)Nothing. This way if \
is the last character in a line, the
string will continue with the first character from the next line,
without a line break.
If the hungry-eol-escapes
reader option is enabled, which is not
the case by default, leading whitespace on the next line is discarded.
"foo\ bar" ⇒ "foo bar" (read-enable 'hungry-eol-escapes) "foo\ bar" ⇒ "foobar"
\xHH
Character code given by two hexadecimal digits. For example
\x7f
for an ASCII DEL (127).
\uHHHH
Character code given by four hexadecimal digits. For example
\u0100
for a capital A with macron (U+0100).
\UHHHHHH
Character code given by six hexadecimal digits. For example
\U010402
.
The following are examples of string literals:
"foo" "bar plonk" "Hello World" "\"Hi\", he said."
The three escape sequences \xHH
, \uHHHH
and \UHHHHHH
were
chosen to not break compatibility with code written for previous versions of
Guile. The R6RS specification suggests a different, incompatible syntax for hex
escapes: \xHHHH;
– a character code followed by one to eight hexadecimal
digits terminated with a semicolon. If this escape format is desired instead,
it can be enabled with the reader option r6rs-hex-escapes
.
(read-enable 'r6rs-hex-escapes)
For more on reader options, See Reading Scheme Code.
Next: String Constructors, Previous: String Read Syntax, Up: Strings [Contents][Index]
The following procedures can be used to check whether a given string fulfills some specified property.
Return #t
if obj is a string, else #f
.
Returns 1
if obj is a string, 0
otherwise.
Return #t
if str’s length is zero, and
#f
otherwise.
(string-null? "") ⇒ #t y ⇒ "foo" (string-null? y) ⇒ #f
Check if char_pred is true for any character in string s.
char_pred can be a character to check for any equal to that, or a character set (see Character Sets) to check for any in that set, or a predicate procedure to call.
For a procedure, calls (char_pred c)
are made
successively on the characters from start to end. If
char_pred returns true (ie. non-#f
), string-any
stops and that return value is the return from string-any
. The
call on the last character (ie. at end-1), if that
point is reached, is a tail call.
If there are no characters in s (ie. start equals
end) then the return is #f
.
Check if char_pred is true for every character in string s.
char_pred can be a character to check for every character equal to that, or a character set (see Character Sets) to check for every character being in that set, or a predicate procedure to call.
For a procedure, calls (char_pred c)
are made
successively on the characters from start to end. If
char_pred returns #f
, string-every
stops and
returns #f
. The call on the last character (ie. at
end-1), if that point is reached, is a tail call and the
return from that call is the return from string-every
.
If there are no characters in s (ie. start equals
end) then the return is #t
.
Next: List/String conversion, Previous: String Predicates, Up: Strings [Contents][Index]
The string constructor procedures create new string objects, possibly initializing them with some specified character data. See also See String Selection, for ways to create strings from existing strings.
Return a newly allocated string made from the given character arguments.
(string #\x #\y #\z) ⇒ "xyz" (string) ⇒ ""
Return a newly allocated string made from a list of characters.
(list->string '(#\a #\b #\c)) ⇒ "abc"
Return a newly allocated string made from a list of characters, in reverse order.
(reverse-list->string '(#\a #\B #\c)) ⇒ "cBa"
Return a newly allocated string of length k. If chr is given, then all elements of the string are initialized to chr, otherwise the contents of the string are unspecified.
Like scm_make_string
, but expects the length as a
size_t
.
proc is an integer->char procedure. Construct a string of size len by applying proc to each index to produce the corresponding string element. The order in which proc is applied to the indices is not specified.
Append the string in the string list ls, using the string
delimiter as a delimiter between the elements of ls.
grammar is a symbol which specifies how the delimiter is
placed between the strings, and defaults to the symbol
infix
.
infix
Insert the separator between list elements. An empty string will produce an empty list.
strict-infix
Like infix
, but will raise an error if given the empty
list.
suffix
Insert the separator after every list element.
prefix
Insert the separator before each list element.
Next: String Selection, Previous: String Constructors, Up: Strings [Contents][Index]
When processing strings, it is often convenient to first convert them
into a list representation by using the procedure string->list
,
work with the resulting list, and then convert it back into a string.
These procedures are useful for similar tasks.
Convert the string str into a list of characters.
Split the string str into a list of substrings delimited by appearances of characters that
Note that an empty substring between separator characters will result in an empty string in the result list.
(string-split "root:x:0:0:root:/root:/bin/bash" #\:) ⇒ ("root" "x" "0" "0" "root" "/root" "/bin/bash") (string-split "::" #\:) ⇒ ("" "" "") (string-split "" #\:) ⇒ ("")
Next: String Modification, Previous: List/String conversion, Up: Strings [Contents][Index]
Portions of strings can be extracted by these procedures.
string-ref
delivers individual characters whereas
substring
can be used to extract substrings from longer strings.
Return the number of characters in string.
Return the number of characters in str as a size_t
.
Return character k of str using zero-origin indexing. k must be a valid index of str.
Return character k of str using zero-origin indexing. k must be a valid index of str.
Return a copy of the given string str.
The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.
Return a new string formed from the characters of str beginning with index start (inclusive) and ending with index end (exclusive). str must be a string, start and end must be exact integers satisfying:
0 <= start <= end <= (string-length str)
.
The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.
Like substring
, but the strings continue to share their storage
even if they are modified. Thus, modifications to str show up
in the new string, and vice versa.
Like substring
, but the storage for the new string is copied
immediately.
Like substring
, but the resulting string can not be modified.
Like scm_substring
, etc. but the bounds are given as a size_t
.
Return the n first characters of s.
Return all but the first n characters of s.
Return the n last characters of s.
Return all but the last n characters of s.
Take characters start to end from the string s and either pad with chr or truncate them to give len characters.
string-pad
pads or truncates on the left, so for example
(string-pad "x" 3) ⇒ " x" (string-pad "abcde" 3) ⇒ "cde"
string-pad-right
pads or truncates on the right, so for example
(string-pad-right "x" 3) ⇒ "x " (string-pad-right "abcde" 3) ⇒ "abc"
Trim occurrences of char_pred from the ends of s.
string-trim
trims char_pred characters from the left
(start) of the string, string-trim-right
trims them from the
right (end) of the string, string-trim-both
trims from both
ends.
char_pred can be a character, a character set, or a predicate
procedure to call on each character. If char_pred is not given
the default is whitespace as per char-set:whitespace
(see Standard Character Sets).
(string-trim " x ") ⇒ "x " (string-trim-right "banana" #\a) ⇒ "banan" (string-trim-both ".,xy:;" char-set:punctuation) ⇒ "xy" (string-trim-both "xyzzy" (lambda (c) (or (eqv? c #\x) (eqv? c #\y)))) ⇒ "zz"
Next: String Comparison, Previous: String Selection, Up: Strings [Contents][Index]
These procedures are for modifying strings in-place. This means that the result of the operation is not a new string; instead, the original string’s memory representation is modified.
Store chr in element k of str and return an unspecified value. k must be a valid index of str.
Like scm_string_set_x
, but the index is given as a size_t
.
Stores chr in every element of the given str and returns an unspecified value.
Change every character in str between start and end to fill.
(define y (string-copy "abcdefg")) (substring-fill! y 1 3 #\r) y ⇒ "arrdefg"
Copy the substring of str1 bounded by start1 and end1 into str2 beginning at position start2. str1 and str2 can be the same string.
Copy the sequence of characters from index range [start, end) in string s to string target, beginning at index tstart. The characters are copied left-to-right or right-to-left as needed – the copy is guaranteed to work, even if target and s are the same string. It is an error if the copy operation runs off the end of the target string.
Next: String Searching, Previous: String Modification, Up: Strings [Contents][Index]
The procedures in this section are similar to the character ordering predicates (see Characters), but are defined on character sequences.
The first set is specified in R5RS and has names that end in ?
.
The second set is specified in SRFI-13 and the names have not ending
?
.
The predicates ending in -ci
ignore the character case
when comparing strings. For now, case-insensitive comparison is done
using the R5RS rules, where every lower-case character that has a
single character upper-case form is converted to uppercase before
comparison. See See the (ice-9
i18n)
module, for locale-dependent string comparison.
Lexicographic equality predicate; return #t
if all strings are
the same length and contain the same characters in the same positions,
otherwise return #f
.
The procedure string-ci=?
treats upper and lower case
letters as though they were the same character, but
string=?
treats upper and lower case as distinct
characters.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically less than str_i+1.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically less than or equal to str_i+1.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically greater than str_i+1.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically greater than or equal to str_i+1.
Case-insensitive string equality predicate; return #t
if
all strings are the same length and their component
characters match (ignoring case) at each position; otherwise
return #f
.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically less than str_i+1
regardless of case.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically less than or equal to
str_i+1 regardless of case.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically greater than
str_i+1 regardless of case.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically greater than or equal to
str_i+1 regardless of case.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position where the lowercased letters do not match.
Return #f
if s1 and s2 are not equal, a true
value otherwise.
Return #f
if s1 and s2 are equal, a true
value otherwise.
Return #f
if s1 is greater or equal to s2, a
true value otherwise.
Return #f
if s1 is less or equal to s2, a
true value otherwise.
Return #f
if s1 is greater to s2, a true
value otherwise.
Return #f
if s1 is less to s2, a true value
otherwise.
Return #f
if s1 and s2 are not equal, a true
value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 and s2 are equal, a true
value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is greater or equal to s2, a
true value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is less or equal to s2, a
true value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is greater to s2, a true
value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is less to s2, a true value
otherwise. The character comparison is done
case-insensitively.
Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Because the same visual appearance of an abstract Unicode character can
be obtained via multiple sequences of Unicode characters, even the
case-insensitive string comparison functions described above may return
#f
when presented with strings containing different
representations of the same character. For example, the Unicode
character “LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE” can be
represented with a single character (U+1E69) or by the character “LATIN
SMALL LETTER S” (U+0073) followed by the combining marks “COMBINING
DOT BELOW” (U+0323) and “COMBINING DOT ABOVE” (U+0307).
For this reason, it is often desirable to ensure that the strings to be compared are using a mutually consistent representation for every character. The Unicode standard defines two methods of normalizing the contents of strings: Decomposition, which breaks composite characters into a set of constituent characters with an ordering defined by the Unicode Standard; and composition, which performs the converse.
There are two decomposition operations. “Canonical decomposition” produces character sequences that share the same visual appearance as the original characters, while “compatibility decomposition” produces ones whose visual appearances may differ from the originals but which represent the same abstract character.
These operations are encapsulated in the following set of normalization forms:
Characters are decomposed to their canonical forms.
Characters are decomposed to their compatibility forms.
Characters are decomposed to their canonical forms, then composed.
Characters are decomposed to their compatibility forms, then composed.
The functions below put their arguments into one of the forms described above.
Return the NFD
normalized form of s.
Return the NFKD
normalized form of s.
Return the NFC
normalized form of s.
Return the NFKC
normalized form of s.
Next: Alphabetic Case Mapping, Previous: String Comparison, Up: Strings [Contents][Index]
Search through the string s from left to right, returning the index of the first occurrence of a character which
Return #f
if no match is found.
Search through the string s from right to left, returning the index of the last occurrence of a character which
Return #f
if no match is found.
Return the length of the longest common prefix of the two strings.
Return the length of the longest common prefix of the two strings, ignoring character case.
Return the length of the longest common suffix of the two strings.
Return the length of the longest common suffix of the two strings, ignoring character case.
Is s1 a prefix of s2?
Is s1 a prefix of s2, ignoring character case?
Is s1 a suffix of s2?
Is s1 a suffix of s2, ignoring character case?
Search through the string s from right to left, returning the index of the last occurrence of a character which
Return #f
if no match is found.
Search through the string s from left to right, returning the index of the first occurrence of a character which
Search through the string s from right to left, returning the index of the last occurrence of a character which
Return the count of the number of characters in the string s which
Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings.
Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings. Character comparison is done case-insensitively.
Next: Reversing and Appending Strings, Previous: String Searching, Up: Strings [Contents][Index]
These are procedures for mapping strings to their upper- or lower-case equivalents, respectively, or for capitalizing strings.
They use the basic case mapping rules for Unicode characters. No special language or context rules are considered. The resulting strings are guaranteed to be the same length as the input strings.
See the (ice-9
i18n)
module, for locale-dependent case conversions.
Upcase every character in str
.
Destructively upcase every character in str
.
(string-upcase! y) ⇒ "ARRDEFG" y ⇒ "ARRDEFG"
Downcase every character in str.
Destructively downcase every character in str.
y ⇒ "ARRDEFG" (string-downcase! y) ⇒ "arrdefg" y ⇒ "arrdefg"
Return a freshly allocated string with the characters in str, where the first character of every word is capitalized.
Upcase the first character of every word in str destructively and return str.
y ⇒ "hello world" (string-capitalize! y) ⇒ "Hello World" y ⇒ "Hello World"
Titlecase every first character in a word in str.
Destructively titlecase every first character in a word in str.
Next: Mapping, Folding, and Unfolding, Previous: Alphabetic Case Mapping, Up: Strings [Contents][Index]
Reverse the string str. The optional arguments start and end delimit the region of str to operate on.
Reverse the string str in-place. The optional arguments start and end delimit the region of str to operate on. The return value is unspecified.
Return a newly allocated string whose characters form the concatenation of the given strings, arg ....
(let ((h "hello ")) (string-append h "world")) ⇒ "hello world"
Like string-append
, but the result may share memory
with the argument strings.
Append the elements (which must be strings) of ls together into a single string. Guaranteed to return a freshly allocated string.
Without optional arguments, this procedure is equivalent to
(string-concatenate (reverse ls))
If the optional argument final_string is specified, it is consed onto the beginning to ls before performing the list-reverse and string-concatenate operations. If end is given, only the characters of final_string up to index end are used.
Guaranteed to return a freshly allocated string.
Like string-concatenate
, but the result may share memory
with the strings in the list ls.
Like string-concatenate-reverse
, but the result may
share memory with the strings in the ls arguments.
Next: Miscellaneous String Operations, Previous: Reversing and Appending Strings, Up: Strings [Contents][Index]
proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified.
proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified. The string s is modified in-place, the return value is not specified.
proc is mapped over s in left-to-right order. The return value is not specified.
Call (proc i)
for each index i in s, from left to
right.
For example, to change characters to alternately upper and lower case,
(define str (string-copy "studly")) (string-for-each-index (lambda (i) (string-set! str i ((if (even? i) char-upcase char-downcase) (string-ref str i)))) str) str ⇒ "StUdLy"
Fold kons over the characters of s, with knil as the terminating element, from left to right. kons must expect two arguments: The actual character and the last result of kons’ application.
Fold kons over the characters of s, with knil as the terminating element, from right to left. kons must expect two arguments: The actual character and the last result of kons’ application.
(lambda (x) )
.
Next: Representing Strings as Bytes, Previous: Mapping, Folding, and Unfolding, Up: Strings [Contents][Index]
This is the extended substring procedure that implements replicated copying of a substring of some string.
s is a string, start and end are optional
arguments that demarcate a substring of s, defaulting to
0 and the length of s. Replicate this substring up and
down index space, in both the positive and negative directions.
xsubstring
returns the substring of this string
beginning at index from, and ending at to, which
defaults to from + (end - start).
Exactly the same as xsubstring
, but the extracted text
is written into the string target starting at index
tstart. The operation is not defined if (eq?
target s)
or these arguments share storage – you
cannot copy a string on top of itself.
Return the string s1, but with the characters start1 … end1 replaced by the characters start2 … end2 from s2.
Split the string s into a list of substrings, where each
substring is a maximal non-empty contiguous sequence of
characters from the character set token_set, which
defaults to char-set:graphic
.
If start or end indices are provided, they restrict
string-tokenize
to operating on the indicated substring
of s.
Filter the string s, retaining only those characters which satisfy char_pred.
If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.
Delete characters satisfying char_pred from s.
If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.
Next: Conversion to/from C, Previous: Miscellaneous String Operations, Up: Strings [Contents][Index]
Out in the cold world outside of Guile, not all strings are treated in the same way. Out there there are only bytes, and there are many ways of representing a strings (sequences of characters) as binary data (sequences of bytes).
As a user, usually you don’t have to think about this very much. When you type on your keyboard, your system encodes your keystrokes as bytes according to the locale that you have configured on your computer. Guile uses the locale to decode those bytes back into characters – hopefully the same characters that you typed in.
All is not so clear when dealing with a system with multiple users, such as a web server. Your web server might get a request from one user for data encoded in the ISO-8859-1 character set, and then another request from a different user for UTF-8 data.
Guile provides an iconv module for converting between strings and sequences of bytes. See Bytevectors, for more on how Guile represents raw byte sequences. This module gets its name from the common UNIX command of the same name.
Note that often it is sufficient to just read and write strings from
ports instead of using these functions. To do this, specify the port
encoding using set-port-encoding!
. See Ports, for more on
ports and character encodings.
Unlike the rest of the procedures in this section, you have to load the
iconv
module before having access to these procedures:
(use-modules (ice-9 iconv))
Encode string as a sequence of bytes.
The string will be encoded in the character set specified by the
encoding string. If the string has characters that cannot be
represented in the encoding, by default this procedure raises an
encoding-error
. Pass a conversion-strategy argument to
specify other behaviors.
The return value is a bytevector. See Bytevectors, for more on bytevectors. See Ports, for more on character encodings and conversion strategies.
Decode bytevector into a string.
The bytes will be decoded from the character set by the encoding
string. If the bytes do not form a valid encoding, by default this
procedure raises an decoding-error
. As with
string->bytevector
, pass the optional conversion-strategy
argument to modify this behavior. See Ports, for more on character
encodings and conversion strategies.
Like call-with-output-string
, but instead of returning a string,
returns a encoding of the string according to encoding, as a
bytevector. This procedure can be more efficient than collecting a
string and then converting it via string->bytevector
.
Next: String Internals, Previous: Representing Strings as Bytes, Up: Strings [Contents][Index]
When creating a Scheme string from a C string or when converting a Scheme string to a C string, the concept of character encoding becomes important.
In C, a string is just a sequence of bytes, and the character encoding describes the relation between these bytes and the actual characters that make up the string. For Scheme strings, character encoding is not an issue (most of the time), since in Scheme you usually treat strings as character sequences, not byte sequences.
Converting to C and converting from C each have their own challenges.
When converting from C to Scheme, it is important that the sequence of bytes in the C string be valid with respect to its encoding. ASCII strings, for example, can’t have any bytes greater than 127. An ASCII byte greater than 127 is considered ill-formed and cannot be converted into a Scheme character.
Problems can occur in the reverse operation as well. Not all character encodings can hold all possible Scheme characters. Some encodings, like ASCII for example, can only describe a small subset of all possible characters. So, when converting to C, one must first decide what to do with Scheme characters that can’t be represented in the C string.
Converting a Scheme string to a C string will often allocate fresh
memory to hold the result. You must take care that this memory is
properly freed eventually. In many cases, this can be achieved by
using scm_dynwind_free
inside an appropriate dynwind context,
See Dynamic Wind.
Creates a new Scheme string that has the same contents as str when interpreted in the character encoding of the current locale.
For scm_from_locale_string
, str must be null-terminated.
For scm_from_locale_stringn
, len specifies the length of
str in bytes, and str does not need to be null-terminated.
If len is (size_t)-1
, then str does need to be
null-terminated and the real length will be found with strlen
.
If the C string is ill-formed, an error will be raised.
Note that these functions should not be used to convert C string
constants, because there is no guarantee that the current locale will
match that of the execution character set, used for string and character
constants. Most modern C compilers use UTF-8 by default, so to convert
C string constants we recommend scm_from_utf8_string
.
Like scm_from_locale_string
and scm_from_locale_stringn
,
respectively, but also frees str with free
eventually.
Thus, you can use this function when you would free str anyway
immediately after creating the Scheme string. In certain cases, Guile
can then use str directly as its internal representation.
Returns a C string with the same contents as str in the character
encoding of the current locale. The C string must be freed with
free
eventually, maybe by using scm_dynwind_free
,
See Dynamic Wind.
For scm_to_locale_string
, the returned string is
null-terminated and an error is signalled when str contains
#\nul
characters.
For scm_to_locale_stringn
and lenp not NULL
,
str might contain #\nul
characters and the length of the
returned string in bytes is stored in *lenp
. The
returned string will not be null-terminated in this case. If
lenp is NULL
, scm_to_locale_stringn
behaves like
scm_to_locale_string
.
If a character in str cannot be represented in the character encoding of the current locale, the default port conversion strategy is used. See Ports, for more on conversion strategies.
If the conversion strategy is error
, an error will be raised. If
it is substitute
, a replacement character, such as a question
mark, will be inserted in its place. If it is escape
, a hex
escape will be inserted in its place.
Puts str as a C string in the current locale encoding into the
memory pointed to by buf. The buffer at buf has room for
max_len bytes and scm_to_local_stringbuf
will never store
more than that. No terminating '\0'
will be stored.
The return value of scm_to_locale_stringbuf
is the number of
bytes that are needed for all of str, regardless of whether
buf was large enough to hold them. Thus, when the return value
is larger than max_len, only max_len bytes have been
stored and you probably need to try again with a larger buffer.
For most situations, string conversion should occur using the current
locale, such as with the functions above. But there may be cases where
one wants to convert strings from a character encoding other than the
locale’s character encoding. For these cases, the lower-level functions
scm_to_stringn
and scm_from_stringn
are provided. These
functions should seldom be necessary if one is properly using locales.
This is an enumerated type that can take one of three values:
SCM_FAILED_CONVERSION_ERROR
,
SCM_FAILED_CONVERSION_QUESTION_MARK
, and
SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE
. They are used to indicate
a strategy for handling characters that cannot be converted to or from a
given character encoding. SCM_FAILED_CONVERSION_ERROR
indicates
that a conversion should throw an error if some characters cannot be
converted. SCM_FAILED_CONVERSION_QUESTION_MARK
indicates that a
conversion should replace unconvertable characters with the question
mark character. And, SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE
requests that a conversion should replace an unconvertable character
with an escape sequence.
While all three strategies apply when converting Scheme strings to C,
only SCM_FAILED_CONVERSION_ERROR
and
SCM_FAILED_CONVERSION_QUESTION_MARK
can be used when converting C
strings to Scheme.
This function returns a newly allocated C string from the Guile string str. The length of the returned string in bytes will be returned in lenp. The character encoding of the C string is passed as the ASCII, null-terminated C string encoding. The handler parameter gives a strategy for dealing with characters that cannot be converted into encoding.
If lenp is NULL
, this function will return a null-terminated C
string. It will throw an error if the string contains a null
character.
The Scheme interface to this function is string->bytevector
, from the
ice-9 iconv
module. See Representing Strings as Bytes.
This function returns a scheme string from the C string str. The
length in bytes of the C string is input as len. The encoding of the C
string is passed as the ASCII, null-terminated C string encoding
.
The handler parameters suggests a strategy for dealing with
unconvertable characters.
The Scheme interface to this function is bytevector->string
.
See Representing Strings as Bytes.
The following conversion functions are provided as a convenience for the most commonly used encodings.
Return a scheme string from the null-terminated C string str, which is ISO-8859-1-, UTF-8-, or UTF-32-encoded. These functions should be used to convert hard-coded C string constants into Scheme strings.
Return a scheme string from C string str, which is ISO-8859-1-,
UTF-8-, or UTF-32-encoded, of length len. len is the number
of bytes pointed to by str for scm_from_latin1_stringn
and
scm_from_utf8_stringn
; it is the number of elements (code points)
in str in the case of scm_from_utf32_stringn
.
Return a newly allocated, ISO-8859-1-, UTF-8-, or UTF-32-encoded C string
from Scheme string str. An error is thrown when str
cannot be converted to the specified encoding. If lenp is
NULL
, the returned C string will be null terminated, and an error
will be thrown if the C string would otherwise contain null
characters. If lenp is not NULL
, the string is not null terminated,
and the length of the returned string is returned in lenp. The length
returned is the number of bytes for scm_to_latin1_stringn
and
scm_to_utf8_stringn
; it is the number of elements (code points)
for scm_to_utf32_stringn
.
It is not often the case, but sometimes when you are dealing with the implementation details of a port, you need to encode and decode strings according to the encoding and conversion strategy of the port. There are some convenience functions for that purpose as well.
Like scm_from_stringn
and friends, except they take their
encoding and conversion strategy from a given port object.
Previous: Conversion to/from C, Up: Strings [Contents][Index]
Guile stores each string in memory as a contiguous array of Unicode code points along with an associated set of attributes. If all of the code points of a string have an integer range between 0 and 255 inclusive, the code point array is stored as one byte per code point: it is stored as an ISO-8859-1 (aka Latin-1) string. If any of the code points of the string has an integer value greater that 255, the code point array is stored as four bytes per code point: it is stored as a UTF-32 string.
Conversion between the one-byte-per-code-point and four-bytes-per-code-point representations happens automatically as necessary.
No API is provided to set the internal representation of strings; however, there are pair of procedures available to query it. These are debugging procedures. Using them in production code is discouraged, since the details of Guile’s internal representation of strings may change from release to release.
Return the number of bytes used to encode a Unicode code point in string str. The result is one or four.
Returns an association list containing debugging information for str. The association list has the following entries.
string
The string itself.
start
The start index of the string into its stringbuf
length
The length of the string
shared
If this string is a substring, it returns its
parent string. Otherwise, it returns #f
read-only
#t
if the string is read-only
stringbuf-chars
A new string containing this string’s stringbuf’s characters
stringbuf-length
The number of characters in this stringbuf
stringbuf-shared
#t
if this stringbuf is shared
stringbuf-wide
#t
if this stringbuf’s characters are stored in a 32-bit buffer,
or #f
if they are stored in an 8-bit buffer
Next: Keywords, Previous: Strings, Up: Data Types [Contents][Index]
Symbols in Scheme are widely used in three ways: as items of discrete data, as lookup keys for alists and hash tables, and to denote variable references.
A symbol is similar to a string in that it is defined by a sequence of characters. The sequence of characters is known as the symbol’s name. In the usual case — that is, where the symbol’s name doesn’t include any characters that could be confused with other elements of Scheme syntax — a symbol is written in a Scheme program by writing the sequence of characters that make up the name, without any quotation marks or other special syntax. For example, the symbol whose name is “multiply-by-2” is written, simply:
multiply-by-2
Notice how this differs from a string with contents “multiply-by-2”, which is written with double quotation marks, like this:
"multiply-by-2"
Looking beyond how they are written, symbols are different from strings in two important respects.
The first important difference is uniqueness. If the same-looking string is read twice from two different places in a program, the result is two different string objects whose contents just happen to be the same. If, on the other hand, the same-looking symbol is read twice from two different places in a program, the result is the same symbol object both times.
Given two read symbols, you can use eq?
to test whether they are
the same (that is, have the same name). eq?
is the most
efficient comparison operator in Scheme, and comparing two symbols like
this is as fast as comparing, for example, two numbers. Given two
strings, on the other hand, you must use equal?
or
string=?
, which are much slower comparison operators, to
determine whether the strings have the same contents.
(define sym1 (quote hello)) (define sym2 (quote hello)) (eq? sym1 sym2) ⇒ #t (define str1 "hello") (define str2 "hello") (eq? str1 str2) ⇒ #f (equal? str1 str2) ⇒ #t
The second important difference is that symbols, unlike strings, are not
self-evaluating. This is why we need the (quote …)
s in the
example above: (quote hello)
evaluates to the symbol named
"hello" itself, whereas an unquoted hello
is read as the
symbol named "hello" and evaluated as a variable reference … about
which more below (see Symbols as Denoting Variables).
Next: Symbols as Lookup Keys, Up: Symbols [Contents][Index]
Numbers and symbols are similar to the extent that they both lend
themselves to eq?
comparison. But symbols are more descriptive
than numbers, because a symbol’s name can be used directly to describe
the concept for which that symbol stands.
For example, imagine that you need to represent some colours in a computer program. Using numbers, you would have to choose arbitrarily some mapping between numbers and colours, and then take care to use that mapping consistently:
;; 1=red, 2=green, 3=purple (if (eq? (colour-of vehicle) 1) ...)
You can make the mapping more explicit and the code more readable by defining constants:
(define red 1) (define green 2) (define purple 3) (if (eq? (colour-of vehicle) red) ...)
But the simplest and clearest approach is not to use numbers at all, but symbols whose names specify the colours that they refer to:
(if (eq? (colour-of vehicle) 'red) ...)
The descriptive advantages of symbols over numbers increase as the set of concepts that you want to describe grows. Suppose that a car object can have other properties as well, such as whether it has or uses:
Then a car’s combined property set could be naturally represented and manipulated as a list of symbols:
(properties-of vehicle1) ⇒ (red manual unleaded power-steering) (if (memq 'power-steering (properties-of vehicle1)) (display "Unfit people can drive this vehicle.\n") (display "You'll need strong arms to drive this vehicle!\n")) -| Unfit people can drive this vehicle.
Remember, the fundamental property of symbols that we are relying on
here is that an occurrence of 'red
in one part of a program is an
indistinguishable symbol from an occurrence of 'red
in
another part of a program; this means that symbols can usefully be
compared using eq?
. At the same time, symbols have naturally
descriptive names. This combination of efficiency and descriptive power
makes them ideal for use as discrete data.
Next: Symbols as Denoting Variables, Previous: Symbols as Discrete Data, Up: Symbols [Contents][Index]
Given their efficiency and descriptive power, it is natural to use symbols as the keys in an association list or hash table.
To illustrate this, consider a more structured representation of the car properties example from the preceding subsection. Rather than mixing all the properties up together in a flat list, we could use an association list like this:
(define car1-properties '((colour . red) (transmission . manual) (fuel . unleaded) (steering . power-assisted)))
Notice how this structure is more explicit and extensible than the flat
list. For example it makes clear that manual
refers to the
transmission rather than, say, the windows or the locking of the car.
It also allows further properties to use the same symbols among their
possible values without becoming ambiguous:
(define car1-properties '((colour . red) (transmission . manual) (fuel . unleaded) (steering . power-assisted) (seat-colour . red) (locking . manual)))
With a representation like this, it is easy to use the efficient
assq-XXX
family of procedures (see Association Lists) to
extract or change individual pieces of information:
(assq-ref car1-properties 'fuel) ⇒ unleaded (assq-ref car1-properties 'transmission) ⇒ manual (assq-set! car1-properties 'seat-colour 'black) ⇒ ((colour . red) (transmission . manual) (fuel . unleaded) (steering . power-assisted) (seat-colour . black) (locking . manual)))
Hash tables also have keys, and exactly the same arguments apply to the
use of symbols in hash tables as in association lists. The hash value
that Guile uses to decide where to add a symbol-keyed entry to a hash
table can be obtained by calling the symbol-hash
procedure:
Return a hash value for symbol.
See Hash Tables for information about hash tables in general, and for why you might choose to use a hash table rather than an association list.
Next: Operations Related to Symbols, Previous: Symbols as Lookup Keys, Up: Symbols [Contents][Index]
When an unquoted symbol in a Scheme program is evaluated, it is interpreted as a variable reference, and the result of the evaluation is the appropriate variable’s value.
For example, when the expression (string-length "abcd")
is read
and evaluated, the sequence of characters string-length
is read
as the symbol whose name is "string-length". This symbol is associated
with a variable whose value is the procedure that implements string
length calculation. Therefore evaluation of the string-length
symbol results in that procedure.
The details of the connection between an unquoted symbol and the variable to which it refers are explained elsewhere. See Definitions and Variable Bindings, for how associations between symbols and variables are created, and Modules, for how those associations are affected by Guile’s module system.
Next: Function Slots and Property Lists, Previous: Symbols as Denoting Variables, Up: Symbols [Contents][Index]
Given any Scheme value, you can determine whether it is a symbol using
the symbol?
primitive:
Return #t
if obj is a symbol, otherwise return
#f
.
Equivalent to scm_is_true (scm_symbol_p (val))
.
Once you know that you have a symbol, you can obtain its name as a
string by calling symbol->string
. Note that Guile differs by
default from R5RS on the details of symbol->string
as regards
case-sensitivity:
Return the name of symbol s as a string. By default, Guile reads symbols case-sensitively, so the string returned will have the same case variation as the sequence of characters that caused s to be created.
If Guile is set to read symbols case-insensitively (as specified by
R5RS), and s comes into being as part of a literal expression
(see Literal expressions in The Revised^5 Report on Scheme) or
by a call to the read
or string-ci->symbol
procedures,
Guile converts any alphabetic characters in the symbol’s name to
lower case before creating the symbol object, so the string returned
here will be in lower case.
If s was created by string->symbol
, the case of characters
in the string returned will be the same as that in the string that was
passed to string->symbol
, regardless of Guile’s case-sensitivity
setting at the time s was created.
It is an error to apply mutation procedures like string-set!
to
strings returned by this procedure.
Most symbols are created by writing them literally in code. However it is also possible to create symbols programmatically using the following procedures:
Return a newly allocated symbol made from the given character arguments.
(symbol #\x #\y #\z) ⇒ xyz
Return a newly allocated symbol made from a list of characters.
(list->symbol '(#\a #\b #\c)) ⇒ abc
Return a newly allocated symbol whose characters form the concatenation of the given symbols, arg ....
(let ((h 'hello)) (symbol-append h 'world)) ⇒ helloworld
Return the symbol whose name is string. This procedure can create symbols with names containing special characters or letters in the non-standard case, but it is usually a bad idea to create such symbols because in some implementations of Scheme they cannot be read as themselves.
Return the symbol whose name is str. If Guile is currently reading symbols case-insensitively, str is converted to lowercase before the returned symbol is looked up or created.
The following examples illustrate Guile’s detailed behaviour as regards the case-sensitivity of symbols:
(read-enable 'case-insensitive) ; R5RS compliant behaviour (symbol->string 'flying-fish) ⇒ "flying-fish" (symbol->string 'Martin) ⇒ "martin" (symbol->string (string->symbol "Malvina")) ⇒ "Malvina" (eq? 'mISSISSIppi 'mississippi) ⇒ #t (string->symbol "mISSISSIppi") ⇒ mISSISSIppi (eq? 'bitBlt (string->symbol "bitBlt")) ⇒ #f (eq? 'LolliPop (string->symbol (symbol->string 'LolliPop))) ⇒ #t (string=? "K. Harper, M.D." (symbol->string (string->symbol "K. Harper, M.D."))) ⇒ #t (read-disable 'case-insensitive) ; Guile default behaviour (symbol->string 'flying-fish) ⇒ "flying-fish" (symbol->string 'Martin) ⇒ "Martin" (symbol->string (string->symbol "Malvina")) ⇒ "Malvina" (eq? 'mISSISSIppi 'mississippi) ⇒ #f (string->symbol "mISSISSIppi") ⇒ mISSISSIppi (eq? 'bitBlt (string->symbol "bitBlt")) ⇒ #t (eq? 'LolliPop (string->symbol (symbol->string 'LolliPop))) ⇒ #t (string=? "K. Harper, M.D." (symbol->string (string->symbol "K. Harper, M.D."))) ⇒ #t
From C, there are lower level functions that construct a Scheme symbol from a C string in the current locale encoding.
When you want to do more from C, you should convert between symbols
and strings using scm_symbol_to_string
and
scm_string_to_symbol
and work with the strings.
Construct and return a Scheme symbol whose name is specified by the null-terminated C string name. These are appropriate when the C string is hard-coded in the source code.
Construct and return a Scheme symbol whose name is specified by
name. For scm_from_locale_symbol
, name must be null
terminated; for scm_from_locale_symboln
the length of name is
specified explicitly by len.
Note that these functions should not be used when name is a
C string constant, because there is no guarantee that the current locale
will match that of the execution character set, used for string and
character constants. Most modern C compilers use UTF-8 by default, so
in such cases we recommend scm_from_utf8_symbol
.
Like scm_from_locale_symbol
and scm_from_locale_symboln
,
respectively, but also frees str with free
eventually.
Thus, you can use this function when you would free str anyway
immediately after creating the Scheme string. In certain cases, Guile
can then use str directly as its internal representation.
The size of a symbol can also be obtained from C:
Return the number of characters in sym.
Finally, some applications, especially those that generate new Scheme
code dynamically, need to generate symbols for use in the generated
code. The gensym
primitive meets this need:
Create a new symbol with a name constructed from a prefix and a counter value. The string prefix can be specified as an optional argument. Default prefix is ‘ g’. The counter is increased by 1 at each call. There is no provision for resetting the counter.
The symbols generated by gensym
are likely to be unique,
since their names begin with a space and it is only otherwise possible
to generate such symbols if a programmer goes out of their way to do
so. Uniqueness can be guaranteed by instead using uninterned symbols
(see Uninterned Symbols), though they can’t be usefully written out
and read back in.
Next: Extended Read Syntax for Symbols, Previous: Operations Related to Symbols, Up: Symbols [Contents][Index]
In traditional Lisp dialects, symbols are often understood as having three kinds of value at once:
put
or get
functions.
Although Scheme (as one of its simplifications with respect to Lisp) does away with the distinction between variable and function namespaces, Guile currently retains some elements of the traditional structure in case they turn out to be useful when implementing translators for other languages, in particular Emacs Lisp.
Specifically, Guile symbols have two extra slots, one for a symbol’s property list, and one for its “function value.” The following procedures are provided to access these slots.
Return the contents of symbol’s function slot.
Set the contents of symbol’s function slot to value.
Return the property list currently associated with symbol.
Set symbol’s property list to value.
From sym’s property list, return the value for property
prop. The assumption is that sym’s property list is an
association list whose keys are distinguished from each other using
equal?
; prop should be one of the keys in that list. If
the property list has no entry for prop, symbol-property
returns #f
.
In sym’s property list, set the value for property prop to
val, or add a new entry for prop, with value val, if
none already exists. For the structure of the property list, see
symbol-property
.
From sym’s property list, remove the entry for property
prop, if there is one. For the structure of the property list,
see symbol-property
.
Support for these extra slots may be removed in a future release, and it is probably better to avoid using them. For a more modern and Schemely approach to properties, see Object Properties.
Next: Uninterned Symbols, Previous: Function Slots and Property Lists, Up: Symbols [Contents][Index]
The read syntax for a symbol is a sequence of letters, digits, and
extended alphabetic characters, beginning with a character that
cannot begin a number. In addition, the special cases of +
,
-
, and ...
are read as symbols even though numbers can
begin with +
, -
or .
.
Extended alphabetic characters may be used within identifiers as if they were letters. The set of extended alphabetic characters is:
! $ % & * + - . / : < = > ? @ ^ _ ~
In addition to the standard read syntax defined above (which is taken from R5RS (see Formal syntax in The Revised^5 Report on Scheme)), Guile provides an extended symbol read syntax that allows the inclusion of unusual characters such as space characters, newlines and parentheses. If (for whatever reason) you need to write a symbol containing characters not mentioned above, you can do so as follows.
#{
,
}#
.
Here are a few examples of this form of read syntax. The first symbol needs to use extended syntax because it contains a space character, the second because it contains a line break, and the last because it looks like a number.
#{foo bar}# #{what ever}# #{4242}#
Although Guile provides this extended read syntax for symbols, widespread usage of it is discouraged because it is not portable and not very readable.
Alternatively, if you enable the r7rs-symbols
read option (see
see Reading Scheme Code), you can write arbitrary symbols using the same
notation used for strings, except delimited by vertical bars instead of
double quotes.
|foo bar| |\x3BB; is a greek lambda| |\| is a vertical bar|
Note that there’s also an r7rs-symbols
print option
(see Writing Scheme Values). To enable the use of this notation, evaluate
one or both of the following expressions:
(read-enable 'r7rs-symbols) (print-enable 'r7rs-symbols)
Previous: Extended Read Syntax for Symbols, Up: Symbols [Contents][Index]
What makes symbols useful is that they are automatically kept unique. There are no two symbols that are distinct objects but have the same name. But of course, there is no rule without exception. In addition to the normal symbols that have been discussed up to now, you can also create special uninterned symbols that behave slightly differently.
To understand what is different about them and why they might be useful, we look at how normal symbols are actually kept unique.
Whenever Guile wants to find the symbol with a specific name, for
example during read
or when executing string->symbol
, it
first looks into a table of all existing symbols to find out whether a
symbol with the given name already exists. When this is the case, Guile
just returns that symbol. When not, a new symbol with the name is
created and entered into the table so that it can be found later.
Sometimes you might want to create a symbol that is guaranteed ‘fresh’, i.e. a symbol that did not exist previously. You might also want to somehow guarantee that no one else will ever unintentionally stumble across your symbol in the future. These properties of a symbol are often needed when generating code during macro expansion. When introducing new temporary variables, you want to guarantee that they don’t conflict with variables in other people’s code.
The simplest way to arrange for this is to create a new symbol but not enter it into the global table of all symbols. That way, no one will ever get access to your symbol by chance. Symbols that are not in the table are called uninterned. Of course, symbols that are in the table are called interned.
You create new uninterned symbols with the function make-symbol
.
You can test whether a symbol is interned or not with
symbol-interned?
.
Uninterned symbols break the rule that the name of a symbol uniquely
identifies the symbol object. Because of this, they can not be written
out and read back in like interned symbols. Currently, Guile has no
support for reading uninterned symbols. Note that the function
gensym
does not return uninterned symbols for this reason.
Return a new uninterned symbol with the name name. The returned
symbol is guaranteed to be unique and future calls to
string->symbol
will not return it.
Return #t
if symbol is interned, otherwise return
#f
.
For example:
(define foo-1 (string->symbol "foo")) (define foo-2 (string->symbol "foo")) (define foo-3 (make-symbol "foo")) (define foo-4 (make-symbol "foo")) (eq? foo-1 foo-2) ⇒ #t ; Two interned symbols with the same name are the same object, (eq? foo-1 foo-3) ⇒ #f ; but a call to make-symbol with the same name returns a ; distinct object. (eq? foo-3 foo-4) ⇒ #f ; A call to make-symbol always returns a new object, even for ; the same name. foo-3 ⇒ #<uninterned-symbol foo 8085290> ; Uninterned symbols print differently from interned symbols, (symbol? foo-3) ⇒ #t ; but they are still symbols, (symbol-interned? foo-3) ⇒ #f ; just not interned.
Next: Pairs, Previous: Symbols, Up: Data Types [Contents][Index]
Keywords are self-evaluating objects with a convenient read syntax that makes them easy to type.
Guile’s keyword support conforms to R5RS, and adds a (switchable) read
syntax extension to permit keywords to begin with :
as well as
#:
, or to end with :
.
Next: Coding With Keywords, Up: Keywords [Contents][Index]
Keywords are useful in contexts where a program or procedure wants to be able to accept a large number of optional arguments without making its interface unmanageable.
To illustrate this, consider a hypothetical make-window
procedure, which creates a new window on the screen for drawing into
using some graphical toolkit. There are many parameters that the caller
might like to specify, but which could also be sensibly defaulted, for
example:
If make-window
did not use keywords, the caller would have to
pass in a value for each possible argument, remembering the correct
argument order and using a special value to indicate the default value
for that argument:
(make-window 'default ;; Color depth 'default ;; Background color 800 ;; Width 100 ;; Height …) ;; More make-window arguments
With keywords, on the other hand, defaulted arguments are omitted, and non-default arguments are clearly tagged by the appropriate keyword. As a result, the invocation becomes much clearer:
(make-window #:width 800 #:height 100)
On the other hand, for a simpler procedure with few arguments, the use
of keywords would be a hindrance rather than a help. The primitive
procedure cons
, for example, would not be improved if it had to
be invoked as
(cons #:car x #:cdr y)
So the decision whether to use keywords or not is purely pragmatic: use them if they will clarify the procedure invocation at point of call.
Next: Keyword Read Syntax, Previous: Why Use Keywords?, Up: Keywords [Contents][Index]
If a procedure wants to support keywords, it should take a rest argument and then use whatever means is convenient to extract keywords and their corresponding arguments from the contents of that rest argument.
The following example illustrates the principle: the code for
make-window
uses a helper procedure called
get-keyword-value
to extract individual keyword arguments from
the rest argument.
(define (get-keyword-value args keyword default) (let ((kv (memq keyword args))) (if (and kv (>= (length kv) 2)) (cadr kv) default))) (define (make-window . args) (let ((depth (get-keyword-value args #:depth screen-depth)) (bg (get-keyword-value args #:bg "white")) (width (get-keyword-value args #:width 800)) (height (get-keyword-value args #:height 100)) …) …))
But you don’t need to write get-keyword-value
. The (ice-9
optargs)
module provides a set of powerful macros that you can use to
implement keyword-supporting procedures like this:
(use-modules (ice-9 optargs)) (define (make-window . args) (let-keywords args #f ((depth screen-depth) (bg "white") (width 800) (height 100)) ...))
Or, even more economically, like this:
(use-modules (ice-9 optargs)) (define* (make-window #:key (depth screen-depth) (bg "white") (width 800) (height 100)) ...)
For further details on let-keywords
, define*
and other
facilities provided by the (ice-9 optargs)
module, see
Optional Arguments.
To handle keyword arguments from procedures implemented in C,
use scm_c_bind_keyword_arguments
(see Keyword Procedures).
Next: Keyword Procedures, Previous: Coding With Keywords, Up: Keywords [Contents][Index]
Guile, by default, only recognizes a keyword syntax that is compatible
with R5RS. A token of the form #:NAME
, where NAME
has the
same syntax as a Scheme symbol (see Extended Read Syntax for Symbols), is the
external representation of the keyword named NAME
. Keyword
objects print using this syntax as well, so values containing keyword
objects can be read back into Guile. When used in an expression,
keywords are self-quoting objects.
If the keywords
read option is set to 'prefix
, Guile also
recognizes the alternative read syntax :NAME
. Otherwise, tokens
of the form :NAME
are read as symbols, as required by R5RS.
If the keywords
read option is set to 'postfix
, Guile
recognizes the SRFI-88 read syntax NAME:
(see SRFI-88 Keyword Objects).
Otherwise, tokens of this form are read as symbols.
To enable and disable the alternative non-R5RS keyword syntax, you use
the read-set!
procedure documented Reading Scheme Code. Note that
the prefix
and postfix
syntax are mutually exclusive.
(read-set! keywords 'prefix) #:type ⇒ #:type :type ⇒ #:type (read-set! keywords 'postfix) type: ⇒ #:type :type ⇒ :type (read-set! keywords #f) #:type ⇒ #:type :type -| ERROR: In expression :type: ERROR: Unbound variable: :type ABORT: (unbound-variable)
Previous: Keyword Read Syntax, Up: Keywords [Contents][Index]
Return #t
if the argument obj is a keyword, else
#f
.
Return the symbol with the same name as keyword.
Return the keyword with the same name as symbol.
Equivalent to scm_is_true (scm_keyword_p (obj))
.
Equivalent to scm_symbol_to_keyword (scm_from_locale_symbol
(name))
and scm_symbol_to_keyword (scm_from_locale_symboln
(name, len))
, respectively.
Note that these functions should not be used when name is a
C string constant, because there is no guarantee that the current locale
will match that of the execution character set, used for string and
character constants. Most modern C compilers use UTF-8 by default, so
in such cases we recommend scm_from_utf8_keyword
.
Equivalent to scm_symbol_to_keyword (scm_from_latin1_symbol
(name))
and scm_symbol_to_keyword (scm_from_utf8_symbol
(name))
, respectively.
SCM_UNDEFINED
) ¶Extract the specified keyword arguments from rest, which is not
modified. If the keyword argument keyword1 is present in
rest with an associated value, that value is stored in the
variable pointed to by argp1, otherwise the variable is left
unchanged. Similarly for the other keywords and argument pointers up to
keywordN and argpN. The argument list to
scm_c_bind_keyword_arguments
must be terminated by
SCM_UNDEFINED
.
Note that since the variables pointed to by argp1 through
argpN are left unchanged if the associated keyword argument is not
present, they should be initialized to their default values before
calling scm_c_bind_keyword_arguments
. Alternatively, you can
initialize them to SCM_UNDEFINED
before the call, and then use
SCM_UNBNDP
after the call to see which ones were provided.
If an unrecognized keyword argument is present in rest and
flags does not contain SCM_ALLOW_OTHER_KEYS
, or if
non-keyword arguments are present and flags does not contain
SCM_ALLOW_NON_KEYWORD_ARGUMENTS
, an exception is raised.
subr should be the name of the procedure receiving the keyword
arguments, for purposes of error reporting.
For example:
SCM k_delimiter; SCM k_grammar; SCM sym_infix; SCM my_string_join (SCM strings, SCM rest) { SCM delimiter = SCM_UNDEFINED; SCM grammar = sym_infix; scm_c_bind_keyword_arguments ("my-string-join", rest, 0, k_delimiter, &delimiter, k_grammar, &grammar, SCM_UNDEFINED); if (SCM_UNBNDP (delimiter)) delimiter = scm_from_utf8_string (" "); return scm_string_join (strings, delimiter, grammar); } void my_init () { k_delimiter = scm_from_utf8_keyword ("delimiter"); k_grammar = scm_from_utf8_keyword ("grammar"); sym_infix = scm_from_utf8_symbol ("infix"); scm_c_define_gsubr ("my-string-join", 1, 0, 1, my_string_join); }
Next: Lists, Previous: Keywords, Up: Data Types [Contents][Index]
Pairs are used to combine two Scheme objects into one compound object. Hence the name: A pair stores a pair of objects.
The data type pair is extremely important in Scheme, just like in any other Lisp dialect. The reason is that pairs are not only used to make two values available as one object, but that pairs are used for constructing lists of values. Because lists are so important in Scheme, they are described in a section of their own (see Lists).
Pairs can literally get entered in source code or at the REPL, in the
so-called dotted list syntax. This syntax consists of an opening
parentheses, the first element of the pair, a dot, the second element
and a closing parentheses. The following example shows how a pair
consisting of the two numbers 1 and 2, and a pair containing the symbols
foo
and bar
can be entered. It is very important to write
the whitespace before and after the dot, because otherwise the Scheme
parser would not be able to figure out where to split the tokens.
(1 . 2) (foo . bar)
But beware, if you want to try out these examples, you have to quote the expressions. More information about quotation is available in the section Expression Syntax. The correct way to try these examples is as follows.
'(1 . 2) ⇒ (1 . 2) '(foo . bar) ⇒ (foo . bar)
A new pair is made by calling the procedure cons
with two
arguments. Then the argument values are stored into a newly allocated
pair, and the pair is returned. The name cons
stands for
"construct". Use the procedure pair?
to test whether a
given Scheme object is a pair or not.
Return a newly allocated pair whose car is x and whose
cdr is y. The pair is guaranteed to be different (in the
sense of eq?
) from every previously existing object.
Return #t
if x is a pair; otherwise return
#f
.
Return 1 when x is a pair; otherwise return 0.
The two parts of a pair are traditionally called car and
cdr. They can be retrieved with procedures of the same name
(car
and cdr
), and can be modified with the procedures
set-car!
and set-cdr!
.
Since a very common operation in Scheme programs is to access the car of
a car of a pair, or the car of the cdr of a pair, etc., the procedures
called caar
, cadr
and so on are also predefined. However,
using these procedures is often detrimental to readability, and
error-prone. Thus, accessing the contents of a list is usually better
achieved using pattern matching techniques (see Pattern Matching).
Return the car or the cdr of pair, respectively.
These two macros are the fastest way to access the car or cdr of a pair; they can be thought of as compiling into a single memory reference.
These macros do no checking at all. The argument pair must be a valid pair.
These procedures are compositions of car
and cdr
, where
for example caddr
could be defined by
(define caddr (lambda (x) (car (cdr (cdr x)))))
cadr
, caddr
and cadddr
pick out the second, third
or fourth elements of a list, respectively. SRFI-1 provides the same
under the names second
, third
and fourth
(see Selectors).
Stores value in the car field of pair. The value returned
by set-car!
is unspecified.
Stores value in the cdr field of pair. The value returned
by set-cdr!
is unspecified.
Next: Vectors, Previous: Pairs, Up: Data Types [Contents][Index]
A very important data type in Scheme—as well as in all other Lisp dialects—is the data type list.8
This is the short definition of what a list is:
()
,
Next: List Predicates, Up: Lists [Contents][Index]
The syntax for lists is an opening parentheses, then all the elements of the list (separated by whitespace) and finally a closing parentheses.9.
(1 2 3) ; a list of the numbers 1, 2 and 3 ("foo" bar 3.1415) ; a string, a symbol and a real number () ; the empty list
The last example needs a bit more explanation. A list with no elements, called the empty list, is special in some ways. It is used for terminating lists by storing it into the cdr of the last pair that makes up a list. An example will clear that up:
(car '(1)) ⇒ 1 (cdr '(1)) ⇒ ()
This example also shows that lists have to be quoted when written (see Expression Syntax), because they would otherwise be mistakingly taken as procedure applications (see Simple Procedure Invocation).
Next: List Constructors, Previous: List Read Syntax, Up: Lists [Contents][Index]
Often it is useful to test whether a given Scheme object is a list or not. List-processing procedures could use this information to test whether their input is valid, or they could do different things depending on the datatype of their arguments.
The predicate null?
is often used in list-processing code to
tell whether a given list has run out of elements. That is, a loop
somehow deals with the elements of a list until the list satisfies
null?
. Then, the algorithm terminates.
Return 1 when x is the empty list; otherwise return 0.
Next: List Selection, Previous: List Predicates, Up: Lists [Contents][Index]
This section describes the procedures for constructing new lists.
list
simply returns a list where the elements are the arguments,
cons*
is similar, but the last argument is stored in the cdr of
the last pair of the list.
SCM_UNDEFINED
) ¶Return a new list containing elements elem ....
scm_list_n
takes a variable number of arguments, terminated by
the special SCM_UNDEFINED
. That final SCM_UNDEFINED
is
not included in the list. None of elem … can
themselves be SCM_UNDEFINED
, or scm_list_n
will
terminate at that point.
Like list
, but the last arg provides the tail of the
constructed list, returning (cons arg1 (cons
arg2 (cons … argn)))
. Requires at least one
argument. If given one argument, that argument is returned as
result. This function is called list*
in some other
Schemes and in Common LISP.
Return a (newly-created) copy of lst.
Create a list containing of n elements, where each element is
initialized to init. init defaults to the empty list
()
if not given.
Note that list-copy
only makes a copy of the pairs which make up
the spine of the lists. The list elements are not copied, which means
that modifying the elements of the new list also modifies the elements
of the old list. On the other hand, applying procedures like
set-cdr!
or delv!
to the new list will not alter the old
list. If you also need to copy the list elements (making a deep copy),
use the procedure copy-tree
(see Copying Deep Structures).
Next: Append and Reverse, Previous: List Constructors, Up: Lists [Contents][Index]
These procedures are used to get some information about a list, or to retrieve one or more elements of a list.
Return the number of elements in list lst.
Return the last pair in lst, signalling an error if lst is circular.
Return the kth element from list.
Return the "tail" of lst beginning with its kth element. The first element of the list is considered to be element 0.
list-tail
and list-cdr-ref
are identical. It may help to
think of list-cdr-ref
as accessing the kth cdr of the list,
or returning the results of cdring k times down lst.
Copy the first k elements from lst into a new list, and return it.
Next: List Modification, Previous: List Selection, Up: Lists [Contents][Index]
append
and append!
are used to concatenate two or more
lists in order to form a new list. reverse
and reverse!
return lists with the same elements as their arguments, but in reverse
order. The procedure variants with an !
directly modify the
pairs which form the list, whereas the other procedures create new
pairs. This is why you should be careful when using the side-effecting
variants.
Return a list comprising all the elements of lists lst … obj. If called with no arguments, return the empty list.
(append '(x) '(y)) ⇒ (x y) (append '(a) '(b c d)) ⇒ (a b c d) (append '(a (b)) '((c))) ⇒ (a (b) (c))
The last argument obj may actually be any object; an improper list results if the last argument is not a proper list.
(append '(a b) '(c . d)) ⇒ (a b c . d) (append '() 'a) ⇒ a
append
doesn’t modify the given lists, but the return may share
structure with the final obj. append!
is permitted, but
not required, to modify the given lists to form its return.
For scm_append
and scm_append_x
, lstlst is a list
of the list operands lst … obj. That lstlst
itself is not modified or used in the return.
Return a list comprising the elements of lst, in reverse order.
reverse
constructs a new list. reverse!
is permitted, but
not required, to modify lst in constructing its return.
For reverse!
, the optional newtail is appended to the
result. newtail isn’t reversed, it simply becomes the list
tail. For scm_reverse_x
, the newtail parameter is
mandatory, but can be SCM_EOL
if no further tail is required.
Next: List Searching, Previous: Append and Reverse, Up: Lists [Contents][Index]
The following procedures modify an existing list, either by changing elements of the list, or by changing the list structure itself.
Set the kth element of list to val.
Set the kth cdr of list to val.
Return a newly-created copy of lst with elements
eq?
to item removed. This procedure mirrors
memq
: delq
compares elements of lst against
item with eq?
.
Return a newly-created copy of lst with elements
eqv?
to item removed. This procedure mirrors
memv
: delv
compares elements of lst against
item with eqv?
.
Return a newly-created copy of lst with elements
equal?
to item removed. This procedure mirrors
member
: delete
compares elements of lst
against item with equal?
.
See also SRFI-1 which has an extended delete
(Deleting), and also an lset-difference
which can delete
multiple items in one call (Set Operations on Lists).
These procedures are destructive versions of delq
, delv
and delete
: they modify the pointers in the existing lst
rather than creating a new list. Caveat evaluator: Like other
destructive list functions, these functions cannot modify the binding of
lst, and so cannot be used to delete the first element of
lst destructively.
Like delq!
, but only deletes the first occurrence of
item from lst. Tests for equality using
eq?
. See also delv1!
and delete1!
.
Like delv!
, but only deletes the first occurrence of
item from lst. Tests for equality using
eqv?
. See also delq1!
and delete1!
.
Like delete!
, but only deletes the first occurrence of
item from lst. Tests for equality using
equal?
. See also delq1!
and delv1!
.
Return a list containing all elements from lst which satisfy the predicate pred. The elements in the result list have the same order as in lst. The order in which pred is applied to the list elements is not specified.
filter
does not change lst, but the result may share a
tail with it. filter!
may modify lst to construct its
return.
Next: List Mapping, Previous: List Modification, Up: Lists [Contents][Index]
The following procedures search lists for particular elements. They use
different comparison predicates for comparing list elements with the
object to be searched. When they fail, they return #f
, otherwise
they return the sublist whose car is equal to the search object, where
equality depends on the equality predicate used.
Return the first sublist of lst whose car is eq?
to x where the sublists of lst are the non-empty
lists returned by (list-tail lst k)
for
k less than the length of lst. If x does not
occur in lst, then #f
(not the empty list) is
returned.
Return the first sublist of lst whose car is eqv?
to x where the sublists of lst are the non-empty
lists returned by (list-tail lst k)
for
k less than the length of lst. If x does not
occur in lst, then #f
(not the empty list) is
returned.
Return the first sublist of lst whose car is
equal?
to x where the sublists of lst are
the non-empty lists returned by (list-tail lst
k)
for k less than the length of lst. If
x does not occur in lst, then #f
(not the
empty list) is returned.
See also SRFI-1 which has an extended member
function
(Searching).
Previous: List Searching, Up: Lists [Contents][Index]
List processing is very convenient in Scheme because the process of iterating over the elements of a list can be highly abstracted. The procedures in this section are the most basic iterating procedures for lists. They take a procedure and one or more lists as arguments, and apply the procedure to each element of the list. They differ in their return value.
Apply proc to each element of the list arg1 (if only two
arguments are given), or to the corresponding elements of the argument
lists (if more than two arguments are given). The result(s) of the
procedure applications are saved and returned in a list. For
map
, the order of procedure applications is not specified,
map-in-order
applies the procedure from left to right to the list
elements.
Like map
, but the procedure is always applied from left to right,
and the result(s) of the procedure applications are thrown away. The
return value is not specified.
See also SRFI-1 which extends these functions to take lists of unequal lengths (Fold, Unfold & Map).
Next: Bit Vectors, Previous: Lists, Up: Data Types [Contents][Index]
Vectors are sequences of Scheme objects. Unlike lists, the length of a vector, once the vector is created, cannot be changed. The advantage of vectors over lists is that the time required to access one element of a vector given its position (synonymous with index), a zero-origin number, is constant, whereas lists have an access time linear to the position of the accessed element in the list.
Vectors can contain any kind of Scheme object; it is even possible to have different types of objects in the same vector. For vectors containing vectors, you may wish to use arrays, instead. Note, too, that vectors are the special case of one dimensional non-uniform arrays and that most array procedures operate happily on vectors (see Arrays).
Also see SRFI-43 - Vector Library, for a comprehensive vector library.
Next: Dynamic Vector Creation and Validation, Up: Vectors [Contents][Index]
Vectors can literally be entered in source code, just like strings,
characters or some of the other data types. The read syntax for vectors
is as follows: A sharp sign (#
), followed by an opening
parentheses, all elements of the vector in their respective read syntax,
and finally a closing parentheses. Like strings, vectors do not have to
be quoted.
The following are examples of the read syntax for vectors; where the first vector only contains numbers and the second three different object types: a string, a symbol and a number in hexadecimal notation.
#(1 2 3) #("Hello" foo #xdeadbeef)
Next: Accessing and Modifying Vector Contents, Previous: Read Syntax for Vectors, Up: Vectors [Contents][Index]
Instead of creating a vector implicitly by using the read syntax just
described, you can create a vector dynamically by calling one of the
vector
and list->vector
primitives with the list of Scheme
values that you want to place into a vector. The size of the vector
thus created is determined implicitly by the number of arguments given.
Return a newly allocated vector composed of the
given arguments. Analogous to list
.
(vector 'a 'b 'c) ⇒ #(a b c)
The inverse operation is vector->list
:
Return a newly allocated list composed of the elements of v.
(vector->list #(dah dah didah)) ⇒ (dah dah didah) (list->vector '(dididit dah)) ⇒ #(dididit dah)
To allocate a vector with an explicitly specified size, use
make-vector
. With this primitive you can also specify an initial
value for the vector elements (the same value for all elements, that
is):
Return a newly allocated vector of len elements. If a second argument is given, then each position is initialized to fill. Otherwise the initial contents of each position is unspecified.
Like scm_make_vector
, but the length is given as a size_t
.
To check whether an arbitrary Scheme value is a vector, use the
vector?
primitive:
Return #t
if obj is a vector, otherwise return
#f
.
Return non-zero when obj is a vector, otherwise return
zero
.
Next: Vector Accessing from C, Previous: Dynamic Vector Creation and Validation, Up: Vectors [Contents][Index]
vector-length
and vector-ref
return information about a
given vector, respectively its size and the elements that are contained
in the vector.
Return the number of elements in vector as an exact integer.
Return the number of elements in vec as a size_t
.
Return the contents of position k of vec. k must be a valid index of vec.
(vector-ref #(1 1 2 3 5 8 13 21) 5) ⇒ 8 (vector-ref #(1 1 2 3 5 8 13 21) (let ((i (round (* 2 (acos -1))))) (if (inexact? i) (inexact->exact i) i))) ⇒ 13
Return the contents of position k (a size_t
) of
vec.
A vector created by one of the dynamic vector constructor procedures (see Dynamic Vector Creation and Validation) can be modified using the following procedures.
NOTE: According to R5RS, it is an error to use any of these procedures on a literally read vector, because such vectors should be considered as constants. Currently, however, Guile does not detect this error.
Store obj in position k of vec. k must be a valid index of vec. The value returned by ‘vector-set!’ is unspecified.
(let ((vec (vector 0 '(2 2 2 2) "Anna"))) (vector-set! vec 1 '("Sue" "Sue")) vec) ⇒ #(0 ("Sue" "Sue") "Anna")
Store obj in position k (a size_t
) of vec.
Store fill in every position of vec. The value
returned by vector-fill!
is unspecified.
Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.
vector-move-left!
copies elements in leftmost order.
Therefore, in the case where vec1 and vec2 refer to the
same vector, vector-move-left!
is usually appropriate when
start1 is greater than start2.
Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.
vector-move-right!
copies elements in rightmost order.
Therefore, in the case where vec1 and vec2 refer to the
same vector, vector-move-right!
is usually appropriate when
start1 is less than start2.
Next: Uniform Numeric Vectors, Previous: Accessing and Modifying Vector Contents, Up: Vectors [Contents][Index]
A vector can be read and modified from C with the functions
scm_c_vector_ref
and scm_c_vector_set_x
, for example. In
addition to these functions, there are two more ways to access vectors
from C that might be more efficient in certain situations: you can
restrict yourself to simple vectors and then use the very fast
simple vector macros; or you can use the very general framework
for accessing all kinds of arrays (see Accessing Arrays from C),
which is more verbose, but can deal efficiently with all kinds of
vectors (and arrays). For vectors, you can use the
scm_vector_elements
and scm_vector_writable_elements
functions as shortcuts.
Return non-zero if obj is a simple vector, else return zero. A
simple vector is a vector that can be used with the SCM_SIMPLE_*
macros below.
The following functions are guaranteed to return simple vectors:
scm_make_vector
, scm_c_make_vector
, scm_vector
,
scm_list_to_vector
.
Evaluates to the length of the simple vector vec. No type checking is done.
Evaluates to the element at position idx in the simple vector vec. No type or range checking is done.
Sets the element at position idx in the simple vector vec to val. No type or range checking is done.
Acquire a handle for the vector vec and return a pointer to the
elements of it. This pointer can only be used to read the elements of
vec. When vec is not a vector, an error is signaled. The
handle must eventually be released with
scm_array_handle_release
.
The variables pointed to by lenp and incp are filled with the number of elements of the vector and the increment (number of elements) between successive elements, respectively. Successive elements of vec need not be contiguous in their underlying “root vector” returned here; hence the increment is not necessarily equal to 1 and may well be negative too (see Shared Arrays).
The following example shows the typical way to use this function. It creates a list of all elements of vec (in reverse order).
scm_t_array_handle handle; size_t i, len; ssize_t inc; const SCM *elt; SCM list; elt = scm_vector_elements (vec, &handle, &len, &inc); list = SCM_EOL; for (i = 0; i < len; i++, elt += inc) list = scm_cons (*elt, list); scm_array_handle_release (&handle);
Like scm_vector_elements
but the pointer can be used to modify
the vector.
The following example shows the typical way to use this function. It
fills a vector with #t
.
scm_t_array_handle handle; size_t i, len; ssize_t inc; SCM *elt; elt = scm_vector_writable_elements (vec, &handle, &len, &inc); for (i = 0; i < len; i++, elt += inc) *elt = SCM_BOOL_T; scm_array_handle_release (&handle);
Previous: Vector Accessing from C, Up: Vectors [Contents][Index]
A uniform numeric vector is a vector whose elements are all of a single numeric type. Guile offers uniform numeric vectors for signed and unsigned 8-bit, 16-bit, 32-bit, and 64-bit integers, two sizes of floating point values, and complex floating-point numbers of these two sizes. See SRFI-4 - Homogeneous numeric vector datatypes, for more information.
For many purposes, bytevectors work just as well as uniform vectors, and have the advantage that they integrate well with binary input and output. See Bytevectors, for more information on bytevectors.
Next: Bytevectors, Previous: Vectors, Up: Data Types [Contents][Index]
Bit vectors are zero-origin, one-dimensional arrays of booleans. They
are displayed as a sequence of 0
s and 1
s prefixed by
#*
, e.g.,
(make-bitvector 8 #f) ⇒ #*00000000
Bit vectors are the special case of one dimensional bit arrays, and can thus be used with the array procedures, See Arrays.
Return #t
when obj is a bitvector, else
return #f
.
Return 1
when obj is a bitvector, else return 0
.
Create a new bitvector of length len and optionally initialize all elements to fill.
Like scm_make_bitvector
, but the length is given as a
size_t
.
Create a new bitvector with the arguments as elements.
Return the length of the bitvector vec.
Like scm_bitvector_length
, but the length is returned as a
size_t
.
Return the element at index idx of the bitvector vec.
Return the element at index idx of the bitvector vec.
Set the element at index idx of the bitvector vec when val is true, else clear it.
Set the element at index idx of the bitvector vec when val is true, else clear it.
Set all elements of the bitvector vec when val is true, else clear them.
Return a new bitvector initialized with the elements of list.
Return a new list initialized with the elements of the bitvector vec.
Return a count of how many entries in bitvector are equal to bool. For example,
(bit-count #f #*000111000) ⇒ 6
Return the index of the first occurrence of bool in
bitvector, starting from start. If there is no bool
entry between start and the end of bitvector, then return
#f
. For example,
(bit-position #t #*000101 0) ⇒ 3 (bit-position #f #*0001111 3) ⇒ #f
Modify bitvector by replacing each element with its negation.
Set entries of bitvector to bool, with uvec selecting the entries to change. The return value is unspecified.
If uvec is a bit vector, then those entries where it has
#t
are the ones in bitvector which are set to bool.
uvec and bitvector must be the same length. When
bool is #t
it’s like uvec is OR’ed into
bitvector. Or when bool is #f
it can be seen as an
ANDNOT.
(define bv #*01000010) (bit-set*! bv #*10010001 #t) bv ⇒ #*11010011
If uvec is a uniform vector of unsigned long integers, then they’re indexes into bitvector which are set to bool.
(define bv #*01000010) (bit-set*! bv #u(5 2 7) #t) bv ⇒ #*01100111
Return a count of how many entries in bitvector are equal to bool, with uvec selecting the entries to consider.
uvec is interpreted in the same way as for bit-set*!
above. Namely, if uvec is a bit vector then entries which have
#t
there are considered in bitvector. Or if uvec
is a uniform vector of unsigned long integers then it’s the indexes in
bitvector to consider.
For example,
(bit-count* #*01110111 #*11001101 #t) ⇒ 3 (bit-count* #*01110111 #u32(7 0 4) #f) ⇒ 2
Like scm_vector_elements
(see Vector Accessing from C), but
for bitvectors. The variable pointed to by offp is set to the
value returned by scm_array_handle_bit_elements_offset
. See
scm_array_handle_bit_elements
for how to use the returned
pointer and the offset.
Like scm_bitvector_elements
, but the pointer is good for reading
and writing.
Next: Arrays, Previous: Bit Vectors, Up: Data Types [Contents][Index]
A bytevector is a raw bit string. The (rnrs bytevectors)
module provides the programming interface specified by the
Revised^6 Report on the Algorithmic Language
Scheme (R6RS). It contains procedures to manipulate bytevectors and
interpret their contents in a number of ways: bytevector contents can be
accessed as signed or unsigned integer of various sizes and endianness,
as IEEE-754 floating point numbers, or as strings. It is a useful tool
to encode and decode binary data.
The R6RS (Section 4.3.4) specifies an external representation for
bytevectors, whereby the octets (integers in the range 0–255) contained
in the bytevector are represented as a list prefixed by #vu8
:
#vu8(1 53 204)
denotes a 3-byte bytevector containing the octets 1, 53, and 204. Like string literals, booleans, etc., bytevectors are “self-quoting”, i.e., they do not need to be quoted:
#vu8(1 53 204) ⇒ #vu8(1 53 204)
Bytevectors can be used with the binary input/output primitives (see Binary I/O).
Next: Manipulating Bytevectors, Up: Bytevectors [Contents][Index]
Some of the following procedures take an endianness parameter. The endianness is defined as the order of bytes in multi-byte numbers: numbers encoded in big endian have their most significant bytes written first, whereas numbers encoded in little endian have their least significant bytes first10.
Little-endian is the native endianness of the IA32 architecture and
its derivatives, while big-endian is native to SPARC and PowerPC,
among others. The native-endianness
procedure returns the
native endianness of the machine it runs on.
Return a value denoting the native endianness of the host machine.
Return an object denoting the endianness specified by symbol. If
symbol is neither big
nor little
then an error is
raised at expand-time.
The objects denoting big- and little-endianness, respectively.
Next: Interpreting Bytevector Contents as Integers, Previous: Endianness, Up: Bytevectors [Contents][Index]
Bytevectors can be created, copied, and analyzed with the following procedures and C functions.
Return a new bytevector of len bytes. Optionally, if fill is given, fill it with fill; fill must be in the range [-128,255].
Return true if obj is a bytevector.
Equivalent to scm_is_true (scm_bytevector_p (obj))
.
Return the length in bytes of bytevector bv.
Likewise, return the length in bytes of bytevector bv.
Return is bv1 equals to bv2—i.e., if they have the same length and contents.
Fill bytevector bv with fill, a byte.
Copy len bytes from source into target, starting reading from source-start (a positive index within source) and start writing at target-start. It is permitted for the source and target regions to overlap.
Return a newly allocated copy of bv.
Return the byte at index in bytevector bv.
Set the byte at index in bv to value.
Low-level C macros are available. They do not perform any type-checking; as such they should be used with care.
Return the length in bytes of bytevector bv.
Return a pointer to the contents of bytevector bv.
Next: Converting Bytevectors to/from Integer Lists, Previous: Manipulating Bytevectors, Up: Bytevectors [Contents][Index]
The contents of a bytevector can be interpreted as a sequence of integers of any given size, sign, and endianness.
(let ((bv (make-bytevector 4))) (bytevector-u8-set! bv 0 #x12) (bytevector-u8-set! bv 1 #x34) (bytevector-u8-set! bv 2 #x56) (bytevector-u8-set! bv 3 #x78) (map (lambda (number) (number->string number 16)) (list (bytevector-u8-ref bv 0) (bytevector-u16-ref bv 0 (endianness big)) (bytevector-u32-ref bv 0 (endianness little))))) ⇒ ("12" "1234" "78563412")
The most generic procedures to interpret bytevector contents as integers are described below.
Return the size-byte long unsigned integer at index index in bv, decoded according to endianness.
Return the size-byte long signed integer at index index in bv, decoded according to endianness.
Set the size-byte long unsigned integer at index to value, encoded according to endianness.
Set the size-byte long signed integer at index to value, encoded according to endianness.
The following procedures are similar to the ones above, but specialized to a given integer size:
Return the unsigned n-bit (signed) integer (where n is 8, 16, 32 or 64) from bv at index, decoded according to endianness.
Store value as an n-bit (signed) integer (where n is 8, 16, 32 or 64) in bv at index, encoded according to endianness.
Finally, a variant specialized for the host’s endianness is available
for each of these functions (with the exception of the u8
and
s8
accessors, as endianness is about byte order and there is only
1 byte):
Return the unsigned n-bit (signed) integer (where n is 8, 16, 32 or 64) from bv at index, decoded according to the host’s native endianness.
Store value as an n-bit (signed) integer (where n is 8, 16, 32 or 64) in bv at index, encoded according to the host’s native endianness.
Next: Interpreting Bytevector Contents as Floating Point Numbers, Previous: Interpreting Bytevector Contents as Integers, Up: Bytevectors [Contents][Index]
Bytevector contents can readily be converted to/from lists of signed or unsigned integers:
(bytevector->sint-list (u8-list->bytevector (make-list 4 255)) (endianness little) 2) ⇒ (-1 -1)
Return a newly allocated list of unsigned 8-bit integers from the contents of bv.
Return a newly allocated bytevector consisting of the unsigned 8-bit integers listed in lst.
Return a list of unsigned integers of size bytes representing the contents of bv, decoded according to endianness.
Return a list of signed integers of size bytes representing the contents of bv, decoded according to endianness.
Return a new bytevector containing the unsigned integers listed in lst and encoded on size bytes according to endianness.
Return a new bytevector containing the signed integers listed in lst and encoded on size bytes according to endianness.
Next: Interpreting Bytevector Contents as Unicode Strings, Previous: Converting Bytevectors to/from Integer Lists, Up: Bytevectors [Contents][Index]
Bytevector contents can also be accessed as IEEE-754 single- or double-precision floating point numbers (respectively 32 and 64-bit long) using the procedures described here.
Return the IEEE-754 single-precision floating point number from bv at index according to endianness.
Store real number value in bv at index according to endianness.
Specialized procedures are also available:
Return the IEEE-754 single-precision floating point number from bv at index according to the host’s native endianness.
Store real number value in bv at index according to the host’s native endianness.
Next: Accessing Bytevectors with the Array API, Previous: Interpreting Bytevector Contents as Floating Point Numbers, Up: Bytevectors [Contents][Index]
Bytevector contents can also be interpreted as Unicode strings encoded in one of the most commonly available encoding formats. See Representing Strings as Bytes, for a more generic interface.
(utf8->string (u8-list->bytevector '(99 97 102 101))) ⇒ "cafe" (string->utf8 "café") ;; SMALL LATIN LETTER E WITH ACUTE ACCENT ⇒ #vu8(99 97 102 195 169)
Return the number of bytes in the UTF-8 representation of str.
Return a newly allocated bytevector that contains the UTF-8, UTF-16, or
UTF-32 (aka. UCS-4) encoding of str. For UTF-16 and UTF-32,
endianness should be the symbol big
or little
; when omitted,
it defaults to big endian.
Return a newly allocated string that contains from the UTF-8-, UTF-16-,
or UTF-32-decoded contents of bytevector utf. For UTF-16 and UTF-32,
endianness should be the symbol big
or little
; when omitted,
it defaults to big endian.
Next: Accessing Bytevectors with the SRFI-4 API, Previous: Interpreting Bytevector Contents as Unicode Strings, Up: Bytevectors [Contents][Index]
As an extension to the R6RS, Guile allows bytevectors to be manipulated with the array procedures (see Arrays). When using these APIs, bytes are accessed one at a time as 8-bit unsigned integers:
(define bv #vu8(0 1 2 3)) (array? bv) ⇒ #t (array-rank bv) ⇒ 1 (array-ref bv 2) ⇒ 2 ;; Note the different argument order on array-set!. (array-set! bv 77 2) (array-ref bv 2) ⇒ 77 (array-type bv) ⇒ vu8
Previous: Accessing Bytevectors with the Array API, Up: Bytevectors [Contents][Index]
Bytevectors may also be accessed with the SRFI-4 API. See SRFI-4 - Relation to bytevectors, for more information.
Next: VLists, Previous: Bytevectors, Up: Data Types [Contents][Index]
Arrays are a collection of cells organized into an arbitrary number of dimensions. Each cell can be accessed in constant time by supplying an index for each dimension.
In the current implementation, an array uses a vector of some kind for
the actual storage of its elements. Any kind of vector will do, so you
can have arrays of uniform numeric values, arrays of characters, arrays
of bits, and of course, arrays of arbitrary Scheme values. For example,
arrays with an underlying c64vector
might be nice for digital
signal processing, while arrays made from a u8vector
might be
used to hold gray-scale images.
The number of dimensions of an array is called its rank. Thus, a matrix is an array of rank 2, while a vector has rank 1. When accessing an array element, you have to specify one exact integer for each dimension. These integers are called the indices of the element. An array specifies the allowed range of indices for each dimension via an inclusive lower and upper bound. These bounds can well be negative, but the upper bound must be greater than or equal to the lower bound minus one. When all lower bounds of an array are zero, it is called a zero-origin array.
Arrays can be of rank 0, which could be interpreted as a scalar. Thus, a zero-rank array can store exactly one object and the list of indices of this element is the empty list.
Arrays contain zero elements when one of their dimensions has a zero length. These empty arrays maintain information about their shape: a matrix with zero columns and 3 rows is different from a matrix with 3 columns and zero rows, which again is different from a vector of length zero.
The array procedures are all polymorphic, treating strings, uniform numeric vectors, bytevectors, bit vectors and ordinary vectors as one dimensional arrays.
Next: Array Procedures, Up: Arrays [Contents][Index]
An array is displayed as #
followed by its rank, followed by a
tag that describes the underlying vector, optionally followed by
information about its shape, and finally followed by the cells,
organized into dimensions using parentheses.
In more words, the array tag is of the form
#<rank><vectag><@lower><:len><@lower><:len>...
where <rank>
is a positive integer in decimal giving the rank of
the array. It is omitted when the rank is 1 and the array is non-shared
and has zero-origin (see below). For shared arrays and for a non-zero
origin, the rank is always printed even when it is 1 to distinguish
them from ordinary vectors.
The <vectag>
part is the tag for a uniform numeric vector, like
u8
, s16
, etc, b
for bitvectors, or a
for
strings. It is empty for ordinary vectors.
The <@lower>
part is a ‘@’ character followed by a signed
integer in decimal giving the lower bound of a dimension. There is one
<@lower>
for each dimension. When all lower bounds are zero,
all <@lower>
parts are omitted.
The <:len>
part is a ‘:’ character followed by an unsigned
integer in decimal giving the length of a dimension. Like for the lower
bounds, there is one <:len>
for each dimension, and the
<:len>
part always follows the <@lower>
part for a
dimension. Lengths are only then printed when they can’t be deduced
from the nested lists of elements of the array literal, which can happen
when at least one length is zero.
As a special case, an array of rank 0 is printed as
#0<vectag>(<scalar>)
, where <scalar>
is the result of
printing the single element of the array.
Thus,
#(1 2 3)
is an ordinary array of rank 1 with lower bound 0 in dimension 0. (I.e., a regular vector.)
#@2(1 2 3)
is an ordinary array of rank 1 with lower bound 2 in dimension 0.
#2((1 2 3) (4 5 6))
is a non-uniform array of rank 2; a 2x3 matrix with index ranges 0..1 and 0..2.
#u8(0 1 2)
is a uniform u8 array of rank 1.
#2u32@2@3((1 2) (2 3))
is a uniform u32 array of rank 2 with index ranges 2..3 and 3..4.
#2()
is a two-dimensional array with index ranges 0..-1 and 0..-1, i.e. both dimensions have length zero.
#2:0:2()
is a two-dimensional array with index ranges 0..-1 and 0..1, i.e. the first dimension has length zero, but the second has length 2.
#0(12)
is a rank-zero array with contents 12.
In addition, bytevectors are also arrays, but use a different syntax (see Bytevectors):
#vu8(1 2 3)
is a 3-byte long bytevector, with contents 1, 2, 3.
Next: Shared Arrays, Previous: Array Syntax, Up: Arrays [Contents][Index]
When an array is created, the range of each dimension must be specified, e.g., to create a 2x3 array with a zero-based index:
(make-array 'ho 2 3) ⇒ #2((ho ho ho) (ho ho ho))
The range of each dimension can also be given explicitly, e.g., another way to create the same array:
(make-array 'ho '(0 1) '(0 2)) ⇒ #2((ho ho ho) (ho ho ho))
The following procedures can be used with arrays (or vectors). An argument shown as idx… means one parameter for each dimension in the array. A idxlist argument means a list of such values, one for each dimension.
Return #t
if the obj is an array, and #f
if
not.
The second argument to scm_array_p is there for historical reasons,
but it is not used. You should always pass SCM_UNDEFINED
as
its value.
Return #t
if the obj is an array of type type, and
#f
if not.
Return 1
if the obj is an array and 0
if not.
Return 0
if the obj is an array of type type, and
1
if not.
Equivalent to (make-typed-array #t fill bound ...)
.
Create and return an array that has as many dimensions as there are bounds and (maybe) fill it with fill.
The underlying storage vector is created according to type,
which must be a symbol whose name is the ‘vectag’ of the array as
explained above, or #t
for ordinary, non-specialized arrays.
For example, using the symbol f64
for type will create an
array that uses a f64vector
for storing its elements, and
a
will use a string.
When fill is not the special unspecified value, the new
array is filled with fill. Otherwise, the initial contents of
the array is unspecified. The special unspecified value is
stored in the variable *unspecified*
so that for example
(make-typed-array 'u32 *unspecified* 4)
creates a uninitialized
u32
vector of length 4.
Each bound may be a positive non-zero integer n, in which
case the index for that dimension can range from 0 through n-1; or
an explicit index range specifier in the form (LOWER UPPER)
,
where both lower and upper are integers, possibly less than
zero, and possibly the same number (however, lower cannot be
greater than upper).
Equivalent to (list->typed-array #t dimspec
list)
.
Return an array of the type indicated by type with elements the same as those of list.
The argument dimspec determines the number of dimensions of the array and their lower bounds. When dimspec is an exact integer, it gives the number of dimensions directly and all lower bounds are zero. When it is a list of exact integers, then each element is the lower index bound of a dimension, and there will be as many dimensions as elements in the list.
Return the type of array. This is the ‘vectag’ used for
printing array (or #t
for ordinary arrays) and can be
used with make-typed-array
to create an array of the same kind
as array.
Return the element at (idx …)
in array.
(define a (make-array 999 '(1 2) '(3 4))) (array-ref a 2 4) ⇒ 999
Return #t
if the given indices would be acceptable to
array-ref
.
(define a (make-array #f '(1 2) '(3 4))) (array-in-bounds? a 2 3) ⇒ #t (array-in-bounds? a 0 0) ⇒ #f
Set the element at (idx …)
in array to obj.
The return value is unspecified.
(define a (make-array #f '(0 1) '(0 1))) (array-set! a #t 1 1) a ⇒ #2((#f #f) (#f #t))
Return a list of the bounds for each dimension of array.
array-shape
gives (lower upper)
for each
dimension. array-dimensions
instead returns just
upper+1 for dimensions with a 0 lower bound. Both are
suitable as input to make-array
.
For example,
(define a (make-array 'foo '(-1 3) 5)) (array-shape a) ⇒ ((-1 3) (0 4)) (array-dimensions a) ⇒ ((-1 3) 5)
Return the length of an array: its first dimension. It is an error to ask for the length of an array of rank 0.
Return the rank of array as a size_t
.
Return a list consisting of all the elements, in order, of array.
Copy every element from vector or array src to the corresponding element of dst. dst must have the same rank as src, and be at least as large in each dimension. The return value is unspecified.
Store fill in every element of array. The value returned is unspecified.
Return #t
if all arguments are arrays with the same shape, the
same type, and have corresponding elements which are either
equal?
or array-equal?
. This function differs from
equal?
(see Equality) in that all arguments must be arrays.
Set each element of the dst array to values obtained from calls to proc. The list of src arguments may be empty. The value returned is unspecified.
Each call is (proc elem …)
, where each
elem is from the corresponding src array, at the
dst index. array-map-in-order!
makes the calls in
row-major order, array-map!
makes them in an unspecified order.
The src arrays must have the same number of dimensions as dst, and must have a range for each dimension which covers the range in dst. This ensures all dst indices are valid in each src.
Apply proc to each tuple of elements of src1 src2 …, in row-major order. The value returned is unspecified.
Set each element of the dst array to values returned by calls to proc. The value returned is unspecified.
Each call is (proc i1 … iN)
, where
i1…iN is the destination index, one parameter for
each dimension. The order in which the calls are made is unspecified.
For example, to create a 4x4 matrix representing a cyclic group,
/ 0 1 2 3 \ | 1 2 3 0 | | 2 3 0 1 | \ 3 0 1 2 /
(define a (make-array #f 4 4)) (array-index-map! a (lambda (i j) (modulo (+ i j) 4)))
An additional array function is available in the module
(ice-9 arrays)
. It can be used with:
(use-modules (ice-9 arrays))
Return a new array with the same elements, type and shape as
src. However, the array increments may not be the same as those of
src. In the current implementation, the returned array will be in
row-major order, but that might change in the future. Use
array-copy!
on an array of known order if that is a concern.