Tag Archives: programming

The less-familiar parts of Lisp for beginners — read-line

Now, we come to the read-line function.  For the C++ programmer, this is like std::getline.  This function reads in the text stream until the next newline, if any, and returns the content in a string.  It does no interpretation of the text beyond searching for the newline, and does not care about the syntax of the text contents.  This is the function that you would be most likely to use when reading and scanning data files or natural language text.

The optional second argument, eof-error-p, defaults to true.  Unless this argument is set to nil by the programmer, read-line will raise an error if it encounters the end of line.  The optional third argument, eof-value, is that value that will be returned by read-line if the end of line is encountered before any other characters are read.  A common construct, then, is to call read-line with the stream and then two nil arguments, causing it to return strings containing lines of text until none remain, when it returns nil instead of a string.

The read-line function returns a second value, nil if the line read was terminated by and end of line, and true if it was terminated by the end of the file.  Using this, the programmer can determine whether or not  the last line in the file has a trailing newline.

The less-familiar parts of Lisp for beginners — read-from-string

Continuing through our list of less familiar features of Lisp, next is read-from-string.  The same caveats I mentioned under read apply here.  This is not something that you would typically use for a general-purpose input parser.  Like read, it requires that the input be fairly strongly constrained to match valid input that the Lisp REPL might encounter.  It is easy for it to encounter a character that will raise a condition, so tends to be used in places where the strings being interpreted are carefully generated in a *print-readably* format.

The less-familiar parts of Lisp for beginners — read-delimited-list

We now come across another Lisp feature, read-delimited-list.  This function is generally expected to be used in the construction of reader macros.  You might want to review the discussion of reader macros under gensym and get-dispatch-macro-character.  You would use it to build a reader macro that interprets a sequence of objects between bounding characters as being a list.  The standard reader macro for the #\( character could invoke (read-delimited-list #\)) to retrieve the list between the parentheses.

Because this function doesn’t have any resilience for handling bad cases, it is fairly brittle, and so I would tend to discourage its use in other contexts where the input stream format is less rigidly enforced.

The less-familiar parts of Lisp for beginners — read and read-preserving-whitespace

I’m going to take a brief stop here at read.  While that’s a fairly basic Lisp operation, a newcomer to Lisp might not have seen many examples where programs take input.  So, we’ll talk about read and read-preserving-whitespace a little, to point out what they are, and what they are not.

These two functions are very similar, differing only in how they treat trailing whitespace (read consumes all whitespace up to the start of the next object, or the end-of-file, while read-preserving-whitespace stops immediately after the object read).  It’s important to note, though, that they are not general-purpose input methods.  The read function consumes enough characters from its input stream to produce a single Lisp object.  It is assumed that the input stream consists of readable objects (by the definition of *print-readably*, see the earlier article on print).  These functions do not handle badly-formatted input well, raising an error if the input can’t be interpreted as a Lisp object.

If you’re writing code that reads in data files, or parses English text, you will probably not be doing it with read.  You’ll use, instead, read-line, which returns a string of all the text up to the next newline or the end of file.  You will then use string parsing techniques to interpret the stream.

Places where I have used read are in client-server command protocols.  Rather than sending bare text sequences over the channel, I send lists which contain keywords and arguments related to the information exchanged.  Since I control both ends of the connection, any read failures are related to bugs in the sending program, which I must find and fix.

The less-familiar parts of Lisp for beginners — random

We’ve been discussing features of Lisp that the newcomer to the language might not have encountered before, and now I’m going to talk a bit about random.  While the newcomer arriving from C++ is probably aware of this function, at least in the abstract, there are some dangers that have to be pointed out for the serious users of pseudo-random numbers.

I am going to assert that there are two categories of users of pseudo-random numbers.  First, there are those I’d call the “casual” users.  These users are interested in producing one or more pseudo-random numbers with no eye to recovering or replaying those specific numbers later.  People generating cryptographic keys, or shuffling a deck of cards for a game, or making a salt or other sequence number, would all fall under the category of casual users.  The second group are the “serious” users.  These are the Monte Carlo simulator people (among whose ranks I count myself).  These people would like to be able to run the same program with 1000 different random number sequences, and would like to be able to re-run specific instances for debugging purposes.

The Common Lisp standard defines its pseudo-random number behaviour in a way that accommodates the casual users, but not the serious users.  We’ll start by explaining what the standard does require.

First, the Common Lisp random function takes, as an optional argument, a random state object.  This is quite useful, as it allows the programmer, in effect, to produce multiple independent pseudo-random number streams.  This is unlike the BSD srandom() and random() functions, which modify a global state and don’t allow for easy manipulation of it.  It is, though, somewhat like the GNU libc extensions, srandom_r() and random_r(), which take a state vector as an additional argument.

That random state object used in the Lisp random function can be constructed using make-random-state, operating in one of three modes.  The make-random-state function can be used to create a copy of an existing random state, to create a copy of the default random state, or to build a new random state using an implementation-defined technique that is hoped to give a different result every time it is used.  None of these use cases allows what the programmer would think of as a “random number seed”, which the Monte Carlo modelers absolutely require for their work.  The Common Lisp standard also does not appear to require that the implementation supply a way to produce a readable representation of the current random state, so that it can be recorded and re-loaded from disc.

EDIT #1: 2014-05-18

As pointed out by mch in the comments, the standard does, in fact, require that *random-state* have a *print-readably* representation, so a compliant implementation must be able to store the current random state in a form that can be re-loaded within the same implementation.  There is no guarantee, however, that this representation is meaningful to any other Lisp implementation.

If you need to produce multiple reproducible pseudo-random number sequences, you really have two choices.  You can either write your own pseudo-random number generator, and use that for your work, or you can check the particular implementation of Lisp that you are using for any non-standard pseudo-random number extensions that may be supplied.  For instance, in SBCL there is a non-standard extension, sb-ext:seed-random-state, that allows the programmer to construct a random state variable based on a supplied seed, exactly what the Monte Carlo programmers require.