Tag Archives: lisp

The less-familiar parts of Lisp for beginners — defconstant

Next in the series of commands that the novice Lisp programmer arriving from C++ might overlook is defconstant.  Now, this macro is actually something that the programmer possibly started using, then discarded because of perceived inconvenience.

There’s a basic difference between C++ programs and Lisp programs.  C++ programs typically run in isolation.  While you can certainly write code to allow them to communicate with their environment, other programs, or with files, the conceptual model of a C++ program is something that you write in an editor, compile into a binary, and then run.  If it becomes necessary to modify the code, you go back to the editor, recompile the binary, terminate the running code and then run the modified version.

Lisp is, in typical use, a very different model.  Your Lisp program is really a collection of connected functions running on a platform.  If you take care to avoid namespace issues, you can easily load several different “programs” into a single Lisp image, and have them all running in separate threads.  You can edit a function in the Lisp image, and the running code will immediately start using the modified function.

So, you came to Lisp from C++.  In C++, constants were simple to understand.  You declared them constant, and everywhere the definition was in scope, the compiler likely knew, if the information was available at compile time, what the value would be.  Your linker might even have put the data into a read-only page.  You used constants instead of macros for reasons of visibility, and because you wanted to be able to pass around references or pointers, and wanted access to the symbol in the debugger.

Then, you got to Lisp.  You wrote some code with defconstant.  You compiled it, loaded it, tested it, and decided to change the constant to a different value.  Once again, you compiled it, but when you tried to load it, you got an error.  This isn’t like C++, where your execution environment starts from scratch when you run the code…your Lisp image doesn’t get reset just because you happened to recompile a file somewhere in your system, all it can tell is that a piece of code set the constant to one value, and then later on you loaded another piece of code that tried to set the same constant to a different value.  That’s an error.  So, you exited your Lisp image, restarted, and kept going, but after a while you decided that defparameter was just easier.

Your solution to this particular disappointment is to use the unintern function.  Here’s a transcript of a Lisp session trying to modify a constant.
 

CL-USER> (defconstant +abc+ 12345)
+ABC+
CL-USER> (defconstant +abc+ 12346)
; Evaluation aborted on #<DEFCONSTANT-UNEQL {10140BA103}>.
CL-USER> +abc+
12345
CL-USER> (unintern '+abc+)
T
CL-USER> (defconstant +abc+ 12346)
+ABC+
CL-USER> +abc+
12346

You don’t see here, because SLIME shows it in a separate window, the error that came up, with possible restarts, when the attempt was made to modify a constant value.  You’ll note also that I used the common Lisp convention for constants, that the name should be enclosed in ‘+’ characters.

Now, look at the earlier post about declare.  You, the C++ programmer, might say, “If I made those parameters in the code into constants, then I wouldn’t have to use declare, they’re in scope for the compiler and it would know to make those optimizations itself.”  If you try this, though, you’ll discover that it isn’t true.  Again, it the concept of the Lisp image that your code’s running in.  Your .lisp file isn’t like a C++ compilation unit, because after the code has been compiled and loaded, it’s still possible for other code to modify those constants with unintern.  The compiler, when operating on the function to produce the next random number, cannot assume anything about constants that might appear to you to be “in scope”, unless they are part of the defun itself.

OK, so now the new Lisp programmer has decided to make some use of constants, keeping in mind the limitations and restrictions I’ve outlined above.  And, the new Lisp programmer puts in what looks like a fairly straight-forward constant list.  Later, he or she modifies some other code in the file, and re-loads it, and is amazed that the defconstant now fails:
 

CL-USER> (unintern '+abc+)
T
CL-USER> (defconstant +abc+ '(1 2 3))
+ABC+
CL-USER> (defconstant +abc+ '(1 2 3))
; Evaluation aborted on #<DEFCONSTANT-UNEQL {1014B67A03}>.

The Lisp system refused to allow you to reload exactly the same line, without changes.  What happened?  Understanding this is actually important in avoiding a lot of mysterious behaviour in your code.  The defconstant form, once established, must be constant.  For scalars like numbers or symbols, that’s simple enough, but for lists, or objects, it means you have to use the same object.  Not another object that looks like it, but the same object.  It’s important to realize that two literal lists are different objects, even if they have the same contents.  Once more, unintern will be useful to you here, in allowing you to reload the offending .lisp file and continue your work.

The less-familiar parts of Lisp for beginners — declare

We’ve just discussed declaim, and now we go on to declare.  This is a way to provide guidance to the interpreter/compiler, but limited in scope to the form in which they declare directive appears.  While, syntactically, declare looks like a function or macro, it is a logically distinct entity.  In forms where declare is valid, it must appear before any of the expressions that “do something”.  The beginning Lisp programmer is likely to come across a few contexts where the declare is useful:

  • To set/change the optimization settings on a per-function basis
  • To provide hints to the optimizer, improving code speed
  • To suppress warnings

As I mentioned when discussing declaim, the scope of optimization settings assigned with declaim is not specified.  They compiler may or may not retain those settings across multiple compilation units.  In contrast, declare is defined to have a scope limited to the form in which it appears.

To go over some uses of declare, we’ll look at a simple random number generator, an old Linear Congruential Generator from glibc.  This can be coded as follows:
 

(defparameter *rand-state* 0)
(defparameter *rand-modulus* #x80000000)
(defparameter *rand-multiplier* 1103515245)
(defparameter *rand-increment* 12345)

(defun set-rand-state (val)
  (setf *rand-state* val))

(defun get-next-rand ()
  (setf *rand-state*
        (mod (+ (* *rand-multiplier* *rand-state*)
                *rand-increment*)
             *rand-modulus*)))

Now, if this code is part of a larger project, you might be debugging that project, and so want to have debugging-compatible optimizations.  Meanwhile, you’re confident of your random-number implementation, and it turns out to have a noticeable performance impact, so you want to optimize just that function.  Seems unlikely that such a simple function could affect runtime, I know, but it could happen.

Aside:  I remember in one project coming across code which tried to generate a random number with a Gaussian distribution, and did it by drawing 20 uniform random numbers in a row and averaging them, counting on the fact that the Bates distribution looks Gaussian when N gets large enough.  The RNG turned out to consume a noticeable fraction of the total runtime of the program.

OK, so you’ve decided to optimize the random number generator function, but leave all other functions in the source file to have their own optimization settings.  We change the get-next-rand function to read like this:

 
(defun get-next-rand ()
  (declare (optimize (speed 3) (debug 0) (safety 0)))
  (setf *rand-state*
        (mod (+ (* *rand-multiplier* *rand-state*)
                *rand-increment*)
             *rand-modulus*)))

Now, you compile the .lisp file in SBCL, and it produces output like this:
 

; file: opt.lisp
; in: DEFUN GET-NEXT-RAND
;     (* *RAND-MULTIPLIER* *RAND-STATE*)
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a RATIONAL.
;   The second argument is a NUMBER, not a FLOAT.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a FLOAT.
;   The second argument is a NUMBER, not a RATIONAL.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a SINGLE-FLOAT.
;   The second argument is a NUMBER, not a DOUBLE-FLOAT.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a DOUBLE-FLOAT.
;   The second argument is a NUMBER, not a SINGLE-FLOAT.
; 
; note: unable to
;   convert x*2^k to shift
; due to type uncertainty:
;   The first argument is a NUMBER, not a INTEGER.
;   The second argument is a NUMBER, not a INTEGER.

;     (+ (* *RAND-MULTIPLIER* *RAND-STATE*) *RAND-INCREMENT*)
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a RATIONAL.
;   The second argument is a NUMBER, not a FLOAT.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a FLOAT.
;   The second argument is a NUMBER, not a RATIONAL.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a SINGLE-FLOAT.
;   The second argument is a NUMBER, not a DOUBLE-FLOAT.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a NUMBER, not a DOUBLE-FLOAT.
;   The second argument is a NUMBER, not a SINGLE-FLOAT.

;     (MOD (+ (* *RAND-MULTIPLIER* *RAND-STATE*) *RAND-INCREMENT*) *RAND-MODULUS*)
; --> BLOCK LET IF AND IF NOT IF ZEROP 
; ==>
;   (= REM 0)
; 
; note: unable to
;   open-code FLOAT to RATIONAL comparison
; due to type uncertainty:
;   The first argument is a REAL, not a FLOAT.
; 
; note: unable to open code because: The operands might not be the same type.

; --> BLOCK LET IF AND IF AND THE IF MINUSP 
; ==>
;   (< SB-KERNEL::DIVISOR 0)
; 
; note: unable to
;   open-code FLOAT to RATIONAL comparison
; due to type uncertainty:
;   The first argument is a REAL, not a FLOAT.

; --> BLOCK LET IF AND IF AND THE IF PLUSP 
; ==>
;   (> NUMBER 0)
; 
; note: unable to
;   open-code FLOAT to RATIONAL comparison
; due to type uncertainty:
;   The first argument is a REAL, not a FLOAT.

; --> BLOCK LET IF AND IF AND THE IF MINUSP 
; ==>
;   (< NUMBER 0)
; 
; note: unable to
;   open-code FLOAT to RATIONAL comparison
; due to type uncertainty:
;   The first argument is a REAL, not a FLOAT.

; --> BLOCK LET IF 
; ==>
;   (+ REM SB-KERNEL::DIVISOR)
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a REAL, not a RATIONAL.
;   The second argument is a REAL, not a FLOAT.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a REAL, not a FLOAT.
;   The second argument is a REAL, not a RATIONAL.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a REAL, not a SINGLE-FLOAT.
;   The second argument is a REAL, not a DOUBLE-FLOAT.
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a REAL, not a DOUBLE-FLOAT.
;   The second argument is a REAL, not a SINGLE-FLOAT.

; --> BLOCK LET REM BLOCK MULTIPLE-VALUE-BIND MULTIPLE-VALUE-CALL 
; ==>
;   (TRUNCATE NUMBER SB-KERNEL::DIVISOR)
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a REAL, not a SINGLE-FLOAT.
;   The second argument is a REAL, not a (OR SINGLE-FLOAT INTEGER).
; 
; note: unable to
;   optimize
; due to type uncertainty:
;   The first argument is a REAL, not a DOUBLE-FLOAT.
;   The second argument is a REAL, not a (OR DOUBLE-FLOAT SINGLE-FLOAT INTEGER).
; 
; note: unable to
;   convert division by 2^k to shift
; due to type uncertainty:
;   The first argument is a REAL, not a INTEGER.
;   The second argument is a REAL, not a INTEGER.

;     (* *RAND-MULTIPLIER* *RAND-STATE*)
; 
; note: forced to do GENERIC-* (cost 30)
;       unable to do inline float arithmetic (cost 4) because:
;       The first argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
;       The second argument is a NUMBER, not a SINGLE-FLOAT.
;       The result is a (VALUES NUMBER &OPTIONAL), not a (VALUES
;                                                         (COMPLEX SINGLE-FLOAT)
;                                                         &REST T).
;       unable to do inline float arithmetic (cost 4) because:
;       The first argument is a NUMBER, not a SINGLE-FLOAT.
;       The second argument is a NUMBER, not a SINGLE-FLOAT.
;       The result is a (VALUES NUMBER &OPTIONAL), not a (VALUES SINGLE-FLOAT
;                                                                &REST T).
;       etc.

;     (+ (* *RAND-MULTIPLIER* *RAND-STATE*) *RAND-INCREMENT*)
; 
; note: forced to do GENERIC-+ (cost 10)
;       unable to do inline float arithmetic (cost 2) because:
;       The first argument is a NUMBER, not a DOUBLE-FLOAT.
;       The second argument is a NUMBER, not a DOUBLE-FLOAT.
;       The result is a (VALUES NUMBER &OPTIONAL), not a (VALUES DOUBLE-FLOAT
;                                                                &REST T).
;       unable to do inline float arithmetic (cost 2) because:
;       The first argument is a NUMBER, not a SINGLE-FLOAT.
;       The second argument is a NUMBER, not a SINGLE-FLOAT.
;       The result is a (VALUES NUMBER &OPTIONAL), not a (VALUES SINGLE-FLOAT
;                                                                &REST T).
;       etc.

;     (MOD (+ (* *RAND-MULTIPLIER* *RAND-STATE*) *RAND-INCREMENT*) *RAND-MODULUS*)
; --> BLOCK LET IF AND IF AND THE IF MINUSP 
; ==>
;   (< SB-KERNEL::DIVISOR 0)
; 
; note: forced to do GENERIC-< (cost 10)
;       unable to do inline fixnum comparison (cost 3) because:
;       The first argument is a REAL, not a FIXNUM.
;       unable to do inline fixnum comparison (cost 4) because:
;       The first argument is a REAL, not a FIXNUM.
;       etc.

; ==>
;   (< NUMBER 0)
; 
; note: forced to do GENERIC-< (cost 10)
;       unable to do inline fixnum comparison (cost 3) because:
;       The first argument is a REAL, not a FIXNUM.
;       unable to do inline fixnum comparison (cost 4) because:
;       The first argument is a REAL, not a FIXNUM.
;       etc.

; --> BLOCK LET IF AND IF AND THE IF PLUSP 
; ==>
;   (> NUMBER 0)
; 
; note: forced to do GENERIC-> (cost 10)
;       unable to do inline fixnum comparison (cost 3) because:
;       The first argument is a REAL, not a FIXNUM.
;       unable to do inline fixnum comparison (cost 4) because:
;       The first argument is a REAL, not a FIXNUM.
;       etc.

; --> BLOCK LET IF 
; ==>
;   (+ REM SB-KERNEL::DIVISOR)
; 
; note: forced to do GENERIC-+ (cost 10)
;       unable to do inline float arithmetic (cost 2) because:
;       The first argument is a REAL, not a DOUBLE-FLOAT.
;       The second argument is a REAL, not a DOUBLE-FLOAT.
;       The result is a (VALUES REAL &OPTIONAL), not a (VALUES DOUBLE-FLOAT &REST
;                                                              T).
;       unable to do inline float arithmetic (cost 2) because:
;       The first argument is a REAL, not a SINGLE-FLOAT.
;       The second argument is a REAL, not a SINGLE-FLOAT.
;       The result is a (VALUES REAL &OPTIONAL), not a (VALUES SINGLE-FLOAT &REST
;                                                              T).
;       etc.

The optimizer is not happy.  It’s pointing out that it does not have type knowledge for the parameters of the algorithm, so it has to use its most generic settings, and run-time typing.  As the C++ programmer coming to Lisp, you routinely declared the type of every single variable in your program, because of the strong typing in that language.  You would have declared these parameters to be of type uint_fast32_t, and trusted the compiler to optimize the code appropriately.  In Lisp, you are not forced to tell the compiler about every variable type, but in a situation like this, it helps to do so.  You can use declare to hint to the interpreter/compiler what the types are of certain variables in your expressions.  What we do is to tell the optimizer that all of those numbers are 31-bit integers, and some of them will never be zero.  Our code now looks like this:
 

(defun get-next-rand ()
  (declare (optimize (speed 3) (debug 0) (safety 0)))
  (declare ((integer 0 #x80000000) *rand-state*))
  (declare ((integer 1 #x80000000) *rand-multiplier*))
  (declare ((integer 1 #x80000000) *rand-increment*))
  (declare ((integer 1 #x80000000) *rand-modulus*))
  (setf *rand-state*
        (mod (+ (* *rand-multiplier* *rand-state*)
                *rand-increment*)
             *rand-modulus*)))

When this is compiled, the optimizer produces no diagnostic output at all.  We’ve given it everything it needs to generate optimized code for this case.

Now, you might ask, does it make a difference?  I touched on this in an earlier set of posts, but I’ll go over it again here.  Let’s look at the performance difference between the optimized code with the type declarations, and the optimized code without the type declarations. EDIT #2 on 2014-10-18 to add: For purposes of clarity while typing in the REPL, I will call this latest version of the code, with the five declare forms, get-next-rand-declared.
 

CL-USER> (time (dotimes (i 1000000) (get-next-rand)))
Evaluation took:
  0.026 seconds of real time
  0.028002 seconds of total run time (0.028002 user, 0.000000 system)
  107.69% CPU
  86,409,936 processor cycles
  0 bytes consed

NIL
CL-USER> (time (dotimes (i 1000000) (get-next-rand-declared)))
Evaluation took:
  0.015 seconds of real time
  0.020001 seconds of total run time (0.020001 user, 0.000000 system)
  133.33% CPU
  51,911,523 processor cycles
  0 bytes consed

NIL

Look just at the processor cycles lines.  The version with declared variable took 52 million cycles, vs. over 86 million for the type with unknown types.

If you’re interested, here’s the disassembly of the two functions:
 

CL-USER> (disassemble 'get-next-rand)
; disassembly for GET-NEXT-RAND
; 0587651F:       488B058AFFFFFF   MOV RAX, [RIP-118]         ; '*RAND-MULTIPLIER*
                                                              ; no-arg-parsing entry point
;      526:       488B5021         MOV RDX, [RAX+33]
;      52A:       498B1414         MOV RDX, [R12+RDX]
;      52E:       4883FA61         CMP RDX, 97
;      532:       7504             JNE L0
;      534:       488B50F9         MOV RDX, [RAX-7]
;      538: L0:   488B0579FFFFFF   MOV RAX, [RIP-135]         ; '*RAND-STATE*
;      53F:       488B7821         MOV RDI, [RAX+33]
;      543:       498B3C3C         MOV RDI, [R12+RDI]
;      547:       4883FF61         CMP RDI, 97
;      54B:       7504             JNE L1
;      54D:       488B78F9         MOV RDI, [RAX-7]
;      551: L1:   4C8D1C25B9020020 LEA R11, [#x200002B9]      ; GENERIC-*
;      559:       41FFD3           CALL R11
;      55C:       480F42E3         CMOVB RSP, RBX
;      560:       488B0559FFFFFF   MOV RAX, [RIP-167]         ; '*RAND-INCREMENT*
;      567:       488B7821         MOV RDI, [RAX+33]
;      56B:       498B3C3C         MOV RDI, [R12+RDI]
;      56F:       4883FF61         CMP RDI, 97
;      573:       7504             JNE L2
;      575:       488B78F9         MOV RDI, [RAX-7]
;      579: L2:   4C8D1C25E0010020 LEA R11, [#x200001E0]      ; GENERIC-+
;      581:       41FFD3           CALL R11
;      584:       480F42E3         CMOVB RSP, RBX
;      588:       488955F8         MOV [RBP-8], RDX
;      58C:       488B0D35FFFFFF   MOV RCX, [RIP-203]         ; '*RAND-MODULUS*
;      593:       488B4121         MOV RAX, [RCX+33]
;      597:       498B0404         MOV RAX, [R12+RAX]
;      59B:       4883F861         CMP RAX, 97
;      59F:       7504             JNE L3
;      5A1:       488B41F9         MOV RAX, [RCX-7]
;      5A5: L3:   488945E8         MOV [RBP-24], RAX
;      5A9:       488B55F8         MOV RDX, [RBP-8]
;      5AD:       488B7DE8         MOV RDI, [RBP-24]
;      5B1:       488D5C24F0       LEA RBX, [RSP-16]
;      5B6:       4883EC18         SUB RSP, 24
;      5BA:       488B050FFFFFFF   MOV RAX, [RIP-241]         ; #<FDEFINITION object for TRUNCATE>
;      5C1:       B904000000       MOV ECX, 4
;      5C6:       48892B           MOV [RBX], RBP
;      5C9:       488BEB           MOV RBP, RBX
;      5CC:       FF5009           CALL QWORD PTR [RAX+9]
;      5CF:       48897DF0         MOV [RBP-16], RDI
;      5D3:       488B55F0         MOV RDX, [RBP-16]
;      5D7:       31FF             XOR EDI, EDI
;      5D9:       488D0C2586040020 LEA RCX, [#x20000486]      ; GENERIC-=
;      5E1:       FFD1             CALL RCX
;      5E3:       7529             JNE L8
;      5E5: L4:   488B4DF0         MOV RCX, [RBP-16]
;      5E9: L5:   488B15C8FEFFFF   MOV RDX, [RIP-312]         ; '*RAND-STATE*
;      5F0:       488B4221         MOV RAX, [RDX+33]
;      5F4:       49833C0461       CMP QWORD PTR [R12+RAX], 97
;      5F9:       7406             JEQ L6
;      5FB:       49890C04         MOV [R12+RAX], RCX
;      5FF:       EB04             JMP L7
;      601: L6:   48894AF9         MOV [RDX-7], RCX
;      605: L7:   488BD1           MOV RDX, RCX
;      608:       488BE5           MOV RSP, RBP
;      60B:       F8               CLC
;      60C:       5D               POP RBP
;      60D:       C3               RET
;      60E: L8:   488B55E8         MOV RDX, [RBP-24]
;      612:       31FF             XOR EDI, EDI
;      614:       488D0C25E5030020 LEA RCX, [#x200003E5]      ; GENERIC-<
;      61C:       FFD1             CALL RCX
;      61E:       7D2E             JNL L10
;      620:       488B55F8         MOV RDX, [RBP-8]
;      624:       31FF             XOR EDI, EDI
;      626:       488D0C251B040020 LEA RCX, [#x2000041B]      ; GENERIC->
;      62E:       FFD1             CALL RCX
;      630:       7EB3             JLE L4
;      632: L9:   488B55F0         MOV RDX, [RBP-16]
;      636:       488B7DE8         MOV RDI, [RBP-24]
;      63A:       4C8D1C25E0010020 LEA R11, [#x200001E0]      ; GENERIC-+
;      642:       41FFD3           CALL R11
;      645:       480F42E3         CMOVB RSP, RBX
;      649:       488BCA           MOV RCX, RDX
;      64C:       EB9B             JMP L5
;      64E: L10:  488B55F8         MOV RDX, [RBP-8]
;      652:       31FF             XOR EDI, EDI
;      654:       488D0C25E5030020 LEA RCX, [#x200003E5]      ; GENERIC-<
;      65C:       FFD1             CALL RCX
;      65E:       7CD2             JL L9
;      660:       EB83             JMP L4
NIL
CL-USER> (disassemble 'get-next-rand-declared)
; disassembly for GET-NEXT-RAND-DECLARED
; 058B4AFF:       488B0D9AFFFFFF   MOV RCX, [RIP-102]         ; '*RAND-MULTIPLIER*
                                                              ; no-arg-parsing entry point
;      B06:       488B4121         MOV RAX, [RCX+33]
;      B0A:       498B0404         MOV RAX, [R12+RAX]
;      B0E:       4883F861         CMP RAX, 97
;      B12:       7504             JNE L0
;      B14:       488B41F9         MOV RAX, [RCX-7]
;      B18: L0:   488B1589FFFFFF   MOV RDX, [RIP-119]         ; '*RAND-STATE*
;      B1F:       488B4A21         MOV RCX, [RDX+33]
;      B23:       498B0C0C         MOV RCX, [R12+RCX]
;      B27:       4883F961         CMP RCX, 97
;      B2B:       7504             JNE L1
;      B2D:       488B4AF9         MOV RCX, [RDX-7]
;      B31: L1:   48D1F8           SAR RAX, 1
;      B34:       48D1F9           SAR RCX, 1
;      B37:       488BD0           MOV RDX, RAX
;      B3A:       480FAFD1         IMUL RDX, RCX
;      B3E:       488B0D6BFFFFFF   MOV RCX, [RIP-149]         ; '*RAND-INCREMENT*
;      B45:       488B4121         MOV RAX, [RCX+33]
;      B49:       498B0404         MOV RAX, [R12+RAX]
;      B4D:       4883F861         CMP RAX, 97
;      B51:       7504             JNE L2
;      B53:       488B41F9         MOV RAX, [RCX-7]
;      B57: L2:   488BC8           MOV RCX, RAX
;      B5A:       48D1F9           SAR RCX, 1
;      B5D:       488D040A         LEA RAX, [RDX+RCX]
;      B61:       488B1550FFFFFF   MOV RDX, [RIP-176]         ; '*RAND-MODULUS*
;      B68:       488B4A21         MOV RCX, [RDX+33]
;      B6C:       498B0C0C         MOV RCX, [R12+RCX]
;      B70:       4883F961         CMP RCX, 97
;      B74:       7504             JNE L3
;      B76:       488B4AF9         MOV RCX, [RDX-7]
;      B7A: L3:   48D1F9           SAR RCX, 1
;      B7D:       4885C9           TEST RCX, RCX
;      B80:       742A             JEQ L6
;      B82:       4899             CQO
;      B84:       48F7F9           IDIV RAX, RCX
;      B87:       48D1E2           SHL RDX, 1
;      B8A:       488B0D17FFFFFF   MOV RCX, [RIP-233]         ; '*RAND-STATE*
;      B91:       488B4121         MOV RAX, [RCX+33]
;      B95:       49833C0461       CMP QWORD PTR [R12+RAX], 97
;      B9A:       7406             JEQ L4
;      B9C:       49891404         MOV [R12+RAX], RDX
;      BA0:       EB04             JMP L5
;      BA2: L4:   488951F9         MOV [RCX-7], RDX
;      BA6: L5:   488BE5           MOV RSP, RBP
;      BA9:       F8               CLC
;      BAA:       5D               POP RBP
;      BAB:       C3               RET
;      BAC: L6:   CC0A             BREAK 10                   ; error trap
;      BAE:       03               BYTE #X03
;      BAF:       1E               BYTE #X1E                  ; DIVISION-BY-ZERO-ERROR
;      BB0:       18               BYTE #X18                  ; RAX
;      BB1:       58               BYTE #X58                  ; RCX
NIL

You’ll note that, with the declare directives, the math all takes place in the function definition, while without them, the function makes multiple calls into generic arithmetic functions that will have to determine the type at runtime and perform the appropriate operations.

OK, so that covers the compiler settings and optimizer hints.  I mentioned a third use, suppressing warnings.  As a programmer, I always want my code to compile without warnings.  It makes the build less noisy, it makes it easier to find real errors due to recent changes, and warnings can sometimes point out typos or erroneous code that, while syntactically correct, is suspicious enough to warrant investigation.  So, when might declare be used to suppress warnings? Using our random number generator example, maybe we changed the API. We used to pass the previous random number in to the function, so that we didn’t have to maintain a separate state variable, or so that we could run several independent streams. Later, we decided to eliminate this behaviour, but didn’t want to go to change all of the instances where the random number function was called.  So, the parameter suddenly became unused.  It looks like this now:
 

(defun get-next-rand (previous-number)
  (setf *rand-state*
        (mod (+ (* *rand-multiplier* *rand-state*)
                *rand-increment*)
             *rand-modulus*)))

However, when you compile this file,  you get a warning from SBCL:
 

; file: /home/neufeld/programming/lisp/blogging/opt.lisp
; in: DEFUN GET-NEXT-RAND
;     (DEFUN GET-NEXT-RAND (PREVIOUS-NUMBER)
;       (SETF *RAND-STATE* (MOD (+ # *RAND-INCREMENT*) *RAND-MODULUS*)))
; --> PROGN EVAL-WHEN 
; ==>
;   (SB-IMPL::%DEFUN 'GET-NEXT-RAND
;                    (SB-INT:NAMED-LAMBDA GET-NEXT-RAND
;                        (PREVIOUS-NUMBER)
;                      (BLOCK GET-NEXT-RAND (SETF *RAND-STATE* #)))
;                    NIL 'NIL (SB-C:SOURCE-LOCATION))
; 
; caught STYLE-WARNING:
;   The variable PREVIOUS-NUMBER is defined but never used.

; compiling (DEFUN GET-NEXT-RAND-DECLARED ...); 
                                              ; compilation unit finished
                                              ;   caught 1 STYLE-WARNING condition

The compiler is warning you of an unused variable.  To eliminate the warning, and get a nice clean screen on your compiles, you tell the compiler that yes, you know this variable is unused, that’s fine.
 

(defun get-next-rand (previous-number)
  (declare (ignore previous-number))
  (setf *rand-state*
        (mod (+ (* *rand-multiplier* *rand-state*)
                *rand-increment*)
             *rand-modulus*)))

Another possibility is to declare a variable as ignorable.  That is it might or might not be used in the form.  This can happen as you write specialized macros that can determine at compile-time whether or not to inline certain code, but this is already a long enough posting so I’m going to wrap it up here.

EDIT #1 on 2014-10-18 to fix an unbalanced “raw” tag in place of “/raw” in some quoted code.

The less-familiar parts of Lisp for beginners — declaim

The series of posts related to Lisp functions that the newcomer arriving from C++ might not have encountered continues with the declaim macro.

The declaim macro is generally used as a top-level form, and has the effect of modifying the behaviour of the compiler.  Common uses include setting optimization and debugging states, or declaring that a function ought to be inlined where it is invoked.  It might also be used to indicate that a top-level variable is to be made special.  You will have seen me using the optimization settings in some earlier code examples, for instance here.

The new programmer is likely to be interested in this primarily for the compiler optimizations.  Variables being declaimed as special have to do with dynamic vs. lexical binding, which might be a separate posting some time later.

It is important to note that it is unspecified whether the side-effects of declaim on compilation persist past what the C++ programmer would call a compilation unit.  That is, if you compile a file with certain optimization settings using declaim, and then later compile another file, one that does not declaim specific settings, it is implementation-dependent whether or not the impact of the declaim macro is visible in the second compilation unit.  This may be significant, particularly if you are writing library code.

The less familiar parts of Lisp for beginners — summary B & C

I’ve written a few posts about features of the Lisp language that a newcomer arriving from C++ might not have encountered.  I’m going through them alphabetically, so here is a summary page of the functions beginning with the letters ‘B’ and ‘C’:

block

byte

call-next-method

change-class

compute-applicable-methods

I’m skipping over several interesting functions because they’re more usefully described along with related functions that we haven’t yet encountered.  These include broadcast-stream-streams, compiler-macro-function, concatenated-stream-streams, copy-pprint-dispatch, copy-readtable, and copy-symbol.

The less-familiar parts of Lisp for beginners — compute-applicable-methods

Continuing with some less familiar parts of Lisp from the C++ programmer’s perspective, we come to compute-applicable-methods.  For the programmer, this is most commonly going to be used for insight into the method combination mechanism.  Recall our discussion of call-next-method, there is a set of rules that describes how the method to be invoked is chosen.  There is a default set of rules, but the programmer can also write his or her own combination rules for certain classes.  When the code encounters a generic function invocation, it will choose a method to execute on the arguments, using compute-applicable-methods.

The beginner programmer can use this function to understand how the method is chosen.  The function returns a list of methods, in precedence order.  Here is a demonstration of this use:
 

;; Demonstrate compute-applicable-methods


(defclass base-class ()
  ())

(defclass derived-class (base-class)
  ())

(defgeneric myprint (stream object)
  (:documentation "Prints to 'stream' some information about 'object'."))

(defmethod myprint (stream (object base-class))
  (format t "Entering base-class primary method.~%")
  (format t "Leaving base-class primary method.~%"))

(defmethod myprint :around (stream (object base-class))
  (format t "Entering base-class :around method.~%")
  (format t "Calling next method.~%")
  (call-next-method)
  (format t "Leaving base-class :around method.~%"))

(defmethod myprint (stream (object derived-class))
  (format t "Entering derived-class primary method.~%")
  (format t "Leaving derived-class primary method.~%"))

(defun demonstrate ()
  (let ((base (make-instance 'base-class))
        (derived (make-instance 'derived-class)))
    (format t "The applicable methods list on base-class for 'myprint is:~%")
    (format t "~A~%~%" (compute-applicable-methods #'myprint 
                                                   (list t base)))
    (format t "And now we run 'myprint on a base-class object:~%")
    (myprint t base)

    (format t "~%~%")

    (format t "The applicable methods list on derived-class for 'myprint is:~%")
    (format t "~A~%~%" (compute-applicable-methods #'myprint 
                                                   (list t derived)))
    (format t "And now we run 'myprint on a derived-class object:~%")
    (myprint t derived)))

Producing output as follows:
 
CL-USER> (demonstrate)
The applicable methods list on base-class for 'myprint is:
(#<STANDARD-METHOD MYPRINT :AROUND (T BASE-CLASS) {100CEC1243}>
 #<STANDARD-METHOD MYPRINT (T BASE-CLASS) {100CD87083}>)

And now we run 'myprint on a base-class object:
Entering base-class :around method.
Calling next method.
Entering base-class primary method.
Leaving base-class primary method.
Leaving base-class :around method.

The applicable methods list on derived-class for 'myprint is:
(#<STANDARD-METHOD MYPRINT (T DERIVED-CLASS) {100CF3AB63}>
 #<STANDARD-METHOD MYPRINT :AROUND (T BASE-CLASS) {100CEC1243}>
 #<STANDARD-METHOD MYPRINT (T BASE-CLASS) {100CD87083}>)

And now we run 'myprint on a derived-class object:
Entering base-class :around method.
Calling next method.
Entering derived-class primary method.
Leaving derived-class primary method.
Leaving base-class :around method.
NIL

You’ll notice that, for the derived class, it looks as if the order is wrong, because the base-class :around method was called, not the derived-class primary method.  Under method combination rules, if there are any :around methods, then the most specific :around method is invoked instead of any primary methods.  If that :around method invokes call-next-method, it then goes on through the rules we detailed before, under call-next-method.  So, the :around method, though second in the list of returned methods, overrides the more specific primary method.  Then, because of the call-next-method form, the more specific primary method is called.