Amb and Search

To extend Scheme to support nondeterminism, we introduce a new special form called amb.^{noteThe idea of amb for nondeterministic programming was first described in 1961 by John McCarthy (see McCarthy 1967).} The expression (amb <e₁> <e₂> ... <e_n>) returns the value of one of the n expressions <e_i> "ambiguously." For example, the expression

can have six possible values:

(1 a) (1 b) (2 a) (2 b) (3 a) (3 b)

Amb with a single choice produces an ordinary (single) value.

Amb with no choices -- the expression (amb) -- is an expression with no acceptable values. Operationally, we can think of (amb) as an expression that when evaluated causes the computation to "fail": The computation aborts and no value is produced. Using this idea, we can express the requirement that a particular predicate expression p must be true as follows:

With amb and require, we can implement the an-element-of procedure used above:

(define (an-element-of items) (require (not (null? items))) (amb (car items) (an-element-of (cdr items))))

An-element-of fails if the list is empty. Otherwise it ambiguously returns either the first element of the list or an element chosen from the rest of the list.

We can also express infinite ranges of choices. The following procedure potentially returns any integer greater than or equal to some given n:

(define (an-integer-starting-from n) (amb n (an-integer-starting-from (+ n 1))))

This is like the stream procedure integers-starting-from described in section 3.5.2, but with an important difference: The stream procedure returns an object that represents the sequence of all integers beginning with n, whereas the amb procedure returns a single integer.^{noteIn actuality, the distinction between nondeterministically returning a single choice and returning all choices depends somewhat on our point of view. From the perspective of the code that uses the value, the nondeterministic choice returns a single value. From the perspective of the programmer designing the code, the nondeterministic choice potentially returns all possible values, and the computation branches so that each value is investigated separately.}

Abstractly, we can imagine that evaluating an amb expression causes time to split into branches, where the computation continues on each branch with one of the possible values of the expression. We say that amb represents a nondeterministic choice point. If we had a machine with a sufficient number of processors that could be dynamically allocated, we could implement the search in a straightforward way. Execution would proceed as in a sequential machine, until an amb expression is encountered. At this point, more processors would be allocated and initialized to continue all of the parallel executions implied by the choice. Each processor would proceed sequentially as if it were the only choice, until it either terminates by encountering a failure, or it further subdivides, or it finishes.^{noteOne might object that this is a hopelessly inefficient mechanism. It might require millions of processors to solve some easily stated problem this way, and most of the time most of those processors would be idle. This objection should be taken in the context of history. Memory used to be considered just such an expensive commodity. In 1964 a megabyte of RAM cost about $400,000. Now every personal computer has many megabytes of RAM, and most of the time most of that RAM is unused. It is hard to underestimate the cost of mass-produced electronics.}

On the other hand, if we have a machine that can execute only one process (or a few concurrent processes), we must consider the alternatives sequentially. One could imagine modifying an evaluator to pick at random a branch to follow whenever it encounters a choice point. Random choice, however, can easily lead to failing values. We might try running the evaluator over and over, making random choices and hoping to find a non-failing value, but it is better to systematically search all possible execution paths. The amb evaluator that we will develop and work with in this section implements a systematic search as follows: When the evaluator encounters an application of amb, it initially selects the first alternative. This selection may itself lead to a further choice. The evaluator will always initially choose the first alternative at each choice point. If a choice results in a failure, then the evaluator automagically^{noteAutomagically: "Automatically, but in a way which, for some reason (typically because it is too complicated, or too ugly, or perhaps even too trivial), the speaker doesn't feel like explaining." (Steele 1983, Raymond 1993)} backtracks to the most recent choice point and tries the next alternative. If it runs out of alternatives at any choice point, the evaluator will back up to the previous choice point and resume from there. This process leads to a search strategy known as depth-first search or chronological backtracking.^{noteThe integration of automatic search strategies into programming languages has had a long and checkered history. The first suggestions that nondeterministic algorithms might be elegantly encoded in a programming language with search and automatic backtracking came from Robert Floyd (1967). Carl Hewitt (1969) invented a programming language called Planner that explicitly supported automatic chronological backtracking, providing for a built-in depth-first search strategy. Sussman, Winograd, and Charniak (1971) implemented a subset of this language, called MicroPlanner, which was used to support work in problem solving and robot planning. Similar ideas, arising from logic and theorem proving, led to the genesis in Edinburgh and Marseille of the elegant language Prolog (which we will discuss in section 4.4). After sufficient frustration with automatic search, McDermott and Sussman (1972) developed a language called Conniver, which included mechanisms for placing the search strategy under programmer control. This proved unwieldy, however, and Sussman and Stallman (1975) found a more tractable approach while investigating methods of symbolic analysis for electrical circuits. They developed a non-chronological backtracking scheme that was based on tracing out the logical dependencies connecting facts, a technique that has come to be known as dependency-directed backtracking. Although their method was complex, it produced reasonably efficient programs because it did little redundant search. Doyle (1979) and McAllester (1978, 1980) generalized and clarified the methods of Stallman and Sussman, developing a new paradigm for formulating search that is now called truth maintenance. Modern problem-solving systems all use some form of truth-maintenance system as a substrate. See Forbus and deKleer 1993 for a discussion of elegant ways to build truth-maintenance systems and applications using truth maintenance. Zabih, McAllester, and Chapman 1987 describes a nondeterministic extension to Scheme that is based on amb; it is similar to the interpreter described in this section, but more sophisticated, because it uses dependency-directed backtracking rather than chronological backtracking. Winston 1992 gives an introduction to both kinds of backtracking.}

Driver loop

The driver loop for the amb evaluator has some unusual properties. It reads an expression and prints the value of the first non-failing execution, as in the prime-sum-pair example shown above. If we want to see the value of the next successful execution, we can ask the interpreter to backtrack and attempt to generate a second non-failing execution. This is signaled by typing the symbol try-again. If any expression except try-again is given, the interpreter will start a new problem, discarding the unexplored alternatives in the previous problem. Here is a sample interaction:

;;; Amb-Eval input: (prime-sum-pair '(1 3 5 8) '(20 35 110)) ;;; Starting a new problem ;;; Amb-Eval value: (3 20) ;;; Amb-Eval input: try-again ;;; Amb-Eval value: (3 110) ;;; Amb-Eval input: try-again ;;; Amb-Eval value: (8 35) ;;; Amb-Eval input: try-again ;;; There are no more values of (prime-sum-pair (quote (1 3 5 8)) (quote (20 35 110))) ;;; Amb-Eval input: (prime-sum-pair '(19 27 30) '(11 36 58)) ;;; Starting a new problem ;;; Amb-Eval value: (30 11)

Exercise 4.35. Write a procedure an-integer-between that returns an integer between two given bounds. This can be used to implement a procedure that finds Pythagorean triples, i.e., triples of integers (i,j,k) between the given bounds such that i < j and i² + j² = k², as follows:

(define (a-pythagorean-triple-between low high) (let ((i (an-integer-between low high))) (let ((j (an-integer-between i high))) (let ((k (an-integer-between j high))) (require (= (+ (* i i) (* j j)) (* k k))) (list i j k)))))

Exercise 4.36. Exercise 3.69 discussed how to generate the stream of all Pythagorean triples, with no upper bound on the size of the integers to be searched. Explain why simply replacing an-integer-between by an-integer-starting-from in the procedure in exercise 4.35 is not an adequate way to generate arbitrary Pythagorean triples. Write a procedure that actually will accomplish this. (That is, write a procedure for which repeatedly typing try-again would in principle eventually generate all Pythagorean triples.)

Exercise 4.37. Ben Bitdiddle claims that the following method for generating Pythagorean triples is more efficient than the one in exercise 4.35. Is he correct? (Hint: Consider the number of possibilities that must be explored.)

(define (a-pythagorean-triple-between low high) (let ((i (an-integer-between low high)) (hsq (* high high))) (let ((j (an-integer-between i high))) (let ((ksq (+ (* i i) (* j j)))) (require (>= hsq ksq)) (let ((k (sqrt ksq))) (require (integer? k)) (list i j k))))))