Lambda calculus definition

The lambda calculus is a formal mathematical system consisting of constructing lambda terms and performing reduction operations on them. The definition of a lambda term is simply a variable, a lambda abstraction, or a function application, but a formal presentation can be somewhat lengthy. The focus of this article is to present a full and complete definition of the lambda calculus, specifically the pure untyped lambda calculus without extensions, although a lambda calculus extended with numbers and arithmetic is used for explanatory purposes.

Lambda terms

The lambda calculus consists of a language of lambda terms, that are defined by a certain formal syntax. The syntax of the lambda calculus defines some expressions as valid lambda calculus expressions and some as invalid, just as some strings of characters are valid computer programs and some are not. A valid lambda calculus expression is called a "lambda term". In the simplest form of lambda calculus, terms are built using only the following three rules. These rules give an inductive definition that can be applied to build all syntactically valid lambda terms, and produce expressions such as: $(\lambda x.\lambda y.(\lambda z.(\lambda x.z\ x)\ (\lambda y.z\ y))(x\ y)).$ ^[1]

A variable ${\textstyle x}$ is a character or string representing a parameter, itself a valid lambda term.
A lambda abstraction ${\textstyle (\lambda x.M)}$ is a function definition, taking as input the bound variable $x$ (between the λ and the punctum/dot .) and returning the body ${\textstyle M}$ . The definition of a function with an abstraction merely "sets up" the function but does not invoke it. An abstraction denotes an anonymous function that takes a single input $x$ and returns $M$ . The syntax $(\lambda x.M)$ binds the variable $x$ in the term $M$ . For example, $\lambda x.(x^{2}+2)$ is an abstraction representing the anonymous function $x\mapsto x^{2}+2$ . More concretely, we might give this function the name $f$ , and then we could write $f(x)=x^{2}+2,$ , although this name $f$ is superfluous when using the lambda calculus.
An application ${\textstyle (M\ N)}$ represents the application of a function ${\textstyle M}$ to an argument ${\textstyle N}$ . Both ${\textstyle M}$ and ${\textstyle N}$ are lambda terms. The application represents the act of calling function $M$ on input $N$ to produce $M(N)$ .

In Extended Backus-Naur Form, this might be summarized as $e::=v\mid (\lambda v.e)\mid (e\,e)$ , where the variables $v$ come from an infinite set $v_{1},v_{2},v_{3},\ldots$ , and the other symbols consist of lambda ' $\lambda$ ', dot '.', and parentheses '(' and ')'. A more formal and permissive presentation of the grammar might be as follows:

Name	BNF	Description
Expression	<expression> ::= <abstraction> \| <application> \| <variable> \| <bracket>	A lambda term is either an abstraction, an application, a variable, or a bracketed expression.
Abstraction	<abstraction> ::= λ <variable-list> . <expression>	Anonymous function definition.
Variable list	<variable-list> ::= <variable> , <variable-list> \| <variable>	A comma separated list of variables.
Application	<application> ::= <expression> <expression>+	An application (function call) is two or more expressions in a row.
Variable	<variable> ::= <alpha> (<alpha> \| <digit> \| '_')*	A variable name, e.g. x, y, fact, sum, ...
Grouping	<bracket> ::= ( <expression> )	Expression bracketed with parentheses.

The set of lambda expressions is defined inductively, for example as a set $Λ$ , where the results of applying rules 1-3 are all and only the elements of $Λ$ . In the strictest sense, nothing else is a lambda term. That is, a lambda term is valid if and only if it can be obtained by repeated application of these three rules. Formally:

If x is a variable, then $x \in Λ.$
If x is a variable and $M \in Λ,$ then $(λ x . M) \in Λ.$
If $M, N \in Λ,$ then $(M N) \in Λ.$

Instances of rule 2 are known as abstractions and instances of rule 3 are known as applications.^[2]

It is also common to extend the syntax presented here with additional operations, for example introducing terms for mathematical constants and operations, which allows making sense of terms such as $\lambda x.x^{2}.$ The untyped lambda calculus is flexible in that it does not distinguish between different kinds of data. For instance, there may be a function intended to operate on numbers. However, in the untyped lambda calculus, there is no way to prevent a function from being applied to truth values, strings, or other non-number objects. Depending on the encoding of the data, this may lead to nonsensical results, or work as intended.

Free and bound variables

Following the mathematical concepts of free variables and bound variables, the abstraction operator, $\lambda$ , is said to bind its variable wherever it occurs in the body of the abstraction. Variables that fall within the scope of an abstraction are said to be bound, and the part λx is often called the binder of x. Variables that are not bound are called free. For example, the function definition $f(x)=x+y$ could be represented as the lambda term $\lambda x.(x+y)$ , which contains two variables, $x$ and $y$ . The variable $x$ is bound by the lambda abstraction, while $y$ is free. The free variable $y$ has not been defined and is considered an unknown. The abstraction $\lambda x.(x+y)$ is a syntactically valid term and represents a function that adds its input to the yet-unknown $y$ . Also note that a variable is bound by its "nearest" abstraction. In the following example the single occurrence of $x$ in the expression is bound by the second lambda: $\lambda x.y(\lambda x.z\ x)$ A variable may occur both free and bound in a term; for example $y$ in $y(\lambda y.y)$ .

More formally, the sets of free variables and bound variables of a lambda expression, $M$ , are denoted as $\operatorname {FV} (M)$ and $\operatorname {BV} (M)$ and can be defined by recursion on the structure of the terms, as follows:^[3]^[4]

$\operatorname {FV} (M)$ - Free Variable Set	Comment	$\operatorname {BV} (M)$ - Bound Variable Set	Comment
$\operatorname {FV} (x)=\{x\}$	where x is a variable. In words, the free variables of $x$ are just $x$ .	$\operatorname {BV} (x)=\emptyset$	where x is a variable
$\operatorname {FV} (\lambda x.M)=\operatorname {FV} (M)\setminus \{x\}$	Free variables of M, but with $x$ removed	$\operatorname {BV} (\lambda x.M)=\operatorname {BV} (M)\cup \{x\}$	Bound variables of M plus x.
$\operatorname {FV} (M\ N)=\operatorname {FV} (M)\cup \operatorname {FV} (N)$	Union of the free variables from the function and the parameter	$\operatorname {BV} (M\ N)=\operatorname {BV} (M)\cup \operatorname {BV} (N)$	Union of the bound variables from the function and the parameter

An expression that contains no free variables is said to be closed. Closed lambda expressions are also known as combinators and are equivalent to terms in combinatory logic. It is common to restrict discussion to only closed terms, and some presentations of the lambda calculus only consider closed terms. For example, the lambda term representing the identity $\lambda x.x$ has no free variables and is closed.

Notation

For convenience, parentheses can be dropped if the expression is unambiguous. For example, the outermost parentheses can always be dropped— $M\ N$ instead of $(M\ N)$ . However, not all parentheses can be eliminated. For example,

$\lambda x.((\lambda x.x)x)$ is of form $\lambda x.B$ and is therefore an abstraction, while
$(\lambda x.(\lambda x.x))x$ is of form $MN$ and is therefore an application.

The examples 1 and 2 denote different terms, differing only in where the parentheses are placed. They have different meanings: example 1 is a function definition, while example 2 is a function application. The lambda variable $x$ is a placeholder in both examples.

Here, example 1 defines a function $\lambda x.B$ , where $B$ is $(\lambda x.x)x$ , an anonymous function $(\lambda x.x)$ , with input $x$ ; while example 2, $M$ $N$ , is M applied to N, where $M$ is the lambda term $(\lambda x.(\lambda x.x))$ being applied to the input $N$ which is $x$ . Both examples 1 and 2 would evaluate to the identity function $\lambda x.x$ .

To allow further concision in these situations, the following conventions are usually applied:

Applications are assumed to be left-associative: $M\ N\ P$ may be written instead of $((M\ N)\ P)$ ^[5]
The body of an abstraction extends as far right as possible: $\lambda x.M\ N$ means $\lambda x.(M\ N)$ and not $(\lambda x.M)\ N$ . Said another way, a lambda abstraction has a lower precedence than an application.
A sequence of abstractions is contracted: $\lambda x.\lambda y.\lambda z.N$ is abbreviated as $\lambda xyz.N$ ^[6]^[7]^[5]
When all variables are single-letter, the space in applications may be omitted: MNP instead of M N P.^[8]

Transformation and reduction

The meaning of lambda expressions is defined by how expressions can be transformed and reduced.^[9]

There are three kinds of transformation:

α-conversion: changing bound variables (alpha);
β-reduction: applying functions to their arguments (beta), calling functions;
η-reduction: which captures a notion of extensionality (eta).

We also speak of the resulting equivalences: two expressions are β-equivalent, if they can be β-converted into the same expression, and α/η-equivalence are defined similarly.

The term redex, short for reducible expression, refers to subterms that can be reduced by one of the reduction rules. For example, $(\lambda x.M)\ N$ is a β-redex in expressing the substitution of $N$ for $x$ in $M$ ; if $x$ is not free in $M$ , $\lambda x.M\ x$ is an η-redex. The expression to which a redex reduces is called its reduct; using the previous example, the reducts of these expressions are respectively $M[x:=N]$ and $M$ .

α-conversion

α-conversion (alpha-conversion), sometimes known as α-renaming,^[10] allows bound variable names to be changed. For example, alpha-conversion of $\lambda x.x$ might yield $\lambda y.y$ . The terms $x$ and $y$ by themselves are not alpha-equivalent, because they are not bound in an abstraction. Terms that differ only by alpha-conversion are called α-equivalent, capturing the intuition that the particular choice of a bound variable, in an abstraction, does not (usually) matter. Frequently in uses of lambda calculus, α-equivalent terms are considered to be equivalent.

The precise rules for alpha-conversion are not completely trivial. First, when alpha-converting an abstraction, the only variable occurrences that are renamed are those that are bound by the same abstraction. For example, an alpha-conversion of $\lambda x.\lambda x.x$ could result in $\lambda y.\lambda x.x$ , but it could not result in $\lambda y.\lambda x.y$ . The latter has a different meaning from the original. This is analogous to the programming notion of variable shadowing.

Second, alpha-conversion is not possible if it would result in a variable getting captured by a different abstraction. For example, if we replace $x$ with $y$ in $\lambda x.\lambda y.x$ , we get $\lambda y.\lambda y.y$ , which is not at all the same. In the De Bruijn index notation, any two α-equivalent terms are syntactically identical, and confusion in this way cannot occur.

See example:

α-conversion	λ-expression	de Brujin notation	Comment
	$\lambda z.\lambda y.(z\ y)$	$\lambda .\lambda .(21)$	Original expressions.
correctly rename y to k, (because k is not used in the body)	$\lambda z.\lambda k.(z\ k)$	$\lambda .\lambda .(21)$	No change to de Brujin expression.
naively rename y to z, (wrong because z free in $\lambda y.(z\ y)$ )	$\lambda z.\lambda z.(z\ z)$	$\lambda .\lambda .({\color {Red}1}1)$	$z$ is captured.

Substitution

Substitution, written $E[V:=R]$ , is the process of replacing all free occurrences of the variable $V$ in the expression $E$ with expression $R$ . Substitution on terms of the lambda calculus is defined by recursion on the structure of terms, as follows (note: x and y are variables while M and N are any λ expression).

$x[x:=N]=N$ ; with $N$ substituted for $x$ , $x$ becomes $N$
$y[x:=N]=y$ if $x\neq y$ ; with $N$ substituted for $x$ , $y$ (which is not $x$ ) remains $y$
$(M_{1}\ M_{2})[x:=N]=(M_{1}[x:=N])(M_{2}[x:=N])$ ; substitution distributes to both sides of an application
$(\lambda x.M)[x:=N]=\lambda x.M$ ; a variable bound by an abstraction is not subject to substitution; substituting such a variable leaves the abstraction unchanged
$(\lambda y.M)[x:=N]=\lambda y.(M[x:=N])$ if $x\neq y$ and $y\notin FV(N)$ ; substituting a variable which is not bound by an abstraction proceeds in the abstraction's body, provided that the abstracted variable $y$ is "fresh" for the substitution term $N$ , meaning it does not appear among the free variables of $N.$

For example, $(\lambda x.x)[y:=y]=\lambda x.(x[y:=y])=\lambda x.x$ , and $((\lambda x.y)x)[x:=y]=((\lambda x.y)[x:=y])(x[x:=y])=(\lambda x.y)y$ .

The freshness condition (requiring that $y$ is not in the free variables of $N$ ) is crucial in order to ensure that substitution does not change the meaning of functions. The situation where the substituted $x$ was supposed to be free but ended up being bound is a situation known as capturing $x$ . To substitute into a lambda abstraction, it is sometimes necessary to α-convert the expression. For example, this substitution $(\lambda x.y)[y:=x]\neq \lambda x.(y[y:=x])=\lambda x.x$ is erroneous because it would turn the constant function $\lambda x.y$ into the identity $\lambda x.x$ . The correct substitution is to rename the bound variable using α-equivalence, in this case $(\lambda x.y)[y:=x]=(\lambda z.y)[y:=x]=\lambda z.(y[y:=x])=\lambda z.x$ .

In general, failure to meet the freshness condition can be remedied by alpha-renaming first, with a suitable fresh variable. Substitution is defined uniquely up to α-equivalence. Most implementations of substitution use alpha-conversion automatically to avoid capture during substitution, an operation called capture-avoiding substitution. In programming languages with static scope, capture-avoiding substitution can be used to implement name resolution by carefully handling variable shadowing in containing scopes. Another strategy to require alpha renaming in the source program to make name resolution trivial. If De Bruijn indexing is used, then α-conversion is no longer required as there will be no name collisions. Similarly variable names are not needed if using a universal lambda function, such as Iota and Jot, which can create any function behavior by calling it on itself in various combinations.

β-reduction

β-reduction (beta reduction) captures the idea of function application. β-reduction is defined in terms of substitution: the β-reduction of $((\lambda V.E)\ E')$ is $E[V:=E']$ . The β-reduction rule states that an application of the form $(\lambda x.t)s$ reduces to the term $t[x:=s]$ . The notation $(\lambda x.t)s\to t[x:=s]$ is used to indicate that $(\lambda x.t)s$ β-reduces to $t[x:=s]$ . β-reduction captures the idea of function application (also called a function call), and implements the substitution of the actual parameter expression for the formal parameter variable. β-reduction is defined in terms of substitution. β-reduction can be seen to be the same as the concept of local reducibility in natural deduction, via the Curry–Howard isomorphism.

For example, for every $s$ , $(\lambda x.x)s\to x[x:=s]=s$ . This demonstrates that $\lambda x.x$ really is the identity. Similarly, $(\lambda x.y)s\to y[x:=s]=y$ , which demonstrates that $\lambda x.y$ is a constant function. Assuming some encoding of $2,7,\times$ , we have the following β-reduction: $((\lambda n.\ n\times 2)\ 7)\rightarrow 7\times 2$ .

More formally, β-reduction may be performed on the lambda abstraction without alpha renaming only if no variable names are free in the actual parameter and bound in the body:^[a]

FV(y)\cap BV(b)=\{\}\to (\lambda x.b)\ y=b[x:=y]

Alpha renaming may be used on $b$ to rename names that are free in $y$ but bound in $b$ , to meet the pre-condition for this transformation. See example:

β-reduction

λ-expression

de Brujin notation

Comment

(\lambda x.\lambda y.(\lambda z.(\lambda x.z\ x)(\lambda y.z\ y))(x\ y))

(\lambda .\lambda .(\lambda .(\lambda .21)(\lambda .21))(21))

Original expressions.

Naive beta 1,

(\lambda x.\lambda y.((\lambda x.(x\ y)x)(\lambda y.(x\ y)y)))

Correct	$(\lambda .\lambda .((\lambda .({\color {Blue}3}2)1)(\lambda .(3{\color {Blue}2})1)))$
Incorrect	$(\lambda .\lambda .((\lambda .({\color {Red}1}2)1)(\lambda .(3{\color {Red}1)}1)))$

x and y have been captured in the substitution.

Alpha rename inner, x → a, y → b

(\lambda x.\lambda y.(\lambda z.(\lambda a.z\ a)(\lambda b.z\ b))(x\ y))

(\lambda .\lambda .(\lambda .(\lambda .21)(\lambda .21))(21))

Beta 2,

(\lambda x.\lambda y.((\lambda a.(x\ y)a)(\lambda b.(x\ y)b)))

(\lambda .\lambda .((\lambda .(32)1)(\lambda .(32)1)))

x and y not captured.

Looking closer, the key substitutions are

{\begin{array}{r}((\lambda x.z\ x)(\lambda y.z\ y))[z:=(x\ y)]{\text{(blocked - will capture)}}\\((\lambda a.z\ a)(\lambda b.z\ b))[z:=(x\ y)]{\text{(allowed - no capture)}}\end{array}}

In this example,

In the β-redex,
1. The free variables are, $\operatorname {FV} (x\ y)=\{x,y\}$
2. The bound variables are, $\operatorname {BV} ((\lambda x.z\ x)(\lambda y.z\ y))=\{x,y\}$
The naive β-redex changed the meaning of the expression because x and y from the actual parameter became captured when the expressions were substituted in the inner abstractions.
The alpha renaming removed the problem by changing the names of x and y in the inner abstraction so that they are distinct from the names of x and y in the actual parameter.
1. The free variables are, $\operatorname {FV} (x\ y)=\{x,y\}$
2. The bound variables are, $\operatorname {BV} ((\lambda a.z\ a)(\lambda b.z\ b))=\{a,b\}$
The β-redex then proceeded with the intended meaning.

η-reduction

η-reduction (eta reduction) converts from $\lambda x.(fx)$ to $f$ , given that $x$ does not appear free in $f$ . The problem with using an η-redex when f has free variables is shown in this example,

Reduction	Lambda expression	β-reduction
	$(\lambda x.(\lambda y.y\,x)\,x)\,a$	$\lambda a.a\,a$
Naive η-reduction	$(\lambda y.y\,x)\,a$	$\lambda a.a\,x$

This improper use of η-reduction changes the meaning by leaving $x$ in $\lambda y.y\,x$ unsubstituted.

η-reduction is often paired with its inverse, η-expansion, which converts from $f$ to $\lambda x.(fx)$ , again given that $x$ does not appear free in $f$ . The two processes together are called η-conversion. η-conversion expresses the idea of extensionality,^[12] which in this context is that two functions are the same if and only if they give the same result for all arguments. η-conversion can be seen to be the same as the concept of local completeness in natural deduction, via the Curry–Howard isomorphism.

Evaluation and normalization

The lambda calculus may be seen as an idealized version of a functional programming language, like Haskell or Standard ML. Under this view, β-reduction corresponds to a computational step. This step can be repeated by additional β-reductions until there are no more applications left to reduce. If evaluation does terminate, the result is a lambda expression that cannot be reduced further by β-reduction. This is called the normal form of the expression. A lambda expression that cannot be reduced further by β-reductions is in beta normal form, and similarly if it also cannot be reduced by η-reductions it is in beta-eta normal form. All normal forms that can be converted into each other by α-conversion are defined to be equal. By the Church–Rosser theorem, the normal form is unique if it exists, regardless of the order in which the reductions are performed (the reduction strategy).

However, not all lambda expressions have a normal form. For example, consider the term $\Omega =(\lambda x.xx)(\lambda x.xx)$ . Here $(\lambda x.xx)(\lambda x.xx)\to (xx)[x:=\lambda x.xx]=(x[x:=\lambda x.xx])(x[x:=\lambda x.xx])=(\lambda x.xx)(\lambda x.xx)$ . That is, the term reduces to itself in a single β-reduction, and therefore the reduction process will never terminate. Typed lambda calculi, such as the simply typed lambda calculus, do not allow the construction of terms like $\Omega$ , and therefore all well-typed terms in these systems have a normal form.

Notes

^
Barendregt, Barendsen (2000) call this form
- axiom β: (λx.M[x]) N = M[N] , rewritten as (λx.M) N = M[x := N], "where M[x := N] denotes the substitution of N for every occurrence of x in M".^[3]^: 7 Also denoted M[N/x], "the substitution of N for x in M".^[11]

References

^ Barendregt, Hendrik Pieter (1984), The Lambda Calculus: Its Syntax and Semantics, Studies in Logic and the Foundations of Mathematics, vol. 103 (Revised ed.), North Holland, Amsterdam., ISBN 978-0-444-87508-2, archived from the original on 2004-08-23 — Corrections
^ Barendregt, Hendrik Pieter (1984). The Lambda Calculus: Its Syntax and Semantics. Studies in Logic and the Foundations of Mathematics. Vol. 103 (Revised ed.). North Holland. ISBN 0-444-87508-5. (Corrections).
^ ^a ^b Barendregt, Henk; Barendsen, Erik (March 2000), Introduction to Lambda Calculus (PDF)
^ Barendregt, Henk (1985). The Lambda Calculus – Its Syntax and Semantics. Studies in Logic and the Foundations of Mathematics. Vol. 103. Amsterdam: North-Holland. ISBN 0444867481. Here: Def.2.1.6, p.24
^ ^a ^b "Example for Rules of Associativity". Lambda-bound.com. Retrieved 2012-06-18.
^ Selinger, Peter (2008), Lecture Notes on the Lambda Calculus (PDF), vol. 0804, Department of Mathematics and Statistics, University of Ottawa, p. 9, arXiv:0804.3434, Bibcode:2008arXiv0804.3434S
^ "Example for Rule of Associativity". Lambda-bound.com. Retrieved 2012-06-18.
^ "The Basic Grammar of Lambda Expressions". SoftOption. Some other systems use juxtaposition to mean application, so 'ab' means 'a@b'. This is fine except that it requires that variables have length one so that we know that 'ab' is two variables juxtaposed not one variable of length 2. But we want to labels like 'firstVariable' to mean a single variable, so we cannot use this juxtaposition convention.
^ de Queiroz, Ruy J. G. B. (1988). "A Proof-Theoretic Account of Programming and the Role of Reduction Rules". Dialectica. 42 (4): 265–282. doi:10.1111/j.1746-8361.1988.tb00919.x.
^ Turbak, Franklyn; Gifford, David (2008), Design concepts in programming languages, MIT press, p. 251, ISBN 978-0-262-20175-9
^ explicit substitution at the nLab
^ Luke Palmer (29 Dec 2010) Haskell-cafe: What's the motivation for η rules?