-
Notifications
You must be signed in to change notification settings - Fork 360
/
Copy pathpreliminaries.tex
2044 lines (1812 loc) · 135 KB
/
preliminaries.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Type theory}
\label{cha:typetheory}
\section{Type theory versus set theory}
\label{sec:types-vs-sets}
\label{sec:axioms}
\index{type theory}
Homotopy type theory is (among other things) a foundational language for mathematics, i.e., an alternative to Zermelo--Fraenkel\index{set theory!Zermelo--Fraenkel} set theory.
However, it behaves differently from set theory in several important ways, and that can take some getting used to.
Explaining these differences carefully requires us to be more formal here than we will be in the rest of the book.
As stated in the introduction, our goal is to write type theory \emph{informally}; but for a mathematician accustomed to set theory, more precision at the beginning can help avoid some common misconceptions and mistakes.
We note that a set-theoretic foundation has two ``layers'': the deductive system of first-order logic,\index{first-order!logic} and, formulated inside this system, the axioms of a particular theory, such as ZFC.
Thus, set theory is not only about sets, but rather about the interplay between sets (the objects of the second layer) and propositions (the objects of the first layer).
By contrast, type theory is its own deductive system: it need not be formulated inside any superstructure, such as first-order logic.
Instead of the two basic notions of set theory, sets and propositions, type theory has one basic notion: \emph{types}.
Propositions (statements which we can prove, disprove, assume, negate, and so on\footnote{Confusingly, it is also a common practice (dating
back to Euclid) to use the word ``proposition'' synonymously with ``theorem''.
We will confine ourselves to the logician's usage, according to which a \emph{proposition} is a statement \emph{susceptible to} proof, whereas a \emph{theorem}\indexfoot{theorem} (or ``lemma''\indexfoot{lemma} or ``corollary''\indexfoot{corollary}) is such a statement that \emph{has been} proven.
Thus ``$0=1$'' and its negation ``$\neg(0=1)$'' are both propositions, but only the latter is a theorem.}) are identified with particular types, via the correspondence shown in \cref{tab:pov} on page~\pageref{tab:pov}.
Thus, the mathematical activity of \emph{proving a theorem} is identified with a special case of the mathematical activity of \emph{constructing an object}---in this case, an inhabitant of a type that represents a proposition.
\index{deductive system}%
This leads us to another difference between type theory and set theory, but to explain it we must say a little about deductive systems in general.
Informally, a deductive system is a collection of \define{rules}
\indexdef{rule}%
for deriving things called \define{judgments}.
\indexdef{judgment}%
If we think of a deductive system as a formal game,
\index{game!deductive system as}%
then the judgments are the ``positions'' in the game which we reach by following the game rules.
We can also think of a deductive system as a sort of algebraic theory, in which case the judgments are the elements (like the elements of a group) and the deductive rules are the operations (like the group multiplication).
From a logical point of view, the judgments can be considered to be the ``external'' statements, living in the metatheory, as opposed to the ``internal'' statements of the theory itself.
In the deductive system of first-order logic (on which set theory is based), there is only one kind of judgment: that a given proposition has a proof.
That is, each proposition $A$ gives rise to a judgment ``$A$ has a proof'', and all judgments are of this form.
A rule of first-order logic such as ``from $A$ and $B$ infer $A\wedge B$'' is actually a rule of ``proof construction'' which says that given the judgments ``$A$ has a proof'' and ``$B$ has a proof'', we may deduce that ``$A\wedge B$ has a proof''.
Note that the judgment ``$A$ has a proof'' exists at a different level from the \emph{proposition} $A$ itself, which is an internal statement of the theory.
% In particular, we cannot manipulate it to construct propositions such as ``if $A$ has a proof, then $B$ does not have a proof''---unless we are using our set-theoretic foundation as a meta-theory with which to talk about some other axiomatic system.
The basic judgment of type theory, analogous to ``$A$ has a proof'', is written ``$a:A$'' and pronounced as ``the term $a$ has type $A$'', or more loosely ``$a$ is an element of $A$'' (or, in homotopy type theory, ``$a$ is a point of $A$'').
\indexdef{term}%
\indexdef{element}%
\indexdef{point!of a type}%
When $A$ is a type representing a proposition, then $a$ may be called a \emph{witness}\index{witness!to the truth of a proposition} to the provability of $A$, or \emph{evidence}\index{evidence, of the truth of a proposition} of the truth of $A$ (or even a \emph{proof}\index{proof} of $A$, but we will try to avoid this confusing terminology).
In this case, the judgment $a:A$ is derivable in type theory (for some $a$) precisely when the analogous judgment ``$A$ has a proof'' is derivable in first-order logic (modulo differences in the axioms assumed and in the encoding of mathematics, as we will discuss throughout the book).
On the other hand, if the type $A$ is being treated more like a set than like a proposition (although as we will see, the distinction can become blurry), then ``$a:A$'' may be regarded as analogous to the set-theoretic statement ``$a\in A$''.
However, there is an essential difference in that ``$a:A$'' is a \emph{judgment} whereas ``$a\in A$'' is a \emph{proposition}.
In particular, when working internally in type theory, we cannot make statements such as ``if $a:A$ then it is not the case that $b:B$'', nor can we ``disprove'' the judgment ``$a:A$''.
A good way to think about this is that in set theory, ``membership'' is a relation which may or may not hold between two pre-existing objects ``$a$'' and ``$A$'', while in type theory we cannot talk about an element ``$a$'' in isolation: every element \emph{by its very nature} is an element of some type, and that type is (generally speaking) uniquely determined.
Thus, when we say informally ``let $x$ be a natural number'', in set theory this is shorthand for ``let $x$ be a thing and assume that $x\in\nat$'', whereas in type theory ``let $x:\nat$'' is an atomic statement: we cannot introduce a variable without specifying its type.\index{membership}
At first glance, this may seem an uncomfortable restriction, but it is arguably closer to the intuitive mathematical meaning of ``let $x$ be a natural number''.
In practice, it seems that whenever we actually \emph{need} ``$a\in A$'' to be a proposition rather than a judgment, there is always an ambient set $B$ of which $a$ is known to be an element and $A$ is known to be a subset.
This situation is also easy to represent in type theory, by taking $a$ to be an element of the type $B$, and $A$ to be a predicate on $B$; see \cref{subsec:prop-subsets}.
A last difference between type theory and set theory is the treatment of equality.
The familiar notion of equality in mathematics is a proposition: e.g.\ we can disprove an equality or assume an equality as a hypothesis.
Since in type theory, propositions are types, this means that equality is a type: for elements $a,b:A$ (that is, both $a:A$ and $b:A$) we have a type ``$\id[A]ab$''.
(In \emph{homotopy} type theory, of course, this equality proposition can behave in unfamiliar ways: see \cref{sec:identity-types,cha:basics}, and the rest of the book).
When $\id[A]ab$ is inhabited, we say that $a$ and $b$ are \define{(propositionally) equal}.
\index{propositional!equality}%
\index{equality!propositional}%
However, in type theory there is also a need for an equality \emph{judgment}, existing at the same level as the judgment ``$x:A$''.\index{judgment}
\symlabel{defn:judgmental-equality}%
This is called \define{judgmental equality}
\indexdef{equality!judgmental}%
\indexdef{judgmental equality}%
or \define{definitional equality},
\indexdef{equality!definitional}%
\indexsee{definitional equality}{equality, definitional}%
and we write it as $a\jdeq b : A$ or simply $a \jdeq b$.
It is helpful to think of this as meaning ``equal by definition''.
For instance, if we define a function $f:\nat\to\nat$ by the equation $f(x)=x^2$, then the expression $f(3)$ is equal to $3^2$ \emph{by definition}.
Inside the theory, it does not make sense to negate or assume an equality-by-definition; we cannot say ``if $x$ is equal to $y$ by definition, then $z$ is not equal to $w$ by definition''.
Whether or not two expressions are equal by definition is just a matter of expanding out the definitions; in particular, it is algorithmically\index{algorithm} decidable (though the algorithm is necessarily meta-theoretic, not internal to the theory).\index{decidable!definitional equality}
As type theory becomes more complicated, judgmental equality can get more subtle than this, but it is a good intuition to start from.
Alternatively, if we regard a deductive system as an algebraic theory, then judgmental equality is simply the equality in that theory, analogous to the equality between elements of a group---the only potential for confusion is that there is \emph{also} an object \emph{inside} the deductive system of type theory (namely the type ``$a=b$'') which behaves internally as a notion of ``equality''.
The reason we \emph{want} a judgmental notion of equality is so that it can control the other form of judgment, ``$a:A$''.
For instance, suppose we have given a proof that $3^2=9$, i.e.\ we have derived the judgment $p:(3^2=9)$ for some $p$.
Then the same witness $p$ ought to count as a proof that $f(3)=9$, since $f(3)$ is $3^2$ \emph{by definition}.
The best way to represent this is with a rule saying that given the judgments $a:A$ and $A\jdeq B$, we may derive the judgment $a:B$.
Thus, for us, type theory will be a deductive system based on two forms of judgment:
\begin{center}
\medskip
\begin{tabular}{cl}
\toprule
Judgment & Meaning\\
\midrule
$a : A$ & ``$a$ is an object of type $A$''\\
$a \jdeq b : A$ & ``$a$ and $b$ are definitionally equal objects of type $A$''\\
\bottomrule
\end{tabular}
\medskip
\end{center}
%
\symlabel{defn:defeq}%
When introducing a definitional equality, i.e., defining one thing to be equal to another, we will use the symbol ``$\defeq$''.
Thus, the above definition of the function $f$ would be written as $f(x)\defeq x^2$.
Because judgments cannot be put together into more complicated statements, the symbols ``$:$'' and ``$\jdeq$'' bind more loosely than anything else.%
\footnote{In formalized\indexfoot{mathematics!formalized} type theory, commas and turnstiles can bind even more loosely.
For instance, $x:A,y:B\vdash c:C$ is parsed as $((x:A),(y:B))\vdash (c:C)$.
However, in this book we refrain from such notation until \cref{cha:rules}.}
Thus, for instance, ``$p:\id{x}{y}$'' should be parsed as ``$p:(\id{x}{y})$'', which makes sense since ``$\id{x}{y}$'' is a type, and not as ``$\id{(p:x)}{y}$'', which is senseless since ``$p:x$'' is a judgment and cannot be equal to anything.
Similarly, ``$A\jdeq \id{x}{y}$'' can only be parsed as ``$A\jdeq(\id{x}{y})$'', although in extreme cases such as this, one ought to add parentheses anyway to aid reading comprehension.
Moreover, later on we will fall into the common notation of chaining together equalities --- e.g.\ writing $a=b=c=d$ to mean ``$a=b$ and $b=c$ and $c=d$, hence $a=d$'' --- and we will also include judgmental equalities in such chains.
Context usually suffices to make the intent clear.
This is perhaps also an appropriate place to mention that the common mathematical notation ``$f:A\to B$'', expressing the fact that $f$ is a function from $A$ to $B$, can be regarded as a typing judgment, since we use ``$A\to B$'' as notation for the type of functions from $A$ to $B$ (as is standard practice in type theory; see \cref{sec:pi-types}).
\index{assumption|(defstyle}%
Judgments may depend on \emph{assumptions} of the form $x:A$, where $x$ is a variable
\indexdef{variable}%
and $A$ is a type.
For example, we may construct an object $m + n : \nat$ under the assumptions that $m,n : \nat$.
Another example is that assuming $A$ is a type, $x,y : A$, and $p : \id[A]{x}{y}$, we may construct an element $p^{-1} : \id[A]{y}{x}$.
The collection of all such assumptions is called the \define{context};%
\index{context}
from a topological point of view it may be thought of as a ``parameter\index{parameter!space} space''.
In fact, technically the context must be an ordered list of assumptions, since later assumptions may depend on previous ones: the assumption $x:A$ can only be made \emph{after} the assumptions of any variables appearing in the type $A$.
If the type $A$ in an assumption $x:A$ represents a proposition, then the assumption is a type-theoretic version of a \emph{hypothesis}:
\indexdef{hypothesis}%
we assume that the proposition $A$ holds.
When types are regarded as propositions, we may omit the names of their proofs.
Thus, in the second example above we may instead say that assuming $\id[A]{x}{y}$, we can prove $\id[A]{y}{x}$.
However, since we are doing ``proof-relevant'' mathematics,
\index{mathematics!proof-relevant}%
we will frequently refer back to proofs as objects.
In the example above, for instance, we may want to establish that $p^{-1}$ together with the proofs of transitivity and reflexivity behave like a groupoid; see \cref{cha:basics}.
Note that under this meaning of the word \emph{assumption}, we can assume a propositional equality (by assuming a variable $p:x=y$), but we cannot assume a judgmental equality $x\jdeq y$, since it is not a type that can have an element.
However, we can do something else which looks kind of like assuming a judgmental equality: if we have a type or an element which involves a variable $x:A$, then we can \emph{substitute} any particular element $a:A$ for $x$ to obtain a more specific type or element.
We will sometimes use language like ``now assume $x\jdeq a$'' to refer to this process of substitution, even though it is not an \emph{assumption} in the technical sense introduced above.
\index{assumption|)}%
By the same token, we cannot \emph{prove} a judgmental equality either, since it is not a type in which we can exhibit a witness.
Nevertheless, we will sometimes state judgmental equalities as part of a theorem, e.g.\ ``there exists $f:A\to B$ such that $f(x)\jdeq y$''.
This should be regarded as the making of two separate judgments: first we make the judgment $f:A\to B$ for some element $f$, then we make the additional judgment that $f(x)\jdeq y$.
In the rest of this chapter, we attempt to give an informal presentation of type theory, sufficient for the purposes of this book; we give a more formal account in \cref{cha:rules}.
Aside from some fairly obvious rules (such as the fact that judgmentally equal things can always be substituted\index{substitution} for each other), the rules of type theory can be grouped into \emph{type formers}.
Each type former consists of a way to construct types (possibly making use of previously constructed types), together with rules for the construction and behavior of elements of that type.
In most cases, these rules follow a fairly predictable pattern, but we will not attempt to make this precise here; see however the beginning of \cref{sec:finite-product-types} and also \cref{cha:induction}.\index{type theory!informal}
\index{axiom!versus rules}%
\index{rule!versus axioms}%
An important aspect of the type theory presented in this chapter is that it consists entirely of \emph{rules}, without any \emph{axioms}.
In the description of deductive systems in terms of judgments, the \emph{rules} are what allow us to conclude one judgment from a collection of others, while the \emph{axioms} are the judgments we are given at the outset.
If we think of a deductive system as a formal game, then the rules are the rules of the game, while the axioms are the starting position.
And if we think of a deductive system as an algebraic theory, then the rules are the operations of the theory, while the axioms are the \emph{generators} for some particular free model of that theory.
In set theory, the only rules are the rules of first-order logic (such as the rule allowing us to deduce ``$A\wedge B$ has a proof'' from ``$A$ has a proof'' and ``$B$ has a proof''): all the information about the behavior of sets is contained in the axioms.
By contrast, in type theory, it is usually the \emph{rules} which contain all the information, with no axioms being necessary.
For instance, in \cref{sec:finite-product-types} we will see that there is a rule allowing us to deduce the judgment ``$(a,b):A\times B$'' from ``$a:A$'' and ``$b:B$'', whereas in set theory the analogous statement would be (a consequence of) the pairing axiom.
The advantage of formulating type theory using only rules is that rules are ``procedural''.
In particular, this property is what makes possible (though it does not automatically ensure) the good computational properties of type theory, such as ``canonicity''.\index{canonicity}
However, while this style works for traditional type theories, we do not yet understand how to formulate everything we need for \emph{homotopy} type theory in this way.
In particular, in \cref{sec:compute-pi,sec:compute-universe,cha:hits} we will have to augment the rules of type theory presented in this chapter by introducing additional axioms, notably the \emph{univalence axiom}.
In this chapter, however, we confine ourselves to a traditional rule-based type theory.
\section{Function types}
\label{sec:function-types}
\index{type!function|(defstyle}%
\indexsee{function type}{type, function}%
Given types $A$ and $B$, we can construct the type $A \to B$ of \define{functions}
\index{function|(defstyle}%
\indexsee{map}{function}%
\indexsee{mapping}{function}%
with domain $A$ and codomain $B$.
We also sometimes refer to functions as \define{maps}.
\index{domain!of a function}%
\index{codomain, of a function}%
\index{function!domain of}%
\index{function!codomain of}%
\index{functional relation}%
Unlike in set theory, functions are not defined as
functional relations; rather they are a primitive concept in type theory.
We explain the function type by prescribing what we can do with functions,
how to construct them and what equalities they induce.
Given a function $f : A \to B$ and an element of the domain $a : A$, we
can \define{apply}
\indexdef{application!of function}%
\indexdef{function!application}%
\indexsee{evaluation}{application, of a function}
the function to obtain an element of the codomain $B$,
denoted $f(a)$ and called the \define{value} of $f$ at $a$.
\indexdef{value!of a function}%
It is common in type theory to omit the parentheses\index{parentheses} and denote $f(a)$ simply by $f\,a$, and we will sometimes do this as well.
But how can we construct elements of $A \to B$? There are two equivalent ways:
either by direct definition or by using
$\lambda$-abstraction. Introducing a function by definition
\indexdef{definition!of function, direct}%
means that
we introduce a function by giving it a name --- let's say, $f$ --- and saying
we define $f : A \to B$ by giving an equation
\begin{equation}
\label{eq:expldef}
f(x) \defeq \Phi
\end{equation}
where $x$ is a variable
\index{variable}%
and $\Phi$ is an expression which may use $x$.
In order for this to be valid, we have to check that $\Phi : B$ assuming $x:A$.
Now we can compute $f(a)$ by replacing the variable $x$ in $\Phi$ with
$a$. As an example, consider the function $f : \nat \to \nat$ which is
defined by $f(x) \defeq x+x$. (We will define $\nat$ and $+$ in \cref{sec:inductive-types}.)
Then $f(2)$ is judgmentally equal to $2+2$.
If we don't want to introduce a name for the function, we can use
\define{$\lambda$-abstraction}.
\index{lambda abstraction@$\lambda$-abstraction|defstyle}%
\indexsee{function!lambda abstraction@$\lambda$-abstraction}{$\lambda$-abstraction}%
\indexsee{abstraction!lambda-@$\lambda$-}{$\lambda$-abstraction}%
Given an expression $\Phi$ of type $B$ which may use $x:A$, as above, we write $\lam{x:A} \Phi$ to indicate the same function defined by~\eqref{eq:expldef}.
Thus, we have
\[ (\lamt{x:A}\Phi) : A \to B. \]
For the example in the previous paragraph, we have the typing judgment
\[ (\lam{x:\nat}x+x) : \nat \to \nat. \]
As another example, for any types $A$ and $B$ and any element $y:B$, we have a \define{constant function}
\indexdef{constant!function}%
\indexdef{function!constant}%
$(\lam{x:A} y): A\to B$.
We generally omit the type of the variable $x$ in a $\lambda$-abstraction and write $\lam{x}\Phi$, since the typing $x:A$ is inferable from the judgment that the function $\lam x \Phi$ has type $A\to B$.
By convention, the ``scope''
\indexdef{variable!scope of}%
\indexdef{scope}%
of the variable binding ``$\lam{x}$'' is the entire rest of the expression, unless delimited with parentheses\index{parentheses}.
Thus, for instance, $\lam{x} x+x$ should be parsed as $\lam{x} (x+x)$, not as $(\lam{x}x)+x$ (which would, in this case, be ill-typed anyway).
Another equivalent notation is
\symlabel{mapsto}%
\[ (x \mapsto \Phi) : A \to B. \]
\symlabel{blank}%
We may also sometimes use a blank ``$\blank$'' in the expression $\Phi$ in place of a variable, to denote an implicit $\lambda$-abstraction.
For instance, $g(x,\blank)$ is another way to write $\lam{y} g(x,y)$.
Now a $\lambda$-abstraction is a function, so we can apply it to an argument $a:A$.
We then have the following \define{computation rule}\indexdef{computation rule!for function types}\footnote{Use of this equality is often referred to as \define{$\beta$-conversion}
\indexsee{beta-conversion@$\beta $-conversion}{$\beta$-reduction}%
\indexsee{conversion!beta@$\beta $-}{$\beta$-reduction}%
or \define{$\beta$-reduction}.%
\index{beta-reduction@$\beta $-reduction|footstyle}%
\indexsee{reduction!beta@$\beta $-}{$\beta$-reduction}%
}, which is a definitional equality:
\[(\lamu{x:A}\Phi)(a) \jdeq \Phi'\]
where $\Phi'$ is the
expression $\Phi$ in which all occurrences of $x$ have been replaced by $a$.
Continuing the above example, we have
%
\[ (\lamu{x:\nat}x+x)(2) \jdeq 2+2. \]
%
Note that from any function $f:A\to B$, we can construct a lambda abstraction function $\lam{x} f(x)$.
Since this is by definition ``the function that applies $f$ to its argument'' we consider it to be definitionally equal to $f$:\footnote{Use of this equality is often referred to as \define{$\eta$-conversion}
\indexsee{eta-conversion@$\eta $-conversion}{$\eta$-expansion}%
\indexsee{conversion!eta@$\eta $-}{$\eta$-expansion}%
or \define{$\eta$-expansion.
\index{eta-expansion@$\eta $-expansion|footstyle}%
\indexsee{expansion, eta-@expansion, $\eta $-}{$\eta$-expansion}%
}}
\[ f \jdeq (\lam{x} f(x)). \]
This equality is the \define{uniqueness principle for function types}\indexdef{uniqueness!principle!for function types}, because it shows that $f$ is uniquely determined by its values.
The introduction of functions by definitions with explicit parameters can be reduced
to simple definitions by using $\lambda$-abstraction: i.e., we can read
a definition of $f: A\to B$ by
\[ f(x) \defeq \Phi \]
as
\[ f \defeq \lamu{x:A}\Phi.\]
When doing calculations involving variables, we have to be
careful when replacing a variable with an expression that also involves
variables, because we want to preserve the binding structure of
expressions. By the \emph{binding structure}\indexdef{binding structure} we mean the
invisible link generated by binders such as $\lambda$, $\Pi$ and
$\Sigma$ (the latter we are going to meet soon) between the place where the variable is introduced and where it is used. As an example, consider $f : \nat \to (\nat \to \nat)$
defined as
\[ f(x) \defeq \lamu{y:\nat} x + y. \]
Now if we have assumed somewhere that $y : \nat$, then what is $f(y)$? It would be wrong to just naively replace $x$ by $y$ everywhere in the expression ``$\lam{y}x+y$'' defining $f(x)$, obtaining $\lamu{y:\nat} y + y$, because this means that $y$ gets \define{captured}.
\indexdef{capture, of a variable}%
\indexdef{variable!captured}%
Previously, the substituted\index{substitution} $y$ was referring to our assumption, but now it is referring to the argument of the $\lambda$-abstraction. Hence, this naive substitution would destroy the binding structure, allowing us to perform calculations which are semantically unsound.
But what \emph{is} $f(y)$ in this example? Note that bound (or ``dummy'')
variables
\indexdef{variable!bound}%
\indexdef{variable!dummy}%
\indexsee{bound variable}{variable, bound}%
\indexsee{dummy variable}{variable, bound}%
such as $y$ in the expression $\lamu{y:\nat} x + y$
have only a local meaning, and can be consistently replaced by any
other variable, preserving the binding structure. Indeed, $\lamu{y:\nat} x + y$ is declared to be judgmentally equal\footnote{Use of this equality is often referred to as \define{$\alpha$-conversion.
\indexfoot{alpha-conversion@$\alpha $-conversion}
\indexsee{conversion!alpha@$\alpha$-}{$\alpha$-conversion}
}} to
$\lamu{z:\nat} x + z$. It follows that
$f(y)$ is judgmentally equal to $\lamu{z:\nat} y + z$, and that answers our question. (Instead of $z$,
any variable distinct from $y$ could have been used, yielding an equal result.)
Of course, this should all be familiar to any mathematician: it is the same phenomenon as the fact that if $f(x) \defeq \int_1^2 \frac{dt}{x-t}$, then $f(t)$ is not $\int_1^2 \frac{dt}{t-t}$ but rather $\int_1^2 \frac{ds}{t-s}$.
A $\lambda$-abstraction binds a dummy variable in exactly the same way that an integral does.
We have seen how to define functions in one variable. One
way to define functions in several variables would be to use the
cartesian product, which will be introduced later; a function with
parameters $A$ and $B$ and results in $C$ would be given the type
$f : A \times B \to C$. However, there is another choice that avoids
using product types, which is called \define{currying}
\indexdef{currying}%
\indexdef{function!currying of}%
(after the mathematician Haskell Curry).
\index{programming}%
The idea of currying is to represent a function of two inputs $a:A$ and $b:B$ as a function which takes \emph{one} input $a:A$ and returns \emph{another function}, which then takes a second input $b:B$ and returns the result.
That is, we consider two-variable functions to belong to an iterated function type, $f : A \to (B \to C)$.
We may also write this without the parentheses\index{parentheses}, as $f : A \to B \to C$, with
associativity\index{associativity!of function types} to the right as the default convention. Then given $a : A$ and $b : B$,
we can apply $f$ to $a$ and then apply the result to $b$, obtaining
$f(a)(b) : C$. To avoid the proliferation of parentheses, we allow ourselves to
write $f(a)(b)$ as $f(a,b)$ even though there are no products
involved.
When omitting parentheses around function arguments entirely, we write $f\,a\,b$ for $(f\,a)\,b$, with the default associativity now being to the left so that $f$ is applied to its arguments in the correct order.
Our notation for definitions with explicit parameters extends to
this situation: we can define a named function $f : A \to B \to C$ by
giving an equation
\[ f(x,y) \defeq \Phi\]
where $\Phi:C$ assuming $x:A$ and $y:B$. Using $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction} this
corresponds to
\[ f \defeq \lamu{x:A}{y:B} \Phi, \]
which may also be written as
\[ f \defeq x \mapsto y \mapsto \Phi. \]
We can also implicitly abstract over multiple variables by writing multiple blanks, e.g.\ $g(\blank,\blank)$ means $\lam{x}{y} g(x,y)$.
Currying a function of three or more arguments is a straightforward extension of what we have just described.
\index{type!function|)}%
\index{function|)}%
\section{Universes and families}
\label{sec:universes}
So far, we have been using the expression ``$A$ is a type'' informally. We
are going to make this more precise by introducing \define{universes}.
\index{type!universe|(defstyle}%
\indexsee{universe}{type, universe}%
A universe is a type whose elements are types. As in naive set theory,
we might wish for a universe of all types $\UU_\infty$ including itself
(that is, with $\UU_\infty : \UU_\infty$).
However, as in set
theory, this is unsound, i.e.\ we can deduce from it that every type,
including the empty type representing the proposition False (see \cref{sec:coproduct-types}), is inhabited.
For instance, using a
representation of sets as trees, we can directly encode Russell's
paradox\index{paradox} \cite{coquand:paradox}.
% or alternatively, in order to avoid the use of
% inductive types to define trees, we can follow Girard \cite{girard:paradox} and encode the Burali-Forti paradox,
% which shows that the collection of all ordinals cannot be an ordinal.
To avoid the paradox we introduce a hierarchy of universes
\indexsee{hierarchy!of universes}{type, universe}%
\[ \UU_0 : \UU_1 : \UU_2 : \cdots \]
where every universe $\UU_i$ is an element of the next universe
$\UU_{i+1}$. Moreover, we assume that our universes are
\define{cumulative},
\indexdef{type!universe!cumulative}%
\indexdef{cumulative!universes}%
that is that all the elements of the $i^{\mathrm{th}}$
universe are also elements of the $(i+1)^{\mathrm{st}}$ universe, i.e.\ if
$A:\UU_i$ then also $A:\UU_{i+1}$.
This is convenient, but has the slightly unpleasant consequence that elements no longer have unique types, and is a bit tricky in other ways that need not concern us here; see the Notes.
When we say that $A$ is a type, we mean that it inhabits some universe
$\UU_i$. We usually want to avoid mentioning the level
\indexdef{universe level}%
\indexsee{level}{universe level or $n$-type}%
\indexsee{type!universe!level}{universe level}%
$i$ explicitly,
and just assume that levels can be assigned in a consistent way; thus we
may write $A:\UU$ omitting the level. This way we can even write
$\UU:\UU$, which can be read as $\UU_i:\UU_{i+1}$, having left the
indices implicit. Writing universes in this style is referred to as
\define{typical ambiguity}.
\indexdef{typical ambiguity}%
It is convenient but a bit dangerous, since it allows us to write valid-looking proofs that reproduce the paradoxes of self-reference.
If there is any doubt about whether an argument is correct, the way to check it is to try to assign levels consistently to all universes appearing in it.
When some universe \UU is assumed, we may refer to types belonging to \UU as \define{small types}.
\indexdef{small!type}%
\indexdef{type!small}%
To model a collection of types varying over a given type $A$, we use functions $B : A \to \UU$ whose
codomain is a universe. These functions are called
\define{families of types} (or sometimes \emph{dependent types});
\indexsee{family!of types}{type, family of}%
\indexdef{type!family of}%
\indexsee{type!dependent}{type, family of}%
\indexsee{dependent!type}{type, family of}%
they correspond to families of sets as used in
set theory.
\symlabel{fin}%
An example of a type family is the family of finite sets $\Fin
: \nat \to \UU$, where $\Fin(n)$ is a type with exactly $n$ elements.
(We cannot \emph{define} the family $\Fin$ yet --- indeed, we have not even introduced its domain $\nat$ yet --- but we will be able to soon; see \cref{ex:fin}.)
We may denote the elements of $\Fin(n)$ by $0_n,1_n,\dots,(n-1)_n$, with subscripts to emphasize that the elements of $\Fin(n)$ are different from those of $\Fin(m)$ if $n$ is different from $m$, and all are different from the ordinary natural numbers (which we will introduce in \cref{sec:inductive-types}).
\index{finite!sets, family of}%
A more trivial (but very important) example of a type family is the \define{constant} type family
\indexdef{constant!type family}%
\indexdef{type!family of!constant}%
at a type $B:\UU$, which is of course the constant function $(\lam{x:A} B):A\to\UU$.
As a \emph{non}-example, in our version of type theory there is no type family ``$\lam{i:\nat} \UU_i$''.
Indeed, there is no universe large enough to be its codomain.
Moreover, we do not even identify the indices $i$ of the universes $\UU_i$ with the natural numbers \nat of type theory (the latter to be introduced in \cref{sec:inductive-types}).
\index{type!universe|)}%
\section{Dependent function types (\texorpdfstring{$\Pi$}{Π}-types)}
\label{sec:pi-types}
\index{type!dependent function|(defstyle}%
\index{function!dependent|(defstyle}%
\indexsee{dependent!function}{function, dependent}%
\indexsee{type!Pi-@$\Pi$-}{type, dependent function}%
\indexsee{Pi-type@$\Pi$-type}{type, dependent function}%
In type theory we often use a more general version of function
types, called a \define{$\Pi$-type} or \define{dependent function type}. The elements of
a $\Pi$-type are functions
whose codomain type can vary depending on the
element of the domain to which the function is applied, called \define{dependent functions}. The name ``$\Pi$-type''
is used because this type can also be regarded as the cartesian
product over a given type.
Given a type $A:\UU$ and a family $B:A \to \UU$, we may construct
the type of dependent functions $\prd{x:A}B(x) : \UU$.
There are many alternative notations for this type, such as
\[ \tprd{x:A} B(x) \qquad \dprd{x:A}B(x) \qquad \lprd{x:A} B(x). \]
If $B$ is a constant family, then the dependent product type is the ordinary function type:
\[\tprd{x:A} B \jdeq (A \to B).\]
Indeed, all the constructions of $\Pi$-types are generalizations of the corresponding constructions on ordinary function types.
\indexdef{definition!of function, direct}%
We can introduce dependent functions by explicit definitions: to
define $f : \prd{x:A}B(x)$, where $f$ is the name of a dependent function to be
defined, we need an expression $\Phi : B(x)$ possibly involving the variable $x:A$,
\index{variable}%
and we write
\[ f(x) \defeq \Phi \qquad \mbox{for $x:A$}.\]
Alternatively, we can use \define{$\lambda$-abstraction}%
\index{lambda abstraction@$\lambda$-abstraction|defstyle}%
\begin{equation}
\label{eq:lambda-abstraction}
\lamu{x:A} \Phi \ :\ \prd{x:A} B(x).
\end{equation}
\indexdef{application!of dependent function}%
\indexdef{function!dependent!application}%
As with non-dependent functions, we can \define{apply} a dependent function $f : \prd{x:A}B(x)$ to an argument $a:A$ to obtain an element $f(a):B(a)$.
The equalities are the same as for the ordinary function type, i.e.\ we have the computation rule
\index{computation rule!for dependent function types}%
given $a:A$ we have $f(a) \jdeq \Phi'$ and
$(\lamu{x:A} \Phi)(a) \jdeq \Phi'$, where $\Phi' $ is obtained by replacing all
occurrences of $x$ in $\Phi$ by $a$ (avoiding variable capture, as always).
Similarly, we have the uniqueness principle $f\jdeq (\lam{x} f(x))$ for any $f:\prd{x:A} B(x)$.
\index{uniqueness!principle!for dependent function types}%
As an example, recall from \cref{sec:universes} that there is a type family $\Fin:\nat\to\UU$ whose values are the standard finite sets, with elements $0_n,1_n,\dots,(n-1)_n : \Fin(n)$.
There is then a dependent function $\fmax : \prd{n:\nat} \Fin(n+1)$
which returns the ``largest'' element of each nonempty finite type, $\fmax(n) \defeq n_{n+1}$.
\index{finite!sets, family of}%
As was the case for $\Fin$ itself, we cannot define $\fmax$ yet, but we will be able to soon; see \cref{ex:fin}.
Another important class of dependent function types, which we can define now, are functions which are \define{polymorphic}
\indexdef{function!polymorphic}%
\indexdef{polymorphic function}%
over a given universe.
A polymorphic function is one which takes a type as one of its arguments, and then acts on elements of that type (or of other types constructed from it).
\symlabel{idfunc}%
\indexdef{function!identity}%
\indexdef{identity!function}%
An example is the polymorphic identity function $\idfunc : \prd{A:\UU} A \to A$, which we define by $\idfunc{} \defeq \lam{A:\type}{x:A} x$.
(Like $\lambda$-abstractions, $\Pi$s automatically scope\index{scope} over the rest of the expression unless delimited; thus $\idfunc : \prd{A:\UU} A \to A$ means $\idfunc : \prd{A:\UU} (A \to A)$.
This convention, though unusual in mathematics, is common in type theory.)
We sometimes write some arguments of a dependent function as subscripts.
For instance, we might equivalently define the polymorphic identity function by $\idfunc[A](x) \defeq x$.
Moreover, if an argument can be inferred from context, we may omit it altogether.
For instance, if $a:A$, then writing $\idfunc(a)$ is unambiguous, since $\idfunc$ must mean $\idfunc[A]$ in order for it to be applicable to $a$.
Another, less trivial, example of a polymorphic function is the ``swap'' operation that switches the order of the arguments of a (curried) two-argument function:
\[ \mathsf{swap} : \prd{A:\UU}{B:\UU}{C:\UU} (A\to B\to C) \to (B\to A \to C). \]
We can define this by
\[ \mathsf{swap}(A,B,C,g) \defeq \lam{b}{a} g(a)(b). \]
We might also equivalently write the type arguments as subscripts:
\[ \mathsf{swap}_{A,B,C}(g)(b,a) \defeq g(a,b). \]
Note that as we did for ordinary functions, we use currying to define dependent functions with
several arguments (such as $\mathsf{swap}$). However, in the dependent case the second domain may
depend on the first one, and the codomain may depend on both. That is,
given $A:\UU$ and type families $B : A \to \UU$ and $C : \prd{x:A}B(x) \to \UU$, we may construct
the type $\prd{x:A}{y : B(x)} C(x,y)$ of functions with two
arguments.
In the case when $B$ is constant and equal to $A$, we may condense the notation and write $\prd{x,y:A}$; for instance, the type of $\mathsf{swap}$ could also be written as
\[ \mathsf{swap} : \prd{A,B,C:\UU} (A\to B\to C) \to (B\to A \to C). \]
Finally, given $f:\prd{x:A}{y : B(x)} C(x,y)$ and arguments $a:A$ and $b:B(a)$, we have $f(a)(b) : C(a,b)$, which,
as before, we write as $f(a,b) : C(a,b)$.
\index{type!dependent function|)}%
\index{function!dependent|)}%
\section{Product types}
\label{sec:finite-product-types}
Given types $A,B:\UU$ we introduce the type $A\times B:\UU$, which we call their \define{cartesian product}.
\indexsee{cartesian product}{type, product}%
\indexsee{type!cartesian product}{type, product}%
\index{type!product|(defstyle}%
\indexsee{product!of types}{type, product}%
We also introduce a nullary product type, called the \define{unit type} $\unit : \UU$.
\indexsee{nullary!product}{type, unit}%
\indexsee{unit!type}{type, unit}%
\index{type!unit|(defstyle}%
We intend the elements of $A\times B$ to be pairs $\tup{a}{b} : A \times B$, where $a:A$ and $b:B$, and the only element of $\unit$ to be some particular object $\ttt : \unit$.
\indexdef{pair!ordered}%
However, unlike in set theory, where we define ordered pairs to be particular sets and then collect them all together into the cartesian product, in type theory, ordered pairs are a primitive concept, as are functions.
\begin{rmk}\label{rmk:introducing-new-concepts}
There is a general pattern for introduction of a new kind of type in type theory.
We have already seen this pattern in \cref{sec:function-types,sec:pi-types}\footnote{The description of universes above is an exception.}, so it is worth emphasizing the general form.
To specify a type, we specify:
\begin{enumerate}
\item how to form new types of this kind, via \define{formation rules}.
\indexdef{formation rule}%
\index{rule!formation}%
(For example, we can form the function type $A \to B$ when $A$ is a type and when $B$ is a type. We can form the dependent function type $\prd{x:A} B(x)$ when $A$ is a type and $B(x)$ is a type for $x:A$.)
\item how to construct elements of that type.
These are called the type's \define{constructors} or \define{introduction rules}.
\indexdef{constructor}%
\indexdef{rule!introduction}%
\indexdef{introduction rule}%
(For example, a function type has one constructor, $\lambda$-abstraction.
Recall that a direct definition like $f(x)\defeq 2x$ can equivalently be phrased
as a $\lambda$-abstraction $f\defeq \lam{x} 2x$.)
\item how to use elements of that type.
These are called the type's \define{eliminators} or \define{elimination rules}.
\indexsee{rule!elimination}{eliminator}%
\indexsee{elimination rule}{eliminator}%
\indexdef{eliminator}%
(For example, the function type has one eliminator, namely function application.)
\item
a \define{computation rule}\indexdef{computation rule}\footnote{also referred to as \define{$\beta$-reduction}\index{beta-reduction@$\beta $-reduction|footstyle}}, which expresses how an eliminator acts on a constructor.
(For example, for functions, the computation rule states that $(\lamu{x:A}\Phi)(a)$ is judgmentally equal to the substitution of $a$ for $x$ in $\Phi$.)
\item
an optional \define{uniqueness principle}\indexdef{uniqueness!principle}\footnote{also referred to as \define{$\eta$-expansion}\index{eta-expansion@$\eta $-expansion|footstyle}}, which expresses
uniqueness of maps into or out of that type.
For some types, the uniqueness principle characterizes maps into the type, by stating that
every element of the type is uniquely determined by the results of applying eliminators to it, and can be reconstructed from those results by applying a constructor---thus expressing how constructors act on eliminators, dually to the computation rule.
(For example, for functions, the uniqueness principle says that any function $f$ is judgmentally equal to the ``expanded'' function $\lamu{x} f(x)$, and thus is uniquely determined by its values.)
For other types, the uniqueness principle says that every map (function) \emph{from} that type is uniquely determined by some data. (An example is the coproduct type introduced in \cref{sec:coproduct-types}, whose uniqueness principle is mentioned in \cref{sec:universal-properties}.)
When the uniqueness principle is not taken as a rule of judgmental equality, it is often nevertheless provable as a \emph{propositional} equality from the other rules for the type.
In this case we call it a \define{propositional uniqueness principle}.
\indexdef{uniqueness!principle, propositional}%
\indexsee{propositional!uniqueness principle}{uniqueness principle, propositional}%
(In later chapters we will also occasionally encounter \emph{propositional computation rules}.)
\indexdef{computation rule!propositional}%
\end{enumerate}
The inference rules in \cref{sec:syntax-more-formally} are organized and named accordingly; see, for example, \cref{sec:more-formal-pi}, where each possibility is realized.
\end{rmk}
The way to construct pairs is obvious: given $a:A$ and $b:B$, we may form $(a,b):A\times B$.
Similarly, there is a unique way to construct elements of $\unit$, namely we have $\ttt:\unit$.
We expect that ``every element of $A\times B$ is a pair'', which is the uniqueness principle for products; we do not assert this as a rule of type theory, but we will prove it later on as a propositional equality.
Now, how can we \emph{use} pairs, i.e.\ how can we define functions out of a product type?
Let us first consider the definition of a non-dependent function $f : A\times B \to C$.
Since we intend the only elements of $A\times B$ to be pairs, we expect to be able to define such a function by prescribing the result
when $f$ is applied to a pair $\tup{a}{b}$.
We can prescribe these results by providing a function $g : A \to B \to C$.
Thus, we introduce a new rule (the elimination rule for products), which says that for any such $g$, we can define a function $f : A\times B \to C$ by
\[ f(\tup{a}{b}) \defeq g(a)(b). \]
We avoid writing $g(a,b)$ here, in order to emphasize that $g$ is not a function on a product.
(However, later on in the book we will often write $g(a,b)$ both for functions on a product and for curried functions of two variables.)
This defining equation is the computation rule for product types\index{computation rule!for product types}.
Note that in set theory, we would justify the above definition of $f$ by the fact that every element of $A\times B$ is an ordered pair, so that it suffices to define $f$ on such pairs.
By contrast, type theory reverses the situation: we assume that a function on $A\times B$ is well-defined as soon as we specify its values on pairs, and from this (or more precisely, from its more general version for dependent functions, below) we will be able to \emph{prove} that every element of $A\times B$ is a pair.
From a category-theoretic perspective, we can say that we define the product $A\times B$ to be left adjoint to the ``exponential'' $B\to C$, which we have already introduced.
As an example, we can derive the \define{projection}
\indexsee{function!projection}{projection}%
\indexsee{component, of a pair}{projection}%
\indexdef{projection!from cartesian product type}%
functions
\symlabel{defn:proj}%
\begin{align*}
\fst & : A \times B \to A \\
\snd & : A \times B \to B
\end{align*}
with the defining equations
\begin{align*}
\fst(\tup{a}{b}) & \defeq a \\
\snd(\tup{a}{b}) & \defeq b.
\end{align*}
%
\symlabel{defn:recursor-times}%
Rather than invoking this principle of function definition every time we want to define a function, an alternative approach is to invoke it once, in a universal case, and then simply apply the resulting function in all other cases.
That is, we may define a function of type
\begin{equation}
\rec{A\times B} : \prd{C:\UU}(A \to B \to C) \to A \times B \to C
\end{equation}
with the defining equation
\[\rec{A\times B}(C,g,\tup{a}{b}) \defeq g(a)(b). \]
Then instead of defining functions such as $\fst$ and $\snd$ directly by a defining equation, we could define
\begin{align*}
\fst &\defeq \rec{A\times B}(A, \lam{a}{b} a)\\
\snd &\defeq \rec{A\times B}(B, \lam{a}{b} b).
\end{align*}
We refer to the function $\rec{A\times B}$ as the \define{recursor}
\indexsee{recursor}{recursion principle}%
for product types. The name ``recursor'' is a bit unfortunate here, since no recursion is taking place. It comes from the fact that product types are a degenerate example of a general framework for inductive types, and for types such as the natural numbers, the recursor will actually be recursive. We may also speak of the \define{recursion principle} for cartesian products, meaning the fact that we can define a function $f:A\times B\to C$ as above by giving its value on pairs.
\index{recursion principle!for cartesian product}%
We leave it as a simple exercise to show that the recursor can be
derived from the projections and vice versa.
% Ex: Derive from projections
\symlabel{defn:recursor-unit}%
We also have a recursor for the unit type:
\[\rec{\unit} : \prd{C:\UU}C \to \unit \to C\]
with the defining equation
\[ \rec{\unit}(C,c,\ttt) \defeq c. \]
Although we include it to maintain the pattern of type definitions, the recursor for $\unit$ is completely useless,
because we could have defined such a function directly
by simply ignoring the argument of type $\unit$.
To be able to define \emph{dependent} functions over the product type, we have
to generalize the recursor. Given $C: A \times B \to \UU$, we may
define a function $f : \prd{x : A \times B} C(x)$ by providing a
function
\narrowequation{
g : \prd{x:A}\prd{y:B} C(\tup{x}{y})
}
with defining equation
\[ f(\tup x y) \defeq g(x)(y). \]
For example, in this way we can prove the propositional uniqueness principle, which says that every element of $A\times B$ is equal to a pair.
\index{uniqueness!principle, propositional!for product types}%
Specifically, we can construct a function
\[ \uniq{A\times B} : \prd{x:A \times B} (\id[A\times B]{\tup{\fst {(x)}}{\snd {(x)}}}{x}). \]
Here we are using the identity type, which we are going to introduce below in \cref{sec:identity-types}.
However, all we need to know now is that there is a reflexivity element $\refl{x} : \id[A]{x}{x}$ for any $x:A$.
Given this, we can define
\label{uniquenessproduct}
\[ \uniq{A\times B}(\tup{a}{b}) \defeq \refl{\tup{a}{b}}. \]
This construction works, because in the case that $x \defeq \tup{a}{b}$ we can
calculate
\[ \tup{\fst(\tup{a}{b})}{\snd{(\tup{a}{b})}} \jdeq \tup{a}{b} \]
using the defining equations for the projections. Therefore,
\[ \refl{\tup{a}{b}} : \id{\tup{\fst(\tup{a}{b})}{\snd{(\tup{a}{b})}}}{\tup{a}{b}} \]
is well-typed, since both sides of the equality are judgmentally equal.
More generally, the ability to define dependent functions in this way means that to prove a property for all elements of a product, it is enough
to prove it for its canonical elements, the ordered pairs.
When we come to inductive types such as the natural numbers, the analogous property will be the ability to write proofs by induction.
Thus, if we do as we did above and apply this principle once in the universal case, we call the resulting function \define{induction} for product types: given $A,B : \UU$ we have
\symlabel{defn:induction-times}%
\[ \ind{A\times B} : \prd{C:A \times B \to \UU}
\Parens{\prd{x:A}{y:B} C(\tup{x}{y})} \to \prd{x:A \times B} C(x) \]
with the defining equation
\[ \ind{A\times B}(C,g,\tup{a}{b}) \defeq g(a)(b). \]
Similarly, we may speak of a dependent function defined on pairs being obtained from the \define{induction principle}
\index{induction principle}%
\index{induction principle!for product}%
of the cartesian product.
It is easy to see that the recursor is just the special case of induction
in the case that the family $C$ is constant.
Because induction describes how to use an element of the product type, induction is also called the \define{(dependent) eliminator},
\indexsee{eliminator!of inductive type!dependent}{induction principle}%
and recursion the \define{non-dependent eliminator}.
\indexsee{eliminator!of inductive type!non-dependent}{recursion principle}%
\indexsee{non-dependent eliminator}{recursion principle}%
\indexsee{dependent eliminator}{induction principle}%
% We can read induction propositionally as saying that a property which
% is true for all pairs holds for all elements of the product type.
Induction for the unit type turns out to be more useful than the
recursor:
\symlabel{defn:induction-unit}%
\[ \ind{\unit} : \prd{C:\unit \to \UU} C(\ttt) \to \prd{x:\unit}C(x)\]
with the defining equation
\[ \ind{\unit}(C,c,\ttt) \defeq c. \]
Induction enables us to prove the propositional uniqueness principle for $\unit$, which asserts that its only inhabitant is $\ttt$.
That is, we can construct
\label{uniquenessunit}
\[\uniq{\unit} : \prd{x:\unit} \id{x}{\ttt} \]
by using the defining equations
\[\uniq{\unit}(\ttt) \defeq \refl{\ttt} \]
or equivalently by using induction:
\[\uniq{\unit} \defeq \ind{\unit}(\lamu{x:\unit} \id{x}{\ttt},\refl{\ttt}). \]
\index{type!product|)}%
\index{type!unit|)}%
\section{Dependent pair types (\texorpdfstring{$\Sigma$}{Σ}-types)}
\label{sec:sigma-types}
\index{type!dependent pair|(defstyle}%
\indexsee{type!dependent sum}{type, dependent pair}%
\indexsee{type!Sigma-@$\Sigma$-}{type, dependent pair}%
\indexsee{Sigma-type@$\Sigma$-type}{type, dependent pair}%
\indexsee{sum!dependent}{type, dependent pair}%
Just as we generalized function types (\cref{sec:function-types}) to dependent function types (\cref{sec:pi-types}), it is often useful to generalize the product types from \cref{sec:finite-product-types} to allow the type of
the second component of a pair to vary depending on the choice of the first
component. This is called a \define{dependent pair type}, or \define{$\Sigma$-type}, because in set theory it
corresponds to an indexed sum (in the sense of coproduct or
disjoint union) over a given type.
Given a type $A:\UU$ and a family $B : A \to \UU$, the dependent
pair type is written as $\sm{x:A} B(x) : \UU$.
Alternative notations are
\[ \tsm{x:A} B(x) \hspace{2cm} \dsm{x:A}B(x) \hspace{2cm} \lsm{x:A} B(x). \]
Like other binding constructs such as $\lambda$-abstractions and $\Pi$s, $\Sigma$s automatically scope\index{scope} over the rest of the expression unless delimited, so e.g.\ $\sm{x:A} B(x) \to C$ means $\sm{x:A} (B(x) \to C)$.
\symlabel{defn:dependent-pair}%
\indexdef{pair!dependent}%
The way to construct elements of a dependent pair type is by pairing: we have
$\tup{a}{b} : \sm{x:A} B(x)$ given $a:A$ and $b:B(a)$.
If $B$ is constant, then the dependent pair type is the
ordinary cartesian product type:
\[ \Parens{\sm{x:A} B} \jdeq (A \times B).\]
All the constructions on $\Sigma$-types arise as straightforward generalizations of the ones for product types, with dependent functions often replacing non-dependent ones.
For instance, the recursion principle%
\index{recursion principle!for dependent pair type}
says that to define a non-dependent function out of a $\Sigma$-type
$f : (\sm{x:A} B(x)) \to C$, we provide a function
$g : \prd{x:A} B(x) \to C$, and then we can define $f$ via the defining
equation
\[ f(\tup{a}{b}) \defeq g(a)(b). \]
\indexdef{projection!from dependent pair type}%
For instance, we can derive the first projection from a $\Sigma$-type:
\symlabel{defn:dependent-proj1}%
\begin{equation*}
\fst : \Parens{\sm{x : A}B(x)} \to A
\end{equation*}
by the defining equation
\begin{equation*}
\fst(\tup{a}{b}) \defeq a.
\end{equation*}
However, since the type of the second component of a pair
\narrowequation{
(a,b):\sm{x:A} B(x)
}
is $B(a)$, the second projection must be a \emph{dependent} function, whose type involves the first projection function:
\symlabel{defn:dependent-proj2}%
\[ \snd : \prd{p:\sm{x : A}B(x)}B(\fst(p)). \]
Thus we need the \emph{induction} principle%
\index{induction principle!for dependent pair type}
for $\Sigma$-types (the ``dependent eliminator'').
This says that to construct a dependent function out of a $\Sigma$-type into a family $C : (\sm{x:A} B(x)) \to \UU$, we need a function
\[ g : \prd{a:A}{b:B(a)} C(\tup{a}{b}). \]
We can then derive a function
\[ f : \prd{p : \sm{x:A}B(x)} C(p) \]
with defining equation\index{computation rule!for dependent pair type}
\[ f(\tup{a}{b}) \defeq g(a)(b).\]
Applying this with $C(p)\defeq B(\fst(p))$, we can define
\narrowequation{
\snd : \prd{p:\sm{x : A}B(x)}B(\fst(p))
}
with the obvious equation
\[ \snd(\tup{a}{b}) \defeq b. \]
To convince ourselves that this is correct, we note that $B (\fst(\tup{a}{b})) \jdeq B(a)$, using the defining equation for $\fst$, and
indeed $b : B(a)$.
We can package the recursion and induction principles into the recursor for $\Sigma$:
\symlabel{defn:recursor-sm}%
\[ \rec{\sm{x:A}B(x)} : \dprd{C:\UU}\Parens{\tprd{x:A} B(x) \to C} \to
\Parens{\tsm{x:A}B(x)} \to C \]
with the defining equation
\[ \rec{\sm{x:A}B(x)}(C,g,\tup{a}{b}) \defeq g(a)(b) \]
and the corresponding induction operator:
\symlabel{defn:induction-sm}%
\begin{narrowmultline*}
\ind{\sm{x:A}B(x)} : \narrowbreak
\dprd{C:(\sm{x:A} B(x)) \to \UU}
\Parens{\tprd{a:A}{b:B(a)} C(\tup{a}{b})}
\to \dprd{p : \sm{x:A}B(x)} C(p)
\end{narrowmultline*}
with the defining equation
\[ \ind{\sm{x:A}B(x)}(C,g,\tup{a}{b}) \defeq g(a)(b). \]
As before, the recursor is the special case of induction
when the family $C$ is constant.
As a further example, consider the following principle, where $A$ and $B$ are types and $R:A\to B\to \UU$:
\[ \ac : \Parens{\tprd{x:A} \tsm{y :B} R(x,y)} \to
\Parens{\tsm{f:A\to B} \tprd{x:A} R(x,f(x))}.
\]
We may regard $R$ as a ``proof-relevant relation''
\index{mathematics!proof-relevant}%
between $A$ and $B$, with $R(a,b)$ the type of witnesses for relatedness of $a:A$ and $b:B$.
Then $\ac$ says intuitively that if we have a dependent function $g$ assigning to every $a:A$ a dependent pair $(b,r)$ where $b:B$ and $r:R(a,b)$, then we have a function $f:A\to B$ and a dependent function assigning to every $a:A$ a witness that $R(a,f(a))$.
Our intuition tells us that we can just split up the values of $g$ into their components.
Indeed, using the projections we have just defined, we can define:
\[ \ac(g) \defeq \Parens{\lamu{x:A} \fst(g(x)),\, \lamu{x:A} \snd(g(x))}. \]
To verify that this is well-typed, note that if $g:\prd{x:A} \sm{y :B} R(x,y)$, we have
\begin{align*}
\lamu{x:A} \fst(g(x)) &: A \to B, \\
\lamu{x:A} \snd(g(x)) &: \tprd{x:A} R(x,\fst(g(x))).
\end{align*}
Moreover, the type $\prd{x:A} R(x,\fst(g(x)))$ is the result of applying the type family $\lamu{f:A\to B} \tprd{x:A} R(x,f(x))$ being summed over in the codomain of \ac to the function $\lamu{x:A} \fst(g(x))$:
\[ \tprd{x:A} R(x,\fst(g(x))) \jdeq
\Parens{\lamu{f:A\to B} \tprd{x:A} R(x,f(x))}\big(\lamu{x:A} \fst(g(x))\big). \]
Thus, we have
\[ \Parens{\lamu{x:A} \fst(g(x)),\, \lamu{x:A} \snd(g(x))} : \tsm{f:A\to B} \tprd{x:A} R(x,f(x))\]
as required.
If we read $\Pi$ as ``for all'' and $\Sigma$ as ``there exists'', then the type of the function $\ac$ expresses:
\emph{if for all $x:A$ there is a $y:B$ such that $R(x,y)$, then there is a function $f : A \to B$ such that for all $x:A$ we have $R(x,f(x))$}.
Since this sounds like a version of the axiom of choice, the function \ac has traditionally been called the \define{type-theoretic axiom of choice}, and as we have just shown, it can be proved directly from the rules of type theory, rather than having to be taken as an axiom.
\index{axiom!of choice!type-theoretic}%
However, note that no choice is actually involved, since the choices have already been given to us in the premise: all we have to do is take it apart into two functions: one representing the choice and the other its correctness.
In \cref{sec:axiom-choice} we will give another formulation of an ``axiom of choice'' which is closer to the usual one.
Dependent pair types are often used to define types of mathematical structures, which commonly consist of several dependent pieces of data.
To take a simple example, suppose we want to define a \define{magma}\indexdef{magma} to be a type $A$ together with a binary operation $m:A\to A\to A$.
The precise meaning of the phrase ``together with''\index{together with} (and the synonymous ``equipped with''\index{equipped with}) is that ``a magma'' is a \emph{pair} $(A,m)$ consisting of a type $A:\UU$ and an operation $m:A\to A\to A$.
Since the type $A\to A\to A$ of the second component $m$ of this pair depends on its first component $A$, such pairs belong to a dependent pair type.
Thus, the definition ``a magma is a type $A$ together with a binary operation $m:A\to A\to A$'' should be read as defining \emph{the type of magmas} to be
\[ \mathsf{Magma} \defeq \sm{A:\UU} (A\to A\to A). \]
Given a magma, we extract its underlying type (its ``carrier''\index{carrier}) with the first projection $\proj1$, and its operation with the second projection $\proj2$.
Of course, structures built from more than two pieces of data require iterated pair types, which may be only partially dependent; for instance the type of pointed magmas (magmas $(A,m)$ equipped with a basepoint $e:A$) is
\[ \mathsf{PointedMagma} \defeq \sm{A:\UU} (A\to A\to A) \times A. \]
We generally also want to impose axioms on such a structure, e.g.\ to make a pointed magma into a monoid or a group.
This can also be done using $\Sigma$-types; see \cref{sec:pat}.
In the rest of the book, we will sometimes make definitions of this sort explicit, but eventually we trust the reader to translate them from English into $\Sigma$-types.
We also generally follow the common mathematical practice of using the same letter for a structure of this sort and for its carrier (which amounts to leaving the appropriate projection function implicit in the notation): that is, we will speak of a magma $A$ with its operation $m:A\to A\to A$.
Note that the canonical elements of $\mathsf{PointedMagma}$ are of the form $(A,(m,e))$ where $A:\UU$, $m:A\to A\to A$, and $e:A$.
Because of the frequency with which iterated $\Sigma$-types of this sort arise, we use the usual notation of ordered triples, quadruples and so on to stand for nested pairs (possibly dependent) associating to the right.
That is, we have $(x,y,z) \defeq (x,(y,z))$ and $(x,y,z,w)\defeq (x,(y,(z,w)))$, etc.
\index{type!dependent pair|)}%
\section{Coproduct types}
\label{sec:coproduct-types}
Given $A,B:\UU$, we introduce their \define{coproduct} type $A+B:\UU$.
\indexsee{coproduct}{type, coproduct}%
\index{type!coproduct|(defstyle}%
\indexsee{disjoint!sum}{type, coproduct}%
\indexsee{disjoint!union}{type, coproduct}%
\indexsee{sum!disjoint}{type, coproduct}%
\indexsee{union!disjoint}{type, coproduct}%
This corresponds to the \emph{disjoint union} in set theory, and we may also use that name for it.
In type theory, as was the case with functions and products, the coproduct must be a fundamental construction, since there is no previously given notion of ``union of types''.
We also introduce a
nullary version: the \define{empty type $\emptyt:\UU$}.
\indexsee{nullary!coproduct}{type, empty}%
\indexsee{empty type}{type, empty}%
\index{type!empty|(defstyle}%
There are two ways to construct elements of $A+B$, either as $\inl(a) : A+B$ for $a:A$, or as
$\inr(b):A+B$ for $b:B$.
(The names $\inl$ and $\inr$ are short for ``left injection'' and ``right injection''.)
There are no ways to construct elements of the empty type.
\index{recursion principle!for coproduct}
To construct a non-dependent function $f : A+B \to C$, we need
functions $g_0 : A \to C$ and $g_1 : B \to C$. Then $f$ is defined
via the defining equations
\begin{align*}
f(\inl(a)) &\defeq g_0(a), \\
f(\inr(b)) &\defeq g_1(b).
\end{align*}
That is, the function $f$ is defined by \define{case analysis}.
\indexdef{case analysis}%
As before, we can derive the recursor:
\symlabel{defn:recursor-plus}%
\[ \rec{A+B} : \dprd{C:\UU}(A \to C) \to (B\to C) \to A+B \to C\]
with the defining equations
\begin{align*}
\rec{A+B}(C,g_0,g_1,\inl(a)) &\defeq g_0(a), \\
\rec{A+B}(C,g_0,g_1,\inr(b)) &\defeq g_1(b).
\end{align*}
\index{recursion principle!for empty type}
We can always construct a function $f : \emptyt \to C$ without
having to give any defining equations, because there are no elements of \emptyt on which to define $f$.
Thus, the recursor for $\emptyt$ is
\symlabel{defn:recursor-emptyt}%
\[\rec{\emptyt} : \tprd{C:\UU} \emptyt \to C,\]
which constructs the canonical function from the empty type to any other type.
Logically, it corresponds to the principle \textit{ex falso quodlibet}.
\index{ex falso quodlibet@\textit{ex falso quodlibet}}
\index{induction principle!for coproduct}
To construct a dependent function $f:\prd{x:A+B}C(x)$ out of a coproduct, we assume as given the family
$C: (A + B) \to \UU$, and
require
\begin{align*}
g_0 &: \prd{a:A} C(\inl(a)), \\
g_1 &: \prd{b:B} C(\inr(b)).
\end{align*}
This yields $f$ with the defining equations:\index{computation rule!for coproduct type}
\begin{align*}
f(\inl(a)) &\defeq g_0(a), \\
f(\inr(b)) &\defeq g_1(b).
\end{align*}
We package this scheme into the induction principle for coproducts:
\symlabel{defn:induction-plus}%
\begin{narrowmultline*}
\ind{A+B} :
\dprd{C: (A + B) \to \UU}
\Parens{\tprd{a:A} C(\inl(a))} \to \narrowbreak
\Parens{\tprd{b:B} C(\inr(b))} \to \tprd{x:A+B}C(x).
\end{narrowmultline*}
As before, the recursor arises in the case that the family $C$ is
constant.
\index{induction principle!for empty type}
The induction principle for the empty type
\symlabel{defn:induction-emptyt}%
\[ \ind{\emptyt} : \prd{C:\emptyt \to \UU}{z:\emptyt} C(z) \]
gives us a way to define a trivial dependent function out of the
empty type. % In the presence of $\eta$-equality it is derivable
% from the recursor.
% ex
\index{type!coproduct|)}%
\index{type!empty|)}%
\section{The type of booleans}
\label{sec:type-booleans}
\indexsee{boolean!type of}{type of booleans}%
\index{type!of booleans|(defstyle}%
The type of booleans $\bool:\UU$ is intended to have exactly two elements
$\bfalse,\btrue : \bool$. It is clear that we could construct this
type out of coproduct
% this one results in a warning message just because it's on the same page as the previous entry
% for {type!coproduct}, so it's not our fault
\index{type!coproduct}%
and unit\index{type!unit} types as $\unit + \unit$. However,
since it is used frequently, we give the explicit rules here.
Indeed, we are going to observe that we can also go the other way
and derive binary coproducts from $\Sigma$-types and $\bool$.
\index{recursion principle!for type of booleans}
To derive a function $f : \bool \to C$ we need $c_0,c_1 : C$ and
add the defining equations
\begin{align*}
f(\bfalse) &\defeq c_0, \\
f(\btrue) &\defeq c_1.
\end{align*}
The recursor corresponds to the if-then-else construct in
functional programming:
\symlabel{defn:recursor-bool}%
\[ \rec{\bool} : \prd{C:\UU} C \to C \to \bool \to C \]
with the defining equations
\begin{align*}
\rec{\bool}(C,c_0,c_1,\bfalse) &\defeq c_0, \\
\rec{\bool}(C,c_0,c_1,\btrue) &\defeq c_1.
\end{align*}
\index{induction principle!for type of booleans}
Given $C : \bool \to \UU$, to derive a dependent function
$f : \prd{x:\bool}C(x)$ we need $c_0:C(\bfalse)$ and $c_1 : C(\btrue)$, in which case we can give the defining equations
\begin{align*}
f(\bfalse) &\defeq c_0, \\
f(\btrue) &\defeq c_1.
\end{align*}
We package this up into the induction principle
\symlabel{defn:induction-bool}%