forked from diveintomark/diveintopython3
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathyour-first-python-program.html
executable file
·389 lines (322 loc) · 36 KB
/
your-first-python-program.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
<!DOCTYPE html>
<meta charset=utf-8>
<title>Your first Python program - Dive Into Python 3</title>
<!--[if IE]><script src=j/html5.js></script><![endif]-->
<link rel=stylesheet href=dip3.css>
<style>
body{counter-reset:h1 1}
table{border:1px solid #bbb;border-collapse:collapse;margin:auto}
td,th{border:1px solid #bbb;padding:0 1.75em}
th{text-align:left}
mark{display:inline}
</style>
<link rel=stylesheet media='only screen and (max-device-width: 480px)' href=mobile.css>
<link rel=stylesheet media=print href=print.css>
<meta name=viewport content='initial-scale=1.0'>
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input type=search name=q size=25 placeholder="powered by Google™"> <input type=submit name=sa value=Search></div></form>
<p>You are here: <a href=index.html>Home</a> <span class=u>‣</span> <a href=table-of-contents.html#your-first-python-program>Dive Into Python 3</a> <span class=u>‣</span>
<p id=level>Difficulty level: <span class=u title=novice>♦♢♢♢♢</span>
<h1>Your First Python Program</h1>
<blockquote class=q>
<p><span class=u>❝</span> Don’t bury your burden in saintly silence. You have a problem? Great. Rejoice, dive in, and investigate. <span class=u>❞</span><br>— <a href=http://en.wikiquote.org/wiki/Buddhism>Ven. Henepola Gunaratana</a>
</blockquote>
<p id=toc>
<h2 id=divingin>Diving In</h2>
<p class=f>Convention dictates that I should bore you with the fundamental building blocks of programming, so we can slowly work up to building something useful. Let’s skip all that. Here is a complete, working Python program. It probably makes absolutely no sense to you. Don’t worry about that, because you’re going to dissect it line by line. But read through it first and see what, if anything, you can make of it.
<p class=d>[<a href=examples/humansize.py>download <code>humansize.py</code></a>]
<pre class=pp><code>SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
if False, use multiples of 1000
Returns: string
'''
if size < 0:
raise ValueError('number must be non-negative')
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
for suffix in SUFFIXES[multiple]:
size /= multiple
if size < multiple:
return '{0:.1f} {1}'.format(size, suffix)
raise ValueError('number too large')
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))</code></pre>
<p>Now let’s run this program on the command line. On Windows, it will look something like this:
<pre class='nd screen'>
<samp class=p>c:\home\diveintopython3\examples> </samp><kbd>c:\python31\python.exe humansize.py</kbd>
<samp>1.0 TB
931.3 GiB</samp></pre>
<p>On Mac OS X or Linux, it would look something like this:
<pre class='nd screen cmdline'>
<samp class=p>you@localhost:~/diveintopython3/examples$ </samp><kbd>python3 humansize.py</kbd>
<samp>1.0 TB
931.3 GiB</samp></pre>
<p>What just happened? You executed your first Python program. You called the Python interpreter on the command line, and you passed the name of the script you wanted Python to execute. The script defines a single function, the <code>approximate_size()</code> function, which takes an exact file size in bytes and calculates a “pretty” (but approximate) size. (You’ve probably seen this in Windows Explorer, or the Mac OS X Finder, or Nautilus or Dolphin or Thunar on Linux. If you display a folder of documents as a multi-column list, it will display a table with the document icon, the document name, the size, type, last-modified date, and so on. If the folder contains a 1093-byte file named <code>TODO</code>, your file manager won’t display <code>TODO 1093 bytes</code>; it’ll say something like <code>TODO 1 KB</code> instead. That’s what the <code>approximate_size()</code> function does.)
<p>Look at the bottom of the script, and you’ll see two calls to <code>print(approximate_size(<var>arguments</var>))</code>. These are function calls — first calling the <code>approximate_size()</code> function and passing a number of arguments, then taking the return value and passing it straight on to the <code>print()</code> function. The <code>print()</code> function is built-in; you’ll never see an explicit declaration of it. You can just use it, anytime, anywhere. (There are lots of built-in functions, and lots more functions that are separated into <i>modules</i>. Patience, grasshopper.)
<p>So why does running the script on the command line give you the same output every time? We’ll get to that. First, let’s look at that <code>approximate_size()</code> function.
<p class=a>⁂
<h2 id=declaringfunctions>Declaring Functions</h2>
<p>Python has functions like most other languages, but it does not have separate header files like <abbr>C++</abbr> or <code>interface</code>/<code>implementation</code> sections like Pascal. When you need a function, just declare it, like this:
<pre class='nd pp'><code>def approximate_size(size, a_kilobyte_is_1024_bytes=True):</code></pre>
<aside>When you need a function, just declare it.</aside>
<p>The keyword <code>def</code> starts the function declaration, followed by the function name, followed by the arguments in parentheses. Multiple arguments are separated with commas.
<p>Also note that the function doesn’t define a return datatype. Python functions do not specify the datatype of their return value; they don’t even specify whether or not they return a value. (In fact, every Python function returns a value; if the function ever executes a <code>return</code> statement, it will return that value, otherwise it will return <code>None</code>, the Python null value.)
<blockquote class=note>
<p><span class=u>☞</span>In some languages, functions (that return a value) start with <code>function</code>, and subroutines (that do not return a value) start with <code>sub</code>. There are no subroutines in Python. Everything is a function, all functions return a value (even if it’s <code>None</code>), and all functions start with <code>def</code>.
</blockquote>
<p>The <code>approximate_size()</code> function takes the two arguments — <var>size</var> and <var>a_kilobyte_is_1024_bytes</var> — but neither argument specifies a datatype. In Python, variables are never explicitly typed. Python figures out what type a variable is and keeps track of it internally.
<blockquote class='note compare java'>
<p><span class=u>☞</span>In Java and other statically-typed languages, you must specify the datatype of the function return value and each function argument. In Python, you never explicitly specify the datatype of anything. Based on what value you assign, Python keeps track of the datatype internally.
</blockquote>
<h3 id=optional-arguments>Optional and Named Arguments</h3>
<p>Python allows function arguments to have default values; if the function is called without the argument, the argument gets its default value. Furthermore, arguments can be specified in any order by using named arguments.
<p>Let’s take another look at that <code>approximate_size()</code> function declaration:
<pre class='nd pp'><code>def approximate_size(size, a_kilobyte_is_1024_bytes=True):</code></pre>
<p>The second argument, <var>a_kilobyte_is_1024_bytes</var>, specifies a default value of <code>True</code>. This means the argument is <i>optional</i>; you can call the function without it, and Python will act as if you had called it with <code>True</code> as a second parameter.
<p>Now look at the bottom of the script:
<pre class=pp><code>if __name__ == '__main__':
<a> print(approximate_size(1000000000000, False)) <span class=u>①</span></a>
<a> print(approximate_size(1000000000000)) <span class=u>②</span></a></code></pre>
<ol>
<li>This calls the <code>approximate_size()</code> function with two arguments. Within the <code>approximate_size()</code> function, <var>a_kilobyte_is_1024_bytes</var> will be <code>False</code>, since you explicitly passed <code>False</code> as the second argument.
<li>This calls the <code>approximate_size()</code> function with only one argument. But that’s OK, because the second argument is optional! Since the caller doesn’t specify, the second argument defaults to <code>True</code>, as defined by the function declaration.
</ol>
<p>You can also pass values into a function by name.
<pre class=screen>
<samp class=p>>>> </samp><kbd class=pp>from humansize import approximate_size</kbd>
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(4000, a_kilobyte_is_1024_bytes=False)</kbd> <span class=u>①</span></a>
<samp class=pp>'4.0 KB'</samp>
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(size=4000, a_kilobyte_is_1024_bytes=False)</kbd> <span class=u>②</span></a>
<samp class=pp>'4.0 KB'</samp>
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(a_kilobyte_is_1024_bytes=False, size=4000)</kbd> <span class=u>③</span></a>
<samp class=pp>'4.0 KB'</samp>
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(a_kilobyte_is_1024_bytes=False, 4000)</kbd> <span class=u>④</span></a>
<samp class=traceback> File "<stdin>", line 1
SyntaxError: non-keyword arg after keyword arg</samp>
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(size=4000, False)</kbd> <span class=u>⑤</span></a>
<samp class=traceback> File "<stdin>", line 1
SyntaxError: non-keyword arg after keyword arg</samp></pre>
<ol>
<li>This calls the <code>approximate_size()</code> function with <code>4000</code> for the first argument (<var>size</var>) and <code>False</code> for the argument named <var>a_kilobyte_is_1024_bytes</var>. (That happens to be the second argument, but doesn’t matter, as you’ll see in a minute.)
<li>This calls the <code>approximate_size()</code> function with <code>4000</code> for the argument named <var>size</var> and <code>False</code> for the argument named <var>a_kilobyte_is_1024_bytes</var>. (These named arguments happen to be in the same order as the arguments are listed in the function declaration, but that doesn’t matter either.)
<li>This calls the <code>approximate_size()</code> function with <code>False</code> for the argument named <var>a_kilobyte_is_1024_bytes</var> and <code>4000</code> for the argument named <var>size</var>. (See? I told you the order didn’t matter.)
<li>This call fails, because you have a named argument followed by an unnamed (positional) argument, and that never works. Reading the argument list from left to right, once you have a single named argument, the rest of the arguments must also be named.
<li>This call fails too, for the same reason as the previous call. Is that surprising? After all, you passed <code>4000</code> for the argument named <code>size</code>, then “obviously” that <code>False</code> value was meant for the <var>a_kilobyte_is_1024_bytes</var> argument. But Python doesn’t work that way. As soon as you have a named argument, all arguments to the right of that need to be named arguments, too.
</ol>
<p class=a>⁂
<h2 id=readability>Writing Readable Code</h2>
<p>I won’t bore you with a long finger-wagging speech about the importance of documenting your code. Just know that code is written once but read many times, and the most important audience for your code is yourself, six months after writing it (<i>i.e.</i> after you’ve forgotten everything but need to fix something). Python makes it easy to write readable code, so take advantage of it. You’ll thank me in six months.
<h3 id=docstrings>Documentation Strings</h3>
<p>You can document a Python function by giving it a documentation string (<code>docstring</code> for short). In this program, the <code>approximate_size()</code> function has a <code>docstring</code>:
<pre class='nd pp'><code>def approximate_size(size, a_kilobyte_is_1024_bytes=True):
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
if False, use multiples of 1000
Returns: string
'''</code></pre>
<aside>Every function deserves a decent docstring.</aside>
<p>Triple quotes signify a multi-line string. Everything between the start and end quotes is part of a single string, including carriage returns, leading white space, and other quote characters. You can use them anywhere, but you’ll see them most often used when defining a <code>docstring</code>.
<blockquote class='note compare perl5'>
<p><span class=u>☞</span>Triple quotes are also an easy way to define a string with both single and double quotes, like <code>qq/.../</code> in Perl 5.
</blockquote>
<p>Everything between the triple quotes is the function’s <code>docstring</code>, which documents what the function does. A <code>docstring</code>, if it exists, must be the first thing defined in a function (that is, on the next line after the function declaration). You don’t technically need to give your function a <code>docstring</code>, but you always should. I know you’ve heard this in every programming class you’ve ever taken, but Python gives you an added incentive: the <code>docstring</code> is available at runtime as an attribute of the function.
<blockquote class=note>
<p><span class=u>☞</span>Many Python <abbr>IDE</abbr>s use the <code>docstring</code> to provide context-sensitive documentation, so that when you type a function name, its <code>docstring</code> appears as a tooltip. This can be incredibly helpful, but it’s only as good as the <code>docstring</code>s you write.
</blockquote>
<p class=a>⁂
<h2 id=importsearchpath>The <code>import</code> Search Path</h2>
<p>Before this goes any further, I want to briefly mention the library search path. Python looks in several places when you try to import a module. Specifically, it looks in all the directories defined in <code>sys.path</code>. This is just a list, and you can easily view it or modify it with standard list methods. (You’ll learn more about lists in <a href=native-datatypes.html#lists>Native Datatypes</a>.)
<pre class=screen>
<a><samp class=p>>>> </samp><kbd class=pp>import sys</kbd> <span class=u>①</span></a>
<a><samp class=p>>>> </samp><kbd class=pp>sys.path</kbd> <span class=u>②</span></a>
<samp class=pp>['',
'/usr/lib/python31.zip',
'/usr/lib/python3.1',
'/usr/lib/python3.1/plat-linux2@EXTRAMACHDEPPATH@',
'/usr/lib/python3.1/lib-dynload',
'/usr/lib/python3.1/dist-packages',
'/usr/local/lib/python3.1/dist-packages']</samp>
<a><samp class=p>>>> </samp><kbd class=pp>sys</kbd> <span class=u>③</span></a>
<samp class=pp><module 'sys' (built-in)></samp>
<a><samp class=p>>>> </samp><kbd class=pp>sys.path.insert(0, '/home/mark/diveintopython3/examples')</kbd> <span class=u>④</span></a>
<a><samp class=p>>>> </samp><kbd class=pp>sys.path</kbd> <span class=u>⑤</span></a>
<samp class=pp>['/home/mark/diveintopython3/examples',
'',
'/usr/lib/python31.zip',
'/usr/lib/python3.1',
'/usr/lib/python3.1/plat-linux2@EXTRAMACHDEPPATH@',
'/usr/lib/python3.1/lib-dynload',
'/usr/lib/python3.1/dist-packages',
'/usr/local/lib/python3.1/dist-packages']</samp></pre>
<ol>
<li>Importing the <code>sys</code> module makes all of its functions and attributes available.
<li><code>sys.path</code> is a list of directory names that constitute the current search path. (Yours will look different, depending on your operating system, what version of Python you’re running, and where it was originally installed.) Python will look through these directories (in this order) for a <code>.py</code> file whose name matches what you’re trying to import.
<li>Actually, I lied; the truth is more complicated than that, because not all modules are stored as <code>.py</code> files. Some are <i>built-in modules</i>; they are actually baked right into Python itself. Built-in modules behave just like regular modules, but their Python source code is not available, because they are not written in Python! (Like Python itself, these built-in modules are written in C.)
<li>You can add a new directory to Python’s search path at runtime by adding the directory name to <code>sys.path</code>, and then Python will look in that directory as well, whenever you try to import a module. The effect lasts as long as Python is running.
<li>By using <code>sys.path.insert(0, <var>new_path</var>)</code>, you inserted a new directory as the first item of the <code>sys.path</code> list, and therefore at the beginning of Python’s search path. This is almost always what you want. In case of naming conflicts (for example, if Python ships with version 2 of a particular library but you want to use version 3), this ensures that your modules will be found and used instead of the modules that came with Python.
</ol>
<p class=a>⁂
<h2 id=everythingisanobject>Everything Is An Object</h2>
<p>In case you missed it, I just said that Python functions have attributes, and that those attributes are available at runtime. A function, like everything else in Python, is an object.
<p>Run the interactive Python shell and follow along:
<pre class=screen>
<a><samp class=p>>>> </samp><kbd class=pp>import humansize</kbd> <span class=u>①</span></a>
<a><samp class=p>>>> </samp><kbd class=pp>print(humansize.approximate_size(4096, True))</kbd> <span class=u>②</span></a>
<samp class=pp>4.0 KiB</samp>
<a><samp class=p>>>> </samp><kbd class=pp>print(humansize.approximate_size.__doc__)</kbd> <span class=u>③</span></a>
<samp>Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
if False, use multiples of 1000
Returns: string
</samp></pre>
<ol>
<li>The first line imports the <code>humansize</code> program as a module — a chunk of code that you can use interactively, or from a larger Python program. Once you import a module, you can reference any of its public functions, classes, or attributes. Modules can do this to access functionality in other modules, and you can do it in the Python interactive shell too. This is an important concept, and you’ll see a lot more of it throughout this book.
<li>When you want to use functions defined in imported modules, you need to include the module name. So you can’t just say <code>approximate_size</code>; it must be <code>humansize.approximate_size</code>. If you’ve used classes in Java, this should feel vaguely familiar.
<li>Instead of calling the function as you would expect to, you asked for one of the function’s attributes, <code>__doc__</code>.
</ol>
<blockquote class='note compare perl5'>
<p><span class=u>☞</span><code>import</code> in Python is like <code>require</code> in Perl. Once you <code>import</code> a Python module, you access its functions with <code><var>module</var>.<var>function</var></code>; once you <code>require</code> a Perl module, you access its functions with <code><var>module</var>::<var>function</var></code>.
</blockquote>
<h3 id=whatsanobject>What’s An Object?</h3>
<p>Everything in Python is an object, and everything can have attributes and methods. All functions have a built-in attribute <code>__doc__</code>, which returns the <var>docstring</var> defined in the function’s source code. The <code>sys</code> module is an object which has (among other things) an attribute called <var>path</var>. And so forth.
<p>Still, this doesn’t answer the more fundamental question: what is an object? Different programming languages define “object” in different ways. In some, it means that <em>all</em> objects <em>must</em> have attributes and methods; in others, it means that all objects are subclassable. In Python, the definition is looser. Some objects have neither attributes nor methods, <em>but they could</em>. Not all objects are subclassable. But everything is an object in the sense that it can be assigned to a variable or passed as an argument to a function.
<p>You may have heard the term “first-class object” in other programming contexts. In Python, functions are <i>first-class objects</i>. You can pass a function as an argument to another function. Modules are <i>first-class objects</i>. You can pass an entire module as an argument to a function. Classes are first-class objects, and individual instances of a class are also first-class objects.
<p>This is important, so I’m going to repeat it in case you missed it the first few times: <em>everything in Python is an object</em>. Strings are objects. Lists are objects. Functions are objects. Classes are objects. Class instances are objects. Even modules are objects.
<p class=a>⁂
<h2 id=indentingcode>Indenting Code</h2>
<p>Python functions have no explicit <code>begin</code> or <code>end</code>, and no curly braces to mark where the function code starts and stops. The only delimiter is a colon (<code>:</code>) and the indentation of the code itself.
<pre class=pp><code><a>def approximate_size(size, a_kilobyte_is_1024_bytes=True): <span class=u>①</span></a>
<a> if size < 0: <span class=u>②</span></a>
<a> raise ValueError('number must be non-negative') <span class=u>③</span></a>
<a> <span class=u>④</span></a>
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
<a> for suffix in SUFFIXES[multiple]: <span class=u>⑤</span></a>
size /= multiple
if size < multiple:
return '{0:.1f} {1}'.format(size, suffix)
raise ValueError('number too large')</code></pre>
<ol>
<li>Code blocks are defined by their indentation. By “code block,” I mean functions, <code>if</code> statements, <code>for</code> loops, <code>while</code> loops, and so forth. Indenting starts a block and unindenting ends it. There are no explicit braces, brackets, or keywords. This means that whitespace is significant, and must be consistent. In this example, the function code is indented four spaces. It doesn’t need to be four spaces, it just needs to be consistent. The first line that is not indented marks the end of the function.
<li>In Python, an <code>if</code> statement is followed by a code block. If the <code>if</code> expression evaluates to true, the indented block is executed, otherwise it falls to the <code>else</code> block (if any). Note the lack of parentheses around the expression.
<li>This line is inside the <code>if</code> code block. This <code>raise</code> statement will raise an exception (of type <code>ValueError</code>), but only if <code>size < 0</code>.
<li>This is <em>not</em> the end of the function. Completely blank lines don’t count. They can make the code more readable, but they don’t count as code block delimiters. The function continues on the next line.
<li>The <code>for</code> loop also marks the start of a code block. Code blocks can contain multiple lines, as long as they are all indented the same amount. This <code>for</code> loop has three lines of code in it. There is no other special syntax for multi-line code blocks. Just indent and get on with your life.
</ol>
<p>After some initial protests and several snide analogies to Fortran, you will make peace with this and start seeing its benefits. One major benefit is that all Python programs look similar, since indentation is a language requirement and not a matter of style. This makes it easier to read and understand other people’s Python code.
<blockquote class='note compare java'>
<p><span class=u>☞</span>Python uses carriage returns to separate statements and a colon and indentation to separate code blocks. <abbr>C++</abbr> and Java use semicolons to separate statements and curly braces to separate code blocks.
</blockquote>
<p class=a>⁂
<h2 id=exceptions>Exceptions</h2>
<p>Exceptions are everywhere in Python. Virtually every module in the standard Python library uses them, and Python itself will raise them in a lot of different circumstances. You’ll see them repeatedly throughout this book.
<p>What is an exception? Usually it’s an error, an indication that something went wrong. (Not all exceptions are errors, but never mind that for now.) Some programming languages encourage the use of error return codes, which you <em>check</em>. Python encourages the use of exceptions, which you <em>handle</em>.
<p>When an error occurs in the Python Shell, it prints out some details about the exception and how it happened, and that’s that. This is called an <em>unhandled</em> exception. When the exception was raised, there was no code to explicitly notice it and deal with it, so it bubbled its way back up to the top level of the Python Shell, which spits out some debugging information and calls it a day. In the shell, that's no big deal, but if that happened while your actual Python program was running, the entire program would come to a screeching halt if nothing handles the exception. Maybe that’s what you want, maybe it isn’t.
<blockquote class='note compare java'>
<p><span class=u>☞</span>Unlike Java, Python functions don’t declare which exceptions they might raise. It’s up to you to determine what possible exceptions you need to catch.
</blockquote>
<p>An exception doesn’t need to result in a complete program crash, though. Exceptions can be <em>handled</em>. Sometimes an exception is really because you have a bug in your code (like accessing a variable that doesn’t exist), but sometimes an exception is something you can anticipate. If you’re opening a file, it might not exist. If you’re importing a module, it might not be installed. If you’re connecting to a database, it might be unavailable, or you might not have the correct security credentials to access it. If you know a line of code may raise an exception, you should handle the exception using a <code>try...except</code> block.
<blockquote class='note compare java'>
<p><span class=u>☞</span>Python uses <code>try...except</code> blocks to handle exceptions, and the <code>raise</code> statement to generate them. Java and <abbr>C++</abbr> use <code>try...catch</code> blocks to handle exceptions, and the <code>throw</code> statement to generate them.
</blockquote>
<p>The <code>approximate_size()</code> function raises exceptions in two different cases: if the given <var>size</var> is larger than the function is designed to handle, or if it’s less than zero.
<pre class='nd pp'><code>if size < 0:
raise ValueError('number must be non-negative')</code></pre>
<p>The syntax for raising an exception is simple enough. Use the <code>raise</code> statement, followed by the exception name, and an optional human-readable string for debugging purposes. The syntax is reminiscent of calling a function. (In reality, exceptions are implemented as classes, and this <code>raise</code> statement is actually creating an instance of the <code>ValueError</code> class and passing the string <code>'number must be non-negative'</code> to its initialization method. But <a href=iterators.html#defining-classes>we’re getting ahead of ourselves</a>!)
<blockquote class=note>
<p><span class=u>☞</span>You don’t need to handle an exception in the function that raises it. If one function doesn’t handle it, the exception is passed to the calling function, then that function’s calling function, and so on “up the stack.” If the exception is never handled, your program will crash, Python will print a “traceback” to standard error, and that’s the end of that. Again, maybe that’s what you want; it depends on what your program does.
</blockquote>
<h3 id=importerror>Catching Import Errors</h3>
<p>One of Python’s built-in exceptions is <code>ImportError</code>, which is raised when you try to import a module and fail. This can happen for a variety of reasons, but the simplest case is when the module doesn’t exist in your <a href=#importsearchpath>import search path</a>. You can use this to include optional features in your program. For example, <a href=case-study-porting-chardet-to-python-3.html>the <code>chardet</code> library</a> provides character encoding auto-detection. Perhaps your program wants to use this library <em>if it exists</em>, but continue gracefully if the user hasn’t installed it. You can do this with a <code>try..except</code> block.
<pre class='nd pp'><code><mark>try</mark>:
import chardet
<mark>except</mark> ImportError:
chardet = None</code></pre>
<p>Later, you can check for the presence of the <code>chardet</code> module with a simple <code>if</code> statement:
<pre class='nd pp'><code>if chardet:
# do something
else:
# continue anyway</code></pre>
<p>Another common use of the <code>ImportError</code> exception is when two modules implement a common <abbr>API</abbr>, but one is more desirable than the other. (Maybe it’s faster, or it uses less memory.) You can try to import one module but fall back to a different module if the first import fails. For example, <a href=xml.html>the XML chapter</a> talks about two modules that implement a common <abbr>API</abbr>, called the <code>ElementTree</code> <abbr>API</abbr>. The first, <code>lxml</code>, is a third-party module that you need to download and install yourself. The second, <code>xml.etree.ElementTree</code>, is slower but is part of the Python 3 standard library.
<pre class='nd pp'><code>try:
from lxml import etree
except ImportError:
import xml.etree.ElementTree as etree</code></pre>
<p>By the end of this <code>try..except</code> block, you have imported <em>some</em> module and named it <var>etree</var>. Since both modules implement a common <abbr>API</abbr>, the rest of your code doesn’t need to keep checking which module got imported. And since the module that <em>did</em> get imported is always called <var>etree</var>, the rest of your code doesn’t need to be littered with <code>if</code> statements to call differently-named modules.
<p class=a>⁂
<h2 id=nameerror>Unbound Variables</h2>
<p>Take another look at this line of code from the <code>approximate_size()</code> function:
<pre class='nd pp'><code>multiple = 1024 if a_kilobyte_is_1024_bytes else 1000</code></pre>
<p>You never declare the variable <var>multiple</var>, you just assign a value to it. That’s OK, because Python lets you do that. What Python will <em>not</em> let you do is reference a variable that has never been assigned a value. Trying to do so will raise a <code>NameError</code> exception.
<pre class='nd screen'>
<samp class=p>>>> </samp><kbd class=pp>x</kbd>
<samp class=traceback>Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined</samp>
<samp class=p>>>> </samp><kbd class=pp>x = 1</kbd>
<samp class=p>>>> </samp><kbd class=pp>x</kbd>
<samp class=pp>1</samp></pre>
<p>You will thank Python for this one day.
<p class=a>⁂
<h2 id=case>Everything is Case-Sensitive</h2>
<p>All names in Python are case-sensitive: variable names, function names, class names, module names, exception names. If you can get it, set it, call it, construct it, import it, or raise it, it’s case-sensitive.
<pre class='nd screen'>
<samp class=p>>>> </samp><kbd class=pp>an_integer = 1</kbd>
<samp class=p>>>> </samp><kbd class=pp>an_integer</kbd>
<samp class=pp>1</samp>
<samp class=p>>>> </samp><kbd>AN_INTEGER</kbd>
<samp class=traceback>Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'AN_INTEGER' is not defined</samp>
<samp class=p>>>> </samp><kbd>An_Integer</kbd>
<samp class=traceback>Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'An_Integer' is not defined</samp>
<samp class=p>>>> </samp><kbd>an_inteGer</kbd>
<samp class=traceback>Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'an_inteGer' is not defined</samp>
</pre>
<p>And so on.
<p class=a>⁂
<h2 id=runningscripts>Running Scripts</h2>
<aside>Everything in Python is an object.</aside>
<p>Python modules are objects and have several useful attributes. You can use this to easily test your modules as you write them, by including a special block of code that executes when you run the Python file on the command line. Take the last few lines of <code>humansize.py</code>:
<pre class='nd pp'><code>
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))</code></pre>
<blockquote class='note compare clang'>
<p><span class=u>☞</span>Like <abbr>C</abbr>, Python uses <code>==</code> for comparison and <code>=</code> for assignment. Unlike <abbr>C</abbr>, Python does not support in-line assignment, so there’s no chance of accidentally assigning the value you thought you were comparing.
</blockquote>
<p>So what makes this <code>if</code> statement special? Well, modules are objects, and all modules have a built-in attribute <code>__name__</code>. A module’s <code>__name__</code> depends on how you’re using the module. If you <code>import</code> the module, then <code>__name__</code> is the module’s filename, without a directory path or file extension.
<pre class='nd screen'>
<samp class=p>>>> </samp><kbd class=pp>import humansize</kbd>
<samp class=p>>>> </samp><kbd class=pp>humansize.__name__</kbd>
<samp class=pp>'humansize'</samp></pre>
<p>But you can also run the module directly as a standalone program, in which case <code>__name__</code> will be a special default value, <code>__main__</code>. Python will evaluate this <code>if</code> statement, find a true expression, and execute the <code>if</code> code block. In this case, to print two values.
<pre class='nd screen'>
<samp class=p>c:\home\diveintopython3> </samp><kbd>c:\python31\python.exe humansize.py</kbd>
<samp>1.0 TB
931.3 GiB</samp></pre>
<p>And that’s your first Python program!
<p class=a>⁂
<h2 id=furtherreading>Further Reading</h2>
<ul>
<li><a href=http://www.python.org/dev/peps/pep-0257/>PEP 257: Docstring Conventions</a> explains what distinguishes a good <code>docstring</code> from a great <code>docstring</code>.
<li><a href=http://docs.python.org/3.1/tutorial/controlflow.html#documentation-strings>Python Tutorial: Documentation Strings</a> also touches on the subject.
<li><a href=http://www.python.org/dev/peps/pep-0008/>PEP 8: Style Guide for Python Code</a> discusses good indentation style.
<li><a href=http://docs.python.org/3.1/reference/><cite>Python Reference Manual</cite></a> explains what it means to say that <a href=http://docs.python.org/3.1/reference/datamodel.html#objects-values-and-types>everything in Python is an object</a>, because some people are <a href=http://www.douglasadams.com/dna/pedants.html>pedants</a> and like to discuss that sort of thing at great length.
</ul>
<p class=v><a rel=prev href=installing-python.html title='back to “Installing Python”'><span class=u>☜</span></a> <a rel=next href=native-datatypes.html title='onward to “Native Datatypes”'><span class=u>☞</span></a>
<p class=c>© 2001–11 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/prettify.js></script>
<script src=j/dip3.js></script>