Working with file (open, read, write)
with open(r'./filename', ['r'|'r+'|'a+']) as <alias>:
[operations with <alias>]
When working with text files you can perform operations of reading, writing and more
Thou you don't necessarily need it's best practice to use a context manager
to automatically handle the closing of the file once it's no longer used.
Otherwise, you would need to manually call file.close()
There are two types of files you can work with:
- Text Files: Each line is terminated with a
\n
character - Binary Files: There is no terminator for a line
How the file will be used once it's opened. They also define the File Handle which indicates from where the data has to be read or written in the file
-
Read Only (
'r'
):- Default mode of opening
- Handle positioned at the beginning
- Raises
I/O
error if file don't exist
-
Write Only (
'w'
):- Handle positioned at the beginning
- For existing file, the data is truncated and over-written
-
Append Only (
'a'
):- The file is created if it don't exist
- Handle positioned at the end of file
-
Read and Write (
'r+'
):- Handle positioned at the beginning
- Raises
I/O
error if file don't exist
-
Write and Read (
'w+'
):- Handle positioned at the beginning
- For existing file, the data is truncated and over-written
-
Append and Read (
'a+'
):- Open for reading and writing
- Handle positioned at the end of file
- The file is created if it don't exist
- Data written will be inserted after existing one
The more pythonic way of handling a file in Python
is by using the
pathlib module and a logging library to keep up with
the operations. And group all of this in an exception block to
catch any errors
import pathlib
import logging
file_path = pathlib.Path("hello.txt")
try:
with file_path.open(mode="w") as file:
file.write("Hello, World!")
except OSError as error:
logging.error("Writing to file %s failed due to: %s", file_path, error)
As you read from a file, the handler will be moved from where it began so
if you want to get access back to the beginning of the file you will need to
use the seek()
method
Here seek(n)
takes the file handle to the nth bite from the beginning, in
this case 0
file1 = open("myfile.txt", "r+")
data = file1.read()
file1.seek(0)
You can use the writelines()
method to write elements from a list object into
a file from the beginning of set file
L = ["This is Delhi \n", "This is Paris \n", "This is London \n"]
with open("myfile.txt", "w") as file1:
file1.write("Hello \n")
file1.writelines(L)
When you import a module if you wanted to now the file path
name of the loaded module you can use the __file__
magic attribute
to get the relative path to the current running scripts
import re
print(re.__file__) # '/usr/lib/python3.9/re.py'
When you print a line of a file, each line ends with a double \n
character,
to get rid of this character you can either use the rstirp()
string
method or pass the end=''
argument to the print statement
with open('file.txt') as file_object:
for line in file_object:
print(line.rstrip())
# print(line, end='')
Sometimes lines in a file might end with \n
which will be interpreted
as a new line character and such treated as an extra element line
If you want to avoid reading this character you can use:
lines = <alias>.read().splitlines()
Or use a list comprehension:
lines = [line[:-1] for line in <alias>]
You can treat a file as an iterable where each element is represented by a line in the file
So you can access directly each line in a for loop
path = pathlib.Path.cwd() / 'test.md'
with open(path, mode='r') as fid:
headers = [line.strip() for line in fid if line.startswith('#')]
print('\n'.join(headers))
Sometimes you may need to open and handle more than one file at the same time, like in the case you may want to read from a file and write into another
d_path = 'dog_breeds.txt'
d_r_path = 'dog_breeds_reversed.txt'
with open(d_path, 'r') as reader, open(d_r_path, 'w') as writer:
dog_breeds = reader.readlines()
writer.writelines(reversed(dog_breeds))
It is the same as writing to a file but use the access method a
to position
the handle at the end of the file
with open('dog_breeds.txt', 'a') as a_writer:
a_writer.write('\nBeagle')
There are many libraries desigin to handle specific file types
- tarfile: read and write tar archive files
- zipfile: work with ZIP archives
- configparser: easily create and parse configuration files
- xml.etree.ElementTree: create or read XML based files
- PyPDF2: PDF toolkit
- xlwings: read and write Excel files
- pillow: image rendering and manipulation
You can define a custom file handler by defining your own context manager
with a special class that will have the magic methods
__enter__
and __exit__}