Skip to content

Latest commit

 

History

History
206 lines (146 loc) · 5.92 KB

7i8g.md

File metadata and controls

206 lines (146 loc) · 5.92 KB

lang.py.syntax.file

Working with file (open, read, write)

Synopsis

  with open(r'./filename', ['r'|'r+'|'a+']) as <alias>:
      [operations with <alias>]

Overview

When working with text files you can perform operations of reading, writing and more

Thou you don't necessarily need it's best practice to use a context manager to automatically handle the closing of the file once it's no longer used. Otherwise, you would need to manually call file.close()

There are two types of files you can work with:

  • Text Files: Each line is terminated with a \n character
  • Binary Files: There is no terminator for a line

Access Mode

How the file will be used once it's opened. They also define the File Handle which indicates from where the data has to be read or written in the file

  • Read Only ('r'):

    • Default mode of opening
    • Handle positioned at the beginning
    • Raises I/O error if file don't exist
  • Write Only ('w'):

    • Handle positioned at the beginning
    • For existing file, the data is truncated and over-written
  • Append Only ('a'):

    • The file is created if it don't exist
    • Handle positioned at the end of file
  • Read and Write ('r+'):

    • Handle positioned at the beginning
    • Raises I/O error if file don't exist
  • Write and Read ('w+'):

    • Handle positioned at the beginning
    • For existing file, the data is truncated and over-written
  • Append and Read ('a+'):

    • Open for reading and writing
    • Handle positioned at the end of file
    • The file is created if it don't exist
    • Data written will be inserted after existing one

Cookbook

Working with a file best practice

The more pythonic way of handling a file in Python is by using the pathlib module and a logging library to keep up with the operations. And group all of this in an exception block to catch any errors

  import pathlib
  import logging

  file_path = pathlib.Path("hello.txt")

  try:
      with file_path.open(mode="w") as file:
          file.write("Hello, World!")
  except OSError as error:
      logging.error("Writing to file %s failed due to: %s", file_path, error)

Go back to the beginning of a file

As you read from a file, the handler will be moved from where it began so if you want to get access back to the beginning of the file you will need to use the seek() method

Here seek(n) takes the file handle to the nth bite from the beginning, in this case 0

  file1 = open("myfile.txt", "r+")
  data = file1.read()
  file1.seek(0)

Write line from a list into a file

You can use the writelines() method to write elements from a list object into a file from the beginning of set file

  L = ["This is Delhi \n", "This is Paris \n", "This is London \n"]

  with open("myfile.txt", "w") as file1:
    file1.write("Hello \n")
    file1.writelines(L)

Get the path name of the file of the loaded module

When you import a module if you wanted to now the file path name of the loaded module you can use the __file__ magic attribute to get the relative path to the current running scripts

  import re

  print(re.__file__)    # '/usr/lib/python3.9/re.py'

Print lines of a file without the \n ending

When you print a line of a file, each line ends with a double \n character, to get rid of this character you can either use the rstirp() string method or pass the end='' argument to the print statement

  with open('file.txt') as file_object:
      for line in file_object:
          print(line.rstrip())
          # print(line, end='')

Read lines of a file without the \n ending

Sometimes lines in a file might end with \n which will be interpreted as a new line character and such treated as an extra element line

If you want to avoid reading this character you can use:

  lines = <alias>.read().splitlines()

Or use a list comprehension:

  lines = [line[:-1] for line in <alias>]

Operate on each line of a file

You can treat a file as an iterable where each element is represented by a line in the file

So you can access directly each line in a for loop

  path = pathlib.Path.cwd() / 'test.md'

  with open(path, mode='r') as fid:
      headers = [line.strip() for line in fid if line.startswith('#')]

  print('\n'.join(headers))

Working with multiple files at the same time

Sometimes you may need to open and handle more than one file at the same time, like in the case you may want to read from a file and write into another

  d_path = 'dog_breeds.txt'
  d_r_path = 'dog_breeds_reversed.txt'
  with open(d_path, 'r') as reader, open(d_r_path, 'w') as writer:
      dog_breeds = reader.readlines()
      writer.writelines(reversed(dog_breeds))

Append something to the end of a file

It is the same as writing to a file but use the access method a to position the handle at the end of the file

  with open('dog_breeds.txt', 'a') as a_writer:
      a_writer.write('\nBeagle')

Common file handling libraries

There are many libraries desigin to handle specific file types

  • tarfile: read and write tar archive files
  • zipfile: work with ZIP archives
  • configparser: easily create and parse configuration files
  • xml.etree.ElementTree: create or read XML based files
  • PyPDF2: PDF toolkit
  • xlwings: read and write Excel files
  • pillow: image rendering and manipulation

Create your own file handler

You can define a custom file handler by defining your own context manager with a special class that will have the magic methods __enter__ and __exit__}