Introduction to programming using Python

Session 8

Matthieu Choplin

http://mattchoplin.com/

Objectives

  • To open a file, read/write data from/to a file
  • To read data from a Web resource

Open a File

We create a file object with the following syntax:

file = open(filename, mode)
Mode Description
r Open a file for reading only
r+ Open a file for both reading and writing
w Open a file for writing only
a Open a file for appending data. Data are written to the end of the file
rb Open a file for reading binary data
wb Open a file for writing binary data

Write to a File: Example

Program that creates a file if it does not exist (an existing file with the same name will be erased) and write in it:

def main():
    # Open file for output
    outfile = open("Python_projects.txt", "w")
    # Write data to the file
    outfile.write("Django\n")
    outfile.write("Flask\n")
    outfile.write("Ansible")
    outfile.close() # Close the output file

main() # Call the main function

Testing File Existence

import os.path

if os.path.isfile("Python_projects.txt"):
    print("Python_projects.txt exists")

Read from a File: Example

After a file is opened for reading data, you can use:

  • the read(size) method to read a specified number of characters or all characters,
  • the readline() method to read the next line
  • the readlines() method to read all lines into a list.

Read from a File: Example with read()

def main():
    # Open file for input
    infile = open("Python_projects.txt", "r")
    print("Using read(): ")
    print(infile.read())
    infile.close() # Close the input file
main() # Call the main function

Read from a File: Example with read(size)

def main():
    infile = open("Python_projects.txt", "r")
    print("\nUsing read(number): ")
    s1 = infile.read(4)  # read till the 4th character
    print(s1)
    s2 = infile.read(10)  # read from 4th till 4th+10th
    print(repr(s2))  # a new line is also a character \n
    infile.close()  # Close the input file

main()  # Call the main function

Read from a File: Example with readline()

def main():
    infile = open("Python_projects.txt", "r")
    print("\nUsing readline(): ")
    line1 = infile.readline()
    line2 = infile.readline()
    line3 = infile.readline()
    line4 = infile.readline()
    print(repr(line1))
    print(repr(line2))
    print(repr(line3))
    print(repr(line4))
    infile.close() # Close the input file

main() # Call the main function

Read from a File: Example with readlines()

def main():
    # Open file for input
    infile = open("Python_projects.txt", "r")
    print("\n(4) Using readlines(): ")
    print(infile.readlines())  # a list of lines
    infile.close()  # Close the input file

main() # Call the main function

Append Data to a File

You can use the 'a' mode to open a file for appending data to an existing file.

def main():
    # Open file for appending data
    outfile = open("Info.txt", "a")
    outfile.write("\nPython is interpreted\n")
    outfile.close() # Close the input file

main() # Call the main function

Exercise

Write a program that prompts the user to enter a file and counts the number of occurrences of each letter in the file regardless of case.

Only take the characters of the alphabet, you can get them with the following

from string import ascii_lowercase
print(ascii_lowercase)  # abcdefghijklmnopqrstuvwxyz

Solution

Hide solution

from string import ascii_lowercase
from pprint import pprint

def main():
    filename = input("Enter a filename: ").strip()
    dict_of_letter = {}
    f = open(filename)
    for line in f:
        for letter in line.lower():
            if letter in ascii_lowercase:
                if letter in dict_of_letter:
                    dict_of_letter[letter] += 1
                else:
                    dict_of_letter[letter] = 1
    f.close()
    pprint(dict_of_letter)

main()

Case Studies: Occurrences of Words

  • This case study writes a program that counts the occurrences of words in a text file and displays the words and their occurrences in alphabetical order of words. The program uses a dictionary to store an entry consisting of a word and its count. For each word, check whether it is already a key in the dictionary. If not, add to the dictionary an entry with the word as the key and value 1. Otherwise, increase the value for the word (key) by 1 in the dictionary
  • See the program CountOccurenceOfWords.py

Retrieving Data from the Web

Using Python, you can write simple code to read data from a Website. All you need to do is to open a URL link using the urlopen function as follows:

import urllib.request
infile = urllib.request.urlopen('http://example.org/')
html_page = infile.read().decode()
print(html_page)

It represents the full HTML of the page just as a web browser would see it

Exercise

  • Count each letter from a web page (from the source code of the page)
  • You can reuse the code of the previous and try to refactor so that both programs use the same count_letter function

Solution

Hide solution

from pprint import pprint
from string import ascii_lowercase
import urllib.request

def main():
    url = input("Enter an URL for a file: ").strip()
    infile = urllib.request.urlopen(url)
    f = infile.read().decode() # Read the content as string
    dict_of_letter = {}
    for line in f:
        for letter in line.lower():
            if letter in ascii_lowercase:
                if letter in dict_of_letter:
                    dict_of_letter[letter] += 1
                else:
                    dict_of_letter[letter] = 1
    pprint(dict_of_letter)

main()

Using the with statement

It is good practice to use the with keyword when dealing with file objects. This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way. It is also much shorter than writing equivalent try-finally blocks:

with open('Python_projects.txt', 'r') as f:
    read_data = f.read()

assert f.closed

The json file format

  • json (Javascript Object Notation) is a lightweight data interchange format with which you:
    • dump data ("serialize")
    • load data ("deserialize")
import json
serialized_data = json.dumps(
    ['foo', {'bar': ('baz', None, 1.0, 2)}])
print(serialized_data)
deserialized_data = json.loads(serialized_data)
print(deserialized_data)

Example with a simple rest API (1)

How to get the capital of each country?

import json
from urllib import request

infile = request.urlopen(
    'https://restcountries.eu/rest/v1/all')
content_as_python_obj = json.loads(infile.read().decode())
for country in content_as_python_obj:
    print(country['borders'])

Can you see what object is the "borders"?

Example with a simple rest API (2)

import json
from urllib import request

infile = request.urlopen(
    'https://restcountries.eu/rest/v1/all')
content_as_python_obj = json.loads(infile.read().decode())
for country in content_as_python_obj:
    print(country['capital'])

API

  • In the previous case, an API (Application Programming Interface) is simply a specification of remote calls exposed to the API consumers
  • We are using the API as a service by just calling (doing a GET) its available urls

Example with the Google map API

from urllib import parse, request
import json

serviceurl = 'http://maps.googleapis.com/maps/api/geocode/json?'

while True:
    address = input('Enter location (q to quit): ')
    if len(address) < 1 or address.lower() == 'q':  # sentinel value, press q to quit
        break
    url = serviceurl + parse.urlencode({'sensor': 'false', 'address': address})
    print('Retrieving', url)
    uh = request.urlopen(url)
    data = uh.read().decode('utf-8')
    print('Retrieved', len(data), 'characters')
    js = json.loads(data)
    if 'status' not in js or js['status'] != 'OK':
        print('==== Failure To Retrieve ====')
        print(data)
        continue
    lat = js["results"][0]["geometry"]["location"]["lat"]
    lng = js["results"][0]["geometry"]["location"]["lng"]
    print('lat', lat, 'lng', lng)
    location = js['results'][0]['formatted_address']
    print(location)

Example with the Twitter API using the client Tweepy

  1. Navigate to https://apps.twitter.com/
  2. Click the button to create a new application
  3. Enter dummy data
  4. Once the application is created, get the following:
    • consumer_key
    • consumer_secret
    • access_token
    • access_secret

Get tweet with #python

import tweepy

consumer_key = 'get_your_own'
consumer_secret = 'get_your_own'
access_token = 'get_your_own'
access_secret = 'get_your_own'

def main():
    auth = tweepy.auth.OAuthHandler(consumer_key,
                        consumer_secret)
    auth.set_access_token(access_token, access_secret)
    api = tweepy.API(auth)

    tweets = api.search(q='#python')
    for t in tweets:
        print(t.created_at, t.text, '\n')
main()