Web scraping with Python and beautiful soup

As part of my final project for my General Assembly Python course I needed translate American State codes like NY, DE, NJ etc into the full state names. I done a quick google search for a full list of the american states and found a website that that them all in a table so I decided to pull them down from the website using python. Below is how i done it, really easy to do and only a few lines of code

# Import the module
from bs4 import BeautifulSoup

#connect and download the web page
webpage = requests.get('https://www.infoplease.com/us/postal-information/state-abbreviations-and-state-postal-codes', 'html.parser')

#pass the content to soup
soup = BeautifulSoup(webpage.text)

#find the right table
rows = soup.find("table", {"class": "sgmltable"}).find("tbody").find_all("tr")

#make the dict to store the items
states = {}

#loop through the rows in the table and add the key/vales to the dictionary
for row in rows:

    cells = row.find_all("td")

    full_state = cells[0].get_text()
    abbrev = cells[2].get_text()
    states[abbrev] = full_state

#Now you have a dictionary with the states and you can do whatever you need to now with that data

Once I had the above I could update my Pandas dataframe and get working on producing some charts like below, more on how to do that in the next blog post!!


No Comments Yet