_This notebook file is adapted from Melanie Walsh 2021 'Introduction to Cultural Analytics: Mapping' [https://melaniewalsh.github.io/Intro-Cultural-Analytics/Mapping/Mapping.html](https://melaniewalsh.github.io/Intro-Cultural-Analytics/Mapping/Mapping.html). You are encouraged to follow the remainder of her tutorial to learn how to add a custom map background, and how to publish your resulting map to the web._

# Mapping with Python

In this lesson, we're going to learn how to analyze and visualize geographic data.

# Geocoding with GeoPy

First, we're going to geocode data — aka get coordinates from addresses or place names — with the Python package [GeoPy](https://geopy.readthedocs.io/en/stable/#). GeoPy makes it easier to use a range of third-party [geocoding API services](https://geopy.readthedocs.io/en/stable/#), such as Google, Bing, ArcGIS, and OpenStreetMap.

Though most of these services require an API key, Nominatim, which uses OpenStreetMap data, does not, which is why we're going to use it here.

### Install GeoPy

In [1]:
!pip install geopy

Collecting geopy
  Downloading geopy-2.1.0-py3-none-any.whl (112 kB)
[K     |████████████████████████████████| 112 kB 3.0 MB/s eta 0:00:01
[?25hCollecting geographiclib<2,>=1.49
  Downloading geographiclib-1.50-py3-none-any.whl (38 kB)
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-2.1.0


### Import Nominatim

From GeoPy's list of possible geocoding services, we're going to import Nominatim:

In [2]:
from geopy.geocoders import Nominatim

### Nominatim & OpenStreetMap

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/b/b0/Openstreetmap_logo.svg/256px-Openstreetmap_logo.svg.png" border=2 >

Nominatim (which means "name" in Latin) uses [OpenStreetMap data](https://www.openstreetmap.org/relation/174979) to match addresses with geopgraphic coordinates. Though we don't need an API key to use Nominatim, we do need to create a unique [application name](https://operations.osmfoundation.org/policies/nominatim/). 

Here we're initializing Nominatim as a variable called `geolocator`. Change the application name below to your own application name:

In [3]:
geolocator = Nominatim(user_agent="GIVE-A-NAME-HERE-app", timeout=2)

To geocode an address or location, we simply use the `.geocode()` function:

In [6]:
location = geolocator.geocode("Wellington Street Ottawa")

In [7]:
location

Location(Wellington Street, Centretown, Somerset, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1P 5M4, Canada, (45.4227971, -75.6995593, 0.0))

---

### An Alternative: Google Geocoding API

The Google Geocoding API is superior to Nominatim, but it requires an API key and more set up. To enable the Google Geocoding API and get an API key, see [Get Started with Google Maps Platform](https://developers.google.com/maps/gmp-get-started) and [Get Started with Geocoding API](https://developers.google.com/maps/documentation/geocoding/start).

**If you want to just continue without mucking about with Google, skip down to the heading 'Continue'**

In [5]:
# once you get an API Key from Google, you would uncomment the three lines below,
# inserting the API key into the appropriate spot in the second line

#from geopy.geocoders import GoogleV3
#google_geolocator = GoogleV3(api_key="YOUR-API-KEY HERE")
#google_geolocator.geocode("Wellington Street")

### Get Address

In [8]:
print(location.address)

South Cayuga Street, South Hill, Ithaca, Ithaca Town, Tompkins County, New York, 14850, United States of America


### Get Latitude and Longitude

In [9]:
print(location.latitude, location.longitude)

42.4359281 -76.4988639


### Get "Importance" Score

In [10]:
print(f"Importance: {location.raw['importance']}")

Importance: 0.4


### Get Class and Type

In [11]:
print(f"Class: {location.raw['class']} \nType: {location.raw['type']}")

Class: highway 
Type: residential


---

## Continue...

### Get Multiple Possible Matches

In [8]:
possible_locations = geolocator.geocode("Wellington Street", exactly_one=False)

for location in possible_locations:
    print(location.address)
    print(location.latitude, location.longitude)
    print(f"Importance: {location.raw['importance']}")

Wellington Street, Centretown, Somerset, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1P 5C7, Canada
45.4227971 -75.6995593
Importance: 0.5492354726566488
Wellington Street, Centretown, Somerset, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1A 0A3, Canada
45.4234495 -75.6980579
Importance: 0.5492354726566488
Wellington Street, Centretown, Somerset, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1P 5M4, Canada
45.4227971 -75.6995593
Importance: 0.5492354726566488
Wellington Street, Newtown, Widnes, Halton, North West England, England, WA8 0BF, United Kingdom
53.3568398 -2.7350504
Importance: 0.3
Wellington Street, Newtown, Widnes, Halton, North West England, England, WA8 0QD, United Kingdom
53.3565607 -2.7353103
Importance: 0.3
Wellington Street, Modelpark, Emalahleni Ward 24, eMalahleni, Emalahleni Local Municipality, Nkangala, Mpumalanga, South Africa
-25.8653092 29.25061099581582
Importance: 0.3
Wellington Street, Mascot, Sydney, Bayside Council, New South Wales, 2020, Aust

In [10]:
location = geolocator.geocode("Wellington Street, Ottawa Ontario")

print(location.address)
print(location.latitude, location.longitude)
print(f"Importance: {location.raw['importance']}")

Wellington Street, Centretown, Somerset, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1P 5M4, Canada
45.4227971 -75.6995593
Importance: 0.7692354726566488


## Geocode with Pandas

To geocode every location in a CSV file, we can use Pandas, make a Python function, and `.apply()` it to every row in the CSV file.

In [18]:
# you might need to uncomment the next line and run it to install pandas
# pandas is a package that lets you work with tabular data etc

#!pip install pandas

import pandas as pd
pd.set_option("max_rows", 400)
pd.set_option("max_colwidth", 400)

Here we make a function with `geolocator.geocode()` and ask it to return the address, lat/lon, and importance score:

In [19]:
def find_location(row):
    
    place = row['place']
    
    location = geolocator.geocode(place)
    
    if location != None:
        return location.address, location.latitude, location.longitude, location.raw['importance']
    else:
        return "Not Found", "Not Found", "Not Found", "Not Found"

To start exploring, let's read in a CSV file with a list of places in and around Ithaca.

In [23]:
ottawa_df = pd.read_csv("ottawa-places.csv")

In [24]:
ottawa_df

Unnamed: 0,place
0,Carleton University
1,University of Ottawa
2,Senate of Canada
3,Chateau Laurier
4,Algonquin College
5,Bayshore Mall
6,Britannia Beach Ottawa River


Now let's `.apply()` our function to this Pandas dataframe and see what results Nominatim's geocoding service spits out.

In [25]:
ottawa_df[['address', 'lat', 'lon', 'importance']] = ottawa_df.apply(find_location, axis="columns", result_type="expand")
ottawa_df

Unnamed: 0,place,address,lat,lon,importance
0,Carleton University,"Carleton University, 1125, Colonel By Drive, Capital, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1S 5B7, Canada",45.385858,-75.695004,0.698423
1,University of Ottawa,"University of Ottawa, 75, Laurier Avenue East, Sandy Hill, Rideau-Vanier, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1N 6N5, Canada",45.422527,-75.68339,0.822764
2,Senate of Canada,"Senate, Reno No. 51, Saskatchewan, Canada",49.273746,-109.705258,0.345
3,Chateau Laurier,"Fairmont Château Laurier, 1, Rideau Street, Byward Market, Lowertown, Rideau-Vanier, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, K1N 8W5, Canada",45.425601,-75.695253,0.448807
4,Algonquin College,"Algonquin College, 1385, Woodroffe Avenue, Centrepointe, College, Nepean, Ottawa, Eastern Ontario, Ontario, K2G 1V8, Canada",45.34952,-75.75572,0.577866
5,Bayshore Mall,"Bayshore Mall, 3300, Broadway Street, Broadway, Eureka, Humboldt County, California, 95502, United States",40.779649,-124.190233,0.559973
6,Britannia Beach Ottawa River,"Britannia Beach, Bay, Ottawa, (Old) Ottawa, Ottawa, Eastern Ontario, Ontario, Canada",45.365779,-75.801067,0.51


**What do you notice about these results?** ☝️☝️☝️

# Making Interactive Maps

To map our geocoded coordinates, we're going to use the Python library [Folium](https://python-visualization.github.io/folium/). Folium is built on top of the popular JavaScript library [Leaflet](https://leafletjs.com/).

To install and import Folium, run the cells below:

In [11]:
!pip install folium

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 2.1 MB/s eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Collecting requests
  Using cached requests-2.25.1-py2.py3-none-any.whl (61 kB)
Collecting chardet<5,>=3.0.2
  Using cached chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.4-py2.py3-none-any.whl (153 kB)
[K     |████████████████████████████████| 153 kB 3.6 MB/s eta 0:00:01
Collecting idna<3,>=2.5
  Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
Installing collected packages: branca, chardet, urllib3, idna, requests, folium
Successfully installed branca-0.4.2 chardet-4.0.0 folium-0.12.1 idna-2.10 requests-2.25.1 urllib3-1.26.4


In [12]:
import folium

### Base Map

First, we need to establish a base map. This is where we'll map our geocoded Ithaca locations. To do so, we're going to call `folium.Map()`and enter the general latitude/longitude coordinates of the Ithaca area at a particular zoom.

(To find latitude/longitude coordintes for a particular location, you can use Google Maps, [as described here](https://support.google.com/maps/answer/18539?co=GENIE.Platform%3DDesktop&hl=en).)

In [28]:
ottawa_map = folium.Map(location=[45.42, -75.69], zoom_start=12)
ottawa_map

### Add a Marker

Adding a marker to a map is easy with Folium! We'll simply call `folium.Marker()` at a particular lat/lon, enter some text to display when the marker is clicked on, and then add it to our base map.

In [29]:
folium.Marker(location=[45.385858, -75.695004], popup="Crafting Digital History!").add_to(ottawa_map)
ottawa_map

### Add Markers From Pandas Data

To add markers for every location in our Pandas dataframe, we can make a Python function and `.apply()` it to every row in the dataframe.

In [30]:
def create_map_markers(row, map_name):
    folium.Marker(location=[row['lat'], row['lon']], popup=row['place']).add_to(map_name)

Before we apply this function to our dataframe, we're going to drop any locations that were "Not Found" (which would cause `folium.Marker()` to return an error).

In [32]:
found_ottawa_locations = ottawa_df[ottawa_df['address'] != "Not Found"]

In [33]:
found_ottawa_locations.apply(create_map_markers, map_name=ottawa_map, axis='columns')
ottawa_map

### Save Map

In [26]:
ottawa_map.save("Ottawa-map.html")

## Torn Apart / Separados

The data in this section was drawn from [Torn Apart / Separados Project](https://github.com/xpmethod/torn-apart-open-data). It maps the locations of Immigration and Customs Enforcement (ICE) detention facilities, as featured in [Volume 1](http://xpmethod.plaintext.in/torn-apart/volume/1/).

Go to [https://github.com/melaniewalsh/Intro-Cultural-Analytics/tree/master/book/data](https://github.com/melaniewalsh/Intro-Cultural-Analytics/tree/master/book/data) to get the data files for this next section OR insert the following string pattern into the file names as appropriate to load directly from the web:

eg, where the code says `ICE_df = pd.read_csv("../data/ICE-facilities.csv")` 

you'd change that to

`ICE_df = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/Intro-Cultural-Analytics/master/book/data/ICE-facilities.csv`)`

### Add a Circle Marker

There are a few [different kinds of markers](https://python-visualization.github.io/folium/quickstart.html#Markers) that we can add to a Folium map, including circles. To make a circle, we can call `folium.CircleMarker()` with a particular radius and the option to fill in the circle. You can explore more customization options in the [Folium documentation](https://python-visualization.github.io/folium/modules.html#folium.vector_layers.CircleMarker). We're also going to add a hover `tooltip` in addition to a `popup`.

In [38]:
def create_ICE_map_markers(row, map_name):
    
    folium.CircleMarker(location=[row['lat'], row['lon']], raidus=100, fill=True,
                popup=folium.Popup(f"{row['Name'].title()} <br> {row['City'].title()}, {row['State']}", max_width=200),
                  tooltip=f"{row['Name'].title()} <br> {row['City'].title()}, {row['State']}"
                 ).add_to(map_name)

In [35]:
ICE_df = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/Intro-Cultural-Analytics/master/book/data/ICE-facilities.csv")
ICE_df

Unnamed: 0,lat,lon,adpSum,onWeb,Flags,fulladdr,DETLOC,Name,Address,City,...,ICE.Threat.Level.2,ICE.Threat.Level.3,No.ICE.Threat.Level,Facility.Operator,FY17.Calendar.Days.in.Use,FY17...of.Days.in.Use,FY17.Total.Mandays,FY17.Max.Pop.Count,geocodelat,geocodelon
0,28.895000,-99.121200,8391,28.8950,,566 VETERANS DRIVE PEARSALL TX 78061,STCDFTX,SOUTH TEXAS DETENTION COMPLEX,566 VETERANS DRIVE,PEARSALL,...,112,311,1187,GEO,372,1.02,598554,1854,28.896498,-99.116863
1,32.036600,-84.771800,8004,32.0366,,146 CCA ROAD LUMPKIN GA 31815,STWRTGA,STEWART DETENTION CENTER,146 CCA ROAD,LUMPKIN,...,344,365,671,CCA,372,1.02,671515,1992,32.037982,-84.772465
2,34.559200,-117.441000,7265,34.5592,,10250 RANCHO ROAD ADELANTO CA 92301,ADLNTCA,ADELANTO ICE PROCESSING CENTER,10250 RANCHO ROAD,ADELANTO,...,206,164,726,GEO,372,1.02,625414,1918,34.557721,-117.442524
3,32.817700,-111.520000,7096,32.8177,,1705 EAST HANNA RD. ELOY AZ 85131,EAZ,ELOY FEDERAL CONTRACT FACILITY,1705 EAST HANNA RD.,ELOY,...,154,232,785,CCA,372,1.02,502952,1489,32.821231,-111.549772
4,47.249100,-122.421000,6757,47.2491,,1623 E. J STREET TACOMA WA 98421,CSCNWWA,NORTHWEST DETENTION CENTER,1623 E. J STREET,TACOMA,...,166,174,693,GEO,372,1.02,519386,1563,47.250214,-122.422746
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
481,39.671492,-75.714329,1,0.0000,,970 BROAD STREET NEWARK NJ 7102,NEWHOLD,NEW/INS OS HOLD ROOM,970 BROAD STREET,NEWARK,...,0,0,1,FEDERAL,37,0.10,50,2,39.671492,-75.714329
482,26.204563,-98.270145,1,0.0000,,"BENTSEN TOWER, 1701 W BUS HWY 83 MCALLEN TX 78501",USMS3TX,"US MARSHALS (SOUTH DISTRICT, TEXAS)","BENTSEN TOWER, 1701 W BUS HWY 83",MCALLEN,...,0,0,0,FEDERAL,0,0.00,0,0,26.204563,-98.270145
483,41.528728,-73.363545,1,0.0000,,BRIDGEWATER STATE HOSPITAL BRIDGEWATER MA 2324,MABSHOS,BRIDGEWATER STATE HOSPITAL,BRIDGEWATER STATE HOSPITAL,BRIDGEWATER,...,0,0,0,HOSPITAL,0,0.00,0,0,41.528728,-73.363545
484,,,1,0.0000,,Redacted Redacted Redacted Redacted,Redacted,Redacted,Redacted,Redacted,...,0,0,0,ORR,17,0.05,17,1,,


In [40]:
US_map = folium.Map(location=[42, -102], zoom_start=4)
US_map

In [41]:
ICE_df = ICE_df.dropna(subset=['lat', 'lon'])

In [42]:
ICE_df.apply(create_ICE_map_markers, map_name=US_map, axis="columns")
US_map

## Choropleth Maps

```margin Choropleth Map
Choropleth map = a map where areas are shaded according to a value
```

The data in this section was drawn from [Torn Apart / Separados Project](https://github.com/xpmethod/torn-apart-open-data). This data maps the "cumulative ICE awards since 2014 to contractors by congressional district," as featured in [Volume 2](http://xpmethod.plaintext.in/torn-apart/volume/2/).

To create a chropleth map with Folium, we need to pair a "geo.json" file (which indicates which parts of the map to shade) with a CSV file (which includes the variable that we want to shade by).

The following data was drawn from [the Torn Apart / Separados project](https://github.com/xpmethod/torn-apart/tree/master/data/districts)

In [43]:
US_districts_geo_json = "../data/ICE_money_districts.geo.json"

In [44]:
US_districts_csv = pd.read_csv("../data/ICE_money_districts.csv")

In [45]:
US_districts_csv = US_districts_csv .dropna(subset=['districtName', 'representative'])

In [46]:
US_districts_csv

Unnamed: 0,id,id2,state,districtNumber,districtName,party,district_url,representative,representative_photo_url,total_awards
0,5001500US0101,101,Alabama,1,ta-ordinal-st-m,republican,https://en.wikipedia.org/wiki/Alabama%27s_1st_congressional_district,Bradley Byrne,https://upload.wikimedia.org/wikipedia/commons/7/71/Rep_Bradley_Byrne_%28cropped%29.jpg,0.00
1,5001500US0102,102,Alabama,2,ta-ordinal-nd-m,republican,https://en.wikipedia.org/wiki/Alabama%27s_2nd_congressional_district,Martha Roby,https://upload.wikimedia.org/wikipedia/commons/5/55/Martha_roby_113_congressional_portrait_%28cropped%29.jpg,38577.40
2,5001500US0103,103,Alabama,3,ta-ordinal-rd-m,republican,https://en.wikipedia.org/wiki/Alabama%27s_3rd_congressional_district,Mike Rogers,https://upload.wikimedia.org/wikipedia/commons/e/ee/Mike_Rogers_official_photo_%28cropped%29.jpg,0.00
3,5001500US0104,104,Alabama,4,ta-ordinal-th-m,republican,https://en.wikipedia.org/wiki/Alabama%27s_4th_congressional_district,Robert Aderholt,https://upload.wikimedia.org/wikipedia/commons/9/9f/Rep._Robert_B._Aderholt_%28cropped%29.jpg,171873.55
4,5001500US0105,105,Alabama,5,ta-ordinal-th-m,republican,https://en.wikipedia.org/wiki/Alabama%27s_5th_congressional_district,Mo Brooks,https://upload.wikimedia.org/wikipedia/commons/b/b6/Mo_Brooks_Portrait_%28cropped%29.jpg,40346.00
...,...,...,...,...,...,...,...,...,...,...
432,5001500US5506,5506,Wisconsin,6,ta-ordinal-th-m,republican,https://en.wikipedia.org/wiki/Wisconsin%27s_6th_congressional_district,Glenn Grothman,https://upload.wikimedia.org/wikipedia/commons/1/16/Glenn_Grothman_official_congressional_photo_%28cropped%29.jpg,3242401.61
433,5001500US5507,5507,Wisconsin,7,ta-ordinal-th-m,republican,https://en.wikipedia.org/wiki/Wisconsin%27s_7th_congressional_district,Sean Duffy,https://upload.wikimedia.org/wikipedia/commons/d/d7/Sean_Duffy_official_congressional_photo_%28cropped%29.jpg,32698.55
434,5001500US5508,5508,Wisconsin,8,ta-ordinal-th-m,republican,https://en.wikipedia.org/wiki/Wisconsin%27s_8th_congressional_district,Mike Gallagher,https://upload.wikimedia.org/wikipedia/commons/a/ad/Mike_Gallagher_Official_Portrait_2017_%28cropped%29.png,237392.73
435,5001500US5600,5600,Wyoming,0,ta-at-large-district,republican,https://en.wikipedia.org/wiki/Wyoming%27s_at-large_congressional_district,Liz Cheney,https://upload.wikimedia.org/wikipedia/commons/d/dd/Liz_Cheney_official_portrait.jpg,0.00


In [47]:
US_map = folium.Map(location=[42, -102], zoom_start=4)

folium.Choropleth(
    geo_data = US_districts_geo_json,
    name = 'choropleth',
    data = US_districts_csv,
    columns = ['districtName', 'total_awards'],
    key_on = 'feature.properties.districtName',
    fill_color = 'GnBu',
    line_opacity = 0.2,
    legend_name= 'Total ICE Money Received'
).add_to(US_map)

US_map

### Add a Tooltip to Choropleth

In [48]:
tooltip = folium.features.GeoJson(
    US_districts_geo_json,
    tooltip=folium.features.GeoJsonTooltip(fields=['representative', 'state', 'party', 'total_value'], localize=True)
                                )
US_map.add_child(tooltip)
US_map