This short tutorial describes how I collected, managed, and visualized the City of Vancouver's crime data using React as well as Mapbox Studio and GL JS. I conducted this project as a way to learn React, so this tutorial is catered towards those who know the basics of React; but, as the code is relatively simple, it can be understood by beginners. The code is accessible here.
In 2017, the City of Vancouver released new crime data on their open data catalogue. As I wanted to learn React and I saw no pre-existing interactive data visualizations on Vancouver crime, this was a perfect opprotunity to develop an interactive map to learn React with Mapbox GL JS. The following discusses how I collected, managed, and visualized the data.
Data Collection
I wanted to incorporate an interactive map into the data visualization, so I downloaded data that was explicitly georeferenced with coordinates.
- City of Vancouver: crime data 2003-2016 (14 shapefiles: crime_shp_[year].shp), local area boundaries (local_area_boundaries.shp), and parks (parks_shp.shp).
- OpenStreetMap
After downloading crime_shp, I reviewed the metadata to help brainstorm how I would visualize the data on an interactive map.
crime_shp stored geographic points with the following attributes/properties: neighbourhood of the crime, type of crime, address of the crime (without the address number to maintain privacy), year of crime, month of crime, day of crime, hour of crime, and minute of crime.
With these attributes, I wanted to incorporate filters by year and type of crime. I also wanted practice developing a choropleth map with Mapbox GL JS, so I decided that I would calculate the total amount of crime by neighbourhood. I needed the geograhpic locations of the Vancouver neighbourhoods, so I downloaded local_area_boundaries.
local_area_boundaries stored polygons with the geographic coordinates and names of each neighbourhood in Vancouver. I also downloaded parks_shp because local_area_boundaries did not include Stanley Park, which was considered a neighbourhood in crime_shp.
After brainstorming and sketching, I had a general gist of how I wanted to develop the interactive map. I wanted users to have the ability to control the year and type of crime visible on the map. I also wanted users to see a choropleth map of the most recent, 2016, crime data count by Vancouver neighbourhood. With this in mind, I decided to develop the interactive map as the focal point and then add a side panel to store the filters users could interact with.
I certainly have other plans to visualize the data; for example, since the metadata stores the date of the crime, I would like to visualize with a time slider the relation between time and type of crime.
Data Management
For the crime data 2003-2016, I originally downloaded the JSON because I prefer to handle the data as a GeoJSON, but I had a difficult time converting the JSON’s projection on QGIS and ogr2ogr; so, instead I downloaded all the shapefiles. Basically when I tried to convert the JSON into a GeoJSON using the '... Save as' method on QGIS, the GeoJSON would state the CRS was EPSG 4362: WGS 84, but the coordinates were in meters, not decimal degrees.
To avoid repeating each step for each shapefile, I merged all the shapefiles into one shapefile. I used this ogr2ogr line. In Terminal (I use a Mac) I entered the following command that creates a new merge file from the first shapefile:
ogr2ogr -f ‘ESRI Shapefile’ merge.shp crime_shp_2003.shp
With a new merge shapefile (merge), I then ran the following script for every shapefile (crime_shp_2004.shp, crime_shp_2005.shp...) I wanted to merge to the merge shapefile:
ogr2ogr -f ‘ESRI Shapefile’ -update -append merge.shp [crime_shp_year.shp].shp -nln merge
I initially wanted to use QGIS's merge data management tool, but it took 3 hours to complete only 50% of the process. When I tried the ogr2ogr method in Terminal, it took ~ 2 minutes.
With the merged shapefile, I then had to convert the shapefile into a GeoJSON so that I could then use tippecanoe to convert the GeoJSON into MBTiles. I used MBTiles over a GeoJSON because MBTiles converted the GeoJSON into a tileset which, in general, renders more easily.
I converted the shapefile into a GeoJSON in QGIS. All you have to do is click '... Save as' and then select the file type (GeoJSON), the location of where to save the GeoJSON, and then the geographic coordinate sytem. The standard for MBTiles is EPSG 4362: WGS 84, so I made sure my GeoJSON was in that format.
Once I had the GeoJSON (vancouver_crime), I then used tippecanoe in Terminal:
tippecanoe --layer vancouver_crime -o vancouver_crime.mbtiles --minimum-zoom=11 --maximum-zoom=20 < vancouver_crime.geojson
My output was a MBTile file (vancouver_crime.mbtiles) that stored all the crime vector points from 2003 to 2016.
The next step was to create the data for the choropleth map. To accomplish this I used QGIS. First I had to merge local_area_boundaries with a polygon of Stanley Park. I extracted the Stanley Park vector polygon from parks_shp by using the Filter tool ("NAME" = "STANLEY PARK") and then saving the output as a new layer. Then I used Vector > Data Management Tools > Merge Vector Layers to merge the Stanley Park extracted polygon to local_area_boundaries. Lastly, I used the Vector > Data Management Tools > Join Attribute by Location tool to calculate basic statistics, including the crime count, within each neighbourhood boundary. My inputs for the 'Join Attribute by Location' were the vancouver_crime GeoJSON and local_area_boundaries.
Once I had the statistical crime data (count, mean, mode, median) for each neighbourhood represented in vancouver_crime, I followed the same steps as above to convert the shapefile into MBTiles (vancouver-crime-nhoods.mbtiles).
Data Visualization
To avoid slowing the browser with too much JavaScript, I decided to import the MBTiles (vancouver_crime.mbtiles and vancouver-crime-nhoods.mbtiles) into Mapbox Studio and then use Mapbox GL JS to interact with my custom style layer's data.
In Mapbox Studio, I imported vancouver_crime.mbtiles as a new layer on my custom map style and then I created new layers from the vancouver_crime layer by filtering the data by year. Basically, I create a layer of crime data by each year between 2003 and 2016.
After creating and designing all the layers, the next step was to import my custom map style into React, so I published the Mapbox style and copied the token and then I reviewed examples in the Mapbox react-examples repo and POI blog post. This source was useful to develop React infrastructure necessary to use Mapbox GL JS with React.
I then used this code as a frame (index.js) to develop the interactive map.
I first added the map to componentDidMount():
Then in return() I added the map container to store the map: