Recently Tableau has made this much easier in version 2018.3 by enabling spacial joins within the tool. I will go through the steps to create a heatmap in the latest version of Tableau so you can create the below.
Chart 1: Uber Pickups NYC Sept 14 – Shapefile
Chart 2: Uber Pickups NYC Sept 14 – Density
The tools we will use are:
- Tableau 2018.3 – used for the visualization
- OGR2OGR – used for shape file manipulation
The data set we will use is:
- Uber NYC Pickup Data – we will use the September 14 CSV file
1. Download The Data Set
The first thing we will do is download the dataset. Navigate to the Kaggle data repository here. For this exercise we will just download the file ‘uber-raw-data-sept-14.csv’ file, which has approximately 1 million records.
Chart 3: Uber Pickups in New York City
The next thing we need is a shape file for New York City. We will use the 2010 New York City Census Tracts shape file here.
Chart 4: NYC Shapefile
You can open this shapefile in Tableau by clicking Data -> New Data Source -> Spacial File. Once open, drag ‘Geometry’ onto the canvas and you will see the NYC polygons defined in the shapefile.
Chart 5: NYC Shapefile in Tableau
2. Download OGR2OGR
Firstly, what the heck is this weird sounding tool? OGR stands for OpenGIS Simple Features Reference Implementation, and is an open source command line tool for translating and converting vector geospatial data. In other words it is a tool used to manipulate shape files.
There are two different ways to download OGR2OGR:
- Download the OSGeo4W package (more files but simpler)
- Download the files directly from here. (For this demo, I downloaded the file ‘release-1911-x64-gdal-2-3-2-mapserver-7-2-1.zip’)
For the OGR2OGR command line tool, you will need to:
- Set a PATH to that directory so the ogr2ogr.exe executable can be found.
- Set a system environment variable, GDAL_DATA, which points to the data directory in which GDAL support files gcs.csv and pcs.csv are saved. These files allow you to set custom spatial references by their EPSG code
Note, if you don’t want to use the command line interface of OGR2OGR, there is a graphical program you can use called OGR2GUI. The limitation of the GUI tool is you cannot use SQL Server and there are some other useful commands you cannot do.
3. Convert the CSV to a Shapefile
The first thing we need to do is convert our Uber pickup GPS data into a shape file. We need both the pickup points and the NYC geogaphic data both in shapefiles so we can join them within Tableau using a Spacial Join, which Tableau lists as ‘Intersects.’
We will use the OGR2OGR command line to help us. Navigate to the directory you saved the ‘uber-sep14.csv’ file and execute the below command. You will notice that the command looks for fields that start with ‘Lon’ for Longitude values and ‘Lat’ for Latitude values. If you have different names for these fields e.g. X and Y, you can rename these in the command too X* and Y*.
ogr2ogr -f "ESRI Shapefile" rides.shp uber-sep14.csv -oo X_POSSIBLE_NAMES=Lon* -oo Y_POSSIBLE_NAMES=Lat*
You should now have a shapefile called rides.shp. You can view this file in Tableau by going to Data -> New Data Source -> Spacial File. You will see a whole lot of points corresponding to the Uber pickup locations. Since there are so many data points, it is difficult to see what regions have the most pickups. Hence, we need to summarize this data into a heatmap.
Chart 6: Uber Pickup Points
4. Import Data to Tableau
Click on Data -> New Data Source -> Spacial file. Navigate to the rides.shp file we created and drag this onto the right corner. Next click on ‘Add’ next to connections on the left side of the screen.
Chart 7: Add Shapefile
Select ‘Spacial File’ and navigate to the NYC shape file ‘ nyu_2451_34505.shp’. Once again drag this file to the right hand side if this does not automatically add. You should now see a join between the file ‘rides’ and ‘nyu_2451_34505’. You may notice that the join comes up with an error. Click on the join and change the ‘=’ to ‘intersects.’ This will change the join from a regular table join to a spacial join. Note that a spatial join, combines the attributes of two layers based on the location features in the layer. Furthermore we use an ‘inner join’, this means that we will exclude data outside of the NYC polygons.
Chart 8: Change Join to Intersects (Spacial)
5. Count the Pickups in a Calculated Field
The next task is to create a calculated field to count the pickups within a NYC polygon. Click on Analysis -> Create Calculated field. Call this calculation ‘CountPt’ and type in ‘COUNT([BASE])’.
Chart 9: Count Pickup Points
6. CREATE THE SHAPE FILE View
We will next create our polygon based heat map. Open a new sheet and drag in the NYC shapefile field ‘Geometry.’ Next drag in ‘Tractid’ to the details section, this will aggregate the pickups according to the polygon id. Lastly drag the ‘CountPt’ calculated field onto colours. And there you have an awesome heatmap aggregated by the count of pickups in each polygon region of New York City!
Chart 10: Shapefile Polygon Heatmap
You can adjust the colours by clicking on ‘Colour’ and then change to Orange – Blue Diverging. If you open the advanced section you can set the center of where the colours diverge, in this case I set it to a count of 1,000 pickups.
Chart 11: Shapefile Polygon Colours
An alternative view is to create a ‘density cloud.’ Create a new sheet and this time drag the rides shape file’s ‘Geometry’ field onto the canvas. Next change the Marks drop-down from ‘Automatic’ to ‘Density’. Then add ‘Tractid’ to the details and once again add the ‘CountPt’ field to the colour.
Chart 12: Density Cloud Heatmap
Well done! You have successfully created an awesome looking heatmap in Tableau using shapefiles.
If you would like to donate to this post, please use the below Bitcoin address: