Naked Data Science – USA vs Australia Interest Rates

The train doors hissed close as I boarded the city loop train into Melbourne during rush hour. The carriage was tightly packed and I clung to an overhead bar as the train swayed northward. I overheard two men in blue suits in front of me arguing. There was a tall gentleman with a deep booming voice and shorter bespectacled man. They were having a discussion on how Australia’s current interest rates compared to the USA. The shorter of the two said, “Australia’s interest rates compared to the US are at unprecedented high levels given that interest rates were so low.” The taller gentlemen pulled his head backward and opened his eyes wide in a universal expression that said “you are smoking some good stuff.”

However, was there some merit in what the shorter man said? This got me thinking – how would you show the relative difference between the two countries interest rates, where differences in low numbers are more pronounced than differences of large differences? I was also interested in seeing how the two countries rates have changed over time to understand the context. I came up with a little data science solution in Python to help answer this question. I’ll work backwards from the solution to the methodology.

Change of perspective

I compared the two countries interest rates on two different scales. The one on the left is a standard comparison of rates from 1976 to today of the US Federal Reserve and the Reserve Bank of Australia’s interest rates. As expected, the plot shows that Australia’s interest rate movement follows the US, but the magnitude of the rates differ. Following the 1980’s and early 1990’s, the magnitude of Australia’s interest rates has decreased to levels closer to that of the US.

On the right, I plotted a logarithmic scale for interest rates. The logarithmic scale is non-linear and is used when there is a large range in quantities. What the right graph shows is that the shorter business man’s statement had merit. I.e. that relatively, the interest rate difference between Australia and the USA has in the last few years been very high compared with the last 30 years.

What does this all mean in economic terms? I am not sure. Maybe investors in Australian bonds should be happy with the rates they are currently getting. Perhaps interest rates are already relatively high in Australia and after the US Fed increases rates, there may be a short term delay (6-12 months) before the RBA follows.

What is the analytics behind the graphs?

In this section, I will show how I used Python to manipulate the data and graph the results. The point of using Python was to show some basic data science manipulations to import, manipulate and graph. The code here can be used as a framework in your own Python scripts. I have broken the scripts below into 4 section to show practical examples.

  1. Get the data
  2. How to import CSV and Excel files into data frames
  3. How to manipulate data in a data frame
  4. How to create basic line plots

Note that this part of the post assumes basic Python knowledge.

1. Obtain the Raw Data

First off I needed some data for the comparison. I went to the RBA  site and found historic monthly Interbank Overnight Cash Rates (http://www.rba.gov.au/statistics/historical-data.html). I then went to the US Federal Reserve site and got a hold of Monthly the Federal Funds Interest Rate (https://www.federalreserve.gov/datadownload/).

See the below link to download the raw data I used in this analysis:
Download

2. Use Python to Import the Data

I used Python to manipulate and clean the data and to graph the results.

The first step was to read in the CSV and the raw Excel file. Notice, that I used the pandas library to store all the data in data frames and that I limited the rows to get the same time period between the files, i.e. between May/1976 and Jan/2017. The pandas library uses pd.read_csv for CSV files and pd.read_excel for Excel files.

Let’s dive into some of the notation. To do a first data clean, I skipped the first few rows using skiprows=range(1,6) as part of the pd.read_csv function. To further trim my data set I then used iloc to specify the rows and columns I wanted to start from. The notation .iloc[76:,] means give me rows from row number 76 until the end  and all the columns (the format is ‘.iloc[row start:row end,column start:column end]’). In the CSV file I skipped the first 76 rows and in the excel file I limit the rows to 489 ‘.iloc[:489,: ]’ and take the first column and the third column.’.iloc[:, [0,2] ]’ Lastly, I renamed the columns by using the ‘.columns’ syntax.

 

3. Clean the Data

I now have two data sets, but I need them in one location to be able to manipulate them. I used the Pandas merge function, where I join on the ‘Date’ column. I then add a new column to calculate the difference between the 2 interest rates. Note that the rates were first multiplied by 100 to prevent negative log values.

As you muck around with your data, it is always good practice to print out the first few rows using head() to make sure you are getting the desired output.

 

4. Plot the data

Now we have a cleaned data set, how do we see the results? I created two line plots using the matplotlib library. Most of the code is to make nice labels on the graph, but the line that is actually doing the work is ‘ax.plot(x axis data, y axis data, linewidth)’. Also, when graphing, always be sure to include ‘plt.show()’ to actually see the charts, otherwise nothing will happen. Once you run the code, you will see that the first line plot is for the regular time series interest rates and the other for the logarithmic interest rates.

 

That’s it! You know have a framework to read in data from either a CSV or Excel file into a dataframe, do some basic data manipulation and then graph the results.

 

 

 

Share on LinkedInEmail this to someoneTweet about this on TwitterShare on FacebookPrint this page

Leave a Reply

Your email address will not be published. Required fields are marked *

*