Visualizing portfolio historical performance with Python

Hey there!

In this article we are going to check out how to examine and visualize the historical performance of a portfolio. This can be any portfolio but since we scraped prices for SEB funds in our previous article (Web scraping investment fund prices with Python) we will use those prices here as well. If You haven’t read that article it would be good to at least check out the full code used there as we will import it and use one variable from it. Also by doing so we will initiate the whole thing to run and update the prices for us here.

The set up

This is how we will do this:

  1. Define functions for gathering historical prices for our portfolio allocation
    • Gather prices per chosen funds
    • Calculate total portfolio price based on weights in allocation
  2. Define function that would get all the ISIN codes of our funds and run the price update
  3. Put together our portfolio return, SEB fund average returns for performance comparison

Pretty easy but I will still guide You through it. No worries about that.

The walk-through

The imports

We will need 3 things to be imported and one optional:

  • pandas – for data cleaning and manipulation.
  • matplotlib.pyplot for data visualization.
  • datetime to get today’s date.

Optional:

We could just read MS Excel in stead but I wanted to have this file updating the previous one. I know this could be done in a simpler way but whatever. What’s done, done.

Also we could say that matplotlib.pyplot is optional since it is possible the get the charts form pandas DataFrames as well, but whenever I run it from terminal I see only matplotlib and no pandas charts so I would still trust the main chart to matplotlib.

Function for loading prices form MS Excel

This function will read excel file we previously saved and retrieve an aggregated portfolio price based on given allocation and prices from sheet ‘DATA’. It requires ‘allocation’ as a pandas DataFrame with fund ISIN codes and allocation weights for our current portfolio.

In order to get that nice (an single) list of of prices we will need to go through these steps:

1. Create a list of column names (ISIN codes) for allocation

Initially I believed that I can just go with allocation.index as a list for filtering, but something in pandas proved me wrong. The issue was that ‘allocation‘ had ISIN codes that were not saved in excel. Reminder: We dropped funds that didn’t have at least 1000 prices/days when gathering prices.

Why we included ISINs for which we don’t have prices in allocation? It happens when we want to take average fund performance by giving each ISIN equal weights. This shouldn’t be a problem if You would use the same ISIN codes that we have for our portfolio.

Quick workaround – looping though the all ISIN codes saved in prices and then through allocation.index putting ISINs that matched in separate list – filterColumn.

2. Filter and adjust prices according to given allocation

This task calls for another loop!

But if You think of it we could have done it all together in previous loop… All the bright ideas come to me too late. I am not willing to change this setup (since it works well).

Anyway. We will do the following:

  1. Basic double-loop operation on fund prices DataFrame index and filteredColumn ISINS
  2. Append a temp list p for our portfolio price.
  3. Drop NA values everywhere to have only clean data

Here the parameter errors=’coerce’ means that if there would happen to be an error while converting DATE to date then it would deliver NA which we would drop on sight.

3. Group by date and deliver some stats

Lastly we want to use these neat and clean price dates as index as they are unique (at least should be) so we will use function group that would assign whatever we are grouping on as an index.

If for any reason there would be duplicate dates we would sum up the prices of our portfolio anyway and escape an error. But so fare there haven’t been any duplicates..

And as a cherry on the top, and this is optional, we will get some statistics on our prices. We could just go with pd.DataFrame.describe() but we don’t need all of that stuff. We will settle with STD, Mean return and Sharpe ratio:

It looks a bit ugly but I believe that using an extra variable when it’s not needed just puts some extra weight on memory so I went with raw portfolio.pct_change().dropna() in stead of assigning it to a variable.

To be perfectly honest with You, I will drop these print() lines of code as soon as this article gets published because I have some bigger plans for this in near future and I would prefer it not doing non-obligatory calculations. Meanwhile the current output is:

Before You continue I just wanted to say that .dropna() in case with pct_change() is not to drop erroneous entries. It is for the first record that’s always empty when You use .pct_change().

That being said, here’s the full code getPortPrices function:

Function for initiating SEBscraper and getting ISIN codes

This is a short one – it imports our previous project SEBscraper as seb and initiates price update from which we take DataFrame funds. As we noticed in previous function You have to be careful with this one because we didn’t take prices for all funds that we got from overview page. We only took the ones that had significant history (more than 1000 prices)

Putting everything together

Finally all the pre-requirements have been fulfilled and we can get our hands wet with the data.

we will start with the print() to let ourselves know that we are starting the process and initialize the price update. After prices are updated and we got the latest list of fund ISIN codes that SEB offers, we will read the prices out form the Excel file where they are saved.

We want to process them in two ways:

  1. Get our current protfolio allocation
  2. Get an average of SEB investment fund performance

Well, we will calculate the performance but in order to get to that, we will assign equal values to all funds. We can also add other allocations that we might like. I have prepared the split in BONDS, EQUITY and MIX based on their names. Here’s a reminder how it looks:

If You wanted to comment on my chosen allocation that it looks like I’m 60+ then I agree. Way too conservative and I expect to see it in charts as well.

Here’s where we start:

So now we have everything ready in DataFrame totalPrices. The code gives the following output:

AS we can see our bench a.k.a SEB investment fund average performance outperforms our allocation in returns and Sharpe ratio and our allocation beats it in capital preservation. But since I am not ancient, I would prefer to be taking more risks for bigger winnings…but more on choosing optimal allocations in next article.

The plot

So we get to the fun part – data visualization. First thing I want to stress here is that we are dealing with dates for which we have prices for. So I would assume that all of these are working days. In other words, picking last 360 days would not be equal to 1 year. In this code I have made an assumption that each month has equal amount of business days = 21. And that means that quarterly data is 63 days and annual – 250. This will be relevant real soon.

And by soon I meant right now. We will check standard deviation (STD) for past 250 days and also check price movements for the same period with moving average for a month.

For the STD that represents risk we see that our portfolio is 10x safer and more stable than the average:

But it has it’s trade-off in average gains:

At some points we would of course enjoy having the portfolio that we have here. Like, in all those price drops below 0, but in general we can see that average SEB fund enjoyed higher rise in price than our

allocation. Also what we can notice here is the fact that 250 days back is not equal one year. More like 3/4 of a year. We can note this in further work that’ s needed to be done. In this case for comparison reasons it’s fine as long as all date ranges are the same for single comparison.

Let’s take a peek at our price movement frequency in histogram:

Yeah..so it’s not all bad if we do not take into account inflation and stuff. I mean it’s sort of mostly positive…right next to zero. The fact that price changes were mostly upwards is actually really good if You thin about it. It just sucks that those are not significantly large movements upwards.

Let’s move on and check the price movements for last 63 dates and also add some extra stats with it:

We also get the following output:

Nothing to be proud of since we are in the read for last ~3 months. This just keeps sending us signals that there is a better allocation among these funds. If we assume we have a choice among only these funds we can be sure that we can find a more optimal allocation than the one we have currently. And this assumption is true for me since I have a life insurance linked to these funds and I have the privilege of setting the allocation. (another reason why we will definitely discuss asset allocation optimization)

One last plot to view is a scatter plot for STD/returns to see how well risk and return go hand in hand. Obviously we want to have the least risk for the most returns. But let’s see if this will be the case. From previous metrics we can already anticipate low risk scores and not so good looking returns.:

You might guess which one is our portfolio in there. It’s a clear indicator that we need to reallocate the portfolio for better performance.

Saving the summary figures to MS Excel file

So now we can save our results in an Excel spread sheets and close this topic so we can move on to the optimization. 🙂

As You might have noticed I added bit more periods to the summary so it has quarterly, annual and 5 year figures. Or at least what we consider as equivalents of those time periods within this project.

Full code:

As always, here’s a full code for this project (go to web scraping fund prices article to get the code for SEBscraper):