CSV, PANDAS and Python

We will analyze the data for the historical prices of SEB investment funds using powerful Python programming language modules – CSV and PANDAS.

You might ask why I use SEB Fund prices for this example instead of stock price data which are standard source data for these kind of things? The answer is simple – I have savings account linked to these funds performances and I alone have the responsibility of setting up the fund/asset allocation.

In this article, we will take the first steps in deciding on which funds we want to see in our portfolio and which we might want to avoid.

When working with large data tables that contain countless numerical values have to be stored somewhere. Of course, data can also be stored in a database, which would be the right choice if the data is to be regularly updated and stored for a long time, but in this case we will look at the CSV format data, which will be our raw material format and cover the databases when we will continue this topic.

If you are more interested in storing data in a database, don’t worry, at the very end I will show you how to save the data into a database using sqlite3 module.

When I started this article, I was planning to create a Python code that will automatically take data from the SEB website for the proposed investment funds, but then this article would be just about it, not the work with CSV files (which is a much simpler topic) and their analysis. We will get back to this topic when it comes to obtaining data from websites.

What is CSV?

Comma-separated values (CSV ) is a file format that is basically the same as widely known TXT files. The main difference is that in CSV all entries are separated by a comma, semicolon or any other separator by authors choice actually or depending on the regional settings. The world standard for separator is a comma and dot “.” for decimals.

Note: When working with csv files be careful with spaces as thousand separators. You might have to delete them before converting data type to numeric.

Regardless of separator selection, it is the same throughout the file and the structure of any CSV file is a table. With or without headings. In our example we will use CSV files with the historical prices of SEB investment funds.

Python library for CSV

As I mentioned earlier, unfortunately, I will not be able to briefly show how to get these historical data automatically within this article, so I will have to assume it as given. You can check out my article on article.

I’m not really into the manual job, so I saved all CSV files according to their ISIN codes available on the homepage. It looks like this:

It would be helpful to get the list of these files in the Python code. I will not go to this step, but I would add that here we will get an ISIN code list without the .csv suffix. If you want to keep them csv, instead of simply add it to isin.append (i [: 12]) isin.append (i):

Now load one of the CSV files and look or get things far:

If you see something like that, then great!

CSV functions

  • csv.reader
  • csv.writer
  • csv.register_dialect
  • csv.unregister_dialect
  • csv.get_dialect
  • csv.list_dialects
  • csv.field_size_limit

In this article we will discuss only first two though.

Working with CSV files

Soon, we’ve already seen an excellent example of csv.reader. It’s just that simple. In my code above, the function csvFile.close () was superfluous, since using With open () should not be a problem when the file remains open, but keep in mind that files should be closed after their use. Working with With open () is the safest way, because even if the code gets stuck somewhere, the file will be closed.

The next step is to create a new CSV file with the csv.writer, but what do you write here with csv.writer (‘File’, ‘w’) writerows (‘Hi, the world!’) If we can do something more complicated right away, but more interesting?

Let’s do data aggregation

Things that we need to do:

  1. Create a new list (eg data = []);
  2. Collect data from all CSV files with ISIN codes in the title;
  3. Save data to a new CSV file.
  4. Get acquainted with the PANDAS module 🙂

So the first step is pretty simple, but for the second one, let’s just say:

I hope that I will not miss you for a cycle or with open, because everything seems to be simplifying. Things or data become completely abusive when we’re up to formatting. Also, now it might be easier to put ISIN codes in the column names, but it seems to me, then I would not only lose you, but would even spoil yourself: D

Here we read the string of csv data with str (row [0]) [: 10] and str (row [0]) [11: -1] for two reasons:

  1. We do not want the list in the list, so we read the contents of each respective list;
  2. In order to successfully continue, we need two entries in place of a single entry with a date and historical price: date and value.

Saving aggregated data to CSV file

When we are already so far, then the great stumbling block with the CSV module is coming to an end. You just have to save. In this case, I would like to add names, so let’s write one line before we start printing all other data:

Whew, finally, the boring part of the deal. Getting started with PANDAS or pd is the right thing to do, as is the case with shortening the code.


Let’s start by eliminating the shortcomings in our previous table by turning ISIN codes against dates and converting dates into dates:

Changing the date type data will have a slight effect on the speed of the code execution, but this is a necessary step to continue. If you want to remove it print (csvtab.dtypes). it’s useful at the start to make sure the data types in the table are correct.

Analysis of available data

The most simple analysis of all of our data is possible with a very simple command: table.describe (). As we rotate ISIN codes for column names, we will have statistics for each fund. Here’s a bounce from the results:

You may need to add, but remember that after any data manipulation, take some of the results and check the correctness of the results. I’m not saying that we have made mistakes here, but there is a negative experience working with other widely used, user-friendly systems. 🙂

Means, medians, standard deviations and numPy

You may need to add, but remember that after any data manipulation, take some of the results and check the correctness of the results. I’m not saying that we have made mistakes here, but there is a negative experience working with other widely used, user-friendly systems. :)…

Let’s examine how our result in the table of the stomach looks:

It does not look bad, but if we print this to a csv file and would like to continue processing and analyzing data with Excel, or by simply checking our Python result in another environment, it would be difficult for us to combine ISIN numbers with dates. It might be more convenient to blame ISIN codes for column names. You can easily do this with the .unstack () command:

Now the result is already practically 4 tables (amounts, standard deviations, average and median prices) that are logically ordered and in the foreseeable order (by dates).

Saving the data to MS Excel, CSV or SQLite database

Acknowledging I was surprisingly challenging to export to MS Excel. Perhaps this is due to the fact that at present my computer only has Linux Mint. But the good news is that I did manage it and we will look at how to save the options in both the MS Excel file and the two csv files.

MS Excel:

The print team, of course, is not obligatory, but I wanted to see a confirmation that at last everything succeeded. 🙂


Exporting to txt and csv files has always been the easiest solution:


Since last week I wrote about the SQLite Python module, here is an example of how to export data to a SQLite database that would also be suitable for such a project:

This time we will stop here, but we will definitely return to Pandas and data analysis, as well as these investment funds.

If You like this post and would love to see more of this content, please spare a coin for further development of it

[wpedon id=”473″ align=”center”]

One thought on “CSV, PANDAS and Python

  1. Hi there, I discovered your blog via Google while searching for a
    similar subject, your web site got here up, it appears to be like good.
    I have bookmarked it in my google bookmarks.
    Hello there, just was alert to your weblog via Google, and
    located that it is truly informative. I’m going to be careful for brussels.
    I’ll appreciate if you happen to proceed this in future.
    Numerous people will probably be benefited out of your writing.

Leave a Reply

Your email address will not be published. Required fields are marked *