There are multiple ways to select and index rows and columns from Pandas DataFrames. I find tutorials online focusing on advanced selections of row and column choices a little complex for my requirements. Selection Options There’s three main options to achieve the selection and indexing activities in Pandas, which can be confusing. The three selection cases and methods covered in […]

Read More →

The most recent post on this site was an analysis of how often people cycling to work actually get rained on in different cities around the world. You can check it out here. The analysis was completed using data from the Wunderground weather website, Python, specifically the Pandas and Seaborn libraries. In this post, I will […]

Read More →

Self-Organising Maps (SOMs) are an unsupervised data visualisation technique that can be used to visualise high-dimensional data sets in lower (typically 2) dimensional representations. In this post, we examine the use of R to create a SOM for customer segmentation. The figures shown here used use the 2011 Irish Census information for the greater Dublin […]

Read More →

As a practicing data scientist, I have regular need to present and distribute the results of analyses, or to provide descriptive statistics on data sets. Ideally these results can be presented in the form of interactive graphics, standalone applications, or as continually updating dashboards. There are a range of excellent dashboard and visualisation building softwares […]

Read More →