data science

Python Pandas read_csv – Load Data from CSV Files

CSV (comma-separated value) files are a common file format for transferring and storing data. The ability to read, manipulate, and write data to and from CSV files using Python is a key skill to master for any data scientist or business analysis. In this post, we’ll go over what CSV files are, how to read CSV files into Pandas DataFrames, and how to write DataFrames back to CSV files post analysis.

Word Embeddings in Python with Spacy and Gensim

This post shows how to load, use, and make your own word embeddings using Python. Use the Gensim and Spacy libraries to load pre-trained word vector models from Google and Facebook, or train custom models using your own data and the Word2Vec algorithm. This post is a direct follow-on from the introductory Word Embeddings post, and will show you how to get started using word vectors with your own models and systems.

Get Busy with Word Embeddings – An Introduction

This post provides an introduction to “word embeddings” or “word vectors”. Word embeddings are real-number vectors that represent words from a vocabulary, and have broad applications in the area of natural language processing (NLP). We examine training, use, and properties of word embeddings models, and look at how and why you should look to use word embeddings over older bag-of-words techniques in your data science and language modelling tasks.

Batch CSV Geocoding in Python with Google Maps API

Geocode your addresses for free with Python and Google For a recent project, I ported the “batch geocoding in R” script over to Python. The script allows geocoding of large numbers of string addresses to latitude and longitude values using the Google Maps Geocoding API. The Google Geocoding API is one of the most accurate geocoding …

Batch CSV Geocoding in Python with Google Maps API Read More »

Using iloc, loc, & ix to select rows and columns in Pandas DataFrames

Pandas Data Selection There are multiple ways to select and index rows and columns from Pandas DataFrames. I find tutorials online focusing on advanced selections of row and column choices a little complex for my requirements. Selection Options There’s three main options to achieve the selection and indexing activities in Pandas, which can be confusing. The three selection cases and …

Using iloc, loc, & ix to select rows and columns in Pandas DataFrames Read More »

How often do you actually get wet going to work? Using pandas, python, and some graphs, we find out.

How wet is a cycling commute in Ireland? Pretty dry!… if you don’t live in Galway.

How often do you get wet cycling to work? Cycling in Ireland is taking off. The DublinBikes scheme is a massive success with over 10 million journeys, there’s large increases in people cycling in Irish cities, there’s a good cyclist community, and infrastructure is slowing improving around the country. However, Ireland is a rainy place! It turns out that …

How wet is a cycling commute in Ireland? Pretty dry!… if you don’t live in Galway. Read More »