data science | Shane Lynn

Find slow and blocked queries on postgres database servers using pg_stat_activity pg_locks and pg_class

PostgreSQL: Find slow, long-running, and Blocked Queries

4 Comments / blog, data science / By Shane

If you run a PostgreSQL database, use pg_stat_activity to find and identify slow and blocked processes and queries, with the query text and responsible user quickly. pg_blocking_pids and pg_locks will give you everything you need to know about database locks.

Delete Rows & Columns in DataFrames Quickly using Pandas Drop

13 Comments / blog, data science, Pandas, python / By Shane

Learn how to drop or delete rows & columns from Python Pandas DataFrames using “pandas drop”. In this tutorial, we’ll load some sample data, and then look at deleting rows and columns by number, by index, and by boolean values.

Use the fitbit api v1.2 to access your health sleep heart data

Plot your Fitbit data in Python (API v1.2)

3 Comments / blog, data science, Data Visualisation, python / By Shane

Introduction Sleeping, and python. Two of my favourite things, when combined with the the Python Fitbit library, Matplotlib, and Pandas, can generate informative plots of your sleeping habits! This post explores how we can pull date from the Fitbit API, create a Pandas Dataframe, and then plot the results. In this tutorial, I’ve used Python …

Plot your Fitbit data in Python (API v1.2) Read More »

Read CSV data quickly into Pandas DataFrames with read_csv

19 Comments / blog, data science, Pandas, python, Tutorials / By Shane

CSV (comma-separated value) files are a common file format for transferring and storing data. The ability to read, manipulate, and write data to and from CSV files using Python is a key skill to master for any data scientist or business analysis. In this post, we’ll go over what CSV files are, how to read CSV files into Pandas DataFrames, and how to write DataFrames back to CSV files post analysis.

Word Embeddings in Python with Spacy and Gensim

9 Comments / blog, data science, Natural Language Processing, python, Tutorials / By Shane

This post shows how to load, use, and make your own word embeddings using Python. Use the Gensim and Spacy libraries to load pre-trained word vector models from Google and Facebook, or train custom models using your own data and the Word2Vec algorithm. This post is a direct follow-on from the introductory Word Embeddings post, and will show you how to get started using word vectors with your own models and systems.

An introduction to word embeddings for text analysis

14 Comments / blog, data science, python, Tutorials / By Shane

This post provides an introduction to “word embeddings” or “word vectors”. Word embeddings are real-number vectors that represent words from a vocabulary, and have broad applications in the area of natural language processing (NLP). We examine training, use, and properties of word embeddings models, and look at how and why you should look to use word embeddings over older bag-of-words techniques in your data science and language modelling tasks.