Yesterday marked my four months membership of GitHub. Back then, I had seen many GitHub repos pass by after joining Twitter a few weeks earlier, and I wanted to star some of them to find them back more easily. It took another 23 days before I actually committed something myself. Now, four months later, I use GitHub (private) and GitLab (work) as good as daily, and I have inspired others to start using it.

Continue reading

Sometimes you dream big, but you just don’t have the data… I had this plan for text analysis by character from the Parks & Recreation series to celebrate Galentine’s Day but getting the data was a struggle. Subtitles are available for pretty much every episode but those don’t contain data about the character who said the lines. I needed scripts, which are a bit harder to come by. I found 6 scripts of episodes on the web from the first 3 seasons in pdf format that were usable.

Continue reading

I went through the entire dplyr documentation for a talk last week about pipes, which resulted in a few “aha!” moments. I discovered and re-discovered a few useful functions, which I wanted to collect in a few blog posts so I can share them with others.

This first post will cover ordering, naming and selecting columns, it covers the basics of selecting columns and more advanced functions like select_all(), select_if() and shortcuts like everything().

Continue reading

I was busy on a little side project on radio playlists when the news broke about the death of Dolores O’Riordan, front singer of The Cranberries. And sure enough, a few hours later I could see The Cranberries starting to pop up in the playlists. Their second album No Need to Argue was one of my favourite CD’s as a teenager, and so I wondered: how many times would they be played?

Continue reading

Author's picture

Suzan Baert