Crunching my Twitter Data

Bob Durie
6 min readMar 23, 2022

--

At the crossroads of quite a few backburner interests of mine, I decided to take a look at some of my historical internet usage data to see if I could learn anything about me, about the internet, or about data visualization.

Low and behold, it was my 15th anniversary as a Twitter user last week. I’m not sure if I should celebrate or cry, but in the midst of the 2007 tech scene, I never would have imagined a service I was using then I’d still be now. Wherefore art thou incumbent social network??

I thought it might be neat to look at tweets over time to see what they look like. It feels like I’m using Twitter more these days, or at least as much as I ever have, perhaps after a long lull of “not using it for years”. Is that the case?… the data will know!

I first assumed I’d just use the API to pull my tweet history and crunch it. No can do as far as I could tell; unless you’re an academic institution, you cannot access the full history of tweets, yours or anyones.

Luckily for me, Twitter has a comprehensive “dump your data” feature. So I kicked it off and it took over a day to complete. When I got the notification I checked it out — here is what the embedded Web Experience looks like — super slick, and lots of interactive poking around possible:

After a few min I got to work parsing that data and importing into Google Sheets. Boy am I going to learn some things!!

First thing I checked — what does my tweet count by year history look like?

Ok! Wow. I guess 2013 was a banner year, then i slumped hard around 2017, and… have never gone back to anything quite like I was doing before.

By quarter tho?

Not much more interesting. You can see how lame things started off, and then in 2017 it was actually two very non-tweetworthy quarters. I think I was off living or something 🏝.

Ideally I could look at my engagement data to see maybe I was consuming, and not producing. There is a ton of great data in the data dump about all the gory ad impressions I’ve been served. It seems only recent data though and it doesn’t go back that far… interesting nonetheless, I can see the number of times Cheerios ads have hit me up, e.g.:

"engagements" : [
{
"impressionAttributes" : {
"deviceInfo" : {
"osType" : "Desktop"
},
"displayLocation" : "Trends",
"promotedTrendInfo" : {
"trendId" : "85769",
"name" : "#CheerWithCheerios",
"description" : "Send your digital cheer today!"
},
"advertiserInfo" : {
"advertiserName" : "Cheerios",
"screenName" : "@cheerios"
},
"impressionTime" : "2022-02-04 17:55:07"
},
"engagementAttributes" : [
{
"engagementTime" : "2022-02-04 17:55:08",
"engagementType" : "TrendView"
},
{
"engagementTime" : "2022-02-04 17:55:08",
"engagementType" : "TrendView"
}
]
},

I’m a bit sad they use ISO dates everywhere else but not in the above engagement data… minor jab.

Anywho… ok so i had some peaks and valleys. What else?

I wonder if one month is busier than others?

Well, October (i.e. the 10 on x-axis on the above chart) seems to be slightly busier than the rest. Apple release time? Maybe…

What about hours of the day?

Not incredibly surprising. Also clearly I don’t “break” to avoid tweeting during the work times. Or do I?

How about the above broken up by year?

Ohhh this is terrible

A total mish mash! Well mostly… drilling in on particular years I can see trends and think back on some of those years... that top blue line is 2013. Seemingly by 5pm, all I could think of doing was tweeting!

What about day of week?

Ok, at least now I’m not dishonouring my work too much, Saturday busiest day. By year now though?

Hmmm can’t see much here. I know, we need trend lines!!!

Oh yuck this is impossible. How about I include different buckets for time, not just years but how about every 5 year chunk.

Ok sorta neat. The 2012–2016 time when I was most prolific my time spent on twitter tended to ramp up as the week progressed. For sure though? Lets throw a stacked bar chart at this!

Ok enough of this.

What about … all tweets, over time, by character count?

OK — this is actually pretty cool. You can see how it starts out pretty barren, with the hot n heavy 140 character count, then it opens up towards the end of 2018. Wait a sec- was it BECAUSE the character limit opened that I came back?? Could a long awaited feature really spark a whole new revolution?? Not for others, looks like things didn’t really take off in 2018 (source):

I digress, back to the bubble chart! You can see some particularly heavy times in there. What is going on here:

A whole bunch of tweets repeated in a row, of very similar length? Ahhh yes…

A simple hashtag and a url. And geez Twitter, can’t you embed the IG picture? Man I remember when stuff like that was important

That tweet looks awfully lonely. I wonder if any of my tweets are getting likes?

ENTER the Bubble Chart

The above chart is the same scatter plot of all my tweets, but color coded AND sized for the number of favorites (that unlabeled legend at the right).

Its neat to see the mostly blue eventually getting a red treatment and a bit of orange, and how things are a lot more random and well liked these days.

Well if you made it this far, I’ll let you in on a little secret. I really just wanted to see if I could have a use for a bubble chart. This is a bit of a fudge and if I could I’d shrink the size of the circles, but you get the jist!! I think it’s legit. Ok, the tweet character count doesn’t really contrast with the favorites. So what, it’s pretty! Bubble charts inspiration coming from this Hans Rosling video.

If anyone knows of any other great data personal or social data sets to play with, I’d love to hear about it. On my backburner is a bit of a personal data lake project, so if any of this is interest and you want to chat please reach out!

BTW I’m @mobob on Twitter. Thanks for reading!! 🤓

--

--

Bob Durie

Sometimes focused, sometimes scattered, my opinions about the world, people, tech, purpose, impact, and nonsense.