spicer2 OP t1_jc6fzop wrote on March 14, 2023 at 11:56 AM

Reply to [OC] The most common song titles in music by spicer2

Tools: Excel

Source: MusicBrainz (online music encyclopedia)

Methodology/FAQs: I've seen a few other people attempt to figure this out but I've never been completely satisfied with their methodology. Music YouTuber David Bennett Piano has a video where he used Wikipedia disambiguation pages to arrive at a final list which is clever, but not robust enough for my tastes.

I had to do some careful filtering on the data to make it useful, so to be clear on what you’re seeing: this is the number of *original compositions* with that title. This means cover versions are excluded – if you don’t do that, the list is just full of Christmas songs that have been recorded many times over. I also filtered it so it is only *songs* – this means things like musicals, soundtracks, and classical albums are excluded (otherwise you get a load of “Preludes”). This is also why the figures are lower than some other sources you might have seen for this kind of data.

Some other interesting tidbits:

-”I Love You” is 28th in the all-time list;

-The most common animal is “Butterfly” (55th);

-The most common color is “Blue” (61st);

-”Untitled” (as in, an actual title called “Untitled”) is 63rd;

-The most commonly used place name is "California" (75th);

-If you exclude “Grace”, the most common name is “Maria”. And if you exclude that, it’s “Caroline” (unsweetened).

spicer2 OP t1_jb4hznt wrote on March 6, 2023 at 11:01 AM

Reply to [OC] The most dominant athletics world records by spicer2

Tools used: Python, Excel

Source: IAAF Toplists

Methodology/other bits:

A question I’ve wondered for a long time is if there’s a good way to measure how dominant athletics records are between events. I remember reading once years ago that, with the help of some statistical trickery, Paula Radcliffe’s (now-broken) marathon world record was considered to be one of the most impressive human feats, up there with Bob Beamon’s long jump, but I never tracked down the original source or methodology behind it.

I’m sure most people here know what a z-score is but I’ll show my working for full transparency. It essentially tells you how many standard deviations from the mean a given score or “exceptional” a given score is. It’s really handy for letting you make comparisons across categories that use different measurements.

To be clear about the sample I used - I took the 100 best competitor’s times, not the 100 best times overall. So Usain Bolt’s score is gauged against Asafa Powell, Tyson Gay etc’s best times, not all of those and also his own. The main reason I did that was because I was most interested in how dominant the individuals were, not the times themselves.

On that note, I really like this chart as it shows just how good Usain Bolt was. I also like that it confirms the 100/110m hurdles is such a tight and unpredictable event where you don’t really get athletes that consistently sweep the board for medals, as you do in others.

PS: I gathered the data for this in January but sat on it for a bit, so some of this may be slightly out-of-date.

spicer2 OP t1_jac9b7x wrote on February 28, 2023 at 11:57 AM

Reply to [OC] Logan Paul has great timing: energy drinks are now as popular with Gen Z as beer by spicer2

Tools: Datylon

Source: GWI Core, a consumer survey run worldwide every quarter (full disclosure, I work for GWI)

Methodology/other bits: This data is from 47 countries around the world (some countries excluded for obvious reasons). It's specifically among Gen Zers aged 21+ and up to make sure they're legal drinking age in every country we can ask about alcohol consumption.

Logan Paul isn't the main driver of this trend at a global level (though I'm sure he'd be pretty happy about these figures). Having said that, we've seen sizable spikes in the UK in the last year since Prime was released. The main factor driving all this, though, is that Gen Z are way more sober/"sober-curious" than other generations.

PS: if you're wondering why there's more volatility at the start of this time period, that's probably because the sample of 21+ Gen Zers was smaller.

spicer2 OP t1_j8d8gr5 wrote on February 13, 2023 at 1:22 PM

Reply to [OC] Rates of anxiety and depression have spiked since the start of the pandemic by spicer2

Tools used: Datylon

Source: GWI USA (Full disclosure, I work for GWI)

Methodology/other bits: GWI USA started in Q2 2020, so annoyingly we don't have a benchmark for pre-pandemic rates of mental health. But the trend is pretty clear - and has had another recent spike, likely off the back of inflation and cost-of-living worries.

Note this is self-reported data - this is based on respondents who declare whether or not they experience these conditions. In other words, the data in the chart is from the question "do you experience these currently", not "do you currently see a doctor about these".

spicer2 OP t1_j68f586 wrote on January 28, 2023 at 2:05 PM

Reply to [OC] Herd you liek Mudkips: the most (and least) memorable Pokemon by spicer2

Source: Various quizzes on Sporcle

Tools used: Excel

Methodology/other bits: Not very complicated, just did each quiz and took the top and bottom in the stats.

What fascinates me is that the least memorable ones are quite uniform? Most of them either fall into a category of "spiny sea thing", "mushroom thing" or "thing that looks like a bell". Not quite sure if they're the least cute, but I feel there's something there...

spicer2 OP t1_j4n0vbd wrote on January 16, 2023 at 9:34 PM

Reply to [OC] Hours of the day that appear most frequently in song titles by spicer2

Source: Spotify

Tools used: Python, Flourish (not banned for static charts, right??)

Methodology: This was a really interesting exercise but needs a bit of explaining. What you're looking at is the number of songs whose titles contain a time of day within that window - so 3:33AM would fit between 3-4 on the AM side, 7:34PM would go between 7-8 on the PM side, and so on.

This only looks at times formatted by the 12 hour clock, as there's no foolproof way to verify that something like 11:11 is intended as a time and not something else (usually Bible verses). So 12:51 by The Strokes, for example, isn't picked up on this. And it obviously doesn't include times of day that can be expressed with words, like midnight and noon.

Tidbits:

The day in music is like the inverse of a typical day. The general pattern is that when people are most awake there's fewer songs, and vice versa.

I was really surprised that 3:00AM was the most popular minute/time to name a song after. Although the word "midnight" isn't counted here, I assumed 12AM or 12PM would be the obvious winner. 3AM is like the ultimate shorthand for the middle of the night, when you can't sleep, you're thinking about someone and you want to write a song about it. I also have a lot of time for the person who wrote a song called "Sneaking To The Fridge To Get Beans At Exactly 3:42AM".

3:33AM is the most popular time not at the top of an hour.

Every minute between 1:16AM and 7:34AM has at least one song attached to it.

The longest consecutive period of time without an attached song is just 2 minutes (which happens on multiple occasions).

spicer2 OP t1_j3hcy3m wrote on January 8, 2023 at 4:18 PM

Reply to [OC] The most quoted verses in each book of the Bible by spicer2

Data source: COCA (Corpus of contemporary American English)

Tools used: Excel

Methodology: I'm most interested in applying data viz/data journalism techniques to areas that don't traditionally receive them. I was curious to see if there was a way to measure the relative cultural impact of different parts of the Bible and this is the best I came up with.

I looked up the name of each book in COCA with a wildcard, then scrolled down to find the verse with the highest tally. Obviously it's only a sample of all published media (even if a big one) and doesn't include sermons afaik, but I looking at this, I think it's a pretty good representation of the whole.

spicer2 OP t1_izik75l wrote on December 9, 2022 at 10:56 AM

Reply to [OC] Mexican beer and New Zealand’s selflessness: the most distinctive traits and behaviors of 50 countries around the world by spicer2

Data source: GWI Core (ongoing survey asked in 50 different countries)

Tools used: Adobe Illustrator

Full disclosure: I work for GWI. I normally post my personal work on here but thought this was cool and wanted to share.

We run a survey called Core which asks people about many aspects of their lives – what they think, what they buy, what they’re interested in, their online behavior, etc.

We recently hit a milestone of researching in 50 different countries, so we put together this visual to show where consumers in each country are most distinctive from each other - where they’re more likely to do or think something than any other country. So a higher % of people in the Czech Republic buy cheese than anywhere else in the 50, and so on.

There’s some really interesting cultural context behind these data points - feel free to check out a blog we put together that goes into more detail (did you know that the lead singer of A-ha helped EVs become a thing in Norway? Me neither!).

PS: I know the example for Australia is really boring and obvious but trust me, we sweated blood to find these stats and that was the only one we could find!