earnest_dad
earnest_dad OP t1_isy0rkv wrote
Reply to comment by sexy_wash_bucket in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
No worries! There are *so* many bugs that end up making their way in here -- good for me to do the double-check!
earnest_dad OP t1_isxvyzx wrote
Reply to comment by Disastrous-Year571 in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
This comment has been under-appreciated.
But I see you.
earnest_dad OP t1_isxvvrx wrote
Reply to comment by Ghrota in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
Sorry to disappoint. In case it eases the sorrow, there were 5 babies in 2007 named "Lilylynn", which is similarly difficult to say, but I agree -- not quite as exciting as the much-discussed, but never actually documented "Lilly-lee"
earnest_dad OP t1_isxvmrp wrote
Reply to comment by magnesiumb in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
Interesting question! (Similarly, I'm more familiar with "ara" prefixes; from my identification strategy, though, "Ara" isn't a common standalone name in the way "Ada" is).
As it turns out, this isn't a bug --
"Adalynn" has been in use (with n>5 instances) since 1996, and is really on the rise since ~1007. In 2017, there were 2.651 female sex babies given the name "Adalynn"
"Adamary" isn't used with the frequency that "Adalynn" is given, but we're seeing its use along a similar timeline -- the name first crossed the (n>5) threshold in 1998, and it has been used at least that many times every year since. While its usage is declining recently, in the early 2000s it was typically given ~40 times per year.
Similar stories with "Adabelle" and "Adabella", though the timelines are different. "Adabell" is a *much* older name -- it was given to a handful of female sex babies starting in the early 20th century -- we see n>5 uses quite frequently from 1900 - 1931, then it falls off the radar until 2006.
"Adabella" looks more like "Adamary" -- wasn't really in vogue (if you can even say that about a VERY rare name) until 2008.
earnest_dad OP t1_isxugk0 wrote
Reply to comment by -KR- in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
I think you may be onto an interesting piece of this, but I do want to be precise about how the names here are identified (this is described carefully in the source / tools comment above. Note that syllabic count isn't a feature here).
I (personally) think it's interesting to identify names that can be sub-divided into standalone names. There's some complexity around whether names that can technically be subdivided (but we do not think about as compound names themselves) should be included. As an example that generated a lot of discussion in a previous version, think about something like "Elizabeth" that can technically be subdivided into the standalone names "Eliza" and "Beth", but we don't really think of as a compound name in the sense you describe.
I think you raise an interesting question that gets at what the central interpretation of the plot is; there's a question here about whether this strategy maps cleanly onto true compound names, and I think you're right that there are some we'd want to hand-edit (or otherwise identify) if that's the main goal. To me, it's a tricky thing to decide.
earnest_dad OP t1_isxtgkg wrote
Reply to comment by sexy_wash_bucket in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
Can you say a bit more about the concern here? The code I used does the following:
(1) identifies the "maximal proportion" as the greatest share (of all female names in that year) a name receives in any year. Note that these maximal proportions are quite small -- the greatest value represented here is "Joanne" with 0.00420; the smallest values in this chart are less than 10^-5.
(2) convert to "1 name per..." by finding 1/maximal proportion, Note that by this measure, "Joanne" is roughly 1 per 238 names; the very uncommon names (e.g. lilylynn" are roughly 1 per 400,000.
(3) use a log scale gradient to plot
earnest_dad OP t1_ist1bda wrote
Reply to comment by Buck_Thorn in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
LOVE this idea. Oh man. Great call.
earnest_dad OP t1_issyqyv wrote
Reply to comment by jeffinRTP in [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
These are *excellent* questions. With the data we have, it's much easier to examine the first question you posed. It seems totally doable to create an indicator for whether a name is a combination (like these), and look at the proportion of all names that satisfy this property over time. I'm guessing you're right that there's some regional variation, but unfortunately the babynames library I used doesn't connect names to distinct geographies. Would be very cool to examine that, though!
Thanks for the comment!
earnest_dad OP t1_isswtbh wrote
Reply to [OC] Female names that are composed of two "standalone" names (e.g. "Rosemary", "Annmarie", "Adalynn", "Emmalee"...). Turns out "Jo-" is super versatile [repost with light updates after comments] by earnest_dad
Source: babynames library (R package): https://cran.r-project.org/web/packages/babynames/index.html
Note: this package draws data from the US Social Security Administration
Tools used: R
data preprocessing: tidyverse
visualization: ggplot2
Additional notes:
(1) identify "standalone" names by finding top 1000 female names
(2) identify names that are composed of two standalone names combined
(3) identify common "prefix" and "suffix" names by finding the maximum (annual) proportion of names from (2); restrict to instances where log(max frequency) > -8.5
(4) restrict attention to combined names composed of the names from (3)
(5) hand-edit (that is, remove) unusual prefixes and suffixes: (redditors objected to the inclusion of "eliza-beth" and "elisa-beth"; also hand-remove "ina" and "ora")
Note: an earlier draft of this plot did not filter to female names only, and so incidentally included the name "Josue", a male name which is composed of the common female standalone names, "Jo" and "Sue"
earnest_dad t1_ivzub2g wrote
Reply to The effect of the First World War on names, in France [OC] by bjco
This is the cleanest regression discontinuity I've seen in my entire life.