tomiwa1a
tomiwa1a t1_j6bbdzn wrote
Reply to comment by EICONTRACT in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Watch the demo. Youtube doesn't give matches this precise.
tomiwa1a t1_j6bb9zk wrote
Reply to comment by Thenerdy9 in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
You can try it here: https://atlas.atila.ca/
tomiwa1a t1_j6bb8e8 wrote
Reply to comment by Chramir in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Exactly! This is how it works.
I agree it's not perfect, but remember, Youtube itself is not a library so any comparisons to real libraries will require some degree of approximation. You can think of it as an approximate estimate or my preferred term, a Fermi Estimate.
tomiwa1a t1_j6bapiz wrote
Reply to comment by Purplekeyboard in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
The reason that happens is because unless someone has previously submitted a youtube video with "I gotta have more cowbell" we won't have it in our index.
​
>The transcripts get added on-demand when users request to search for a video. It wouldn't make sense to index the entire database given it's large size. We're also able to get the transcripts pretty quickly, so there's no need to pre-cache the transcripts if a user has never asked for it before.A more detailed overview of how it works can be found here:
tomiwa1a t1_j6baiw7 wrote
tomiwa1a t1_j6bagzz wrote
Reply to comment by MurdrWeaponRocketBra in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Thanks! The transcripts get added on-demand when users request to search for a video. It wouldn't make sense to index the entire database given it's large size. We're also able to get the transcripts pretty quickly, so there's no need to pre-cache the transcripts if a user has never asked for it before.
​
A more detailed overview of how it works can be found here:
tomiwa1a t1_j6ba59m wrote
Reply to comment by ZeusTheRecluse in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
- The other interesting piece is that Library of Congress was founded in 1800 (though a fire caused it to restart it's collection in 1815).
So in just 17 years, Youtube has amassed a collection of information that is 57% the size of the world's largest library which has been accumulating it's collection for over 200 years.
​
-
I'm also Canadian. Hadn't heard of it either until we did this report. We probably haven't heard it because we likely won't need to use any of it's resources. Public libraries already do a really good job for most of our day to day needs.
-
Wikipedia's small size makes sense given that contributions are heavily restricted and have such a high bar. Imagine if every Youtube video had to be approved by a editors before or every author had to have their books approved by editors before publishing.
tomiwa1a t1_j6b8wnz wrote
Reply to comment by worriedshuffle in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Can you please clarify? what do you mean by it isn't clear how books on Youtube is calculated?
If you check this range you can see how we arrived at our numbers:
- We calculated the number of hours of video uploaded to Youtube every minute from 2007-2022 source: statista
- We found how many words are spoken per hour of human conversation source: virtualspeech
- We calculated the number of words in the average book source: jericho writers
Then we did some calcualations with those numbers to arrive at 99,338,400 books on Youtube
tomiwa1a t1_j6b8gcr wrote
Reply to comment by NovaticFlame in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Y Axis is the number of books. You're right though, the Y Axis should definitely have been there.
You can see the details of those calculations here: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit#gid=52223737
tomiwa1a t1_j6b80iu wrote
Reply to comment by DenL4242 in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
I don't think it's fair to say that comparing Youtube to a Library is like comparing Mt. Everest to a Cow. For one thing, there is actually a pretty clever way to estimate the amount of text on Youtube and compare it to the amount of text in a library.
Maybe, if I explain how we made the graph you'll see that it's more apples to apples than mountains to cows:
- We calculated the number of hours of video uploaded to Youtube every minute from 2007-2022 source: statista
- We found how many words are spoken per hour of human conversation source: virtualspeech
- We calculated the number of words in the average book source: jericho writers
Then we did some calcualations with those numbers to arrive at 99,338,400 books on Youtube
You can see the details of those calculations here: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit#gid=52223737
tomiwa1a t1_j69prfr wrote
Reply to comment by [deleted] in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Good point, here’s we got this information.
- We calculated the number of hours of video uploaded to Youtube every minute from 2007-2022 source: statista
- We found how many words are spoken per hour of human conversation source: virtualspeech
- We calculated the number of words in the average book source: jericho writers
Then we did some calcualations with those numbers to arrive at 99,338,400 books on Youtube
You can see the details of those calculations here: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit#gid=52223737
Edit: I also have a question about the last thing you said > there’s so much more content than that though
What other content is there?
tomiwa1a t1_j69illm wrote
Reply to comment by Lirlya in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
Which ones are missing?
tomiwa1a t1_j6bbj7o wrote
Reply to comment by insane9001 in [OC] Youtube has over 1 billion hours of videos, we Built an AI Search Engine that can find exact timestamps for anything on Youtube by simonezchen
The Y Axis is number of books. I agree with you though, That was an oversight on our part. I also don't like when graphs don't have a labelled Y-Axis. Next time we'll add them.