Submitted by Ch1nada t3_10oauj5 in MachineLearning
I’ll try to get into detail about the implementation and difficulties in case it is useful for anyone else trying to do something similar with an applied ML project, so there’s a TLDR at the end if you’d like the short version/result.
At the end of last year I convinced myself to start 2023 by creating a side-project that I'd actually finish and deploy and perhaps earn some “passive” income (spoiler, not so passive after all :P), and after some brainstorming I settled on making an automated Youtube channel about finance news since I had just gotten into investing. Shorts seemed to be more manageable and monetization is changing in February so I went with that.
My rough initial idea was to get online articles, summarize them, make a basic compilation with some combination of pymovie, opencv and stock photos and done. I was pretty worried about the summarization, since in my ML day job I mainly work with vision or sensor data in manufacturing not NLP. Also, I quickly realized pymovie with still images and some overlayed text was not very attractive for viewers (starting with myself).
Fast-forward a few days, and after some research online I came across two things, Huggingface transformers (yep, I know I’ve been living under a rock :P) and After Effects scripting. From here, it became mainly about figuring out exactly which ML models I needed to fine-tune for finance / social media and for what, then putting it all together.
The entire workflow looks something like this: the bot fetches online daily news about a topic (stocks or crypto), then sentiment analysis is performed on the title and the full text is summarized into a single sentence. I fine-tuned SBERT on ~1.5M posts from /r/worldnews publicly available in Google Cloud BigQuery so that it could predict a “social engagement” score that could be used to rank and filter the news that would make it into the video.
Finally, all of this is combined into a single JSON object written into a .js file that can be used by another “content creator” script to render the video from a template using aerender in Python. The content of this template is generated dynamically based on the contents of the .js file via AE Expressions. This module also uses the TTS lib to generate voice-overs for the text, and is also responsible for generating the title (using NLTK to identify the main subjects of each title) and the video’s description. Pexel stock videos are used for the background.
In principle automating the upload to Youtube could also be done, but at this stage I’m handling this manually as the JSON generation is not as robust as I’d like, so the output file often needs to be tweaked and fixed before the video can be finalized and uploaded. An examples is the summary being too short or vague when taken out of the context of the original article. If you increase the max_length of the summarizer to compensate, it can easily become too long to for the overlay to fit the pre-defined dimensions, or the total audio length can be too long for the max duration of a youtube short.
With some more work I’m confident the whole process can be automated further. For those interested, feel free to check the result here:
If you have any questions or suggestions I’d be happy to hear them.
TLDR: Coded an automated (not 100% yet, but will get there) Youtube Shorts channel about finance news to create a passive income stream. Ended up being way harder, more fun and not so “passive” than my initial expectations.
nTro314 t1_j6e5cbh wrote
I am impressed