Submitted by OnlineGrab t3_y3s4ar in MachineLearning
OnlineGrab OP t1_ishhqxx wrote
Reply to comment by master3243 in [P] Stable-Diffusion fine tuned on mechas from the anime franchise Gundam by OnlineGrab
Thanks! There's 1565 images in the datasaset. The original Pokemon project used an even smaller one (less than 1K images).
Each row is a gundam image + a text description. The original project used BLIP to auto-caption the images but that didn't really work for this dataset so instead I asked BLIP to only describe the colors and inserted them into a generic description: "A robot, humanoid, futuristic, <colors>". One could likely get better results with more fine-grained captions.
Viewing a single comment thread. View all comments