Submitted by evanthebouncy t3_zxef0f in MachineLearning

Foundational models can generate realistic images from prompts, but do these models understand their own drawings? Generating SVG (Scalable Vector Graphics) gives us a unique opportunity to ask this question. SVG is programmatic, consisting of circles, rectangles, and lines. Therefore, the model must schematically decompose the target object into meaningful parts, approximating each part using simple shapes, then arrange the parts together in a meaningful way.

Check out the blog (5min read) for the full report https://medium.com/p/74ec9ca106b4

tl;dr:
GPT can symbolically decompose an object into parts, is okay at approximating the parts using SVG, is bad at putting the parts together, and is Egyptian.

be happy to take some comments and QA here :D

--evan

35

Comments

You must log in or register to comment.

slashdave t1_j1zwb0a wrote

SVG supports complex shapes via paths, text, images (photos) and complex fills including gradient fields.

−1

suspicious_Jackfruit t1_j21mu6v wrote

Didn't think about SVG, I got chatGPT to draw in ascii art instead, it drew itself as a human, but with a larger head

3