I have spent way too many hours messing with getting this working, and now it's time for sleep...
Getting LLMs to draw complex objects, in this case a detailed turtle, using nothing but html code without image files is REALLY hard. This is about as good as I could get tonight, and as simple and silly as that looks, it's miles ahead of the zero short results of any model without using the workflow concepts I have been running on about recently.
Just ironically easier and quicker to score the results of very difficult visual programing tasks, like ASCII or html art production. When it looks like a turtle, it's definitely a win. Most zero shot results don't look anything like a turtle, and getting anything turtle-like is a heavy lift to devise. If you want to try this for yourself with any model of your choosing. Here's the main single line instruction I used to build my entire resulting workflow around.
"Write html code to create a webpage that displays a highly detailed graphical visualization of a turtle viewed from the side, without using image files of any kind."
Also for the description I forgot to mention, that’s Claude 3 opus with 6 step workflow to generate html code to use shapes to draw pictures of complex objects (no visualization analysis was used…model is blind for whole execution)
Here's a quick comparison I threw together this morning of using the same baseline final instruction step of the workflow utilizing the revised primary step instructions of the workflow that Claude3 Opus generated last night for the original picture in the post. Using those with only the final reflective steps of the workflow, this is what four different models generated.
The obvious result is that there seems to be a very direct correlation between the ability of the model to initially draw a better quality image, and its ability to use the reflection and CoT to improve its visualization.
Here's the link to my website where I posted the full workflow for the current version of the process I used: https://lnkd.in/gU_mJMGW