Athena

The App


Athena is a simple web app with a single purpose: allow users to easily obtain AI-assisted descriptions of their images. A lot of thought went into the user interface and experience, though not necessarily effort ๐Ÿ˜œ. More on that later.

At itโ€™s core, Athena allows:

๐Ÿ‘‰ Easy upload of your images - drag and drop into the file drop area

๐Ÿ‘‰ You can also paste files or drag most images from the web too (e.g. google images)

๐ŸŒฟ Each time a file is uploaded, a card is created. Each card contains the image thumbnail, editable prompt, trigger button and a response area.

Itโ€™s really that simple. Give it a try!

โš ๏ธ The app deployed on ghp is inert; youโ€™ll need to clone the repo and adjust utils.js to point to your LLM api end point along with your api key ๐Ÿ”


Why make this?


Well, UI has always been a passion of mine, but UX too. In our internal platforms for LLMs, it can be a bit cumbersome to perform image interpretation. I wanted a streamlined way to do a task I perform frequently at the drop of a hat ๐Ÿƒ๐Ÿปโ€โ™‚๐Ÿ’จ๏ธ

I also chose to do this as a static webpage because I thought a shiny app would be overkill for this ๐Ÿ˜….

Assurance


I didn't do the majority of the programming for Athena. GPT-4O and GPT-4O-mini did. As a secondary objective, I was curious whether:

โ“ This app could be generated completely by AI itself?
โ“ Will I have a job in a few years ๐Ÿ˜‚ ?

Iโ€™m happy to share that generative AI, at the least the models I employed, were not enough to generate this completely. Even with several attempts and strategies of prompting ๐Ÿ“, it still required quite a bit of human-in-the-loop to get it right ๐Ÿ‘ซ. Hereโ€™s a few points on my experience:

โœ”๏ธ AI was helpful in getting started from scratch. Everything from scaffolding files in a structured way that made working on the app easier

โœ”๏ธ Basic CSS, HTML tasks were do-able with ease

โš ๏ธ Advanced CSS and JS required frequent iteration and testing ๐Ÿ”„. This was frustrating and time โŒ› consuming.

โœ”๏ธ Common tasks like using fetch API to access openaiโ€™s end point are well understood

โš ๏ธ Specifying named JS libraries were amenable, but only after I specifically prompted itโ€™s use. It was really hard to get the file-area-upload from first principles. However, I knew about dropzone.js, so I asked it to use it instead. It didnโ€™t work 100% out of the box, so I still had to read the documentation and alter the generated code.

In the end, I probably could have coded this myself more quickly than ask AI to generate it. I realize thereโ€™s a lot of things that you need to be deeply aware of before you can ask AI to generate you something as complex as a web app. For example, JS libraries, CSS frameworks (Bootstrap 5 class names), Names (HTML elements, CSS selectors, function names), and even coding approaches for efficiency.

Itโ€™s not push start. At least not with these models. I wonder how Strawberry would do ๐Ÿค”

In sum, my job is at least safe for the next bit ๐Ÿ™†

Till next time!

๐Ÿป๐ŸŒด

Matthew Kumar
Matthew Kumar
Associate Director, Lead Computational Scientist