Athena

The App


Athena is a simple web app with a single purpose: allow users to easily obtain AI-assisted descriptions of their images. A lot of thought went into the user interface and experience, though not necessarily effort 😜. More on that later.

At it’s core, Athena allows:

👉 Easy upload of your images - drag and drop into the file drop area

👉 You can also paste files or drag most images from the web too (e.g. google images)

🌿 Each time a file is uploaded, a card is created. Each card contains the image thumbnail, editable prompt, trigger button and a response area.

It’s really that simple. Give it a try!

⚠️ The app deployed on ghp is inert; you’ll need to clone the repo and adjust utils.js to point to your LLM api end point along with your api key 🔐


Why make this?


Well, UI has always been a passion of mine, but UX too. In our internal platforms for LLMs, it can be a bit cumbersome to perform image interpretation. I wanted a streamlined way to do a task I perform frequently at the drop of a hat 🏃🏻‍♂💨️

I also chose to do this as a static webpage because I thought a shiny app would be overkill for this 😅.

Assurance


I didn't do the majority of the programming for Athena. GPT-4O and GPT-4O-mini did. As a secondary objective, I was curious whether:

❓ This app could be generated completely by AI itself?
❓ Will I have a job in a few years 😂 ?

I’m happy to share that generative AI, at the least the models I employed, were not enough to generate this completely. Even with several attempts and strategies of prompting 📝, it still required quite a bit of human-in-the-loop to get it right 👫. Here’s a few points on my experience:

✔️ AI was helpful in getting started from scratch. Everything from scaffolding files in a structured way that made working on the app easier

✔️ Basic CSS, HTML tasks were do-able with ease

⚠️ Advanced CSS and JS required frequent iteration and testing 🔄. This was frustrating and time ⌛ consuming.

✔️ Common tasks like using fetch API to access openai’s end point are well understood

⚠️ Specifying named JS libraries were amenable, but only after I specifically prompted it’s use. It was really hard to get the file-area-upload from first principles. However, I knew about dropzone.js, so I asked it to use it instead. It didn’t work 100% out of the box, so I still had to read the documentation and alter the generated code.

In the end, I probably could have coded this myself more quickly than ask AI to generate it. I realize there’s a lot of things that you need to be deeply aware of before you can ask AI to generate you something as complex as a web app. For example, JS libraries, CSS frameworks (Bootstrap 5 class names), Names (HTML elements, CSS selectors, function names), and even coding approaches for efficiency.

It’s not push start. At least not with these models. I wonder how Strawberry would do 🤔

In sum, my job is at least safe for the next bit 🙆

Till next time!

🍻🌴

Matthew Kumar
Matthew Kumar
Associate Director, Lead Computational Scientist