Closing the loop
I recently asked a friend working at OpenAI what kind of startup he'd found if he were to build one today. His answer was simple: I'd solve a specific problem where I can collect training data from users, with which I can fine-tune a model on top of OpenAI, which increases the quality of the product, which brings in more users, which gets you more training data, which lets you further fine-tune the model, etc.
Midjourney is a staggering example of this. Its quality is still way superior to Dalle-3. How is that possible? One thing is that they're focused but also able to constantly increase the quality by getting user feedback.
https://twitter.com/DrJimFan/status/1643279641065713665
By choosing which picture to upscale, you give Midjourney feedback on which option you prefer, from where they can further train their models.
How would such a loop look like for LLM-based applications? What's the journey from starting with a prompt to a closed loop, which can even tune itself?
Create a prompt, by using one of the prompt playgrounds out there, such as baserun.ai's and test it with a few test cases that you believe users will have
Deploy the prompt to production, get the first users
With the first users providing actual input, collect that input and turn it into evaluation data
With this evaluation data, you can now do a couple of things:
Further improve the prompt
Fine-tune a model
Adjust your RAG approach (reordering etc)
Deploy the changes
Get more user feedback, which adds more evaluation data
Profit
This is one of the ways to build a moat. By constantly improving on a hard problem after a few iteration cycles, you can already be miles ahead of where you started. What if there was a tool that could do this tuning for you automatically, even deploying the changes? We'll soon see this tooling, and I'm excited to see applications improve and become more useful for us humans!