Local First AI - what’s missing
AI is currently primarily available through the cloud. It’s currently cloud-first, the source of truth of the application living in the cloud and the browser being a mere interface. But what if it starts with the browser and the device? What if we have local-first AI?
AI is becoming more accessible thanks to ongoing efforts in downsizing models without sacrificing performance. OpenAI's GPT-4, for example, has an estimated massive 1.7 trillion parameters, requiring between 850GB to 6800GB of GPU memory to run. These requirements are currently too high for home computers or smartphones, making cloud platforms the go-to solution for running such models.
However, there's hope on the horizon. Recent models like the one from Mistral AI have shown that it's possible to achieve competitive performance with fewer parameters, as evidenced by their 7 billion parameter model outperforming the previously celebrated 14 billion parameter Llama model by Meta.
While these smaller models have yet to reach the performance levels of GPT-3.5 or GPT-4, the pace of improvement is encouraging. The cloud may offer vast scalability but presents privacy concerns, especially for regulated or sensitive applications. Here, on-device AI shines with its promise of privacy and low latency, a potential recognized by tech giants like Apple, investing heavily in their M1/M2 chips to advance on-device AI capabilities.
To bring the rich, interactive experience provided by ChatGPT to devices without relying on the cloud, what’s missing:
AI Chip and Language Model (LLM) Development: Rapid advancements in chip technology and language model development are crucial for more powerful AI models to fit on devices without losing performance.
Internet Browsing: Our computers are already adept at browsing the internet, but for instance, performing a request to google.com from a website doesn’t work due to security restrictions like Cross-Origin Resource Sharing (CORS). As the use of language models and intelligent agents grows, revisiting these security design decisions in browsers might be necessary.
Code Interpretation: Platforms like Stackblitz have shown that running various programming languages in the browser is feasible, which bodes well for on-device AI.
In summary, the strides being made in developing smaller models are bringing us closer to the day when on-device AI becomes a viable alternative for the masses. This shift could not only provide users with enhanced privacy and lower latency but also change the narrative on whether the cloud-based platforms like OpenAI will remain the preferred choice, or if open-source, on-device alternatives will gain traction.