Anyone who is involved in technology at almost every level knows about generative AI. Businesses all over the world are leveraging it to help make their workforces more efficient, and even more creative. There is just one area that still is not leveraging it, and that is embedded systems. We're not referring to generating code examples or design ideas for embedded systems, but rather the implementation of generative AI as a core feature in an embedded system.
I think this largely stems from the fact that generative AI is so closely associated with chat features. Embedded systems are not always used for chatting with humans, although they may deploy an application that includes this capability. Normally such a feature is accessible via the cloud, and it brings a human element into certain products. In embedded systems that do not require mimicry of human interactions, generative AI can play other very interesting roles that are still being explored.
How to build embedded hardware with generative AI capabilities
As a general area of AI and computing, generative AI encompasses a range of possible features, all of which involve generating an output from an input prompt. The prompt-response workflow is easiest to understand in the context of human language, but any data type could be used to develop a prompt-response dataset. We have already seen this with the generation of images, music and deep fakes.
Hardware that needs a generative capability could be applied in many different areas due to the generality of this field of AI. In a software-driven approach, general-purpose tensorial processors provide the required compute, today being available in banks of GPUs. Embedded systems deployed at the edge could be very different and might not involve a GPU at all, at least not on the end device.
Here's some examples of what makes a generative AI-capable embedded system unique:
- A device deployed at the edge could leverage generative AI in the cloud via a connection to a data center.
- Some devices deployed at the edge will not have any connection to the cloud, but instead might be on a private network or operating alone.
- Assuming no HMI is required, the compute requirements could be much lower depending on the data being generated.
- Hyper-specific model development can lead to much smaller models that do not require cloud computing capabilities.
- It is well-known that generative AI uses a lot of energy, but most standalone embedded systems are designed to be very power conservative.
Because so much of the typical approach to embedded systems runs counter to what we do in the data center and with generative AI, it may be difficult for systems designers to see how they can bring generative AI into end devices. It could be that we just move GPU deployments into the field in large enclosures. Remember the NVIDIA Jetson Xavier MX computer-on-module (COM)? This might be the type of component that delivers hyper-specific generative AI to embedded systems.
New processors can pave the way for on-device generative AI
Thankfully for embedded systems developers, the semiconductor industry is stepping up with new processors and developer toolkits to bring AI to the edge. The semiconductor industry has realized that deployment of large GPUs is impractical from the perspectives of thermal, power, form factor and cost. Instead of going this route, some companies are focusing on smaller, more power efficient devices supporting hyper-specific generative AI models to provide inference at the edge.
To date, there have been several developments that enable alternative approaches to generative AI at the edge.
A fusion of the above hardware and software approaches is needed to deploy hyper-specific generative AI models in the field on small embedded systems. The more generalized the model becomes, the more power hungry it becomes, which then blows up your thermal and power consumption specifications.
If you're a developer who is interested in this area of embedded systems development, pay attention to what startups are doing in this space. Startup companies have been working toward AI at the edge with on-device inference long before ChatGPT became publicly accessible. Some of these companies have now had to shift their approach to think more about how they can incorporate generative AI compute workloads into their systems. This is where we will start to see the most promising developments in generative AI for embedded systems.