Peel back the layers of hardware enabling the Internet of Things (IoT), and at the heart, you will find embedded processors. While chipmakers constantly push the limits of processor technology, some of the most important innovations aimed at providing the building blocks of the IoT can be seen in the way chip designers combine different types of processors.
The forces driving this innovation include, among others, the need to support pervasive sensing and more complex software architectures, and the nearly ubiquitous requirement for low power consumption. These demands tighten the constraints with which embedded system designers must contend and raise the bar on processor performance.
Figure 1 - Embedded processors enable sensing and connectivity technologies to bring layers of intelligence to the full spectrum of devices making up the IoT. (Courtesy of Freescale Semiconductor.)
More Intelligence at the Edge
Until recently, an 8- or 16-bit microcontroller was enough to meet the processing demands of many of the nodes at the edge of the network. The IoT and economics, however, are changing that. With more and more data being collected by end nodes, economic considerations call for more local processing. “It is far more cost-effective to cook the data locally than to transmit raw data and have someone else do the processing,” says Richard York, Vice President for Marketing, Embedded Segment, ARM.
Figure 2. Economics dictate greater local data processing to support increasing data collection by IoT end nodes. (Courtesy of ARM.)
In addition, because of demand for IP connectivity and more complex interfaces, embedded system designers must deploy processors that can support full-featured operating systems. The down side of this tack is that these operating systems are not well suited to handle interrupts and the low latencies required by certain critical tasks. To support these functions, the embedded system requires a processor running a real-time operating system (RTOS).
Challenges
To accommodate these disparate needs, embedded system makers are throwing out the “one size fits all” approach, changing the type and number of processors deployed, and adopting new architectures. Designers must tailor processor resources to meet the varied needs of the device. “You need multiple compute subsystems dedicated to particular activities,” says Ajith Dasari, Vice President of Platforms and Customer Engineering, Ineda Systems. “This means a processor dedicated to real-time sensing and data gathering and a microprocessor for application- and communication-related tasks. Real-time sensing is not necessarily compute-intensive, but application processing can be. These two distinct performance requirements call for different types of processors.”
The concept of a heterogeneous multicore system is not new. To avoid impacting the effectiveness of the overall system, however, designers need an architecture that enables and enforces the safe sharing of system resources like memory and peripherals. Such an architecture enables the system to support both compute-intensive functions and real-time responsiveness, using the most energy-efficient processor for the task at hand.
In addition to providing this kind of flexibility, the system must also be able to ensure that end nodes have enough processor performance to scale to meet growing application demands. To do this, embedded system designers use a variety of devices. These include microcontrollers (MCUs), hybrid MCUs/microprocessors, and integrated MCU devices that can deliver more processing power than 8- and 16-bit processors. “The IoT is making 32-bit processors the new minimum standard across the whole spectrum,” says ARM’s York. “Even the simplest devices tend to have fairly complex software.”
For Wearables and Beyond
An innovative approach providing a heterogeneous, multicore, single-chip system can be found in the offerings of Ineda Systems. The Silicon Valley-based startup is developing chips tailored for wearable devices and IoT systems. Its portfolio includes a prototype wearable processing unit (WPU) called Dhanush, a MIPS- and PowerVR-based system on a chip (SoC) built around a hierarchical computing architecture. The Dhanush comes in an assortment of SoC implementations that provide various grades of performance, from the Nano, which provides basic MCU functions, to the Advanced, which provides processor support for always-on functions and applications requiring a full-blown OS, such as Android or Linux.
Figure 3. Ineda Systems Dhanush processor is tailored to meet the unique demands of wearable and IoT devices. (Courtesy of Ineda Systems.)
The WPUs incorporate as many as three CPUs, with each providing different levels of performance and power consumption. The Advanced WPU combines two MIPS CPUs — for sensor support, Bluetooth connectivity, basic functions, and ultra-low-power performance — and an application processor for compute-intensive functions and Internet access. The triad of processors provides for always-on listening to enable contextual computing and an integrated sensor and connectivity hub. The power and performance micromanagement unit optimizes power consumption by turning the individual CPUs on and off to accommodate always-on functions and on-demand complex computing tasks.
The chip’s hierarchical computing architecture allows the CPUs to operate independently or together, with each sharing on-chip peripherals and tiered memory. “Control of the peripherals is determined by the application software,” says Ineda’s Dasari. “Disputes among CPUs over control of the peripherals are resolved via software negotiations. Critical applications are given priority, and on-demand control is determined by the use case. The highest power-consuming processor remains in a powered-down state as much as possible, withholding access to the peripherals until absolutely necessary.”
Figure 4. The Dhanush wearable processing unit is built around a multicore hierarchical computing architecture that delegates tasks to processors based on performance requirements and energy efficiency. (Courtesy of Ineda Systems.)
The Best of Both Worlds
Another approach to multicore processing is Freescale Semiconductor’s expanded i.MX 6-architecture applications processor. This SoC combines ARM’s Cortex-A9 and Cortex-M4 processors, with each processor running its own operating system. The A9 applications processor runs a full-featured operating system and supports multiple-user applications. The M4 can run a real-time operating system or bare-metal firmware and includes digital signal processing and floating-point support, minimizing power consumption with integrated sleep modes. The SoC, which was released this year, takes aim at next-generation, connected, highly graphical, and system-aware devices.
Freescale built the SoC around an asymmetrical multicore architecture, which significantly reduces overhead and latency in interprocessor communications. This promotes real-time responsiveness in multimedia applications and provides power efficiency while maintaining system connectivity and always-on sensor monitoring.
The architecture partitions memory and peripherals into four independently controlled resource domains. This enables the SoC to support applications that simultaneously perform real-time processing and compute-intensive tasks.
The SoC provides access to memory and peripherals via a shared bus topology. “Freescale’s shared bus makes all chip resources accessible to either core,” says Amanda McGregor, Microcontroller Product Manager at Freescale Semiconductor. “The access permissions are established through the Resource Domain Controller, which allows independent setting of read and write access privileges for each memory region and peripheral in the system-on-chip. It also provides hardware-level enforcement of these access privileges.”
Figure 5. Freescale’s asymmetrical multicore architecture leverages a shared bus topology that makes chip resources like memory and peripherals accessible to all cores. (Courtesy of Freescale Semiconductor.)
Freescale’s implementation reflects the chief characteristics sought in IoT edge devices. It provides low-power modes to significantly reduce standby power consumption, supports small form factor design, executes real-time responsiveness, and provides for on-demand compute-intensive applications.
Blurring Distinctions
It is clear that SoC packaging is well suited for heterogeneous multicore systems and, at least for now, that it has a place in the pantheon of technologies of the future. But what does the ascent of the SoC mean for the tried-and-true microcontroller?
“We will continue to see mixed-mode implementations, where applications processors will provide additional real-time capabilities and low-power system monitoring through supporting the asymmetric multicore approach,” says McGregor. “Freescale is providing software enablement for these types of implementations. We also expect to continue to see discrete microcontrollers used in both high-end and simple embedded systems.”
The fact is that the roles of applications processors and microcontrollers are in flux, as shown by the rise of the new breed of super microcontrollers like ARM’s 32-bit Cortex-M7. The M7’s design significantly increases the processor’s compute and DSP performance, enabling it to execute two instructions in parallel, support 64-bit data transfer, and efficiently translate analog sensor data into digital information. Industry experts contend that it is more powerful than ARM’s Cortex-R real-time processor family.
Figure 6. ARM’s 32-bit Cortex-M7 processor boasts enhanced compute and DSP performance that enables it to execute two instructions in parallel, support 64-bit data transfer, and efficiently translate analog sensor data into digital information. (Courtesy of ARM.)
These improvements are blurring the lines between microprocessors, real-time operating system processors, and application processors. “With the Cortex-M7, you are going to see that area start to grey, and it will begin to be a decision around whether the application will be better served by an RTOS or a full OS like Linux,” says Steve Tateosian, Global Product Marketing Manager at Freescale. “If the answer is a full OS, the customers will select Cortex-A type products [applications processors], but if everything that must be done could be done by an RTOS, then I suspect that the M7 is going to begin to give a lot of those lower-end A-type processors a run for their money.”
The Software Factor
When you look at what will be required to prepare embedded systems for the IoT, it is easy to over-emphasize the importance of hardware. Does a system have low power consumption? Can it collect sensor data effectively? Is there enough processing power? The truth is that the real revolution of the IoT will be in software. Software development represents the dominant cost in embedded systems, and the software demands of the IoT are only going to increase.
“Most people first decide what their software is going to be, and then they look around for hardware that will support it,” says ARM’s York. “They decide the type of software environment they need, and that will direct them to a particular processor family. Sophisticated software needs sophisticated hardware.”