Analysis: March 2021

Having spent over 18 years developing embedded software for wireless systems, I believe I have developed a certain level of intuition which enabled me to understand the design and problems pretty quickly. This is an attempt to codifying that intuition. I hope to keep this entry live for a few months, will update the last updated date.

The definition and scope of embedded systems are changing rapidly. We can dilute the definition and settle for something which is "embedded" in a larger system, delivers its functions as per certain rules, has a limited set of specialized functions, have tighter constraints than a general-purpose system, and by extension real-time. Even a general-purpose system may be viewed as a combination of several embedded systems. Take, for instance, a mobile - where it is getting increasingly difficult to distinguish it from say, a laptop - the cellular modem inside the phone can be very much real-time, a limited set of specialized functions and have extremely tight resource constraints.

First steps in order to understand an embedded system would be to know

The processing elements
Memories
Power Architecture
Interfaces

Processing Elements

The Processing elements could be both to execute the SW developed and the processing that happens primarily in hardware blocks. We have the ARM Cores, x86 Cores, Micro Controllers, DSPs, FPGAs where targeted software gets executed. We also have various HW blocks that over a period of time take over the tasks that were earlier executed by SW. Two major motivations to move tasks to HW are speed, and power consumption - enabler is the confidence to move a certain piece of function to HW. For example, most high-end phones today implement Camera processing / JPEG encoding etc in HW. Similarly, there is a lot of custom HW to accelerate neural network inferences.

Clearly, when we look at processing elements, we need to understand the HW/SW split and how the hardware is programmed, and by whom. Also of interest would be the speed at which each of these processing elements operates and why there are so many processing elements that can do the same task. For example - in a mobile or a laptop, a CPU Core (ARM/x86) can do whatever task a GPU can do. Yet, GPUs are ubiquitous and we know why. Likewise, early smartphones used GPUs for AI tasks - but as the asks increased we moved to DSP-based accelerators and AI accelerators. An easy way to understand would be to take a Product Requirement Document (PRD) and understand which processing element executes each of the product requirements.

Memories

A typical system can have access to several types of memories. On-chip memories such as Level1,2,3 and System caches, Non-Volatile Memory (which is retained on reboots), Read Only Memory (ROM), RAM, Secondary Storage, EFPROM for fuses, and so on.

We need to understand the rationale for each of the memory's availability and sizing. A lot of work goes during the system design phase to understand if a processing block needs a particular amount of allocation in L1, L2, L3 and System Caches. The latency increases as you move across the memory hierarchy but the price reduces. The system performance teams, if they have a base chip to compare - can fairly accurately come up with required memories based on the feature delta compared to the base chip. A combination of increasing DDR speeds as we go from LPDDR to LPDDR5 and the efficiency of moving certain functionality from core-specific caches to system caches has caused changes in the sizing of caches in the hierarchy.

Power Architecture

There was a time when Power was considered to be important only for small battery-operated devices like mobile phones. Increasing device complexity that is pushing the power envelope higher and higher and thereby increasing the costs of cooling solutions is motivating designers to consider optimum power performance even for use cases where traditionally power wasn't a concern. For example, laptops consuming lesser power can go with a fanless design and reduced battery sizing for the same battery backup - both of which have an impact on the weight and cost of the laptop. On the other hand, Data centers are extremely power-guzzling that any saving there could mean a lot of CAPEX saved in cooling solution. In the industrial setting - when the system is in a remote area without access to main power, there is a very strong motivation to make systems run on a set of AA batteries for the life of the system.

While the overall power consumption as mentioned above is important to understand - it is also important to note which processing elements will be powered and how they will be power gated during idle stages. When a particular processing element is in a power gated state, how do the interrupts targeted to that core get handled, can the interrupts wakeup the target core? How many levels of sleep are possible. What is the deepest level and what is the power consumption during that level could be some questions that can give some understanding of the system design.

Interfaces

When we think of the interfaces - for mobile phones, the interfaces are what the user can use to communicate - touch and other sensors such as accelerometer etc, speech, input with on-screen keyboard, key presses etc. Typically a complex system such as a mobile phone has multiple chips apart from the main digital chip that takes care of most of the functionality. There are chips to supply regulated power from the battery known as Power Management ICs, there are charger chips to enable charging from the power source, there are external codecs and speakers for better audio performance, there are Radio Frequency chips for cellular, BT, FM, and WLAN technologies, cameras and so on. Within the main silicon as well, there will be typically multiple cores which perform target functions - how each of these cores communicate with the other cores within the silicon on chip (SoC) and also with chips outside the SoC is another area that will provide a lot of perspective of the design.

Analysis

Tuesday, March 30, 2021

Understanding / Designing Embedded Systems

Processing Elements

Memories

Power Architecture

Interfaces