Posts Tagged ‘Balance’

TEB #10: Analyzing Buffer Zones

Saturday, January 9th, 2010

The Embedded Bridge – Issue #10

One of the more challenging aspects of an embedded systems design is to maximize the time hardware and firmware can work independently while minimizing the time they have to wait for each other. The proper use and sizing of buffers can help balance the load between hardware and firmware.

An I/O block that can hold only one byte at a time constantly interrupts firmware for action. Buffers in the I/O block allow several bytes to be transmitted and/or received between interruptions to firmware. The question is how big should the buffers be? That depends on the application. Here are some general guidelines:

System Attribute Guideline
Data packet size and burst size The buffer should be big enough to hold all the bytes of a multi-byte packet or burst of bytes. The hardware block should be able to handle all bytes in a packet without requiring mid-packet intervention from the driver.
Quantity of data For large quantities of data with high-speed I/O, increasing the buffer size reduces the frequency and overhead of driver interrupts. If the driver is taking too much time to handle the data, increasing the buffer size can improve performance.
Operating system If there is no operating system or a very basic one, then a larger buffer size in hardware can help reduce firmware complexity. Conversely, firmware with proper operating system support is more likely to have separate drivers, each with its own threads and memory buffers, that can efficiently handle I/O traffic with smaller hardware buffers.
Space on the chip If space availability on the die is constrained, the size of the buffer may be limited, requiring firmware to take a bigger load.
Synchronicity of data If the protocol is such that more data cannot be received without first sending some sort of synchronization data, then the buffer needs to be only as big as the largest expected batch of synchronous data. If the data are asynchronous, the buffer should be sized bigger to accommodate multiple batches that may arrive before the driver has a chance to respond.
Buffer location Memory for smaller buffer sizes can be accommodated on the chip itself but it is fixed in size. For large and flexible buffer sizes, consider adding a DMA to access external memory.

These and other system requirements may compete against each other in driving the buffer sizes and will require striking a proper balance.

Best Practice: Size receive and transmit buffers appropriately for efficient communication between hardware and firmware.

This month’s newsletter ran out of buffer space, so I will allocate space in next month’s newsletter for a continuation of this discussion.

Until the next buffered issue…

TEB #3: Balancing How Firmware Waits on Hardware

Saturday, January 2nd, 2010

The Embedded Bridge – Issue #3

A common question engineers often wrestle with is how long hardware will take to do a requested task so firmware can take the next step. Engineers implement different designs (both in hardware and firmware) depending on the length of time, and these designs have varying impacts on hardware and firmware complexity and overall system performance. Understanding their ramifications during the design phase helps balance the load between hardware and firmware.

Based on the hardware and firmware implementation required, we can group these designs into three categories:

  • No Delay – Hardware completes the task almost immediately. Firmware can assume the task is immediately completed and can safely take the next step.
  • Short Delay – Hardware completes the task after a short delay. Firmware must wait momentarily for the task to complete before taking the next step.
  • Long Delay – Hardware completes the task after a long delay. The wait time is long enough that firmware should do other processing while waiting for the task to complete so that it can take the next step.

Let’s take aborts in hardware as an example, since implementations exist in each of the three categories – no, short, and long delays. For some aborts there is no delay; it is a simple matter of returning back to the home or idle state, clearing counters and buffers, and completing other activities that can be done quickly. Such an operation is so quick that it is not necessary for hardware to add extra logic for a status or interrupt bit. In these cases, firmware can initiate the abort and simply move on to the next step, which may be to set up the hardware for the next job. The key is for hardware to complete the abort before firmware tries to access it again.

Best Practice: When the task in hardware is fast enough to complete before the next firmware access, hardware does not need to implement a status or interrupt bit for task completion.

Some abort implementations can take several clock cycles to complete, which means that firmware must wait for completion before accessing the block again. If it is a short delay, hardware should provide a status bit that firmware could poll, looping a few times until the task is done, then move on to the next step. If there is a long delay, then hardware should provide an interrupt bit that firmware will enable. Firmware will then do other processing while waiting for the interrupt to occur. Setting up, waiting and responding to an interrupt requires several CPU cycles with task swaps, context switches and semaphore handling. Thus, for firmware, polling a status bit is preferable to managing an interrupt if the task will be done after a short delay.

Where that line should be between short and long delays must be determined on a case-by-case basis and depends on the hardware platform, operating system and performance requirements. The dividing line could even move dynamically depending on the current operating conditions of the product. To give engineers the flexibility of moving that dividing line, the hardware for short and long delays should be the same, implemented with both a status bit and a maskable interrupt. This flexibility allows engineers to calculate or take measurements to count how many loops the polling is taking and determine if polling is acceptable or if interrupts are needed.

Best Practice: Implement both a status bit and a maskable interrupt bit to indicate completion of hardware tasks that take time to complete, whether a short or a long time.

For some blocks, the time the abort takes can vary from a short delay if the block is in an idle state to a long delay if the block is busy and needs to gracefully terminate. Since firmware cannot know the current state, it must always assume the worse case. If firmware wants to take advantage of the shorter aborts when they do occur, it could poll for several loops in case the task completes quickly. If not, then enable the interrupt and switch to another task.

To help engineers know how to implement the firmware, put in the block’s documentation the min and max abort times and the conditions in which they will occur. It could be something such as, “if the block is already idle, the abort will complete in 20ns, otherwise it will take 2-3us to complete.”

Best Practice: Document the min and max times that a hardware task will take, including the conditions and states that affect those times.

I used aborts for these examples, but the concepts apply for any firmware-initiated hardware task that could take time to complete. Implementing both status and interrupt bits for short- and long-delay hardware tasks allows firmware to balance the system load and performance by using polling loops or interrupts as appropriate.

Until the next interrupt (which will not occur for at least 2,000,000,000,000us)…