Altera’s HyperFlex™ Architecture Delivers a Hat Trick

System designers have three key challenges when embarking on any modern electronics design- increasing bandwidth while maintaining power levels, increasing functionality and improving time to market.

Increase Bandwidth by Reducing Routing Delays

In previous generation FPGA devices, routing delays dominate the delay characteristics of a design. In these architectures, the usual way to get higher throughput is to implement wider data busses, which increases power and increases routing congestion.  When routing congestion is increased, signal lines get longer, in order to bypass congested areas. This additional delay counters some of the performance gains that come from using wider data busses.  The HyperFlex architecture introduces major innovations to solve these performance problems. Hyper-Registers are added everywhere throughout all routing segments and these registers can be selectively included or by-passed as needed to improve performance.  With the HyperFlex architecture are now an order of magnitude more registers, located right in the FPGA routing fabric than traditional ALM registers which are located in the logic portion of the fabric. Hyper-registers enable key design optimizations to reach double the performance of previous architectures without sacrificing power or logic efficiency. Additionally, Hyper-aware software tools help designers to automatically target Hyper-Registers in their designs, dramatically simplifying the development process.

0616 Altera Hyperplex image 1
 
Figure 1: HyperFlex Architecture has Ten Times the Registers as Traditional Architectures (Figure courtesy of Altera)

Registers Everywhere

As process geometries shrink, the speed through the logic modules in an FPGA increases. Unfortunately the speed of the interconnect between the logic modules is not increasing as fast. Thus the routing signals are becoming the performance limiting element in all FPGA designs. Locating the Hyper-Registers in the interconnect routing, where they can best address this issue, is one of the key innovations of the HyperFlex architecture.

The “registers everywhere” in the interconnect routing, called Hyper-Registers, are distinct from the conventional registers that are contained within the adaptive logic modules (ALMs). A Hyper-Register is associated with each individual routing segment in the device. Hyper-Registers are also available at the inputs of all functional blocks such as ALMs, embedded memory (M20K) blocks, and digital signal processing (DSP) blocks. The Hyper-Registers are selectively by-passable, allowing the design tools to place the optimal register location automatically, after place-and-route, to maximize interconnect performance. Having Hyper-Registers throughout the interconnect means that performance tuning does not require additional ALM resources (unlike conventional architectures) and does not require additional changes or added complexity to the design’s place-and route. Additionally, having Hyper-Registers built into the interconnect helps to reduce routing congestion, since wider busses are not needed to improve interconnect performance.

Enhanced Clocking Network 

It is important to make sure the other elements of the HyperFlex architecture can ‘keep up’ with the higher performance fabric. The clocking network, for example, needs to be implemented in such a way as to not become the limiting element in a design. The HyperFlex programmable clock tree network allows system designers to create localized clock trees, reducing skew and timing uncertainty to obtain maximum core clocking performance. This capability is a key feature that allows the HyperFlex architecture to reach twice the performance of traditional architectures. In addition, the core clocking uses intelligent branch-enables to reduce the dynamic power dissipation in the clock networks to dramatically improve power efficiency. 

Increasing Functionality with Efficient Logic Implementation

The HyperFlex architecture not only improves performance, it also makes it easier to implement your designs. Since the HyperFlex architecture allows the designer to more easily optimize performance, power, bandwidth, die size and features it takes much less time ‘tuning’ a design to achieve aggressive design goals. No longer do you need to spend the majority of your engineering time and budget trying to ‘shoe horn’ your design into a smaller device while meeting timing requirements and power budgets. The HyperFlex architecture and the associated design tools dramatically simplify, and in many cases automate, the optimization portion of the design. This allows the designer to explore architecture level options and alternative implementations instead of spending a majority of the time optimizing low-level signal placement and delay times. 

Hyper-Aware Design Flow Simplifies Implementation

Along with the HyperFlex architecture Altera has developed a powerful set of new tools, integrated into the Quartus Prime design software, that help system designers take full advantage of the HyperFlex architecture and maximize the developer’s design productivity. The Hyper-Aware design flow includes these three key improvements:

• A Fast Forward Compile tool that allows performance exploration and guides the user to maximum design performance.
• A Hyper-Retimer step that supports performance optimization after place and route.
• Enhanced synthesis and place-and-route algorithms that use the Hyper-Registers.

To maximize the performance of a design using the HyperFlex architecture, designers use a three-step process that is based on familiar design techniques: register retiming, pipelining, and design optimization. The Hyper-Registers allow designers to use familiar design techniques to increase the performance of the design well beyond what is possible in conventional FPGA architectures. When using the Hyper-Aware Design Flow, these techniques are called Hyper-Retiming, Hyper-Pipelining, and Hyper-Optimization. Table 1 summarizes the typical performance gains achieved in each step and a more detailed description of each step follows.

0616 Altera Hyperplex image 2

Hyper-Retiming

The Hyper-Retiming step uses the Hyper-Registers in the interconnect routing to reduce critical path delays. It accomplishes this by selectively moving registers out of the ALMs and into the interconnect, better balancing register-to-register delays. By balancing delays, longer delay paths are eliminated, allowing the design to run at a faster clock frequency. Figure 2, below, shows the results of using Hyper-Registers to improve a critical path delay from 3.5ns to 1.2ns.

Conventional Architecture

0616 Altera Hyperplex image 3

Hyper-Flex Architecture

   0616 Altera Hyperplex image 4

Figure 2: Unbalanced Delays in a Conventional Architecture (Top) Compared to a Balanced Delay (Bottom) In the HyperFlex Architecture (Bottom)

In contrast to Hyper-Retiming,  conventional retiming techniques require additional FPGA logic and routing resources (reducing those available for use in actual product features) and requires the design to be recompiled, refitted, and rerouted, often multiple times, to ‘tune’ the implementation. Hyper-Retiming does not use any additional FPGA logic or routing resources, allowing additional product features to be implemented. Hyper-Retiming is performed after place-and-route, providing a significant performance boost without the need for additional tuning by the designer. This process requires little to no user effort yet it results in an average performance gain of 1.4 times for Stratix 10 devices compared to previous generation high-performance FPGAs. 
 

Hyper-Pipelining

The Hyper-Pipelining step uses Hyper-Registers to eliminates long routing delays by adding additional pipeline stages in the interconnect between the ALMs. Additional pipeline stages add cycles of latency, but in many data processing applications, additional latency is acceptable, and the resulting shorter clock cycle is more important for overall system performance. This technique requires minor user effort and results in an average performance gain of 1.6 times for Stratix 10 devices compared to previous generation high-performance FPGAs. As with Hyper-Retiming, Hyper-Pipelining does not use additional FPGA logic and routing resources, and it is done after place-and-route.

Hyper-Optimization

The Hyper-Optimization step uses Hyper-Registers to improve control oriented designs with long feedback loops and complex state machines. In these types of designs, to improve performance, it is necessary to restructure these logic sections. By restructuring the logic implementation, the design can use  functionally equivalent feed-forward or pre-computed logic techniques to eliminate long combinatorial feedback paths. This method requires a bit more effort, depending on the design. It results in average performance gains of two times or more in Stratix10 devices compared to previous generation high-performance FPGAs. In a conventional architecture, this process is called design optimization. In the HyperFlex architecture, this process is called Hyper-Optimization because the Hyper-Registers apply the benefits of Hyper-Retiming and Hyper-Pipelining to the feed-forward or pre-compute paths.

Faster Time to Market

The HyperFlex architecture dramatically reduces the amount of time typically required to optimize an FPGA implementation. By simplifying and automating the optimization process development time can be reduced, to improve time to market. Alternately, the time saved could be re-allocated to implementing features and capabilities that differentiate the product from the competition. Often, the best choice is a mix of improving time to market and adding new compelling features. The HyperFlex architecture can finally deliver on the full time to market potential available from programmable devices.

Intel 14 nm Tri-Gate Transistor Technology Gets the Assist

Without a leading edge process it would not be possible to combine all the features used in the HyperFlex architecture while still meeting the performance, power and functionality requirements. Stratix® 10 FPGAs and SoCs leverage Intel’s 14 nm Tri-Gate transistor technology (commonly referred to as FinFET) to implement the HyperFlex architecture optimally. The combination of the Intel 14 nm Tri-Gate process and the new HyperFlex high-performance architecture enable Stratix 10 devices to operate with twice the core performance, with an up to 70 percent lower power for equivalent performance, and at five times the density compared to current high-end FPGAs while still delivering the most comprehensive high-performance FPGA functional capabilities.

It is also important to note that the HyperFlex architecture enables significant power reductions. By running the core at twice the frequency of other implementations, designs are often implemented using up to half the logic required in other architectures. The approach, called HyperFolding, often results in a smaller device and when combined with the advanced Intel 14 nm Tri-Gate (FinFET) process results in an up to 70% power reduction when compared to Stratix V devices.


Conclusion

The Altera HyperFlex architecture is a break-through approach that fully delivers on the promise of programmable devices. By power-efficiently and simultaneously improving performance, increasing functionality and speeding time to market, HyperFlex scores a hat trick against all other FPGA architectures.

Latest News

Sorry, your filter selection returned no results.

We've updated our privacy policy. Please take a moment to review these changes. By clicking I Agree to Arrow Electronics Terms Of Use  and have read and understand the Privacy Policy and Cookie Policy.

Our website places cookies on your device to improve your experience and to improve our site. Read more about the cookies we use and how to disable them here. Cookies and tracking technologies may be used for marketing purposes.
By clicking “Accept”, you are consenting to placement of cookies on your device and to our use of tracking technologies. Click “Read More” below for more information and instructions on how to disable cookies and tracking technologies. While acceptance of cookies and tracking technologies is voluntary, disabling them may result in the website not working properly, and certain advertisements may be less relevant to you.
We respect your privacy. Read our privacy policy here