This method has long been promising but has never been realized. Is there a fundamental issue, or is it just bad luck?
In the current era where power consumption has become a core design constraint, there is still skepticism about whether asynchronous logic can play a substantial role. Although this design style has many significant advantages, its practical application still lacks sufficient experimental verification.
Synchronous design relies on clock signals, and the clock frequency is often limited by the longest and slowest path in the design, with the need to consider variables that may arise during the manufacturing process. During the testing phase, it is common practice to categorize chips into different grades based on performance; otherwise, any chip that does not meet the working frequency will be considered substandard.
Advertisement
Clock skew further complicates this complexity. Although the clock signal originates from a single point, it encounters varying degrees of delay as it traverses the chip. This clock skew refers to the deviation of the clock signal from its intended arrival time, and it is also deeply influenced by the manufacturing process.
To mitigate these issues, multiple clocks or other complex design methods are often employed. While these methods help create asynchronous coupling domains, they also give rise to a new category of problems—clock domain crossing.
Moreover, clocks are one of the main sources of power consumption. Since clock signals need to be distributed throughout the chip, a significant amount of capacitance accumulates on the clock lines. Each transition at the clock edge means a charging and discharging process of the capacitor, which not only slows down the speed but also leads to substantial power consumption. To reduce the capacitive load on each buffer, additional buffers are often added, which in turn increases power consumption.
As chips get closer to the process limits, single synchronous clock operation has become impractical. Michael Frank, a retired senior engineer at Arteris, pointed out: "If it is not possible to complete the traversal of the entire chip within a single clock cycle, the design must be considered as a hybrid mode that is locally synchronous but runs asynchronously over long distances. This requires implementation through synchronizers, or it may draw on strategies from the early CPU era—constructing a clock grid with low clock skew. The problem is that clocks consume a lot of power and require a massive re-buffering tree to support a large number of flip-flops."
Synchronous design is attractive because once the longest path is determined, the timing factor becomes relatively secondary, as all operations are broken down into a series of discrete steps. This is crucial for design tools such as synthesis.Synopsys' senior engineer Rob Aitken stated: "Asynchronous design is often seen as a technology full of promise but difficult to implement. Apart from a few specific cases, it is really hard to apply it broadly. Although this is an overly general statement, if you take a piece of RTL code optimized for synchronous design and try to implement the same functionality with asynchronous design, you first have to adjust the RTL to fit the asynchronous environment. Then you proceed with the asynchronous implementation and evaluate whether this adjustment has brought actual benefits. In the long run, the world will eventually find ways to benefit from asynchronous design, but for now, fully synchronous design still dominates because all tools and technologies are optimized around it."
Before any revolutionary change, tool support is crucial. Marly Roncken, director of the Asynchronous Research Center at Portland State University, pointed out: "Asynchronous design faces the typical 'chicken and egg' problem—no suitable tools mean no users, and no users mean no development of tools. This leads to large companies hesitating in this field, making asynchronous design the exclusive domain of startups and research centers. Although synchronous design also includes elements of asynchronous logic, synchronous designers may not pay special attention to this. I expect to see seamless integration in the tool field, allowing synchronous and asynchronous designs to complement each other's strengths and work together to maximize their utility."
So, will there be a decisive moment that makes asynchronous design an indispensable choice? Rajit Manohar, a professor of Electrical Engineering and Computer Science at Yale University, believes: "If given a set of design goals that include power consumption and energy, using asynchronous methods may achieve these goals faster than synchronous methods. This is particularly significant when the goals are very challenging. As long as there is enough time and resources, engineers can optimize any design. Although I cannot assert that a specific performance point can never be achieved, with the right tools, support, and capabilities, engineers can be creative and optimize their designs."
Historical attempts
Back in the 1980s and 1990s, many top system companies actively explored the potential of asynchronous design. However, limited by design tools that were only suitable for synchronous logic at the time, these companies tried various techniques but ultimately did not adopt asynchronous design practices.
In the 1980s, most design work still had to be done manually. Professor Manohar from Yale University pointed out: "Quality tools allow us to design more complex chips, and more efficient processors enable us to run more advanced tools, forming a virtuous cycle of development. Today, we have complex EDA tools to design highly complex synchronous chips. Unfortunately, asynchronous design methods are not mature enough to effectively integrate into this development cycle. The first synchronous processor was born in the 1970s, and the first asynchronous processor was late to the game, not appearing until 1989, with a significant time gap between the two."
In a research paper, the author detailed 10 ways to describe asynchronous systems and their related synthesis methods. Scott Hauck wrote: "To make a comprehensive and in-depth comparison of each method, especially on core issues such as performance, area, and power consumption, is a daunting task. Unfortunately, we have not yet conducted enough real comparisons. What's worse, despite some impressive cases, there is no conclusive evidence that asynchronous circuits have a significant advantage over synchronous methods. Which method performs best in terms of performance, area, or power consumption in asynchronous design, and whether it is worth the extra effort to abandon the widely applied synchronous model, these questions remain controversial."
It is worth noting that these asynchronous techniques do not include the most common methods in early experiments. Ron Lavallee, President of You Know Solutions, pointed out: "Existing asynchronous methods essentially convert synchronous Turing machines into asynchronous systems. However, the ideal starting point should be a stateless and asynchronous foundation, and then build matching circuits on top of it. Compared with traditional synchronous design methods, developing asynchronous systems is particularly challenging. Moreover, converting a Turing machine of a synchronous system into an asynchronous system is also a technical problem."
Asynchronous design has long been troubled by the lack of a clear design philosophy. Professor Manohar said: "Asynchronous design is prone to errors. In synchronous design, I can refer to VLSI books that show 50 different types of latches and numerous possible circuit styles, although commercial tools may not support many of them. The field of synchronous logic has established an effective design style, and engineers know how to make it work efficiently and achieve the desired results. However, in asynchronous logic, there are many different methods with varying effects. For outsiders, it is difficult to determine which method is the most suitable. Once the wrong method is chosen, it may lead to trouble. This is also part of the problem. The lack of regular production of asynchronous chips and professionals with relevant expertise, as well as the lack of reliable design tools and automated support methodologies, are significant factors hindering the development of asynchronous design."When seeking circuits suitable for asynchronous design, we should focus on those whose operation time depends on the data. The characteristic of these circuits is that some results can be quickly calculated, while others take longer. If calculations are to be completed within a fixed time period, adjustments must be made based on the longest possible computation time.
Professor Manohar further explains: "Multiplication is a simple example. Suppose I am writing a piece of software to calculate the product of two numbers. By analyzing the code, I find that the multiplier is the performance bottleneck. I notice that, in most cases, the value of variable X is zero. As a software engineer, I might add a conditional check: if X equals zero, then return zero directly; otherwise, perform the multiplication operation. However, in clock-driven synchronous designs, this is not a good idea. Because, in the worst case, adding this check condition could lead to a decrease in frequency. In asynchronous design, this is an optimization because, on average, our performance is improved. This is the kind of problem we need to think about from an algorithmic perspective."
Language and Tools
This is a significant barrier because every EDA language currently in use is deeply optimized for synchronous input methods.
"We initially tried to introduce decision flowcharts," Mr. Lavallee from You Know Solutions explains, "These flowcharts have been widely used in General Motors' powertrain systems, involving thousands of manufacturing systems. However, one of the main challenges we faced was how to guide people to understand these flowcharts with parallel thinking. Their working principle is to propagate multiple decision flowcharts simultaneously and trigger event functions during the propagation process. This propagation can occur in physical, biological, or chemical substrates. In short, flowcharts (as shown in Figure 1) consist of a series of events, actions, and tests. Some have complained that flowcharts can become as complex and intractable as spaghetti code.
Years ago, we successfully solved this problem by transforming flowcharts into a true parallel programming language. You just need to draw a separate flowchart for each task or function, which helps to break down complex large flowcharts into multiple manageable small flowcharts. In addition, we also introduced the concept of objects. Objects allow us to encapsulate action and test structures into higher-level actions and tests, thereby further improving the readability and maintainability of the code. You can also continue this encapsulation operation through as many levels as needed, according to your requirements."
Yale University has developed a unique hardware description language. "This language is essentially a message-passing programming language, where messages are responsible for communication between components," Professor Manohar explains, "It allows us to use clear syntax to describe data flow designs, achieving loops and communication. The language is based on the CSP language developed by Tony Hoare in 1979, but with semantic innovations."
However, transitioning from synchronous languages like Verilog is quite challenging. "Many synchronous design tools do not support asynchronous functions at the core software level," Professor Roncken from Portland State University points out, "This is particularly evident in timing analysis and testing tools, which are crucial for designers. Our research is closely connected with the asynchronous tools developed by Yale University, which in turn draw on the deep knowledge and experience in the field of asynchrony from institutions such as the California Institute of Technology, Philips Electronics, the University of Manchester, Intel, and others."Yale University's research project has been funded by DARPA's Electronics Resurgence Initiative (ERI). "We have established a complete ASIC process for asynchronous circuits," said Professor Manohar, "We have developed a suite of tools aimed at designing asynchronous circuits with less effort compared to the complex design of synchronous chips. Our goal is to demonstrate the ability to automate the design of high-quality chips, or at least significantly reduce the workload of clock design."
However, the verification process may encounter some unique challenges. One of them is reproducibility. Although simulation results are deterministic, meaning the order of events is consistent every time a simulation is run, coordinating multiple asynchronous activities is quite difficult. This is a significant problem in real-time systems. Even in a simulation environment, capturing and understanding the system state can be extremely challenging. Minor changes can lead to significant differences in the results of asynchronous designs, while synchronous designs can usually avoid these issues.
Additionally, problems can arise when using reference models for verification. Even though both models may be correct, their behavior patterns may differ, especially in the presence of asynchronous activities. Extra care must be taken to ensure that the reference model stays in sync with the design model.
"There are aspects where they are similar, and aspects where they are vastly different," said Professor Manohar, "We are leveraging formal methods and theorem provers to conduct research in this area to verify various properties of asynchronous designs. At a higher level of abstraction, we need different types of verification because we must check whether asynchronous computations are correctly implemented by gate circuits. We have developed some verification strategies that look more like strategies used for software verification."
Some aspects may be simpler. "The verification of clockless flowchart systems is easier because there is no need to verify every signal path," Mr. Lavallee pointed out, "Once the actions or test block structures used in the substrate have been thoroughly verified, there is no need to verify them again. Verification only involves the signal paths of the flow line and the overall behavior of the system."
However, few designs can be entirely asynchronous, which means that synchronous and asynchronous designs must be combined. "Adaptive systems may be able to address variability issues," said Mr. Aitken from Synopsys, "In these cases, you encounter some more general problems. I designed an asynchronous system, and now its performance has improved, but there are actually two classic points of failure in asynchronous design. One is testing. Although the situation has changed recently, the common method for testing asynchronous circuits historically has been to synchronize them and then perform scanning. The other is actually a technique often used in synchronous design, which is to borrow time from the clock cycle to ensure that the signal integrity of the clock waveform is not too poor. Some things are not entirely asynchronous, nor entirely synchronous. These capabilities allow synchronous design to achieve some significant performance improvements and power reductions. This means that the advantages of fully asynchronous systems are not as great as they theoretically could be."
Clock Domain Crossing (CDC) issues have become the bridge connecting these two worlds. "Subtle issues arise when there are two asynchronous clocks, and the relative gap between their active edges changes dynamically," explained Mr. Prakash Narain, President and CEO of Real Intent, "At some point, this gap becomes small enough that the path between clock triggers cannot meet the timing requirements. To compensate for this, you must ensure that CDC crossings follow a set of specific logical design principles. For relatively slow inter-chip interconnections, we have adopted the Globally Asynchronous, Locally Synchronous (GALS) approach. You create a clock domain, but within this clock domain, you establish islands of fully synchronous logic to meet timing requirements. Between these islands, due to the imbalance of the clock tree, we treat them as asynchronous crossings. They use the same source clock and frequency but may have phase differences. This may bring a certain degree of simplification, but it is not usually the case."
Simply stripping a small part from synchronous design is not feasible. "In my experience, the greatest advantage we gain from asynchronous design is that our approach to the entire problem is different," Professor Manohar concluded, "We often come up with solutions that do not work well in synchronous methods, but due to the adoption of asynchronous logic, these solutions have achieved good results."
ConclusionSo, does asynchronous design have a chance? "If our understanding of asynchronous design and how to automate various steps was known back in 1988, the situation might have been different," says Manohar. "We are at an interesting juncture where companies traditionally seen as software companies are now developing silicon chips. This is an interesting opportunity because there might be a group of people looking at chip design problems from a completely new perspective. This is an opportunity for asynchronous design."
Comment