# A Novel Static D-Flip-Flop Topology for Low Swing Clocking

Mallika Rathore Marvell Semiconductor Printer and Custom Solutions Business Unit Boise, ID 83713 mrathore@marvell.com Weicheng Liu Emre Salman Department of ECE Stony Brook University Stony Brook, NY 11794 weicheng.liu @stonybrook.edu Can Sitik Baris Taskin Department of ECE Drexel University Philadelphia, PA 19104 as3577@drexel.edu, taskin@coe.drexel.edu

# ABSTRACT

Low swing clocking is a well known technique to reduce dynamic power consumption of a clock network. A novel static D flip-flop topology is proposed that can reliably operate with a low swing clock signal (down to 50% of the  $V_{DD}$ ) despite the full swing data and output signals. The proposed topology enables low swing signals within the entire clock network, thereby maximizing the power saved by low swing operation. The proposed flip-flop is compared with existing low swing flip-flops using a 45 nm technology node at a clock frequency of 1.5 GHz. The results demonstrate an average reduction of 38.1% and 44.4% in, respectively, power consumption and power-delay product. The sensitivity of each circuit to clock swing is investigated. The robustness of the proposed topology is also demonstrated by ensuring reliable operation at various process, voltage, and temperature corners.

# **Categories and Subject Descriptors**

B.7 [Integrated Circuits]: VLSI (very large scale integration)

### **General Terms**

Design

### **Keywords**

Flip-flop, low power, low swing, clock

### 1. INTRODUCTION

Reducing power consumption is a major objective for almost any application [1]. Voltage scaling is a common method to achieve quadratic savings in power consumption with considerable penalty in performance [2, 3]. Near-threshold computing has recently received attention to widen the application scope of sub-threshold circuits due to relatively more acceptable delay penalty [4]. Reducing the power supply voltage to the levels of threshold voltage, however, is not suitable for a majority of the applications where

GLSVLSI'15, May 20-22, 2015, Pittsburgh, PA, USA

Copyright 2015 ACM 978-1-4503-3474-7/15/05 ...\$15.00 http://dx.doi.org/10.1145/2742060.2742095.

Figure 1: Flip-flop that operates with a low swing clock signal whereas the data and output signals have full swing voltage.

DFF

CLK

Full V<sub>DD</sub>

Q

Full swing

data

Low swing

clock

Full swing

output

performance is also a critical concern. Furthermore, the effect of process and environmental variations is exacerbated at the near-threshold voltage levels [5].

Another approach to reduce power consumption with negligible impact on performance is low swing clocking [6–8]. Low swing clock distribution networks have been investigated in existing literature since clock networks typically consume a significant portion of the overall power [9, 10]. Thus, low swing clock networks implemented either with a dedicated low voltage power grid or single voltage level-shifters can achieve considerable reduction in dynamic power while maintaining the performance.

Existing works on low swing clock networks, however, rely on full swing operation at the sinks (flip-flops) to maintain performance and the timing characteristics of the data paths [9, 10]. This approach significantly limits the power savings since the last stage of a clock network typically has large switching capacitance. Thus, it is desirable to maintain low swing clock signals even at the sink stages (*i.e.*, at the clock pins of the flip-flops). However, a typical flip-flop designed for full swing operation cannot reliably operate with a low swing clock signal. Furthermore, the data (D) and output (Q) signals of the flip-flop have full swing operation since the transistors along the D-to-Q path are connected to full power supply voltage, as illustrated in Fig. 1. A novel D flip-flop topology is therefore required to reliably operate with a low swing clock signal despite the full swing data and output signals.

The most commonly used static D flip-flop topology is redesigned in a novel fashion to accommodate low swing clock signals while ensuring reliable operation without any contention current. The proposed topology enables utilizing low swing clock signals at the clock pins of the flip-flops, thereby increasing the overall power savings that can be achieved via low swing clocking.

The rest of the paper is organized as follows. Background information is provided in Section 2. Previous work on low swing flip-flops is summarized in Section 3. The proposed topology is described in Section 4. Simulation results including a comparative

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org.



Figure 2: Conventional, transmission gate based D flip-flop driven by a low swing clock signal.

analysis with existing works are presented in Section 5, demonstrating an average reduction of 38.1% and 44.4% in, respectively, power consumption and power-delay product. The effect of process and environmental variations is also investigated. Finally, the paper is concluded in Section 6.

#### 2. BACKGROUND

In a typical flip-flop, clock signals drive both NMOS and PMOS transistors (as in transmission gated based and tri-state inverter based flip-flops). If the same flip-flop is used with a low swing clock signal, the PMOS transistors driven by the clock signal fail to completely turn off when the clock signal is high. For example, consider a 45 nm technology with a nominal  $V_{DD}$  of 1 V. If the clock swing is reduced to  $0.7 \times V_{DD}$ , the gate-to-source voltage of the PMOS transistors -0.3 V since the data signal is at full swing and the inverters within the flip-flop are connected to full  $V_{DD}$ . -0.3 V is sufficiently close to the threshold voltage of PMOS transistors in this technology. This behavior significantly affects the operation reliability of a traditional flip-flop driven by a low swing clock signal. As an example, consider a rising-edge triggered master-slave flip-flop. When the clock signal is high, the master latch should be turned off. However, due to low swing clock signal, the transmission gate (or tri-state inverter) within the master latch cannot completely turn off. If the data signal is in a different state than the stored data within the master latch, a race condition occurs which can possibly generate a metastable state.

To better illustrate the unreliability of conventional flip-flops operating with a low swing clock signal, a traditional transmission gate based D flip-flop, as shown in Fig. 2, is simulated with a 45 nm technology node when the clock swing is 0.7 V. Note that the clock signal and inverted clock signal are internally generated by using two inverters. This circuit is referred to as the clock sub-circuit in this paper, as also depicted in Fig. 2. Note that the inverters within the clock sub-circuit are connected to a low supply voltage to provide low swing clock signals. Since the PMOS transistors driven by the clock signals are not completely turned off, internal nodes experience a glitch as high as 400 mV and clock-to-Q delay increases by more than 15%. Furthermore, in the slow corner, the flip-flop fails to correctly latch the data signal. Thus, a new flipflop topology is required that can reliably operate with a low swing clock signal.

#### 3. PREVIOUS WORK

Existing flip-flop topologies developed for a low swing clock signal are summarized in this section. The strengths and weaknesses of each topology are discussed.

A flip-flop topology for a low swing clock signal based on clocked CMOS method ( $C^2MOS$ ) and sense amplifier (SA) has been pro-

posed in [11], as illustrated in Fig. 3(a). This circuit, referred to as L-C<sup>2</sup>MOS-SA, reduces the charge-discharge capacitance and implements the conditional pre-charge and discharge technique to achieve low power consumption. The circuit is area efficient and a considerable reduction in leakage current is also obtained. The original version of this topology utilizes diode-connected PMOS transistors within the clock sub-circuit to reduce voltage swing, as depicted in Fig. 3(a). Diode-connected PMOS transistors, however, significantly degrade clock slew due to reduced supply voltage in stacked PMOS transistors, making this topology impractical for industrial circuits. Thus, to achieve a fair comparison, this topology is modified in this work where the clock sub-circuit has a second power supply voltage for low swing operation rather than having diode-connected PMOS transistors. This modified version is referred to as L-C<sup>2</sup>MOS-SA-2. Also note that this topology requires a full swing clock signal at the slave stage (transistor N5), which defies our primary objective of having only a low swing clock signal throughout the entire clock network.

Another flip-flop topology has been proposed in [12] for low swing operation. This topology, referred to as reduced clock swing flip-flop (RCSFF), is depicted in Fig. 3(b). As shown in this figure, this design utilizes an additional low supply voltage within the clock sub-circuit to provide low swing clock signal, similar to the proposed topology in this paper. However, in [12], the low swing clock signal is used to drive PMOS transistors (P2 and P4) that are connected to a higher (full) supply voltage. As mentioned earlier, these transistors cannot completely turn off, producing functionality and reliability issues in addition to significantly increasing both short-circuit and leakage current. To alleviate this issue, in [12], authors have utilized the well known bulk biasing technique. Specifically, the bulk nodes of P2 and P4 are connected to a separate well biased at a greater voltage, thereby increasing the threshold voltage of these PMOS transistors. An additional well, however, not only increases the physical area and complexity of the design, but also requires a triple-well process that is not common in standard digital CMOS technologies. Furthermore, in the slowest corner, this issue is exacerbated despite the use of well biasing.

The NAND-type keeper flip-flop topology proposed in [13], referred to as NDKFF, is illustrated in Fig. 3(c). As opposed to the previous topology, this circuit does not require a separate well at the expense of excessive leakage current that flows through transistors P2, N1-N3 when node X is at logic low. Furthermore, a contention occurs at node X since the level-keeping transistors, *i.e.*, P2, N4, N5 and I1-I2 have a race condition when node X transitions from logic low to logic high, thereby increasing the transition time and clock-to-Q delay of the output. This issue is exacerbated during the worst-case delay analysis of the circuit, which can be partially controlled by carefully sizing the transistors.

In [14], authors have proposed a contention reduced flip-flop referred to as CRFF and is depicted in Fig. 3(d). This circuit utilizes a pulsed clock signal to provide a short transparency window during which the output is discharged through the NMOS transistors N1-N4. During this transparency window, the clocked transistors P5 and P6 disconnect the latch (I1-I2), thereby reducing contention current. Transistors P1 and P2 are controlled by input D through P3 and P4 which further reduces the contention current. However, low swing clock signal is used to drive PMOS transistors P5 and P6, thereby suffering from the aforementioned issues of functionality and reliability.

#### 4. PROPOSED D FLIP-FLOP TOPOLOGY

As observed in some of the previously proposed topologies, if a low swing clock signal drives PMOS transistors, functionality and



Figure 3: Existing low swing flip-flop topologies that are compared with the proposed topology in this work: (a)  $C^2MOS$  and sense amplifier based low swing flip-flop, L- $C^2MOS$ -SA [11], (b) reduced clock swing flip-flop, RCSFF [12], (c) NAND-type keeper flip-flop, NDKFF [13], and (d) contention reduced flip-flop, CRFF [14].

reliability are compromised, particularly in nanoscale technologies where the supply voltage is in the range of 0.8 to 1.1 V and threshold voltage is in the range of 0.3 to 0.5 V. Thus, in the proposed topology, as depicted in Fig. 4, low swing clock signal drives only NMOS transistors.

The proposed topology is based on the most commonly used, static D flip-flop shown in Fig. 2. However, rather than using transmission gates, pass gates with NMOS transistors (N2, N4, N13, and N14) are utilized as the switches in both master and slave latches. Thus, when the low swing clock signal is at logic high, N2 and N13 can completely turn off. Replacing the transmission gates with pass gates, however, introduces another issue since the pass gates cannot transfer a full voltage to the output. This issue is critical since the incoming data signal operates at full swing. Thus, node Y cannot reach a full  $V_{DD}$  which increases the short-circuit and leakage current in the following stages in addition to increasing clock-to-Q delay. Furthermore, pass transistors are known to be less robust to process variations. To alleviate these issues, a pull-up network consisting of two PMOS transistors is added to both master and slave latches (P4 to P7). When the master node (input of N4) transitions to logic low, P4 turns on. If the data signal D is also at logic low, then node Y is pulled to full VDD through P4 and P6. Note that P6 (and P7 in the slave latch) are added to prevent contention current

(and therefore reduce power consumption) when the data signal D is at logic high and clock signal is at logic low. In this situation, N2 is on and node Y is discharged through N2 and N1. If P6 does not exist, a race condition occurs at node Y since N2 and N1 should be stronger than P4, which pulls node Y to full  $V_{DD}$ . Finally, a pull-down logic is added to both master and slave latches to enhance clock-to-Q delay and setup time (N6 to N9). Specifically, when data and clock signals are at logic low, the pull-down logic is active and pulls the master node (input of N4) to ground, triggering P4. Thus, node Y quickly reaches full  $V_{DD}$ . Note that the master node does not need to wait for node Y to rise through a weak pass transistor and turn on N3. Instead, the pull-down logic completes this transition relatively faster. Also note that the clock sub-circuit is not shown in Fig. 4, but is identical to the sub-circuit shown in Fig. 2.

The operation of the proposed flip-flop is depicted in Fig. 5 in a 45 nm technology node where the nominal  $V_{DD}$  is 1 V. The clock voltage swing is 0.7 V. The flip-flop successfully latches the full swing data and produces a full swing output while operating with a low swing clock signal.

The proposed topology is relatively less complex than the previously proposed flip-flops described in Section 3 since the proposed topology is static and does not require a separate well (bulk biasing)



Figure 4: Proposed flip-flop topology where the low swing clock signal drives only NMOS transistors.



Figure 5: Operation of the proposed flip-flop in a 45 nm technology node where the nominal  $V_{DD}$  is 1 V. Clock swing is set to 0.7 V.

or sense amplifier. Furthermore, the performance characteristics of the proposed topology outperform existing flip-flops, as discussed in the following section.

#### 5. SIMULATION RESULTS

The simulation setup is described in Section 5.1 and the simulation results including a comparative analysis with existing work are presented in Section 5.2. Robustness of each topology to process, voltage, and temperature variations is investigated in Section 5.3.

#### 5.1 Simulation Setup

The proposed flip-flop topology and the previous circuits in existing work (L-C<sup>2</sup>MOS-SA [11], L-C<sup>2</sup>MOS-SA-2 [11], RCSFF [12], NDKFF [13], and CRFF [14]) are designed using a 45 nm technology with a nominal supply voltage of 1 V and all of the simulations are performed using Spectre [15]. The clock signal has a reduced swing of 0.7 V. The clock and data frequencies are, respectively, 1.5 GHz and 150 MHz. Each flip-flop drives an output load capacitance of 5 fF. To achieve a fair comparison, all of the flip-flops are sized to produce approximately equal clock-to-Q delay.

## 5.2 Comparative Analysis

The simulation results are listed in Table 1, comparing clock-to-Q delay, power consumption, power-delay product (PDP), leakage power, overall transistor size, and setup and hold times of all of the flip-flops. Note that the leakage power listed in this table is obtained by averaging the leakage power obtained from four possible static combinations of the data and clock signals.

As listed in Table 1, the proposed topology achieves, on average, 38.1% and 44.4% reduction in, respectively, dynamic power and power-delay product while exhibiting similar clock-to-Q delay. L-C<sup>2</sup>MOS-SA [11] achieves the least leakage power that is approximately 45% less than the proposed topology. L-C<sup>2</sup>MOS-SA [11], however, exhibits degraded behavior at the worst-case corners, as described in Section 5.3. The proposed topology exhibits the second lowest leakage power, achieving significant reduction, particularly as compared to NDKFF [13] and CRFF [14]. Overall transistor width of the proposed topology is less than the other topologies except L-C<sup>2</sup>MOS-SA [11]. The setup-hold time characterization of each topology is also performed, as listed in the last two columns [16]. Similar to some other topologies, the proposed flip-flop exhibits a negative setup time.

The effect of clock swing voltage level on clock-to-Q delay and power consumption is also investigated for each flip-flop, as depicted in Fig. 6. According to Fig. 6(a), clock-to-Q delay increases as the clock swing is reduced in each topology. For L- $C^2MOS$ -SA [11], NDKFF [13], and CRFF [14], clock swing cannot be reduced below 0.6 V since these circuits fail to latch the input data at clock swings lower than 0.6 V. Note that the clock-to-Q delay is highly sensitive to voltage swing for NDKFF [13]. The proposed topology can reliably latch the input data for a clock swing

Table 1: Comparison of the proposed topology with existing work under nominal operating conditions with a clock voltage swing of  $0.7 \times V_{DD}$ . Each topology is sized to achieve approximately equal clock-to-Q delay.

| Flin-flop topology             | CLK-to-Q<br>delay (ps) | Overall<br>power<br>(µW) | PDP<br>(fW.s) | Leakage<br>power<br>(nW) | Overall transistor width |              | Setup | Hold         |
|--------------------------------|------------------------|--------------------------|---------------|--------------------------|--------------------------|--------------|-------|--------------|
| r np-nop topology              |                        |                          |               |                          | NMOS<br>(nm)             | PMOS<br>(nm) | (ps)  | time<br>(ps) |
| L-C <sup>2</sup> MOS-SA [11]   | 69.7                   | 8.0                      | 0.56          | 37.2                     | 2395                     | 2950         | 27.9  | 0.9          |
| L-C <sup>2</sup> MOS-SA-2 [11] | 71.3                   | 9.3                      | 0.67          | 34.5                     | 4195                     | 3550         | 17.5  | -9.2         |
| RCSFF [12]                     | 70.6                   | 17.2                     | 1.22          | 82.0                     | 6840                     | 5500         | -26.9 | 42.6         |
| NDKFF [13]                     | 69.4                   | 12.0                     | 0.83          | 196.6                    | 5050                     | 3900         | -6.0  | -62.7        |
| CRFF [14]                      | 70.3                   | 10.5                     | 0.74          | 201.5                    | 4650                     | 5300         | -20.2 | 92.9         |
| This work                      | 64.1                   | 6.6                      | 0.42          | 68.1                     | 2650                     | 3450         | -1.7  | 17.8         |



Figure 6: Effect of clock swing voltage level on clock-to-Q delay and power consumption for each flip-flop topology: (a) Clock-to-Q delay vs. clock voltage swing, (b) power consumption vs. clock voltage swing.

as low as 0.5 V (half  $V_{DD}$ ). Furthermore, the proposed topology exhibits relatively low sensitivity to voltage swing. RCSFF [12] is the only topology that can work with a clock swing as low as 0.4 V. However, the clock-to-Q delay significantly increases below 0.5 V, making this operating point impractical. Furthermore, RCSFF [12] consumes significantly more power than the proposed topology, as listed in Table 1.

The dependence of power dissipation on clock voltage swing is shown in Fig. 6(b). According to this figure, the proposed topology exhibits the lowest power consumption at each clock swing voltage. Note that from 0.7 V to 0.6 V, the overall power is slightly reduced whereas from 0.6 V to 0.5 V, there is a slight increase. The overall effect of clock swing on power depends upon two factors: 1) partial reduction in power since the clock sub-circuit consumes less power with a lower swing, 2) partial increase in power due to a greater contention current with a lower clock swing. If the first factor outweighs the second factor, overall power is reduced as the clock swing is reduced. Note that CRFF [14] is the only topology where power consumption significantly increases with a lower clock swing, indicating the dominance of the second factor.

### 5.3 Robustness to Variations

A critical challenge in nanoscale ICs is the variations incurred during fabrication and fluctuations in operating voltage and temperature. The behavior of a circuit to these variations is important to evaluate the overall robustness. To investigate this issue, each flip-flop topology is simulated in the worst-case corner for delay, transient power, and leakage power. The results are listed in Table 2. Note that L-C<sup>2</sup>MOS-SA [11] and RCSFF [12] fail to latch the input data at the worst-case corner for delay, determined by the slow process models for both NMOS and PMOS transistors, 0.9 V  $V_{DD}$  (90% of the nominal  $V_{DD}$ ) and 165°C temperature.

The proposed topology exhibits the lowest clock-to-Q delay (approximately 20% lower, on average) in the worst-case corner even though each topology exhibits similar delays in the nominal case (see Table 1). This trend demonstrates that the clock-to-Q delay of the proposed topology exhibits the least sensitivity to process and environmental variations.

Similar to the nominal case, the proposed topology consumes the least transient power in the worst-case, as determined by the fast process models for both NMOS and PMOS transistors and 1.1 V  $V_{DD}$  (110% of the nominal  $V_{DD}$ ). Note that the temperature that corresponds to the worst-case corner for overall power depends upon the topology due to inverted temperature dependence [17]. Specifically, in RCSFF [12] and the proposed topologies, the worstcase transient power occurs at the lowest temperature whereas the other topologies consume the largest power at the highest temperature. Note that inverted temperature dependence also applies to the worst-case delay analysis and therefore each topology is simulated with both the lowest and highest temperatures. However, largest clock-to-Q delay occurs at the highest temperature in each topology, as shown by the second column in Table 1. Finally, the worstcase leakage power, determined by the fast process models, 1.1 V  $V_{DD}$ , and highest temperature, is provided in the last column. The trend is similar to the nominal results where the proposed topology consumes the second least leakage power, after L-C<sup>2</sup>MOS-SA [11].

The variation analysis provided in this section demonstrates that the proposed topology can reliably operate at the worst-case delay Table 2: Comparison of the proposed topology with existing work under worst-case operating conditions for clock-to-Q delay, overall transient power, and leakage power. FF and SS correspond, respectively, to fast and slow models for both NMOS and PMOS transistors.

| Flip-flop topology             | CLK-to-Q delay (ps)<br>at SS-0.9V-165°C | Overall transient<br>power (µW) at<br>FF-1.1V | Leakage power (µW)<br>at FF-1.1V-165°C |
|--------------------------------|-----------------------------------------|-----------------------------------------------|----------------------------------------|
| L-C <sup>2</sup> MOS-SA [11]   | Failure                                 | 10.8 (T=165°C)                                | 0.6                                    |
| L-C <sup>2</sup> MOS-SA-2 [11] | 178.8                                   | 11.3 (T=165°C)                                | 0.6                                    |
| RCSFF [12]                     | Failure                                 | 21.5 (T=-40°C)                                | 1.5                                    |
| NDKFF [13]                     | 210.0                                   | 17.2 (T=165°C)                                | 2.5                                    |
| CRFF [14]                      | 175.7                                   | 17.7 (T=165°C)                                | 3.8                                    |
| This work                      | 150.8                                   | 8.9 (T=-40°C)                                 | 1.1                                    |

and power corners, unlike some of the existing topologies that fail at the worst-case delay corner. Furthermore, the proposed topology consumes the least worst-case power consumption and exhibits the least sensitivity to worst-case delay corner.

# 6. CONCLUSIONS

In existing low swing clocking approaches, clock signal is restored to full swing before arriving to the clock pins of the flipflops since traditional flip-flops cannot reliably operate with a low swing clock signal. This approach, however, significantly limits the power savings since the last stage of a clock network typically has the largest capacitance and therefore benefits the most from low swing clocking. A novel static D flip-flop topology is proposed in this paper that can reliably operate with a low swing clock signal, thereby enabling a *true* low swing clocking methodology. The proposed topology is compared with existing low swing flipflops, demonstrating an average reduction of 38.1% and 44.4% in, respectively, power consumption and power-delay product. It is shown that the proposed topology is less sensitive to clock voltage swing than existing circuits. Furthermore, worst-case corner analysis for clock-to-Q delay and power consumption demonstrates that the proposed flip-flop exhibits a robust operation.

# 7. ACKNOWLEDGMENTS

This research is supported by Semiconductor Research Corporation (SRC) under contract No. 2013-TJ-2449 and 2013-TJ-2450.

# 8. REFERENCES

- [1] E. Salman and E. G. Friedman, *High Performance Integrated Circuit Design*. McGraw-Hill, 2012.
- [2] R. Gonzalez, B. M. Gordon, and M. A. Horowitz, "Supply and Threshold Voltage Scaling for Low Power CMOS," *IEEE Journal of Solid-State Circuits*, Vol. 32, No. 8, pp. 1210-1216, August 1997.
- [3] J.-M. Chang and M. Pedram, "Energy Minimization Using Multiple Supply Voltages," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 5, No. 4, pp. 436-443, December 1997.
- [4] R. Dreslinski, *et al.*, "Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits," *Proceedings of the IEEE*, Vol. 98, No. 2, pp. 253-266, 2010.
- [5] F. Chang *et al.*, "Practical Strategies for Power-Efficient Computing Technologies," *Proceedings of the IEEE*, Vol. 98, No. 2, pp. 215-236, 2010.

- [6] H. Zhang, G. Varghese, and J. M. Rabaey, "Low-Swing On-Chip Signaling Techniques: Effectiveness and Robustness," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 8, No. 3, pp. 264-272, June 2000.
- [7] C. Sitik, E. Salman, L. Filippini, S. J. Yoon, and B. Taskin, "FinFET-Based Low Swing Clocking," ACM Journal on Emerging Technologies in Computing Systems, in press.
- [8] C. Sitik, L. Filippini, E. Salman, and B. Taskin, "High Performance Low Swing Clock Tree Synthesis with Custom D Flip-Flop Design," *Proceedings of the IEEE Computer Society Annual Symposium on VLSI*, pp. 498–503, July 2014.
- [9] J. Pangjun and S. Sapatnekar, "Low-Power Clock Distribution Using Multiple Voltages and Reduced Swings," *IEEE Transactions on Very Large Scale Integration (VLSI)* Systems, Vol. 10, No. 3, pp. 309-318, June 2002.
- [10] F. H. A. Asgari and M. Sachdev, "A Low-Power Reduced Swing Global Clocking Methodology," *IEEE Transactions* on Very Large Scale Integration (VLSI) Systems, Vol. 12, No. 5, pp. 538-545, May 2004.
- [11] Z. Jianjun and S. Yihe, "A low clock swing, power saving and generic technology based D flip-flop with single power supply," *IEEE International Conference on ASIC*, pp. 142-144, October 2007.
- [12] H. Kawaguchi and T. Sakurai, "A reduced clock-swing flip-flop (RCSFF) for 63% power reduction," *IEEE Journal* of Solid-State Circuits, Vol. 33, No. 5, pp. 807-811, 1998.
- [13] M. Tokumasu, et al., "A new reduced clock-swing flip-flop: NAND-type keeper flip-flop (NDKFF)," Proceedings of the IEEE Custom Integrated Circuits Conference, pp. 129-132, 2002.
- [14] D. Levacq, et al., "Half V<sub>DD</sub> Clock-Swing Flip-Flop with Reduced Contention for up to 60% Power Saving in Clock Distribution," Proceedings of the IEEE European Solid State Circuits Conference, pp. 190-193, 2007.
- [15] Cadence. Spectre. http://www.cadence.com.
- [16] E. Salman, A. Dasdan, F. Taraporevala, K. Kucukcakar, and E. G. Friedman,, "Exploiting Setup-Hold Time Interdependence In Static Timing Analysis," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 26, No. 6, pp. 1114–1125, June 2007.
- [17] A. Dasdan and I. Hom, "Handling Inverted Temperature Dependence in Static Timing Analysis," ACM Transactions on Design Automation of Electronic Systems, Vol. 11, No. 2, pp. 306-324, April 2006.