i Preface Welcome to the Volume 3 Number 1 of the International Journal of Design, Analysis and Tools for Integrated Circuits and Systems (IJDATICS). This issue comprises of enhanced and extended version of research papers from the International DATICS Workshops in 2011. DATICS Workshops were created by a network of researchers and engineers both from academia and industry in the areas of i) Design, Analysis and Tools for Integrated Circuits and Systems and ii) Communication, Computer Science, Software Engineering and Information Technology. The main target of DATICS Workshops is to bring together software/hardware engineering researchers, computer scientists, practitioners and people from industry to exchange theories, ideas, techniques and experiences. This IJDATICS issue presents four high quality academic papers. This mix provides a well rounded snapshot of current research in the field and provides a springboard for driving future work and discussion. The four papers presented in this volume are summarized as follows: • Hardware/Software Testing: Philemon and Chiraz investigate how advanced hardware/software testing methodologies can be used to engineer more efficient and flexible embedded hardware circuits and systems. • Formal Methods: Krilavičius applies formal methods to address several crucial problems in radiation therapy system. • Analog Circuit: Wey presents a precise and high linearity power supply noise monitor circuit design. We are beholden to all of the authors for their contributions to DATICS Workshops in 2011. We would also like to thank the IJDATICS editorial team. Editors: Ka Lok Man, Xi’an Jiaotong-Liverpool University, China, Myongji University, South Korea and Baltic Institute of Advanced Technology (BPTI), Lithuania Chi-Un Lei, University of Hong Kong, Hong Kong Kaiyu Wan, Xi’an Jiaotong-Liverpool University, China Chi-Hua Chen, National Chiao Tung University, Taiwan ii Table of Contents Vol. 3, No. 1, March 2012 Preface ………………………………………………………………………………....... i Table of Contents ……………………………………………………………………….. ii 1. Reconfigurable Test Architecture for Online Concurrent Fault Detection, Diagnosis and Repair……………………..…………............ Philemon Daniel, Rajeevan Chandel 1 2. Specification and Verification of Radiation Therapy System with Respiratory Compensation using Uppaal…………………………………………………………….. …………………………....… Tomas Krilavičius, Kaiyu Wan, Kevin Lee, Ka Lok Man 8 3. A Fault Tolerant Adder Based On Alternative Computation………………………….. …...………………. Chiraz Khedhiri, Mouna Karmani, Belgacem Hamdi, Ka Lok Man 14 4. A Precise and High Linearity Power Supply Noise Monitor Circuit …………………… ……………………... I-Chyn Wey, Chien-Chang Peng, Yu-Jiang Liao, Yu-Sheng Yang 19 INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 1 Reconfigurable Test Architecture for Online Concurrent Fault Detection, Diagnosis and Repair Philemon Daniel and Rajeevan Chandel Abstract—A complete and versatile online test solution based the system. Most of these are manufacturing faults or on reconfigurable test architecture is presented in the present design errors. A few of these are caused by major paper. Reconfigurable test architecture works alongside the environmental changes or physical damage of the chip. controllers for online concurrent fault detection. The output This effect may remain for the entire lifetime. vectors of the controllers are concurrently monitored and any fault present is detected in a few cycles from the sensitization of 2) Intermittent faults may appear for a brief period of time, the fault. The architecture is then reprogrammed to a similar set then disappear and reappear after a relatively longer of diagnostic hardware to locate a sub block which is the cause for period. As there are no fixed parameters that cause these the fault. The same architecture is then reprogrammed to replace faults, predicting them seems to be an impossible task. The the faulty block thereby completing repair. The test architecture two major reasons for their occurrence are marginal is designed based on configurable logic blocks. The design has dimensions in manufacturing and tight constraints during several advantages viz. (i) it works well for critical VLSI the design. Since the system works well for most of the controllers where shutting down or suspending the operation of a controller for testing is not possible and where the fault needs to time and under most of the conditions, testing and their be detected at the earliest, during the run time of the system, (ii) diagnosis is a major concern. after a fault is detected, diagnosis can be performed online, (iii) 3) Transient faults mostly appear for a much shorter period once a faulty block is located, repair is also done online. Since and disappear quickly. Their appearances are rare and are fault detection, diagnosis and repair are completed online with mainly caused due to environmental variations. one test hardware, the effective hardware overhead is negligible and the system can resume its function within a brief period. The One of the common methods of testing circuits after these are applicability of the architecture is demonstrated for the control placed in the field is using built-in-self-tests (BISTs). These are blocks in OC8051. both effective and practical for most offline tests. The advantages of BIST include the capability of performing Index Terms—concurrent test, multiple controllers, online test, at-speed testing, high fault coverage, elimination of test output vector monitoring, programmable architecture generation effort and less reliance on expensive external testing equipment for applying and monitoring test patterns [3]. BISTs I. INTRODUCTION which are programmable offer much larger flexibility for I N most of the embedded system applications, real time computing is used. The operations execute within the strict constraints called system deadlines. The anti-lock brakes on a deeply embedded components [4]. Owing to these advantages, BIST offers a very cost effective test package. The test methods are further divided into offline car is a simple example of a real-time computing system. The and online BISTs. When the system is shutdown completely or controllers that manage anti-lock brakes must continue to the circuit is detached from the field and the inputs/outputs are perform its intended real-time function during its entire captured by the BIST circuitry are called offline BIST. Online lifetime. The controllers which are very large scale integration BIST is where the operation of the circuit might be temporarily (VLSI) circuits can become faulty in the due course of their suspended to change the mode into test mode and run a BIST. operation. Some faults that arise later in the lifetime, because of Here a test generator (TG) applies the test vectors either in a electro-migration, stress, time-dependant dielectric breakdown random or in a deterministic way to the circuit. A response or thermal cycling, make estimation of mean-time-to-failure verifier (RV) verifies the captured output. The compiled (MTTF) very difficult at design time and consequently, failure response is finally analysed to determine if a fault is present. detection at runtime [1]. The faults that occur during the There are several proposals to tackle both online and offline lifetime of an integrated circuit (IC) can be classified as follows BIST. However these methods are not sufficient to handle [2]: concurrent online testing. 1) Permanent faults are those which remain indefinitely in In the present work a complete online test solution is presented. It concurrently detects faults in controllers, locates Philemon Daniel and Rajeevan Chandel are with the Electronics & the design block that is faulty and replaces the faulty block with Communication Engineering Department, National Institute of Technology a fault free one. The entire test hardware is built by Hamirpur 177 005, Himachal Pradesh, India. E-mail: {phil_dani, rchandel}@nith.ac.in. configurable logic blocks (CLBs). Concurrent online testing is The authors duly acknowledge technical and financial support from DIT, carried out by simultaneously monitoring the outputs of MoCIT, Govt. of India, New Delhi, through SMDP-II Project at NIT Hamirpur multiple controller blocks and dynamically generating HP, India. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 2 signatures, which are in turn monitored for faults. For the Concurrent Test Latency (CTL): It is the number of purpose of diagnosis and repair, each of the controllers is normal functional inputs that must be applied to the circuit divided into blocks of similar gate counts. The hardware used under test (CUT) while the CUT operates normally in order to for fault detection is rerouted to target each block and diagnosis complete the concurrent test process. This is an important is performed. Once a faulty block is detected, then the function factor as it determines how quickly the functional vectors of the block is programmed on the CLBs which have been used achieve the expected fault coverage. If this measure is high, for test, which replaces the faulty block, thus completing repair. then probably the CUT has to wait for a higher number of The paper is organized as follows. In Section II, the various cycles for all the targeted faults to be covered [13]. challenges for concurrent online testing and diagnosis are Fault Latency (FL): It is the time taken for the concurrent discussed. In Section III, the principle behind the design of the test to detect the fault from the time it actually appeared. A complete online test solution i.e. the Reconfigurable related parameter is error latency (EL) defined as the time taken Architecture for Online- Detection, Diagnosis and Repair to detect an error from the time the error gets activated by the (RAO-DDR) is presented. Section IV presents, the Concurrent input vector. FL is the most important factor since it gives the Online Test Architecture for Multiple Controllers information about the number of cycles that go by without the (COTA-MC). In Section V the online fault diagnosis process is fault being detected. EL helps to assess and design the monitor given. Section VI deals with the fault repair procedure. In circuit in a better way to capture the effect of the fault as soon Section VII, the proposed design is validated by implementing as possible, after it gets activated. it for the control blocks of OC8051. Finally, conclusions are Fault Coverage (FC): It is the fraction of the targeted faults drawn in Section VIII. for a particular CUT that are detected by a specific test or a test set. Circuits that are critical, require very high fault coverage in II. CHALLENGES FOR CONCURRENT ONLINE TEST AND each of the fault categories. DIAGNOSIS Area Overhead (AO): It is the number of gates that are Online testing addresses the detection of faults that emerge needed to complete online testing over and above the gate during the operation of the system. These are mainly the count of the original design. Even though area overhead is not a intermittent and transient faults. Online testing is especially major factor, it affects scalability. If area scales proportionally important for critical applications and those applications which then area overhead becomes important. are in high demand. These systems are not expected to fail Concurrent online testing was initially carried out by using without warning. Online testing provides an option to avoid watchdog timers [5]. Watchdog timers alone proved to be catastrophe, if a system fails. Once the test detects an error, the inefficient, because these only confirm if control flow traverses system performs one or a few of the following tasks to adjust to properly. Later, redundancy has been introduced. In one case the error: i) it saves the critical data, ii) issues a warning or duplication with comparison (DWC) [6], where the outputs of switches to a different module, iii) steps down the performance the two copies of the same circuit which operate in tandem, is of the system, iv) starts a repair sequence, v) starts a compared. These can only detect a single error but with 100% reconfiguration mechanism and/or vi) shuts down the system. area overhead and are still inefficient because it is difficult to Online test can be done with a setup outside the system either synchronize both. The method has been further improvised by with the help of software or hardware alone. But the external comparing the outputs of three identical circuits receiving the setup does not have sufficient external pins to monitor the same inputs. The result is interpreted based on majority. entire complex hardware within. Also all the internal faults do However, the area overhead increases to 200%. The other not show up on the pins and external monitoring is expensive. method is to do the same operation twice and compare between Internal online testing is the alternative method to test ICs on the two results. This method is called double-execution or retry. the system. Testing is internal if it takes place on the same Transient faults are likely to be detected. Although this substrate as the design-under-test (DUT) within the technique is area efficient, it introduces a lot of time system-on-chip (SoC). redundancy. For applications where run time is critical, these Online testing can be further divided into concurrent and methods cannot be used. But even otherwise, the time penalty it non-concurrent testing. In non-concurrent testing the DUT is imposes is too high. Another similar method is recomputing tested while the normal operation of the circuit is temporarily with shifted operands (RESO). In this a coding based method of suspended or during the shutdown or boot sequence. For parity checking is used especially for detecting memory and critical applications where the operation of the circuit cannot be data transmission errors [6]. suspended, testing has to be carried out during the normal The initial work on vector monitoring concurrent BIST functioning of the circuit. This kind of testing is called (C-BIST) was reported by Saluja [7]. The test generator of concurrent online testing. Normal online testing methods do C-BIST is a linear feedback shift register (LFSR) and the active not work for concurrent testing, nor do the external online test set consists of exactly one active test vector i.e. the current testing schemes. The major parameters to be considered while value of the LFSR. C-BIST has low hardware overhead but designing a concurrent online test scheme are: very high CTL. This is because in every clock cycle the input vector is compared with only one active test vector. To drive INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 3 down the CTL so far four techniques have been reported in like the one shown in the Fig. 1 is bisected into sub-blocks. literature viz. i)Multiple Hardware Signature Analysis Each block has a switch matrix at its input and output. The Technique (MHSAT) [8], ii) Order Independent Signature inputs go through the switch matrix and the outputs come Analysis Technique (OISAT) [9], iii) windowed-Comparative through the switch matrix. A switch matrix consists of a few Concurrent BIST (w-CBIST) [10] and iv) RAM-based pass transistors and has high speed. The states of the pass Concurrent BIST (R-CBIST) [11]. These decrease the CTL by transistors are set by programming SRAM cells. These switch increasing the number of active test vectors. matrices create a tap to the inputs and outputs of each of the Built-In Concurrent Self-Test (BICST) has been proposed by blocks. These are also capable of disassociating a block from its Sharma and Saluja [12]. When BICST is applied to an n-input inputs and outputs. The taps to the outputs help in diagnosis. m-output combinational CUT that can be tested with T vectors, The disassociating option of a block from its inputs and outputs it utilizes a T-line X (n+m)-column PLA. In [13], an input helps in repair of the block. vector monitoring concurrent BIST technique for monitoring input vectors for concurrent testing based on a preComputed test SET (MICSET) is given. This scheme is based on a test set stored in a mapping logic module which can be implemented with either random logic or a ROM whose address inputs are driven by a subset of the input bits of the CUT. This scheme again suffers from a very high CTL and hence very high fault latency. Since the hardware overhead scales along with the size of the CUT, this scheme is not a workable solution. For systems whose continuous functioning is of utmost importance, online concurrent testing with minimum area overhead and minimum fault latency that is presented here is the best solution. Diagnosis has been almost completely offline, since online diagnosis is expensive and the online diagnostic resolution achieved has been very low. Moreover, there is nothing much one can do after online diagnosis because repair of logic blocks Fig. 1. Sub blocks of a module connected with RAO-DDR through switch has again been an almost impossible task. One method of matrices diagnosis is using external hardware or a reconfigurable FPGA connected to the chip, which is a very long and tedious process IV. CONCURRENT ONLINE TEST ARCHITECTURE FOR [19]. As mentioned earlier, after completing diagnosis there are MULTIPLE CONTROLLERS no efficient repair methods to replace logic. The Concurrent Online Test Architecture for Multiple An efficient diagnosis and repair method is also proposed Controllers (COTA-MC) [18] is used here for fault detection in which solves most of the problems mentioned above. RAO-DDR. Its effectiveness has been established based on its implementation on the controllers of OC8051 [17] as shown in III. RECONFIGURABLE ARCHITECTURE FOR ONLINE – Fig. 2. OC8051 is chosen because in most embedded system DETECTION, DIAGNOSIS AND REPAIR applications at least a single microcontroller is used. The In order to facilitate a complete test solution, the test method of testing used is non-intrusive and adds no hardware is a reconfigurable logic; lookup table (LUT) based performance overhead to the circuit under test. The normal configurable logic blocks (CLBs). A CLB can be made up of program execution is the necessary input required for the CUT. sub-components called slices and each slice can have one or Furthermore, there is no requirement of an extra test pattern two 6-input LUTs, a full adder (FA) and one or two D flip-flops. generator like most other BIST methods. The outputs of the Test architecture of 15 CLBs is considered. controllers are passed through a set of scramblers and then The proposed test architecture is configured first for through a series of XOR gates. It is then fed to two compactors concurrent fault detection of either one or multiple controllers. viz. the multiple input signature register (MISR) and the This is achieved by routing the outputs of the controller to the accumulator based compactor (ABC). These are further COTA-MC. The outputs are monitored and dynamic signatures processed by a set of rule sets implemented on a PLA which are generated. When an unexpected signature is detected, a fail gives the final Pass/ Fail signal. The entire COTA-MC is signal is asserted. Fault detection is explained in detail in the housed on the CLBs of RAO-DDR. next section. In order to make diagnosis possible, the controller under test is segmented into a number of smaller blocks, which have approximately half the gate count as in one CLB. A controller, INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 4 in having separate signatures generated. Secondly, an error in one of the bits will generate a different output at the XOR gate. Fig. 3. Scrambler The XORed bits are then fed to two different compactors, the ABC and the MISR. ABC [14, 15] is chosen since it is proven to have negligible aliasing, provides extremely high fault coverage, has very little hardware overhead and can work effectively for a very large number of cycles with little or no error cancellation. The ABC used is shown in Fig. 4. Each output word of the scrambler (n-bits) is added to the contents of Fig. 2. COTA-MC the register of the accumulator (m-bits; m>n) and the result is in turn added to the next word of the output and so on. The ABC is The registers of COTA-MC are reset when either the system usually used to produce a unique signature. MISR is the other reset is given or the program counter (PC) reaches a particular well known compactor chosen because of its small area and address or COTA-MC’s reset is provided. For many of the acts as an additional signature register along with ABC. The embedded systems and systems with critical applications there probability of error escaping both the signature registers, is are two facts which are exploited in the present architecture. extremely low. One is that there is a specific program cycle that gets executed repeatedly. The other is that the program is loaded once in the system and is not changed unless the normal operation of the system is suspended. Therefore, it is a safe assumption that during the normal operation of a circuit, it is sufficient if the system works fine for the current program that is loaded into the system. A reset signal is generated when PC reaches a pre-programmed value. COTA-MC’s reset can be multiplexed Fig. 4. Accumulator Based Compactor with an external input pin. To demonstrate the architecture and its ability to COTA-MC does not wait until the entire program cycle is simultaneously test multiple controllers, two of the controllers over to read out the signature registers. The outputs of the of OC8051 are chosen. One is the decoder and the other is the signature registers are supplied to the next stage where a set of universal asynchronous receiver transmitter (UART). The rules are implemented using either a PLA structure or one of outputs of the decoder and the UART, each pass through a two the CLBs. It is noticed that some combinations of the output stage scrambler. The function of a scrambler is to shuffle the bits of the ABC and MISR do not occur during the fault free data bits in a predetermined manner. One way for this is to execution of the program cycle. Those bit combinations are shuffle them based on the XOR-ed output of two constantly chosen as rule sets and are checked in every valid program changing bits. For example, two bits of the opcode have been cycle. The rule sets are combinational functions that check the used here. A typical scrambler is shown in Fig 3. The 4-bit cumulative validity of the signatures upto the previous program input A, B, C and D are shuffled based on the parity generated cycle. Three rule functions for each of the signature registers (Y) by the two opcode bits B(0) and B(1). If Y=0, the outputs of are chosen in the present work and the total six rule function the scrambler A’, B’, C’ and D’ will be A, C, B and D; and if outputs are ORed together to generate the Fail function. If a Y=1, the outputs will be B, D, C and A. This shuffling based on violation is found, then the fail signal is asserted. If there is a the opcode bits increases the probability of error detection. A fault in either the decoder or the UART, the signatures in both two stage scrambler is used to thoroughly shuffle the data bits. ABC and MISR change and violate at least one rule set in the Both the decoder and the UART have their respective next few cycles. two-stage scramblers. The present COTA-MC can be used for both online and The data bits are preserved at the output of the scrambler, but offline testing and is fully customizable according to the test these appear shuffled. In the next stage, the data bits from the requirements of the CUT. Online testing will be completed decoder are XOR-ed with the data bits from the UART. This is while the CUT is performing its normal operation. All the faults done for two reasons. Firstly, to reduce the hardware overhead in the false paths i.e. the non-functional paths are naturally INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 5 excluded. Since it is based on output vector monitoring, the VI. ONLINE FAULT REPAIR architecture is scalable and hence hardware overhead becomes Once a faulty sub block is identified, the corresponding negligible for larger controllers. switch matrices are programmed to isolate the sub block from the controller and the inputs are rerouted to the test architecture. V. ONLINE FAULT DIAGNOSIS The test architecture is reconfigured to perform the same For online fault diagnosis, the controller has to temporarily function as that of the faulty sub block that needs replacement. suspend operation. As explained earlier, the controller is The outputs of this reconfigured block are routed back through divided into several smaller sub blocks for easy isolation of a the output switch matrix of the faulty sub block. So the faulty faulty block. The objective of diagnosis is not to locate the gate sub block is replaced with the reconfigured test architecture. or a transistor that is faulty. But the objective is to isolate the This repair is feasible only when the faulty block is not closed block that is faulty and replace that block with a fault free one. very tight for timing. As the pass transistors are fast their delay A flow chart explaining the online diagnosis process is given in can be neglected. The delay from the CLBs prove to be a few Fig 5. The controller that is to be diagnosed is considered one extra gate delays. In worst cases, if the clock frequency can be sub block at a time. For each sub block, the controller is run for lowered a little bit, this repair becomes better. There is a small a regular cycle. The switch matrices are used to tap the outputs compromise on performance but the system can continue to of the block under consideration. The outputs are fed to the work. An example is shown in Fig 6, where sub block 5 is COTA-MC as explained in the previous section. The identified to be faulty. So this sub block 5 is isolated and the COTA-MC is configured into the reconfigurable test inputs are routed through the input switch matrix of sub block 5 architecture. In each cycle, the dynamic signatures are watched to the test architecture which performs the function of the sub to decide if the fault occurs in this particular sub block. When a block 5 and the outputs are routed back to the output switch fault occurs, the fail signal is asserted. A little simpler option matrix of sub block 5. can also be adopted for diagnosis. Instead of watching the dynamic signatures, the final signatures can be compared after the full run of the instruction sequence. In both the cases the faulty block is located but in the later case the detection will only happen when the test cycle ends. If no fault is detected in this sub block, then this sub block is released and the next sub block’s output is similarly connected to the COTA-MC through the switch matrices. The controller is executed for a new set of test cycle. This process is repeated for each of the blocks until the faulty block is detected. If no fault is detected, then there is no fault in any of the sub blocks. Thus diagnosis is completed with the chip on board. Fig. 6. Repair for the faulty sub block 5 VII. DESIGN VALIDATION The RAO-DDR scheme is implemented within the OC8051 microcontroller. A set of 15 CLBs are used to implement the COTA-MC for online concurrent fault detection and fault diagnosis. The CLBs are then reprogrammed to replace the faulty block as explained. The hardware is the same for fault detection, diagnosis and repair. There is no extra area overhead. The same architecture works for all controllers whose sub-blocks can fit within the 15 CLBs and the outputs are limited to the allocated CLB inputs. Each CLB is approximately equivalent to 230 gates. The architecture is scalable for larger circuits. It only depends on the number of output lines and the size of the sub-blocks. The minimum overhead required for the test architecture comes from the Fig. 5. Flow chart for online diagnosis COTA-MC. The hardware calculation is given in Table 1. A INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 6 TABLE I The benchmark is not good for pipelined circuits and hence AREA OVERHEAD CALCULATION cannot be used efficiently with software based self test (SBST) Block Hardware overhead Gates methods. Its FC is good, but covers a large number of faults in Scrambler1 2n Muxes + 1 XORgate 6n+1 Scrambler2 2k Muxes + 1 XORgate 6k+1 the false paths [16]. Consequently, CTL increases XORs n XOR gates n unrealistically. COTA-MC is good in all these aspects, ABC < (5 x m) gates + m x DFF m x 13 especially with the CTL as it does not cover false paths. The MISR < n-XOR gates + n x DFF nx9 online diagnosis and online repair schemes are unique to PLA rule-set 4inputs x 18AND gates + 4 x 18 + RAO-DDR. The proposed RAO-DDR is very good for 7input x 3OR gates + 1 7x3+1 practical applications and mainly for controllers and controller Total Gates 6n+1 + 6k+1 + n + m x 13 + n x 9 + 94 like circuits. TABLE II PERFORMANCE ANALYSIS OF THE BENCHMARK MICSET AND THE PROPOSED VIII. CONCLUSION RAO-DDR SCHEMES Parameters In the present work, an all-inclusive test architecture is MICSET [13] RAO-DDR (Proposed) Fault Stuck-at only All fault models, whose presented and shown feasible for controller like modules. The models fault effect propagates to proposed RAO-DDR is capable of concurrently detecting faults the output of the CUT during system operation. Once the fault is detected, it is able to Circuits Good only for small Works well for any locate the fault up to a sub-block and repair the faulty combinational circuits combinational and sequential circuit sub-block. All these are achieved without any additional Fault Good, but covers faults Good. Does not cover hardware overhead because of the reprogrammable logic used coverage in false paths too faults in false paths for the test architecture. The practical effectiveness of the (FC) method is demonstrated by applying the scheme to the Fault Low Low Latency (FL) controllers of OC8051 and is compared with the concurrent Concurrent Extremely high. Well within reasonable online fault detection methods. RAO-DDR proves to be far Test Latency 5.37 X 1018 cycles for limits better in most aspects. It is a complete online test solution and is (CTL) “c880” circuit with 383 ~ 276,300 for the current one of its kind. Methods to improve diagnostic resolution and gates DUT efficiency in repair methods are being investigated as further Hardware 2005% of the DUT 50% of the DUT (Increase area (Increases proportionally only marginally with scope of this research work. overhead with increase in inputs/ increase in inputs/ (AO) outputs) outputs) REFERENCES Hardware Increases as the number None [1] J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, “Exploiting structural dependency of inputs increase duplication for lifetime reliability enhancement,” Proc. 32nd Int. Sym. Aliasing Error cancellation and Nil Comput. Arch. (ISCA), 2005, pp. 520–531. error leakage [2] H. Al-Asaad and M. Shringi, “On-Line Built-In Self-Test for Operational Scalability Not scalable Very much scalable Faults,” Proc. of Conf. Systems Readiness Technology, 2000, pp. because of its generic 168-174. structure [3] M. Abramovici, M. Breuer, and A. Friedman, Digital Systems Testing and Multi block Nil Yes Testable Design. Computer Science Press, 1990. support (e.g. Decoder + UART) [4] P. Philemon Daniel and Rajeevan Chandel, “A Flexible Programmable Diagnosis Not attempted Yes (Identification of the Memory BIST Architecture,” IETE Journal of Education, vol. 51, pp. faulty sub block) 67-74, Dec 2010. Repair Not attempted Yes (Replacement of the [5] A. Mahmood and E. McCluskey, “Concurrent error detection using faulty sub block) watchdog processors-A survey”, IEEE Trans. on Computers, vol. C-37, Practical None, because of Very good, particularly no.2, pp. 160-174, February 1988. significance unrealistic FL, CTL and for Controllers [6] B. W. Johnson, Design and Analysis of Fault Tolerant Digital Systems, very high area overhead Addison-Wesley, Reading, Massachusetts, 1989. [7] K.K. Saluja, R. Sharma, and C.R. Kime, “A Concurrent Testing comparative analysis of the proposed architecture is also Technique for Digital Circuits,” IEEE Trans. Computer-Aided Design, carried out with the benchmark MICSET [13] which also vol. 7, no. 12, pp. 1250-1260, Dec. 1988. [8] K.K. Saluja, R. Sharma, and C.R. Kime, “Concurrent Comparative attempts online concurrent BIST and is shown in Table 2. It is Testing Using BIST Resources,” Proc. Int. Conf. Computer Aided Design, seen from Table 2 that RAO-DDR is better in most of the Nov. 1987, pp. 336-339. aspects. Both the schemes are implemented for the decoder [9] K.K. Saluja, R. Sharma, and C.R. Kime, “Concurrent Comparative with 13-inputs 32-outputs and a UART with 20-inputs and Built-In Testing of Digital Circuits,” Technical Report ECE-8711, Dept. of Electrical and Computer Eng., Univ. of Wisconsin, 1986. 12-outputs of OC8051, with a total gate count of 3279. The [10] I. Voyiatzis and C. Halatsis, “A Low Cost Concurrent BIST Scheme for total test vectors to achieve sufficient coverage are 3699. As Increased Dependability,” IEEE Trans. Dependable and Secure MICSET depends on the number of input vectors, there is a Computing, vol. 2, no. 2, pp. 150-156, April-June 2005. huge increase for pipelined circuits. Whereas, it is seen that to [11] I. Voyiatzis, A. Paschalis, D. Gizopoulos, N. Kranitis, and C.Halatsis, “A Concurrent Built-In Self Test Architecture Based on a Self-Testing implement COTA-MC as part of RAO-DDR, the hardware RAM,” IEEE Trans. Reliability, vol. 54, no. 1, pp. 69-78, Mar. 2005. overhead is within reasonable limits. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 7 [12] R. Sharma and K.K. Saluja, “Theory, Analysis and Implementation of an Philemon Daniel received his B.E. Degree in Electronics On-Line BIST Technique,” VLSI Design, vol. 1, no. 1, pp. 9-22, 1993. & Communication Engineering from JJCET, [13] I. Voyiatzis, A. Paschalis, D. Gizopoulos, C. Halatsis, F.S. Makri, and M. Bharadidasan University in 2001. He received M.Tech Hatzimihail, "An Input Vector Monitoring Concurrent BIST Architecture degree in VLSI Design from Vellore Institute of Based on a Precomputed Test Set," IEEE Trans. on Computers, vol. 57, Technology in 2005. He worked with Sasken no. 8, pp. 1012-1022, Aug. 2008. Communications Ltd(I) Bangalore, before joining as a [14] J. Rajski and J. Tsyzer, “Test Responses Compaction in Accumulators Faculty Engineer under SMDP-II in 2006 at NIT with Rotate Carry Adders,” IEEE Trans. Computer-Aided Design of Hamirpur. Presently he is working as an Assistant Integrated Circuits and Systems, vol. 12, no. 4, pp. 531-539, Apr. 1993. Professor in E&CE Department, National Institute of [15] K. Chakrabarty and J.P. Hayes, “On the Quality of Accumulator-Based Technology, Hamirpur HP India. His research interests include processor based Compaction of Test Responses,” IEEE Trans. on Computer-Aided Design self test, programmable BIST, online testing and advanced logic design. He is of Integrated Circuits and Systems, vol. 16, no. 8, pp. 916 - 922, Aug. currently pursuing his Ph.D on VLSI Self Test at NIT Hamirpur. He is a IEEE 1997. member.. [16] R. Ernst and W. Ye, "Embedded Program Timing Analysis based on Path Clustering and Architecture Classification," Int. Conf. on Rajeevan Chandel received B.E. Degree in Electronics Computer-Aided Design, 1997. IEEE/ACM Digest of Technical Papers, & Communication Engineering from Thapar University, 1997, pp. 598-604, 9-13 Nov 1997. Patiala, India in 1990. She is a double gold medalist of Himachal Pradesh University, Shimla, India, in [17] 8051, 2010. Available: http://opencores.org/project Pre-University and Pre-Engineering in 1985 and 1986 [18] P. Daniel and R. Chandel, "Concurrent Online Test Architecture for respectively. She did her M.Tech. in Integrated Multiple Controller Blocks with Minimum Fault Latency," Ninth IEEE Electronics and Circuits, from Indian Institute of International Symposium on Parallel and Distributed Processing with Technology (IIT), Delhi India in 1997. She was awarded Applications Workshops, pp.45-49, 26-28 May 2011. Ph.D. Degree from IIT Roorkee, India under QIP scheme of Govt. of India in [19] I. Mandjavidze and T. Romanteau, "Embedding Online Test and 2005. Dr. Chandel joined Department of Electronics & Communication Monitoring Features in Real Time Hardware Systems," 17th Engineering, NIT Hamirpur HP India, as Lecturer in 1990, where presently she IEEE-NPSS Real Time Conference, pp.1-8, 24-28 May 2010. is working as an Associate Professor and has been the Head of the Department from 2006 to 2009. She has five MHRD, MCIT sponsored projects to her credit from Govt. of India. She has 35 research papers in international and national journals of repute and over 75 in conferences. Her research interest is in Electronics circuit modeling and low power VLSI design. She is a life member of IETE(I) and ISTE(I) and member of VSI.. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 8 Specification and Verification of Radiation Therapy System with Respiratory Compensation using Uppaal Tomas Krilavičius, Kaiyu Wan, Kevin Lee, and Ka Lok Man Abstract—The goal of radiation therapy is to give as much and analysis of diverse systems. The main reasons of the dose as possible to the target volume of tissue and avoid giving formal methods’ popularity are the following. any dose to a healthy tissue. Advances of the digital control allow performing accurate plans and treatments. Unfortunately, motion compensation during the treatment remains a considerable prob- • Unambiguous models. Formal modeling languages al- lem. Currently, a combination of the different techniques, such low defining systems unambiguously, because syntax as gating (restricting movement of patient) and periodic emission and semantics are defined formally, and those languages are used to avoid damaging healthy tissue. This paper focuses on include means to define non deterministic and stochastic systems that completely compensate respiratory movement (up to behavior precisely, too. Moreover, for the same reasons, certain limit) and start by investigating adequacy of the existing hardware and software platform. unambiguous refinement and code generation techniques In this paper a radiation therapy system consisting of a can be applied. HexaPOD couch with 6-degrees movement, a tracking camera, a • Strict analysis techniques. Because models are defined marker (markers) and a controller is modeled. A formal un-timed using languages with strict semantics, rigorous reasoning model was evaluated and found to be insufficient to completely about models is possible. E.g., model checking, theorem determine adequacy of the system to compensate respiratory motion. Therefore, un-timed model was extended to include time proving and specifically designed algorithms can be used. and investigated. It provides more information than un-timed model, but does not answer all interesting question. Therefore, Quite a few techniques and tools were defined over the based on the results further research directions are sketched. years, e.g. process algebras [6]–[11], timed automaton [12], Index Terms—simulation, verification, formal methods, radia- hybrid automaton [13], SPIN [14] and Uppaal [15] tools and tion treatment, quality assurance a lot more, see [16], [17] for a wider overview. Successful application of formal techniques is reported in different ar- eas, e.g. automotive industry [18], electronics [19], industrial I. I NTRODUCTION devices control [20] and other. T HE goals of the radiation therapy is to give as much dose as possible to the target volume of tissue and avoid giving any dose to a normal tissue. Advances of the computer- This paper investigates applicability of timed automaton [12] and Uppaal tool [15] for the design and functional analysis of a radiation therapy system consisting of a Hexa- based control allow planning and performing accurate plans POD couch with 6-degrees movement, a tracking camera, a and treatments, however motion compensation during treat- marker (markers) and a controller. Uppaal is an integrated ment remains a considerable problem. Different techniques tool environment for modeling, validation and verification of to cope with such problem are analyzed in [1]. Usage of real-time systems modeled as networks of timed automata, gating combined with external surrogates is overviewed in extended with data types and other convenient constructions [2]. However, most of the research models and try to predict [15]. In [21] an un-timed version of the model was presented. movement of the tumor, e.g. [3]–[5]. This paper, on the other However, the model is to abstract to determine adequacy of hand, is interested in modeling hardware and software, which the system for a respiratory motion compensation task. There- is supposed to conform to the requirements, i.e. process images fore, it was extended to include some timing properties and and move precisely and fast. Formal methods are used for such analyze some functional properties in [21]. In this paper timed analysis, because they provide means for rigorous modeling model and timing aspects are presented in detail. Moreover, functional properties, i.e. absence of deadlocks, liveness and T. Krilavičius is with Faculty of Informatics, Vytautas Magnus University, safety, are analyzed. Informatics fac., Kaunas, Lithuania, and Baltic Institute of Advanced Tech- nology. E-mail: (see http://www.surface.lt/krilaviciust). In Section II a detailed description of the radiation treatment Kaiyu Wan is with Xi’an Jiaotong-Liverpool University, Dept. of Computer system is provided. Then Uppaal and timed automaton in Sec- Science and Software Engineering, 111 Ren’ai Road, Suzhou, Jiangsu 215123. E-mail:kaiyu.wan@xjtlu.edu.cn tion III are concisely introduced. In Section IV a Uppaal model K. Lee is with School of I.T, Murdoch University, Australia. E- of the radiation treatment system is presented, some of its mail:kevin.lee@murdoch.edu.au properties are checked, and its applicability to further analysis K.L. Man is with Xi’an Jiaotong-Liverpool University, Dept. of Computer Science and Software Engineering, 111 Ren’ai Road, Suzhou, Jiangsu 215123, is discussed. Future plans and conclusions are discussed in and Myongji University, South Korea. E-mail:ka.man@xjtlu.edu.cn Section V. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 9 Fig. 1. Radiation Treatment System. Fig. 2. Timed Automaton. II. R ADIATION T REATMENT S YSTEM III. T IMED AUTOMATON AND U PPAAL Radiation treatment system under analysis1 , depicted in Timed automaton [12] is one of the most popular techniques Fig. 1, consists of the following components: for modeling and analysis of the real-time systems. A version of automata used in Uppaal [15] is presented. Uppaal is Patient Setup Couch is used to position the patient an integrated tool environment for the modeling, simulation for the treatment, in our case the HexaPOD couch [22], and verification of (complex) real-time systems. It is well- [23]. suited for systems that can be modeled as a collection of External Radiation Beam Source, usually produced non-deterministic processes with finite control structure and by a medical linear accelerator, in short, linac. In the real-valued clocks, communicating through channels or shared current stage of our study it is not important, because be- variables. havior of the couch, the tracking device and the controller are analyzed. Definition 1. Let C = {x, y, z, . . .} be a set of clocks and Tracking Device provides information about the po- B(C) is the set of clock restrictions of the form g, g1 , g2 ::= sition of the patient. Different means and techniques x ⊲⊳ c|x − y ⊲⊳ c|g1 ∧ g2 with x, y ∈ C, c ∈ N and ⊲⊳∈ {≤, < can be used to perform it, see [1] for the details. A , =, >, ≥}. system with a stereo camera is modeled. In this paper Definition 2. A timed automaton is called as a finite directed hardcoded trajectories are used instead of dynamic input, graph A = (L, l0 , A, E, I) over C and B(C), where and therefore, it is omitted. • L is a finite set of locations; Controller is a system, that controls the treatment • l0 ∈ L is the initial location; process, in our case the controller uses information pro- • A is a finite set of action names; vided by the treatment plan and the HexaPOD response C • E ⊆ L × B(C) × A × 2 × L is a finite set of edges, and to control it. I : L → B(C) assigns invariants to locations. g,a,r l −−−→ l′ is written instead of (l, g, a, r, l) ∈ E. l is called A. Experiments with HexaPOD Couch the source location of the state, g is the guard, a is the action, r is the set of clocks to be reset and l′ is the target location. Technical documentation of the HexaPOD device does not Timed automata can be represented as in Fig. 2. Locations provide detailed documentation of its behavior when it is are depicted as nodes of the graph, and the initial location is used continuously, not just to move a patient into a specified usually marked with a double circle. Transitions are depicted position. Usually, when a new position is provided, it starts to by arrows. move towards it accelerating with 5.5m/s2 acceleration until it reaches 7.6m/s (instead of the stated 8mm/s velocity. Then, Definition 3. Let A = (L, l0 , A, E, I) be a timed automaton when 5mm are left to the target, it starts decelerating with over a set of clocks C. The timed transition system T (A) with 5.5m/s2 . In case, when distance to the target is less than tr generated by A is defined as T (A) = (S, Act, −→), where: 5 mm, HexaPOD accelerates until the middle of the interval • S = L × (C → R≥0 ) is a set of states (l, v), where l and then decelerates until it reaches target and stops. is a location of the timed automaton and v is a clock Based on these experiments and expected breath movement valuation that satisfies the invariant of l; timing properties of the model and testing trajectories can be • Act = A ∪ R≥0 is the set of labels; defined. • two types of transitions are defined: a 1 There is a diversity of radiation treatment systems, see [1] for overview – action transitions (l, v) − → (l′ , v ′ ) such that exists an g,a,r ′ of the systems relevant to this study. However, here we define just a selected edge (l −−−→ l ) ∈ E where v satisfies g, v ′ satisfies setup. v[r] and v ′ satisfies I(l′ ), INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 10 d → (l′ , v ′ ) if ∀d′ ∈ [0, d] ⇒ – delay transitions (l, v) − • p -> q leads to, whenever p holds eventually q will v + d satisfies I(l). hold; • deadlock true, if deadlock state is reachable; Let v0 denotes the valuation such that v0 (x) = 0, ∀x ∈ C. • P.state certain properties hold in the selected state. If v0 satisfies the invariant of the initial location l0 , (l0 , v0 ) is called the initial state of T (A). IV. U PPAAL M ODEL OF T HE R ADIATION T REATMENT Timed automata are composed into a network of S YSTEM timed automata consisting of n timed automata Ai = (Li , li0 , A, Ei , Ii ), i = 1...n over a set of clocks C. Let A work in progress is presented, a simplified version of the l = (l1 , ..., ln ) be a location of the network, radiation treatment system defined in sect. II. Model presented Vn then invariants are composed using conjunction I(l) = i=0 Ii (li ). in [21] with timing aspects is extended. Uppaal model consists of the following components: Definition 4. Let Ai = (Li , li0 , A, C, Ei , Ii ), i = 1...n be a • Controller that, based on its state, a treatment plan and network of n timed automata. Let l0 = (l10 , . . . , ln0 ) be the the input from the tracking system, i.e. stereo camera, initial location vector. Then the semantics is defined as a controls movement of the HexaPOD; transition system (S, s0 , →), where • HexaPOD moves according to its physical limitations S = (L1 × . . . × Ln ) × RC is the set of states, s0 = (l0 , v0 ) and following the commands sent from the Controller. is the initial state and transition relation contains three types • HexaPOD Buffer that models asynchronous communica- of transitions: d tion and latency between the controller and the HexaPOD. • time flow transitions (l, v) − → (l, v + d), if ∀d′ ∈ [0, d] • Tracker, in this case an abstraction of a tracking device ′ holds v + d |= Inv(l); (e.g., stereo camera), observes tracker placed on the • discrete transitions HexaPOD (or patient), calculates position of the tracker, τ – synchronized ((l1 , . . . , li , . . . , lj , . . . , ln ), v) − → and provides it to the controller. In this model we use ′ ′ ′ predefined inputs and ignore it. (l1 , . . . , li , . . . , lj , . . . , ln ), v if ∃i 6= j, a?,gj ,rj a!,g ,r ∃ li −−−−→ li′ ∈ Ei , ∃ lj −−−−−→ lj′ ∈ Ej , i i V. G LOBAL D EFINITIONS , VARIABLES AND D ESCRIPTION v |= gi ∧ gj , v ′ |= v[riV∪ rj ] and OF C OMPLETE S YSTEM v ′ |= Ii (li′ ) ∧ Ij (lj′ ) ∧ k6=i,j Ik (lk ); Global definitions and variables are used all over the model. – asynchronous We provide them below. τ → ((l1 , li′ , . . . , ln ), v ′ ) if ((l1 , . . . , li , . . . ,ln ), v) − a!,gi ,ri ′ const int X_MAX = x_coord_max; ∃ li −−−−→ li ∈ Ei , v |= gi , v ′ |= v[ri ] and const int Y_MAX = y_coord_max; v ′ |= Ii (li′ ) ∧ k6=i Ik (lk ). V const int Z_MAX = z_coord_max; const int HP_LATENCY_MAX = hp_latency_max; There are many tools for designing real-time systems based const int HP_LATENCY_MIN = hp_latency_min; on the theory of timed automata. For example, KRONOS const int HP_STEP = hp_step_duration; performs model-checking of TCTL formulas with respect typedef struct { to timed safety automata [24]. The Hybrid Technology tool int x; (HYTECH) is for analysis of embedded systems. It computes int y; the condition under which a linear hybrid system satisfied int z; a temporal requirement. Since times automata are particular } POSITION; hybrid systems they can be verified with this tool [25]. chan move_to; // Controller->HexaPODBuffer State Graph Manipulator tool (SGM) is for real-time system POSITION set_target_pos = {0, 0, 0}; specification and verification. It uses various sophisticated urgent chan get_move; // Buffer->HexaPOD verification techniques developed in the previous years [26]. System definition just instantiates all templates and merges The model-checker Uppaal is based on the theory of timed them into a complete model. automata as well, however its modeling language offers addi- Controller = Controller_(); tional features such as bounded integer variables and urgency. HexaPOD = HexaPOD_(); The query language of Uppaal, used to specify properties to HexaPODBufferLat = HexaPODBufferLat_(); be checked, is a subset of Real Time CTL (computation tree system HexaPOD, HexaPODBufferLat, Controller; logic) [12], [15], [27]: • A[] property invariant, property always holds in all A. HexaPOD paths; An Uppaal model of the HexaPOD is depicted in Fig. 3. It • A<> property eventually, property holds in all paths is modeled as a one point-device with a discrete movement in at some moment; three - x, y and z directions. We abstract from the acceleration • E<> property possibly, property eventually holds at and rotation. Instead of continuous behavior discrete steps some state, at least in one path; on the grid with constant velocity are defined. It allows • E[] property potentially always, property eventually investigating an impact of the latency and the general design of holds from some state, at least in one path; HexaPOD control. The automaton consists of three locations: INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 11 Idle TargetReached get_move? target_pos == current_pos && target_pos = set_target_pos, t == HP_STEP Empty t=0 move_to? get_move! t=0 target_pos != current_pos && target_pos = set_target_pos t == HP_STEP Latency t <= HP_LATENCY_MAX Ready get_move? move_to? move_to? t=0 step(), Move t=0 t <= HP_STEP t=0 t>= HP_LATENCY_MIN t = 0 Fig. 3. HexaPOD in Uppaal. Fig. 4. HexaPODBuffer in Uppaal. • Idle: HexaPOD waits for a command move_to. With Description of the buffer (just clock) is provided below. this action it receives a target, and changes to Move clock t; location. • Move: HexaPOD stepwise moves towards the target, taking steps in the predefined direction of the predefined C. Controller length at a constant speed. After each step it checks for a new target, and updates the current one, if necessary. When the target is reached, it changes to TargetReached Start step == STEPS location. • TargetReached: is a committed location (a special type step < STEPS move_to! of location, which should be left at the next step), which set_target_pos = Finished is used for diagnostic reasons, see sect. III. path[step].pos, ++step, t = 0 Description of the model is defined as follows. Move move_to! clock t; step == (STEPS-1) && t <= path[step].tstamp path[step].tstamp == t POSITION target_pos = {0, 0, 0}; POSITION current_pos = {0, 0, 0}; set_target_pos = path[step].pos step < (STEPS-1) && void step() // make step path[step].tstamp == t move_to! { if (current_pos.x < target_pos.x) set_target_pos = path[step].pos, ++step, t = 0 current_pos.x++; else if (current_pos.x > target_pos.x) Fig. 5. Controller in Uppaal. current_pos.x--; if (current_pos.y < target_pos.y) current_pos.y++; In the current model controller provides control commands else if (current_pos.y > target_pos.y) to the HexaPOD. It consists of three locations: current_pos.y--; if (current_pos.z < target_pos.z) • Start - start of the treatment program (plan). current_pos.z++; • Move - the control program is in progress, control inputs else if (current_pos.z > target_pos.z) provided by an array are sent to the HexaPOD at the current_pos.z--; } predefined time moment. • Finished denotes that the control program was completed successfully. B. HexaPODBuffer const int STEPS = 5; HexaPODBuffer, depicted in Fig. 4, models asynchronous int step = 0; communication and latency. It consists of the following three clock t; locations. typedef struct { • Empty location denotes an empty buffer, it awaits for an POSITION pos; // position input from the Controller, i.e. the move_to command, int tstamp; // timestamp and the target, and then changes to Latency location. } PATH; • Latency location is used to model delays in the system, // Test trajectory i.e. after receiving the new target the buffer delays for const PATH path[STEPS] = { a while before making it available to the HexaPOD. {{ 2, 3, 4 }, 0}, However, the new target can be provided to the buffer {{ 3, 3, 4 }, 30}, anytime. ... {{ 3, 3, 4 }, 10}, • Ready: when the buffer is ready, the target can be {{ 4, 3, 3 }, 20}, acquired by the HexaPOD using get_move command {{ 3, 3, 3}, 20} (action), and location is changed to Empty. }; INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 12 D. Simulation and Analysis Provided properties allow checking different characteristics of the systems and producing diverse diagnostic traces. The Stepwise timed simulations allows to acquire an insight traces can be compared to the required trajectories, and the of the model behavior. However, Uppaal allows more, i.e. control properties of the HexaPOD as well as the Controller, the conformance of the system to the selected properties can estimated. More properties can be added. Moreover, traces can be verified. The following properties are used to analyze the be exported and difference between the target and HexaPOD model: position calculated. However, as it was already mentioned in [28], [29], it is 1) E<> Controller.Finished easy to see that an average distance between the position property allows to check, if there exists a path that of HexaPOD and its target should be found, and therefore allows for the Controller to reach its final location. This exact durations are needed, and the change of the distance over property holds for the model under analysis. time. Moreover, more realistic respiratory movement input are 2) A<> Controller.Finished necessary. Therefore, hybrid models is required to estimate all allows to check, if the Controller reaches Finished lo- durations and time scales. cation in all evolutions. Verification shows that property holds. ACKNOWLEDGMENT 3) E<> Controller.Finished and The authors would like to thank UAB Rubedo and in par- Controller.step == Controller.STEPS ticular Gabrielius Čaplinskas for experiments with HexaPOD there exists such state, that the Controller finishes when couch and comments on its behavior. all control commands were sent. The property holds for the model as well. VI. C ONCLUSIONS AND F UTURE P LANS 4) E<> Controller.Finished and A work in progress, the model of the radiation treatment Controller.step != Controller.STEPS system in Uppaal, is discussed. It is an abstract model, that checks, if all control steps were performed before reach- includes selected elements of the complete system. It allows ing the final state of the Controller, i.e. property would to obtain some useful characteristics of the system. Moreover, hold if there exists at least one state in one path where it shows certain limitations of the approach, time scales Controller reaches Finished state, but not all control and corresponding distances (grid) should be chosen to get commands were sent. It is formulated in such a way that accurate results. In addition, current system does not allow when the tool returns negative answer, then the system providing realistic respiratory movement input, it is impossible works as expected. The property does not hold. to calculate average exposure of the healthy tissue. It can be reformulated in a different manner Our conclusion is that such model is insufficient to answer A<> Controller.Finished and all the interesting questions, and therefore it should be com- Controller.step == Controller.STEPS-1 bined with hybrid model to get more information about the i.e. we can check, if in all paths eventually the state, behavior of the system. Our future plans are as follows: where Controller has finished and it has made all steps • Extensions of the Uppaal model: is reached. It holds. – model of HexaPOD with acceleration; 5) E<> HexaPOD.TargetReached and – model of the targeting component; HexaPOD.current_pos == – implementation of the different control approaches. Controller.path[Controller.STEPS-1] • Continuous model of the HexaPOD, that would allow there exists such state that HexaPOD reaches the target to build more exact discrete model, or generate discrete and its position coincides with the target position set by paths for timed model. Controller. • Semi-formal control model in OpenModelica [30] (see 6) E<> HexaPOD.TargetReached and [29] for the first attempt). HexaPOD.current_pos != • Combination of the real respiratory movement trajectories Controller.path[Controller.STEPS-1] and (formal) model to investigate systems adequacy to there exists such state that HexaPOD reaches the target compensate it. and its position does not coincide with the target position Moreover, hybrid model results should be used to modify set by Controller. Again, the property is formulated in existing models, namely distances and timing. Hybrid and such a way, that when the tool returns negative answer, timed simulation results should be compared to validate both then the system is corrected. As expected, the property models, and different control strategies should be analyzed does not hold. with models. Again, it can be reformulated in the following manner A<> HexaPOD.TargetReached and R EFERENCES HexaPOD.current_pos == Controller. [1] P. J. Keall, G. S. Mageras, J. M. Balter, R. S. Emery, K. M. Forster, path[Controller.STEPS-1].pos S. B. Jiang, J. M. Kapatoes, D. A. Low, M. J. Murphy, B. R. Murray, i.e. we check, if in all paths eventually the state, where C. R. Ramsey, M. B. V. Herk, S. S. Vedam, J. W. Wong, and E. Yorke, “The management of respiratory motion in radiation oncology report of HexaPOD has reached target, and it coincides with the aapm task group 76,” Medical Physics, vol. 33, no. 10, pp. 3874–3900, last target set by Controller. 2006. [Online]. Available: http://link.aip.org/link/MPH/33/3874/1 INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 13 [2] R. I. Berbeco, S. Nishioka, H. Shirato, G. T. Y. Chen, and S. B. vol. 14, pp. 1945–1947, 10.1007/978-3-540-36841-0 485. [Online]. Jiang, “Residual motion of lung tumours in gated radiotherapy Available: http://dx.doi.org/10.1007/978-3-540-36841-0 485 with external respiratory surrogates,” Physics in Medicine and [23] Elekta, “Elekta Synergy R S with HexaPODTM ,” On- Biology, vol. 50, no. 16, pp. 3655–3667, 2005. [Online]. Available: line, 2006. [Online]. Available: http://www.elekta.com/assets/ http://stacks.iop.org/0031-9155/50/i=16/a=001 Elekta-Oncology/Stereotactic-Radiation-Therapy/case studies/0601% [3] D. Ruan, J. A. Fessler, and J. M. Balter, “Real-time prediction of 2003-06%20HexaPOD.pdf respiratory motion based on local regression methods,” Physics in [24] S. C.Daws, A.Olivero and S.Yovine., “The tool kronos,” in Hybrid Medicine and Biology, vol. 52, no. 23, pp. 7137–7152, 2007. [Online]. Systems III, Verification and Control 1996, 1996. Available: http://stacks.iop.org/0031-9155/52/i=23/a=024 [25] P. T.A.Henzinger and H.Wong-Toi, “Hytech: A model checker for hybrid [4] A. Kalet, G. Sandison, H. Wu, and R. Schmitz, “A state-based system,” vol. 1, no. 1, pp. 110–122, 1997. probabilistic model for tumor respiratory motion prediction,” Physics in [26] F.Wang and P.-A.Hsiung, “Efficient and user-friendly verification,” Medicine and Biology, vol. 55, no. 24, pp. 7615–7631, 2010. [Online]. vol. 51, no. 1, pp. 61–83, 2002. Available: http://stacks.iop.org/0031-9155/55/i=24/a=015 [27] E. Clarke, O. Grumber, and D. Peled, Model Checking. MIT, 2001. [5] H. Wu, G. C. Sharp, B. Salzberg, D. Kaeli, H. Shirato, and [28] T. Krilavičius and K. Man, “Timed model of the radiation therapy system S. B. Jiang, “A finite state model for respiratory motion analysis with respiratory motion compensation,” in Proc. of the 6th Int. Conf. on in image guided radiation therapy,” Physics in Medicine and Electrical and Control Technologies (ECT 2011), 2011, p. 6. Biology, vol. 49, no. 23, p. 5357, 2004. [Online]. Available: [29] T. Krilavičius, D. Vitkutė-Adžgauskienė, and K. Šidlauskas, “Simulation http://stacks.iop.org/0031-9155/49/i=23/a=012 of the radiation therapy system for respiratory movement compensation,” [6] R. Milner, Communication and Concurrency. Pren.-Hall, 1989. in Proc. of Mechatronic Systems and Materials (MSM 2011), Kaunas, [7] C. A. R. Hoare, Communicating Sequential Processes. Prent.-Hall, Lithuania, 2011, p. 6. 1985. [30] OpenModelica System website, “OpenModelica System,” [8] E. Brinksma, T. Krilavičius, and Y. S. Usenko, “Process Algebraic 2009. [Online]. Available: http://www.ida.liu.se/\∼{}pelab/modelica/ Approach to Hybrid Systems,” in Proc. of 16th IFAC World Congress, OpenModelica.html Prague, Jul. 2005, pp. 1–6. [9] T. Krilavičius, “Hybrid techniques for hybrid systems,” Ph.D. dissertation, Enschede, 2006. [Online]. Available: http://doc.utwente.nl/ 57124/ [10] D. A. van Beek, K. L. Man, M. A. Reniers, J. E. Rooda, and R. R. H. Tomas Krilavičius is an associate professor at Vytautas Magnus university Schiffelers, “Syntax and Consistent Equation Semantics of Hybrid Chi,” and Senior researcher at Baltic Institute of Advanced Technologies. He has JLAP, vol. 68, no. 1-2, pp. 129–210, 2006. published a number of papers. Tomas is an honorary member of Lithuanian [11] T. Krilavičius and K. Man, Intelligent Automation and Computer Engi- Society of Young Researchers and member of Lithuanian Scientific Society. neering. Springer, 2009, ch. Behavioural Hybrid Process Calculus for He is a PC member of DATICS workshops and EURAS conference. Tomas Modelling and Analysis of Hybrid and Electronic Systems. research interests are in the area of formal modeling and analysis of Cyber- [12] R. Alur and D. Dill, “The theory of timed automata,” in Real- physical and Hybrid systems and application of Natural Language Processing Time: Theory in Practice, ser. Lecture Notes in Computer Science, for Information retrieval. Tomas holds Ph.D. from University of Twente, The J. de Bakker, C. Huizing, W. de Roever, and G. Rozenberg, Netherlands. Eds. Springer Berlin / Heidelberg, 1992, vol. 600, pp. 45–73, 10.1007/BFb0031987. [Online]. Available: http://dx.doi.org/10.1007/ BFb0031987 [13] R. Alur, C. Courcoubetis, N. Halbwachs, T. A. Henzinger, P. H. Ho, X. Nicollin, A. Olivero, J. Sifakis, and S. Yovine, “The algorithmic Kaiyu Wan is a Lecturer at Xi’an Jiaotong Liverpool University. She has analysis of hybrid systems,” TCS, vol. 138, no. 1, pp. 3–34, 1995. published more than 25 academic papers. Kaiyu’s research interests include [14] G. J. Holzmann, The SPIN Model Checker: Primer and Software Systems (Context-Aware Systems, Cyber Physical System, Service- Reference Manual. Addison-Wesley, sep 2003. [Online]. Avail- oriented Architectures and Web service), and Programming Languages ( able: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20\ Intensional Programming Language, Language for Agent Communication and &path=ASIN/0321228626 Collaboration, Compiler and Development Frameworks). Kaiyu holds a Ph.D. [15] G. Behrmann, A. David, and K. G. Larsen, “A tutorial on U PPAAL,” in in Computer Science from Concordia University, Canada. Formal Methods for the Design of Real-Time Systems: 4th Int. School on Formal Methods for the Design of Computer, Communication, and Software Systems, SFM-RT 2004, ser. LNCS, M. Bernardo and F. Corradini, Eds., no. 3185, sep 2004, pp. 200–236. [Online]. Available: http://www.cs.auc.dk/∼adavid/publications/21-tutorial.pdf Kevin Lee is a Lecturer at Murdoch University in Perth, Australia. He [16] J. P. Bowen and M. G. Hinchey, “Ten commandments of formal received his Ph.D. from Lancaster University; was a Research Associate at methods,” IEEE COMPUTER, vol. 28, pp. 56–63, 1994. the University of Manchester and a postdoctoral fellow at the University [17] K. Wan, D. Hughes, K. Man, T. Krilavičius, and S. Zou, “Investigation of Mannheim. He has published more than 40 refereed academic papers on composition mechanisms for cyber physical systems,” vol. 2, no. 1, in international conferences and journals. His research interests focus on pp. 30–40, 2010. adaptive and autonomic systems in the areas of High-speed Networking, [18] B. Gebremichael, T. T. Krilavičius, and Y. S. Usenko, “A formal model Sensor Networks, Scientific Workflow Processing, Physiological Computing of a car periphery supervision system in U PPAAL,” in Proc. of Workshop and Peer-to-Peer networks. on Discrete Event Systems (WODES’04), Reims, France, Sep 2004, pp. 433–438. [19] K. Man, T. Krilavičius, C. Chen, and H. Leung, “Application of bhave toolset for systems control and mixed-signal design,” in Proc. of the Int. MultiConf. of Engineers and Computer Scientists (IMECS), Hong Kong, Ka Lok Man holds a Dr. Eng. degree in Electronic Engineering from March 2010. Politecnico di Torino, Italy; and a Ph.D. degree in Computer Science from [20] T. Krilavičius and V. Miliukas, “Functional modelling and analysis of a Technische Universiteit Eindhoven, The Netherlands. Currently, he is a Senior distributed truck lifting system,” in The 5th Int. Conf. on Electrical and Lecturer with the Department of Computer Science and Software Engineering, Control Technologies (ECT 2010), Kaunas, Lithuania, 2010, p. 6. Xian Jiaotong-Liverpool University, China and a research and engineering [21] K. Man, T. Krilavičius, K. Wan, H. D., and K. Lee, “Modeling and consultant for Solari, Hong Kong. His research interests include logic synthe- analysis of radiation therapy system with respiratory compensation using sis, simulation, formal verification and low power design methodologies for Uppaal,” in Proc. of the 9th IEEE Int. Symp. on Parallel and Distributed integrated circuits and systems, formalization of SystemC and SystemVerilog Processing with Application (ISPA 2011), May 2011. design, formal methods, process algebras, wireless sensor networks, specifica- [22] H. Chung, H. Jin, T. Suh, J. Palta, and S. Kim, “Characterization tion and analysis of distributed, real-time, hybrid systems, embedded systems of a commercial add-on couch, HexaPODTM 6D robotic treatment and physical cyber systems. On the above-mentioned topics, he has authored couchTOP,” in World Congress on Medical Physics and Biomedical or co-authored more than 100 refereed publications. Engineering 2006, ser. IFMBE Proceedings, R. Magjarevic, R. Magjarevic, and J. H. Nagel, Eds. Springer Berlin Heidelberg, 2007, INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 14 A Fault Tolerant Adder Based On Alternative Computation Chiraz Khedhiri, Mouna Karmani, Belgacem Hamdi, and Ka Lok Man Abstract—This paper presents a concurrent error correcting comparison of the results obtained from the two executions will adder design employing fault tolerance through a duplication of a allow error detection. To avoid the problem of extra delay we bit slice of a full adder based on alternative computation. The will propagate the result when the first computation is finished, duplicated module is based on computing the sum and carry bits so that the dependant computation can commence execution as in two alternative ways so that faults will be detected by soon as possible. Redundancy is used to provide comparing the results (Sum and Carry out) obtained from the two computing paths. Redundancy is used to provide fail-operational fail-operational functionality. So that by duplicating the full functionality. If one hardware component goes down, then one of adder based on alternative computation we obtain four copy of the redundant components can be brought in to continue each output that will be compared using a voter. This technique operation of the system. The proposed method is simulated in is compared to a quadruple modular redundancy. To prove the standard CMOS 32nm technology and provides 11.11% saving in reliability of the proposed design, a bit-slice is implemented in transistor count compared to a QMR (Quadruple Modular 32nm CMOS technology and the layout is simulated. Redundancy) style design. This paper is organized as follows. Section II presents Index Terms—adder, concurrent error detection, concurrent previous work on this topic. Section III describes the proposed error correction, alternative computation, fault tolerance, design. In Section IV, we present the simulation results. quadruple modular redundancy, voter Conclusions are given in Section V. II. PREVIOUS WORK I. INTRODUCTION Over the past decades, Complementary Metal Oxide A CCORDING to the Moore’s Law, the fast developing Integrated Circuit (IC) technology will provide the industry with billions of transistors on a single chip in a few Semiconductor (CMOS) technology scaling has been a primary driver of the electronics industry and has provided a denser and faster integration [3, 4]. The need for more performance and years [1]. This continuous scaling of microelectronic integration has accelerated the scaling trends in almost every technology is posing several challenges to the designers of device. In addition, integrated circuit design and testing have electronic systems, particularly from the reliability point of become a real challenge to ensure the functionality and quality view. of the product especially for safety-critical applications. Fault tolerance is usually employed to satisfy the high In fact, safety-critical systems have to function correctly reliability constraints imposed by an increasing number of even in presence of faults because they could cause injury or safety-critical applications [2] such as: Space based loss of human life if they fail or encounter errors. The applications, process control systems, missile guidance automobile, aerospace, medical, nuclear and military systems systems, medical applications…It is achieved by the inclusion are examples of extremely safety-critical applications [5]. of redundancy within a circuit. A Fault tolerant system can Safety-critical applications have strict time and cost perform its specified tasks in the presence of hardware faults constraints, which means that not only faults have to be and software errors. Fault tolerance tries to prevent negative tolerated but also the constraints should be satisfied. Hence, effects of these faults on the system operation. efficient system design approaches with consideration of fault This work proposes a fault tolerant adder based on the tolerance are required [5]. duplication of a concurrent error detection adder. The basic Fault tolerance is usually based on some form of redundancy idea of this concurrent error detection adder is that the same to extend system reliability through the invocation of hardware is used two times in differing ways such that a alternative resources. The redundancy may be in hardware, information or time. Chiraz Khedhiri is with the Electronic & Microelectronics Laboratory, In the hardware redundancy approach, function modules in a Monastir, Tunisia. E-mail: chirazkhedhiri@yahoo.fr Mouna Karmani is with the Electronic & Microelectronics Laboratory, system are duplicated [6], triplicated (Fig. 1), and quadrupled, Monastir, Tunisia. E-mail: mouna.karmani@yahoo.fr so that comparison or majority voting can be performed to Belgacem Hamdi is with the Electronic & Microelectronics Laboratory, detect or correct errors. It is obvious that the hardware overhead Monastir, Tunisia. E-mail: Belgacem.Hamdi@issatgb.tn Ka Lok Man is with the Department of Computer Science and Software of this approach is very high while the extra delay is minimal. Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, China. E-mail: ka.man@xjtlu.edu.cn INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 15 S i 2 = ai ⊕bi ⊕cin . (2) Couti 2 = (Pi + bi ) + (Pi .cin ). bi ai bi ai Fig. 1. Triple modular redundancy Coding techniques are used in the information redundancy approach. However, both the hardware complexity and delay Cin can be very high. Cin On the other hand, time redundancy is an approach to achieve fault tolerance without introducing too much hardware overhead at the expense of more computation time. The basic concept of this method is to repeat computations one or more times (Fig. 2). Si1 Couti1 Si2 Couti2 Time redundancy is an approach to achieve fault tolerance without introducing too much hardware overhead at the (a) First computation (b) Second computation expense of more computation time. In many applications, additional time may be much more affordable than extra hardware [7]. Combination of the two computations ai bi H Fig. 2. Time redundancy technique [8] C in III. PROPOSED DESIGN In the literature, many techniques for concurrent error detection in adders have been proposed: [9-14]. In this paper, we are interested in full adders. Full adder is the fundamental unit in circuits used for performing arithmetic operations such as multipliers, compressors, large adders, comparators and parity checkers [15]. This paper describes a fault tolerant S i1 S i2 C o u t i1 C o u t i2 technique for a full adder based on the duplication of an (c ) Th e c o mp le te c om p ut at io n alternative computation. Fig. 3. Full adder with duplicate computation A. The Alternative Computation In order to realise the complete computation, two paths are As it is proposed in [16] the alternative computation involves used. The first one is selected when H = 0 using (1) (Fig. 3(a)). computing the sum and carry bits in two alternative ways. So And the second path is selected when H = 1 using (2) (Fig. that we will use two different paths in the same hardware in 3(b)). order to calculate duplicate outputs (sum and carry). The As the first execution and the redundant one are shifted with repeated computations are performed differently by using (1) a certain delay, a transfer gate is used in order to synchronise and (2) as illustrated in Fig. 3(c). the duplicated outputs and to prepare them to be compared (Fig. 4). Si 1 = ai ⊕bi ⊕ci n . (1) B. The Fault Tolerant Technique Couti 1 = (Pi + a i ) + (Pi .cin ). In the nanoelectronic environment, the massive occurrence of online faults makes aggressive fault tolerance approaches a fundamental requirement for the implementation of any INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 16 functional system [9]. (case Not Considered). A fault-tolerant system provides continuous safe operation in the presence of faults. Essentially, any fault tolerance approach IV. SIMULATION RESULTS relies on a certain amount of redundancy. In all cases, The concurrent error correction adder is implemented in redundancy is introduced to detect or correct the result of 32nm CMOS technology and only multiplexors and failures in the system. demultiplexors are implemented with pass transistor technology [17]. This is done in order to decrease the number of transistors and to avoid the signal degradation. The layout of H Propagation of the fault tolerant full adder is as shown in Fig. 6. first outputs TABLE I TABLE MAJORITY VOTER TRUTH TABLE x y z w Output S i2 / Couti2 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 Not considered Error signal 0 1 0 0 0 Comparator 0 1 0 1 Not considered 0 1 1 0 Not considered Si1 / Couti1 0 1 1 1 1 Fig. 4. On-line detection of faults after shifting the outputs [16] 1 0 0 0 0 1 0 0 1 Not considered In this paper, the fault tolerant technique consists of 1 0 1 0 Not considered duplicating the concurrent error detection adder presented in 1 0 1 1 1 Fig. 3(c). Thus we obtain four copies of each output. 1 1 0 0 Not considered The general structure of the fault tolerant technique is 1 1 0 1 1 1 1 1 0 1 presented in Fig. 5. 1 1 1 1 1 ai ai bi bi H H 16.4µm C in Cin 8.6µm Si1 Si2 Si3 S i4 Co ut i1 C out i2 C out i3 Cou ti4 Transfer gate Transfer gat e Transfer gate Transfer gat e + i nverter + inverter + inverter + inverter Fig. 6. Layout of the adder in 32nm CMOS technology without faults Voter Voter Fig. 5. The fault tolerant technique Fig. 7 illustrates a SPICE simulation of the circuit of Fig. 6 with shifted outputs and voters. One purpose of redundancy is to provide fail-operational The above simulations show that the outputs are similar thus functionality. If one hardware component goes down, then one the circuit is fault free. The quadruplated outputs Sum and of the redundant components can be brought in to continue Carry verify (1) and (2). operation of the system. Since identical hardware running identical, software should logically produce the same outputs. By comparing the outputs and seeing if they are the same or not, voter can determine if there has been a fault in one of the system. Truth table of conventional bit-by-bit majority voter logic circuit was given in Table I. We suppose that only one fault exist. We ignore the case where there is more than one fault INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 17 Fig. 7. SPICE simulation of the adder in 32nm CMOS technology without Fig. 9. SPICE simulation of the adder in 32nm CMOS technology with faults insertion of primary fault A. Fault Detection Now, we simulate the concurrent error detection adder in the As the voter is the majority function and we assume that presence of faults. Faults are voluntarily and manually injected there is only one fault that is injected, we will always get a into the physical layout of the circuit. Fig. 8 gives an example correct output for voters. of theses simulations. In the above simulation a transient fault A fault-tolerant system provides continuous safe operation in was injected in the primary input cin which made the outputs the presence of faults. It can improve a system’s reliability by sum and carry not complementary. This fault is detected by keeping the system operational when hardware failures and comparator(s) (the XORSum indicate a non valid code ‘0’). software errors occur. C. Overhead We now compare the proposed concurrent error correction adder with the adder using the using the quadruple technique (see Table II). TABLE II NUMBER OF TRANSISTORS IN DIFFERENT CASES Quadruple Quadruple modular modular redundancy with redundancy alternative computation Number of transistors 180 152 The proposed fault tolerant design allows for a decrease in the number of transistors. We save 11.11% on the transistor number overhead if we compare the proposed fault tolerant Fig. 8. SPICE simulation of the concurrent error detection adder in 32nm design with the quadruple based adder. CMOS technology with insertion of primary fault V. CONCLUSION B. Fault Tolerance This paper presents a novel approach to realise fault tolerant We now inject the same transient fault of Fig. 10 in the fault system by the duplication of a concurrent error detection adder tolerant adder of Fig. 5. When the transient fault is injected, it is based on duplicate computation. This technique involves a firstly detected by the comparators used in the concurrent error hardware reduction of 11.11% compared to the quadruple detection adder as it is shown in Fig. 8. By duplicating this modular redundancy scheme. The fault-tolerant adder provides concurrent error detection adder we obtain a fault tolerant continuous safe operation in the presence of faults and it can system. In fact, in Fig. 9 we note that when the fault is injected improve a system’s reliability by keeping the system in the primary input cin, the output S2 does not stay similar to operational when hardware failures and software errors occur. the other outputs Si, while the votersum provide correct output. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 18 REFERENCES [12] M. Nicolaidis, “Carry Checking/Parity Prediction Adders and ALUs”, IEEE Transactions on VLSI Systems, Vol. 11, No. 1, pp. 121–128, Jan [1] Alexander Wei Yin, Liang Guang, Pasi Liljeberg, Pekka Rantala, Jouni 2003. Isoaho, and Hannu Tenhunen, “Hierarchical Agent Based NoC with [13] D. P. Vasudevan and P. K. Lala, “A Technique for Modular Design of DVFS Techniques,” International Journal of Design, Analysis and Tools Self-Checking Carry-Select Adder”, in DFT, pp. 325– 333, 2005. for Circuits and Systems, Vol. 1, No. 1, June 2011. [14] B. K. Kumar and P. K. Lala, “On-line Detection of Faults in Carry-Select [2] José Manuel Cazeaux, Daniele Rossi and Cecilia Metra; “Self-Checking Adders”, in ITC, pp. 912–918, 2003. Voter for High Speed TMR Systems”, Journal of Electronic Testing : [15] Bui, H.T., Y. Wang and Y. Jiang, “Design and analysis of low-power Theory and Applications, Vol. 21, 377-389, 2005. 10-transistor full adders using novel XOR-XNOR gates,” IEEE Trans. [3] C.Mead, “Fundamental limitations in microelectronics - I.MOS Circuits Systems-II: Analog Digit. Signal Process, Vol. 49, No. 1, pp. technology,” Solid State Electronics, vol. 15, pp.819-829, 1972. 25-30, 2002. [4] R.Puri, T. Karnik and Joshi, “Technology Impacts on sub-90nm CMOS [16] Chiraz Khedhiri, Mouna Karmani, Belgacem Hamdi, Ka Lok Man, Circuit Design & Design Methodologies,” Proceedings of the 19 th “Concurrent Error Detection Adder Based On Two Path Output International Conference on VLSI Design, 2006. Computation,” Journal of Convergence, Vol. 2, No. 1, June 2011. [5] V.Izosimov, “Scheduling and Optimisation of Fault-Tolerant Distributed [17] E. Sicard, “Microwind and Dsch version 3.1,” Toulouse: INSA, ISBN Embedded Systems,” PhD thesis, Linköping University, 2006. 2-87649-050-1, December 2006. [6] C. Dumortier and J. M. DeHaene, “Test en ligne,” ELE6303. Test des systèmes électroniques, 2004. Chiraz Khedhiri is with the Electronic & Microelectronics Laboratory, [7] B. W. Johnson, J. H. Aylor, and H. H. Hana, “Efficient use of time and Monastir, Tunisia. hardware redundancy for concurrent error detection in a 32-bit VLSI adder,” IEEE Journal of Solid-State Circuits, Vol. 23, No. 1, February Mouna Karmani is with the Electronic & Microelectronics Laboratory, 1988. Monastir, Tunisia. [8] M. Nicolaidis, “Time redundancy based soft-error tolerance to rescue nanometer technologies,” in IEEE VLSI Test Symposium, 1999. Belgacem Hamdi is with the Electronic & Microelectronics Laboratory, [9] International Technology Roadmap for Semiconductors Emerging Monastir, Tunisia. Research Devices, 2006. [10] R. J. Sellers, M. Hsiao and L. W. Bearnson, Error Detecting Logic for Ka Lok Man is with the Department of Computer Science and Software Digital Computer, McGraw-Hill, 1968. Engineering, Xi'an Jiaotong-Liverpool University, Suzhou, China. [11] J. G. G. Langdon and C. K. Tang, “Concurrent Error Detection for Group Look-ahead Binary Adders”, IBM J. Res. Develop, pp. 563–573, Sep 1970. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 19 A Precise and High Linearity Power Supply Noise Monitor Circuit I-Chyn Wey, Chien-Chang Peng, Yu-Jiang Liao, and Yu-Sheng Yang Abstract—In this paper, we proposed a new power supply noise noises in terms of amplitudes, timings, and locations. The monitor with high linearity in CMOS 0.18um process. We remove more precise noise model constructed in the EDA tools, the the pre charging capacitor from power supply noise monitor higher reliability and better performance can be achieved in circuit and add an independent stable charging voltage source to enhance power supply noise detection linearity. We set the the modern VLSI design. Therefore, recently there have been detection path with a higher supply voltage of 2.5V to turn on many noise monitor related studies are published [1]-[10]. For the detection circuitry immediately when supply voltage drops. different noise sources, we must apply different techniques to In this way, the noise detection range can be enlarged and the detect them. In this paper, we will focus on PSN detection. noise detection accuracy is improved by 4.94 times. By removing Power supply noise may lead to unexpected delay and extra the pre charging capacitor we can save approximately 40% area as compare with the original design. power consumption. When it comes to sensitive circuits such as dynamic circuits and analog circuits, it may even lead to Index Terms—power supply noise (PSN), power supply noise malfunction. The bandwidth of noise is wide and the noise monitor, high linearity, high precise signal amplitude is random. So we expect to design the noise detection circuit with high linearity and high accuracy. I. I NTRODUCTION Nowadays there are some built-in noise detection circuits I N recent years, due to the progress of CMOS technology process, not only the transistor size is scaled down, the supply voltage is also lowered as well as threshold volt- [3]-[6], that have been proposed to measure the distribution of noise signals in a VLSI chip. The instant noise change can also be measured by means of oscilloscope but in the process age. Lowering power supply voltage leads to lower power of connecting to the oscilloscope, the off-chip noise (due to consumption but unfortunately noise does not decrease ac- connecting wires) can also be incorporated into oscilloscope cordingly. Furthermore, higher transistor density and higher measurement, which leads to detection accuracy degradation. complexity in VLSI circuits also lead to more serious noise The built-in self noise detection circuits in [3],[4] are con- interference. structed by a source follower and a transconductance amplifier. The type of noise can be classified as switching impulse These designs can detect the noise in real time and the noise noise, power supply noise (PSN), and substrate noise [1],[2]. detection architectures are simple with small area. But for Switching impulse noise is generated by the parasitic capaci- different noise amplitudes, the source follower operates in tance. Power supply noise comes from signal transitions. Due different regions such as sub-threshold region, triode region, to signal transition, transistors turn on and turn off frequently and saturation region, which also leads to non-linear noise so current spikes create, which are transformed into voltage detection results. So the detected results of transconductance bounces in the power supply terminal. As for the substrate amplifier are non-linear. noise, it is caused by substrate capacitive coupling. As it As for the concurrent power supply noise detection tech- transmits through low resistive substrate, it may affect the nique [5], it can provide an output error message upon the sensitive analog circuits. For different noise sources, we must occurrence of power supply noise. It can also detect the noise apply different techniques to deal with them [1]-[10]. In this with various amplitudes by means of tuning transistor size. But paper, we focus our attention on PSN issues. due to digital output this circuit can only show that the noise To solve the noise issues in VLSI designs, we must first is detected or not, we cannot get the detail noise information detect the behavior, distribution, and size of noise to build the such as noise amplitude and duration. In addition, the noise understanding of noise interference. The detected noise param- detection range is limited because the noise amplitude peak eters can be applied to electronic design automation (EDA) value does not be detected until the transistor is turned on. tool development, which can be applied to further predict the For the power supply noise monitoring circuit [6], the detected signals are analog signals, which can reveal the I-Chyn Wey is with the Department of Electrical Engineering, Graduate Institute of Electrical Engineering and Green Technology Research Center, detected noise voltage peak values and circuit is simple. Chang-Gung University, Taiwan. E-mail:ichynwey@gmail.com. Nevertheless, the noise detection range is limited because the Chien-Chang Peng is with the Graduate Institute of Electrical Engineering, noise amplitude peak value cannot be detected before the Chang-Gung University, Taiwan. E-mail:dillon73peng@gmail.com. Yu-Jiang Liao and Yu-Sheng Yang are with the Department of Electrical charging transistor is turned on. For different noise amplitudes, Engineering, Chang-Gung University, Taiwan. the detection transistor operates in different regions, which This work was supported by the research project of Chang-Gung Univer- also leads to non-linear noise detection output because the sity, Taiwan [Grant number: UERPD290141]. The chip implementation was supported by National Chip Implementation Center, Taiwan [Chip number: detected voltage value is affected by the variance of charging T18-98C-148]. voltage. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 20 To remove the unstable charging voltage source in [6], we replace the baseline precharging capacitor by using an inde- pendent power supply. In this way, we can enhance detection linearity and widen noise detection range. To further enhance noise detection linearity, we set the detection path with a higher supply voltage of 2.5V to turn-on the detection circuitry immediately when supply voltage drops so that the detection transistor operates only in triode and saturation region. As a result, the output dynamic range can be enlarged 5.04 times and the noise detection accuracy can be improved by 4.94 times. The rest of this paper is organized as follows. In Section II, Fig. 1. Built-in PSN detector constructed by SF+ Gm [3], [4] we first review the existing works of noise detector. In Section III, we propose our new power supply noise detector design. In section IV, the experimental results are presented. Finally, Section V concludes the work of this paper. II. T HE E XISTING P OWER S UPPLY N OISE D ETECTION C IRCUIT Built-in power supply noise detecting circuit [3],[4], as illustrated in Fig. 1, consists of a source follower (SF) to senses power supply noise voltage and a transconductance amplifier (Gm) to converts the SFs output voltage to current signal Iout. The benefit of built-in detecting circuit lies in its small size, real-time output and the detected noise voltage or current value that can be observed directly. But the detection results translated from transconductance amplifier and source Fig. 2. The concurrent power supply noise detection circuit [5] follower are non-linear. Concurrent power supply noise detection technique [5] is illustrated in Fig. 2. Its self-checking scheme can concurrently changes in Vr marginally). At the same time, MN1 will turn monitor a signal of the system clock distribution network. It on and Cx will discharge to ground level, MP3 will turn off can provide an output error message upon the occurrence of and MP2 will turn on. When CLK turns high CLD=1, MP1 power supply noise. In this circuit NOT4 and the transmission will turn off and no longer will serve for Cr. In the meanwhile, gates are designed to make D and D’ complementary. When MN1 will turn off, Cx will stop discharging and MP3 will turn CK = 0, (D, D’) = (0,1); when CK = 1, (D, D’) = (1,0). on. Because the signal will delay by the inverter chain so the When there is no noise , output signal of gate e1 and e2 MP2 and MP3 both will turn on instantly for a short period of will arrive at the same time so (ERR1, ERR2) will be (01) time. At the same time, a current will flows from capacitor Cr or (10). When noise occurs, it will result in different delay to Cx to raise the voltage value of Vx. This current will charge time between e1 and e2 and if delay time is longer than the capacitor Cx until the output signal of the inverter chain feedback time, the latch circuit is locked and (ERR1, ERR2) will change, MP2 will turn off. Thus, the total charge flow to will be (00) or (11). The function of RS and RS‘ are to reset Cx will be proportional to the propagation delay of the inverter the circuit. The PSN monitor can be triggered in different chain. While the propagation delay of the inverter chain is noise conditions by adjusting the size of transistors. The proportional to the supply voltage and the supply voltage is benefit of concurrent detecting technology lies in its digital affected by the PSN. In other words, the larger voltage peak output because it is hardly interfered and can be stored in of PSN, the longer delay time in the inverter chain; therefore, memory. The shortcoming of this circuit is that we cannot the charging time of Cx is longer, and the output of Vx is detect the accurate noise voltage level. Besides, with various higher and vice versa. noise voltage amplitudes, the delay time will not the same so In the previous noise detection circuit designs, the detection that it will cause the latch to operate under a wrong state. results in the built-in detecting circuits are non-linear. The When the delay time is shorter than the feedback time, it will concurrent power supply noise detection technique can only also lead to malfunction in the latch circuit. reveal digital output instead of the peak value of noise. The The power supply monitoring circuit [6], as illustrated in monitoring circuit can provide analog outputs, which can be Fig. 3, is to measure the power supply noise through its measured in different sizes of Vx, so it can identify the effect on the propagation delay of an inverter chain. When peak value of noise but the detection output is nonlinear and the CTRL/CLK is set as logic low the monitoring circuit will detection range is limited. Alternatively, our proposed power get triggered. For CLK = 0, MP1 will turn on and Cr would supply noise detection circuit design is based on monitoring charge to high value of VDD (Cr cannot be too small to make circuit but with higher linearity and higher precision. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 21 1.8 1.7 1.6 Voltage of Cr (V) 1.5 1.4 1.3 The ideal charging voltage of Cr N oises affect charging voltage of Cr 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 N oise (V) Fig. 4. The analysis of charging voltage variation of Cr along with various PSN amplitudes dent from original power supply VDD. We must note that Fig. 3. The power supply monitoring circuit [6] the current drained away from VDDH is much smaller than that of VDD; therefore, PSN in VDDH is much smaller than that of VDD. Consequently, the PSN detection accuracy can III. T HE PROPOSED POWER SUPPLY NOISE MONITOR be held. Even though a separated VDDH signal needs a little Maintaining the relationship between the PSN monitor larger circuitry area, its benefit in providing a stable voltage output and the input PSN as linear is a direct and effective source for charging Cx, which can enhance the detection way to precisely detect the noise amplitude. But the noise linearity. Overall our circuitry area is still much smaller the detection range and PSN detection precision are limited in [6] conventional one [6] because of one capacitor is removed. because of the graduation in the △Vx is small. Therefore, we We can also separately adjust the value of VDDH and the must enlarge the voltage difference of △Vx to enhance PSN capacitance value of Cx to enhance the linearity of detected detection accuracy. output. Moreover, VDDH is set to be higher than VDD to As mentioned above, the charging voltage of Cx in the turn on the transistor MP2 and MP3 once the supply voltage power supply monitoring circuit is determined by the charging drops. In other words, by using a separated supply voltage with voltage across the capacitor of Cr. Once PSN occurs, the volt- higher voltage value, not only the shortage of original charging age across Cr will change along with various noise amplitudes voltage dropping can be solved but the output of linearity can and variance of Vr will drop. Such voltage variance on Vr be also improved. Moreover, the PSN linear detection range results in non-linearity on the variance of Vx, as illustrated can be much wider because the non-linear operation region in Fig. 5(a). In order to provide a stable voltage source for has been removed. In the meanwhile, the PSN detected output charging Cx, we remove the capacitor and connect capacitor dynamic range (△Vx), can also be greatly enlarged; therefore, Cx directly to the separated supply voltage to achieve higher the noise detection accuracy can be improved. linearity in the detection output. To enlarge the linear detection range, we apply a higher voltage of 2.5V to this separate IV. EXPERIMENTAL RESULTS independent supply voltage so that the transistor MP3 can In this paper, the proposed precise and high linearity PSN be turned on immediately once PSN occurs. In this way, the monitor is accomplished by a stable and independent charging detected △Vx can be enlarged and the PSN detection linearity voltage source. By connecting the transistor MP2 to a separate can be improved, as illustrated in Fig. 5(b). For the proposed charging path even with the same supply voltage as 1.8V, we PSN detection circuit as illustrated in Fig. 6, MP1 and Cr are can improve the detection range and achieve higher accuracy removed and replaced by an independent power supply directly due to wider Vx variation range as illustrated in Fig. 7. By to the charging capacitor Cx. The circuit operates in the using separate independent charging voltage source to charge following way, as the CTRL/CLK is logic low, the transistor the Cx capacitor, we can achieve higher detection linearity and MN1 will turn on, the capacitor Cx will begin to discharge also overcome the charging voltage instability issue existing in and the transistor MP3 will turn off. While CLK is logic high [6]. As illustrated in Fig. 7, the slope of curve in the proposed as 1.8V, MN1 will open and Cx will stop discharging. Due to PSN monitor is steeper than that of conventional one so the the path difference in the inverter delay line, MP2 and MP3 average output value can be enlarged to about 2 times larger will both turn on for a short period of time. Meanwhile, the as compared with the conventional design. Cx will be recharged by VDDH, and the period of time will In order to further detect the noise below 0.5V, we raise up to the delay time of the inverter chain. the supply voltage value of VDDH to improve the output In our proposed PSN detection circuit, VDDH is indepen- dynamic range. As illustrated in Fig. 8 that the higher VDDH INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 22 (a) The case in the PSN monitor [6] Fig. 7. Comparison of Vx variation under various power supply noise pulse amplitude in the proposed PSN detection circuit versus the state-of-art PSN monitor in [6] 1.6 1.5 Linearity 1.4 1.8V 0.9243 1.3 2.3V 0.9946 2.4V 0.9980 1.2 2.5V 0.9986 1.1 2.6V 0.9970 1.0 0.9 0.8 Vx (V) (b) The case in the proposed PSN monitor 0.7 0.6 Fig. 5. Comparison of voltage across Vr and its effect on the detected △Vx 0.5 in the PSN monitor in [6] and the proposed PSN monitor 0.4 0.3 0.2 0.1 0.0 -0.1 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Noise (V) Fig. 8. Analysis of various charging supply voltage for higher PSN detection linearity linearity of PSN detection can no more be held once the PSN is larger than the saturation value. To find out the highest PSN detection linearity, we compare the PSN detection linearity among different charging supply voltages, as illustrated in Fig. 8. To evaluate the detection linearity of PSN monitor, we define the relevance between the ideal expected PSN value and the real detected PSN value as PSN detection linearity, which is defined in equation (1). n P (xi − x) · (yi − y) i=1 DetectionLinearity = s (1) n P n (xi − x)2 · P (yi − y)2 i=1 i=1 The higher relevance or the higher correlation means the Fig. 6. The proposed power supply noise detection circuit higher PSN detection linearity. As illustrated in Fig. 8, we can achieve a higher PSN detection linearity of 0.9986 by selecting the maximum charging voltage of 2.5V. As a result, the VDDH can result in the more linear PSN detection. However, when the value is set as 2.5V to acquire the higher PSN detection charging voltage increases, the △Vx will reach saturation. The linearity. The size of output capacitor also affects the PSN INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 23 State-of-Art PSN Monitor Circuit [6] 1.8 Proposed PSN Monitor Circuit Linearity 1.6 100 0.3p 0.9950 1.4 0.4p 0.9986 0.5p 0.9984 1.2 80 0.6p 0.9983 E rror (%) 1.0 0.7p 0.9983 60 Vx (V) 0.8 Relative 0.6 40 0.4 0.2 20 0.0 -0.2 0 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Noise (V) The order of delta Vx (0V~1.8V) Fig. 9. Analysis of output capacitor value for higher PSN detection linearity Fig. 11. Comparison of PSN detection error in the state-of-art PSN monitor in [6] and the proposed PSN monitor 1.6 the PSN detection error as: State-of-Art PSN Monitor Circuit [6] Proposed PSN Monitor Circuit 1.4 VX − VX DetectionError = | | (2) 1.2 VX 1.0 As illustrated in Fig. 11, the PSN detection resolution in [6] is limited to around 100mV and the average PSN detection Vx (V) 0.8 error is 75.265%. Under the case with 100mV PSN detection 0.6 resolution, the PSN detection error can be lowered to 15.246 Delta 0.4 % in our proposed PSN detection circuit. The reason that 0.2 makes our improvement effective is that, we successfully raise 0.0 the charging voltage to enhance the PSN detection range and provide an independent stable charging path to improve PSN -0.2 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 detection linearity. Noise (V) Finally, the proposed power supply noise monitor with detailed transistor size is shown in Fig. 12. The NOT gate Fig. 10. Comparison of detected output voltage difference in the state-of-art and NOR gate are designed to have nearly the same rise PSN monitor in [6] and the proposed PSN monitor and fall time. MN1 is adjusted for the way to discharge Vx within the PSN detection period. MP2 and MP3 are adjusted to achieve the better linearity with light-sized. The comparison of transistor size in the state-of-art PSN monitor in [6] and the detection linearity. As the size of output capacitor is reduced, proposed PSN monitor is summarized in Table 1. Overall, the output PSN detection voltage will reach saturation values transistor size is nearly the same, but MP1 and Cr are removed easily and if the output capacitor size is furthers reduced, the in the proposed PSN monitor so the chip area can be reduced detected output value loses its linearity. If we increase the size by approximately 40%. of capacitor the out became not obvious. Therefore, we try to To evaluate the PSN detection performance, we summarize adjust the way that the output performs the higher detection the comparison results in Table 2. In the proposed PSN moni- linearity. Based on the analytical results in Fig. 9, we select tor, the PSN detection linearity can be improved from 0.9291 the capacitance of 0.4pF as the output capacitor with the PSN to 0.9986 in terms of relevance. The detected output variance detection linearity of 0.9986. can be enlarged from 15.82mV to 79.75mV. Therefore, the By providing a separate and stable charging path, charging output dynamic range can be enlarged to 5.04 times. Moreover, with higher supply voltage and output capacitor adjustment, the PSN detection error can also be reduced from 75.27% we can greatly improve the PSN detection linearity and de- to 15.25%. As a result, the noise detection accuracy can be tection dynamic range of the power supply monitoring circuit. improved by 4.94 times. As illustrated in Fig. 10, the PSN can be read clearly with In order to verify the function and performance in silicon, 5.04 times larger average detected PSN voltage difference as we realized the design in TSMC 0.18um CMOS process. The compared to the conventional design in [6]. The linearity can chip die photo is shown in Fig. 13. As the area is dominated also be enhanced from 0.9291 to 0.9986. Moreover, the larger mainly by capacitor so after removing one capacitor Cr from detected voltage difference can lead to better PSN detection power supply noise monitor circuit in [6], we can save about accuracy. To evaluate the PSN detection accuracy, we define 40% silicon area. The silicon area of proposed PSN monitor INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 24 Fig. 13. Proposed PSN monitor chip die photo TABLE III P ERFORMANCE SUMMARY OF THE PROPOSED PSN MONITOR Pre-Sim Post-Sim Measurement Fig. 12. The proposed power supply noise monitor with transistor sizing Area 15.93 × 45.89um2 TABLE I Transistor Count 13 C OMPARISON OF TRANSISTOR SIZE IN THE STATE - OF - ART PSN MONITOR Power Dissipation 244.5uW 240.7uW IN [6] AND THE PROPOSED PSN MONITOR Linearity State-of-Art [6] Proposed PSN Monitor (Relevance) 0.9986 0.9987 0.9977 Detected Variance Technology TSMC 0.18um (per 100mV PSN) 79.75mV 73.31mV 73.56mV MP2 (PMOS) W=2u W=2u Detection Error 15.25% 13.13% 17.83% MP3 (PMOS) W=8u W=8u MN1 (NMOS) W=1u W=1u MP1 (PMOS) W=2.41u none Cx 0.4p 0.4p output signal Vx of PSN monitor to analyze its PSN detection Cr 0.4p none range, accuracy, and linearity. NOR PMOS W=10u, NMOS W=1u INV PMOS W=2.41u, NMOS W=1u Initially, the CTRL is set as logic low. Once CLK changes from logic low to logic high, due to the path difference in TABLE II the inverter delay line, MP2 and MP3 will both turn on to C OMPARISON OF PSN DETECTION PERFORMANCE charge capacitor Cx. Then the voltage of Vx will increase. As PSN Monitor [6] Proposed PSN Monitor CLK changes from logic high to logic low, MN1 will turn on to discharge capacitor Cx. Then the voltage of Vx will Transistor Count 14 13 Capacitor 2 1 be reset to logic low. Under various PSN noise peak, we can Detection Range 0.8V ∼ 1.8V 0V ∼ 1.8V measure its corresponding Vx value. As PSN is 1.8V and 0.6V, Linearity the measured Vx is 786mV and 114mV, respectively. Due (Relevance) 0.9291 0.9986 to inherently existing capacitor mismatching, and capacitance Detected Variance estimation mismatching in both PAD and bounding wire, there (per 100mV PSN) 15.82mV 79.75mV may be some difference between post-layout simulation and Detection Error 75.27% 15.25% chip measurement. From Fig. 14, we can see that the post- layout simulation shows the PSN detection linearity is 0.9986 and the measurement result shows that the PSN detection circuit is 15.93um × 45.93um. linearity is 0.9977. There is only 0.09% difference. The PSN In our measurement environment, we set power supply detection performance is nearly the same before and after voltage VDD as 1.8V by using Tektronix AFG-3252 Arbitrary the silicon implementation. Finally, we summarize the perfor- Function generator and set VDDH as 2.5V by using Agilent mance specification of our proposed PSN detection circuit in E3631A DC Power supply. Our CLK signal is generated from Table 3. The PSN detection in our proposed design in 0.9977, Quartz oscillator and the CLK frequency is set as 50MHz. the PSN detection variance is 73.56mV per 100 mV PSN, the Through Tektronix DPO-7254 oscilloscope, we observe the PSN detection error is 17.83%. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS, VOL. 3, NO. 1, MARCH 2012 25 I-Chyn Wey was born in Taipei, Taiwan, in 1979. 1.4 PSN De tection linearity He received the B.S. and M.S. degrees in elec- Pos - ou imu ion t lay t S lat tronics engineering from Chang Gung University, 1.2 Chip sur m n Mea e e t Taoyuan, Taiwan, in 2001 and 2003, respectively, and the Ph.D. degree in electronics engineering 1.0 from National Taiwan University, Taipei, in 2008. He is currently an Assistant Professor with Chang 0.8 Gung University. His research interests include VLSI delta Vx (V) CMOS circuits design, noise-tolerant CMOS circuits 0.6 design, soft-error-tolerant CMOS circuits design, and ultralow-power CMOS circuits design. 0.4 0.2 0.0 Chien-Chang Peng was born in Taichung, Taiwan -0.2 in 1984. He received the B.S. and M.S. degrees -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 in electrical engineering from Chung Yuan Chris- Nois e (V) tian University, Taoyuan, Taiwan and Chang Gang University, Taoyuan, Taiwan in 2008 and 2010, re- Fig. 14. PSN detection measurement results spectively. He is currently working toward the Ph.D. degree in the Department of Electrical Engineering, Chang Gung University, Taoyuan, Taiwan. His re- search interests include VLSI CMOS circuits design, V. C ONCLUSION soft-error-tolerant CMOS circuits design, and noise detection circuits design. In this paper, we proposed a noise monitor with high PSN detection precision and high linearity in CMOS 0.18um process. The silicon area of the proposed PSN monitor circuit is 15.93um×45.93um, which saves 40% as compared with the Yu-Jiang Liao received the B.S. degree in electrical engineering from Chang PSN monitor in [6]. By replacing the the baseline precharging Gung University, Taoyuan, Taiwan in 2010. His research interests are in VLSI CMOS circuits design. capacitor with a independent 2.5V supply voltage, the PSN detection linearity can be enhanced to 0.9977 and the noise detection accuracy by 4.94 times. Yu-Shang Yang received the B.S. degree in electrical engineering from Chang R EFERENCES Gung University, Taoyuan, Taiwan in 2010. His research interests are in VLSI CMOS circuits design. [1] T. Chen, ”On the Impact of On-Chip Inductance on Signal Nets under the Influence of Power Grid Noise,” IEEE Transactions on VLSI Systems, Vol. 13, pp. 339-348, Mar. 2005. [2] P. Heydari, and M. Pedram, ”Capacitive Coupling Noise in High-Speed VLSI Circuits,” IEEE Transactions on Computer-Aided Design of Inte- grated Circuits and Systems, Vol. 24, pp. 478-488, Mar. 2005. [3] T. Okumoto, M. Nagata, and K. Taki, ”A built-in Technique for Probing Power-Supply Noise Distribution Within Large-Scale Digital Integrated Circuits,” in Symposium on VLSI Circuits, pp. 98-101, Jun. 2004. [4] M. Nagata, T. Okumoto, and K. Taki, ”A Built-in Technique for Probing Power Supply and Ground Noise Distribution Within Large-Scale Digital Integrated Circuits,” IEEE Journal of Solid-State Circuits, Vol. 40, Issue 4, pp. 813-819, Apr. 2005. [5] C. Metra, and L. Schiano, ”Concurrent Detection of Power Supply Noise”, IEEE Transactions on Reliability, Vol. 52, Issue 4, pp. 469-475, Dec. 2003. [6] J. R. Vazquez, and J. P. de Gyvez, ”Power Supply Noise Monitor for Signal Integrity Faults,” in Proceeding of Design, Automation and Test in Europe Conference, Vol. 2, pp. 1406-1407, Feb. 2004. [7] A. Sehgal, P. Song, and K. A. Jenkins, ”On-chip Real-Time Power Supply Noise Detector,” in Proceeding of Solid-State Circuits Conference, pp. 380-383, Sep. 2006. [8] S. Q. Yuan, and Y. H. Tan, ”Difference-type Noise Detector for Adaptive Median Filte”, in Electronics Letters, Vol. 42, Issue 8, pp. 454-455, Apr. 2006. [9] H. C. Chow, and Z. H. Hor, ”A high performance peak detector sample and hold circuit for detecting power supply noise, in Proceeding of IEEE Asia Pacific Conference on Circuits and Systems,pp. 672-675, Dec. 2008. [10] M. Fukazawa, K. Noguchi, M. Nagata, and K. Taki, ”A built-in power supply noise probe for digital LSIs,” in Proceeding of Asia and South Pacific Conference on Design Automation, pp. 24-27, Jan. 2006. [11] Y. Q. Dong, R. H. Chan, and S. F. Xu, ”A Detection Statistic for Random-Valued Impulse Noise,” IEEE Transactions on Image Process- ing, Vol. 16, Issue 4, pp. 1112-1120, Apr. 2007. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS The International Journal of Design, Analysis and Tools for Integrated Circuits and Systems (IJDATICS) was created by a network of researchers and engineers both from academia and industry. IJDATICS is an international journal intended for professionals and researchers in all fields of design, analysis and tools for integrated circuits and systems. The objective of the IJDATICS is to serve a better understanding between the community of researchers and practitioners both from academia and industry. Editor-In-Chief Ka Lok Man Xi'an Jiaotong-Liverpool University, China, Myongji University, South Korea, and Baltic Institute of Advanced Technology, Lithuania Co-Editor-In-Chief Chi-Un Lei University of Hong Kong, Hong Kong Managing Editor Michele Mercaldi Tomas Krilavicius EnvEve, Switzerland Vytautas Magnus University, Lithuania Baltic Institute of Advanced Technology, Lithuania Kaiyu Wan Taikyeong Jeong Xi'an Jiaotong-Liverpool University, China Myongji University, South Korea Journal Secretary Treasurer Jun Wang Woonkian Chong Fujitsu Laboratories of America, Inc., USA Xi'an Jiaotong-Liverpool University, China Assistant Editor-In-Chief Chi-Hua Chen Amir-Mohammad Rahmani National Chiao Tung University, Taiwan University of Turku, Finland Publishing Manager Nan Zhang Xi'an Jiaotong-Liverpool University, China, Editorial Board Vladimir Hahanov Oscar Valero Enggee Lim Kharkov National University of Radio Electronics, Ukraine University of Balearic Islands, Spain Xi'an Jiaotong-Liverpool University, China Paolo Prinetto Yang Yi Kevin Lee Politecnico di Torino, Italy Sun Yat-Sen University, China Murdoch University, Australia Massimo Poncino Damien Woods Prabhat Mahanti Politecnico di Torino, Italy University of Seville, Spain University of New Brunswick, Saint John, Canada Alberto Macii Franck Vedrine Tammam Tillo Politecnico di Torino, Italy CEA LIST, France Xi'an Jiaotong-Liverpool University, China Joongho Choi Bruno Monsuez Yanyan Wu University of Seoul, South Korea ENSTA, France Xi'an Jiaotong-Liverpool University, China Wei Li Kang Yen Wen Chang Huang Fudan University, China Florida International University, USA Kun Shan University, Taiwan Michel Schellekens Takenobu Matsuura Masahiro Sasaki University College Cork, Ireland Tokai University, Japan The University of Tokyo, Japan Emanuel Popovici R. Timothy Edwards Vineet Sahula University College Cork, Ireland MultiGiG, Inc., USA Malaviya National Institute of Technology, India Jong-Kug Seon Olga Tveretina D. Boolchandani LS Industrial Systems R&D Center, South Korea Karlsruhe University, Germany Malaviya National Institute of Technology, India Umberto Rossi Maria Helena Fino Zhao Wang STMicroelectronics, Italy Universidade Nova De Lisboa, Portugal Xi'an Jiaotong-Liverpool University, China Franco Fummi Adrian Patrick ORiordan Shishir K. Shandilya University of Verona, Italy University College Cork, Ireland NRI Institute of Information Science & Technology, India Graziano Pravadelli Grzegorz Labiak J.P.M. Voeten University of Verona, Italy University of Zielona Gora, Poland Eindhoven University of Technology, The Netherlands Vladimir PavLov Jian Chang Wichian Sittiprapaporn Intl. Software and Productivity Engineering Institute, USA Texas Instruments Inc, USA Mahasarakham University, Thailand Ajay Patel Yeh-Ching Chung Aseem Gupta Intelligent Support Ltd, United Kingdom National Tsing-Hua University, Taiwan Freescale Semiconductor Inc., USA Thierry Vallee Anna Derezinska Kevin Marquet Georgia Southern University, USA Warsaw University of Technology, Poland Verimag Laboratory, France Menouer Boubekeur Kyoung-Rok Cho Matthieu Moy University College Cork, Ireland Chungbuk National University, South Korea Verimag Laboratory, France Monica Donno Yong Zhang Ramy Iskander Minteos, Italy Shenzhen University, China LIP6 Laboratory, France Jun-Dong Cho R. Liutkevicius Suryaprasad Jayadevappa Sung Kyun Kwan University, South Korea Vytautas Magnus University, Lithuania PES School of Engineering, India AHM Zahirul Alam Yuanyuan Zeng S. Hariharan International Islamic University Malaysia, Malaysia University College Cork, Ireland B. S. Abdur Rahman University, India Gregory Provan D.P. Vasudevan Chung-Ho Chen University College Cork, Ireland University College Cork, Ireland National Cheng-Kung University, Taiwan Miroslav N. Velev Arkadiusz Bukowiec Kyung Ki Kim Aries Design Automation, USA University of Zielona Gora, Poland Daegu University, South Korea M. Nasir Uddin Maziar Goudarzi Shiho Kim Lakehead University, Canada University College Cork, Ireland Chungbuk National University, South Korea Dragan Bosnacki Jin Song Dong Hi Seok Kim Eindhoven University of Technology, The Netherlands National University of Singapore, Singapore Cheongju University, South Korea Dave Hickey Dhamin Al-Khalili Siamak Mohammadi University College Cork, Ireland Royal Military College of Canada, Canada University of Tehran, Iran Maria OKeeffe Zainalabedin Navabi Brian Logan University College Cork, Ireland University of Tehran, Iran University of Nottingham, UK Milan Pastrnak Lyudmila Zinchenko Ben Kwang-Mong Sim Siemens IT Solutions and Services, Slovakia Bauman Moscow State Technical University, Russia Gwangju Institute of Science & Technology, South Korea John Herbert Muhammad Almas Anjum Asoke Nath University College Cork, Ireland National University of Sciences and Technology, Pakistan St. Xavier's College, India Zhe-Ming Lu Deepak Laxmi Narasimha Tharwon Arunuphaptrairong Sun Yat-Sen University, China University of Malaya, Malaysia Chulalongkorn University, Thailand Jeng-Shyang Pan Danny Hughes Shin-Ya Takahasi National Kaohsiung University of Applied Sciences, Taiwan Xi'an Jiaotong-Liverpool University, China Fukuoka University, Japan Chin-Chen Chang Jun Wang Cheng C. Liu Feng Chia University, Taiwan Fujitsu Laboratories of America, Inc., USA University of Wisconsin at Stout, USA Mong-Fong Horng A.P. Sathish Kumar Farhan Siddiqui Shu-Te University, Taiwan PSG Institute of Advanced Studies, India Walden University, Minneapolis, USA Liang Chen N. Jaisankar Yui Fai Lam University of Northern British Columbia, Canada VIT University. India Hong Kong University of Science & Technology, Hong Kong Chee-Peng Lim Atif Mansoor Jinfeng Huang University of Science Malaysia, Malaysia National University of Sciences and Technology, Pakistan Philips & LiteOn Digital Solutions, The Netherlands Ngo Quoc Tao Steven Hollands Abhilash Goyal Vietnamese Academy of Science and Technology, Vietnam Synopsys, Ireland Oracle (SunMicrosystems), USA Salah Merniz Felipe Klein Katsumi Wasaki Mentouri University, Algeria State University of Campinas, Brazil Shinshu University, Japan Pankaj Gupta Yue Yang Yeo Kiat Seng Microsoft Corporation, USA EJITEC, China Nanyang Technological University, Singapore Masoud Daneshtalab Boguslaw Cyganek Youngmin Kim University of Turku, Finland AGH University of Science and Technology, Poland UNIST Academy-Industry Research Corporation, South Korea Publisher Cooperation Name : Solari Co., Hong Kong Address : Unit 1-5, 20/F, Midas Plaza, 1 Tai Yau Street, San Po Kong, Kowloon, Hong Kong Phone : (852) 3966-2536 ISSN: 2071-2987 (online version), 2223-523X (print version) INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTEGRATED CIRCUITS AND SYSTEMS http://ijdatics.distributedthought.com/
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-