i Preface Welcome to the Volume 4 Number 1 of the International Journal of Design, Analysis and Tools for Integrated Circuits and Systems (IJDATICS). This issue comprises of i) enhanced and extended version of research papers from the International DATICS Workshops in 2012 and 2013, and ii) ordinary manuscript submissions in 2012 and 2013. DATICS Workshops were created by a network of researchers and engineers both from academia and industry in the areas of i) Design, Analysis and Tools for Integrated Circuits and Systems and ii) Communication, Computer Science, Software Engineering and Information Technology. The main target of DATICS Workshops is to bring together software/hardware engineering researchers, computer scientists, practitioners and people from industry to exchange theories, ideas, techniques and experiences. This IJDATICS issue presents five high quality academic papers. This mix provides a well- rounded snapshot of current research in the field and provides a springboard for driving future work and discussion. The five papers presented in this volume are summarized as follows: • Image Processing: Liutkevičius and Davidsona modeled structured surface from unstructured cloud of points. • Integrated Circuits: Rao and Srinivasulu propose an approach to implement a 16-BIT RCA using current sink restorer structure. Meanwhile, Jamali, Ahmadi and Fathabadi conduct a chipless RFID case study on a RF/Digital co-Design approach based on VHDL-AMS modeling. • Software Engineering: Arnuphaptrairong perform an empirical validation study on early stage software effort estimation via Function Point Analysis. • Sensor Networks: Imran, Khursheed, Ahmad, Waheed, O’Nils and Lawal propose an architecture of wireless visual sensor node with region of interest coding. We are beholden to all of the authors for their contributions to the Volume 4 Number 1 of IJDATICS. We would also like to thank the IJDATICS editorial team. Editors: Ka Lok Man, Xi’an Jiaotong-Liverpool University, China, and Baltic Institute of Advanced Technology (BPTI), Lithuania Chi-Un Lei, University of Hong Kong, Hong Kong Amir-Mohammad Rahmani, University of Turku, Finland Nan Zhang, Xi’an Jiaotong-Liverpool University, China David Afolabi, Xi’an Jiaotong-Liverpool University, China ii Table of Contents Vol. 4, No. 1, December 2013 Preface ………………………………………………………………………………....... i Table of Contents ……………………………………………………………………….. ii 1. Modeling of Structured Surface from Unstructured Cloud of Points ………………...… 1 …………………………………………………………..... R. Liutkevičius and A. Davidsonas 2. 16-BIT RCA Implementation Using Current Sink Restorer Structure ………………… 9 ......…………………..…………… Tirumalasetty Venkata Rao and Avireni Srinivasulu 3. Early Stage Software Effort Estimation Using Function Point Analysis: An Empirical Validation ……………....………………………………… Tharwon Arnuphaptrairong 15 4. RF/Digital Co-Design Approach Based on VHDL-AMS Modeling: A Chipless RFID 22 Case Study …………….... Arash Jamali, Arash Ahmadi and Omid Sadeghi Fathabadi 5. Architecture of Wireless Visual Sensor Node with Region of Interest 30 Coding ………………………………….… Muhammad Imran, Khursheed Khursheed, Naeem Ahmad, Malik A. Waheed, Mattias O’Nils and Najeem Lawal INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 1 Modeling of Structured Surface from Unstructured Cloud of Points R. Liutkevičius, A. Davidsonas Abstract— This paper presents a new approach for creating a Curless and Levoy [8] and Turk and Levoy [17]. Zero-set based model of a structured surface from an unstructured cloud of algorithms along with the points coordinates expect the normal points where points are measured from a surface containing a vectors and other information like errors. This can be a problem centre-line such that all perpendicular rays to that line intersect if 3D scanner can capture only surface points. with a surface no more than once. A presented algorithm analyses cloud of points, generated by a point based 3D scanner and A first theoretically proven reconstruction algorithm was calculates parameters for a single non-uniform 3D B-spline mesh. proposed by Amenta, Bern and Kamvysselis [1]. This A calculated model can be used directly in a 3D graphics API, algorithm, named Crust, builds Voronoi diagrams out of such as OpenGL or converted to a structured triangle mesh. acquired data and uses it to reconstruct a surface. Crust algorithm was greatly improved by Amenta, Choi, Dey and Index Terms—3D scanning and visualization, 3D surface Leekha [2] and is now known as Cocone algorithm. Cocone and reconstruction, B-Spline, cloud of points approximation, Crust algorithms produce holes and other artifacts’ if data computational geometry. contain undersampling, noise or reconstructed surface has sharp corners. To overcome these problems Amenta, Choi and I. INTRODUCTION Kolluri [3] introduced Power Crust algorithm. Authors extended Crust algorithm by using additional power shape A surface reconstruction from a cloud of points is an increasingly important problem in computer graphics, computer aided design, quality control, medical imaging etc. structure. However, Power Crust algorithm adds comparatively large amount of extra data points in the output. Dey and Goswami [10] have extended Cocone algorithm to handle Laser range scanners, hand-held digitizers or imaging systems surfaces containing sharp corners. Their version of an are often used to capture surface points from real objects and a algorithm is called Tight Cocone. Algorithm performs wide variety of algorithms have been developed for building undersampling detection to avoid a formation of surface holes surface models from the captured data. in the undersampling areas. If compared to Power Crust A very early solution to the reconstruction problem was algorithm, Tight Cocone adds no extra points to the output. suggested by Boissonnat [6] who proposed a sculpting of the Nevertheless, surface holes are avoided only if the Delaunay triangulation. Later, Edelsbrunner and Mücke [12] undersampling is local and detectable. Also, algorithm is suggested a more refined sculpting strategy, named α-shapes. sensitive to noise. Another parametric surface reconstruction algorithm was Dey and Goswami [11] modified Tight Cocone algorithm proposed by Bajaj et al. [4]. An algorithm uses α-shapes and and made the reconstruction available from noisy data. cubic Bernstein-Bezier (BB) surfaces to create a surface Unfortunately a result of the algorithm, named as Robust approximating triangulation. Bernardini et al. [5] also proposed Cocone, is usually a very rough surface approximation. an algorithm based on α-shapes but suggested solution how to All the algorithms, mentioned above produce piecewise avoid expensive computation of the Delaunay triangulation. linear approximations of a surface in a form of an unstructured The algorithm was named Ball Pivoting Algorithm (BPA). A triangle mesh. In the presence of undersampling, noise and major drawback of using α-shapes for a surface reconstruction sharp corners, these algorithms often produce errors that is that such algorithms require an experimentally chosen usually result in structure flaws and holes on a models surface. parameter α that depends on the sampling density. The This paper presents a surface reconstruction algorithm that sampling density may vary over different parts of the surface, generates a water-tight (without holes) and well-structured so there may be no optimal value of α. surface out of noisy and unstructured input data. These very A different approach to a surface reconstruction problem was important characteristics of a surface are guaranteed thanks to proposed by Hoppe et al. [14]. In his algorithm approximating the non-uniform B-splines. The use of B-splines makes the surface is created from a set of values for which signed distance proposed algorithm different from above mentioned algorithms functions, defined by the input point samples, are equal to zero. while keeping its quality at a highly competitive level. The algorithm was named zero-set and later was improved by This paper is organized as follows. Definitions and structure of the input data are described in the next section. The proposed Authors are associated with the Department of System Analysis, Vytautas surface modeling algorithm is comprehensively presented in Magnus University, Lithuania. Authors can be reached by e-mail section III, through A-E subsections. Practical usage of the r.liutkevicius@live.vdu.lt and andrius@live.vdu.lt respectively. algorithm results are discussed in Section IV. Section V covers INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 2 implementation and experimental results. Finally Section VI represent patches of the B-spline surface. Data blocks are used B-spline blending functions /,0 u and 2,3 v are calculated concludes the paper. to form B-spline knot vectors U and V. During the third step II. SOURCE DATA AND DEFINITIONS from knot vectors and D(b). At the fourth step, B-spline control dimensional points’ set : The presented algorithm takes as an input a three points Pc,d are calculated based on the blending functions and other calculated data by applying optimization in the least , , , square sense. Calculated parameters U, V and Pc,d are finally 1≤ ≤ . (1) used to form a B-spline surface that approximates a surface . Here x, y, and z represent point’s coordinates in the Cartesian coordinate system, γ is a number of acquired surface points. It is assumed that this point set has been sampled from a surface ϵℝ3 that has a centre-line I, such that all perpendicular rays to that line intersects with a surface not more than once. An approximating surface is formed by converting a cloud of points into a parametric non-uniform B-spline surface [15] that is defined as: , = , , , (2) ≤ ≤ , ≤ ≤ (3) v direction; ! = " |0 ≤ & ≤ ' + ) + 1* is a B-spline’s knot Here p is a surface degree in u direction; q is a surface degree in vector in u direction; + = " |0 ≤ , ≤ - + . + 1* is a B-spline’s knot vector in v direction; , and , are B-spline’s blending functions of degree p and q, respectively; Pc,d are B-spline’s control (de-Boor) points [13]; u and v are parameterization intervals (infinite set of points in a predefined parameters U, V, , , , and Pc,d are described in range) and should not be confused with single values. The more detail in the following sections. Fig. 1. The process of parametric surface creation III. SURFACE MODELING B. Transformation of input data A. Overview of an algorithm Shoenberg and Whitney [16] proved that data must be unknown parameters U, V, /,0 u , 2,3 v and /,2 of a Out of unstructured input data the algorithm computes structured for a direct parametric surface interpolation and an approximation to be possible. An example of structured data is shown in Fig. 2. parametric B-spline surface P(u,v), defined in (2) so that it 3D scanners usually output unstructured data. Fig. 3 shows approximates a surface . The steps of how to calculate these data, acquired with an experimental scanner, built at Vytautas unknown parameters are shown in Fig. 1. Magnus University [9]. It can be noticed from Fig. 3 that there Unstructured data usually do not meet necessary and is no simple way to restructure data into a structured form. So a sufficient conditions and direct approximation is not possible to directly. To solve the problem input data i are transformed parametric surface cannot be approximated from such data implement using B-splines [16]. However, an approximation of unstructured data using B-splines is possible. To overcome the into spaces D(b)ϵℝ2, b = 1,2. The steps of the transformation data i into two spaces D(1)ϵℝ2 and D(2)ϵℝ2 and then calculate problem of unstructured data, an approach to transform input process are: Positioning of cloud of points i as shown in Fig. 4; unknown B-spline parameters from transformed data is 1. proposed, implemented and experimental results presented in this paper to demonstrate algorithm’s advantage over the other 2. Sorting of cloud of points by z coordinate in ascending Grouping of points i along the center axis I; well-known algorithms. order; A key thing here is a source data transformation, described in 3. more details in section III B that is applied at the first step of the 4. Sorting of points in each group by their position related to algorithm (Fig. 1). During the next step transformed data D(b), the center axis I and storing in matrices D(b); b = 1,2 are subdivided into separate data blocks, which INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 3 rotating a cloud of points i so that the surface’s center axis I 5. Filling of empty entries of matrices D(b) with values, properly. This is done during the first step by translating and calculated using linear interpolation between neighbor non-empty entries. coincides to the z axis and starts from z=0 as shown in Fig. 4. of points i by appropriate transformation and rotation The positioning of a surface is done by multiplying a cloud matrices [13]. Transformation matrices can be saved and used to position a surface back to the original position after the reconstruction if the original surface position is important. During the second step positioned points are sorted in the ascending order by z coordinate. Fig. 4. Positioning of the surface idea of grouping is to label each point i as 6 and save labeling The next step is grouping of points by the z coordinate. The information in the separate vector Zgrp: 789 = :6 , 6; , … , 6 , … , 6= >, ? ∆A Fig. 2. Example of the structured data (4) 6 = , ∆ ≤ ; ≤. . . ≤ ≤. . . ≤ = , = 1, … , . Here ?·A is a round down operator. Vector Zgrp are then used to group points into m unique groups: C = 6 , 6 ≠ 6E , ≠ F, , F = 1, … , . (5) Each group is denoted as hi as shown below: C = ℎ , ℎ; , … , ℎ , … , ℎH , C ⊆ 789 , (6) ℎ < ℎ; <. . . < ℎ <. . . < ℎH . vector 789 . The parameter ∆ depends on input data and is The vector H contains m (m ≤ γ) unique elements from the selected experimentally. Usually values from 5 to 10 give already distributed in groups. Higher values of ∆ decrease the Fig. 3. Example of the unstructured data sufficient results. A value of 1 should be used when points are A cloud of points must be properly aligned with respect to the coordinate system for the transformation algorithm to work INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 4 number of groups and increase the number of points in the C. Formation of knot vectors During the fourth step the points in each group ℎ are sorted subdividing jk into Y × o data blocks as shown in Fig. 5. groups. B-spline knot vectors U and V are calculated by again but this time by a parameter KL that takes into account a Endpoints of each data block are marked as and . points are written into matrices D(b). KL is calculated using an point’s position from the center axis I. Based on this parameter, equation: NO360 + R yT cos X L Z[ mod 360_ N |y T | YE _ KL = N _, ∆ N _ (7) M ^ E , E ≠ 0 T = ` , 1 , E = 0 YE = aE; + E; , 0 < ∆ ≤ 1, F = 1, … , . Here ?∙A is a round to nearest operator. The parameter ∆ Fig. 5. Subdivision of jk . depends on the input data and is selected experimentally. ∆ affects surface' approximation: the lower the value of the ∆ is, Subdivision can be non-uniform. Although the endpoints noise, higher values of the ∆ give better results. have to be selected according to the following requirements: ≤ ℎ ≤ , the more detailed a surface’s approximation is. If data contain 1 ≤ ≤ n, 0 ≤ & ≤ Y, 1 ≤ Y ≤ n − ), For the sorting of data, a new vector A is used: (11) c = :d , d; , … , dE , … , de > = ∆ × F − 1, ≤ dE ≤ , 360 (8) h=i i, 1 ≤ F ≤ h, 0 ≤ , ≤ o, 1 ≤ o ≤ h − .. (12) ∆ 0 < ∆ ≤ 1. Here p and q are B-spline degrees in the directions u and v, respectively. In B-spline theory, a cubic spline is the most The sorting is performed using an equation: common type, sufficient to approximate most types of shapes, ℎ = 6 , l = 1 so cubic B-spline surface i.e. of order 4 (degree = 3) is used jk :ℎ , dE > = ` − dE = K , l = 2 (9) through this paper too. The parameters r and w are selected , ℎ < ℎ; <. . . < ℎ <. . . < ℎH , experimentally. A B-spline surface approximates acquired data more precisely if higher r and w values are used. d < d; <. . . < dE <. . . < de , The requirements in (11) and (12) ensure that & = 1, … , . Shoenberg-Whitney conditions are met i.e. it is possible to create a valid parametric surface that approximates a given data The result of this equation is written into matrices j,E : Knots and are calculated using equations: set [16]. k ℎ <) j,E = jk :ℎ , dE >, k − )ℎH − ℎ = pℎ + ) ≤ < Y + ) − 2, (10) l = 1,2, n−1 (13) = 1, … , n, ℎH >Y+)−2 F = 1, … , h. d F<. F − .ℎe − ℎ = pd + . ≤ F < o + . − 2, If a cloud of points is unstructured, matrices jk most likely h−1 de F > o+.−2 (14) 0 ≤ ≤ Y + 2), will contain empty entries that must be calculated to proceed. In the presented algorithm, a simple linear interpolation between 0 ≤ F ≤ o + 2.. performed on modified matrices jk that do not have empty the neighbor non-empty entries is used. Further calculations are entries. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 5 Knot vectors U and V are created directly from the knots 1 ≤ ≤ n, 1 ≤ F ≤ h, and : l = 1,2. ! = , Here ! = " |0 ≤ & ≤ ' + ) + 1* and + = " |0 ≤ , ≤ + = , - + . + 1* are B-spline knot vectors. The parameters , ℎ (15) and , :dE > are calculated according to the equations, (16) 0 ≤ & ≤ Y, 1 ≤ Y ≤ n − ), described in details in Section III D. The parameters , are k 0 ≤ , ≤ o, 1 ≤ o ≤ h − .. the control points of the surfaces P(b)(u,v). They are unknown in Since , in (19) does not depend on the index d, the advance but can be calculated using an equation (24). D. Calculation of B-spline blending functions Blending functions define the type of the splines used to form equation can be rewritten as: into parameters , and , of the (2) equation and a parametric surface. In this paper these functions are encoded j,E = , ℎ z , :dE >, {. k k (20) are used to calculate B-spline’s control points Pc,d. The parameters , ℎ and , :dE > are stored in the The relation between the parameterization intervals u and v in matrices , and E, respectively: the expressions (3) and the values hi and aj in the expressions (11)-(12) are used to calculate B-spline’s blending functions. non-zero coefficients , ℎ , ℎ ∈ and , :dE > , dE ∈ }, },; ⋯ }, As a result blending functions are defined by p+1 and q+1 } };,; ⋯ };, , ℎ = , = | ;, , ⋮ ⋮ ⋱ ⋮ respectively. These coefficients are calculated using Cox-de }H, }H,; ⋯ }H, (21) Boor recurrences [7]: 1, ℎ ∈ s, > }, },; ⋯ }, Initialize , ℎ = ` , 0, otherwise };, };,; ⋯ };, , :dE > = E, =| , 1, d ∈ s , > ⋮ ⋮ ⋱ ⋮ , :dE > = ` E , (22) 0, otherwise }e, }e,; ⋯ }e, 0 ≤ ≤ ', 0 ≤ ≤ -, ,X ℎ , ℎ = ℎ − + 1 ≤ ≤ n, 1 ≤ F ≤ h. − ,X ℎ The inner sum of the equation (20) can be denoted as ,E : k +: − ℎ > , − (17) j,E = , ,E , k k ,X :dE > , :dE > = :dE − > + − (23) ,X :dE > ,E = E, , . k k +: − dE > , (18) − (24) ≤ ℎ ≤ , 1 ≤ ≤ n, There is a possibility, that there are no such , for which k ≤ dE ≤ , 1 ≤ F ≤ h. both equations (23) and (24) are valid. So a minimization is Here ! = " |0 ≤ & ≤ ' + ) + 1* and + = " |0 ≤ , ≤ - + used to solve these equations. . + 1* are B-spline’s knot vectors. equations. The unknown coefficients ,E can be calculated by k The equation (23) hides n over-determined systems of linear minimizing a sum: E. Calculation of B-spline control points H k ; = min j,E − , ,E , ; k A B-spline surface P(u,v) is an infinite set of points. In order E , to create an approximation of a surface , a certain number of (25) acquired point’s i. Matrices D(b), b = 1,2 are used instead of these points should be close or equal to a finite set of the ,E ,E k k the i to create such mapping. Surfaces P(b)(u,v) interpolate D(b) k k k ,E = ;,E , j,E = ;,E , k ⋮ ⋮ if at points hi and aj: ,E H,E k k j,E = :ℎ , dE > = , ℎ , :dE >, , k k k 0 ≤ ≤ ', 1 ≤ F ≤ h, l = 1,2. (19) ≤ ℎ ≤ , ≤ dE ≤ , INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 6 Here |∙| is the length of a vector. The equation (24) hides s fragment of the parametric surface’s internal structure is shown coefficients , can be calculated by minimizing a sum: k over-determined systems of linear equations. The unknown in Fig. 6(e). The parametric surface can be converted into a triangle mesh of a desired level of details. As an example, the parametric e k ; ; = min ,E − E, , , k surface 6(c) is converted into a triangle mesh 6(d), which , (26) contains of 171570 triangles. Like the parametric surface, the E triangle mesh is well structured as shown in Fig. 6(f). , , k k In order to demonstrate that the proposed algorithm is k k capable of creating parametric models of other shapes of , = ;, , ,E = ;, , k k surfaces, few other objects were scanned. The calculated ⋮ ⋮ models of these items are shown in Fig. 6(g). k k , e, In order to evaluate the accuracy of an approximation of the 0 ≤ ≤ ', 0 ≤ ≤ -, l = 1,2. presented algorithm, vectors connecting input data points of transformed surface and the corresponding points on the B-spline’s surface were calculated. The mean of vector lengths The control points , are points in ℝ3 and have three is 0.0281 mm, the standard deviation is 0.0885 mm. These numbers are influenced by possible measurement errors that are coordinates that are defined as a tensor: present in the input data. The calculated mean and standard , >, , = :, , , , ! ; deviation show that approximation errors are quite small and (27) 0 ≤ ≤ ', 0 ≤ ≤ -. hardly visually noticeable. Other algorithms mentioned in the introduction usually do not produce approximation errors, , is a vector, which stores averages of p successive Here ! because surface points are interpolated, meaning that measurement errors and noise are interpolated too. B-spline knots starting from uc and is calculated using an The approximation of head sculpture using the presented : + ⋯ + > equation: algorithm took 24.742 s to complete (transformation: 15.9161s, , = ! . calculation of control points: 8.099, other steps: 0.7269) using a ) (28) PC with 2.4GHz Core 2 Duo CPU and 4GB memory. 3D surface was visualized using OpenGL library and hardware accelerated graphics enabled. The parameters of B-spline’s Here parameter p is the degree of a B-spline surface in u surface were stored in a custom format in a text file and its size direction. was 743KB. The triangulation results were stored in a plain text object file format (OFF) and its size is 5.12MB. IV. VISUALIZATION OF PARAMETRIC SURFACE VI. CONCLUSIONS Calculated parameters can be directly used to visualize a In this paper the new algorithm for a surface reconstruction , and two knot vectors U and V are accepted as parameters surface using OpenGL graphical library1. The control points from an unstructured cloud of points is presented. The algorithm guarantees to generate hole free surface as it creates a by GLU function gluNurbsSurface() and are sufficient to create single B-spline surface of a shape of a cloud of points. A and visualize a surface. generated model is well-structured and simple to modify Additionally, a parametric surface can be converted to a because a parametric B-spline surface preserves C2 continuity triangulated mesh by evaluating surface points from equation when changing its parameters. This is much harder to (2) and then using algorithm [9] to pair them into triangles. accomplish then using triangulated surfaces. The combination of all three features, mentioned above, makes the presented V. EXPERIMENTAL RESULTS algorithm different from the other well-known algorithms such For the experiments a cloud of points, representing a head us Tight Cocone [10], Power Crust [3], etc. sculpture, shown in Fig. 6(a), is used. A cloud of points consists The experimental results presented in this paper prove that of 80589 unstructured points as shown in Fig. 6(b). the proposed algorithm can be successfully used for creating a The results of the surface reconstruction algorithm, presented in model of a structured surface from an unstructured cloud of this paper, are shown in Fig. 6(c) – 6(g). The algorithm is points where points are measured from a surface containing a implemented using MATLAB2 and a custom built code. The centre-line such that all perpendicular rays to that line intersect exception is the calculation of the control points. MATLAB’s with a surface no more than once. implementation of QR algorithm3 is used to solve the systems It was also experimentally proved that the suggested of linear equations (23) and (24). algorithm has small average processing time and its output uses The parametric surface 6(c) is created using 35728 control ~7 times less space on a hard drive disk compared to an output points, 180 knots in u direction and 207 knots in v direction. A of triangulation based algorithms. 1 http://www.opengl.org/ 2 http://www.mathworks.se/products/matlab/ 3 http://www.mathworks.se/help/matlab/ref/qr.html INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 7 a) b) c) d) e) f) g) Fig. 6. Input data: (a) - test cloud of points, (b) – distribution of the points in the cloud. Surface reconstruction results using proposed algorithm: (c) - parametric surface, (d) – triangulated surface, (e) – structure of parametric surface, (f) – structure of triangulated surface, (g) – examples of the other parametric reconstructions [3] N. Amenta, S. Choi, R. K. Kolluri. The Power Crust. Proceedings REFERENCES of 6’th Annual Symposium on Solid Modeling Applications, pp. 249-260, 2001. [1] N. Amenta, M. Bern, M. Kamvysselis. A New Voronoi-Based [4] C. L. Bajaj, F. Bernardini, G. Xu. Automatic reconstruction of Surface Reconstruction Algorithm. SIGGRAPH 98, pp. 415-421. surfaces and scalar fields from 3D scans. Annual Conference on [2] N. Amenta, S. Choi, T. K. Dey, N. Leekha. A simple algorithm Computer Graphics – SIGGRAPH ’95, pp. 109-118, 1995. for homeomorphic surface reconstruction. International Journal [5] F. Bernardini, J. Mittleman, H. E. Rushmeier, C. T. Silva, G. of Computational Geometry and Applications - IJCGA, vol. 12, Taubin. The ball-pivoting algorithm for surface reconstruction. no. 1-2, pp. 125-141, 2002. IEEE Transactions on Visualization and Computer Graphics - TVCG , vol. 5, no. 4, pp. 349–359, 1999. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 8 [6] J. D. Boissonnat. Geometric structures for three dimensional [17] G. Turk, M. Levoy. Zippered polygon meshes from range images. shape representation. ACM Transactions on Graphics - TOG , Annual Conference on Computer Graphics – SIGGRAPH ’94, vol. 3, no. 4, pp. 266–286, 1984. 311-318, 1994.J. U. Duncombe, “Infrared navigation—Part I: An [7] C. de Boor. On calculating with B-splines. Journal of assessment of feasibility (Periodical style),” IEEE Trans. Approximation Theory, 6, 50–62, 1972. Electron Devices, vol. ED-11, pp. 34–39, Jan. 1959. [8] B. Curless, M. Levoy. A volumetric method for building complex models from range images. Annual Conference on Computer Graphics – SIGGRAPH ’96, pp. 303-312, 1996. Raimundas Liutkevičius was born in [9] A. Davidsonas, R. Liutkevičius. A PLC Application for Data Kaunas, Lithuania, in 1974. He received Scanning and Visualization. Proceedings of International his Bachelor, Master and PhD degrees in Conference “Electrical and Control Technologies – 2008”, computer science from Vytautas Magnus Kaunas, pp. 144-149, 2008. University. His research areas include [10] T. K. Dey, S. Goswami. Tight Cocone: A water tight surface computer graphics, system modeling and reconstructor. Symposium on Solid Modeling and Applications - learning control. SMA, vol. 3, no. 4, , pp. 127-134, 2003. [11] T. K. Dey, S. Goswami. Provable surface reconstruction from noisy samples. Computational Geometry: Theory and Andrius Davidsonas was born in Kaunas, Applications - COMGEO, vol. 35, no. 1-2, pp. 124–141, 2006. Lithuania, in 1982. He received the [12] H. Edelsbrunner, E. P. Mücke. Three-dimensional alpha shapes. Bachelor degree in Applied Informatics ACM Transactions on Graphics (TOG), vol. 13, no. 1, pp. 43-72, from Vytautas Magnus University, Kaunas 1994. in 2005, Master degree in Informatics from [13] Eric Lengyel. Mathematics for 3D Game Programming and Vytautas Magnus University in 2007. At Computer Graphics, Third Edition. Cengage Learning, 2011. present he is a Ph.D student at Vytautas [14] H. Hoppe, T. Derose, T. Duchamp, J. A. McDonald, W. Stuetzle. Magnus University. His research areas are Surface Reconstruction from Unorganized Points. ACM Siggraph computer graphics, surface reconstruction Computer Graphics, vol. 26, no. 2, pp. 71-78, 1992. algorithms, 3D scanners and 3D printers. [15] L. Piegl, W. Tiller. The NURBS Book, Second Edition. Springer-Verlag, 1997. [16] I. J. Schoenberg, A. Whitney. On Pólya frequency functions. III. The positivity of translation determinants with an application to the interpolation problem by spline curves. Transactions of the American Mathematical Society, Volume 74, pp. 246-259, 1953. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 9 16-BIT RCA Implementation Using Current Sink Restorer Structure Tirumalasetty Venkata Rao, Avireni Srinivasulu, Member, IEEE Abstract—This paper presents the design of a 16-BIT Ripple on the number of transistors, their sizes and on their wiring Carry Adder (RCA) in Branch-Based Logic and Pass-Transistor complexity. Some of them use one logic style for the whole full logic (BBL-PT), a static design style that minimizes the internal adder where as others use more than one logic style for their node capacitances. This feature is used to lower the dynamic power dissipation, while maintaining good speed performances. implementation [4]-[7]. And, we proposed a modified level restorer using current sink The ever-increasing demand for mobile products, working restorer structure for branch-based logic and pass-transistor with a high throughput capability and a limited source of power, (BBL-PT) full adder [1]. In BBL-PT full adder, there lies a makes the design of low-power adder cells another significant drawback i.e. voltage step existence of, which could be eliminated goal to be attained. The design of a full-adder having low-power in the proposed logic by using the current sink restorer. The consumption and low propagation delay results of great interest proposed ripple carry adder is compared with the conventional static CMOS logic and BBL-PT RCAs, demonstrated the good for the implementation of modern digital systems [8]-[9]. delay performance. The performance of the 16-bit RCA based on The full adders TG-CMOS (Transmission Gate based proposed BBL-PT cell with current sink restorer structure and Complementary Metal Oxide Semiconductor), TFA other conventional RCA structures are examined using PSPICE (Transmission Function full Adder) and 14T having the and the model parameters of a 0.13 µm CMOS process. advantage of lower transistor count, intermediate nodes and provide better performance than CMOS and CPL Index Terms—CMOS digital integrated circuits, current sink restorer, full adder, high-speed, performance analysis, ripple (Complementary Pass Transistor Logic) designs. The above carry adder. logics TG-CMOS, TFA, 14T, CMOS and CPL show good behavior when implementing a 1-bit full adder cell but they may I. INTRODUCTION show performance degradation when used to implement more complex structures. Recently, building the low-power VLSI F ull adder is the core element of complex arithmetic circuits like addition, multiplication, division, exponentiation, etc. Adders play an important part in the final phase of signal systems has gained momentum because of the fast growth of technologies in mobile communication and computation. processing in some advanced architectures of high-speed However, the battery technology doesn’t exhibit faster growth analog-to-digital converters. Adders are important components as the microelectronics technology. There is however a limited in applications such as digital signal processors (DSP) amount of power availability for the mobile systems. So architectures and microprocessors. In addition to its main task, designers are faced with more constraints such as high speed, which is adding two numbers, it participates in many other high throughput, small silicon area, and at the same time useful operations such as subtraction, multiplication, division,, low-power consumption. So building low-power, address calculation, ..Etc. In most of these systems adder lies in high-performance adder cells is an important factor in today’s the critical path that affects the overall speed of the system. growing VLSI Technologies [10]-[12]. Accordingly, extensive research is being conducted to develop Adder performance can be improved by efficiently novel architectures, circuit configurations, layouts, design styles, implementing the carry propagation chain. This can be and design methodologies with the aim of improving adder addressed by either improving the structure of the 1-bit full speed and energy efficiency [1]-[3]. adder which is one of the basic cells in adders such as the carry The performance parameter of the full adder seems to be select or carry skip, as well as the building block of the ripple delay, power consumption and power-delay product. The logic carry adder (RCA). since an n-bit RCA is formed by n 1- bit style used in the logic gates basically influences the speed, size, full-adders, or by using improved fast adder architectures such power dissipation and the wiring complexity of a circuit. The as conditional sum adders (CSAs) or carry look-ahead (CLA) circuit delay is determined by number of inversion levels, the adders [1]. With n-bit ripple carry adders, the speed of operation number of transistors in series, transistor sizes (i.e. channel is mainly limited by the time taken to generate and propagate a widths) and intra-cell wiring capacitances. Circuit size depends carry from input to output in each one-bit cell involved [13]. The circuit delay is determined by number of inversion levels, the number of transistors in series, transistor sizes (i.e. channel Authors are associated with the Department of Electronics and Communications, Vignan University, Vadlamudi-522 213, Guntur, A.P, India. widths) and intra-cell wiring capacitances. Circuit size depends Authors can be reached by e-mail venkatarao.srp@gmail.com and on the number of transistors, their sizes and on their wiring avireni@ieee.org, respectively. complexity. Some of them use one logic style for the whole full INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 10 adder where as others use more than one logic style for their A. 1-Bit FA Based on Current Sink Restorer implementation. The current sink restorer structure (within the circle shown In this paper, we proposed a modified level restorer using in fig. 1) in which current sink load is used. The current sink is a current sink restore structure (within the circle shown in fig. 1) common gate configuration using an n-channel transistor with for BBL-PT full adder [1]. The proposed circuit eliminates the gate connected to a fixed bias supply; on account of this, the voltage step in the existing full adder and having good delay NMOS is always in ON Condition; while the PMOS transistor performance i.e. high operating speed. The remaining section of serves as a pull-up network. This inverter achieves higher voltage gain than active load inverters. the paper is organized as follows. In section II, proposed full adder structure implementation. The implementation of a 16-bit In order to prevent the voltage step that appears in 0→1 RCA based on the proposed full adder and their simulation transition on the sum output signal by using the current sink results is described in section III and finally in section IV ended restorer in the proposed full adder circuit as shown in Fig. 1(a). up drawing with conclusion. The sum can be implemented using pass transistor logic i.e. NMOS transistors are used. In general, NMOS transistor has II. PROPOSED FULL ADDER IMPLEMENTATION stronger 0's & weaker 1's, when input is applied. The carry can A Full Adder (FA) cell is a three-input and two-output block be implemented [1] with branch structure, using the in which the outputs are obviously the addition of three inputs. simplification method given [14] as shown in Fig. 1(b). This has three 1-bit inputs (A, B, and Cin) and two 1-bit outputs (sum and carry). The relations between the inputs and the outputs are expressed Sum = ABCin + ABCin + ABCin + ABCin (1) Carry = AB + BCin + ACin (2) For the low power design, a cell schematic must be implemented with few transistors and intra-cell node connections as far as possible. The branch-based design (a) technique meets these requirements while ensuring robustness with respect to voltage and device scaling. Here, logic cells are designed exclusively with branches composed of transistors in series connected between a supply line and the gate output. In branch-based design, the networks are composed only of branches, i.e., series connections of transistors between the output node and the supply rail. The advantages of transistor branches are higher layout regularity and simpler characterization (i.e., branch instead of gate modeling) [14]. Branch based design having the advantages of no diffusion interruption, common drain for two branches, a minimal number of contacts and few metal connections. Since a lower capacitance is switched during each transition, this results in less dynamic power consumption and lower delays. Fig.1. Proposed full adder circuit with current sink restorer structure (within the dashed circle). The disadvantage of BBL-PT full adder implementation lies in the discharge of weak high output level in pass- transistors used in the sum block that can be realized by the feedback When logic “1” is passed through the NMOS network, the pull-up PMOS transistor. In order to restore the weak logic “1” node “Sout” is to be charged to a weak logic “1”. When the (i.e. VddVtn) caused by the pass transistors, it has provided voltage at node “Sout” is < Vdd/2, the PMOS transistor in the sufficient drive to the successive stages. However, the level current sink restorer reaches switching threshold, and turned restoration implemented this way causes a voltage step at the ON and when the output of inverter is logic “1” then, the pull-up output node “Sout” during a 0 → 1 transition. This voltage step PMOS is turned OFF, and the node “Sout” is charged with an is due to the threshold voltage drop in the pass transistors and effective drive current that equals the current of the NMOS the delay needed by the level restorer to restore the weak logic network. “1” level. If the voltage step exists, the ON time period (tON) and When the voltage at node “Sout” reaches a Vdd/2, the PMOS OFF time period (tOFF) will not be equal. To make it equal we transistor in the current sink inverter is turned OFF. Then, the proposed a level restorer using current sink inverter structure. P-transistor is turned OFF while the N-transistor is always kept ON condition irrespective of input, thus it reduces the delay INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 11 during the 0 → 1 transition. Finally, the output of the current of the carry bits rippling through the carry chain. Therefore, a sink inverter is logic “0”, it is applied to the input of pull-up fast carry-out response becomes essential for the overall PMOS. Then, the pull-up PMOS is turned ON, and the effective performance of the adder chain. drive current charging the capacitance at node “Sout” becomes A0 B0 A1 B1 A2 B2 A15 B15 the sum of the current flowing through the NMOS network and the pull-up PMOS current. C0 C1 C2 C3 C15 Cout FA0 FA1 FA2 FA15 The SUM block can be implemented using branch based logic and pass transistors, since the SUM block implemented with branches is not advantageous; it requires 24-transistors for 1-bit SUM generation. Thus an implementation with pass transistors was used for the sum block as shown in Fig. 1(a). S0 S1 S2 S15 The carry can be implemented [1] with branch structure, using the simplification method given [14] as shown in Fig.1. The Fig.2. 16-bit RCA based on the proposed full adder cell. equations for the NMOS and PMOS are given by CN = A C in + B C in + A B (3) The proposed full adder and the other conventional adders (CMOS and BBL-PT) are simulated using the PSPICE and the CP = A C in + B C in + A B (4) model parameters of a 0.13 µm CMOS process. The simulations were carried out with supply voltage Vdd=1.2 V and frequency B. Transistor Sizes of 100 MHz. A comparison with existing or already reported designs is included which shows the advantage of the proposed In order to increase the switching speed, and thus to reduce design having good delay performance and the results are the delay times, would be to increase the W/L ratios of all shown in Table I. transistors in the circuit. However, increase in the transistor W/L ratios also increases the gate, source and drain areas and TABLE I subsequently, increases the parasitic capacitances of loading the PERFORMANCE COMPARISON OF THE PROPOSED AND logic gates [15]. Hence, the resizing of transistors is an iterative ALTERNATIVE IMPLEMENTATIONS OF FULL ADDERS WITH Vdd = 1.2 process to improve the performance of the full adder cell. All V AND f = 100 MHz. the transistor sizes (in micrometers) for the proposed full adders and conventional full adders (CMOS and CPL) are given by Device Delay (ps) W=0.4 µm and L=0.13 µm for PMOS count Design W=0.2 µm and L=0.13 µm for NMOS 1-Bit 8-Bit 16-Bit 1-Bit The pull-up PMOS transistor in the proposed level-restorer, CMOS FA 124 833 1020 28 considered minimum W/L ratios i.e. W = 0.18 µm and L = 0.18 µm. For this purpose the SPICE level 3 model of MOS BBL-PT FA 110 845 1091 23 transistors with Vtn = 0.5 V and Vtp = –0.6 V were used. The pull-up PMOS transistor, which must have high ON resistance, Current sink restorer FA 85 514 654 23 in order to restore the high logic level without affecting the low logic levels at the output node “Sout”. The branch-based logic, in combination with pass-transistor III. 16-BIT RCA IMPLEMENTATION logic, allows a simple implementation of the full-adder gate, To confirm, if the cells that display good performance in namely, the BBL-PT full adder, with only 23 transistors. This 1-bit operation have sufficient driving capability to also show compares favorably the 28-transistor static CMOS full adder. good performance when cascaded in larger circuits, we chose a The proposed full adder occupies the minimum silicon area on 16-bit RCA as a benchmark implementation to fairly compare chip amongst the ones (except BBL-PT FA) reported in this three 1-bit full adders. paper. The proposed RCA had the lower delay, when implementing the more complex structures. The small silicon The full adder circuits designed in this paper can now be area of the proposed full adders makes it potentially useful for used as the basic building block of a 16-bit ripple carry adder, building compact VLSI circuits on a small area of a chip. The which accepts two 16-bit binary numbers as input and produces proposed circuit is designed with combination of two logic the binary sum and carry at the output. The simplest such adder styles and offers high speed, low static power consumption and can be constructed by a cascade-connection of eight full adders, where as each adder stage performs a two-bit addition, produces low count of transistors. the corresponding sum bit, and passes the carry output on to the The simulated input and output waveforms of the existing next stage. Hence, this cascade-connected adder configuration (BBL-PT) are shown in fig. 3 and the proposed 16-bit RCA are is called the ripple carry adder as shown fig. 2. The overall shown in fig. 4. speed of the ripple carry adder is obviously limited by the delay INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 12 In the BBL-PT 16-bit adder, there exist a voltage step in the The BBL-PT 16-bit RCA shows a higher delay than the SUM output waveform during 0 → 1 transition as shown in fig. proposed 16-bit RCA which in turn exhibits slightly higher 3 and that can be eliminated in the proposed full adder design delay than the CMOS 16-bit RCA. using modified current sink restorer as shown in fig. 4.The voltage step in BBL-PT is due to the delay needed by the level restorer to restore the weak logic “1” level. The proposed level restorer doesn’t require delay to restore the weak logic “1” level. For each transition, the delay is measured from 50% of the input voltage swing to 50% of the output voltage swing. The maximum delay is taken as the cell delay. It is apparent that amongst the existing conventional full adders, the proposed one is having lower delay. Fig. 3. 16- Bit BBL-PT RCA output waveforms. The simulated input and output waveforms proposed 16-bit RCA is shown in fig.4. The proposed 16-bit ripple carry adder having lower delay when increasing the supply voltage and results are plotted in fig.5. It can thus be seen that the proposed current sink restorer based 16-bit RCA shows the best delay performance than CMOS and BBL-PT based 16-bit RCAs. The results are enunciated in fig. 6. Fig. 4. Simulated input & output waveforms of proposed 16-bit RCA INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 13 Sensors, and mobile communications, the intrinsic benefits of proposed one could be fully exploited. REFERENCES [1] Ilham Hassoune, Denis Flandre, Ian O’connor, Jean-Didier Legat., ULPFA: A new efficient design of a power-aware full adder. IEEE Transactions on Circuits and Systems-I: Regular papers, 2010, vol. 57, no. 8, p. 2066-2074. [2] I. S. Abu-Khater, A. Bellaouar, and M. I. Elmasry, “Circuit techniques for CMOS low-power high-performance multipliers,” IEEE J. Solid-State Circuits, vol. 31, no. 10, Oct. 1996, pp. 1535–1546. [3] R. Zimmermann, W. Fichtner, Low-power logic styles: CMOS versus pass-transistor logic, IEEE Journal Solid State Circuits, 1997, vol. 32, p.1079-1090. [4] H. Lee and G.- E. Sobelman, “A newlow-voltage full adder circuit,” in IEEE Proc. 7th Great Lakes Symp. VLSI, 1997, pp. 88–92. [5] R. X. Gu and M.-I. Elmasry, “Power dissipation analysis and optimization of deep submicron CMOS digital circuits,” IEEE J. Solid-State Circuits, vol. 31, no. 5, May 1996, pp. 707–713. [6] N. Zhuang, H. Wu, A new design of the CMOS full adder. IEEE Journal Fig.5. Delay vs. Supply voltage comparison of 16-bit ripple carry adders. of Solid State Circuit, 1992, vol. 27, p. 840-844. [7] Ahmed M. Shams, T. Darwish, M. Bayoumi, Performance analysis of low power 1-bit CMOS full adder cells. IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2002, vol. 10, no. 1, p. 20–29. [8] C.-H. Chang, J. Gu, And M. Zhang, “A review of 0.18 m full adder performances for tree structured arithmetic circuits,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2005, vol. 13, no. 6, p. 686–694. [9] Goel, S., Gollamudi, S., Kumar, A., and Bayoumi, M., On the Design of Low-Energy Hybrid CMOS 1_Bit Full Adder Cells, in Proc. 2004 27th Midwest Symp. On Circuits and Systems, 2004, vol. 2, pp. 209–212. [10] N. Weste, K. Eshraghian. Principles of CMOS VLSI Design, A System Perspectives. Reading, MA: Addison-Wesley, 1988. [11] V.V. Shubin, New CMOS circuit implementation of a one- bit full adder cell. Russian Microelectronics Journal, 2011, vol. 40, no. 2, p. 130-139. [12] K. Chu, D. Pulfrey, A comparison of CMOS circuit techniques: Differential cascode voltage switches logic versus conventional logic. IEEE Journal Solid State Circuits, 1987, vol. 22, pp. 528–532. [13] Rabaey J. M., Chandrakasan. A, Nikolic.B, Digital Integrated Circuits, a Design Perspective, Englewood Cliffs, 2nd ed N.J.: Prentice Hall, 2002. [14] C. Piguet, C. Piguet, Ed., “Chapter 7: Logic Circuits and Standard Cells,” in Low-power Electronics Design. Boca Raton, FL: CRC Press, 2004. [15] S.Kang, Y.Leblebici, CMOS Digital Integrated Circuits - Analysis and Design, McGraw-Hill, 2003. Venkata Rao TIRUMALASETTY was born in Fig.6. Delay vs. Sum bit position comparison for the proposed RCA based on Sriranga Puram, Guntur, (A.P), India in 1987. He current sink restorer and other 16-bit ripple carry adders. received the B.Tech degree in Electronics and Communication Engineering from J.N.T.University, Kakinada in 2009 and Master of IV. CONCLUSION Technology (M.Tech) in VLSI Design from Vignan University, Vadlamudi, Guntur, India in In this paper, we have presented a modified level restorer 2012. He is currently working as an Assistant using current sink restorer structure for BBL-PT full adder. The Professor in the Department of Electronics and Communication Engineering, Vignan’s Nirula proposed circuit provides regular and compact layout structure Institute of Technology and Science for Women, Guntur, India. and reduces the diffusion capacitances (since it eases diffusion sharing). The proposed level restorer is eliminating the voltage Srinivasulu AVIRENI was born in Thurimella, step in the existing full adder design and achieves good delay (A.P), India, in 1963. He received the B.Tech degree in Electronics and Communication performance. Next, the 16-bit RCA circuits based on the Engineering from Sri Venkateswara University, proposed one achieve 40% lower delay than the static CMOS Tirupati in 1986, M.E, degree in Power Electronics and BBL-PT full adders. Finally, while implementing high Engineering from Gulbarga University, Gulbarga in 1991, M.S, degree in Software Systems from Birla performance complex structures, such as multipliers, ALU Institute of Technology and Science (BITS), Pilani (arithmetic logic unit), counters, Memory, Finite State Machine, in 1998 and Ph.D. degree in Electronics & Microprocessors, Micro-electromechanical, Microcontrollers, Communication Engineering (VLSI Design) from Birla Institute of Technology, Mesra, India in 2010. He worked as a Lecturer, INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 14 Assistant Professor, Reader, Associate Professor, and working as a Professor in the Department of Electronics & Communication Engineering, T.G.L.G. Polytechnic, Adoni, Guru Nanak Dev Polytechnic, Bidar, K.S.R.M. College of Engineering, Kadapa, Defence University Engineering College, Ethiopia, Kigali Institute of Science, Technology & Management, Rwanda, Birla Institute of Technology, Mesra, Ranchi, and Vignan University, Vadlamudi, Guntur, India. He has 24 years of teaching and 14 years of research experience in the Department of Electronics & Communication Engineering. Dr. A.Srinivasulu is a member of IEEE, life member of I.S.T.E and a member of the Institution of Engineers (India). He has published over 28 articles in international journals and international conference proceedings; his main research areas are microelectronics, VLSI design and analog ASIC. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 15 Early Stage Software Effort Estimation Using Function Point Analysis: An Empirical Validation Tharwon Arnuphaptrairong Abstract—Software effort and cost estimation are necessary II. OVERVIEW OF RELATED LITERATURE at the early stage of the software development life cycle for This section reviews the software effort and cost estimation the project manager to be able to successfully plan for the methods related to our proposed methodologies i.e., Function software project. Unfortunately, most of the estimation Point Analysis, early Function Points, Function Points models depend on details that will be available at the later estimation from data flow diagram method and COCOMO cost stage of the development process. This paper proposes to estimation model. use Function Point Analysis in application with dataflow diagram to solve this timing critical problem. The proposed A. Function Point Analysis methodology was validated through the graduate student Function Point Analysis (FPA) was originated in 1979 and software projects at the Chulalongkorn University Business widely accepted with a lot of variants, from both academics and School. The results show the high potential of the practitioner [2]. The research in this area is also known as applicability and some interesting insights which are worth Function Size Measurement (FSM). The FPA or FSM could looking into. be classified into FP counting and estimation [3]. Function Point Analysis was introduced by Albrecht [4]. The Index Terms—Software effort estimation, early stage software concept is based on the idea that the functionality of the effort estimation, early stage Function Point Analysis, software software delivered is the driver of the size of the software (Line effort empirical evidence. of Codes). In other words, the more the functions delivered, the more the Line of Codes. The functionality size is measured in I. INTRODUCTION terms of Function Points (FP). S oftware effort and cost estimation are necessary at the early stage of the software development life cycle for the project manager to be able to successfully plan for the software project. FPA assumes that a software program comprises of functions or processes. In turn, each function or process consists of five unique components or function types as shown in Figure 1. The Unfortunately, most of the estimation models depend on details five function types are External Input (EI), External Output that will be available at the later stage of the development (EO), External Query (EQ), Internal Logical File (ILF), and process. For example, the object oriented estimation models External Interface File (EIF). depend on the UML models –Use cases, Class diagrams and so Each of these five function types is individually assessed for on, which will not be available until the design stage. This complexity and given a Function Point value or weight which situation –the need for information at the early stage but is varies from 3 (for simple external inputs) to 15 (for complex available at the later, is referred to as software estimation internal files) as shown in table I. The Function Point values are paradox [1]. This paper proposes to use dataflow diagram (DFD) based on the complexity of the feature being counted. to solve this timing critical problem. At the requirement stage, The low, average and high complexity level of ILF and EIF the DFD can be used to depict the functionality of the software are based on the number of Record Element Type (RET) and system. The information available in the DFD can be used to Data Element Type (DET). A Record Element Type (RET) is a obtain Function Points and serve as the basis for software effort subgroup of the data element (record) of an ILF or ELF. A data estimation. element type is a unique non-repeated data field. This article is organized as follows. Section II overviews the The complexity level of EI and EO and EQ are based on the software effort estimation methods related to our proposed number of File Type Referenced (FTR) and Data Element Type methodologies i.e., Function Point Analysis, early Function (DET). A File Type Referenced (FTR) is an ILF or EIF. Points, the Function Points estimation from data flow diagram The Unadjusted Function Points (UFP) or Unadjusted method and COCOMO cost estimation model. Section III Function Point Counts (UFC) is calculated as below [4]: describes the proposed methodology. Section IV presents the empirical results. The discussions and the conclusions for this 5 3 research are presented in section V and VI respectively. UFP = ∑∑ N W i =1 j=1 ij ij (1) The sum of all the occurrences is computed by multiplying T. Arnuphaptrairong is with the Department of Statistics, Chulalongkorn Business School, Chulalongkorn University, Bangkok 10250 Thailand (e-mail: each function count (N) with a Function Point weighting (W) in Tharwon@acc.chula.ac.th). Table I, and then the UFP is attained by adding up all the values. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 16 FP = UFP x TCF (3) The International Function Point User Group (IFPUG) is the organization establishes the standards for the Function Point Size Measurement to ensure that function point counting are the same and comparable across organizations. The counting manual can be found at http://www.ifpug.otg. The International Standard Organization (ISO), in 1996, established the common standard, in order to support the consistency and promote the use of this Function Size Measurement (FSM). The updated versions are maintained. Besides the IFPUG FPA, three other FPA variants are also certified methods by ISO --Mk II, NESMA, and COSMIC FFP. Fig. 1. The Albrecht five function types B. Early Function Points TABLE I Early Function Points (EFP) and Extended Function Points (XFP) were proposed by Meli [5], to anticipate for the THE FUNCTION POINT WEIGHTS need of software size estimate at the early stage of the Complexity development life cycle. The method requires the estimator to Function Type Low Average High put in knowledge at different detail levels of a particular application. Functionalities are classified as: Macrofunction, External Input 3 4 6 Function, Microfunction, and Functional Primitive. Each type External Output 4 5 7 of functionality is assigned a set of FP value (minimum, average, and maximum). The Early Function Points (EFP) and External Inquiry 3 4 6 Extended Function Points (XFP) are considered not very easy Internal Logical File 7 10 15 to use. External Interface File 5 7 10 C. Function Points estimation from data flow diagram method The sum of all the occurrences is computed by multiplying Functionality is the heart of FPA. One stream of research each function count (N) with a Function Point weighting (W) in proposed that functionalities can be retrieved using Structured Table I, and then the UFP is attained by adding up all the values. Analysis (SA) which expressed in the form of Dataflow Where Nij is the number of the occurrences of each function Diagram (DFD) for process modeling and Entity Relationship type i of the five types with complexity j and Wij is the Diagram (ERD) for data modeling. corresponding complexity function point weighting value j of DFD was proposed as the estimator for FPA by a number of the 3 levels –low, average and high of the function type. papers using either DFD alone or together with ERD [6]-[12]. The Function Point values obtained can be used directly for Rask [6, 7] introduced the algorithm for counting the estimating the software project effort and cost. But in some Function Points using specification from DFD and ERD data cases, it may need further adjustments with the software model. The automated system was also built. development environment factors. O’brien and Jones [8] proposed a set of counting rules to In order to find adjusted FP, UFP is multiplied by technical incorporate Structured Analysis and Design Method (SSADM) complexity factors (TCF) which can be calculated by the into Function Point Analysis. DFD, together with I/O structure formula: diagram, Enquiring Access Path (EAP) and Effect Correspondence Diagram (ECD) were applied to the counting TCF = 0.65 + (sum of factors) / 100 (2) rules for the Mark II FPA. Shoval and Feldman [9] applied Mark II Function Points with There are 14 technical complexity factors --data Architectural Design of Information System Based on structural communications, performance, heavily used configuration, Analysis (ADISSA). The proposed method counts the attributes transaction rate, online data entry, end user efficiency, online of all inputs and outputs from the DFD of the system to be built update, complex processing, reusability, installation ease, and all of the relations in the database from the database design operations ease, multiple sites, facilitate change, distributed process, and then plugs in all the numbers in the Mark II model. functions. Each complexity factor is rated on the basis of its DFD was found also proposed to be used together with ERD degree of influence from no influence (0) to very influential (5). in Gramantieri et al. [10]. Lamma et al. [11] to solve the The adjusted Function Points (FP) or Function Point Counts problem of counting error, a system for automating the counting (FC) is then derived as follows: is built and called FUN (FUNction point measurement). The INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 17 system used the specification of a software system from Entity 3 3 Relationship Diagram and DFD to estimate software Function FP = ∑∑ N W i =1 j=1 ij ij (5) Points. Later, the system was automated by Grammantieri et al. [12]. FP = (1×3 + 1×4 + 3×7) + (1×3 + 1×4 + 4×7) + (1×3 + 3×4 + 1×7) D. COCOMO Cost Estimation Model = 28 + 35 + 22 COCOMO (Constructive Cost Model) was originated by = 85 Boehm [13] in 1985. The model was based statistical analysis of data of 63 software development projects. By performing Next, the size of the software is attained by multiplying the regression analysis on the of 63 software development projects, FP counts with the average number of line of codes per function the following basic software development effort estimation point of the programming language used [16] to transform the model was derived: FP counts to number of line of codes (LOC). Suppose that this software is to be implemented with C++, the average number of k Effort = c (size) (4) line of codes per function point for C++ is 53. The number of line of codes (LOC) for this software is then calculated as Where: Effort was measured in person month (pm) or the below: number of person months, a person month is of 152 person hours, Size is the estimated size of the software, measured in LOC = 85 × 53 (6) Kilo Delivered Source Instructions (KDSI) and c and k are = 4,505 constants. And the required effort is then estimated using COCOMO III. THE PROPOSED METHODOLOGY cost estimation model as follows: One of the problems associated with FPA is the need for estimation at the early stage of the software development life k Effort = c × (size) (7) cycle [5, 14]. According to IFPUG standard counting rules, 1.05 = 2.4 × (4.505) function specification should already be clear at least from 15 to = 11.66 person months 40% before FP can be obtained. Otherwise it would not be possible to identify EI, EO, EQ, ILF and EIF [5, 15]. This research proposes to handle this problem by utilizing IV. EMPIRICAL VALIDATION functional requirements available in the DFD at the requirement This section describes the experiment and the validation determination stage — early stage of the development life cycle. process of this study. Using DFD is not new. At least two algorithms using DFD had been proposed for FP counts [6, 9]. To attain the FP counts A. The Proposed Function Point Analysis using DFD, the proposed method adapted the method proposed The proposed methodology was validated with graduate by [6, 9]. student projects in the master program in Business Software The proposed method is to count for only EI, EO and ILF Development at the Department of Statistics, CBS Business where EI is the data flow from external entity into the system, School, Chulalongkorn University. The graduate students are EO is data flow from the system to external entity, and the ILF is required to have at least one year of experience in the software the file used inside the system. The DFD of the sales order industry for the admission. After finishing 36 credits of course system shown in Fig. 2 will be used to demonstrate how the works, the students are required to have a 6 credit master project proposed method works. to develop a business software package. There are 3 functions, --Fill Order, Create Invoice and Apply A batch of the 25 graduating students of the master program Payment in the sales order system. was asked to participate in the experiment using the proposed The Fill Order function consists of 1 EI (order from methodology described in section III. When the students passed customer), 1 EO (packing list), and 3 ILF (customer file, the project proposal, a questionnaire was distributed to each product file, and order file) student to ask for the following information: student The Create Invoice function consists of 1 EI (completed identification, name, programming experience, project name, orders from warehouse), 1 EO (invoice), and 4 ILF (customer start date, number of functions appeared in the DFD, languages file, product file, order file, and accounts receivable file) and tools used, and the estimated function points. Finished date The Handle Payment function consists of 1 EI (payment from and actual effort in man hours used for the software projects customer), 3 EO (bank deposit, commission and cash receipt) were filled out in the same form by the students again when the and 1 ILF (accounts receivable file) projects were completed. The numbers of unadjusted Function Point counts of this software are then achieved by applying corresponding FP weights to the EI, EO and ILF as follows: INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 18 Order Packing List Credit Status Order history Customer Order Product details Fill Order Order Picking details Warehouse Product Inventory change Order details Product details Create Customer Invoice Customer details Completed order Invoice Invoice Payment detail Handle Accounts payable Payment Cash receipt Invoice detail Payment Commission Bank deposit Bank Commission Accounting Dept. Dept. Fig. 2. First level DFD of a Sales Order System Fifteen students returned the final results. Table II shows the productivity rates instead. Therefore, the productivity rate data background data of the projects. The table shows the names of from the literature were reviewed to test the proposed FPA the software developed, languages and tools used, number of method. Six studies were found. These studies are Albretch & functions in the software, and the actual man hours spent in Gaffney (1983) [18], Behrens (1983) [19], Moser & Nierstrasz developing the software. Estimated FP is the estimated (1996) [20], Lokan (2000) [21], Maxwell & Forselius (2000) unadjusted Function Points from DFD with the algorithm as [22], Jeffer et al. (2001) [23]. Table IV shows how the effort explained in section III. On average, they are of 8 functions, will be derived from the productivity rates reviewed. Table IV 305.4 Function Points and 534 man hours employed per project. also includes the proposed FPA method using COCOMO cost Of the 15 questionnaires, two students did not return the final estimation model with the programming language factors and actual effort used for the projects. This resulted in only 13 average productivity rate calculated from the 13 sample data set usable project data sets. The analysis was carried on with the 13 which is 1.09 work-hours per Function Point. projects. From the data gather in table II, the software size in To evaluate the usability of the proposed method, the effort source line of codes (LOC) was calculated by multiplying estimates of the 13 samples data set were calculated using the language factors (58 LOC for C#, 28 LOC for VB.net and 56 productivity rates reviewed in table IV as shown in Table V. LOC for PHP). Then, the COCOMO Cost Estimation Model Table VI shows the Mean Magnitude of Relative Error (MMRE) was used to estimate the effort needed to develop the software of the effort estimates. with c=2.4, and k=1.05 (Basic COCOMO Model). Table VI shows that effort estimates from Lokan’s The estimated effort obtained in person months was then productivity rate has the least MMRE of 0.82 and follows by the multiplied by 152 to get the man hours. The accuracy of the estimates using average productivity (MMRE = 1.08) However, estimation was measured using Magnitude of Relative Error after removing 2 outliers project – 3 and 13, the performance of (MRE) which is the absolute value of (estimated man hours – the effort estimates are much better. The effort estimates from actual man hours / actual man hours). The results are shown in Jeffery et al. show the least MMRE of 0.45, follows by average Table III. The results from table III show very high MRE with productivity method (MMRE = 0.64). average of 1624.31%. The figures also show over estimates of Whilst effort estimates from Behrens’s productivity rate the estimated effort for all projects. Two projects --3 and 13 are show the worst MMRE of 18.94 and follows by effort estimates obviously outliers. With the two outliner projects removed, the from the Albretch & Gaffney (MMRE =17.47). After removing MRE was improve to 517.84% 2 outliers project – 3 and 13, the two estimates still perform the worst (MMRE of 6.26 and 5.37). This may be because the two B. The Validation Analysis productivity rates were discovered about 30 year ago. The Rollo [17] affirmed that using the wrong LOC per FP productivity rates were very low (18.3 and 16.81 work hours per figures as an input to the COCOMO model can lead to function point) which are far from present day productivity. considerable error in the estimates and encouraged to use proper INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 19 TABLE II PROJECT BACKGROUND DATA No of Estimated Man hours No. Software Language and Tool used Functions FP (Actual effort) Tap Water Production Maintenance and Service Using 1 GPS System C#.net, VS-Studio 5 207 400 VB.net, ASP.NET, SQL 2 Investment Support System server 12 564 1016 Software Inspection processing Compliance Support 3 System C#.net, VS-Studio 9 1321 560 4 Thai Language Data Mining for Marketing Research PHP, SQL server 4 83 640 5 Vegetable Box Project Management Software VB.net, SQL server 8 220 240 6 Personal Loan Follow up Management System ASP.NET 9 139 N/A 7 Intelligent Room Assignment Dormitory System PHP, SQL server 5 65 400 8 Software Supporting Buffet Business Via Web C#.net, VS-Studio 9 254 458 9 Visual Challenge Library Support System C#.net, ASP.NET,VS-Studio 7 186 384 10 Beauty Business Information System VB.net, SQL server 12 112 550 Gold Retailing Business Information System Using 11 RFID VB.net, SQL server 11 110 600 12 Appraisal System VB.net, C++,SQL server 5 268 520 13 Primary School Teaching Support System C#.net, VS-Studio 7 154 1080 14 Electronic Menu Restaurant Support System VB.net, Java Script 8 707 95 Software for Visual Challenge person travelling with 15 public transportation C#, Java,VB.net 9 191 N/A Average 8 305.4 534 TABLE III THE RESULTS FROM THE ANALYSIS Estimated Estimated Estimated Estimated Actual effort No. Language FP LOC per FP KLOC Man months Man hours (man hours) MRE (%) 1 C#.net 207 58.00 12.01 32.63 4959.33 400.00 1139.83 2 VB.net 564 28.00 15.79 43.51 6613.23 1016.00 550.91 3 C#.net 1321 58.00 76.62 228.43 34721.80 560.00 6100.32 4 PHP 83 56.00 4.65 12.05 1830.98 640.00 186.09 5 VB.net 220 28.00 6.16 16.19 2461.02 240.00 925.42 6 PHP 65 56.00 3.64 9.32 1416.48 400.00 254.12 7 C#.net 254 58.00 14.73 40.45 6147.94 458.00 1242.34 8 C#.net 186 58.00 10.79 29.16 4432.44 384.00 1054.28 9 VB.net 112 28.00 3.14 7.97 1211.29 550.00 120.24 10 VB.net 110 28.00 3.08 7.82 1188.59 600.00 98.10 11 VB.net 268 28.00 7.50 19.92 3027.70 520.00 482.25 12 C#.net 154 58.00 8.93 23.92 3635.39 1080.00 236.61 13 VB.net 707 28.00 19.80 55.16 8384.19 95.00 8725.46 Average 1624.31 INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 20 The lowest MMRE of 0.82 (13 samples) or 0.45 (11 samples) the proposition to use COCOMO Cost Estimation Model in indicates that the proposed method to obtained Function Points combination with the obtained Function Points may not be from DFD may be applicable with the appropriate productivity recommended. The findings are similar to the prior work, for rate like productivity rate of Jeffer et al. (2001). However, the example, the work of Kemerer [24] and Miyasaki and Mori [26]. proposition to use COCOMO Cost Estimation Model in The results reveal the potential to explore into many issues, for combination with the obtained Function Points may not be example, the COCOMO Cost Estimation Model and its recommended since it performed badly (MMRE =12.30). parameters, and the programming language factors. The implication from this research is probably that one V. DISCUSSION organization should maintain and calibrate its own software It may not be surprised with the disappointed results of high project data [27, 28] to attained appropriate productivity rates. MRE. The high MRE percentages are similar to the work of And to reduce the variation due to the COCOMO Cost Kemerer -- “An Empirical Validation of Software Cost Estimation Model parameters and the programming language Estimation Model” [24]. factors, one organization may need to localize its own With the proposition to use COCOMO Cost Estimation parameters of a specific programming language and use them to Model for the effort estimation, the high MRE percentages may estimate the effort and cost need instead of the industry be attributed to many factors, including the followings: parameters. COCOMO Cost Estimation Model and its parameters, and programming language factors It may be hypothesized that the parameters of the COCOMO REFERENCES Cost Estimation Model (i.e., the values of c and k) were not fit [1] B.W. Boehm, “Software engineering economics,” IEEE Transaction of Software Engineering, vol.10, no.1, pp. 4-21, 1984. well with the experimental environment. This is probably [2] C. Gencel and O. Demirors, “Functional size measurement revisited,” because the parameters of the COCOMO model were ACM Transaction on Software Engineering and methodology, vol.17, no. discovered by performing the regression analysis on software 3, pp.15.1-15.36, June 2008. [3] R. Meli and L. Santillo, “Function point estimation methods: a project data gathered in the USA. This indicates the need for comparative overview,” in FESMA '99 Conference proceedings, localization. Amsterdam, 4-8 October, 1999. The programming language factors used to converse [4] A. J. Albrecht, “Measuring application development productivity,” in Function Points to number of Source Line of Codes is another Proceeding of the IBM Applications Development Symposium, California, October 14-17, 1979, pp. 83-92. question. The programming language conversion table by Caper [5] R. Meli, “Early and extended function point: a new method for function Jones [16] also produced using software project data gather in points estimation,” IFPUG Fall Conference, September, 15-19, Arizona, the States. This is consistent with the findings of Rollo in [17]. 1997. [6] R. Rask, “Algorithm for counting unadjusted function points from Another observation is that both outlier projects – project 3 dataflow diagram” Technical report, University of Joensuu, 1991. and 13 are the bigger project with 1321 and 707 Function Points. [7] R. Rask, “Counting function points from SA descriptions,” The Papers of This is consistent with the observation of Yang et al. [25] that the Third Annual Oregon Workshop on Software Metrics (Ed. W. Harrision), Oregon, March 17-19, 1991. larger projects are more prone to cost and schedule overruns. [8] S. J. Obrien, and D. A. Jones, “Function points in SSADM,” Software The generalization of this research may suffer from being one Quality Journal, vol. 2, no. 1, pp.1-11, 1993. person student project and small sample size. The software [9] P. Shoval, and O. Feldman, “Combining function points estimation projects used in this research were one person graduate student model with ADISSA methodology for system analysis and design,” in Proceeding of ICCSSE”96, 1996, pp.3-8. projects which are different from the real world project in many [10] F. Gramantieri, E. Lamma, P. Mello, and F. Riguzzi, “A system for aspects, especially the number of team members. Probably with measuring function points from specification,” Technical Report, small sample size of 13 or 11 (when the outliners were removed) Universitra di Bologna, 1997. [11] E. Lamma, P. Mello and F. Riguzzi, “A system for measuring function is the bigger problem. Small sample limits the probing into the points from an ER-DFD specification,” The Computer Journal, vol. 47, above speculations. Other factor that may contribute to the high no.3, pp.358-372, 2004. MRE is the type and the complexity of software application. To [12] F. Gramantieri, E. Lamma, P. Mello, and F. Riguzzi, “A System for Measuring Function Points from Specifications,” DEIS – Universita di handle this problem, one may need to adjust the estimates with Bologna, Bologna and Dipartimento di Ingegneria, Ferrara, Tech. Rep the Technical Complexity Factors (TCF) [3]. DEIS-LIA-97-006, 1997. [13] B.W. Boehm, Software Estimation with COCOMO II, Upper Saddle River, NJ, Prentice Hall, 2002. VII. CONCLUSIONS [14] J. Wu, and X. Cai, “A Software Size Measurement Model for Large-Scale This paper proposes to use Function Point Analysis in Business Applications,” in Proceedings of 2008 International application with DFD to solve the timing critical problem at the Conference on Computer Science and Software Engineering, 2008, pp.39-42. early stage of the software development process. At the [15] R. Meli and L. Santillo, “Function Point Estimation Methods: a requirement stage, the DFD can be used to depict the Comparative Overview,” in FESMA’99 Conference Proceedings, functionality of the software system. The information available Amsterdam, 1999. [16] C. Jones, Applied Software Measurement, Assuring Productivity and in the DFD can be used to obtain Function Points and serve as Quality, 2ed, McGraw-Hill, 1997. the basis for software effort estimation. The empirical data [17] A. L. Rollo, “Functional Size Measurement and COCOMO –A shows that using DFD to obtain Function Points may be Synergistic Approach,” in Proceeding of Software Measurement European Forum, 2006. applicable but needs the appropriate productivity rate. However, INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 21 [18] A.J. Albretch, and J.E. Jr. Gaffney, “Software Function, Source Lines of [27] M. Aguiar, COCOMO II Local Calibration Using Function Points, TI Code, and Development Effort Prediction: A Software Science Metricas, Available at: Validation,” IEEE Transaction on Software Engineering, vol. SE-9, no. 6, http://csse.usc.edu/events/2005/COCOMO/presentations/CIILocalCalibr pp.639-648, 1983. ationPaper.pdf. [19] C. Behrens, “Measuring the Productivity of Computer Systems [28] B. Clark, S. Davnani-Chulani, and B. Boehm, “Calibrating the Development Activities with Function Points,” IEEE Transaction on COCOMO II Post-Architecture Model,” in Proceeding of ICSE’ 98 Software Engineering, vol. SE-9, no. 6, pp.648-652, 1983. Proceeding of the 20th International Conference of Software [20] S. Moser, and O. Nierstrasz, “The Effect of Object-Oriented Frameworks Engineering, 1998, pp.477-480. on Developer Productivity,” IEEE Computer, vol. 29, no.9, pp.45-51, 1996. [21] C.J. Lokan, “An Empirical Analtsis of Function Point Adjustment Tharwon Arnuphaptrairong is An Assistant Factors,” Information and Software Technology, vol.42, no.1, Professor in Business Information pp.649-659, 2000. Technology at the Department of Statistics, [22] K. Maxwell, and P. Forselius, “Benchmarking Software Development Productivity,” IEEE Software, January/February, pp.80-88, 2000. Faculty of Commerce and Accountancy, [23] R. Jeffery, M. Ruhe, and L. Wieczorek, “Using Public Domain Metric to Chulalongkorn University, Thailand. He Estimate Software Development Effort,” in Proceeding of international received a B.Sc. Degree in Statistics from Software Metric Symposium (Metric’01), pp.16-27, 2001 Chulalongkorn University, a M.Sc. in [24] C.F. Kremer, “An Empirical Validation of Software Cost Estimation Computer Applications from Asian Institute Models,” Communication of the ACM, vol. 30, no. 3, pp.416-429, 1987. of Technology, Bangkok, Thailand, and a [25] D. Yang, Q. Wang, M. Li, Y. Ye, and J. Du, “A Survey on Software Cost Estimation in the Chinese Software Industry,” in Proceeding of ESEM’08, Ph.D. Degree in Management Sciences from University of Waterloo, Kaiserslautern, Germany, 2008. Canada. His research interests include Software Project Management, [26] Y. Miyazaki and K. Mori, “COCOMO Evaluation and Tailoring,” in Software Risk Management, Software Cost Estimation and Empirical Proceeding of ICSE’ 85 Proceeding of the 8th International Conference Software Engineering. of Software Engineering, 1985, pp.292-299. TABLE IV TABLE VI FP PROCUTIVITY RATES MMRE OF THE ESTIMATES Author Effort in Man-hours MMRE COCOMO Effort = 2.4 * (size)**1.05 13 Samples 11 Samples Albretch & Gaffney Effort = FP / 0.0595 or 16.81 *FP Albretch & Gaffney (1983) 17.47 5.73 (1983) Behrens (1983) 18.94 6.26 Behrens (1983) Effort = 18.3 * FP Moser & Nierstrasz (1996) 6.29 1.68 Moser & Nierstrasz (1996) Effort = 6.667 * FP Lokan (2000) 0.82 0.80 Lokan et al. (2000) Effort = 21 * FP (0.826) Maxwell & Forselius (2000) 2.63 0.64 Maxwell&Forselius Effort = FP / 0.337 or 2.97 * FP (2000) Jeffer et al. (2001) 1.89 0.45 Jeffer et al. (2001) Effort = 2.2 * FP Average Productivity 1.08 0.64 Average productivity Effort = FP / 1.09 or 0.917 * FP COCOMO 12.30 3.87 TABLE V ESTIMATES FROM DIFFERENT PRODUCTIVITY RATES Actual Albretch & Moser & Maxwell& Jeffer et Average COCOMO Effort Gaffney Brehrens Neierstraz Lokan Forselius al. Productivity No. FP /0.6 18.3* FP 6.667* FP 21+ FP**0.826 FP/0.337 2.2*FP 2.2* FP 1 400 3508.47 3788.10 1380.07 102.84 614.24 455.4 455.4 4959.33 2 1016 9559.32 10321.20 3760.19 208.31 1673.59 1240.8 1240.8 6613.23 3 560 22389.83 24174.30 8807.11 399.33 3919.88 2906.2 2906.2 34721.80 4 640 1406.78 1518.90 553.36 59.47 246.29 182.6 182.6 1830.98 5 240 3728.81 4026.00 1466.74 107.07 652.82 484 484 2461.02 6 400 1101.69 1189.50 433.36 52.44 192.88 143 143 1416.48 7 458 4305.08 4648.20 1693.42 117.92 753.71 558.8 558.8 6147.94 8 384 3152.54 3403.80 1240.06 95.92 551.93 409.2 409.2 4432.44 9 550 1898.31 2049.60 746.70 70.28 332.34 246.4 246.4 1211.29 10 600 1864.41 2013.00 733.37 69.55 326.41 242 242 1188.59 11 520 4542.37 4904.40 1786.76 122.31 795.25 589.6 589.6 3027.70 12 1080 2610.17 2818.20 1026.72 85.11 456.97 338.8 338.8 3635.39 13 95 11983.05 12938.10 4713.57 246.75 2097.92 1555.4 1555.4 8384.19 Mean 534.08 5542.37 5984.10 2180.11 133.64 970.33 719.40 719.40 2785.13 INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 22 RF/Digital Co-Design Approach Based on VHDL-AMS Modeling: A Chipless RFID Case Study Arash Jamali, Arash Ahmadi, Omid Sadeghi Fathabadi combine different design fields. Frank et al in [1] present a Abstract—This paper presents a multi-domain design approach solution which tries to link between SPICE and VHDL-AMS combining High Frequency (HF) and digital design. Simulated or simulators via an interface. Synchronization and data exchange measured S-parameters of a designed HF network are converted between the simulators is the main challenge in this method that to a VHDL-AMS model, which can be combined with the digital part of the system in a RF-digital co-design environment. A can decrease speed of the simulation and also generality of the chipless RFID system is studied as a case study. Comparing method. Another approach, which tries to convert VHDL-AMS simulation results produced by ADS and AMS models validate models to models that can be defined in traditional CAD tools, capability of the proposed method. This method can be used to is presented in [2] but since traditional tools are not designed to combine a variety of disciplines with HF design environments. simulate mixed-mode systems efficiently, this method cannot be very effective when the system complexity increases. Similarly, Index Terms—Chipless RFID, high frequency design, mixed-mode design, scattering parameters, VHDL-AMS automatically generation of AMS models for power electronics modeling. systems [3], [4], Analog to Digital Converters (ADC's) [5] and MATLAB/SIMULINK ADC models [6] are examples of this general approach. There are other works trying to present more I. INTRODUCTION general approaches for automated system and fault modeling [7], [8]. Each of the mentioned works have proposed different CAD A dvances in electronic design technology and market demands are pushing the industry and consequently the research field towards more inclusive and efficient tools in their special fields but to best of our knowledge there is no work on RF systems AMS modeling using Scattering multidisciplinary system level design tools and methods. parameters (S-parameters). System level design approaches normally require design The Advanced Design System (ADS) of Agilent combinations between different disciplines over a diverse range Technologies is a popular and powerful tool for HF designers of specifications. Although system level design methods have with the ability to be used for a variety of designs. This paper been able to fill the gaps between some design fields, engineers proposes a design method that can help digital and HF designers often need to be involved with specification transfers between to co-design a mixed-mode system. The final system is different (Computer Aided Design) CAD tools, which might described in VHDL-AMS, where the digital part can be easily need deep understanding of the different fields. extended to MEMS, power electrical, mechanical and any other High Frequency (HF) systems in combination with system that can be describable by VHDL-AMS, which is a low-power/high-speed electronic circuits design have found a multi-domain description language and does not have the wide range of applications. Although, both of which are well limitations of the tools such as ADS. This approach can lead to a developed in terms of theory, tools and applications; it would productivity gain by letting engineers design both digital and not be a trivial task to co-design , simulate or optimize the HF components together as these components would exist on system considering both fields at the same time, due to their the final system. This higher level of abstraction gives the different natures. Therefore, we either need to focus on new design team a fundamental understanding early in the design tools for modeling of such mixed mode circuits ignoring very process of the intricacies and interactions of the entire system well developed older CAD tools; or utilize current tools and try and enables better system tradeoffs, better and earlier to fill the gap using mixed-mode languages. Accordingly, most verification, and overall productivity gains through reuse of of attempts present methods to automatically generate physical early system models as executable specifications. plausible (Analog Mixed Signal) AMS description of all or a In the proposed method, at first, the HF circuit is designed on part of the mixed-mode system. ADS to satisfy design constraints. Then the simulated There are similar works in which solutions are presented to S-parameters of the circuit are used to generate VHDL-AMS model. The generated model is then delivered to a digital Authors are with the Razi University, Kermanshah, 67149 Iran. Phone: designer, to use the model as a black box in the final designs and +98(0)831-4283261; fax: +98(0)831-4283261; e-mail: aahmadi@razi.ac.ir. completes the system model. This method is implemented as an INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 23 open access tool (called HF/Digital converter) to the community MEMS, energy harvesters etc.) is used to verify the design for further investigations, research and improvements. process. Paper is organized as follows: Section II introduces the As it is seen in Fig. 2, this method produces VHDL-AMS proposed method, its usage, considerations and limits. In codes from original HF description, i.e. S-parameters, given Section III, design procedure and operating fundamentals of a from ADS or other tools in the form of lookup tables and utilizes chipless RFID system are discussed, where different parts of the them directly to produce the behavioral model. modeled system are described and simulated to show the model validity and performance of the method. Paper is concluded in Section IV. II. PROPOSED METHOD The proposed method starts with design constrains of a mixed-mode system consists of digital and HF parts. Initially, HF and digital design are examined for their limitations and requirements. The way HF and digital parts connect and communicate to each other is also decided in this step. Then the HF and digital circuit designers start to work independently. The design flow of the proposed method is presented in Fig. 1. Fig. 1. Design flow of the proposed method. As it can be seen in the Fig. 1, HF designer may design a new circuit or use an existing design, but in both of these cases the HF circuit should be described by its scattering parameters in a text file that can be generated by a HF simulator such as ADS or by any other tool. The VHDL-AMS model of the HF circuit will be created from this file. Block diagram of Fig. 2 shows behavioral specification of the generated model by this tool. The proposed model is based on the S-parameters (S11, S12, S21, Fig. 2. Block diagram of the generated model. and S22) of the HF design. These parameters are transformed Another way, which might be even shorter and optimum, is: the into a matrix representation of the magnitude and phase for input data charts of HF description are converted to analytical different frequency responses. This matrix is used to generate functions describing the HF behavior to be used to produce the the VHDL-AMS model of the HF network. The model of each code to model the system. In this case, there is a function for the HF network is a behavioral, frequency domain model with three magnitude and a function for the phase of each S-parameter. input and two output ports. Input ports get frequency, magnitude Analytical functions are extracted by a curve fitting algorithm. and phase of the input signal and the output ports produce the Block diagram of Fig. 3 shows behavioral specification of the magnitude and phase of the output signal, all as real numbers. generated model by in this approach using fitted function of This model in combination with the model of other parts (digital, S-parameters. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 24 x0 x0 1 1 x x If X n = , Xm = , (3a) ⋮ ⋮ n m x x And f1 ( x) g1 ( x ) f ( x) g ( x) Vf = 2 , Vg = 2 , (3b) ⋮ ⋮ f k ( x) g k ( x ) then: x 0 f1 ( x) a11 ⋯ a1n 1 f ( x) V f = Aij × X n = ⋮ ⋱ ⋮ × = 2 . x (3c) ⋮ ⋮ ak1 ⋯ akn n x f k ( x) x 0 g1 ( x) b11 ⋯ b1m 1 x g 2 ( x) Fig. 3. Block diagram of the generated model using fitted functions. Vg = Bij × X m = ⋮ ⋱ ⋮ × = . (3d) ⋮ ⋮ bk1 ⋯ bkm m General form of the proposed functions for S-parameters is x g k ( x) presented in (1) in which F(x) is a function of frequency. Therefore, each S-parameter description of the HF system is The following section provides more details of the method written in the form of generalized functions with different using a case study. coefficients in compact matrixes forms as: n k ∑a x ij j −1 k f i ( x) III. THE CASE STUDY: A CHIPLESS RFID Radio Frequency Identification (RFID) is a technology that F ( x) = ∑ j =1 m =∑ . (1) utilizes radio frequency signals to identify objects. In this g i ( x) i =1 ∑b x ij j −1 i =1 technology every object is marked by a tag; every tag has a j =1 unique response to a particular request signal. By sending the Where x represents frequency and coefficients of F(x) are signal for a tag and analyzing its response, objects can be expressed in the matrix form presented in (2a), (2b) as: identified. The unit which generates the spectrum, receives the responses and analyzes them is called "Reader". Block diagram a11 ⋯ a1n of a typical RFID system is shown in Fig. 4. Aij = ⋮ ⋱ ⋮ , (2a) ak1 ⋯ akn b11 ⋯ b1m Bij = ⋮ ⋱ ⋮ , (2b) bk1 ⋯ bkm Aij are the nominator coefficients and Bij are the denominator coefficients in (1). And in (1) can be calculated as (3a-3d): Fig. 4. Block diagram of a typical RFID system. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 25 Depending to the type of the tags, RFID systems can be simulated S-parameters of these two tags are used to categorized in three groups. The traditional and the first one is automatically generate their VHDL-AMS models by HF/Digital called Active-RFID [9]. In this type of the systems there is a converter to be used in combination with the digital design power supply and a chip in every tag. This chip is responsible to process. generate a digital code as the ID of the tag. This digital code is then modulated and sent via an antenna. The second type of the RFID systems is passive-RFID systems [10]. These systems are like active ones; but there is no power supply in the tags and the chip of each tag is supplied by the power of rectified RF signal received by the input antenna. The third type of RFID systems are called chipless-RFID [11]–[14]; in which there is no chip or power supply in the tags and the unique ID of each tag is encoded in the magnitude or phase [15] of the reflected wave from the tag or even both of them. Based on this mechanism, different responses to a unique input spectrum are produced for every individual chipless-tag. Chipless tags are cheap and simple which can even be printed on different objects by conducting inks [16]–[18]. Therefore, this technology can be widely used for low cost and mass production applications. One of the features of every RFID system is that there is a mixed-mode circuit in its reader to decode the backscattered wave. This is the reason that encourages us to describe an RFID system using a high level specification language such as VHDL-AMS. To do so we need more understanding of the frequency specification of the system. A. The Chipless RFID system to be modeled The chipless system of [11] is our case to be studied. In this system a chirp of frequencies is generated by the "Reader" and is sent by an antenna. The signals of this chirp have constant magnitude and phase. After propagation in space this spectrum is received by the tag and passes through a transmission line where a multi-resonator is placed nearby. This multi-resonator determines the ID of each tag. Each resonator generates a minimum in the magnitude and a ripple in phase in its corresponding resonance frequency. Therefore, to encode six data bits, six frequencies are required. Because each frequency can carry two states, placing or termination of the resonator of each frequency determines its corresponding logic value. Fig. 5 shows the layout and simulated "S21"of two chipless tags and their corresponding encoded spectrum and IDs [19]. The received spectrum is changed according to the tag ID then backscattered to the reader base, where its modified spectrum will be decoded. The ID can be derived from both magnitude and phase. The RX (Receive) and TX (Transmit) antennas of both tag and reader are cross polarized to prevent interference. So the signal that is sent by the TX antenna of one side can only be received by the Rx antenna of the other side. Nevertheless, all kind of nonlinearities can be modeled in AMS, in this study for the sake of generality, antennas and propagation medium are assumed ideal so they make no change or attenuation in the identification spectrum. Fig. 5. (a), (b), (c) respectively show the layout, magnitude and phase of "S21" for a Fig. 5. (a) Layout of a tag with ID=111111 (b) Simulated magnitude of S21 for 6-bit chipless tag with "111111" ID and Fig. 5. (d), (e), (f) show the first tag (c) Simulated phase of S21 for the first tag (d) Layout of a tag with ID=010101 (e) Simulated magnitude of S21 for the second tag (f) Simulated the similar results for a similar tag with "010101" ID. The phase of S21 for the second tag. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 26 For the other structure, Curve fitting results and obtained 0.0168 - 0.1333 0.007268 1.031 1 Bij = , (5c) 0 functions of these two tags are presented below. First for tag.1 with "010101" ID. Equation (4a), shows the fitted function of 1.938 - 1.14 - 1.568 1 magnitude spectrum and equations (4b), (4c) show coefficients For tag.2 with "111111" ID. Equation (6a), (6b), (6c) show the matrix of tag.1 magnitude spectrum given at the bottom of the fitted function and coefficients matrix of tag.2 magnitude page. spectrum respectively. 5 Equations (6b), (6c) are given at the bottom of the page. ∑a ij x j −1 5 ∑a 2 F1 ( x) = ∑ j =1 . (4a) 5 ij x j −1 F3 ( x) = ∑ 5 j =1 i =1 ∑b x ij j −1 5 . (6a) j =1 i =1 ∑b x ij j −1 0.01698 − 0.1339 0.006565 1.032 1 j =1 Bij = , (4c) 0 Equation (7a), (7b), (7c) show the fitted function and 1.772 − 2.662 1 0 coefficients matrix of tag.2 with "111111" ID phase spectrum respectively. Equations (7b), (7c) are given at the bottom of the Equations (5a), (5b), (5c) show the fitted function and page. coefficients matrix of tag.1 with "010101" ID phase spectrum 6 respectively. Equation (5b) is given at the bottom of the page. 4 ∑a ij x j −1 F4 ( x) = ∑ 6 j =1 ∑a ij x j −1 5 . (7a) ∑b x 2 j −1 F2 ( x ) = ∑ j =1 i =1 5 . (5a) ij ∑b x i =1 j −1 j =1 ij j =1 0.011052 - 0.08776 0.004164 0.676 0.6552 (4b) Aij = , 0.6044 - 0.9084 0.342 - 0.0004836 0.0001576 0.2404 − 2.138 0.537 18.07 19.96 3.54 (5b) Aij = 158.2 162 . 5 87.57 137 .7 35.57 0.00414 , − − − 0.2991 1.051 1.174 0.4235 0 3573 2 . 317 e + 004 3. 206 e 004 − 1 . 867 e + 004 0 (6b) Aij = − 0.005686 0.007592 0.004529 0.002537 − 0.01211, − 0.004342 − 0.419 1.536 − 1.824 0.7085 − 0.6 − 0.6204 − 0.002245 0.08179 − 0.01059 1.185 2.177 1 0 0 3763 2 . 427 e + 004 3 . 342 e + 004 − 1 .966 e + 004 1 (6c) Bij = 0.3635 − 0.3011 − 0.7537 − 0.06529 1 , 1.744 − 2.641 1 0 0 0.01752 − 0.1363 0.003262 1.033 1 36.11 − 56.5 − 63.44 94.76 33.02 − 34.97 − 2.898e + 006 3.953e + 006 − 5.701e + 005 − 8.54e + 005 1.789e + 005 2.353e + 004 (7b) Aij = , − 0.1267 0.8403 − 0.2978 − 4.14 3.87 8.854 − 4.353 − 29.61 − 44.28 22.51 3.436 − 0.05575 0.6126 − 0.5807 − 1.427 0.7421 1 − 2.894e + 004 3.175e + 004 1731 − 6953 1 (7c) Bij = , 0.01744 − 0.1359 0.002692 1.032 1 0.08114 0.5693 1 0 0 INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 27 B. Simulation results To complete the procedure of modeling, "Reader" should be modeled as well. This model can also be used as a test bench for the automatically generated models. The "Reader" is considered to have a structure as shown in Fig. 6. Fig. 6. A behavioral block diagram of the "Reader" unit of a Chipless-RFID system. In this structure the "spectrum generator" block generates the spectrum that is needed to detect tags. In this generated spectrum, magnitude and phase of all signals are considered to be "1.0" and "0.0" respectively. The "Reader decoder" block extracts the digital code that is hidden in the backscattered spectrum and converts it to a quantized value. Each quantized value represents a logic state in a specific frequency. Finally, Fig. 7. (a) Simulated magnitude of the output signal of RX antenna for tag with ID "111111" (b) Simulated phase of the output signal of RX antenna for tag every quantized value is translated to a digital value in one of with ID "111111" (c) Simulated magnitude of the output signal of RX antenna the "Output Code" pins; these pins are latched and keep their for tag with ID "010101" (d) Simulated phase of the output signal of RX values until the whole spectrum is decoded. Therefore, there is a antenna for tag with ID "010101" (e) equivalent frequency of each time point (f) Simulated digital code of the tags with ID "111111" and (g) with ID "010101". transient time for the output code in which the code is not valid. Simulation results of the generated model are shown in Fig. 7. Please note that here, "time" is used to represent the IV. CONCLUSION AND FUTURE WORKS independent variable (frequency) in the traces to summarize the The proposed method to generate, VHDL-AMS model of code but the equivalent frequency of each time that is used in the mixed-mode systems with HF networks was used to model a model is presented in Fig. 7. (e) and the value of every dot in the Chipless RFID system and the simulation results confirm its output traces should be considered with its corresponding validity. This work shows that the proposed CAD tool, can frequency, not with time. Fig. 7. (a), (b) respectively show the effectively model a matched two-port HF network in magnitude and phase of the output signal of RX antenna in the VHDLAMS. Using the proposed design method, digital and HF "Reader" block for the tag of Fig. 5. (a) with "111111" ID and can be co-designed in the form of a mixed-mode system with Fig. 7. (c), (d) show the same parameters for the other tag. These minimum required knowledge about each field. The proposed traces are resulted from the simulation of the VHDL-AMS method can be easily extended to the design of mixed-mode model and are in good agreement with the simulated results of systems with more sections such as MEMS, power electrical ADS as depicted in Fig. 5. Small differences between the and mechanical systems because of the flexibility of simulated results of the generated model and ADS results are VHDL-AMS. Further, this is the first attempt to model a due to the different interpolation methods that each tool uses to chipless RFID system. In modeling of this system, the antennas approximate the values of S-parameters for frequencies in and propagation medium are assumed ideal and the HF which the S-parameters are not exactly defined. These networks which are modeled by the proposed tool are assumed differences decrease as the simulated points that are used to two port networks with matched I/O ports; consideration of the generate the model increase. The IDs of the studied tags that are multi-port HF networks that are not matched in their ports and correctly decoded by the "reader" model are shown in Fig. 7. (f), also non-ideal modeling of the propagation medium and (g). antennas are subject of our future works. The simulated traces and decoded IDs, verify the validity of the system model and confirms that the proposed modeling APPENDIX procedure is successful for this type of systems. The following This appendix presents a shortened VHDL-AMS model of code mentioned in appendix is the shortened model of the the under-study case, generated by the proposed tool. Also there under-study case, generated by the proposed tool. are models of some other required blocks like Reader source, Reader decoder and Digitizer to complete the simulation. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 28 entity two_port is architecture behav of dig is port ( quantity inspecmag , inspecphase , freq : in real ; signal code_sig : bit_vector(5 downto 0) := "000000"; quantity outspecmag , outspecphase : out real) ; constant period : time := 10 us; end entity two_port ; begin architecture curve_fit of two_port is p0 : process quantity s21m, s21p : real; variable ana_var : real; constant p1 : real := 1.638; begin constant p2 : real := 1.69; ana_var := anain ; constant p3 : real := 0.01041; if ana_var = 32.0 then constant p4 : real := -0.2194; code_sig(5) <= '1'; constant p5 : real := 0.02763; elsif ana_var= 16.0 then . code_sig(4) <= '1'; . continue like this elsif ana_var= 8.0 then . code_sig(3) <= '1'; constant p10 : real := 1.511; elsif ana_var= 4.0 then begin code_sig(2) <= '1'; s21m==Function_1 (freq); elsif ana_var= 2.0 then s21p== Function_2 (freq); code_sig(1) <= '1'; elsif ana_var= 1.0 then outspecmag == inspecmag * s21m ; code_sig(0) <= '1'; outspecphase == inspecphase + s21p ; else end architecture curve_fit ; code_sig <= code_sig ; end if; wait for period; end process p0; entity two_port is code <= code_sig ; port ( quantity inspecmag , inspecphase , freq : in real ; end architecture behav ; quantity outspecmag , outspecphase : out real) ; end entity two_port ; architecture table of two_port is quantity count , step , freq0 : real ; architecture behav of reader_decoder is quantity s21m , s21p , s11m , s11p , s12m ,s12p , s22m , begin s22p : real ; IF (freq > 1620000000.0 and freq < 1670000000.0 and begin inmag < 0.8) USE freq0 == 1500000000.000 ; step==4000000.000; code==32.0; count == (freq - freq0) / step ; ELSIF (freq > 1710000000.0 and freq < 1760000000.0 and IF (count >=1.0 AND count < 2.0) USE inmag < 0.8) USE s21m ==000.997; s21p==152.265; code == 16.0 ; s11m ==000.007; s11p==046.211; ELSIF (freq > 1830000000.0 and freq < 1840000000.0 and s12m==000.997; s12p==152.265; inmag < 0.8) USE s22m ==000.007; s22p==078.346; code == 8.0 ; ELSIF (count >= 2.0 AND count < 3.0) USE ELSIF (freq > 1940000000.0 and freq < 1950000000.0 and s21m ==000.997; s21p==151.699; inmag < 0.8) USE s11m ==000.007; s11p==044.767; code == 4.0 ; s12m ==000.997; s12p==151.699; ELSIF (freq > 2060000000.0 and freq < 2080000000.0 and s22m ==000.007; s22p==078.611; inmag < 0.8) USE ELSIF (count >= 3.0 AND count < 4.0) USE code == 2.0 ; . ELSIF (freq > 2180000000.0 and freq < 2220000000.0 and . continues like this inmag < 0.8) USE . code == 1.0 ; ELSIF (count >= 201.0 AND count < 202.0) USE s21m ==000.995; s21p==044.987; ELSE s11m ==000.032; s11p==097.095; code == 0.0 ; s12m ==000.995; s12p==044.988; END USE ; s22m ==000.030; s22p==-156.140; end architecture behav ; ELSE s21m==0.0; s21p==0.0; s11m==0.0; s11p==0.0; s12m==0.0; s12p==0.0; s22m==0.0; s22p==0.0; architecture behav of reader_source is END USE; begin outmag == 1.0 ; outspecmag == inspecmag * s21m; outphase == 0.0 ; outspecphase == inspecphase + s21p; freqq == 1500000000.000 + 160000000000.000 * now ; end architecture table; end architecture behav ; INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 29 REFERENCES [16] S. Preradovic and N.C. Karmakar, “Design of Fully Printable Planar Chipless RFID Transponder with 35-bit Data Capacity,” In Proc. the [1] F. Frank and R. Weigel, “Co-Simulation of SPICE Netlists and 39th European Microwave Conference, Rome, Sept. 29 2009-Oct. 1 2009, VHDL-AMS Models via a Simulator Interface,” in Proc. International pp.13-16. Symposium on Signals, Systems and Electronics, ISSSE '07, Montreal, [17] L. Zheng, S. Rodriguez, L. Zhang, B. Shao, L. R. Zheng, “Design and July 30 2007-Aug. 2 2007, pp. 75 -78. Implementation of a Fully Reconfigurable Chipless RFID Tag Using [2] M. Zorzi, F. Franze, N. Speciale and G. Masetti, “A Tool for the Inkjet printing technology,” presented at the IEEE International integration of new VHDL-AMS models in spice,” in Proc. the 2004 Symposium on circuits and systems, Seattle-WA, 18-21 May 2008, pp. International Symposium on Circuits and Systems, ISCAS '04. IV, 23-26 1524-1527. May 2004, Vol.4, pp. 637-640. [18] V. Derbek, C. Steger, S. Kajtazovic, J. Preishuber-Pfluegl, M. Pistauer, [3] A. Merdassi, L Gerbaud and S. Bacha, “Automatic Generation of Average “Behavioral Model of UHF RFID Tag for System and Application Level Models for Power Electronics Systems in VHDL-AMS and Modelica Simulation,” In proc. the 2005 IEEE International Behavioral Modeling Modeling Languages,” Journal of Modelling and Simulation of Systems, and Simulation Workshop, BMAS 2005, 22-23 Sept. 2005, pp. 60-63. vol.1, pp. 176 -186, 2010. [19] S. A. Chakra, U. O. Farrukh, E. Colin and A. Moretto “Ultra High [4] A. Merdassi, L. Gerbaud and S. Bacha, “Automatic global modeling of Frequency-Radio Frequency Identification Tags Modeling,” presented at static converters for power electronics systems: taking into account of International Conference on Advances in Computational Tools for causality aspects for model coupling,” in Proc. 13th European Conf. on Engineering Applications, ACTEA '09, Zouk Mosbeh, 15-17 July 2009, Power Electronics and Applications, Barcelona, 8-10 Sept. 2009, pp. 1 pp. 265 – 268. -10. [5] B. Babba, G. Barret and F. Poullet, “Virtual test with VHDL-AMS for a generator of analog and mixed signal virtual components,” in Proc. Fall VIUF Workshop, Orlando, Oct 1999, pp. 88 -93. Arash Jamali received the B.S. degree in electrical [6] A. Cesar, I. Grout, J. Ryan and T. O'Shea, “Generating VHDL-AMS engineering from Razi University, Kermanshah, Iran, Models of Digital-to-Analogue Converters from MATLAB/ in 2010,and is currently working towards the M.S. SIMULINK,” presented at International Conf. on Thermal, Mechanical degree in the Department of Electrical Engineering, and Multi-Physics Simulation Experiments in Microelectronics and Razi University. His research interests include mixed Micro-Systems, London, April 2007, pp. 1-7. mode design, high frequency design, Energy [7] A. Al-Kashef, M. Zaky, M. Dessouky and H. El-Ghitani, “A Case-Based Harvesting, and VLSI design. Reasoning Approach for the Automatic Generation of VHDL-AMS Models,” IEEE International Behavioral Modeling and Simulation Workshop, BMAS 2008, San Jose- CA, 25-26 Sept. 2008, pp. 100-105. [8] X. Likun, I. M. Bell and A. J. Wilkinson, “Automated Model Generation Algorithm for High-Level Fault Modeling,” IEEE Trans. Arash Ahmadi received the BSc and the MSc Computer-Aided Design of Integrated Circuits and Systems, vol.29, degrees in Electronics Engineering from Sharif pp.1140-1145, July. 2010. University of Technology and Tarbiat Modares [9] W. Yoon, S. Chung, S. Lee, and Y. Moon, “Design and implementation University, Tehran, Iran, in 1993 and 1997 of an active RFID system for fast tag collection,” in Conf. 7th IEEE Int. respectively. He joined soon after to the Electrical Conf. on Computer and Information Technology (CIT 2007), Engineering Department of Razi University, Aizu-Wakamatsu, Fukushima, 16-19 Oct. 2007, pp.961-966. Kermanshah, Iran as a faculty member. He received [10] A. man, E. Zhang, V. Lau, C. Tsui and H. Luong, “Low power VLSI his PhD degree in Electronics from University of design for a RFID passive tag baseband system enhanced with an AES Southampton, United Kingdom in 2008. From cryptography engine,” in Proc. 1st Annual RFID Eurasia, Istanbul, 5-6 2008 to 2010, he was a Fellow Researcher with Sept. 2007, pp.1-6. University of Southampton. He is currently an Assistant Professor in the [11] S. Preradovic, I. Balbin, N.C. Karmakar and G. F. Swiegers, Electrical Engineering Department, Razi University. His current research “Multiresonator-Based Chipless RFID System for Low-Cost Item interests include hardware implementation of signal processing systems, Tracking,” IEEE Trans. microwave theory and techniques, vol. 57, no. 5, high-level synthesis and bio-inspired computing. pp. 1411-1419, May 2009. [12] S. Preradovic and N.C. Karmakar, “Design of Chipless RFID Tag for Omid Sadeghi Fathabadi received his BSc and Operation on Flexible Laminates,” IEEE antennas and wireless MSc in electrical engineering from Razi University, propagation letters, vol. 9, pp.207-210, 18 March 2010. Kermanshah, Iran, in 2009 and 2011. He is currently [13] Preradovic, I. Balbin, N.C. Karmakar and G. F. Swiegers, “Chipless a PhD student at the University of Tasmania, Hobart, Frequency Signature Based RFID Transponders,” In Proc. the 1st Australia. His current research interests are European Wireless Technology Conference, Amsterdam, 27-28 Oct. automated modeling of high frequency circuits, 2008, pp.302-305, biomedical signal processing and modeling, [14] S. Hu, Y. Zhou, C. L. Law and W. Dou, “Study of a Uniplanar Monopole automatic control, and system identification. Antenna for Passive Chipless UWB-RFID Localization System,” IEEE trans. antennas and propagation, vol. 58, no. 2, pp.271-278, Feb. 2010. [15] I. Bablin and N. C. Karmakar, “Phase-Encoded Chipless RFID Transponder for Large-Scale Low-Cost Applications,” IEEE microwave and wireless components letters, vol. 19, no. 8, pp. 509-511, Aug. 2009. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 30 Architecture of Wireless Visual Sensor Node with Region of Interest Coding Muhammad Imran, Khursheed Khursheed, Naeem Ahmad, Malik A. Waheed, Mattias O’Nils and Najeem Lawal. Abstract—Wireless Vision Sensor Node (WVSN) is an emerging presence of these challenges, the WVSNs are expected to field which has a number of potential applications like surveillance, provide autonomous and continuous monitoring of the events in smart home, and environmental monitoring. WVSN consists of a the field of view for a long time without the frequent number of nodes which are referred to as wireless Vision Sensor Node (VSN). Each VSN is expected to perform vision processing replacement of batteries. A VSN can be implemented on a by using limited resources such as power, memory, processing, and software i.e. microcontroller and/or a hardware i.e. Field wireless bandwidth. The major challenges in VSN include Programmable (FPGA) platforms. A VSN implemented with reduction in processing and communication energy consumption, the currently available software platforms consumes more in order to maximize the lifetime. To meet this challenge, our goal energy as compared to a hardware implemented VSN [1][2]. is to propose a VSN architecture which has reduced processing However, the design and development time on a software and communication energy consumption and has small design complexity on hardware platform. A number of different platform is smaller because of the availability of ready to use processing strategies are investigated to realize a VSN with these image processing libraries. On the other hand, a hardware characteristics. A VSN with suitable strategy is then implemented platform, FPGA, offers programmable and energy efficient and energy values are measured on real hardware. In this strategy, solutions by virtue of parallel computing [3]. This makes FPGA the processing energy consumption is reduced by implementing a good choice for VSN [6]. Generally, researchers employ two lightweight vision tasks on the VSN by using hardware platform approaches for VSN implementation [7][5][11]. In first and moving complex tasks to a server. The communication energy consumption is reduced with Region Of Interest coding together approach, all vision tasks are performed on the VSN and then with a ITU-T G4 compression scheme. The implemented system is the final features are transmitted to the server for analysis. This compared with a previously published system. The comparison approach consumes a small amount of communication energy shows that proposed VSN consumes up to 34 percent lower energy because the transmission data is small as features are consumption and depending on the sample period, it can achieve represented by means of a few bytes. However, it consumes approximately 50 percent greater lifetime as compared to the greater processing energy and has a high design complexity. In published system. second approach, no local processing is performed on the VSN Index Terms—Architecture, region of interest coding, smart and raw data is transmitted to the server for processing. This camera, wireless multimedia sensor networks, wireless vision approach consumes greater communication energy but a smaller sensor node, wireless visual sensor networks. amount of processing energy. In comparison to the aforementioned approaches, the I. INTRODUCTION balanced strategy is to partition the tasks between the VSN and the server as shown in Fig. 1. In many WVSN applications such W ireless Vision Sensor Networks (WVSNs) consist of a number of sensor nodes, which are usually referred to as wireless Vision Sensor Nodes (VSNs). A number of potential as meter reading, the monitoring of industrial machines, environmental monitoring, people counting [2][1][4][9], the WVSN applications include industrial process monitoring, images contain few objects with distinctive backgrounds. In elderly home care, smart homes, environmental monitoring, and such applications, the pre-processing task and a simple surveillance [1][2][4][5]. The VSNs are often characterized by segmentation can be performed on the VSN in order to clearly means of limited resources such as battery or alternative energy identify the objects from the background. The segmentation will source, memory, processing capability and communication convert the image into a binary format. The processing on the bandwidth. The WVSNs are often exposed to a number of binary data will assist the use of lighter algorithms which will requirements such as the varying density of node deployment have reduced latency and hardware resource requirement [22]. and possibly hazardous environmental situations [6]. In the The binary image can be processed further on the VSN or it can be transmitted to the server for further processing. It is preferable to move the complex tasks i.e. labelling, feature This work was supported in part by the STC research program at Mid extraction and classification to the server. This strategy will Sweden University, Sweden and Higher Education Commission, Pakistan. The authors are with Faculty of Science, Technology and Media, Mid reduce both the processing and design complexity because the Sweden University, Sweden. Phone: 0046700905858; Fax: 004660148456; complex tasks are moved to a generalized platform, namely the E-mail: muhammad.imran@ miun.se. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 31 Fig. 1. VSN with tasks partitioning Fig. 2. VSN with tasks partitioning and ROI coding server, which has reduced constraints. The transmission of a 128x64-pixel binary contrast based sensor. However, for a complete binary image consumes greater communication number of applications, a binary image sensor and this small energy and generally, the communication process is considered resolution is not feasible. Sánchez et al. [24] proposed a to consume more energy than other processes i.e. sensing and software and hardware based VSN system. In the system, two processing [5][7]. Therefore, it is necessary to select a suitable blackfin Digital Signal Processor (DSP) processors are used for bi-level compression scheme. The selected compression scheme image processing and FPGA is used for configuring hardware is expected to have a small processing energy consumption, a outline. The authors claimed to have shown a number of low memory requirement, a good compression ratio and a small implementation challenges while designing such type of system. design complexity. The image compression will reduce the data Bakkali et al. [4] proposed a system which processed the tasks before transmission which will result in reduction of by using a regular computational flow on an array of processing communication energy consumption. For bi-level compression, elements. The tasks with a small amount of data are processed seven well known image compression schemes including ITU-T on a 32 bit NIOS-II, RISC processor. The authors defined ROI, G4, ITU-T group3, JBIG2, Rectangular, GZIP, GZIP_pack and by tracing a bounding box around the area in which a significant JPEG_LS were investigated for resource constraint amount of motion was detected. The Stanford’s MeshEye node environments [10]. We have analyzed these compression [8] uses two kilopixel imagers for low resolution images and schemes for a hardware implemented VSN and it is concluded one high resolution camera module for capturing detailed object that ITU-T G4 and JBIG2 are suitable candidates in relation to snapshots. It is claimed that Stanford’s MeshEye Mote the hardware platform. The JBIG2 has a high processing energy, quantifies the reduction in energy consumption through the higher memory requirements [21] and has a high design usage of a hybrid-resolution vision system. However, the hybrid complexity as compared to that for the G4. Therefore, G4 is vision is computationally more efficient and consumes less selected in this work. energy for smaller objects within its field of view [5]. In WVSN applications, there is a chance to reduce the data Kerhet et al. [9] have proposed a wireless video node, beyond the compression limited. Therefore, in this work, we MicrelEye for cooperative video processing applications. The have investigated the fact that the compressed data can be authors used FPGA for the initial processing, stored the ROI in further reduced by employing the Region of Interest (ROI) an on-chip memory and then performed feature extraction and coding before the compression. This strategy is shown in Fig. 2. classification by using a micro-controller. The storage of ROI This approach will not only reduce the processing energy and and performing of complex vision tasks by using a design complexity on hardware implemented VSN but it will micro-controller will increase the power consumption of the also reduce the communication energy consumption. To prove node. Anastasi et al. [23] concluded that it is not always the this concept, we have investigated different processing communication module which accounts for greater energy strategies for hardware implemented VSN. After investigation, consumption but depending on the application, sometime the a suitable strategy is implemented and energy values are processing module will account for greater energy consumption. measured on real hardware. The implemented system is The authors provided a taxonomy for energy conservation in compared with previously published system. The comparison wireless sensor networks context. results show that significant reduction in energy consumption Our approach, shown in Fig. 2 is unique in relation to the can be achieved by employing ROI coding before compression. aforementioned systems because our focus is to propose a VSN architecture which requires small processing and commutation II. RELATED WORK energy consumption and requires small design efforts on There are a number of systems in which VSN has been hardware platform. Following this, the experimental setup is designed and implemented on software and/or hardware described. platforms. The software implementation in this work is referred to as implementation by using a micro-controller and the III. EXPERIMENTAL SETUP hardware implementation is referred to as implementation on In this section, different parameters of experimental setup FPGAs. In [2], the design principles for the video node are including target application, processing platforms, power presented in the context of a long-lifetime. The imager used is a consumption is discussed. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 32 A. Application description The target applications in this work are industrial machine monitoring [11] and eagle detection [12]. The experimental results provided here are for the monitoring of an industrial machine in order to remove the redundancy. In industry, the hydraulic machines wear out with ageing and this can create Fig. 3. Flow of vision tasks on the VSN losses for the industry due to accidental stoppages. The engineers frequently stop the machines in order to check their health by looking at the particles, detached from the engine. In this work, an autonomous machine surveillance system is Fig. 4. Flow of vision tasks on the server envisioned and implemented to automatically detect magnetic particles, measure the size of the particles and transmit the TABLE 1 information to the user. This does not require the stopping of DEVICE UTILIZATION, POWER AND PERFORMANCE PARAMETERS OF VISION FUNCTIONS machine which will increase the productivity. It is worth mentioning that the application area for the analysis of this work Vision Tasks Logics BRAMS Power Latency is limited to the systems for which the greyscale/colour data is (mW) (clock cycles) reduced to binary after some pre-processing techniques on the Image capture (A) 329 0 0.78 1 Pre-processing (B) 373 0 1 4 VSN of the machine. Segmentation (C) 3 0 0.13 1 B. Processing platforms and vision tasks Morphology (D) 683 4 1.14 645 ROI (E) 342 0 1.18 641 To realize this system, the vision algorithms are implemented Compression (F) 3059 3 3.16 647 on a software and hardware platform. The software platform used in this work is SENTIO32 [1] which has AVR32, a 32 bit The processing time for each strategy on the hardware platform micro-controller [17], embedded with an IEEE 802.15.4 is calculated by using the Eq.1 compliant transceiver [18] and the hardware platform is Xilinx Spartan6 FPGA [14]. The vision tasks are shown in Fig. 3 and T = (R × (C + Ls) + Lt) / f (sec) (1) Fig. 4. The vision tasks implemented on the VSN are shown in Fig. 3 in which the individual task is represented by a symbol where R represents rows, C represents columns and Ls such as image capturing is represented by A, pre-processing by represents low line sync, Lt is the latency of each task, and f is B, etc. frequency. In this case, R is 400, C is 640, Ls is 32 and f is 27 MHz. The time spent on transmitting the results to the server The vision tasks performed on the server are shown in Fig. 4. is calculated by using Eq.2 These tasks are represented by the respective symbols. The dashed lines in Fig. 3 show the different processing strategies on T_IEEE = (X +19 ) × 32 × 10 -6 +192 × 10 -6 (sec) (2) the VSN such as in one of the strategies, vision tasks namely image capturing (A), pre-processing (B), segmentation (C) where X is the number of bytes being transmitted, 19 is the could be performed on the VSN. The data can be transmitted to overhead bytes involved due to the header information, 32 µsec the server, where the remaining vision tasks such as morphology is the processing time of one byte while 192 µsec is the settling (I), low pass filtering (J), labelling (K), feature extraction (L) time of the radio transceiver. The energy consumption of the and classification (M) can be performed. In Fig. 4, there is an external flash light, used to achieve a sufficiently high signal to noise ratio, is 48µJ. The power consumption of the IEEE alternative path for some of the vision tasks to show that, when a 802.15.4 compliant transceiver is 132 mW while that of the specific task is performed on the VSN, it is not necessary to AVR32 is 77.55 mW, when operating. The total energy spent on process it on the server i.e. morphology. The different sending data over the wireless link is the combination of the processing strategies are shown in Table 3 and discussed in individual energy consumption of the IEEE 802.15.4 previous sections. transceiver and the software platform because both are running C. Resource utilization, power and performance parameters when the data is being transmitted to the server. The energy consumption of the CMOS camera [20] for processing one The power consumption, logic and latency of an individual image is 2.3 mJ. Following this, the target architecture is vision task are shown in Table 1. The Xilinx power measuring presented. tool, called the XPower Analyzer [14] was used for the power analysis. For real time power measurement on FPGA, digilent IV. TARGET ARCHITECTURE tool, called adept is used [15]. In relation to the power analysis, In this case, the VSN architecture and the architecture for the the sleep power consumption of the ACTEL FLASH based individual algorithm involved in the VSN are presented. The FPGA was used in the lifetime calculation because it has a small target architecture for the wireless VSN is shown in Fig. 5 which sleep power consumption of 5µW [16]. The designs in this work includes a CMOS camera, an FPGA, a FLASH memory, are sufficiently large to fit on the available ACTEL platform. AVR32 micro-controller and a radio transceiver. The vision For small designs, this has been proved in work [1]. tasks implemented on VSN include image capturing, INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 33 Fig. 5. VSN architecture with ROI coding pre-processing, segmentation, morphology, ROI coding and G4 alternating white and black runs, is considered [13]. The ROI compression. The vision tasks processed on the server include coding is implemented by storing the row data in a line buffer. decompression, reconstruction image from the ROI image with After the row (Line sync) is completed, a flag is generated to objects placed at original locations, low pass filtering/bubble either retained or discard the row as shown in Fig. 6. The row is remover, labelling, feature extraction and classification. The retained by generating a new row signal, ROI_sync, when it has individual tasks implemented on VSN are described here. an object information. When the row contains all zeros, it is discarded and no ROI_sync is generated. For the reconstruction A. Image capture of the original image, the rows having objects are represented by Images are captured by using CMOS camera [20]. The 1 and the rows having no objects are represented by 0. camera is programmed via I2C for a resolution of 640×400. Generally, in WVSN applications, there are a few objects B. pre-processing with a limited number of pixels values and the background In this work, the pre-processing includes background information occurs in long runs. This makes the run length subtraction. Depending on the application filtering can also be coding efficient for compressing the long runs of 1 and 0. The performed for smoothing and noise removal. For background run length can be used in two ways. In one of the ways, true run subtraction, a background image is stored in the FLASH length coding can be performed which requires a run value and memory [19] via a Serial Peripheral Interface (SPI) at the initial run count. The second way is to have a special zero-to-end stage. During processing current image, the background image symbol which indicates that the remainders of the symbols to the is accessed from the FLASH memory and then the current image end of the block are zero [22]. For run length coding, we have is subtracted from the background image. used a variant of the zero-to-end symbol coding technique. In this technique the first value of the pixel is represented by a C. Segmentation count and a symbol as shown in Table 2. The first black run with In segmentation, a simple thresholding technique is applied a value of 200 has a count value of 200 and symbol value of 0. to partition the image into mutually exclusive connected binary After this, the remaining count values alone are used for the regions. This technique is suitable for applications in which the counting and representation of symbols. The count values can objects are relatively distinguished from the background. be represented by number of bits including 8 or greater than 8. Following on from this stage, the binary data can be processed. In any specific number of bits, when the run value is greater than The binary operations will reduce the processing load and will the maximum range, 0s are appended which shows maximum assist in the reduction of hardware resources. range. The 0s are then followed by a remaining count value. For example in Table 2, for 8 bits when a count value is 500, it is D. Morphology represented by 0 which shows 255 and then 0 is followed by After segmentation, binary morphology is applied to remove remaining count value of 245. The run length codes the noise. In binary morphology, erosion followed by dilation, representing the presence of objects in the rows are transmitted with a mask of 3×3 is applied. During the erosion and dilation, along with the compressed Huffman codes of ROI image to the two complete rows are stored in the line buffers in order to have server. the necessary neighbourhood information for the operation. F. ITU-T G4 compression E. ROI coding The G4 [13], implemented in this work, includes two stages. The ROI coding is performed in order to transmit binary image regions which have objects. The ROI coding employed in this work removes the rows without objects and retains the columns as shown in Fig. 7 (b). The columns are passed through because the data after ROI is compressed by using the G4, which uses a two-dimensional line-by-line coding method in which the position of each changing picture element, rather than Fig. 6. Signal behavior with ROI coding. INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 34 After analyzing images of dimensions 640×400 and 3000×2000 with a different number and type of objects, we have concluded that 9 bits representing the ROI row runs offers the smallest number of bytes. V. RESULTS AND DISCUSSION There are different VSN processing strategies in Table 3 (b) which are represented by dashed lines in Fig. 3. These strategies include a combination of vision tasks which are represented by their respective symbols in Fig. 3 such as image capturing, which is represented by A, pre-processing by B and segmentation by C, etc. In Strategy1, vision tasks including (a) image capturing (A), pre-processing (B), segmentation (C) are performed on the VSN and data (3200 bytes) is transmitted to Fig. 7. (a) Original image with objects. (b) Image after ROI coding. the server where the remaining vision tasks such as morphology TABLE 2 (I), low pass filtering (J), labelling (K), features extraction (L) RUN LENGTH CODING OF ROI CODING and classification (M) are performed. In a similar manner, there Runs Symbols Run length coding Run length coding are eight strategies with different combination of vision tasks. colour (8 bits) (9 bits) The energy consumption, resource utilization and output data of 200 Black 200,0 200,0 each of the strategy is given in Table 3. In Table 3, E_Proc 100 White 100 100 shows the processing energy consumption, E_Comm shows the 280 Black 0,25 280 communication energy consumption whereas E_Tot shows the 150 White 150 150 total energy consumption of the VSN modules which includes 514 Black 0,0,4 0,3 500 White 0,245 500 lighting, camera, processing and communication. 656 Black 0,0,146 0,145 A. Selection of VSN implementation strategy By analyzing all strategies in Table 3, it is evident that the In stage1, the three encoding modes, namely the pass mode, Strategy8 which includes vision tasks such as image capturing vertical mode and horizontal mode, are identified. In addition to (A), pre-processing (B), segmentation (C), morphology (D), this, black and white runs for the horizontal mode are calculated ROI coding (E) and group compression (F), has reduced energy at this stage. In stage2, the pass mode, vertical mode and consumption and the output data as compared to compared to horizontal mode are assigned the Huffman codes. The Huffman other strategies. The output images of Strategy8 are shown by codes of the pass and vertical modes are assigned on the runtime Fig. 9 (d) whereas input images are shown by Fig. 9 (a). The because there are few Huffman codes in these two modes. The comparison of the energy consumptions of Strategy8 with other Huffman codes for the horizontal mode are stored in the on-chip strategies shows that the introduction of the ROI coding in distributed memory of the platform. In the implementation, Strategy8 results in a reduction of the communication energy at three line buffers are used at stage1 in order to have information the cost of a small processing energy consumption. Strategy7 in relating to the coding and reference lines. Two of these buffers Table 3 shows that the ROI coding, together with are used for the storage of the reference line and the coding line pre-processing, segmentation and compression, can achieve while the third line buffer is used for saving the current row data. results comparable to Strategy8. It is important to note that in The G4 is implemented without the header stripes because the the worst cases, when there is significant noise, objects in each sender and receiver have the knowledge about the compressed alternate row, the ROI coding, without appropriate illumination data. This saves approximately 100 bytes. and/or morphological filtering, will not offer good results. The G. VSN data format reason is that the amount of data required to represent the ROI row run will increase when long runs are not expected. The VSN data format is shown in Fig. 8 where ROI Huffman The Fig. 9 (b) shows that without morphology, there is a noise codes show the compressed data of ROI image, ROI row runs as compared to Fig. 9 (d). The amount of data generated for Fig. holds the objects information in the rows and the ROI bytes 9 (b) is 505 bytes as compared to 259 bytes for Fig. 9 (d). By count shows the number of bytes for the ROI row runs. This analyzing different strategies in Table 3, it is evident that VSN data format will enable the reconstruction of the original Strategy8 which included pre-processing, segmentation, image at the receiving side. In relation to ROI row runs, it is morphology and ROI coding together with compression offers essential to select a suitable number of bits to represent the better result as compared to the other strategies. runs. B. Implementation and comparison The Strategy8 which is selected as being a suitable one for Fig. 8. VSN data format with ROI coding. VSN is implemented on hardware platform. The power is INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 35 TABLE 3. ESTIMATED ENERGY VALUES, RESOURCE UTILIZATION AND OUTPUT DATA OF TABLE 4. DIFFERENT VSN PROCESSING STRATEGIES DIFFERENT PARAMETERS OF PUBLISHED AND PROPOSED SYSTEM Strategy VSN Server E_Proc+ E_Tot logic FPGA data Systems E_Tot Logic Max data sent Lifetime Tasks Tasks E_Comm (mJ) cells brams sent (mJ) cells frequency (bytes) [month] symbols symbols (mJ) (bytes) (f/s) (period 2 sec) 1 ABC IJKLM 214 217 705 N.A. 32000 Published [1] 9.57 3949 59.5 500 3 2 ABCF GIJKLM 4.90 7.28 3764 3 680 Proposed 6.31 4339 99.73 259 4.5 3 ABCD GKLM 214 217 1388 4 32000 4 ABCDF GJKLM 3.70 6.09 4447 7 500 7 5 ABCE HIJKLM 64.9 67.3 1047 0 9635 6 ABCDE HJKLM 48.7 51.1 1730 4 7217 6 7 ABCEF GHIJKLM 3.74 6.12 4106 3 505 8 ABCDEF GHJKLM 2.10 4.49 4789 7 259 5 Lifetime [years] 4 Proposed 3 Published (a) 2 1 0 0.03 0.05 0.07 0.17 0.50 1.50 5.00 15.00 45.00 130.00 480.00 Time between frames [minutes] (b) Fig. 10. Lifetime of VSN for different sample period, predicted by using measured energy values. compared to the published system. This improvement is achieved at the cost of small difference in (c) hardware resources. The lifetime of the two systems for a sample period of 2 sec is shown in Table 4. The lifetime calculation is based on the fact that batteries offer a constant performance and zero leakage during the operating period. The lifetime of the VSN is predicted by using AA battery (d) having 37.44 kJ energy. The predicted lifetime of the VSN is compared with the published system as shown in Fig. 10. The Fig. 10 shows that depending on the sample period, the Fig. 9. Sample images, (a) After segmentation, (b) ROI coding of image a (c) After segmentation and morphology, (d) ROI coding image c. proposed system can achieve up to 50 percent longer lifetime as compared to published system. It is worth mentioning that after measured on hardware platform by using digilent real time certain time it is not useful to keep the node in sleep mode as the power measuring tool, called Adept [15]. It is worth mentioning sleep energy will start to dominate and no improvement in that for simulation, a smaller Spartan6 FPGA device namely lifetime can be achieved. The Fig. 10 shows that after a sample XC6SLX9L-CSG225 [14] was used but for real hardware period of 130 minutes, the lifetime is approximately constant implementation, Atlys board from digilent [15] was used which because the sleep power consumption is dominant at this stage. included a bigger Spartan6 FPGA device with a name XC6SLX45-CSG324 [14]. The implemented VSN is then VI. CONCLUSION compared with previously published system [1]. The In this work, Wireless Vision Sensor Node (VSN) comparison results of the two systems are given in Table 4. The implementation strategies are investigated with the objective of Table 4 shows that introduction of ROI coding reduced the reducing the overall energy consumption. The strategy which overall energy consumption of the proposed system by 34 consumes smaller energy consumption is implemented on percent as compared to previously published system. It is worth hardware platform and then it is compared with previously mentioning that the VSN in [1] was implemented on a smaller published system. In the implemented system, the processing FLASH based FPGA device as compared to the greater size energy consumption is reduced by partitioning the vision tasks SRAM based FPGA device which is used this work. On the between the VSN and the server as well as by executing same device, the difference in energy consumption for VSN is lightweight vision tasks i.e. pre-processing, segmentation, and expected to be significant. The frame rate achieved by the binary morphology on the hardware platform. The complex implemented system is approximately 2 times greater and the vision tasks such as labelling, feature extraction, and output data is reduced by a factor of approximately 2 as classification are performed on a server. The amount of data INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 36 being transmitted from VSN to the server is reduced by [17] Atmel Corporation, AT32UC3B0256, AVR32 (2010), Available: http://www.atmel.com/ introducing a simplified Region Of Interest (ROI) coding [18] Texas Instruments Inc, CC2520 transeiver (2007), Available: technique together with a G4 compression scheme. This http://www.ti.com/ approach reduced the amount of data by a factor of [19] Micron, Numonyx Serial Flash Memory (2007), Available: approximately 2 and reduced the overall energy consumption by http://www.micron.com/ approximately 34 percent as compared to previously published [20] Aptina Imaging Corporation, MT9V011 CMOS Image Sensor (2009), Available: http:// www.aptina.com/ system. It is worth mentioning that it is necessary to apply [21] C. Chun-Chia, C. Yu-Wei, F. Hung-Chi, C. Liang-Gee,“Analysis and appropriate illumination and/or morphological filtering in order architecture for memory efficient JBIG2 arithmetic encoder”, 48th to reduce the noise. It is concluded that depending on the sample Midwest Symposium on Circuits and Systems, pp.1191-1194, 2005. time, a VSN architecture, with ROI coding and G4 scheme can [22] D. G. Bailey, Design for Embedded Image Processing on FPGAs, John Wiley & Sons, Asia, 199-231, 2011. achieve a 50 percent longer lifetime as compared to VSN [23] G. Anastasi, M. Conti, M. Di Francesco, A. Passarella, “Energy without ROI coding. The proposed approach will reduce the conservation in wireless sensor networks: A survey”, Ad Hoc Networks overall energy consumption of the VSN and will move the 7(3), 537–568, May 2009. design complexity from embedded platforms to the more [24] J. Sánchez, G. Benet, J. E. Simó, “Video sensor architecture for generalized platform, namely the server. surveillance applications”, Sensors 12(2):1509–1528, Feb. 2012. REFERENCES [1] M. Imran, K. Khursheed, N. Lawal, M. O'Nils, and N. Ahmad,"Implementation of Wireless Vision Sensor Node for Muhammad Imran received B.S. degree in Characterization of Particles in Fluids”, IEEE Trans. on Circuits and computer engineering in 2006, the Masters degree in Systems for Video Technology, Vol.pp, 2012. electrical engineering (System-on-Chip) in 2009 and the Licentiate degree in electronic design in 2011. [2] L. Gasparini, R. Manduchi, M. Gottardi, D. Petri, “An Ultralow-Power Imran is currently pursuing PhD studies in electronics Wireless Camera Node: Development and Performance Analysis”, IEEE Trans. on Instr. and Meas.,vol pp, pp1-9, Dec. 2011. system design in Mid Sweden University, Sweden. His research interests include embedded video [3] W. J. MacLean,“An Evaluation of the Suitability of FPGAs for processing systems particularly investigation of Embedded Vision Systems”, IEEE Comp. Society Conf. on Comp. Vision programmable and energy efficient architectures for and Pattern Recog., pp.25-25 Jun. 2005. wireless smart camera systems, hardware software [4] M. Bakkali, R. Carmona-Galá, A. Rodriguez-Vazquez, “A prototype imaging tasks partitioning, synthesis and implementation of embedded vision node for wireless vision sensor network applications development”, 5th systems. Intl Sym. on I/V Comm. and Mobile Network (ISVC), Sep. 2010. [5] S. Soro and W. Heinzelman, “A survey of visual sensor networks”, Khursheed did his bachelor in Computer Engineering Advances in Multimedia, vol 2009, May, 2009. from COMSATS Institute of Information Technology, [6] I. Dietrich and F. Dressler, “On the lifetime of wireless sensor networks”, Abbottabad, Pakistan in 2006 and obtained one year ACM Trans. on Sen. Net., vol. 5, pp. 1–38, Feb. 2009. teaching experience at the same institute until August [7] L. Ferrigno, S. Marano, V. Paciello, and A. Pietrosanto, “Balancing 2007. Khursheed completed PhD studies in 2013 from computational and transmission power consumption in wireless image Electronics Design Division at Mid Sweden sensor networks”, Poc. of the IEEE Interl. Conf. on Virtual Env., University Sweden. He is working as Assistant Human-Computer Interfaces, and Measurement Systems, pp. 61–66, July, Professor in Abasyn University, Peshawar, Pakistan. 2005. His research focus is on the exploration of methods for [8] S. Hengstler, D. Prashanth, F. Sufen, H. Aghajan, “MeshEye: A the reduction of energy consumption in visual sensor node through intelligence Hybrid-Resolution Smart Camera Mote for Applications in Distributed partitioning and data reduction. Intelligent Surveillance”, 6th Intl. Symposium on Information Processing in Sensor Networks, 2007. Naeem Ahmad received M.Sc. degree in electronics from Quaid-e-Azam [9] A. Kerhet, M. Magno, F. Leonardi, A. Boni and L. Benini “A low-power University Pakistan, Masters degree in electrical wireless video sensor node for distributed object detection”, Journal of engineering (System-on-Chip) from Linköping Real-Time Image Pproc., vol.2, Oct. 2007. University Sweden, and Licentiate of Electronics [10] K. Khursheed, M. Imran, N. Ahmad, M. O'Nils, “Selection of bilevel degree from Mid Sweden University Sweden. image compression methods for reduction of communication energy in Currently, he is pursuing his PhD studies in wireless visual sensor networks”, Proc. Of SPIE, Real-Time Image and surveillance system design and visual sensor Video Processing, April 2012. networks from Mid Sweden University, Sweden. His [11] M. Imran, K. Khursheed, M. O'Nils, N. Lawal," Exploration of Target main research interest includes modelling and Architecture for a Wireless Camera Based Sensor Node", IEEE Norchip optimization of visual sensor networks and design of Conf., Finland, Nov. 15-16, 2010. vision based surveillance system. [12] N. Ahmad, N. Lawal, M. O. Nils, B. Oelmann, M. Imran, K. Khursheed, Malik Abdul Waheed received his BSc. in “Model and Placement Optimization of a Sky Surveillance Visual Sensor Network”, Intl. Conf. on Broad. and Wireless Comp., Comm. and Appl., computer engineering degree in 2005, Masters Oct. 2011. degree in electrical engineering in 2009, licentiate in electronics design in 2012. He is conducting his PhD [13] TIFF, Revision 6.0, (1992), Available: on optical real-time measurement systems for http://www.itu.int/itudoc/itu-t/com16/tiff-fx/docs/tiff6.pdf position and orientation in at most six dimensions of [14] Xilinx, Inc, Spartan-6 family and Xilinix power tools tutorial (2010), freedom. These are machine vision systems which Available: http://www.xilinx.com/ are suitable for aggressive parallelization and [15] Digilent, Adept 2.10.2 (2011), Available: http://www.digilentinc.com/ implementation on Field Programmable Gate Arrays [16] Microsemi Corporation, IGLOO video kit (2009), Available: (FPGA). http://www.actel.com/ INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, DECEMBER 2013 37 Mattias O’Nils received his B.Sc. electrical Najeem Lawal received his B.Sc. degree in engineering degree from Mid Sweden University in Electrical Engineering from the University of Lagos, 1993 and his Licentiate/PhD degrees in electronics Nigeria in 1998. He received his M.Sc. and PhD. in system design from The Royal Institute of Technology electronics system design from the Mid Sweden in Stockholm, Sweden, in 1996 and 1999 respectively. University, Sundsvall, Sweden in 2004 and 2009 He is Professor/prefect of the department and leads a respectively. He is an Assistant Professor in research group in embedded systems design at Mid electronic design division at Mid Sweden University. Sweden University. His research interest include His research involves machine vision where a hardware/software partitioning, methods for combination of state-of-the-art development in performance optimization by migration of software and hardware are employed in implementing functionality to hardware, specification, design and the systems. In particular, Najeem focuses on the verification of hardware/software interfaces, low-power optimization of FSM power consumption of vision system algorithms and implementation, compiler technology memory optimization and synthesis for works on how it can be optimized through algorithm optimization and real-time video systems. implementation platform exploration.
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-