

### DESIGN AND IMPLEMENTATION OF ADDERS WITH SINGLE LUT FOR AERIAL IMAGE PROCESSING APPLICATIONS

D.Shofia Priyadharshini<sup>1</sup>,K.Aishwarya<sup>2</sup>,R.Ranjith<sup>3</sup>,V.Vinay Kumar<sup>3</sup>,Manne Mohith<sup>3</sup> <sup>1,2</sup>Assistant Professor,Department of Electronics and Communication Engineering,Vel Tech High tech Dr.Rangarjan Dr.Sakunthala Engineering College,Chennai,India

<sup>3</sup>UG Scholar,Department of Electronics and Communication Engineering,Vel Tech High tech Dr.Rangarjan Dr.Sakunthala Engineering College,Chennai,India.

#### Abstract:

In VLSI, the usage of adders plays an important role for designing and implementing it in an application. In this paper, the comparison of Adders like Approximate Adder, Carry Select Adder, Ripple carry Adder is done, because these Adders like Approximate Adder, it sacrifices accuracy for performance. It uses simplified circuit to get fast output and it reduces the amount of time and energy required to get output. And the Carry- Set adder that uses a set of precomputed carry bits to speed up for getting the required output. A ripple carry adder uses a series operation to get the output. On comparing these three adders and by selecting the best adder based on their output and apply it in the DCT as an image processing application.

Key Words: VLSI, Approximate Adder, Ripple Carry Adder, Carry Set Adder, DCT.

#### **1.Introduction:**

The open design of FPGA sets it apart from ASIC. Logic circuits are used to implement processing in ASICs. FPGAs, on the other hand, use Adaptable Logic Blocks to execute logic. (CLBs). In order to reach an overall routing matrix, the CLB, which comes in two varieties called SLICEL and SLICEM (Logic as well as Memory Slices), must be linked to a switch matrix. This is Look-Up Tables (LUTs) and flip sandals make up every layer. For various FPGA chips, there are variations in the overall amounts of flip-flops and also LUTs.

In to minimise silicon size and power usage, logic gates have been developed into ASICs as a set. Meanwhile, due to variations in their fundamental design principles, ASIC-focused optimisations cannot be easily mapped onto the FPGA architecture.

Because Look-up Tables (LUTs) serve as the foundation of FPGAs, LUT-specific computation methods are necessary. Despite this, FPGAs (field-programmable gate arrays) are favoured due to its run-time reconfigurability, rapid testing (shorter time to market), and cheap cost for the hardware layout of some hazardous and real-time tasks (such as UAVs).

#### 16 bit-Ripple Carry Adder:

A ripple-carry adder (shown in fig 1) is a basic adder architecture that uses a series of full- adder stages to perform the addition operation. This type of adder is relatively simple to implement and requires minimal circuitry, making it a popular choice for small-scale applications. However, in the context of DCT, ripple-carry adders can be slow and inefficient, especially for large data sets, due to the need to propagate carries through multiple stages.



Fig 1: 16 bit Ripple Carry Adder

#### **Carry Select Adder:**

A carry select adder (shown in fig 2) is a type of adder that uses a set of precomputed carry bits to speed up the addition operation. This type of adder is particularly useful for computing the DCT because it can reduce the number of full-adder stages required to perform the addition, leading to a reduction in the overall delay of the circuit. However, carry-select adders (fig 2) are generally more complex than ripple-carry adders, which can lead to higher power consumption and area requirements.



Fig 2 :Carry Select Adder

#### **Approximate Adder:**

An approximate adder(shown in fig 3) is a type of adder that sacrifices accuracy for performance. It uses simplified circuits to perform the addition operation, which reduces the amount of time and energy required to perform the operation. However, the resulting output may not be exact, leading to a loss in accuracy. In the context of DCT, using an approximate adder can result in a loss of precision in the computed DCT coefficients, leading to a degradation in the quality of the reconstructed signal.





Pure LUT is used to create FPGA logic sections. A portion of SRAM known as a "k-input LUT" can create a truth table with k inputs and 1 output. As an illustration, the 3-input LUT supports three 3-bit inputs as well as stores 23 a possibility single-bit results, mapping each distinct input combo to the appropriately stored distinct output.

This project's primary goal is to demonstrate a novel estimate adder design method for FPGAbased devices with enhanced SWaP functionality while still maintaining the accuracy standards within reasonable limits. The suggested design technique concentrates on the Look- Up Table (LUT) architecture that is unique to FPGAs in order to introduce approximate results while dividing the carrying chain into LUT-based sub-adders, with adaptable coincide, to adjust an adder's accuracy as well as attain the overall delay of a single LUT.

By introducing sub-adder-based topologies that carry specified sub-adder outcomes and separate the carry chain, the Xilinx 14.7 design suite enables approximate adders to be designed with greater effectiveness in terms of power, area, and delay. Then, using the provided mathematical formula, the error chance is determined. The DCT module in MATLAB then introduces the adder design for picture compression.



#### 2. Related Work:

The researchers used a variety of adders, including ripple carry, carry select, and Brent Kung, to create and analyse an eight-bit Vedic multiplier. In some of their designs, they employed the Ling Carry select (CS) adder and the binary system to Excess-1 (BEC) converter. The study evaluated speed as well as area capacity of several carry choose adder designs with that of the Vedic multiplier. In terms of both area and latency, the researchers found that an eight- bit multipliers employing Brent Kung (BK) with Carry Select Adder (CSL) offered the greatest results[1]. In comparison to a standard CSA, an a 16-bit Square Root Carry-Select Adder (SQRT CSA) was developed to use less space and power. By using a binary-to-excess- 1 converter (BEC), the SQRT CSA substitutes the Cin=1 block seen in traditional CSA. Additional implementations include RCA and BEC with various bit widths. By applying Mentor Graphics Design Architect, transistor-level schematics are created, and the structure of the design is simulated using Eldo using TSMC 0.35m CMOS technologies and 3.3V supply voltage[2]. The ripple carry adders and exact and approximation adder cells are evaluated in this study utilising the frequency upscaling method. With this technique, inputs to an adder cells are sent more often than they can handle, which results in addition mistakes. The findings demonstrate that the inaccurate adder can operate at a faster frequency and with less energy loss than an exact adder, as well as that the unsure and exact RCAs have extremely near normalised mean error distances and mean relative error distances[3].

The redundant binary (RB) format is presented in this study as an effective reverse converter for converting it into two's complement format. In comparison to its rivals, the suggested reverse converter completes a 64-bit conversion rates in 829 ps and dissipates just 5.84 mW at a data

transmission speed of 1 GHz and a supplying voltage of 1.8 V in TSMC 0.18- m CMOS technology, according to a study[4]. In order to increase energy economy, this paper investigates the usage of various parallel-prefix adder topology in precision block of approximation adders. This study investigates many topologies and evaluates their performance using the Sobel picture application, in contrast to most comparable papers that employ only one topology. Results demonstrate that switching from Ripple Carry Adder to faster adder topology significantly reduces energy usage and dynamic power[5]. The novel approximation adder proposed in this study uses longer carry chains to provide programmable accuracy levels and includes a Carry Select Module for improved delay features. The design is flexible and can accommodate different degrees of precision. Experiments demonstrate that this novel design has superior latency and accuracy than previously suggested approximation adders[6].

This research suggests investigating various Parallel-Prefix adders (PPA) topology in the exact section for better performance as well as power efficiency. The majority of earlier efforts on approximation adders have employed Ripple Carry Adders (RCA) for the precision component. The findings demonstrate that PPA adders may dramatically increase the frequency of the clock with a low energy overhead[7]. In order to develop Carry Select Adder (CSA) architectures, this study suggests employing a parallel prefix adder, especially a Brent Kung adder, as opposed to a Ripple Carry Adder (RCA). It was determined that the Modified SQRT BK CSA had the highest power output but a little speed disadvantage. Tanner EDA tool was used to create the designs at a 45nm scale[8]. The methods presented in this study may be used to estimate the chance of error occurring and the possibility Mass Functions (PMF) of an error in a certain class of approximation adders, which can be utilised as measure of performance for comparison. The study illustrates how the suggested methodology may accurately forecast how various approximation adders would perform in real-world applications involving image processing[9].

This research utilises mathematical modelling as well as simulation methods to compare several 16-bit adders with respect to their latency. New carry bypass and carry select adders are suggested in light of the investigation; these adders demonstrate a considerable reduction in propagation latency when compared to existing adders[10]. In order to increase the space and power consumption of the FIR filters installed in CMOS, the composition of approximation adders is presented in this research. Indicating energy per test. Our design strategy decreases hardware space in filters and saves money. Our approximation adder approach improves space and power efficiency in CMOS VLSI filters[11]. The goal of this study is to find ways to save energy in portable multimedia systems that employ signal processing techniques. The suggested method employs imperfect full adder cells having less complicated transistors than accurate full adder cells, which leads to shorter critical pathways and permits voltage scaling. By creating structures for video and picture compression algorithms and generating mathematical representations for errors and consumption of power, the study assesses the effectiveness of this strategy. Comparing simulation findings to current accurate adder approaches, up to 69% power reductions are possible[12].

The use of FPGAs, or field-programmable gate arrays, for the processing of images in Intelligence, Surveillance, and Reconnaissance (ISR) applications is discussed in this study, as well as the difficulties in implementing such algorithms on platforms with limited resources. The authors suggest a unique approximation adder design technique that offers considerable performance advantages for error-tolerant applications while compromising the correctness of processed outputs. The most accurate design performs at least 9.9% better in terms of energy consumption when compared to current approximation adders, indicating that the suggested technique has the potential to significantly improve SWaP-index for computationally intensive UAV applications[13]. The brief suggests a block-based adder with low consumption of energy that divides the adder into a non-overlapping summation blocks and predicts the carry output based on the input operands of the block in question and the block after it. This results in a carry chain with fewer links and a shorter average delay. To improve accuracy and lower the output error rate, a system for mistake detection and recovery has also been presented. Modern approximation adders are outperformed by the suggested adder in terms of energy, latency, area, and output quality[14]. In order to loosen timing restrictions while retaining minimum

behavioural change, a new technique for constructing approximation circuits that makes use of fake timing routes is presented in this study. The Carry Cut-Back Adder (CCBA), which increases performance and energy economy while assuring low worst-case mistakes, is used as an illustration of this strategy. The CCBA outperforms cutting-edge and truncated adders for high-accuracy and low-power circuits, achieving high accuracy with large circuit savings[15].

### **3** Proposed Methodology

#### **Approximate Adder**

Two binary integers are added together by an adder. The ripple-carry adder (RCA)(fig 5) as

well as the carry look - ahead adder (CLA)(fig 5) are two common adders . As a result of the propagation of each full adder's (FA) carry to the subsequent FA in an n-bit RCA, the delay and circuit complexity grow linearly with n (or O(n)). An n-bit CLA comprises of signals that generate (g = ai bi) and propagate (p = a+bi) signals for creating the lookahead carries. These signals work in parallel to achieve the sum. In comparison to RCA, the delay of CLA is logarithmic in n (or O(log(n)), making it much shorter.

The increased power dissipation of a CLA comes at the cost of a bigger circuit area (in O (n log(n))). By lowering an accurate adder's hardware complexity and critical path, several approximation approaches have been developed. The foundation of an early approach is a speculative operation. Each sum bit in an n-bit based on speculation adder is anticipated by the k LSBs that came before it (k n). Because the carry network is shorter than n, a speculating adder performs more quickly than a traditional design. A segmented adder is built using a number of smaller adders running concurrently. Carry select adders are the name given to this particular type of adder. By simulating a complete adder, the critical route time and power dissipation may also be decreased.



#### Fig 5 :Ripple- carry adder & carry look -ahead adder 2-Bit & 3-Bit Sub-Adder Based Designs

LUTs are used in FPGAs to implement designs. When calculating the sum of two operands with three bits each, the adding necessitates calculations over six input bits to produce three output bits (ignoring the carry in/out). Three six-input LUTs can be used to accomplish this. Each of these three LUTs must map six inputs to one bit of output. We compute the result in this manner using a single LUT delay utilising three six-input LUTs. The only possible sizes for a sub-adder are 2 or 3 bits when using a maximum 6-input LUT layout. As there is only one LUT delay, the proposed sub-adder model versions with sub-adder sizes of 2 and 3 bits are shown in (Fig 6).



Fig 6: 2-Bit & 3-Bit Sub-Adder Based Designs 5-

#### **Bit Sub-Adder Based Designs**

As seen in Fig. 7, modern FPGAs (such the Virtex-7) have 5-LUT combinational blocks,

which can act as 6-input LUTs with a common input (a). Consequently, the logic built into the two 5-input LUTs can be adjusted so that one of them is set up for potential carry-in as "1," while the other calculates the outcome if the carry is "0," as shown in Fig.7 (b). The precise Carry Save Adder [15] serves as the inspiration for this LUT-specific sub adder design. This setup increases the size of this sub adder to 5 bits while adding 1 built-in FPGA multiplexer as a delay overhead.



Fig 7: Proposed 5 bit sub-adder

#### **Discrete Cosine Transform**

Images are sent or stored with pixel values. It can be compressed by lowering the value that each of its pixels has. There are mainly two forms of image compression:

#### **Lossless compression:**

With this sort of compression, the image quality remains unaffected because it is exactly the same after recovery as it was before employing compression techniques.

#### Loss compression:

The image's overall quality is greatly diminished in this sort of compression because, even after recovering, we cannot obtain older data exactly. Yet this kind of compression produces very high compression levels for the image data and is highly helpful when sending images across a network.

A sum of sinusoids with different magnitudes as well as frequencies is how the discrete cosine transform (DCT) portrays an image. The discrete cosine transform (DCT) of an image in two dimensions is computed using the dct2 function. A characteristic of the DCT is that, for a typical image, the majority of the visually important information about the image is concentrated in just a few coefficients of the DCT. The DCT is frequently utilised in picture compression applications as a result of this. The international standard lossy picture compression method known as JPEG, for instance, is built around the DCT. The name of the standard's working group, the Joint Photographic Experts Group, is reflected in its name.

This is the definition of an M-by-N matrix A's two-dimensional DCT.

$$\begin{array}{ll} Tpq = 1/sqrt(M) & p=0, & 0 \leq q \leq M-1 \\ sqrt(2/M ) \underbrace{cos \ pi(2q+1)p}_{2M}, & 1 \leq p \leq M-1, & 0 \leq q \leq M-1 \end{array}$$

To execute a discrete cosine transform (DCT) on an image, we must first get the image file information (pixel value expressed as an integer with a range of 0 to 255) and divide it into blocks of an  $8 \times 8$  matrix. The subadder design must be added to the compression addition. Inverse DCT is then performed as part of the ongoing operation to produce the reconstructed image.



#### **4** Results And Discussion:

| Adder<br>Model               | Time     | Power  | Area |
|------------------------------|----------|--------|------|
| ApproximateAdder             | 7.595ns  | 0.44mW | 58nm |
| Carry SelectAdder            | 17.501ns | 0.63mW | 45nm |
| 16- Bit RippleCarry<br>Adder | 24.686ns | 0.92mW | 38nm |

### Time and power consumption:







Performance results for proposed design:

### DCT implementation of Accurate adder:



INPUT IMAGE

DCT IMAGE

## Vol 12 Issue 03 2023 ISSN NO: 2230-5807

### Implementation of DCT algorithm for sa3ov1:



INPUT IMAGE



DCT image

### **DCT implementation for sa5ov3:**



INPUT IMAGE



DCT

image

## Vol 12 Issue 03 2023 ISSN NO: 2230-5807

#### 5. Conclusion

The total area of each adder using a suitable technology library or synthesis tool. The adder with the smallest area will have the advantage in terms of area. Use power analysis tools to compute the power consumption of each adder. The adder with the lowest power consumption will have the advantage in terms of power the delay of each adder using suitable timing analysis tools. The adder with the smallest delay will have the advantage in terms of speed. The approximation adder is superior to the carry choose adder and the 16-bit ripple carry adder when performance, cost, and efficiency of the three types of adders are taken into account. In comparison to the other two adders, the approximation adder can perform addition operations with less power consumption, a shorter delay, and a smaller area overhead. Additionally, it is much faster and more effective than conventional adders while still achieving acceptable levels of accuracy for a variety of applications. As a result, many designers may prefer to use the approximation adder as a solution for improving the performance of arithmetic circuits.

#### **References:**

[1] YaswanthD , S. Nagaraj , R.VishnuVijeth-Design and analysis of high speed and using carry select adder 2020.low area vedic multiplier

[2] Shamim Akhter, Saurabh Chaturvedi, KilariPardhasardi - CMOS Implementation of Efficient 16-Bit Square Root Carry-Select Adder 2015.

[3] Junqi H, T.Nandha Kumar, Haider AbbasFabrizio Lombardi- Simulation-Based Evaluation of Frequency Upscaled Operation of Exact/Approximate Ripple Carry Adders 2017.

[4] YajuanHe,Chip-Hong Chang- A Power-Delay Efficient Hybrid Carry Lookahead/Carry- Select Based Redundant Binary to Two's Complement Converter 2008.

[5] Leonardo B, Morgana M, Claudio M, Eduardo A-Exploring Power-Performance-Quality Tradeoff of Approximate Adders for Energy Efficient Sobel Filtering 2018.

[6] Alish Kanani, Jigar Mehta, Neeraj Goel - ACA-CSU: A Carry Selection Based Accuracy Configurable Approximate Adder Design 2020.

[7] Morgana Macedo, Leonardo Soares, Bianca Silveira, Claudio M. Diniz, Eduardo A. C. da Costa - Exploring the Use of Parallel Prefix Adder Topologies into Approximate Adder Circuits 2017.

[8] Pallavi Saxena - Design of Low Power and High Speed Carry Select Adder Using Brent Kung Adder 2015.

[9] Sana Mazahir, Osman Hasan, Rehan Hafiz, Muhammad Shafique, and Jorg Henkel – Probabilistic Error Modeling for Approximate Adders 2016.

[10] Mary Christina Joy, Ansa Jimmy ,Tony C.Thomas, Manju I.Kollannur-Modified 16 bit Carry Select and Carry Bypass Adder Architectures for High Speed Operations 2020.

[11] Leonardo Bandeira Soares, Sergio Bampi-Approximate Adder Synthesis for Area- and EnergyEfficient FIR Filters in CMOS VLSI 2015.

[12] Vaibhav Gupta, Debabrata Mohapatra, Anand Raghunathan,- Low-Power Digital Signal Processing Using Approximate Adders 2013.

[13] TUAHA NOMANI, MUJAHID MOHSIN, ZAHID PERVAIZ, AND MUHAMMAD SHAFIQUE- xUAVs: Towards Efficient Approximate Computing for UAVs—Low Power Approximate Adders With Single LUT Delay for FPGA-Based Aerial Imaging Optimization 2020. [14] Farhad Ebrahimi-Azandaryani, Omid Akbari, Mehdi Kamal, Ali Afzali-Kusha, and Massoud

Pedram-Block-Based Carry Speculative Approximate Adder for Energy-Efficient Applications 2020. [15] Vincent Camus, Student Member, IEEE, Mattia Cacciotti, Jeremy Schlachter and Christian Enz- Design of Approximate Circuits by Fabrication of False Timing Paths: The Carry Cut-Back Adder 2018.