Hardware Acceleration of EDA Algorithms- P10

Hardware Acceleration of EDA Algorithms- P10: Single-threaded software applications have ceased to see significant gains in performance on a general-purpose CPU, even with further scaling in very large scale integration (VLSI) technology. This is a significant problem for electronic design automation (EDA) applications, since the design complexity of VLSI integrated circuits (ICs) is continuously growing. In this research monograph, we evaluate custom ICs, field-programmable gate arrays (FPGAs), and graphics processors as platforms for accelerating EDA algorithms, instead of the general-purpose singlethreaded CPU | Experiments 163 Table Speedup for circuit simulation Ckt name Trans. Total eval. OmegaSIM s AuSIM s SpeedUp CPU-alone GPU CPU Industrial_1 324 x Industrial_2 1 098 x Industrial_3 1 098 x Buf_1 500 x Buf_2 1 000 x Buf_3 2 000 x ClockTree_1 1 922 x ClockTree_2 7 682 x Avg x Table compares the runtime of AuSIM which is OmegaSIM with our approach integrated. AuSIM runs partly on GPU and partly on CPU against the original OmegaSIM running on the CPU alone . Columns 1 and 2 report the circuit name and the number of transistors in the circuit respectively. The number of evaluations required for full circuit simulation is reported in column 3. Columns 4 and 5 report the CPU-alone and GPU GPU runtimes in seconds respectively. The speedups are reported in column 6. The circuits Industrial_1 Industrial_2 and Industrial_3 perform the functionality of an LFSR. Circuits Buf_1 Buf_2 and Buf_3 are buffer insertion instances for buses of three different sizes. Circuits ClockTree_1 and ClockTree_2 are symmetrical H-tree clock distribution networks. These results show that an average speedup of can be achieved over a variety of circuits. Also note that with an increase in the number of transistors in the circuit the speedup obtained is higher. This is because the GPU memory latencies can be better hidden when more device evaluations are issued in parallel. The NVIDIA 8800 GPU device supports IEEE 754 single precision floating point operations. However the BSIM3 model code uses IEEE 754 double precision floating point computations. We first converted all the double precision computations in the BSIM3 code into single precision before modifying it for use on the GPU. We determined the error that was incurred in this process. We found that the .

Không thể tạo bản xem trước, hãy bấm tải xuống
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.