International Science Index

48
10007795
CPU Architecture Based on Static Hardware Scheduler Engine and Multiple Pipeline Registers
Abstract:

The development of CPUs and of real-time systems based on them made it possible to use time at increasingly low resolutions. Together with the scheduling methods and algorithms, time organizing has been improved so as to respond positively to the need for optimization and to the way in which the CPU is used. This presentation contains both a detailed theoretical description and the results obtained from research on improving the performances of the nMPRA (Multi Pipeline Register Architecture) processor by implementing specific functions in hardware. The proposed CPU architecture has been developed, simulated and validated by using the FPGA Virtex-7 circuit, via a SoC project. Although the nMPRA processor hardware structure with five pipeline stages is very complex, the present paper presents and analyzes the tests dedicated to the implementation of the CPU and of the memory on-chip for instructions and data. In order to practically implement and test the entire SoC project, various tests have been performed. These tests have been performed in order to verify the drivers for peripherals and the boot module named Bootloader.

Paper Detail
7
downloads
47
10007738
Electrode Engineering for On-Chip Liquid Driving by Using Electrokinetic Effect
Abstract:

High lamination in microchannel is one of the main challenges in on-chip components like micro total analyzer systems and lab-on-a-chips. Electro-osmotic force is highly effective in chip-scale. This research proposes a microfluidic-based micropump for low ionic strength solutions. Narrow microchannels are designed to generate an efficient electroosmotic flow near the walls. Microelectrodes are embedded in the lateral sides and actuated by low electric potential to generate pumping effect inside the channel. Based on the simulation study, the fluid velocity increases by increasing the electric potential amplitude. We achieve a net flow velocity of 100 µm/s, by applying +/- 2 V to the electrode structures. Our proposed low voltage design is of interest in conventional lab-on-a-chip applications.

Paper Detail
30
downloads
46
10007068
A Study of Recent Contribution on Simulation Tools for Network-on-Chip
Abstract:
The growth in the number of Intellectual Properties (IPs) or the number of cores on the same chip becomes a critical issue in System-on-Chip (SoC) due to the intra-communication problem between the chip elements. As a result, Network-on-Chip (NoC) has emerged as a system architecture to overcome intra-communication issues. This paper presents a study of recent contributions on simulation tools for NoC. Furthermore, an overview of NoC is covered as well as a comparison between some NoC simulators to help facilitate research in on-chip communication.
Paper Detail
87
downloads
45
10006573
On-Chip Aging Sensor Circuit Based on Phase Locked Loop Circuit
Abstract:

In sub micrometer technology, the aging phenomenon starts to have a significant impact on the reliability of integrated circuits by bringing performance degradation. For that reason, it is important to have a capability to evaluate the aging effects accurately. This paper presents an accurate aging measurement approach based on phase-locked loop (PLL) and voltage-controlled oscillator (VCO) circuit. The architecture is rejecting the circuit self-aging effect from the characteristics of PLL, which is generating the frequency without any aging phenomena affects. The aging monitor is implemented in low power 32 nm CMOS technology, and occupies a pretty small area. Aging simulation results show that the proposed aging measurement circuit improves accuracy by about 2.8% at high temperature and 19.6% at high voltage.

Paper Detail
128
downloads
44
10004936
Analysis of Performance of 3T1D Dynamic Random-Access Memory Cell
Abstract:
On-chip memories consume a significant portion of the overall die space and power in modern microprocessors. On-chip caches depend on Static Random-Access Memory (SRAM) cells and scaling of technology occurring as per Moore’s law. Unfortunately, the scaling is affecting stability, performance, and leakage power which will become major problems for future SRAMs in aggressive nanoscale technologies due to increasing device mismatch and variations. 3T1D Dynamic Random-Access Memory (DRAM) cell is a non-destructive read DRAM cell with three transistors and a gated diode. In 3T1D DRAM cell gated diode (D1) acts as a storage device and also as an amplifier, which leads to fast read access. Due to its high tolerance to process variation, high density, and low cost of memory as compared to 6T SRAM cell, it is universally used by the advanced microprocessor for on chip data and program memory. In the present paper, it has been shown that 3T1D DRAM cell can perform better in terms of fast read access as compared to 6T, 4T, 3T SRAM cells, respectively.
Paper Detail
408
downloads
43
10003784
A High-Crosstalk Silicon Photonic Arrayed Waveguide Grating
Abstract:

In this paper, we demonstrated a 1 × 4 silicon photonic cascaded arrayed waveguide grating, which is fabricated on a SOI wafer with a 220 nm top Si layer and a 2µm buried oxide layer. The measured on-chip transmission loss of this cascaded arrayed waveguide grating is ~ 5.6 dB, including the fiber-to-waveguide coupling loss. The adjacent crosstalk is 33.2 dB. Compared to the normal single silicon photonic arrayed waveguide grating with a crosstalk of ~ 12.5 dB, the crosstalk of this device has been dramatically increased.

Paper Detail
679
downloads
42
10003912
Design of a Pulse Generator Based on a Programmable System-on-Chip (PSoC) for Ultrasonic Applications
Abstract:
This paper describes the design of a pulse generator based on the Programmable System-on-Chip (PSoC) module. In this module, using programmable logic is possible to implement different pulses which are required for ultrasonic applications, either in a single channel or multiple channels. This module can operate with programmable frequencies from 3-74 MHz; its programming may be versatile covering a wide range of ultrasonic applications. It is ideal for low-power ultrasonic applications where PZT or PVDF transducers are used.
Paper Detail
692
downloads
41
10000942
The Effect of CPU Location in Total Immersion of Microelectronics
Abstract:

Meeting the growth in demand for digital services such as social media, telecommunications, and business and cloud services requires large scale data centres, which has led to an increase in their end use energy demand. Generally, over 30% of data centre power is consumed by the necessary cooling overhead. Thus energy can be reduced by improving the cooling efficiency. Air and liquid can both be used as cooling media for the data centre. Traditional data centre cooling systems use air, however liquid is recognised as a promising method that can handle the more densely packed data centres. Liquid cooling can be classified into three methods; rack heat exchanger, on-chip heat exchanger and full immersion of the microelectronics. This study quantifies the improvements of heat transfer specifically for the case of immersed microelectronics by varying the CPU and heat sink location. Immersion of the server is achieved by filling the gap between the microelectronics and a water jacket with a dielectric liquid which convects the heat from the CPU to the water jacket on the opposite side. Heat transfer is governed by two physical mechanisms, which is natural convection for the fixed enclosure filled with dielectric liquid and forced convection for the water that is pumped through the water jacket. The model in this study is validated with published numerical and experimental work and shows good agreement with previous work. The results show that the heat transfer performance and Nusselt number (Nu) is improved by 89% by placing the CPU and heat sink on the bottom of the microelectronics enclosure.

Paper Detail
1370
downloads
40
10002446
Test Data Compression Using a Hybrid of Bitmask Dictionary and 2n Pattern Runlength Coding Methods
Abstract:
In VLSI, testing plays an important role. Major problem in testing are test data volume and test power. The important solution to reduce test data volume and test time is test data compression. The Proposed technique combines the bit maskdictionary and 2n pattern run length-coding method and provides a substantial improvement in the compression efficiency without introducing any additional decompression penalty. This method has been implemented using Mat lab and HDL Language to reduce test data volume and memory requirements. This method is applied on various benchmark test sets and compared the results with other existing methods. The proposed technique can achieve a compression ratio up to 86%.
Paper Detail
1012
downloads
39
10001108
A Superior Delay Estimation Model for VLSI Interconnect in Current Mode Signaling
Abstract:

Today’s VLSI networks demands for high speed. And in this work the compact form mathematical model for current mode signalling in VLSI interconnects is presented.RLC interconnect line is modelled using characteristic impedance of transmission line and inductive effect. The on-chip inductance effect is dominant at lower technology node is emulated into an equivalent resistance. First order transfer function is designed using finite difference equation, Laplace transform and by applying the boundary conditions at the source and load termination. It has been observed that the dominant pole determines system response and delay in the proposed model. The novel proposed current mode model shows superior performance as compared to voltage mode signalling. Analysis shows that current mode signalling in VLSI interconnects provides 2.8 times better delay performance than voltage mode. Secondly the damping factor of a lumped RLC circuit is shown to be a useful figure of merit.

Paper Detail
1486
downloads
38
10002850
Dynamic Power Reduction in Sequential Circuits Using Look Ahead Clock Gating Technique
Abstract:
In this paper, a novel Linear Feedback Shift Register (LFSR) with Look Ahead Clock Gating (LACG) technique is presented to reduce the power consumption in modern processors and System-on-Chip. Clock gating is a predominant technique used to reduce unwanted switching of clock signals. Several clock gating techniques to reduce the dynamic power have been developed, of which LACG is predominant. LACG computes the clock enabling signals of each flip-flop (FF) one cycle ahead of time, based on the present cycle data of the flip-flops on which it depends. It overcomes the timing problems in the existing clock gating methods like datadriven clock gating and Auto-Gated flip-flops (AGFF) by allotting a full clock cycle for the determination of the clock enabling signals. Further to reduce the power consumption in LACG technique, FFs can be grouped so that they share a common clock enabling signal. Simulation results show that the novel grouped LFSR with LACG achieves 15.03% power savings than conventional LFSR with LACG and 44.87% than data-driven clock gating.
Keywords:
Paper Detail
698
downloads
37
9999932
Heuristic for Accelerating Run-Time Task Mapping in NoC-Based Heterogeneous MPSoCs
Abstract:

In this paper, we propose a new packing strategy to find a free resource for run-time mapping of application tasks to NoC-based Heterogeneous MPSoC. The proposed strategy minimizes the task mapping time in addition to placing the communicating tasks close to each other. To evaluate our approach, a comparative study is carried out for a platform containing single task supported PEs. Experiments show that our strategy provides better results when compared to latest dynamic mapping strategies reported in the literature.

Paper Detail
1453
downloads
36
10001273
A High Level Implementation of a High Performance Data Transfer Interface for NoC
Abstract:
The distribution of a single global clock across a chip has become the major design bottleneck for high performance VLSI systems owing to the power dissipation, process variability and multicycle cross-chip signaling. A Network-on-Chip (NoC) architecture partitioned into several synchronous blocks has become a promising approach for attaining fine-grain power management at the system level. In a NoC architecture the communication between the blocks is handled asynchronously. To interface these blocks on a chip operating at different frequencies, an asynchronous FIFO interface is inevitable. However, these asynchronous FIFOs are not required if adjacent blocks belong to the same clock domain. In this paper, we have designed and analyzed a 16-bit asynchronous micropipelined FIFO of depth four, with the awareness of place and route on an FPGA device. We have used a commercially available Spartan 3 device and designed a high speed implementation of the asynchronous 4-phase micropipeline. The asynchronous FIFO implemented on the FPGA device shows 76 Mb/s throughput and a handshake cycle of 109 ns for write and 101.3 ns for read at the simulation under the worst case operating conditions (voltage = 0.95V) on a working chip at the room temperature.
Paper Detail
1225
downloads
35
9998716
Formation of Round Channel for Microfluidic Applications
Abstract:

PDMS (Polydimethylsiloxane) polymer is a suitable material for biological and MEMS (Microelectromechanical systems) designers, because of its biocompatibility, transparency and high resistance under plasma treatment. PDMS round channel is always been of great interest due to its ability to confine the liquid with membrane type micro valves. In this paper we are presenting a very simple way to form round shapemicrofluidic channel, which is based on reflow of positive photoresist AZ® 40 XT. With this method, it is possible to obtain channel of different height simply by varying the spin coating parameters of photoresist.

Paper Detail
1967
downloads
34
9998077
A Novel FIFO Design for Data Transfer in Mixed Timing Systems
Abstract:

In the current scenario, with the increasing integration densities, most system-on-chip designs are partitioned into multiple clock domains. In this paper, an asynchronous FIFO (First-in First-out pipeline) design is employed as a data transfer interface between two independent clock domains. Since the clocks on the either sides of the FIFO run at a different speed, the task to ensure the correct data transmission through this FIFO is manually performed. Firstly an existing asynchronous FIFO design is discussed and simulated. Gate-level simulation results depicted the flaw in existing design. In order to solve this problem, a novel modified asynchronous FIFO design is proposed. The results obtained from proposed design are in perfect accordance with theoretical expectations. The proposed asynchronous FIFO design outperforms the existing design in terms of accuracy and speed. In order to evaluate the performance of the FIFO designs presented in this paper, the circuits were implemented in 0.24µ TSMC CMOS technology and simulated at 2.5V using HSpice (© Avant! Corporation). The layout design of the proposed FIFO is also presented.

Paper Detail
2169
downloads
33
16681
CMOS-Compatible Plasmonic Nanocircuits for On-Chip Integration
Abstract:

Silicon photonics is merging as a unified platform for driving photonic based telecommunications and for local photonic based interconnect but it suffers from large footprint as compared with the nanoelectronics. Plasmonics is an attractive alternative for nanophotonics. In this work, two CMOS compatible plasmonic waveguide platforms are compared. One is the horizontal metal-insulator-Si-insulator-metal nanoplasmonic waveguide and the other is metal-insulator-Si hybrid plasmonic waveguide. Various passive and active photonic devices have been experimentally demonstrated based on these two plasmonic waveguide platforms.

Paper Detail
1431
downloads
32
16097
Long-Term On-Chip Storage and Release of Liquid Reagents for Diagnostic Lab-on-a-Chip Applications
Abstract:

A new concept for long-term reagent storage for Labon- a-Chip (LoC) devices is described. Here we present a polymer multilayer stack with integrated stick packs for long-term storage of several liquid reagents, which are necessary for many diagnostic applications. Stick packs are widely used in packaging industry for storing solids and liquids for long time. The storage concept fulfills two main requirements: First, a long-term storage of reagents in stick packs without significant losses and interaction with surroundings, second, on demand releasing of liquids, which is realized by pushing a membrane against the stick pack through pneumatic pressure. This concept enables long-term on-chip storage of liquid reagents at room temperature and allows an easy implementation in different LoC devices.

Paper Detail
2067
downloads
31
11659
Independent Spanning Trees on Systems-on-chip Hypercubes Routing
Abstract:

Independent spanning trees (ISTs) provide a number of advantages in data broadcasting. One can cite the use in fault tolerance network protocols for distributed computing and bandwidth. However, the problem of constructing multiple ISTs is considered hard for arbitrary graphs. In this paper we present an efficient algorithm to construct ISTs on hypercubes that requires minimum resources to be performed.

Paper Detail
1172
downloads
30
17240
FPGA Hardware Implementation and Evaluation of a Micro-Network Architecture for Multi-Core Systems
Abstract:

This paper presents the design, implementation and evaluation of a micro-network, or Network-on-Chip (NoC), based on a generic pipeline router architecture. The router is designed to efficiently support traffic generated by multimedia applications on embedded multi-core systems. It employs a simplest routing mechanism and implements the round-robin scheduling strategy to resolve output port contentions and minimize latency. A virtual channel flow control is applied to avoid the head-of-line blocking problem and enhance performance in the NoC. The hardware design of the router architecture has been implemented at the register transfer level; its functionality is evaluated in the case of the two dimensional Mesh/Torus topology, and performance results are derived from ModelSim simulator and Xilinx ISE 9.2i synthesis tool. An example of a multi-core image processing system utilizing the NoC structure has been implemented and validated to demonstrate the capability of the proposed micro-network architecture. To reduce complexity of the image compression and decompression architecture, the system use image processing algorithm based on classical discrete cosine transform with an efficient zonal processing approach. The experimental results have confirmed that both the proposed image compression scheme and NoC architecture can achieve a reasonable image quality with lower processing time.

Paper Detail
2231
downloads
29
6693
CMOS-Compatible Silicon Nanoplasmonics for On-Chip Integration
Abstract:

Although silicon photonic devices provide a significantly larger bandwidth and dissipate a substantially less power than the electronic devices, they suffer from a large size due to the fundamental diffraction limit and the weak optical response of Si. A potential solution is to exploit Si plasmonics, which may not only miniaturize the photonic device far beyond the diffraction limit, but also enhance the optical response in Si due to the electromagnetic field confinement. In this paper, we discuss and summarize the recently developed metal-insulator-Si-insulator-metal nanoplasmonic waveguide as well as various passive and active plasmonic components based on this waveguide, including coupler, bend, power splitter, ring resonator, MZI, modulator, detector, etc. All these plasmonic components are CMOS compatible and could be integrated with electronic and conventional dielectric photonic devices on the same SOI chip. More potential plasmonic devices as well as plasmonic nanocircuits with complex functionalities are also addressed.

Paper Detail
1167
downloads
28
10161
Silicon-Waveguide Based Silicide Schottky- Barrier Infrared Detector for on-Chip Applications
Abstract:

We prove detailed analysis of a waveguide-based Schottky barrier photodetector (SBPD) where a thin silicide film is put on the top of a silicon-on-insulator (SOI) channel waveguide to absorb light propagating along the waveguide. Taking both the confinement factor of light absorption and the wall scanning induced gain of the photoexcited carriers into account, an optimized silicide thickness is extracted to maximize the effective gain, thereby the responsivity. For typical lengths of the thin silicide film (10-20 Ðçm), the optimized thickness is estimated to be in the range of 1-2 nm, and only about 50-80% light power is absorbed to reach the maximum responsivity. Resonant waveguide-based SBPDs are proposed, which consist of a microloop, microdisc, or microring waveguide structure to allow light multiply propagating along the circular Si waveguide beneath the thin silicide film. Simulation results suggest that such resonant waveguide-based SBPDs have much higher repsonsivity at the resonant wavelengths as compared to the straight waveguidebased detectors. Some experimental results about Si waveguide-based SBPD are also reported.

Paper Detail
1439
downloads
27
3067
An On-chip LDO Voltage Regulator with Improved Current Buffer Compensation
Abstract:

A fully on-chip low drop-out (LDO) voltage regulator with 100pF output load capacitor is presented. A novel frequency compensation scheme using current buffer is adopted to realize single dominant pole within the unit gain frequency of the regulation loop, the phase margin (PM) is at least 50 degree under the full range of the load current, and the power supply rejection (PSR) character is improved compared with conventional Miller compensation. Besides, the differentiator provides a high speed path during the load current transient. Implemented in 0.18μm CMOS technology, the LDO voltage regulator provides 100mA load current with a stable 1.8V output voltage consuming 80μA quiescent current.

Paper Detail
3183
downloads
26
8667
Extended Low Power Bus Binding Combined with Data Sequence Reordering
Abstract:
In this paper, we address the problem of reducing the switching activity (SA) in on-chip buses through the use of a bus binding technique in high-level synthesis. While many binding techniques to reduce the SA exist, we present yet another technique for further reducing the switching activity. Our proposed method combines bus binding and data sequence reordering to explore a wider solution space. The problem is formulated as a multiple traveling salesman problem and solved using simulated annealing technique. The experimental results revealed that a binding solution obtained with the proposed method reduces 5.6-27.2% (18.0% on average) and 2.6-12.7% (6.8% on average) of the switching activity when compared with conventional binding-only and hybrid binding-encoding methods, respectively.
Paper Detail
732
downloads
25
13190
A Novel Implementation of Application Specific Instruction-set Processor (ASIP) using Verilog
Abstract:
The general purpose processors that are used in embedded systems must support constraints like execution time, power consumption, code size and so on. On the other hand an Application Specific Instruction-set Processor (ASIP) has advantages in terms of power consumption, performance and flexibility. In this paper, a 16-bit Application Specific Instruction-set processor for the sensor data transfer is proposed. The designed processor architecture consists of on-chip transmitter and receiver modules along with the processing and controlling units to enable the data transmission and reception on a single die. The data transfer is accomplished with less number of instructions as compared with the general purpose processor. The ASIP core operates at a maximum clock frequency of 1.132GHz with a delay of 0.883ns and consumes 569.63mW power at an operating voltage of 1.2V. The ASIP is implemented in Verilog HDL using the Xilinx platform on Virtex4.
Paper Detail
1314
downloads
24
5164
Accurate Crosstalk Analysis for RLC On-Chip VLSI Interconnect
Abstract:

This work proposes an accurate crosstalk noise estimation method in the presence of multiple RLC lines for the use in design automation tools. This method correctly models the loading effects of non switching aggressors and aggressor tree branches using resistive shielding effect and realistic exponential input waveforms. Noise peak and width expressions have been derived. The results obtained are at good agreement with SPICE results. Results show that average error for noise peak is 4.7% and for the width is 6.15% while allowing a very fast analysis.

Paper Detail
968
downloads
23
15803
Phase Error Accumulation Methodology for On-Chip Cell Characterization
Abstract:
This paper describes the design of new method of propagation delay measurement in micro and nanostructures during characterization of ASIC standard library cell. Providing more accuracy timing information about library cell to the design team we can improve a quality of timing analysis inside of ASIC design flow process. Also, this information could be very useful for semiconductor foundry team to make correction in technology process. By comparison of the propagation delay in the CMOS element and result of analog SPICE simulation. It was implemented as digital IP core for semiconductor manufacturing process. Specialized method helps to observe the propagation time delay in one element of the standard-cell library with up-to picoseconds accuracy and less. Thus, the special useful solutions for VLSI schematic to parameters extraction, basic cell layout verification, design simulation and verification are announced.
Paper Detail
886
downloads
22
1360
Speedup of Data Vortex Network Architecture
Authors:
Abstract:
In this paper, 3X3 routing nodes are proposed to provide speedup and parallel processing capability in Data Vortex network architectures. The new design not only significantly improves network throughput and latency, but also eliminates the need for distributive traffic control mechanism originally embedded among nodes and the need for nodal buffering. The cost effectiveness is studied by a comparison study with the previously proposed 2- input buffered networks, and considerable performance enhancement can be achieved with similar or lower cost of hardware. Unlike previous implementation, the network leaves small probability of contention, therefore, the packet drop rate must be kept low for such implementation to be feasible and attractive, and it can be achieved with proper choice of operation conditions.
Paper Detail
831
downloads
21
9
Closed form Delay Model for on-Chip VLSIRLCG Interconnects for Ramp Input for Different Damping Conditions
Abstract:
Fast delay estimation methods, as opposed to simulation techniques, are needed for incremental performance driven layout synthesis. On-chip inductive effects are becoming predominant in deep submicron interconnects due to increasing clock speed and circuit complexity. Inductance causes noise in signal waveforms, which can adversely affect the performance of the circuit and signal integrity. Several approaches have been put forward which consider the inductance for on-chip interconnect modelling. But for even much higher frequency, of the order of few GHz, the shunt dielectric lossy component has become comparable to that of other electrical parameters for high speed VLSI design. In order to cope up with this effect, on-chip interconnect has to be modelled as distributed RLCG line. Elmore delay based methods, although efficient, cannot accurately estimate the delay for RLCG interconnect line. In this paper, an accurate analytical delay model has been derived, based on first and second moments of RLCG interconnection lines. The proposed model considers both the effect of inductance and conductance matrices. We have performed the simulation in 0.18μm technology node and an error of as low as less as 5% has been achieved with the proposed model when compared to SPICE. The importance of the conductance matrices in interconnect modelling has also been discussed and it is shown that if G is neglected for interconnect line modelling, then it will result an delay error of as high as 6% when compared to SPICE.
Paper Detail
1297
downloads
20
7464
Explicit Delay and Power Estimation Method for CMOS Inverter Driving on-Chip RLC Interconnect Load
Abstract:
The resistive-inductive-capacitive behavior of long interconnects which are driven by CMOS gates are presented in this paper. The analysis is based on the ¤Ç-model of a RLC load and is developed for submicron devices. Accurate and analytical expressions for the output load voltage, the propagation delay and the short circuit power dissipation have been proposed after solving a system of differential equations which accurately describe the behavior of the circuit. The effect of coupling capacitance between input and output and the short circuit current on these performance parameters are also incorporated in the proposed model. The estimated proposed delay and short circuit power dissipation are in very good agreement with the SPICE simulation with average relative error less than 6%.
Paper Detail
879
downloads
19
943
Low Power Bus Binding Based on Dynamic Bit Reordering
Abstract:
In this paper, the problem of reducing switching activity in on-chip buses at the stage of high-level synthesis is considered, and a high-level low power bus binding based on dynamic bit reordering is proposed. Whereas conventional methods use a fixed bit ordering between variables within a bus, the proposed method switches a bit ordering dynamically to obtain a switching activity reduction. As a result, the proposed method finds a binding solution with a smaller value of total switching activity (TSA). Experimental result shows that the proposed method obtains a binding solution having 12.0-34.9% smaller TSA compared with the conventional methods.
Paper Detail
698
downloads