Friday 10 January 2014

Library

What are Library file in ASIC design ?

- Every foundry own their gate and have specification according to technology and by using that library designer can build designs

- A file that contain list of such gates and cells by using that our design which is in the form of verilog code ( which is in the written in the behavioral model)  can be represent into the structural form called Library

Using this library you are converting your design which is in the code form into the physically implemented form  that is into the cells and gates

Same design code can be implement into the various technology node
Example : we have design of MUX, that can be implement into the various way such as different performance, power and area by just changing the technology library file with different node like 130 nm, 90 nm, 45 nm, 16 nm etc

Here Cells are the library representations of gates. we can use a cell from a library to create a gate in your design.
One such file that is used by synthesis tool, is called the technology library
which have, - different views (logical , physical) of the gates and cells are stored in the different files. and together, these views are called a technology library.

A library file consists of two sections,
1) Header and 2) Body

Header contain the following
  • General attributes
  • Documentation attributes
  • Unit attributes
  • Operating conditions
  • Threshold and default definitions
  • Templates
  • Voltage 
  • Wire load Definitions
Body contain following

Routing


  • Routing is the process of establishing physical connections between various components in the design.
  • The different stages in routing are :-
  • Global routing
  • Track assignment
  • Detail routing.

For routing, chip is divided into small blocks. These small blocks are called as routing bin. It is also known as g-cell. Each g-cell has a finite number of horizontal and vertical tracks.

  • Global Routing: The first stage of routing is the global routing; it is done in the placement stage. In Global routing, the core area is divided into Global Routing Cells. In this stage only routing resources are allocated. Global routing assigns nets to specific GRC but it does not define the specific tracks for each of them.


  • Track Assignment: Tracks will be assigning to all metal layers through each GRC. Routing will be done but pin to pin connection will not be there.


  • Detail Routing: In Detail routing, the actual routing takes place the physical connections are made i. e. it will create actual metal and via connections.
The checks to be performed after routing are crosstalk analysis, DRC, LVS, ERC, EM, Antenna violations.

Clock Tree Synthesis




  • Clock Tree Synthesis is a process which makes sure that the clock gets distributed evenly to all sequential elements in a design.
  • The goal of CTS is to minimize the skew and latency.
  • The placement data will be given as input for CTS, along with the clock tree constraints.
  • The clock tree constraints will be Latency, Skew, Maximum transition, Maximum capacitance, Maximum fan-out, list of buffers and inverters etc.
  • The clock tree synthesis contains clock tree building and clock tree balancing.
  • Clock tree can be build by clock tree inverters so as to maintain the exact transition (duty cycle) and clock tree balancing is done by clock tree buffers (CTB) to meet the skew and latency requirements.
  • Less clock tree inverters and buffers should be used to meet the area and power constraints.
  • There can be several structure for clock tree:
  • H-Tree
  • X-Tree
  • Multi level clock tree
  • Fish bone
  • Once the CTS is done than we have to again check the timing.
  • The outputs of clock tree synthesis are Design Exchange Format (DEF), Standard Parasitic Exchange Format (SPEF), and Netlist etc.
NOTES:
  • The normal inverters and buffers are not used for building and balancing because, the clock buffers provides a better slew and better drive capability when compared to normal buffers and clock inverters provides a better balance with rise and fall times and hence maintaining the 50% duty cycle.
  • Effects of CTS: Many clock buffers are added, congestion may increase, crosstalk noise, crosstalk delay etc.
  • Clock tree optimizations: It is achieved by buffer sizing, gate sizing, HFN synthesis, Buffer relocation.
Set Up Fixing:
  1. Upsizing the cells (increase the drive strength) in data path.
  2. Pull the launch clock
  3. Push the capture clock
  4. We can reduce the buffers from datapath .
  5. We can replace buffers with two inverters placing farther apart so that delay can adjust.
  6. We can also reduce some larger than normal capacitance on a cell output pin.
  7. We can upsize the cells to decrease the delay through the cell.
  8. LVT cells
Hold Fixing:
It is well understood hold time will be large if data path has more delay. So we have to add more delays in data path.
  1. Downsizing the cells (decrease the drive strength) in data path.
  2. Pulling the capture clock.
  3. Pushed the launch clock.
  4. By adding buffers/Inverter pairs/delay cells to the data path.
  5. Decreasing the size of certain cells in the data path, It is better to reduce the cells n capture path closer to the capture flip flop because there is less chance of affecting other paths and causing new errors.
  6. By increasing the wire load model, we can also fix the hold violation.
Transition violation
In some cases, signal takes too long transiting from one logic level to another, than a transition violation is caused. The Trans violation can be because of node resistance and capacitance.
  1. By upsizing the driver cell.
  2. Decreasing the net length by moving cells nearer (or) reducing long routed net.
  3. By adding Buffers.
  4. By increase the width of the route at the violation instance pin. This will decrease the resistance of the route and fix the transition violation.
Cap violation
The capacitance on a node is a combination of the fan-out of the output pin and capacitance of the net. This check ensures that the device does not drive more capacitance than the device is characterized for.
  1. The violation can be removed by increasing the drive strength of the cell.
By buffering the some of the fan-out paths to reduce the capacitance seen by the output pin.


Placement



  • Placement is a step in the Physical Implementation process of placing the standard cell in a standard cell rows in order meet the timing, congestion, and utilization.
  • An input to the placement is floorplan database or def.
  • For the placement, complete standard cell area will be divided into pieces known as bins or also known as bucket. The size of bin may vary from design to design.
  • There are two steps in placement:
      1. Global placement
      2. Detail placement
Global Placement: As a part of global placement all the standard cells will place in standard cell rows but there may be some overlap of standard cells.
Detail Placement: All standard cells on standard cell rows will legalized, refined and their will not be any overlaps.

  • Once the placement is done, than we have to check timing as well as congestion.
  • Outputs from the placement will be netlist, def and spef.


NOTES:
Standard Cell Row Utilization: It is defined as the ratio of the area of the standard Cells to the area of the chip minus the area of the macros and area of blockages.
Area (Standard Cells)
----------------------------------------
Area (Chip) - Area (Macro) – Area (Region Blockages)

Congestion: If the number of routing tracks available for routing is less than the required tracks then it is known as congestion.



  • Timing checks to perform after placement: congestion issues, HFN synthesis, capacitance fixing, Transition fixing, setup fixing.

  • Based on timing and congestion the tool optimally places standard cells. While doing so, if scan chains are detached, it can break the chain ordering and can reorder to optimize it. it maintains the number of flops in a chain.
  • During placement, the optimization may make the scan chain difficult to route due to congestion. Hence the tool will re-order the chain to reduce congestion. This sometimes increases hold time problems in the chain. To overcome these buffers may have to be inserted into the scan path. It may not be able to maintain the scan chain length exactly. It cannot swap cell from different clock domains. Because of scan chain reordering patterns generated earlier is of no use.
  • In placement stage only different types of special cells are added. They are Spare cells, End cap cells, tie cells, etc.


Standard cells: the designer neither uses predesigned logic cells such as AND gate, NOR gate, etc. These gates are called Standard Cells. The advantage of Standard Cell ASIC’s is that the designers save time, money and reduce the risk by using a predesigned and pre-tested Standard Cell Library
Tie cells : The tie cells are used to connect the floating input to either a VDD or VSS without any change in logic functionality of circuit.
Spare cells : Whenever it is required to perform some functional ECO (Engineering change order) , spare cells would be used .These are extra cells, floating in an ASIC design they are also apart of standard cell library and if you want to include some more functionality, after the base tape out of chip by using spare cells.
End cap cells: End caps are placed at the end of cell rows and handle end-of-row well tie-off requirements End caps are used to connect power and ground rails across an area and are also used to ensure gaps do not occur between well or implant layers which could cause design rule violations.
Pre requisites of CTS include, ensuring that the design is placed and optimizes, ensuring that the clock tree can be routed i.e., taking care of congestion issues, power and ground nets are pre-routed.
The inputs are placement database or design exchange format file after the placement stage and clock tree constraints.

Power Planning

Let , Power = 100mW   Voltage = 1v
Recall the formula
P= VI  => I = P/V
= 100/1
= 100mA
Since we assume that there are four pads of Vdd and vss
Then each pad can take a 25 mA of current
Again we have the technology LEF file
In that for a particular metal (M8) the current carry strength will be given
For example M8 - 1µ width of wire - the current carrying strength is 10mA
Now for 2.5µ = 2.5mA
Spacing =25 µ
If the core width of floorplan is 1mm
No of vertical straps =core width/ spacing
1mm/ 25 µ = 40 straps
Width of the each strap = width of the power ring / no of vertical straps
= 2.5/ 40 =0.625µ
Spacing =25 µ
If the core height of floorplan is 1mm
No of horizontal straps =core height / spacing
1mm/ 25 µ = 40 straps
Width of the each strap =height of the power ring / no of horizontal straps
= 2.5/ 40 =0.625µ

Floorplaning

Build a chip, in many ways same as building an apartment
Apartment
chip









Comparison:
> Both built in layers from ground to up
> Apartment have proper parking for cars, Chips have IO pad
> Apartment have rooms, Chips have macros
> For connectivity, Apartment have stair-case vs Chips have interconnects 
> Bricks, cement, metal rods in the case of apartments, Silicon atoms, dopants and metals in the case of chips
> Builds using a floorplan, 
    - all rooms have own functions in case of apartments
    - all module have own function in case of chips

Floor-planning is a process of arranging structures (macros, complex cell, I/O cells etc) placing close together in such a way to meet the timing (delay) and they occupy less area.
> The main objectives of floor planning are arriving at the optimum shape and size to achieve smallest size to reduce the cost of the total chip for meeting specification
> Nowadays, In deep submicron technology: the signal integrity issues like X-talk, EM, Hot electron becomes more important issues and this influences the die size also.
> For Xtalk issues comes with the metal pitch, EM comes in picture due to metal layers, hot electron due to voltage scaling problem at the lower node and these all issues have affect on die size.
  • Inputs for floorplanning
    - Gate level Netlist
    - Libraries (.lef &.lib)
    - Constraints 
  •  Outputs of floorplan: floorplan of the design in the form of DEF file
  • During floorplanning, following steps are to be done:
  1. Initialization of a floorplan of appropriate dimension
  2. Placement of I/O pins
  3. Macro placement considering the communication between them through fly lines.
  4. Creation of power straps.
  5. Applying appropriate placement blockages near the macros, near the I/O pins, densely packed cell areas etc.
  6. Pre-Placement of Tap cells, switch cells, ESD cells, Isolation cells for Low power design-LPD
  7. Creation of multiple voltage, power domains for LPD 
  8. Clustering of level shifter between the different power domain
  9. etc
Notes:
  • The final timing, quality of the chip depends on the floorplan design.
Macros (Memories): 
To store information using sequential elements takes up lot of area. A single flip flop could take up 15 to 20 transistors to store one bit. Therefore special memory elements are used which store the data efficiently and also do not occupy much space on the chip comparatively. These memory cells are called macros.

Floor planning control parameters like aspect ratio, core utilization etc are defined as follows:-
  • Aspect ratio (AR): It is defines as the ratio of the width to height of the chip. The aspect ratio should take into account the number of routing resources available. If there are more horizontal layers, then the rectangle should be long and width should be small and vice versa if there are more vertical layers in the design
  • Concept of Rows: The standard cells in the design are placed in rows. All the rows have equal height and spacing between them. The width of the rows can vary. The standard cells in the rows get the power and ground connection from VDD and VSS rails which are placed on either side of the cell rows. Sometimes, the technology allows the rows to be flipped or abutted, so that they can share the power and ground rails.
  • Core: Core is defined as the inner block, which contains the standard cells and macros. There is another outer block which covers the inner block. The I/O pins are placed on the outer block.
  • Power Planning: Signals flow into and out off the chip, and for the chip to work, we need to supply power. A power ring is designed around the core. The power ring contains both the VDD and VSS rings. Once the ring is placed, a power mesh is designed such that the power reaches all the cells easily. The power mesh is nothing but horizontal and vertical lines (straps) on the chip. One needs to assign the metal layers through which you want the power to be routed. During power planning, the VDD and VSS rails also have to be defined.
  • I/O Placement: There are two types of I/O‘s.
Chip I/O: The chip contains I/O pins. The chip consists of the core, which contains all the standard cells, blocks. The chip I/O placement consists of the placement of I/O pins and also the I/O pads. The placement of these I/O pads depends on the type of packaging.
Block I/O: The core contains several blocks. Each block contains the Block I/O pins which communicate with other blocks, cells in the chip. This placement of pins can be optimized.
  • Pin Placement: Pin Placement is an important step in floorplaning. The pin placement can be done based on timing, congestion and utilization of the chip.
Pin Placement in Macros: It uses up M3 layers most of the time, so the macro needs to be placed logically. The logical way is to put the macros near the boundary. If there is no connectivity between the macro pins and the boundary, then move it to another location.
  • Concept of Utilization: Utilization is defined as the percentage of the area that has been utilized in the chip. In the initial stages of the floorplan design, if the size of the chip is unknown, then the starting point of the floorplan design is utilization.
There are three different kinds of utilizations.
Chip Level utilization: It is the ratio of the area of standard cells, macros and the pad cells with respect to area of chip.
Area (Standard Cells) + Area (Macros) + Area (Pad Cells)
-------------------------------------------------------
Area (chip)



Floorplan Utilization: It is defined as the ratio of the area of standard
 
cells, macros, and the pad cells to the area of the chip minus the area of the

sub floorplan.

�� Area (Standard Cells) + Area (Macros) + Area (Pad Cells)
----------------------------------------------------------------------------
Area (Chip) – Area (sub floorplan)

Standard Cell Row Utilization: It is defined as the ratio of the area of the standard Cells to the area of the chip minus the area of the macros and area of blockages.
Area (Standard Cells)
----------------------------------------
Area (Chip) - Area (Macro) – Area (Region Blockages)
  • Macro Placement: As a part of floorplaning, initial placement of the macros in the core is performed. Depending on how the macros are placed, the tool places the standard cells in the core. If two macros are close together, it is advisable to put placement blockages in that area. This is done to prevent the tool from putting the standard cells in the small spaces between the macros, to avoid congestion.
  • Few of the different kinds of placement blockages are:
a. Standard Cell Blockage: The tool does not put any standard cells in the area specified by the standard cell blockage.
b. Non Buffer Blockage: The tool can place only buffers in the area specified by the Non Buffer Blockage.
c. Blockages below power lines: It is advisable to create blockages under power lines, so that they do not cause congestion problems later. After routing, if you see an area in the design with a lot of DRC violations, place small chunks of placement blockages to ease congestion. After Floor planning is complete, check for DRC (Design Rule check) violations. Most of the pre-route violations are not removed by the tool. They have to be fixed manually.
  • The factors to be considered during macro placement
  1. First check flylines i.e. check net connections from macro to macro and macro to standard cells. If there is more connection from macro to macro place those macros nearer to each other preferably nearer to core boundaries.
  2. If input pin is connected to macro better to place nearer to that pin or pad.
  3. If macro has more connection to standard cells spread the macros inside core.
  4. Avoid criscross placement of macros.
  5. Avoid bottleneck
  6. Use soft or hard blockages to guide placement engine.
  • I/O cells in the floorplan: the I/O cells are the one which interact in-between the blocks outside of the chip and to the internal blocks of the chip. In floorplan these I/O cells are placed in between the inner ring (core) and the outer ring (chip boundary). These I/O cells are responsible for providing voltage to the cells in the core. These I/O cells are responsible for providing voltage to the cells in the core. For example: the voltage inside the chip for 90nm technology is about 1.2 Volts. The regulator supplies the voltage to the chip (Normally around 5.5V, 3.3V etc).
The next question which comes to mind is that why is the voltage higher than the voltage
Inside the chip?
The regulator is basically placed on the board. It supplies voltage to different other chips
on board. There is lot of resistances and capacitances present on the board. Due to this,
the voltage needs to be higher. If the voltage outside is what actually the chip need inside,
then the standard cells inside of the chip get less voltage than they actually need and the
chip may not run at all. So now the next question is how the chips can communicate between different voltages? The answer lies in the I/O cells. These I/O cells are nothing but Level Shifters. Level Shifters are nothing but which convert the voltage from one level to another The Input I/O cells reduce the voltage coming from the outside to that of the voltage needed inside the chip and output I/O cells increase the voltage which is needed outside of the chip. The I/O cells acts like a buffer as well as a level shifter.
  • Add the tap cell/ switch cells : At the floor plan stage we have to add the tap cell, with some spacing in staggred format.The tap cell are cell which are used connect the VDD and VSS to the substrate and well connectivity.By using tap cell we can remove latch up problem
  • Core Limited & I/O Limited Designs: In core limited design core is densely packaged and in I/O limited design I/O are densely packed which limits the size of the chip.
  • Complex cells: The complex cells are cells which are made of groups of standard cells based on functionality requirement. This cells height is greater than standard cell and less than marcos.

Design Partitioning

To make it easier to design a complex system, it is normally broken down into several sub systems.  The functionality of these subsystems should match the specifications. At this point, the relationship between different sub systems and with the top level system is also defined. This is known as design partitioning.
The design partitioning can be done based on:
  1. Functionality
  2. Timing criticality
  3. Based on area
  4. Tool limitations
  5. Reusability