Tuesday, November 24, 2015

Performing Clock Tree Optimization

Clock tree optimization improves the clock skew and clock insertion delay by applying addtional optimization iterations. Clock tree optimization is performed during the clock_opt process and can also be run as a standalone process before clock routing, after clock tree routing, or after detail routing. Typically, you would perform standalone clock tree optimization when timing optimization or incremental placement disturbs the clock skew or clock insertion delay.

To perform standalone clock tree optimization, use the optimize_clock_tree cmd or choose Clock > Optimize Clock Tree in the GUI. You can specify  a list of clock trees, ports, ot pins, but not hierarchical pins, as starting points of the clock network by using the -clock_trees option.

ICC provides the following incremental optimization capabilities:
1) Buffer relocation by using the -buffer_relocation option.
2) Buffer sizing by using the -buffer_sizing option.
3)Delay insertion by using the -delay_insertion option.
4)Gate relocation by using the -gate_relocation option.
5)Gate sizing by using the -gate_sizing option.

Note:
During clock tree optimization, ICC ignores the dont_touch attribute on cells and nets. To prevent sizing of cells during clock tree optimization, use the
set_clock_tree_exceptions -dont_size_cells command.

By default, the optimize_clock_tree command assumes that the clock trees in your design are not routed. For unrouted clock trees, the optimize_clock_tree command can perform any of the incremental optimization capabilies. The default behavior is to perform all incremental optimizations.

If your clock trees are routed, you must explicitly the routing stage of the clock trees by using the
 -routed_clock_stage option of the optimize_clock_tree cmd.

For routed clock trees, optimize_clock_tree can perform only the sizing optimizations ( default behavior is to perform both buffer sizing and gate sizing).

To run a subset of the available optimizations, you must explicitly specify the optimizations that you want. If you specify options that are not compatible with the routing status of your design, ICC generates an error message.
For example, to perform only gate sizing on a routed design, enter the following cmd:
icc_shell> optimiza_clock_tree -gate_sizing

After optimizing postroute clock trees, the optimize_clock_tree cmd performs ECO routing and extraction. The type of ECO routing performed depends on the routing stage of the clock trees.
1) For global routed clock trees, ICC performs incremental global routing.
2)For track assigned clock trees, ICC performs detail routing ( utilizing dangling wires).
3) For detail routed clock trees, ICC performs detail routing (utilizing dangling wires) and performs two search-and-repair loops. To change the number of search-and-repair loops, use the
-search_repair_loop option of the optimize_clock_tree command.

To disable ECO routing during optimize_clock_tree , specify the -no_clock_eco_route option.

By default,  clock tree optimization  uses the integrated clock global router to estimate the wire delay and capacitance for better correlation with postroute timing. For best QoR, you should use the integrated clock global router and specify the -clock_arnoldi option whenever possible.

To run multicorner clock tree optimization, you use the -enable_multicorner option to specify at least one corner before running either the clock_opt or optimize_clock_tree cmd .  ICC automatically sets the -clock_arnoldi option.
icc_shell> set_clock_tree_optimization_options -enable_multicorner all
icc_shell> clock_opt

or

icc_shell> set_clock_tree_optimization_options  -enable_multicorner all
icc_shell>optimize_clock_tree

Fixing DRC Violations
The optimize_clock_tree cmd can automatically fix DRC violations in the clock network that are not fixed by the compile_clock_tree cmd because of heuristic limitations. You enable this capability by setting the cto_enable_drc_fixing variable to true, changing it from its default to false.

Fixing DRC violations during clock tree optimization improves the correction between the following two stages:
1) Preroute , which uses virtual routing and the integrated clock global router.
The compile_clock_tree cmd uses virtual routing to estimate the wire delay and capacitance, whereas the optimize_clock_tree cmd invokes the integrated clock global router to perform global routing of clock nets.
2) Postroute, which uses the integrated clock global router and detail router.
To fix DRC violations in multicorner designs, you must also specify the maximum corner.

Running Interclock Delay Balancing
Interclock delay balancing balances the skew between a group of clock trees, either as part of the clock_opt process or as a standalone process.
By default, interclock delay balancing uses the intergrated clock global router to estimate the wire delay and capacitance for better correlation with postroute timing.
To run standalone interclock delay balancing, use the balance_inter_clock_delay cmd or choose Clock > Balance Interclock Delay in the GUI.

Regardless of which delay calculation model that you set by using the set_delay_calculation cmd, the balance_inter_clock_delay cmd uses Elmore delay model by default. If you set the use_improved_icdb variable to true, changing it from its default to false, the balance_inter_clock_delay cmd honors the Elmore or Arnoldi delay model that is set by the set_delay_calculation cmd.

Adjusting the I/O Timing
After implementing the clock trees, ICC can update the input and output delays to reflect the actual clock arrival times. When you adjust the I/O timing, ICC calculates the median insertion delay for each clock tree and applies these values as the clock latency. The Milkyway database and SDC constraints are automatically updated, so you can easily export this data to Prime Time for detailed timing analysis.

To adjust the I/O timing,
1) Run the update_clock_latency cmd.
or
2)Specify the -update_clock_latency option when you run the clock_opt cmd.

ICC adjusts the I/O timing to achieve the accuracy of the clock latency and to prevent false timing violations on I/O paths after CTS in the following ways:
1) For synthesized generated clocks, the network latency of a clock object is updated, but the source latency of the clock object is updated when its master clock is synthesized.
2)For synthesized clocks, network latency is computed by using the median value of the clock propagation delay, that is, the arrival time relative to the clock root at all boundary registers.
3)For virtual clocks defined with the same create_clock cmd, network latency is calculated using the clock propagation delay of the boundary registers clocked by the individual virtual clocks.

To adjust the I/O timing for virtual clocks, you must define the relationships between the virtual clocks and the real clocks before you adjust the I/O timing as follows:
icc_shell> set_latency_adjustment_options \
      -to_clock my_virtual_clock -from_clock my_real_clock
icc_shell> update_clock_latency

When you save your design in Milkyway format, the relationships defined by the set_latency_adjustment_options cmd are stored in the Milkyway design library.

When adjusting the I/O timing based on virtual clocks, the update_clock_latency cmd defines the clock latency for both the real clock and its associated virtual clocks as the median insertion delay of the real clock.

You can report the virtual clock definitions by using the
report_latency_adjustment_options cmd. You can remove the virtual clock definitions by using the reset_latency_adjustment_options cmd.
















No comments:

Post a Comment