FPGA Pentimenti TCL
These are a collection of scripts I developed for the placement and routing for the pentimenti attack of our ASPLOS 2024 paper. This work relied on exact placement of our time-to-digital converter sensor with respect to some target route. These target routes previously held sensitive data which an attacker aims to recover. There are two experimental conditions we consider: 1) the generation of artificial routes from a predefined start and end node, and 2) the extraction of the maximum sub-route within a high fanout route from an existing design. The details behind these two experiments are left to the paper.
Table of Contents
1. Artificial Route Length Generation
In our initial pentimenti experiments we wanted to carefully control the delay, the source, and the destination of the targeted route. This allowed us to study the relationship of route length to observable burn-in as measured by our time-to-digital converter. The ability to specify the source and destination of a route simplified the integration with the time-to-digital converter as we could parameterize the route to start at the pulsegenerator and end at the input to the carry chain.
Assuming that the source and destination for some \(\texttt{net}\) has been placed, which begins at \(\texttt{pin}\), we can construct a route within a bounded range of \(\texttt{net_delay}\). The size of the range will determine how quickly the router converges to some valid routing path. I found that \(\pm 500\) worked pretty well. The tcl script to accomplish this is relatively simple.
proc generate_timed_route { pin net net_delay } { route_design -timing_summary -pins [get_pins $pin] -min_delay $net_delay -max_delay [expr $net_delay + 500] set_property is_route_fixed 1 [get_nets net] }
This is however an unrealistic construction as we are artificially constraining the route in source, destination, and delay. While this provided compelling and motivating results, we needed to perform our attack on a routing constraint extracted from a design with no intervention. This proved be significantly greater challenge.
2. Real Design Route Extraction
If given a set of routes within a design compiled with no intervention, we needed to be able to extract the routing constraint of the longest sub-path within that route, then place both that route and a corresponding time-to-digital converter in a new attacker design for measurement. This is accomplished in the following stages:
2.1. Extracting Longest Sub-Route
Real design's routes generally have a single source and fanout to multiple destinations. This could be because that signal really has multiple destinations within the logic, or because some logic duplication step was performed by the compiler to meet timing closures. Regardless, the first step in extracting our target route from a net name is to extract the longest sub-path within that route. If given some net name \texttt{net} we can invoke \texttt{getlongestroute{$net}} to return the destination pin of the longest route.
proc get_longest_route { net } { set delays [get_net_delays -of $net] set longest_route_dest "" set longest_delay 0 foreach dly $delays { set dest [get_property TO_PIN $dly] set del [get_property SLOW_MAX $dly] if { $del > $longest_delay } { set longest_route_dest $dest set longest_delay $del } } return $longest_route_dest }
This acts as the basis for our extraction but is far from complete. Significantly, this is just a destination pin, and not a set of routing constraints that we could use to reconstruct this route in a new design. This destination pin needs to be mapped back to node within the net's routing constraint before longest sub-route extraction can begin. It is trivial to retrieve the routing constraint for this net but has two major limitations. The first is that this routing constraint includes all destinations of the fanout which means this routing constraint cannot be copied wholesale into a new design without leaving antennas. The second is that the routing constraint is relative in that all specified nodes are respective of the initial pin of the route.
To remove the route destinations that are not our longest route pin we need to know which node in the constraint corresponds to this longest route. From the longest route destination pin we can work backwards to extract the node which leads to it with the following command:
set dest_node [get_node -of_objects [get_site_pins -of_objects [get_bel_pins -of_objects [get_pins $longest_route_dest]]]]
This however cannot be used to lookup a position in the routing constraint because the routing constraint is relative. Searching for the \texttt{dest_node} in the routing constraint will return multiple possible matches because node names are geometrically reused within the FPGA (similar and repetitive structures throughout will share the same name frequently). To fix this, we need to canonize the routing constraint and then augment the destination node with its associated slice index.
To canonize the route, we will begin at the root node (as determined by the source pin of the net) and traverse from node to node using the pips which connect them. The routing constraint is a hierarchical structure where open brackets define a subtree and close brackets terminate the subtree. This lends it self well to the following recursive algorithm. If given some \texttt{route}, the \texttt{last_node} we explored, and the current \(\texttt{token_index}\) our exploration has reached within the route, we can generate a canonized constraint with the slices prepended to each node.
proc canonize_route {route last_node token_index } { set route [get_property ROUTE $net] set canonized_route [] # token starts as the first element in the closure set next_token [lindex $route $token_index] while {$next_token != "\}"} { if {$next_token != ""} { if {$next_token == "\{"} { set canonized_child [canonize_route $route $last_node [expr $token_index + 1]] set child_route [lindex $canonized_child 0] set token_index [lindex $canonized_child 1] set new_token [lindex $route $token_index] set canonized_route [concat [concat $canonized_route [list "\{"]] [concat $child_route [list "\}"]]] } else { # find the pip that leads to $next_token set pips [get_pips -of_object $last_node] foreach pip $pips { set pip_dest [get_nodes -downhill [get_pips $pip] -of_objects [get_pips $pip]] set pip_src [get_nodes -uphill [get_pips $pip] -of_objects [get_pips $pip]] set next_hop_node [lindex [split $pip_dest "/"] 0] set next_hop_cnstr [lindex [split $pip_dest "/"] 1] if {$pip_src == $last_node && $next_hop_cnstr == $next_token} { set last_node $pip_dest lappend canonized_route $next_hop_node/$next_token break } } } } # move to the next token incr token_index # grab the next token set next_token [lindex $route $token_index] } # Where should the parent pick up reading again incr token_index return [list $canonized_route $token_index] }
This algorithm functions by finding the only possible pip that could connect the current node to the next node in the routing constraint. We know the canonized routing constraint of the first node because we have the initial source pin which provides us the slice within the device. From here we can begin scanning through the tokens in the routing constraint. From the current node we can grab all of the pips connected to it. A pip defines a programmable path between two nodes, and so, there should exist a pip from our current node to the next node in the routing constraint. The next node in the constraint is not necessarily uniquely named in terms of all connected nodes, but it is uniquely named for all connected nodes in the outgoing direction. And so, we can specify we want the nodes downhill and uphill from our current node, and then use that direction to determine which of the nodes corresponds to the relative constraint in the routing path. When we encounter a nested tree, we can call the canonize function on that sub-tree.
After canonizing the route we can begin the process of extracting the routing constraint of the longest sub-path. To accomplish this we need to exploit the hierarchical structure of the routing constraint. As outputted by the tool, an open bracket indicates the start of a sub-tree while a close bracket indicates the end of that sub-tree. By beginning at the destination node of the longest path (which we know because we extracted the canonized end node from the destination pin in the first step and can related that to the canonized routing constraint we generated in the previous step) we can work backwards through the tokens in the routing constraint. By excluding everything within sub-trees encountered through the traversal and including everything else, we are left with the routing constraint of the longest sub-path. The first step in this process is to identify the token in the canonized route that corresponds to the end node of the longest sub-path, which can be accomplished with the following:
set count 0 foreach token $canonized_route { if { $token == $dest_node } { set target_index $count } incr count }
With the token corresponding to the end node we can traverse backwards while ignoring sub-trees with the following script.
set direct_route [] set disable 0 for { set i $target_index } {$i >= 0} {incr i -1} { set curr_token [lindex $canonized_route $i] # is double bracket the same if { [string first "\}" $curr_token] != -1 } { incr disable 1 } if { [string first "\{" $curr_token] != -1 } { if {$disable != 0} { incr disable -1 } } if { [string first "\{" $curr_token] == -1 && [string first "\}" $curr_token] == -1 && $disable <= 0 } { lappend direct_route $curr_token } } set direct_route [lreverse $direct_route]
We are now left with the routing constraint from the source to destination for the longest sub-path.
3. Jumper Routes
The extracted routes can be from any source and destination pin within the FPGA. This is determined by however the compiler decided to place the design, which is outside the control of the attacker. These routes however need to be linked to our time-to-digital converter for timing measurement. To measure timing delay through this route, the output of the time-to-digital converter's \(\texttt{pulse generator}\) needs to be routed into the source of the route under test. The output of the route under test needs to be routed into the first input of the \(\texttt{carry chain}\). One option would be to place the \(\texttt{pulse generator}\) at the source of the target route and the \(\texttt{carry chain}\) at the destination of the target route. The issue with this is the \(\texttt{carry chain}\) takes up 64 vertical slices (for the 256 bit version we deployed). If you are trying to measure hundreds of routes simultaneously on the same FPGA you will very quickly run into placement conflicts. The approach we took instead was to place the chains geometrically in successive columns, then create small jumper wires from the \(\texttt{pulse generator}\) to the target route source and from the target route destination to the \(\texttt{carry chain}\).
We must be ensure that the source and destination of the target route are reachable by the router. This means docking a few entries from the head and tail, as those entries lead down an inescapable path to the pin. Going through a pin requires logic which the router does not generate.
The next thing to be careful of is to collect the nodes from all target routes into a single list for exclusion from the routing path. This is necessary because otherwise the router will see nodes available for routing which are in fact used by one of the target routes.
All this can roughly be accomplished with the following commands, provided that all utilized nodes are compiled into \(\texttt{reduced_nodes}\).
set jumper_to [find_routing_path -from [get_nodes $pulse_node] -to [get_nodes $burn_start] -exclude_nodes $excluded_nodes] set jumper_from [find_routing_path -from [get_nodes $burn_end] -to [get_nodes $carry_node] -exclude_nodes $excluded_nodes] set full_route [concat $jumper_to $direct_route $jumper_from]
This approach worked to extract 256 routes from an AES core (the output of the key memory) and construct a corresponding measurement image for the attacker.