# Dynamic Programming

### From View

## Contents |

### Description

Dynamic programming is an algorithmic technique that compute solutions by solving simpler overlapping subproblems. It is particularly applicable for optimization problems where the optimal result for a problem is built up from the optimal result for the subproblems. Dynamic programming techniques are used in a variety of industrial size problems, such as technology mapping of logic netlists to gate libraries in integrated circuit design, longest common subsequence matching of DNA strands, and finding the most likely sequence of hidden states with the Viterbi algorithm.

Problems amenable to dynamic programming techniques have optimal substructures. A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems [1]. In other words, the optimal solution can be constructed from optimal subproblems solutions.

Optimal substructure has two attributes that varies among different problems:

1. the number of subproblems used in an optimal solution to the original problem 2. the number of choices in determining which subproblem(s) to use in an optimal solution

These two attributes could be constant according to the problem structure, or of variable values dependent on problem input data.

#### Structured Dynamic Programming Example: The 0-1 Knapsack Problem

##### Problem Description

The input to the 0-1 knapsack problem is a set of N books and a bag. Each book has an integer weight and an integer profit, and the bag has an integer capacity, C. The objective is to choose a subset of the books whose total weight is no greater than the bag capacity while maximizing the profit. Dynamic programming can be used to heuristically solve this problem in pseudo-polynomial time.

##### Optimal Substructure

A C × N table is constructed, where each entry is a subproblem that provides the maximum profit achievable after choosing a subset of the total books for a portion of the bag capacity. To find the actual books that are chosen, a backtrack procedure is carried out after the entire table has been filled in.

Each entry in the table draws on exactly one subproblem - maximize profit given two scenarios where the book is chosen or not chosen, that is one subproblem for optimal substructure attribute 1 and a fixed two choices for attribute 2.

#### Unstructured Dynamic Programming Example: The Technology Mapping Problem

##### Problem Description

Technology mapping [2] optimally binds a logic netlist to gates in a fabrication library according to a detailed technology-specific gate cost model. The input is a logic netlist (or subject DAG) in normal form containing only 2-input NAND gates and inverters, as well as a library of gates, each with a cost and a set of logically equivalent normal forms (or primitive DAGs) for the function it represents. The objective is to find a minimum cost covering of the subject DAG by the primitive DAGs.

##### Optimal Substructure:

The subject DAG is partitioned into a forest of trees, where each tree is solved optimally using tree covering. For each node in the trees, a list of candidate primitive DAG matches is generated. The optimal cover for a tree consists of a best match of the primitive DAG at the root of the tree plus the optimal cover for the sub-trees starting at each input of the match.

At the root of each tree, the optimal solution draws on all M matching primitive DAGs to determine which of the M matches and their respective input subtrees has the lowest total costs. This gives M choices for optimal substructure attribute 2, and for each choice there is an S number of input subtree as the subproblems for attribute 1. Note M is dependent on input subject DAG structure, and S is dependent on the choice of the primitive DAG matched.

### Uniprocessor Implementation

Many of dynamic programming base algorithms assume a fast store and look up time for subproblem solutions. This assumption is being challenged with memory hierarchies in modern computer architectures and with communication delays in parallel implementations.

In an uniprocessor implementation, each subproblem is solved once and stored in a table. For the structured dynamic programming applications, the data access pattern is highly regular. Many of the techniques for structured grid, such as special subproblem visit ordering to enhance data spatial locality and temporal locality (i.e. caching) can be applied. The optimization of execution order, though just as applicable to the unstructured dynamic programming applications, is much more difficult.

#### Autotuning

The subproblems of the knapsack problem can be represented as a table with books as rows and capacity as columns. The computation of these subproblems can be grouped into book blocks and capacity blocks. Experiments with various block sizes indicate that certain execution orders based on blocks can boost performance by 1.4x over a sequential implementation.

### Parallel Implementation

The subproblems in dynamic programming are overlapping by nature. If each overlapping subproblem is required to be solved only once, the overlap creates a data dependency. In a parallel implementation, the communication vs. computation trade-off determines how the subproblems can be solved. A subproblem could be solved once and the solution communicated, or it could be resolved on multiple PEs to reduce communication cost.

In dynamic programming, the number of subproblems (one for the knapsack problem, and a variable number for the technology mapping problem) determines the communication needs. The number of choices in a problem (2 for the knapsack problem, S for the technology mapping problem) and the optimization objective complexity determines the computation needs.

#### Structured Dynamic Programming Problems

Assuming there is negligible communication delay between subproblems within the same PE, the subproblems in a structured dynamic programming problem can be grouped into blocks to increase computational granularity.

Using the Knapsack problem as an example, given the table of subproblems to solve, both data and computation can be distributed to multiple PEs. As shown in figure, each block is labeled with its starting time. The subproblem dependencies are resolved by communicated the last row in each book block to the next processor. The speedup obtainable over a single processor implementation is shown in the graph.

#### Unstructured Dynamic Programming Problems

The key to parallelizing the unstructured dynamic programming problem is the assignment of subject DAG nodes and their associated subproblems to parallel PEs. The technology mapping problem for example, can be parallelized by solving independent subtrees of the subject DAG. Given the various matching primitive DAGs at a node, a "ghost zone" the edge of the subtrees can be established, where subproblems are buffered such that they are only communicated once. However, exploring this communication vs. computation trade-off is a non-trivial task in a netlist data structure.

### Elaboration

### Reference

[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, Introduction to Algorithms, MIT Press and McGraw-Hill. [2] K. Keutzer, DAGON: technology binding and local optimization by DAG matching, Proceedings of the 24th ACM/IEEE conference on Design automation, p341-347, 1987.

### Credits

Original text by Jike Chong and Kurt Keutzer.