Dynamic Programming

Example of Bellman's optimality principle of dynamic programming

The idea of dynamic programming, or Bellman's optimality principle of dynamic programming, can be captured very easily with a very simple example which illustrates the basic idea.

Suppose we are interested in finding the shortest auto route between Los Angeles and New York City as illustrated in the figure below. Further suppose that you know that the shortest route between LA and New York goes through Chicago. Then Bellman's optimality principle states the obvious fact that in this case, the Chicago to New York leg of the shortest journey from LA to New York will be identical to the shortest auto route between Chicago and New York, i.e. to the shortest route on a trip that starts at Chicago and ends in New York. Why is this obvious observation useful? Because it can result in a lot of computational savings in finding the best path from LA to New York: if we find the best path from LA to Chicago, then we only need to add on the shortest auto distance between Chicago and New York, if we we already know the answer to that.

Figure:Illustration of dynamic programming. gif

Sophisticated applications of this basic principle can lead to fast optimal algorithms for a variety of problems of interest. A popular incarnation of the above principle in signal processing and communications involves the omniscient Viterbi Algorithm, which uses the dynamic programming principle illustrated above in finding the ``best path'' through a trellis induced by a finite-state machine. The cities in the example above are analogous to the ``states'' in the Viterbi Algorithm at various time stages of the trellis. Recall that a state decouples the future from the past; that is, given the state, future decisions are not influenced by past ones that led to that state. Thus, if two paths merge at a system state, then the costlier of the two paths can be pruned. The analogy to the above example can be captured as follows: suppose there are two separate paths from LA to Chicago, one through St. Louis and the other through Denver. If the LA-Denver-Chicago route is longer than the LA-St. Louis-Chicago route, then the former can be thrown out because the best route from LA to New York passing through Chicago can never go through Denver. This principle is true for every state in the system, and is the guiding principle behind the popular Viterbi Algorithm. As described in the main body, the principle of dynamic programming finds application in a variety of scenarios in image and video coding based on rate-distortion considerations like buffer-constrained video compression

To, demonstrate the above idea we have developed two JAVA applets. The idea behind the the applets is that there are (N) blocks of information and (Q) quantizers. For, simplicity we have assumed that the rate goes from 1 to Q. The data has been generated using a scalar quantizer and random input. The first applet is an example where we fix the number of blocks to 4 and the number of quantizer to 4. The rate-constraint that the algorithm has to achieve is 6. We explain the steps of the D.P. algorithm as the applet works it way towards optimality.
In the second applet, the user can choose the number of block, number of quantizers and the Rate constraint(Rc) he wants to achieve. Some of the data is from scalar quantizer and some from a random non-increasing data set.
If, the applet window does not fit in your screen, please try this lower version of the above applets. If, you continue to have problems please let us know.

Rate Distortion Theory Homepage

Raghavendra Singh