Abstract
Finding the maximum a posteriori (MAP) assignment of a discrete-state distribution specified by a graphical model requires solving an integer program. The max-product algorithm, also known as the max-plus or min-sum algorithm, is an iterative method for (approximately) solving such a problem on graphs with cycles. We provide a novel perspective on the algorithm, which is based on the idea of reparameterizing the distribution in terms of so-called pseudo-max-marginals on nodes and edges of the graph. This viewpoint provides conceptual insight into the max-product algorithm in application to graphs with cycles. First, we prove the existence of max-product fixed points for positive distributions on arbitrary graphs. Next, we show that the approximate max-marginals computed by max-product are guaranteed to be consistent, in a suitable sense to be defined, over every tree of the graph. We then turn to characterizing the nature of the approximation to the MAP assignment computed by max-product. We generalize previous work by showing that for any graph, the max-product assignment satisfies a particular optimality condition with respect to any subgraph containing at most one cycle per connected component. We use this optimality condition to derive upper bounds on the difference between the log probability of the true MAP assignment, and the log probability of a max-product assignment. Finally, we consider extensions of the max-product algorithm that operate over higher-order cliques, and show how our reparameterization analysis extends in a natural manner.
Similar content being viewed by others
References
Aji S.M., Horn G., McEliece R.J., and Xu M. 1998. Iterative min-sum decoding of tail-biting codes. In: Proc. IEEE Information Theory Workshop, Killarney Ireland, pp. 68–69.
Aji S.M. and McEliece R.J. 2000. The generalized distributive law. IEEE Trans. Info. Theory 46: 325–343.
Barahona F. 1982. On the computational computational complexity of the Ising model. Journal of Physics A: Mathematical and General 15: 3241–3253.
Benedetto S., Montorsi G., Divsalar D., and Pollara F. 1996. Soft-output decoding algorithms in iterative decoding of turbo codes. Technical report, Jet Propulsion Laboratory.
Berge C. 1989. Hypergraphs. North-Holland Publishing Company, Amsterdam.
Bertsekas D.P. 1995. Dynamic Programming and Stochastic Control, vol. 1. Athena Scientific, Belmont, MA.
Besag J. 1986. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, Series B 48(3): 259–279.
Bodlaender H. 1993. A tourist guide through treewidth. Acta Cybernetica 11: 1–21.
Bollobás B. 1998. Modern Graph Theory. Springer-Verlag, New York.
Boykov Y., Veksler O., and Zabih R. 2001. Fast approximate energy minimization via graph cuts. IEEE. Trans. PAMI 23: 1222–1238.
Brémaud P. 1991. Markov Chains, Gibbs Fields, Monte Carlo Simulation, and Queues. Springer.
Censor Y. and Zenios S.A. 1988. Parallel Optimization: Theory, Algorithms, and Applications. Numerical Mathematics and Scientific Computation. Oxford University Press.
Cowell R.G., Dawid A.P., Lauritzen S.L., and Spiegelhalter D.J. 1999. Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science. Springer-Verlag.
Dawid A.P. 1992. Applications of a general propagation algorithm for probabilistic expert systems. Statistics and Computing 2: 25–36.
Forney G.D., Jr. 1973. The Viterbi algorithm. Proc. IEEE 61: 268–277.
Forney G.D., Jr., Kschischang F.R., Marcus B., and Tuncel S. 2001a. Iterative decoding of tail-biting trellises and connections with symbolic dynamics. In: Codes, Systems and Graphical Models. Springer, pp. 239–264.
Forney G.D., Jr., Koetter R., Kschischang F.R., and Reznick A. 2001b. On the effective weights of pseudocodewords for codes defined on graphs with cycles. In: Codes, Systems and Graphical Models. Springer, pp. 101–112.
Freeman W.T. and Pasztor E.C. 1999. Learning to estimate scenes from images. In: NIPS, vol. 11.
Freeman W.T. and Weiss Y. 2001. On the optimality of solutions of the max-product belief propagation algorithm in arbitrary graphs. IEEE Trans. Info. Theory 47: 736–744.
Frey B.J. and Koetter R. 2000. Exact inference using the attenuated maxproduct algorithm. In: Advanced Mean Field Methods: Theory and Practice. MIT Press.
Frey B.J., Koetter R., and Vardy A. 2001. Signal-space characterization of iterative decoding. IEEE Trans. Info. Theory. 47: 766–781.
Fridman A. 2002.Topology-corrected belief revision in the ising model. Presented at Snowbird Learning workshop, Snowbird, UT.
Gallager R.G. 1963. Low-Density Parity Check Codes. MIT Press, Cambridge, MA.
Geman S. and Geman D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Pat. Anal. Mach. Intell. 6: 721–741.
Goemans M.X. and Williamson D.P. 1995. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM 42: 1115–1145.
Grötschel M., Lovász L., and Schrijver A. 1993. Geometric algorithms and combinatorial optimization. Springer-Verlag, Berlin, Germany.
Horn G. 1999. Iterative decoding and pseudocodewords. PhD thesis, California Institute of Technology.
Jordan M. (Ed.) 1999. Learning in graphical models. MIT Press, Cambridge, MA.
Kleinberg J. and Tardos E. 1999. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and Markov random fields. In: IEEE Symp. Found. Comp. Science, pp. 14–24.
Lauritzen S.L. 1996. Graphical models. Oxford University Press, Oxford.
McEliece R.J. and Yildirim M. 2002. Belief propagation on partially ordered sets. In: Gilliam D. and Rosenthal J. (Eds.), Mathematical Theory of Systems and Networks. Institute for Mathematics and its Applications.
Nilsson D. 1998. An efficient algorithm for finding the M most probable configurations in a probabilistic expert system. Statistics and Computing 8: 159–173.
Ortega J.M. and Rheinboldt W.C. 2000. Iterative solution of nonlinear equations in several variables. Classics in Applied Mathematics. SIAM, New York.
Pakzad P. and Anantharam V. 2002. Iterative algorithms and free energy minimization. In: Conference on Information Sciences and Systems.
Pearl J. 1988. Probabilistic reasoning in intelligent systems. Morgan Kaufman, San Mateo.
Stanley R.P. 1997. Enumerative combinatorics, vol. 1. Cambridge University Press, Cambridge, UK.
Verdú S. and Poor H.V. 1987. Abstract dynamic programming models under commutativity conditions. SIAM J. Control and Optimization 25(4): 990–1006.
Viterbi A. 1967. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans. Info. Theory IT-13: 260–269.
Wainwright M.J., Jaakkola T.S., and Willsky A.S. 2003. Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Trans. Info. Theory 49(5): 1120–1146.
Wainwright M.J. 2002. Stochastic processes on graphs with cycles: Geometric and variational approaches. PhD thesis, MIT.
Weiss. Y. 2000. Correctness of local probability propagation in graphical models with loops. Neural Computation 12: 1–41.
Wiberg N. 1996. Codes and decoding on general graphs. PhD thesis, University of Linkoping, Sweden.
Yedidia J.S., Freeman W.T., and Weiss Y. 2001. Generalized belief propagation. In: NIPS 13, MIT Press, pp. 689–695.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wainwright, M., Jaakkola, T. & Willsky, A. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Statistics and Computing 14, 143–166 (2004). https://doi.org/10.1023/B:STCO.0000021412.33763.d5
Issue Date:
DOI: https://doi.org/10.1023/B:STCO.0000021412.33763.d5