Skip to main content
Log in

Tree consistency and bounds on the performance of the max-product algorithm and its generalizations

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Finding the maximum a posteriori (MAP) assignment of a discrete-state distribution specified by a graphical model requires solving an integer program. The max-product algorithm, also known as the max-plus or min-sum algorithm, is an iterative method for (approximately) solving such a problem on graphs with cycles. We provide a novel perspective on the algorithm, which is based on the idea of reparameterizing the distribution in terms of so-called pseudo-max-marginals on nodes and edges of the graph. This viewpoint provides conceptual insight into the max-product algorithm in application to graphs with cycles. First, we prove the existence of max-product fixed points for positive distributions on arbitrary graphs. Next, we show that the approximate max-marginals computed by max-product are guaranteed to be consistent, in a suitable sense to be defined, over every tree of the graph. We then turn to characterizing the nature of the approximation to the MAP assignment computed by max-product. We generalize previous work by showing that for any graph, the max-product assignment satisfies a particular optimality condition with respect to any subgraph containing at most one cycle per connected component. We use this optimality condition to derive upper bounds on the difference between the log probability of the true MAP assignment, and the log probability of a max-product assignment. Finally, we consider extensions of the max-product algorithm that operate over higher-order cliques, and show how our reparameterization analysis extends in a natural manner.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aji S.M., Horn G., McEliece R.J., and Xu M. 1998. Iterative min-sum decoding of tail-biting codes. In: Proc. IEEE Information Theory Workshop, Killarney Ireland, pp. 68–69.

    Google Scholar 

  • Aji S.M. and McEliece R.J. 2000. The generalized distributive law. IEEE Trans. Info. Theory 46: 325–343.

    Google Scholar 

  • Barahona F. 1982. On the computational computational complexity of the Ising model. Journal of Physics A: Mathematical and General 15: 3241–3253.

    Google Scholar 

  • Benedetto S., Montorsi G., Divsalar D., and Pollara F. 1996. Soft-output decoding algorithms in iterative decoding of turbo codes. Technical report, Jet Propulsion Laboratory.

  • Berge C. 1989. Hypergraphs. North-Holland Publishing Company, Amsterdam.

    Google Scholar 

  • Bertsekas D.P. 1995. Dynamic Programming and Stochastic Control, vol. 1. Athena Scientific, Belmont, MA.

    Google Scholar 

  • Besag J. 1986. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, Series B 48(3): 259–279.

    Google Scholar 

  • Bodlaender H. 1993. A tourist guide through treewidth. Acta Cybernetica 11: 1–21.

    Google Scholar 

  • Bollobás B. 1998. Modern Graph Theory. Springer-Verlag, New York.

    Google Scholar 

  • Boykov Y., Veksler O., and Zabih R. 2001. Fast approximate energy minimization via graph cuts. IEEE. Trans. PAMI 23: 1222–1238.

    Google Scholar 

  • Brémaud P. 1991. Markov Chains, Gibbs Fields, Monte Carlo Simulation, and Queues. Springer.

  • Censor Y. and Zenios S.A. 1988. Parallel Optimization: Theory, Algorithms, and Applications. Numerical Mathematics and Scientific Computation. Oxford University Press.

  • Cowell R.G., Dawid A.P., Lauritzen S.L., and Spiegelhalter D.J. 1999. Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science. Springer-Verlag.

  • Dawid A.P. 1992. Applications of a general propagation algorithm for probabilistic expert systems. Statistics and Computing 2: 25–36.

    Google Scholar 

  • Forney G.D., Jr. 1973. The Viterbi algorithm. Proc. IEEE 61: 268–277.

    Google Scholar 

  • Forney G.D., Jr., Kschischang F.R., Marcus B., and Tuncel S. 2001a. Iterative decoding of tail-biting trellises and connections with symbolic dynamics. In: Codes, Systems and Graphical Models. Springer, pp. 239–264.

  • Forney G.D., Jr., Koetter R., Kschischang F.R., and Reznick A. 2001b. On the effective weights of pseudocodewords for codes defined on graphs with cycles. In: Codes, Systems and Graphical Models. Springer, pp. 101–112.

  • Freeman W.T. and Pasztor E.C. 1999. Learning to estimate scenes from images. In: NIPS, vol. 11.

  • Freeman W.T. and Weiss Y. 2001. On the optimality of solutions of the max-product belief propagation algorithm in arbitrary graphs. IEEE Trans. Info. Theory 47: 736–744.

    Google Scholar 

  • Frey B.J. and Koetter R. 2000. Exact inference using the attenuated maxproduct algorithm. In: Advanced Mean Field Methods: Theory and Practice. MIT Press.

  • Frey B.J., Koetter R., and Vardy A. 2001. Signal-space characterization of iterative decoding. IEEE Trans. Info. Theory. 47: 766–781.

    Google Scholar 

  • Fridman A. 2002.Topology-corrected belief revision in the ising model. Presented at Snowbird Learning workshop, Snowbird, UT.

  • Gallager R.G. 1963. Low-Density Parity Check Codes. MIT Press, Cambridge, MA.

    Google Scholar 

  • Geman S. and Geman D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Pat. Anal. Mach. Intell. 6: 721–741.

    Google Scholar 

  • Goemans M.X. and Williamson D.P. 1995. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM 42: 1115–1145.

    Google Scholar 

  • Grötschel M., Lovász L., and Schrijver A. 1993. Geometric algorithms and combinatorial optimization. Springer-Verlag, Berlin, Germany.

    Google Scholar 

  • Horn G. 1999. Iterative decoding and pseudocodewords. PhD thesis, California Institute of Technology.

  • Jordan M. (Ed.) 1999. Learning in graphical models. MIT Press, Cambridge, MA.

    Google Scholar 

  • Kleinberg J. and Tardos E. 1999. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and Markov random fields. In: IEEE Symp. Found. Comp. Science, pp. 14–24.

  • Lauritzen S.L. 1996. Graphical models. Oxford University Press, Oxford.

    Google Scholar 

  • McEliece R.J. and Yildirim M. 2002. Belief propagation on partially ordered sets. In: Gilliam D. and Rosenthal J. (Eds.), Mathematical Theory of Systems and Networks. Institute for Mathematics and its Applications.

  • Nilsson D. 1998. An efficient algorithm for finding the M most probable configurations in a probabilistic expert system. Statistics and Computing 8: 159–173.

    Google Scholar 

  • Ortega J.M. and Rheinboldt W.C. 2000. Iterative solution of nonlinear equations in several variables. Classics in Applied Mathematics. SIAM, New York.

    Google Scholar 

  • Pakzad P. and Anantharam V. 2002. Iterative algorithms and free energy minimization. In: Conference on Information Sciences and Systems.

  • Pearl J. 1988. Probabilistic reasoning in intelligent systems. Morgan Kaufman, San Mateo.

    Google Scholar 

  • Stanley R.P. 1997. Enumerative combinatorics, vol. 1. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  • Verdú S. and Poor H.V. 1987. Abstract dynamic programming models under commutativity conditions. SIAM J. Control and Optimization 25(4): 990–1006.

    Google Scholar 

  • Viterbi A. 1967. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans. Info. Theory IT-13: 260–269.

    Google Scholar 

  • Wainwright M.J., Jaakkola T.S., and Willsky A.S. 2003. Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Trans. Info. Theory 49(5): 1120–1146.

    Google Scholar 

  • Wainwright M.J. 2002. Stochastic processes on graphs with cycles: Geometric and variational approaches. PhD thesis, MIT.

  • Weiss. Y. 2000. Correctness of local probability propagation in graphical models with loops. Neural Computation 12: 1–41.

    Google Scholar 

  • Wiberg N. 1996. Codes and decoding on general graphs. PhD thesis, University of Linkoping, Sweden.

    Google Scholar 

  • Yedidia J.S., Freeman W.T., and Weiss Y. 2001. Generalized belief propagation. In: NIPS 13, MIT Press, pp. 689–695.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wainwright, M., Jaakkola, T. & Willsky, A. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Statistics and Computing 14, 143–166 (2004). https://doi.org/10.1023/B:STCO.0000021412.33763.d5

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:STCO.0000021412.33763.d5

Navigation