Variational approximations for categorical causal modeling with latent variables

Humphreys, K.; Titterington, D. M.

doi:10.1007/BF02294734

Variational approximations for categorical causal modeling with latent variables

Article
Published: September 2003

Volume 68, pages 391–412, (2003)
Cite this article

Psychometrika Aims and scope Submit manuscript

K. Humphreys¹ &
D. M. Titterington²

270 Accesses
12 Citations
Explore all metrics

Abstract

Latent class models in the social and behavioral sciences have remained structurally simple. One reason for this is that inference in statistical models can be computationally difficult. Methods for approximate inference, known as variational approximations, which have been developed in the machine learning, graphical modeling and statistical physics literatures, can be used to alleviate the computational difficulties of inference for latent variable models. The aim of the present article is to set these methods alongside some social and behavioral science literature to which they are relevant, and in particular to consider their potential for “categorical causal modeling”, using latent class analysis. We have collated a number of popular categorical-data models with latent variables and causal structure, typically incorporating a Markovian structure. The efficacy of the approximation methods has been demonstrated through simulations related to an important behavioral science model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates

Article 21 August 2014

Beyond the number of classes: separating substantive from non-substantive dependence in latent class analysis

Article Open access 25 June 2015

Intervention and Identifiability in Latent Variable Modelling

Article Open access 30 March 2018

References

Ajzen, I. (1991). The theory of planned behavior.Organisational Behavior and Human Decision Processes, 50, 179–211.
Google Scholar
Amari, S. (1995). Information geometry of the EM and em algorithms for neural networks.Neural Networks, 8, 1379–1408.
Google Scholar
Bahadur, R.R. (1961). A representation of the joint distribution of responses ton dichotomous items. In H. Solomon (Ed.),Studies in item analysis and prediction (pp. 158–168). Standford, CA: Stanford University Press.
Google Scholar
Barber, D., & Wiegerinck, W. (1998). Tractable undirected approximations for graphical models. In L. Niklasson, T. Bodén & M. Ziemke (Eds.),Proceedings of the Eighth International Conference on Artificial Neural Networks (pp. 93–98). Skövde, Sweden: Springer.
Google Scholar
Barber, D., & Wiegerinck, W. (1999). Tractable variational structures for approximating graphical models. In M.S. Kearns, S.A. Solla & D.A. Cohn (Eds.),Advances in Neural Information Processing Systems, (Vol. 11, pp. 183–189). Cambridge, MA: MIT Press.
Google Scholar
Baum, L.E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains.Annals of Mathematical Statistics, 41, 164–171.
Google Scholar
Bentler, P.M. (1989). EQS Structural Equations Program Manual. Los Angeles, CA: BMDP Statistical Software.
Google Scholar
Bishop, C.M., Lawrence, N., Jaakkola, T., & Jordan, M.I. (1998). Approximating posterior distributions in belief networks using mixtures. In M.I. Jordan, M.J. Kearns & S.A. Solla (Eds.),Advances in Neural Information Processing Systems, (Vol. 10, pp. 416–422). Cambridge, MA: MIT Press.
Google Scholar
Bollen, K.A. (1989).Structural equations with latent variables. New York, NY: John Wiley & Sons.
Google Scholar
Browne, M.W. (1984). Asymptotically distribution free methods for the analysis of covariance structures.British Journal of Mathematical and Statistical Psychology, 37, 62–83.
Google Scholar
Byrne, B.M. (1995). One application of structural equation modeling from two perspectives: Exploring the EQS and LISREL strategies. In R. Hoyle (Ed.),Structural equation modeling concepts, issues and applications (pp. 138–161). Thousand Oaks, CA: Sage.
Google Scholar
Cannings, C., Thompson, E.A., & Skolnick, M.H. (1978). Probability functions on complex pedigrees.Advances in Applied Probability, 10, 26–91.
Google Scholar
Cooper, G.F. (1990). Computational complexity of probabilistic inference using Bayesian belief networks.Artificial Intelligence, 42, 393–405.
Google Scholar
Cowell, R. (1999). Intoduction to inference for Bayesian networks. In M.I. Jordan (Ed.)Learning in graphical models (pp. 6–26). Dordrecht, The Netherlands: Kluwer.
Google Scholar
Dayan, P., Hinton, G.E., Neal, R.M., & Zemel, R.S. (1995). The Helmholtz machine.Neural Computation, 7, 889–904.
Google Scholar
Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion).Journal of the Royal Statistical Society, Series B, 39, 1–38.
Google Scholar
Dunmur, A.P., & Titterington, D.M. (1999). Analysis of latent structure models with multidimensional latent variables. In J.W. Kay & D.M. Titterington (Eds.),Statistics and neural networks: Advances at the interface (pp. 165–194). Oxford, U.K.: Oxford University Press.
Google Scholar
Gershenfeld, N.A. (1999).The nature of mathematical modeling. Cambridge, U.K.: Cambridge University Press.
Google Scholar
Ghahramani, Z. (1996). Factorial learning and the EM algorithm. In G. Tesauro, D.S. Touretzky, & T.K. Leen (Eds.),Advances in neural information processing systems (Vol. 7, pp. 617–624). Cambridge, MA: MIT Press.
Google Scholar
Ghahramani, Z., & Jordan, M.I. (1997). Factorial hidden Markov models.Machine Learning, 29, 245–273.
Google Scholar
Goodman, L. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models.Biometrika, 61, 215–231.
Google Scholar
Hagenaars, J.A. (1993).Loglinear models with latent variables (Sage university paper series on quantitative applications in the social sciences, No. 07-094). Newbury Park, CA: Sage.
Google Scholar
Hagenaars, J.A. (1998). Categorical causal modeling: Latent class analysis and directed log-linear models with latent variables.Sociological Methods and Research, 26, 436–486.
Google Scholar
Hall, P., Humphreys, K., & Titterington, D.M. (2002). On the adequacy of variational lower bound functions for likelihood-based inference in Markovian models with missing values.Journal of the Royal Statistical Society, Series B, 64, 549–564.
Google Scholar
Humphreys, K., & Titterington, D.M. (1999). The exploration of new methods for learning in binary Boltzmann machines. In D. Heckerman & J. Whittaker (Eds.),Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics (pp. 209–214). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Humphreys, K., & Titterington, D.M. (2000). Improving the mean field approximation in belief networks using Bahadur's reparameterization of the multivariate binary distribution.Neural Processing Letters, 12, 183–197.
Google Scholar
Jensen, F. (1996).An introduction to Bayesian networks. London, U.K.: UCL Press.
Google Scholar
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., & Saul, L.K. (1999). An introduction to variational methods for graphical models. In M.I. Jordan (Ed.),Learning in graphical models (pp. 105–161). Dordrecht, The Netherlands: Kluwer.
Google Scholar
Jöreskog, K.G. (1979). Statistical estimation of structural models in longitudinal-development investigations. In J.R. Nesselroade & P.B. Baltes (Eds.),Longitudinal research in the study of behavior and development (pp. 303–351). New York, NY: Academic Press.
Google Scholar
Jöreskog, K.G., & Sörbom, D. (1984). LISREL VI: Analysis of Linear Structural Relationships by the Method of Maximum Likelihood. Chicago, IL: Scientific software.
Google Scholar
Lange, K., & Elston, R.C. (1975). Extension to pedigree analysis: Likelihood computations for simple and complex pedigrees.Human Heredity, 25, 95–105.
Google Scholar
Langeheine, R. (1994). Latent variables Markov models. In A. Von Eye & C.C. Clogg (Eds.),Latent variables analysis: Applications for developmental research (pp. 373–395). Beverly Hills, CA: Sage
Google Scholar
Lauritzen, S.L. (1995). The EM algorithm for graphical association models.Computational Statistics and Data Analysis, 10, 191–200.
Google Scholar
Lauritzen, S.L. (1996).Graphical models. Oxford, U.K.: Clarendon Press.
Google Scholar
Lauritzen, S.L., & Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical structures and their applications to expert systems (with discussion).Journal of the Royal Statistical Society, Series B, 50, 157–224.
Google Scholar
Lazarsfeld, P.F., & Henry, N.W. (1968).Latent structure analysis. Boston, MA: Houghton-Mifflin.
Google Scholar
MacDonald, I.L., & Zucchini, W. (1997).Hidden Markov and other models for discrete-valued time series (Monographs on statistics and applied probability, No. 70). London, U.K.: Chapman and Hall.
Google Scholar
McArdle, J.J., & Aber, M.S. (1990). Patterns of change within latent structure equation models. In A. von Eye (Ed.),Statistical methods in longitudinal research: Volume 1, Principles and structuring change (pp. 151–224). Boston, MA: Academic Press.
Google Scholar
McHugh, R.B. (1956). Efficient estimation and local identification in latent class analysis.Psychometrika, 21, 331–347.
Google Scholar
Neal, R.M., & Hinton, G.E. (1999). A view of the EM algorithm that justifies incremental, sparse, and other variants. In M.I. Jordan (Ed.),Learning in graphical models (pp. 355–368). Cambridge, MA: MIT Press.
Google Scholar
Ng, A.Y., & Jordan, M.I. (2000). Approximate inference algorithms for two-layer Bayesian networks. In S.A. Solla, T.K. Leen & K.-R. Müller (Eds.),Advances in neural information processing systems (Vol. 12, pp. 533–539). Cambridge, MA: MIT Press.
Google Scholar
Olsson, U., & Bergman, L.R. (1977). A longitudinal factor model for studying change in ability structure.Multivariate Behavioral Research, 12, 221–241.
Google Scholar
Opper, M., & Saad, D. (Eds.). (2001).Advanced mean field methods: Theory and practice. Cambridge, MA: MIT Press.
Google Scholar
Pearl, J. (1988).Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco, CA: Morgan Kaufmann.
Google Scholar
Pearl, J. (1998). Graphs, causality and structural equation models.Sociological Methods and Research, 27, 226–284.
Google Scholar
Pearl, J. (2000).Causality. Cambridge, U.K.: Cambridge University Press.
Google Scholar
Peterson, C., & Anderson, J.R. (1987). A mean field theory learning algorithm for neural networks.Complex Systems, 1, 995–1019.
Google Scholar
Pfeffermann, D., Skinner, C.J., & Humphreys, K. (1998). The estimation of gross flows in the presence of measurement error using auxiliary variables.Journal of the Royal Statistical Society, Series A, 161, 13–32.
Google Scholar
Rabiner, L.R., & Juang, B.H. (1986). An introduction to hidden Markov models.IEEE ASSP Magazine, 3, 4–16.
Google Scholar
Reinecke, J. (1997). Testing the theory of planned behavior with latent Markov models. In J. Rost & R. Langeheine (Eds.),Applications of latent trait and latent class models in the social sciences (pp. 398–411). Münster, Germany: Waxmann.
Google Scholar
Reinecke, J., Schmidt, P., & Ajzen, I. (1996). Application of the theory of planned behavior to adolescents' condom use: A panel study.Journal of Applied Social Psychology, 26, 749–772.
Google Scholar
Saul, L.K., T. Jaakkola & M.I. Jordan (1996). Mean field theory for sigmoid belief networks.Journal of Artificial Intelligence Research, 4, 61–76.
Google Scholar
Saul, L.K., & Jordan, M.I. (1995). Boltzmann Chains and Hidden Markov Models. In G. Tesauro, D.S. Touretzky & T.K. Leen (Eds.),Advances in neural information processing systems (Vol. 7, pp. 435–442). Cambridge, MA: MIT Press.
Google Scholar
Saul, L.K., & Jordan, M.I. (1996). Exploiting tractable substructures in intractable networks. In D.S. Touretzky, M.C. Mozer & M.E. Hasselmo (Eds.),Advances in neural information processing systems (Vol. 8, pp. 486–492). Cambridge, MA: MIT Press.
Google Scholar
Seung, H. (1995). Annealed theories of learning. In J.-H. Oh, C. Kwon & S. Cho (Eds.),Neural networks: The statistical mechanics perspective, Proceedings of the CTP-PRSRI Joint workshop on theoretical physics. Singapore, Malaysia: World Scientific.
Google Scholar
Smyth, P. (1997). Clustering sequences with hidden Markov models. In M.C. Mozer, M.I. Jordan, & T. Petsche (Eds.),Advances in neural information processing systems (Vol. 9, pp. 648–654). Cambridge, MA: MIT Press.
Google Scholar
Smyth, P., Heckerman, D., & Jordan, M.I. (1997). Probability independence networks for hidden Markov probability models.Neural Computation, 9, 227–269.
Google Scholar
Tisak, J., & Meredith, W. (1990). Longitudinal factor analysis. In A. von Eye (Ed.)Statistical methods in longitudinal research: Volume 1, Principles and structuring change (pp. 125–150). Boston, MA: Academic Press.
Google Scholar
van de Pol, F., & Langeheine, R. (1990). Mixed Markov latent class models. In C.C. Clogg (Ed.),Sociological methodology (pp. 213–247). Oxford, U.K.: Blackwell.
Google Scholar
West, S.G., Finch, J.F., & Curran, P.J. (1995). Structural equation models with nonnormal variables. In R. Hoyle (Ed.),Structural equation modeling concepts, issues and applications. Thousand Oaks, CA: Sage.
Google Scholar
Whittaker, J. (1990).Graphical models in applied multivariate statistics. New York, NY: John Wiley & Sons.
Google Scholar
Wiegerinck, W., & Barber, D. (1999). Variational belief networks for approximate inference. In La Poutre & van den Herik (Eds.),Proceedings of the Tenth Netherlands/Belgium Conference on Artificial Intelligence (pp. 177–183). Amsterdam, The Netherlands: CWI.
Google Scholar
Wiggins, L.M. (1955).Mathematical models for the analysis of multi-wave panels. Unpublished doctoral dissertation, Columbia University, New York City, NY.
Google Scholar
Wiggins, L.M. (1973).Panel Aanalysis: Latent probability models for attitude and behavioral processes. San Francisco, CA: Jossey-Bass/Elsevier.
Google Scholar
Zhang, J. (1996). The application of the Gibbs-Bogoliubov-Feynman inequality in mean field calculations for Markov random fields.IEEE Transactions on Image Processing, 5, 1208–1214.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, P.O. Box 281, 171 77, Stockholm, Sweden
K. Humphreys
Department of Statistics, University of Glasgow, USA
D. M. Titterington

Authors

K. Humphreys
View author publications
You can also search for this author in PubMed Google Scholar
D. M. Titterington
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Humphreys.

Additional information

Research was supported by a grant from the UK Engineering and Physical Sciences Research Council. The authors would like to thank anonymous reviewers and the Associate Editor for their very helpful comments on earlier versions of the manuscript.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Humphreys, K., Titterington, D.M. Variational approximations for categorical causal modeling with latent variables. Psychometrika 68, 391–412 (2003). https://doi.org/10.1007/BF02294734

Download citation

Received: 09 December 1999
Revised: 10 August 2002
Issue Date: September 2003
DOI: https://doi.org/10.1007/BF02294734

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variational approximations for categorical causal modeling with latent variables

Abstract

Access this article

Similar content being viewed by others

Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates

Beyond the number of classes: separating substantive from non-substantive dependence in latent class analysis

Intervention and Identifiability in Latent Variable Modelling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Variational approximations for categorical causal modeling with latent variables

Abstract

Access this article

Similar content being viewed by others

Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates

Beyond the number of classes: separating substantive from non-substantive dependence in latent class analysis

Intervention and Identifiability in Latent Variable Modelling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation