Skip to main content
Log in

Admissible stochastic complexity models for classification problems

  • Papers
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

In this paper we investigate the application of stochastic complexity theory to classification problems. In particular, we define the notion of admissible models as a function of problem complexity, the number of data pointsN, and prior belief. This allows us to derive general bounds relating classifier complexity with data-dependent parameters such as sample size, class entropy and the optimal Bayes error rate. We discuss the application of these results to a variety of problems, including decision tree classifiers, Markov models for image segmentation, and feedforward multilayer neural network classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barron, A. and Barron, R. (1988) Statistical learning networks: a unifying view, presented at the 1988 Symposium on the Interface: Statistics and Computing Science, Virginia.

  • Barron, A. R. and Cover, T. M. (1991) Minimum complexity density estimation.IEEE Transactions on Information Theory,37, 1034–1054.

    Google Scholar 

  • Buntine, W. L. (1992) Learning classification trees.Statistics and Computing,2, 63–73.

    Google Scholar 

  • Buntine, W. L. and Weigend, A. S. (1991) Bayesian back-propagation. Technical Report FIA-91-22, RIACS and NASA Ames Research Center, Moffett Field, CA.

    Google Scholar 

  • Cybenko, G. (1990) Complexity theory of neural networks and classification problems, inNeural Networks, L. Almeida (ed.), Springer Lecture Notes in Computer Science, pp. 26–45.

  • Gish, H. (1993) Maximum likelihood training of neural networks, inArtificial Intelligence Frontiers in Statistics, Hand, D. J. (ed.), Chapman & Hall, London.

    Google Scholar 

  • Goodman, R. M., Symth, P. Higgins, C. and Miller, J. W. (1992) Rule-based networks for classification and probability estimation, to appear inNeural Computation.

  • Kovalevsky, V. A. (1980)Image Pattern Recognition, translated by A. Brown, Springer-Verlag, New York, p. 79.

    Google Scholar 

  • Mackay, D. (1992) A practical Bayesian framework for backprop networks, to appear inNeural Computation.

  • Mehra, P., Rendell, L. A. and Wah, B. W. (1989) Principled constructive induction, inProceedings of IJCAI 1989, Morgan Kaufmann, San Mateo, CA, pp. 651–656.

    Google Scholar 

  • Quinlan, J. R. and Rivest, R. (1989) Inferring decision trees using the minimum description length principle.Information and Computations,80, 227–248.

    Google Scholar 

  • Rissanen, J. (1984) Universal coding, information, prediction, and estimation.IEEE Transactions on Information Theory,30, 629–636.

    Google Scholar 

  • Rissanen, J. (1989)Stochastic Complexity in Statistical Inquiry, Teaneck, New Jersey: World Scientific.

    Google Scholar 

  • Shrager, J. and Langley, P. (eds) (1990)Computational Models of Scientific Discovery and Theory Formation, Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Smith, K. R. and Miller, M. I. (1990) A Bayesian approach incorporating Rissanen complexity for learning Markov random field texture models, inProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE Press, New York.

    Google Scholar 

  • Smyth, P. (1991) On stochastic complexity and admissible models for neural network classifiers, inAdvances in Normal Information Processing 3, Touretzky, D., Lippman, R. and Moody, J. (eds), Morgan Kaufmann, San Mateo, CA, pp. 818–824.

    Google Scholar 

  • Wallace, C. S. and Freeman, P. R. (1987) Estimation and inference by compact coding.Journal of the Royal Statistical Society B,49, 240–251.

    Google Scholar 

  • Wallace, C. S. and Patrick, J. D. (1990) Coding decision trees. Preprint.

  • Wolberg, W. H. and Mangasarian, O. L. (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology,Proceedings of the National Academy of Sciences, U.S.A.,87, 9193–9196.

    Google Scholar 

  • White, H. (1989) Learning in artificial neural networks: a statistical perspective.Neural Computation,1, 425–464.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Smyth, P. Admissible stochastic complexity models for classification problems. Stat Comput 2, 97–104 (1992). https://doi.org/10.1007/BF01889588

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01889588

Keywords

Navigation