Presentation of Census Results by Interactive Statistical Models

Method of Models

Advantages:
  • original data not included
  • perfect privacy protection (see more)
  • unlimited distribution
  • reasonable model accuracy (see more)
  • applicable to incomplete data (see more)

Real application:
  • applied to data from Czech census in 2001

Idea of the Method

In case of census the increasing demand for dissemination and sharing of statistical information is strongly limited by the necessity to protect the privacy of respondents. To eliminate this problem we estimate the joint probability distribution of the original microdata in the form of a multivariate distribution mixture by using EM algorithm. The estimated mixture can be used directly as a knowledge base of a probabilistic expert system. By means of the probabilistic inference mechanism we can derive conditional distributions of arbitrary variables interactively without any further access to the source database. The statistical model does not contain the original data and therefore can be distributed without any confidentiality concerns. The resulting interactive software provides flexibility and user comfort analogous to the sets of anonymized microdata.