Previous grants:

  1. Cryptology and Data-Mining (CRYDAMI)

    Duration: 3.5 years (2004 - 2007).

    Funded by the Finnish Academy of Sciences

  2. Privacy-Preserving Data Mining: Cryptographic Methods

    Duration: 2 years (2006 - 2008).

    Funded by the Estonian Science Foundation. Grant number 6848 (to Cybernetica AS).

Abstract (as in the grant application, with only minor modifications):

The primary task of data-mining is to develop models about aggregated data. It has close ties with areas like statistics, machine learning and database theory. On the other hand, the main question of privacy-preserving data-mining (PPDM) is, can we develop accurate models without access to precise information in individual data records? Privacy-preserving data-mining is important in practice, since for example, without guarantees that their privacy is being protected, many customers are unwilling to submit their data to the data-miners, or are consciously lying. As another important example, different companies may want to establish models of their joint databases, without giving away the content of the databases to another parties. Because of these and many other reasons, while privacy is usually seen as unrelevant and even undesirable in data-mining (and statistics in general), without preserving the privacy it may be hard to achieve the main goals of data-mining.

The research field that deals with privacy (and secure computing in general) is called cryptology, with cryptography dealing with the construction of new primitives and protocols, and cryptanalysis deals with the analysis and attacking of existing primitives/protocols. One of the important results of cryptology is that all efficiently computable functions are also efficiently computable in a secure, privacy-preserving, way . Therefore, it is natural to try to use cryptographic methods to solve the privacy problems in data-mining. However, the main problem of the cryptographic approach is that even if all functions can be computed securely, the resulting protocols are not very efficient. This is well-known in many application areas like electronic auctions , but data-mining poses even more difficulties due to the huge amount of data involved.

As an example, a concrete yet very important question of privacy-preserving data-mining is how to guarantee that the client will get exactly one record from the commercial database, without the database maintainer knowing, which item was obtained. If the database is relatively small, one can use oblivious transfer for this purpose. If the database is huge (as in most of the applications), one must use private information retrieval (PIR) where only cares about the privacy of the client or some other, alternative techniques.

Related publications up to now:

Visit of Sven Laur (Feb-May 2003). Informal weekly seminars with Prof. Heikki Mannila, Prof. Helger Lipmaa, Sven Laur and Jouni Seppänen.

Visit of Matthias Fischmann (August 2003).

A weekly seminar on the PPDM: T-79.514 Special Course on Cryptology, Autumn 2004, topic: PPDM.

A grant on PPDM by Finnish Academy of Sciences (2004--2007), funding for Sven Laur's PhD studies.

A weekly seminar on the PPDM: T-79.515 Special Course on Cryptology, Spring 2004, topic: PPDM.

Invitation of Benny Pinkas to Estonian Winter School in Computer Science 2005.


