Research

PPDM

Previous grants:

  1. Cryptology and Data-Mining (CRYDAMI)

    Duration: 3.5 years (2004 - 2007).

    Funded by the Finnish Academy of Sciences

  2. Privacy-Preserving Data Mining: Cryptographic Methods

    Duration: 2 years (2006 - 2008).

    Funded by the Estonian Science Foundation. Grant number 6848 (to Cybernetica AS).

Abstract (as in the grant application, with only minor modifications):

The primary task of data-mining is to develop models about aggregated data. It has close ties with areas like statistics, machine learning and database theory. On the other hand, the main question of privacy-preserving data-mining (PPDM) is, can we develop accurate models without access to precise information in individual data records? Privacy-preserving data-mining is important in practice, since for example, without guarantees that their privacy is being protected, many customers are unwilling to submit their data to the data-miners, or are consciously lying. As another important example, different companies may want to establish models of their joint databases, without giving away the content of the databases to another parties. Because of these and many other reasons, while privacy is usually seen as unrelevant and even undesirable in data-mining (and statistics in general), without preserving the privacy it may be hard to achieve the main goals of data-mining.

The research field that deals with privacy (and secure computing in general) is called cryptology, with cryptography dealing with the construction of new primitives and protocols, and cryptanalysis deals with the analysis and attacking of existing primitives/protocols. One of the important results of cryptology is that all efficiently computable functions are also efficiently computable in a secure, privacy-preserving, way . Therefore, it is natural to try to use cryptographic methods to solve the privacy problems in data-mining. However, the main problem of the cryptographic approach is that even if all functions can be computed securely, the resulting protocols are not very efficient. This is well-known in many application areas like electronic auctions , but data-mining poses even more difficulties due to the huge amount of data involved.

As an example, a concrete yet very important question of privacy-preserving data-mining is how to guarantee that the client will get exactly one record from the commercial database, without the database maintainer knowing, which item was obtained. If the database is relatively small, one can use oblivious transfer for this purpose. If the database is huge (as in most of the applications), one must use private information retrieval (PIR) where only cares about the privacy of the client or some other, alternative techniques.

Related publications up to now:

[LAN02] (useful subprotocols for PPDM)
Helger Lipmaa, N. Asokan, Valtteri Niemi, ``Secure Vickrey Auctions without Threshold Trust.'' In Matt Blaze, editor, Financial Cryptography 2002, volume 2357 of Lecture Notes in Computer Science, Southampton Beach, Bermuda, 11--14 March 2002. Springer-Verlag.
[Lip03a] (useful subprotocols for PPDM)
Helger Lipmaa. On Diophantine Complexity and Statistical Zero-Knowledge Arguments. In Chi Sung Laih, editor, Advances on Cryptology --- ASIACRYPT 2003, volume 2894 of Lecture Notes in Computer Science, pages 398--415, Taipei, Taiwan, November 30--December 4 2003. Springer-Verlag.
[Lip03b] (a fast verifiable oblivious transfer protocol)
Helger Lipmaa. Verifiable Homomorphic Oblivious Transfer and Private Equality Test. In Chi Sung Laih, editor, Advances on Cryptology --- ASIACRYPT 2003, volume 2894 of Lecture Notes in Computer Science, pages 416--433, Taipei, Taiwan, November 30--December 4 2003. Springer-Verlag.
[AJL04] (first cryptographic randomized response technique protocols)
Andris Ambainis, Markus Jakobsson, Helger Lipmaa. Cryptographic Randomized Response Techniques. In Feng Bao, editor, Public Key Cryptography 2004, volume 2947 of Lecture Notes in Computer Science, pages 425--438, Singapore, March 1--4 2004. Springer-Verlag.
[LL04] (attacks on some proposed private similarity search protocols)
Sven Laur and Helger Lipmaa. On Private Similarity Search Protocols. In Sanna Liimatainen and Teemupekka Virtanen, editors, Proceedings of the 9th Nordic Workshop on Secure IT Systems (NordSec 2004), pages 73--77, Espoo, Finland, November 4--5, 2004. ISBN 951-22-7348-9.
[GLLM04] (attacks on some proposed private scalar product protocols; description of a secure protocol)
Bart Goethals, Sven Laur, Helger Lipmaa and Taneli Mielikäinen. On Private Scalar Product Computation for Privacy-Preserving Data Mining. In Choonsik Park and Seongtaek Chee, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506 of Lecture Notes in Computer Science, pages 104--120, Seoul, Korea, December 2--3, 2004. Springer-Verlag.
[Lip05] (the most efficient oblivious transfer protocol at this time)
Helger Lipmaa. An Oblivious Transfer Protocol with Log-Squared Communication. In Jianying Zhou and Javier Lopez, editors, The 8th Information Security Conference (ISC'05), volume 3650 of Lecture Notes in Computer Science, pages 314--328, Singapore, September 20--23, 2005. Springer-Verlag.
[LLM05] (study of private itemset counting algorithms)
Sven Laur, Helger Lipmaa and Taneli Mielikäinen. Private Itemset Support Counting. In Sihan Qing, Wenbo Mao, Javier Lopez and Guilin Wang, editors, Information and Communications Security: 7th International Conference, ICICS 2005, volume 3783 of Lecture Notes in Computer Science, pages 97--111, Beijing, China, December 10--13, 2005. Springer-Verlag.
[LLM06] (cryptographically private versions of perceptron/svm)
Sven Laur, Helger Lipmaa and Taneli Mielikäinen. Cryptographically Private Support Vector Machines. In Mark Craven and Dimitrios Gunopulos, editors, The Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, volume ? of ?, pages ?--?, Philadelphia, USA, August 20--23, 2006. ACM.

Activities

Visit of Sven Laur (Feb-May 2003). Informal weekly seminars with Prof. Heikki Mannila, Prof. Helger Lipmaa, Sven Laur and Jouni Seppänen.

Visit of Matthias Fischmann (August 2003).

A weekly seminar on the PPDM: T-79.514 Special Course on Cryptology, Autumn 2004, topic: PPDM.

A grant on PPDM by Finnish Academy of Sciences (2004--2007), funding for Sven Laur's PhD studies.

A weekly seminar on the PPDM: T-79.515 Special Course on Cryptology, Spring 2004, topic: PPDM.

Invitation of Benny Pinkas to Estonian Winter School in Computer Science 2005.

Links

First, PPDM linkfarm by Helger Lipmaa. Crypto group at the University of Tartu.

Data-mining links in Finland

Research groups:

Courses:

Data-mining groups in Estonia

Research groups:

Latest update: 31 March 2006. Helger Lipmaa.