2nd Workshop on
Privacy Preserving Data Mining (PPDM)

Melbourne, Florida, USA, November 19, 2003
In conjunction with
ICDM'03: The Third IEEE International Conference on Data Mining 2003

Call for Papers

PDF Version

In the light of developments in technology to analyze personal data, public concerns regarding privacy are rising. While some believe that statistical and Knowledge Discovery and Data Mining (KDDM) research is detached from this issue, we can certainly see that the debate is gaining momentum as KDDM and statistical tools are more widely adopted by public and private organizations hosting large databases of personal records. One of the key requirements of a data mining project is access to the relevant data. Privacy and Security concerns can constrain such access, threatening to derail data mining projects. This workshop will bring together researchers and practitioners to identify problems and solutions where data mining interferes with privacy and security.

The purpose of this workshop is to discuss these issues and promote achievements of researchers in the area. We want to bring together experts, including both researchers and practitioners, in privacy, data mining and its applications, and statistical database security.


There are many data mining situations where these privacy and security issues arise. A few examples are:

  • Sensitive data collection for data mining. There are many situations where data contain sensitive information. For these data, data collection becomes difficult because privacy concerns not only limit individuals to disclose their truthful information, but also limit the willingness of the data custodians (e.g. hospitals, insurance companies, government, etc.) to share data. Are there privacy-preserving techniques that can perturb, obscure, sanitize, or anonymize data before the data are collected, while still making the data useful for data mining and knowledge discovery?
  • Collaboration. Success in many endeavors is achieved through collaboration, team efforts, or partnerships. The collaboration usually calls for extensive data sharing. For example, when two companies, each having an extra large set of data containing their transactions information, want to conduct data mining on their joint data set for mutual benefit, the confidential information, such as trade secrets, in each data set prevents extensive data sharing. Can they conduct the data mining on the joint data set, while gaining minimal knowledge about the other company's confidential information?
  • Multi-national corporations. An individual country's legal system may prevent sharing of customer data between a subsidiary and its parent.

Workshop Content and Format

We plan a full day workshop, opening with a presentation by an invited speaker to set the stage. The rest of the day will consist of paper sessions with ample time for questions and breaks for discussion. The goal is to bring participants up to speed on the issues and solutions in this area, outline key research problems, and encourage collaborations to address these problems. Accepted papers will be published in ICDM workshop proceedings.

Topics of Interest

Papers are solicited that identify and propose technical solutions to such problems. Sample topics (by no means an exhaustive list) include:

  • Meanings and measuring of ``privacy'' in privacy-preserving data mining.
  • Learning from perturbed/obscured data.
  • Techniques for protecting confidentiality of sensitive information, including work on statistical databases, and obscuring or restricting data access to prevent violation of privacy and security policies.
  • Learning from distributed data sets with limits on sharing of information.
  • Hiding knowledge in data sets.
  • Underlying methods and techniques to support data mining while respecting privacy and security (e.g., secure multi-party computation).
  • The relationship between privacy and knowledge discovery, and algorithms for balancing privacy and knowledge discovery.
  • Use of data mining results to reconstruct private information, and corporate security in the face of analysis by KDDM and statistical tools of public data by competitors.
  • Use of anonymity techniques to protect privacy in data mining.


Attendance is not limited to the paper authors. We strongly encourage other interested parties to attend the workshop. One of the objectives of the workshop is to promote the interaction among researchers and those who have experienced security and privacy constraints on data mining.

What and how to submit

Papers should be at most 12 pages long in single-column format, 12-point font, with at least 1-inch margins on all sides. Please send them electronically (PDF or PostScript files) to wedu@ecs.syr.edu on or before August 29, 2003 (extended to September 5).

Important Dates

Intent to submit
(appreciated, not required)
August 22, 2003
Paper submissionSeptember 5, 2003 (extended)
Notification of acceptanceSeptember 26, 2003
Camera ready papersOctober 10, 2003
Workshop dateNovember 19, 2003


  • Wenliang (Kevin) Du (Chair), Syracuse University
    Department of Electrical Engineering and Computer Science
    Syracuse, NY 13244 USA
    +1 315-443-9180, Fax: +1 315-443-1122
  • Chris Clifton (Co-Chair), Purdue University,
    Department of Computer Sciences
    West Lafayette, Indiana 47907-1398 USA
    +1 765-494-6005, Fax: +1 765 494-0739

Program Committee

  • Wesley Chu, University of California, Los Angeles
  • George Cybenko, Dartmouth College
  • Vladimir Estivill-Castro, Griffith University
  • Johannes Gehrke, Cornell University
  • Tom Johnsten, University of South Alabama
  • Hillol Kargupta, University of Maryland Baltimore County
  • Stanley R. M. Oliveira, Embrapa Information Technology
  • Benny Pinkas, Trusted Systems Lab, HP Labs
  • Vijay V. Raghavan, University of Louisiana Lafayette