Loading...
Please wait, while we are loading the content...
Similar Documents
Privacy Preserving Data Mining Privacy Preserving Clustering by Data Transformation
| Content Provider | Semantic Scholar |
|---|---|
| Author | Plunkett, Noirin |
| Copyright Year | 2006 |
| Abstract | Data mining techniques raise a number of ethical issues, particularly regarding knowledge gained from the secondary use of data (use of data for a purpose other than that for which it was originally gathered). However, this knowledge has great potential, if it can be gathered without the current fears of privacy breaches. Regardless of the difficulties in obtaining consent, data subjects increasingly have a legal right to have their privacy respected. Even where they don't, public opinion is generally unfavourable where privacy is regarded as having been breached. Therefore, data which has not been collected for the express purpose of research/data mining must be in some way adapted so that the privacy of the data subjects is maintained, whilst the data is made available for research. In order to protect the privacy of an individual, whilst retaining the ability to usefully mine the data available, two steps are needed, according to Oliveira and Zaiane 1-anonymity (the removal of identifying information) and data transformation (to protect sensitive data which is not, on its own, identifying). Anonymity on its own is insufficient, because of the danger that non-identifying data from the anonymised source could be combined with other, non-anonymised sources, to re-identify individuals. Data transformation prevents this, ensuring that the raw anonymised data cannot simply be mapped onto another data source. Data transformation methods which are suitable for statistical databases often do not perform well when combined with data clustering-in order to generate the same clusters from the original and the transformed data, new techniques are required, which preserve the inter-and intra-cluster relations. Previous techniques have several issues. The data swapping techniques suggested by Estivill-Castro and Brankovic 2 involve a deliberate trade-off between privacy and precision, by accepting partial disclosure in order to allow mining of detailed data. Partial disclosure occurs when substantial information is discovered about a confidential attribute – the attribute lies within or outside of a certain range, does not have a certain value, or has certain properties relative to another confidential attribute. Positive disclosure occurs when a user discovers the exact value of a confidential attribute, and is prevented by this model. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.nerdchic.net/ppdm/Summary.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |