Normal view MARC view ISBD view

Statistical disclosure control for microdata : Matthias Templ

By: Templ, Matthias
Material type: BookPublisher: Cham, Switzerland : Springer, c2017.Description: xix, 287 p. : ill. ; 25 cm.ISBN: 9783319502700Subject(s): Mathematical statistics -- Data processing | R (Computer program language) | MATHEMATICS / Applied | MATHEMATICS / Probability & Statistics / GeneralDDC classification: 519.5 TE ST Online resources: Location Map
Summary:
This book on statistical disclosure control presents the theory, applications and software implementation of the traditional approach to (micro)data anonymization, including data perturbation methods, disclosure risk, data utility, information loss and methods for simulating synthetic data. Introducing readers to the R packages sdcMicro and simPop, the book also features numerous examples and exercises with solutions, as well as case studies with real-world data, accompanied by the underlying R code to allow readers to reproduce all results.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Home library Call number Status Date due Barcode Item holds
REGULAR University of Wollongong in Dubai
Main Collection
519.5 TE ST (Browse shelf) Available T0056683
Total holds: 0

Preface; Overview of the Book; Acknowledgements; Contents; Acronyms; 1 Software; 1.1 Prerequisites; 1.1.1 Installation and Updates; 1.1.2 Install sdcMicro and Its Browser-Based Point-and-Click App; 1.1.3 Updating the SDC Tools; 1.1.4 Help; 1.1.5 The R Workspace and the Working Directory; 1.1.6 Data Types; 1.1.7 Generic Functions, Methods and Classes; 1.2 Brief Overview on SDC Software Tools; 1.3 Differences Between SDC Tools; 1.4 Working with sdcMicro; 1.4.1 General Information About sdcMicro; 1.4.2 S4 Class Structure of the sdcMicro Package; 1.4.3 Utility Functions 1.4.4 Reporting Facilities1.5 The Point-and-Click App sdcApp; 1.6 The simPop package; References; 2 Basic Concepts; 2.1 Types of Variables; 2.1.1 Non-confidential Variables; 2.1.2 Identifying Variables; 2.1.3 Sensitive Variables; 2.1.4 Linked Variables; 2.1.5 Sampling Weights; 2.1.6 Hierarchies, Clusters and Strata; 2.1.7 Categorical Versus Continuous Variables; 2.2 Types of Disclosure; 2.2.1 Identity Disclosure; 2.2.2 Attribute Disclosure; 2.2.3 Inferential Disclosure; 2.3 Disclosure Risk Versus Information Loss and Data Utility; 2.4 Release Types; 2.4.1 Public Use Files (PUF) 2.4.2 Scientific Use Files (SUF)2.4.3 Controlled Research Data Center; 2.4.4 Remote Execution; 2.4.5 Remote Access; References; 3 Disclosure Risk; 3.1 Introduction; 3.2 Frequency Counts; 3.2.1 The Number of Cells of Equal Size; 3.2.2 Frequency Counts with Missing Values; 3.2.3 Sample Frequencies in sdcMicro; 3.3 Principles of k-anonymity and l-diversity; 3.3.1 Simplified Estimation of Population Frequency Counts; 3.4 Special Uniques Detection Algorithm (SUDA); 3.4.1 Minimal Sample Uniqueness; 3.4.2 SUDA Scores; 3.4.3 SUDA DIS Scores; 3.4.4 SUDA in sdcMicro; 3.5 The Individual Risk Approach 3.5.1 The Benedetti-Franconi Model for Risk Estimation3.6 Disclosure Risks for Hierarchical Data; 3.7 Measuring Global Risks; 3.7.1 Measuring the Global Risk Using Log-Linear Models:; 3.7.2 Standard Log-Linear Model; 3.7.3 Clogg and Eliason Method; 3.7.4 Pseudo Maximum Likelihood Method; 3.7.5 Weighted Log-Linear Model; 3.8 Application of the Log-Linear Models; 3.9 Global Risk Measures; 3.10 Quality of the Risk Measures Under Different Sampling Designs; 3.11 Disclosure Risk for Continuous Variables; 3.12 Special Treatment of Outliers When Calculating Disclosure Risks; References 4 Methods for Data Perturbation4.1 Kind of Methods; 4.2 Methods for Categorical Key Variables; 4.2.1 Recoding; 4.2.2 Local Suppression; 4.2.3 Post-randomization Method (PRAM); 4.3 Methods for Continuous Key Variables; 4.3.1 Microaggregation; 4.3.2 Noise Addition; 4.3.3 Shuffling; References; 5 Data Utility and Information Loss; 5.1 Element-Wise Comparisons; 5.1.1 Comparing Missing Values; 5.1.2 Comparing Aggregated Information; 5.2 Element-Wise Measures for Continuous Variables; 5.2.1 Element-Wise Comparisons of Mixed Scaled Variables; 5.3 Entropy; 5.4 Propensity Score Methods

This book on statistical disclosure control presents the theory, applications and software implementation of the traditional approach to (micro)data anonymization, including data perturbation methods, disclosure risk, data utility, information loss and methods for simulating synthetic data. Introducing readers to the R packages sdcMicro and simPop, the book also features numerous examples and exercises with solutions, as well as case studies with real-world data, accompanied by the underlying R code to allow readers to reproduce all results.

Powered by Koha