LIBRARY OF CONGRESS CONTROL NUMBER |
LC control number |
2017937274 |
INTERNATIONAL STANDARD BOOK NUMBER |
International Standard Book Number |
9783319502700 |
DEWEY DECIMAL CLASSIFICATION NUMBER |
Call number |
519.5 TE ST |
MAIN ENTRY--PERSONAL NAME |
Authors |
Templ, Matthias |
TITLE STATEMENT |
Title |
Statistical disclosure control for microdata : |
Statement of responsibility, etc |
methods and applications in R |
Subtitle |
Matthias Templ |
PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) |
Place of publication |
Cham, Switzerland : |
Publisher |
Springer, |
Date |
c2017. |
PHYSICAL DESCRIPTION |
Extent |
xix, 287 p. : |
Other Details |
ill. ; |
Size |
25 cm. |
CONTENTS |
Contents |
Preface; Overview of the Book; Acknowledgements; Contents; Acronyms; 1 Software; 1.1 Prerequisites; 1.1.1 Installation and Updates; 1.1.2 Install sdcMicro and Its Browser-Based Point-and-Click App; 1.1.3 Updating the SDC Tools; 1.1.4 Help; 1.1.5 The R Workspace and the Working Directory; 1.1.6 Data Types; 1.1.7 Generic Functions, Methods and Classes; 1.2 Brief Overview on SDC Software Tools; 1.3 Differences Between SDC Tools; 1.4 Working with sdcMicro; 1.4.1 General Information About sdcMicro; 1.4.2 S4 Class Structure of the sdcMicro Package; 1.4.3 Utility Functions 1.4.4 Reporting Facilities1.5 The Point-and-Click App sdcApp; 1.6 The simPop package; References; 2 Basic Concepts; 2.1 Types of Variables; 2.1.1 Non-confidential Variables; 2.1.2 Identifying Variables; 2.1.3 Sensitive Variables; 2.1.4 Linked Variables; 2.1.5 Sampling Weights; 2.1.6 Hierarchies, Clusters and Strata; 2.1.7 Categorical Versus Continuous Variables; 2.2 Types of Disclosure; 2.2.1 Identity Disclosure; 2.2.2 Attribute Disclosure; 2.2.3 Inferential Disclosure; 2.3 Disclosure Risk Versus Information Loss and Data Utility; 2.4 Release Types; 2.4.1 Public Use Files (PUF) 2.4.2 Scientific Use Files (SUF)2.4.3 Controlled Research Data Center; 2.4.4 Remote Execution; 2.4.5 Remote Access; References; 3 Disclosure Risk; 3.1 Introduction; 3.2 Frequency Counts; 3.2.1 The Number of Cells of Equal Size; 3.2.2 Frequency Counts with Missing Values; 3.2.3 Sample Frequencies in sdcMicro; 3.3 Principles of k-anonymity and l-diversity; 3.3.1 Simplified Estimation of Population Frequency Counts; 3.4 Special Uniques Detection Algorithm (SUDA); 3.4.1 Minimal Sample Uniqueness; 3.4.2 SUDA Scores; 3.4.3 SUDA DIS Scores; 3.4.4 SUDA in sdcMicro; 3.5 The Individual Risk Approach 3.5.1 The Benedetti-Franconi Model for Risk Estimation3.6 Disclosure Risks for Hierarchical Data; 3.7 Measuring Global Risks; 3.7.1 Measuring the Global Risk Using Log-Linear Models:; 3.7.2 Standard Log-Linear Model; 3.7.3 Clogg and Eliason Method; 3.7.4 Pseudo Maximum Likelihood Method; 3.7.5 Weighted Log-Linear Model; 3.8 Application of the Log-Linear Models; 3.9 Global Risk Measures; 3.10 Quality of the Risk Measures Under Different Sampling Designs; 3.11 Disclosure Risk for Continuous Variables; 3.12 Special Treatment of Outliers When Calculating Disclosure Risks; References 4 Methods for Data Perturbation4.1 Kind of Methods; 4.2 Methods for Categorical Key Variables; 4.2.1 Recoding; 4.2.2 Local Suppression; 4.2.3 Post-randomization Method (PRAM); 4.3 Methods for Continuous Key Variables; 4.3.1 Microaggregation; 4.3.2 Noise Addition; 4.3.3 Shuffling; References; 5 Data Utility and Information Loss; 5.1 Element-Wise Comparisons; 5.1.1 Comparing Missing Values; 5.1.2 Comparing Aggregated Information; 5.2 Element-Wise Measures for Continuous Variables; 5.2.1 Element-Wise Comparisons of Mixed Scaled Variables; 5.3 Entropy; 5.4 Propensity Score Methods |
SUMMARY |
Summary |
This book on statistical disclosure control presents the theory, applications and software implementation of the traditional approach to (micro)data anonymization, including data perturbation methods, disclosure risk, data utility, information loss and methods for simulating synthetic data. Introducing readers to the R packages sdcMicro and simPop, the book also features numerous examples and exercises with solutions, as well as case studies with real-world data, accompanied by the underlying R code to allow readers to reproduce all results. |
SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical Heading |
Mathematical statistics |
Geographic |
Data processing |
SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical Heading |
R (Computer program language) |
SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical Heading |
MATHEMATICS / Applied |
SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical Heading |
MATHEMATICS / Probability & Statistics / General |
ELECTRONIC LOCATION AND ACCESS |
Uniform Resource Identifier |
https://uowd.box.com/s/w3nr0q6amitab7ji3655n4lf5ia4tlic |
Public note |
Location Map |
MAIN ENTRY--PERSONAL NAME |
-- |
5593 |
SUBJECT ADDED ENTRY--TOPICAL TERM |
-- |
5594 |
SUBJECT ADDED ENTRY--TOPICAL TERM |
-- |
2458 |
SUBJECT ADDED ENTRY--TOPICAL TERM |
-- |
5595 |
SUBJECT ADDED ENTRY--TOPICAL TERM |
-- |
5329 |