This book provides a solid practical guidance to summarize, visualize and interpret the most important information in a large multivariate data sets, using principal component methods (PCMs) in R. The visualization is based on the factoextra R package that we developed for creating easily beautiful ggplot2-based graphs from the output of PCMs. This book contains 4 parts. Part I provides a quick introduction to R and presents the key features of FactoMineR and factoextra. Part II describes classical principal component methods to analyze data sets containing, predominantly, either continuous or categorical variables. These methods include: Principal Component Analysis (PCA, for continuous variables), simple correspondence analysis (CA, for large contingency tables formed by two categorical variables) and Multiple CA (MCA, for a data set with more than 2 categorical variables). In part III, you’ll learn advanced methods for analyzing a data set containing a mix of variables (continuous and categorical) structured or not into groups: Factor Analysis of Mixed Data (FAMD) and Multiple Factor Analysis (MFA). Part IV covers hierarchical clustering on principal components (HCPC), which is useful for performing clustering with a data set containing only categorical variables or with a mixed data of categorical and continuous variables.
Advertisements