Create profiles of observed variables using two-step cluster analysis
create_profiles_cluster(df, ..., n_profiles, to_center = FALSE, to_scale = FALSE, distance_metric = "squared_euclidean", linkage = "complete")
df | with two or more columns with continuous variables |
---|---|
... | unquoted variable names separated by commas |
n_profiles | The specified number of profiles to be found for the clustering solution |
to_center | Boolean (TRUE or FALSE) for whether to center the raw data with M = 0 |
to_scale | Boolean (TRUE or FALSE) for whether to scale the raw data with SD = 1 |
distance_metric | Distance metric to use for hierarchical clustering; "squared_euclidean" is default but more options are available (see ?hclust) |
linkage | Linkage method to use for hierarchical clustering; "complete" is default but more options are available (see ?dist) |
A list containing the prepared data, the output from the hierarchical and k-means cluster analysis, the r-squared value, raw clustered data, processed clustered data of cluster centroids, and a ggplot object.
Function to create a specified number of profiles of observed variables using a two-step (hierarchical and k-means) cluster analysis.
d <- pisaUSA15 m3 <- create_profiles_cluster(d, broad_interest, enjoyment, instrumental_mot, self_efficacy, n_profiles = 3)#>#>#>#>#>summary(m3)#> broad_interest enjoyment instrumental_mot self_efficacy #> Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 #> 1st Qu.:2.200 1st Qu.:2.400 1st Qu.:1.500 1st Qu.:1.625 #> Median :2.800 Median :3.000 Median :2.000 Median :2.000 #> Mean :2.655 Mean :2.782 Mean :2.072 Mean :2.134 #> 3rd Qu.:3.200 3rd Qu.:3.000 3rd Qu.:2.500 3rd Qu.:2.500 #> Max. :5.000 Max. :4.000 Max. :4.000 Max. :4.000 #> cluster #> Min. :1.000 #> 1st Qu.:1.000 #> Median :2.000 #> Mean :1.784 #> 3rd Qu.:2.000 #> Max. :3.000