Create profiles of observed variables using two-step cluster analysis

create_profiles_cluster(df, ..., n_profiles, to_center = FALSE,
  to_scale = FALSE, distance_metric = "squared_euclidean",
  linkage = "complete")

Arguments

df

with two or more columns with continuous variables

...

unquoted variable names separated by commas

n_profiles

The specified number of profiles to be found for the clustering solution

to_center

Boolean (TRUE or FALSE) for whether to center the raw data with M = 0

to_scale

Boolean (TRUE or FALSE) for whether to scale the raw data with SD = 1

distance_metric

Distance metric to use for hierarchical clustering; "squared_euclidean" is default but more options are available (see ?hclust)

linkage

Linkage method to use for hierarchical clustering; "complete" is default but more options are available (see ?dist)

Value

A list containing the prepared data, the output from the hierarchical and k-means cluster analysis, the r-squared value, raw clustered data, processed clustered data of cluster centroids, and a ggplot object.

Details

Function to create a specified number of profiles of observed variables using a two-step (hierarchical and k-means) cluster analysis.

Examples

d <- pisaUSA15 m3 <- create_profiles_cluster(d, broad_interest, enjoyment, instrumental_mot, self_efficacy, n_profiles = 3)
#> Prepared data: Removed 354 incomplete cases
#> Hierarchical clustering carried out on: 5358 cases
#> K-means algorithm converged: 5 iterations
#> Clustered data: Using a 3 cluster solution
#> Calculated statistics: R-squared = 0.424
summary(m3)
#> broad_interest enjoyment instrumental_mot self_efficacy #> Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 #> 1st Qu.:2.200 1st Qu.:2.400 1st Qu.:1.500 1st Qu.:1.625 #> Median :2.800 Median :3.000 Median :2.000 Median :2.000 #> Mean :2.655 Mean :2.782 Mean :2.072 Mean :2.134 #> 3rd Qu.:3.200 3rd Qu.:3.000 3rd Qu.:2.500 3rd Qu.:2.500 #> Max. :5.000 Max. :4.000 Max. :4.000 Max. :4.000 #> cluster #> Min. :1.000 #> 1st Qu.:1.000 #> Median :2.000 #> Mean :1.784 #> 3rd Qu.:2.000 #> Max. :3.000