Create profiles of observed variables using two-step cluster analysis

create_profiles_cluster(df, ..., n_profiles, to_center = FALSE,
  to_scale = FALSE, distance_metric = "squared_euclidean",
  linkage = "complete")

Arguments

df	with two or more columns with continuous variables
...	unquoted variable names separated by commas
n_profiles	The specified number of profiles to be found for the clustering solution
to_center	Boolean (TRUE or FALSE) for whether to center the raw data with M = 0
to_scale	Boolean (TRUE or FALSE) for whether to scale the raw data with SD = 1
distance_metric	Distance metric to use for hierarchical clustering; "squared_euclidean" is default but more options are available (see ?hclust)
linkage	Linkage method to use for hierarchical clustering; "complete" is default but more options are available (see ?dist)

Value

A list containing the prepared data, the output from the hierarchical and k-means cluster analysis, the r-squared value, raw clustered data, processed clustered data of cluster centroids, and a ggplot object.

Details

Function to create a specified number of profiles of observed variables using a two-step (hierarchical and k-means) cluster analysis.

Examples

d <- pisaUSA15
m3 <- create_profiles_cluster(d,
                              broad_interest, enjoyment, instrumental_mot, self_efficacy,
                              n_profiles = 3)
#> Prepared data: Removed 354 incomplete cases
#> Hierarchical clustering carried out on: 5358 cases
#> K-means algorithm converged: 5 iterations
#> Clustered data: Using a 3 cluster solution
#> Calculated statistics: R-squared = 0.424
summary(m3)
#>  broad_interest    enjoyment     instrumental_mot self_efficacy  
#>  Min.   :1.000   Min.   :1.000   Min.   :1.000    Min.   :1.000  
#>  1st Qu.:2.200   1st Qu.:2.400   1st Qu.:1.500    1st Qu.:1.625  
#>  Median :2.800   Median :3.000   Median :2.000    Median :2.000  
#>  Mean   :2.655   Mean   :2.782   Mean   :2.072    Mean   :2.134  
#>  3rd Qu.:3.200   3rd Qu.:3.000   3rd Qu.:2.500    3rd Qu.:2.500  
#>  Max.   :5.000   Max.   :4.000   Max.   :4.000    Max.   :4.000  
#>     cluster     
#>  Min.   :1.000  
#>  1st Qu.:1.000  
#>  Median :2.000  
#>  Mean   :1.784  
#>  3rd Qu.:2.000  
#>  Max.   :3.000

Create profiles of observed variables using two-step cluster analysis

Arguments

Value

Details

Examples

Contents