Estimate parameters for profiles for a specific solution

estimate_profiles(df, ..., n_profiles, variances = "equal",
  covariances = "zero", to_return = "tibble", model = NULL,
  center_raw_data = FALSE, scale_raw_data = FALSE,
  return_posterior_probs = TRUE, return_orig_df = FALSE,
  prior_control = FALSE, print_which_stats = "some")

Arguments

df

data.frame with two or more columns with continuous variables

...

unquoted variable names separated by commas

n_profiles

the number of profiles (or mixture components) to be estimated

variances

how the variable variances are estimated; defaults to "equal" (to be constant across profiles); other option is "varying" (to be varying across profiles)

covariances

how the variable covariances are estimated; defaults to "zero" (to not be estimated, i.e. for the covariance matrix to be diagonal); other options are "varying" (to be varying across profiles) and "equal" (to be constant across profiles)

to_return

character string for either "tibble" (or "data.frame") or "mclust" if "tibble" is selected, then data with a column for profiles is returned; if "mclust" is selected, then output of class mclust is returned

model

which model to estimate (DEPRECATED; use variances and covariances instead)

center_raw_data

logical for whether to center (M = 1) the raw data (before clustering); defaults to FALSE

scale_raw_data

logical for whether to scale (SD = 1) the raw data (before clustering); defaults to FALSE

return_posterior_probs

TRUE or FALSE (only applicable if to_return == "tibble"); whether to include posterior probabilities in addition to the posterior profile classification; defaults to TRUE

return_orig_df

TRUE or FALSE (if TRUE, then the entire data.frame is returned; if FALSE, then only the variables used in the model are returned)

prior_control

whether to include a regularizing prior; defaults to false

print_which_stats

if set to "some", prints (as a message) the log-likelihood, BIC, and entropy; if set to "all", prints (as a message) all information criteria and other statistics about the model; if set to any other values, then nothing is printed

Value

either a tibble or a ggplot2 plot of the BIC values for the explored models

Details

Creates profiles (or estimates of the mixture components) for a specific mclust model in terms of the specific number of mixture components and the structure of the residual covariance matrix

Examples

estimate_profiles(iris, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, n_profiles = 3)
#> Fit Equal variances and covariances fixed to 0 (model 1) model with 3 profiles.
#> LogLik is 361.429
#> BIC is 813.05
#> Entropy is 0.979
#> # A tibble: 150 x 6 #> Sepal.Length Sepal.Width Petal.Length Petal.Width profile posterior_prob #> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 5.1 3.5 1.4 0.2 1 1 #> 2 4.9 3 1.4 0.2 1 1 #> 3 4.7 3.2 1.3 0.2 1 1 #> 4 4.6 3.1 1.5 0.2 1 1 #> 5 5 3.6 1.4 0.2 1 1 #> 6 5.4 3.9 1.7 0.4 1 1 #> 7 4.6 3.4 1.4 0.3 1 1 #> 8 5 3.4 1.5 0.2 1 1 #> 9 4.4 2.9 1.4 0.2 1 1 #> 10 4.9 3.1 1.5 0.1 1 1 #> # ... with 140 more rows