statsitical modelling (cluster analysis) project index statistical modelling (diagnostics)

English or languish - Probing the ramifications
of Hong Kong's language policy

CLUSTER ANALYSIS
Proximity Measures
cluster analysis (research design issues | key features | key terms)

Euclidean Distance Measures
cluster analysis (research design issues | key terms)

The Euclidean distance between any two observations j and k described by n variables is given by the Euclidean distance

Δjk = [Σ(Xij - Xik)2]1/2

where i = 1 to n

statsitical modelling (cluster analysis) top statistical modelling (diagnostics)

Correlation measures
cluster analysis (analytical routines | key features | key terms)

Inverted factor or factor Q-analysis is used to group observations according to their shared variance across the characteristic variable set. Although a popular clustering technique factor Q-analysis suffers from several important shortcomings:

statsitical modelling (cluster analysis) top statistical modelling (diagnostics)

Similarity measures
cluster analysis (research design issues | key terms)
Similarity measures are often used to group observations when the characteristic variables are non-metric. There are two standard approaches to overcoming the absence of metric data:


statsitical modelling (cluster analysis) top statistical modelling (diagnostics)

Mixed scaling - The best way to avoid problems of mixed units is to design the original experiment in such a way that all variables of the characteristic variable set are measured in the same way. Short of this, non-conforming elements of the same variable set must be transformed.

statsitical modelling (cluster analysis) top statistical modelling (diagnostics)