Hi all,

I've learned SPSS on my own, and don't have an extensive stats background, but I'm familiar with the basics.

Problem: I'm trying to create customer segmentation models based on survey data. I have 3 sets of data. I'm using Data Set A as the existing cluster model through which to apply Data Set B and C, but the results I'm getting just don't seem to make sense based on historical data we've collected.

Here's what I did:

1) Identified the optimal # of clusters using Wards' method on SPSS, plotting the scree diagram, and identifying the elbow point. Optimal # of clusters is 4.

2) Prepped Data Set A for SPSS. Identified statistically significant variables to use for analysis. Ran K-Means clustering on Data Set A to classify cases into each cluster. This is successful. Cases are clustered into 4 different groups.

3) Prepped Data Set B and C for SPSS. Matched variables so it aligns with Data Set A. Ran K-Means clustering on Data Set B and C using the saved Data Set A cluster model. This is where it gets weird. All cases in Data Set B and C are clustered into 1 group. Based on historical data + excel analysis, it just doesn't make sense for both sets of data to all belong in 1 cluster.

I've tried cleaning up the data and eliminating outliers that could skew the data. I've rerun the analysis multiple times. I've conferred with multiple people. It seems the process is solid.

What am I overlooking? Is it just because this data isn't prepped well enough to run K-means clustering on SPSS?