K-Means Clustering Issues on SPSS

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

whatwouldyoudo
Posts: 2
Joined: Tue Mar 21, 2017 3:27 am

K-Means Clustering Issues on SPSS

Postby whatwouldyoudo » Tue Mar 21, 2017 3:33 am

Hi all,

I've learned SPSS on my own, and don't have an extensive stats background, but I'm familiar with the basics.

Problem: I'm trying to create customer segmentation models based on survey data. I have 3 sets of data. I'm using Data Set A as the existing cluster model through which to apply Data Set B and C, but the results I'm getting just don't seem to make sense based on historical data we've collected.

Here's what I did:
1) Identified the optimal # of clusters using Wards' method on SPSS, plotting the scree diagram, and identifying the elbow point. Optimal # of clusters is 4.

2) Prepped Data Set A for SPSS. Identified statistically significant variables to use for analysis. Ran K-Means clustering on Data Set A to classify cases into each cluster. This is successful. Cases are clustered into 4 different groups.

3) Prepped Data Set B and C for SPSS. Matched variables so it aligns with Data Set A. Ran K-Means clustering on Data Set B and C using the saved Data Set A cluster model. This is where it gets weird. All cases in Data Set B and C are clustered into 1 group. Based on historical data + excel analysis, it just doesn't make sense for both sets of data to all belong in 1 cluster.

I've tried cleaning up the data and eliminating outliers that could skew the data. I've rerun the analysis multiple times. I've conferred with multiple people. It seems the process is solid.

What am I overlooking? Is it just because this data isn't prepped well enough to run K-means clustering on SPSS?
statman
Administrator
Posts: 2696
Joined: Tue Jun 12, 2007 12:08 pm
Location: Florida, USA

Re: K-Means Clustering Issues on SPSS

Postby statman » Tue Mar 21, 2017 12:51 pm

I am "guessing" sets B & C are of the same design as set A and if so, have you run them through a new model just to see what happens? If still get 1 cluster with the same set-up as done w set A then agree, all seems sound but ..... Can you
See the note below

NOTE: Please read the Posting Guidelines and always tell us your OS, the SPSS version and information about your study and data!

Statman
Statistical Services
whatwouldyoudo
Posts: 2
Joined: Tue Mar 21, 2017 3:27 am

Re: K-Means Clustering Issues on SPSS

Postby whatwouldyoudo » Tue Mar 21, 2017 5:21 pm

statman wrote:
Tue Mar 21, 2017 12:51 pm
I am "guessing" sets B & C are of the same design as set A and if so, have you run them through a new model just to see what happens? If still get 1 cluster with the same set-up as done w set A then agree, all seems sound but ..... Can you
Hi statman,

I did try using Data Set B as the cluster model for example, and running Data Sets A & C K-means clustering through the model - and it worked! The other data sets were clustered into the 4 groups. I'm really at a loss as to why using Data Set A isn't worked. For reference, Data Set A is based on a survey of the general population. And Data Sets B and C are based on survey results from our customers.

FYI - I'm using SPSS version 24 on Mac.
statman
Administrator
Posts: 2696
Joined: Tue Jun 12, 2007 12:08 pm
Location: Florida, USA

Re: K-Means Clustering Issues on SPSS

Postby statman » Tue Mar 21, 2017 7:10 pm

Data Set A is based on a survey of the general population. And Data Sets B and C are based on survey results from our customers
As simple as it might be, this difference might be the issue. Are sets b & C weighted? (I don't recall if CA uses the weights so if they are, check on that. If they are now and CA can apply the weights, try that)

Mac - That's probably the real "odd ball" reason :D

Who is online

Users browsing this forum: No registered users and 3 guests

cron