Binning scatter plot in SPSS 22

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

puser
Posts: 1
Joined: Mon Sep 14, 2015 10:29 am

Binning scatter plot in SPSS 22

Postby puser » Mon Sep 14, 2015 10:44 am

Hello,

I've been trying to bin a scatter plot in SPSS 22.

Here are the details. I have a questionnaire data set from ca. 500 participants. The scatter plot represents a correlation between the age of the participant and the motivation (evaluated on a scale from 0 to 16). Say, among my participants there are three students who all have the level of motivation 14 and they are all of the same age. Hence I have three points falling on top of each other on the plot. This creates a misrepresentation of the data. To avoid that I would like to color my points according to the count - i.e. the points corresponding to a single respondent would be colored grey, the points corresponding to two respondents - a darker shade of grey, etc.

I have found the SPSS v. 12 tutorial which explains how to do it (via the Points Bin tab in the Properties window). However, I cannot figure out how to accomplish this in the SPSS 22. The Points Bin tab is apparently no longer available.

Please help!

Nikolai
JonPedersen
Posts: 117
Joined: Wed May 25, 2011 7:07 am
Contact:

Re: Binning scatter plot in SPSS 22

Postby JonPedersen » Thu Sep 17, 2015 8:10 am

Hi,
Not sure if it can be done directly. Two indirect ways:
First: Jittering.
Normally, SPSS graph command outputs the following code (assuming two variables, a and b, with the following contents:

Code: Select all

a       b
1,00	1,00
1,00	2,00
1,00	3,00
2,00	2,00
2,00	2,00
2,00	2,00
4,00	4,00
4,00	4,00
5,00	5,00
6,00	7,00
7,00	8,00
7,00	8,00
7,00	9,00

Code: Select all

GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=a b MISSING=LISTWISE REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: a=col(source(s), name("a"), unit.category())
  DATA: b=col(source(s), name("b"), unit.category())
  GUIDE: axis(dim(1), label("a"))
  GUIDE: axis(dim(2), label("b"))
  ELEMENT: point(position(a*b))
END GPL.
In this code, change ELEMENT:point(position(a*b)) to
ELEMENT point.jitter(position(a*b))

The second way is to aggregate the file first, which gives you more or less what you want

Code: Select all

compute num=1.
aggregate outfile * MODE=REPLACE
/break a,b
/ties=sum(num).


GGRAPH
  /GRAPHDATASET NAME="graphdataset" VARIABLES=a b ties[LEVEL=ORDINAL] MISSING=LISTWISE 
    REPORTMISSING=NO
  /GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
  SOURCE: s=userSource(id("graphdataset"))
  DATA: a=col(source(s), name("a"), unit.category())
  DATA: b=col(source(s), name("b"), unit.category())
  DATA: ties=col(source(s), name("ties"), unit.category())
  GUIDE: axis(dim(1), label("a"))
  GUIDE: axis(dim(2), label("b"))
  GUIDE: legend(aesthetic(aesthetic.color.exterior), label("ties"))
  ELEMENT: point(position(a*b), color.exterior(ties))
END GPL.
Note that you will have to change the colors, but you can probably fix that with a template, or, if you feel like it, play with GPL directly (using the color.brightness function).
hth
Jon

Who is online

Users browsing this forum: No registered users and 2 guests

cron