Ranking cases based on count of occurences

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

gandalf
Posts: 11
Joined: Mon Apr 16, 2012 12:25 pm

Ranking cases based on count of occurences

Postby gandalf » Mon Apr 16, 2012 1:49 pm

Hello everyone,

I am trying to create a rank variable based on the occurrence of the variable and not on the value of the variable. Specifically,
I have a dataset from a survey of groups visiting shops. I have a survey_id variable to identify the groups. Now, the issue is that some of the groups split themselves and visit different shops. However, these different groups are still assigned the same survey_id. Now, coming to my issue, a variable visit_id keeps track of the visit sequence of the group. However, just like above, the visit_id also is same for the same group. As an example, for the survey_id 360, we see that the visit_id goes 1 to 3. After this we see that the visit_id goes from 1 to 2 and 1 to 2. This means that the group 360 has split into 3 subgroups and these 3 subgroups have visited in the sequence shown under visit_id. So, now I would like to compute a Rank variable as shown below. For the first 3 visits the rank should be 1. For the visits made by the second subgroup, the rank should be 2. For the visits made by the third subgroup the rank should be 3. Since, the survey_id 361 has no other subgroups its rank should be 1. Please suggest me a way to do this.

I have attached a sample of the survey as an example along with the required output.
survey.jpg
You do not have the required permissions to view the files attached to this post.
Penguin_Knight
Posts: 473
Joined: Thu Apr 05, 2012 5:58 pm

Re: Ranking cases based on count of occurences

Postby Penguin_Knight » Mon Apr 16, 2012 4:00 pm

Not elegant, but it can get the job done. Go to File > New > Syntax. Copy the following codes into the window, and the go to Run > All.
compute seq = $casenum .
execute .

if seq = 1 rank = 1 .
execute .

if sysmis(rank) &
survey_id = lag(survey_id) &
visit_id > lag(visit_id)
rank = lag(rank).

if sysmis(rank) &
survey_id = lag(survey_id) &
visit_id <= lag(visit_id)
rank = lag(rank) + 1.

if sysmis(rank) &
survey_id ~= lag(survey_id)
rank = 1.

execute .
gandalf
Posts: 11
Joined: Mon Apr 16, 2012 12:25 pm

Re: Ranking cases based on count of occurences

Postby gandalf » Tue Apr 17, 2012 12:05 am

Thanks Penguin. I will try this.

Who is online

Users browsing this forum: No registered users and 2 guests

cron