Restructuring/aggregating data file with string variables

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

jazzjune
Posts: 1
Joined: Thu Dec 12, 2013 8:41 pm

Restructuring/aggregating data file with string variables

Postby jazzjune » Thu Dec 12, 2013 8:48 pm

I need some major help figuring out how to restructure my data file. I have a student data file with four variables:

1. Student ID (numeric - 1,000+ possible values)
2. Year (numeric - 5 possible values)
3. Name of Fall Math Course (string) - 30+ possible values
4. Name of Spring Math Course (string - 30+ possible values)

Students may have taken several different math courses over the years, or they may have repeated a math course, say if they failed or something. Therefore, Student IDs can be repeated, along with various Math Courses (like Algebra 1A)

Here are my research questions

1. How many Math classes has each student taken? (I could do an Aggregate command, but I'd prefer to exclude classes that students have repeated. Is there a way I can do that?)

2. How many advanced Math classes has each student taken? (I'd need to figure out a way to count 5 or so courses. And again, I'd like to exclude courses that were repeated.)


Any help will be much appreciated!!!
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Restructuring/aggregating data file with string variable

Postby GerineL » Fri Dec 13, 2013 2:23 pm

Not exactly sure, but hopefully this helps:

- make 1 file with all IDs and courses in rows, instead of a separate var for fall and spring courses. You can always add a var to indicate wheter that specific var is spring or fall.

- Maybe you can sort on Id, then course, and then use this syntax to number coursename within student/course.
compute number=1.
if ( (ID = lag(ID)) & (coursename = lag(coursename)) ) number = (lag(number) +1).
execute.

- only select cases with a 1.

then do your aggregate things

Who is online

Users browsing this forum: No registered users and 1 guest

cron