Transposing a variable with multiple sequences

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

uwilliams
Posts: 1
Joined: Sat Dec 03, 2011 3:02 pm

Transposing a variable with multiple sequences

Postby uwilliams » Sat Dec 03, 2011 3:27 pm

I have two databases.

One (Data1) lists the patients as:

Sequence Number Age Gender
1 28 F
2 35 M
3 23 F
4 42 F


the other (Hosp1) list the patients as:

Sequence Number Diagnosis Code
1
1
2
2 6.7
2 3.5
3
3 6.7
4
4 2.1
4 8.9
4 4.3
5
5


These are the same patients coded by sequence number. I am trying to combine the two databases with having only one sequence number for each patient like in Data1. The trouble is that there are multiple Diagnosis Codes for each patient, so I am trying to transpose these so it looks like this:

Sequence Number Diagnosis Code1 Diagnosis Code 2 Diagnosis Code 3
1
1
2
2 6.7 3.5
2
3
3 6.7
4
4
4 2.1 8.9 4.3
4
5
5

So ultimately, I am trying to transpose the Diagnosis Code variable BY each patient and sorted by their first, second, third diagnosis.

I would then aggregate the cases and merge the files (right???)

I am really looking forward to the help I can get, I've tried multiple things and its over my head. I don't want to even think about sorting it manually because there are 40,0000 patients in the database!!! Thanks everyone!!!
JonPedersen
Posts: 119
Joined: Wed May 25, 2011 7:07 am
Contact:

Re: Transposing a variable with multiple sequences

Postby JonPedersen » Fri Dec 09, 2011 9:37 pm

Hi,
Actually, no, although everything depends on what you want to do after having massaged the files. In general, having several instances of the same variable on the same record is bad practice.
I would probably start with merging the files so that you have age and gender info and one diagnosis on each record. This will make it much easier to deal with the fact that you have several diagnoses for each patient.
So

Code: Select all

Match files  /table=agegenderdatabase
/file=diagnosesdatabase
/by ID.
and then you can aggregate, if you need number of diagnoses per patient or something like that. Otherwise it is much easier to get variances, frailty or similar stuff right with a (quasi) normalized data structure.
best
Jon

Who is online

Users browsing this forum: No registered users and 2 guests

cron