Restructuring data

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Restructuring data

Postby Torvon » Tue Apr 17, 2012 3:02 pm

Hello.

I have 5 measurement points (time). Y is my dependent variable, x is a dichotomous critical life event (time-varying covariate, yes vs. no). Most x = 0, meaning subjects have no life event.

My data are currently in the format:

subject - time - y - x
1 ------- 1 -- 2 - 0
1 ------- 2 -- 4 - 1
1 ------- 3 -- 4 - 0
1 ------- 4 -- 5 - 0
1 ------- 5 -- 4 - 0
2 ------- 1 -- 0 - 0
2 ------- 2 -- 0 - 0
2 ------- 3 -- 1 - 0
2 ------- 4 -- 5 - 1
2 ------- 5 -- 3 - 0

In this example, subject 1 had a life event measured at time2 (meaning the person reported that it experienced a life event between measurements 1 and 2), and as you can see the dependent variable y goes up from 2 to 4.
Subject 2 had a life event at time 4, and y increased from 1 to 5.

Now, I am not interested in the other measurement points without life events, or the time variable per se. I want to transform this into a cross-sectional dataformat. All I want to keep is measurement points 1 and 2, in this case.

subject - y before life event - y after life event - time of live event
1 --------------2---------------4---------------2
2 --------------1---------------5---------------4

Is it possible to do this with SPSS syntax? I can't get it done, and the dataset is huge (N > 1000) so I don't want to do it manually ;)

Thank you
E.
Penguin_Knight
Posts: 473
Joined: Thu Apr 05, 2012 5:58 pm

Re: Restructuring data

Postby Penguin_Knight » Tue Apr 17, 2012 5:40 pm

IF x = 1 and subject = LAG(subject) before = LAG(y) .
IF x = 1 after = y .
IF x = 1 eventtime = time .
EXECUTE .

FILTER OFF.
USE ALL.
SELECT IF x = 1.
EXECUTE.

DELETE VARIABLES time y x .
EXECUTE .
Note that I put "subject = lag(subject)" to safeguard against cases in which the life changing event happened at time 1. If you're sure they all happened at time 2 or later, then you don't need that condition in the first line.
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Restructuring data

Postby Torvon » Fri Apr 20, 2012 10:51 am

Penguin_Knight, your help is very much appreciated.

My problem is a bit more complex, and I thought I could post the simplified version of the problem and then enhance the syntax myself to work for the complicated dataset, but I can't get things working.

Could you give me another hint as to how to solve this?

I have several X and several Y.

subject - time - y1 - y2 - y3 - x1 - x2 - x3
1 ------- 1 -- 2 -- 2 -- 1 -- 1 -- 0 -- 1
1 ------- 2 -- 4 -- 1 -- 1 -- 0 -- 0 -- 0
1 ------- 3 -- 4 -- 1 -- 2 -- 0 -- 1 -- 0
1 ------- 4 -- 5 -- 4 -- 3 -- 0 -- 1 -- 0
1 ------- 5 -- 4 -- 3 -- 0 -- 0 -- 0 -- 0
2 ------- 1 -- 0 -- 4 -- 0 -- 0 -- 0 -- 0
2 ------- 2 -- 0 -- 1 -- 0 -- 0 -- 1 -- 0
2 ------- 3 -- 1 -- 2 -- 2 -- 1 -- 0 -- 1
2 ------- 4 -- 5 -- 3 -- 1 -- 0 -- 1 -- 0
2 ------- 5 -- 3 -- 0 -- 1 -- 2 -- 1 -- 0

So a person can have different life events at the same time, and there are multiple Ys as response variables that interest me.

The output would look like this (all variables in one line, I'll put it into a column for better overview)

subject
---
y1 before x1
y1 after x1
y2 before x1
y2 after x1
y3 before x1
y3 after x1
time x1
---
y1 before x2
y1 after x2
y2 before x2
y2 after x2
y3 before x2
y3 after x2
time x2
---
y1 before x3
y1 after x3
y2 before x3
y2 after x3
y3 before x3
y3 after x3
time x3

etc.

Thank you so much, tried for a couple of hours now but I just cannot get it done. This new dataset might seem complex, but you can imagine that it will greatly simplify my design compared to the earlier version with 5 measurement points (this new one will be more or less cross-sectional with the possibility to control for time).

E.
Penguin_Knight
Posts: 473
Joined: Thu Apr 05, 2012 5:58 pm

Re: Restructuring data

Postby Penguin_Knight » Fri Apr 20, 2012 7:23 pm

In that case better just use a restructure:
SORT CASES BY subject time.
CASESTOVARS
/ID=subject
/INDEX=time
/GROUPBY=VARIABLE.
Use "time" as Index, and then you can easily group variables for analysis.

For instance, if x1.2 = 1, you can then check y1.1 (before) and y1.2 (after),y2.1 (before) and y2.2 (after), and y3.1 (before) and y3.2 (after). The ".1", ".2", etc. is the suffix taken from whatever you entered in the variable "time".
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Restructuring data

Postby Torvon » Sat Apr 21, 2012 4:03 pm

I'm very sorry, but I don't get it. I read some tutorials on VARSTOCASE yesterday and today but they are not nearly as complicated as what I'm trying to do. I don't understand what the "VARIABLE" in this case should be, and I find no information in the tutorials about the "GROUPBY" command at all.

Do you mean I need to combine the VARSTOCASE with some kind of IF THEN syntax like the one you posted in your first reply?

Sorry ...
T.
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Restructuring data

Postby Torvon » Sat Apr 21, 2012 4:15 pm

Nevermind, I can just solve it with your first idea. Long and tedious but it works :)

IF x1=1 y1before1=LAG(y1).
IF x1=1 y2before2=LAG(y2).
...
IF x1=1 y1after1(y1).
IF x1=1 y2after2(y2).
...

rinse and repeat

Thank you!

Who is online

Users browsing this forum: No registered users and 2 guests

cron