kot4x wrote:
Hi!
I uploaded a sample of the data in my previous post - I had to copy it into excel and then save it as a picture in order to upload. Do you want to see the variable view from the dataset?
Thanks!
No, I need the real data so that I can get your list of variables. But that wouldn't matter. Here is the syntax:
Code:
VARSTOCASES
/ID=id1
/MAKE trans1 FROM Diagnosis11 TO Diagnosis631
/INDEX=Index1(trans1)
/KEEP=id
/NULL=KEEP.
You may need to modify the syntax a bit. I use Diagnosis11 as your first variable and Diagnosis631 as your last diagnosis variable (probably Diagnosis631, I can't be sure because there isn't a data set for me to check). And it's important that all these 186 variables have to be contiguous, which means they should all be adjacent to each other in your data set without any other foreign variable mixed in between.
Remember to save a back up before restructuring.
Now after it's restructured, you can get rid of the empty cases and deduplicates in the data by using this:
Code:
FILTER OFF.
USE ALL.
SELECT IF (trans1 ~= "").
EXECUTE.
* Identify Duplicate Cases.
SORT CASES BY id1(A) trans1(A).
MATCH FILES
/FILE=*
/BY id1 trans1
/DROP = PrimaryFirst /FIRST=PrimaryFirst
/LAST=PrimaryLast.
DO IF (PrimaryFirst).
COMPUTE MatchSequence=1-PrimaryLast.
ELSE.
COMPUTE MatchSequence=MatchSequence+1.
END IF.
LEAVE MatchSequence.
FORMATS MatchSequence (f7).
COMPUTE InDupGrp=MatchSequence>0.
SORT CASES InDupGrp(D).
MATCH FILES
/FILE=*
/DROP=PrimaryLast InDupGrp MatchSequence.
VARIABLE LABELS PrimaryFirst 'Indicator of each first matching case as Primary'.
VALUE LABELS PrimaryFirst 0 'Duplicate Case' 1 'Primary Case'.
VARIABLE LEVEL PrimaryFirst (ORDINAL).
FREQUENCIES VARIABLES=PrimaryFirst.
EXECUTE.
FILTER OFF.
USE ALL.
SELECT IF (PrimaryFirst = 1).
EXECUTE.
For the GUI command, you can get the same results by using Data > Select Cases, and then Data > Identify Duplicate Cases.