merge issue

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

kea1011
Posts: 12
Joined: Sun Nov 03, 2013 5:11 pm

merge issue

Postby kea1011 » Sun Nov 03, 2013 5:32 pm

hello,
i am using spss version 21. i have merged multiple files from the Health and Retirement Study without issue so far. the last variable i needed was from a different file set (still from HRS). after merging, i received no warning or error message, but was concerned because i only had 80 valid responses (i expected upwards of 500-600). when i looked at the data, i realized that not only did i have only 80 responses, but they were for my FIRST EIGHTY cases. my data set is N=686. i have tried merging multiple times and in multiple ways to try to 'trick' the program. but i keep coming up with 80.

i have gone into the original data set and i can SEE matches that did NOT merge that should have. i am sooo stuck.
i guess my question is, is there a REASON why the program would 'cut me off' at 80 like this for some reason? it has not done so with past files, but i can not figure out what the problem is.

thanks for ANY thoughts.
-kristen
pythonforspss.org
Posts: 116
Joined: Sat Oct 06, 2012 6:21 am

Re: merge issue

Postby pythonforspss.org » Sun Nov 03, 2013 8:04 pm

Please post the full syntax you used for this. It's kinda hard to troubleshoot this without it.
Kind regards,

Ruben Geert van den Berg
http://www.spss-tutorials.com
kea1011
Posts: 12
Joined: Sun Nov 03, 2013 5:11 pm

Re: merge issue

Postby kea1011 » Sun Nov 03, 2013 8:35 pm

'SRHONLY' contains the new variable i want. 'drivenow6' contains my main data set. i sorted each according to two identifier variables: hhid & pn. and match on these key variables as well. i use the 'clicking' method for merging. but this is the sytnax the output creates. hope this helps???
the real puzzle to me is that i can go into both files and SEE cases that match by hhid and pn. yet for some cases, the SRHONLY data did come over .....
GET
FILE='C:\Documents and Settings\Kea\Desktop\spss\old dataset\SRHONLY.sav'.
DATASET NAME DataSet2 WINDOW=FRONT.
DATASET ACTIVATE DataSet1.
SORT CASES BY HHID(A) PN(A).
DATASET ACTIVATE DataSet1.

SAVE OUTFILE='C:\Documents and Settings\Kea\Desktop\drivenow6.sav'
/COMPRESSED.
DATASET ACTIVATE DataSet2.
SORT CASES BY HHID(A) PN(A).
DATASET ACTIVATE DataSet2.

SAVE OUTFILE='C:\Documents and Settings\Kea\Desktop\spss\old dataset\SRHONLY.sav'
/COMPRESSED.
DATASET ACTIVATE DataSet1.
STAR JOIN
/SELECT t0.ADAMSSID, t0.nowDRIVE, t0.RACE2, t0.Degree2, t0.Gender2, t0.dementiatype, t0.dxbroad, t0.trailB, t0.trailA, t0.CDRcomplete, t0.CDRtotal, t0.CDRmemory, t0.CDRorient, t0.CDRjudgem, t0.CDRsocint, t0.CDRhomeact, t0.CDRperscare, t0.mmsetot, t0.animalfluencytotal, t0.bostonnamingtotal, t0.praxistotal, t0.praxdelaytotal, t0.praxrecogtotal, t0.logmem1, t0.logmem2, t0.fuldtotal, t0.bentonviscorrect, t0.bentonviserrors, t0.trailAcomp, t0.trailAerrors, t0.trailBcomp, t0.trailBerrors, t0.digitspanfrwrd,
t0.digitspanback, t0.digitspantotal, t0.shipleytotal, t0.cowatotal, t0.symboldigittotal, t0.symboldigiterrors, t0.ANIMMCP, t0.wordlist1, t0.wordlist2, t0.wordlist3, t0.ANDELCP, t0.WLdelayed, t0.ANRECCP, t0.WLrecyes, t0.WLrecno, t0.DEGREE, t0.GENDER, t0.SCHLYRS, t0.Diagnosis, t0.AAGE, t0.everDRIVE, t0.BIRTHMO, t0.BIRTHYR, t0.agewave, t0.intrvwdate, t0.agebracket, t0.driveNOW2, t0.CDRtotal2, t0.mmseRANGE, t0.mmseCUTOFF, t0.selfhlth, t0.AGQ30A, t0.AGQ30B, t0.AGQ30C, t0.AGQ30D, t0.AGQ30E, t0.AGQ30F, t0.AGQ30G,
t0.AGQ30H, t0.AGQ30I, t0.AGQ30J, t0.AGQ30K, t1.HC001
/FROM * AS t0
/JOIN 'DataSet2' AS t1
ON t0.HHID=t1.HHID
AND t0.PN=t1.PN
/OUTFILE FILE=*
pythonforspss.org
Posts: 116
Joined: Sat Oct 06, 2012 6:21 am

Re: merge issue

Postby pythonforspss.org » Sun Nov 03, 2013 8:53 pm

"...i can go into both files and SEE cases..." => mind you, what you see is not always what you get in SPSS. Perhaps the key variables have decimals that are not shown? Specify some more decimals to find out: http://www.spss-tutorials.com/changing- ... -decimals/.

How many cases are in either data file?
Kind regards,

Ruben Geert van den Berg
http://www.spss-tutorials.com
kea1011
Posts: 12
Joined: Sun Nov 03, 2013 5:11 pm

Re: merge issue

Postby kea1011 » Sun Nov 03, 2013 9:49 pm

the SRHONLY is from the 'core' so it has 18,167 cases. the drivenow6 set is the data set i have been creating to work with. therefore, variables have been merged in, renamed, recoded, dropped if missing values, etc (N=696). i wanted to merge in a variable from SRHONLY into my current set.

i fear i am not sure what you mean about the 'decimals'. but i will check out the link, none the less.
the hhid and pn are two variables used throughout all data files by the Health and Retirement Study so that data can be matched up via interview year.
pythonforspss.org
Posts: 116
Joined: Sat Oct 06, 2012 6:21 am

Re: merge issue

Postby pythonforspss.org » Mon Nov 04, 2013 9:17 am

"i fear i am not sure what you mean about the 'decimals'."

Well, if a cell contains a value 1.234 and you set its format to f1.0 (zero decimals, that is), you'll only see 1. But the 1 you see is not exactly what you have because you have 1.234. If a case from a different file really has value 1, it will obviously not be matched by 1.234.

This is one of the main reasons why MATCH FILES or RECODE sometimes shows unexpected behavior.

P.s. are these public data? If so (and file sizes permitting) I'd like to have a look at them.
Kind regards,

Ruben Geert van den Berg
http://www.spss-tutorials.com
kea1011
Posts: 12
Joined: Sun Nov 03, 2013 5:11 pm

Re: merge issue

Postby kea1011 » Mon Nov 04, 2013 2:28 pm

the data is public but you must be a registered user with HRS. there is a huge amount of data available for use:
http://hrsonline.isr.umich.edu/index.php?p=data

i am currently using the data for a doctoral dissertation out of UMass Boston: A Special Access File, ADAMS Wave A (still public) is where 99% of my data is coming from. the dreaded self-reported health variable comes from the 2002 HRS Core data.

the variable i am currently struggling with is categorical. i can see from a 'codebook' HRS supplies that the only responses available are "poor", "fair", "good", "very good", and "excellent" (coded 1-5 respectively, no decimals).
pythonforspss.org
Posts: 116
Joined: Sat Oct 06, 2012 6:21 am

Re: merge issue

Postby pythonforspss.org » Mon Nov 04, 2013 7:49 pm

If the data are public, could you somehow provide me with the two exact data files you're trying to merge? I'd like to run your syntax on them and see what's going on.

I'll PM you my email address, if they're too big too attach we can perhaps use Dropbox or FTP.
Kind regards,

Ruben Geert van den Berg
http://www.spss-tutorials.com
kea1011
Posts: 12
Joined: Sun Nov 03, 2013 5:11 pm

Re: merge issue

Postby kea1011 » Mon Nov 04, 2013 8:03 pm

i received a reply about 3 mins ago from the 'hrs help line'. it appears it may be a data issue, not an spss issue, which i suspected from the start. they are directing me to a different 'core' in order to access a variable that can be merged correctly. not sure what it all means, but im going to try and follow their lead.
thank you so much for time and efforts.
believe me, im in the beginning stages of this thing .....i'll be back!
--kristen
pythonforspss.org
Posts: 116
Joined: Sat Oct 06, 2012 6:21 am

Re: merge issue

Postby pythonforspss.org » Tue Nov 05, 2013 6:50 am

"...it may be a data issue, not an spss issue..." => Sure but that doesn't imply you can't track down and solve the problem in SPSS. There's no "hidden components" in SPSS data. Theoretically one could even drill down onto the very bits (ones and zeroes) that make it up. There's nothing unusual to working with messy data in SPSS, you just gotta know what you're doing.

But anyway: good luck with the project and do get back if there's any other issues!
Kind regards,

Ruben Geert van den Berg
http://www.spss-tutorials.com

Who is online

Users browsing this forum: No registered users and 2 guests

cron