Replace missing ONLY if 1 out of 9 items (not 2+) is missing

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Replace missing ONLY if 1 out of 9 items (not 2+) is missing

Postby Torvon » Tue Nov 22, 2011 3:16 am

Hey.

Short question: for some reason, a big subset of my sample is missing one of nine items in a questionnaire (responses 0-3). The sample size is huge, so I need a proper way to do this.

What I want to do is this:
IF only one out of the nine items is missing, replace this missing value with the mean of the other items in the scale for the same person (alternatively with the mean of this item for other people).
IF more than 1 are missing, do NOT replace.

I don't know how to tell SPSS this.

Thanks
--T
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby GerineL » Thu Nov 24, 2011 9:30 am

if you just want a variable with a mean value, you can use:

compute YOURVARNAME = mean.8(item1,item2 ..., item9).
execute.

that says: compute the mean of items 1 through 9 if there are at leas 8.

If you really want the individual items, not a scalescore, you can use the count function (transform -> count values within cases) to count the amount of non-missings, and use a if function to replace missings.
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby Torvon » Thu Nov 24, 2011 5:37 pm

Gutnre, thank you, that is very helpful.

I don't really know what way I should go about replacing missing values.
I have a large pool of subjects (around 1k), and about 60% of them scored all 9 items.
Then there is another 10% with 1 missing, so that's a lot of information I would miss out on would I just ignore these.

I am running complex analyses with the 9 items as predictors, there more information the better.

A problem is that the 9 items have very different means (between 2 and 0.4 on a 0-3 scale), so maybe I should z-score them and then replace missings when there is only 1/9 missing?

What would you suggest?

Thanks
--Torvon
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby GerineL » Mon Nov 28, 2011 12:15 pm

are the 9 items different scales or from the same scale?

e.g. depression scale with 9 items?
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby Torvon » Mon Nov 28, 2011 2:52 pm

they are different items from the same scale (the depression scale PHQ-9 has 9 items covering the 9 depressive symptome like sleep problem, eating problems, loss of interest etc.).
the items are therefor intercorrelated (to different degrees).

thanks!
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby GerineL » Mon Nov 28, 2011 3:59 pm

check reliability just in case (and/or factor analysis)

How do you explain the differences in emans? are you sure you don't need to rescale some items?

if there is nothing weird going on you are just fine to take the mean.
also, why would you only take the mean if only 1 item is missing? Do you have reason to assume that some items are better at describing the scale than others? Why not use every information you have?
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby Torvon » Mon Nov 28, 2011 4:29 pm

"check reliability just in case (and/or factor analysis)"
I have 5 measurement points, each 3 months apart, with a huge set of data. So I can see if PCA and reliability stay stable over time, which is pretty neat.
PCA: quite messy, 1 factor around EV3, two factors around EV1, the rest lower. But the messy solution is stable over time.
Reliability: I didn't check this so far (SPSS license just ran out and I'm waiting for University to provide me with an updated one).

"How do you explain the differences in means? are you sure you don't need to rescale some items?"
People think of depression as something binary, you have it or you don't. Which, of course, is nonsense, it's a dimensional thing. Also, 10 people with "depression" most probably have 10 very different problems, life situations, etc.
A self report questionnaire with 9 items impossibly can't catch that.
I am not sure about rescaling. What would you propose? I think I would need to z-transform them, possibly, before entering them into regressions and other analyses.

My goal is to look at the 9 items individually (the instrument is a very very very commonly used instrument - but as all other self-report tools in clinical settings, I think it's pretty bad). I don't want to build a sum score, and don't want to "destroy" differences in responses to single items by replacing missing values with averages of the other items, because these differences are exactly what I am interested in. The goal is to predict these 9 individual scores with different variables and covariates to make the point that there are different patterns or response profiles depending on e.g. gender or other variables.

Means are indeed very different for the 9 items, some are around 2 (0-3 scale), some around 0.2.

Not quite sure what to do, never been in this situation before. Thank you so much for bearing with me

--T
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby GerineL » Mon Nov 28, 2011 5:17 pm

Ok that last post makes things a lot clearer.
It is always hard to guess if people posting questions know what they are doing and actually need the thing they ask for or if they should do something else.

about the rescaling: I meant that if you have "I feel down" and "usually I am quite happy" in one scale, you need to recode it. That is not the case for your scale. Also, I can imagine that there are different means for the different items, since there is a question about suicidal ideation and about concentration. The concentration question probably has a higher mean than the suicidal ideation question.


If I understand correctly, you have a depression scale which has 9 items about 9 symptoms.
You measured this scale at 5 timepoints.
Usually people actually use a mean or some other total scale score, but you want to look at the items separately.
So actually your paper is not on depression but on depressive symptoms.
Your RQ is whether several variables predict specific depressive symptoms.
If your paper is indeed on depressive symptoms rather than depression, and each of these symptoms (/ items) is a dependent variable, in a way a missing item = a missing scale.

Most important question is: Do you have an idea about why some of the missings are missings?
Run Little's mcar test to see if they are not mcar.
if they are not not mcar, you can think about the multiple imputation option in spss.
with this procedure you can impute data, but it is done 5 times (i.e., you get 5 different datasets), and if you subsequently run a regression (it sounds like thats what you want to do) you get a pooled result for those 5 datasets.
I just followed a workshop on this, it was very interesting. To be fair, I always thought that imputing data was worse than just deleting cases with missing data - that deleting those cases would be more conservative- but its not.
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby Torvon » Mon Nov 28, 2011 5:35 pm

Gutnre wrote:Also, I can imagine that there are different means for the different items, since there is a question about suicidal ideation and about concentration.
...
Usually people actually use a mean or some other total scale score, but you want to look at the items separately.
So actually your paper is not on depression but on depressive symptoms.
Your RQ is whether several variables predict specific depressive symptoms.
...
If I understand correctly, you have a depression scale which has 9 items about 9 symptoms.
You measured this scale at 5 timepoints.
Yes, you understood the RQ and the problems perfectly.
Gutnre wrote:Most important question is: Do you have an idea about why some of the missings are missings?
Run Little's mcar test to see if they are not mcar.
That was the test to see whether missings miss randomly? I think I ran it a while ago and they were not missing randomly (I guess because later measurement points have more missings). I will rerun in as soon as my license it up (tomorrow).
Gutnre wrote:if they are not not mcar, you can think about the multiple imputation option in spss.
with this procedure you can impute data, but it is done 5 times (i.e., you get 5 different datasets), and if you subsequently run a regression (it sounds like thats what you want to do) you get a pooled result for those 5 datasets.
I just followed a workshop on this, it was very interesting. To be fair, I always thought that imputing data was worse than just deleting cases with missing data - that deleting those cases would be more conservative- but its not.
Okay, I didn't understand that 100%, but I'm sure that I will be able to read up on this with the keywords you mention.

Thanks!
-T
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby GerineL » Tue Nov 29, 2011 10:48 am

send me a pm I can send you a book chapter if you'd like.
Torvon
Posts: 34
Joined: Sun Nov 13, 2011 8:41 pm

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby Torvon » Tue Nov 29, 2011 5:48 pm

I sent you one yesterday with an additional question already. I hope it got through?
Book chapter would be lovely.

-T
berryla
Posts: 3
Joined: Sat Dec 24, 2011 7:15 am

Re: Replace missing ONLY if 1 out of 9 items (not 2+) is mis

Postby berryla » Sat Dec 24, 2011 7:24 am

compute the mean of items 1 through 9 if there are at leas 8

Who is online

Users browsing this forum: No registered users and 2 guests

cron