## Distinguishing variables based on their label (?)

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Distinguishing variables based on their label (?)

Recently I've followed an introductory SPSS course at my university, with Andy Field in hand. Thanks to that course I can now somewhat manage SPSS. However, now that I'm not following the preset course that my teacher used to give me, it's gotten more difficult. Simply put, sometimes I'm not quite sure what to use in order to get my results. This being my introduction, I'll continue with the details below.

I made a questionnaire that contains 12 sentences. These "question" sentences are all followed by two "answer" sentences, from which the respondent needs to pick one. A question sentence describes a situation to which both answer sentences are a possible continuation. To keep things simple, one answer sentence is "positive" and the other is "negative". Half of the question sentences (basically the variable label) contains a specific word (a preposition) that is my independent variable and the other half contains another. Now what I need is to be able to distinguish between the question sentences that contain one word, and the other.
I think I need to use Recode into Different Variables or something to do this, but I'm not quite sure. Since it's taking me too long trying to figure it out by myself, I was hoping someone here could help me out. I need this for my bachelor's thesis, so I'm kind of anxious to get it done (right).
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

Hi Jordi!

After identifying variables with/without this word or expression in their labels, what do you want to do with them? I wrote a tutorial (long ago) that computes mean scores over only variables with a certain pattern in their labels (http://www.spss-tutorials.com/select-va ... le-labels/) but this is probably not what you need.

Could you provide some more details on what you'd like to accomplish and the steps you're having in mind? "RECODE..." is used in many different situations (see http://www.spss-tutorials.com/recode/) so that doesn't really tell us much. Besides, it may not even be the best approach, given where you're heading.

Uploading a (possibly anonymized) sample of your data may also help us since it makes clear the exact structure we're dealing with.

Kind regards,

Ruben Geert van den Berg
www.spss-tutorials.com
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

Thank you for your reply, Ruben. I've tried attaching a sample, but the website says the attachment quota has been reached. Which seems odd for a 32 KB file.
I'll try to clarify. Say for instance that I want to use split file to divide my output based on a specific variable (male or female) or I want to put an independent variable as a factor in an ANOVA to compare the means of two groups. This is part of what I want, but I also want to make that distinction between entire variables (not based on its values but on its label, so to speak), to see if one group of variables has gotten different results than the other.
How can I select and compare these two types of variables? Even manually would do. I just stumbled on something called Multiple Response Sets, but I'm not sure what it does and if it's useful to me. Same goes for a Custom Attribute.
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

Bummer about the upload limit. Since I'm not active behind the scenes of this website I've no way of telling what the problem could be. But you could use Dropbox as well in order to make a file publicly available by means of a link as well.

Still not clear what the structure of your data is but it kinda sounds as if you could take mean scores over both sets of variables and compare these with a paired samples T test. But that's no more than a guess. I don't think MR Sets or Custom Attributes are going to help you out here.
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

It worked when I put it in a zip file. Hope this helps.
The prepositions that I need to use to distinguish the variables are "door" and "naar" by the way. For example:

'De dames en heren ploeterden zich een weg naar boven.' (naar)
'De zangeres baande zich een weg door haar publiek.' (door)
You do not have the required permissions to view the files attached to this post.
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

Sure does help. Honestly, your data look a bit weird. Seems you swapped values and value labels. But no worries, that's fixable.

However, what's the fundamental difference between the "0" and "1" answers? And what are you trying to figure out anyway?

Best,

Ruben
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

Oh, right. Thanks for pointing that out.
All of the "question" sentences contain a construction known as the "way-construction" (to ... his way to/through ...). This is the object of my research. Basically, what the "answer" sentences say is whether or not the subject of the sentence made it to its destination. So the 1 stands for "destination reached" and the 0 for "destination not reached". After having figured out if one interpretation is dominant overall or not (I believe with a single sample t-test), I want to know whether or not the use of a different preposition (door/naar) makes a difference to the interpretation. If it were just gender differences I'm interested in I wouldn't be bothering you, but I don't know what to look for here.

I'm also planning on checking the reliability by comparing the scores of the individual respondents, but this is where I got stuck so I haven't gotten around to that yet.
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

OK, why don't you do this:

1) Swap your value labels and values.

2) Rename your variables into "door_1" to "door_6" and "naar_1" to "naar_6".

3) Run tests for one proportion (=.5) for the 6 "door" and the 6 "naar" variables. Check out the pattern in the percentages (are the percentages for "door" variables lower/higher than those for "naar" variables? Is the difference consistent?).

4) Perhaps do that for men and women separately.
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

Alright, I see where you're going. But...
RubenGeert wrote: 3) Run tests for one proportion (=.5) for the 6 "door" and the 6 "naar" variables. Check out the pattern in the percentages (are the percentages for "door" variables lower/higher than those for "naar" variables? Is the difference consistent?).
How exactly do I run tests for one proportion? And is there, ideally, a way to get a p-value out of it (my teacher suggested a paired samples t-test)?
Also, I would like to get output on the proportion of people that chose one type of sentence over the other, only without the door/naar distinction (so I wouldn't have an independent variable). I'm just not sure which test to use exactly. My teacher suggested a single sample t-test, even though Andy Field's diagram says Chi-square.
As you can see, still mightily confused.
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

I believe you can technically use

1) One sample chi square test
2) One sample T test
3) Test for one proportion

for each variable separately but I also read that 3) should be the method of choice here. It says so in this book: www.managementboek.nl/boek/978905352705 ... -den-brink. It's in Dutch but I believe you're Dutch as well so that shouldn't be a problem.

But anyway, at this point you should perhaps read up on statistical tests or consult with your supervisor. I'm happy to answer some brief questions here but the entire analysis of your data is a different thing.

Kind regards,

Ruben Geert van den Berg
www.spss-tutorials.com
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

I'm sorry, I failed to google my first question before asking it. After having done that, though, I realized I just know this method as "calculating the probability using a z-score", which I've done before. So now that third point makes more sense. Comparing the percentages for the two preposition groups "by hand", however, is that the only possible solution? Is there perhaps a way to bundle up the variables into two groups, so as to compare them more easily?
After getting sidetracked a bit, I think we're back at my original question.
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

The test for one proportion is under "Analyze" => "Non parametric Tests" => "Legacy Dialogs" => "Binomial".

I can see you'd rather use a single test for the entire hypothesis rather than separate ones for the twelve relevant variables. However, that raises the question which of the "naar" variables should be compared to which of the "door" variables. You could consider computing means over both sets of 6 variables and using a paired samples T test to compare the composites. But even if you do so, I'd certainly ALSO inspect the 12 variables separately.

Best,

Ruben
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

Please don't get me wrong, I was very much intending to compare the output from the individual variables (my teacher has always emphasized the importance). So I can see the use of the binomial test, and I thank you for pointing it out to me. It's also true that the One Sample t-test does not work very well for me in this case, seeing as it tests differentiation from a specified (zero-decimal) number, and now that I've recoded my variables into 0 and 1, it won't let me specify a test value of 0,5 (which would signify an even distribution). That does leave me wondering if in this case it would be of any use to change the 0 to -1. Both 0 and 1 give me p< .001 for every variable, so that's not very useful.

When looking at the individual variables it is striking that the standard deviations are quite large (from .390 to .506) for data with a minimum value of 0 and a maximum value of 1. I'm guessing this is due to having such a small test group (N= 44).
The actual results for the binomial test show a wide variety in the proportions as well, so, based on that, I think neither the "positive" outcome nor the "negative" outcome is more likely to occur. However, when doing the binomial test on the entire sample (0= 213 (.44), 1= 271 (.56)), it shows that the chances of getting these scores (when assuming a probability parameter of .50) are quite small (p< .05). Given the amount of variation between the individual variables, though, I'm forced to give more credit to their individual analyses. After all, the overall p-value is nice but it does not (apparently) reflect the scores of highly comparable variables and cannot, as such, be taken for granted.

The Paired Samples t-test, then, shows a slightly negative but not significant correlation between the means of the door and naar groups (r= -.062). The standard deviations are, luckily, much smaller than in the case of the individual variables. It seems that the individual respondents are more consistent than the group as a whole, which is probably the least I can ask for.
So, it seems that on average my response group showed a significant preference for the "positive" outcome of the "answer" sentences for those "question" sentences that contain "naar" (M= .61, SE= .03), as compared to those that contain "door" (M= .50, SE= .03), t(43)= 2.63, p< .05, r= .37.
This could explain the difference in overall proportion that I've noted above. In response to this result I also did the binomial test for the door and naar groups separately, which gave me the same results. Having a medium effect size also contributes to this result. Of course, there's still the individual analyses to do, but that's too lengthy to get into here, I think.

Well, this has taken me some time to figure out and I'm sure that I've done at least something wrong, so I would very much appreciate your opinion on these results. For reference, here's a link to the relevant data: http://snk.to/f-ct9yxya3
RubenGeert
Posts: 100
Joined: Mon May 19, 2014 6:06 am

### Re: Distinguishing variables based on their label (?)

I'm more than happy to assist with some brief questions but I can't supervise the entire data analysis (not unpaid at least). You're overasking at this point.

Best,

Ruben
Jordi van der Torre
Posts: 9
Joined: Tue Jul 08, 2014 4:21 pm

### Re: Distinguishing variables based on their label (?)

Oh no, that's fine, really. I would mostly like to know if what I've done is in accordance with what you suggested I should do. It is entirely at your leisure to decide in what way you should check this, of course. My data is there, should you wish to view it, but a read-through is all I ask. I would ask someone else, but there's a short (if at all existent) supply of people familiar with statistics in my social circle, and, of course, my teacher is on holiday. I do suppose your hourly fee is quite considerable?

### Who is online

Users browsing this forum: No registered users and 1 guest