Bug in SELECT IF?

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

mwulfe
Posts: 7
Joined: Mon Feb 13, 2012 3:24 pm

Bug in SELECT IF?

Postby mwulfe » Thu Aug 15, 2013 9:32 am

I recently ran a syntax file in version 20 with this line of code:

select if (vara <> N and varb <> M).

Based on the help system, this should remove any case where BOTH vara = N and varb = M. Instead, it removed all cases where vara = N OR varb = M. I only discovered this later when running freqs on varb and noticed there were no cases where varb = M.

Has anyone else found this? Is it a bug?

Marty
apeape
Posts: 56
Joined: Mon May 02, 2011 6:07 pm

Re: Bug in SELECT IF?

Postby apeape » Thu Aug 15, 2013 1:40 pm

There's no bug. Although it is confusing the software did what it was instructed - that is it only kept cases where both vara and varb did not equal N and M respectively. Cases that contained either an N or M or both should be removed by the logic of that statement.

The command you should have used is:

Code: Select all

select if not(vara = 'N' and varb = 'M').
mwulfe
Posts: 7
Joined: Mon Feb 13, 2012 3:24 pm

Re: Bug in SELECT IF?

Postby mwulfe » Thu Aug 15, 2013 2:20 pm

apeape wrote:There's no bug. Although it is confusing the software did what it was instructed - that is it only kept cases where both vara and varb did not equal N and M respectively. Cases that contained either an N or M or both should be removed by the logic of that statement.

The command you should have used is:

Code: Select all

select if not(vara = 'N' and varb = 'M').
In the SPSS help system, when explaining the use of "&", it says "Cases that meet both conditions are included in subsequent analyses." It did not keep only those cases where both vara <> N AND varb <> M, it kept only those cases where vara <> M OR vara <> N. So if I have a case where vara <> M and varb = N, it should be selected.

If you're correct, what's the difference between using "and" and "or"?
apeape
Posts: 56
Joined: Mon May 02, 2011 6:07 pm

Re: Bug in SELECT IF?

Postby apeape » Thu Aug 15, 2013 3:30 pm

This can definitely get confusing especially if you aren't used to boolean logic. The statement of your original query is known as a 'joint denial' or 'logical nor' - or to put it more simply that X is true only when *both* A and B are false.

Read up on Truth Functions (https://en.wikipedia.org/wiki/Truth_function).

Perhaps you'll get a better understanding if we simulate the logical gates in SPSS with the following code and look at the resultant truth tables:

Code: Select all

data list list / P Q (2a1).
begin data.
T F
T T
F F
F T
end data.

compute conjunction = (P = 'T' and Q = 'T').
compute disjunction = (P = 'T' or Q = 'T').
compute jointdenial = (P <> 'T' and Q <> 'T').
compute altdenial = (P <> 'T' or  Q <> 'T').
exe.

mwulfe
Posts: 7
Joined: Mon Feb 13, 2012 3:24 pm

Re: Bug in SELECT IF?

Postby mwulfe » Thu Aug 15, 2013 5:41 pm

apeape wrote:This can definitely get confusing especially if you aren't used to boolean logic. The statement of your original query is known as a 'joint denial' or 'logical nor' - or to put it more simply that X is true only when *both* A and B are false.

Read up on Truth Functions (https://en.wikipedia.org/wiki/Truth_function).

Perhaps you'll get a better understanding if we simulate the logical gates in SPSS with the following code and look at the resultant truth tables:

Code: Select all

data list list / P Q (2a1).
begin data.
T F
T T
F F
F T
end data.

compute conjunction = (P = 'T' and Q = 'T').
compute disjunction = (P = 'T' or Q = 'T').
compute jointdenial = (P <> 'T' and Q <> 'T').
compute altdenial = (P <> 'T' or  Q <> 'T').
exe.

First, I want to thank you for your responses. I appreciate the help!

I am in fact familiar with boolean logic, I've been using statistics and programming for decades. But I am always open to learning something new.

What I am using in my code is what you have defined as the joint denial, as you mentioned. I ran your code, and the only case where the joint denial was true (1) was the third case, where both P and Q are false. When I used a select if (P <> "T" and Q <> "T"), it correctly retained only that one case.

Applying that to my original example, the only cases that should have been kept were those where *both* conditions, vara <> N and varb <> M, are true, analogous to both P and Q being F. But it was de-selecting those cases where varb <> M regardless of the value of vara.

By the way, it seems that the If statement (without select) does this the way I would expect.

Who is online

Users browsing this forum: No registered users and 3 guests

cron