Multiple IF statements with “COMPUTE”

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

iara
Posts: 11
Joined: Tue Oct 20, 2015 3:22 pm

Multiple IF statements with “COMPUTE”

Postby iara » Sat Jan 16, 2016 3:53 pm

I am using COMPUTE and If statements that combine place of birth and ancestry data from the census into a new variable.

If I use the filter I get the same answer as the individual statement, but if I combine all of the formats into one (see “Multiple” syntax below) some of the countries are corrected while others will give a varying answer.

Why would this occur? What did I do wrong when combining all of the criteria into one statement?


I would be happy to use the individual and then combine them later, but I was having trouble combining the individual variables using DO IF (see last set of syntax). Each individual variable I created had a unique code so I thought If I summed them the codes would remain the same. For example if respondents indicated that either their place of birth is Albania or have Albanian first or second ancestry they would get a code of 1 all other responses are 0. Therefore when summing them, I thought for each row of data there would be a number or a 0 so the sum would be the code. Is this the wrong assumption?

I have had success using COMPUTE and DO IF when including a few criteria, but when having 57 statements something went wrong and I am not sure what. Any insight is greatly appreciated. I know it is a lot of syntax, but I thought showing exactly what I did would better illustrate where I went wrong.

Thanks!


_____________
Syntax

Filter (Results match those of the Individual new variable):

USE ALL.
COMPUTE filter_$=(CITY=4610 AND (BPLD=43000 or ANCESTR1D=1000 OR ANCESTR2D=1000)).
VARIABLE LABELS filter_$ '(CITY=4610 AND (BPLD=43000 or ANCESTR1D=1000 OR ANCESTR2D=1000)) (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.


Individual:
COMPUTE Country1= 0.
If (BPLD=43000 | ANCESTR1D=1000 | ANCESTR2D=1000) Country1 =1.
VALUE LABELS Country1
1 Country1.
Formats Individual (F5.0).
EXECUTE.

DO IF (used to combine all of the individual variables into one new variable):

DO IF
(Country1 = 1 or Country2 = 2 or Country3 = 3….)
Compute MUSLIM = sum(Country1+ Country2+ Country3…)
ELSE.
COMPUTE NewVar = 0.
End IF.


Multiple (Creating new variable based on multiple criteria):

COMPUTE NEWVAR= 0.
If (BPLD=43000 | ANCESTR1D=1000 | ANCESTR2D=1000) NEWVAR=1.
If (BPLD=43330 | ANCESTR1D=1300 | ANCESTR2D=1300) NEWVAR=2.
If (BPLD=45740 | ANCESTR1D=1522 | ANCESTR2D=1522) NEWVAR=3.
If (BPLD=45790) NEWVAR=4.
If (BPLD=46541 | ANCESTR1D=1010 | ANCESTR2D=1010) NEWVAR=5.
If (BPLD=46543) NEWVAR=6.
If (BPLD=46544 | ANCESTR1D=1083 | ANCESTR2D=1083) NEWVAR=7.
If (BPLD=46545 | ANCESTR1D=1655 | ANCESTR2D=1655) NEWVAR=8.
If (BPLD=46546) NEWVAR=9.
If (BPLD=46547 | ANCESTR1D=1690 | ANCESTR2D=1690) NEWVAR=10.
If (BPLD=51000) NEWVAR=11.
If (BPLD=51200 | ANCESTR1D=7301 | ANCESTR2D=7301) NEWVAR=12.
If (BPLD=51400 | ANCESTR1D=7700 | ANCESTR2D=7700) NEWVAR=13.
If (BPLD=52000 | ANCESTR1D=6000 | ANCESTR2D=6000) NEWVAR=14.
If (BPLD=52110 | ANCESTR1D=6031 | ANCESTR2D=6031 | ANCESTR1D=6032 | ANCESTR2D=6032 | RACED=664) NEWVAR=15.
If (BPLD=52140 | ANCESTR1D=6801 | ANCESTR2D=6801 | RACED=669) NEWVAR=16.
If (BPLD=52200 | ANCESTR1D=4160 | ANCESTR2D=4160) NEWVAR=17.
If (BPLD=52300 | ANCESTR1D=6950 | ANCESTR2D=6950) NEWVAR=18.
If (BPLD=53000 | ANCESTR1D=4150 | ANCESTR2D=4150) NEWVAR=19.
If (BPLD=53200 | ANCESTR1D=4170 | ANCESTR2D=4170) NEWVAR=20.
If (BPLD=53420 | ANCESTR1D=4650 | ANCESTR2D=4650) NEWVAR=21.
If (BPLD=53500 | ANCESTR1D=4210 | ANCESTR2D=4210) NEWVAR=22.
If (BPLD=53600 | ANCESTR1D=4230 | ANCESTR2D=4230) NEWVAR=23.
If (BPLD=53700 | ANCESTR1D=4250 | ANCESTR2D=4250) NEWVAR=24.
If (BPLD=53800 | ANCESTR1D=4360 | ANCESTR2D=4360) NEWVAR=25.
If (BPLD=53900 | ANCESTR1D=4390 | ANCESTR2D=4390) NEWVAR=26.
If (BPLD=54000 | ANCESTR1D=4270 | ANCESTR2D=4270) NEWVAR=27.
If (BPLD=54100 | ANCESTR1D=4290 | ANCESTR2D=4290) NEWVAR=28.
If (BPLD=54200 | ANCESTR1D=4340 | ANCESTR2D=4340) NEWVAR=29.
If (BPLD=54300 | ANCESTR1D=4800 | ANCESTR2D=4800) NEWVAR=30.
If (BPLD=54400 | ANCESTR1D=4350 | ANCESTR2D=4350) NEWVAR=31.
If (BPLD=60011 | ANCESTR1D=4000 | ANCESTR2D=4000) NEWVAR=32.
If (BPLD= 60012 | ANCESTR1D=4020 | ANCESTR2D=4020) NEWVAR=33.
If (BPLD=60013 | ANCESTR1D=4040 | ANCESTR2D=4040) NEWVAR=34.
If (BPLD=60014 | ANCESTR1D=4060 | ANCESTR2D=4060) NEWVAR=35.
If (BPLD =60016 | ANCESTR1D=4080 | ANCESTR2D=4080) NEWVAR=37.
If (BPLD=60017) NEWVAR=38.
If (BPLD=60021) NEWVAR=39.
If (BPLD=60022 | ANCESTR1D=5270 | ANCESTR2D=5270) NEWVAR=40.
If (BPLD=60024 | ANCESTR1D=5300 | ANCESTR2D=5300) NEWVAR=41.
If (BPLD=60025 | ANCESTR1D=5310 | ANCESTR2D=5310) NEWVAR=42.
If (BPLD=60026 | ANCESTR1D=5320 | ANCESTR2D=5320) NEWVAR=43.
If (BPLD=60028 | ANCESTR1D=5460 | ANCESTR2D=5460) NEWVAR=44.
If (BPLD=60029 | ANCESTR1D=5470 | ANCESTR2D=5470) NEWVAR=45.
If (BPLD=60030 | ANCESTR1D=5510 | ANCESTR2D=5510) NEWVAR=46.
If (BPLD=60032 | ANCESTR1D=5640 | ANCESTR2D=5640) NEWVAR=48.
If (BPLD=60033 | ANCESTR1D=5660 | ANCESTR2D=5660) NEWVAR=49.
If (BPLD=60042) NEWVAR=50.
If (BPLD=60043 | ANCESTR1D=5190 | ANCESTR2D=5190) NEWVAR=51.
If (BPLD=60053 | ANCESTR1D=5680 | ANCESTR2D=5680) NEWVAR=53.
If (BPLD=60054 | ANCESTR1D=5820 | ANCESTR2D=5820) NEWVAR=54.
If (BPLD=60062) NEWVAR=55.
If (BPLD=60065 | ANCESTR1D=5230 | ANCESTR2D=5230) NEWVAR=56.
If (BPLD=60074 | ANCESTR1D=5130 | ANCESTR2D=5130) NEWVAR=57.
EXECUTE.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multiple IF statements with “COMPUTE”

Postby GerineL » Mon Jan 18, 2016 9:22 am

what do you mean when you say "some of the countries are corrected while others will give a varying answer." ?

Just a note: are you aware that | stands for "or" , not for "and"?

It is not really clear what you have done, so it is difficult to understand the quesiton.
iara
Posts: 11
Joined: Tue Oct 20, 2015 3:22 pm

Re: Multiple IF statements with “COMPUTE”

Postby iara » Mon Jan 18, 2016 3:24 pm

Thanks for replying, GerineL.

I am trying to get an estimate for groups using a combination of their place of birth and ancestry using Nested If Statements. My question is how can I compute a NEWVAR with only the selected countries rather than 57 separate variables for the countries. NEWVAR =What all of the groups have in common.

I intentionally used OR. What am doing is creating new country categories so for example if the place of birth is Albania OR either their first OR second ancestry is Albania then in the new variable they will receive a code 1 (Value label = Albania). Therefore, if Albania shows up at all in their response they will receive a code for Albania. The same applies to the other countries in my list.

If I compute each variable individually then I get what I assume is the correct answer. For example, The total for Albania is 30,428. However, when computed in a group, the total is 28,572. The same for other countries. It should not make a difference.

Albania computed using this command:

COMPUTE Country1= 0.
If (BPLD=43000 | ANCESTR1D=1000 | ANCESTR2D=1000) Country1 =1.
VALUE LABELS Country1
1 Country1.
Formats Individual (F5.0).
EXECUTE.

Should give the same result when it is a group such as using Nested If statements:
If (BPLD=43000 | ANCESTR1D=1000 | ANCESTR2D=1000) NEWVAR=1.
If (BPLD=43330 | ANCESTR1D=1300 | ANCESTR2D=1300) NEWVAR=2.
:
:


I tried using a DO IF and the same occurred. Where the total for one country is correct the total for the other is not.

Thanks!
iara
Posts: 11
Joined: Tue Oct 20, 2015 3:22 pm

Re: Multiple IF statements with “COMPUTE”

Postby iara » Mon Jan 18, 2016 5:54 pm

I think I am getting the incorrect values because the IF Statement overwriting the results of the previous ones.

What is the best command to use when the conditions involves two or more variables (POB, ANCESTRY1, ANCESTRY2, etc) AND want all conditions to prevail for the new variable?

Therefore if a respondent selects POB as Albania and his or her ancestry as Albanian they will get a 1, but if another person selects POB as Kosovo but ancestry as Albanian they will not be coded.

Thanks.

ETA:

I believe this explanation explains why my answers are different: http://www-01.ibm.com/support/docview.w ... wg21476060.

Problem remains how do I recode all of this variables into a new variable if they meet my specified conditions. Is there a way to combine all 57 variables in such a way that I can use my value statements as is (i.e., if I combine the variables, 1 will remain Albania and so forth)?

Currently the data are not mutually exclusive so I cannot sum the results. Is there a way to make them MUTU?

Count
Var2 Total
Var2
Var1 0 22550 22550
Var1 22 22
Total 22572 22572



Count
Var2 Total
Var2
Var3 0 22557 22557
Var3 15 15
Total 22572 22572
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multiple IF statements with “COMPUTE”

Postby GerineL » Tue Jan 19, 2016 9:31 am

I think you are right: With each new if-statement, the old value overwrites.
What is important now is that you think about what country you want to end up with.
So indeed, if a person has albania for question 1, kosovo for question 2 and germany for question 3, what do you want the final variable to look like?
iara
Posts: 11
Joined: Tue Oct 20, 2015 3:22 pm

Re: Multiple IF statements with “COMPUTE”

Postby iara » Fri Jan 22, 2016 1:30 pm

In the end I created two sets of variables: one using Compute for the individual countries and another using Do If for a mutually exclusive total.
My problem was I was conflating these two tasks. In other words I was trying to get a mutually exclusive country list, but that was contingent upon the order in which the countries appeared. Of course the results would be inaccurate.

Thanks for helping me work through this.

Who is online

Users browsing this forum: No registered users and 2 guests

cron