Multivariate linear regression analysis

Moderators: statman, Analyst Techy, andris, Fierce, GerineL, Smash

p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Multivariate linear regression analysis

Postby p_s_ » Mon Aug 04, 2014 4:19 am

I am trying to determine the reason why(and how many) people with health insurance do not fully use all of its benefits(like free flu vaccines). I am using a sample of 400 people with age, income, education as dependent variables and having health insurance as independent variable. I glanced at the information in http://www-01.ibm.com/support/docview.w ... wg21476743 and followed the mentioned steps.

I got some results like

Multivariate Tests (Design: Intercept + haveinsure)

Effect Value F Hypothesis df Error df Sig.

Intercept Pillai's Trace .053 11.361(b) 3.000 470.000 .000

Wilks' Lambda .827 11.361(b) 3.000 470.000 .000

Hotelling's
Trace .069 11.361(b) 3.000 470.000 .000

Roy's Largest
Root .083 11.361(b) 3.000 470.000 .000


haveinsure Pillai's Trace .138 4.570 12.000 1420.000 .000

Wilks' Lambda .877 4.797 12.000 1248.086 .000

Hotelling's
Trace .151 4.998 12.000 1410.000 .000

Roy's Largest
Root .141 16.101(c) 4.000 473.000 .000

b - Exact statistic
c The statistic is an upper bound on F that yields a lower bound on the significance level







Tests of Between-Subjects Effects Tests

Source Dependent
Variable Type III df Mean F Sig.
Sum of Squares Square

Corrected Model age 37.546(a) 4 9.637 3.893 .004
education 10.619(b) 4 2.655 .477 .752
income 334.245(c) 4 84.061 16.766 .000

Intercept age 32.173 1 34.173 13.805 .000
education 141.268 1 143.268 25.752 .000
income 30.201 1 30.201 6.024 .014

haveinsure age 37.546 4 9.637 3.893 .004
education 10.619 4 2.655 .477 .752
income 335.245 4 84.061 16.766 .000

Error age 1171.320 474 2.475
education 2636.013 474 5.563
income 2375.494 474 5.014

Total age 3150.000 479
education 12315.000 479
income 6289.000 479

Corrected Total age 1210.866 478

education 2646.633 478

income 2711.739 478


a. R Squared = .032 (Adjusted R Squared = .024)
b. R Squared = .004 (Adjusted R Squared = -.004)
c. R Squared = .124 (Adjusted R Squared = .117)





Dependent Parameter B Std. t Sig. 95% Confidence Interval
Variable Error Lower Upper
Bound Bound

age Intercept 1 1.573 0.637 0.525 -2.092 4.092
[haveinsure=1] 1.173 1.576 0.745 0.456 -1.923 4.268
[haveinsure=2] 0.589 1.578 0.373 0.708 -2.514 3.693

education Intercept 4 2.358 1.697 0.091 -0.635 8.636
[haveinsure=1] 0.578 2.362 0.245 0.808 -4.063 5.219
[haveinsure=2] 0.388 2.367 0.164 0.87 -4.265 5.04

income Intercept 1 2.238 0.448 0.659 -3.4 5.4
[haveinsure=1] 2.289 2.242 1.021 0.309 -2.118 6.696
[haveinsure=2] 0.419 2.245 0.188 0.852 -3.999 4.837

1. Am I approaching the problem in a proper way? I mean am I doing the right analysis in SPSS for Multivariate linear regression analysis (multiple dependent variable, one independent variable)?

2. Which method(Pillai's Trace, Wilks' Lambda, Hotelling's Trace, Roy's Largest Root) should be used for a case like mine?

3. Why is Type III Sum of Squares error 1171.320 for age, education and income?

4. I am new to Multivariate linear regression analysis and also a beginner with SPSS and statistics. How can I interpret and learn more about the output SPSS generated?

Any suggestions would be appreciated.

Thanks
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Mon Aug 11, 2014 2:18 pm

please provide the syntax you used, and upload a file with your results, because they are quite hard to read now ;-)
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Tue Aug 12, 2014 2:52 am

In SPSS22, I chose Analyze->Regression->Binary Logistic, then choose dependent variable as people who free insurance services(like preventive services), independent variable as age, education, income, Method as Enter. I did not choose anything in Save, Options, Bootstrap, Style.


LOGISTIC REGRESSION VARIABLES Notusing_free_insurance_services
/METHOD=ENTER Age Gender Education Income
/CONTRAST (Age)=Indicator
/CONTRAST (Gender)=Indicator
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).


The results are jumbled up since I copied from the spv file(SPSS output).
I got

Block 0: Beginning Block
Classification Table(a,b)
|------|----------------------------------|-----------------------------------------------------|
| |Observed |Predicted |
| | |----------------------------------|------------------|
| | |Notusing_free_insurance_services |Percentage Correct|
| | |------------------------------|---| |
| | |.0 |1.0| |
|------|------------------------------|---|------------------------------|---|------------------|
|Step 0|Notusing_free_insurance_services|.0 |293 |0 |100.0 |
| | |---|------------------------------|---|------------------|
| | |1.0|48 |0 |.0 |
| |----------------------------------|------------------------------|---|------------------|
| |Overall Percentage | | |85.9 |
|-----------------------------------------------------------------------------------------------|
a Constant is included in the model.
b The cut value is .500

1. I can understand this, but need to know what is cut value.


Variables in the Equation
|---------------|------|----|-------|--|----|------|
| |B |S.E.|Wald |df|Sig.|Exp(B)|
|------|--------|------|----|-------|--|----|------|
|Step 0|Constant|-1.809|.156|134.964|1 |.000|.164 |
|--------------------------------------------------|

2. I need to read up on B, S.E.,Wald, df, Sig, Exp(B), but what should I interpret from these?

Block 1: Method = Enter
Omnibus Tests of Model Coefficients
|------------|----------|--|----|
| |Chi-square|df|Sig.|
|------|-----|----------|--|----|
|Step 1|Step |37.079 |10|.000|
| |-----|----------|--|----|
| |Block|37.079 |10|.000|
| |-----|----------|--|----|
| |Model|37.079 |10|.000|
|-------------------------------|

Model Summary
|----|-----------------|--------------------|-------------------|
|Step|-2 Log likelihood|Cox & Snell R Square|Nagelkerke R Square|
|----|-----------------|--------------------|-------------------|
|1 |240.048a |.103 |.185 |
|---------------------------------------------------------------|

a Estimation terminated at iteration number 20 because maximum iterations has been reached. Final solution cannot be found.

3. Does that mean I need to ignore -2 Log likelihood method's results?

Classification Tablea
|------|----------------------------------|-----------------------------------------------------|
| |Observed |Predicted |
| | |----------------------------------|------------------|
| | |Notusing_free_insurance_services |Percentage Correct|
| | |------------------------------|---| |
| | |.0 |1.0| |
|------|------------------------------|---|------------------------------|---|------------------|
|Step 1|Notusing_free_insurance_services|.0 |293 |0 |100.0 |
| | |---|------------------------------|---|------------------|
| | |1.0|47 |1 |2.1 |
| |----------------------------------|------------------------------|---|------------------|
| |Overall Percentage | | |86.2 |
|-----------------------------------------------------------------------------------------------|
a The cut value is .500


Variables in the Equation
|---------------|------|----|-------|--|----|------|
| |B |S.E.|Wald |df|Sig.|Exp(B)|
|------|--------|------|----|-------|--|----|------|
|Step 0|Constant|-1.809|.156|134.964|1 |.000|.164 |
|--------------------------------------------------|

4. I should have taken some statistics classes so that I can understand the output, but does the output indicate what I estimated earlier(People with relatively low income, education and those who are younger(18 to 24) are the group who use free insurance services less compared to other groups)?

I tried to attach a text file with this output, but got the message "Sorry, the board attachment quota has been reached."

Thank you for helping me out.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Tue Aug 12, 2014 8:31 am

could you try to upload your output file as an attachment?
Furthermore, in order to answer your question we need to know more about your variables.
E.g., income is a scale variable, with lower values indicating a lower income.
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Wed Aug 13, 2014 3:00 am

Thanks Gerinel,
GerineL wrote:could you try to upload your output file as an attachment?
I tried to upload as MS-Word or RTF(Rich Text Format), but those type of files are not allowed. Then, I tried to upload a PDF and got the message "Sorry, the board attachment quota has been reached." How can I upload my output?

I have pasted again as "code"

Code: Select all

Block 0: Beginning Block
Classification Table(a,b)

|      |Observed                          |Predicted                                            |
|      |                                  |----------------------------------|------------------|
|      |                                  |Notusing_free_insurance_services    |Percentage Correct|
|      |                                  |------------------------------|---|                  |
|      |                                  |.0                            |1.0|                  |
|------|------------------------------|---|------------------------------|---|------------------|
|Step 0|Notusing_free_insurance_services|.0 |293                           |0  |100.0             |
|      |                              |---|------------------------------|---|------------------|
|      |                              |1.0|48                            |0  |.0                |
|      |----------------------------------|------------------------------|---|------------------|
|      |Overall Percentage                |                              |   |85.9              |
|-----------------------------------------------------------------------------------------------|
 a Constant is included in the model.
 b The cut value is .500




Variables in the Equation
|---------------|------|----|-------|--|----|------|
|               |B     |S.E.|Wald   |df|Sig.|Exp(B)|
|------|--------|------|----|-------|--|----|------|
|Step 0|Constant|-1.809|.156|134.964|1 |.000|.164  |
|--------------------------------------------------|

Variables not in the Equation
|-----------------------------------|------|--|----|
|                                   |Score |df|Sig.|
|------|------------------|---------|------|--|----|
|Step 0|Variables         |Age      |12.310|7 |.091|
|      |                  |---------|------|--|----|
|      |                  |Age(1)   |7.595 |1 |.006|
|      |                  |---------|------|--|----|
|      |                  |Age(2)   |.104  |1 |.747|
|      |                  |---------|------|--|----|
|      |                  |Age(3)   |.213  |1 |.644|
|      |                  |---------|------|--|----|
|      |                  |Age(4)   |.948  |1 |.330|
|      |                  |---------|------|--|----|
|      |                  |Age(5)   |4.997 |1 |.025|
|      |                  |---------|------|--|----|
|      |                  |Age(6)   |2.392 |1 |.122|
|      |                  |---------|------|--|----|
|      |                  |Age(7)   |.164  |1 |.685|
|      |                  |---------|------|--|----|
|      |                  |Gender(1)|9.617 |1 |.002|
|      |                  |---------|------|--|----|
|      |                  |Education|.258  |1 |.612|
|      |                  |---------|------|--|----|
|      |                  |Income   |8.708 |1 |.003|
|      |----------------------------|------|--|----|
|      |Overall Statistics          |29.755|10|.001|
|--------------------------------------------------|


Block 1: Method = Enter
Omnibus Tests of Model Coefficients
|------------|----------|--|----|
|            |Chi-square|df|Sig.|
|------|-----|----------|--|----|
|Step 1|Step |37.079    |10|.000|
|      |-----|----------|--|----|
|      |Block|37.079    |10|.000|
|      |-----|----------|--|----|
|      |Model|37.079    |10|.000|
|-------------------------------|

Model Summary
|----|-----------------|--------------------|-------------------|
|Step|-2 Log likelihood|Cox & Snell R Square|Nagelkerke R Square|
|----|-----------------|--------------------|-------------------|
|1   |240.048(a)         |.103                |.185               |
|---------------------------------------------------------------|
 a Estimation terminated at iteration number 20 because maximum iterations has been reached. Final solution cannot be found.


Classification Tablea
|------|----------------------------------|-----------------------------------------------------|
|      |Observed                          |Predicted                                            |
|      |                                  |----------------------------------|------------------|
|      |                                  |Notusing_free_insurance_services    |Percentage Correct|
|      |                                  |------------------------------|---|                  |
|      |                                  |.0                            |1.0|                  |
|------|------------------------------|---|------------------------------|---|------------------|
|Step 1|Notusing_free_insurance_services|.0 |293                           |0  |100.0             |
|      |                              |---|------------------------------|---|------------------|
|      |                              |1.0|47                            |1  |2.1               |
|      |----------------------------------|------------------------------|---|------------------|
|      |Overall Percentage                |                              |   |86.2              |
|-----------------------------------------------------------------------------------------------|
 a The cut value is .500

Variables in the Equation
|-----------------|-------|---------|-----|--|-----|-------------|
|                 |B      |S.E.     |Wald |df|Sig. |Exp(B)       |
|-------|---------|-----------------|-----|--|-----|-------------|
|Step 1a|Age      |                 |1.578|7 |.979 |             |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(1)   |19.509 |16141.860|.000 |1 |.999 |297074329.512|
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(2)   |19.068 |16141.860|.000 |1 |.999 |191083831.399|
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(3)   |18.916 |16141.860|.000 |1 |.999 |164082825.458|
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(4)   |18.642 |16141.860|.000 |1 |.999 |124807540.072|
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(5)   |.329   |17766.143|.000 |1 |1.000|1.389        |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(6)   |.386   |19221.419|.000 |1 |1.000|1.471        |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Age(7)   |.711   |43313.214|.000 |1 |1.000|2.035        |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Gender(1)|-1.062 |.338     |9.899|1 |.002 |.346         |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Education|.327   |.153     |4.579|1 |.032 |1.386        |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Income   |-.215  |.094     |5.255|1 |.022 |.807         |
|       |---------|-------|---------|-----|--|-----|-------------|
|       |Constant |-21.401|16141.860|.000 |1 |.999 |.000         |
|----------------------------------------------------------------|
 a Variable(s) entered on step 1: Age, Gender, Education, Income.



GerineL wrote: Furthermore, in order to answer your question we need to know more about your variables.
E.g., income is a scale variable, with lower values indicating a lower income.
Yes, income, age, education are such variables. Now, can you please advise on my questions?

I appreciate your assistance and time.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Wed Aug 13, 2014 7:59 am

Sorry again, wee keep coming back :-)

I don't think all your variables are like that. For instance, if I look at age, it seems to consist of many variables.
hence, it is not a normal scale variable.

So please indicate - for each variable - what it consists of.
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Thu Aug 14, 2014 2:54 am

GerineL wrote:Sorry again, wee keep coming back :-)
Can I post an attachment?
GerineL wrote: I don't think all your variables are like that. For instance, if I look at age, it seems to consist of many variables.
hence, it is not a normal scale variable.

So please indicate - for each variable - what it consists of.
I removed the age as independent variable.

In SPSS22, I chose Analyze->Regression->Binary Logistic, then choose dependent variable as people having insurance who are not using free insurance services(like preventive services) who are coded as 1(whereas people having insurance and using free insurance services are coded as 0) for this variable, independent variable as education, income, Method as Enter. I did not choose anything in Save, Options, Bootstrap, Style

I have pasted the "variables in equation", "step 1".

Code: Select all


Variables in the Equation							
Variables in the Equation							
  				B	       S.E.	Wald	      df	Sig.	Exp(B)
  Step 1a	Gender(1)	-1.076	.330	10.643	1	.001	.341
  		Education	.295	       .129	5.224	        1	.022	1.343
  		Income	-.315	        .090	12.175	1	.000	.730
  		Constant	-1.789	.558	10.284	1	.001	.167
  a Variable(s) entered on step 1: Gender, Education, Income.							
 
1. I need to read up on B, S.E.,Wald, df, Sig, Exp(B), but what should I interpret from these?

2. I should have taken some statistics classes so that I can understand the output, but does the output indicate what I estimated earlier(People with relatively low income, education and males are the group who use free insurance services less compared to other groups)?


Thank you for your time and assistance.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Fri Aug 15, 2014 8:42 am

http://www.ats.ucla.edu/stat/spss/output/logistic.htm

On this website, you see a step-by-step explanation of how to interpret these outcomes.
In your case, it seems like education will increase the likelihood of having insurance, income will decrease it.
Gender is hard to say, because you did not explain how this one is coded.

Again, it would be very helpful if you could explain how your data are coded, because interpretation of the results depends on that!!
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Sat Aug 16, 2014 5:41 am

Thanks GerineL,
GerineL wrote:http://www.ats.ucla.edu/stat/spss/output/logistic.htm

On this website, you see a step-by-step explanation of how to interpret these outcomes.
In your case, it seems like education will increase the likelihood of having insurance, income will decrease it.
Gender is hard to say, because you did not explain how this one is coded.

Again, it would be very helpful if you could explain how your data are coded, because interpretation of the results depends on that!!

I re-estimated the model with education as categorical variable.

Code: Select all

Variables in the Equation                                   
               		       B     	   S.E.            Wald     df     Sig.     Exp(B)
Step 1a     Education                                4.249     7     .751     
     Education(1)     -19.877     28133.640     .000     1     .999     .000
     Education(2)     -19.993     11220.675     .000     1     .999     .000
     Education(3)     -1.042      .996             1.095    1     .295     .353
     Education(4)     -.415       .887            .219       1     .640     .660
     Education(5)     .232         1.048           .049       1     .825     1.261
     Education(6)     -.111       .891            .015       1     .901     .895
     Education(7)     .539         1.110           .236       1     .627     1.714
     Income          -.274         .093             8.610      1     .003     .760
     Gender(1)       -1.039       .333            9.703      1     .002     .354
     Constant        -.172         .973            .031        1     .859     .842
a Variable(s) entered on step 1: Education, Income, Gender.

The categorical variables were coded in a way unclear to me by SPSS 22.
Female was given 1 and Male 0 though there were around 200 females and about 160 males.

Do you think they were coded properly?

I appreciate your time and assistance.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Tue Aug 19, 2014 8:44 am

I don't know if using Eduaction like this makes sense, basically you now compare each of the seven conditions with one other condition.

Could you give us something like this?

Gender: dummycoded (1 = female 1 = male)
education: Ordinal (ranging from 1 = no education to 7 = masters degree)

etc.
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Wed Aug 20, 2014 3:44 am

Thanks GerineL,
GerineL wrote:I don't know if using Eduaction like this makes sense, basically you now compare each of the seven conditions with one other condition.
I did not get this. I thought education was categorized as 7 levels Education(1) to Education(7). Can you please explain this more?
GerineL wrote: Could you give us something like this?

Gender: dummycoded (1 = female 1 = male)
education: Ordinal (ranging from 1 = no education to 7 = masters degree)

etc.
I will post it.
Does Sig in SPSS output refer to p?

1. I realize p is the proportion(or probability) having an insurance at the given value of the explanatory variables. But, is the sig in SPSS output same as p meaning if sig is less than 0.05 for say age, then age significantly affects usage of preventive services? Am I understanding correctly?

2. Sorry, if this is naive, but does SE mean Standard Error, df stand for Degrees of freedom, Wald refer to Wald test(http://en.wikipedia.org/wiki/Wald_test)?

I appreciate your time and assistance.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Wed Aug 20, 2014 8:33 am

p_s_ wrote: Does Sig in SPSS output refer to p?
yes.
1. I realize p is the proportion(or probability) having an insurance at the given value of the explanatory variables. But, is the sig in SPSS output same as p meaning if sig is less than 0.05 for say age, then age significantly affects usage of preventive services? Am I understanding correctly?
Yes you are understanding correctly. Be careful with words like affect though, as you probably did not expermentally manipulate anything, you cannot draw conclusions about causality.
2. Sorry, if this is naive, but does SE mean Standard Error, df stand for Degrees of freedom, Wald refer to Wald test(http://en.wikipedia.org/wiki/Wald_test)?
yes you are correct.


If I were you, I would look at some case-studies using logistic regression, such as this one:
http://core.ecu.edu/psyc/wuenschk/MV/Mu ... c-SPSS.PDF
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Thu Aug 21, 2014 3:21 am

Thanks Gerinel,

That means something is flawed with my analysis. I have posted results with sig values most of which are close to 1.

Code: Select all

Variables in the Equation							
			B	S.E.		Wald	df	Sig.	Exp(B)
Step 1a	Age					2.196	6	.901	
	Age(1)		19.097	13651.701	.000	1	.999	196747485.390
	Age(2)		18.727	13651.701	.000	1	.999	135881344.086
	Age(3)		18.585	13651.701	.000	1	.999	117886225.083
	Age(4)		17.635	13651.701	.000	1	.999	45569434.410
	Age(5)		-.511	15322.555	.000	1	1.000	.600
	Age(6)		.181	17048.756	.000	1	1.000	1.198
	Gender(1)	-1.058	.352		9.021	1	.003	.347
	Education				6.994	7	.430	
	Education(1)	-2.226	30163.899	.000	1	1.000	.108
	Education(2)	-19.990	9907.915	.000	1	.998	.000
	Education(3)	-2.017	1.181		2.916	1	.088	.133
	Education(4)	-1.561	1.098		2.021	1	.155	.210
	Education(5)	.543	1.234		.194	1	.660	1.721
	Education(6)	-1.015	1.055		.924	1	.336	.363
	Education(7)	-.603	1.233		.239	1	.625	.547
	Income					7.259		 8	.509	
	Income(1)	.631	.862		.536	1	.464	1.880
	Income(2)	1.089	.905		1.449	1	.229	2.972
	Income(3)	-.282	1.127		.063	1	.802	.754
	Income(4)	-1.021	1.161		.773	1	.379	.360
	Income(5)	-.699	1.427		.240	1	.624	.497
	Income(6)	.587	.996		.348	1	.555	1.799
	Income(7)	-19.720	9801.456	.000	1	.998	.000
	Income(8)	-19.130	10488.552	.000	1	.999	.000
	Constant	-19.042	13651.701	.000	1	.999	.000
a Variable(s) entered on step 1: Age, Gender, Education, Income.							



1. Could separating income(0 to 15999, then 16000 to 25999 and so on...), education(middle school, high school and so on), age into categories could have caused this?

2. How can I trace and fix the error?

3. For a regression analysis like mine, do I need to worry about SE mean Standard Error, df stand for Degrees of freedom, Wald refer to Wald test([url]http://en.wikipedia.org/wiki/Wald_test)?


I appreciate all your assistance and time.
GerineL
Moderator
Posts: 1477
Joined: Tue Jun 10, 2008 4:50 pm

Re: Multivariate linear regression analysis

Postby GerineL » Thu Aug 21, 2014 9:58 am

p_s_ wrote:
1. Could separating income(0 to 15999, then 16000 to 25999 and so on...), education(middle school, high school and so on), age into categories could have caused this?
Yes. If you separate income into groups like you indicated, this basically means that you assume that there will be no difference between an income of 1 or 15000, but there will be a difference between incomes of 25999 and 26000.
There are some cases in which it would make sense to separate into categories like you did, but usually using the variable as a continuous variable is the way to go. I assume that in your case, using the continuous variable would make much more sense. Can you indicate why you chose to make categories?
2. How can I trace and fix the error?
First of all, it could be that there just is no relation. That's why you perform the test, to see if there is a relation between your variables.
Second, I think it will help a lot to use the variables continuously instead of creating categories.
3. For a regression analysis like mine, do I need to worry about SE mean Standard Error, df stand for Degrees of freedom, Wald refer to Wald test([url]http://en.wikipedia.org/wiki/Wald_test)?
These are all elements of the analysis, which are used to come to the conclusion about significance. I am not sure what you mean by "worry about these". It is common to report them though.
p_s_
Posts: 11
Joined: Mon Aug 04, 2014 4:15 am

Re: Multivariate linear regression analysis

Postby p_s_ » Fri Aug 22, 2014 1:23 am

Thanks Gerinel,
GerineL wrote:
1. Could separating income(0 to 15999, then 16000 to 25999 and so on...), education(middle school, high school and so on), age into categories could have caused this?

Yes. If you separate income into groups like you indicated, this basically means that you assume that there will be no difference between an income of 1 or 15000, but there will be a difference between incomes of 25999 and 26000.
There are some cases in which it would make sense to separate into categories like you did, but usually using the variable as a continuous variable is the way to go. I assume that in your case, using the continuous variable would make much more sense. Can you indicate why you chose to make categories?
I chose categories because I thought they cannot be differentiated from each other using a mathematical method. But, now I think only gender and education are categorical. Income and age can be continuous. Do you think this is proper?
GerineL wrote: 2. How can I trace and fix the error?

First of all, it could be that there just is no relation. That's why you perform the test, to see if there is a relation between your variables.
Second, I think it will help a lot to use the variables continuously instead of creating categories.
Did you mean I should use all the variables continuously or only some like income, age?
GerineL wrote: 3. For a regression analysis like mine, do I need to worry about SE mean Standard Error, df stand for Degrees of freedom, Wald refer to Wald test([url]http://en.wikipedia.org/wiki/Wald_test)?

These are all elements of the analysis, which are used to come to the conclusion about significance. I am not sure what you mean by "worry about these". It is common to report them though.
By "worry about these", I wanted to know, can I ignore them or do I need to know what they are indicate for my analysis and explain the reasons for it?

I appreciate your assistance and time.

Who is online

Users browsing this forum: No registered users and 1 guest

cron