## how to analyse multiple variables with the same scope at case-level

### how to analyse multiple variables with the same scope at case-level

Hi people,

I have a question. I am analysing some data and I want to analyse my information at case level in SPSS. Now I am facing a difficulty. I am working with a received database.
At case level, I have ten riskfactors and I want tot analyse the correlation between the different riskfactors and the (nominal) dependent variable. How do I do this scientifically right?

For example:

In case 1:
riskfactor 1 is 'aaa', riskfactor 2 is 'ccc' and the dependent variable is 'zzz' (yes/no).

In case 2:
riskfactor 1 is 'ccc', riskfactor 2 is 'bbb', riskfactor 3 is 'aaa' and the dependent variable is 'zzz' (yes/no).

### Re: how to analyse multiple variables with the same scope at case-level

so how are your data organized? is it 1 variable that indicates what the risk factors are? Is there a 1/0 for each risk factor? Are all risk factors scale level, and in your example, risk is 0 for risk factor bbb for person 1? Does it matter what riskfactor "1" vs "2" is? etc.
### Re: how to analyse multiple variables with the same scope at case-level

Hi GerineL,

In total there are about 300 different riskfactors, but the maximum number of riskfactors per case is 10. So there are 10 different variables (numeric) with the same categories of riskfactors. All those riskfactors are important to measure. The riskfactors are at scale-level and I think that it matters what riskfactor '1' vs riskfactor '2' is, because I have to analyse the correlation between the different riskfactors and the dependent variable.

Do you have any suggestions?

Thanks
### Re: how to analyse multiple variables with the same scope at case-level

not exactly clear how your data are orgenized yet. You say: there are 10 different variables (numeric) with the same categories of risk factors.
What do you mean by that?

Can you maybe give an example of 2 / 3 cases to illustrate how your data are organized?

another question: what is your exact research question? are you interested in the effect of specific risk factors, the total amount of risk, or...?
### Re: how to analyse multiple variables with the same scope at case-level

Hi GerineL,

my main goal is to analyse the existing of correlation between the riskfactors and the dependent variable (commit a crime or not).
The next step i will follow is to design a prediction model by regression-analyse. I think the best option is to create nominal (dummy) variables per case for the riskfactors.
In the attachment you can find some examples. Because of different interests I can't show you more information in the attachments.

### Re: how to analyse multiple variables with the same scope at case-level

Almost there

So, you have 10 risk factors.
These each have a value.
Then there is also a possible range from 0 to ...? for each risk factor. This indicates number of times risk factor was present? or ...?

Then regardign your goal: you say you want to analyze the correlation between risk factors and dependend variable.
But do you want to know whether the level of each of the specific risks is associated with the outcome (i.e., 10 analyses)?
Or do you want to know whether the total number of risks is associated with the outcome?
or something else?
i.e., can you frame your actual research question in words, rather than use the word correlation?
### Re: how to analyse multiple variables with the same scope at case-level

Yes, I have 10 riskfactors in my dataset. In theory there are more existing riskfactors, so every variable (riskfactor 1, riskfactor 2 etc.) contains the same categories (range 0 to 30). This means that each variable "riskfactor x" contains other categories which are the real riskfactors.

I'am testing the efficiency of a detection tool. In practice each case do not have more than 10 riskfactors, so that's why the datasat contains ten variables "riskfactor". -99 means that the variable is not applicable. For each categorie (real riskfactor) I want to analyse the correlation with the dependent variable, so I think that i have to create dummies of these categories. The design of the database makes it very hard for me.
### Re: how to analyse multiple variables with the same scope at case-level

I'm very sorry but I don't seem to be able to understand how your dataset is set up.

How I understand it now:

For every participant, there is 10 variables that contain information about risk factors.
This variable can have a value between 1 and 30, indicating whether or not that variable is present.
Variable 1 contains risk factors 1 thru 30, variable 2 contains riskfactors 31 thru 60 etc.
In total, there is 300 possible risk factors, but because all participants have maximum 1 risk factor in each categorie (e.g., in variable 1 they have no risk factors or 1 of the 30 possible risk factors), their maximum number of total risk factors is 10.

Is that correct.
### Re: how to analyse multiple variables with the same scope at case-level

Hi GerineL,

That is correct!
### Re: how to analyse multiple variables with the same scope at case-level

yay!!!
one more thing regarding the risk factors: you indicated that they are at scale level, but with my discription, they are actually nominal (present/notpresent). correct?

Are you interested in:

- whehter or not a risk factor is present in each of the 10 categories predicts the outcome (e.g., if in variable 1 there is a risk factor, is outcome more likely, independent of whether it is for instance risk factor 11 or 23)?
- which of the total 300 risk factors is related to the outcome?
- if alltogehter, the risk factors increase the likelihood of the outcome (e.g., if any risk factor, doesn't matter wich one, is present, is outcome more likely)
- if alltogether, the total number of risk factors (doesn't matter which ones) increase the likelihood of the outcome?

or something else?
### Re: how to analyse multiple variables with the same scope at case-level

hi gerinel,

Yes:D!

I am interested in the following analyses:

"whehter or not a risk factor is present in each of the 10 categories predicts the outcome (e.g., if in variable 1 there is a risk factor, is outcome more likely, independent of whether it is for instance risk factor 11 or 23)?"

"which of the total 300 risk factors is related to the outcome?"

and:

"if alltogehter, the risk factors increase the likelihood of the outcome (e.g., if any risk factor, doesn't matter wich one, is present, is outcome more likely)"

Thank you very much!
### Re: how to analyse multiple variables with the same scope at case-level

Good! Okay!!!

Now per question:
"whehter or not a risk factor is present in each of the 10 categories predicts the outcome (e.g., if in variable 1 there is a risk factor, is outcome more likely, independent of whether it is for instance risk factor 11 or 23)?"
compute dummy variables (present / not present) for each of the 10 categories.
"which of the total 300 risk factors is related to the outcome?"
compute dummy variables for each of the risk factors (unless you have an extremely large dataset, you probably won't find significant results]
"if alltogehter, the risk factors increase the likelihood of the outcome (e.g., if any risk factor, doesn't matter wich one, is present, is outcome more likely)"
create 1 dummy risk / no risk and relate to outcome.
personally, I would also create a dummy that indicates how many risk factors are present and relate to outcome.

Each of these analyses could be done with chi square test or fisher exact test.
see here for more information, you have 1 categorical IV and several categorical DVs

http://www.ats.ucla.edu/stat/mult_pkg/whatstat/

p.s.
I know that there are also some data mining analyses that might be useful to you.
I know of the existence of these techniques, but I don't know anything about how they work etc. You might want to look into that as well.
I know that Jonhs Hopkins offers courses on data mining techniques on coursera, dependent on how much time you have, that might be something to check out!
### Re: how to analyse multiple variables with the same scope at case-level

Thank you very much!

I have one last question regarding to the following sentence:
"if alltogehter, the risk factors increase the likelihood of the outcome (e.g., if any risk factor, doesn't matter wich one, is present, is outcome more likely)"

you said: "create 1 dummy risk / no risk and relate to outcome.
personally, I would also create a dummy that indicates how many risk factors are present and relate to outcome."

What do you mean with "risk/no risk" and relate to outcome? Do you mean a logistic regression analysis with dummy variables of the riskfactors ?

I will take a look at coursera.
### Re: how to analyse multiple variables with the same scope at case-level

I would create a variable just indicating: how many risk factors are present (0 - 10), and use that as a predictor in logistic regression.
I would not use logistic regression in any of the other analyses, because you don't have continuous variables that is overly complicated.

