Dear Superior Statistical Beings,
I am working on a large dataset of over 4 million data points and around 52 variables. The dependant variables are currently continuous variables representing gas and electricity consumption per year with a different variable for each year. I also have variables for total and average gas and electricity consumption for all years. The independent variables are mostly categorical variables (either 0 and 1 or 1 to 5) with one variable being continuous.
What I am hoping to analyse is whether or not any of the independent variables influence the dependant variable/s and especially if there are any relationships between these variables. The main issue I am having is working out which test would be most appropriate. I have mostly done logistic regressions on previous projects and ideally would want to use the same test here but as my independent variables are continuous I didn’t think that was possible.
I do not know if it would be possible or even advisable to transform the continuous dependant variables to categorical variables but as there are no clear groups in the data I felt that this would not be appropriate (but I may be wrong).
I appreciate that I have not given much information on the dataset but if you require any more information to help then please ask.
Any help would be much appreciated.