I am from Belgium and I will try to do my best in English.
I started using SPSS (version 21) last week because I have to analyse the sales price of used cars with multiple regression. But I am having a few problems.
So my dependent variable is the sales price and independent variables include mileage, age of the car, power, fuel type, transmission, country of origin of the manufacturer,...
1) The first problem I am having concerns the variable "country of origin of the manufacturer". My database includes different makes from Germany, France, Italy, Japan, USA and other countries. I now want to analyse if there is a price difference between the different makes/countries. For example: is a German car more expensive than a Japanese car etc? How can I do this in SPSS? I was thinking about creating multiple dummy variables but I don't know how to code my data and how to put it in SPSS...
2) After analysing the data, I decided to make a logarithmic transformation of the sales price but how do I have to interpret a scatterplot for example? Let's say there is a negative relation between sales price and mileage. So higher mileage means lower ln(sales price). What does this mean?
3) Some variables like "power" show a higher R²-value when the sales price is not transformed. When I make a scatterplot with sales price and power, R²-value is 0,384. When I make a scatterplot with ln(sales prices) and power, R²-value is 0,316. On the other hand, variables like mileage and age of the car show a reasonably higher R² when a logarithmic transformation of the sales price is applied. So how do you know if a transformation is really necessary and if it will be better for your regression model? For me, it would of course be a lot easier to understand and interpret all the output without the logarithmic transformation.
Any help would be very very much appreciated.
Thanks in advance!