I develop a Logit Model for credit risk rating.
Dependent variable is Non-performing Loans/ Performing Loans, so 1/0
Population has 10.000 observations
I want to make cross-validation so I separated randomly 30% test sample and developed a logit model with using remained 70% sample. I have repeated this cross validation process several times and measured ROC for discrimination power. All models were successful. I am planning to implement this developed logit model.
The question is “ Should I choose model which is created by all population or one of the test models ( by 70% random sample)?”