pythonforspss.org wrote:I'm not sure whether your current approach is the fastest one. Could you please share with us:

1) How many combinations are there altogether?

2) How many different values / formulas do each of the two outcome variables have?

3) Where are the rules that map these on the combinations of conditions? Are they in an ordered file?

Better even if you could share these rules and a (anonimized) sample of your data.

I suspected there was probably a more efficient way to evaluate all of these various conditions. (That was my next challenge to explore!)

1) There are 11 combinations to evaluate. All of them make use of all three variables (EndDate, StartDate, and Reason), connected with AND, to determine what to do. The date variables get evaluated as either missing or not missing, and the Reason variable as either gt 0, 0, or missing:

missing(StartDate)

not(missing(StartDate))

missing(EndDate)

not(missing(EndDate)

Reason gt 0

Reason = 0

missing(Reason)

2) LengthOfStay has three formulas. They are:

LengthOfStay = datediff(EndDate,StartDate,"days"). /* but when Group = 2, MidDate should be used instead of EndDate */.

LengthOfStay = 888. /* still in care */.

LengthOfStay = 999. /* can't calculate - EndDate, StartDate, or both are missing */.

ReasonNew has four formulas:

ReasonNew = Reason

ReasonNew = 777 /* Reason is invalid */.

ReasonNew = 888 /* still in care */.

ReasonNew = 999 /* Reason is missing */.

Also, if (for example) LengthOfStay = 999, this doesn't necessarily mean ReasonNew = 999.

3) I'm not sure what you mean about the location of the rules. Each do if represents a rule (i.e., a combination to evaluate and a value to return). They are not stored anywhere else.