A colleague of mine sent me an interesting technical interview question. It goes something like the following. I’ve changed the vocabulary a lot so that I don’t expose the question to an Internet search. (The actual interview question involved a thief and a mailroom — I changed to a magical genie and a cave).
Each day at 8:00 in the morning, for five days, a good genie delivers treasure with a specified value to a cave. At 11:00 each morning, you have the option of paying a certain amount and taking all treasure in the cave, or doing nothing and waiting until the next day. At 2:00 each afternoon, a bad genie might come to the cave (with some probability p) or not. If the bad genie comes, he takes away all treasure in the cave.
N = 5 # number days in week T = [10,2,8,6,4] # value of treasure delivered each day C = 3 # cost of taking (same for all days) P = 0.15 # probability bad genie comes each day
The question is, what actions should you take in order to maximize your average gain? The question is tricky because you sometimes want to defer paying C and taking the accumulated treasure, but sometimes you want to grab the treasure immediately.
To get a grip on the problem, I wrote a simulation that examines all 32 possible sequences of 5 actions. For example, [0,1,0,0,1] means pay and take the treasure on day  and day  but don’t do anything on the other three days.
For the parameters above, my simulation computes that the best set of actions is [1,0,0,0,1] when the expected gain is 20.11221400 for the week.
loop many weeks loop each day in week val_in_cave += treasure[day] if action[day] == 1 then pay cost, take treasure, val_in_cave = 0 end-if p = random(0,1) if p < P then # bad genie comes val_in_cave = 0 end-if end-loop accumulate weekly gain end-loop return avg gain per week
The simulation gave me good insights. A simulation approach won’t work if the time span is long — factorial(5 days) is manageable but factorial(100 days) is not.
According to the interview question my friend sent me, the true answer for the parameters above is 20.10825000 — about 0.0197% different from my simulation value of 20.11221400.
I spent some time trying to figure out a non-simulation algorithm for the interview question, but my first attempt did not give correct results. The problem is stuck in my head now and so I’m sure I’ll revisit the problem when I have some free time.
Update: The moment I finished writing this blog post, I realized what I did wrong in my non-simulation algorithm. I’ll write a follow-up post in a few days.
There are dozens of movies that feature genies of all shapes and sizes. Left: “The Thief of Bagdad” (1940) won Academy Awards for Cinematography and Special Effects. Center: “The Adventures of Prince Achmed” (1926) is the oldest surviving animated feature film. A beautiful work of art. Right: In “Miracle Beach” (1992) a female genie helps a beach bum try to win the heart of his selfish swimsuit-model neighbor. Not a work of art but has some heart and was better than expected.
# genies_simulation.py import numpy as np N = 5 # number days in week T = [10,2,8,6,4] # value of treasure each day C = 3 # cost of taking P = 0.15 # probability bad genie takes in evening def avg_value_of_actions(actions, n_weeks): # actions like [0,1,0,1,0] grand_sum = 0 # gain for all weeks for j in range(n_weeks): val_obtained_for_week = 0 val_in_cave = 0 for i in range(5): # each day val_in_cave += T[i] # good genie delivers treasure if actions[i] == 1: # pay cost and take treasure val_obtained_for_week -= C val_obtained_for_week += val_in_cave val_in_cave = 0 p = np.random.random() if p "lt" P: # bad genie came, takes all treasure val_in_cave = 0 grand_sum += val_obtained_for_week return grand_sum / n_weeks def main(): print("\nBegin good genie / bad genie simulation ") n_weeks = 1000000 print("\nSetting n_weeks = %d \n " % n_weeks) np.random.seed(1) all_actions = \ [[0,0,0,0,0], [0,0,0,0,1], [0,0,0,1,0], [0,0,1,0,0], [0,1,0,0,0], [1,0,0,0,0], [0,0,0,1,1], [0,0,1,0,1], [0,1,0,0,1], [1,0,0,0,1], [0,0,1,1,0], [0,1,0,1,0], [1,0,0,1,0], [0,1,1,0,0], [1,0,1,0,0], [1,1,0,0,0], [0,0,1,1,1], [0,1,0,1,1], [1,0,0,1,1], [0,1,1,0,1], [1,0,1,0,1], [1,1,0,0,1], [0,1,1,1,0], [1,0,1,1,0], [1,1,0,1,0], [1,1,1,0,0], [0,1,1,1,1], [1,0,1,1,1], [1,1,0,1,1], [1,1,1,0,1], [1,1,1,1,0], [1,1,1,1,1] ] best_gain = 0.0 best_actions = [0,0,0,0,0] for i in range(len(all_actions)): actions = all_actions[i] avg_gain = avg_value_of_actions(actions, n_weeks) print(str(actions) + " : avg_gain = %12.8f " % avg_gain) # print("%12.8f " % avg_gain) if avg_gain "gt" best_gain: best_gain = avg_gain best_actions = actions print("\nBest found: ") print(best_actions) print("%12.8f " % best_gain) print("\nEnd ") if __name__ == "__main__": main()