Reinforcement Learning door