Decentralised reinforcement learning in Markov Games door