Energy efficiency is a key factor in the next generation wireless communication systems. Sleep mode implementation in multi-tier 5G networks has proven to be a very good approach for improving the energy efficiency. In this paper, we propose a novel reinforcement learning based decision making algorithm to implement sleep mode in the base stations (BSs) used in multi-tier 5G networks. We propose a Markovian Decision process (MDP) based algorithm to switch between three different power consumption modes of a BS for improving the energy efficiency of the 5G network. The MDP based approach intelligently switches between the states of the BS based on the offered traffic whilst maintaining a prescribed minimum channel rate per user. Our results show that there is a significant gain in the energy efficiency when using our proposed MDP algorithm together with the three-state BSs. We have also shown the energy-delay tradeoff in order to design a delay aware network.