Abstract:In devicetodevice(D2D) communication, when the distance between source node and destination node is too large, relay node can be introduced to improve the communication quality. When the communication decline is serious and single relay cannot improve the communication quality effectively, multirelay communication needs to be introduced. Aiming at multirelay communication in D2D communication, this paper proposes a multirelay selective communication mechanism based on Qlearning in machine learning. First, determine whether cooperative relay is needed for communication between source node and destination node. Secondly, the return value of Q function in qlearning algorithm is defined by considering the communication energy consumption in D2D network. Finally, the satisfaction function is obtained by calculating the transmission distance of D2D communication and the signaltonoise ratio of the communication receiver. Considering the return value and satisfaction, a cooperative relay set is obtained. Simulation results show that multirelay cooperation based on Qlearning algorithm can significantly reduce transmission delay and balance network load.