Traffic prediction plays an essential role in many real-world applications ranging from route planning to vehicular communications. The goal of making accurate prediction is challenging due to the dynamic and various spatial-temporal correlations in traffic network. Numerous methods have been proposed to capture the spatial-temporal correlations, however, there are two major challenges: i) The task of synchronously capturing dynamic and various spatial-temporal correlations has not been fully explored yet. ii) Lack of an adaptive mechanism to model different influences caused by different external factors. To overcome these challenges, we propose a unified spatial-temporal attention network entitled USTAN. First, it constructs the spatial neighbor graph and the temporal neighbor array to describe the spatial and temporal correlations respectively. By expanding the spatial neighbors in temporal perspective according to the temporal neighbor array, all the spatial-temporal neighbors can be gathered together. Second, a single unified attention component is performed on all the spatial-temporal neighbors for capturing the spatial-temporal correlations all at once. In addition, a gated fusion module is designed to fuse the external factors adaptively to reduce predicting error. Extensive experiments on real-world traffic prediction tasks demonstrate the superiority of our proposed framework.