Two major statistical issues confronting comparative analyses of hospital outcomes are adequacy of case-mix adjustment and proper accounting for random variation. Hierarchical modeling has been proposed to improve precision and reduce the impact of random variation but becomes difficult to implement when there are numerous case-mix factors to control. In this paper we formulate the problem of hospital performance comparisons within the framework of potential outcomes and illustrate an approach to hospital comparisons which combines multiple category propensity score methods for the control of case-mix variations with hierarchical Bayesian modeling of case-mix adjusted summaries. The approach is similar to that proposed by Huang et al. (Health Serv Res 40:253–278, 2005) but extends their approach by using a Bayesian model to accommodate hospital level attributes and to facilitate joint modeling of performance for multiple outcomes. The analytical approach is illustrated by a comparison of 30 day post admission mortality risks for patients treated for acute myocardial infarction, pneumonia or stroke in 34 New Zealand public hospitals. In a small simulation study, reported in electronic supplementary material, hierarchical models outperformed non-hierarchical models, achieving both better credible interval coverage and shorter average interval lengths for measures of between hospital variation based on contrasts between the 90th and 10th percentiles of the mortality risk distribution. Simulation performance of hierarchical and non-hierarchical models in detecting unusual performance was similar.