In most real-world recommender systems, the observed rating data are subject to selection bias, and the data are thus missing-not-at-random. Developing a method to facilitate the learning of a recommender with biased feedback is one of the most challenging problems, as it is widely known that naive approaches under selection bias often lead to suboptimal results. A well-established solution for the problem is using propensity scoring techniques. The propensity score is the probability of each data being observed, and unbiased performance estimation is possible by weighting each data by the inverse of its propensity. However, the performance of the propensity-based unbiased estimation approach is often affected by choice of the propensity estimation model or the high variance problem. To overcome these limitations, we propose a model-agnostic meta-learning method inspired by the asymmetric tri-training framework for unsupervised domain adaptation. The proposed method utilizes two predictors to generate data with reliable pseudo-ratings and another predictor to make the final predictions. In a theoretical analysis, a propensity-independent upper bound of the true performance metric is derived, and it is demonstrated that the proposed method can minimize this bound. We conduct comprehensive experiments using public real-world datasets. The results suggest that the previous propensity-based methods are largely affected by the choice of propensity models and the variance problem caused by the inverse propensity weighting. Moreover, we show that the proposed meta-learning method is robust to these issues and can facilitate in developing effective recommendations from biased explicit feedback.