Recent studies suggest that machine learning can be applied to develop good automatic evaluation metrics for machine translated sentences. This paper further analyzes aspects of learning that impact performance. We argue that previously proposed approaches of training a HumanLikeness classifier is not as well correlated with human judgments of translation quality, but that regression-based learning produces more reliable metrics.