These severe problems may come as a surprise, since the sample covariance matrix has appealing properties, such as being maximum likelihood under normality. But this is to forget what maximum likelihood means. It means the most likely parameter values given the data. In other words: let the data speak (and only the data). This is a sound principle, provided that there is enough data to trust the data. Indeed, maximum likelihood is justified asymptotically as the number of observations per variable goes to infinity. It is a general drawback of maximum likelihood that it can perform poorly in small sample. For the covariance matrix, small sample problems.