Đề kiểm tra pháp luật đại cương - 3

Tham khảo tài liệu đề kiểm tra pháp luật đại cương - 3 , khoa học xã hội, hành chính - pháp luật phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả | Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation Ivan Titov Saarland University Saarbruecken Germany titov@ Abstract We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain. One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains. Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the label on the source domain. The danger is that these predictive clusters will consist of features specific to the source domain only and consequently a classifier relying on such clusters would perform badly on the target domain. We introduce a constraint enforcing that marginal distributions of each cluster . each latent variable do not vary significantly across domains. We show that this constraint is effective on the sentiment classification task Pang et al. 2002 resulting in scores similar to the ones obtained by the structural correspondence methods Blitzer et al. 2007 without the need to engineer auxiliary tasks. 1 Introduction Supervised learning methods have become a standard tool in natural language processing and large training sets have been annotated for a wide variety of tasks. However most learning algorithms operate under assumption that the learning data originates from the same distribution as the test data though in practice this assumption is often violated. This difference in the data distributions normally results in a significant drop in accuracy. To address 62 this problem a number of domain-adaptation methods has recently been proposed see . Daume and Marcu 2006 Blitzer et al. 2006 Bickel et al. 2007 . In addition to the labeled data from the source domain they also exploit small amounts of labeled data and or unlabeled data from the target domain to estimate a more .

Bấm vào đây để xem trước nội dung
TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.