Regularized Mixtures of Experts in High-Dimensional Data

Mixture of experts (MoE) models are successful neural-network architectures for modeling heterogeneous data in many machine learning problems including regression, clustering and classification. The model learning is in general performed by maximum likelihood estimation (MLE). For high-dimensional...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Faicel, Chamroukhi, Huỳnh, Bảo Tuyên
Format:	Conference paper
Sprache:	English
Veröffentlicht:	2023
Schlagworte:	Mixtures of Experts high-dimensional data Regularized maximum likelihood estimation
Online Zugang:	https://scholar.dlu.edu.vn/handle/123456789/2335
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Thư viện lưu trữ:	Thư viện Trường Đại học Đà Lạt

Beschreibung
Zusammenfassung:	Mixture of experts (MoE) models are successful neural-network architectures for modeling heterogeneous data in many machine learning problems including regression, clustering and classification. The model learning is in general performed by maximum likelihood estimation (MLE). For high-dimensional data, a regularization is needed in order to avoid possible degeneracies or infeasibility of the MLE related to high-dimensional and possibly redundant and correlated features in a high-dimensional scenario. Regularized maximum likelihood estimation allows the selection of a relevant subset of features for prediction and thus encourages sparse solutions. The problem of variable selection is challenging in the modeling of heterogeneous data, including with MoE models. We consider the MoE for heterogeneous regression data and propose a regularized maximum-likelihood estimation with possibly high-dimensional features, based on a dedicated EM algorithm which integrates coordinate ascent updates of the parameters. Unlike state-of-the art regularized MLE for MoE, the proposed modeling does not require an approximate of the regularization. The proposed algorithm allows to automatically obtaining sparse solutions without thresholding, and includes coordinate ascent updates avoiding matrix inversion, and can thus be scalable. An experimental study shows the good performance of the algorithm in terms of recovering the actual sparse solutions, in parameter estimation, and in clustering of heterogeneous regression data.

Regularized Mixtures of Experts in High-Dimensional Data

Ähnliche Einträge