科学研究
学术报告
Navigating Challenges in Nonparametric Classification and Outlier Detection: a Remedy Based on Semi-parametric Density Ratio Models
邀请人:周叶青
发布时间:2024-05-06浏览次数:

题目:Navigating Challenges in Nonparametric Classification and Outlier Detection: a Remedy Based on Semi-parametric Density Ratio Models

姓名:刘玉坤 教授(华东师范大学)

地点: 致远楼108室

时间: 2024年5月13日 15:30-17:00

Abstract:

The goal of classification is to assign categorical labels to unlabelled test data based on patterns and relationships learned from a labeled training dataset. Yet this task becomes challenging when the training data and the test data exhibit distributional mismatches. The unlabelled test data follow a finite mixture model, which is not identifiable   without any model assumptions. In this paper, we propose to model the test data by a finite semi-parametric mixture model under density ratio model, and construct a semi-parametric likelihood prediction set (SPLPS) for the labels in the test data. Our approach tries to optimize the out-of-sample performance, aiming to include the correct class and to detect outliers as often as possible.  It has the potential to enhance the robustness and effectiveness of classification models when dealing with varying distributions between training and test data. Our method circumvents a stringent separation assumption between training data and outliers, which is required by Guan and Tibshirani (2022) but is often violated by commonly-used distributions. We prove asymptotic consistency and normalities of our parameter estimators and asymptotic optimality of  the proposed SPLPS. We illustrate our methods by analyzing four real-world datasets.

欢迎各位参加!