Approximate calculation of tukey’s depth and median with high-dimensional data

We present a new fast approximate algorithm for Tukey (halfspace) depth level sets and its implementation-ABCDepth. Given a d dimensional data set for any d ≥ 1, the algorithm is based on a representation of level sets as intersections of balls in Rd. | Yugoslav Journal of Operations Research 28 (2018), Number 4, 475–499 DOI: APPROXIMATE CALCULATION OF TUKEY’S DEPTH AND MEDIAN WITH HIGH-DIMENSIONAL DATA ´ ´ Milica BOGICEVI C School of Electrical Engineering, University of Belgrade, Belgrade, Serbia antomripmuk@ Milan MERKLE School of Electrical Engineering, University of Belgrade, Belgrade, Serbia emerkle@ Received: May 2018 / Accepted: August 2018 Abstract: We present a new fast approximate algorithm for Tukey (halfspace) depth level sets and its implementation-ABCDepth. Given a d-dimensional data set for any d ≥ 1, the algorithm is based on a representation of level sets as intersections of balls in Rd . Our approach does not need calculations of projections of sample points to directions. This novel idea enables calculations of approximate level sets in very high dimensions with complexity that is linear in d, which provides a great advantage over all other approximate algorithms. Using different versions of this algorithm, we demonstrate approximate calculations of the deepest set of points (”Tukey median”) and Tukey’s depth of a sample point or out-of-sample point, all with a linear in d complexity. An additional theoretical advantage of this approach is that the data points are not assumed to be in ”general position”. Examples with real and synthetic data show that the executing time of the algorithm in all mentioned versions in high dimensions is much smaller than the time of other implemented algorithms. Also, our algorithms can be used with thousands of multidimensional observations. Keywords: Big Data, Multivariate Medians, Depth Functions, Computing Tukey’s Depth. MSC: 62G15, 68U05. 476 M. Bogi´cevi´c, M. Merkle / Calculation of Tukey’s depth 1. INTRODUCTION Although this paper is about multivariate medians and related notions, for completeness and understanding rationale of multivariate setup, we start from the univariate case. In terms of .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.