On Data Integration Problems with Manifolds

Friday, December 1, 2017
Dr. Kenneth Ryan, West Virginia University



Data integration problems involve circumstances where the variables belong to predefined groupings (sometimes referred to as data views). Accounting for such data views during analysis adds practical value in terms of both interpretation and predictive performance. Existing approaches that handle multi-view data tend to rely on view agreement principles, strong smoothness assumptions, or regularization penalties that account for the groupings. The former approaches tend to be quite sensitive to even modest noise in the response or the feature data, while the latter approach is linear and can usually be outperformed. An empirical approach is proposed to account for several key trade-offs including the bias/variance trade-off of prediction error on available testing cases, the possibility that the data may be fully viewed or have no appreciable view relationships, and the use of sparse anchor point methods for detecting manifolds within views. Theoretical results highlight the expected performance of the new technique. The end result of this work is a computationally efficient approach whose effectiveness is demonstrated on some real-data applications.