Manfred K. Warmuth
A standard problem in Computational Biology is the
visualization of high dimensional data for making medical decisions.
This is typically done by embedding the high dimensional data into
two dimensional space.
t-SNE is the standard method for doing this. We show on artificial and
natural data that t-SNE has a number of problems:
It does not exhibit basic invariances that
any good embedding method should have.
We describe a new method called t-ETE for finding a low-dimensional embedding.
We formulate the embedding problem as a joint ranking problem over a set of triplets, where each triplet captures the relative similarities between three objects in the set.
Using recent advances in robust ranking, t-ETE
produces high-quality embeddings even in the presence of
a significant amount of noise and outliers and
better preserves local scale and basic invariance properties.
In particular, our method produces significantly better results than t-SNE
on a wide range of signature datasets while also being faster to compute.
Joint work with Ehsan Amid, Nikos Vlassis and John Vivian.