Manifold Characteristics That Predict Downstream Task Performance
Abstract
Pretraining methods are typically compared by
evaluating the accuracy of linear classifiers, transfer
learning performance, or visually inspecting
the representation manifold’s (RM) lowerdimensional
projections. We show that the differences
between methods can be understood more
clearly by investigating the RM directly, which
allows for a more detailed comparison. To this
end, we propose a framework and new metric to
measure and compare different RMs. We also
investigate and report on the RM characteristics
for various pretraining methods. These characteristics
are measured by applying sequentially
larger local alterations to the input data, using
white noise injections and Projected Gradient Descent
(PGD) adversarial attacks, and then tracking
each datapoint. We calculate the total distance
moved for each datapoint and the relative change
in distance between successive alterations. We
show that self-supervised methods learn an RM
where alterations lead to large but constant size
changes, indicating a smoother RM than fully
supervised methods. We then combine these measurements
into one metric, the Representation
Manifold Quality Metric (RMQM), where larger
values indicate larger and less variable step sizes,
and show that RMQM correlates positively with
performance on downstream tasks.
Collections
- NWU Official [165]