In this paper authors gain insight for the new loss from the way histopathologists work with images. Since the enormous scale of the images for histopathological research it is stored in pyramid-like structure with different zoom level, so researches tend to zoom in and out multiple times during work with image. Authors proposed two new things in this paper: 1) pre-training self-supervised loss and 2) semi-supervised teacher-student training paradigm to further train network in low-data regime.
Three step pipeline proposed in this paper
Authors proposed pipeline consisting of three steps:
With step 2 being obvious, we will dive into 1. and 3.
Authors proposed a novel self-supervised loss, along with the specific model architecture to train.
Additional pairwise feature extraction head on top of the network for easier training for this loss
Suppose we have $K$ possible magnification factors of the image ($K$ itself is arbitrary, though authors used $K=3$). We then enumerate all possible $K!$ permutations of magninfication factors and denote one as $\{\pi_i\}_{i \in [1,K!]}$.
Then, the training procedure is:
All networks are then trained simultaneously by minimising $CE(f_{\theta,\phi,\xi}(\{s_i\}), k)$.
Authors argue, that to understand the magnification factors order, $f_{\theta}$ should learn meaningful representations.