本文共 3834 字,大约阅读时间需要 12 分钟。
目录
Are Labels Necessary for Neural Architecture Search
Author:Chenxi Liu ,...,Kaiming He(Facebook AI)
发现:
-
the architecture rankings produced with and without labels are highly correlated. NAS和UnNAS的结果具有高度相关性
-
using unlabeled images from a large dataset may be a more promising approach. Label不是必须的,可以进行无监督的搜索。
我的感觉是,一个良好的pretext task对于无监督的搜索很重要 参考:
Self-supervised Learning: Generative or Contrastive(2020)
Author:Xiao Liu (Tsinghua University)
Key:介绍自监督最新的一些研究进展,包括以GAN为基础的生成式方法还有Instance discrimination为主的contrastive的方法
Note:
-
mutual information is only loosely related to the success of several MI-based methods, in which the sampling strategies and architecture design may count more.
-
There is an essential gap between pre-training and downstream tasks.(之前有一个论文也讲到了这个gap的问题)。解决方法可能是:design pre-training tasks for a specifific downstream task automatically
-
The process of selecting pretext tasks seems to be too heuristic and tricky without patterns to follow.
-
对于MI-based-method,主要问题是exploring potential of sampling strategies 比如leverage super large amounts of negative samples and augmented positive samples(Moco, SimCLR)
Big Self-Supervised Models are Strong Semi-Supervised Learners (2020)
Author:Ting Chen,..., Geoffrey Hinton (Google)
Key: 提出了一种半监督的学习方法SimCLRv2,有三个步骤:
-
unsupervised or self-supervised pretraining 这边使用SimCLRv2 - SimCLR:ResNet-50 (4×); SimCLRv2:152-layer ResNet with 3× wider channels and selective kernels (SK)
-
increase the capacity of the non-linear network g ( ·) (a.k.a. projection head)。SimCLR:2-layer,SimCLRv2:3 -
we also incorporate the memory mechanism from MoCo [20], which designates a memory network (with a moving average of weights for stabilization) whose output will be buffered as negative examples.
-
-
projection head g ( · ) is discarded
-
distillation using unlabeled data (到一个小网络)
-
use the fifine-tuned network as a teacher to impute labels for training a student network. (unlabeled data) -
teacher: fixed, student: train -
student model: 可以和teacher一样,也可以是一个更小的
发现:
- although big models are important for learning general (visual) representations, the extra capacity may not be necessary when a specifific target task is concerned. (由此提出蒸馏)
- A deeper projection head not only improves the representation quality measured by linear evaluation, but also improves semi-supervised performance when fifine-tuning from a middle layer of the projection head (projection head指的是SimCLR中conv层后面的部分)
- 蒸馏能获得更好的目标任务表现
Region-of-interest guided Supervoxel Inpainting for Self-supervision (MICCAI2020)
Author:Subhradeep Kayal 。。。
Key:generating images to inpaint by using supervoxel-based masking instead of random masking, and also by focusing on the area to be segmented in the primary task
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments(2020)
Author: Mathilde Caron (Facebook AI)
Key: We learn features by Swapping Assignments between multiple Views of the same image (SwAV). The features and the codes are learned online, allowing our method to scale to potentially unlimited amounts of data. In addition, SwAV works with small and large batch sizes and does not need a large memory bank
详细idea:
- an online clustering-based self-supervised method:we compute a code from an augmented version of the image and predict this code from other augmented versions of the same image.
- introduce the multi-crop strategy to increase the number of views of an image
Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs By Comparing Image Representations(MICCAI2020)
Author:Hong-Yu Zhou,..., Yefeng Zheng
Key:
- 提出一种新的自监督方式Comparing to Learn (C2L)
- pretrained 2D deep models for radiograph related tasks from massive unannotated data
- a momentum-based teacher-student architecture is proposed for the contrastive learning (feature level contrast)
- 超过ImageNet pretrained的效果
转载地址:http://gyeqf.baihongyu.com/