Here is an academic paper list which contains the papers that SnowCloud.ai AI Research Lab considered to be very important, must read.
The reason of any paper to be selected in this list may be any of the following:
-
The paper had brought a paradigm shift in its own domain.
-
The paper contained vital parts which lead the appearance of papers in 1.
-
The paper may cause a paradigm shift within 5 years.
After each subdomain, we proposed several ideas that may inspire your work that might be qualified to appear in this list.
SnowSelected is all you need.
- Long and Short-Term Memory : An original idea for long sentences processing, inspired by human neural information processing mechanism.
- Recurrent neural network based language model : An original idea of introducing RNN-like structure into the language model(LM).
- GRU : A simple yet effective model for RNN-like structure. A large number of effective, high-precision models based on this architecture.
- Connectionist temporal classification : Inspired by dynamic processing and dynamic time warping(DTW) when dealing with time-warped sequences like audio data.
- Learning Longer Memory in RNN : Formulated Recursive Neural Network which can be applied on sequences recursively by only using a single compact model.
- Learning phrase representations using RNN encoder-decoder for statistical machine translation : "Cho Model" for NMT.
- Seq2Seq: "Sutskever Model" for NMT, an advanced version.
- A Convolutional Neural Network for Modelling Sentences : Conv model for NLP.. More efficient on AI chips.
- CNN on Sentence Classification : Conv model for NLP.
- Very Deep Convolutional Networks for Text Classification : Conv model foor NLP.
- Neural Machine Translation by Jointly Learning to Align and Translate : Attention mechanism first introduced in NLP field.
- Soft And Hard Attention : Introduced the choice of soft and hard attention along features.
- Global And Local Attention : Introduced attention along data.
- Character-Aware Neural Language Models: Character level Conv model for NLP.
- Attention is All You Need. : First transduction model relying entirely on self-attention to compute representations of its input and output without using RNNs or convolution, but global FC. Introduced positional encoding, 15% mask sampling and multihead (plus, minus, eltwise product) additive attention.
- BERT: Bidirectional. Optimized for downstream tasks.
- Attentive Neural Processes
- Transformer-XL: Introduced relative positional encoding. State reuse resolved the problem may caused by excessive long sentence.
- Focused Attention Networks
- XLNet : Combined AR and AE models. Introduced DAG while learning AR parameters in sentence segments.
- Generating Long Sequences with Sparse Transformers : Simplified structure of XLNet AR part. And BERT for CV.(ADDRESS OUR #3 in [what is NEXT])
So what is NEXT?
- Better sampling to keep locally complete information of data.
- Better relative positional encoding beyond "learned from position".
- Simplified structure of XLNet AR part.
- AlexNet : The Beginning of Deep Learning for CV. Achieve new high rcoord in imagenet classification
- VGG : Deeper (19 layers at most) Conv3x3 models.
- Google Series
- GoogLeNet : Combinations of different kernel sizes.
- Inception v3
- Inception v4
- First Attention Solution :
- Convolutional Implementation of Sliding Windows : The core idea is CNN must recognize things invariant to position shifts.
- 1 x 1 Convolution : Introduced inplace inter-channel information exchange.
- Triplet Loss : Combined differential learning and hard example mining.
- Highway Networks : Must read before ResNet. Introduced branching schemes to accelerate deep learning training process.
- Dilate Convolution: Introduced more effective method for enlarging receptive field.
- ResNet : Branching scheme with standardized implementation (18/34/50/101), combinations of Conv3x3 and Conv1x1
- DenseNet : Introduced distillation idea in Conv Neural Networks.
- ResNeXt : A tradeoff between a sparse MobileNet and a dense ResNet.
- MobileNets : Efficient on some mobile devices. Introduced Depthwise Separable Conv which is very sparse. Save space for model parameters to the extreme. No saving for infer-time feature map.
- SqueezeNet : Introduced attention mechanism vertical to image.
- Wide ResNet : Ablation study for changing channel sizes.
- R-FCN : Introduced 3x3 pixel shuffler.
- Deformable Convolutional Networks
- Deep Neural Networks for Object Detection
- Glow : Introduced Invertible 1x1 Convolutions to save parameters in Encoder/Decoder , relying on PixelShuffler.
-
R-CNN Original
-
Kaiming He Series
- SPP Net : Introduced Pyramid like conventional SIFT.
- Improved R-CNN
- Faster R-CNN
- Mask R-CNN : Introduced segmentation after ROI-Align. Not efficient on AI chip.
- TensorMask
-
Jia Deng Series
- Stacked Hourglass Networks : Recombination of ResNet. Achieved SOTA using hourglass104.
- CornerNet
-
YOLO Series
-
Segmentation is All You Need : Introduced Segmentation methodology for detection task.
- Fully Convolutional Networks : Pixelwise classification as Segmentation.
- UNet : Introduced spatial features extraction and restorations. Backbone of many works like image compression/imputations/segmentation. Ideas might be inspired by MPEG4 rev.11 i.e. H264.
- Pixel Shuffler
- DeepLab ,DeepLab v2 and DeepLab v3
- FPN
- STS and STS++
- FlowNet and FlowNet2.0 Introduced temporal features extraction. Backbone of many works based on video understanding. Ideas might be inspired by MPEG4 rev.11 i.e. H264.
- SelFlow
- ArcFace : A final human face recognition paper combines sphereface idea and different order loss margins (Order 0,1,2 are hyper parameters)
- Convolution Pose Machines :
- OpenPose + PAF : The core idea is to predict directed vectors in between keypoints to form a feature map (PAF) thus one can join KP to different instances in a bottom-up way.
So what is NEXT?
- A much more robust way to deal with larger/smaller object.
- Beyond the invariance to shift/mirroring, a much more decent way to implement invariance to rotation.
- A "1-for-all" attention mechanism.
- VAE
- GAN
- conditional GAN
- Generalized Denoising Auto-Encoders as Generative Models
- LAPGAN
- GAN for Combinatorial Optimization
- A note on the evaluation of generative models
- DCGAN
- SRGAN
- Pix2Pix
- WGAN and WGAN-GP
- CycleGAN
- XGAN
- StarGAN
- JMMD
- Adaptation regularization
- Feature Ensemble Plus Sample Selection: Domain Adaptation for Sentiment Classification
- Net2Net
- Dropout
- Batch Normalization : Deal with large scale dynamic range of features.
- No More Pesky Learning Rates
- Bag of Tricks for CV
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
- LARS
- SNIPER: Efficient Multi-Scale Training
- Learning Data Augmentation Strategies for Object Detection
- Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
- Learning to Optimize
- Neural Architecture Search with Reinforcement Learning
- AMC: AutoML for Model Compression and Acceleration on Mobile Devices
- TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
- Horvord
- A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Machine Learning Algorithms
- DARTS: Differentiable Architecture Search