This repository also contains my learning on advanced efficient ML.
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Learning both Weights and Connections for Efficient Neural Networks
A White Paper on Neural Network Quantization
A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
- Use sanity checks :
Python :
assert isinstance(prune_ratio, (float, list))
, C++ :static_assert(std::is_floating_point<T>::value, "T must be floating point");
, take a look atdef channel_prune(model: nn.Module,prune_ratio: Union[List, float]) -> nn.Module
in lab1.ipynb - deepcopy ;)# GPU-MODE