Inference & Model Compression WIP deep-learning deep-learning inference quantization pruning distillation 1 min read Latency, throughput, quantization, pruning, and distillation Latency Bandwidth Throughput Model Compression Quantization Pruning Distillation TODO: Add content Previous Activation Functions Next Loss Functions Related Notes in DEEP-LEARNING Attention Mechanisms Attention Mechanisms CNNs Activation Functions Optimization & Training Loss Functions Optimizers PyTorch Lightning Regularization Learning Rate Schedulers