Automatic mixed precision (AMP) training is now natively supported and a stable feature.

Created by: Lornatang

🚀 Feature

AMP allows users to easily enable automatic mixed precision training enabling higher performance and memory savings of up to 50% on Tensor Core GPUs. Using the natively supported torch.cuda.amp API, AMP provides convenience methods for mixed precision, where some operations use the torch.float32 (float) datatype and other operations use torch.float16 (half). Some ops, like linear layers and convolutions, are much faster in float16. Other ops, like reductions, often require the dynamic range of float32. Mixed precision tries to match each op to its appropriate datatype.

Official example

Motivation

In PyTorch1.6, the mixed precision calculation has been integrated, and there is no need to download the Nvdia/apex library.

Pitch

Update the code in training, remove apex.

Alternatives

No changes on the original basis.

Additional context

Refer to my recently updated PR.