Pytorch tutorials

Pytorch online tutorial

  • The tutorial has links to Jupyter notebooks and Google Colab notebooks!
  • If running in Colab, be sure to enable the GPU (Runtime > Change runtime type)

Summaries / key takeaways

What is Pytorch?

Make sure you know how to do the following:

  • How to construct empty, random, all-ones, and all-zeros tensors
  • How to specify tensor datatype
  • Create a tensor with the same shape as another tensor, but a different datatype
  • Look up tensor operations
  • Add, subtract and multiply tensors
  • Convert between Numpy arrays and tensors
  • Move tensors between CPU and GPU

Autograd

  • Useful takeaways from this tutorial:
    • with torch.no_grad()
    • requires_grad determines whether the gradient is computed.
  • Don’t know how important the other stuff is. Part 3 will go into how weight updates etc. is done in practice.

Neural Networks

  • Notice how they define the network as a class.
  • Dense layers are called “Linear”
  • What the network actually does is in “forward”
    • Layers that don’t need weights (e.g. max_pool) aren’t members of the class
    • E.g. F.max_pool2d is used in forward but not defined in __init__
  • .parameters() gets weights
  • Using zero_grad is important at the start of each training loop. I copied the training loop from the end of the document, it’s worth remembering.
    optimizer.zero_grad() # Zero gradients
    output = net(input)
    loss = criterion(output, target) # Compute loss
    loss.backward() # Backprop
    optimizer.step() # Update step
    

Training a Classifier

Datasets and Parallelization

  • DataLoader and Dataset. Basically, create a Dataset that provides the data, and DataLoader can do shuffling and batching on that dataset.
  • DataParallel splits batches and distributes them among GPUs. This can be useful since CSUA allows 2 GPUs per person by default!