Pytorch online tutorial
- The tutorial has links to Jupyter notebooks and Google Colab notebooks!
- If running in Colab, be sure to enable the GPU (Runtime > Change runtime type)
Summaries / key takeaways
What is Pytorch?
Make sure you know how to do the following:
- How to construct empty, random, all-ones, and all-zeros tensors
- How to specify tensor datatype
- Create a tensor with the same shape as another tensor, but a different datatype
- Look up tensor operations
- Add, subtract and multiply tensors
- Convert between Numpy arrays and tensors
- Move tensors between CPU and GPU
Autograd
- Useful takeaways from this tutorial:
with torch.no_grad()
requires_grad
determines whether the gradient is computed.
- Don’t know how important the other stuff is. Part 3 will go into how weight updates etc. is done in practice.
Neural Networks
- Notice how they define the network as a class.
- Dense layers are called “Linear”
- What the network actually does is in “forward”
- Layers that don’t need weights (e.g. max_pool) aren’t members of the class
- E.g.
F.max_pool2d
is used inforward
but not defined in__init__
.parameters()
gets weights- Using
zero_grad
is important at the start of each training loop. I copied the training loop from the end of the document, it’s worth remembering.optimizer.zero_grad() # Zero gradients output = net(input) loss = criterion(output, target) # Compute loss loss.backward() # Backprop optimizer.step() # Update step
Training a Classifier
- Saving and loading
- torchvision is pretty good for computer vision
Datasets and Parallelization
- DataLoader and Dataset. Basically, create a Dataset that provides the data, and DataLoader can do shuffling and batching on that dataset.
- DataParallel splits batches and distributes them among GPUs. This can be useful since CSUA allows 2 GPUs per person by default!