- Training Set: 60,000 training images & labels
- Test Set: 10,000 test images & labels
- Handwritten single digits ranging 0-9
- Each 28 x 28 pixel single digit image is represented as a 28 x 28 array, with grayscale (single color channel) values ranging from 0 (white) to 255 (black).
- It can be normalized to a range of 0-1.
- With 60,000 images, each with a single channel sized 28 x 28 x 1 results in a 4D array:
- (60,000, 28, 28, 1)
- The '1' denotes a single color channel: grayscale. Color images would have a value of '3'.
- When the 28 x 28 array is flattened it is a 1D vector measuring 784 units. Note that the flattening of this array removes the information about the pixel relationships with adjacent pixels, but this can otherwise be accounted for in a CNN. This results in an overall training tensor of a 784 x 60,000 array.
- Labels are one-hot encoded into a single array for each image and identified by the index position in the array.
- Example label for a digit of 4: [0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
- With labels identifying 1 of the 10 available values, the training labels are a (60,000, 10) 2D array.
The following is a sample of the data for a single image depicting a handwritten number '5'.
MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges
of handwritten digits Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST.
TF MNIST Dataset
Distributed Training with MNIST Dataset
Distributed training with Keras | TensorFlow Core
The tf.distribute.Strategy API provides an abstraction for distributing your training across multiple processing units. The goal is to allow users to enable distributed training using existing models and training code, with minimal changes. This tutorial uses the tf.distribute.MirroredStrategy , which does in-graph replication with synchronous training on many GPUs on one machine.