最終更新:2009-11-04 (水) 11:30:11 (5287d)  

MNIST
Top / MNIST

The MNIST dataset, one of the most famous in digit recognition, is derived from the NIST dataset, and has been created by Yann LeCun? [2]. The digits from NIST were downscaled to 20x20 pixels and centered in a 28x28 pixel bitmap by putting center-of-gravity of the black pixels in the center of the bitmap. It has 60,000 training and 10,000 test samples. We have output our digits in this format with Mitchell lter downsampling and again blur=0.5. Center-of-gravity was computed before downsampling and scaled accordingly. Figure 6 shows samples from MNIST and from our reformatted dataset. You can see that MNIST has some segmentation errors (e.g. column 4, row 4 is a badly segmented four), possibly as much as 1%3 { for our dataset, we checked each sample manually for segmentation errors, so there should be none.

http://yann.lecun.com/exdb/mnist/