description.txt

DATASET DESCRIPTION

This dataset is a subset of the data provided as part of the Netflix
Prize. TrainingRatings.txt and TestingRatings.txt are respectively the
training set and test set. Each of them has lines having the format:
MovieID,UserID,Rating.

Each row represents a rating of a movie by some user. The dataset
contains 1821 movies and 28978 users in all. Ratings are integers from
1 to 5. The training set has 3.25 million ratings, and the test set
has 100,000. Use the training set to predict the ratings provided in
the test set. Note that your predicted ratings do not have to be
integers. You must look to achieve best performance on the Mean
Absolute Error and the Root Mean Squared Error metrics.

The files movie_titles.txt and README are from the original Netflix
dataset. The former has rows having the format:
MovieID,YearOfRelease,Title.
Note that it contains many more movies than have been used in our
subset.  The README file has been included to conform to the terms of
use. It will tell you more about the original, complete dataset.