Netflix Prize for Dummies [II]

Posted in Netflix, databases, operations research by Francisco Marco-Serrano @ Jul 18, 2007

Next is the database. In this example I’m going to use MySQL, although you could use PostgreSQL, or MS SQL, for example. I’m in a Windows OS.

2) Creating the database and dumping the data into it.

a. Create the database:

CREATE DATABASE netflix;

b. Create the tables:

“training_ser” is the table where I’m going to dump the training_set data, made up of the movie ids, user ids, the rating, and the date.

CREATE TABLE `netflix`.`training_set` (
`idmovie` INTEGER UNSIGNED NOT NULL,
`iduser` INTEGER UNSIGNED NOT NULL,
`rating` INTEGER UNSIGNED NOT NULL,
`date` VARCHAR(10) NOT NULL,
PRIMARY KEY USING BTREE(`idmovie`, `iduser`);
)
ENGINE = MyISAM
COMMENT = ‘User ratings’;

“movies” is the table where I’ll dump the information from the movies file, made up of movie ids, release date, and title.

CREATE TABLE `netflix`.`movies` (
`idmovie` INTEGER UNSIGNED NOT NULL,
`release` INTEGER UNSIGNED NOT NULL,
`title` VARCHAR(150) NOT NULL,
PRIMARY KEY (`idmovie`)
)
ENGINE = MyISAM
COMMENT = ‘Movies List’;

Try doing the same for “probe” and “qualifying”.

c. Dump the data into the tables:

LOAD DATA LOCAL INFILE “C:/Netflix/training_set.txt”
REPLACE INTO TABLE netflix.training_set
FIELDS TERMINATED BY ‘,’
LINES TERMINATED BY ‘\n’ STARTING BY ”
(idmovie, iduser, rating, date);

LOAD DATA LOCAL INFILE “C:/Netflix/movie_titles.txt”
REPLACE INTO TABLE netflix.movies
FIELDS TERMINATED BY ‘,’
LINES TERMINATED BY ‘\n’ STARTING BY ”;

Try doing the same for “probe” and “qualifying”.

Any problems?.

Oops! It seems we have found nothing related.

1 Comment »

  1. Install the ODBC connector (http://dev.mysql.com/downloads/connector/odbc/3.51.html) as well, so you will be able to access the data from external applications, such as spreadsheets.

    Comment by Administrator — July 20, 2007 @ 2:02 pm

RSS feed for comments on this post.

Leave a comment

This is a captcha-picture. It is used to prevent mass-access by robots. (see: www.captcha.net)

You must read and type the 5 chars within 0..9 and A..F, and submit the form.

  

Oh no, I cannot read this. Please, generate a