Blog

Simple recommendation system

2008-05-22 20:56:51 by Martynas Jusevičius

I have built a simple recommendation for one of the websites I work with. It is only based on social relationships and does not take into account other parameters. The basic idea is if I like an item, select other items liked by people who also like this item and it can be used to build lists of similar items as seen on Amazon or YouTube. If we use as an example people who like movies (tables Person, Movie and many-to-many relationship PersonMovie), this can be expressed in SQL using several joins:

SELECT DISTINCT Movie.*, COUNT(Person.ID) AS PersonCount
FROM Movie
JOIN PersonMovie ON Movie.ID = PersonMovie.MovieID
JOIN Person ON PersonMovie.PersonID = Person.ID
JOIN PersonMovie AS PersonMovie1 ON Person.ID = PersonMovie1.PersonID
JOIN Movie AS Movie1 ON PersonMovie1.MovieID = Movie1.ID
WHERE Movie1.ID = 1 AND Movie.ID != Movie1.ID
GROUP BY Movie.ID
ORDER BY PersonCount DESC

First we join movies to people and then backwards to the specific movie with ID = 1 that we want to get similar movies for. PersonCount is the relevance factor — the higher it is, the more people like the movie.

The algorithm can also be implemented in PHP using Propel ORM, for example, as a method in the Movie class:

public function getSimilarMovies(Criteria $c = null)
{
	if ($c === null) $c = new Criteria();
	elseif ($c instanceof Criteria) $c = clone $c;
	
	MoviePeer::addSelectColumns($c);

	$c->addAlias("PersonMovie1", PersonMoviePeer::TABLE_NAME);
	$c->addAlias("Movie1", MoviePeer::TABLE_NAME);
	$c->addAsColumn("PersonCount", "COUNT(Person.ID)");
	$c->addJoin(MoviePeer::ID, PersonMoviePeer::MovieID);
	$c->addJoin(PersonMoviePeer::PERSONID, PersonPeer::ID);
	$c->addJoin(PersonPeer::ID, "PersonMovie1.PersonID");
	$c->addJoin("PersonMovie1.MovieID", "Movie1.ID");
	$c->add("Movie1.ID", $this->getID());
	$c->add(MoviePeer::ID, $this->getID(), Criteria::NOT_EQUAL);
	$c->addGroupByColumn(MoviePeer::ID);
	$c->addDescendingOrderByColumn("PersonCount");

	return MoviePeer::doSelect($c);
}

Digg Digg this! del.icio.us del.icio.us!

Comments (8)

Expanding to take into account ratings

2008-05-23 11:14:16 by Rob Young

Woah there, slow down on the unnecessary joins and distinct
SELECT Movie.*, COUNT(PersonMovie.PersonID) AS PersonCount
FROM Movie
JOIN PersonMovie ON Movie.ID = PersonMovie.MovieID
JOIN PersonMovie AS PersonMovie1 ON PersonMovie.PersonID = PersonMovie1.PersonID
WHERE PersonMovie1.MovieID = 1 AND Movie.ID != Movie1.ID
GROUP BY Movie.ID
ORDER BY PersonCount DESC

Also, you can easily extend the idea to support weighted relationships (ratings).
SELECT Movie.*, SUM(PersonMovie.Rating) AS RatingAggregate
FROM Movie
JOIN PersonMovie ON Movie.ID = PersonMovie.MovieID
JOIN PersonMovie AS PersonMovie1 ON PersonMovie.PersonID = PersonMovie1.PersonID
WHERE PersonMovie1.MovieID = 1 AND Movie.ID != Movie1.ID
GROUP BY Movie.ID
ORDER BY RatingAggregate DESC

Rob,

2008-05-23 12:52:30 by Martynas

True, I didn't think about optimization. Person table is not necessary at all. Thanks :)

2008-05-28 13:35:06 by Martynas

COUNT needs to be distinct though, if we want precise results:

COUNT(DISTINCT PersonMovie.PersonID)

isbpfmqt

2009-05-30 04:24:17 by isbpfmqt

hH9Abj <a href="http://hlhfsztcnbea.com/">hlhfsztcnbea</a>, [url=http://twjesitodpbv.com/]twjesitodpbv[/url], [link=http://fdvuqedbkoer.com/]fdvuqedbkoer[/link], http://muhxphrxhnqc.com/

hehe

2010-07-26 11:41:24 by ed hardy jeans

They have a wonderful day planned and I’m SO excited to be a part of their wedding

Great article

2010-08-05 05:20:22 by cheap Jordans Shoes

Thank you for your insight,the article was worth every minute reading it.However mean your authentic jordan shoes life is,meet it and live it,do not shun it and call it cheap jordans shoes hard names.It is not so bad as you cheap nike shox are.It looks poorest when you are cheap nike shoes richest.The fault-finder will find faults in http://www.nikejordanshoes2sell.com/ paradise.

cheap jerseys

2010-08-20 12:33:19 by cheap jerseys

love it

2010-08-24 11:50:06 by yjb

Thanks to the upcoming World Basketball Festival, we now get a “USA” Air Jordan 2010 Team. It seems as if more people like the Air Jordan 2010 Team than the original Air Jordan 2010 because of the windowless side panels. I’m not one of those people who likes the team better; I think the original
<a href="http://www.nikeshoescompany.com/">Nike Shoes</a> 2010 Shoes looks much better.Since this <a href="http://www.nikeshoescompany.com/">jordan Shoes</a> Team is made for the USA team, the colorway should be clear. White can be seen on the side panels, toe, shoe laces, tongue, part of the midsole and the entire outsole. Navy blue covers the toe, heel, inner lining and above the midsole. Red accents appear on the tongue, toe, heel, lace panels and the midsole. The sneaker is constructed of perforated white leather with larger perforations placed on the side panels.

New comment






No HTML allowed.