Sean Harnett

Stochastic gradient descent versus non-linear conjugate gradient for the Netflix Prize

5-10-2012

I compare two common techniques to compute matrix factorizations for recommender systems, specifically using the Netflix prize data set. Accuracy, run-time, and scalability are discussed for stochastic gradient descent and non-linear conjugate gradient. pdf

Naive Bayes for multiclass text classification

3-26-2012

This was a part of a homework assignment from Jake Hofman's Data-driven Modeling course. Task: given the text of an article from the New York Times, predict the section to which the article belongs. First step was registering for the NYT developer API and using it to get articles. The classifier was able to produce greater than 90% accuracy on a withheld test data set. The assignment. My answer.

TODO: code on github

Optimal Power Grid Attack

5-11-2011

I study a continuous formulation of the so-called N-k problem: given a power grid, find a small set of lines whose failure will cause maximum damage. Instead of removing lines from the grid, I consider increasing their impedance continuously, and attempt to maximize the resulting disruption to stable voltages. For all test cases, I show that a fast and simple first-order optimization method finds the same set of lines as more sophisticated techniques. pdf

Crime PDE

12-17-2010

The evolution in space and time of crime risk and criminal density is modeled by a coupled system of partial differential equations, and solved with a semi-implicit method using finite differences. Under appropriate conditions, isolated stationary hot spots of high criminal activity form. pdf

UPDATE 12-19-2010: I did more stuff

I was playing with this more over the weekend and implemented the spectral methods we said we'd try next. They work well, and about three times faster than the finite differences we used before. I then found parameters to get different size hot spots like in the Short paper.

We didn't mention this in our paper, but when we coded it up, we used the notation from the other Short paper which we didn't reference. It's equivalent but looks a little different in a couple spots.

Here are animations:

code (matlab)

The Netflix Prize: Alternating Least Squares in MPI

3-5-2010

I solve the netflix problem on a distributed memory machine using MPI and alternating least squares. pdf