Yes please! :) And some quick questions: What network topology do you use, and o...

nsthorat · on Oct 4, 2017

We're using SqueezeNet (https://github.com/DeepScale/SqueezeNet), which is similar to Inception (trained on the same ImageNet dataset) but is much smaller - 5MB instead of inception's 100MB - and inference is much much quicker.

The application takes webcam frames and infers through SqueezeNet, producing a 1000D logits vector for each frame. These can be thought of as unnormalized probabilities for each of ImageNet's 1000 classes.

During the collection phase, we collect these vectors for each class in browser memory, and during inference we pass the frame through SqueezeNet and do k-nearest neighbors to find the class with the most similar logits vector. KNN is quick because we vectorize it as one large matrix multiplication.

I'll go deeper in a blog post soon :)

eggie5 · on Oct 5, 2017

So you're doing nearest neighbour search on the images features from the CNN. This is alluded to in Figure 4 of the DeCaf paper: https://twitter.com/eggie5/status/907120374575505408

eggie5 · on Oct 5, 2017

alexnet paper not decaf paper!

amelius · on Oct 4, 2017

Interesting!

I'm curious why you've used a different classification algorithm on top of a neural network. I would expect that a neural network on top of a pretrained network could give similar results, with the benefit of simpler code. Is performance the reason?

Anyway, I'm looking forward to your blog post.

nsthorat · on Oct 4, 2017

Training a neural network on top would require a "proper" training phase, and finding the right hyperparameters that work everywhere turned out to be tricky. Actually, this is what we did originally, in the blog post we'll try to show demos of each of the approaches and explain why they don't work.

KNN also makes training "instant", and the code much much simpler.

amelius · on Oct 4, 2017

That makes sense.

By the way, I think your software could become very popular on the Raspberry Pi, because it would be very cheap and fun to use it for all sorts of applications (e.g. home automation).

nsthorat · on Oct 4, 2017

https://github.com/PAIR-code/deeplearnjs/issues/158

make3 · on Oct 4, 2017

Basically, read this paper: https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf