Hm. And here I was thinking I'd seen most of the accelerated deep learning imple...

hannes-brt · on Dec 5, 2013

Yes, Hebel doesn't have a ton of features and a kitchen-sink of different models yet, but I hope that's going to change. There are lots of things that are quite easy to implement in the current framework, such as: - Neural net regression - Autoencoders - Restricted Boltzman machines

There's a lot of interest for convolutional networks and the best way to implement it will be to wrap Alex Krizhevsky's cuda-convnet, like DeepNet and PyLearn2 have, but this will require a bit more effort.

With respect to other deep learning packages, Hebel doesn't necessarily do everything differently, but depending on your needs it may be the best choice for a particular job.

PyLearn2 is big and monumental and although I haven't used it much personally, it seems to excellent. But as you mentioned, it's not necessarily easy to use and if you want to extend it, you have to learn the Theano development model, which takes some time to grok.

DeepNet is quite similar to Hebel in its approach (even though it offers more models right now). However, DeepNet is based on cudamat and gnumpy, which I have found to often be quite unstable and slow. Hebel is based on PyCUDA which is very stable and according to some preliminary tests I did runs about twice as fast as cudamat.

So, the idea of Hebel is that it should make it easy to train the most important deep learning models without much setup or having to write much code. It is also supposed to make it easy to implement new models through a modular design that lets you subclass existing layers or models to implement variations of them.

kkjkok · on Dec 5, 2013

Question: do you/will you plan to support converting GPU nets to CPU, perhaps by keeping weights and architecture definition separate from PyCUDA dependent structures during serialization?

I have found that using a trained net for preprocessing can be accomplished using very limited resources (read: Core 2 Duo laptop). This is one of the very nice features of DeCAF, which could allow for some interesting applications on embedded devices.

Great work by the way - I look forward to testing it out soon!

hannes-brt · on Dec 6, 2013

That would be possible, but since Hebel is mainly meant to be used in research I don't think it's a big priority now. The most important reason to do this would be to allow development on laptops and workstations without NVIDIA cards and to run the finished model on CUDA hardware later.

As far as embedded devices go (I assume you're talking about ARM cpus etc), they are probably too underpowered to run Neural nets anyway, or models would have to be written in highly specialized C.

kyzyl · on Dec 5, 2013

Yup, I didn't mean to belittle Hebel. I actually just meant that the lack of features is likely why I hadn't heard of it. From the looks of things it's on a nice path. You philisophy about what Hebel should be sound similar what's been done with MORB for making RBMs, and that is one of the reasons I've always like that library. Although MORB still does incur the 'working with theano' conceptual overhead.

hannes-brt · on Dec 6, 2013

The reason why haven't heard of it is probably because I only put the code on Github less than two weeks ago ;) - I've really been blown away by the response though.

kkjkok · on Dec 5, 2013

There is also DeCAF, which actually includes a way to load a pretrained ImageNet network based on cuda-convnet. I have had pretty recent success using this blob as preprocessing for image classification, ala http://arxiv.org/abs/1310.1531.

Github is here:

https://github.com/UCB-ICSI-Vision-Group/decaf-release/

My current code as an example of combining sklearn/pylearn2 with DeCAF preprocessing (under the decaf folder, sklearn usage is under previous commits):

https://github.com/kastnerkyle/kaggle-dogs-vs-cats

jdonahue · on Dec 5, 2013

Thanks for the DeCAF plug! Here's a demo of the classifier with the pre-trained ImageNet weights in action: http://decaf.berkeleyvision.org/

I also have to take the opportunity to plug Caffe [1] - Yangqing's replacement for DeCAF which he actually open sourced just a few hours ago. All the heavy processing (e.g., forward/backprop) can be run either on your (CUDA-enabled) GPU or on the CPU, and the GPU implementation is actually a bit faster than cuda-convnet. The entire core is (imo) very well-engineered and written in clean lovely C++, but it also comes with Python and Matlab wrappers. I've personally been hacking around inside the core for about a month and it has really been a pleasure to work with.

[1] http://daggerfs.com/caffe/

nrmn · on Dec 5, 2013

"Decaf (CPU) is 2 times faster than Theano with GPU"

Have you managed to reproduce this? Thats awesome if it is true! I thought Theano was already very fast.

nrmn · on Dec 5, 2013

Quick question about your code for the conv net, why do you resize the images down to 32x32? I thought one of the big features of conv nets was the fact that they input does not have to be the same, it just slides a window around the image. Am I complete wrong with this one?

Would you be willing to maybe print out the weights for each layer? I'd be interested to see what features your conv net is capturing.

kkjkok · on Dec 6, 2013

I was (and still am) trying to use an already trained CIFAR10 net in a similar manner to DeCAF/ImageNet. Because CIFAR10 operates on 32x32 color images, I did the same thing for the input of the DeCAF experiment. As far as I know, the inputs to the network need to be identical between train/test sets , though they can be 0-padded/color filled to make the dimensions match, it may affect results - haven't tried anything but scaling personally. I am pretty sure there are 2 sets of scaling happening for my DeCAF experiment: down to 32x32 with convert, then UP to 512x512, then the center 256x256 is pulled out. I think this may affect my results a little :)

The plan is to operate on 32x32 data for now, then try scaling up the input images or just scaling to 512x512 to see how input data size/resolution affects the DeCAF/pylearn2 classification result, either positively or negatively.

As far as network weights, I haven't tried to print/plot the DeCAF weights yet (though there are images in the DeCAF paper itself). For pure pylearn2 networks, there is a neat utility called show_weights.py in pylearn2/scripts.

Another method, which does do "chopping" is http://www.stanford.edu/~acoates/papers/coatesng_nntot2012.p... - which is a little different than what I am currently trying.

dave_sullivan · on Dec 5, 2013

There's also Ersatz (http://www.ersatz1.com) -- deep learning as a service, provides a web interface that allows you to train different neural net architectures and then run them via an API, the networks are GPU backed, been in beta since January. I'm the founder and at NIPS now if anyone wants to ask me about it in person.

kyzyl · on Dec 6, 2013

I was wondering when you were going to pop up here. Ersatz has a lot of potential, but as it stands now it's more or less a web UI for ensembling the pre-implemented models (correct me if I'm wrong). So if you don't want an autoencoder, convnet or a RNN, you're sort of out of luck, no?

I know the original website advertised 'custom architectures', but it's not entirely clear to me (... not that it necessarily should be) what the route for Ersatz's current implementation to something like that is. Comments?

dave_sullivan · on Dec 6, 2013

haha, I guess I do have a way of popping up anytime somebody's talking about deep learning on the internet...

But yeah, fair points re: ersatz. We've got RNNs, autoencoders, conv nets, and deep feed forward nets w/ dropout, different types of nonlinearities, etc etc. I think these represent a pretty flexible set of architectures--but you're right, if you're looking for an RBM, you're out of luck for now. From there, it's a web interface and API that make it pretty straightforward to get started with these types of architectures. Which is still pretty damned cool, if I do say so myself...

I think of it like this:

* Use theano if you want maximum flexibility (and maximum difficulty in getting to results)

* Use pylearn2 if you want a really fair amount of flexibility and pre-built implementations of neural networks. It is, however, difficult to get started with. Otherwise it's awesome.

* Use Ersatz if you want to use neural networks without knowing how to build them--but also know that you're giving up some flexibility and Ersatz is a bit opinionated--which, honestly, i'm not convinced is a bad thing for the type of market we're trying to target (non-ML researchers, really)

Very different offerings for different needs.

Re: custom architectures, yeah, you're right--bottom line is allocation of resources--what should our team spend time on? Because we're bootstrapped, the answer to that is whatever people are asking for (and--pretty importantly--willing to pay for). So far, lack of model types hasn't been a deal breaker for us so we've been spending time improving the API, getting it to run faster, deal better with larger and larger amounts of data, etc. etc. etc. I do have some ideas on how "custom architectures" could work, but we're focusing on polishing the current offering for now.

So yes, I agree, Ersatz is not yet living up to its full potential. But that will come, one step at a time. If theano and pylearn2 seem too complicated, try Ersatz, it's getting better every day.

hannes-brt · on Dec 6, 2013

This is awesome and I'm sure it will be a boon to app developers who want to include Machine Learning capabilities in their apps.

It looks to me as though Ersatz's focus is on providing a limited range of relatively standard models, but make them highly accessible, stable, fast, and suitable for production, whereas most available frameworks like Theano, PyLearn2, etc are more geared to the tinkering researchers and less to be used in actual products.

ma2rten · on Dec 6, 2013

What about Cuda Conv Net?

https://code.google.com/p/cuda-convnet/