Missed a post last week due to the Thanksgiving long weekend :-). We had gone to San Francisco to see the city and try out a couple of hikes). Just FYI – strolling around SF is also as much a hike as any of the real trails at Mt Sutro – with all the uphill & downhill roads! As for Robotics, I am currently on Week 3 of the Mobility course, which is more of physics than ‘computer science’; its a welcome change of pace from all the ML/CS stuff I usually do.
In this article, Numenta‘s cofounder discusses what we would need to push current AI systems towards general intelligence. He points out that many industry experts (including Jeff Bezos & Geoffrey Hinton) have opined that it would take far more than scaling up current intelligent systems, to achieve the next ‘big leap’.
Numenta’s goal as such is to take inspiration from the human brain (especially the neocortex) to design the next generation of machine intelligence. The article describes how the neocortex uses abstract ‘locations’ to understand sensory input and form mental representations. To read more of Numenta’s research, visit this page.
This article, though not presenting any ‘new findings’, is a fun-to-read introduction to Transfer Learning. It focusses on the different ways TL can be applied in the context of Neural Networks.
It provides examples of how pre-trained networks can be ‘retrained’ over new data by freezing/unfreezing certain layers during backpropagation. The blogpost also provides a bunch of useful links, such as this discussion on Stanford CS231.
This article motivates the need for embedding vectors in Deep Learning. One of the challenges of using SQL-ish data for deep learning, is the involvement of categorical attributes. The usual ways of dealing with such variables in ML is to use one-hot encodings, or find an integer representation for each possible value.
However, 1) one-hot encodings increase the memory footprint of a NN & 2) assigning integers to ordinal values implies a wrong meaning to neural networks, which are inherently continuous/numeric in nature. For example, Sunday=1 & Saturday=7 for a ‘week’ enum might lead the NN to believe that Sundays and Saturdays are very far apart, which is not usually true.
Hence, learning vectorial embeddings for ordinal attributes is perhaps the right way to go for most applications. While we usually know embeddings in the context of words (Word2Vec, LDA, etc), similar techniques can be used to other enum-style values as well.
This blog-post by Deepmind presents a novel approach to coming up with the hyperparameters for Neural-Network training. It essentially brings in the methodology of Genetic Algorithms for designing optimal network architectures.
While standard hyperparameter-tuning methods perform some kind of random search, Population-based training (PBT) allows each candidate ‘worker’ to take inspiration from the best candidates in the current population (similar to mating in GAs) while allowing for random perturbations in parameters for exploration (a.la. GA mutations.)