Weekly Review: 12/23/2017

Happy Holidays people! If you live in the Bay Area then the next week is probably your time off, so I hope you have fun and enjoy the holiday season! As for Robotics, I just finished Week 2 of Perception, and will probably kick off Week 3 in 2018. I am excited for the last ‘real’ course (Estimation & Learning), and then building my own robot as part of the ‘Capstone’ project after that :-D.

This week’s articles:

XGBoost

I recently came across XGBoost (eXtreme Gradient Boosting), an improvement over standard Gradient Boosting – thats actually a shame, considering how popular this method is in Data Science. If you are rusty on ensemble learning, take a look at this article on bagging/random Forests, and my own intro to Boosting.

XGBoost is one of the most efficient versions of Gradient Boosting, and apparently works really well on structured/tabular data. It also provides features such as sparse-awareness (being able to handle missing values), and the ability to update models with ‘continued training’. Its effectiveness for tabular data has made it very popular with Kaggle winners, with one of them quoting: “When in doubt, use xgboost”!

Take a look at the original paper to dig deeper.

Quantum Computing + Machine Learning

A lot of companies, such as Google, Microsoft, etc have recently shown interest in the domain of Quantum Computing. Rigetti happens to be a startup that aims to rival these juggernauts with its great solution to cloud-Quantum Computing (called Forest). They even have their own Python integration!

The article in question details their efforts to prototype simple clustering with quantum computing. It is still pretty crude, and is by no means a replacement to traditional systems – for now. One of the major critical points is “Applying Quantum Computing to Machine Learning will only make a black-box system more difficult to understand”. This is infact true, but the author suggests that ML could actually/maybe help us understand the behavior of Quantum Computers by modelling them!

Breaking a CAPTCHA with ML

A simple, easy-to-read, fun article on how you could break the simplest CAPTCHA algorithms with CV+Deep Learning.

Learning Indexing Structures with ML

Indexing structures are essentially data structures meant for efficient data access. For example, a B-Tree Index is used for efficient range-queries, a Hash-table is used for fast key-based access, etc. However, all of these data structures are pretty rigid in their behavior – they do not fine-tune/change their parameters based on the structure of the data.

This paper (that includes the Google legend Jeff Dean as an author) explores the possibility of using Neural Networks (infact, a hierarchy of them) as indexing structures. Basically, you would use a Neural Network to compute the function – f: data -> hash/position.

Some key takeaways from the paper:

  1. Range Index models essentially ‘learn’ a cumulative distribution function.
  2. The overall ‘learned index’ by this paper is a hierarchy of models (but not a tree, since two models at a certain layer can point to the same model in the next layer)
    1. As you go down the layers, the models deal with smaller and smaller subsets of the data.
  3. Unlike a B-Tree, no ‘search’ involved, since each model predicts the next model for hash generation.

Tacotron 2

This post on the Google Research blog details the development of a WaveNet-like framework to generate Human Speech from text.

 

Weekly Review: 12/10/2017

The Mobility Robotics course is finally done, and I just started Perception. It seems to be way more concept-heavy than any of the other courses, but I like the content from Week 1 so far! I did not like Mobility as much, since it focussed exclusively on theory, and the content assumed a fair amount of comfort with kinematics/dynamics (which I don’t have anymore). Anyway, off to the articles for this week:

AI & the Blockchain

This article gives a quick introduction to Blockchain technologies, and then delves into the relationship between Artificial Intelligence and cryptocurrencies.

It discusses the various ways in which AI could transform blockchain tech, such as: 1. Improving the energy efficiency of mining centers (like DeepMind’s algorithms do for Google), 2. Increasing scalability using Federated Learning, 3. Predicting which nodes could solve a particular block, so as to ‘free’ up the others.

Federated Learning

Coming across the mention of Federated Learning made me realise that I did not remember what it was, so I revisited the old(ish) post on Google’s Research blog.

Federated Learning works by decentralizing the training process for ML models (unlike most other technologies that mainly do inference on end-devices). This is useful in cases where communicating data continuously from devices causes bandwidth and latency issues for the user/training server.

It works like this: Every device downloads the latest version of a model from the central server. Then, as it sees more data in deployment, it trains the local model to compute small ‘focussed’ updates based on the user. All these small updates (and the not the raw data that created them) are then sent to the central server, which aggregates all the updates using the FederatedAveraging algorithm. Privacy is ensured primarily by retraining the central model only after receiving a certain number of smaller updates.

AlphaZero Chess

Sometime back, DeepMind had unveiled the AlphaGo Zero, an algorithm that learned to play Go by playing only against itself (given the basic laws of the game). They then went on to try out the MCTS-based algorithm on chess, and it seems to be working really well! The AlphaZero algorithm apparently defeated Stockfish (current computer chess champion) 28 wins to none (and a bunch of draws).

Ofcourse, the superior hardware that AlphaZero uses does make a huge difference, but the very fact that such powerful computers can be optimally used to ‘meta-learn’ is in itself a game-changer. Do read the original paper to get an idea of their method (especially the section on input/outputs Representations to the deep network)

DeepVariant

High-Throughput Sequencing (HTS) is a method used in genome sequencing. HTS produces multiple reads of an individual’s genome, which are then compared to some ‘reference’ to explore variations.

To achieve this, it is necessary to properly align the reads with the reference genome, and also account for errors in measurement. Essentially, every nucleotide position that does not match with the reference could either be a genuine variant or an error in measurement. This is determined using data from all the reads produced by the method – this problem is called the ‘Variant Calling Problem‘.

DeepVariant, an algorithm co-developed by Google Brain & Verily, converts the variant-calling problem into an image classification problem to achieve state-of-the-art results. It was unveiled at NIPS-2017, and they have open-sourced the code.

Funny Programming Jargon

This is not really an ‘article’, but more of comic relief :-). It lists out various programming terms invented by real developers, that mock the various software engineering pitfalls in a typical workplace. Do read if you appreciate programming humor!

Weekly Review: 12/03/2017

Missed a post last week due to the Thanksgiving long weekend :-). We had gone to San Francisco to see the city and try out a couple of hikes). Just FYI – strolling around SF is also as much a hike as any of the real trails at Mt Sutro – with all the uphill & downhill roads! As for Robotics, I am currently on Week 3 of the Mobility course, which is more of physics than ‘computer science’; its a welcome change of pace from all the ML/CS stuff I usually do.

Numenta – Secret to Strong AI

In this article, Numenta‘s cofounder discusses what we would need to push current AI systems towards general intelligence. He points out that many industry experts (including Jeff Bezos & Geoffrey Hinton) have opined that it would take far more than scaling up current intelligent systems, to achieve the next ‘big leap’.

Numenta’s goal as such is to take inspiration from the human brain (especially the neocortex) to design the next generation of machine intelligence. The article describes how the neocortex uses abstract ‘locations’ to understand sensory input and form mental representations. To read more of Numenta’s research, visit this page.

Transfer Learning

This article, though not presenting any ‘new findings’, is a fun-to-read introduction to Transfer Learning. It focusses on the different ways TL can be applied in the context of Neural Networks.

It provides examples of how pre-trained networks can be ‘retrained’ over new data by freezing/unfreezing certain layers during backpropagation. The blogpost also provides a bunch of useful links, such as this discussion on Stanford CS231.

Structured Deep Learning

This article motivates the need for embedding vectors in Deep Learning. One of the challenges of using SQL-ish data for deep learning, is the involvement of categorical attributes. The usual ways of dealing with such variables in ML is to use one-hot encodings, or find an integer representation for each possible value.

However, 1) one-hot encodings increase the memory footprint of a NN & 2) assigning integers to ordinal values implies a wrong meaning to neural networks, which are inherently continuous/numeric in nature. For example, Sunday=1 & Saturday=7 for a ‘week’ enum might lead the NN to believe that Sundays and Saturdays are very far apart, which is not usually true.

Hence, learning vectorial embeddings for ordinal attributes is perhaps the right way to go for most applications. While we usually know embeddings in the context of words (Word2Vec, LDA, etc), similar techniques can be used to other enum-style values as well.

Population-based Training

This blog-post by Deepmind presents a novel approach to coming up with the hyperparameters for Neural-Network training. It essentially brings in the methodology of Genetic Algorithms for designing optimal network architectures.

While standard hyperparameter-tuning methods perform some kind of random search, Population-based training (PBT) allows each candidate ‘worker’ to take inspiration from the best candidates in the current population (similar to mating in GAs) while allowing for random perturbations in parameters for exploration (a.la. GA mutations.)

 

Weekly Review: 11/18/2017

I finished the Motion Planning course from Robotics this week. It was expected, since the material was quite in line with data structures and algorithms that I have studied during my undergrad. The next one, Mobility, seems to be a notch tougher than Aerial Robotics, mainly because of the focus on calculus and physics (neither of which I have touched heavily in years).

Heres the articles this week:

Neural Networks: Software 2.0

In this article from Medium, the Director of AI at Tesla gives a fresh perspective on NNs. He refers to the set of weights in a Neural Network as a program which is learnt, as opposed to coded in by a human. This line of thought is justified by the fact that many decisions in Robotics, Search, etc. are taken by parametric ML systems. He also compares it to traditional ‘Software 1.0’, and points out the benefits of each.

Baselines in Machine Learning

In this article, a senior Research Scientist from Salesforce points out that we need to pay greater attention to baselines in Machine Learning. A baseline is any meaningful ‘benchmark’ algorithm that you would compare your algorithm against. The actual reference point would depend on your task – random/stratified systems for classification, state-of-the-art CNNs for image processing, etc. Read Neal’s answer to this Quora question for a deeper understanding.

The article ends with a couple of helpful tips, such as:

  1. Use meaningful baselines, instead of using very crude code. The better your baseline, the more meaningful your results.
  2. Start off with optimizing the baseline itself. Tune the weights, etc. if you have to – this gives you a good base to start your work on.

TensorFlow Lite

TensorFlow Lite is now in the Developer Preview mode. It is a light-weight platform for inference (not training) using ML models on mobile/embedded devices. Google calls it an ‘evolution of TensorFlow mobile’. While the latter is still the system you should use in production, TensorFlow lite appears to perform better on many benchmarks (Differences here). Some of the major plus-points of this new platform are smaller binaries, and support for custom ML-focussed hardware accelerators via the Android Neural Networks API.

Flatbuffers

Reading up on Tensorflow Lite also brought me to Flatbuffers, which are a ‘liter’ version of Protobufs. Flatbuffer is a data serialization library  for performance-critical applications. Flatbuffers provide the benefits of a smaller memory footprint and lesser generated code, mainly due to skipping of the parsing/unpacking step. Heres the Github repo.

Adversarial Attacks

This YCombinator article gives a nice overview of Adversarial attacks on ML models – attacks that provide ‘noisy’ data inputs to intelligent systems, in order to get a ‘wrong’ output. The author points out how Gradient descent can be used to sort-of reverse engineer spurious noise, in order to get data ‘misclassified’ by a neural network. The article also shows examples of such faulty inputs, and they are surprisingly indistinguishable from the original data!

 

Weekly Review: 11/11/2017

The Motion Planning course is going faster than I expected. I completed 2 weeks within 5 days. Thats good I guess, since it means I might get to the Capstone project before I take a vacation to India.

Heres the stuff from this week:

Graphcore and the Intelligent Processing Unit (IPU)

Graphcore aims to disrupt the world of ML-focussed computing devices. In an interesting blog post, they visualize neuron connections in different CNN architectures, and talk about how they compare to the human brain.

If you are curious about how IPUs differ from CPUs and GPUs, this NextPlatform article gives a few hints: mind you, IPUs are yet to be ‘released’, so theres no concrete information out yet. If you want to brush up on why memory is so important for neural network training (more than inference), this is a good place to start.

Overview of Different CNN architectures

This article on the CV-Tricks blog gives a high-level overview of the major CNN architectures so far: AlexNet, VGG, Inception, ResNets, etc. Its a good place to go for reference if you ever happen to forget what one of them did differently.

On that note, this blog post by Adit Deshpande goes into the ‘Brief History of Deep Learning’, marking out all the main research papers of importance.

Meta-learning and AutoML

The New York Times posted an article about AI systems that can build other AI systems, thus leading to what they call ‘Meta-learning’ (Learning how to learn/build systems that learn).

Google has been dabbling in meta-learning with a project called AutoML. AutoML basically consists of a ‘Generator’ network that comes up with various NN architectures, which are then evaluated by a ‘Scorer’ that trains them and computes their accuracy. The gradients with respect to these scores are passed back to the Generator, in order to improve the output architectures. This is their original paper, in case you want to take a look.

The AutoML team recently wrote another post about large-scale object detection using their algorithms.

Tangent

People from Google recently open-sourced their library for computing gradients of Python functions. Tangent works directly on your Python code(rather than view it as a black-box), and comes up with a derivative function to compute its gradient. This is useful in cases where you might want to debug how/why some NN architecture is not getting trained the way it’s supposed to. Here’s their Github repo.

Reconstructing films with Neural Network

This blog post talks about the use of Autoencoders and GANs to reconstruct films using NNs trained on them. They also venture into reconstructing films using NNs trained on other stylish films (like A Scanner Darkly). The results are pretty interesting.

Weekly Review: 11/04/2017

A busy week. I finished my Aerial Robotics course! The next in the Specialization is Computational Motion Planning, which I am more excited about – mainly because the curriculum goes more towards my areas of expertise. Aerial Robotics was challenging primarily because I was doing a lot of physics/calculus which I had not attempted since a long time.

Onto the articles for this week:

Colab is now public!

Google made Colaboratory, a previously-internal tool public. ‘Colab’ is a document-collaboration tool, with the added benefits of being able to run script-sized pieces of code. This is especially useful if you want to prototype small proofs-of-concept, which can then be shared with documentation and demo-able output. I had previously used it within Google to tinker with TensorFlow, and write small scripts for database queries.

Visual Guide to Evolution Strategies

The above link is a great introduction to Evolutionary Strategies such as GAs and CMA-ES. They show a visual representation of how each of these algorithms converges on the optima from the first iteration to the last on simple problems. Its pretty interesting to see how each algorithm ‘broadens’ or ‘focuses’ the domain of its candidate solutions as iterations go by.

Baidu’s Deep Voice

In a 2-part series (Part 1 & Part 2), the author discusses the architecture of Baidu’s Text-to-Speech system (Deep Voice). Take a look if you have never read about/worked on such systems and want to have a general idea of how they are trained and deployed.

Capsule Networks

Geoff Hinton and his team at Google recently discussed the idea of Capsule networks, which try and remedy the rigidity in usual CNNs – by defining groups of specialized neurons called ‘capsules’ whose contribution to higher-level neurons is decided by the similarity of output. Heres a small intro on Capsule Networks, or the original paper if you wanna delve deeper.

Nexar Challenge Results

Nexar released the results of its Deep-Learning challenge on Image segmentation – the problem of ‘boxing’ and ‘tagging’ objects in pictures with multiple entities present. This is especially useful in their own AI-dashboard apps, which need to be quite accurate to prevent possible collisions in deployment.

As further reading, you could also check out this article on the history of CNNs in Image Segmentation, another one on Region-of-Interest Pooling in CNNs, and Deformable Neural Networks. (All of these concepts are mentioned in the main Nexar article)

Weekly Review: 10/28/2017

This was a pretty busy week with a lot going on, but I finally seem to be settling into my new role!

The study for Aerial Robotics is almost over with a week to go. There hasn’t been much coding in this course, but that was to be expected since it was more about PID-Control Theory and quadrotor dynamics. I am particularly interested in the Capstone/’final’ project for this course, which would involve building an autonomous robot in Pi.

Anyway, on to the interesting tidbits from this week:

AlphaGo Zero

Google’s Deepmind recently announced a new version of their AI-based Go player, the AlphaGo Zero. What makes this one so special, is that it breaks the common notion of intelligent systems requiring a LOT of data to produce decent results. AlphaGo Zero was only provided the basic rules of Go, and it performed the rest of the learning all by playing against itself. Oh and BTW, AlphaGo Zero beats AlphaGo, the previous champion in the game. This is indeed a landmark in demonstrating the power of good-old RL.

Read this article for a basic overview, and their paper in Nature for a detailed explanation. Brushing up on Monte Carlo Tree Search would certainly help.

Word Mover’s Distance

Given an excellent embedding of words such as Word2Vec, it is not very difficult to compute the semantic distance between individual terms. However, when it comes to big blocks of text, a simple ‘average’ over term-embeddings isn’t good enough for computing their relative distances.

In such cases, the Word Mover’s Distance, inspired from Earth Mover’s Distance, provides a better solution. It figures out the semantically closest term(s) from one document to each term in another, and then the average effort required to ‘rephrase’ one text in words of another. Click on the article link for a detailed explanation.

Robots generalizing from simulations

OpenAI posted a blog article about how they trained a robot only through simulations. This means that the robot received no data from sensors during the training phase, but was able to perform basic tasks in deployment after some calibration.

During the simulations, they used dynamics randomization to alter basic traits of the environment. This data was then fed to an LSTM to understand the settings and goals. A key insight from this work is Hindsight Experience Replay. Quoting the article, “Hindsight Experience Replay (HER), allows agents to learn from a binary reward by pretending that a failure was what they wanted to do all along and learning from it accordingly. (By analogy, imagine looking for a gas station but ending up at a pizza shop. You still don’t know where to get gas, but you’ve now learned where to get pizza.)

Concurrency in Go

If you are a Go Programmer, take a look at this old (but good) talk on concurrency patterns and constructs in the language.

Generalization Bounds in Machine Learning

The Generalization Gap for an ML system is defined as the difference between the training error and the generalization error. The Generalization Bound tries to put a bound on this value, based on probability theory. Read this article for a detailed mathematical explanation.

Weekly Review: 10/21/2017

Its been a long while since I last posted, but for good reason! I was busy shifting base from Google’s Hyderabad office to their new location in Sunnyvale. This is my first time in the USA, so there is a lot to take in and process!

Anyway, I am now working on Google’s Social-Search and Ranking team. At the same time, I am also doing Coursera’s Robotics Specialization to learn a subject I have never really touched upon. Be warned if you ever decide to give it a try: their very first course, titled Aerial Robotics, has a lot of linear math and physics involved. Since I last did all this in my freshman year of college, I am just about getting the weeks done!

Since I already have my plate full with a lot of ToDos, but I also feel bad for not posting, I found a middle ground: I will try, to the best of my ability, to post one article each weekend about all the random/new interesting articles I read over the course of the week. This is partly for my own reference later on, since I have found myself going back to my posts quite a few times to revisit a concept I wrote on. So here goes:

Eigenvectors & Eigenvalues

Anything ‘eigen’ has confused me for a while now, mainly because I never understood the intuition behind the concept. The highest-rated answer to this Math-Stackexchange question did the job: Every square matrix is a linear transformation. The corresponding eigenvectors roughly describe how the transformation orients the results (or the directions of maximum change), while the corresponding eigenvalues describe the distortion caused in those directions.

Transfer Learning

Machine Learning currently specializes in utilizing data from a certain {Task, Domain} combo (for e.g., Task: Recognize dogs in photos, Domain: Photos of dogs) to learn a function. However, when this same function/model is used on a different but related task (Recognize foxes in photos) or a different domain (Photos of dogs taken during the night), it performs poorly. This article discusses Transfer Learning, a method to apply knowledge learned in one setting on problems in different ones.

Dynamic Filters

The filters used in Convolutional Neural Network layers usually have fixed weights at a certain layer, for a given feature map. This paper from the NIPS conference discusses the idea of layers that change their filter weights depending on the input. The intuition is this: Even though a filter is trained to look for a specialized feature within a given image, the orientation/shape/size of the feature might change with the image itself. This is especially true while analysing data such as moving objects within videos. A dynamic filter will then be able to adapt to the incoming data, and efficiently recognise the intended features inspite of distortions.

 

The magic behind Attribute Access in Python

Most people know just one thing when it comes to attribute access – the dot ‘.’ (as in x.some_attribute). In simple terms, attribute access is the way you retrieve an object linked to the one you already have. To someone who uses Python without delving too much into the details, it may seem pretty straightforward. However, under the hood, theres a lot that goes on for this seemingly trivial task.

Lets look at each of the components one by one.

The __dict__ attribute

Every object in Python has an attribute denoted by __dict__. This dictionary/dictionary-like (I will explain this shortly) object contains all the attributes defined for the object itself. It maps the attribute name to its value.

Heres an example:


>>> class C(object):
	x = 4

>>> c = C()
>>> c.y = 5
>>> c.__dict__
{'y': 5}

Notice how 'x' is not in c.__dict__. The reason for this is simple enough. While y was defined for the object c, x was defined for its class (C). Therefore, it will appear in the __dict__ of C. In fact, C‘s __dict__ contains a lot of other keys too (including '__dict__'):


>>> c.__class__.__dict__['x']
4
>>> c.__class__.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'C' objects>, 'x': 4, 
'__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'C' objects>, 
'__doc__': None})

We will look at what dictproxy means soon.

The __dict__ of an object is simple enough to understand. It behaves like a Python dict, and is one too.


>>> c.__dict__
{'y': 5}
>>> c.__dict__.__class__
<type 'dict'>
>>> c.__dict__ = {}
>>> c.y

Traceback (most recent call last):
  File "<pyshell#81>", line 1, in <module>
    c.y
AttributeError: 'C' object has no attribute 'y'
>>> c.__dict__['y'] = 5
>>> c.y
5

The __dict__ of a class however, is not that straight-forward. Its actually an object of a class called dictproxy. dictproxy is a special class whose objects behave like normal dicts, but they differ in some key behaviours.


>>> C.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'C' objects>, 'x': 4, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'C' objects>, '__doc__': None})
>>> C.__dict__.__class__
<type 'dictproxy'>
>>> C.__dict__['x']
4
>>> C.__dict__['x'] = 6

Traceback (most recent call last):
  File "<pyshell#87>", line 1, in <module>
    C.__dict__['x'] = 4
TypeError: 'dictproxy' object does not support item assignment
>>> C.x = 6
>>> C.__dict__ = {}

Traceback (most recent call last):
  File "<pyshell#89>", line 1, in <module>
    C.__dict__ = {}
AttributeError: attribute '__dict__' of 'type' objects is not writable

Notice how you cannot set a key in a dictproxy directly (C.__dict__['x'] = 4 does not work). You can accomplish the same using C.x = 6 however, since the internal behaviour then is different. Also notice how you cannot set the __dict__ attribute itself either(C.__dict__ = {} does not work).

Theres a reason behind this weird implementation. If you don’t want to get into the details, just know that its for the Python interpreter to keep working properly, and to enforce some optimizations. If you want a more detailed explanation, have a look at Scott H’s answer to this StackOverflow question.

Descriptors

A descriptor is an object that has atleast one of the following magic methods in its attributes: __get__, __set__ or __delete__ (Remember, methods are ultimately objects in Python). Mind you, its the object we are talking about. Its class may or may not have implemented them.

Descriptors can help you define the behaviour of an object’s attribute in Python. With each of the magic methods just mentioned, you implement how the attribute (‘described’ by the descriptor) will be retrieved, set and deleted in the object respectively. There are two types of descriptors – Data Descriptors, and Non-Data Descriptors.

Non-Data Descriptors only have __get__ defined. All others are Data Descriptors. You would naturally think, why these two types are called so. The answer is intuitive. Usually, its data-related attributes that we tend to ‘set’ or ‘delete’ with respect to an object. Other attributes, like methods themselves, we don’t. So their descriptors are called Non-Data Descriptors. As with a lot of other things in Python, this is not a hard-and-fast rule, but a convention. You could just as well describe a method with a Data Descriptor. But then, its __get__ should return a function.

Heres an example of two classes that will come up with data and non-data descriptor objects respectively:


class DataDesc(object):
    def __init__(self, name):
        self._name = name

    def __get__(self, obj, objclass):
        try:
            print("Retrieving attr " + self._name + " from " +
                  str(obj) + "...")
            return objclass.x + " + " + obj.y
        except:
            raise AttributeError("Attr " + self._name + " could not be " +
                                 "retrieved from " + str(obj))
    
    def __set__(self, obj, value):
        raise AttributeError("Attr " + self._name + " cannot be " +
                             "set in " + str(obj))

    def __delete__(self, obj):
        raise AttributeError("Attr " + self._name + " cannot be " +
                             "deleted in " + str(obj))

class NonDataDesc(object):
    def __init__(self, name):
        self._name = name

    def __get__(self, obj, objclass):
        try:
            print("Retrieving attr " + self._name + " from " +
                  str(obj) + "...")
            return objclass.x + " + " + obj.y
        except:
            raise AttributeError("Attr " + self._name + " could not be " +
                                 "retrieved from " + str(obj))

Notice how the __get__ function takes in an object obj and (its) class objclass. Similarly, setting the value requires obj and some candidate value. Deletion just needs obj. Taking these parameters in (along with the initializer __init__) helps you differentiate between objects of the same descriptor class. Mind you, its the objects that are intended to be the descriptors.
(P.S. If you don’t define the __get__ method for a descriptor, the descriptor object itself will get returned).

Lets use these classes in some code.


class ParentClass(object):
    x = "x1"
    y = "y1"
    data_attr_parent = DataDesc("desc1")
    data_attr_child = DataDesc("desc2")

class ChildClass(ParentClass):
    x = "x2"
    y = "y2"
    data_attr_child = DataDesc("desc3")
    non_data_attr_child = NonDataDesc("desc4")

some_object = ChildClass()

Thats it! You can access the ‘described’ objects as usual in Python.


>>> some_object.data_attr_parent
Retrieving attr desc1 from <__main__.ChildClass object at 0x1062c5790>...
'x2 + y2'

Descriptors are used for a lot of attribute and method related functionality in Python, including static methods, class methods and properties. Using descriptors, you can gain better control over how attributes and methods of a class/its objects are accessed – including defining some ‘behind the scenes’ functionality like logging.

Now lets look at the high-level rules governing attribute access in Python.

The Rules

Quoting Shalabh Chaturvedi’s book verbatim, the workflow is as follows:

  1. If attrname is a special (i.e. Python-provided) attribute for objectname, return it.
  2. Check objectname.__class__.__dict__ for attrname. If it exists and is a data-descriptor, return the descriptor result. Search all bases of objectname.__class__ for the same case.
  3. Check objectname.__dict__ for attrname, and return if found. If objectname is a class, search its bases too. If it is a class and a descriptor exists in it or its bases, return the descriptor result.
  4. Check objectname.__class__.__dict__ for attrname. If it exists and is a non-data descriptor, return the descriptor result. If it exists, and is not a descriptor, just return it. If it exists and is a data descriptor, we shouldn’t be here because we would have returned at point 2. Search all bases of objectname.__class__for same case.
  5. Raise AttributeError

 

To make things clearer, heres some tinkering using the code we wrote in the Descriptors section (Have a look at it again just to be clear about things):

data_attr_child is a Data descriptor in some_object‘s class. So you cant write over it. Also, the version in ChildClass (‘desc3’) is used, not the one in ParentClass.


>>> some_object.data_attr_child
Retrieving attr desc3 from <__main__.ChildClass object at 0x1110c9790>...
'x2 + y2'
>>> some_object.data_attr_child = 'xyz'

Traceback (most recent call last):
  File "<pyshell#112>", line 1, in <module>
    some_object.data_attr_child = 'xyz'
  File "/Users/srjoglekar/metaclasses.py", line 16, in __set__
    "set in " + str(obj))
AttributeError: Attr desc3 cannot be set in <__main__.ChildClass object at 0x10883f790>

Infact, even if you make an appropriate entry in some_object‘s dict, it still won’t matter (as per Rule 1).


>>> some_object.__dict__['data_attr_child'] = 'xyz'
>>> some_object.data_attr_child
Retrieving attr desc3 from <__main__.ChildClass object at 0x10883f790>...
'x2 + y2'

The Non-Data Descriptor attribute, on the other hand, can be easily overwritten.


>>> some_object.non_data_attr_child
Retrieving attr desc4 from <__main__.ChildClass object at 0x10883f790>...
'x2 + y2'
>>> some_object.non_data_attr_child = 'xyz'
>>> some_object.non_data_attr_child
'xyz'
>>> some_object.__dict__
{'data_attr_child': 'xyz', 'non_data_attr_child': 'xyz'}

You can, however, change the behaviour of data_attr_child, if you go to some_object‘s class and modify it in the dictproxy there itself.


>>> some_object.__class__.data_attr_child = 'abc'
>>> some_object.data_attr_child
'xyz'

Notice how the moment you replace the Data-Descriptor in the class with some non-data descriptor (or some object like a String in this case), the entry that we initially made in some_object‘s __dict__ comes into play. Therefore, some_object.data_attr_child returns 'xyz', not 'abc'.

The data_attr_parent attribute behaves similar to data_attr_child.


>>> some_object.data_attr_parent
Retrieving attr desc1 from <__main__.ChildClass object at 0x10883f790>...
'x2 + y2'
>>> some_object.data_attr_parent = 'xyz'

Traceback (most recent call last):
  File "<pyshell#127>", line 1, in <module>
    some_object.data_attr_parent = 'xyz'
  File "/Users/srjoglekar/metaclasses.py", line 16, in __set__
    "set in " + str(obj))
AttributeError: Attr desc1 cannot be set in <__main__.ChildClass object at 0x10883f790>
>>> some_object.__class__.data_attr_parent = 'xyz'
>>> some_object.__class__.data_attr_parent
'xyz'

Notice how you cant ‘write-over’ data_attr_parent in ChildClass itself. Once you do that, we go through Rules 1-2-3 and stop at 4, to get the result 'xyz'.

Rules for Setting Attributes

Way simpler than the rules for ‘getting them’. Quoting Shalabh’s book again,

  1. Check objectname.__class__.__dict__ for attrname. If it exists and is a data-descriptor, use the descriptor to set the value. Search all bases of objectname.__class__ for the same case.
  2. Insert something into objectname.__dict__ for key "attrname".

Thats it! :-).

__slots__

To put it concisely, __slots__ is a way to disallow objects from having their own __dict__ in Python. This means, that if you define __slots__ in a Class, then you cannot set arbitrary attributes(apart from the ones mentioned in the ‘slots’) on its objects.

Heres an example of such a class:


class SomeClass(object):
    __slots__ = ['x', 'y']

obj = SomeClass()

Now see how this behaves:


>>> obj.x = 4
>>> obj.y = 5
>>> obj.x
4
>>> obj.y
5
>>> obj.z = 6

Traceback (most recent call last):
  File "<pyshell#135>", line 1, in <module>
    obj.z = 6
AttributeError: 'SomeClass' object has no attribute 'z'

You can ofcourse do this:


>>> obj.__class__.z = 6
>>> obj.z
6

But then, remember you have now defined z in SomeClass‘s __dict__, not in obj‘s.

As Guido van Rossum himself mentions in his blog post, __slots__ were implemented in Python to introduce efficiency, not ‘stricter’ attribute-setting. The basic intuition is this: Suppose you have a class, whose objects you intend to construct in a large number. You don’t really need the flexibility of having ‘dynamic’ attributes on the objects themselves, but you want efficiency. Since slots essentially eliminates the __dict__ attribute in each one of the objects, you get a lot of memory savings this way.

Interestingly, slots are implemented using descriptors in Python.

 

Further Reading

Have a look at this book I have already quoted in the post. It goes into a lot of detail regarding attribute access in Python, including method resolutions.

Thats all for now. Cheers!

Joined Google :-D

Hello people! I haven’t really blogged in quite some time, and I kind-of feel guilty about it :-). Truth is, I have been busy starting a new job/life at Google, as a Software Engineer at the Hyderabad office. I joined the Google Apps for Work team, and I work on the analytics part of things. I was pretty swamped with setting things up, getting the formalities done, finding a place to live, blah blah – Basically learning to be an adult! Things have finally settled down now, and I have (I think) found my groove when it comes to my overall routine – so I will probably start blogging as usual in the coming weeks.

To answer the obvious question, life at Google is …well…pretty awesome. Let me write out some points about my experience at the company thats supposed to be one of the best employers in the world. And as per my usual style, they will be bullet points, since putting down a coherent/easy-flowing train of thoughts is just beyond my abilities as a writer.

I. The cool work culture!! I don’t know how many other companies do this (since this is my first real job) but its definitely not all of them. Google doesn’t just talk about an open workspace, it does follow it. My team lead engineer sits on a desk thats exactly like the one I use (except for a lot of Swag he has that I don’t), and I can just go drop by if I have any doubts/issues regarding pretty much anything (even non work related). If my mentor needs to talk to me, he will just come by and sit on a bean-bag lying around my desk and discuss things like a friend at college. You literally won’t be able to tell Tech-levels of people at Google, and thats something I feel is really nice about the office culture.

II. Getting intimidated by the people and the technology. This is a big one. I don’t think I have ever felt so…small…in front of people around me, anytime before in my life. To put it honestly, I felt like an idiot around my team for the first week or two. Not because they acted in any such way – in fact I had to stalk them online to know more about them – but because all of them are basically smartasses. And the other aspect is obviously that you get overwhelmed by the internal infrastructure at a company like Google. Its just so vast and there are so many parts and bits and pieces working together, that its difficult to wrap your head around all of them at first. Its all humbling.

III. The perks!! I don’t really think I need to explain this one (since its covered extensively in articles online). We at Hyderabad don’t have the perks that Mountain View does, but its still pretty darn amazing. Free transport, Bunker rooms, Free food (Oh the damn awesome food :-D), Microkitchens, Games rooms on all floors, Amazing gym, Massage centre, Free internet at home, Matching your charity (if you do any), Techstop for chargers and stuff…the list goes on. Who knows, once we have the campus, the list might expand even further!

Trust me, I know I sound like a freaking fanboy throughout the post. But well… I am barely one year out of my college term, and this is pretty much paradise for a luxury-loving dork like me. I hope to have a great life here at Google, and justify the whole process by making as much of an impact as I can. Cheers!