Circular References in Python

One of the more convenient aspects of writing code in interpreted languages such as Python or Ruby is that you normally can avoid dealing with memory management. However, one known case where Python will definitely leak memory is when you declare circular references in your object declarations and implement a custom __del__ destructor method in one these classes. For instance, consider the following example:

class A(object):
    def __init__(self, b_instance):
      self.b = b_instance

class B(object):
    def __init__(self):
        self.a = A(self)
    def __del__(self):
        print "die"

def test():
    b = B()

test()

When the function test() is invoked, it declares an instance of B, which passes itself to A, which then sets a reference to B, resulting in a circular reference. Normally Python's garbage collector, which is used to detect these types of cyclic references, would remove it. However, because of the custom destructor (the __del__ method), it marks this item as "uncollectable". By design, it doesn't know the order in which to destroy the objects, so leaves them alone (see Python's garbage collection documentation for more background). You can verify this aspect by forcing the Python garbage collector to run and inspecting what is set inside the gc.garbage array:

import gc
gc.collect()
print gc.garbage
[<__main__.B object at 0x7f59f57c98d0>]

You can also see these circular references visually by using the objgraph library, which relies on Python's gc module to inspect the references to your Python objects. Note that objgraph library also deliberately plots the the custom __del__ methods in a red circle to spotlight a possible issue.

You would just need to make a call to objgraph.show_backrefs() to show this issue:
 def test(): b = B()

    import objgraph
    objgraph.show_backrefs([b, b.a], refcounts=True)

While much of this knowledge is usually well-known about Python, it still often shows up. Introducing circular references can often come up when trying to implement connection pooling, whereby a number of network connections are collected and reused by a process. In these cases, a connection pool object is created that is provided a reference to various connection objects. The connection itself must be able to report status information back to the pool. Since the network connection must often be terminated gracefully by closing sockets and/or file descriptors, developers often tried to implement their own __del__ destructor methods, which invariably could create a potential memory leak.

To avoid circular references, you usually need to use weak references, declaring to the interpreter that the memory can be reclaimed for an object if the remaining references are of these types, or to use context managers and the with statement (for an example of this latter approach, see how it was solved for the happybase library).

Also, if you think you've not been guilty of introducing circular references, think again! If you've ever tried to dump a stack trace within a function and assigned a local variable to the third value in the tuple, you are actually creating a cycle with the stack frame and the local variable.

def main():
    try:
        raise Exception('here')
    except:
        pass

    exc_info = sys.exc_info()
    import objgraph
    objgraph.show_backrefs([exc_info, main])

main()

Again, this cycle can be shown visually:

In this specific case, the memory can normally be reclaimed since cyclic references alone aren't going to cause leaks, but forcing Python to handle the detection requires special heuristics to do so it's best to clean up after introducing this cyclic reference. In this example, you should add a finally: clause and delete the local variable.

def main()
    try:
        raise Exception('here')
    except:
        pass

    exc_info = sys.exc_info()
    del exc_info[2]

Hearsay Social Hosts San Francisco PyData Meetup

On Thursday May 2, 2013 Hearsay Social welcomed members of the San Francisco PyData meetup to our living room. Wes McKinney, the founder of the Python pandas library spoke about the tool and how it makes data analysis easy.

Hearsay Social is a long time supporter of the Python community and numerous open source libraries. Along with sponsoring the annual PyCon conference and offering space for meetups, Hearsay Social engineers are encouraged to contribute to the libraries & frameworks we use everyday: Django, Celery, Chef, etc.

For his talk, Wes walked through the process of analyzing a GitHub repository using pandas. In real time he pulled data from the API, organized it using pandas and began looking for answers to questions about who was contributing the most, how quickly bugs were being fixed and how activity changed around major releases. Everything was done using an iPython notebook which is available for download along with the slides here:

https://www.dropbox.com/sh/05d16q8zm5uozkl/dD2mHWqhhI

Hackbright Academy visits Hearsay Social

Last Friday, Hearsay Social hosted a panel discussion for students from Hackbright Academy. Hackbright Academy offers a 10 week Programming Fellowship in Silicon Valley that is designed to help women from all backgrounds become adept programmers. Hearsay Social's engineering team is lucky to have a number of stellar female engineers. And they were excited to share stories, advice and thoughts on the industry.

"The first time I learned how to code is also the story of how I got asked to prom," recalled Bansi Shah, a Product Manager at Hearsay Social.

Ruchi Varshney has been a Generalist Engineer at Hearsay Social for over a year. She's worked on projects ranging from site internationalization to mobile development. She advised, "Find a company that allows you to work the whole stack. Keep on learning new skills."

Megan Anctil started her career at Hearsay Social as a Customer Support Associate. She learned to code on the job with help and encouragement from the engineering team. "I was motivated to fix issues rather than forward them to the engineering team. It gave me a great platform to learn." She then went on to show off the traditional Hearsay Social Engineering Barbie. She mentioned, "This is like a rite of passage for women who check in code at Hearsay Social."

Hearsay Social co-founder and CEO, Clara Shih, offered her perspective on finding the right company. "Seek out a growing industry, find the experts of that industry, and work with them to build a team you can learn from," she advised.

As the afternoon wrapped up a Hackbright student remarked, "We've visited a lot of companies with brilliant female engineers and at Hearsay Social it really feels like you girls are good friends first."

Posted on April 24, 2013