Monday, January 13, 2020

Ned Batchelder: Bug #915: solved!

Yesterday I pleaded, Bug #915: please help! It got posted to Hacker News, where Robert Xiao (nneonneo) did some impressive debugging and found the answer.

The user’s code used mocks to simulate an OSError when trying to make temporary files (source):

with patch('tempfile._TemporaryFileWrapper') as mock_ntf:
    mock_ntf.side_effect = OSError()

Inside tempfile.NamedTemporaryFile, the error handling misses the possibility that _TemporaryFileWrapper will fail (source):

(fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
try:
    file = _io.open(fd, mode, buffering=buffering,
                    newline=newline, encoding=encoding, errors=errors)

    return _TemporaryFileWrapper(file, name, delete)
except BaseException:
    _os.unlink(name)
    _os.close(fd)
    raise

If _TemporaryFileWrapper fails, the file descriptor fd is closed, but the file object referencing it still exists. Eventually, it will be garbage collected, and the file descriptor it references will be closed again.

But file descriptors are just small integers which will be reused. The failure in bug 915 is that the file descriptor did get reused, by SQLite. When the garbage collector eventually reclaimed the file object leaked by NamedTemporaryFile, it closed a file descriptor that SQLite was using. Boom.

There are two improvements to be made here. First, the user code should be mocking public functions, not internal details of the Python stdlib. In fact, the variable is already named mock_ntf as if it had been a mock of NamedTemporaryFile at some point.

NamedTemporaryFile would be a better mock because that is the function being used by the user’s code. Mocking _TemporaryFileWrapper is relying on an internal detail of the standard library.

The other improvement is to close the leak in NamedTemporaryFile. That request is now bpo39318. As it happens, the leak had also been reported as bpo21058 and bpo26385.

Lessons learned:

  • Hacker News can be helpful, in spite of the tangents about shell redirection, authorship attribution, and GitHub monoculture.
  • There are always people more skilled at debugging. I had no idea you could script gdb.
  • Error handling is hard to get right. Edge cases can be really subtle. Bugs can linger for years.

I named Robert Xiao at the top, but lots of people chipped in effort to help get to the bottom of this. ikanobori posted it to Hacker News in the first place. Chris Caron reported the original #915 and stuck with the process as it dragged on. Thanks everybody.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...