Python deque is a double-ended queue. You can append to both ends and pop from both ends. The complexity of those operations amortizes to constant time.
We are going to look at the Python 3 internal implementation of deques. It uses a linked list of blocks of 64 pointers to objects. This reduces memory overhead since there are fewer previous and next links.
Let’s create an empty deque and see what happens.
>>> import collections >>> d = collections.deque() >>> d deque([])
A linked list with a single block of 64 pointers (set to null) is created. The right index is initialized to the center of the block (31) and is used to refer to the last object in the deque. The left index is initialized to the center of the block plus one (32) and is used to refer to the first object in the deque. As of now, we did not push anything to our deque so all the pointers are pointing to nothing.
The deque internal C structure also has a pointer to the left most block and a pointer to the right most block. Both are currently pointing to this single block above. There is also an optional max length attribute that can be used for bounded queues.
typedef struct { ... block *leftblock; block *rightblock; Py_ssize_t leftindex; Py_ssize_t rightindex; ... Py_ssize_t maxlen; ... } dequeobject;
A block has a left and right link. It also has an array of 64 pointers to objects.
typedef struct BLOCK { struct BLOCK *leftlink; PyObject *data[BLOCKLEN]; struct BLOCK *rightlink; } block;
Let’s append two objects to our deque. We are going to use strings here labelled e1, e2, … The append method really is append-right. There is also append-left.
>>> d.append('e1') >>> d.append('e2') >>> d deque(['e1', 'e2'])
The right index got incremented by two (33) and still points to the last object of the deque. The left index did not move and still points to the first object of the deque. The diagram below shows the string objects inside the deque for simplification, while they are in reality pointed to.
The following C function is called internally. It takes the deque object and the object to append. The size of the deque is increased by one. The size of the deque is what is returned when you use the Python len function.
static inline int deque_append_internal(dequeobject *deque, PyObject *item, ...) { ... Py_SET_SIZE(deque, Py_SIZE(deque) + 1); deque->rightindex++; deque->rightblock->data[deque->rightindex] = item; ... }
Let’s append strings e3 to e32
>>> for i in range(3, 33): d.append('e%d' % i) >>> d deque(['e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'e10', 'e11', 'e12', 'e13', 'e14', 'e15', 'e16', 'e17', 'e18', 'e19', 'e20', 'e21', 'e22', 'e23', 'e24', 'e25', 'e26', 'e27', 'e28', 'e29', 'e30', 'e31', 'e32'])
The right index is now pointing to the maximum index in our first block.
What happens if we append one more string?
>>> d.append('e33') >>> d deque(['e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'e10', 'e11', 'e12', 'e13', 'e14', 'e15', 'e16', 'e17', 'e18', 'e19', 'e20', 'e21', 'e22', 'e23', 'e24', 'e25', 'e26', 'e27', 'e28', 'e29', 'e30', 'e31', 'e32', 'e33'])
A new block is allocated. The deque right index now points to the first index of the new block. The left block pointer continues to point to the first block. The right block pointer now points to the new block.
The same internal function is called as above but an extra piece of code is executed to allocate a new block, update the block links and the deque right block pointer.
static inline int deque_append_internal(dequeobject *deque, PyObject *item, Py_ssize_t maxlen) { if (deque->rightindex == BLOCKLEN - 1) { block *b = newblock(); if (b == NULL) return -1; b->leftlink = deque->rightblock; deque->rightblock->rightlink = b; deque->rightblock = b; deque->rightindex = -1; } Py_SET_SIZE(deque, Py_SIZE(deque) + 1); deque->rightindex++; deque->rightblock->data[deque->rightindex] = item; ... }
Now that we looked at the method append, we can move to the method popleft. We can pop the first object in our deque:
>>> d.popleft() 'e1'
The left index points to the first object in our deque. It moved up by one and is now pointing to index 33. The string e1 has been returned.
We can see in the popleft internal function that the size of the deque is also decreased by one.
static PyObject * deque_popleft(dequeobject *deque, PyObject *unused) { PyObject *item; ... item = deque->leftblock->data[deque->leftindex]; deque->leftindex++; Py_SET_SIZE(deque, Py_SIZE(deque) - 1); ... return item; }
Let’s continue and pop all the objects in the deque left’s block. Strings e2 to e32 have been returned. The left index is not showed here and is now equal to 64.
The left index being equal to the block length (64), the left block can be deallocated to save some memory. The left index is set to zero. The deque left block and right block pointers are now pointing to the same block.
static PyObject * deque_popleft(dequeobject *deque, PyObject *unused) { ... if (deque->leftindex == BLOCKLEN) { if (Py_SIZE(deque)) { prevblock = deque->leftblock->rightlink; freeblock(deque->leftblock); deque->leftblock = prevblock; deque->leftindex = 0; } ... } ... }
We looked at what happens when we append to the end of the queue (append right) and when we pop from the beginning of the queue (pop left). The methods appendleft and pop (pop right) have similar internal mechanisms.
I hope you enjoyed the article. Don’t hesitate to comment.
from Planet Python
via read more
No comments:
Post a Comment