Monday, August 31, 2020

PSF GSoC students blogs: Outro

Note: This article's HTML source is exported from reST.  Without necessary CSS, some part might look hideous.  Please consider viewing on my personal blog.

The Look

At the time of writing, implementation-wise parallel download is ready:

Does this mean I’ve finished everything just-in-time? This sounds to good to be true! And how does it perform? Welp…

The Benchmark

Here comes the bad news: under a decent connection to the package index, using fast-deps does not make pip faster. For best comparison, I will time pip download on the following cases:

Average Distribution

For convenience purposes, let’s refer to the commands to be used as follows

legacy-resolver

pip --no-cache-dir download {requirement}

2020-resolver

pip --use-feature=2020-resolver --no-cache-dir download {requirement}

fast-deps

pip --use-feature=2020-resolver --use-feature=fast-deps --no-cache-dir download {requirement}

In the first test, I used axuy and obtained the following results

legacy-resolver

2020-resolver

fast-deps

7.709s

7.888s

10.993s

7.068s

7.127s

11.103s

8.556s

6.972s

10.496s

Funny enough, running pip download with fast-deps in a directory with downloaded files already took around 7-8 seconds. This is because to lazily download a wheel, pip has to make many requests which are apparently more expensive than actual data transmission on my network.

Note

With unstable connection to PyPI (for some reason I am not confident enough to state), this is what I got

2020-resolver

fast-deps

1m16.134s

0m54.894s

1m0.384s

0m40.753s

0m50.102s

0m41.988s

As the connection was unstable and that the majority of pip networking is performed as CI/CD with large and stable bandwidth, I am unsure what this result is supposed to tell (-;

Large Distribution

In this test, I used TensorFlow as the requirement and obtained the following figures:

legacy-resolver

2020-resolver

fast-deps

0m52.135s

0m58.809s

1m5.649s

0m50.641s

1m14.896s

1m28.168s

0m49.691s

1m5.633s

1m22.131s

Distribution with Conflicting Dependencies

Some requirement that will trigger a decent amount of backtracking by the current implementation of the new resolver oslo-utils==1.4.0:

2020-resolver

fast-deps

14.497s

24.010s

17.680s

28.884s

16.541s

26.333s

What Now?

I don’t know, to be honest. At this point I’m feeling I’ve failed my own (and that of other stakeholders of pip) expectation and wasted the time and effort of pip’s maintainers reviewing dozens of PRs I’ve made in the last three months.

On the bright side, this has been an opportunity for me to explore the codebase of package manager and discovered various edge cases where the new resolver has yet to cover (e.g. I’ve just noticed that pip download would save to-be-discarded distributions, I’ll file an issue on that soon). Plus I got to know many new and cool people and idea, which make me a more helpful individual to work on Python packaging in the future, I hope.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...