Thursday, December 9, 2021

Python Bytes: #262 So many bots up in your documentation

<p><strong>Watch the live stream:</strong></p> <a href='https://www.youtube.com/watch?v=_CVmzukm050' style='font-weight: bold;'>Watch on YouTube</a><br> <br> <p><strong>About the show</strong></p> <p>Sponsored by <strong>us:</strong></p> <ul> <li>Check out the <a href="https://training.talkpython.fm/courses/all"><strong>courses over at Talk Python</strong></a></li> <li>And <a href="https://pythontest.com/pytest-book/"><strong>Brian’s book too</strong></a>!</li> </ul> <p>Special guest: <a href="https://twitter.com/leahecole"><strong>Leah Cole</strong></a></p> <p><strong>Brian #1:</strong> <a href="https://docs.pytest.org/en/7.0.x/announce/release-7.0.0rc1.html"><strong>pytest 7.0.0rc1</strong></a></p> <ul> <li>Question: Does <a href="https://pragprog.com/titles/bopytest2/python-testing-with-pytest-second-edition/">the new pytest book</a> work with pytest 7? <ul> <li>Answer: Yes! I’ve been working with pytest 7 during final review of all code, and many pytest core developers have been technical reviewers of the book.</li> <li>A few changes in pytest 7 are also the result of me writing the 2nd edition and suggesting (and in one case implementing) improvements. </li> </ul></li> <li><a href="https://twitter.com/the_compiler/status/1468144014889209858?s=20">Florian Bruhin’s announcement on Twitter</a> <ul> <li>“I'm happy to announce that I just released <a href="https://twitter.com/hashtag/pytest?src=hashtag_click">#pytest</a> 7.0.0rc1! After many tricky deprecations, some internal changes, and months of delay due to various issues, it looks like we could finally get a new non-bugfix release this year! (6.2.0 was released in December 2020).”</li> <li>“We invite everyone to test the #pytest prerelease and report any issues - there is <em>a lot</em> that happened, and chances are we broke something we didn't find yet (we broke a lot of stuff we already fixed Smiling face with open mouth and cold sweat). See the release announcement for details: https://docs.pytest.org/en/7.0.x/announce/release-7.0.0rc1.html”</li> </ul></li> <li>Try it out with <code>pip install pytest==7.0.0rc1</code></li> <li>For those of you following along at home (we covered <code>pip index</code> briefly in <a href="https://pythonbytes.fm/episodes/show/259">episode 259</a>) <ul> <li>to see rc releases with <code>pip index versions</code>, add <code>--pre</code></li> <li>ex: <code>pip index versions</code> <code>--``pre pytest</code> will include <code>Available versions: 7.0.0rc1, 6.2.5, 6.2.4,</code> and let you know if there’s a newer rc available.</li> </ul></li> <li>Highlights from the <a href="https://docs.pytest.org/en/7.0.x/changelog.html">7.0.0rc1 changelog</a> <ul> <li><code>pytest.approx()</code> now works on <code>Decimal</code> within mappings/dicts and sequences/lists.</li> <li>Improvements to <code>approx()</code> with sequences of numbers. Example:</li> </ul></li> </ul> <pre><code> &gt; assert [1, 2, 3, 4] == pytest.approx([1, 3, 3, 5]) E assert comparison failed for 2 values: E Index | Obtained | Expected E 1 | 2 | 3 +- 3.0e-06 E 3 | 4 | 5 +- 5.0e-06 </code></pre> <ul> <li>pytest invocations with <code>--fixtures-per-test</code> and <code>--fixtures</code> have been enriched with: <ul> <li>Fixture location path printed with the fixture name.</li> <li>First section of the fixture’s docstring printed under the fixture name.</li> <li>Whole of fixture’s docstring printed under the fixture name using <code>--verbose</code> option.</li> <li><em>Never again wonder where a fixture’s definition is</em></li> </ul></li> <li><code>RunResult</code> method <code>assert_outcomes</code> now accepts a <code>warnings</code> and <code>deselected</code> argument to assert the total number of warnings captured. <em>Helpful for plugin testing.</em></li> <li>Added <code>pythonpath</code> setting that adds listed paths to <code>sys.path</code> for the duration of the test session. <em>Nice for using pytest for applications, and for including test helper libraries.</em></li> <li>Improved documentation, including <ul> <li>an <a href="https://docs.pytest.org/en/7.0.x/reference/plugin_list.html#plugin-list">auto-generated list of plugins</a>. There were 963 this morning.</li> </ul></li> </ul> <p><strong>Michael #2:</strong> <a href="https://twitter.com/davidouglasmit/status/1467522704542683139"><strong>PandasTutor</strong></a></p> <ul> <li>via <a href="https://twitter.com/davidouglasmit"><strong>David Smit</strong></a></li> <li><strong>Why use this tool?</strong> Let's say you're trying to explain what this <code>pandas</code> code does:</li> </ul> <pre><code> (dogs[dogs['size'] == 'medium'] .sort_values('type') .groupby('type').median() ) </code></pre> <ul> <li>But this doesn't tell you what's going on behind the scenes.</li> <li><strong>What did this code just do?</strong> This single code expression has 4 steps (filtering, sorting, grouping, and aggregating), but only the final output is shown.</li> <li><strong>Where were the medium-sized dogs?</strong> This code filters for dogs with size <code>"medium"</code>, but none of those dogs appear in the original table display (on the left) because they were buried in the middle rows.</li> <li><strong>How were the rows grouped?</strong> The output doesn't show which rows were grouped and aggregated together. (Note that printing a <code>pandas.GroupBy</code> object won't display this information either.)</li> <li>If you <a href="https://pandastutor.com/vis.html#trace=example-code/py_dogs.json"><strong>ran this same code</strong></a> in Pandas Tutor, you can teach students exactly what's going on step-by-step</li> </ul> <p><strong>Leah #3: Apache Airflow</strong></p> <ul> <li>Workflow orchestration tool the originated at Airbnb and is now part of the Apache Software Foundation</li> <li>author workflows as directed acyclic graphs (DAGs) of tasks</li> <li>Airflow works best with workflows that are mostly static and slowly changing. When DAG structure is similar from one run to the next, it allows for clarity around unit of work and continuity.</li> <li>Typical data analytics workflow is the Extract, Transform, Load (ETL) workflow - I have data somewhere that I need to get (extract), I do something to it (Transform) and I put that result somewhere else (load)</li> <li>Airflow has "Operators" and connectors which enable you to perform common tasks in popular libraries and Cloud providers</li> <li>Let's talk about a sample - I work on GCP so my sample will be GCP based because that's what I use most. One common workflow I see is running Spark jobs in ephemeral Dataproc clusters. I'm actually writing a tutorial demonstrating this now - literally in progress in another tab <ul> <li>BigQuery -> Create Dataproc cluster -> Run PySpark Dataproc job -> Store results in GCS -> delete Dataproc cluster</li> </ul></li> <li>Airflow has a really wonderful, active community. Please join us. </li> </ul> <p><strong>Brian #4:</strong> <a href="https://docs.python.org/3/library/textwrap.html#textwrap.dedent"><strong>textwrap.dedent</strong></a></p> <ul> <li><a href="https://twitter.com/mrvkino/status/1468306098235006981?s=20">Suggested by Michel Rogers-VallĂ©e</a></li> <li>Small utility but super useful. Also, built in to Python standard library.</li> <li>BTW, <code>textwrap</code> package has other cool tools you probably didn’t know Python could do right out of the box. It’s worth <a href="https://docs.python.org/3/library/textwrap.html">reading the docs</a>.</li> <li><code>dedent</code> akes a multiline string (the ones with tripple quotes).</li> <li>Removes all common whitespace.</li> <li>This allows you to have multi-line strings defined in functions without mucking up your indenting.</li> <li>Example from docs:</li> </ul> <pre><code> def test(): # end first line with \ to avoid the empty line! s = '''\ hello world ''' print(repr(s)) # prints ' hello\n world\n ' print(repr(dedent(s))) # prints 'hello\n world\n' </code></pre> <p>Better example:</p> <pre><code> from textwrap import dedent def multiline_hello_world(): print("hello") print(" world") def test_multiline_hello_world(capsys): expected = dedent('''\ hello world ''') multiline_hello_world() actual = capsys.readouterr().out assert actual == expected </code></pre> <p><strong>Michael #5:</strong> <a href="https://github.com/trailofbits/pip-audit"><strong>pip-audit</strong></a></p> <ul> <li>via Dan Bader (from Real Python)</li> <li>Audits Python environments and dependency trees for known vulnerabilities</li> <li>Are your dependencies containing security issues?</li> <li>What about their dependencies, the ones you forgot to list in your requirements files or pin? </li> <li>Just run pip-audit on your requirements file(s)</li> <li>Perfect candidate for pipx</li> </ul> <p><strong>Leah #6 - Using bots to manage samples</strong></p> <ul> <li>Another part of my job is working with other software engineers in GCP to oversee the maintenance our Python samples</li> <li>We have thousands of samples in hundreds of repos that are part of GCP documentation</li> <li>To ensure consistency and that this wonderful group of Devrel Engineers has time to get their work done and also function as a human, we use a lot of automation</li> <li>Bots do things like keep our dependencies up to date, check for license headers, auto-assign PRs and issues to code-owners, sync repositories with a centralized config, and more</li> <li>the GCP DevRel github automation team has an <a href="https://github.com/googleapis/repo-automation-bots">open source repo</a> with some of the bots they have developed that we use every day and we use <a href="https://github.com/renovatebot/renovate">whitesource renovatebot to manage our dependencies</a> and keep them up to date</li> </ul> <p><strong>Extras</strong></p> <p>Michael:</p> <ul> <li>Github CMD/CTRL+K command palette</li> <li>Python 3.10.1 <a href="https://www.python.org/downloads/release/python-3101/"><strong>is out</strong></a></li> </ul> <p><strong>Joke:</strong></p> <ul> <li><a href="https://twitter.com/mkennedy/status/1466849030047035392"><strong>HTTP status code meanings</strong></a></li> <li><a href="https://http.cat/"><strong>http.cat</strong></a></li> </ul>

from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...