Wednesday, January 6, 2021

Python Bytes: #215 A Visual Introduction to NumPy

<p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://pragprog.com/titles/bopytest/python-testing-with-pytest/"><strong>pytest book</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a></li> </ul> <p>Special guest: <a href="https://twitter.com/codemouse92"><strong>Jason McDonald</strong></a></p> <p><strong>Michael #1:</strong> <a href="https://www.youtube.com/watch?v=bxWrXhLFN2s"><strong>5 ways I use code as an astrophysicist</strong></a></p> <ul> <li>Video by <a href="https://twitter.com/drbecky_"><strong>Dr. Becky</strong></a> (i.e. Dr Becky Smethurst, an astrophysicist at the University of Oxford)</li> <li>She has a <a href="https://www.youtube.com/channel/UCYNbYGl89UUowy8oXkipC-Q">great YouTube channel</a> to check out.</li> <li>#1: Image Processing (of galaxies from telescopes) <ul> <li>Noise removal</li> </ul></li> <li>#2: Data analysis <ul> <li>Image features (brightness, etc)</li> <li>One example: 600k “rows” of galaxy properties</li> </ul></li> <li>#3: Model fitting <ul> <li>e.g. linear fit (visually as well through jupyter)</li> <li>e.g. Galaxies and their black holes grow in mass together</li> <li>Color of galaxies &amp; relative star formation</li> </ul></li> <li>#4: Data visualization</li> <li>#5: Simulations <ul> <li>Beautiful example of galaxies colliding</li> <li>Star meets black hole</li> </ul></li> </ul> <p><strong>Brian #2:</strong> <a href="http://jalammar.github.io/visual-numpy/"><strong>A Visual Intro to NumPy and Data Representation</strong></a></p> <ul> <li><a href="https://twitter.com/JayAlammar">Jay Alammar</a></li> <li>I’ve started using numpy more frequently in my own work.</li> <li>Problem: I think of np.array like a Python list. But that’s not right.</li> <li>This visualization guide helped me think of them differently.</li> <li>Covers: <ul> <li>arrays <ul> <li>creating arrays (I didn’t know about np.ones(), np.zeros(), or np.random.random(), so cool)</li> <li>array arithmetic</li> <li>indexing and slicing</li> <li>aggregation with min, max, sum, mean, prod, etc.</li> </ul></li> <li>matrices : 2D arrays <ul> <li>matrix arithmetic</li> <li>dot product (with visuals, it takes seconds to understand)</li> <li>matrix indexing and slicing</li> <li>matrix aggregation (both all entries and column or row with axis parameter)</li> <li>transposing and reshaping</li> </ul></li> <li>ndarray: n-dimensional arrays</li> <li>transforming mathematical formulas to numpy syntax</li> <li>data representation</li> </ul></li> <li>All with excellent drawings to help visualize the concept.</li> </ul> <p><strong>Jason #3:</strong> <strong>Qt 6 release (including PySide2)</strong></p> <ul> <li>Qt 6.0 released on December 8: <a href="https://www.qt.io/blog/qt-6.0-released">https://www.qt.io/blog/qt-6.0-released</a> <ul> <li>3D Graphics abstraction layer called RHI (Rendering Hardware Interface), eliminating hard dependency on OpenGL, and adding support for DirectX, Vulkan, and Metal. Uses native 3D graphics on each device by default.</li> <li>Property bindings: <a href="https://www.qt.io/blog/property-bindings-in-qt-6">https://www.qt.io/blog/property-bindings-in-qt-6</a></li> <li>A bunch of refactoring to improve performance.</li> <li>QtQuick styling</li> <li>CAUTION: Many Qt 5 add-ons not yet supported!! They plan to support by 6.2 (end of September 2021).</li> <li>Pay attention to your 5.15 deprecation warnings; those things have now been removed in 6.0.</li> </ul></li> <li>PySide6/Shiboken6 released December 10: <a href="https://www.qt.io/blog/qt-for-python-6-released">https://www.qt.io/blog/qt-for-python-6-released</a> <ul> <li>New minimum version is Python 3.6, supported up to 3.9.</li> <li>Uses properties instead of (icky) getters/setters now. (Combine with snake_case support from 5.15.2)</li> </ul></li> </ul> <pre><code> from __feature__ import snake_case, true_property </code></pre> <ul> <li>PyQt6 also just released, if you prefer Riverbank’s flavor. (I prefer official.)</li> </ul> <p><strong>Michael #4:</strong> <a href="https://twitter.com/mkennedy/status/1339657542591336448"><strong>Is your GC hyper active? Tame it!</strong></a></p> <ul> <li>Let’s think about <code>gc.get_threshold()</code>.</li> <li>Returns <code>(700, 10, 10)</code> by default. That’s read roughly as: <ul> <li>For every net 700 allocations of a collection type, a gen 0 collection runs</li> <li>For every gen 0 collection run, 1/10 times it’ll be upgraded to gen 1.</li> <li>For every gen 1 collection run, 1/10 times it’ll be upgraded to gen 2. Aka for every 100 gen 0 it’s upgraded to gen 2.</li> </ul></li> <li>Now consider this:</li> </ul> <pre><code> query = PageView.objects(created__gte=yesterday).all() data = list(query) # len(data) = 1,500 </code></pre> <ul> <li>That’s multiple GC runs. We’ve allocated at least 1,500 custom objects. Yet never ever will any be garbage.</li> <li>But we can adjust this. Observe with <code>gc.set_debug(gc.DEBUG_STATS)</code> and consider this ONCE at startup:</li> </ul> <pre><code> # Clean up what might be garbage gc.collect(2) # Exclude current items from future GC. gc.freeze() allocs, gen1, gen2 = gc.get_threshold() allocs = 50_000 # Start the GC sequence every 10K not 700 class allocations. gc.set_threshold(allocs, gen1, gen2) print(f"GC threshold set to: {allocs:,}, {gen1}, {gen2}.") </code></pre> <ul> <li>May be better, may be worse. But our pytest integration tests over <a href="https://training.talkpython.fm/">at Talk Python Training</a> run 10-12% faster and are a decent stand in for overall speed perf.</li> <li>Our sitemap was doing 77 GCs for a single page view (77!), now it’s 1-2.</li> </ul> <p><strong>Brian #5:</strong> <a href="https://tryolabs.com/blog/2020/12/21/top-10-python-libraries-of-2020/"><strong>Top 10 Python libraries of 2020</strong></a></p> <ul> <li>tryolabs</li> <li>criteria <ul> <li>They were launched or popularized in 2020.</li> <li>They are well maintained and have been since their launch date.</li> <li>They are outright cool, and you should check them out.</li> </ul></li> </ul> <p>General interest:</p> <ol> <li><a href="https://github.com/tiangolo/typer">Typer</a> : FastAPI for CLI applications <ul> <li>I can’t believe first commit was right before 2020. </li> <li>Just about a year after the introduction of FastAPI, if you can believe it.</li> <li><a href="https://github.com/tiangolo">Sebastián Ramírez</a> is on 🔥 </li> </ul></li> <li><a href="https://github.com/willmcgugan/rich">Rich</a> : rich text and beautiful formatting in the terminal. <ul> <li><a href="https://github.com/willmcgugan">Will McGugan</a></li> <li>yep. showed up in 2020, amazing.</li> </ul></li> <li><a href="https://github.com/hoffstadt/DearPyGui">Dear PyGui</a> : Python port of the popular <a href="https://github.com/ocornut/imgui"><strong>Dear ImGui</strong></a> C++ project.</li> <li><a href="https://github.com/onelivesleft/PrettyErrors">PrettyErrors</a> : transforms stack traces into color coded, well spaced, easier to read stack traces.</li> <li><a href="https://github.com/mingrammer/diagrams">Diagrams</a> : lets you draw the cloud system architecture without any design tools, directly in Python code.</li> </ol> <p>Machine Learning:</p> <ol> <li><a href="https://hydra.cc/">Hydra</a> and <a href="https://github.com/omry/omegaconf">OmegaConf</a></li> <li><a href="https://github.com/PyTorchLightning/PyTorch-lightning">PyTorch Lightning</a></li> <li><a href="https://github.com/microsoft/hummingbird">Hummingbird</a></li> <li><a href="https://github.com/facebookresearch/hiplot">HiPlot</a> : plotting high dimensional data</li> </ol> <p>Also general</p> <ol> <li><a href="https://github.com/emeryberger/scalene">Scalene</a> : CPU and memory profiler for Python scripts capable of correctly handling multi-threaded code and distinguishing between time spent running Python vs. native code, without having to modify your code to use it.</li> </ol> <p><strong>Jason #6:</strong> <strong><a href="https://github.com/carlosperate/awesome-pyproject/">Adoption of pyproject.toml — why is this so darned controversial?</a></strong></p> <p>The goal of this file is to have a single standard place for all Python tool configurations. It was introduced in PEP 518, but the community seems divided over standardizing.</p> <p>A bunch of tools are lagging behind others in implementing. Tracked in this repo</p> <p>A few of the bigger “sticking points”:</p> <ul> <li>setuptools is working on it: <a href="https://github.com/pypa/setuptools/issues/1688">https://github.com/pypa/setuptools/issues/1688</a></li> <li>MyPy: GVR says it “doesn’t solve anything” and closed the PR. <a href="https://github.com/python/mypy/issues/5205">https://github.com/python/mypy/issues/5205</a></li> <li>Flake8 objections: <a href="https://gitlab.com/pycqa/flake8/-/issues/428#note_251982786">https://gitlab.com/pycqa/flake8/-/issues/428#note_251982786</a> <ul> <li>Lack of standard TOML parser.</li> <li>“pip to change its behavior so mere presence of the file does not change functionality”</li> <li>Flake9 already implemented what Flake8 wouldn’t. Is this political?</li> </ul></li> <li>Bandit is sitting on a PR since 2019: <a href="https://github.com/PyCQA/bandit/issues/550">https://github.com/PyCQA/bandit/issues/550</a></li> <li>ReadTheDocs: It’s too much work? — <a href="https://github.com/readthedocs/readthedocs.org/issues/7065">https://github.com/readthedocs/readthedocs.org/issues/7065</a></li> <li>PyOxidizer (shockingly), silent on the topic since 2019: <a href="https://github.com/indygreg/PyOxidizer/issues/93">https://github.com/indygreg/PyOxidizer/issues/93</a></li> </ul> <p>Extras:</p> <p>Michael:</p> <ul> <li><a href="https://www.pyxll.com/">PyXLL for Excel people,</a> including <a href="https://www.pyxll.com/blog/python-jupyter-notebooks-in-excel">Python Jupyter Notebooks in Excel</a>.</li> <li>Django 3.1.5 Released</li> <li>Python 3.10.0a4 Is Now Available for Testing</li> <li>SciPy 1.6.0 Released</li> <li>M1 + PyCharm fast? <a href="https://ift.tt/3hSKR1c> <li>Flying solo with the M1 too - apparently 75% is shutdown time for my MBP!</li> </ul> <p>Joke</p> <p>“Why did the programmer always refuse to check his code into the repository? He was afraid to commit.”</p>

from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...