Friday, July 31, 2020

Codementor: I Wrote an Online Escape Game

All about the puzzle game I wrote with lots of web tech for maximum entertainment value!

from Planet Python
via read more

NumFOCUS: Dask Life Sciences Fellow [Open Job]

Dask is an open-source library for parallel computing in Python that interoperates with existing Python data science libraries like Numpy, Pandas, Scikit-Learn, and Jupyter.  Dask is used today across many different scientific domains. Recently, we’ve observed an increase in use in a few life sciences applications: Large scale imaging in microscopy Single cell analysis Genomics […]

The post Dask Life Sciences Fellow [Open Job] appeared first on NumFOCUS.



from Planet Python
via read more

Mike Driscoll: Real Python Podcast Interview

I am on the latest Real Python podcast where I talk about my ReportLab book, wxPython, and lots more.

The podcast episode that I take part in is called Episode 20: Building PDFs in Python with ReportLab. Check it out and feel free to ask questions in the comments.

Related Articles

The post Real Python Podcast Interview appeared first on The Mouse Vs. The Python.



from Planet Python
via read more

Catalin George Festila: Python 3.8.5 : PyEphem astronomy library for Python - part 001.

About this python package, you can find it from the official website. PyEphem provides an ephem Python package for performing high-precision astronomy computations. The underlying numeric routines are coded in C and are the same ones that drive the popular XEphem astronomy application, whose author, Elwood Charles Downey, generously gave permission for their use in PyEphem. The name ephem is

from Planet Python
via read more

Python⇒Speed: A tableau of crimes and misfortunes: the ever-useful `docker history`

If you want to understand a Docker image, there is no more useful tool than the docker history command. Whether it’s telling you why your image is so large, or helping you understand how a base image was constructed, the history command will let you peer into the innards of any image, allowing you to see the good, the bad, and the ugly.

Let’s see what this command does, what it can teach us about the construction of Docker images, and some examples of why it’s so useful.

Read more...

from Planet Python
via read more

This Week in Machine Learning: Should We Be Afraid of AI, SER, Disney, and More

Machine learning is fascinating. New things happen every second while we’re busy performing our daily tasks. If you want to know what […]

The post This Week in Machine Learning: Should We Be Afraid of AI, SER, Disney, and More appeared first on neptune.ai.



from Planet SciPy
read more

PSF GSoC students blogs: Week 5 Blog Post

I am not feeling well this week and have asked for leave this week with my mentors. I will catch up with my plan on this weekend or next week. 



from Planet Python
via read more

Real Python: The Real Python Podcast – Episode #20: Building PDFs in Python with ReportLab

Have you wanted to generate advanced reports as PDFs using Python? Maybe you want to build documents with tables, images, or fillable forms. This week on the show we have Mike Driscoll to talk about his book "ReportLab - PDF Processing with Python."


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Planet Python
via read more

Learn PyQt: Creating multiple windows in PyQt5/PySide2

In an earlier tutorial we've already covered how to open dialog windows. These are special windows which (by default) grab the focus of the user, and run their own event loop, effectively blocking the execution of the rest of your app.

However, quite often you will want to open a second window in an application, without interrupting the main window -- for example, to show the output of some long-running process, or display graphs or other visualizations. Alternatively, you may want to create an application that allows you to work on multiple documents at once, in their own windows.

It's relatively straightforward to open new windows but there are a few things to keep in mind to make sure they work well. In this tutorial we'll step through how to create a new window, and how to show and hide external windows on demand.

Creating a new window

In Qt any widget without a parent is a window. This means, to show a new window you just need to create a new instance of a widget. This can be any widget type (technically any subclass of QWidget) including another QMainWindow if you prefer.

There is no restriction on the number of QMainWindow instances you can have. If you need toolbars or menus on your second window you will have to use a QMainWindow to achieve this. This can get confusing for users however, so make sure it's necessary.

As with your main window, creating a window is not sufficient, you must also show it.

python
from PyQt5.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window")
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        w = AnotherWindow()
        w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
from PySide2.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window")
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        w = AnotherWindow()
        w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()

If you run this, you'll see the main window. Clicking the button may show the second window, but if you see it it will only be visible for a fraction of a second. What's happening?

python
    def show_new_window(self, checked):
        w = AnotherWindow()
        w.show()

Inside this method, we are creating our window (widget) object, storing it in the variable w and showing it. However, once we leave the method we no longer have a reference to the w variable (it is a local variable) and so it will be cleaned up – and the window destroyed. To fix this we need to keep a reference to the window somewhere, for example on the self object.

python
    def show_new_window(self, checked):
        self.w = AnotherWindow()
        self.w.show()

Now, when you click the button to show the new window, it will persist.

However, what happens if you click the button again? The window will be re-created! This new window will replace the old in the self.w variable, and – because there is now no reference to it – the previous window will be destroyed.

You can see this in action if you change the window definition to show a random number in the label each time it is created.

python
from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)

The __init__ block is only run when creating the window. If you keep clicking the button the number will change, showing that the window is being re-created.

One solution is to simply check whether the window has already being created before creating it. The example below shows this in action.

python
from PyQt5.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.w = None  # No external window yet.
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        if self.w is None:
            self.w = AnotherWindow()
        self.w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
from PySide2.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.w = None  # No external window yet.
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        if self.w is None:
            self.w = AnotherWindow()
        self.w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()

Using the button you can pop up the window, and use the window controls to close it. If you click the button again, the same window will re-appear.

This approach is fine for windows that you create temporarily – for example if you want to pop up a window to show a particular plot, or log output. However, for many applications you have a number of standard windows that you want to be able to show/hide them on demand.

In the next part we'll look at how to work with these types of windows.

Toggling a window

Often you'll want to toggle the display of a window using an action on a toolbar or in a menu. As we previously saw, if no reference to a window is kept, it will be discarded (and closed). We can use this behaviour to close a window, replacing the show_new_window method from the previous example with –

python
    def show_new_window(self, checked):
        if self.w is None:
            self.w = AnotherWindow()
            self.w.show()

        else:
            self.w = None  # Discard reference, close window.

By setting self.w to None the reference to the window will be lost, and the window will close.

If we set it to any other value that None the window will still close, but the if self.w is None test will not pass the next time we click the button and so we will not be able to recreate a window.

This will only work if you have not kept a reference to this window somewhere else. To make sure the window closes regardless, you may want to explicitly call .close() on it. The full example is shown below.

python
from PyQt5.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.w = None  # No external window yet.
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        if self.w is None:
            self.w = AnotherWindow()
            self.w.show()
        
        else:
            self.w.close()  # Close window.
            self.w = None  # Discard reference.


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
from PySide2.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.w = None  # No external window yet.
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        if self.w is None:
            self.w = AnotherWindow()
            self.w.show()
        
        else:
            self.w.close()  # Close window.
            self.w = None  # Discard reference.


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()

Persistent windows

So far we've looked at how to create new windows on demand. However, sometimes you have a number of standard application windows. In this case rather than create the windows when you want to show them, it can often make more sense to create them at start-up, then use .show() to display them when needed.

In the following example we create our external window in the __init__ block for the main window, and then our show_new_window method simply calls self.w.show() to display it.

python
from PyQt5.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.w = AnotherWindow()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        self.w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
from PySide2.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    
    def __init__(self):
        super().__init__()
        self.w = AnotherWindow()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.show_new_window)
        self.setCentralWidget(self.button)
        
    def show_new_window(self, checked):
        self.w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()

If you run this, clicking on the button will show the window as before. However, note that the window is only created once and calling .show() on an already visible window has no effect.

Showing & hiding persistent windows

Once you have created a persistent window you can show and hide it without recreating it. Once hidden the window still exists, but will not be visible and accept mouse/other input. However you can continue to call methods on the window and update it's state -- including changing it's appearance. Once re-shown any changes will be visible.

Below we update our main window to create a toggle_window method which checks, using .isVisible() to see if the window is currently visible. If it is not, it is shown using .show() , if it is already visible we hide it with .hide().

python
class MainWindow(QMainWindow):

    def __init__(self):
        super().__init__()
        self.w = AnotherWindow()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.toggle_window)
        self.setCentralWidget(self.button)

    def toggle_window(self, checked):
        if self.w.isVisible():
            self.w.hide()

        else:
            self.w.show()

The complete working example of this persistent window and toggling the show/hide state is shown below.

python
from PyQt5.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.w = AnotherWindow()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.toggle_window)
        self.setCentralWidget(self.button)

    def toggle_window(self, checked):
        if self.w.isVisible():
            self.w.hide()

        else:
            self.w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
from PySide2.QtWidgets import QApplication, QMainWindow, QPushButton, QLabel, QVBoxLayout, QWidget

import sys

from random import randint


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent, it 
    will appear as a free-floating window as we want.
    """
    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0,100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.w = AnotherWindow()
        self.button = QPushButton("Push for Window")
        self.button.clicked.connect(self.toggle_window)
        self.setCentralWidget(self.button)

    def toggle_window(self, checked):
        if self.w.isVisible():
            self.w.hide()

        else:
            self.w.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()

Note that, again, the window is only created once -- the window's __init__ block is not re-run (so the number in the label does not change) each time the window is re-shown.

Multiple windows

You can use the same principle for creating multiple windows -- as long as you keep a reference to the window, things will work as expected. The simplest approach is to create a separate method to toggle the display of each of the windows.

python
import sys
from random import randint

from PyQt5.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent,
    it will appear as a free-floating window.
    """

    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0, 100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.window1 = AnotherWindow()
        self.window2 = AnotherWindow()

        l = QVBoxLayout()
        button1 = QPushButton("Push for Window 1")
        button1.clicked.connect(self.toggle_window1)
        l.addWidget(button1)

        button2 = QPushButton("Push for Window 2")
        button2.clicked.connect(self.toggle_window2)
        l.addWidget(button2)

        w = QWidget()
        w.setLayout(l)
        self.setCentralWidget(w)

    def toggle_window1(self, checked):
        if self.window1.isVisible():
            self.window1.hide()

        else:
            self.window1.show()

    def toggle_window2(self, checked):
        if self.window2.isVisible():
            self.window2.hide()

        else:
            self.window2.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
import sys
from random import randint

from PySide2.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent,
    it will appear as a free-floating window.
    """

    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0, 100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.window1 = AnotherWindow()
        self.window2 = AnotherWindow()

        l = QVBoxLayout()
        button1 = QPushButton("Push for Window 1")
        button1.clicked.connect(self.toggle_window1)
        l.addWidget(button1)

        button2 = QPushButton("Push for Window 2")
        button2.clicked.connect(self.toggle_window2)
        l.addWidget(button2)

        w = QWidget()
        w.setLayout(l)
        self.setCentralWidget(w)

    def toggle_window1(self, checked):
        if self.window1.isVisible():
            self.window1.hide()

        else:
            self.window1.show()

    def toggle_window2(self, checked):
        if self.window2.isVisible():
            self.window2.hide()

        else:
            self.window2.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()

However, you can also create a generic method which handles toggling for all windows -- see transmitting extra data with Qt signals for a detailed explanation of how this works. The example below shows that in action, using a lambda function to intercept the signal from each button and pass through the appropriate window. We can also discard the checked value since we aren't using it.

python
import sys
from random import randint

from PyQt5.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent,
    it will appear as a free-floating window.
    """

    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0, 100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.window1 = AnotherWindow()
        self.window2 = AnotherWindow()

        l = QVBoxLayout()
        button1 = QPushButton("Push for Window 1")
        button1.clicked.connect(
            lambda checked: self.toggle_window(self.window1)
        )
        l.addWidget(button1)

        button2 = QPushButton("Push for Window 2")
        button2.clicked.connect(
            lambda checked: self.toggle_window(self.window2)
        )
        l.addWidget(button2)

        w = QWidget()
        w.setLayout(l)
        self.setCentralWidget(w)

    def toggle_window(self, window):
        if window.isVisible():
            window.hide()

        else:
            window.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()
python
import sys
from random import randint

from PySide2.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)


class AnotherWindow(QWidget):
    """
    This "window" is a QWidget. If it has no parent,
    it will appear as a free-floating window.
    """

    def __init__(self):
        super().__init__()
        layout = QVBoxLayout()
        self.label = QLabel("Another Window % d" % randint(0, 100))
        layout.addWidget(self.label)
        self.setLayout(layout)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.window1 = AnotherWindow()
        self.window2 = AnotherWindow()

        l = QVBoxLayout()
        button1 = QPushButton("Push for Window 1")
        button1.clicked.connect(
            lambda checked: self.toggle_window(self.window1)
        )
        l.addWidget(button1)

        button2 = QPushButton("Push for Window 2")
        button2.clicked.connect(
            lambda checked: self.toggle_window(self.window2)
        )
        l.addWidget(button2)

        w = QWidget()
        w.setLayout(l)
        self.setCentralWidget(w)

    def toggle_window(self, window):
        if window.isVisible():
            window.hide()

        else:
            window.show()


app = QApplication(sys.argv)
w = MainWindow()
w.show()
app.exec_()


from Planet Python
via read more

The Real Python Podcast – Episode #20: Building PDFs in Python with ReportLab

Have you wanted to generate advanced reports as PDFs using Python? Maybe you want to build documents with tables, images, or fillable forms. This week on the show we have Mike Driscoll to talk about his book "ReportLab - PDF Processing with Python."


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Real Python
read more

PSF GSoC students blogs: Week 8

Just a brief check-in. Setting up a MySQL environment for manual testing on a mac mini and then moving on to tickets #362 and #363.



from Planet Python
via read more

Thursday, July 30, 2020

Python Insider: Upgrade to pip 20.2, plus, changes coming in 20.3

On behalf of the Python Packaging Authority, I am pleased to announce that we have just released pip 20.2, a new version of pip. You can install it by running python -m pip install --upgrade pip.

The highlights for this release are:

- The beta of the next-generation dependency resolver is available -- please test
- Faster installations from wheel files
- Improved handling of wheels containing non-ASCII file contents
- Faster pip list using parallelized network operations
- Installed packages now contain metadata about whether they were directly requested by the user (PEP 376’s REQUESTED file)

The new dependency resolver is off by default because it is in beta and not yet ready for everyday use. The new dependency resolver is significantly stricter and more consistent when it receives incompatible instructions, and reduces support for certain kinds of constraints files, so some workarounds and workflows may break. Please test it with the --use-feature=2020-resolver flag. Please see our guide on how to test and migrate, how to report issues, and context for the change.

Thanks to all who tested the alpha of the new resolver in pip 20.1 for feedback that helped us get it to the beta stage.

We are preparing to change the default dependency resolution behavior and make the new resolver the default in pip 20.3 (in October 2020).

This release also partially optimizes pip’s network usage during installation (as part of a Google Summer of Code project by McSinyx). Please test it with pip install --use-feature=2020-resolver --use-feature=fast-deps and report bugs to the issue tracker. This functionality is still experimental and not ready for everyday use.

You can find more details (including deprecations and removals) in the changelog.

As with all pip releases, a significant amount of the work was contributed by pip’s user community. Huge thanks to all who have contributed, whether through code, documentation, issue reports and/or discussion. Your help keeps pip improving, and is hugely appreciated. Specific thanks go to Mozilla (through its Mozilla Open Source Support Awards) and to the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation, for their funding that enabled substantial work on the new resolver.




from Planet Python
via read more

Matt Layman: Docs, Bugs, and Reports - Building SaaS #66

In this episode, I created documentation for anyone interested in trying out the application. After documenting the setup, I moved on to fixing a bug with the scheduling display of courses. In the latter half of the stream, we focused on creating a new reports section to show progress reports for students. One of my patrons requested some documentation to explain how to get started with the project. We updated the README.

from Planet Python
via read more

Upgrade to pip 20.2, plus, changes coming in 20.3

On behalf of the Python Packaging Authority, I am pleased to announce that we have just released pip 20.2, a new version of pip. You can install it by running python -m pip install --upgrade pip.

The highlights for this release are:

- The beta of the next-generation dependency resolver is available -- please test
- Faster installations from wheel files
- Improved handling of wheels containing non-ASCII file contents
- Faster pip list using parallelized network operations
- Installed packages now contain metadata about whether they were directly requested by the user (PEP 376’s REQUESTED file)

The new dependency resolver is off by default because it is in beta and not yet ready for everyday use. The new dependency resolver is significantly stricter and more consistent when it receives incompatible instructions, and reduces support for certain kinds of constraints files, so some workarounds and workflows may break. Please test it with the --use-feature=2020-resolver flag. Please see our guide on how to test and migrate, how to report issues, and context for the change.

Thanks to all who tested the alpha of the new resolver in pip 20.1 for feedback that helped us get it to the beta stage.

We are preparing to change the default dependency resolution behavior and make the new resolver the default in pip 20.3 (in October 2020).

This release also partially optimizes pip’s network usage during installation (as part of a Google Summer of Code project by McSinyx). Please test it with pip install --use-feature=2020-resolver --use-feature=fast-deps and report bugs to the issue tracker. This functionality is still experimental and not ready for everyday use.

You can find more details (including deprecations and removals) in the changelog.

As with all pip releases, a significant amount of the work was contributed by pip’s user community. Huge thanks to all who have contributed, whether through code, documentation, issue reports and/or discussion. Your help keeps pip improving, and is hugely appreciated. Specific thanks go to Mozilla (through its Mozilla Open Source Support Awards) and to the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation, for their funding that enabled substantial work on the new resolver.




from Python Insider
read more

Paolo Amoroso: Reading Impractical Python Projects

If you experienced the home and personal computing revolution of the early 1980s, you may have read some books that got you hooked up with programming. These books led you through the intellectual adventure of using computing to explore interesting problem domains.

I got a recent book that brought back that fascination and excitement with programming, Impractical Python Projects: Playful Programming Activities to Make You Smarter by Lee Vaughan.

The cover of the book Impractical Python Projects in the Google Play Books app on a Pixel 2 XL phone.
The cover of the book Impractical Python Projects in the Google Play Books app on my Pixel 2 XL phone.

The book is not a Python tutorial or guide. Instead, it presents stimulating coding projects for non-programmers who want to use Python for doing experiments, test theories, or simulate natural phenomena. This includes professionals who are not software developers but use programming to solve problems in science and engineering.

Exploring and understanding the problem domain is an integral part of the book’s projects along with the coding. This is unlike typical programming books where the examples are often trivial, have little or no domain depth, and are stripped of everything but the essentials.

The science and engineering Impractical Python Projects covers include some great ones that match my interest in astronomy and space such as estimating alien civilizations with the Fermi Paradox, simulating a volcano on Jupiter’s moon Io, simulating orbital maneuvers, and stacking planetary images.

The sample code is straightforward and clear. Since the book is not a language tutorial, it focuses on prototyping and exploration rather than building large and maintainable systems.

This book is worth alone the Humble Bundle of No Starch Press Python programming books I purchased it with.

from Planet Python
via read more

Janusworx: A Hundred Days of Code, Day 022 - Getting into the Groove

Did the same time as yesterday.
Only about an hour.
Was much more prodcutive though.

Getting the hang of how to sit and program and work through things I do not know.
Gaining a bit of experience with the workflow now.
I have the basics in hand. I know what I want to look up.
So check problem, work a bit, look up, try, fail, repeat, gain incremental success, work some more.
Love the immediate feedback loop.
With other stuff I try, I have to wait days, weeks, months.
Here, it’s immediate.

Beginning to love the work, as I get more familiar with it.
Tomorrow is another day :)



from Planet Python
via read more

Wednesday, July 29, 2020

PSF GSoC students blogs: Weekly Check In - 8

What did I do till now?

Last week I added tests for H2Agent and H2DownloaderHandler

What's coming up next?

Next week I plan to continue working on ScrapyTunnelingH2Agent.

Did I get stuck anywhere?

Yes. I got stuck for a long time while setting up the testing environment of H2DownloaderHandler. The problem was a bit weird one, till now Scrapy was using the Twisted's WrappingFactory class to wrap the Site instance, which allows only upto HTTP/1.1 (for unknown reasons) which took me a long time to realize. After removing the WrappingFactory, the tests environment was setup as required. Apart from this another hurdle I'm still facing is about the CONNECT Protocol in HTTP/2.0, I couldn't really find much blogs/articles on this to get a better idea. I plan to look at some open-source libraries' implementation of HTTP/2.0 CONNECT now.  



from Planet Python
via read more

PSF GSoC students blogs: Weekly Check-in #9

<meta charset="utf-8">

What did I do this week?

I added support for Immediate response in the HTTP server. I also

added a new command-line option, so that we can run dataflow without

the need for Sources.

What's next?

I'll be adding tests for the same and updating the documentation to use the new features.

Did I get stuck somewhere?

No.



from Planet Python
via read more

Israel Fruchter: How much fun was EuroPython 2020

How much fun was EuroPython 2020

#pyconil got canceled

This year I’ve finally got enough courage and will, and I had 2 submissions for #pyconil. COVID-19 had other plans, and #pyconil was canceled

I’ve told @ultrabug about this (Numberly CTO, Alexys Jacob), after a few weeks he surprised me with telling me he’s gonna present scylla-driver in europython2020, the shard-aware driver we were working on in the last 6 months.

At the time it wasn’t yet ready nor publish. (Also found out that Numberly were sponsoring europython for years now) Took me a few seconds to figure that he just set me deadline without my consent…

Fast-forwarding a bit, and scylla-driver initial release came out: https://ift.tt/2P5275s

And my tickets for europython2020 were booked…

Discord is fun

Few days before the date, I’ve got an email with instruction to connect to the discord of europython2020, since it’s my first COVID-19 online conference, I’ve registered and login straight away.

It was really nice to be start to start talking with people and meet them a few days before

Each track got its own chat room, backed with zoom webinar and online youtube stream each talk got its own chat room, so people can promote their and answer question before and after the talks. Sponsors had rooms too, and all the sprints had rooms (we’ll get to that later on)

Talk about timing, day before europython2020, this press release came out https://ift.tt/2ZNsW4u

Talks — Day 1

The opening keynote was cancelled cause of technical difficulties, that lost touch with whom was supposed to talk. (it rescheduled to next day)

Social — the missing e in my whisky

During the whole day I was playing cat and mouse with people, trying to bring them into the open track, to have a face to face meeting, only later after all the sessions end, some people came in, most of them were the part of the organizing team. Every one where show their beer or whisky that they were drinking. Keith Gaughan was laughing at the ice in my whisky, and also thought me a real whisky is spelled with whiskey with an e in it.

Marc-Andre Lemburg was talking about the challenges they have as organizers, one thing led to anther, and I ask how can I help. He said they need a hand with handling the tweeter account, and that he’ll hook me tomorrow with whom handle it.

After a few more rounds of whisky, I call it a day.

Talks — Day 2

TODO:

Sprints

  • packaging -
  • hypothesis -
  • terminusdb-client -


from Planet Python
via read more

Codementor: Face Mask Detection using Yolo V3

Want to implement Real Time Face Mask detection ? .. In this post you will see the hand's on Training of yolo v3 using google colab to detect person wearing mask or not .

from Planet Python
via read more

Real Python: Namespaces and Scope in Python

This tutorial covers Python namespaces, the structures used to organize the symbolic names assigned to objects in a Python program.

The previous tutorials in this series have emphasized the importance of objects in Python. Objects are everywhere! Virtually everything that your Python program creates or acts on is an object.

An assignment statement creates a symbolic name that you can use to reference an object. The statement x = 'foo' creates a symbolic name x that refers to the string object 'foo'.

In a program of any complexity, you’ll create hundreds or thousands of such names, each pointing to a specific object. How does Python keep track of all these names so that they don’t interfere with one another?

In this tutorial, you’ll learn:

  • How Python organizes symbolic names and objects in namespaces
  • When Python creates a new namespace
  • How namespaces are implemented
  • How variable scope determines symbolic name visibility

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you'll need to take your Python skills to the next level.

Namespaces in Python

A namespace is a collection of currently defined symbolic names along with information about the object that each name references. You can think of a namespace as a dictionary in which the keys are the object names and the values are the objects themselves. Each key-value pair maps a name to its corresponding object.

Namespaces are one honking great idea—let’s do more of those!

The Zen of Python, by Tim Peters

As Tim Peters suggests, namespaces aren’t just great. They’re honking great, and Python uses them extensively. In a Python program, there are four types of namespaces:

  1. Built-In
  2. Global
  3. Enclosing
  4. Local

These have differing lifetimes. As Python executes a program, it creates namespaces as necessary and deletes them when they’re no longer needed. Typically, many namespaces will exist at any given time.

The Built-In Namespace

The built-in namespace contains the names of all of Python’s built-in objects. These are available at all times when Python is running. You can list the objects in the built-in namespace with the following command:

>>>
>>> dir(__builtins__)
['ArithmeticError', 'AssertionError', 'AttributeError',
 'BaseException','BlockingIOError', 'BrokenPipeError', 'BufferError',
 'BytesWarning', 'ChildProcessError', 'ConnectionAbortedError',
 'ConnectionError', 'ConnectionRefusedError', 'ConnectionResetError',
 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError',
 'Exception', 'False', 'FileExistsError', 'FileNotFoundError',
 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError',
 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError',
 'InterruptedError', 'IsADirectoryError', 'KeyError', 'KeyboardInterrupt',
 'LookupError', 'MemoryError', 'ModuleNotFoundError', 'NameError', 'None',
 'NotADirectoryError', 'NotImplemented', 'NotImplementedError', 'OSError',
 'OverflowError', 'PendingDeprecationWarning', 'PermissionError',
 'ProcessLookupError', 'RecursionError', 'ReferenceError', 'ResourceWarning',
 'RuntimeError', 'RuntimeWarning', 'StopAsyncIteration', 'StopIteration',
 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError',
 'TimeoutError', 'True', 'TypeError', 'UnboundLocalError',
 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError',
 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError',
 'Warning', 'ZeroDivisionError', '_', '__build_class__', '__debug__',
 '__doc__', '__import__', '__loader__', '__name__', '__package__',
 '__spec__', 'abs', 'all', 'any', 'ascii', 'bin', 'bool', 'bytearray',
 'bytes', 'callable', 'chr', 'classmethod', 'compile', 'complex',
 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate',
 'eval', 'exec', 'exit', 'filter', 'float', 'format', 'frozenset',
 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input',
 'int', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list',
 'locals', 'map', 'max', 'memoryview', 'min', 'next', 'object', 'oct',
 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'repr',
 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod',
 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']

You’ll see some objects here that you may recognize from previous tutorials—for example, the StopIteration exception, built-in functions like max() and len(), and object types like int and str.

The Python interpreter creates the built-in namespace when it starts up. This namespace remains in existence until the interpreter terminates.

The Global Namespace

The global namespace contains any names defined at the level of the main program. Python creates the global namespace when the main program body starts, and it remains in existence until the interpreter terminates.

Strictly speaking, this may not be the only global namespace that exists. The interpreter also creates a global namespace for any module that your program loads with the import statement. For further reading on main functions and modules in Python, see these resources:

You’ll explore modules in more detail in a future tutorial in this series. For the moment, when you see the term global namespace, think of the one belonging to the main program.

The Local and Enclosing Namespaces

As you learned in the previous tutorial on functions, the interpreter creates a new namespace whenever a function executes. That namespace is local to the function and remains in existence until the function terminates.

Read the full article at https://realpython.com/python-namespaces-scope/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Planet Python
via read more

Janusworx: A Hundred Days of Code, Day 021 - Swing and a miss

Only did about an hour of distracted work and exercises today.
I’ll still count it though.

Tomorrow is another day :)



from Planet Python
via read more

Django Weblog: Django Developers Community Survey 2020

We're conducting a seventeen question survey to assess how the community feels about the current Django development process. This was last done in 2015.

Please take a few minutes to complete the 2020 survey. Your feedback will help guide future efforts.



from Planet Python
via read more

Namespaces and Scope in Python

This tutorial covers Python namespaces, the structures used to organize the symbolic names assigned to objects in a Python program.

The previous tutorials in this series have emphasized the importance of objects in Python. Objects are everywhere! Virtually everything that your Python program creates or acts on is an object.

An assignment statement creates a symbolic name that you can use to reference an object. The statement x = 'foo' creates a symbolic name x that refers to the string object 'foo'.

In a program of any complexity, you’ll create hundreds or thousands of such names, each pointing to a specific object. How does Python keep track of all these names so that they don’t interfere with one another?

In this tutorial, you’ll learn:

  • How Python organizes symbolic names and objects in namespaces
  • When Python creates a new namespace
  • How namespaces are implemented
  • How variable scope determines symbolic name visibility

Namespaces in Python

A namespace is a collection of currently defined symbolic names along with information about the object that each name references. You can think of a namespace as a dictionary in which the keys are the object names and the values are the objects themselves. Each key-value pair maps a name to its corresponding object.

Namespaces are one honking great idea—let’s do more of those!

The Zen of Python, by Tim Peters

As Tim Peters suggests, namespaces aren’t just great. They’re honking great, and Python uses them extensively. In a Python program, there are four types of namespaces:

  1. Built-In
  2. Global
  3. Enclosing
  4. Local

These have differing lifetimes. As Python executes a program, it creates namespaces as necessary and deletes them when they’re no longer needed. Typically, many namespaces will exist at any given time.

The Built-In Namespace

The built-in namespace contains the names of all of Python’s built-in objects. These are available at all times when Python is running. You can list the objects in the built-in namespace with the following command:

>>>
>>> dir(__builtins__)
['ArithmeticError', 'AssertionError', 'AttributeError',
 'BaseException','BlockingIOError', 'BrokenPipeError', 'BufferError',
 'BytesWarning', 'ChildProcessError', 'ConnectionAbortedError',
 'ConnectionError', 'ConnectionRefusedError', 'ConnectionResetError',
 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError',
 'Exception', 'False', 'FileExistsError', 'FileNotFoundError',
 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError',
 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError',
 'InterruptedError', 'IsADirectoryError', 'KeyError', 'KeyboardInterrupt',
 'LookupError', 'MemoryError', 'ModuleNotFoundError', 'NameError', 'None',
 'NotADirectoryError', 'NotImplemented', 'NotImplementedError', 'OSError',
 'OverflowError', 'PendingDeprecationWarning', 'PermissionError',
 'ProcessLookupError', 'RecursionError', 'ReferenceError', 'ResourceWarning',
 'RuntimeError', 'RuntimeWarning', 'StopAsyncIteration', 'StopIteration',
 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError',
 'TimeoutError', 'True', 'TypeError', 'UnboundLocalError',
 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError',
 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError',
 'Warning', 'ZeroDivisionError', '_', '__build_class__', '__debug__',
 '__doc__', '__import__', '__loader__', '__name__', '__package__',
 '__spec__', 'abs', 'all', 'any', 'ascii', 'bin', 'bool', 'bytearray',
 'bytes', 'callable', 'chr', 'classmethod', 'compile', 'complex',
 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate',
 'eval', 'exec', 'exit', 'filter', 'float', 'format', 'frozenset',
 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input',
 'int', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list',
 'locals', 'map', 'max', 'memoryview', 'min', 'next', 'object', 'oct',
 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'repr',
 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod',
 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']

You’ll see some objects here that you may recognize from previous tutorials—for example, the StopIteration exception, built-in functions like max() and len(), and object types like int and str.

The Python interpreter creates the built-in namespace when it starts up. This namespace remains in existence until the interpreter terminates.

The Global Namespace

The global namespace contains any names defined at the level of the main program. Python creates the global namespace when the main program body starts, and it remains in existence until the interpreter terminates.

Strictly speaking, this may not be the only global namespace that exists. The interpreter also creates a global namespace for any module that your program loads with the import statement. For further reading on main functions and modules in Python, see these resources:

You’ll explore modules in more detail in a future tutorial in this series. For the moment, when you see the term global namespace, think of the one belonging to the main program.

The Local and Enclosing Namespaces

As you learned in the previous tutorial on functions, the interpreter creates a new namespace whenever a function executes. That namespace is local to the function and remains in existence until the function terminates.

Read the full article at https://realpython.com/python-namespaces-scope/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Real Python
read more

PyCharm: PyCharm 2020.2 Out Now!

Complete the full Pull Request workflow, quickly catch exceptions, and apply project-wide refactorings. All without leaving your IDE. Download the new version now, or upgrade from within PyCharm.

New in PyCharm

  • New pull request dedicated view: You no longer need to switch between the browser and your IDE to manage your GitHub Pull Request workflow. Do it all in PyCharm!
  • Smart in-editor exceptions preview: Don’t spend time browsing your code after exceptions. PyCharm now automatically finds it for you and displays a preview of the problem directly in your editor.
  • In-place signature-change refactoring: Simply add, remove, or edit your method signature in-place and use context actions (Alt+Enter) or the new gutter-icon to preview the changes and apply the refactoring.
  • Support for Django configuration constants completion in settings.py: Stop typing the same Django configuration variables in settings.py over and over again. Speed up your flow and let PyCharm autocomplete documented Django settings for you.

These and a lot more!
Read about them all on our What’s New page or check the release notes.



from Planet Python
via read more

Stack Abuse: Deep Learning in Keras - Data Preprocessing

Introduction

Deep learning is one of the most interesting and promising areas of artificial intelligence (AI) and machine learning currently. With great advances in technology and algorithms in recent years, deep learning has opened the door to a new era of AI applications.

In many of these applications, deep learning algorithms performed equal to human experts and sometimes surpassed them.

Python has become the go-to language for Machine Learning and many of the most popular and powerful deep learning libraries and frameworks like TensorFlow, Keras, and PyTorch are built in Python.

In this series, we'll be using Keras to perform Exploratory Data Analysis (EDA), Data Preprocessing and finally, build a Deep Learning Model and evaluate it.

If you haven't already, check out our first article - Deep Learning Models in Keras - Exploratory Data Analysis (EDA).

Data Preprocessing

In the preprocessing stage, we'll prepare the data to be fed to the Keras model. The first step is clearing the dataset of null values. Then, we'll use one-hot encoding to convert categorical variables to numerical variables. Neural Nets work with numerical data, not categorical.

We'll also split the data into a training and testing set. Finally, we'll scale the data/standardize it so that it ranges from -1 to 1. This standardization helps both train the model better and allows it to converge easier.

Dealing with Missing Values

Let's find out the number and percentage of missing values in each variable in the dataset:

missing_values = pd.DataFrame({
    'Column': df.columns.values,
    '# of missing values': df.isna().sum().values,
    '% of missing values': 100 * df.isna().sum().values / len(df),
})

missing_values = missing_values[missing_values['# of missing values'] > 0]
print(missing_values.sort_values(by='# of missing values', 
                                 ascending=False
                                ).reset_index(drop=True))

This code will produce the following table which shows us variables that contain missing values and how many missing values they contain:

Column # of missing values % of missing values
0 Pool QC 2917 99.5563
1 Misc Feature 2824 96.3823
2 Alley 2732 93.2423
3 Fence 2358 80.4778
4 Fireplace Qu 1422 48.5324
5 Lot Frontage 490 16.7235
6 Garage Cond 159 5.42662
7 Garage Qual 159 5.42662
8 Garage Finish 159 5.42662
9 Garage Yr Blt 159 5.42662
10 Garage Type 157 5.35836
11 Bsmt Exposure 83 2.83276
12 BsmtFin Type 2 81 2.76451
13 BsmtFin Type 1 80 2.73038
14 Bsmt Qual 80 2.73038
15 Bsmt Cond 80 2.73038
16 Mas Vnr Area 23 0.784983
17 Mas Vnr Type 23 0.784983
18 Bsmt Half Bath 2 0.0682594
19 Bsmt Full Bath 2 0.0682594
20 Total Bsmt SF 1 0.0341297

Since Pool QC, Misc Feature, Alley, Fence, and Fireplace Qu variables contain a high percentage of missing values as shown in the table, we will simply remove them as they probably won't affect the results much at all:

df.drop(['Pool QC', 'Misc Feature', 'Alley', 'Fence', 'Fireplace Qu'], 
        axis=1, inplace=True)

For other variables that contain missing values, we will replace these missing values depending on the data type of the variable: whether it is numerical or categorical.

If it is numerical, we will replace missing values with the variable mean. If it is categorical, we will replace the missing values with the variable mode. This removes the false bias that can be created with missing values in a neutral way.

To know which variables are numerical and which are categorical, we will print out 5 unique items for each of the variables that contain missing values using this code:

cols_with_missing_values = df.columns[df.isna().sum() > 0]
for col in cols_with_missing_values:
    print(col)
    print(df[col].unique()[:5])
    print('*'*30)

And we get the following results:

Lot Frontage
[141.  80.  81.  93.  74.]
******************************
Mas Vnr Type
['Stone' 'None' 'BrkFace' nan 'BrkCmn']
******************************
...

Let's replace the values of missing numerical values with the mean:

num_with_missing = ['Lot Frontage', 'Mas Vnr Area', 'BsmtFin SF 1', 'BsmtFin SF 2', 
                    'Bsmt Unf SF', 'Total Bsmt SF', 'Bsmt Full Bath', 'Bsmt Half Bath', 
                    'Garage Yr Blt', 'Garage Cars', 'Garage Area']

for n_col in num_with_missing:
    df[n_col] = df[n_col].fillna(df[n_col].mean())

Here, we just put them all in a list and assigned new values to them. Next, let's replace missing values for categorical variables:

cat_with_missing = [x for x in cols_with_missing_values if x not in num_with_missing]

for c_col in cat_with_missing:
    df[c_col] = df[c_col].fillna(df[c_col].mode().to_numpy()[0])

After this step, our dataset will have no missing values in it.

One-Hot Encoding of Categorical Variables

Keras models, like all machine learning models fundamentally work with numerical data. Categorical data has no meaning to a computer, but it does do us. We need to convert these categorical variables into numerical representations in order for the dataset to be usable.

The technique that we will use to do that conversion is One-Hot Encoding. Pandas provides us with a simple way to automatically perform One-Hot encoding on all categorical variables in the data.

Before that though, we must ensure that no categorical variable in our data is represented as a numerical variable by accident.

Checking Variables Data Types

When we read a CSV dataset using Pandas as we did, Pandas automatically tries to determine the type of each variable in the dataset.

Sometimes, Pandas can determine this incorrectly - if a categorical variable is represented with numbers, it can wrongfully infer that it's a numerical variable.

Let's check if there are any data type discrepancies in the DataFrame:

data_types = pd.DataFrame({
    'Column': df.select_dtypes(exclude='object').columns.values,
    'Data type': df.select_dtypes(exclude='object').dtypes.values
})

print(data_types)
Column Data type
0 MS SubClass int64
1 Lot Frontage float64
2 Lot Area int64
3 Overall Qual int64
4 Overall Cond int64
5 Year Built int64
6 Year Remod/Add int64

Based on this table and the variables descriptions from Kaggle, we can notice which variables were falsely considered numerical by Pandas.

For example, MS SubClass was detected as a numerical variable with a data type of int64. However, based on the description of this variable, it specifies the type of the unit being sold.

If we take a look at the unique values of this variable:

df['MS SubClass'].unique().tolist()

We get this output:

[20, 60, 120, 50, 85, 160, 80, 30, 90, 190, 45, 70, 75, 40, 180, 150]

This variable represent different unit types as numbers like 20 (one story dwellings built in 1946 and newer), 60 (2 story dwellings built in 1946 and newer), etc.

This actually isn't a numerical variable but a categorical one. Let's convert it back into a categorical variable by reassigning it as a string:

df['MS SubClass'] = df['MS SubClass'].astype(str)

Performing One-Hot Encoding

Before performing One-Hot Encoding, we want to select a subset of the features from our data to use from now on. We'll want to do so because our dataset contains 2,930 records and 75 features.

Many of these features are categorical. So if we keep all the features and perform One-Hot Encoding, the resulting number of features will be large and the model might suffer from the curse of dimensionality as a result.

Let's make a list of the variables we want to keep in a subset and trim the DataFrame so we only use these:

selected_vars = ['MS SubClass', 'MS Zoning', 'Lot Frontage', 'Lot Area',
                 'Neighborhood', 'Overall Qual', 'Overall Cond',
                 'Year Built', 'Total Bsmt SF', '1st Flr SF', '2nd Flr SF',
                 'Gr Liv Area', 'Full Bath', 'Half Bath', 'Bedroom AbvGr', 
                 'Kitchen AbvGr', 'TotRms AbvGrd', 'Garage Area', 
                 'Pool Area', 'SalePrice']

df = df[selected_vars]

Now we can perform One-Hot Encoding easily by using Pandas' get_dummies() function:

df = pd.get_dummies(df)

After one-hot encoding, the dataset will have 67 variables. Here are the capped first few rows - there are many more variables than this:

Lot Frontage Lot Area Overall Qual Overall Cond Year Built Total Bsmt SF 1st Flr SF 2nd Flr SF Gr Liv Area
0 141 31770 6 5 1960 1080 1656 0 1656
1 80 11622 5 6 1961 882 896 0 896
2 81 14267 6 6 1958 1329 1329 0 1329

Splitting Data into Training and Testing Sets

One of the last steps in data preprocessing is to split it in a training and testing subset. We'll be training the model on the training subset, and evaluating it with an unseen test set.

We will split the data randomly so that the training set will have 80% of the data and the testing set will have 20% of the data. Generally, the training set typically has anywhere between 70-80% of the data, while 20-30% is used for validation.

This is made really simple with Pandas' sample() and drop() functions:

train_df = df.sample(frac=0.8, random_state=9)
test_df = df.drop(train_df.index)

Now train_df holds our training data and test_df holds our testing data.

Next, we will store the target variable SalePrice separately for each of the training and testing sets:

train_labels = train_df.pop('SalePrice')
test_labels = test_df.pop('SalePrice')

We're removing the SalePrice value because, well, we want to predict it. There's no point predicting something we already know and have fed to the model. We'll be using the actual values to verify if our predictions are correct.

After this step, train_df will contain the predictor variables of our training data (i.e. all variables excluding the target variable), and train_labels will contain the target variable values for train_df. The same applies to test_df and test_labels.

We perform this operation to prepare for the next step of data scaling.

Note that Pandas' pop() function will return the specified column (in our case, it is SalePrice) from the dataframe (train_df for example) with removing that column from the dataframe.

At the end of this step, here are the number of records (rows) and features (columns) for each of train_df and test_df:

Set Number of records Number of features
`train_df` 2344 67
`test_df` 586 67

Moreover, train_labels has 2,344 labels for the 2,344 records of train_df and test_labels has 586 labels for the 586 records in test_df.

Without preprocessing this data, we would have a much messier dataset to work with.

Data Scaling: Standardization

Finally, we will standardize each variable - except the target variable, of course - in our data.

For training data which is stored now in train_df, we will calculate the mean and standard deviation of each variable. After that, we will subtract the mean from the values of each variable and then divide the resulting values by the standard deviation.

For testing data, we will subtract the training data mean from the values of each variable and then divide the resulting values by the training data standard deviation.

If you'd like to read up on Calculating Mean, Median and Mode in Python or Calculating Variance and Standard Deviation in Python, we've got you covered!

We use values calculated using training data because of the general principle: anything you learn, must be learned from the model's training data. Everything from the test dataset will be completely unknown to the model before testing.

Let's perform the standardization now:

predictor_vars = train_df.columns

for col in predictor_vars:
    # Calculating variable mean and std from training data
    col_mean = train_df[col].mean()
    col_std = train_df[col].std()
    if col_std == 0:
        col_std = 1e-20
    train_df[col] = (train_df[col] - col_mean) / col_std
    test_df[col] = (test_df[col] - col_mean) / col_std    

In this code, we first get the names of the predictor variables in our data. These names are the same for training and testing sets because these two sets contain the same variables but different data values.

Then for each predictor variable, we calculate the mean and standard deviation using the training data (train_df), subtract the calculated mean and divide by the calculated standard deviation.

Note that sometimes, the standard deviation is equal to 0 for some variables. In that case, we make the standard deviation equal to an extremely small amount because if we keep it equal to 0, we will get a division-by-zero error when we use it for division later.

This nets us scaled and standardized data in the range of -1 and 1.

With that done, our dataset is ready to be used to train and evaluate a model. We'll be building a deep neural network in the next article.

Conclusion

Data preprocessing is a crucial step in a Machine Learning pipeline. Without dropping certain variables, dealing with missing values, encoding categorical values and standardization - we'd be feeding a messy (or impossible) dataset into a model.

The model will only be as good as the data we feed it and in this article - we've prepped a dataset to fit a model.



from Planet Python
via read more

Andre Roberge: HackInScience: friendly Python learning

A short while ago I discovered HackInScience, a fantastic site for learning Python by doing exercises. It currently includes 68 programming exercises, with increasing level of difficulty.
I learned about it via an issue filed for Friendly-traceback: yes, HackInScience does use Friendly-traceback to provide feedback to users when their code raises Python exceptions.  These real-life experiences have resulted in additional cases being covered by Friendly-traceback: there are now 128 different test cases, each providing more helpful explanation as to what went wrong than that offered by Python. Python versions 3.6 to 3.9 inclusively are supported.

Previously, I thought I would get feedback about missing cases from teachers or beginners using either Mu or Thonny - both of which can make use of Friendly-traceback. However, this has not been the case yet, and this makes me extremely grateful for the feedback received from HackInScience.

While Friendly-traceback can provide feedback in either English or French [1], HackInScience only uses the English version - this, in spite of the fact that it was created by four French programmers.  I suspect that it is only a matter of time until they make a French version of their site.

One excellent additional feature provided by HackInScience is the addition of formatting (including some colour) in the output provided by Friendly-traceback.



The additional cases provided by Julien Palard from HackInScience have motivated me to clear out the accumulated backlog of test cases I had identified on my own. Now, there is only one (new) issue: enabling coloured output from Friendly-traceback's console.

Please, feel free to interrupt my work on this new issue by submitting new cases that are not covered by Friendly-traceback! ;-)

[1] Anyone interested in providing translations in other languages is definitely welcome!

from Planet Python
via read more

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...