Sunday, May 31, 2020

PSF GSoC students blogs: Weekly Check-in #01

Hey all!! I'm Aghin Shah, a 3rd Year CS undergrad from IIT-Madras. I'll be working with DFFML, a sub-org under Python Software Foundation during GSoC on Implementing Distributed Orchestrator and Adding DataFlow tutorials.

What did I do this week?

I worked on setting up a DataFlow for coImplementingntinuous deployment of Docker containers. With the new flow, you can push your changes to GitHub, and it'll automatically pull and redeploy the defined containers. I've also been working on adding additional features to the CLI command for creating the DataFlow. We had meetings in the community (on Tuesdays and Fridays). Everyone was updated with the changes in the codebase. I also had an individual meeting with the mentor, where we discussed the possible ways to go about the project.

What is coming up next?

I'll be finishing patches for a couple of issues which I've been working on. I'll also start working on adding basic tutorials for DataFlow.

Did you get stuck anywhere?

There were a few places where I was confused, but my mentor is very active, so most of it was cleared by the day.

from Planet Python
via read more

PSF GSoC students blogs: Weekly Check-in #01 (Week #01)

>>> import Introduction
>>> Introduction.display()
Hello World! My name is Saksham Arora. I'm a 2nd year undergraduate student from India pursuing B. Tech in Information Technology. This is my blog for GSoC 2020 @ PSF!

Over the summer, I'll be working with DFFML under the umbrella of Python Software Foundation. My project for the summer is to Integrate Image Processing into DFFML!

GSoC 2020 Weekly Check-in #01 (Week #01 - 01/06/2020)

What did you do this week?

Since it was the last week of Community Bonding period, I worked on finishing a few pending issues assigned to me, researched and brushed up on important topics related to my GSoC project. In the community bonding period, we had virtual meetings (also called Weekly Syncs) twice a week where I interacted with the mentors to discuss about new features and enhancements for the project. Also, I went through a few videos and documentation on asynchronous functions in python which was recommended by one of the mentors as a part of understanding the codebase better!

What is coming up next?

I will be adding the capability to normalize the MNIST dataset and pre-process any image provided for prediction on the dataset using the CLI before feeding it to a machine learning model. I will be discussing with the mentors on the best approach to get started on my project and start working on wrapping the OpenCV library with DFFML this week!

Did you get stuck anywhere?

I briefly got stuck at a unittesting error where it was trying to create a test class out of a decorator function which it shouldn't be doing, I was eventually able to figure it out after thoroughly going through the importing section in the Python documentation.

I'm very excited to get started on this journey. I hope everyone does great this summer!
Thank you for reading!

from Planet Python
via read more

PSF GSoC students blogs: Week 1 check-in

Hello

Welcome to my blog. I am participating in this year's GSoC program for Panda3D - a suborgansiation under PSF. Today is the start of the coding period. Its 7:00 am in India here and I am starting this memorable day by writing my first blog here on this forum. I have been assigned the task to integrate Recast & Detour tools in Panda3D game engine. Already excited by the project idea, I started playing with the tools of Panda3D during the community bonding period. I did go through a lot of blogs and articles about "recastnavigation", which is the github repository that provides the Recast and Detour tools. Well, this was pretty much what I did in the previous month, but now starts the actual coding period. I plan to start by planning the classes and functions required to bring recast into the Panda3D world.

So yeah, I am very much excited to let you guys how all of it unfolds.

Let's start the journey!

from Planet Python
via read more

PSF GSoC students blogs: Weekly Check In - 0

Hello, I am Aditya Kumar. I will be contributing to Scrapy during GSoC'20. This is my first blog of the series.

What did I do till now?

I had two meetings with my mentors to discuss about the project goals and deadlines
I was looking into implementation of HTTP/2 Client by various libraries to get a better picture

Whats coming up next?

Next week, I would work on implementing a simple HTTP/2 Client which can handle GET, POST & HEAD requests.

Did I get stuck anywhere?

Last week, I was mainly working on tested code functioning as tutorials. So I didn't come across any bugs.

</article>

from Planet Python
via read more

PSF GSoC students blogs: test

from Planet Python
via read more

PSF GSoC students blogs: Weekly Check-In #1 - Community Bonding ( 4th May - 31st May )

Hi, I am Arnav Kapoor a 3rd year Undergraduate student from IIIT-Hyderabad and I will be working with the Scrapinghub sub-org this summer. The project goal is to create a nuarmber-parser library to parse numbers in natural language and incorporate the same with existing libraries.

What did you do this week ?
The community bonding phase mostly involved researching more into the existing solutions, understanding the pros and cons of each. I also got to know the mentors and we have set up weekly meetings for the duration of the program.

Did you get stuck anywhere ?
No there weren't any hurdles as such.

What is coming up next ?
Begin coding and face the challenges as and when they come.

from Planet Python
via read more

Codementor: How I learnt Django

This is a quick summary of how I learnt Django and tips on how you as a beginner can get through the learning process also.

from Planet Python
via read more

Python Morsels: Duck Typing

If it looks like a duck and quacks like a duck

Duck typing is the idea that instead of checking the type of something in Python, we tend to check what behavior it supports (often by attempting to use the behavior and catching an exception if it doesn't work).

For example, we might test whether something is an integer by attempting to convert it to an integer:

try:
    x = int(input("Enter your favorite integer: "))
except ValueError:
    print("Uh oh. That's not a number. Try again.")
else:
    print(f"I like the number {x}")

We say "if it looks like a duck and quacks like a duck, then we consider it a duck". We don't have to check the duck's DNA to see whether it's a duck, we just observe its behavior.

This concept is deeply embedded in Python, so much so that it's really all over the place. In Python we very often assume the behavior of objects instead of checking the types of those objects.

How's the water?

Duck typing is so prevalent in Python that it's like water to a fish: we don't even think about it.

Duck typing is a description we use for the way Python sees just about every part of Python.

Let's try to get a grasp for what duck typing is by looking at a number of examples of duck typing.

Duck typing by example

In Python we use words like sequence, iterable, callable, and mapping to describe the behavior of an object rather than describing its type (type meaning the class of an object, which you can get from the built-in type function).

Behavior-oriented words are important because duck typing is all about behavior: we don't care what an object is, we care what it can do.

The below duck typing examples focus on the following behavior-driven nouns:

Sequence
Iterable
Callable
Mapping
File-like object
Context manager
Iterator
Decorator

Sequences: is it like a list?

Sequences consist of two main behaviors: they have a length and they can be indexed from 0 up until one less than the length of the sequence. They can also be looped over.

Strings, tuples, and lists are all sequences:

>>> s = "hello"
>>> t = (1, 2, 3)
>>> l = ['a', 'b', 'c']
>>> s[0]
'h'
>>> t[0]
1
>>> l[0]
'a'

Strings and tuples are immutable sequences (we can't change them) and lists are mutable sequences.

Sequences typically have a few more behaviors though. They can usually be indexed in reverse with negative indexes, they can be sliced, they can usually be compared for equality with other sequences of the same type, and they usually have an index and count method.

If you're trying to invent your own sequence, I'd look into the collections.abc.Sequence and collections.abc.MutableSequence abstract base classes and consider inheriting from them.

Iterables: can we loop over it?

Iterables are a more general notion than sequences. Anything that you can loop over with a for loop is an iterable. Put another way anything you're able to iterate over is an iter-able.

Lists, strings, tuples, sets, dictionaries, files, generators, range objects, zip objects, enumerate objects, and many other things in Python are iterables.

Callables: is it a function?

If you can put parenthesis after something to call it, it's a callable. Functions are callables and classes are callables. Anything with a __call__ method is also a callable.

You can think of callables as function-like things. Many of the built-in functions are actually classes. But we call them functions because they're callable, which is the one behavior that functions have, so they may as well be functions.

For more on callables see my article, Is it a class or a function? It's a callable!

Mappings: is it a dictionary?

We use the word "mapping" in Python to refer to dictionary-like objects.

You might wonder, what is a dictionary-like object? It depends on what you mean by that question.

If you mean "can you assign key/value pairs to it using the [...] syntax" then all you need is __getitem__/__setitem__/__delitem__ methods:

>>> class A:
...     def __getitem__(self, key):
...         return self.__dict__[key]
...     def __setitem__(self, key, value):
...         self.__dict__[key] = value
...     def __delitem__(self, key):
...         del self.__dict__[key]
...
>>> a = A()
>>> a['a'] = 4
>>> a['a']
4

If instead you mean "does it work with the ** syntax" then you'll need a keys method and a __getitem__ method:

>>> class A:
...     def keys(self):
...         return ['a', 'b', 'c']
...     def __getitem__(self, key):
...         return 4
...
>>> {**A()}
{'a': 4, 'b': 4, 'c': 4}

I'd recommend taking guidance from the collections.abc.Mapping and collections.abc.MutableMapping abstract base classes to help guide your thinking on what belongs in a "mapping".

Files and file-like objects

You can get file objects in Python by using the built-in open function which will open a file and return a file object for working with that file.

Is sys.stdout a file? It has a write method like files do as well as a writable and readable methods which return True and False (as they should with write-only files).

What about io.StringiO? StringIO objects are basically in-memory files. They implement all the methods that files are supposed to have but they just store their "contents" inside the current Python process (they don't write anything to disk). So they "quack like a file".

The gzip.open function in the gzip module also returns file-like objects. These objects have all the methods that files have, except they do a bit of compressing or decompressing when reading/writing data to gzipped files.

Files are a great example of duck typing in Python. If you can make an object that acts like a file (often by inheriting from one of the abstract classes in the io module) then from Python's perspective, your object "is" a file.

Context managers

A context manager is any object that works with Python's with block, like this:

with my_context_manager:
    pass  # do something here

When the with block is entered, the __enter__ method will be called on the context manager object and when the block is exited, the __exit__ method will be called.

File objects are an example of this.

>>> with open('my_file.txt') as f:
...     print(f.closed)
...
False
>>> print(f.closed)
True

We can use the file object we get back from that open call in a with block, which means it must have __enter__ and __exit__ methods:

>>> f = open('my_file.txt')
>>> f.__enter__()
>>> f.closed
False
>>> f.__exit__()
>>> f.closed
True

Python practices duck typing in its with blocks. The Python interpreter doesn't check the type of the objects used in a with block: it only checks whether they implement __enter__ and __exit__ methods. Any class with a __enter__ method and a __exit__ method works in a with block.

For more on context managers, see the context managers page.

Iterators

Iterators are objects which have a __iter__ method that returns self (making them an iterable) and a __next__ method which returns the next item within them.

This is duck typing again. Python doesn't care about the types of iterators, just whether they have these two methods.

Decorators

Python's decorator syntax is all about duck typing too.

Usually I describe decorators as "functions that accept functions and return functions". More generally, a decorator is a callable which accepts a function (or class) and returns another object.

This:

@my_decorator
def my_function():
    print("Hi")

Is the same as this:

def my_function():
    print("Hi")

my_function = my_decorator(my_function)

Which means any function you pass another function to and get something back can be used with that @-based decorator syntax.

Even silly things like this, which replaces my_silly_function by a string:

>>> @str
... def my_silly_function():
...     print("I'm silly")
...
>>> my_silly_function
'<function my_silly_function at 0x7f525ae2ebf8>'

Other examples

This idea of caring about behavior over types is all over the place.

The built-in sum function accepts any iterable of things it can add together. It works with anything that supports the + sign, even things like lists and tuples:

>>> sum([(1, 2), (3, 4)], ())
(1, 2, 3, 4)

The string join method also works with any iterable of strings, not just lists of strings:

>>> words = ["words", "in", "a", "list"]
>>> numbers = [1, 2, 3, 4]
>>> generator_of_strings = (str(n) for n in numbers)
>>> " ".join(words)
'words in a list'
>>> ", ".join(generator_of_strings)
'1, 2, 3, 4'

The built-in zip and enumerate functions accept any iterable (not just lists or sequences, any iterable)!

>>> list(zip([1, 2, 3], (4, 5, 6), range(3)))
[(1, 4, 0), (2, 5, 1), (3, 6, 2)]

The csv.reader class works on all file-like objects, but it also works on any iterable that will give back delimited rows of lines as we loop over it (so it'd even accept a list of strings):

>>> rows = ['a,b,c', '1,2,3']
>>> import csv
>>> list(csv.reader(rows))
[['a', 'b', 'c'], ['1', '2', '3']]

Dunder methods are for duck typing

Duck typing is all about behaviors. What behavior does this object support? Does it quack like a duck would? Does it walk like a duck would?

Dunder methods ("double underscore methods") are the way that class creators customize instances of their class to support certain behaviors supported by Python.

For example the __add__ (aka "dunder add") method is what makes something "support addition":

>>> class Thing:
...     def __init__(self, value):
...         self.value = value
...     def __add__(self, other):
...         return Thing(self.value + other.value)
...
>>> thing = Thing(4) + Thing(5)
>>> thing.value
9

Dunder methods are all about duck typing.

Where duck typing isn't

So duck typing is just about everywhere in Python. Is there anywhere we don't see duck typing?

Yes!

Python's exception handling relies on strict type checking. If we want our exception to "be a ValueError", we have to inherit from the ValueError type:

>>> class MyError(ValueError):
...     pass
...
>>> try:
...     raise MyError("Example error being raised")
... except ValueError:
...     print("A value error was caught!")
...
A value error was caught!

Dunder methods also often rely on strict type checking. Typically methods like __add__ will return NotImplemented if given an object type it doesn't know how to work with, which signals to Python that it should try other ways of adding that object (calling __radd__ on the right-hand object for example). So a better implementation of the Thing class above would be:

class Thing:
    def __init__(self, value):
        self.value = value
    def __add__(self, other):
        if not isinstance(other, Thing):
            return NotImplemented
        return Thing(self.value + other.value)

This will ensure an appropriate error is raised for objects that Thing doesn't know how to add itself to. Instead of this:

>>> Thing(4) + 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in __add__
AttributeError: 'int' object has no attribute 'value'

We'll get this:

>>> Thing(4) + 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'Thing' and 'int'

Many functions in Python also require strings and "string" is defined as "an object which inherits from the str class".

For example the string join method accept iterables of strings, not just iterables of any type of object:

>>> class MyString:
...     def __init__(self, value):
...         self.value = str(value)
...
>>> ", ".join([MyString(4), MyString(5)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected str instance, MyString found

If MyString inherits from str instead, this code will work:

>>> class MyString(str):
...     pass
...
>>> ", ".join([MyString(4), MyString(5)])
'4, 5'

So duck typing is all over the place in Python, but there are definitely places where strict type checking is used.

If you see isinstance or issubclass used in code, someone is not practicing duck typing. That's not necessarily bad, but it's rare.

Python programmers tend to practice duck typing in most of our Python code and only rarely rely on strict type checking.

Okay but what's the point?

If duck typing is everywhere, what's the point of knowing about it?

This is mostly about mindset. If you've already been embracing duck typing without knowing the term, that's great. If not, I'd consider asking yourself these questions while writing Python code:

Could the function I'm writing accept a more general kind of object (a less specialized duck) than the one I'm expecting? For example could I accept an iterable instead of assuming I'm getting a sequence?
Does the return type of my function have to be the type I'm using or would a different type just as well or even better?
What does the function I'm calling expect me to pass it? Does it have to be a list/file/etc or could it be something else that might be more convenient (might require less type conversions from me)?
Am I type-checking where I shouldn't be? Could I check for (or assume) behavior instead?

from Planet Python
via read more

Python Morsels: The Iterator Protocol

Iterators are all over the place in Python. You can often get away without knowing and understanding the word "iterator", but understanding this term will help you understand how you can expect various iterator-powered utilities in Python to actually work.

Iterables

From our perspective as Python programmers, an iterable is anything that you can loop over.

Python's definition of an iterable is much simpler though.

From Python's perspective, an iterable is anything that you can pass to the built-in iter function without having a TypeError being raised.

So numbers and booleans are not iterables:

>>> iter(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> iter(True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bool' object is not iterable

But strings and lists are iterables:

>>> iter("hello")
<str_iterator object at 0x7f0ecaaa0dc0>
>>> iter([1, 2, 3])
<list_iterator object at 0x7f0ecaa74a00>

When you pass an iterable to the built-in iter function, an iterator will be returned.

Iterators

An iterator is the thing you get when you pass any iterable to the iter function:

>>> iter({1, 2, 3})
<set_iterator object at 0x7f0ecaa19a40>
>>> iter((1, 2, 3))
<tuple_iterator object at 0x7f0ecaa74a00>

Once you have an iterator, you can call next on it to repeatedly get the next item from it:

>>> s = "Hi!"
>>> i = iter(s)
>>> next(i)
'H'
>>> next(i)
'i'
>>> next(i)
'!'
>>> next(i)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Iterators are consumed as you ask them for items. Once there are no more items left in an iterator, calling next on it will raise a StopIteration exception. Iterators that have been fully consumed are sometimes called exhausted.

Iterators are iterables

The strangest fact about iterators is that they are also iterables.

Remember that from Python's perspective an iteratable is something that you can pass to the iter function to get an iterator from it.

When you pass an iterator to the iter function it'll return itself back:

>>> s = "Hi!"
>>> i = iter(s)
>>> i
<str_iterator object at 0x7f0ecaaa0dc0>
>>> j = iter(i)
>>> j
<str_iterator object at 0x7f0ecaaa0dc0>
>>> i is j
True
>>> next(i)
'H'
>>> next(j)
'i'
>>> next(i)
'!'
>>> next(j)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

The Iterator Protocol

Python's iterator protocol boils down to the terms iterable and iterator:

An iterable is anything that you can get an iterator from using iter
An iterator is an iterable that you can loop over using next

Along with a few rules that dictate how iterables and iterators work:

An iterator is "exhausted" (completed) if calling next raises a StopIteration exception
When you use iter on an iterator, you'll get the same iterator back
Not all iterators can be exhausted (they can keep giving next values forever if they want)

How for loops work

The iterator protocol is how Python's for loops work under the hood.

Python's for loops do not rely on indexes. They rely on iterators.

We can use the rules of the iterator protocol to re-implement a for loop using a while loop, essentially recreating the work that Python does whenever it evaluates a for loop.

This function:

def print_each(iterable):
    for item in iterable:
        print(item)

Is equivalent to this function:

def print_each(iterable):
    iterator = iter(iterable)
    while True:
        try:
            item = next(iterator)
        except StopIteration:
            break  # Iterator exhausted: stop the loop
        else:
            print(item)

You can see that the while loop will go on forever unless the iterator we got from the input iterable has ends (and StopIteration is raised. It is possible to make infinitely long iterables, so it's possible this loop will go forever.

All looping is iterator-powered

Iterators power for loops but they also power many other forms of iteration over iterables.

Comprehensions rely on the iterator protocol:

>>> [n**2 for n in numbers]
[1, 4, 9]

So does tuple unpacking tuple unpacking:

>>> a, b, c = numbers
>>> a
1
>>> c
3

And iterable unpacking when calling a function:

>>> print(*numbers)
1 2 3

Iterators are everywhere

Iterators are all over the place in Python.

For example the built-in enumerate, zip, and reversed functions all return iterators.

>>> enumerate("hey")
<enumerate object at 0x7f016721ca00>
>>> reversed("hey")
<reversed object at 0x7f01672da250>
>>> zip("hey", (4, 5, 6))
<zip object at 0x7f016721cb80>

You can test whether an iterable is an iterator by seeing whether it works with the next function (you'll get a TypeError for non-iterators):

>>> next(enumerate("hey"))
(0, 'h')

Or by calling iter on it and seeing whether it returns itself:

>>> z = zip("hey", (4, 5, 6))
>>> iter(z)
<zip object at 0x7f016721cd00>
>>> iter(z) is z
True

Files (opened in read mode) are also iterators in Python:

>>> f = open('my_file.txt', mode='wt')
>>> f.write('This is line 1\nThis is line 2\nThis is the end\n')
46
>>> f = open('my_file.txt', mode='rt')
>>> next(f)
'This is line 1\n'
>>> list(f)
['This is line 2\n', 'This is the end\n']

Making your own iterators

We can make our own iterators by making generator functions or generator expressions.

Generators allow us to practice lazy looping, which is a technique for wrapping iterators around iterators and delaying the data processing work on your iterators until the very last moment.

If you're interested in lazy looping you might want to start with:

There are also a lot of Python Morsels exercises on lazy looping and working with iterators. I recommend signing up to Python Morsels to get regular hands-on experience working with iterators.

from Planet Python
via read more

Saturday, May 30, 2020

Test and Code: 115: Catching up with Nina Zakharenko

One of the great things about attending in person coding conferences, such as PyCon, is the hallway track, where you can catch up with people you haven't seen for possibly a year, or maybe even the first time you've met in person.

Nina is starting something like the hallway track, online, on twitch, and it's already going, so check out the first episode of Python Tea.

Interesting coincidence is that this episode is kind of like a hallway track discussion between Nina and Brian.

We've had Nina on the show a couple times before, but it's been a while.

In 2018, we talked about Mentoring on episode 44.
In 2019, we talked about giving Memorable Tech Talks in episode 71.

In this episode, we catch up with Nina, find out what she's doing, and talk about a bunch of stuff, including:

Live Coding
Online Conferences
Microsoft Python team
Python Tea, an online hallway track
Q&A with Python for VS Code team
Python on hardware
Adafruit
Device Simulator Express
CircuitPython
Tricking out your command prompt
Zsh and Oh My Zsh
Emacs vs vi key bindings for shells
Working from home

Special Guest: Nina Zakharenko.

Codementor: Splitwise Telegram Bot

SplitwizeBot is a chat based bot to list, create and settle the expenses of Splitwise application from within Telegram 🤖

from Planet Python
via read more

PSF GSoC students blogs: Weekly Blog #1

Welcome to my GSoC Blog!!!

Hello Everyone, this is Soham Biswas currently in 2nd year pursuing my Bachelor’s(B.Tech) degree in Computer Science & Engineering from Institute of Engineering & Management, Kolkata. I have been selected for GSoC' 20 at sub-org FURY under the umbrella organisation of Python Software Foundation. I will be working on building sci-fi-like 2D and 3D interfaces and provide physics engine integration under project titled "Create new UI widgets & Physics Engine Integration".

What did you do during the Community Bonding Period?

Due to the pandemic outbreak and the country wide lockdown in India, many places including my university were closed and therefore I decided to get a head start and start with the project early. During the community bonding period, we had video conference meetings with our mentors and the project's core team. We interacted with each other and discussed the implementational details and their respective deadlines for the entire event period. We will be having such meetings every week on Wednesday in order to update ourselves about the progess of our respective tasks.

I completed the remaining Pull Requests that I had pending before the GSoC students announcement. I also reviewed other issues and pull requests to make sure everything remains up-to-date.

What is comming up next week?

Currently, I am focusing on building the ComboBox2D UI element. I will try to get the skeleton model, required sub-components and their respective default callbacks done by next week.

Did you get stuck anywhere?

While working with my previous PR related to Double Click callbacks, I faced an issue where I was unable to create and implement UserEvents properly in VTK. Thankfully, my mentors helped me out I was able to implement double click callbacks for all three mouse buttons successfully.

<samp>See you next week, cheers!!</samp>

from Planet Python
via read more

Weekly Python StackOverflow Report: (ccxxx) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2020-05-30 12:45:10 GMT

from Planet Python
via read more

Codementor: How to handle bulk data insertion SQLite + python

SQLite Python: Inserting Data

from Planet Python
via read more

PSF GSoC students blogs: GSoC Blog : Week 1

Hey everyone!
This is my blog for this summer’s GSoC @ PSF
I am Lenix Lobo, an undergraduate student from India, and this summer I will be working with project Fury under the umbrella of the Python Software foundation.

What did you do during the community bonding period?

Since most of the places including my university are closed due to the pandemic outbreak, I decided to get a head start and start with the project early. During the community bonding period, I had video conference meetings with my mentors scheduled every week on Wednesday. During these meetings i interacted with the mentors to have a coherent understanding of how the project design and implementation will be managed over the course of the entire period.
Since my project involves a lot of theoretical understanding of concepts such as ray marching, I spent the period going through the theory of each topic.This week also involved going through the documentation for shaders used in VTK.

What is coming up next ?
The next task assigned to me is to go through the theory of geometry shaders and to create a example using the same.

Did you get stuck anywhere?
Since, some of the documentation on VTK shaders was not upto date, i had to go through example implementations to understand the nomenclature of variables and their respective usage.

from Planet Python
via read more

PSF GSoC students blogs: First Blog GSoC 2020

from Planet Python
via read more

PSF GSoC students blogs: Weekly Check-in #1

<tt>Hi everyone I am Abhay, I am a cse undergrad from India. </tt><tt>I got selected in TERN sub-org to work on this summer. Tern is container analysis tool.</tt>

<tt>You can read my blog post here to learn more about it.</tt>

<tt>This first of many blog posts to come. So lets get started.</tt>

<tt>What did you do this week ?</tt>

<tt>I started working on creating UI for JSON reports generated by Tern. At first I used json2html library but that gave very ugly results :p. After that I decided to make a tree view for JSON data. I started writing recursive functions to generate the tree-view HTML report. I've opened a PR and making the changes requested by the mentors. I also had my first meeting with my mentors they are very helpful and supportive.</tt>

<tt>Did you get stuck anywhere ?</tt>

<tt>While writing the recursive code I struggled as the drop-down buttons didn't worked. Turns out <ul> tag cannot have another <ul> as its child. I found this with the help of html validator tool.</tt>

What is comping up next?

<tt>I have another meeting with my mentors.</tt>
<tt>Make changes to the PR requested by mentors.</tt>
<tt>Start working on golang modules metadata extraction.</tt>

<tt>Thanks for reading my blog.</tt>

from Planet Python
via read more

Friday, May 29, 2020

Brett Cannon: The many ways to pass code to Python from the terminal

For the Python extension for VS Code, I wrote a simple script for generating our changelog (think Towncrier, but simpler, Markdown-specific, and tailored to our needs). As part of our release process we have a step where you are supposed to run python news which points Python at the news directory in our repository. A co-worker the other day asked how that worked since everyone on my team knows to use -m (see my post on using -m with pip as to why)? That made me realize that other people probably don't know the myriad of ways you can point python at code to execute, hence this blog post.

Via `stdin` and piping

Since how you pipe things into a process is shell-specific, I'm not going to dive too deep into this. Needless to say, you can pipe code into Python.

echo "print('hi')" | python

Piping text into python

This obviously also works if you redirect a file into Python.

python < spam.py

Redirecting a file into python

Nothing really surprising here thanks to Python's UNIX heritage.

A string via `-c`

If you just need to quickly check something, passing the code in as a string on the command-line works.

python -c "print('hi')"

Using the -c flag with python

I personally use this when I need to check something that's only a line or two of code instead of launching the REPL.

A path to a file

Probably the most well-known way to pass code to python is via a file path.

python spam.py

Specifying a file path for python

The key thing to realize about this is the directory containing the file is put at the front of sys.path. This is so that all of your imports continue to work. But this is also why you can't/shouldn't pass in the path to a module that's contained from within a package. Since sys.path won't have the directory that contains the package, all your imports will be relative to a different directory than you expect for your package.

Using `-m` with packages

The proper way to execute a package is by using -m and specifying the package you want to run.

python -m spam

Using -m with python

This uses runpy under the hood. To make this work with your project all you need to do is specify a __main__.py inside your package and it will get executed as __main__. And the submodule can be imported like any other module, so you can test it and everything. I know some people like having a main submodule in there package and then make their __main__.py be:

from . import main

if __name__ == "__main__":
    main.main()

Personally, I don't bother with the separate main submodule and put all the relevant code directly in __main__.py as the module names feel redundant to me.

A directory

Defining a __main__.py can extend to a directory as well. If you look at my example that instigated this blog post, python news works because the news directory has __main__.py file. Python executes that like a file path. Now you might be asking, "why don't you just specify the file path then?" Well, it's honestly one less thing to know about a path. 😄 I could have just as easily written out instructions in our release process to run python news/announce.py, but there is no real reason to when this mechanism exists. Plus I can change the file name later on and no one would notice. Plus I knew the code was going to have ancillary files with it, so it made sense to put it in a directory versus as a single file on its own. And yes, I could have made it a package to use -m, but there as no point as the script is so simple I knew it was going to stay a single, self-contained file (it's less than 200 lines and the test module is about the same length).

Besides, the __main__.py file is extremely simple.

import runpy
# Change 'announce' to whatever module you want to run.
runpy.run_module('announce', run_name='__main__', alter_sys=True)

a __main__.py for when you point python at a directory

Now obviously there's having to deal with dependencies, but if your script just uses the stdlib or you place the dependencies next to the __main__.py then you are good to go!

Executing a zip file

When you do have multiple files and/or dependencies and you want to ship our code our as a single unit, you can place it in a zip file with a __main__.py and Python will run that file on your behalf with the zip file places on sys.path.

python app.pyz

Passing a zip file to python

Now traditionally people name such zip files with a .pyz file extension, but that's purely tradition and does not affect anything; you can just as easily use the .zip file extension.

To help facilitate creating such executable zip files, the stdlib has the zipapp module. It will generate the __main__.py for you and add a shebang line so you don't even need to specify python if you don't want to on UNIX. If you are wanting to move around a bunch of pure Python code it's a nice way to do it.

Unfortunately using a zip file like this only works when all the code the zip file contains is pure Python. Executing zip files as-is doesn't work for extension modules (this is why setuptools has a zip_safe flag). To load an extension module Python has to call the dlopen() function and it takes a file path which obviously doesn't work when that file path is contained within a zip file. I know at least one person who talked to the glibc team about adding support for passing in a memory buffer so Python could read an extension module into memory and pass that in, but if memory serves the glibc team didn't go for it.

But not all hope is lost! You can use a project like shiv which will bundle your code and then provide a __main__.py that will handle extracting the zip file, caching it, and then executing the code for you. While not as ideal as the pure Python solution, it does work and is about as elegant as one can get in this situation.

from Planet Python
via read more

Talk Python to Me: #266 Refactoring your code, like magic with Sourcery

Refactoring your code is a fundamental step on the path to professional and maintainable software. We rarely have the perfect picture of what we need to build when we start writing code and attempts to over plan and overdesign software often lead to analysis paralysis rather than ideal outcomes. Join me as I discuss refactoring with Brendan Maginnis and Nick Thapen as well as their tool, Sourcery, to automate refactoring in the popular Python editors. Links from the show <div>Guests Brendan Maginnis: <a href="https://twitter.com/brendan_m6s" target="_blank" rel="noopener">@brendan_m6s</a> Nick Thapen: <a href="https://twitter.com/nthapen" target="_blank" rel="noopener">@nthapen</a> Sourcery Sourcery: <a href="https://sourcery.ai/" target="_blank" rel="noopener">sourcery.ai</a> Sourcery on Twitter: <a href="https://twitter.com/sourceryai" target="_blank" rel="noopener">@sourceryai</a> VS Code and PyCharm Plugins: <a href="https://ift.tt/3gBl8ct" target="_blank" rel="noopener">sourcery.ai/editor</a> GitHub Bot: <a href="https://ift.tt/2TTU7Hp" target="_blank" rel="noopener">sourcery.ai/github</a> For an instant demo ⭐ this repo, and Sourcery will refactor your most popular Python repo: <a href="https://ift.tt/2Bh5yCH" target="_blank" rel="noopener">github.com/sourcery-ai/sourcery</a> Python Refactorings article: <a href="https://ift.tt/2YYXVL3" target="_blank" rel="noopener">sourcery.ai/blog</a> Nuitka Talk Python episode: <a href="https://ift.tt/2vaI0JJ" target="_blank" rel="noopener">talkpython.fm</a> Nuitka site: <a href="https://ift.tt/2XfApIr" target="_blank" rel="noopener">github.com</a> Gilded Rose Kata: <a href="https://ift.tt/1gbobHB" target="_blank" rel="noopener">github.com</a> </div> Sponsors <a href='https://ift.tt/33twppf> <a href='https://ift.tt/3aBjB2k> <a href='https://ift.tt/2PVc9qH Python Training</a>

from Planet Python
via read more

Python Bytes: #183 Need a beautiful database editor? Look to the Bees!

Sponsored by DigitalOcean: <a href="http://pythonbytes.fm/digitalocean">pythonbytes.fm/digitalocean</a> Special guest: Calvin Hendryx-Parker <a href="https://twitter.com/calvinhp">@calvinhp</a> <hr /> Brian #1: <a href="https://github.com/fastai/fastpages">fastpages: An easy to use blogging platform, with enhanced support for Jupyter Notebooks.</a> <ul> <li>Uses GH actions to Jekyll blog posts on GitHub Pages.</li> <li>Create posts with code, output of code, formatted text, directory from Jupyter Notebooks.</li> <li>Altair interactive visualizations</li> <li>Collapsible code cells that can be open or closed by default.</li> <li>Metadata like title, summary, in special markdown cells.</li> <li>twitter cards and YouTube videos</li> <li>tags support</li> <li>Support for pure markdown posts </li> <li>and even MS Word docs for posts. (but really, don’t).</li> <li>Documentation and introduction written in fastpages itself, <a href="https://fastpages.fast.ai/">https://fastpages.fast.ai/</a></li> </ul> <hr /> Michael #2: <a href="https://www.beekeeperstudio.io">BeeKeeper Studio Open Source SQL Editor and Database Manager</a> <ul> <li>Use Beekeeper Studio to query and manage your relational databases, like MySQL, Postgres, SQLite, and SQL Server.</li> <li>Runs on all the things (Windows, Linux, macOS)</li> <li>Features <ul> <li>Autocomplete SQL query editor with syntax highlighting</li> <li>Tabbed interface, so you can multitask</li> <li>Sort and filter table data to find just what you need</li> <li>Sensible keyboard-shortcuts</li> <li>Save queries for later</li> <li>Query run-history, so you can find that one query you got working 3 days ago</li> <li>Default dark theme</li> </ul></li> <li>Connect: Alongside normal connections you can encrypt your connection with SSL, or tunnel through SSH. Save a connection password and Beekeeper Studio will make sure to encrypt it to keep it safe.</li> <li>SQL Auto Completion: Built-in editor provides syntax highlighting and auto-complete suggestions for your tables so you can work quickly and easily.</li> <li>Open Lots of Tabs: Open dozens of tabs so you can write multiple queries and tables in tandem without having to switch windows.</li> <li>Save queries</li> <li>View Table Data: Tables get their own tabs too! Use our table view to sort and filter results by column.</li> </ul> <hr /> Calvin #3: 2nd Annual <a href="https://2020.pythonwebconf.com/">Python</a> <a href="https://2020.pythonwebconf.com/">Web</a> <a href="https://2020.pythonwebconf.com/">Conference</a> <a href="https://2020.pythonwebconf.com/"></a> <ul> <li>The most in-depth Python conference for web developers <ul> <li>Targeted at production users of Python</li> <li>Talks on Django, Flask, Twisted, Testing, SQLAlchemy, Containers, Deployment and more</li> </ul></li> <li>June 17th-19th — One day of tutorials and two days of talks in 3 tracks</li> <li>Keynote talks by <ul> <li>Lorena Mesa</li> <li>Hynek Schlawack</li> <li>Russell Keith-Magee</li> <li>Steve Flanders</li> </ul></li> <li>Fireside Chat with Carl Meyer about Instragram’s infrastructure, best practices</li> <li>Participate in 40+ presentations and 6 tutorials</li> <li>Fun will be had and connections made <ul> <li>Virtual cocktails</li> <li>Online gaming</li> <li>Board game night</li> </ul></li> <li>Tickets are $199 and $99 for Students <ul> <li>As a bonus, for every Professional ticket purchased, we'll donate a ticket to an attendee in a <a href="https://unstats.un.org/unsd/methodology/m49/">developing country.</a> </li> <li>As a Python Bytes listener you can get a 20% discount with the code PB20</li> </ul></li> </ul> <hr /> Brian #4: <a href="https://mimesis.name/">Mimesis - Fake Data Generator</a> <ul> <li>“…helps generate big volumes of fake data for a variety of purposes in a variety of languages.”</li> <li>Custom and generic data providers</li> <li>>33 locales</li> <li>Lots of locale dependent providers, like address, Food, Person, …</li> <li>Locale independent providers. </li> <li>Super fast. Benchmarking with 10k full names was like <a href="https://mimesis.name/foreword.html#advantages">60x faster than Faker</a>.</li> <li>Data generation by schema. Very cool</li> </ul> <pre><code> >>> from mimesis.schema import Field, Schema >>> _ = Field('en') >>> description = ( ... lambda: { ... 'id': _('uuid'), ... 'name': _('text.word'), ... 'version': _('version', pre_release=True), ... 'timestamp': _('timestamp', posix=False), ... 'owner': { ... 'email': _('person.email', domains=['test.com'], key=str.lower), ... 'token': _('token_hex'), ... 'creator': _('full_name'), ... }, ... } ... ) >>> schema = Schema(schema=description) >>> schema.create(iterations=1) </code></pre> <pre><code>- Output: [ { "owner": { "email": "aisling2032@test.com", "token": "cc8450298958f8b95891d90200f189ef591cf2c27e66e5c8f362f839fcc01370", "creator": "Veronika Dyer" }, "name": "widget", "version": "4.3.1-rc.5", "id": "33abf08a-77fd-1d78-86ae-04d88443d0e0", "timestamp": "2018-07-29T15:25:02Z" } ] </code></pre> <hr /> Michael #5: <a href="https://github.com/kiwicom/schemathesis">Schemathesis</a> <ul> <li>A tool for testing your web applications built with Open API / Swagger specifications.</li> <li>Supported specification versions: <ul> <li>Swagger 2.0</li> <li>Open API 3.0.x</li> </ul></li> <li>Built with: <ul> <li><a href="https://hypothesis.works/">hypothesis</a></li> <li><a href="https://github.com/Zac-HD/hypothesis-jsonschema">hypothesis_jsonschema</a></li> <li><a href="http://pytest.org/en/latest/">pytest</a></li> </ul></li> <li>It reads the application schema and generates test cases which will ensure that your application is compliant with its schema.</li> <li>Use: There are two basic ways to use Schemathesis: <ul> <li><a href="https://github.com/kiwicom/schemathesis#command-line-interface">Command Line Interface</a></li> <li><a href="https://github.com/kiwicom/schemathesis#in-code">Writing tests in Python</a></li> </ul></li> <li>CLI supports passing options to <code>hypothesis.settings</code>.</li> <li>To speed up the testing process Schemathesis provides <code>-w/--workers</code> option for concurrent test execution</li> <li>If you'd like to test your web app (Flask or AioHTTP for example) then there is <code>--app</code> option for you</li> <li>Schemathesis CLI also available as a docker image</li> <li>Code example:</li> </ul> <pre><code> import requests import schemathesis schema = schemathesis.from_uri("http://0.0.0.0:8080/swagger.json") @schema.parametrize() def test_no_server_errors(case): # `requests` will make an appropriate call under the hood response = case.call() # use `call_wsgi` if you used `schemathesis.from_wsgi` # You could use built-in checks case.validate_response(response) # Or assert the response manually assert response.status_code < 500 </code></pre> <hr /> Calvin #6: <a href="https://blog.jse.li/posts/pyc/">Finding secrets by decompiling Python bytecode in public repositories</a> <ul> <li>Jesse’s initial research revealed that thousands of GitHub repositories contain secrets hidden inside their bytecode.</li> <li>It has been common practice to store secrets in Python files that are typically ignored such as <code>settings.py</code>, <code>config.py</code> or <code>secrets.py</code>, but this is potentially insecure</li> <li>Includes a nice crash course on Python byte code and cached source</li> <li>This post comes with a small capture-the-flag style lab for you to try out this style of attack yourself. <ul> <li>You can find it at <a href="https://github.com/veggiedefender/pyc-secret-lab/">https://github.com/veggiedefender/pyc-secret-lab/</a></li> </ul></li> <li>Look through your repositories for loose <code>.pyc</code> files, and delete them</li> <li>If you have <code>.pyc</code> files and they contain secrets, then revoke and rotate your secrets</li> <li>Use a standard <a href="https://github.com/github/gitignore/blob/master/Python.gitignore">gitignore</a> to prevent checking in <code>.pyc</code> files</li> <li>Use JSON files or environment variables for configuration</li> </ul> <hr /> Extras: Michael: <ul> <li><a href="https://pycoders.com/link/4164/yrq2q8ogch">Python 3.9.0b1 Is Now Available for Testing</a></li> <li><a href="https://pycoders.com/link/4141/yrq2q8ogch">Python 3.8.3 Is Now Available</a></li> <li>Ventilators and Python: Some particle physicists put some of their free time to design and build a low-cost ventilator for covid-19 patients for use in hospitals. https://ift.tt/3gCmLXe Search of the PDF for Python: <ul> <li>"Target computing platform: Raspberry Pi 4 (any memory size), chosen as a trade-off between its computing power over power consumption ratio and its wide availability on the market; • Target operating: Raspbian version 2020-02-13; • Target programming language: Python 3.5; • Target PyQt5: version 5.11.3."</li> <li>"The MVM GUI is a Python3 software, written using the PyQt5 toolkit, that allows steering and monitoring the MVM equipment."</li> </ul></li> </ul> Brian: <ul> <li><a href="https://pyfound.blogspot.com/2020/05/call-for-volunteers-python-github.html">Call for Volunteers! Python GitHub Migration Work Group</a> <ul> <li>migration from bugs.python.org to GitHub</li> </ul></li> </ul> Calvin: <ul> <li><a href="https://www.humblebundle.com/books/learn-you-some-python-no-starch-press-books">Learn Python Humble Bundle</a> <ul> <li>Pay $15+ and get an amazing set of Python books to start learning at all levels</li> <li>Book Industry Charitable Foundation</li> <li>The No Starch Press Foundation</li> </ul></li> </ul> <hr /> Joke: More O’Really book covers <img src="https://ift.tt/2yOk4Ru" alt="" /> <img src="https://ift.tt/2Ap3HLK" alt="" /> <img src="https://ift.tt/3gDqVOu" alt="" /> <img src="https://ift.tt/3gzVJ2F" alt="" />

from Planet Python
via read more

Red Hat Developers: Red Hat Software Collections 3.5 brings updates for Red Hat Enterprise Linux 7

Red Hat Software Collections 3.5 and Red Hat Developer Toolset 9.1 are now available for Red Hat Enterprise Linux 7. Here’s what that means for developers.

Red Hat Software Collections (RHSCL) is how we distribute the latest stable versions of various runtimes and languages through Red Hat Enterprise Linux (RHEL) 7, with some components available in RHEL 6. RHSCL also contains the Red Hat Developer Toolset, which is the set of tools we curate for C/C++ and Fortran. These components are supported for up to five years, which helps you build apps that have a long lifecycle as well.

What changed?

Updated collections in RHSCL 3.5 include:

Python 3.8, which introduces assignment expressions and several optimizations to make Python 3.8 run faster than previous versions, and with previous version compatibility to ease upgrade strategies.
Ruby 2.7, which offers a large number of new features such as pattern matching, Read-Eval-Print-Loop (REPL) improvements, and compaction garbage collection (GC) for fragmented memory spaces.
Perl 5.30, which adds new features for developers such as the limited variable-length lookbehinds, Unicode 12.1, faster string interpolation, and other performance improvements.
Apache httpd 2.4 (update), which fixes a number of bugs and includes an updated version of mod_md to support ACMEv2.
Varnish 6, which updates Varnish Cache to version 6.0.6, the latest bi-annual fresh release with numerous bug fixes and performance improvements.
Java Mission Control 7.1, which updates JDK Mission Control to version 7.1.1 and fixes numerous bugs. It also adds key enhancements, including multiple rule optimizations, a new JOverflow view based on Standard Widget Toolkit (SWT), a new flame graph view, and a new latency visualization using the High Dynamic Range (HDR) Histogram.
HAProxy 1.8.24, which provides multiple bug and security fixes.

The last—but certainly not least—update to Red Hat Software Collections 3.5 is Red Hat Developer Toolset (DTS) version 9.1, which is the set of tools we curate for C/C++ and Fortran. For DTS, we updated the compilers, debuggers, and performance monitoring tools to ensure the best experience for software developers using these languages. At the center of DTS 9.1 is GCC 9.3, which brings a huge number of improvements including improved diagnostics and useability. The full list of tools that we updated in DTS 9.1 is available in the release notes, as always.

How do I get this great stuff?

With a Red Hat Developer Subscription, you have access to Red Hat Enterprise Linux 7, where you can update these packages. If you have already enabled Red Hat Software Collections in the subscription manager, follow the instructions below for either a specific software collection or a container image. If you haven’t already enabled RHSCLs, please follow the instructions in our online documentation.

To install a specific software collection, type the following into your command line as root:

$ yum install software_collection…

Replace software_collection with a space-separated list of the software collections you want to install. For example, to install php54 and rh-mariadb100, type as root:

$ yum install rh-php72 rh-mariadb102

Doing this installs the main meta-package for the selected software collection and a set of required packages as its dependencies. For information on how to install other packages such as additional modules, see Section 2.2.2, “Installing Optional Packages.”

Another option, of course, is to start with our container images for these packages, which make it easier to build and deploy mission-critical applications that use these components for Red Hat Enterprise Linux and Red Hat OpenShift platforms.

The full release notes for Red Hat Software Collections 3.5 and Red Hat Developer Toolset 9.1 are available in the customer portal.

What about Red Hat Enterprise Linux 8?

Software Collections are for Red Hat Enterprise Linux 7. Red Hat Enterprise Linux 8 is managed in a different way through Application Streams, and you can find updated RHEL 8 packages in the RHEL8 appstream repository. The updates for these packages might not be the same for Red Hat Enterprise Linux 8 Application Streams, so please check on the Application Streams Life Cycle page.

The post Red Hat Software Collections 3.5 brings updates for Red Hat Enterprise Linux 7 appeared first on Red Hat Developer.

from Planet Python
via read more

PSF GSoC students blogs: Community Bonding Check-in

What did you do during this period?

I had an onboarding meeting with my mentors where we got to know each other a bit better. They advised me to play around with uarray and unumpy without any goal in mind which I found to be a very good advice. I played a bit with special methods by implementing a simple Vector2D class and used the code in this notebook with some print statements to understand better the protocols and how they are called. I wanted to start earlier on my project so I took over a PR from one of my mentors which adds multimethods for the linalg module.

What is coming up next?

I'm going to continue the PR that I have been working on since it still isn't finished and I will also follow the proposed timeline and start adding multimethods for other routines like checking class equality in array elements. Some mathematical constants and their aliases are also missing so I will be adding these too and probably refactoring the existing ones into classes. This week marks the end of my college classes but I still have some assignments and exams coming up in the following weeks so there's a lot of work ahead of me to proper balance both university studies and GSoC but I wouldn't have it other way.

Did you get stuck anywhere?

I consider the PR that I started working during this period to be a challenging one since some mathematical intuition is needed to translate Linear Algebra routines into proper functions. Things like decomposition of a matrix into eigenvalues and eigenvectors to calculate its n-th power is something I'm not too familiar with specially in a programming context. With that said there hasn't been a roadblock for me up until now and usually I can wrap my head around these concepts in half a day. It should be noted that mentor help plays a huge part in this as they frequently give me very good advices. Despite that I usually think a lot before doing a commit to make sure that what I'm pushing is correct I notice that I still can't avoid some mistakes, even ones that should be obvious to me. I guess these mistakes are normal and they are corrected soon after so no harm's done but I'm training myself to not do them as often.

from Planet Python
via read more

EuroPython: EuroPython 2020: Schedule published

We are very excited to announce the first version of our EuroPython 2020 schedule:

EuroPython 2020 Schedule

More sessions than we ever dreamed of

After the 2nd CFP, we found that we had so many good talk submissions that we were able to open a fourth track. Together with the newly added slots for the Indian / Asian / Pacific and Americas time zones, we now have a fully packed program, with:

more than 110 sessions,
more than 110 speakers from around the world,
4 brilliant keynotes, two of which are already published,
2 exciting lightning talk blocks,
4 all-day tracks, with a whole track dedicated to data science topics,
a poster track, which we’ll announce next week,
a virtual social event,
an after party,
and lots of socializing on our conference platform.

We are proud to have reached almost the size of our in-person event with the online version of EuroPython 2020.

Never miss a talk

All talks will be made available to the attendees as live Webinars, with easy switching between tracks, as well as online streams, which will allow rewinding to watch talks you may have missed during the day.

Conference Tickets

Conference tickets are available on our registration page. We have simplified and greatly reduced the prices for the EuroPython 2020 online edition.

As always, all proceeds from the conference will go into our grants budget, which we use to fund financial aid for the next EuroPython edition, special workshops and other European conferences and projects:

EuroPython Society Grants Program

We hope to see lots of you at the conference in July. Rest assured that we’ll make this a great event again — even within the limitations of running the conference online.

Sprints

On Saturday and Sunday, we will have sprints/hackathons on a variety of topics. Registration of sprint topics has already started. If you would like to run a sprint, please add your sprint topic to the wiki page we have available for this:

EuroPython 2020 Sprints Listing

If registrations continue as they currently do, we will have a few hundred people waiting to participate in your sprint projects, so this is the perfect chance for you to promote your project and find new contributors.

Participation in the sprints is free, but does require registration. We will provide the necessary collaboration tools in form of dedicated Jitsi or Zoom virtual rooms and text channels on our Discord server.

EuroPython is your conference

EuroPython has always been a completely volunteer based effort. The organizers work hundreds of hours to make the event happen and will try very hard to create an inspiring and exciting event.

However, we can only provide the setting. You, as our attendees, are the ones who fill it with life and creativity.

We are very much looking forward to having you at the conference !

Enjoy,
–
EuroPython 2020 Team
https://ep2020.europython.eu/
https://www.europython-society.org/

from Planet Python
via read more

Real Python: The Real Python Podcast – Episode #11: Advice on Getting Started With Testing in Python

Have you wanted to get started with testing in Python? Maybe you feel a little nervous about diving in deeper than just confirming your code runs. What are the tools needed and what would be the next steps to level up your Python testing? This week on the show we have Anthony Shaw to discuss his article on this subject. Anthony is a member of the Real Python team and has written several articles for the site.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

from Planet Python
via read more

The Real Python Podcast – Episode #11: Advice on Getting Started With Testing in Python

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

from Real Python
read more

Zato Blog: Backing up and restoring Zato Single Sign-On data

This article presents a procedure for backing up all of Zato Single Sign-On (SSO) data and restoring it later on.

A single Zato server with SQLite is used for simplicity reasons but the same principles hold regardless of the size of one's environment or the SQL database used.

Overview

There are two data sources that SSO uses:

Run-time information, such as users, groups, and all the other objects , are stored in an SQL database in tables prefixed with zato_sso_, e.g. zato_sso_user or zato_sso_group.
Encryption keys are kept in a file called secrets.conf - the same file is shared by all servers in a cluster

Thus, to make a backup:

Connect to an existing server via SSH
Dump the SQL contents of SSO tables and related objects such as indexes
Make a copy of secrets.conf
Save everything in a safe place

Conversely, to restore a backup:

Load the backup from the safe place
Connect to a new Zato server via SSH
Move the contents of the SQL dump to the database
Replace the server's secrets.conf with the one copied over earlier during backup

Backing up SQL data

Assuming that a Zato server is in a directory called /home/zato/sso1/server, here is how to back up an SQLite database:

$ cd /home/zato/sso1/server
$ sqlite3 zato.db ".dump zato_sso_%" > zato-sso-backup.sql

This will create a file called zato-sso-backup.sql the contents of which is the schema and rows of all the SSO objects.

To make it easier to restore it later, open the file and add the following commands right after "BEGIN TRANSACTION;"

DROP TABLE IF EXISTS zato_sso_group;
DROP TABLE IF EXISTS zato_sso_user;
DROP TABLE IF EXISTS zato_sso_user_group;
DROP TABLE IF EXISTS zato_sso_session;
DROP TABLE IF EXISTS zato_sso_attr;
DROP TABLE IF EXISTS zato_sso_linked_auth;

The idea with the DROP statements is that when you are restoring SSO from a backup, these tables, albeit empty, will already exist, so we can just drop them to silence out any SQLite warnings.

Backing up secrets.conf

Again, if the server is in /home/zato/sso1/server, the full path to secrets.conf is /home/zato/sso1/server/config/repo/secrets.conf - simply copy the whole file to a location of choice.

Just to confirm it, the contents should be akin to this:

[secret_keys]
key1=P8ViJZs8hM...

[zato]
well_known_data=gAAAAABe0LcDT...
server_conf.kvdb.password=gAAAAA...
server_conf.main.token=gAAAAABe0LcDPy...
server_conf.misc.jwt_secret=gAAAAABe0Lc...
server_conf.odb.password=gAAAAABe0LcD2MqLa...

Creating a new server

We work under assumption that a new server will be created in a directory named /home/zato/sso2/server.

Note that it should be a completely new instance in a new cluster. Do not start the server yet.

Restoring SQL data

Move the zato-sso-backup.sql file to /home/zato/sso2/server and run the commands below:

$ cd /home/zato/sso2/server
$ sqlite3 zato.db < zato-sso-backup.sql
$ echo $?
$ 0

Exit code 0 should be returned on output, indicating a successful operation.

Restoring secrets.conf

The file backed up previously needs to be saved to /home/zato/sso2/server/config/repo/secrets.conf, as below:

$ cd /home/zato/sso2/server/config/repo
$ mv ./secrets.conf ./secrets.conf.bak # Just in case
$ cp /path/to/backup/secrets.conf .

Confirming it all

Now, the server can be started and we can confirm that the SSO data can be accessed by logging it to the system as one of its users, as below - output was reformatted for clarity:

$ zato sso login /home/zato/sso2/server my.user
User logged in {
  'username': 'my.user',
  'user_id': 'zusr6htg...',
  'ust': 'gAAAAABe0M_Pf8cdBa6bimnjfVUt5CF...',
  'creation_time': '2020-05-29T09:03:11.459337',
  'expiration_time': '2020-05-29T10:03:11.459337',
  'has_w_about_to_exp': False
}
$

That concludes the process - the SSO data is now restored and the server can be fully used, just like the original one.

from Planet Python
via read more

Stefan Scherfke: Attrs, Dataclasses and Pydantic

I’ve been using attrs for a long time now and I am really liking it. It is very flexible, has a nice API, is well documented and maintained, and has no runtime requirements.

The main idea behind attrs was to make writing classes with lots of data attributes (“data classes”) easier. Instead of this:

class Data:
    def __init__(self, spam, eggs):
        self.spam = spam
        self.eggs = eggs

It lets you write this:

@attr.s
class Data:
    spam = attr.ib()
    eggs = attr.ib()

Attrs also adds a nice string representation, comparison methods, optional validation and lots of other stuff to your classes, if you want to. You can also opt out of everything; attrs is very flexible.

Attrs became so popular, that since Python 3.7 we also have the dataclasses module in the standard library. It is predominantly inspired by attrs (the attrs team was involved in the design of data classes) but has a smaller feature set and will evolve a lot slower. But you can use it out-of-the box without adding a new requirement to your package.

Pydantic’s development roughly started during Python 3.7 development, too. Its main focus is on data validation, settings management and JSON (de)serialisation, therefore it is located at a higher level ob abstraction. Out of the box, it will recursively validate and convert all data that you pour into your model:

>>> from datetime import datetime
>>> from pydantic import BaseModel
>>>
>>> class Child(BaseModel):
...     x: int
...     y: int
...     d: datetime
...
>>> class Parent(BaseModel):
...     name: str
...     child: Child
...
>>> data = {
...     'name': 'spam',
...     'child': {
...         'x': 23,
...         'y': '42',  # sic!
...         'd': '2020-05-04T13:37:00',
...     },
... }
>>> Parent(**data)
Parent(name='spam', child=Child(x=23, y=42, d=datetime.datetime(2020, 5, 4, 13, 37)))

I only learned about pydantic when I started to work with FastAPI. FastAPI is a fast, asynchronous web framework specifically designed for building REST APIs. It uses pydantic for schema definition and data validation.

Since then, I asked myself: Why not attrs? What’s the benefit of pydantic over the widely used and mature attrs? Can or should it replace attrs?

As I begin to write this article, I still don’t know the answer to these questions. So lets explore attrs, data classes and pydantic!

Simple class definition

Originally, attrs classes were created by using the @attr.s() (or @attr.attrs() class decorator. Fields had to be created via the attr.ib() (or @attr.attrib()) factory function. By now, you can also create them nearly like data classes.

The recommended way for creating pydantic models is to subclass pydantic.BaseModel. This means that in contrast to data classes, all models inherit some “public” methods (e.g., for JSON serialization) which you need to be aware of. However, pydantic allows you to create stdlib data classes extended with validation, too.

Here are some very simple examples for data classes / models:

>>> import attr
>>> import dataclasses
>>> import pydantic
...
...
>>> # Simple data classes are supported by all libraries:
>>> @attr.dataclass
... # @dataclasses.dataclass
... # @pydantic.dataclasses.dataclass
... class Data:
...     name: str
...     value: float
...
>>> Data('Spam', 3.14)
Data(name='Spam', value=3.14)
...
...
>>> @attr.s
... class Data:
...     name = attr.ib()
...     value = attr.ib()
...
>>> Data('Spam', 3.14)
Data(name='Spam', value=3.14)
...
...
>>> class Data(pydantic.BaseModel):
...     name: str
...     value: float
...
>>> Data(name='Spam', value=3.14)
Data(name='Spam', value=3.14)

Pydantic models enforce keyword only arguments when creating new instances. This is a bit tedious for classes with only a few attributes but with larger models, you’re likely going to use keyword arguments anyways. The benefit of kw-only arguments is, that it doesn’t matter if you list attributes with a default before ones without a default.

Data classes support positional as well as keyword arguments. Passing values by position is very convenient for smaller classes but that also means that you must define all fields without a default value first and the ones with a default value afterwards. This may prevent you from grouping similar attributes, when only some of them have a default value.

Attrs supports both ways. The default is to allow positional and keyword arguments like data classes. You can enable kw-only arguments by passing kw_only=True to the class decorator.

Another major difference is that Pydantic always validates and converts all attribute values, but more on that later.

Class and attribute customistaion

All three libraries let you customize the created fields as well as the class itself.

In data classes and attrs, you can customize your class by passing additional arguments to the class decorator. Pydantic models can define a nested Config class for the same purpose.

Attributes can be customized via special factory functions. Instead of specifying an attribute like this: name: type [= default], you you do: name: type = field_factory(). This function is named [attr.]ib()/attrib() in attrs, field() with data classes and Field() in pydantic.

Using these functions, you can specify default values, validators, meta data and other attributes. The following tables let you compare the customisation features that each library provides:

Attribute / field settings
	attrs	data classes	pydantic
Explicit no default	`NOTHING` ¹	`MISSING` ¹	`...` ¹
Default factory	yes	yes	yes
Validators	yes ²	no	no ^2,3
Constraints	no	no	const, regex, length, number range, …
Converters	yes ²	no	no ^2,3
Exclude field from	repr, eq, order, hash, init	repr, compare, hash, init	ø
Add arbitrary metadata	yes	yes	yes
Additional docs	no	no	title, description

footnotes

¹ Passing no default is optional in attrs and data classes, but mandatory in pydatnic.

² Validators and converters can also be defined as decorated methods of you class.

³ Pydantic always performs basic validation and conversion for the attribute’s data type (e.g., int('42')).

Class customisation and instantiation
	attrs	data classes	pydantic
Auto create methods for	str, repr, equality, ordering, hash, init	repr, equality, ordering, hash, init	str, repr, equality, init
Keyword args only	optional	no	yes
Faux immutability / Freezing	yes	yes	yes
Slots	yes	no	no
Safe to subclass exceptions	yes	no	no
Dynamic creation	yes	yes	yes
Instantiate from dict	yes	yes	yes, recursively
Instantiate from objects	no	no	optional, recursively
Instantiate from JSON	no	no	yes, recursively
Instantiate from env. vars.	no	no	yes, recursively

Generated methods

All libraries create useful “dunder” methods (like __init__() or __str__()). Attrs can generate the most methods, followed by data classes and Pydantic. Attrs and data classes also allow you to selectively disable the generation of certain methods.

Attrs is the only library that generates __slots__ and is also the only one that has explicit support for subclassing exceptions.

Default values

Without a field factory, default values for fields are simply assigned to the field name, e.g., value: int = 42. When you use a field factory, you can/need to pass a default value as argument to that function. In pydantic, the default value is always passed as first positional argument. In order to express “this attribute has no default”, you use the elipsis literal (...). Data classes use the optional keyword argument default instead. Attrs lets you choose - you can pass a default value by position or as keyword argument.

Another difference is that pydantic allows you to use mutable objects like lists or dicts as default values. Attrs and data classes prohibit this for good reason. To prevent bugs with mutable defaults, pydantic deep-copies the default value for each new instance.

You you can specify factories for default values with all libraries.

Freezing and functional programming

You can create pseudo immutable classes with all libraries. Immutable/frozen instances prevent you from changing attribute values. This helps when you aim to program in a more functional style. However, if attributes themselves are mutable (like lists or dicts), you can still change these!

In attrs and data classes, you pass frozen=True to the class decorator. In pydantic, you set allow_mutation = False in the nested Config class.

Attrs and data classes only generate dunder protocol methods, so your classes are “clean”. Having struct-like, frozen instances make it relatively easy to write purely functional code, that can be more robust and easier to test than code with a lot of side effects.

Pydantic models, on the other hand, use inheritance and always have some methods, e.g., for converting an instance from or to JSON. This facilitates a more object-orient programming style, which can be a bit more convenient in some situations.

Instantiation, validation and conversion

The main differentiating features of pydantic are its abilities to create, validate and serialize classes.

You can instantiate pydantic models not only from dicts/keyword arguments but also from other data classes (ORM mode), from environment variables, and raw JSON. Pydantic will then not only validate/convert basic data types but also more advanced types like datetimes. On top of that, it will recursively create nested model instances, as shown in the example above.

Model instances can directly be exported to dicts and JSON via the .dict()/.json() methods.

To achieve something similar in attrs or data classes, you need to install an extension package like, for example, cattrs. And even then, Pydantic has a far better user experience.

Apart from that, all libraries allow you to define custom validator and converter functions. You can either pass these functions to the field factories or define decorated methods in your class.

Metadata and schema generation

Pydantic can not only serialize model instances but also the schema of the model classes themselves. This is, for example, used by FastAPI to generate the OpenAPI spec for an API.

To aid the documentation of the generated schemas, every field can have a title and a description attribute. These are not used for docstrings, though.

Documentation

In a way, the documentation of all three projects mirrors their feature sets.

It is of high quality in all cases, but technically and in terms of content very different.

The data classes documentation is part of Python’s stdlib documentation and the briefest of all candidates, but it covers everything you need to know. It contains a direct link to the source code that also has many helpful comments.

The attrs docs contain example based guides, in-depth discussions of certain features and design decisions as well as an exhaustive API reference. It uses Sphinx and, for the API reference, the autodoc extension. It provides an objects inventory which allows you to cross-reference attrs classes and functions from your own documentation via intersphinx.

The pydantic documentation is also very well written and contains many good examples that explain almost all functionality in great detail. However, it follows the unfortunate trend of using MkDocs as a documentation system. I assume that this is easier to set-up then Sphinx and allows you to use Markdown instead of ReStructuredText, but it is also lacking lots of important features and I also don’t like its UX. It has two navigation menus – one on the left for whole document’s TOC and one on the right for the current page. More serious, however, is the absence of an API reference. There is also no cross referencing (e.g., links from class and function names to their section in the API reference) and thus no objects inventory that can be used for inter-project cross referencing via Sphinx’ intersphinx extension. Even pydantic’s source code barely includes any docstrings or other comments. This can be a hindrance when you want to solve more advanced problems.

Note

An alternative for MkDocs might be MyST, which is an extend Markdown parser that can be used with Sphinx.

Performance

For most use cases, the performance of a data classes library can be neglected. Performance differences only become noticeable when you create thousands or even millions of instances in a short amount of time.

However, the pydantic docs contain some benchmarks that suggest that pydantic is slightly ahead of attrs + cattrs in mean validation time. I was curious why pydantic, despite its larger feature set, was so fast, so I made my own benchmarks.

I briefly evaluate the attrs extension packages. The only one that offers reasonably convenient means for input validation and (de)serialization as well as good performance is cattrs.

I created benchmarks for three different use cases:

Simple classes and no need for extensive validation
Deserialization and validation of (more) complex (nested) classes
Serialization of complex (nested) classes

I calculated the time and memory consumption for handling 1 million instances of different variants of attrs and data classes as well as pydantic models.

Unsurprisingly, attrs and data classes are much faster than pydantic when no validation is needed. They also use a lot less memory.

Relative time and memory consumption for basic class instantiation

I was expecting that the results would be much closer when it comes to validation/conversion and serialization, but even there, pydantic was a lot slower than attrs + cattrs.

Relative time and memory consumption for loading more complex, nested classes

Relative time and memory consumption for dumping more complex, nested classes

I wondered why my benchmarks were so clearly in favor of attrs when the pydantic docs state that it is 1.4x faster than attrs + cattrs. I tried running the pydantic benchmarks myself and indeed I could reproduce these results. Wondering, why the results differed so much, I took a closer look at the benchmark’s source code. It turned out that the attrs + cattrs example used python-dateutil for parsing datetimes while pydantic uses its own implementation. I replaced datetuil.parser.parse() with the stdlib datetuil.fromisoformat() and the attrs + cattrs example suddenly became 6–7 times faster. (Note: fromisoformat() is not a general purpose parser!)

In defense of pydantic: The attrs + cattrs (de)serializers were specifically designed and implemented for this benchmark while Pydantic ships everything out-of-the box. Pydantic’s UX for these use cases is also more pleasant than that of attrs + cattrs.

You can find the source of all benchmark in the accompanying repository.

Summary

Attrs, data classes and pydantic seem very similar on a first glance, but they are very different when you take a closer look.

All three projects are of high quality, well documented and generally pleasant to use. Furthermore, they are different enough that each of them has its niche where it really shines.

The stdlib’s data classes module provides all you need for simple use cases and it does not add a new requirement to your project. Since it does not do any data validation, it is also quite fast.

When you need more features (more control over the generated class or data validation and conversion), you should use attrs. It is as fast as data classes are but its memory footprint is even smaller when you enable __slots__.

If you need extended input validation and data conversion, Pydantic is the tool of choice. The price of its many features and nice UX is a comparatively bad performance, though.

If you want to cut back on UX instead, the combination of attrs and cattrs might also be an alternative.

You can take a look at the benchmarks to get a feel for how the libraries can be used for different use cases and how they differ form each other.

I myself will stay with attrs as long as it can provide what I need. Otherwise I’ll use Pydantic.

Epilogue

I wish there was a library like mattrs (magic attrs) that combined Pydantic’s (de)serialization UX with attrs’ niceness and performance:

>>> from datetime import datetime
>>> from mattrs import dataclass, field, asjson
>>>
>>> @dataclass()
... class Child:
...     x: int = field(ge=0, le=100)
...     y: int = field(ge=0, le=100)
...     d: datetime
...
>>> @dataclass()
... class Parent:
...     name: str = field(re=r'^[a-z0-9-]+$')
...     child: Child
...
>>> data = {'name': 'spam', 'child': {'x': 23, 'y': '42', 'd': '2020-05-04T13:37:00'}}
>>> Parent(**data)
Parent(name='spam', child=Child(x=23, y=42, d=datetime.datetime(2020, 5, 4, 13, 37)))
>>> asjson(_)
{"name": "spam", "child": {"x": 23, "y": 42, "d": "2020-05-04T13:37:00"}}

Maybe it’s time for another side project? 🙊

from Planet Python
via read more

Sunday, May 31, 2020

What did I do this week?

What is coming up next?

Did you get stuck anywhere?

What did I do till now?

Whats coming up next?

Did I get stuck anywhere?

If it looks like a duck and quacks like a duck

How's the water?

Duck typing by example

Sequences: is it like a list?

Iterables: can we loop over it?

Callables: is it a function?

Mappings: is it a dictionary?

Files and file-like objects

Context managers

Iterators

Decorators

Other examples

Dunder methods are for duck typing

Where duck typing isn't

Okay but what's the point?

Iterables

Iterators

Iterators are iterables

The Iterator Protocol

How for loops work

All looping is iterator-powered

Iterators are everywhere

Making your own iterators

Saturday, May 30, 2020

Welcome to my GSoC Blog!!!

What did you do during the Community Bonding Period?

What is comming up next week?

Did you get stuck anywhere?

<tt>What did you do this week ?</tt>

<tt>Did you get stuck anywhere ?</tt>

What is comping up next?

Friday, May 29, 2020

Via stdin and piping

A string via -c

A path to a file

Using -m with packages

A directory

Executing a zip file

What changed?

How do I get this great stuff?

What about Red Hat Enterprise Linux 8?

What did you do during this period?

What is coming up next?

Did you get stuck anywhere?

More sessions than we ever dreamed of

Never miss a talk

Conference Tickets

Sprints

EuroPython is your conference

Overview

Backing up SQL data

Backing up secrets.conf

Creating a new server

Restoring SQL data

Restoring secrets.conf

Confirming it all

Simple class definition

Class and attribute customistaion

Generated methods

Default values

Freezing and functional programming

Instantiation, validation and conversion

Metadata and schema generation

Documentation

Performance

Summary

Epilogue

Via `stdin` and piping

A string via `-c`

Using `-m` with packages