Daily Python: January 2021

Sunday, January 31, 2021

Python Pool: Matplotlib Zorder Explained with Examples

Hello geeks and welcome in this article, we will cover Matplotlib Zorder. Along with that, we look at its syntax, what difference does it make to the graph. To do so, we will look at a couple of examples. But first, let us try to generate a general overview of the function.

Matplotlib is the plotting library for the Python programming language. The Zorder attribute of the Matplotlib Module helps us to improve the overall representation of our plot. This property determines how close the points or plot is to the observer. The higher the value of Zorder closer the plot or points to the viewer. Things will become more clear as we move ahead in this article.

Zorder Implementation in Matplotlib

In this section, we will learn how to implement the Zorder. We will also notice what difference does the addition of Zorder makes.

Let us look at a simple example

import matplotlib.pyplot as plt
plt.plot([2,4],[4,4])
plt.plot([1,5],[2,6])
plt.show()

Here we can see a simple plot. Here we have considered 2 different set points. They are joined together to form a straight line. In order to execute this program, we have just imported the Matplotlib. Then we have declared points and at last, used the plt. show() tag to get the output.

But in the above graph straight line from a point having coordinates (1,2) and (5,6). Appear over (2,4) and (4,4). We wish to change that and reverse the order. Let us see how we can do that

import matplotlib.pyplot as plt
plt.plot([2,4],[4,4],zorder=2,linewidth=20)
plt.plot([1,5],[2,6],zorder=1,linewidth=20)
plt.show()

Here we can see that using the Zorder we are able to achieve the desired output. Here in order to get a clear cut view I have also increased the line width to 20. Now, this Zorder can be executed for any number of lines.

Now let us look at how it works for a 4 straight lines graph.

import matplotlib.pyplot as plt
plt.plot([2,4],[4,4],zorder=5,linewidth=10)
plt.plot([1,5],[2,6],zorder=1,linewidth=10)
plt.plot([1,4],[2,4],zorder=3,linewidth=10)
plt.plot([2,5],[1,3.5],zorder=4,linewidth=10)
plt.plot([3,1],[5,1],zorder=2,linewidth=10)
plt.show()

Here we can see that Zorder works fine even for this program. By default even if the Zorder is not declared. The program itself follows the following order

Type	Zorder
Patch collection	1
Line collection	2
Text	3

Let us verify this through an example

import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 5, 4])
y = np.array([ 2.65, 2.65, 4, 2 ])
m, b = np.polyfit(x, y, 1)
plt.xlabel("X-axis")
plt.ylabel("y-axis")

plt.plot(x, y, 'o')
plt.plot(x, m*x + b)
plt.show()

Above we can see an example where have plotted a scatter plot and straight line between it. In order to do so, we require the NumPy module along with matplotlib. I hope that how to implement and how does Zorder will be clear to you. In the next section, we will discuss some of the general questions related to this function.

General Questions

1. How to change the color of the lines in Zorder?

Ans. In order to change the color of the lines in the plot, all you have to do is add the “color” in the syntax. After that, you can specify the color. You can use hex-code, RBGA, or just simply write the color.

2. How to change the width of the lines of the plot?

Ans. In order to do so, you need to use the “linewidth” parameter. After which we can specify the width.

Below you can see the Implementation of the above things below

import matplotlib.pyplot as plt

plt.plot([2, 4], [4, 4], zorder=2, linewidth=20,color="#7aebc1")
plt.plot([1, 5], [2, 6], zorder=1, linewidth=15,color=(.5,.95,1,1))
plt.show()

Here above we have successfully implemented the changes. When using RBGA instead of using the pattern of (240,240,240,1) it is advised to use a value between 0 and 1.

Conclusion

In this article, we covered the Matplotlib Zorder. Besides that, we have also looked at its syntax and application. For better understanding, we looked at a couple of examples. We varied the syntax and looked at the output for each case. In the end, we can conclude that function Matplotlib Zorder is used to get the lines in a particular order as requested by the user.

I hope this article was able to clear all Doubts. But in case you have any unsolved queries feel free to write them below in the comment section. Done reading this why not read about Random Uniform Function next.

The post Matplotlib Zorder Explained with Examples appeared first on Python Pool.

from Planet Python
via read more

Zato Blog: How to invoke REST APIs from Zato microservices

This Zato article is a companion to an earlier post - previously we covered accepting REST API calls and now we look at how Zato services can invoke external REST endpoints.

Outgoing connections

Just like channels are responsible for granting access to your services via REST or other communication means, it is outgoing connections (outconns, as an abbreviation) that let the services access resources external to Zato, including REST APIs.

Here is a sample definition of a REST outgoing connection:

Zato web-admin creating a REST outgoing connection

The Python implementation will follow soon but, for now, let's observe that keeping the two separate has a couple of prominent advantages:

The same outgoing connection can be used by multiple services
Configuration is maintained in one place only - any change is immediately reflected on all servers and services can make use of the new configuration without any interruptions

Most of the options of REST outconns have the same meaning as with channels but TLS CA certs may require particular attention. This option dictates what happens if a REST endpoint is invoke using HTTPS rather than HTTP, how the remote end's TLS certificate is checked.

The option can be one of:

Default bundle - a built-in bundle of CA certificates will be used for validation. This is the same bundle that Mozilla uses and is a good choice if the API you are invoking is a well-known, public one with endpoints signed by one of the public certificate authorities.
If you upload your own CA certificates, they can be used for validation of external REST APIs - for instance, your company or a business partner may have their own internal CAs
Skip validation - no validation will be performed at all, any TLS certificate will be accepted, including self-signed ones. Ususually, this option should not be used for non-development purposes.

Python code

A sample service making use of the outgoing connection is below.

# -*- coding: utf-8 -*-

# Zato
from zato.server.service import Service

class GetUserDetails(Service):
    """ Returns details of a user by the person's name.
    """
    name = 'api.user.get-details'

    def handle(self):

        # In practice, this would be an input parameter
        user_name = 'john.doe'

        # Name of the connection to use
        conn_name = 'My REST Endpoint'

        # Get a connection object
        conn = self.out.rest[conn_name].conn

        # A single dictionary with all the parameters,
        # Zato will know where each one goes to.
        params = {
            'name': user_name,
            'app_id': 'Zato',
            'app_version': '3.2'
        }

        # We are responsible for catching exceptions
        try:

            # Invoke the endpoint
            response = conn.get(self.cid, params)

            # Log the response
            self.logger.info('Response `%s`', response.data)

        # We caught an exception - log its details
        except Exception as e:
            self.logger.warn('Endpoint could not be invoked due to `%s`', e.args[0])

First, we obtain a connection handle, then the endpoint is invoked and a response is processed, which in this case means merely outputting its contents to server logs.

In the example, we use a constant user name and query string but in practice they would be likely produced basing on user input.

Note the 'params' dictionary - when you invoke this service, Zato will know that 'name' should go to the URL path but all the remaining parameters will go to the request's query string. Again, this gives you additional flexibility, e.g. if the endpoint's URL changes from path parameters to query string, the service will continue to work without any changes.

Observe, too, that we are responsible for catching and handling potential exceptions arising from invoking REST endpoints.

Finally - because the outgoing connection's data format is JSON, you are not required to de-/serialise it yourself, the contents of 'response.data' is already a Python dict read from a JSON response.

At this point, the service is ready to be invoked - let's say, through REST, AMQP or from the scheduler - and when you do it, here is the output that will be seen in server logs:

INFO - Response `{'user_id': 123, 'username': 'john.doe', 'display_name': 'John Doe'}`

Now, you can extend it to invoke other systems, get data from an SQL database or integrate with other APIs.

Next steps

Start the tutorial to learn more technical details about Zato, including its architecture, installation and usage. After completing it, you will have a multi-protocol service representing a sample scenario often seen in banking systems with several applications cooperating to provide a single, consistent API to its callers.
Visit the support page if you would like to discuss anything about Zato with its creators

from Planet Python
via read more

Zero to Mastery: Python Monthly 💻🐍 January 2021

14th issue of Python Monthly! Read by 20,000+ Python developers every month. This monthly Python newsletter is focused on keeping you up to date with the industry and keeping your skills sharp, without wasting your valuable time.

from Planet Python
via read more

Python Pool: Only Size 1 Arrays Can Be Converted To Python Scalars Error Solved

Python has helped thousands of communities to create solutions for their real-life problems. With thousands of useful modules, Python has proved to be one of the top versatile languages in the coding world. In addition, writing in Python is similar to writing in English, and using it is even simpler. With simple single command module installs, you can install almost every dependency to control your problems. Numpy is one of those important modules, and Only Size 1 Arrays Can Be Converted To Python Scalars Error appears while using this module.

Only Size 1 Arrays Can Be Converted To Python Scalars Error is a typical error that appears as a TypeError form in the terminal. This error’s main cause is passing an array to a parameter that accepts a scalar value. In various numpy methods, acceptable parameters are only a scalar value. Hence, if you pass a single dimensional array or multidimensional array in the method, it’ll throw this error. With increasing methods accepting a single parameter, you can expect this error to appear many times.

This post will guide you with the causes of the error and it’s solutions.

What is Only Size 1 Arrays Can Be Converted To Python Scalars Error?

Only Size 1 Arrays Error is a TypeError that gets triggered when you enter an array as a parameter in a function or method which accepts a single scalar value. Many functions have inbuilt error handling methods to avoid crashing programs and validate the inputs given for the function. Without the validation, the python program will crash immediately, which can cause issues.

numpy.int() and numpy.float() shows this error due to single-valued parameters. As TypeError originates from an invalid data type, you can get this error by sending an array as a parameter.. There are different ways to avoid this error, which we’ll discuss in the post’s bottom section.

Why do I get Only Size 1 Arrays Can Be Converted To Python Scalars Error?

Errors are an integral part of the programming, and they should be handled properly. With the management of error handling, your program can not only avoid harmful vulnerabilities, but it can also perform in a better way. As a result, you can get the Only Size 1 Arrays error while using numpy. This module’s developers have categorized this error into TypeError, which describes that you supplied the wrong data to the function/method.

Causes of Only Size 1 Arrays Can Be Converted To Python Scalars Error

There are several ways the error can appear while using numpy. All of these errors are solvable by one or the other means. This error is a user sided error and passing appropriate parameters can prevent this error. Following are the causes of this error –

Incorrect Datatype

In Python, every data type has different methods and attributes. Each of these data types has different usage. In many numpy methods, the primary parameter required is a single value. With such methods, if you pass a numpy array as a parameter, Only Size 1 Arrays Can Be Converted To Python Scalars Error can appear.

Example:

import numpy as np

x = np.array([1, 2, 3, 4])
x = np.int(x)

Output:

TypeError: only size-1 arrays can be converted to Python scalars

Explanation:

In the program, x is a simple array with integer elements. If you try to convert this array into int form, it will throw an error because np.int() accept a single value.

Using Single Conversion Function

Single Conversion Functions are the functions that accept a single-valued datatype and convert it to another data type. For example, converting a string to int is a single-valued conversion. In numpy, these functions accept the single numpy element and change its datatype from within. If you pass a numpy array as a parameter, you’ll get an error in such methods.

Example –

import numpy as np

x = np.array([1, 2, 3, 4])
x = np.float(x)

Output –

TypeError: only size-1 arrays can be converted to Python scalars

Explanation –

In the above example, we wanted to convert all the integers from the numpy array to float values. np.float() throws TypeError as it accepts single-valued parameters.

Solutions for Only Size 1 Arrays Can Be Converted To Python Scalars Error

There are multiple ways of solving the TypeError in Python. Most importantly, Numpy modules provide some inbuilt functions which you can use to create a suitable datatype before using it in a method. Following are the solutions to this error –

1. Using Numpy Vectorize Function

In layman’s terms, Vectorize means applying an algorithm to a set of values rather than applying them on a single value. As the TypeError occurs due to its usage on sets of values, you can use numpy.vectorize() in between the algorithm and methods. This method acts like a python map function over a numpy array.

Code –

import numpy as np

vector = np.vectorize(np.float)
x = np.array([1, 2, 3])
x = vector(x)
print(x)

Output –

[1. 2. 3.]

Explanation –

Firstly, we started by creating a vector that accepts np.float as a parameter. To apply a method on all the numpy array elements, we’ll use this vector. Nextly, we created a numpy array and then used our vector() to apply np.float over all the values. This method avoids all sorts of TypeError and also converts all the values into the float.

2. Using Map() Function

Surely, the map is the basic inbuilt function in python that applies a function over all array elements. map() function accepts two major parameters. The first one is the function you need to apply over sets of values. The second one is an array that you want to change. Let’s jump on an example –

Code –

import numpy as np

x = np.array([1, 2, 3])
x = np.array(list(map(np.float, x)))
print(x)

Output –

[1. 2. 3.]

Explanation –

Firstly, we’ve created a simple integer array and used map(np.float, x) to convert all the elements from numpy array to float. As the map function returns a map object, we need to convert it back to the list and numpy array to restore its datatype. By using this method, you can avoid getting TypeError.

3. Using Loops

Loops are the most brute force methods to apply a function over a set of values. But it provides us control over all parts of the elements and can be used to customize elements if we need.

Code –

import numpy as np

x = np.array([1, 2, 3])
y = np.array([None]*3)
for i in range(3):
    y[i] = np.float(x[i])
print(y)

Output –

[1.0 2.0 3.0]

Explanation –

In the above example, we’ve used indexing to fetch the initial integer from the numpy array. Then applied the np.float() method to convert it from float to int. Furthermore, we’ve created a dummy numpy array y, which stores the float values after changing.

4. Using apply_along_axis

Apply_along_axis is a Numpy method that allows the users to apply a function over a numpy array along a specific axis. As numpy is looped according to axis number, you can use it to apply a function over sets of values.

Code –

import numpy as np

x = np.array([1, 2, 3])
app = lambda y: [np.float(i) for i in y]
x = np.apply_along_axis(app, 0, x)
print(x)

Output –

[1. 2. 3.]

Explanation –

In this example, we’ll use the lambda function to create a function’s vectorized function. Then we’ll use np.apply_along_axis to apply the lambda function over the specific numpy array. You can also specify the axis along which you need to apply the function.

Conclusion

Numpy module has provided thousands of useful methods that easily solve hard problems. These methods have their own sets of instructions and parameters that we need to follow properly. Only Size 1 Arrays Can Be Converted To Python Scalars Error appears when you provide an invalid data type to a function that accepts a scalar value. By following the solutions and alternatives given in the post, you can solve this error in no time!

However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.

Happy Pythoning!

The post Only Size 1 Arrays Can Be Converted To Python Scalars Error Solved appeared first on Python Pool.

from Planet Python
via read more

Juri Pakaste: Alfred Script Filter with find and jq

Looks like this is a jq blog now, so here's another one.

I work on an iOS repository that's used to create a large number of apps and a few frameworks. Each app has a directory with configuration and a script that regenerates the associated Xcode project with XcodeGen.

You can run the script from the shell, or from Finder. Both of these require that you navigate to the appropriate directory or find a window that's already there. Both approaches work, and both are unsatisfactory.

I use Alfred for launching apps and all sorts of other things on macOS. One of the things it allows is workflows, a sort of Automator-like thing where after typing in a keyword Alfred will prompt you for input and execute things and so on. I built a workflow for helping with launching those regenerate scripts. Alfred's workflow sharing thing isn't great, as it creates hard to inspect zip files, and besides my specific circumstances probably aren't relevant to many people. I'll explain here in prose how it works. Adapt it to your needs as necessary.

The repository contains publisher folders. Inside the publisher folders are app folders. In each app folder is a script called regenerate-project.command. The hierarchy looks something like this:

├── publisher1
│   ├── app1
│   │   └── regenerate-project.command
│   └── app2
│       └── regenerate-project.command
└── publisher2
    └── app1
        └── regenerate-project.command

We want Alfred to ask us which one of the scripts to run after we've typed a keyword.

Let's see how we can make it happen. First, to get a list of those files we can run find in the terminal:

find . -maxdepth 3 -mindepth 3 -name regenerate-project.command -print

This gives us a list of files, one per line, like:

./publisher1/app1/regenerate-project.command
./publisher1/app2/regenerate-project.command
./publisher2/app1/regenerate-project.command

etc¹.

Now, looking at Alfred's documentation, looks like we need to create a document in the Script Filter JSON Format. It should look like this:

{
    "items": [
        {
            "uid": "publisher1/app1",
            "type": "file",
            "title": "publisher1/app1",
            "arg": "publisher1/app1",
            "match": "publisher1: app1"
        }
    ]
}

And so on. The one thing that breaks the monotony of identical keys is the match value. Its purpose there is to make Alfred give better completions. Alfred has a "word boundary" matching logic, but apparently / doesn't count as a word boundary.

What do we do when we need to handle JSON on the command line? We reach for jq.

Jq has a number of parameters that modify how it handles input. To get it to ingest the list of strings produced by find, what seemed to work was using a combination of the --raw-input/-R and --null-input/-n flags, and the inputs builtin function. So the first thing to do is to build the wrapping object.

find . -maxdepth 3 -mindepth 3 -name regenerate-project.command -print | jq -nR '{ "items": [inputs] }'

Running that produces output like this:

{
  "items": [
    "./publisher1/app2/regenerate-project.command",
    "./publisher1/app1/regenerate-project.command",
    "./publisher2/app1/regenerate-project.command"
  ]
}

You could pipe find through sort or you could use jq's sort function, but the order doesn't matter as Alfred will reorder the choices by usage anyway, which is nice.

Next, just because we're careful developers, let's filter out empty entries, just in case we're ever using this with some other source of data:

find … | jq -nR '{ "items": [inputs | select(length>0)] }'

When you're running this with find, it shouldn't affect the output, but if you ever end up feeding it a text file it might be a different story.

Next drop the extra bits from the lines. We don't care about the leading ./ or the script name. They're all the same on all the lines. To lose them split the line into path components, take the two central elements and recombine them:

find … | jq -nR '{
    "items": [
        inputs |
        select(length>0) |
        split("/")[1:3] |
        join("/")
    ]
}'

{
  "items": [
    "publisher1/app2",
    "publisher1/app1",
    "publisher2/app1"
  ]
}

One thing we have to do to before we can build the object literals is capture the values — both the parts array and the combined string — in variables. This is a slightly longer version of the above jq snippet. It produces exactly the same output, but it defines the variables we need in the next step:

find … | jq -nR '{
    "items": [
        inputs |
        select(length>0) |
        split("/")[1:3] as $parts |
        $parts |
        join("/") as $file |
        $file
    ]
}'

OK, good. Now we have a the two folders as an array in $parts and as a string in $file. Then just replace that last bit that produces the array elements with an object literal.

find … | jq -nR '{
    "items": [
        inputs |
        select(length>0) |
        split("/")[1:3] as $parts |
        $parts |
        join("/") as $file |
        {
            "uid": $file,
            "type": "file",
            "title": $file,
            "arg": $file,
            match: $parts | join(": ")
        }
    ]
}'

That's a whole lot of $file and one special element that produces the value for the match field. Now the output looks like this:

{
  "items": [
    {
      "uid": "publisher1/app2",
      "type": "file",
      "title": "publisher1/app2",
      "arg": "publisher1/app2",
      "match": "publisher1: app2"
    },
    {
      "uid": "publisher1/app1",
      "type": "file",
      "title": "publisher1/app1",
      "arg": "publisher1/app1",
      "match": "publisher1: app1"
    },
    {
      "uid": "publisher2/app1",
      "type": "file",
      "title": "publisher2/app1",
      "arg": "publisher2/app1",
      "match": "publisher2: app1"
    }
  ]
}

All right, that's what we were after! Now we need to glue things together. In Alfred's Preferences, go to Workflows and create a new blank workflow. First tap on the "[𝑥]" button to set up variables. You'll need at least one, to specify where your project lives. Call it root, specify your folder as the value, and uncheck "Don't Export" as you want it as an environment variable in your script.

Next ctrl-click in the workflow background to get the context menu and select Inputs > Script Filter. In the filter configuration panel, give your workflow a keyword — I call mine regenios, this is how I invoke it in Alfred — uncheck "with space", and select "Argument Required". Select /bin/bash as the script language, and as text add this:

cd $root
find . -maxdepth 3 -mindepth 3 -name regenerate-project.command -print | jq -nR '{
    "items": [
        inputs |
        select(length>0) |
        split("/")[1:3] as $parts |
        $parts |
        join("/") as $file |
        {
            "uid": $file,
            "type": "file",
            "title": $file,
            "arg": $file,
            match: $parts | join(": ")
        }
    ]
}'

Now click Save to save your Script Filter. Then ctrl-click in the workflow background again and this time select Actions > Terminal Command. Insert the following as the terminal command:

{var:root}/{query}/regenerate-project.command && exit

Again click save. Finally in the workflow editor drag a connection from the Script Filter shape to the Terminal Command box and you're done.

Now when you open the Alfred command window and type regenios and two spaces, you should get a full list of all the items your script produced. If you start typing after the first space, Alfred will match the beginning of each word of the match field of the JSON objects we produced and give a list of the matching items.

As I said at the start of this article, this probably isn't of much use to you as is. But it might be useful as inspiration.

Yes, I'm aware of -print0, but it seems jq isn't.

from Planet Python
via read more

Saturday, January 30, 2021

The Three of Wands: On structured and unstructured data, or the case for cattrs

If you've ever gone through the Mypy docs, you might have seen the section on TypedDict. The section goes on to introduce the feature by stating:

Python programs often use dictionaries with string keys to represent objects. [...] you can use a TypedDict to give a precise type for objects like movie, where the type of each dictionary value depends on the key:

from typing_extensions import TypedDict

Movie = TypedDict('Movie', {'name': str, 'year': int})

movie = {'name': 'Blade Runner', 'year': 1982}  # type: Movie

In other words, TypedDict exists to make dictionaries a little more like classes (in the eyes of Mypy, in this particular case), and is only one example of a growing menagerie of similar efforts to make dictionaries classes.

In this post, I maintain that in modern Python classes already exist, are fit-for-purpose and dictionaries should just be left to be dictionaries.

Value Objects

Pretty much every application and every API has a notion of data models on some level. These are prime examples of structured data - pieces of information with a defined shape (usually the names and types of subfields). The TypedDict example from the introduction defines a data model with two fields. Let's call these pieces of data value objects. Value objects come in a million flavors on many different abstraction layers; they can range from a Django model to a class you define in a one-liner to be able to return multiple values from a function, to just a dictionary. Value objects usually don't have a lot of business logic attached to them so it might be a stretch calling some of these value objects, but let's roll with it here.

In Python, the most natural way of modeling value objects is a class; since an instance of a class is just that - a piece of structured data.

When the TypedDict docs claim that Python programs often use dictionaries to model value objects, they aren't incorrect. The reason for this is, however, that historically Python has not had good tools for using classes for value objects, not that dictionaries are actually good or desireable for this purpose. Let's look at why this is the case.

JSON Value Objects

One of the biggest reasons, I believe, is JSON, probably the most popular serialization format of our time. Python has great tools for converting a piece of JSON into unstructured data (Python primitives, lists and dictionaries) - there's a JSON library included in Python's standard library, and very robust, well-known and performant third-party JSON libraries. Pretty much all Python HTTP libraries (client and server) have special cases for easy handling of JSON payloads.

Now, take into account that the most straightforward way to model a value object in JSON is simply using a JSON object with fields corresponding to the value object fields. So, parsing the JSON payload {"name": "Blade Runner", "year": 1982} into a dictionary is extremely easy, and converting this into a proper Python value object much less so.

Modern Python Value Objects

Historically, creating Python value object classes and populating them with data from somewhere (like a JSON payload) has been very cumbersome. There have been three recent development in the broader Python ecosystem to make this much better.

attrs

We now have attrs. attrs is a Python library for declaratively defining Python classes, and is particularly amazing for modeling value objects. attrs itself has excellent docs and makes a great case against manually writing classes (which it whimsically calls artisinal classes) here. The example nicely illustrates the amount of code needed for a well-behaved value object. No wonder the Python ecosystem used to default to dictionaries.

A small note on dataclasses: the dataclasses module is basically a subset clone of attrs present in the Python standard library. In my opinion, the only use of dataclasses is if you don't have access to third-party libraries (i.e. attrs), for example if you're creating simple scripts that don't require a virtual environment or are writing code for the standard library. If you can use pip you should be using attrs instead, since it's just better.

Field-level type annotations

We now (since Python 3.6) have field-level type annotations in classes (aka PEP 526).

This makes it possible to define a value object thusly:

@attr.define
class Movie:
    name: str
    year: int

The most important part of this PEP is that the type information for the value object fields is available at runtime. (Classes like this were possible before this PEP using type comments, but that's not usable in runtime.)

The field type information is necessary for handling structured data; especially any kind of nested structured data.

cattrs

We now have cattrs. cattrs is my library for efficiently converting between unstructured and structured Python data. To simplify, cattrs ingests dictionaries and spits out classes, and ingests classes and spits out dictionaries. attrs classes are supported out of the box, but anything can be structured and unstructured. For example, the usage docs show how to convert Pendulum DateTime instances to strings, which can then be embedded in JSON.

cattrs uses converters to perform the actual transformations, so the un/structuring logic is not on the value objects themselves. This keeps the value objects leaner and allows you to use different rules for the same value object, depending on the context.

So cattrs is the missing layer between our existing unstructured infrastructure (our JSON/msgpack/bson/whatever libraries) and the rich attrs ecosystem, and the Python type system in general. (cattrs goes to efforts to support higher-level Python type concepts, like enumerations and unions.)

I believe this functionality is sufficiently complex for it to have a layer of its own and that it doesn't really make sense for lower-level infrastructure (like JSON libraries) to implement it itself, since the conversion rules between higher-level components (like Pendulum DateTimes) and their serialized representations need to be very customizable. (In other words, there's a million ways of dumping DateTimes to JSON.)

Also, if the unstructured layer only concerns itself with creating unstructured data, the structuring logic can be in one place. In other words, if you use ujson + cattrs, you can easily switch to msgpack + cattrs later (or at the same time).

Putting it all to use

Let's try putting this to use. Let's say we want to load a movie from a JSON HTTP endpoint.

First, define our value object in code. This serves as documentation, runtime information for cattrs, and type information for Mypy.

@attr.frozen
class Movie:
    name: str
    year: int

Second, grab the unstructured JSON payload.

>>> payload = httpx.get('http://my-movie-url.com/movie').json()

Third, structure the data into our value object (this will throw exceptions if the data is not the shape we expect). If our data is not exotic and doesn't require manual customization, we can just import structure from cattr and use that.

>>> movie = cattr.structure(payload, Movie)

Done!

Addendum: What should dictionaries actually be used for?

The attrs docs already have a great section on what dictionaries should be, so I'll be short in adding my two cents.

If the value type of your dictionary is any sort of union, it's not really a dictionary but a value object in disguise. For the movie example, the type of the dictionary would be dict[str, Union[str, int]], and that's a tell-tale sign something's off (and the raison d'etre for TypedDict). A true dictionary would, for example, be a mapping of IDs to Movies (if movies had IDs), the type of which would be dict[int, Movie]. There's no way to turn this kind of data into a class.

from Planet Python
via read more

Matt Layman: Are Django and Flask Similar?

Maybe you’re new to web development in Python, and you’ve encountered the two most popular Python web frameworks, Django and Flask, and have questions about which one you should use. Are Django and Flask similar tools for building web applications? Yes, Django and Flask share many similarities and can both make great websites, but they have some different development philosophies which will attract different types of developers. What do I know about this?

from Planet Python
via read more

How to pad arrays in NumPy: Basic usage of np.pad() with examples

The np.pad() function has a complex, powerful API. But basic usage is very simple and complex usage is achievable!

from Planet SciPy
read more

John Cook: Python triple quote strings and regular expressions

There are several ways to quote strings in Python. Triple quotes let strings span multiple lines. Line breaks in your source file become line break characters in your string. A triple-quoted string in Python acts something like “here doc” in other languages.

However, Python’s indentation rules complicate matters because the indentation becomes part of the quoted string. For example, suppose you have the following code outside of a function.

x = """\
abc
def
ghi
"""

Then you move this into a function foo and change its name to y.

def foo():
    y = """\
    abc
    def
    ghi
    """

Now x and y are different strings! The former begins with a and the latter begins with four spaces. (The backslash after the opening triple quote prevents the following newline from being part of the quoted string. Otherwise x and y would begin with a newline.) The string y also has four spaces in front of def and four spaces in front of ghi. You can’t push the string contents to the left margin because that would violate Python’s formatting rules.

We now give three solutions to this problem.

Solution 1: textwrap.dedent

There is a function in the Python standard library that will strip the unwanted space out of the string y.

import textwrap 

def foo():
    y = """\
    abc
    def
    ghi
    """
    y = textwrap.dedent(y)

This works, but in my opinion a better approach is to use regular expressions [1].

Solution 2: Regular expression with a flag

We want to remove white space, and the regular expression for a white space character is \s. We want to remove one or more white spaces so we add a + on the end. But in general we don’t want to remove all white space, just white space at the beginning of a line, so we stick ^ on the front to say we want to match white space at the beginning of a line.

import re 

def foo():
    y = """\
    abc
    def
    ghi
    """
    y = re.sub("^\s+", "", y)

Unfortunately this doesn’t work. By default ^ only matches the beginning of a string, not the beginning of a line. So it will only remove the white space in front of the first line; there will still be white space in front of the following lines.

One solution is to add the flag re.MULTILINE to the substitution function. This will signal that we want ^ to match the beginning of every line in our multi-line string.

    y = re.sub("^\s+", "", y, re.MULTILINE)

Unfortunately that doesn’t quite work either! The forth positional argument to re.sub is a count of how many substitutions to make. It defaults to 0, which actually means infinity, i.e. replace all occurrences. You could set count to 1 to replace only the first occurrence, for example. If we’re not going to specify count we have to set flags by name rather than by position, i.e. the line above should be

    y = re.sub("^\s+", "", y, flags=re.MULTILINE)

That works.

You could also abbreviate re.MULTILINE to re.M. The former is more explicit and the latter is more compact. To each his own. There’s more than one way to do it. [2]

Solution 3: Regular expression with a modifier

In my opinion, it is better to modify the regular expression itself than to pass in a flag. The modifier (?m) specifies that in the rest of the regular the ^ character should match the beginning of each line.

    y = re.sub("(?m)^\s+", "", y)

One reason I believe this is better is that moves information from a language-specific implementation of regular expressions into a regular expression syntax that is supported in many programming languages.

For example, the regular expression

    (?m)^\s+

would have the same meaning in Perl and Python. The two languages have the same way of expressing modifiers [3], but different ways of expressing flags. In Perl you paste an m on the end of a match operator to accomplish what Python does with setting flasgs=re.MULTILINE.

One of the most commonly used modifiers is (?i) to indicate that a regular expression should match in a case-insensitive manner. Perl and Python (and other languages) accept (?i) in a regular expression, but each language has its own way of adding modifiers. Perl adds an i after the match operator, and Python uses

    flags=re.IGNORECASE

    flags=re.I

as a function argument.

More on regular expressions

[1] Yes, I’ve heard the quip about two problems. It’s funny, but it’s not a universal law.

[2] “There’s more than one way to do it” is a mantra of Perl and contradicts The Zen of Python. I use the line here as a good-natured jab at Python. Despite its stated ideals, Python has more in common with Perl than it would like to admit and continues to adopt ideas from Perl.

[3] Python’s re module doesn’t support every regular expression modifier that Perl supports. I don’t know about Python’s regex module.

The post Python triple quote strings and regular expressions first appeared on John D. Cook.

from Planet Python
via read more

Weekly Python StackOverflow Report: (cclxi) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2021-01-30 15:03:53 GMT

pip install failing on python2 - [21/3]
Installing pip is not working in bitbucket CI - [17/2]
Python Pip broken wiith sys.stderr.write(f"ERROR: {exc}") - [16/2]
JSON serialized object gives error with multiprocessing calls - TypeError: XXX objects not callable error - [7/0]
Efficient way of making time increment strings? - [6/5]
How to calculate the distance between two points on lines in python - [6/4]
How to format to n decimal places in Python - [6/3]
Read .pptx file from s3 - [5/1]
Why is iterating over a dict so slow? - [5/1]
Why is the output not 1 and 1? - [5/1]

from Planet Python
via read more

Ben Cook: How to pad arrays in NumPy: Basic usage of np.pad() with examples

The np.pad() function has a complex, powerful API. But basic usage is very simple and complex usage is achievable!

from Planet Python
via read more

Friday, January 29, 2021

Iterating over rows in Pandas

When you absolutely have to iterate over rows in a Pandas DataFrame, use the .itertuples() method.

from Planet SciPy
read more

Ben Cook: Iterating over rows in Pandas

When you absolutely have to iterate over rows in a Pandas DataFrame, use the .itertuples() method.

from Planet Python
via read more

PyCharm: The Transition to Apple Silicon

In June of last year, Apple announced that the Mac would transition to Apple’s own chips, called Apple Silicon. Here at PyCharm, this would mean major changes to the way we build our software. A change of this magnitude has not happened since the move from PowerPC to Intel’s x86 architecture.

Although the performance was somewhat acceptible on Rosetta 2, Apple’s new translation engine that translates the x86 instruction set to the M1’s ARM-base instruction set, it was not good enough for our IDEs.

In general, if you have a simple program then Rosetta 2 should be able to translate your program without significant overhead. However, our IDEs are built on top of our own custom Java Runtime Environment, and that is in no way a simple program.

JetBrains Runtime

Up until 2010, Apple bundled their own version of Java with their operating system. This meant that every time a new version of Java was released, Apple would need to patch it for their own operating system, so that it did not have any security vulnerabilities. With the deprecation of Java on the Mac, certain things such as font-rendering on retina screens became more difficult using the version of Java that Oracle released. In order to remedy this, JetBrains forked the OpenJDK project in order to facilitate better control over how the IDEs looked on Macs a well as other HiDPI screens; JetBrains Runtime was born and we bundled it with our IDEs from 2014.

The JetBrains Runtime ships with all our IDEs and although this gives us more control, it also means that we need to have a large team to maintain this codebase. Furthermore, there are many facets of the runtime, and we do not know every little crevice of it, rather we focus on the part of the code that handles the rendering of UI on screens.

M1 Enters the Chat

The change to Apple Silicon meant that we’d need to re-write a lot of JetBrains Runtime, to make sure that we had adequate performance. Although we had been experimenting with running applications on Raspberry Pi computers, this was a completely different issue; the M1 meant that ARM-based computers would soon become mainstream. Our IDEs couldn’t just run adequately on the M1, they had to run well on them.

To this end, we began to investigate how we could handle this transition with grace. It soon turned out that we had to re-write a lot of the JIT system, a core component of the JVM itself, which was something we had little to no experience in.

Eventually, we did manage to solve this issue with the help of Azul Systems. To hear the whole story, listen to the podcast, where I talk to Konstantin Bulenkov, who had to weather the storm of this fundamental change.

from Planet Python
via read more

Best Alternatives to MLFlow Model Registry

In the Spark + AI Summit in Amsterdam, Mlflow Model Registry, a brand new feature in the MLflow platform, was announced. Mlflow...

The post Best Alternatives to MLFlow Model Registry appeared first on neptune.ai.

from Planet SciPy
read more

Stack Abuse: How to Format Number as Currency String in Python

Introduction

Having to manually format a number as a currency string can be a tedious process. You may have just a few lines of modifications to make, however, when we need to do a fair bit of conversions, it becomes very tedious.

The first step to automating these kind of tasks will require a function. In this article, we'll be going over a few methods you can use to format numbers as currency strings in Python.

Methods for Formatting Numbers

We'll be going over three alternate libraries and functions which allow us to convert numbers into currency strings:

The locale module.
The Babel module.
The str.format() function.

The locale module is already included in Python, though, we'll have to install Babel to use it.

Format Number as Currency String with Locale

The locale module comes already preinstalled with your version of Python.

This package allows developers to localize their applications. Meaning they don't have to know in which region their software will be run, they can just write a universal codebase which will dynamically change depending on the region of use.

Initializing the Locale

To begin using the locale module you first need to set the locale:

import locale 

# To use default settings, set locale to None or leave second argument blank.
print(locale.setlocale(locale.LC_ALL, ''))

# To use a specific locale (Great Britian's locale in this case)
print(locale.setlocale(locale.LC_ALL, 'en_GB'))

The code above will produce the following output:

English_United States.1252
en_GB

To get the list of available locales, you can look it up on MS-LCID. Alternatively, you can print it out:

# For the Windows operating system 
for lang in locale.windows_locale.values():
        print(lang)

# For other operating systems
for lang in locale.locale_alias.values():
        print(lang)

Running any of the above variants will yield something similar to:

en_GB
af_ZA
sq_AL
gsw_FR
am_ET
ar_SA
ar_IQ
ar_EG
ar_LY
ar_DZ
...

Formatting Numbers with Locale

With your preferred locale set, you can easily format number strings:

locale.setlocale(locale.LC_ALL, '')

# If you'd like groupings - set grouping to True, else set it to false or leave it out completely
print(locale.currency(12345.67, grouping=True))
print(locale.currency(12345.67))

Running the code above we get the following output:

$12,345.67
$12345.67

Using the str.format() method

The next method we'll be covering is the str.format() method, which has the advantage of being the most straight forward one:

number_string = 340020.8
# This portion is responsible for grouping the number
number_commas_only = "{:,}".format(number_string)
print(number_commas_only)

# To ensure we have two decimal places
number_two_decimal = "{:.2f}".format(number_string)
print(number_two_decimal)

# Both combined along with the currency symbol(in this case $)
currency_string = "${:,.2f}".format(number_string)
print(currency_string)

Running the code above we get the following output:

340,020.8
340020.80
$340,020.80

Though, this approach is hard-coded, unlike the previous one, which you can use to localize the formatting dynamically.

Format Number as Currency String with Babel

Using Babel is perhaps one of the lesser known methods, however it's very user-friendly and intuitive. It comes with number and currency formatting as well as other internationalizing tasks.

Unlike Python's locale module, you don't have to worry about making adjustments on a global scale.

To install Babel via pip, run the following command:

$ pip install Babel
...
Successfully installed Babel-2.9.0

Once installed, to achieve the same results as the two other methods listed above, you can simply call format_currency() on a string:

import babel.numbers
number_string = 340020.8

# The three needed arguements are the number, currency and locale
babel.numbers.format_currency(number_string, "USD", locale='en_US')

Running the code above we get the following output:

$340,020.80

To get the full list of locales available:

avail_loc = babel.localedata.locale_identifiers()
print(avail_loc)

Which looks something like this:

['af', 'af_NA', 'af_ZA', 'agq', 'agq_CM', 'ak', 'ak_GH', 'am', 'am_ET',...]

Searching For Numbers in Strings and Formatting as Currency

Sometimes, you don't work with direct numerical input, such as the input from a user. You might be working with a sentence, or a larger, unclean corpus. We can use the re module to filter through different types of input, find numerical values and format them.

Let's use all three of the approaches above to format the currency in a sentence:

import re
import locale
import babel.numbers
locale.setlocale(locale.LC_ALL, 'en_US')

Next we come up with the regex pattern needed to match the number strings:

 # This pattern is used to match any number string
 pattern = r'\d+(\.\d{1,2})?'

Next we apply the three methods we've learned to the string variable message:

message = "Our current budget is 180000, we'll need 25000.67 to cover rent, then 23400.4 for food."

# re.sub() is used to substitute substrings that match a certain pattern
# with another string, in our case the return value of a lambda function
# which will return a matching currency string.
new_message_locale = re.sub(
    pattern, lambda x: locale.currency(float(x.group()), grouping=True), message
)
new_message_str = re.sub(
    pattern, lambda x: "${:,.2f}".format(float(x.group())), message
)
new_message_babel = re.sub(
    pattern,
    lambda x: babel.numbers.format_currency(float(x.group()), "USD", locale="en_US"),
    message,
)

Let's compare the original output with the output gotten from all three methods:

print(message)
print(new_message_locale)
print(new_message_str)
print(new_message_babel)

Our current budget is 180000, we'll need 25000.67 to cover rent, then 23400.4 for food.
Our current budget is $180,000.00, we'll need $25,000.67 to cover rent, then $23,400.40 for food.
Our current budget is $180,000.00, we'll need $25,000.67 to cover rent, then $23,400.40 for food.
Our current budget is $180,000.00, we'll need $25,000.67 to cover rent, then $23,400.40 for food.

Depending on the method you prefer, the length of this script can be reduced. There are certain limitations as you may have noticed.

The script as it is, is unable to differentiate between number strings you'd like to format. However, it can be changed easily depending on your needs and use cases.

Conclusion

In this article we took a look at a couple of ways of converting numbers into proper currency strings. We've covered the str.format() method, as well as the locale and babel modules.

Finally we combined these methods with Python's regular expression module to achieve a wider range of uses. At the end I hope you were able to learn something new from all this that can help save you time.

from Planet Python
via read more

Real Python: The Real Python Podcast – Episode #45: Processing Images in Python With Pillow

Are you interested in processing images in Python? Do you need to load and modify images for your Flask or Django website or CMS? Then you most likely will be working with Pillow, the friendly fork of PIL, the Python imaging library. This week on the show, we have Mike Driscoll, who is writing a new book about image processing in Python.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

from Planet Python
via read more

Lucas Cimon: fpdf2.3.0 Unbreakable! and PDF quines

Unbreakable movie poster

Today, I am happy to announce version 2.3.0 of fpdf2, code name: Unbreakable!

https://github.com/pyfpdf/fpdf2/ Doc: https://pyfpdf.github.io/fpdf2/

Why Unbreakable?

As a tribute to M. Night Shyamalan movie
Because using fpdf2, your Python code can never break!
...
Just kidding, I would be …

— Permalink

from Planet Python
via read more

The Real Python Podcast – Episode #45: Processing Images in Python With Pillow

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

from Real Python
read more

Talk Python to Me: #301 Deploying and running Django web apps in 2021

Have you been learning Django and now want to get your site online? Not sure the best way to host it or the trade offs between the various options? Maybe you want to make sure your Django site is secure. On this episode, I'm joined by two Django experts Will Vincent and Carlton Gibson to talk about deploying and running Django in production along with recent updates in Django 3.2 and beyond. Links from the show <div>Guests Will Vincent: <a href="https://wsvincent.com/" target="_blank" rel="noopener">wsvincent.com</a> Carlton Gibson: <a href="https://twitter.com/carltongibson" target="_blank" rel="noopener">@carltongibson</a> Watch the live stream: <a href="https://www.youtube.com/watch?v=9SmY2b1EwwY" target="_blank" rel="noopener">youtube.com</a> Give me back my monolith: <a href="https://ift.tt/36nqFzQ" target="_blank" rel="noopener">craigkerstiens.com</a> Carlton’s Button hosting platform: <a href="https://btn.dev/" target="_blank" rel="noopener">btn.dev</a> Django Software Foundation: <a href="https://ift.tt/OMQoU5" target="_blank" rel="noopener">djangoproject.com</a> Django News newsletter: <a href="https://ift.tt/35g9gpm" target="_blank" rel="noopener">django-news.com</a> Deployment Checklist: <a href="https://ift.tt/2T7OMM2" target="_blank" rel="noopener">djangoproject.com</a> Environs 3rd party package for environment variables: <a href="https://ift.tt/3iY84iQ" target="_blank" rel="noopener">github.com</a> Django Static Files & Templates: <a href="https://ift.tt/2YNES5O" target="_blank" rel="noopener">learndjango.com</a> Learn Django: <a href="https://LearnDjango.com" target="_blank" rel="noopener">LearnDjango.com</a> Configuring uWSGI for Production Deployment @ Bloomberg: <a href="https://ift.tt/2XLQvvi" target="_blank" rel="noopener">techatbloomberg.com</a> </div> Sponsors <a href='https://ift.tt/3ovvope> <a href='https://ift.tt/3aBjB2k> <a href='https://ift.tt/2PVc9qH Python Training</a>

from Planet Python
via read more

Thursday, January 28, 2021

Python⇒Speed: Speed up pip downloads in Docker with BuildKit's new caching

Docker uses layer caching to speed up builds, but layer caching isn’t always enough. When you’re rapidly developing your Python application and therefore frequently changing the list of dependencies, you’re going to end up downloading the same packages.

Over and over and over again.

This is no fun when you depend on small packages. It’s extra no fun when you’re downloading machine learning libraries that take hundreds of megabytes.

With the release of a stable Docker BuildKit, Docker now supports a new caching mechanism that can cache these downloads.

Read more...

from Planet Python
via read more

PyCharm: PyCharm 2021.1 EAP starts now!

We are starting our new Early Access Program (EAP) for PyCharm 2021.1. You can now try the pre-release build of the upcoming v2021.1. It delivers enhanced support for Cython as well as UI and usability updates.

In short, our EAP allows anyone to try out the features we are implementing. Follow this series of EAP blog posts to get the latest information about the changes coming in PyCharm 2021.1.

pc-eap-2021-1

We encourage you to join the program to try out the new and improved features. By testing these updates and giving us feedback, you can help us make PyCharm better for you. As always, you can download the new EAP from our website, get it from the free Toolbox App, or update using snap if you’re an Ubuntu user.

DOWNLOAD PYCHARM 2021.1 EAP

Important! PyCharm EAP builds are not fully tested and might be unstable.

In this post, we’ll take a look at the most notable updates from week one of the EAP.

Improved Cython type checker

PyCharm provides improved Cython support. In this EAP we have improved the type checker for Cython, which you can already try. In the next EAP releases we are planning to fix a number of Cython-related bugs. Check out our help page for more information on this.

VCS

Built-in Space

The Space plugin is now available. This means that you can connect your IDE to your organization in JetBrains Space to view and clone project repositories, write complex scripts that use Space APIs, and review your teammates’ code. To log in to Space, click the Get from VCS button on the Welcome screen, select Space on the left, and enter your organization URL in the dedicated field. It is also possible to log in via Tools | Space | Log in to Space.

Once logged in, you can clone the desired repository and open it in PyCharm. When you open it, Space Code Reviews will appear on the left-hand pane. From there, you can see a list of issues that contain your changes or require your attention. For example, if you are a reviewer, you can open an issue to see its author, look at the timeline, add comments inside a diff view, and more.

Configure a profile for pre-commit inspections

We’ve added the possibility to choose a code inspection profile before committing changes to VCS. To access this feature, click the gear icon to show commit options, select Analyze code checkbox, click Configure, and choose the desired profile. Profiles can be created in Preferences / Settings | Editor | Inspections. The IDE will use the selected profile when inspecting your code before the commit.

User experience

Built-in HTML preview

We’ve added a new built-in browser preview to PyCharm that allows you to quickly preview HTML files. Any changes you make to HTML files in the IDE, as well as in the linked CSS and JavaScript files, will be immediately saved and the preview will update on the fly.

pc-HTML-preview

To open the preview, click on the icon with the PyCharm logo in the widget in the top-right side of the editor.

Collaborative Development

Code With Me: audio and video calls are enabled

Code With Me allows you to do collaborative development like pair programming even if your peer doesn’t have an IDE. To make the experience even better, the product team has now included the ability to make audio and video calls in Code With Me. A detailed guide can be found here.

Community contributions

The correct code insight for None is now provided in TypedDict by Morgan Bartholomew [PY-44714]
Thanks to Bernat Gabor, an important debugging case in tox is now covered: Python isolated mode no longer affects subprocess debugging. [PY-45659]

Notable bug fixes

Code insight logic is improved for ContextManager and the following other cases: function parameter annotated with built-in function type matches the passed function argument; functions that take modules as a parameter. [PY-29891] [PY-36062] [PY-43841]
For PyTorch Tensors, Tensorflow tensors, Pandas GeoDataFrame (basically for all the classes that have a shape attribute, if this attribute is non-callable and iterable) you can now see the shapes in the variable pane. [PY-19764]

That’s it for week one! You can find the rest of the changes for this EAP build in the release notes. Stay tuned for more updates, and be sure to share your feedback in the comments below, on Twitter, or via our issue tracker.

Ready to join the EAP?

Some ground rules

EAP builds are free to use and expire 30 days after the build date.
You can install an EAP build side by side with your stable PyCharm version.
These builds are not fully tested and can be unstable.
Your feedback is always welcome. Please use our issue tracker and make sure to mention your build version

How to download

Download this EAP from our website. Alternatively, you can use the JetBrains Toolbox App to stay up to date throughout the entire EAP. If you’re on Ubuntu 16.04 or later, you can use snap to get PyCharm EAP and stay up to date.

The PyCharm team

from Planet Python
via read more

Python Morsels: Assigning to Global Variables

Transcript:

Let's talk about assigning to global variables in Python.

Variable assignments are local

Let's take a global variable (a variable defined outside of any function) called message:

>>> message = "Hello world"

And let's define a function called set_message that assigns to the message variable:

>>> def set_message(name):
...     message = f"Hello {name}"
...

If we call the set_message function with the name Trey, it will assign message to Hello Trey.

>>> set_message("Trey")

But if we read message now, it's not actually Hello Trey, it's Hello world:

>>> message
'Hello world'

This happens because assignments within a function only assign to local variables. So when we're inside set_message function, we're always assigning, to a local variable.

So that message variable in the set_message function is a local variable. When the function returns, that local variable disappears.

So calling set_message doesn't actually change our global variable message, it just shadowed the global variable message (meaning we made a local variable with the same name as a global variable).

The global variable promise: "I promise I will not assign to global variables from within a function"

It is actually possible in to write to global variables in Python.

I'm going to show you how to write to global variables, but you must promise me that you won't actually every do this.

Promised? Great! Read on...

Assigning to global variables

I've assigned to global variables (from within a function) very infrequently in my own Python code and you probably shouldn't do this in your code either. But Python does allow you to do this even though you often shouldn't.

The trick to writing a global variable is to use the global statement:

>>> def set_message(name):
...     global message
...     message = f"Hello {name}"
...

When we call this new set_message function you'll see that it actually updates the global message variable:

>>> set_message("Trey")
>>> message
'Hello Trey'

If we call set_message with a different name, message will change again:

>>> set_message("Guido")
>>> message
'Hello Guido'

Normally, all assignments assign to local variables and variables are either local or global in one scope. But you can think of the global statement as an escape hatch: it's a way of declaring that the message variable is a global variable within the set_message function.

Best practices with global variables

So, every time we read from it and (more importantly) every time we write to it, it should read from and write to the global variable message.

You don't need the global statement if you're just reading. Whenever you read a variable, Python will look for a local variable with that name and, if it doesn't find it, it will look for a global variable with the same name.

You only need global when you're writing to a global variable. And, you probably shouldn't be writing to global variables from within a function.

Functions are typically are seen as having inputs (which come from their arguments usually) and outputs (usually their return value).

Global state usually shouldn't be changed simply by calling a function. Functions usually provide information through their return value.

It's a common practice to make a function like this:

>>> def get_message(name):
...     return f"Hello {name}"
...

This get_message function doesn't assign to a message variable; it just returns a string which represents the message we actually want:

>>> get_message("Trey")
'Hello Trey'

This get_message function doesn't change our global variable message:

>>> message
'Hello world'

But if we did want to change our global message variable, as long as we're in the global scope when we're calling the function, we can assign to that global variable directly:

>>> message = get_message("Trey")

Assigning to message while at the module-level (outside of any function), changes the global message variable directly:

>>> message
'Hello Trey'

If we wanted to change it to something else, we'd pass in a different argument, and continue to assign to the message variable:

>>> message = get_message("Guido")
>>> message
'Hello Guido'

Typically in Python we embrace the fact that all assignments are local; unless you're in the global scope (you're outside of any function) because then assignment statements will assign to the global scope.

Summary

All assignments assign to local variables in Python, unless you use the global statement which is kind of our escape hatch for assigning to a global variable. Don't do it though! You don't usually want that escape hatch.

Typically instead of assigning to a global variable, you should return from your function and leave it up to the caller of your function to assign to a global variable, if they would like to.

from Planet Python
via read more

SAM Coupé SCREEN$ Converter — Interrupt optimizing image converter

Ever found yourself wondering how to convert an image to SAM Coupé MODE 4 SCREEN$ format? No probably not, but I'm going to tell you anyway.

The SAM Coupé was a British 8 bit home computer that was pitched as a successor to the ZX Spectrum.

The high-color MODE4 mode …

from Planet SciPy
read more

Martin Fitzpatrick: SAM Coupé SCREEN$ Converter — Interrupt optimizing image converter

Ever found yourself wondering how to convert an image to SAM Coupé MODE 4 SCREEN$ format? No probably not, but I'm going to tell you anyway.

The SAM Coupé was a British 8 bit home computer that was pitched as a successor to the ZX Spectrum.

The high-color MODE4 mode could manage 256x192 resolution graphics, with 16 colors from a choice of 127. Each pixel can be set individually, rather than using PEN/PAPER attributes as on the Spectrum. But there's more. The SAM also supports line interrupts which allowed palette entries to be changed on particular scan lines: so a single palette entry can actually be used to display multiple colors.

The limitation that color can only be changed per line means it's not really useful for games, or other moving graphics. But it does allow you to use a completely separate palette for "off screen" elements like panels. For static images, such as photos, it's more useful - assuming that the distribution of color in the image is favorable¹.

tip: If you just want the converter, you can get it here. It is written in Python, using Pillow for image color conversion.

First a quick look at the SAM Coupé screen modes to see what we're dealing with.

Sam Coupe Screen Modes

There are 4 screen modes on the SAM Coupé.

MODE 1 is the ZX Spectrum compatible mode, with 8x8 blocks which can contain 2 colors PAPER (background) and PEN (foreground). The framebuffer in MODE 1 is non-linear, in that line 1 is followed by line 8.
MODE 2 also uses attributes, with PAPER and PEN, but the cells are 8x1 pixels and the framebuffer is linear. This MODE wasn't used a great deal.
MODE 3 is high resolution, with double the X pixels but only 4 colours -- making it good for reading text.
MODE 4 is the high color mode, with 256x192 and independent coloring of every pixel from a palette of 16. Most games/software used this mode.

Mode	Dimensions	Framebuffer	bpp	Colors	Size	Notes
4	256×192	linear	4	16	24 KB	High color
3	512×192	linear	2	4	24 KB	High resolution
2	256×192	linear	1	16	12 KB	Color attributes for each 8x1 block
1	256×192	non-linear	1	16	6.75KB	Color attributes for each 8×8 block; matches ZX Spectrum

Most SAM Coupe SCREEN$ were in MODE 4, so that's what we'll be targeting. It would be relatively easy to support MODE 3 on top of this².

The SCREEN$ format

The format itself is fairly simple, consisting of the following bytes.

Bytes	Content
24576	Pixel data, Mode 4 4bpp: 1 byte=2 pixels; Mode 3 2bpp: 1 byte = 3 pixels
16	Mode 4 Palette A
4	Mode 3 Palette A store
16	Mode 4 Palette B
4	Mode 3 Palette B store
Variable	Line interrupts 4 bytes per interrupt (see below)
1	FF termination byte

In MODE 4 the pixel data is 4bbp, that is 1 byte = 2 pixels (16 possible colors). To handle this we can create our image as 16 colors and bit-shift the values before packing adjacent pixels into a single byte.

Palette A & B

As shown in the table above the SAM actually supports two simultaneous palettes (here marked A & B). These are full palettes which are alternated between, by default 3 times per second, to create flashing effects. The entire palette is switched, but you can opt to only change a single color. The rate of flashing is configurable with:

POKE &5A08, <value>

The <value> is the time between swaps of alternate palettes, in 50ths of a second. This is only generally useful for creating flashing cursor effects ³. For converting to SAM SCREEN$ we'll be ignoring this and just duplicating the palette.

note: The exporter supports palette flash for GIF export.

MODE 3 Store

When switching between MODE 3 and MODE 4. The palettes of MODE 3 & 4 are separate, but palette operations on the same CLUT. When changing mode 4 colors are aside to a temporary store, and replaced when switching back. These values are also saved when saving SCREEN$ files (see "store" entries above), so you can replace the MODE 3 palette by loading a MODE 4 screen. It's a bit odd.

We can ignore this for our conversions and just write a default set of bytes.

Interrupts

Interrupts define locations on the screen where a given palette entry (0-15) changes to a different color from the 127 system palette. They are encoded with 4 bytes per interrupt, with multiple interrupts appended one after another.

Bytes	Content
1	Y position, stored as 172-y (see below)
1	Color to change
1	Palette A
1	Palette B

Interrupt coordinates set from BASIC are calculated from -18 up to 172 at the top of the screen. The plot range in BASIC is actually 0..173, but interrupts can't affect the first pixel (which makes sense, since this is handled through the main palette).

When stored in the file, line interrupts are stored as 172-y. For example, a line interrupt at 150 is stored in the file as 22. The line interrupt nearest the top of the screen (1st row down, interrupt position 173) would be stored as 172-172=0.

This sounds complicated, but actually means that to get our interrupt Y byte we can just subtract 1 from the Y coordinate in the image.

Converting Image to SCREEN$

We now have all the information we need to convert an image into a SCREEN$ format. The tricky bit (and what takes most of the work) is optimising the placement of the interrupts to maximise the number of colors in the image.

Pre-processing

Processing is done using Pillow package for Python. Input images are resized and cropped to fit, using using the ImageOps.fit() method, with centering.

SAM_COUPE_MODE4 = (256, 192, 16)
WIDTH, HEIGHT, MAX_COLORS = SAM_COUPE_MODE4

im = Image.open(fn)

# Resize with crop to fit.
im = ImageOps.fit(im, (WIDTH, HEIGHT), Image.ANTIALIAS, 0, (0.5, 0.5))

If the above crop is bad, you can adjust it by pre-cropping/sizing the image beforehand. There isn't the option to shrink without cropping as any border area would waste a palette entry to fill the blank space.

Interrupts

This is the bulk of the process for generating optimized images: the optimize method is shown below -- this shows the high level steps taken to reach optimal number of colors using interrupts to compress colors.

def optimize(im, max_colors, total_n_colors):
    """
    Attempts to optimize the number of colors in the screen using interrupts. The
    result is a dictionary of color regions, keyed by color number
    """
    optimal_n_colors = max_colors
    optimal_color_regions = {}
    optimal_total_interrupts = 0

    for n_colors in range(max_colors, total_n_colors+1):
        # Identify color regions.
        color_regions = calculate_color_regions(im, n_colors)

        # Compress non-overlapping colors together.
        color_regions = compress_non_overlapping(color_regions)

        # Simplify our color regions.
        color_regions = simplify(color_regions)

        total_colors = len(color_regions)

        # Calculate home many interrupts we're using, length drop initial.
        _, interrupts = split_initial_colors(color_regions)
        total_interrupts = n_interrupts(interrupts)

        print("- trying %d colors, with interrupts uses %d colors & %d interrupts" % (n_colors, total_colors, total_interrupts))

        if total_colors <= max_colors and total_interrupts <= MAX_INTERRUPTS:
            optimal_n_colors = n_colors
            optimal_color_regions = color_regions
            optimal_total_interrupts = total_interrupts
            continue
        break

    print("Optimized to %d colors with %d interrupts (using %d palette slots)" % (optimal_n_colors, optimal_total_interrupts, len(optimal_color_regions)))
    return optimal_n_colors, optimal_color_regions

The method accepts the image to compress, a max_colors argument, which is the number of colors supported by the screen mode (16). This is a lower bound, the minimum number of colors we should be able to get in the image. The argument total_n_colors contains the total number of colors in the image, capped at 127 -- the number of colors in the SAM palette. This is the upper bound, the maximum number of colors we can use. If the total_n_colors < 16 we'll skip optimization.

Each optimization round is as follows -

calculate_color_regions generates a dictionary of color regions in the image. Each region is a (start, end) tuple of y positions in the image where a particular color is found. Each color will usually have many blocks.
compress_non_overlapping takes colors with few blocks and tries to combine them with other colors with no overlapping regions: transitions between colors will be handled by interrupts
simplify takes the resulting color regions and tries to simplify them further, grouping blocks back with their own colors if they can and then combining adjacent blocks
total_colors the length of the color_regions is now the number of colors used
split_initial_colors removes the first block, to get total number of interrupts

note: The compress_non_overlapping algorithm makes no effort to find the best compression of regions - I experimented with this a bit and it just explodes the number of interrupts for little real gain in image quality.

The optimization process is brute force - step forward, increase the number of colors by 1 and perform the optimization steps above. If the number of colors > 16 we've gone too far: we return the last successful result, with colors <= 16.

SAM Coupé Palette

Once we have the colors for the image we map the image over to the SAM Coupé palette. Every pixel in the image must have a value between 0-15 -- pixels for colors controlled by interrupts are mapped to their "parent" color. Finally, all the colors are mapped across from their RGB values to the nearest SAM palette number equivalent.

note: This is sub-optimal, since the choice of colors should really be informed by the colors available. But I couldn't find a way to get Pillow to quantize to a fixed palette without dithering.

The mapping is done by calculating the distance in RGB space for each color to each color in the SAM 127 color palette, using the usual RGB color distance algorithm.

def convert_image_to_sam_palette(image, colors=16):
    new_palette = []
    rgb = image.getpalette()[:colors*3]
    for r, g, b in zip(rgb[::3], rgb[1::3], rgb[2::3]):

        def distance_to_color(o):
            return distance(o, (r, g, b))

        spalette = sorted(SAM_PALETTE, key=distance_to_color)
        new_palette.append(spalette[0])

    palette = [c for i in new_palette for c in i]
    image.putpalette(palette)
    return image

Packing bits

Now our image contains pixels of values 0-15 we can pack the bits and export the data. we can iterate through the flattened data in steps of 2, and pack into a single byte:

pixels = np.array(image16)

image_data = []
pixel_data = pixels.flatten()
# Generate bytestream and palette; pack to 2 pixels/byte.
for a, b in zip(pixel_data[::2], pixel_data[1::2]):
    byte = (a << 4) | b
    image_data.append(byte)

image_data = bytearray(image_data)

The operation a << 4 shifts the bits of integer a left by 4, so 15 (00001111) becomes 240 (11110000), while | ORs the result with b. If a = 0100 and b = 0011 the result would be 01000011 with both values packed into a single byte.

Writing the SCREEN$

Finally, the image data is written out, along with the palette data and line interrupts.

        # Additional 4 bytes 0, 17, 34, 127; mode 3 temporary store.
        bytes4 = b'\x00\x11\x22\x7F'

        with open(outfile, 'wb') as f:
            f.write(image_data)
            # Write palette.
            f.write(palette)

            # Write extra bytes (4 bytes, 2nd palette, 4 bytes)
            f.write(bytes4)
            f.write(palette)
            f.write(bytes4)

            # Write line interrupts
            f.write(interrupts)

            # Write final byte.
            f.write(b'\xff')

To actually view the result, I recommend the SAM Coupé Advanced Disk Manager.

You can see the source code for the img2sam converter on Github.

Examples

Below are some example images, converted from PNG/JPG source images to SAM Coupé MODE 4 SCREEN$ and then back into PNGs for display. The palette of each image is restricted to the SAM Coupé's 127 colors and colors are modified using interrupts.

Pool Pool 16 colors, no interrupts

Pool Pool 24 colors, 12 interrupts (compare gradients)

This image pair shows the effect on line interrupts on a image without dither. The separation between the differently colored pool balls makes this a good candidate.

Leia Leia 26 colors, 15 interrupts

Tully Tully 22 colors, 15 interrupts

The separation between the helmet (blue, yellow components) and horizontal line in the background make this work out nicely. Same for the second image of Tully below.

Isla Isla 18 colors, 6 interrupts

Tully (2) 18 colors, 5 interrupts

Dana Dana 17 colors, 2 interrupts

Lots of images that don't compress well because the same shades are used throughout the image. This is made worse by the conversion to the SAM's limited palette of 127.

Interstellar 17 colors, 3 interrupts

Blade Runner 16 colors (11 used), 18 interrupts

This last image doesn't manage to squeeze more than 16 colors out of the image, but does reduce the number of colors used for those 16 to just 11. This gives you 5 spare colors to add something else to the image.

Converting SCREEN$ to Image

Included in the scrimage package is the sam2img converter, which will take a SAM MODE4 SCREEN$ and convert it to an image. The conversion process respects interrupts and when exporting to GIF will export flashing palettes as animations.

The images above were all created using sam2img on SCREEN$ created with img2sam. The following two GIFs are examples of export from SAM Coupe SCREEN$ with flashing palettes.

Flashing palette

Flashing palette and flashing Line interrupts

You can see the source code for the sam2img converter on Github.

An ideal image either has gradients down the image, or regions of isolated non-overlapping color. But it's hard to predict as conversion to the SAM palette can run some colors together. ↩
I experimented a bit with converting to MODE 3, but only 4 colors meant not very exciting results. ↩
With faster flash speeds (1 50th/second) you can use it to sort of merge nearby colors to create additional shades, while giving yourself a headache. ↩

from Planet Python
via read more