Friday, November 30, 2018

gamingdirectional: Create a game over scene for pygame project

In this article we are going to create a game over scene for the pygame project, we will use back the start scene class which we have created previously to render our game over scene by slightly modify it for the multi-scenes uses. Here is the modify version of the start scene class, as you can see we have passed in a state variable to the start scene class’s draw method which will be used...

Source



from Planet Python
via read more

Shannon -jj Behrens: JJ's Mostly Adequate Summary of the Django Meetup: When *Not* To Use the ORM & Goodbye REST: Building GraphQL APIs with Django

The Django meetup was at Prezi. They have a great space. They are big Django users.

Goodbye REST: Building APIs with Django and GraphQL

Jaden Windle, @jaydenwindle, lead engineer at Jetpack.

https://github.com/jaydenwindle/goodbye-rest-talk

They moved from Django REST Framework to GraphQL.

It sounds like a small app.

They're using Django, React, and React Native.

I think he said they used Reason and moved away from it, but I could be wrong.

They had to do a rebuild anyway, so they used GraphQL.

He said not to switch to GraphQL just because it's cool, and in general, avoid hype-driven development.

GraphQL is a query language, spec, and collection of tools, designed to operate over a single endpoint via HTTP, optimzing for perf and flexibility.

Key features:

  • Query for only the data you need.
  • Easily query for multiple resources in a single request.
  • Great front end tooling for handling caching, loading / error states, and updates.

(I wonder if he's talking about Apollo, not just GraphQL. I found out later, that they are using Apollo.)

GraphQL Schema:

  • Types
  • Queries
  • Mutations
  • *Subscriptions

Graphene is the goto framework for GraphQL in Django.

He showed an example app.

You create a type and connect it to a model. It's like a serializer in DRF.

(He's using VS Code.)

It knows how to understand relationships between types based on the relationships in the models.

He showed the query syntax.

He showed how Graphene connects to the Django model. You're returning raw Django model objects, and it takes care of serialization.

There's a really nice UI where you can type in a query, and it queries the server. It has autocomplete. I can't tell if this is from Apollo, Graphene, or some other GraphQL tool.

You only pass what you need across the wire.

When you do a mutation, you can also specify what data it should give back to you.

There is some Mutation class that you subclass.

The code looks a lot like DRF.

Subscriptions aren't fully implemented in Graphene. His company is working on implementing and open sourcing something. There are a bunch of other possible real-time options--http://graphql-python/graphql-ws is one.

There's a way to do file uploads.

Important: There's this thing called graphene-django-extras. There's even something to connect to DRF automatically.

Pros:

  • Dramatically improves front end DX
  • Flexible types allow for quick iteration
  • Always up-to-date documentation
  • Only send needed data over the wire

Cons:

  • Graphene isn't as mature as some other GraphQL implementations (for instance, in JS and Ruby)
  • Logging is different when using a single GraphQL endpoint
  • REST is currently better at server-side caching (E-Tag, etc.)

Graphene 3 is coming.

In the Q&A, they said they do use Apollo.

They're not yet at a scale where they have to worry about performance.

He's not entirely sure whether it's prone to the N+1 queries problem, but there are GitHub issues related to that.

You can do raw ORM or SQL queries if you need to. Otherwise, he's not sure what it's doing behind the scenes.

You can add permissions to the models. There's also a way to tie into Django's auth model.

Their API isn't a public API. It's only for use by their own client.

The ORM, and When Not To Use It

Christophe Pettus from PostgreSQL Experts.

He thinks the ORM is great.

The first ORM he wrote was written before a lot of the audience was born.

Sometimes, writing code for the ORM is hard.

Database agnosticism isn't as important as you think it is. For instance, you don't make the paint on your house color-agnostic.

Libraries have to be DB agnostic. Your app probably doesn't need to be.

Reasons you might want to avoid the query language:

  • Queries that generate sub-optimal SQL
  • Queries more easily expressed in SQL
  • SQL features not available via the ORM

Django's ORM's SQL is much better than it used to be.

Don't use __in with very large lists. 100 is about the longest list you should use.

Avoid very deep joins.

It's not hard to chain together a bunch of stuff that ends up generating SQL that's horrible.

The query generator doesn't do a separate optimization pass that makes the query better.

It's better to express .filter().exclude().filter() in SQL.

There are so many powerful functions and operators in PostgreSQL!

SomeModel.objects.raw() is your friend!

You can write raw SQL, and yet still have it integrate with Django models.

You can even get stuff back from the database that isn't in the model definition.

There's some WITH RECURSIVE thing in PostgreSQL that would be prohibitively hard to do with the Django ORM. It's not really recursive--it's iterative.

You can also do queries without using the model framework.

The model framework is very powerful, but it's not cheap.

Interesting: The data has to be converted into Python data and then again into model data. If you're just going to serialize it into JSON, why create the model objects? You can even create the JSON directly in the database and hand it back directly with PostgreSQL. But make sure the database driver doesn't convert the JSON back to Python ;) Return it as raw text.

There are also tables that Django can't treat as model tables. For instance, there are logging tables that lack a primary key. Sometimes, you have weird tables with non-Django-able primary keys.

The ORM is great, though. For instance, it's great for basic CRUD.

Interfaces that require building queries in steps are better done with SQL--for instance, an advanced search function.

Summary:

  • Don't be afraid to step outside the ORM.
  • SQL isn't a bug. It's a feature. It's code like everything else.
  • Do use the ORM for operations that it makes easier.
  • Don't hesitate to use the full power of SQL.

Q&A:

Whether you put your SQL in model methods or managers is a matter of taste. Having all the code for a particular model in one place (i.e. the model or manager) is useful.

Write tests. Use CI.

Use parameter substitution in order to avoid SQL injection attacks. Remember, all external input is hostile. You can't use parameter substitution if the table name is dynamic--just be careful with what data you allow.

If you're using PostGIS, write your own queries for sure. It's really easy to shoot yourself in the foot with the GIS package.

 

 



from Planet Python
via read more

Shannon -jj Behrens: PyCon Notes: PostgreSQL Proficiency for Python People

In summary, this tutorial was fantastic! I learned more in three hours than I would have learned if I had read a whole book!

Here's the video. Here are the slides. Here are my notes:

Christophe Pettus was the speaker. He's from PostgreSQL Experts.

PostgreSQL is a rich environment.

It's fully ACID compliant.

It has the richest set of features of any modern, production RDMS. It has even more features than
Oracle.

PostgreSQL focuses on quality, security, and spec compliance.

It's capable of very high performance: tens of thousands of transactions per second, petabyte-sized data sets, etc.

To install it, just use your package management system (apt, yum, etc.). Those systems will usually take care of initialization.

There are many options for OS X. Heroku even built a Postgres.app that runs more like a foreground app.

A "cluster" is a single PostgreSQL server (which can manage multiple databases).

initdb creates the basic file structure. PostgreSQL has to be up and running to run initdb.

To create a database:

sudo su - postgres
psql

create database this_new_database;

To drop a database:

drop database this_new_database;

Debian runs initdb for you. Red Hat does not.

Debian has a cluster management system. Use it. See, for instance, pg_createcluster.

Always create databases as UTF-8. Once you've created it, you can't change it.

Don't use SQLASCII. It's a nightmare. Don't use "C locale".

pg_ctl is a built-in command to start and stop PostgreSQL:

cd POSTGRES_DIRECTORY
pg_ctl -D . start

Usually, pg_ctl is wrapped by something provided by your platform.

On Ubuntu, start PostgreSQL via:

service postgresql start

Always use "-m fast" when stopping.

Postgres puts its own data in a top-level directory. Let's call it $PGDATA.

Don't monkey around with that data.

pg_clog and pg_xlog are important. Don't mess with them.

On most systems, configuration lives in $PGDATA.

postgresql.conf contains server configuration.

pg_hba.conf contains authentication settings.

postgresql.conf can feel very overwhelming.

Avoid making a lot of changes to postgresql.conf. Instead, add the following to it:

include "postgresql.conf.include"

Then, mess with "postgresql.conf.include".

The important parameters fall into these categories: logging, memory, checkpoints, and the planner.

Logging:

Be generous with logging. It has a very low impact on the system. It's your best source of info for diagnosing problems.

You can log to syslog or log CSV to files. He showed his typical logging configuration.

He showed his guidelines / heuristics for all the settings, including how to finetune things. They're really good! See his slides.

As of version 9.3, you don't need to tweak Linux kernel parameters anymore.

Do not mess with fsync or  synchronous_commit.

Most settings require a server reload to take effect. Some things require a server restart. Some can be set on a per-session basis. Here's how to do that. This is also an example of how to use a transaction:

begin;
set local random_page_cost = 2.5;
show random_page_cost;
abort;

pg_hba.conf contains users and roles. Roles are like groups. They form a hierarchy.

A user is just a role with login privs.

Don't use the "postgres" superuser for anything application-related.

Sadly, you probably will have to grant schema-modification privs to your app user if you use migrations, but if you don't have to, don't.

By default, DB traffic is not encrypted. Turn on SSL if you are running in a cloud provider.

In pg_hba.conf, "trust" means if they can log into the server, they can access Postgres too. "peer" means they can have a Postgres user that matches their username. "md5" is an md5 hash password.

It's a good idea to restrict the IP addresses allowed to talk to the server fairly tightly.

The WAL

The Write-Ahead Log is key to many Postgres operations. It's the basis for replication, crash recovery, etc.

When each transaction is committed, it is logged to the write-ahead log.

Changes in the transaction are flushed to disk.

If the system crashes, the WAL is "replayed" to bring the DB to a consistent state.

It's a continuous record of changes since the last checkpoint.

The WAL is stored in 16MB segments in the pg_xlog directory.

Never delete anything from pg_xlog.

archive_command is a way to move the WAL segments to someplace safe (like a
different system).

By default, synchronous_commit is on, which means that commits do not return until the WAL flush is done. If you turn it off, they'll return when the WAL flush is queued. You might lose transactions in the case of a crash, but there's no risk of database corruption.

Backup and Recovery

Experience has shown that 20% of the time, your EBS volumes will not reattach when you reboot in AWS.

pg_dump is a built-in dump/restore tool.

It takes a logical snapshot of the database.

It doesn't lock the database or prevent writes to disk.

pg_restore restores the database. It's not fast.

It's great for simple backups but not suitable for fast recovery from major failures.

pg_bench is the built in benchmarking tool.

pg_dump -Fc --verbose example > example.dump

Without the -Fc, it dumps SQL commands instead of its custom format.

pg_restore --dbname=example_restored --verbose example.dump

pg_restore takes a long time because it has to recreate indexes.

pg_dumpall --globals-only

Back up each database with pg_dump using --format=custom.

To do a parallel restore, use --jobs=.

If you have a large database, pg_dump may not be appropriate.

A disk snapshot + every WAL segment is enough to recreate the database.

To start a PITR (point in time recovery) backup:

select pg_start_backup(...);

Copy the disk image and any WAL files that are created.

select pg_stop_backup();

Make sure you have all the WAL segments.

The disk image + all the WAL segments are enough to create the DB.

See also github.com/wal-e/wal-e. It's highly recommended.

It automates backups to S3.

He explained how to do a PITR.

With PITR, you can rollback to a particular point in time. You don't have to replay everything.

This is super handy for application failures.

RDS is something that scripts all this stuff for you.

Replication

Send the WAL to another server.

Keep the server up to date with the primary server.

That's how PostgreSQL replication works.

The old way was called "WAL Archiving". Each 16MB segment was sent to the secondary when complete. Use rsync, WAL-E, etc., not scp.

The new way is Streaming Replication.

The secondary gets changes as they happen.

It's all setup via recovery.conf in your $PGDATA.

He showed a recovery.conf for a secondary machine, and showed how to let it become the master.

Always have a disaster recovery strategy.

pg_basebackup is a utility for doing a snapshot of a running server. It's the easiest way to take a snapshot to start a new secondary. It's also useful for archival backups. It's not the fastest thing, but it's pretty foolproof.

Replication:

The good:

Easy to setup.

Schema changes are replicated.

Secondaries can handle read-only queries for load balancing.

It either works or it complains loudly.

The bad:

You get the entire DB cluster or none of it.

No writes of any kind to the secondary, not even temporary tables.

Some things aren't replicated like temporary tables and unlogged tables.

His advice is to start with WAL-E. The README tells you everything. It fixes a ton of problems.

The biggest problem with WAL-E is that writing to S3 can be slow.

Another way to do funky things is trigger-based replication. There's a bunch of third-party packages to do this.

Bucardo is one that lets you do multi-master setups.

However, they're fiddly and complex to set up. They can also fail quietly.

Transactions, MVCC, and Vacuum

BEGIN;
INSERT ...;
INSERT ...;
COMMIT;

By the way, no bank works this way ;)

Everything runs inside of a transaction.

If there is no explicit transaction, each statement is wrapped in one for you.

Everything that modifies the database is transactional, even schema changes.

\d shows you all your tables.

With a transaction, you can even rollback a table drop.

South (the Django migration tool) runs the whole migration in a single transaction.

Many resources are held until the end of a transaction. Keep your transactions brief and to the point.

Beware of "IDLE IN TRANSACTION" sessions. This is a problem for Django apps.

A tuple in Postgres is the same thing as a row.

Postgres uses Multi-Version Concurrency Control. Each transaction sees its own version of the database.

Writers only block writers to the same tuple. Nothing else causes blocking.

Postgres will not allow two snapshots to "fork" the database. If two people try to write to the same tuple, Postgres will block one of them.

There are higher isolation modes. His description of them was really interesting.

He suggested that new apps use SERIALIZABLE. This will help you find the concurrency errors in your app.

Deleted tuples are not usually immediately freed.

Vacuum's primary job is to scavenge tuples that are no longer visible to any transaction.

autovacuum generally handles this problem for you without intervention (since version 8).

Run analyze after a major database change to help the planner out.

If someone tells you "vacuum's not working", they're probably wrong.

The DB generally stabilizes at 20% to 50% bloat. That's acceptable.

The problem might be that there are long-running transactions or idle-in-transaction sessions. They'll block vacuuming. So will manual table locking.

He talked about vacuum issues for rare situations.

Schema Design

Normalization is important, but don't obsess about it.

Pick "entities". Make sure that no entity-level info gets pushed into the subsidiary items.

Pick a naming scheme and stick with it.

Plural or singular? DB people tend to like plural. ORMs tend to like singular.

You probably want lower_case to avoid quoting.

Calculated denormalization can sometimes be useful; copied denormalization is almost never useful.

Joins are good.

PostgreSQL executes joins very efficiently. Don't be afraid of them.

Don't worry about large tables joined with small tables.

Use the typing system. It has a rich set of types.

Use domains to create custom types.

A domain is a core type + a constraint.

Don't use polymorphic fields (fields whose interpretation is dependent on another field).

Don't use strings to store multiple types.

Use constraints. They're cheap and fast.

You can create constraints across multiple columns.

Avoid Entity-Attribute-Value schemas. They cause great pain. They're very inefficient. They make reports very difficult.

Consider using UUIDs instead of serials as synthetic keys.

The problem with serials for keys is that merging tables can be hard.

Don't have "Thing" tables like "Object" tables.

If a table has a few frequently-updated fields and a few slowly-updated fields, consider splitting the table. Split the fast-moving stuff out into a separate 1-to-1 table.

Arrays are a first-class type in PostgreSQL. It's a good substitute for using a subsidiary table.

A list of tags is a good fit for arrays.

He talked about hstore. It's much better than Entity-Attribute-Value. It's great for optional, variable attributes. It's like a hash. It can be indexed, searched, etc. It lets you add attributes to tables for users. Don't use it as a way to avoid all table modifications.

json is now a built in type.

There's also jsonb.

Avoid indexes on big things, like 10k character strings.

NULL it a total pain in the neck.

Only use it to mean "missing value".

Never use it to represent a meaningful value.

Let's call anything 1MB or more a "very large object". Store them in files. Store the metadata in the database. The database API is just not a good fit for this.

Many-to-many tables can get extremely large. Consider replacing them with array fields (either one way or both directions). You can use a trigger to maintain integrity.

You don't want more than about 250k entries in an array.

Use UTF-8. Period.

Always use TIMESTAMPTZ (which Django uses by default). Don't use TIMESTAMP. TIMESTAMPTZ is a timestamp converted to UTC.

Index types:

B-Tree

Use a B-Tree on a column if you frequently query on that column,
use one of the comparison operators, only get back 10-15% of the rows,
and run that query frequently.

It won't use the index if you're going to get back more than 15% of
the rows because it's faster to scan a table then scan an index.

Use a partial index if you can ignore most of the rows.

The entire tuple has to be copied into the index.

GiST

It's a framework to create indexes.

KNN indexes are the K-nearest neighbors.

GIN

Generalized inverted index. Used for full-text search.

The others either are not good or very specific.

Why isn't it using my index?

Use explain analyze to look at the query.

If it thinks it's going to require most of the rows, it'll do a table scan.

If it's wrong, use analyze to update the planner stats.

Sometimes, it can't use the index.

Two ways to create an index:

create index

create index concurrently

reindex rebuilds an index from scratch.

pg_stat_user_indexes tells you about how your indexes are being used.

What do you do if a query is slow:

Use explain or explain analyze.

explain doesn't actually run the query.

"Cost" is measured in arbitrary units. Traditionally, they have been "disk fetches". Costs are inclusive of subnodes.

I think explain analyze actually runs the query.

Things that are bad:

Joins between 2 large tables.

Cross joins (cartesian products). These often happen by accident.

Sequential scans on large tables.

select count(*) is slow because it results in a full table scan since you
have to see if the tuples are alive or dead.

offset / limit. These actually run the query and then throw away that many
rows. Beware that GoogleBot is relentless. Use other keys.

If the database is slow:

Look at pg_stat_activity:

select * from pg_stat_activity;

tail -f the logs.

Too much I/O? iostat 5.

If the database isn't responding:

Try connecting with it using psql.

pg_stat_activity

pg_locks

Python Particulars

psycopg2 is the only real option in Python 2.

The result set of a query is loaded into client memory when the query completes. If there are a ton of rows, you could run out of memory. If you want to scroll through the results, use a "named" cursor. Be sure to dispose of it properly.

The Python 3 situation is not so great. There's py-postgresql. It's pure Python.

If you are using Django 1.6+, use the @atomic decorator.

Cluster all your writes into small transactions. Leave read operations outside.

Do all your writes at the very end of the view function.

Multi-database works very nicely with hot standby.

Point the writes at the primary, and the reads at the secondary.

For Django 1.5, use the @xact decorator.

Sloppy transaction management can cause the dreaded Django idle-in-transaction problem.

Use South for database migration. South is getting merged into Django in version 1.7 of Django.

You can use manual migrations for stuff the Django ORM can't specify.

Special Situations

Upgrade to 9.3.4. Upgrade minor versions promptly.

Major version upgrades require more planning. pg_upgrade has to be run when the database is not running.

A full pg_dump / pg_restore is always the safest, although not the most practical.

Always read the release notes.

All parts of a replication set must be upgraded at once (for major versions).

Use copy, not insert, for bulk loading data. psycopg2 has a nice interface. Do a vacuum afterwards.

AWS

Instances can disappear and come back up without instance storage.

EBS can fail to reattach after reboot.

PIOPS are useful (but pricey) if you are using EBS.

Script everything, instance creation, PostgreSQL, etc. Use Salt. Use a VPC.

Scale up and down as required to meet load. If you're just using them to rent a server, it's really expensive.

PostgreSQL RDS is a managed database instance. Big plus: automatic failover! Big minus: you can't read from the secondary. It's expensive. It's a good place to start.

Sharding

Eventually, you'll run out of write capacity on your master.

postgres-xc is an open source fork of PostgreSQL.

Bucardo provides multi-master write capability.

He talked about custom sharding.

Instagram wrote a nice article about it.

Pooling

Opening a connection is expensive. Use a pooler.

pgbouncer is a pooler.

pgPool II can even do query analysis. However, it has higher overhead and is more complex to configure.

Tools

Monitor everything.

check_postgres.pl is a plugin to monitor PostgreSQL.

pgAdmin III and Navicat are nice clients.

pgbadger is for log analysis. So is pg_stat_statements.

Closing

MVCC works by each tuple having a range of transaction IDs that can see that
tuple.

Failover is annoying to do in the real world. People use HAProxy, some pooler, etc. with some scripting, or they have a human do the failover.

HandyRep is a server-based tool designed to allow you to manage a PostgreSQL "replication cluster", defined as a master and one or more replicas on the same network.


from Planet Python
via read more

Understanding Conda and Pip

Conda and pip are often considered as being nearly identical. Although some of the functionality of these two tools overlap, they were designed and should be used for different purposes. Pip is the Python Packaging Authority’s recommended tool for installing packages from the Python Package Index, PyPI. Pip installs Python software packaged as wheels or …
Read more →

The post Understanding Conda and Pip appeared first on Anaconda.



from Planet SciPy
read more

Stack Abuse: Handling Unix Signals in Python

UNIX/Linux systems offer special mechanisms to communicate between each individual process. One of these mechanisms are signals, and belong to the different methods of communication between processes (Inter Process Communication, abbreviated with IPC).

In short, signals are software interrupts that are sent to the program (or the process) to notify the program of significant events or requests to the program in order to run a special code sequence. A program that receives a signal either stops or continues the execution of its instructions, terminates either with or without a memory dump, or even simply ignores the signal.

Although it is defined in the POSIX standard, the reaction actually depends on how the developer wrote the script and implemented the handling of signals.

In this article we explain what are signals, show you how to sent a signal to another process from the command line as well as processing the received signal. Among other modules, the program code is mainly based on the signal module. This module connects the according C headers of your operating system with the Python world.

An Introduction to Signals

On UNIX-based systems, there are three categories of signals:

  • System signals (hardware and system errors): SIGILL, SIGTRAP, SIGBUS, SIGFPE, SIGKILL, SIGSEGV, SIGXCPU, SIGXFSZ, SIGIO

  • Device signals: SIGHUP, SIGINT, SIGPIPE, SIGALRM, SIGCHLD, SIGCONT, SIGSTOP, SIGTTIN, SIGTTOU, SIGURG, SIGWINCH, SIGIO

  • User-defined signals: SIGQUIT, SIGABRT, SIGUSR1, SIGUSR2, SIGTERM

Each signal is represented by an integer value, and the list of signals that are available is comparably long and not consistent between the different UNIX/Linux variants. On a Debian GNU/Linux system, the command kill -l displays the list of signals as follows:

$ kill -l
 1) SIGHUP   2) SIGINT   3) SIGQUIT  4) SIGILL   5) SIGTRAP
 6) SIGABRT  7) SIGBUS   8) SIGFPE   9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM  
16) SIGSTKFLT   17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP  
21) SIGTTIN 22) SIGTTOU 23) SIGURG  24) SIGXCPU 25) SIGXFSZ  
26) SIGVTALRM   27) SIGPROF 28) SIGWINCH    29) SIGIO   30) SIGPWR  
31) SIGSYS  34) SIGRTMIN    35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3  
38) SIGRTMIN+4  39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8  
43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13  
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12  
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7  
58) SIGRTMAX-6  59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2  
63) SIGRTMAX-1  64) SIGRTMAX  

The signals 1 to 15 are roughly standardized, and have the following meaning on most of the Linux systems:

  • 1 (SIGHUP): terminate a connection, or reload the configuration for daemons
  • 2 (SIGINT): interrupt the session from the dialogue station
  • 3 (SIGQUIT): terminate the session from the dialogue station
  • 4 (SIGILL): illegal instruction was executed
  • 5 (SIGTRAP): do a single instruction (trap)
  • 6 (SIGABRT): abnormal termination
  • 7 (SIGBUS): error on the system bus
  • 8 (SIGFPE): floating point error
  • 9 (SIGKILL): immmediately terminate the process
  • 10 (SIGUSR1): user-defined signal
  • 11 (SIGSEGV): segmentation fault due to illegal access of a memory segment
  • 12 (SIGUSR2): user-defined signal
  • 13 (SIGPIPE): writing into a pipe, and nobody is reading from it
  • 14 (SIGALRM): the timer terminated (alarm)
  • 15 (SIGTERM): terminate the process in a soft way

In order to send a signal to a process in a Linux terminal you invoke the kill command with both the signal number (or signal name) from the list above and the id of the process (pid). The following example command sends the signal 15 (SIGTERM) to the process that has the pid 12345:

$ kill -15 12345

An equivalent way is to use the signal name instead of its number:

$ kill -SIGTERM 12345

Which way you choose depends on what is more convenient for you. Both ways have the same effect. As a result the process receives the signal SIGTERM, and terminates immediately.

Using the Python signal Library

Since Python 1.4, the signal library is a regular component of every Python release. In order to use the signal library, import the library into your Python program as follows, first:

import signal  

Capturing and reacting properly on a received signal is done by a callback function - a so-called signal handler. A rather simple signal handler named receiveSignal() can be written as follows:

def receiveSignal(signalNumber, frame):  
    print('Received:', signalNumber)
    return

This signal handler does nothing else than reporting the number of the received signal. The next step is registering the signals that are caught by the signal handler. For Python programs, all the signals (but 9, SIGKILL) can be caught in your script:

if __name__ == '__main__':  
    # register the signals to be caught
    signal.signal(signal.SIGHUP, receiveSignal)
    signal.signal(signal.SIGINT, receiveSignal)
    signal.signal(signal.SIGQUIT, receiveSignal)
    signal.signal(signal.SIGILL, receiveSignal)
    signal.signal(signal.SIGTRAP, receiveSignal)
    signal.signal(signal.SIGABRT, receiveSignal)
    signal.signal(signal.SIGBUS, receiveSignal)
    signal.signal(signal.SIGFPE, receiveSignal)
    #signal.signal(signal.SIGKILL, receiveSignal)
    signal.signal(signal.SIGUSR1, receiveSignal)
    signal.signal(signal.SIGSEGV, receiveSignal)
    signal.signal(signal.SIGUSR2, receiveSignal)
    signal.signal(signal.SIGPIPE, receiveSignal)
    signal.signal(signal.SIGALRM, receiveSignal)
    signal.signal(signal.SIGTERM, receiveSignal)

Next, we add the process information for the current process, and detect the process id using the methode getpid() from the os module. In an endless while loop we wait for incoming signals. We implement this using two more Python modules - os and time. We import them at the beginning of our Python script, too:

import os  
import time  

In the while loop of our main program the print statement outputs "Waiting...". The time.sleep() function call makes the program wait for three seconds.

    # output current process id
    print('My PID is:', os.getpid())

    # wait in an endless loop for signals 
    while True:
        print('Waiting...')
        time.sleep(3)

Finally, we have to test our script. Having saved the script as signal-handling.py we can invoke it in a terminal as follows:

$ python3 signal-handling.py 
My PID is: 5746  
Waiting...  
...

In a second terminal window we send a signal to the process. We identify our first process - the Python script - by the process id as printed on screen, above.

$ kill -1 5746

The signal event handler in our Python program receives the signal we have sent to the process. It reacts accordingly, and simply confirms the received signal:

...
Received: 1  
...

Ignoring Signals

The signal module defines ways to ignore received signals. In order to do that the signal has to be connected with the predefined function signal.SIG_IGN. The example below demonstrates that, and as a result the Python program cannot be interrupted by CTRL+C anymore. To stop the Python script an alternative way has been implemented in the example script - the signal SIGUSR1 terminates the Python script. Furthermore, instead of an endless loop we use the method signal.pause(). It just waits for a signal to be received.

import signal  
import os  
import time

def receiveSignal(signalNumber, frame):  
    print('Received:', signalNumber)
    raise SystemExit('Exiting')
    return

if __name__ == '__main__':  
    # register the signal to be caught
    signal.signal(signal.SIGUSR1, receiveSignal)

    # register the signal to be ignored
    signal.signal(signal.SIGINT, signal.SIG_IGN)

    # output current process id
    print('My PID is:', os.getpid())

    signal.pause()

Handling Signals Properly

The signal handler we have used up to now is rather simple, and just reports a received signal. This shows us that the interface of our Python script is working fine. Let's improve it.

Catching the signal is already a good basis but requires some improvement to comply with the rules of the POSIX standard. For a higher accuracy each signal needs a proper reaction (see list above). This means that the signal handler in our Python script needs to be extended by a specific routine per signal. This works best if we understand what a signal does, and what a common reaction is. A process that receives signal 1, 2, 9 or 15 terminates. In any other case it is expected to write a core dump, too.

Up to now we have implemented a single routine that covers all the signals, and handles them in the same way. The next step is to implement an individual routine per signal. The following example code demonstrates this for the signals 1 (SIGHUP) and 15 (SIGTERM).

def readConfiguration(signalNumber, frame):  
    print ('(SIGHUP) reading configuration')
    return

def terminateProcess(signalNumber, frame):  
    print ('(SIGTERM) terminating the process')
    sys.exit()

The two functions above are connected with the signals as follows:

    signal.signal(signal.SIGHUP, readConfiguration)
    signal.signal(signal.SIGTERM, terminateProcess)

Running the Python script, and sending the signal 1 (SIGHUP) followed by a signal 15 (SIGTERM) by the UNIX commands kill -1 16640 and kill -15 16640 results in the following output:

$ python3 daemon.py
My PID is: 16640  
Waiting...  
Waiting...  
(SIGHUP) reading configuration
Waiting...  
Waiting...  
(SIGTERM) terminating the process

The script receives the signals, and handles them properly. For clarity, this is the entire script:

import signal  
import os  
import time  
import sys

def readConfiguration(signalNumber, frame):  
    print ('(SIGHUP) reading configuration')
    return

def terminateProcess(signalNumber, frame):  
    print ('(SIGTERM) terminating the process')
    sys.exit()

def receiveSignal(signalNumber, frame):  
    print('Received:', signalNumber)
    return

if __name__ == '__main__':  
    # register the signals to be caught
    signal.signal(signal.SIGHUP, readConfiguration)
    signal.signal(signal.SIGINT, receiveSignal)
    signal.signal(signal.SIGQUIT, receiveSignal)
    signal.signal(signal.SIGILL, receiveSignal)
    signal.signal(signal.SIGTRAP, receiveSignal)
    signal.signal(signal.SIGABRT, receiveSignal)
    signal.signal(signal.SIGBUS, receiveSignal)
    signal.signal(signal.SIGFPE, receiveSignal)
    #signal.signal(signal.SIGKILL, receiveSignal)
    signal.signal(signal.SIGUSR1, receiveSignal)
    signal.signal(signal.SIGSEGV, receiveSignal)
    signal.signal(signal.SIGUSR2, receiveSignal)
    signal.signal(signal.SIGPIPE, receiveSignal)
    signal.signal(signal.SIGALRM, receiveSignal)
    signal.signal(signal.SIGTERM, terminateProcess)

    # output current process id
    print('My PID is:', os.getpid())

    # wait in an endless loop for signals 
    while True:
        print('Waiting...')
        time.sleep(3)

Further Reading

Using the signal module and an according event handler it is relatively easy to catch signals. Knowing the meaning of the different signals, and to react properly as defined in the POSIX standard is the next step. It requires that the event handler distinguishes between the different signals, and has a separate routine for all of them.



from Planet Python
via read more

Codementor: Subtleties of Python

A good software engineer understands how crucial attention to detail is; minute details, if overlooked, can make a world of difference between a working unit and a disaster. That’s why writing...

from Planet Python
via read more

PyBites: 3 Cool Things You Can do With the dateutil Module

In this short article I will show you how to use dateutil's parse, relativedelta and rrule to make it easier to work with datetimes in Python.

Firt some necessary imports:

>>> from datetime import date
>>> from dateutil.parser import parse
>>> from dateutil.relativedelta import relativedelta
>>> from dateutil.rrule import rrule, WEEKLY, WE

1. Parse a datetime from a string

This is actually what made me look into dateutil to start with. Camaz shared this technique in the forum for Bite 7. Parsing dates from logs

Imagine you have this log line:

>>> log_line = 'INFO 2014-07-03T23:27:51 supybot Shutdown complete.'

Up until recently I used datetime's strptime like so:

>>> date_str = '%Y-%m-%dT%H:%M:%S'
>>> datetime.strptime(log_line.split()[1], date_str)
datetime.datetime(2014, 7, 3, 23, 27, 51)

More string manipulation and you have to know the format string syntax. dateutil's parse takes this complexity away:

>>> timestamp = parse(log_line, fuzzy=True)
>>> print(timestamp)
2014-07-03 23:27:51
>>> print(type(timestamp))
<class 'datetime.datetime'>

2. Get a timedelta in months

A limitation of datetime's timedelta is that it does not show the number of months:

>>> today = date.today()
>>> pybites_born = date(year=2016, month=12, day=19)
>>> (today-pybites_born).days
711

So far so good. However this does not work:

>>> (today-pybites_born).years
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'datetime.timedelta' object has no attribute 'years'

Nor this:

>>> (today-pybites_born).months
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'datetime.timedelta' object has no attribute 'months'

relativedelta to the rescue:

>>> diff = relativedelta(today, pybites_born)
>>> diff.years
1
>>> diff.months
11

When you need months, use relativedelta. And yes, we can almost celebrate two years of PyBites!

Another use case of this we saw in my previous article, How to Test Your Django App with Selenium and pytest, where I used it to get the last 3 months for our new Platform Coding Streak feature:

>>> def _make_3char_monthname(dt):
...     return dt.strftime('%b').upper()
...
>>> this_month = _make_3char_monthname(today)
>>> last_month = _make_3char_monthname(today-relativedelta(months=+1))
>>> two_months_ago = _make_3char_monthname(today-relativedelta(months=+2))
>>> for month in (this_month, last_month, two_months_ago):
...     print(f'{month} {today.year}')
...
NOV 2018
OCT 2018
SEP 2018

Let's get next Wednesday for the next example:

>>> next_wednesday = today+relativedelta(weekday=WE(+1))
>>> next_wednesday
datetime.date(2018, 12, 5)

3. Make a range of dates

Say I want to schedule my next batch of Italian lessons, each Wednesday for the coming 10 weeks. Easy:

>>> rrule(WEEKLY, count=10, dtstart=next_wednesday)
<dateutil.rrule.rrule object at 0x1033ef898>

As this will return an iterator and it does not show up vertically, let's materialize it in a list and pass it to pprint:

>>> from pprint import pprint as pp
>>> pp(list(rrule(WEEKLY, count=10, dtstart=next_wednesday)))
[datetime.datetime(2018, 12, 5, 0, 0),
datetime.datetime(2018, 12, 12, 0, 0),
datetime.datetime(2018, 12, 19, 0, 0),
datetime.datetime(2018, 12, 26, 0, 0),
datetime.datetime(2019, 1, 2, 0, 0),
datetime.datetime(2019, 1, 9, 0, 0),
datetime.datetime(2019, 1, 16, 0, 0),
datetime.datetime(2019, 1, 23, 0, 0),
datetime.datetime(2019, 1, 30, 0, 0),
datetime.datetime(2019, 2, 6, 0, 0)]

Double-check with Unix cal

$ cal 12 2018
December 2018
Su Mo Tu We Th Fr Sa
                   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

$ cal 1 2019
    January 2019
Su Mo Tu We Th Fr Sa
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31

$ cal 2 2019
February 2019
Su Mo Tu We Th Fr Sa
                1  2
 3  4  5  6  7  8  9
...

We added an exercise to our platform to create a #100DaysOfCode planning, skipping weekend days. rrule made this relatively easy.


And that's it, my favorite use cases of dateutil so far. There is some timezone functionality in dateutil as well, but I have mostly used pytz for that.

Learn more? Check out this nice dateutil examples page and feel free to share your favorite snippets in the comments below.

Don't forget this is an external library (pip install python-dateutil), for most basic operations datetime would suffice. Another nice stdlib module worth checking out is calendar.


Keep Calm and Code in Python!

-- Bob



from Planet Python
via read more

Reinout van Rees: Amsterdam Python meetup, november 2018

My summary of the 28 november python meetup at the Byte office. I myself also gave a talk (about cookiecutter) but I obviously haven't made a summary of that. I'll try to summarize that one later :-)

Project Auger - Chris Laffra

One of Chris' pet projects is auger, automated unittest generation. He wrote it when lying in bed with a broken ankle and thought about what he hated most: writing tests.

Auger? Automated Unittest GEneRator. It works by running a tracer

The project's idea is:

  • Write code as always
  • Don't worry about tests
  • Run the auger tracer to record function parameter values and function results.
  • After recording, you can generate mocks and assertions.

"But this breaks test driven development"!!! Actually, not quite. It can be useful if you have to start working on an existing code base without any tests: you can generate a basic set of tests to start from.

So: it records what you did once and uses that as a starting point for your tests. It makes sure that what ones worked keeps on working.

It works with a "context manager". A context manager normally has __enter__() and __exit__(). But you can add more interesting things. If in the __enter__()` you call sys.settrace(self.trace), you can add a def trace(self, frame, event, args), which is then fired upon everything that happens within the context manager. You can use it for coverage tracking or logging or visualization of what happens in your code. He used the last for algorithm visualizations on http://pyalgoviz.appspot.com/

So... this sys.settrace() magic is used to figure out which functions get called with which parameters.

Functions and classes in the modules you want to check are tested, classes from other modules are partially mocked.

Python LED animation system BiblioPixel - Tom Ritchford

Bibliopixel (https://github.com/ManiacalLabs/BiblioPixel) is his pet project. It is a python3 program that runs on basically everything (raspberry pi, linux, osx, windows. What it does? It controls large numbers of lights in real-time without programming.

There are lots of output drivers form led strips and philips hue to an opengl in-browser renderer. There are also lots of different ways to steer it. Here is the documentation.

He actually started on a lot of programs having to do with audio and lights and so. Starting with a PDP-11 (which only produced beeps). Amiga, macintosch (something that actually worked and was used for real), java, javascript, python + C++. And now python.

The long-term goal is to programmatically control lights and other hardware in real time. And... he wants to define the project in text files. The actual light "program" should not be in code. Ideally, bits of projects ought to be reusable. And any input ought to be connectable to any output.

Bibliopixel started with the AllPixel LED controller which had a succesful kickstarter campaign (he got involved two years later).

An "animation" talks to a "layout" and the layout talks to one or more drivers (one could be a debug visualization on your laptop and the other the real physical installation). Animations can be nested.

Above it all is the "Project". A YAML (or json) file that defines the project and configures everything.

Bibliopixel is quite forgiving about inputs. It accepts all sorts of colors (red, #ff0000, etc). Capitalization, missing spaces, extraneous spaces: all fine. Likewise about "controls": a control receives a "raw" message and then tries to convert it into something it understands.

Bibliopixel is very reliable. Lots of automated tests. Hardware test boards to test the code with the eight most common types of hardware. Solid error handling and readable error messages help a lot.

There are some weak points. The biggest is lack of developers. Another problem is that it only supports three colors (RGB). So you can't handle RGBW (RGB plus white) and other such newer combinations. He hopes to move the code over completely to numpy, that would help a lot. Numpy is already supported, but the existing legacy implementation currently also still needs to be work.

He showed some nice demos at the end.



from Planet Python
via read more

Bletchley-Completed-kind of!

Bletchley Completed

But It Doesn’t End Well!

It’s been a very long week or so for my Mastermind type game, and my brain.

I made such a mess of the code, I got lost in a tangle of global statements (over 60, no joking, see later in this post about how I solved that) and spaghetti code of the likes never seen before, by even the lowest of BASIC noob programmers.  I am thoroughly ashamed of the code, BUT, it bloody works!

python-mastermind-gui-game-bletchley-screenshot

Question: If the compiled program runs and plays smoothly, and works, does it matter how it was created?

Well, it’s something to think about.  I don’t suppose a non-programmer end user would give two flying figurines about how badly it’s coded, as long as it worked.

Coder, there’s a bug in my spaghetti!

But it still riles me thinking about the days and nights wasted on trying to hunt down simple bugs in the spaghetti and trying to follow the control flow of the program.   It’s so bad that to add even a simple feature or change could (and I’ve tried) make the whole kit and caboodle come crashing down.

For example. I spent the best part of a week on one routine, the game logic where it reports to the user if a peg is in the correct position and colour, correct colour and wrong position or neither, should be simple, it wasn’t, not for me anyway.

I wrote a routine to do those three tasks, but when there were duplicate colours either in the users selection or the secret code, or both, the game always reported these outcomes incorrectly.  The game would be useless in this state.

I tried so hard for five days, but I could not get my head around how to solve the duplicate problem.  In the end, I asked for help on Reddit learn python, and two very nice guys, props to both “MikeTheWatchGuy” and “efmccurdy” who tried to help me.  Without people like these it would be even harder to learn Python, I’m grateful to everyone that even tries to help me.  I do try to give back by helping noobs that are not even at my level yet, but they are few and far  between, ha!

Even with help and advice I couldn’t work out what they were telling me, or how to integrate their code into the alphabetti-spaghetti monster that I had created.

Inciidently, Mikeas quite rightly appalled at the length of my code, at one stage it was nearly 3000 line long, LOL, to be fair a large chunk of that was Tkinter GUI stuff and a lot of comments, I have now trimmed it to 2,500,  still far too long I know, especially as a non-gui, no frills Mastermind game can be written in 50 to 100 lines of code!

python-spaghetti code

Finally, in desperation, I found the source code to someone else’s (non-GUI) Mastermind game to see how they did it.  Thank you Mr F. Grant, his Mastermind code here, I didn’t really understand how it worked exactly, but I understood what inputs were needed (the players entered pegs and the secret code) and how the output of the function had to be used.  I realised it was either make his bit of code work in Bletchley, or the game would have to be shelved.

Note: It looks like Mr F updated and improved the code since I used it.  Also while I was searching for the the original source of his code I came across a couple of coders with the same problem as this on Reddit (I wish I had just searched “mastermind” on Reddit when I first had this problem), Still, it heartens me a little that I am not the only dumb ass around here LOL.

I had written my code to use integer for the colours, like 1=red, 2=blue etc, and then at the required time I would convert the integers to a string like “red” depending on its related integer.  I thought this would be the smart way to do it, maybe it is, but just my luck, the code I was “borrowing” worked on strings only, goddammit ha-ha, it just made things that litter bit more confusing and frustrating.

I lack many qualities as a human, but I do have pretty good determination most of the time.  I might say ‘I give up‘, a lot, but I still come back after a rest, (well usually).

It was complicated, but I managed it in about four hours, linking up\changing variables, moving\deleting\adding chunks of code, adding more globals to get at his variables, you get the picture, this did not feel like my happiest day of coding though I tell thee.

I set out to make a Mastermind clone with a GUI, and in the end I succeeded, well almost.

I know you are dying to hear what doesn’t work

I’m sorry to report dear reader, that because of my disgraceful planning and design (I warned myself about this a few posts ago), I am unable, at this stage, to get a proper game over\start new game routine to work.

The game is still totally playable, but after each go you have to reload the program, yep that sucks ass, I hate the idea as much as anyone. The program does not even exit gracefully, or at all, unless you use the “x” option in the window frame, or the quit option in the drop-down menu.  Daft thing is they use the same code as the game over routine.

In the end I needed to stop work on this game today, and start writing this post because to be honest I couldn’t face another day of staring at my code, getting a migraine and then getting nowhere.

I tried to get the start new game routine to work, but it was like a leaking boat, as soon as I plugged one leak, two more leaks would appear, ad infinitum, most disheartening.

Not a happy ending, said the masseuse to the old bloke

What to do?  I had a working game with a bad ending?  The game is actually quite good, and a bit of a challenge to get it in six tries.

Look at the first screenshot on this page for a clue as to how I can crack the code about 7 out of 10 tries.  I’ve played so many games on it I think I’ve sussed it out now.

There is also no scoring system, but I don’t think that’s too important, the original game did not have a scoring system anyway.

You can get the 10mb .exe of Bletchley from my Dropbox “Bletchley” folder.  If you want the optional music, then get the “music” folder as well, and place that in the same directory as the .exe file.  Because I am not posting the source code (probably never will) this will only be for Windows I’m afraid.

My forte, at least for the foreseeable future, appears to be making small apps where I can just about get away with bad structure and design.  Because the code is short it doesn’t notice so much or really matter.  That’s no solution of course, but I will get there sooner or later, as I learn more.

hearing frequency sound tester updated v023

Talking of my other small projects. I tried updating TIM to use the Pygame internal mp3 player, rather than an external system player but it keep coming up with “file in use” crashes, so I abandoned that idea, like I said, not a great week or two in my little Python world.

Global to classes, it’s not that hard actually

At the beginning of this stupidly long and rambling post, I mentioned about the 60+ global statements I felt I had to use in Bletchley.  I had mentioned in another post that I had to have a look at classes to help with the global problem, so I did.

I had previously tried to understand classes when I first started out learning Python but I couldn’t make head nor tail of them, so I avoided them completely.  I took another look at classes for the sake of Bletchley.

I went through all the YouTube videos and articles I could find nothing much sunk in, until I found the perfect little article on classes that all beginners that are scared of classes should start with.

That article is concise, plainly laid out, and keeps it simple, which is just how I need it.  Of course I know there must be a lot more to classes than this, and most likely, I am using them incorrectly, but they served their purpose.

I took all my global statements and put them into simple classes, the variables were now global without using the global statement, awesome.

Why aren’t beginners shown this I wonder?  Probably because it’s not ‘Pythonic‘ is my guess.  As this is so easy to do, I suspect the Python experts won’t like it.  Just like global, it’s not complex and confusing enough.

Here is a simple class I used for a set of five globals:

class outcome:
”’build a string of the hint pegs”’
def __init__(self, out_come, out_come1, out_come2, out_come3, out_come4):

self.out_come = out_come
self.out_come1 = out_come1
self.out_come2 = out_come2
self.out_come3 = out_come3
self.out_come4 = out_come4

#initialise

r1 = outcome (0,0,0,0,0)

Now if I need to access any of those variables anywhere in my code, including inside functions, I can refer to them as “r1.out_come1“, “r1.out_come2” etc.  Note that classes use a two space indent, unlike a function’s four space indent.

So if you are a beginner and don’t understand classes or are scared of their complexity, like I was, then start with the article I linked above.

 

My excuses for being a bad programmer

The following is a lot of self indulgent crap that you will probably want to skip, or get the violins out.  Up to you.

python violins

Making Bletchley and HI-Lo has made me realise where I am at in Python programming, not nearly as good as I thought I was, and that’s saying something, because I am my own harshest critic.

However, I do have some caveats that I have to allow myself some give on.

I have been ADHD (called “Hyperkinetic Disorder” by my doctor for some reason), since I was a nipper.  I was diagnosed in my 40s.  It’s not the wibbly wobbly, running up walls, physical version [now], but an internal thing, affecting my mind in many ways that I won’t bore you with, apart from to tell you that my mind appears to work quite differently to most other people.  Sometimes this is helpful, with creativity for example, but most of the time it is a pain in the smegging arse to be honest.

I still have an old box of Dex

I’ve managed to overcome, dampen, or put up with, most of the effects of ADD (as they call the adult version now) without medication.  In the UK it is very difficult to get the treatment, especially since I was on Dexamphetamine and then I stopped it after a year because of the side-effects.  It can take years to get back on it again, or even a different drug, because it is a class A drug.

Because of my ADHD behaviour as a child (I had the wibbly wobbly running up walls thing from about 5 years old to roughly 15) and the condition not being understood in the UK in the 1960s\70s, and therefore no treatment available, I missed out on a proper education. Which meant mister thicko here didn’t get to Uni, or even college.  Most of what I know is self-taught.

Add to that, I am nearly 60 years old (I can hear you gasping at the back there, (I’m really only 58 and a quarter, in fact), and maybe my mind isn’t as sharp as it once was.

Also, I may not have many years left.  I had a heart attack and surgery a few years back.  I have a dodgy ticker that stops at random in my sleep and I probably won’t make 65-70, so you can see why time is so important to me.  I’m in a rush.

Sorry for this self-indulgence, but sometimes I feel like I am up against spectacularly bad odds of achieving anything worthwhile on planet Earth before I go. Boo hoo, blub.

Using Python V3.6.5(32 bit) on Windows 7, (64 bit).

Previous post: Python GUI Mastermind Game

Advertisements


from Python Coder
via read more

Thursday, November 29, 2018

Matt Layman: Deciphering Python: How to use Abstract Syntax Trees (AST) to understand code

Let’s get a little “meta” about programming. How does the Python program (better know as the interpreter) “know” how to run your code? If you’re new to programming, it may seem like magic. In fact, it still seems like magic to me after being a professional for more than a decade. The Python interpreter is not magic (sorry to disappoint you). It follows a predictable set of steps to translate your code into instructions that a machine can run.

from Planet Python
via read more

Python Engineering at Microsoft: Python in Visual Studio Code – November 2018 Release

We are pleased to announce that the November 2018 release of the Python Extension for Visual Studio Code is now available. You can download the Python extension from the Marketplace, or install it directly from the extension gallery in Visual Studio Code. You can learn more about Python support in Visual Studio Code in the documentation.

This release was a quality focused release, we have closed a total of 28 issues, improving startup performance and fixing various bugs related to interpreter detection and Jupyter support. Keep on reading to learn more!

Improved Python Extension Load Time

We have started using webpack to bundle the TypeScript files in the extension for faster load times, this has significantly improved the extension download size, installation time and extension load time.  You can see the startup time of the extension by running the Developer: Startup Performance command. Below shows before and after times of extension loading (measured in milliseconds):

One downside to this approach is that reporting & troubleshooting issues with the extension is harder as the call stacks output by the Python extension are minified. To address this we have added the Python: Enable source map support for extension debugging command. This command will load source maps for for better error log output. This slows down load time of the extension, so we provide a helpful reminder to disable it every time the extension loads with source maps enabled:

These download, install, and startup performance improvements will help you get to writing your Python code faster, and we have even more improvements planned for future releases.

Other Changes and Enhancements

We have also added small enhancements and fixed issues requested by users that should improve your experience working with Python in Visual Studio Code. The full list of improvements is listed in our changelog; some notable changes include:

  • Update Jedi to 0.13.1 and parso 0.3.1. (#2667)
  • Make diagnostic message actionable when opening a workspace with no currently selected Python interpreter. (#2983)
  • Fix problems with virtual environments not matching the loaded python when running cells. (#3294)
  • Make nbconvert in a installation not prevent notebooks from starting. (#3343)

Be sure to download the Python extension for Visual Studio Code now to try out the above improvements. If you run into any problems, please file an issue on the Python VS Code GitHub page.



from Planet Python
via read more

Catalin George Festila: Python Qt5 - submenu example.

Using my old example I will create a submenu with PyQt5.
First, you need to know the submenu works like the menu.
Let's see the result:

The source code is very simple:
# -*- coding: utf-8 -*-
"""
@author: catafest
"""
import sys
from PyQt5.QtWidgets import QMainWindow, QAction, qApp, QApplication, QDesktopWidget, QMenu
from PyQt5.QtGui import QIcon

class Example(QMainWindow):
#init the example class to draw the window application
def __init__(self):
super().__init__()
self.initUI()
#create the def center to select the center of the screen
def center(self):
# geometry of the main window
qr = self.frameGeometry()
# center point of screen
cp = QDesktopWidget().availableGeometry().center()
# move rectangle's center point to screen's center point
qr.moveCenter(cp)
# top left of rectangle becomes top left of window centering it
self.move(qr.topLeft())
#create the init UI to draw the application
def initUI(self):
#create the action for the exit application with shortcut and icon
#you can add new action for File menu and any actions you need
exitAct = QAction(QIcon('exit.png'), '&Exit', self)
exitAct.setShortcut('Ctrl+Q')
exitAct.setStatusTip('Exit application')
exitAct.triggered.connect(qApp.quit)
#create the status bar for menu
self.statusBar()
#create the menu with the text File , add the exit action
#you can add many items on menu with actions for each item
menubar = self.menuBar()
fileMenu = menubar.addMenu('&File')
fileMenu.addAction(exitAct)

# add submenu to menu
submenu = QMenu('Submenu',self)

# some dummy actions
submenu.addAction('Submenu 1')
submenu.addAction('Submenu 2')

# add to the top menu
menubar.addMenu(submenu)
#resize the window application
self.resize(640, 480)
#draw on center of the screen
self.center()
#add title on windows application
self.setWindowTitle('Simple menu')
#show the application
self.show()
#close the UI class

if __name__ == '__main__':
#create the application
app = QApplication(sys.argv)
#use the UI with new class
ex = Example()
#run the UI
sys.exit(app.exec_())


from Planet Python
via read more

Hynek Schlawack: Python Application Dependency Management in 2018

We have more ways to manage dependencies in Python applications than ever. But how do they fare in production? Unfortunately this topic turned out to be quite polarizing and was at the center of a lot of heated debates. This is my attempt at an opinionated review through a DevOps lens.



from Planet Python
via read more

Stack Abuse: Python Data Visualization with Matplotlib

Introduction

Python Data Visualization with Matplotlib

Visualizing data trends is one of the most important tasks in data science and machine learning. The choice of data mining and machine learning algorithms depends heavily on the patterns identified in the dataset during data visualization phase. In this article, we will see how we can perform different types of data visualizations in Python. We will use Python's Matplotlib library which is the de facto standard for data visualization in Python.

The article A Brief Introduction to Matplotlib for Data Visualization provides a very high level introduction to the Matplot library and explains how to draw scatter plots, bar plots, histograms etc. In this article, we will explore more Matplotlib functionalities.

Changing Default Plot Size

The first thing we will do is change the default plot size. By default, the size of the Matplotlib plots is 6 x 4 inches. The default size of the plots can be checked using this command:

import matplotlib.pyplot as plt

print(plt.rcParams.get('figure.figsize'))  

For a better view, may need to change the default size of the Matplotlib graph. To do so you can use the following script:

fig_size = plt.rcParams["figure.figsize"]  
fig_size[0] = 10  
fig_size[1] = 8  
plt.rcParams["figure.figsize"] = fig_size  

The above script changes the default size of the Matplotlib plots to 10 x 8 inches.

Let's start our discussion with a simple line plot.

Line Plot

Line plot is the most basic plot in Matplotlib. It can be used to plot any function. Let's plot line plot for the cube function. Take a look at the following script:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

plt.plot(x, y, 'b')  
plt.xlabel('X axis')  
plt.ylabel('Y axis')  
plt.title('Cube Function')  
plt.show()  

In the script above we first import the pyplot class from the Matplotlib library. We have two numpy arrays x and y in our script. We used the linspace method of the numpy library to create list of 20 numbers between -10 to positive 9. We then take cube root of all the number and assign the result to the variable y. To plot two numpy arrays, you can simply pass them to the plot method of the pyplot class of the Matplotlib library. You can use the xlabel, ylabel and title attributes of the pyplot class in order to label the x axis, y axis and the title of the plot. The output of the script above looks likes this:

Output:

Python Data Visualization with Matplotlib

Creating Multiple Plots

You can actually create more than one plots on one canvas using Matplotlib. To do so, you have to use the subplot function which specifies the location and the plot number. Take a look at the following example:

import matplotlib.pyplot as plt  
import numpy as np  
x = np.linspace(-10, 9, 20)

y = x ** 3

plt.subplot(2,2,1)  
plt.plot(x, y, 'b*-')  
plt.subplot(2,2,2)  
plt.plot(x, y, 'y--')  
plt.subplot(2,2,3)  
plt.plot(x, y, 'b*-')  
plt.subplot(2,2,4)  
plt.plot(x, y, 'y--')  

The first attribute to the subplot function is the rows that the subplots will have and the second parameter species the number of columns for the subplot. A value of 2,2 species that there will be four graphs. The third argument is the position at which the graph will be displayed. The positions start from top-left. Plot with position 1 will be displayed at first row and first column. Similarly, plot with position 2 will be displayed in first row and second column.

Take a look at the third argument of the plot function. This argument defines the shape and color of the marker on the graph.

Output:

Python Data Visualization with Matplotlib

Plotting in Object-Oriented Way

In the previous section we used the plot method of the pyplot class and pass it values for x and y coordinates along with the labels. However, in Python the same plot can be drawn in object-oriented way. Take a look at the following script:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

figure = plt.figure()

axes = figure.add_axes([0.2, 0.2, 0.8, 0.8])  

The figure method called using pyplot class returns figure object. You can call add_axes method using this object. The parameters passed to the add_axes method are the distance from the left and bottom of the default axis and the width and height of the axis, respectively. The value for these parameters should be mentioned as a fraction of the default figure size. Executing the above script creates an empty axis as shown in the following figure:

The output of the script above looks like this:

Python Data Visualization with Matplotlib

We have our axis, now we can add data and labels to this axis. To add the data, we need to call the plot function and pass it our data. Similarly, to create labels for x-axis, y-axis and for the title, we can use the set_xlabel, set_ylabel and set_title functions as shown below:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

figure = plt.figure()

axes = figure.add_axes([0.2, 0.2, 0.8, 0.8])

axes.plot(x, y, 'b')  
axes.set_xlabel('X Axis')  
axes.set_ylabel('Y Axis')  
axes.set_title('Cube function')  

Python Data Visualization with Matplotlib

You can see that the output is similar to the one we got in the last section but this time we used the object-oriented approach.

You can add as many axes as you want on one plot using the add_axes method. Take a look at the following example:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure = plt.figure()

axes = figure.add_axes([0.0, 0.0, 0.9, 0.9])  
axes2 = figure.add_axes([0.07, 0.55, 0.35, 0.3]) # inset axes

axes.plot(x, y, 'b')  
axes.set_xlabel('X Axis')  
axes.set_ylabel('Y Axis')  
axes.set_title('Cube function')

axes2.plot(x, z, 'r')  
axes2.set_xlabel('X Axis')  
axes2.set_ylabel('Y Axis')  
axes2.set_title('Square function')  

Take a careful look at the script above. In the script above we have two axes. The first axis contains graphs of the cube root of the input while the second axis draws the graph of the square root of the same data within the other graph for cube axis.

In this example, you will better understand the role of the parameters for left, bottom, width and height. In the first axis, the values for left and bottom are set to zero while the value for width and height are set to 0.9 which means that our outer axis will have 90% width and height of the default axis.

For the second axis, the value of the left is set to 0.07, for the bottom it is set to 0.55, while width and height are 0.35 and 0.3 respectively. If you execute the script above, you will see a big graph for cube function while a small graph for a square function which lies inside the graph for the cube. The output looks like this:

Python Data Visualization with Matplotlib

Subplots

Another way to create more than one plots at a time is to use subplot method. You need to pass the values for the nrow and ncols parameters. The total number of plots generated will be nrow x ncols. Let's take a look at a simple example. Execute the following script:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

fig, axes = plt.subplots(nrows=2, ncols=3)  

In the output you will see 6 plots in 2 rows and 3 columns as shown below:

Python Data Visualization with Matplotlib

Next, we will use a loop to add the output of the square function to each of these graphs. Take a look at the following script:

import matplotlib.pyplot as plt  
import numpy as np  
x = np.linspace(-10, 9, 20)

z = x ** 2

figure, axes = plt.subplots(nrows=2, ncols=3)

for rows in axes:  
    for ax1 in rows:
        ax1.plot(x, z, 'b')
        ax1.set_xlabel('X - axis')
        ax1.set_ylabel('Y - axis')
        ax1.set_title('Square Function')

In the script above, we iterate over the axes returned by the subplots function and display the output of the square function on each axis. Remember, since we have axes in 2 rows and three columns, we have to execute a nested loop to iterate through all the axes. The outer for loop iterates through axes in rows while the inner for loop iterates through the axis in columns. The output of the script above looks likes this:

Python Data Visualization with Matplotlib

In the output, you can see all the six plots with square functions.

Changing Figure Size for a Plot

In addition to changing the default size of the graph, you can also change the figure size for specific graphs. To do so, you need to pass a value for the figsize parameter of the subplots function. The value for the figsize parameter should be passed in the form of a tuple where the first value corresponds to the width while the second value corresponds to the hight of the graph. Look at the following example to see how to change the size of a specific plot:

import matplotlib.pyplot as plt  
import numpy as np  
x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure, axes = plt.subplots(figsize = (6,8))

axes.plot(x, z, 'r')  
axes.set_xlabel('X-Axis')  
axes.set_ylabel('Y-Axis')  
axes.set_title('Square Function')  

In the script above draw a plot for the square function that is 6 inches wide and 8 inches high. The output looks likes this:

Python Data Visualization with Matplotlib

Adding Legends

Adding legends to a plot is very straightforward using Matplotlib library. All you have to do is to pass the value for the label parameter of the plot function. Then after calling the plot function, you just need to call the legend function. Take a look at the following example:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure = plt.figure()

axes = figure.add_axes([0,0,1,1])

axes.plot(x, z, label="Square Function")  
axes.plot(x, y, label="Cube Function")  
axes.legend()  

In the script above we define two functions: square and cube using x, y and z variables. Next, we first plot the square function and for the label parameter, we pass the value Square Function. This will be the value displayed in the label for square function. Next, we plot the cube function and pass Cube Function as value for the label parameter. The output looks likes this:

Python Data Visualization with Matplotlib

In the output, you can see a legend at the top left corner.

The position of the legend can be changed by passing a value for loc parameter of the legend function. The possible values can be 1 (for the top right corner), 2 (for the top left corner), 3 (for the bottom left corner) and 4 (for the bottom right corner). Let's draw a legend at the bottom right corner of the plot. Execute the following script:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure = plt.figure()

axes = figure.add_axes([0,0,1,1])

axes.plot(x, z, label="Square Function")  
axes.plot(x, y, label="Cube Function")  
axes.legend(loc=4)  

Output:

Python Data Visualization with Matplotlib

Color Options

There are several options to change the color and styles of the plots. The simplest way is to pass the first letter of the color as the third argument as shown in the following script:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure = plt.figure()

axes = figure.add_axes([0,0,1,1])

axes.plot(x, z, "r" ,label="Square Function")  
axes.plot(x, y, "g", label="Cube Function")  
axes.legend(loc=4)  

In the script above, a string "r" has been passed as the third parameter for the first plot. For the second plot, the string "g" has been passed at the third parameter. In the output, the first plot will be printed with a red solid line while the second plot will be printed with a green solid line as shown below:

Python Data Visualization with Matplotlib

Another way to change the color of the plot is to make use of the color parameter. You can pass the name of the color or the hexadecimal value of the color to the color parameter. Take a look at the following example:

import matplotlib.pyplot as plt  
import numpy as np

x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure = plt.figure()

axes = figure.add_axes([0,0,1,1])

axes.plot(x, z, color = "purple" ,label="Square Function")  
axes.plot(x, y, color = "#FF0000", label="Cube Function")  
axes.legend(loc=4)  

Output:

Python Data Visualization with Matplotlib

Stack Plot

Stack plot is an extension of bar chart or line chart which breaks down data from different categories and stack them together so that comparison between the values from different categories can easily be made.

Suppose, you want to compare the goals scored by three different football players per year over the course of the last 8 years, you can create a stack plot using Matplot using the following script:

import matplotlib.pyplot as plt

year = [2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018]

player1 = [8,10,17,15,23,18,24,29]  
player2 = [10,14,19,16,25,20,26,32]  
player3 = [12,17,21,19,26,22,28,35]

plt.plot([],[], color='y', label = 'player1')  
plt.plot([],[], color='r', label = 'player2')  
plt.plot([],[], color='b', label = 'player3 ')

plt.stackplot(year, player1, player2, player3, colors = ['y','r','b'])  
plt.legend()  
plt.title('Goals by three players')  
plt.xlabel('year')  
plt.ylabel('Goals')  
plt.show()  

Output:

Python Data Visualization with Matplotlib

To create a stack plot using Python, you can simply use the stackplot class of the Matplotlib library. The values that you want to display are passed as the first parameter to the class and the values to be stacked on the horizontal axis are displayed as the second parameter, third parameter and so on. You can also set the color for each category using the colors attribute.

Pie Chart

A pie type is a circular chart where different categories are marked as part of the circle. The larger the share of the category, larger will be the portion that it will occupy on the chart.

Let's draw a simple pie chart of the goals scored by a football team from free kicks, penalties and field goals. Take a look at the following script:

import matplotlib.pyplot as plt

goal_types = 'Penalties', 'Field Goals', 'Free Kicks'

goals = [12,38,7]  
colors = ['y','r','b']

plt.pie(goals, labels = goal_types, colors=colors ,shadow = True, explode = (0.05, 0.05, 0.05), autopct = '%1.1f%%')  
plt.axis('equal')

plt.show()  

Output:

Python Data Visualization with Matplotlib

To create a pie chart in Matplot lib, the pie class is used. The first parameter to the class constructor is the list of numbers for each category. Comma-separated list of categories is passed as the argument to the labels attribute. List of colors for each category is passed to the colors attribute. If set to true, shadow attribute creates shadows around different categories on the pie chart. Finally, the explode attribute breaks the pie chart into individual parts.

It is important to mention here that you do not have to pass the percentage for each category; rather you just have to pass the values and percentage for pie charts will automatically be calculated.

Saving a Graph

Saving a graph is very easy in Matplotlib. All you have to do is to call the savefig method from the figure object and pass it the path of the file that you want your graph to be saved with. Take a look at the following example:

import matplotlib.pyplot as plt  
import numpy as np  
x = np.linspace(-10, 9, 20)

y = x ** 3

z = x ** 2

figure, axes = plt.subplots(figsize = (6,8))

axes.plot(x, z, 'r')  
axes.set_xlabel('X-Axis')  
axes.set_ylabel('Y-Axis')  
axes.set_title('Square Function')

figure.savefig(r'E:/fig1.jpg')  

The above script will save your file with name fig1.jpg at the root of the E directory.

Conclusion

Matplotlib is one of the most commonly used Python libraries for data visualization and plotting. The article explains some of the most frequently used Matplotlib functions with the help of different examples. Though the article covers most of the basic stuff, this is just the tip of the iceberg. I would suggest that you explore the official documentation for the Matplotlib library and see what more you can do with this amazing library.



from Planet Python
via read more

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...