Skip to main content

Learning to Learn: Python

·2157 words·11 mins

Python is a relatively simple language compared to others such as C++. Despite its simplicity, Python really shines because of its robust standard library, extensive 3rd party library ecosystem (especially for statistics and data analysis), and intuitiveness. This post tracks my process of learning how to program well in Python.

Order of Topics #

The listing of topics here is arbitrary and represents the path that I took to learn Python roughly organized by topic. My advice is that with Python you choose topics that are useful to you and focus on them.

Beginning Python #

Everyone needs to start somewhere, but there are limited, current, comprehensive, and free documents that describe an overview for the language.

I would recommend in starting in one of two ways:

  1. Read “Dive into Python 3” by Mark Pilgrim. It used to be that this book was all you needed to read to get started. However the author of this book has since retired and Python has grown since version 3.0. To supplement this reading, I would recommend reading up on the following newer topics from the Python Release Notes:
    1. Pathlib – the newer and more user friendly file interface introduced in Python 3.4.
    2. f-Strings – a newer way for writing formatted strings which cuts down on the verbosity.
    3. Async-IO – as of Python 3.5, Python developed a robust set of faculties for asynchronous programming. While you probably won’t start out writing async programs, know what they are and how they work will make reading them less surprising.
    4. Type Hints – as of Python 3.5 Python now supports the ability to provide type information for arguments. While the information is only a hint and is automatically ignored at runtime, it can be checked statically by tools.
  2. Read the “Python Tutorial” which is part of the standard documentation. This resource is a bit terse for non-programmers, but provides a well written and more modern introduction to the language.

I also would read “The Zen of Python” by Tim Peters. You can find it by typing import this in a Python interpreter. It is poem that describes some of the design philosophy about Python. Keep these lines in mind as you watch and read the following items in this post.

I would also watch these videos by Raymond Hettinger who is one of the Python Core Developers that overview the design philosophy around Python:

Tooling and Libraries #

Python is famous for the set of libraries and tools that it supports and that are built using it. In this section I give a limited overview of the libraries and tools that I use most often.

Standard Library #

Python’s claim to fame is that it has one of the best “batteries included” standard libraries. Learning to use it well is essential to any budding Python programmer. For these tools, I reccomend the web documentation on docs.python.org. These docs are written in such a way so that they can be read as introduction rather than using the help() method which is intended to be a reference. Here are some of the parts of the standard library that I use most.

NameUse
datetimeParsing, formatting, and calculating dates
collectionsA set of useful data structures
randomA robust but easy to use random numbers
pathlibModern, easy to use filesystems library
osOS generic facilitates
sqlite3Built-in SQLite support
csvCSV parsing
jsonJSON parsing
loggingA robust easy to use logging library
argparseAll but the most complex command line parsing
threadingSpawning and using threads
multiprocessingSpawning and using multiple processes
concurrent.futuresModern futures based multithreaded/process
subprocessRun other programs and get their results
asyncioAsynchronous IO library
typingType hint support
itertoolsTools for working with iterators
functoolsTools for higher level functions
unittestBuilt-in unit testing framework.
doctestAnother unit testing framework that uses doc strings

3rd Party Libraries #

In addition to the fantastic standard library, Python has a number of famous 3rd party libraries that are considered the best in class across all programming languages. While I list these specifically later, I would be remiss in failing to mention the NumFOCUS libraries which includes MatplotLib, NumPy, SciPy, Pandas, and others up front as some of the best open source (arguably overall) libraries for statistical analysis and numerical calculation that exist. Unfortunately the quality of the documentation for these tools varies highly. One alternative to strictly reading the documentation is to look at the unit-tests for the library. Because of the builtin unit testing frameworks being quite good, Python libraries often have extensive test suites.

NameUseDescription
pandasdata anlaysisData analysis framework for tabular data
scipydata anlaysisFast scientific functions
numpydata anlaysisFast numeric framework famous for arrays
matplotlibplottingDefacto plotting library for Python
seabournplottingEase of use layer for matplotlib
bokehplottingWeb-first plotting library
plotlyplottingInteractive plotting library
beautiful soupscrapingSane HTML parsing
scrapyscrapingWeb Scraping Framework
requestsscraping/web requestsVastly superior HTTP library
djangoweb appopinionated easy to use CMS framework
flaskweb appFlexible web app framework
seleniumtestingAutomated web browser controller
psycog2databasePostgreSQL bindings
simpysimulationSimulation library
mpi4pyHPCDistributed memory programming executor
tensorflowmachine learningFaster but more complex machine learning
sklearnmachine learningEasy to use but slower machine learning
networkxgraphsEasy to use but slow graph library
networkitgraphsSlightly harder to use but faster graphs

Tools for developing Python #

Due to the popularity of Python, there are a number of tools that exist to make it easier to work with. A number of these tools have great documentation. There is one key exception which is setuputils. For setuputils I would read this guide on packaging Python libraries.

NameUse
pdbPython debugger
pipTool for installing packaged libraries
setuputilsPython library/tool for packaging libraries
flake8Linter for Python that respects PEP 8
jediCode completion for Python
mypyPython Static Type
iPythonFriendly interactive shell
JuPyterNotebook software commonly used for science
python-language-serverImplementation of the language server protocol – I pair this with vim for my development environment
pycharm/spyderMore fully featured alternative IDEs

Other tools that work well with Python #

Another reason to learn Python is be able to use it as part of other tools. Most of the time, the libraries are fairly straight forward to use and to learn to use with good project level documentation. Generally speaking these tools come in one of two flavors, applications that are written in Python and are easy to extend and those that are written in C/C++ and need a wrapper library.

namepurpose
gdbwell supported native executable debugger
lldbeasy to use native executable debugger
ansibleeasy to use system orchestration tool
saltstackmore “Pythonic” system orchestration tool

GDB and LLDB are C/C++ applications that use a wrapper library. There are two good tools for doing generating Python bindings for lower level C/C++ applications:

  • swig – easy to use, but can have significant overhead for some applications. Supports other languages.
  • boost::Python – harder to use, but lower overhead

Both have extensive examples of how to get started with these tools online and pretty good project documentation. The biggest hiccup that people find when they go to use these tools is that they require the use of the development libraries for Python which are often installed separately from the Python interpreter.

Generally, the strategy that is recommended for using these tools is to create a class or module that exposes the functionality you would like to configure from Python and applying these tools to just that module. Then as an added benefit, you have a façade for your library to make it easier to use in the lower level library.

Watch this video from CppCon that describes why you may want to do this and how to do it well.

Functional Programming in Python #

One common misconception about Python is that it is primarily object oriented programming language. Rather, Python is primarily a functional language that supports object orientation where it makes sense. In functional programming, programmers avoid the use of state and have so called “higher level programming” faculties that accept functions and return functions. You should watch this video on why object oriented Python is overused.

The functional capabilities of Python are best seen in the following features of the language:

  • Functions as first class types.
  • List, Generator, and Dictionary Comprehensions.
  • Easy to write Iterators.
  • Great library support for iterators, and higher-level programming.

I would read the appropriate sections of the Python docs especially the introduction to functional programming in python as well as the itertools and functools docs.

One common question that comes up when considering generators versus list comprehensions is which to use. Generally, you should prefer generators because they use less memory and can be converted to lists later if multiple passes or random access is required. This video is on Python 2, but shows why you should prefer generators over lists

Object Oriented Programming in Python #

That said, Objects in Python are not considered second class citizens. If you have not read the key text on Object Oriented design [Design Patterns Elements of Reusable Software](Python’s Class Development Toolkit https://youtu.be/HTLu2DFOdTg) you should read that first. It explains why you would want to use object oriented design in the first place. However much of what is in this book is a language feature of Python. I would read this site which shows that many of the classical patterns of Object Oriented design are just features of Python. I would also watch/read the following resources to better understand object oriented Python:

Other helpful topics #

There are a number of other useful topics that I would commend to your consideration when you have finished reading about the topics above.

  • Coroutines/Asynchronous Programming are a really easy way to write iterators and are an easy way to abstract asyncronous programming in Python in general. Read the standard library docs on AsyncIO to get an overview of why and how to use these.
  • Decorators are a great way to do aspect oriented programming in Python. I would read this paper on decorators and I would read about how Flask uses them to get the high points.
  • Metaclasses are a powerful way to change how classes are constructed. If you are in doubt if you need meta-classes you probably don’t need them. However they are how Django built their Model class and how much of the standard library is built. I would read this Stack Overflow answer about how they work if you need to learn how to use them.
  • Performance Python generally is slower than compiled languages such as C/C++. You can look into tools like embedded C/C++ modules, cython, pypy, or numbra if things are too slow. Understanding profiling tools is also helpful consider watching this video as a starting place.
  • Parallel and Distributed Programming Python has some of the best parallel and distributed programming features out there. I would read/watch the following resources:
    • Read the docs for multiprocess and theading in the standard library. They describe the executor abstraction which is core to parallel and distributed programming in Python.
    • Watch this video from Raymond Hettinger on parallel programming which describes some common pitfalls and tools for parallel programming.
    • mpi4py documentation – mpi4py is a Python binding for openmpi, but providing substantial ease of use benefits. If you want to run on a HPC cluster, mpi4py is the way to do it.

What’s Next #

Once you’ve done the above, I would consider the following resources for father reading:

  • PyCon – PyCon is a set of related conferences about Python programming. Some of the keynotes and presentations are quite well done. Many are posted online.
  • Read the standard library code for some advanced examples of good Python code. As a reminder, Python code is almost always distributed as source code which means you can read it.
  • Read the Python Enhancement Proposals (PEPs) to get an idea about where the language is going.

Suggestions to Learn the Language #

I have one key suggestion to learning Python: Refuse the temptation to write C in Python. Python has great functional and object oriented facilities use them.

  • Pick a project that is useful you: Python has libraries for almost everything. Some much better than others. I would suggest picking a project then do some quick research about what libraries people use to do that kind of work.
  • After you have written the code, re-write it after watching videos like the ones from Raymond Hettinger or considering the Zen of Python. This will help you understand Pythonic style more and write more natural code.

Hope this helps!

Author
Robert Underwood
Robert is an Assistant Computer Scientist in the Mathematics and Computer Science Division at Argonne National Laboratory focusing on data and I/O for large-scale scientific applications including AI for Science using techniques of lossy compression, and data management. He currently co-leads the AuroraGPT Data Team with Ian Foster. In addition to AI, Robert’s library LibPressio, allows users to experiment and adopt advanced compressors quickly, has over 200 average unique monthly downloads, is used in over 17 institutions worldwide, and he is also a contributor to the R&D100 winning SZ family of compressors and other compression libraries. He regularly mentors students and is the early career ambassador for Argonne to the Joint Laboratory for Extreme Scale Computing.