Living Code
David Parker famously said that texts are living, once they leave the pen of the author then they have a life of their own, you never know where the text will end up or how it will be modified. For Python code that is even more true.
The beauty of Python is that you can write code fast, share code and modify code. For this to work, your code needs to be readable. Writing code is easy, reading other people's code is much harder, or even reading your own code after a few months or years has past.
Therefore the aim is to make code as readable as possible, even if it causes a little more work when you write it. The way to make your Python code most readable is to keep to the Style Guide for Python Code, also known as PEP8.
Pylint for the Win
It is far easier to keep your code valid to PEP8 as you go along, than to try to move a large codebase to PEP8 at the end. I recommend the use of a tool called pylint.
Pylint is available from all Linux distributions' package managers (e.g. apt-get install pylint or emerge pylint). Here are some instructions for Windows.
If you have ever made a webpage you probably know about HTML-tidy or the online W3C Validator tool. These tell you everything wrong with your HTML.
Pylint is similar, it goes through and tells you both syntax errors and also how your code differs from the PEP8 standard.
There are some corner cases in which you will need to give pylint the finger, but doing it consciously for good reason is better than because you are sloppy.
PEP8 is better than your crappy style
People often don't use PEP8. This is for a variety of (bad) reasons.
Firstly, sometimes people are tourists from another programming language, they do not know any better so they write their Python code like it was Java or C code.
Secondly, Sometimes people think their (cl)own style is better than PEP8 in some technical way. Well that does not matter. I might have a better way to design a plug socket, but if I implemented my better plug socket, I would not be able to buy any electrical devices.
There can only be one standard, and PEP8 is that standard. If you want to change that standard then bribe, sleep with or kill Guido Van Rossum.
Not following the standard makes your code less readable to others, this prevents the quick reuse that Python is designed for (see above).
If you are a free-software/open-source project, then you particularly should be ashamed if you write hard to read code, because allowing other people to read, understand and modify your code is the whole point.
Lastly, some people don't use PEP8 because the document is too circular and verbose for them to remember. I feel your pain, below are the main points in 12 easy rules.
The 12 commandments
Guido, who brought you out of the land of Visual Basic, out of the land of slavery, spake all these words to thee:
- Module names should be in all lowercase - hello.py.
- Class names should be in CamelCase.
- Methods and functions should be in lower_with_underscores
- Implementation-specific 'private' methods _single_underscore_prefix
- Especially private non-subclassable methods __double_underscore_prefix
- Top level constants (i.e. those that are not in a function or class) should be in BLOCKCAPITALS. Overuse of these constants may make your code less reusable.
- If a variable inside a function or method is so temporary and disposable that you cannot give it a name, then use i for the first one, j for the second and k for third.
- Indentation is four spaces per level. No tabs. If you break this rule then you must be stoned in the village square.
- Lines are never more that 80 characters wide. Tip, break lines with a backward slash . You do not need to do this if there are parentheses, brackets or braces. Don't add extra parentheses just to break lines, use instead.
- Spaces after commas, (green, eggs, and, ham)
- Spaces around operators i = i + 1
- Write docstrings for all public modules, functions, classes, and methods. Python is an international community, so use English for docstrings, object names and comments. If you want to provide local translations then use a proper localisation library.
<p>Nice set of guidelines. I try my best to keep to PEP8 but i must confess i
do stray from time to time.</p>
<p>The most annoying guideline i find is keeping lines to 80 characters. Thats
one i regularly forget.</p>
<p>I'd not come across PyLint before, time to check how much i deviate from the
PEP i suppose.</p>
<p>Hi Jon, thanks for the comment.</p>
<p>Keeping to 80 characters is important because that is the default size of
terminals, so making code wider than that will annoy many command-line geeks.</p>
<p>Some text editors allow you to put a vertical line at 80 characters across.</p>
<p>I agree with all except the 4 space indentation rule. I've never understood
or agreed with people who prefer spaces to tabs. Why bicker over the
"proper" number of spaces when the tab character can allow people to view
code as they like to see it and still follow indentation rules?</p>
<p>Of course, if you're Linus tabs are <em>always</em> 8 spaces and you have to choose
an arbitrary number of spaces for indentation. Proof positive that smart
people sometimes hold onto things that don't really matter, and, in Linus's
case, are factually incorrect.</p>
<p>Hi Brendan,</p>
<p>I suppose my point is that in Python, the benefits of having your own style
are always out weighed by the cost of being less compatible with the rest of
Python using humanity.</p>
<p>If you want to grab a method from one module and throw it together with
another, having to go through and redo all the indentation is slightly
annoying.</p>
<p>As for spaces, Python is quite highly indented, so if you use 8 spaces you
are going to hit the 80 characters maximum width pretty fast.</p>
<p>I suppose tabs are not favoured because in some senses they are binary data,
rather than plain text. They can also do weird things when you move them
across platform.</p>
<p>If you are Linus then you only use C, and Python is not C. Indeed the
<a class="reference external" href="http://www.python.org/dev/peps/pep-0007/">guidelines for C code for the underlying Python implementation</a> have tabs
that expand to 8 spaces.</p>
<p>Hey there! Good summary :-)</p>
<p>I just made a quick search about spaces vs tabs... I guess if you're going to
keep your code internal (for example, for a 'private' app) you can simply use
tabs, since that code isn't going to be shared with more people.</p>
<p>I personally prefer tabs, and while I haven't looked into pylint yet, I guess
it should be possible to convert from spaces to tabs and the inverse with
software like that.</p>
<p>Don't you think it's a bit stupid to keep arguing about spacing and tabbing
at this point? It's 2008 already! :-)</p>
<p>The only problem I can think of with an automated conversion is when the tabs
are used for keeping things aligned with a certain position (i.e. a tab!).
Open the same file with a different editor/settings and it's going to look
funny!</p>
<p>/quote
The only problem I can think of with an automated conversion is when the tabs
are used for keeping things aligned with a certain position (i.e. a tab!).
Open the same file with a different editor/settings and it's going to look
funny!
/end quote
Yes, that's why tabs should be used as <strong>indentation</strong>. As in Changing where
the text starts. Adjusting the left margin.
I agree that position of broken up line should be done with spaces, because
then everything is preserved.
<tab>mylong line needs to
<tab> be broken up
You should not use two tabs in line 2 because this line belongs to the same
intentation. If people grasped this simple consept, "tabbed" code would be
not be a problem.</p>
<p>I find it really annoying to change the indentation width in my editor all
the time beacuse some numnut decided that 3 is the perfect indentation level,
and used spaces.</p>
<p>ONLY reason to use spaces over tabs is to assure that linelengths is < 80
columns. A better solution would be to make a standard saying that when
tabsize = 4 is used, code should fit in 80 columns.</p>
<p>/quote
The only problem I can think of with an automated conversion is when the tabs
are used for keeping things aligned with a certain position (i.e. a tab!).
Open the same file with a different editor/settings and it's going to look
funny!
/end quote
Yes, that's why tabs should be used as <strong>indentation</strong>. As in Changing where
the text starts. Adjusting the left margin.
I agree that position of broken up line should be done with spaces, because
then everything is preserved.
<tab>mylong line needs to
<tab> be broken up
You should not use two tabs in line 2 because this line belongs to the same
intentation. If people grasped this simple consept, "tabbed" code would be
not be a problem.</p>
<p>I find it really annoying to change the indentation width in my editor all
the time beacuse some numnut decided that 3 is the perfect indentation level,
and used spaces.</p>
<p>ONLY reason to use spaces over tabs is to assure that linelengths is < 80
columns. A better solution would be to make a standard saying that when
tabsize = 4 is used, code should fit in 80 columns.</p>
<p><a class="reference external" href="http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html">http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html</a></p>
<p><a class="reference external" href="http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html">http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html</a></p>
<p><a class="reference external" href="http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html">http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html</a></p>
<p>> Keeping to 80 characters is important because that is the default
> size of terminals, so making code wider than that will annoy
> many command-line geeks.</p>
<p>No command line geek (that cares about python line length) has used as 80
char wide terminal in the last 15 years.</p>
<p>You can't be serious, quoting noted internet kook Xah Lee as an example of
why tabs are "proper", can you? Unless you're using him as a counter-
example, perhaps. (But then, you did it 3 times, so maybe you ARE Xah?)</p>
<p>At least use the canonical reference: <a href="#id1"><span class="problematic" id="id2">`http://www.jwz.org/doc/tabs-vs-
spaces.html`_</span></a></p>
<div class="system-messages section">
<h1>Docutils System Messages</h1>
<div class="system-message" id="id1">
<p class="system-message-title">System Message: ERROR/3 (<tt class="docutils"><string></tt>, line 5); <em><a href="#id2">backlink</a></em></p>
Unknown target name: "<a href="http://www.jwz.org/doc/tabs-vs-" rel="nofollow">http://www.jwz.org/doc/tabs-vs-</a> spaces.html".</div>
</div>
<p>If you're going to elevate PEP 8 to a holy text [ed - that bit is a joke]
then at least get it right.</p>
<p>> Module names should be in all lowercase - hello.py.</p>
<blockquote>
<div class="system-message">
<p class="system-message-title">System Message: WARNING/2 (<tt class="docutils"><string></tt>, line 7)</p>
Explicit markup ends without a blank line; unexpected unindent.</div>
<p>for Python 3.0 made this basically a rule which you need a strong
readability exception to override.</p>
</blockquote>
<p>> Top level variables (variables that are not in a function or class) should
be in BLOCKCAPITALS.</p>
<blockquote>
This isn't even in PEP 8, but it shouldn't be used for objects intended
to be mutable. (Hey, like classes.) Use it for constants only.</blockquote>
<p>> Especially private non-subclassable methods __double_underscore_prefix</p>
<blockquote>
Don't use __double on anything. This rule isn't even worth mentioning.</blockquote>
<p>> Methods and functions should be in lower_with_underscores</p>
<blockquote>
PEP8 makes an exception in cases where it's "already the prevailing
style" to use mixedCase. There is a great deal of disagreement on this
point in the Python community's code right now; bit premature to make it a
rule.</blockquote>
<p>Hi Cory, I am not saying that PEP8 is a particularly good or particularly bad
style standard, just that it is the only standard we have, so we should stick
to it to make our code as readable as possible.</p>
<p>> This isn't even in PEP 8, but it shouldn't be used for objects intended to
be mutable. (Hey, like classes.) Use it for constants only.</p>
<p>Yeah, "constants" would have been a better term. To be honest, one should not
need to use (m)any of these in any program of any reasonable size. However,
contants are quite common in simple scripts.</p>
<p>commandment #13</p>
<p>switch to Ruby</p>
<p>I'm a command line geek, and my windows are always 80x25 (sometimes 80x24.)
But that's not why it's important to keep our lines limited to 80 columns
(although it's most likely why 80 columns was chosen.)</p>
<p>The reason it's important here is because we need some maximum width,
otherwise you'll be able to tell who worked on a file (or a particular piece
of the file) based on how wide the lines are. The whole point of a style
guide is to make the code as uniform as possible. Setting a limit on how long
a line can be (as well as setting a soft limit on how many lines a particular
function can be, which most good projects I've worked on have done) is just
another way to ensure code consistency so it's easier for everyone on the
team to work.</p>
<p>Perhaps it's time to upgrade 80 to 120?</p>
<p>The idea behind a standardised code style is that it is very easy to write
your code but very hard to read other people's. So optimising the look of
your code for readers is the way to make programming more efficient.</p>
<p>@Alexander, take a look at a <a class="reference external" href="http://upload.wikimedia.org/wikipedia/commons/7/75/NYTimes-Page1-11-11-1918.jpg">newspaper</a>, for hundreds of years they have
had very thin columns, 20 characters or less. Why, because shorter line
lengths are easier to read.</p>
<p>Getting down to 80 characters may require a bit of discipline to start with,
but it is not that hard.</p>
<p>What code needs to be longer than 80 columns? None.</p>
<p>PEP8 doesn't mandate spaces instead of tabs. It recommends spaces over tabs,
but certainly leaves room for devs to make that decision themselves.</p>
<p>So, as far as using your (cl)own standard, tabs are fine. And I think only a
clown would dictate otherwise.</p>
<p>@Clown,</p>
<p>Say you work for a team that produces code using tabs, and then later,
unexpectedly, your work then gets combined with another team that uses
spaces. You then want to combine the code together in interesting ways.
Someone has to go through and convert the tabs to spaces. Not the most
annoying thing compared to some of the other rules, but still an inefficient
(mis)use of time.</p>
<p>For those who are arguing technical reasons why something is better, that
does not matter. The point is the cost of not being part of the standards
almost always outweighs whatever small perceived benefit you gain by making
up your own system.</p>
<p>It is only worth breaking the standard when you can some extra advantage, in
the case of tabs I just can't see that.</p>
<p>Why all the problems with tabs vs spaces? Surely you all put Vim mode lines
at the top of your files so that whom ever edits them later, their Vim
session gets set to whatever the file states it should use?</p>
<p>What do you mean not everyone codes in Vim?</p>
<p>Spaces are error prone and those easily made errors are hard to see. TAB
maintains visualization flexibility and reduces file size (cough-parsing-
cough). Guido, among others, fell into the spaces pitfall and has not
escaped. That he did so in a format dependent language is doubly troubling.
Please, someone sleep with Guido before P3k ships!</p>
<p>There is pretty much complete consensus in collaborative Python projects
(i.e., projects with more than one developer) to use 4 space indentation.Â
The only projects that use tabs are single-developer projects, and even then
the large majority uses spaces. This is past being a point of debate: the
public has made its choice. And that choice isn't to allow developer
preference either: the choice is to use strictly spaces!</p>
<p>PEP 8 clearly says that double underscore is for class-private variables.Â
Many people don't know what "class-private variable". Those people should
then not use them. It's really not worth explaining. Most people who know
what it means don't use them -- mostly it's people who think __ means "really
private" that use them.</p>
<p>I believe continuation might be removed in Python 3? Guido certainly
prefers adding parenthesis to do line continuations.</p>
<p>There's no localization of docstrings. If you want to document your code in
French, document it in French, localization libraries don't really handle
that.</p>
<p>It's a wonderful post and it highlights one of the great things about python.
The one thing I'd disagree with is not adding extra parenthesis to get
implicit line continuations.</p>
<p>According to PEP8:
"The preferred way of wrapping long lines is by using Python's implied line
continuation inside parentheses, brackets and braces. If necessary, you can
add an extra pair of parentheses around an expression, but sometimes using a
backslash looks better."</p>
<p>Spaces vs. Tabs</p>
<p>Probably the greatest weakness of Python is that characters that are not
visibly distinct have a different meaning. It's the same horrid thing that
happened in syslog.conf and in make. It's like somebody somewhere forgot
that humans might read the code.</p>
<p>When I'm reading a file (or a single line) containing tabs and spaces, they
don't look different. If it didn't matter which was present, that wouldn't
be a huge deal. However, if it matters (and it matters in python,
syslog.conf, and make) it is a huge deal.</p>
<p>In python, you are going to use spaces to do proper alignment of long lines
that you wrapped. Yes, you are going to be wrapping your long lines and them
making things line up so your code doesn't look like crap. Therefore you are
already using spaces for at least some of your leading whitespace.</p>
<p>If you can guarantee that no human being will ever read the code, then by all
means, mix spaces and tabs to your heart's content. If it turns out that at
least one human might read your code (and I'm looking at you buddy), then
don't intentionally do "the wrong thing" and mix spaces and tabs.</p>
<p>That's why the only tabs in a python file had better be in a quoted string
and represented as "t" or some unicode equivalent.</p>
<p>I have come to really enjoy writing in python over the past year or so, and
this brokenness still annoys me.</p>
<p>Tabs are teh EVIL.</p>
<p>The killer for me is that once you allows tabs in a file, every tool that processes that file has to know about tabs.</p>
<p>diff old.py new.py</p>
<p>Displays misleading diffs if there are tabs present.</p>
<p>Using pr requires that you specify where your non-standard tabs are supposed to go (otherwise it truncates your dumb longer than 80-character lines at the wrong place).</p>
<p>"cut" won't work until you run the source through expand. enscript needs to know about your tab placement. The list goes on.</p>
<p>It's amusing that the suggestion that one should follow a shared standard is at all contentious.</p>
<p>Ah, "Tabs vs. Spaces"... :></p>
<p>First, Thanks! for this fun little post. Amusing that similar things always devolve to the same points... <img src="/static/forum/img/smilies/big_smile.png"></p>
<p>@Zeth:</p>
<p>"Say you work for a team that produces code using tabs, and then later, unexpectedly, your work then gets combined with another team that uses spaces. You then want to combine the code together in interesting ways. Someone has to go through and convert the tabs to spaces."</p>
<p>Q: Why??</p>
<p>Sure, if one copies/pastes lines WITHIN blocks to other (different) blocks, then TvS comes into play. But shouldn't one/one's editor take care of the formatting?</p>
<p>Or, perhaps, SIMPLY WRITE A QUICK UTIL TO DO THE CONVERSION FOR YOU, LIKE, SAY, <strong>IN PYTHON</strong>! :></p>
<p>(Of course, this assumes there's not already utils for this. But we digress... <img src="/static/forum/img/smilies/wink.png"></p>
<p>@David Jones:</p>
<p>That various utilities choke from Tabs isn't a valid reason to not use Tabs. It's certainly IS a valid reason to -fix- the -broken- utils, though.</p>
<p>However, please note that I'm not misguided; I understand that They Are What They Are and we Mere Mortals must deal with them. That you have to is tragic. But I understand.</p>
<p>Still, aren't there utils to convert T->S first?</p>
<p>And such things as "scripts" to enabile one to not have to re-type such filters every time?</p>
<p>Even a "language" one might use to whip something up quickly to help with such things... Hmmm... Might be something to that concept... :></p>
<p>@World:</p>
<p>Tabs for indentation. It's not hard.</p>
<p>Spaces for alignment. It's not hard.</p>
<p>This "argument" reminds me of the aversion that C-like lang users have to "significant whitespace"! <img src="/static/forum/img/smilies/tongue.png"> Get over it. It's not hard.</p>
<p>I'm NOT saying "I'm right, You're wrong!" Simply that there's no Here here! Put code through a filter to YOUR specs/desires. Don't try to foist YOUR preference upon World+Dog and "justify" it with the problem cases. <em>I</em> use Tabs, and have never had a problem.</p>
<p>Heck, MY biggest problem is that the HTML def makes whitespace "disappear"! (Handy for HTML source [I like indentation! SURPRISE! ;], but HORRIBLE for -presentation-.) So, without CONSCIENTIOUS EFFORT, posted Python code is "broken".</p>
<p>Funny how that's one of the "arguments" that "Leading Whitespace Sucks!"-people throw out there for why Python's "broken".</p>
<p>Nee! <em>I</em> contest that it's <em>HTML</em> that's broken! (Not unlike the aforementioned tools, above. <img src="/static/forum/img/smilies/wink.png"></p>
<p>Anyway, just up late and thought I'd throw a couple cents into the kitty... <img src="/static/forum/img/smilies/smile.png"></p>
<p>Cheers,
-L</p>
<p>tabs are too evil for words.</p>
<p>And, no, here is the difficulty with "fixing" "broken" tools: there's no consensus for what a tab actually represents. (That's kind of the advantage of tab, right?) This actually presents a real problem.</p>
<p>Should a tool interpret a single tab character as eight spaces? Perhaps the tool should do a regex replacement through the entire file on the fly -- being sure not to replace tabs that are embedded in strings... a single space (i.e., that's what it actually exists as in a byte stream)? four spaces?</p>
<p>Only the Shadow knows, and he doesn't work on code.</p>
<p>Spaces are right for two reasons:</p>
<ul class="simple">
<li>They're a standard</li>
<li>They're cross-platform/editor/etc</li>
<li>They make sense to normal text-munging tools</li>
<li>You can still use Tab to generate them. I use these settings for Vim, a similar thing is possible with Emacs:</li>
</ul>
<blockquote>
<p>set expandtab " Do not insert tab when <Tab> was pressed - insert a number of spaces</p>
<p>set shiftwidth=4 " Number of spaces to use for each step of (auto)indent</p>
<p>set tabstop=4 " Number of spaces that a <Tab> in the file counts for</p>
<p>set softtabstop=4 " less than 4 may result in breaking of lists</p>
</blockquote>
<ul class="simple">
<li>Spaces have a single, universal meaning</li>
<li>Most editors don't visually show a difference between tabs and spaces unless you turn on a "show all the ugly characters" mode..</li>
</ul>
<p>I've actually downloaded (and reformatted) Python code that contained a mixture of Tabs and spaces throughout. In some blocks the author used tabs, and in others spaces. I had to reformat the entire thing using regex.</p>
<p>In closing, I have one thing to say:</p>
<pre class="literal-block">
:%s/\t/ /gc
</pre>
<p>A space is .25 of a level???</p>
<p>Instead of tabs or spaces, or tabs being so many spaces, there should be a "level" character, where one character equals one level of nesting. Then, to go one level deeper, you add this character once. Your editor can display it as whatever your preferred indentation-per-level is.</p>
<p>Some editors like scintilla, EMACS, and vi already use the tab this way, but obviously, by the above discussion, this is very difficult for programmers to use or convert to their preferences. So what is needed is a new character meaning "one additional indentation level" instead of "indent to next tab stop". Just put it on the programmers block of the control-character plane of Unicode, right?</p>
<p>Of course, for pure simplicity and portability it's hard to beat one space per level. For readability, either run it through sed or write a little python script to expand the leading indention to your own habitual coding convention.</p>
<p>I buy 99% of PEP8, except:
Opposin..-.. inside a function or method is s... then use i for the first one, j for ...rd."
Argument: if no relevant, just use the sourceexpression of the variable. If it has even a minimal "caching" relevancy, name it so</p>
<p>I buy 99% of PEP8, except: I don't like the line spacing rules... I can't read the code when it's too close together - it looks congested and I can't see where one fn ends and another starts when scrolling down quickly.</p>
<p>One blank line between fns is not enough IMHO. I use one blank line for intra-function spacing, two between methods and global fns, and three between classes.</p>
<p>Maybe I'm just getting old.</p>
<p>PS. Tabs suck.. <img src="/static/forum/img/smilies/wink.png"> I once new a (very good) coder who edited all his stuff in proportional fonts!!</p>
<p>As for spaces ....</p>
<p>I do the 80 character width and I find that 4 spaces per indent make me run out of space very quickly.
I like to use 2.
Quiet readable to me.</p>