[Tweets]

10/12/2018 7:08pm
RT @karpathy: Incredible NeurIPS talk from @drmichaellevin on "Bioelectric Computation Outside the Nervous System" [LINK]
06/12/2018 6:39pm
We are looking for 2x part-time Java developers to work on an awesome editor for a multi-agent systems modelling la… [LINK]
06/12/2018 5:12pm
Excited to announce that we have been awarded the @EPSRC Impact Accelerator Kickstarter Award in collaboration with… [LINK]
03/12/2018 8:39pm
RT @BristolRobotLab: Cafe opens in Tokyo staffed by robots controlled by paralyzed people [LINK] via @RocketNews24En

[How Coding in Python Might Be Bad For You]

Java C++ ActionScript Added on 10/04/2016

Python questionmark logoWith the growing power of computers, scientists across all fields are starting to appreciate computer modelling as a valuable tool for their research. A particularly favoured language is Python, as it is easy to get into compared to other, strongly typed and compiled languages, like C++. Consequently, because of Python's popularity, there is a great number of code libraries and tools one can use for free. I myself use Python for analysis and visualisation of simulation results (I write my simulations in C++ though). I have also been exposed to using Python during a couple of commercial projects.

The momentum that Python has these days, both in science and in the software development industry, worries me. I have seen many programs and have worked with many programmers during my career as a software developer. I often struggle to explain to fellow scientists and programmers why I think Python is bad. Today, I would like to share the following analogy:

Imagine that you are writing an essay for a coursework during your degree. There are good and bad courses and there are good and bad teachers. I'd like to argue that coding in Python is like writing a really bad essay and getting away with it. Here is why:

1. No structure

No structureYou are allowed to write your essay with any structure you want. You skip an introduction and go straight to your main arguments. You then mention a couple of references in the end. You get a B, which in your mind validates this approach to writing essays.

In Python, it is very easy for your code to become a mess, especially if you are accustomed to the fluidity that Python allows. There is no pre-scribed structure to a program - you can write in-line code, you can put it into functions or into classes. What can end up happening in bigger projects is that there is a mixture of everything. This makes it really hard
to read and maintain complex programs.

2. You are allowed to redefine words and say things that are not true

You are allowed to redefine words and say things that are not trueIn your essay on African animals, you say that termites are social insects and than rhinos are mammals. A few pages later, you claim that ants are really stones and that rhinos can fly. You get an A - what a creative way of thinking about reality!

In python, a variable can change its data type during run time. Or, a parameter to a fuction can take on any data type, determined at a run time. Consider the following code snippet:

def whichWeekDayIsIt(date):
   """
   A function that returns name of the week
   day, based on a date.
   """

It is simple enough to understand the function signature right? Yes, until someone changes the body of the function to cover cases when date is not a single date, but a list of dates. You can then end up calling the function as


myDay = whichWeekDayIsIt(date = [01022016, 02022016])
The parameter name 'date' looses its original meaning (a similar thing would happen if you changed the parameter name to 'dates' and then wanted to call it with a single date, not a list). Because variables have no datatypes, you basically either have to call variables in a very generic way, defeating the purpose of naming them, or call them in a way that is only partially true.

The second problem with having no data type is that when it comes to passing variables into functions, you have to rely purely on documentation of that function to determine what you should pass in (which, more times than I am comfortable with, is missing in open-source libraries). Alternatively, you have to look into the body of that function to see how a particular parameter is used and to deduce what data type your variable should have. This is sometimes really difficult.

3. The meaning of words changes based on the amount of spaces in front of them

The meaning of words changes based on the amount of spaces in front of themEveryone makes mistakes during typing. But in the essay for this course, a word " dog" that has one space in front of it means a dog, while the phrase " dog" (two spaces) means a flying saucer. Unfortunately, your word editor does not highlight multiple spaces as an error, which means you end up writing about UFOs, while what you really wanted to do is describe your pet dog.

This is one of the most annoying things about Python for me - indentation, i.e., the use of white spaces at the beginning of code lines, defines what part of the program a line belongs to. It is isn't just the fact that it slows coding down during "dirty-editing" of new code. This reliance on indentation alone is potentially dangerous, as it allows you to create logical bugs that are difficult to spot. Consider the following code snippet:

numOfDrivers = 0
for car in cars:
   print("car {}".format(car))
   numOfDrivers += 1

The number of drivers will be equal to the number of cars. Now imagine you are editing the code and you make the following indentation mistake:

numOfDrivers = 0
for car in cars:
   print("car {}".format(car))
   print("something else")
numOfDrivers += 1

The number of drivers will now be 1. It is possible that you will not realise this mistake until your code delivers unexpected results. In the worse case scenario, you will never go back to your code and you will make incorrect conclusions based on your results. For example: "For every ten cars, there is one driver, we must have self-driving cars on the road!"

Let's compare this to what happens in other languages, like C++, Java, etc.:

numOfDrivers = 0;
for (i=0; i<cars.length(); i++) {
   print("car {}".format(car))
   print("something else")
numOfDrivers += 1;
}

Here, the same mistake of unindenting the line that counts the number of drivers was made. However, because the code has a stronger structure and uses the curly brackets { } to identify the code block in the for loop, the resulting number of drivers will still be correct.

4. The only feedback you get is that you failed or passes your course

The only feedback you get is that you failed or passes your courseYou cannot ask your course coordinator about your research for the essay, because he is too lazy to answer his emails. You spend weeks writing your essay, submit it and fail the course when it's too late to change anything.

Python has no compiler. It means that you can simply run a Python program without the extra step of turning code into an executable file. Yes, it may be faster to develop (small) applications this way. But in the long run, it actually makes you less productive. I lost count of how many times my program failed after an hour or so of running, because of some simple typo that a compiler could have picked up straight away.

5. Lack of control over volume

Lack of control over volumeYou need to print your essay to hand it in and you are allowed any formatting you like. Because you have absolutely no idea about how printing works, your essay ends up being triple spaced, with headings that span across pages. When you print it, it takes up a whole room.

Python leaks memory badly. Have you ever tried running a program that takes a few hours to execute and produces a lot of graphs using matplotlib? Python, like other languages that use a garbage collector, was designed based on the philosophy that it is the computer, not the programmer, that should decide how memory is allocated and when it is freed up.

What this means in practise is that it's very hard to optimise memory usage. Sure, things run smoothly for simple programs that show off a Hello world message and maybe calculate a few things. For a scientific application though, that requires complex computations and uses big data, this can become a problem very quickly.

6. Each sheet of paper is from a different manufacturer. Your printer dies horribly.

Your printer dies horriblyBecause you do not produce your own printing paper, and because paper is really hard to find in bulk, you have to go on the Internet and buy individuals sheets (so you can print out your thousand-page essay). Some paper manufacturers are very bad at their job but no one knows about it. One partcular sheet had stone fragments in it and broke your printer.

It is very common in Python to use libraries, mainly so that your application production is faster, or because some people truly know better how to write code. However, the curse of open-source applies to program libraries too. Not everyone knows how to code efficiently. Almost no one, from my experience, knows how to document things properly.

A python application may be using 20 other libraries to do what it's supposed to (because, let's face it, Python by itself doesn't support much). You end up installing libraries all the time, especially when you pick up someone else's project. Libraries often depend on other libraries than depend on other libraries. My laptop has become a mess since I started coding in Python, I must have installed at least 30 libraries to finally get some code running.

Secondly, most libraries are made by people like us, who are simply trying to get by writing code for their own application. Some of those people are kind enough to want to share their code, but they often do not go the extra mile to make their code usable in general. Poor documentation, lack of naming conventions, and other bad coding practises are very common.

7. You are in a community of false novel writers

You are in a community of false novel writersYour fellow classmates also have to write their essays. As do many other people on the Internet. It feels to them like they are writing a novel, or even a trilogy, and they often truly believe that they are professional novelists capable of producing masterpieces. In really, some of them have never even seen a novel.

The Python community feels very strange to me personally. Python claims to be a language to fit all purposes, it is functional, it is object-oriented, it is for simple utility scripts and it is used for working with big data. I have particular experience in Object-Oriented programming (OO) and I have seen OO in Python, C++, Java, Objective-C, ActionScript, JavaScript, PHP, etc. Like in PHP, for example, OO in Python is not fully implemented. For example, encapsulation, one of the basic principles in OO is non-existant in Python. The fact that you can create objects in a language does not make it a proper OO language.

And that's ok, when everyone understands what a language is suitable for. PHP programmers will not claim that their language is for anything else than coding web sites. Python programmers, however, seem to think that their language is the answer to everything. If they remain within their niche community, they will never learn aspects of coding that are not present in Python, but are important in general.

Conclusion

Just like any other programming language, there are things that Python is good at and things it is bad at. I find it a great tool for writing initial exploration code, or small scripts to maintain large numbers of files and directories. Also, I use matplotlib extensively as it can produce some really nice graphs with relative ease.

The reason I have a problem with Python in general is how it is used. I do not find it efficient and I therefore find it difficult to understand why people want to teach it to data scientists. I do not find it well structured and because of the reasons I list above, I think it encourages bad coding practises. That's fine if you are coding small applications alone. But for a big project, Python code can become a big mess very quickly, unless there is an experienced programmer to guide the development process.

In science, I think it is up to the people with a richer software engineering background to encourage others to learn good coding practise and to use languages more appropriate for big applications (like C++, Java, C, etc.). Yes, it is hard to get started with those languages. It is hard to understand memory management and it takes years to build up a sense of what is good and what is bad coding practise. But in the long run, more structured languages provide a safer and a more robust environment for turning ideas into computer programs. Let's not write bad essays and pretend they are best-selling novels. Let's build up our skills slowly from the ground up, so that when we finally want to write those big novels, we can do so properly. 

Did this article interest you? Have a look at these!

pyCreeper: A python library for making plots easily and cleanly.

Object Oriented Software Development (In Python): Tips on making clean OO code in Python


Image sources (in order as they appear on the page):
http://www.pythondoeswhat.com
http://gailey.org.uk/?tag=/unstructured-data
http://www.smosh.com/smosh-pit/photos/12-more-bizarre-mutant-animals
http://www.clipartpanda.com/clipart_images/a-survey-clip-art-image-8637850
http://www.nooooooooooooooo.com
https://buffetoblog.wordpress.com/2011/11/14/caption-contest-woman-surrounded-by-paperwork/woman-at-desk-swamped-by-papers/
https://www.pinterest.com/lauriuetc/explosions/
http://www.youlifesuccess.com/3-quick-tips-on-becoming-a-great-public-speaker/

Comments

praneet
[11/11/2018]
>>> "If uncaught, e.g. in an educational setting, these bad habits propagates to one's professional work and make life harder to anyone else who tries to use their code." <<< No truer words were spoken. That is why I also firmly state that "starting" with python is bad. It gives you a false sense of confidence. But I also believe, that in the right circumstances, with the right ideas behind the development, it can be quite powerful. There is no one-size-fits-all. We are not doing server apps in C (at least I hope not) and we are certainly not doing realtime using python (not too sure about micropy on the STM boards though). Anyway, to each his own. I will definitely check out your fast data analysis cpp/python work. :)
Lenka
[09/11/2018]


Hi Praneet, thanks for your comments. Indeed, what you say is true, Python is easy to adopt which is why it's gaining popularity. This article (and I would say most of the comments) are not about hating Python. At the end of the text, I mentioned that I use Python myself for plotting data analysis results etc.

What I mainly wanted to point out, and I think what people here have mentioned as well, is that Python enables bad habits. If uncaught, e.g. in an educational setting, these bad habits propagates to one's professional work and make life harder to anyone else who tries to use their code. Yes, you can make bad code in other languages. But Python, in my opinion, is particularly suitable.

You also mention run time, and again, if we are talking a simple desktop / server application, you are correct that it doesn't really matter if the code is optimal. But there are many uses in science, data mining and engineering when this is not the case. Using Python in these instances just because it's "easy" is, in my opinion, simply bad practise. I am a scientist so this is why I find this problem important to discuss. What I found very useful, which is what I am doing currently, is creating Python modules in C++ and then using Python simply to plot processed data. This allowed me to speed up my data analysis program over 10 times (have a look here if you are interested in how I do this: http://lenkaspace.net/code/dataScience/fastDataAnalysisCppPython ) - and such a speed up is simply invaluable when your work revolves around analysis lots and lots of data.
praneet
[09/11/2018]
I went thru your article with great interest, while searching for the the best way to create a non-gui linux application. More interesting than the article itself, are the comments !! :D

Seems like there are quite a few of python-haters out there. I am from an embedded background and understand the hatred for python, that I can see here. So no doubt, I advocate the learning of static languages before dynamic ones. But does that make Python a bad language? I can show you examples of "struct inside struct inside struct .... " in C, that would make your brain boil. At least, I hated that myself. So, essentially, the "badness" of the language depends on the "badness" of the user behind it.

Anyway , the above opinion differs from user to user. But one point, that I would like you all to think about is "creativity". X has an idea that he wants to bring to the world. He has bare programming skills. He lands on certain py modules, that he quickly cobbles together to give life to his idea. He knows enough about systems, to realize that, given the growing power/size of CPU/RAM, wasting time to do minute optimizations is redoubtable, at least at the application level. He knows he is not looking for real-time responses from any part of his application. So, is he justified in his choice, or should he take longer to learn a static language, while his competitor releases a dynamic language version much earlier? Maybe the answer differs from requirement to requirement.

Maybe we are feeling a bit challenged, because python has many adopters. But dont forget, at least its getting people interested in programming and getting them to release their ideas. THIS is good. And truly, there is no way to tell how many of those programmers are good, bad or ugly. "Hip" or not, this language is here to stay. Unless you dont believe the graphs on github octoverse.

In the end, our code is a language used to "meet some requirement". I do not feel a language, that can help us to "meet" those "requirements", can be altogether a bad language. If that was really the case, we should all just go back to assembly.
Lenka
[08/10/2018]


Thanks for your thoughts Sandalo. I do think that some problems pointed out in the article are inherent to Python. However, I agree with you that how a program in the end turns out very much depends on the person who uses the language.
Sandalo
[08/10/2018]
Python is one of my favorite languages birthday that's not why I didn't like most of the points you raised here. Most.of them are valid, but the problem is not the language itseld and what it offers. It has to do mostly with best practices and such. You can easily make a mess in Java and other typed languages. Python is even more strongly typed than some static languages. Python never coerces s type on an old nterpreter's whim like J's or c. Your points Py generally to interpreted languages, but py is one of the best interpreted languages. Its standard library also offers so much it might not be as rich as Java but it sure is one of the best. Anyesys, this essay and interesting read and has really pushed me to make a response!


{Please enable JavaScript in order to post comments}

Top 5 Things I Wish I Knew When I Started a PhD

In a short moment self-reflection, I made a list of the five most important things that doing research with a lot of data has taught me. And I learned the hard way - wasting a lot of time and energy re-doing things instead of being smart about it at the beginning. Note to self: I should read this once a year or so.

Getting Started with Creeper

This tutorial helps you to create your own Java project with Creeper. As an alternative solution, you could simply download Creeper with the Demo project and rewrite class and package names within net.lenkaspace.creeper.demo to suit your own needs.

A small compiler script for C with GCC

One of my favourite classes at the moment is the one where they teach us C. Knowing C already, it is a nice relaxation for Monday morning...

pyCreeper

The main purpose of pyCreeper is to wrap tens of lines of python code, required to produce graphs that look good for a publication, into functions. It takes away your need to understand various quirks of matplotlib and gives you back ready-to-use and well-documented code.

Novelty detection with robots using the Grow-When-Required Neural Network

The Grow-When-Required Neural Network implementation in simulated robot experiments using the ARGoS robot simulator.

Fast Data Analysis Using C++ and Python

C++ code that processes data and makes it available to Python, significantly improving the execution speed.