How To Start Programming Like a Pro: Mindset and Tools
May 23, 2017
My fundamental goal for RC is to learn how to think and work like a programmer in a professional environment.
My programming background has been coding for a few research projects in SageMath, which is basically Python with pure math library functions, and a couple foundational CS courses. I’ve never worked in industry or managed a large code-base.
Today I had the chance to learn about what constitutes a good programming setup from Saul Pwanson, who’s programmed his whole life (that’s over 30 years).
- How My Day Was Structured
- Setup and Tools
- Shortcuts To Avoid Rewriting Commands In Terminal
- Steps to Writing Good Code
- Why Use a Debugger (vs Printing)
- Programming Practices and Shattered Assumptions
- Test-Driven Development
- Saul’s Recommendation
How My Day Was Structured
Morning: Saul and I pair programmed on a Python program to freely reduce in groups. It was really an excuse for me to pick his brain about programming.
Afternoon and Evening: Wrote down what I learned before I had to go to lunch, then and pair programmed a bit with Rudi Chen to figure out an optimal solution ($O(n)$ time and $O(1)$ space) to my freely reduce program. Rudi had to leave, so I continued blogging. We will resume tomorrow.
Setup and Tools
Saul and I share roughly the same setup, which is code on one side of the screen and command line on the other side. This makes code editing interactive.
Saul uses vim and does everything from command-line. This makes it easy for him to do things like split screen and pair program remotely. He also had many shortcuts which made code editing blindingly fast to me. I had no idea what was going on at times. His gear was a giant monitor, an old mechanical keyboard, and an ergonomic mouse.
For contrast, Rudi and I use SublimeText and Terminal and program on our laptops.
Setting up tools like Saul requires time, but seeing him move on the computer, I had a feeling it’s well worth it. Saul doesn’t upgrade on a daily basis, but he once took a month of personal project time to upgrade.
The tools will change as you progress through your programming career. A good way to upgrade tools gradually is to do it whenever you are working on a personal project and get frustrated. Instead of banging through things, lookup how you can upgrade your skill set/toolset. Over time, you will build an infrastructure customized to you. That will make you powerful.
Shortcuts To Avoid Rewriting Commands In Terminal
- Press the up arrow to browse recent commands.
- Press ctrl+r to search previous commands.
- Type ‘!!’ to repeat a command.
- Press tab to complete a command.
Steps to Writing Good Code
Writing good code requires many drafts (say 3-4 for a short algorithm) plus a final draft that is polished and easy to read. The priorities should be (in order):
- Write something that works.
- Write something that is correct (test).Related: Test-Driven Development.
- Write something that’s fast (optimize).
- Write a final version that is easy to understand and thus maintain.Related: Clean Code
Why Use a Debugger (vs Printing)
Essentially, a debugger allows you to access states in your program in an interactive way. For example, it can access the value of a variable at any point in time with a level of details that is within your control.
Debugger gives access to the kind of information printing extracts without having to know ahead of time what you want to print before running the program or having the need to modify the code to get the information.
A debugger is useful because it allows you to troubleshoot your code like you’d do with printing, without having to modify your code or restart the program.
For a large codebase, this is especially useful because it may take hours to run your code to get to a bug. Debugging becomes a huge time-saver.
Since both Saul and I were uninitiated with the Python debugger, I had the opportunity to watch a seasoned programmer get acquainted with a new tool.
For Python, we ended up going for PuDB which was second on the list of debuggers on Wikipedia (the first one being the default one) – we just assumed the second one was the most highly recommended, and that was it. We did a pip search
then a pip install
. The installation was no problem, but figuring out the right commands to run PuDB took a while. It was mostly a lot of googling and ctrl+f, knowing what to look for – no surprises here.
What I realized from the experience is that getting acquainted with the debugger for a certain language and choosing a suitable debugger takes a certain time investment even for someone very advanced. I am under the impression that this becomes a worthwhile investment professionally if you’re at the point where you’re working on serious projects in a particular language.
Programming Practices and Shattered Assumptions
The biggest surprise I had today is realizing that programming like a programmer is about writing logic that people can read and maintain effectively – not cool engineering hacks (unless they are called for). As a pure mathematician, I thought I’d have more of a culture shock.
Let me give you two simple but illustrative examples of what I mean.
Example 1: Writing a Travel In Reverse Order
Suppose you have a list L
which you want to read in reverse order. This is how I would write it.
n = len(L)
for i in range(n):
L[n-1-i]
This is how Saul writes it.
for i in range(len(L)-1, -1, -1):
L[i]
Recall that len(L)-1
is the start index, -1
is the stop index (non-inclusive: if we want to stop at the first element the stop is -1) and -1
is the number of step (negative means we are traversing in reverse order).
The first line tells you explicitly we are going to traverse in reverse order. It allows i
to be exactly the index of traversal we expect as a result. This frees up mental load.
Example 2: Optimal vs Easy to Use (Scale Later)
Suppose again we have the list L = ['a','b','a','c']
of strings in a large codebase (which you would find in a professional environment). We want to process by deleting every index with string a
.
The optimal thing to do space-wise is to mutate the list, which would use no extra memory. We proceed in reverse order to eliminate the need to update indices.
def deleteA(L):
for i in range(len(L)-1, -1, -1):
if L[i] == 'a':
del L[i]
However, it doesn’t always pay in practice to optimize. Saul reminded me that we were playing on a 16 GB machine (his computer). Given the smallness of my expected input, saving this list doesn’t make a noticeable difference even if the memory space is $O(n)$.
Mutating the list is a global change which requires documentation. People don’t read documentation.Being people, I don’t! The next developer will probably forget that you are mutating a list, causing all kinds of errors.
Returning a new list is much safer.
def deleteA(L):
Lcopy = []
for i in range(len(L)-1, -1, -1):
if L[i] != 'a':
Lcopy.append(L[i])
return Lcopy
It’s worthwhile to mutate the list when everyone on the team understands that you are constrained in space. This can happen, for example, when your database becomes very large over time. That’s ok: rewrite the code. Scale later.
Test-Driven Development
Another advantageous mindset to have when developing is to write tests for your code as you figure out edge cases. These tests can be part of an associated test function, test()
with assert commands, which is a command which raises an AssertionError
if the argument is false. Test-driven development allows you to encode your insights in the form of solving problems.
For example, let a function reduce(L)
process a list using pairs of neighbours, L[i], L[i+1]
. We want the function to return L
as is when len(L) < 2
.
Let the input be a list of strings L = ['c']
, then in the test function test()
. The testing code would look this.
def test_deleteA(input_list, output_list):
ret = deleteA(input_list)
assert ret == output_list, '%s\n%s' % (input_list, ret)
print("pass")
def test():
test_reduce(['a'], ['a'])
print("tests passed")
This test environment ensures that the one-element list ['a']
stays unmodified when passed through the reduce
function.
Here, the return value of the function deleteA
with input input_list
. We assert ret
should be equal to a designated output, entered as output_list
.
Suppose our function deleteA
is wrong, and deleteA(['a'])
returns ['b']
. test()
will display (input_list, ret)
in the %s\n%s
format we wrote after the assert statement.
AssertionError: ['a']
['b']
If no AssertionError gets raised, then test()
prints tests passed
.
tests passed
In summary, test-driven development goes like this:
- Figure out a new feature/edge case to add to your function.
- Write a test for it, and look for the function to fail.
- Write a solution to pass the test.
Note: A nuance from Rudi: test-driving can sometimes slow down problem-solving if we are working on something tricky, because you are focusing on the result (given this input, I want my program to have this output).
Saul’s Recommendation
Lookup Python Virtual Machine.
That’s all for today! I want to thank Saul Pwanson again for investing so much time in me and being super helpful.
Special thanks to Scott Emmons for proofreading this and providing useful feedback.