Every once in a while I read something that is so insightful, so clearly written and so well documented that it enters my own personal pantheon of “Best Ever” documents. I recently added a new, simply divine article titled Best Practices for Scientific Computing and hope that everyone reading this post also takes the time to read that article. I’m including the outline here only to encourage you to read the article in it’s entirety. It is extremely well written.
- Write programs for people, not computers.
- a program should not require its readers to hold more than a handful of facts in memory at once
- names should be consistent, distinctive and meaningful
- code style and formatting should be consistent
- all aspects of software development should be broken down into tasks roughly an hour long
- Automate repetitive tasks.
- rely on the computer to repeat tasks
- save recent commands in a file for re-use
- use a build tool to automate scientific workflows
- Use the computer to record history.
- software tools should be used to track computational work automatically
- Make incremental changes.
- work in small steps with frequent feedback and course correction
- Use version control.
- use a version control system
- everything that has been created manually should be put in version control
- Don’t repeat yourself (or others).
- every piece of data must have a single authoritative representation in the system
- code should be modularized rather than copied and pasted
- re-use code instead of rewriting it
- Plan for mistakes.
- add assertions to programs to check their operation
- use an off-the-shelf unit testing library
- use all available oracles when testing programs
- turn bugs into test cases
- use a symbolic debugger
- Optimize software only after it works correctly.
- use a profiler to identify bottlenecks
- write code in the highest-level language possible
- Document design and purpose, not mechanics.
- document interfaces and reasons, not implementations
- refactor code instead of explaining how it works
- embed the documentation for a piece of software in that software
- use pre-merge code reviews
- use pair programming when bringing someone new up to speed and when tackling particularly tricky problems
The only extra I would have included would be:
11. Maintain and update older code.
If you are still hesitant to go to the original article, go there for the 67 references to other books and articles that discuss scientific computing. Like I said, this article is a “Best Ever”.
A previous version of this article originally appeared in 2013 at WorkingwithData.