Thursday, June 05, 2008

Hobbyist programming advice

I've programmed quite a bit, in a variety of languages, for work, for fun and for learning. Quite often, I've chosen how I do it and the other times it's been dictated but I've found that I program to a set of rules unconsciously. These mostly apply to "hobby" coding (games, utilities, etc. that you are mostly making for yourself) but can also apply to school or work if you have flexible working regimes:

1) Program in whatever language you like.

Honestly. It really doesn't matter for "casual" use. Don't program in C just because everyone else is, or because you've heard that it's better. Program in a language that you're comfortable with, even if that's BASIC and people tell you it's rubbish. You're the programmer, you should work to what you know. If you want to learn new languages, do so, but don't write every project in the current-fad language just because.

This also means that if you are confident in several languages, you should program in the most appropriate or your favourite language. It doesn't matter which! The most appropriate will make things easier (e.g. using something like Perl for handling a lot of text, or using Java for an object-oriented idea) but your favourite is likely to be the one you know best and you can make it do whatever handstands you need it to despite its shortcomings.

If someone tells you that you should start a program in C because you might need the performance later, test their theory first. Knock up a worst-case in your favourite language and see if it's worth the effort of learning a new language or coding in an unfamiliar one before you start. Chances are, for most things, just about any language will do.

There's nothing worse than being told that you "should" be programming in C when you're knocking up a twenty-line batch file (yes, batch files are programming, of a sort!). However, knowing whether you want an interpreted vs compiled language can be a help when you start - the chances are that it won't matter though.

2) Ignore coding style.

If you're writing for yourself, you can ignore supposed "best" coding styles and work your own way. Even if you're working as part of a small team what matters is clarity. Whether you space or double-space or crunch your arguments together with their commas doesn't matter so long as it can be read and understood. Stylising the code can come later if it's necessary. And if you're the only person to ever read it, it doesn't matter that people aren't used to your coding style.

I would argue that programming teams should have a system where they all use their own coding styles and have "convertors" so that when they put code back into the global pool, the style gets modified to the "team" style and when they check it out, it gets customised to that particular programmer's style. This way, everyone works comfortably but the central repository stays consistent.

If you're a single developer, spending too much time on worrying about the style wastes the "programming mood". You could write a thousand lines of new code with new features in the time it takes to style your code.

3) Document the code.

Documenting your code is important, even for yourself. You will not get to the end of any significant project and remember why every line is in there or how they all work. But you don't need flow-charts, invariant lists, hundred-pages of specification and all the other rubbish that programmers are taught to do if you're just working for yourself.

A simple comment above each block of code is more than sufficient but what's ABSOLUTELY necessary are warnings to yourself:


REM Don't remove this, because the program crashes without it.

// Don't pass different flags here because they are ignored.

;This line isn't redundant, it's deliberate. Without it X happens.

/* I know it's already supposed to be initialised but
without this, it can pass uninitialiased through
the compiler unnoticed and cause problems.*/


For big projects, you might find a brief Changelog and a Todo come in handy, but everything else should be inside the actual code. It takes long enough to write user documentation without having to document every single thing you do for yourself.

4) Backup

Yes, I know, but you really need to, especially if you're programming. Programs can crash, file accesses can run amok, source code will be heavily edited to fix a basic problem that you'll track down to something else, etc. Ideally make a copy of files you intend to change just before you start a session, and after any major breakthough (which may be as simple as finding the 1-character cause of a serious bug... you will kick yourself if you forget which of the hundred lines that mention "variableX" it was on and have to go through hours of debugging to find it again).

Regularly copy those files to other computers, or a website somewhere. SVN and other source code management programs are useful for this if you do it often but even just a simple ZIP and upload is better than nothing. THIS INCLUDES MAKEFILES AND COMPILE SCRIPTS!

I work on the following basis: At the start of a session, I backup the files I'm likely to change the most (e.g. graphics.c if I'm planning on revamping the graphics), to a dated copy in the same folder (e.g. graphics-2008-01-01.c). If I make significant advances (add a new feature, fix an old bug, etc.), I will copy that file again to the same folder (e.g. graphics-2008-01-01-added-double-buffering.c). At the end of a particular session, I copy all changed files to an SVN folder, which I may or may not decide to "commit" to a remote website. Every time I think I've made enough advances to see that the program obviously works differently, or which would be a real pain to replicate, I copy them to a couple of personal websites/SVN repositories.

Even few months or so, backup all that stuff to a CD-R or something. You'll be glad of it in a few years time. "I used to have code that did something similar, let me dig it out for you", etc.

5) Develop on a slow PC.

Seriously. Find an old junker, a laptop if possible because programming is one of those "I'm in the mood" hobbies that can take you any time and it's a shame to waste it. Don't make it so ridiculously bad that you can't use it comfortably, but a machine with less CPU and less RAM will show you the second you start to implement code that is detrimental to performance.

And it will also encourage good practice. Any program that's slow on your development PC will be unnecessarily slow if you ever do intend to scale it up somewhere else. So when that script to rename your MP3's starts struggling with 1000 files on your old laptop, you know that you should fix it before you convert it to rename 10,000 user accounts on the server in work. And when that game starts running out of memory with only a 10x10 board on your laptop, you know exactly how much it'll take on a modern PC before it starts exhibiting problems, when normally you wouldn't have given it a second thought.

This is especially important if you are developing for embedded systems, handheld games consoles, homebrew circuits, anything that's underpowered. You don't need to program on something of the same MHz or less of your target machine, but you need to be able to get an idea of where your program is slowing down while still using your development machine.

Using an old machine will also deter you from using your main desktop to program on. This can be important for a number of reasons - First, the source code (and associated backups) of any prolific programmer will get out of hand very quickly and you don't want it spread across your system. Secondly, you can backup to a different machine easily as part of the "routine" (e.g. a good system is to develop on a PC which is tricky to connect to the Internet - my old laptop has wireless but for simplicity and security, I set it up so that it proxies through my desktop machine for web access. This means that anything like FTP, SVN, etc. I have to deliberately enact or do from another machine, and it's slower to do so, which also provides the incentive to backup the same files to other machines while I'm waiting). Third, it removes the risk of damage to your everyday machine. This is especially important if you are using pointer-based or low-level languages that can potentially crash or corrupt a machine.

6) Test

If you intend to distribute the end-binaries to anyone, test it thoroughly. Good feedback is significantly hindered if your early versions have lots of basic problems or dangerous bugs. And get lots of people to test it for you. Announcing version 1.0 suggests that you've had lots of people testing it, believe it or not, so make it clear it's a test/0.9/beta/prelease version and then improve on it to make the magic 1.0 something special that works first time for everyone.

Testing is multi-stage. You need to test it thoroughly yourself. This usually means running the end program an awful lot of times, with different inputs. You don't need to generate a program to automate a test suite, you just need to use the end program a lot. This is especially good for games - just play the game a lot - until you realise that actually it IS possible for both players in the fighting game to hit 0 health on the same move, and you should add a "Round Draw" sound/routine/graphic.

Another stage of testing is theoretical. As you plough through the code, look for potential problems. "Oh, bugger, I don't test if someone asks for a -1 sized game!". Users are dumb - don't trust users, they are "rm -f *"'ing all the time, so they will find a way to break your program/their computer if it exists. Such analysis is hard to do if you set out specifically to do it but you'll catch a lot of bugs this way. This is best done when you spot a particular bug or as part of your general looking through the code for functions to tweak.

The best stage of testing is undoubtedly giving the program to other people and asking them to test. They won't have the same file structure, hardware, software, way of using the keyboard/mouse/joystick. They won't be aware of or use the little foibles and habits you've got into ("Oh, I only ever run the game from a script within Windows, I thought it would just work without it", "I don't touch the mouse when it's loading up", etc.). And the least experienced the person is, the better. You'll get hundreds of people test a game but only one or two would accidentally click outside the main menu, or tell you that the program says it can't find a common library.

And people who don't know how the program works will be clicking around in a non-obvious way so they will find all sorts of bugs as they try to figure out how it works, whereas an experienced person will just be dragging/dropping and completely ignoring the menus or buttons, for instance. My wife is the best bug-finder in the world. I guarantee that I can put a program thats worked successfully for months for a thousand people onto her Palm and within an hour she'll have crashed it somehow.

7) Have a nice development environment.

This doesn't mean a flashy GUI, it doesn't mean having a multitude of debuggers for multiple architectures. What is does mean is that there area as little distractions as possible when you're coding. You can compile with one command or click. You can place files into a clean test area with similar ease. You have all the tools you need most of the time. You don't need perfect Makefiles, or fantastically complicated scripts to allow it to compile on every architecture in the world, but you do need to have an easy way to do things for yourself.

For my projects, I often have Makefiles but use them sparingly or in a special way. And a lot of the time, if I'm working on a particular part of the code, I'll have a simple script that compiles, links and runs just the part that I need. I don't intend anybody else to ever use that script, I don't even distribute it, but it's a convenience when I'm programming.

If you're developing for another machine, you should have a way to simulate or access that machine quickly. For instance, I develop for the GP2X, so when programming for it, I usually have it close to hand. That doesn't mean I have to plug it in every session but it's easily accessible. I can plug it in and test stuff in it. As there are two main types of GP2X, I have a way to "emulate" the other type so that I can spot potential problems etc. I also have telnet and VNC access to the machine so that I can leave it running and test multiple programs in many ways in one session without having to keep re-connecting and running the machine.

For some people, this might also mean artificially limiting the memory/CPU available to their programs when they want to see how it would run on their target machine. Or running a software emulator. Or connecting the machine itself, transferring the program over and seeing how it works. Especially in the last case, you want to make it as simple as possible.

Debuggers are useful but you should only be using them relatively rarely. Simple problems are much easier to solve with code inspection. You'll usually know exactly where it crashed and a quick glance will give you the why. Tricky problems can be isolated with a couple of printf's or their equivalent. And difficult problems don't NEED the facilities of a debugger, they need your brain the most, but a debugger can then be a big help.

Have easy access to manuals and function references. Visual Basic used to come with a massive function reference manual that was incredibly useful. Sometimes just flicking through it would give you ideas about how to solve problems because you found functions you never knew existed that could do things in a different way for you. The TI-85 calculator was the same, and even the ZX Spectrum came with a BASIC manual. Even a "For Dummies" book can be an incredible reference for a seasoned programmer, especially if you switch between languages often ("Is it Select Case of switch() in this language?").

If you're programming with a particular library, say SDL, have access to a complete reference to it, with examples. This is immensely helpful, even if it's just a bookmark of a good site. I use one for SDL that lists the C prototype of functions and structures (which I copy/paste straight into a program and then edit to turn it into an actual function call), a brief description with lots of hints on what is and isn't possible (e.g. don't run SDL functions inside SDL threads etc.), and an example or two of using each function. It's amazing how much more helpful a simple example can be over a documented manual. Even a Google for a function name will turn up uses and tricks that you'd never have thought of, plus a variety of sites publishing "gotcha's" for using it.

8) Don't worry about the "proper way" of doing things.

You can't use GOTO. You should write performance-critical code in C. You shouldn't declare global variables. You should get structures down to their minimum required size. You should optimise all your functions. You should consolidate duplicated code.

Rubbish. Program for operation FIRST. You're not a NASA shuttle engineer, you're not dealing with 1000's of financial transactions. There are not going to be people sniffing over your code with disgust because you saved yourself having to restructure a hundred lines of code loops by using a jump instead.

If you intend to teach a programming course, or you're entering code into a critical programming project, then you should follow quite a lot of this advice. But if it's for a quick game that only you or a handful of people will ever want to see the code for, don't worry about it. A few of those items will help make you a more structured programmer, a few will reduce the risk of you leaving an accidental bug in your code, but the majority of them are pendantic ramblings of people who learned to program on punchcard and consider every cycle sacred. Don't get me wrong, I hate unoptimised programs with shoddy algorithms that bloat memory and slow the CPU, I can describe a bucketful of pathfinding, tree, sorting algorithms and their O(N) scales and there was a time when I would audit Z80 assembly and literally save hundreds or thousands of cycles in even the smallest listing, but for casual programming all that sort of stuff can come later.

9) Think.

This is the best bit of programming. Getting your head around why the 100% logical machine in front of you isn't doing what you ask. Invariably, it's because you've done something wrong but finding it is the best part. Programming requires you to think not only logically but creatively ("I haven't got a way to use floating-point numbers on this machine, how can I get around that to draw a circle?", "I can't use this function from inside a thread, so how do I get this thread to do what I want?").

The best part, although it's shocking, is that you do NOT need a computer to program. Argh! What is he suggesting? No, I don't meant that you should write a hundreds lines of C on paper perfectly first time, but that you can program in your head or on paper just as easily. You should be able to craft the basic structure of a program in your head before you start to write it, you'll be able to spot pitfalls, estimate which way is better or takes less RAM or less CPU.

You can program on a busy train. You can program in a quiet room. You can program in the garden. And you don't need a wireless laptop to do it. When I was learning to program, the best part was to get so struck with an idea that you dug out some paper to get your idea down before you'd even sat at a computer to try it.

Sometimes this even involved primitive flow charts but most of the time it meant just sketching out some pseudocode and working out some primitive structures.

I would use other people's programs of operating systems and work out how certain things functioned ("This must be a double-linked list", "They must be spinning in a loop waiting for me to press a key", "The AI in that is watching to see when you move more than X units into a small space before it considers it an attack", etc.). And a lot of the time I found myself stumped and sat down to work out how things worked. Or a way to improve them. Most of my ideas never made it onto an actual computer but it's the best part of programming - to stimulate the brain.

10) Enjoy it, and make it fun

Programming is like a puzzle. The aim of the puzzle is to finish whatever program you started (and sometimes just getting the answer hidden in the yellow squares isn't enough and you have to go and finish off the rest of the crossword). The intellectual challenge is to do that and to debug it when it doesn't work (and some people subject themselves to such intellectual challenges all the time, even when the answer is only a calculator tap away). The point of doing it all is to have fun (once you do the one-millionth wordsearch, or you've spend eight months on the same clue, the puzzles lose their appeal).

Why do you want to subject yourself to decrypting the Enigma code, when you could just as happily do a tabloid crossword? Do things that are within your capabilities, are fun, aren't over-stretching your bain, and that you enjoy doing. That's why people program. That's why there's billions of lines of code just floating out that the people are offering for free... they don't care that it was hard to do, it was fun for them to do it. Fun to prove it could be done, or that it could be done better, or that they could do it, or to get recognition from someone else for their hard work.

This is especially true if your programming games. If you can't enjoy the game at the end, there was little point in writing it you would think, but actually even the worst of games can be fun to program.