Interregional Software Development Week, Day 4: File Exchange

The possibility to exchange files is an absolute necessity when developing software. There are many ways to do it of which version control systems are probably the most helpful.

A couple of ways to exchange files:
* Send them through e-mail: works, but you end up with a big mess of e-mails and attachment of different versions of files. It’s not very friendly to you as a user. Also, only the people who you send the file have access to it.
* FTP: set up an FTP account somewhere and let everybody store his/her files there: works, but older versions of files disappear if you’re not making backups.
* Forum attachments: some forum systems have the ability to attach files to messages. This has the same disadvantages as e-mail, but the advantage is that everybody (that can access the forum) has access to it.
* Version control systems: the option I want to talk about today.

Version control systems are not used nearly enough. They’re not only useful in big-ass million-people projects, but even if you’re working on something alone. How often has it happened to you that you removed a piece of code and saved it, only to remember that you needed that piece of code for another purpose. If you use a version control system it’s easy to retrieve an older version of the file with the removed code still in.

Within interregional projects, version control systems are not only useful for versioning purposes, but also for distribution purposes. But before I get into that, I’ll first explain how a normal client-server version control system works. Let’s start with a picture:

Version Control System

In the middle is the server. A version control server can serve multiple so-called repositories. A repository is just a tree of directories and files. Usually you use one repository per project, but there are reasons to use more. In each repository all the current versions of files are stored, but also previous versions, so you can always request an older version.

Clients have a copy of a repository stored locally. This copy can be obtained with a so-called check-out. A check-out is an initial download of all directories and files to your local disk. After that you can keep your repository copy up-to-date with the update command. It’s important to realize that this is a local copy, you’re not editting files directly at the version control system.
Once you got a copy of the repository you can then edit the files, add new ones ore remove some. When you’re done, or you think it’s a good idea to store the changes in a safe place, you synchronize the changes you made with the version control server. This is called a check-in (or commit). The version-control client can see what files have changed and will submit new revisions of those files to the server.

When multiple people are working on the same file, problems can occur. If you’re working with text files, many of those problems can be fixed automatically. For example if person A is working on a subroutine and person B is working on another subroutine in the same file, and both check-in their changes, these changes can often be merged. If the changes don’t conflict, they can be both applied. Note however, that this, in the systems I know, only works on text files. It doesn’t work on images, UML diagrams or Word documents.

Now, for which files should version control be used and for which shouldn’t it be used? Personally I’m in favour of using it for all kinds of files, both source code and (Word) documents. People argue that, because Word documents can’t be merged, it’s not very useful, but I beg to differ. Indeed, Word documents can’t be merged so you have to figure out a mechanism to prevent two people working on it at the same time, but it still has many advantages:
* Version control: that’s why we were considering this in the first place, wan’t it? Old versions of documents should still be retrievable, even if you can see the differences between the version in an as pretty fashion as with text files.
* It’s a convenient way for the distribution of files. Version control systems are easier to use than e-mail, forum attachments or FTP.

The best way to prevent two persons working on a document simultaneously, I’ve seen yet, is just to have a “What are you working on?” topic on your forum. If someone’s going to work on a document, let him or her, post a message stating the status of this work. When you want to edit a document you first check if somebody else is not already working on it. It’s not ideal, but it works.

*Software to use*
OK, you decided to use version control. Great choice! Now you still need software to accomplish this. Personally, and I’m not at all alone in this, I’m very fond of “Subversion”:http://subversion.tigris.org. Subversion is a successor to the well-known CVS(Concurrent Versioning System) with some issues fixed. There are both servers and clients available for most platforms. For Windows there’s a very easy-to-use client, called “TortoiseSVN”:http://tortoisesvn.tigris.org, that integrates nicely into Windows Explorer.