It’s still a work in progress, but it makes my coffee every single morning 🙂 This is the very definition of a safety critical system – imagine if it failed to work one morning (it doesn’t bear thinking about…)
I wanted to move an existing Subversion repository to Google Code, preserving all the history. This is possible, but (as with most things) turned out to be more complicated than expected.
To start with, you need to dump the existing repository. If you can’t log into the server directly, it’s possible to do this for a remote server using the svnrdump command:
svnrdump dump https://svnserver.com/trunk > remote.dump
This will probably contain the full history of several projects, so the next step is to filter out the specific project you need. Unfortunately, svndumpfilter won’t read the dump output from svnrdump (it doesn’t seem to like the file version).
One workaround is to temporarily create a local repository, load the dump file to that repository, and then dump it back out again. The new dump file is then accepted by svndumpfilter. Not very convenient, but it works:
svnadmin create temp svnadmin load temp < remote.dump svnadmin dump temp > local.dump
Then (theoretically) you can filter out the specific project path from the dump file using svndumpfilter. Again, this caused me some problems because it leaves lots of empty commits (i.e. log messages related to other projects, with no file changes).
svndumpfilter --drop-empty-revs --renumber-revs include /your/path/ < input.dump > output.dump
There is apparently a newer version of svndumpfilter which accepts a new option –drop-all-empty-revs. However, updating SVN didn’t get me this version, and I couldn’t be bothered to build it from source.
The solution was to use the much more capable svndumpsanitizer. Unlike svndumpfilter, this actually seems to work, and is a single C file with no exotic dependencies, so very simple to build.
svndumpsanitizer --infile input.dump --drop-empty --outfile output.dump --include /your/path
I did find one bug with svndumpsanitizer: it doesn’t work if your dump file contains a partial set of revisions. For example: I had a dump file which started from revision 88, and this caused it to index out of bounds of an array and output nonsensical revision numbers like 1638472. I only discovered this by adding an assertion in svndumpsanitizer to check array bounds. After that, I fed it a complete dump file containing all revisions, and it worked fine.
In my case, I also wanted to replace the username in the commit messages. This can be done using svndumptool as follows:
svndumptool.py transform-revprop svn:author OldAuthor NewAuthor input.dump output.dump
Then you need to create a local repository and load the dump file.
svnadmin create repo svnadmin load repo < final.dump
If this doesn’t work, you might need to manually create some of the higher level folders (e.g. trunk) before loading the dump.
svn mkdir file:///full/path/to/repo/trunk
At this point you need to create your Google Code repo. Then you can use Administer… Source… Reset this project’s repository. to ensure it’s empty.
Finally, you can initialise the remote repo and sync the local repo to the remote Google Code repo.
svnsync init https://project.googlecode.com/svn/ file:///full/path/to/repo/ --username email@example.com svnsync sync https://project.googlecode.com/svn/ file:///full/path/to/repo/ --username firstname.lastname@example.org
If all goes to plan, you should now have a copy of your project on Google Code, with full history available. If not, you can reset the repo on Google, filter the dump files again, and retry.
This took quite a bit of trial and error and searching to find, so I hope it helps save you (and me) some pain in future.