How-To: Extract Files Not Under Version Control

We recently got stuck supporting an outdated & nasty, 3rd party shopping cart application. You know the type... procedural, code soup comprised of 100's of nested functions, global variables and poor naming conventions. Complicating matters, the client also required a comprehensive front-end redesign on a very aggressive schedule.

Being only marginally familiar with the shopping cart app, we decided to make life on ourselves easy by initially relegating only those files necessitating UI changes to version control. This actually did alleviate the pain of launching the redesign since our repository was essentially a list of the files that needed to be updated on the production server. All files left outside of version control were given read-only permissions, preventing our developers from inadvertently changing something outside of this primary scope.

So today when needing to quickly duplicate the code base for another developer in our office, I was suddenly tasked with extricating all files not under version control so our developer could checkout only those UI-related files for further updates. This is a marginally complicated task since you can't just copy the entire directory over- the new user needed to commit via her own SVN credentials and any hidden .svn directories littered throughout would wreak havoc on a new clean checkout.

Enter solution: so I first executed the following command to get a list of all filenames not currently under version control...

svn status | grep "^\?" | awk '{print $2}' > /tmp/files-not-under-svn.txt

Then, using the derived file, I added each item to a preexisting tarball giving me a final collection of all files not under version control. Problem solved!

cat /tmp/files-not-under-svn.txt  | while read line; do tar rvf /tmp/files-not-under-svn.tar $line; done;