For a long time I have been searching for the ultimate efficiency in the filing and retrieval of documents. Apart from all of the day to day to document storage needs of the avarage family I have a number of tax related investments which results in the accumiliation of a large amount of paperwork.
In reality only a small percentage of these documents need to be maintained in paper form so I went searching for the best way to digitise the data. The dual benefit of this approach is that it minimises storage needs while making retrieval much easier.
The following is the system that I have developed;

ScanSnap S1500M
Hardware
The first step is of course to digitise the document which requires the use of a scanner. I already have a Brother Multi Function Centre which can scan documents but it is very slow and does not do duplex. After researching the available information on the net time and time again The Fujitsu ScanSnaps were mentioned. It would seem the consensus is that these are the preferred devices for the paperless office fanatics. I settled on the ScanSnap S1500M as my preferred device. I sourced mine via Ebay from the US. It was delivered within 2 weeks and I saved about $400 compared to local prices.
Software
Apart from the ScanSnap Manager software I utilise Yep for file tagging and viewing, Hazel for file renaming and filing and ChronoSync for file syncing
Storage
The documents are stored in a directory on my Macbook. This is backed up automatically via Time Machine and periodically to an external drive for archiving. After learning the hardway It pays to have multiple backaps of your important files. I also sync a copy of each file to my Mobile Me account, this means that I can access the documents wherever I am. For accessing the documents via the iPhone I use ezShare as it has a search function (apart from many other extras) that is missing from Apple’s iDisk app.
Process
Generally I scan in batches. I organise my documents into 3 piles – single page, double side pages and multiple page documents. This makes the process of scanning quicker as ScanSnap can be setup to automatically create seperate documents after x number of pages. The first two options allow me to scan multiple documents in one batch and ScanSnap will automatically separate them. I dont usually perfrom OCR but I do have additioanl ScanSnap profiles to handles these occasions. The documents are output as PDFs and saved to a central location.
I then open up Yep to view and tag the documents. I chose Yep partly based on its features and party based on the fact that it does not store the scans in some sort of propriety database. Yep 1.8 is a great app that is purely for the management of PDF files. Currently there is a version 2 in beta. It has more complex in functionality and views a greater range of file types. It has switched to the use of OpenMeta Tagging format that is used by a number of developers at present. There is controversy surrounding the OpenMeta format and it would be prudent to not rely on it alone for the storage and retrieval of your documents. I tag each document by the category (Document, Certificate, Ebook or Manual), purpose (Car, Food, Electricity, Warranty etc or a Person or Investment name) and Supplier (Eg Amazon, Apple, Acme Inc etc). I will also add a year and month if it is document that I will receive periodically, such as a bank statement or If I want to allocate it to a specific period , i.e. 2008 Tax Return.
I utilise the tags that I have allocated against the documents In Yep to rename the and file the documents automatically with Hazel. I use Hazel to rename each file with the date and time (down to the second) the document was created and the tags that have been allocated. For example a document created on 1/7/2009 that related to an Apple purchase would be renamed as;
- 20090701-123015-Receipt, Warranty, Macbook, Apple.pdf.
By renaming this way I can find them via spotlight even of the OpenMeta Tags get corrupted.
Hazel will then move the documents to a final storage location in directories based on the category tag. Any document that doesn’t have a Document, Certificate, Ebook or Manual tag will be stored in the default Receipts directory. Over time I may add more categories but these suit me fine for now. If the number of documents in a folder gets too large I will split them up into financial year directories.
Chronosync then runs once per day and syncs the files to the external drive and Mobile Me.
Conclusion
So far I am up to 500+ documents and find this system great. The time this process takes is probably about the same as the old manual filing system but it is much quicker to retrieve, view and forward the documents as required. The additional benefit is that I estimate I will reduce my current filing needs from 2×4 drawer cabinets to only 1 or 2 drawers in total.