Infinite Disk, integrated file migration/backup Infinite Disk, integrated file migration/backup
home Student Projects no local find frame, full screen Google search web for topic jump to footer translate with Babelfish by Roedy Green ©1996-2008 Canadian Mind Products
This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. It does not describe an actual complete program. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; I have too many other projects of my own.

I do contract work for a living, which could include writing a program such as this. However, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project any way you please.

CurrCon neededThe CurrCon Java Applet displays prices on this web page converted with today’s exchange rates into your local international currency, e.g. Euros, US dollars, Canadian dollars, British Pounds, Indian Rupees… CurrCon requires Java 1.1 or later, preferably 1.6.0_06 . If you can’t see the prices, of you if just want to learn more about CurrCon, click here for help.
You write a program that backs up rarely used files over the Internet, using an ADSL connection to a server. Whenever space gets low, it frees disk space of files that have been backed up. When the user goes to open those files, there is a pause while they are retrieved from the server.

This scheme also acts as backup. All files can be backed up. To speed backup, before they are sent to the server they can be super compressed , and only changes sent via deltas.

The backup server can conserve disk space by noticing that, for example, two customers both have identical copies of MS Word For Windows DLLs installed. The server only needs to keep one copy of each DLL. It has to be careful. Customers may have identically named files that are not identical.

This is very old idea. Univac 1106 mainframes and DEC PDP-10s with far less than a megabyte of RAM used to integrate file migration with backup to tape.

Before you can restore, you need to free up space. This is done by dropping infrequently used files already backed up. Happily, you don’t have to take time out to back them up.

You could also implement this with backup to CD ROM burner. To consolidate backups, so you don’t have do shuffle a zillion discs, you may need to periodically re-backup files that have not changed, (which could entail temporarily restoring them.).

You probably would want to use 64-bit checksums for end to end assurance files were backed up and restored correctly. These also act as cookies for almost uniquely identifying files.

The only tricky technical challenge is the hook into the operating system. It only has to intercept file open. You don’t have to intercept directory reads or close. You leave tiny stub files behind in the directory as proxies for the file. I would tackle NT first. The hooks are much more formal with no ways around them. By the time you are ready, Windows 2000 will be out with I hope similar bulletproof hooks. You could then leave Win95 and Win98 on the trash heap of history. I must admit I have not studied the open hooks available, but I have studied the defragger interface. It is quite bulletproof and stable. I would hope the open hooks would be done similarly. In contrast, in Win95/98 there are many ways around the hooks since low level sector i/o is not restricted. You can cannibalise the Filemon vxd.

Backup should be imperceptible, a low priority background process. It might even stop completely when the computer or Internet connection were busy. You might simply monitor throughput and if it is slower than some threshold, temporarily shutdown. Backup would have relatively low overhead on the cpu, disk and Internet connection. It soaks up the upload side of the Internet channel which is not normally very busy. Unfortunately, with ADSL upload is typically not nearly as fast as the download side. However, that works to your advantage when it comes time to restore.

You would do a manual recall on a directory to revive a cold project so that all files are ready to go rather than dribbling in as you open them. You could even do this on a different machine if you knew the password. See how this project evolves into a scheme where you can sit down at any machine in the world, and your logical desktop is sitting there ready for you.

For a simplified initial version, you could handle migration but not backup. You keep at most one backup of each file on the server. The ISP then needs to track files only by cookie.

For the full backup version, the ISP needs to maintain a list of what backups exist on mass backup organised in a directory structure matching the customer’s, so the user can selectively restore individual files by date. Further, the user should be able to say, "Put this directory back the way it was as of March 26 1999".

This scheme is not going to work without a 24-hour ADSL or faster connection. Restore would be too slow. You want backups to be scheduled at any time at the convenience of the ISP. He is typically backing up the least recently used files. If a user wanted to use one of these file, the backup would be automatically abandoned. You thus have very little interaction between the backup and the files the user is working on.

What about a purely local version where the user is responsible for making the backups? You can get about a gig per CD with compression. ZIP drives are too tiny to bother with. Tape drives are too slow to search to bring back files.

I think it wisest to tackle this first as an Internet service. That gets rid of many headaches.

Stavros Macrakis macrakis@alum.mit.edu hopes to find someone to implement this for under $50000.00 USD . See the Automatic File Update project for hints on implementation details.


CMP_homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.18] The information on this page is for non-military use only.
You are visitor number 3,372. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/Mindprod website mirror)
http://mindprod.com/project/infinitedisk.html J:\mindprod\project\infinitedisk.html