The place of Bittorrent Sync in file sharing

First, a happy teacher's day to all my teachers who taught me so much. Thank you. You are the greatest.

File sharing yesterday

From the earliest computer networks, we have had some form of file sharing. I first started using computers in 1989 (a PC-XT with 360K floppy drives) and back then file sharing was essentially copying files onto floppy disks. Then I encountered the BBS scene when I got my first modem (a Motorola Lifestyle 28.8K bought from another country, at the time no one was selling more than 14.4K in India). I would gleefully download a game or utility from a BBS and I would despair the download ratio restriction since I did not really have anything to share back. On the Internet I recall using FTP servers and Archie, Veronica (still no Betty?), Jughead and Gopher back in 1995. This was later superceded by web based downloads and some sites such as download.com and tucows which were a bookmark away. On the college intranet we had Novell Netware file shares and later Windows file sharing. When I first encountered Linux in 1995, I came to know about NFS but I got to use it only when I went to Columbia in 1998. By that time, an email attachment was an acceptable way to share small files. It still is. So nearly 20 years have passed since my first encounters with file sharing but even today, sharing files (especially big ones) is a hassle. Why?

In most older file sharing protocols, it is assumed that one (powerful) computer will be hosting the files and other computers will be accessing those files. Therefore all the other computers would be acting as clients to the server. There were permissions on who could and could not access those files. So login IDs were distributed and passwords were employed. Permissions were also granted on what the eligible users could do with the files. If the files are not meant to be edited in any fashion, the access to those files is Read-Only or RO. If on the other hand, users are expected to change the files, or add new files or remove files from the share, the access is Read-Write or RW. Clearly, sharing a file is a complicated endeavour. But it worked. Both in an intranet and on the Internet.

File sharing today

With the advent of camera phones and wide-spread Internet access through those phones, every day users of these devices create a large amount of large size files that they would like to share with family, friends, foes and the Internet in general. So how do they do it? The mechanism is the same, only the interface has changed. Create an account with a hosting service ( Facebook, Instagram, Vine, WhatsApp etc.) and assign permissions to which other users using that exact same service can see what they share. When the user uploads the file, the hosting service provides a nice web page or app so that the other users (family, friends, anyone) can see them. It may even go ahead and notify those other users that a new image/video is available. But what if you wanted to share documents and spreadsheets? There are other service providers like Google Drive, Microsoft SkyDrive etc. that will help you do that too. All theses services follow a similar design principle: logins and permissions.

Then there are services like Dropbox which provide continuous, incremental synchronization of your files. In these services, a small program constantly monitors specific folder(s) and if you change anything in them (add/modify/delete files), it repeats the same change everywhere else you have the same service running. A sort of master-slave philisophy wherein the slaves follow the master and duplicate the actions on the master. I say sort-of, because in the case of synchronization services, all computers running the service are masters and all are slaves. So lets say you wanted to share some files with team members, a la Windows file share, you would copy them to a folder that is being monitored. Soon, every other computer running the same service (with the same login ID and password) will add the newly copied file. If you deleted a file on any of the computers from the monitored folders, the service would cause a deletion of that file in all the other computers as well. Thus this type of file sharing is called file synchronization. The service is able to do this because it connects all these computers to a file hosting server where it keeps track of the files as well as keep a copy. It is worth noting that a lot of people use these synchronization services as a backup mechanism so as to

  1. avoid carrying USB pen drives from work to home
  2. backup important files on different computers
  3. have a copy of the file on a mobile phone and on a desktop computer

If you observe carefully, every single one of the above mechanisms rely on a computer/device that hosts the files and is responsible for sharing them.

What if you wanted to share files with friends who are on your home/office network? What if that intranet is not connected to the Internet? What if you wanted to backup your files onto a server at home? or on the Internet? What if you wanted to backup your files to multiple computers all at once? What if your central file server was unavailable? What if you wanted to synchronize your files without using a third party? What if you were a speaker at a seminar who wants to impromptu share some documents with the audience who are not all users of the same service? What if you did not want the NSA to snoop on what you are sharing and with whom? One option is to install a file sharing server on your intranet which is also accessible from the Internet. Another option is to set up a file sharing service on every computer/device that needs the files and it works regardless of the network you and others are connected to. And that is what Bittorrent Sync does.

Bittorrent Sync

In Bittorrent Sync, all computers that are sharing files are considered as peers. Why? Because each of them is responsible for distribution of the files irrespective of whether they are the origin of the file or not. This is aligned with the original Bittorrent idea wherein all computers copying a file (i.e. the peers) are also responsible for distributing that file. Obviously, until the entire file is available to a peer, it will share only those parts of the file that it has downloaded or copied. As soon as the complete file has been copied, the peer will share the entire file. So how to we address the fundamental questions of file sharing: who can access the files? how to assign RO or RW permission?

Bittorrent Sync handles both these scenarios well. A long series of random numbers and alphabets constitute a key. Each key is related to a folder on your computer or device. Some keys allow RO, others RW. Knowledge of a key is akin to having a login for that folder. Generation of keys is in your hand and is done locally (i.e. on your computer). Keys can be changed as frequently as you like. In fact, Bittorrent Sync has a few more interesting options on file synchronization such as time limited sharing and a limit on number of users who get to copy the files. File transfers are encrypted. It works over LAN and WAN. It is almost always faster than other file sharing mechanisms (because all peers contribute to your download speed, not just the origin.) So what are the drawbacks? Plenty.

Drawbacks in Bittorrent Sync

  1. Bittorrent Sync is an underlying protocol for file synchronization and therefore is not the most easy to use or understand even with a GUI.
  2. The sharing of the keys is definitely not as easy as adding a friend on Facebook or follow someone on Twitter.
  3. Intranet-interet sharing may be slow unless you have uPnP enabled or punch a hole in your intranet firewall. (This should be less of a problem with wide IPv6 adoption).
  4. Since it is a file synchronization mechanism, it cannot be used directly as a backup mechanism (I will discuss this in a follow up journal entry).
  5. The underlying implementation is not open source so we cannot be absolutely clear as to what's happening inside.

File sharing of the future?

Nevertheless, I am still excited. Why? In the future, it is likely that it (or another similar protocol) will become the underpinning protocol for file sharing services. So it is likely that someone like Facebook will need not store millions of photos each day for its users. It is likely that your selfie will not travel to a datacenter half way around the world before it becomes available to your sister sitting on the sofa next to you. It is likely that file sharing will become an invisible process for any sized file and that is a magic worth being excited about.