Network Investigation Methodology for BitTorrent Sync: A Peer
Network Investigation Methodology for BitTorrent Sync:
A Peer-to-Peer Based File Synchronisation Service
Mark Scanlon, Jason Farina, M-Tahar Kechadi
School of Computer Science and Informatics,
University College Dublin, Belfield, Dublin 4, Ireland
Final accepted version available: http: // dx. doi. org/ 10. 1016/ j. cose. 2015. 05. 003
High availability is no longer just a business continuity concern.
Users are increasingly dependant on devices that consume and produce data in ever increasing volumes. A popular solution is to have a central repository which each device accesses after centrally managed authentication. This model of use is facilitated by cloud based file synchronisation services such as Dropbox, OneDrive,
Google Drive and Apple iCloud. Cloud architecture allows the provisioning of storage space with “always-on” access. Recent concerns over unauthorised access to third party systems and large scale exposure of private data have made an alternative solution desirable. These events have caused users to assess their own security practices and the level of trust placed in third party storage services.
One option is BitTorrent Sync, a cloudless synchronisation utility provides data availability and redundancy. This utility replicates files stored in shares to remote peers with access controlled by keys and permissions. While lacking the economies brought about by scale, complete control over data access has made this a popular solution. The ability to replicate data without oversight introduces risk of abuse by users as well as difficulties for forensic investigators. This paper suggests a methodology for investigation and analysis of the protocol to assist in the control of data flow across security perimeters.
Keywords: BitTorrent Sync, Distributed Storage, Peer-to-Peer, Network
Traffic Analysis, Remote Evidence Acquisition
Preprint submitted to Computers and Security June 3, 2015
Applications such as Evernote and Dropbox leverage the decreasing cost of hard disk storage seen in Infrastructure as a Service providers, e.g., Amazon S3, to provide data storage on the cloud to home users and businesses alike. The main advantage of services such as Dropbox, Google Drive, Microsoft OneDrive
(formally SkyDrive) and Apple iCloud to the end user is that their data is stored in a virtual extension of their local machine with no direct user interaction required after installation. It is also backed up by a fully distributed datacentre architecture that would be completely outside the financial reach of the average consumer. Their data is available anywhere with Internet access and is usually machine agnostic so the same data can be accessed on multiple devices without any need to re-format partitions or wasting space by creating multiple copies of the same file for each device. Some services such as Dropbox, also have offline client applications that allow for synchronisation of data to a local folder for offline access.
As Internet accessibility continues to become more commonplace and allows for increasingly faster access, it is not unexpected that many utilities that are intended for general use will aid in the perpetration of some variety of cybercrime.
One attribute that is highly desirable by those contemplating illegal activities is the notion of anonymity and data security – especially the ability to keep data secure transfer secure from inspection while in transit. BitTorrent Sync
(also referred to as BTSync, BitSync and bsync) is a file replication utility that would seem to serve exactly this function for the user. Designed to be server agnostic, the protocol is built on already popular and widespread technologies that would not seem out of place in any network activity log.
Each of the aforementioned consumer focused services can be categorised as cloud synchronisation services. This means that while the data is synchronised between user machines, a copy of the data is also stored remotely in the cloud.
In recent headline news, much of this data is easily available to governmental
agencies without the need of a warrant or just cause. BTSync provides the same synchronisation functionality (without the cloud storage aspect) and provides a similar level of data availability. The service has numerous desirable attributes
• Compatibility and Availability – Clients are built for most common desktop and mobile operating systems, e.g., Windows, Mac OS, Linux, BSD,
Android and iOS.
• Synchronisation Options – Users can choose whether to sync their content over a local network or over the Internet to remote machines with no requirement for scripting or schedule management making this an accessible technology compared to existing options such as RSYNC.
• No Limitations or Cost – Most cloud synchronisation services provide a free tier offering a small amount of storage and subsequently charge when the user outgrows the available space. BTSync eliminates these limitations and costs. The only limitation to the volume of storage and speed of the service is down to the limitations of the synchronised users machines.
• Automated Backup – Like most competing products, once the initial install and configuration is complete, the data contained within specified folders is automatically synchronised between machines.
• Decentralised Technology – All data transmission and synchronisation takes place solely in a Peer-to-Peer (P2P) fashion, based on the BitTorrent file sharing protocol.
• Encrypted Data Transmission – While synchronising data between computers, the data is encrypted using RSA encryption. Under the BTSync
could result in users storing their data on untrusted remote locations for the purposes of data redundancy and secure off site backup.
• Proprietary Technology The precise protocol and operation of the technology is not documented by the developer. There is debate over whether security through obscurity or peer code evaluation, i.e., open source, is better. Some enterprise security policies prohibit the use of open source applications as a result of the source code being open to inspection by those looking for flaws in the implementation. From the point of view of the consumer, BitTorrent Inc. have stated that they will not give access to traffic to any LEA without due process and the bespoke protocol makes casual eavesdropping or crawling less likely.
As a result of these attributes, BTSync has grown to become a popular alternative to cloud based synchronisation services. Less than a year after its release, the active user base had grown to over one million by November 2013,
undoubtedly be of interest to both law enforcement officers and digital forensics investigators in future investigations. Like many other file distribution technologies, this interest may be centred around recovery of the data itself, proof of the modification of data or evidence of data distribution and enumeration of the recipients.
While BTSync is based on the same technology as BitTorrent for the transfer of files, the intention of the application is quite different. This results in a change of users’ behaviours, as well as a necessary change in the assumptions an investigator should make. BitTorrent is designed to be a one-to-many data dissemination utility. The uploader usually does not care about the identity of the downloader and a single seeder can deliver data to a large number of unique peers over the life of the torrent file. Data integrity and transfer speed take precedence over privacy of data in transit. BTSync on the other hand, is designed to be a secure data replication protocol for making a faithful replica of a data set on a remote machine. Data integrity is still highly prised but data privacy is now the top priority and speed-through-dispersion is sacrificed
as a result. The files can only be read by users specifically given access to the repository. The advertisement of data availability is completely scalable by the owner with options ranging from restricting access to known IP addresses through to registration with a centralised tracker.
Given the nature of the application, users are much more likely to know the operator of the remote site
(this does not apply to secrets advertised online though that could be a point of commonality that would not necessarily have existed for pure BitTorrent clients).
1.1. Aim and Contribution of this Work
The aim of this work is to provide a reference for digital investigators discovering the use of BitTorrent Sync in an active investigation. However, it is hoped that the analysis presented may be of use to security personnel looking to detect and control the use of this protocol within their perimeter.
To accommodate these goals this work presents an analysis of the protocol and its network interaction. Activities undertaken to perform a synchronisation are presented and described at the packet level in order to facilitate both post mortem traffic analysis and to enable the development of feature based detection rules and deep packet inspection for Network Intrusion Detection Systems
(NIDS) or firewall appliances.
The contribution of this work presents a suggested a network investigation
This methodology includes recommendations for the investigation of a number of hypothetical scenarios where BTSync could be used to aid in criminal or illicit activities.
Legitimate usage of the system, e.g., backup and synchronisation, group modification, data transfer between systems, etc., may itself be of interest to an investigation. However, the technology may also be suitable in the aid of a number of potential scenarios of interest such as industrial espionage, copyright infringement, sharing of illicit images of children, etc., outlined in greater detail
received during regular operation of BTSync. Finally, the results from two dig-
In order to gain an understanding of how BTSync functions, one must first understand the technologies upon which it is built. The application is a product built by BitTorrent Inc. (the creators and maintainers of the eponymous file-sharing protocol). As a result, the technologies used by the regular BitTorrent protocol and BTSync are developed using a similar premise. This section provides a brief overview of the required background information and outlined the key differences between the two applications.
2.1. BitTorrent File Sharing Protocol
The BitTorrent protocol is designed to easily facilitate the distribution of files to a large number of downloaders with minimal load on the original file
parts of the entire file to other downloaders. A BitTorrent swarm is made up of both seeders (peers with complete copies of the content shared in the swarm), and leechers (peers who are downloading the content and may have none or some of the content). Due to BitTorrent’s ease of use and minimal bandwidth requirements, it lends itself as an ideal platform for the unauthorised distribution of copyrighted material. The unauthorised distribution of copyrighted material typically commences with a single original source sharing large sized files to many downloaders.
Bencoding is a method of notation for storing data in an array list. The main advantage of bencoding is that it avoids the pitfalls of system-byte order requirements (such as big-endian or little-endian), which can cause issues for cross platform communication between applications. The datagram packet can
easily be converted to a human readable UTF-8 encoded sequence of key:value
The value for any pair is stored as a sequence of-bytes with the exception of integer values. Associated with the integer indicating keys, bencoding uses the lowercase “i” to indicate the start of an integer value, which is also terminated with a lowercase “e”.
2.1.2. Active Peer Discovery
Each BitTorrent client must be able to identify a list of active peers in the same swarm who have at least one piece of the content and is willing to share it, i.e., identify a peer that has an available open connection and has the bandwidth available to upload. By the nature of the implementation of the protocol, any peer that wishes to partake in a swarm must be able to communicate and share files with other active peers. BitTorrent provides a number of methods available for peer discovery. There are a number of methods that a BitTorrent client can use in an attempt to discover new peers who are in the swarm outlined below
1. Tracker Communication – BitTorrent trackers maintain a list of seeders
Each BitTorrent client will contact the tracker intermittently throughout the download of a particular piece of content to report that they are still alive on the network and to download a short list of new peers on the network.
2. Peer Exchange (PEX) – As set out in the standard BitTorrent specification, there is no intercommunication between peers of different BitTorrent swarms besides data transmission. Peer Exchange is a BitTorrent Enhancement Proposal (BEP) whereby when two peers are communicating
(sharing the data referenced by a torrent file), a subset of their respective peer lists are shared during the communication.
3. Distributed Hash Tables (DHT) – Many BitTorrent clients, such as Vuze and µTorrent contain implementations of a common distributed hash table as part of the standard client features. The common DHT maintains a list of each active peer using the corresponding clients and enables crossswarm communication between peers. Each known peer active in swarms with DHT contributors is added to the DHT. The mainline BitTorrent
DHT protocol (also used by BTSync), is based on the Kademlia protocol. Regular BitTorrent file-sharing users and BTSync users contribute to the update and maintenance of the DHT. The DHT provides an entirely decentralised approach aiding in the discovery of new peers sharing particular pieces of content. The Kademlia DHT structures its ID space as a
or” (xor). Each user in the DHT generates a unique key that is used for identification when connecting to the DHT. The piece of the DHT that each peer stores is related to this xor calculation. Those peer IDs that are closest to the key, e.g., a torrent’s info_hash, are responsible for facilitating lookups for those keys. The same DHT responsible for regular
BitTorrent file-sharing is also responsible for maintaining a lookup for BT-
Sync shared content. In this scenario, the key used is based on the public
read-only key generated for each shared folder in BTSync.
While a DHTs decentralised nature results in a much more resilient service compared to server based tracker, it also results in it be vulnerable to
4. Local Peer Discovery (LPD) – This is enabled by checking the “Search
LAN” option in most BitTorrent client’s application preferences. When enabled the application will announce its availability to potential local peers using multicast packets. Once a client on the network receives a multicast packet, that client will check its current list of shares to see if a match is found. Is a match it found, that peer will respond to the origin of the request offering to synchronise the content.
2.1.3. Downloading of Content through BitTorrent
To commence the download of the content in a particular BitTorrent swarm, a metadata .torrent file or a corresponding magnet universal resource identifier (URI) must be acquired from a BitTorrent indexing website. This file/URI is then opened using a BitTorrent client, which proceeds to identify other active peers sharing the specific content required. The client application then attempts to connect to several active members and downloads the content piece by piece.
Each BitTorrent swarm is built around a single piece of content which is determined through a unique identifier based on a SHA-1 hash of the file information contained in this UTF-8 encoded metadata file/URI, e.g., name, piece length, piece hash values, length and path.
2.2. BitTorrent Sync
BTSync is a file replication utility created by BitTorrent Inc. and released
necessarily intended as any form of off-site storage. Any data transferred using
BTSync resides in whole files on at least one of the synchronised devices. This makes the detection of data much simpler for digital forensic purposes as there is no distributed file system, redundant data block algorithms or need to contact
Figure 1: Keys (formerly secrets) are generated at share provision. The ability to view the keys is not available in v2.0
a cloud storage provider to get a list of all traffic to or from a container using discovered credentials. The investigation remains an examination of the local suspect machine. However, because BTSync uses DHT to transfer data there is also no central authority to manage authentication or log data access attempts.
A suspect file found on a system may have been downloaded from one or many sources and may have been uploaded to one or more recipients. Additionally while the paid services offer up to 1TB of storage (Amazon S3 paid storage plan), the free versions which are much more popular with home users cap at approximately 10GB. BTSync is limited only by the size of the folder being set as a share. Another concern for any investigation into BTSync folders is that unless the system being examined is the owner/originator of the folder being shared, it is quite possible that any files present were downloaded without prior knowledge of their content or nature. Before v2.0, BTSync had no built in content preview facility in its protocol, it merely blindly synchronises from host to target without any selection process available to the user. In v2.0, an option was added to the preferences for each folder that allows the user to only synchronise file titles as a zero byte place holder file. If the file is selected the content of the file is downloaded. An update to the link descriptor in v1.4 allows users to get an approximation of the share size at the time of joining.
The “secrets” used as part of the original release of BTSync were renamed as “keys” in v1.4. The structure has not been changed however and still con-
sists of a 33 character human readable string consisting of a Base32 encoded string generated when the folder was first provisioned. This Base32 pattern is then prepended with a single letter indicating its nature. Keys are the unique identifiers used by the BTSync service to differentiate between shared folders.
In order for the 20-byte keys to be human readable, they are displayed using
secrets for the sharing of data contained within specific folders, as can be seen
The initial Read & Write (RW) key is still generated using CryptoApi on
Windows based systems (this is downloaded as part of the installation process if it is not installed already). This RW key is the equivalent of the original
“master secret” in that, if it is shared then the receiving party has an equal level of access to the share as the original owner including the ability to delete content and add new content that will be replicated to any synchronising peer whether downstream or of equal rank.
From this initial RW key, a Read Only Key (RO) is generated automatically
available to the user. However, these are not the only keys available for use.
BTSync defines six standard keys of which three can be generated using the default installation of the desktop client. These keys are identified by their prepended letter as follows:
• [A] This is the RW key generated at the time the share is provisioned.
This key gives the user full control over the share contents.
• [B] This identifies the Read Only key and can be used to create a child, or downstream, peer that can only replicate share contents from another peer.
Any changes made to share contents, including deletion, will invalidate the file changed and prevent any further replication actions for that particular file in the future, or until the share is re-provisioned on that client (or the share’s *.db file is altered but this may cause the entire share to be deemed invalid).
• [C] The C type key is a read only one-use key that is discarded after its first use. This key can be generated from either type A or type B keys and is used primarily in the distribution of other keys.
• [D] Generated through the use of the Sync API, this type of key allows read & write access to encrypted shares.
• [E] A read only key capable of replicating data from type D encrypted shares and decrypting the contents. This key is calculated form the type
D key and so is not possible using that standard BTSync v1.4 or v2.0
• [F] Encrypted Read Only key capable of replicating data from an encrypted share but unable to decrypt the share contents. This type of key can be used to store data in an encrypted state on a remote, untrusted, system and still provide authenticity and availability.
Older versions of these, such as the ‘R’ prepended read only key of v0.x
are still usable but are no longer generated by the application. As with the earlier BTSync versions, a user may also generate his or her own key that has been Base64 encoded. As a result, these default prepended identification letters cannot always be taken as an definite indicator of the access level granted by a key before it has been applied.
The Keys outlined above need never necessarily be shared publicly, i.e., any user can create a number of keys solely for his personal use across his different machines. Depending on the level of access the user wishes to give to a third-party, he can give the corresponding key to any other user through regular one-to-one communication methods (e-mail, instant messaging, social networking, SMS, etc.). If public distribution is desirable, there are a number of public online avenues for BTSync users to share secrets with each other (e.g.
www.btsynckeys.com, http://www.reddit.com/r/btsecrets, among others).
Version 1.4 presents a change to the method of sharing a link with a peer that has been modified further in v2.0. In v1.4, a user can still view the RW and RO
Figure 2: Key Sharing is Now Managed from within the Application with Optional Restrictions key of a share and can copy this key and send it via any medium to the remote device. Using this method, the remote device user adds a new share and inputs the key causing the share to automatically query a tracker (if this option is left enabled) for the location of remote peers hosting a share matching the applied key. An alternative to this method was added to the client and works as follows:
In the application the user that currently has access to the share (the owner) can select the option to provision the share to another user (a peer which can be a different person or a remote system under the control of the owner), as
presented as options.
• Read Only (default)
• Read & Write
• Invited participants must be approved – the owner will receive notification in the application that a peer wishes to share the resource. The Device ID
of the remote peer will be presented and the owner can accept or reject their membership. This option is enabled by default.
• Expiration date – the link to the share will only remain active for a set number of days from the time it is generated. This option is enabled by default and the time limit is set to three days, but can be changed to any number of days the owner inputs.
• Number of uses – this option allows the owner to limit the number of times a link can be used to join a share. This is set to off by default.
The link generated by this process is presented as https://link.getsync.com/[URLoptions], where the URL options are each separated by an ampersand. For example a link shared from v1.4 for a folder called winhex with no expiry or usage limitation would present as https://link.getsync.com/#f=winhex&sz=35E5&s=
XIQSFD2MCDPS2QKITWKJROJ2VUSV2YNA&i=CKKR3V2BBM7MXIOTPU3XWK55JBUFWG3EY& p=CALSNMDGCZZAUQXBXEIR6Q57UMTVOSFI&e=1431277452 where:
• #f=(folder name of the share in plain text)
• sz=(approximate size of the share contents)
• s=(the shareID of the folder encoded in Base32)
• i=(a one time key used to provide access to the real key, this changes every time the link for the folder is generated)
• p=(PeerID of the peer performing the server role in the upcoming key exchange)
• e=(the expiry timestamp of the link if it is set, if it is not set this item will not be present in the link)
• v=(the version of the client. This is only present in the v2.0 client and is not optional)
Figure 3: A received link can be shortened and still be resolved to a share by the server
This URL can be copied to the system clipboard, sent via email (the email option will open the default mail application on the system) or converted to a
QR code for scanning by a mobile device.
At a minimum the link must contain the folder, shareID and one time key fields to resolve to a share if entered directly into a browser however removing the version may cause the actual replication to fail if the remote version is incompatible with the version adding the share. An example how this stripped
link is converted into a URL that can be opened by the locally installed client if the client satisfies all of the requirements such as version number.
An alternative to opening the link in a web browser is to enter the link in
version is not correct the replication will fail and, if authorisation is required, the request will never be sent to the owner.
The process of joining a share has also been changed in v1.4 and v2.0. Using the x.509 security certificates and public private key pairs stored in the
Figure 4: A received link can also be added in the section to manually add a share
Figure 5: Requests for access can be verified (redacted) and share members can be reviewed
sync.dat file in the BitTorrent Sync folder. Once a host address is retrieved a connection is made and a request for the RO or RW key is sent using the One-
Time-Key (i in the optional data) along with the peer’s public key generated the first time a link is received or generated. The user and device name set at this time will be the user and device name that the owner will see if they check the identity of the peer requesting access. The device name will also be present
authorised, the requesting peer receives a copy of the required key encrypted with their public key which they then decrypt and apply to the share on their end of the connection. Once complete the process of synchronisation can begin and the new peer will be registered on the tracker if that option is left enabled.
2.3. Potential Scenarios Pertinent to Digital Forensic Investigation
2.3.1. Industrial Espionage
Many companies are aware of the dangers of allowing BitTorrent traffic on their networks. However, quite often corporate IT departments enforce a blocking of the technology through protocol blocking rules on their perimeter firewalls. This has the effect of cutting off any BitTorrent clients installed on the
LAN from the outside world. In addition to Deep Packet Inspection (DPI) to investigate the data portion of a network packet passing the inspection point, basic blocking of known torrent tracker sites using firewall rule sets can be used. BTSync does use BitTorrent as the protocol for file transfer but once the transfer session is established using the BTSync protocol all traffic is encrypted using AES and may not be open to inspection by a firewall. It also does not follow the current known patterns that would identify an encrypted BitTorrent stream as the target-source profile is different. Blocking t.usyncapp.com and r.usyncapp.com will stop the tracker and relay options from being used but
BTSync can operate quite well without those services. Local peer discovery can use multicast or direct “known peer" configuration where a known IP:Port combination is used to identify a specific machine allowed to participate in the share. This specificity would negate the issue of multicast packets usually not
being routed beyond the current network segment. A scenario where BTSync can be used to transfer files within a LAN would be to transfer data to a machine with lower security protocols in place such as the capability to write to a USB device or perhaps even unmonitored access to the Internet (and the BitTorrent protocol ) through a designated guest LAN.
2.3.2. Cloudless Backup
By synchronising between two or more machines accessible to the user, data can be stored in multiple locations as a form of backup. The secondary copies of a file would be stored using a read only key so that only changes on the primary system will ever replicated. A feature of BTSync that is enabled by default but can be disabled in the configuration file, is the use of the .SyncArchive folder that stores a copy of any file deleted or changed for a preset period of time allowing for a form of file recovery or versioning.
2.3.3. Encrypted Remote P2P Backup
tion secret”. Through the use of encryption secrets, a BTSync user has the ability to remotely store encrypted data, e.g., personal, sensitive or illegal, on one or more remote machines. These remote machines do not have the ability to decrypt the information stored. The data could then be securely wiped off the original machine and easily recovered at a later stage.
2.3.4. Dead Drop
Due to BTSync’s intended use as a file replication utility, it is assumed that a person receiving a copy of a shared directory is aware of the contents of the folder. As a result, no method was included to gather details of the contents of
a node configured correctly with an API key will return a folder or file listing when queried.
2.3.5. Secure P2P Messaging
For example, the proof of concept found at http://missiv.es/. The application currently operates by saving messages to an “outbox” folder that has a read only key shared to the person you want to receive the message. They in turn send you a read only key to their outbox. One to many can be achieved by sharing the read only key with more than one person but no testing has been done with synchronisation timing issues yet and key management may become an issue as a new outbox would be needed for each private conversation required.
– BitTorrent, like any other P2P technology, was designed for one-to-many distribution of large content and has become almost synonymous with piracy.
BTSync was not necessarily intended to be a one-to-many distribution utility.
However, it does allow for a group of users to set one another as “known peers” so that they can communicate directly through encrypted channels. Websites such as http://btsynckeys.com/ have examples of users posting keys publicly and advertising the content as being copyrighted material.
2.3.7. Serverless Website Hosting
– This involves the creation of static websites served through a BTSync shared folder. These websites could be directly viewed on each user’s local machine. The local copies of the website could receive updates from the webmaster automatically through the synchronisation of the content associated with a read only secret.
2.3.8. Malicious Software Distribution
– Due to the lack of any trust level being associated with any publicly shared secret, the synchronised files may contain infected executables.
For each of the above scenarios, an added dimension can be created by the
BTSync user: time. Due to the ability to create “throw away” or temporary secrets for any piece of content, the timeframe where evidence may be recovered from remote sharing peers might be very short.
3. Related Work
This paper is focused on the network communication protocol employed by
BTSync and the investigation thereof. The work presented as part of this paper
of the BTSync client application on a host machine. This paper outlines the procedures for identifying a current or previous install of the BTSync application and the extraction of secrets from gain physical access to a machines hard drive and performing a regular digital forensic investigation on its image. At the time of publication, there are no other academic publications focusing on BTSync.
However, seeing as BTSync shares a number of attributes and functionalities with cloud synchronisation services, e.g., Dropbox, Google Drive, etc., and is largely based on the BitTorrent protocol, this section outlines a number of related case studies and investigative techniques for these technologies.
3.1. BitTorrent Forensics
Numerous investigations have been made into identifying the peer information of those involved in BitTorrent swarms. Most of these publications focus on the investigation of the unauthorised distributed of copyrighted material
tion may be recorded for a particular piece of material under investigation or a larger landscape view of the peer activity across numerous pieces of content.
3.2. Client-side Synchronisation Tool Forensics
Forensics of cloud storage utilities can prove challenging, as presented by
plete local synchronisation has been performed, the data can be stored across various distributed locations. For example, it may only reside in temporary local files, volatile storage (such as the system’s RAM) or dispersed across multiple datacentres of the service provider’s cloud storage facility. Any digital forensic examination of these systems must pay particular attention to the method of access, usually the Internet browser connecting to the service provider’s storage
access page (https://www.dropbox.com/login for Dropbox for example). This temporary access serves to highlight the importance of live forensic techniques when investigating a suspect machine as a “pull out the plug” anti-forensic technique would not only lose access to any currently opened documents but may also lose any currently stored sessions or other authentication tokens that are stored in RAM.
In 2013, Martini and Choo published the results of a cloud storage forensics investigation on the ownCloud service from both the perspective of the client
on both the client machine and on the server facilitating the identification of files stored by different users. The module client application was found to store authentication and file metadata relating to files stored on the device itself and on files only stored on the server. Using the client artefacts, the authors were able to decrypt the associated files stored on the server instance.
3.3. Extension of the Digital Evidence Acquisition Window
In 2014, Scanlon et al., outlined a case study on BTSync whereby the remote recovery of evidence from a BTSync shared folder can enable the recovery of
may have been securely deleted, corrupted or overwritten on the local device or viewed (not stored) on a mobile device using the BitTorrent Sync app. The paper outlines a number of entry points from the local machine into the investigation and the remote recovery of such evidence including local and network sources.
4. BitTorrent Sync Network Protocol Analysis
Starting with the beta release of v1.4, BTSync changed its protocol to more closely resemble that of the underlying BitTorrent protocol. In addition to changes to the directory structure and the introduction of public/private key storage for shares, the network traffic profile of the protocol changed dramatically by utilising the Micro Transmission Protocol (µT P ) as outlined in
the BitTorrent Extension to Protocol (BEP) 29, which is officially specified here: http://www.bittorrent.org/beps/bep_0029.html. This protocol was already used by BitTorrent once actual file transfer was initiated but now BT-
Sync has adapted its communications to use µT P signalling resulting in a smaller overall usage of bandwidth but a more noticeable footprint.
Where the initial release of BTSync used custom packets that all started with the header BSYNC or BSync, this purely cosmetic identifying header was replaced with the µT P DATA version 1 (01) header for all request and transfer packets and STATE (21) was used to perform the same functionality of the original PING used to update peer availability and provide connection details and data.
As with the original µT P protocol the connection management packets and headers used by BTSync v1.4 and onwards are:
SYN : initiates the two-way µT P handshake to establish a connection with the remote peer. This packet has its type indicator set to 4.
STATE : the most common packet in µT P , this “ACK” replaces the BTSync response to PING and serves as both the keep-alive and the response to the handshake initiation. This packet is identified by the type value of 2.
DATA : This packet is used to carry messages such as the peer request message sent to the tracker or the peer list sent in response. This packet has a type value of 0.
RST : as with TCP the RST packet is used to reset the connection in the event of an error in transmission. This has a type identifier set to 3
FIN : Indicates the end of a connection and is denoted by the type value of 1.
The µT P message headers have a similar layout that is formatted as follows:
Figure 6: A newly created share will have some preferences set by default that can be toggled by the user
Timestamp:[AB CD EF GH]
Timestamp Difference[AB CD EF GH]
Window Size:[AB CD EF GH] sequence number:[AB CD]
Ack Number:[AB CD]
On provision of a new share several options are enabled automatically by
enabled by the user at any time to customise the network behaviour of the local repository being edited. These changes can also be managed through direct editing of the application configuration files. The default behaviour for BTSync is to utilise the tracker server at t.usyncapp.com. The DNS request resolves to three IP addresses: 22.214.171.124, 126.96.36.199 and 188.8.131.52. These three IP addresses are servers hosted on Amazon’s EC2 cloud service. This is the BTSync tracker server, which facilitates peer discovery for clients looking to synchronise data. One peer request message is sent for each share stored on the local machine and the act of requesting a peer lookup also serves to register the requesting client as a source for that share.
Packets sent from the client to the tracker server contain registration details and get_peers message requests (when a new share is created it registers the share with the tracker using a get_peers packet). A get_peers packet takes the form of:
Header type: 0 d2:la
6:[6 byte local IP:port]
20:[20 byte peer ID]
20:[20 byte ShareID] e
Header type: d2:la
[20 byte peer ID]
32:[32 byte ShareID] e
6:[6 byte local IP:Port]
2:lp [local port integer]
This packet is initially sent to the tracker server via TCP and UDP to test connectivity. If both protocols succeed, UDP is the preferred method of communication. Tracker updates are performed at a rate of once every 600 seconds or if a change is made to the share data, in which case the timer is reset. A separate packet is sent for each share present on the local machine. It is noteworthy that, even when a new share is created, the first packet advertising that share to the server uses a message type of get_peers. Depending on the bandwidth usage it is entirely possible for a single peer to simultaneously contact and register with multiple tracker server addresses. Each share will have its own Connection ID value in the µT P header for that get_peers packet and each request will prompt a separate type 2 (ACK) response from the tracker server followed by a separate response to the request itself.
Table 2: Sample Tracker Packet
The µT P data header that signifies
Start of the dictionary of key:value pairs
Local address label identifier which consists of
6-bytes, the first 4 are IP, the last two are port
1:m local port in integer form
Message label identifier
9:get_peers message type value
4:peer Local peer label
Local ShareID label
The 20 character ShareID a transform of the secret used and can be found in the .SyncID file.
A share ID based on some transform of the 20 byte ShareID, the local IP address and local port.
The receiving tracker will respond to the requesting client with the same protocol used in the get_peers message. This has the consequence that if TCP and UDP are successful on the first request, the first response will be a set of duplicate TCP and UDP packet in the form:
Version 1.4 and 2.0:
Header type: 0 d2:ea
6:[requester external IP:port]
5:peers l[peer list starts] e[peer list ends]
20:[20 byte ShareID]
Peer Entry in peer list d
6:[external IP:Port value]
2:la[local address key]
20:[Peer ID] e[end of Peer dictionary]
entry for each peer currently in contact with the tracker through get_peer requests. The current requesting peer will be included in this list so the peers message will always have at least one entry in the peers list.
One unusual feature of the peers response is the inclusion of a peer’s local, non-routable, IP address and Port. This is so that, if the local IP matches the local subnet of the requesting peer, the requesting peer can attempt to communicate directly over the LAN using the local address provided. If the tracker server option is disabled then the local client will have to use a different method to find peers local to it.
Table 3: Multicast Ping Packet
BSYNC The BTSync Header
Start of the dictionary of key:value pairs
1:m Message label identifier
4:PING The message type
Local peer label
PeerID of the multicasting Peer
5:share Local ShareID label
32: The Share32 ID that matches that used in the v2.0 get_peers
4.1. Local Peer Discovery
When the option to search LAN is enabled (the default behaviour) the application will start sending out multicast packets to port 3838 across the LAN.
The multicast packets are BTSync bencoded packets with the following format
BSYNC d1:m4:ping4:peer20:[20-byte Peer ID]
4:port[i Integer e]
5:share32:[32-byte content ShareID] e
The format of these packets has not changed since the original pre v1.4 BT-
Sync. Once LAN discovery is enabled the local neighbouring peers will respond to the multicast broadcast with the “BSYNC” TCP packet detailed below.
Once a peer receives a multicast message that contains a ShareID that it possesses the peer responds with the content:
BSYNC d1:m4:ping4:peer20:[20-byte PeerID]
4:port[i Integer e]
5:share20:[20-byte ShareID] e
of the ShareID being the more familiar 20 byte version.
Once the Ping has been sent the peers perform a BTSync session negotiation
the synchronisation takes place over TCP IP and the µT P traffic runs alongside over UDP. The synchronisation process is signed off with a µT P Type 1 (FIN) packet. After this there are regular µT P type 2 (STATE) messages to check for changes.
4.2. BTSync Relay Server
When BTSync finds that it needs to communicate directly between two firewalled peers, the application may make use of a relay server.
Relay Server if required” option is enabled by default on share creation. The relay server is contacted by a DNS request sent out for r.usyncapp.com, which resolves to the following IP addresses: 184.108.40.206 and 220.127.116.11.
These are the IP addresses of the relay servers contactable on remote port
3000. Each peer contacts the relay server using an outbound connection that should bypass any firewall rule preventing unauthorised inbound connections.
Once the server handshake has taken place, the negotiation to set up a secure connection between the two peers begins. The following sequence of events is observable:
1. Peer contacts the relay server to initiate contact with the remote peer.
0022 | CounterA | BSYNC 0x000000 [20 byte remote peerID]
CounterB | peer20 | 20 byte local peerID
2. The relay server responds to the peer using a standard TCP ACK packet
3. The peer contacts the server to arrange transfer of the data and to supply the nonce for encrypted traffic and provide a status ID.
0066 | CounterA | BSYNC 0x00(4) :d5:nonce16:[nonce value for key share]5:share20:[20 byte shareID]e
4. The relay contacts the peer to initiate the session counters
0022 | CounterA | [20 byte remote peerID] remote Peer IP:Port | Counter B
5. The relay server Confirms the SID status and supplies the remote nonce to complete the bridge for encrypted data transfer
0022 | CounterB | remote peerID | 0066 | remote Peer ID | CounterA
BSYNC 00x4 | :d5:Nonce16:[nonce value]5:share20:[20 byte ShareID]e
6. The Relay server contacts the local peer to deliver the remote public key
7. the local peer delivers its public key to the relay server
8. Encrypted bidirectional traffic transfer commences with the relay server acting as the router delivering packets to each peer.
4.3. BTSync Data Transfer
The transfer of data during a BTSync synchronisation operates in a similar
A unique magnet URI is created for each file contained within the shared folder and this is used for requesting chunks of the entire file from known peers sharing this content.
4.4. Differentiation from Regular BitTorrent Traffic
While much of the network topology of BTSync is shared with regular Bit-
Torrent, the request and response packets differ from those employed by regular
BitTorrent file-sharing traffic. The most obvious addition is the BSYNC header attached to each datagram transmitted on the network. In addition, the introduction of µT P causes increased volume of traffic recognisable even though µT P results in lower overall bandwidth usage. Besides that addition, the active peer
list that is returned also contains additional information over the regular Bit-
Torrent file-sharing protocol: namely the inclusion of the local IP:port address pairs for each peer. From an investigative perspective, this extra information could prove useful in identifying the particular machine involved in the BTSync network as opposed to merely resolving the WAN IP address back to a router with potentially hundreds of LAN users. The local DHCP records could be used to resolve the MAC address (and often the hostname) of the individual machine identified during the network investigation.
In addition to the regular BitTorrent peer discovery methods outline in Sec-
dresses to the local cache of peers. BTSync facilitates this through the option to add “Predefined Hosts” to the configuration or application options. These are hardcoded IP address and port entries that are saved in order of preference. BTSync will contact these peers directly, without any requirement for a multicast (LPD) or sending a get_peers request to an online tracker.
5. Investigation Methodology
This section outlines a reproducible methodology for the network investigation methodology. Depending on which of the scenarios outlined above, the methodology may branch according to what the desired outcome will be. Fig-
steps are described in greater detail below).
5.1. Identification of Content
Depending on the scenario that motivates the BTSync network investigation, there are a number of avenues that the forensic investigator may find secrets
(and corresponding hash values) needed for investigation:
5.1.1. Web Discovery
– As soon as BTSync was released as a public alpha, publicly accessible shar-
Figure 7: Steps Involved in Performing a BTSync Network Investigation and numerous websites and blogs were created to set up an online “dead drop” secret share, for example http://www.12char.com and http://www.btsynckeys.com.
It is also feasible that an investigator could come across an online community that shares secrets in a private forum for the purposes of trading data and material without 3rd party involvement. Keys to shares discovered in this manner that possess a timestamp component will need to be checked to determine if the link has expired or not.
5.1.2. Local Discovery
– An investigator could, in the course of an investigation find evidence of
BTSync having been used to transfer material to the suspect machine. This could be that BTSync installed and the folder listed in the list of shares stored in the configuration file, webUI or the BTSync hidden .Sync folder. BTSync log files (/.sync/sync.log), or, if BTSync is not present (uninstalled) there could still be .SyncID files remaining in folders that were synchronised from remote peers. A hexdump of the .SyncID file or, more conveniently, the names of the
*.db files found in the .Sync folder will give the SHA1 encoded share ID that the investigator needs to find other peers actively sharing that content
5.1.3. LAN traffic
– Many companies configure their edge firewalls to block torrent traffic for the general users. If the company uses torrent for some other business purpose it will usually be accounted for and allowed from or to a particular server or subnet. However, BTSync allows for all external communicate beyond the LAN to be turned off (in the configuration file or in the settings dialogue the options for “Use DHT”, “Use Tracker” and “Use Relay Server” can be disabled) leaving only the settings for LAN discovery or known peers. A security review of the router logs may find active torrent traffic within the LAN or system admins may discover evidence of torrent applications run.
5.2. Identification of Lookup Hash
Requesting a list of peers through any of the peer discovery methods outlined above requires a unique lookup hash. This hash is used by the tracker, DHT,
PEX and LPD in the association of know peers to a particular piece of content.
5.3. Crawl the Network to Identify Peer Information
Each of the peer discovery methods outlined above should be queried for a list of known active nodes sharing that content. Due to the user configurable nature over which services are enabled in the BTSync client, to ensure complete node enumeration/identification, the results from each of the peer discovery methods should be combined to form the final result of collected information.
5.4. Downloading and Verification of Content
Depending on the scenario being investigated, it may be necessary to download a copy of the content stored remotely for investigation or verification. In order to accomplish this, a regular BitTorrent download can be started for each of the files contained within the shared folder. If the investigation’s goal is to attempt to recreate content deliberately deleted off a suspect’s machine, the data can only be entirely recovered if there is a complete copy of the data stored remotely. However, this does not mean that any single node needs to have 100%
of the content. The original data can be recombined so long as a complete copy exists split among the distributed nodes actively sharing the content. An obstacle to this stage of the investigation would be the use of limited use keys. The link descriptor for a key has no component to indicate a restricted number of uses. A further obstacle would be the option to require authorisation before a peer can access a share. This is unlikely to be the case for links discovered on a public platform.
6. Proof of Concept
In order to begin proof of concept testing for the investigation methodology, a bespoke BTSync crawling application was first designed and developed. This application was built to emulate regular BTSync client usage, as outlined above, and recorded the necessary results for analysis.
To demonstrate the functionality of the application, an investigation was conducted on a known publicly accessible BTSync secret. One of the public BT-
Sync online secret sharing sites was used (http://www.btsynckeys.com/) to acquire a secret likely to have active peers sharing the corresponding content. The secret selected was advertised with the description “45 GB Movie Collection
[Movies] [R]” and the read-only secret BKV273YUFMWILMESLRDVLI5NHMWO3OCS7 was supplied. It is important to note that there is no certainty that the description accurately advertises the content within the share. There is no method of verifying any of the containing shared content until the syncing process begins and temporary files are created in the shared folder. Even at that point, the user can merely see the file names of the content once the download/synchronisation process has begun.
As part of the peer identification process a number of active peers were returned to the investigative application. These peers were recorded for later
Figure 8: Daily Snapshot Comparison for Investigated Secret (Public IP Addresses Partially
Redacted) analysis. During the first snapshot taken for this investigation, 21 peers were identified as sharing the specific content and 20 were identified on the second.
A snapshot accounts for all of the peers identified sharing the specific content at the same instance in time.
Two peers (differentiated by PeerID) of particular interest are listed as the
Comparing their peer ID and local IP:Port address pairing, it is clear that these two peers are referring to the same individual node. Between the two snapshots taken of this shared content, their IP address changed from one IP address range
to another. However, both of these IP address ranges are associated with the ISP
“Telefonica” in the same postal zip code in Berlin, Germany (data gathered from
sometime between the two snapshots as opposed to the use of a VPN or other IP address masking system. The two peers share the same external IP address but have different external ports and local IP:port pairs indicating that the BTSync install on these nodes are accessing the Internet through a router employing
Network Address Translation (NAT).
6.3. Churn Rate
While the example investigation outlined as part of this paper focuses on a single secret over a 24 hour window, the low churn rate of just 7% remains inter-
the assumption that most users are active on the network while downloading some content and disconnect upon completion. BTSync is designed to be a tool that functions in a similar manner to cloud file synchronisation services like
Dropbox or Google Drive. These tools largely operate on an “install and forget” approach whereby synchronisation and updating between the cloud and potentially multiple client machines does not require any direct user input. BTSync uses a similar approach and as a result, low churn rates would be expected.
the investigation. While the total number of peers identified with this proof of concept investigation is quite low, the data remains consistent with regular
most popular continents involved.
7. Example Investigation
In late August 2014, the iCloud accounts of numerous celebrities were hacked and compromising photos and videos were posted online without their con-
Figure 9: Geolocation of Discovered IP Addresses sent in what has gained notoriety in the media and among Internet users as
the globe with the help of Internet forums, such as htpp://4chan.org and http://reddit.com. At the time, there was concerns that iCloud itself had been hacked and these leaks were merely a subset of the information stolen of
Apple’s servers, however an investigation into the attack found that the pass-
7.1. Entry Point
The entry point to this investigation first involved verifying that this content was being shared using BTSync. On the public BTSync secret sharing
“subreddit” http://reddit.com/r/btsecrets, a number of public read-only secrets were shared containing collections of the leaked content. For the purposes of this investigation, one shared leaked content was investigated using the aforementioned BTSync investigative application. The secret investigated was bb63eb5b61969956e71273026f00a1deca464413. The investigation took place one week after the leak occurred.
Figure 10: Network-based Entry Point into Investigation (ShareID Highlighted in Blue
to expedite the network analysis process. This dissector can identify the various packets pertinent to the decentralised service in the Wireshark traffic capture, as can be seen in Fig-
application was launched and the ShareID supplied.
7.2. Peer Discovery
Figure 11: IP Addresses Discovered Sharing the Content
Wireshark Dissector is downloadable from http://www.markscanlon.co/bittorrent-sync
Using the gathered ShareID, the application was able to gather information
each of the peer discovery methods outlined above.
Figure 12: Geolocation of Discovered Peers
The IP addresses detected during the investigation were geolocated and
7.4. Data Recoverable from Remote Peers
Figure 13: Evidence Recovery from Remote Peers
Some of the evidence recoverable from remote peers in this particular BT-
the time of the investigation (v1.4), did not have selective sync functionality.
As a result, each member of the secret must download all of the shared content.
This limitation of a lack of selective syncing means that each peer identified will eventually have all of the content in the share. This feature makes evidence recovery from such popular shares more performant for digital investigators as each node is a potential source of the pertinent evidence. With the advent of v2.0 of the application, selective sync means that each peer must be communicated with individually to identify which active machines identified has what data.
This paper documented the protocol used in BitTorrent Sync during the discovery of peers and the synchronisation of data. While BTSync is not necessarily intended to replace BitTorrent as a file dissemination utility, it will likely be used for this purpose. This is already facilitated though websites providing
ers describe the tool as an end-to-end encrypted method of transferring files without the use of a third party staging area, which ensures that the content and personal details remain hidden from unauthorised access. Analysis of the network communication procedure produced unique identifiable information on peers including their unique PeerID, their external and local IP addresses and port numbers. In combination with traditional digital forensic methods, once a secret is identified, it is possible to discover other nodes on the network who are also sharing this data. Deleted data from a local shared folder could be downloaded from the network and recombined for forensic investigation. From an investigative perspective, the decentralised nature of BTSync will always leave an avenue of gathering information and identifying nodes sharing particular content open to the forensic investigator.
 BitTorrent Inc., Bittorrent sync user manual (2013).
 BitTorrent Inc., BitTorrent Sync Developer API (2013).
 BitTorrent Inc., BitTorrent Sync Article (2013).
 BitTorrent Inc., Introducing BitTorrent Sync 1.4: An Easier Way to Share
Large Files (2014).
 B. Cohen, The BitTorrent Protocol Specification (2008).
 B. Cohen, Incentives build robustness in bittorrent, in: Proceedings of the
Workshop on Economics of Peer-to-Peer systems, Vol. 6, 2003, pp. 68–72.
 J. Li, J. Stribling, T. M. Gil, R. Morris, M. F. Kaashoek, Comparing the
Performance of Distributed Hash Tables under Churn, in: Peer-to-Peer
Systems III, Springer, 2005, pp. 87–99.
 E. Sit, R. Morris, Security Considerations for Peer-to-Peer Distributed
Hash Tables, in: Peer-to-Peer Systems, Springer, 2002, pp. 261–269.
 J. Farina, M. Scanlon, M.-T. Kechadi, BitTorrent Sync: First Impressions and Forensic Implications, in: Digital Forensic Research Workshop EU
(DFRWS EU 2014), 2014.
 R. Layton, P. Watters, Investigation into the extent of infringing content on BitTorrent networks, Internet Commerce Security Laboratory.
 M. Scanlon, A. Hannaway, M.-T. Kechadi, A Week in the Life of the Most
Popular BitTorrent Swarms, 5th Annual Symposium on Information Assurance (ASIA’10).
 S. Le Blond, A. Legout, F. Lefessant, W. Dabbous, M. A. Kaafar, Spying the World from your Laptop: Identifying and Profiling Content Providers and Big Downloaders in BitTorrent, in: Proceedings of the 3rd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more, USENIX Association, 2010, pp. 4–4.
 H. Chung, J. Park, S. Lee, C. Kang, Digital Forensic Investigation of Cloud
Storage Services, Digital Investigation 9 (2) (2012) 81 – 95.
 B. Martini, K.-K. R. Choo, Cloud storage forensics: ownCloud as a case study, Digital Investigation 10 (4) (2013) 287 – 299.
 M. Scanlon, J. Farina, T. Kechadi, Leveraging Decentralization to Extend the Digital Evidence Acquisition Window: Case Study on BitTorrent Sync,
Journal of Digital Forensics, Security and Law 9 (2) (2014) 85–100.
 Reddit, Btsecrets, http://www.reddit.com/r/btsecrets (2014).
 Maxmind Inc., Geolite country database (Jul. 2014).
 O. Herrera, T. Znati, Modeling churn in P2P networks, in: Simulation
Symposium, 2007. ANSS’07. 40th Annual, IEEE, 2007, pp. 33–40.
 K. Bora, Apple Knew About iCloud Flaw 6 Months Before ’The Fappening’ Hit Celebrity Photos, Report Claims (Sep. 2014).
 K. T. Muth, Googlestroika: Five years later, NCJL & Tech. 16 (2015)