This page documents rough and ready speed tests I performed to investigate how best to copy files between two Linux computers across a fast network. The conclusions are that scp and rsync leave a lot to be desired, that sshfs and sftp are the slowest of the bunch by a factor of up to 16 for lots of small files, and that up to 8 times times faster transfers than with scp can be obtained using tar over ssh while still retaining a secure connection.
For details of the newer tests which are not limited by hard drive speed, please see below.
Details of the older tests
Tests were performed transferring data between two computers running Ubuntu 10.04.4 LTS (Lucid Lynx). The two computers contained dual core Intel Pentium D 3.2 GHz processors and 4 GB of memory, and were connected using Intel pro/1000 ethernet adaptors over a gigabit ethernet connection through an old but high performance Cisco switch. The hard drive on each machine had a write speed of approximately 50 MB/s; this is rather slow, and so efficient nonblocking and buffering will be an important feature of these speed tests.
Two data sets were tested. An incompressible data set consisted of 200 files each of 10 MB of random data from /dev/urandom. A compressible data set consisted of the uncompressed GCC 4.6.3 source tree (610 MB).
All measurements are given as average ± standard deviation, where standard deviation is not provided if it is zero to the stated precision. These measurements are taken from a number (usually 5) of independent transfers.
Several methods of transfer were considered. These are listed below for transferring from src to dest on computer remote:
- scp -r src remote:dest
- rsync -r src remote:dest, using rsync version 3.0.7 protocol version 30.
- tar over ssh
- tar -cf- src | ssh -q remote tar -xf- -Cdest
- tar over netcat
- ssh -f remote "nc -l 7111 | tar -xf- -Cdest" ; tar -cf- src | nc remote 7111. This in fact suffered some problems with ssh not backgrounding properly and netcat not correctly listening or correctly parsing end of file codes.
- sshfs remote:dest mnt ; cp -r src mnt ; fusermount -u mnt, using sshfs version 2.2, fuse library version 2.8.1, and fuse kernel interface version 7.12.
In all cases, the version of ssh used was OpenSSH_5.3p1 Debian-3ubuntu7, OpenSSL 0.9.8k 25 Mar 2009, and the version of (GNU) tar used was 1.22. All tests were run using the default options unless specified. For the version of ssh used, the default cipher is aes128-ctr and the default MAC is hmac-md5.
Initial tests and ssh ciphers
The first tests aimed to find the fastest ssh ciphers and compare them with the other methods. All tests in here are on incompressible data.
|scp||33.7±0.7 MB/s||33.6±0.1 MB/s||33.7±0.1 MB/s||33.8±0.1 MB/s||31.1±0.1 MB/s||34.4±0.7 MB/s||33.8±0.1 MB/s|
|tar over ssh||53.3 MB/s||53.4 MB/s||64.9 MB/s||52.9 MB/s||33.9 MB/s||85.3±0.1 MB/s||85.8±0.3 MB/s|
|rsync||32.4±0.2 MB/s||33.5±0.5 MB/s||33.7±0.6 MB/s||33.7±0.7 MB/s||33.3±0.3 MB/s||34.5±0.3 MB/s||33.8±0.5 MB/s|
|tar over netcat||58.5±2.0 MB/s|
I conclude that:
- tar over either ssh or netcat are far faster than the other methods, possibly because tar and the transfer run in parallel, getting around the hard disk speed bottleneck.
- arcfour is the fastest cipher, and aes128-cbc is faster than the default aes128-ctr. Note that blowfish-cbc is not particularly fast.
- ssh is faster than netcat using either the aes128-cbc or arcfour ciphers.
ssh Message Authentication Codes (MACs)
The second tests aimed to find the fastest ssh Message Authentication Codes (MACs) and compare them with the other methods. All tests here are on incompressible data using the arcfour cipher found above to be the fastest in this case.
|scp||34.0±0.1 MB/s||33.9±0.4 MB/s||34.0±0.4 MB/s||34.0±0.1 MB/s||33.9±0.3 MB/s||34.0±0.1 MB/s|
|tar over ssh||85.7±0.2 MB/s||83.9±0.3 MB/s||73.2±0.2 MB/s||73.8±0.1 MB/s||62.1±0.1 MB/s||96.7±0.2 MB/s|
|rsync||33.6±0.7 MB/s||34.0±1.3 MB/s||34.1±0.4 MB/s||33.4±0.7 MB/s||33.8±0.6 MB/s||34.0±0.8 MB/s|
I conclude that:
- For the scp and rsync methods, the bottleneck is elsewhere and the MAC makes little difference.
- For the tar over ssh method, the MAC makes a significant difference, with the new email@example.com outperforming everything, and the default hmac-md5 coming a not-too-distant second.
The third tests aimed to find the best way to deal with compressibility. In addition to ssh's -C flag which enables compression within ssh, rsync offer the -z flag and tar offers both the -z and -j flags. The -j flag uses bzip2 for compression, while all the others use gzip.
All tests were performed using the arcfour cipher and firstname.lastname@example.org MAC.
|scp||compressible||9.5±0.3 MB/s||6.5±0.1 MB/s|
|tar over ssh||compressible||100.5±0.9 MB/s||16.7±0.1 MB/s||17.3±0.2 MB/s||5.0 MB/s|
|rsync||compressible||12.5±0.2 MB/s||11.7±0.2 MB/s||11.4±0.1 MB/s|
|sshfs||compressible||4.4 MB/s||3.3 MB/s|
|scp||incompressible||34.7±0.3 MB/s||13.8 MB/s|
|tar over ssh||incompressible||102.7±1.8 MB/s||13.7±0.1 MB/s||13.3±0.2 MB/s||2.5 MB/s|
|rsync||incompressible||34.5±0.5 MB/s||33.7±0.5 MB/s||33.7±0.5 MB/s|
|sshfs||incompressible||33.2±0.5 MB/s||10.9 MB/s|
I conclude that:
- On these computers where the connection speed is so fast as to not be limiting, not using compression beats using compression every time even when what is being transferred is compressible.
- At least on these computers, tar over ssh (without compression) is significantly faster than every other method. Again, this may be because the two separate processes get around the limitations of the slow hard drive.
- sshfs is terrible at transferring a large number of small files (as seen from the compressible case).
- bzip2 is too slow to be used for on-the-fly compression.
- The speed of rsync appears to be independent of whether and what type of compression is used.
These newer tests were performed on a newer cluster, so are not directly comparable with the ones given above. The operating system this time was Ubuntu 12.04.3 LTS, running on two-core AMD Opteron 270 2.0GHz processors linked by a gigabit ethernet connection through an old but high performance Cisco switch. This time, RAM disks were used at either end to avoid the hard drive bottleneck; in the fastest tests, speeds of up to 0.9Gb/s were reached, showing that the network link is now the bottleneck.
The same data sets as before were used; one with lots of small compressible files, and one with 200 × 10MB incompressible files. All tests were using the Ciphers=arcfour128 and MACsemail@example.com ssh options. This time, multiplexing connections was also tested. This was set up using ssh -M -S mux -fN <target>, and was used using the -S mux option passed to ssh. The results were:
|program||compressible small files||incompressible large files|
|no multiplexing||multiplexing||no multiplexing||multiplexing|
|sshfs||3.4±0.0 MB/s||3.4±0.0 MB/s||75.8±0.6 MB/s||84.4±0.5 MB/s|
|sftp||4.2±0.0 MB/s||4.3±0.0 MB/s||110.6±0.1 MB/s||110.9±0.0 MB/s|
|scp||8.0±0.0 MB/s||8.0±0.0 MB/s||112.6±0.0 MB/s||113.1±0.0 MB/s|
|rsync||38.4±0.5 MB/s||38.8±0.3 MB/s||102.2±0.3 MB/s||106.9±3.2 MB/s|
|tar over ssh||69.2±0.2 MB/s||70.2±0.1 MB/s||113.5±0.0 MB/s||114.1±0.0 MB/s|
I conclude that:
- Multiplexing does not offer much of an advantage for any of these methods.
- tar over ssh is still by far the fastest. rsync again comes second.
- sshfs, sftp and to some extent scp are terrible at transferring a large number of small files. sshfs is also poor when transferring larger files too.
- The results of this test are broadly in agreement with the previous results, now that the bandwidth limit imposed by the hard drive is removed. The only inconsistency is for tar over ssh for the compressible files, with the new result being more believable.