[Cialug] rsnapshot problem with compression
Zachary Kotlarek
zach at kotlarek.com
Mon Jun 14 23:43:59 CDT 2010
On Jun 14, 2010, at 9:35 PM, Tom Pohl wrote:
> Do you have the same version of rsync on both sides of the connection?
That would be my guess too. Either your rsync can't negotiate the compression do to some protocol limitation or one end is missing the external dependencies for rsync to support compression. Additionally verbosity would probably help find either issue.
> I've run into issues with mismatched versions of rsync. The other thing I've found is that even with really fast CPUs on both ends, unless you have a relatively small pipe between the two servers, you'll slow down the transfer by turning on compression.
It really depends on where your bottleneck falls (and your sensitivity to latency, which is presumably very low on a week-long transfer); a small pipe can be that bottleneck, but with multi-core/multi-CPU machines that bottleneck can be "single core performance" even if the machine isn't very busy overall. Since SSH's encryption is not multi-threaded you can have a machine that's got CPU power to spare for new processes, but is maxed out on a single core doing encryption.
For example, given an SSH link across 1000 Mbps Ethernet I find it relatively easy to tie up one core with SSH for a sustained throughput of only ~11 MBps instead of 30+ MBps with netcat. That's with aes128, and you can do a little better if you pick less-demanding algorithms, but it's CPU-bound. I've seen the same sort of problem with dm-crypt, which is currently limited one-thread-per-volume, and which maxes out for me around 100 MBps on disks that will do 3 times that without encryption.
In both cases compression that runs on another core can increase effective throughput by reducing the amount of data to be encrypted. The actual speedup is related the sparsity of the data being transmitted, but gzip is pretty lightweight so if the bottleneck isn't "overall CPU load" or "overall I/O load" it's probably worth a shot.
That being said, I'm not sure that the compression in OpenSSH actually uses another thread (in fact I'm guessing it doesn't, and that's probably why you saw decreased performance). But adding your favorite compressor in front of your data stream, or in the current case, using rsync's compression instead of SSH's, would avoid that issue and take advantage of multiple cores.
On a more general note, if you want to solve the single-threaded SSH issue there are some patches from PSC to do exactly that. They also provide a patch for the "none" cipher if you want to use SSH to setup an authenticated-but-not-encrypted TCP stream (auth is still encrypted), which can be handy for internal file transfers/etc.:
http://www.psc.edu/networking/projects/hpn-ssh/
Zach
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2746 bytes
Desc: not available
Url : http://cialug.org/pipermail/cialug/attachments/20100614/5042ac10/attachment-0001.bin
More information about the Cialug
mailing list