ASCII or Binary data for network IO

Lots of protocols I've been using recently seem to like sending data as "binary" blobs. Of course the strings are almost always just strings, so the main difference is how numbers travel across the wire. I'm pretty sure at least some of this is just people copy and pasting ideas from elsewhere, without thinking ... much like how we now have more than a few threaded applications.

So I've decided to do show some statistics. In the following tables, bin 1; bin 2 and bin 4 are the constant size raw data formats that are human unreadable, Netstring is a variable length format used in a few places like qmail and postfix, HEX is a constant size format used most notably in cpio, and Compact HEX len is just a simple variable length format which is a single digit base 36 number, followed by a variable length hex number.

Value Bin 1 Bin 2 Bin 4 Bin 8 Netsting HEX 1 HEX 2 HEX 4 HEX 8 Compact HEX
0 1 2 4 8 3 (0:,) 2 4 8 16 1 (0)
1 1 2 4 8 4 (1:1,) 2 4 8 16 2 (11)
15 1 2 4 8 5 (2:15,) 2 4 8 16 2 (1F)
101 1 2 4 8 6 (3:101,) 2 4 8 16 3 (265)
256 1 2 4 8 6 (3:256,) 2 4 8 16 4 (3100)
3855 N/A 2 4 8 7 (4:3855,) N/A 4 8 16 4 (3F0F)
65535 N/A 2 4 8 8 (5:65535,) N/A 4 8 16 5 (4FFFF)
986895 N/A N/A 4 8 9 (6:986895,) N/A N/A 8 16 6 (3F0F0F)
(2^32 - 1) N/A N/A 4 8 14 (10:4294967295,) N/A N/A 8 16 9 (8FFFFFFF)
(2^64 - 1) N/A N/A N/A 8 24 (20:18446744073709551615,) N/A N/A N/A 16 17 (GFFFFFFFFFFFFFFFF)
(2^(35*4) - 1) N/A N/A N/A N/A 39 (35:1393796574908163946345...,) N/A N/A 8 16 36 (ZFFFFFFFFFFFFFFFFF...)

As you can see the compact HEX ascii representation is equal or better with the 4 byte binary representation upto 4,095 and netstrings aren't far behind. Also while the 2 byte numbers are only equal or better upto 15 the 2 byte representation can't represent anything above 65535 (DNS suffers due to this).


James Antill
Last modified: Fri May 27 12:25:03 EDT 2005