[Mnet-devel] erasure code optimization project

Bryce Wilcox-O'Hearn zooko at zooko.com
Thu Aug 5 18:53:48 BST 2004


On 2004, Aug 05, , at 13:02, Lenny G Arbage wrote:
> Decoding is just as dramatic:
> real    0m0.148s
> time ~/client/fileDecoder 84 170 x
> recovered file after 110 blocks.
> real    0m0.011s
> ************ ~= 13x faster!  *****************
>   Notice that the decode step for LDGP requires
> "slightly more than k" blocks to succeed.
[...]
> There is a paper
> at the website that goes into some depth about
> "inefficiency," and shows that values of about k*1.05
> are achievable.

Is it this one?

``Design, Evaluation and Comparison of Four Large Block FEC Codecs, 
LDPC, LDGM, LDGM Staircase and LDGM Triangle, plus a Reed-Solomon Small 
Block FEC Codec'',
  V. Roca, C. Neumann

I haven't read it yet.

This paper:

``On the Practical Use of LDPC Erasure Codes for Distributed Storage 
Applications''
by James S. Plank and Michael G. Thomason

http://www.cs.utk.edu/~plank/plank/papers/CS-03-510.html

Seems to indicate that the "1.05" number is an asymptotic estimate, and 
that for many practical applications the "1.31" that you observed is 
more likely!

Is this too much?  Let's assume for the moment that we would have to 
download 1.31 times as many blocks with a LDPC than with Reed-Solomon, 
but that the LDPC code would decode 5x as fast.  Then if the time to 
download the file were less than 3/5 of the time to decode the file 
(with Reed-Solomon), then switching to LDPC would be a win.  So 
approximately it is better to use LDPC if Reed-Solomon takes half-again 
as long to decode the file as you have to wait to download the file.

However, consider that you can download and decode in parallel, so in 
practice Reed-Solomon might continue to be faster even if it takes 
twice as long to perform the Reed-Solomon decoding as it does to 
download the file.

Unfortunately, the person who encoded and published the file in the 
first place didn't know the ratio between download speed and decode 
speed on your computer, so we need to think of a good strategy in 
advance.

One possible approach would be to fix the block size at 64 KB for every 
block, and fix the expansion ratio to 3.  (M=3*K)

Then if filesize / 64 KB were greater than 84, which means that we 
could no longer fit into the GF(2^8) Reed-Solomon, we could switch to 
LDPC instead.

I'm not sure that is better than our current plan of keeping GF(2^8) 
Reed-Solomon and just making the blocks bigger.

Regards,

Zooko



More information about the Mnet-devel mailing list