├── .gitignore
├── LICENSE.txt
├── MANIFEST.in
├── README.md
├── bin
    ├── decoder
    └── encoder
├── lt
    ├── __init__.py
    ├── decode
    │   ├── __init__.py
    │   └── __main__.py
    ├── encode
    │   ├── __init__.py
    │   └── __main__.py
    └── sampler.py
├── setup.py
└── tests
    ├── data
        └── README.txt
    ├── out
        └── decoded
    └── test.sh


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Following section ignores files for python <<<<<<<<<
 2 | # Byte-compiled / optimized / DLL files
 3 | __pycache__/
 4 | *.py[cod]
 5 | *$py.class
 6 | 
 7 | # C extensions
 8 | *.so
 9 | 
10 | # Distribution / packaging
11 | .Python
12 | env/
13 | build/
14 | develop-eggs/
15 | dist/
16 | downloads/
17 | eggs/
18 | .eggs/
19 | lib/
20 | lib64/
21 | parts/
22 | sdist/
23 | var/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | 
28 | # PyInstaller
29 | #  Usually these files are written by a python script from a template
30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
31 | *.manifest
32 | *.spec
33 | 
34 | # Installer logs
35 | pip-log.txt
36 | pip-delete-this-directory.txt
37 | 
38 | # Unit test / coverage reports
39 | htmlcov/
40 | .tox/
41 | .coverage
42 | .coverage.*
43 | .cache
44 | nosetests.xml
45 | coverage.xml
46 | *,cover
47 | 
48 | # Translations
49 | *.mo
50 | *.pot
51 | 
52 | # Django stuff:
53 | *.log
54 | 
55 | # Sphinx documentation
56 | docs/_build/
57 | 
58 | # PyBuilder
59 | target/
60 | 
61 | # End ignores for python >>>>>>>>>>>>
62 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) [2015] [Anson Rosenthal]
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include LICENSE.txt
2 | include README.md
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | LT-code
 2 | =======
 3 | 
 4 | This is an implementation of a Luby Transform code in Python, consisting of two executables, one for each encoding and decoding files. These are thin wrappers around a core stream/file API.
 5 | 
 6 | See [_D.J.C, MacKay, 'Information theory, inference, and learning algorithms. Cambridge University Press, 2003_](http://www.inference.org.uk/itprnn/book.pdf) for reference on the algorithms.
 7 | 
 8 | ## Encoding
 9 | 
10 | The encoding algorithm follows the given spec, so no innovations there. A few optimizations are made however. First, the CDF of the degree distribution, M(d), is precomputed for all degrees d = 1, ..., K. This CDF is represented as an array mapping index d => M(d), so sampling from the degree distribution mu(d) becomes a linear search through the CDF array looking for the bucket our random number on \[0, 1) landed in. This random number is generated as specified using the linear congruential generator. 
11 | 
12 | Second, the integer representation of all blocks is held in RAM for maximum speed in block sample generation. This is a limitation on the size of the file practically encoded on most computers, but this decision does not reach far into other parts of the design, and it can be easily addressed if necessary for better memory scalability.
13 | 
14 | ```python
15 | from sys import stdout
16 | from lt import encode
17 | 
18 | # Stream a fountain of 1024B blocks to stdout
19 | block_size = 1024
20 | with open(filename, 'rb') as f:
21 |     for block in encode.encoder(f, block_size):
22 |         stdout.buffer.write(block)
23 | ```
24 | 
25 | ## Decoding
26 |     
27 | The decoder reads the header, then the body, of each incoming block and conducts all possible steps in the belief propagation algorithm on a representation of the source node/check node graph that become possible given the new check node. This is done using an online algorithm, which computes the appropriate messages incrementally and passes them eagerly as the value of source nodes is resolved. Thus, the decoder will finish decoding once it has read only as many blocks is necessary in the stream to decode the file, and it seems to scale well as the file size, and block size increase.
28 | 
29 | ```python
30 | from sys import stdin, stdout
31 | from lt import decode
32 | 
33 | # Usage 1: Blocking
34 | # Blocks until decoding is complete, returns bytes
35 | data = decode.decode(stdin.buffer)
36 | 
37 | 
38 | # Usage 2: Incremental
39 | # Consume blocks in a loop, breaking when finished
40 | decoder = decode.LtDecoder()
41 | for block in decode.read_blocks(stdin.buffer):
42 |     decoder.consume_block(block)
43 |     if decoder.is_done():
44 |        break 
45 | 
46 | # You can collect the decoded transmission as bytes
47 | data = decoder.bytes_dump()
48 | 
49 | # Or You can write the output directly to another stream
50 | decoder.stream_dump(sys.stdout.buffer)
51 | 
52 | ```
53 | ## Commandline Usage
54 | 
55 | To run the encoder, invoke the following from the shell
56 | ```
57 | $ ./bin/encoder <file> <blocksize> [seed] [c] [delta]
58 | ```
59 | 
60 | For example, the following streams the encoding of `file.txt` to stdout in 64B blocks.
61 | ```
62 | $ ./bin/encoder ./file.txt 64
63 | ```
64 | 
65 | To run the decoder on stdin, run the following
66 | ```
67 | $ ./bin/decoder 
68 | ```
69 | 


--------------------------------------------------------------------------------
/bin/decoder:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 | python3 -m lt.decode "$@"
3 | 


--------------------------------------------------------------------------------
/bin/encoder:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 | python3 -m lt.encode "$@"
3 | 


--------------------------------------------------------------------------------
/lt/__init__.py:
--------------------------------------------------------------------------------
1 | from lt import encode, decode
2 | 


--------------------------------------------------------------------------------
/lt/decode/__init__.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | import io
  3 | import sys
  4 | 
  5 | from struct import unpack, error
  6 | from random import random
  7 | from collections import defaultdict
  8 | 
  9 | from math import ceil
 10 | from lt import sampler
 11 |  
 12 | # Check node in graph
 13 | class CheckNode(object):
 14 | 
 15 |     def __init__(self, src_nodes, check):
 16 |         self.check = check
 17 |         self.src_nodes = src_nodes
 18 | 
 19 | class BlockGraph(object):
 20 |     """Graph on which we run Belief Propagation to resolve 
 21 |     source node data
 22 |     """
 23 |     
 24 |     def __init__(self, num_blocks):
 25 |         self.checks = defaultdict(list)
 26 |         self.num_blocks = num_blocks
 27 |         self.eliminated = {}
 28 | 
 29 |     def add_block(self, nodes, data):
 30 |         """Adds a new check node and edges between that node and all
 31 |         source nodes it connects, resolving all message passes that
 32 |         become possible as a result.
 33 |         """
 34 | 
 35 |         # We can eliminate this source node
 36 |         if len(nodes) == 1:
 37 |             to_eliminate = list(self.eliminate(next(iter(nodes)), data))
 38 | 
 39 |             # Recursively eliminate all nodes that can now be resolved
 40 |             while len(to_eliminate):
 41 |                 other, check = to_eliminate.pop()
 42 |                 to_eliminate.extend(self.eliminate(other, check))
 43 |         else:
 44 | 
 45 |             # Pass messages from already-resolved source nodes
 46 |             for node in list(nodes):
 47 |                 if node in self.eliminated:
 48 |                     nodes.remove(node)
 49 |                     data ^= self.eliminated[node]
 50 | 
 51 |             # Resolve if we are left with a single non-resolved source node
 52 |             if len(nodes) == 1:
 53 |                 return self.add_block(nodes, data)
 54 |             else:
 55 | 
 56 |                 # Add edges for all remaining nodes to this check
 57 |                 check = CheckNode(nodes, data)
 58 |                 for node in nodes:
 59 |                     self.checks[node].append(check)
 60 | 
 61 |         # Are we done yet?
 62 |         return len(self.eliminated) >= self.num_blocks
 63 | 
 64 |     def eliminate(self, node, data):
 65 |         """Resolves a source node, passing the message to all associated checks
 66 |         """
 67 | 
 68 |         # Cache resolved value
 69 |         self.eliminated[node] = data
 70 |         others = self.checks[node]
 71 |         del self.checks[node]
 72 | 
 73 |         # Pass messages to all associated checks
 74 |         for check in others:
 75 |             check.check ^= data
 76 |             check.src_nodes.remove(node)
 77 | 
 78 |             # Yield all nodes that can now be resolved
 79 |             if len(check.src_nodes) == 1:
 80 |                 yield (next(iter(check.src_nodes)), check.check)
 81 | 
 82 | class LtDecoder(object):
 83 | 
 84 |     def __init__(self, c=sampler.DEFAULT_C, delta=sampler.DEFAULT_DELTA):
 85 |         self.c = c
 86 |         self.delta = delta
 87 |         self.K = 0
 88 |         self.filesize = 0
 89 |         self.blocksize = 0
 90 | 
 91 |         self.block_graph = None
 92 |         self.prng = None
 93 |         self.initialized = False
 94 | 
 95 |     def is_done(self):
 96 |         return self.done
 97 | 
 98 |     def consume_block(self, lt_block):
 99 |         (filesize, blocksize, blockseed), block = lt_block
100 | 
101 |         # first time around, init things
102 |         if not self.initialized:
103 |             self.filesize = filesize
104 |             self.blocksize = blocksize
105 | 
106 |             self.K = ceil(filesize/blocksize)
107 |             self.block_graph = BlockGraph(self.K)
108 |             self.prng = sampler.PRNG(params=(self.K, self.delta, self.c))
109 |             self.initialized = True
110 | 
111 |         # Run PRNG with given seed to figure out which blocks were XORed to make received data
112 |         _, _, src_blocks = self.prng.get_src_blocks(seed=blockseed)
113 | 
114 |         # If BP is done, stop
115 |         self.done = self._handle_block(src_blocks, block)
116 |         return self.done
117 | 
118 |     def bytes_dump(self):
119 |         buffer = io.BytesIO()
120 |         self.stream_dump(buffer)
121 |         return buffer.getvalue()
122 | 
123 |     def stream_dump(self, out_stream):
124 | 
125 |         # Iterate through blocks, stopping before padding junk
126 |         for ix, block_bytes in enumerate(map(lambda p: int.to_bytes(p[1], self.blocksize, 'big'),
127 |                 sorted(self.block_graph.eliminated.items(), key = lambda p:p[0]))):
128 |             if ix < self.K-1 or self.filesize % self.blocksize == 0:
129 |                 out_stream.write(block_bytes)
130 |             else:
131 |                 out_stream.write(block_bytes[:self.filesize%self.blocksize])
132 | 
133 |     def _handle_block(self, src_blocks, block):
134 |         """What to do with new block: add check and pass
135 |         messages in graph
136 |         """
137 |         return self.block_graph.add_block(src_blocks, block)
138 | 
139 | def _read_header(stream):
140 |     """Read block header from network
141 |     """
142 |     header_bytes = stream.read(12)
143 |     return unpack('!III', header_bytes)
144 | 
145 | def _read_block(blocksize, stream):
146 |     """Read block data from network into integer type
147 |     """
148 |     blockdata = stream.read(blocksize)
149 |     return int.from_bytes(blockdata, 'big')
150 | 
151 | def read_blocks(stream):
152 |     """Generate parsed blocks from input stream
153 |     """
154 |     while True:
155 |         header = _read_header(stream)
156 |         block  = _read_block(header[1], stream)
157 |         yield (header, block)
158 | 
159 | # TODO: NO validation here that the bytes consist of a *single* block
160 | def block_from_bytes(bts):
161 |     return next(read_blocks(io.BytesIO(bts)))
162 |     
163 | def decode(in_stream, out_stream=None, **kwargs):
164 | 
165 |     decoder = LtDecoder(**kwargs)
166 | 
167 |     # Begin forever loop
168 |     for lt_block in read_blocks(in_stream):
169 |         decoder.consume_block(lt_block)
170 |         if decoder.is_done():
171 |             break
172 |     
173 |     if out_stream:
174 |         decoder.stream_dump(out_stream)
175 |     else:
176 |         return decoder.bytes_dump()
177 | 


--------------------------------------------------------------------------------
/lt/decode/__main__.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | import argparse
 3 | import fileinput
 4 | import sys
 5 | import time
 6 | from struct import unpack, error
 7 | from random import random
 8 | from ctypes import c_int
 9 | from collections import defaultdict
10 | from math import ceil
11 | 
12 | from lt import decode
13 |  
14 | def run(stream=sys.stdin.buffer):
15 |     """Reads from stream, applying the LT decoding algorithm
16 |     to incoming encoded blocks until sufficiently many blocks
17 |     have been received to reconstruct the entire file.
18 |     """
19 |     payload = decode.decode(stream)
20 |     sys.stdout.write(payload.decode('utf8'))
21 | 
22 | if __name__ == '__main__':
23 |     parser = argparse.ArgumentParser("decoder")
24 |     try:
25 |         run(sys.stdin.buffer)
26 |     except error:
27 |         print("Decoder got some invalid data. Try again.", file=sys.stderr)
28 | 


--------------------------------------------------------------------------------
/lt/encode/__init__.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | from random import randint
 3 | from struct import pack
 4 | 
 5 | from lt import sampler
 6 | 
 7 | def _split_file(f, blocksize):
 8 |     """Block file byte contents into blocksize chunks, padding last one if necessary
 9 |     """
10 | 
11 |     f_bytes = f.read()
12 |     blocks = [int.from_bytes(f_bytes[i:i+blocksize].ljust(blocksize, b'0'), sys.byteorder) 
13 |             for i in range(0, len(f_bytes), blocksize)]
14 |     return len(f_bytes), blocks
15 | 
16 | 
17 | def encoder(f, blocksize, seed=None, c=sampler.DEFAULT_C, delta=sampler.DEFAULT_DELTA):
18 |     """Generates an infinite sequence of blocks to transmit
19 |     to the receiver
20 |     """
21 | 
22 |     # Generate seed if not provided
23 |     if seed is None:
24 |         seed = randint(0, 1 << 31 - 1)
25 | 
26 |     # get file blocks
27 |     filesize, blocks = _split_file(f, blocksize)
28 | 
29 |     # init stream vars
30 |     K = len(blocks)
31 |     prng = sampler.PRNG(params=(K, delta, c))
32 |     prng.set_seed(seed)
33 | 
34 |     # block generation loop
35 |     while True:
36 |         blockseed, d, ix_samples = prng.get_src_blocks()
37 |         block_data = 0
38 |         for ix in ix_samples:
39 |             block_data ^= blocks[ix]
40 | 
41 |         # Generate blocks of XORed data in network byte order
42 |         block = (filesize, blocksize, blockseed, int.to_bytes(block_data, blocksize, sys.byteorder))
43 |         yield pack('!III%ss'%blocksize, *block)
44 | 


--------------------------------------------------------------------------------
/lt/encode/__main__.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | """Implementation of a Luby Transform encoder.
 3 | 
 4 | This is a type of fountain code, which deals with lossy channels by 
 5 | sending an infinite stream of statistically correllated packets generated
 6 | from a set of blocks into which the source data is divided. In this way, 
 7 | epensive retransmissions are unecessary, as the receiver will be able 
 8 | to reconstruct the file with high probability after receiving only 
 9 | slightly more blocks than one would have to transmit sending the raw
10 | blocks over a lossless channel.
11 | 
12 | See 
13 | 
14 | D.J.C, MacKay, 'Information theory, inference, and learning algorithms'.
15 | Cambridge University Press, 2003
16 | 
17 | for reference.
18 | """
19 | import os.path
20 | import argparse
21 | import sys
22 | import time
23 | 
24 | from lt import encode, sampler
25 | 
26 | def run(fn, blocksize, seed, c, delta):
27 |     """Run the encoder until the channel is broken, signalling that the 
28 |     receiver has successfully reconstructed the file
29 |     """
30 | 
31 |     with open(fn, 'rb') as f:
32 |         for block in encode.encoder(f, blocksize, seed, c, delta):
33 |             sys.stdout.buffer.write(block)
34 | 
35 | if __name__ == '__main__':
36 |     parser = argparse.ArgumentParser("encoder")
37 |     parser.add_argument('file', help='the source file to encode')
38 |     parser.add_argument('blocksize', metavar='block-size', 
39 |                                      type=int, 
40 |                                      help='the size of each encoded block, in bytes')
41 |     parser.add_argument('seed', type=int,
42 |                                 nargs="?",
43 |                                 default=2067261,
44 |                                 help='the initial seed for the random number generator')
45 |     parser.add_argument('c', type=float,
46 |                              nargs="?",
47 |                              default=sampler.DEFAULT_C,
48 |                              help='degree sampling distribution tuning parameter')
49 |     parser.add_argument('delta', type=float,
50 |                                  nargs="?",
51 |                                  default=sampler.DEFAULT_DELTA,
52 |                                  help='degree sampling distribution tuning parameter')
53 |     args = parser.parse_args()
54 | 
55 |     if not os.path.exists(args.file):
56 |         print("File %s doesn't exist. Try again." % args.file, file=sys.stderr)
57 |         sys.exit(1)
58 |     
59 |     try:
60 |         run(args.file, args.blocksize, args.seed, args.c, args.delta)
61 |     except (GeneratorExit, IOError):
62 |         print("Decoder has cut off transmission. Fountain closed.", file=sys.stderr)
63 |         sys.stdout.write = lambda s:None
64 |         sys.stdout.flush = lambda:None
65 |         sys.exit(0)
66 | 


--------------------------------------------------------------------------------
/lt/sampler.py:
--------------------------------------------------------------------------------
  1 | """Implementation of a sampler for the Robust Soliton Distribution.
  2 | 
  3 | This is the distribution on the `degree` of blocks encoded in the 
  4 | Luby Transform code. Blocks of data transmitted are generated by
  5 | sampling degree `d` from the Robust Soliton Distrubution, then
  6 | sampling `d` blocks uniformly from the sequence of blocks in the
  7 | file to be transmitted. These are XOR'ed together, and the result
  8 | is transmitted. 
  9 | 
 10 | Critically, the state of the PRNG when the degree of a block was 
 11 | sampled is transmitted with the block as metadata, so the 
 12 | receiver can reconstruct the sampling of source blocks given the
 13 | same PRNG parameters below.
 14 | """
 15 | from math import log, floor, sqrt
 16 | 
 17 | DEFAULT_C = 0.1
 18 | DEFAULT_DELTA = 0.5
 19 | 
 20 | # Parameters for Pseudorandom Number Generator
 21 | PRNG_A = 16807
 22 | PRNG_M = (1 << 31) - 1
 23 | PRNG_MAX_RAND = PRNG_M - 1
 24 | 
 25 | def gen_tau(S, K, delta):
 26 |     """The Robust part of the RSD, we precompute an
 27 |     array for speed
 28 |     """
 29 |     pivot = floor(K/S)
 30 |     return [S/K * 1/d for d in range(1, pivot)] \
 31 |             + [S/K * log(S/delta)] \
 32 |             + [0 for d in range(pivot, K)] 
 33 | 
 34 | def gen_rho(K):
 35 |     """The Ideal Soliton Distribution, we precompute
 36 |     an array for speed
 37 |     """
 38 |     return [1/K] + [1/(d*(d-1)) for d in range(2, K+1)]
 39 | 
 40 | def gen_mu(K, delta, c):
 41 |     """The Robust Soliton Distribution on the degree of 
 42 |     transmitted blocks
 43 |     """
 44 | 
 45 |     S = c * log(K/delta) * sqrt(K) 
 46 |     tau = gen_tau(S, K, delta)
 47 |     rho = gen_rho(K)
 48 |     normalizer = sum(rho) + sum(tau)
 49 |     return [(rho[d] + tau[d])/normalizer for d in range(K)]
 50 | 
 51 | def gen_rsd_cdf(K, delta, c):
 52 |     """The CDF of the RSD on block degree, precomputed for
 53 |     sampling speed"""
 54 | 
 55 |     mu = gen_mu(K, delta, c)
 56 |     return [sum(mu[:d+1]) for d in range(K)]
 57 | 
 58 | 
 59 | class PRNG(object):
 60 |     """A Pseudorandom Number Generator that yields samples
 61 |     from the set of source blocks using the RSD degree
 62 |     distribution described above.
 63 |     """
 64 | 
 65 |     def __init__(self, params):
 66 |         """Provide RSD parameters on construction
 67 |         """
 68 | 
 69 |         self.state = None  # Seed is set by interfacing code using set_seed
 70 |         K, delta, c = params
 71 |         self.K = K
 72 |         self.cdf = gen_rsd_cdf(K, delta, c)
 73 | 
 74 |     def _get_next(self):
 75 |         """Executes the next iteration of the PRNG
 76 |         evolution process, and returns the result
 77 |         """
 78 | 
 79 |         self.state = PRNG_A * self.state % PRNG_M
 80 |         return self.state
 81 | 
 82 |     def _sample_d(self):
 83 |         """Samples degree given the precomputed
 84 |         distributions above and the linear PRNG output
 85 |         """
 86 | 
 87 |         p = self._get_next() / PRNG_MAX_RAND
 88 |         for ix, v in enumerate(self.cdf):
 89 |             if v > p:
 90 |                 return ix + 1
 91 |         return ix + 1
 92 | 
 93 |     def set_seed(self, seed):
 94 |         """Reset the state of the PRNG to the 
 95 |         given seed
 96 |         """
 97 | 
 98 |         self.state = seed
 99 | 
100 |     
101 |     def get_src_blocks(self, seed=None):
102 |         """Returns the indices of a set of `d` source blocks
103 |         sampled from indices i = 1, ..., K-1 uniformly, where
104 |         `d` is sampled from the RSD described above.
105 |         """
106 | 
107 |         if seed:
108 |             self.state = seed
109 | 
110 |         blockseed = self.state
111 |         d = self._sample_d()
112 |         have = 0
113 |         nums = set()
114 |         while have < d:
115 |             num = self._get_next() % self.K
116 |             if num not in nums:
117 |                 nums.add(num)
118 |                 have += 1
119 |         return blockseed, d, nums
120 | 
121 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | try:
 2 |     from setuptools import setup, find_packages
 3 | except ImportError:
 4 |     from distutils.core import setup
 5 | 
 6 | config = {
 7 |     'description': 'An implementation of an Encoder and Decoder for the Luby Transform Fountain code. Useful for transmitting data over very lossy channels where retry-based transmission protocols struggle.',
 8 |     'author': 'Anson Rosenthal',
 9 |     'url': 'https://github.com/anrosent/LT-Code',
10 |     'author_email': 'anson.rosenthal@gmail.com',
11 |     'version': '0.3.3',
12 |     'packages': ['lt', 'lt.encode', 'lt.decode'],
13 |     'scripts': [],
14 |     'name': 'lt-code'
15 | }
16 | 
17 | setup(**config)
18 | 


--------------------------------------------------------------------------------
/tests/data/README.txt:
--------------------------------------------------------------------------------
 1 |  _   _____    ____          _           
 2 | | | |_   _|  / ___|___   __| | ___  ___ 
 3 | | |   | |   | |   / _ \ / _` |/ _ \/ __|
 4 | | |___| |   | |__| (_) | (_| |  __/\__ \
 5 | |_____|_|    \____\___/ \__,_|\___||___/
 6 | ===========
 7 | anrosent
 8 | Completed for Brown University course cs168: Computer Networks.
 9 | 
10 |     This is an implementation of a Luby Transform code in Python, consisting of two executables, one for each encoding and decoding files. The sampling code is pulled into a library shared between the two executables.
11 | 
12 | Encoding
13 | -----------
14 | 
15 |     The encoding algorithm follows the given spec, so no innovations there. A few optimizations are made however. First, the CDF of the degree distribution, M(d), is precomputed for all degrees d = 1, ..., K. This CDF is represented as an array mapping index d => M(d), so sampling from the degree distribution mu(d) becomes a linear search through the CDF array looking for the bucket our random number on [0, 1) landed in. This random number is generated as specified using the linear congruential generator. 
16 |     Second, the integer representation of all blocks is held in RAM for maximum speed in block sample generation. This is a limitation on the size of the file practically encoded on most computers, but this decision does not reach far into other parts of the design, and it can be easily addressed if necessary for better memory scalability.
17 | 
18 | Decoding
19 | -----------
20 |     
21 |     The decoder is essentially a loop that reads the header, then the body, of each incoming block and conducts all possible steps in the belief propagation algorithm on a representation of the source node/check node graph that become possible given the new check node. This is done using an online algorithm, which computes the appropriate messages incrementally and passes them eagerly as the value of source nodes is resolved. Thus, the program will terminate once it has read only as many blocks is necessary in the stream to decode the file, and it seems to scale well as the file size, block size, and drop rate increase.
22 | 
23 | Usage
24 | ------------
25 | 
26 |     To run the encoder, invoke the following from the shell
27 |     $ ./encoder <file> <blocksize> <seed> [c] [delta]
28 | 
29 |     To run the decoder, run the following
30 |     $ ./decoder <drop-rate>
31 |     where <drop-rate> is written as a decimal probability of a block being dropped in transmission
32 | 


--------------------------------------------------------------------------------
/tests/out/decoded:
--------------------------------------------------------------------------------
 1 |  _   _____    ____          _           
 2 | | | |_   _|  / ___|___   __| | ___  ___ 
 3 | | |   | |   | |   / _ \ / _` |/ _ \/ __|
 4 | | |___| |   | |__| (_) | (_| |  __/\__ \
 5 | |_____|_|    \____\___/ \__,_|\___||___/
 6 | ===========
 7 | anrosent
 8 | Completed for Brown University course cs168: Computer Networks.
 9 | 
10 |     This is an implementation of a Luby Transform code in Python, consisting of two executables, one for each encoding and decoding files. The sampling code is pulled into a library shared between the two executables.
11 | 
12 | Encoding
13 | -----------
14 | 
15 |     The encoding algorithm follows the given spec, so no innovations there. A few optimizations are made however. First, the CDF of the degree distribution, M(d), is precomputed for all degrees d = 1, ..., K. This CDF is represented as an array mapping index d => M(d), so sampling from the degree distribution mu(d) becomes a linear search through the CDF array looking for the bucket our random number on [0, 1) landed in. This random number is generated as specified using the linear congruential generator. 
16 |     Second, the integer representation of all blocks is held in RAM for maximum speed in block sample generation. This is a limitation on the size of the file practically encoded on most computers, but this decision does not reach far into other parts of the design, and it can be easily addressed if necessary for better memory scalability.
17 | 
18 | Decoding
19 | -----------
20 |     
21 |     The decoder is essentially a loop that reads the header, then the body, of each incoming block and conducts all possible steps in the belief propagation algorithm on a representation of the source node/check node graph that become possible given the new check node. This is done using an online algorithm, which computes the appropriate messages incrementally and passes them eagerly as the value of source nodes is resolved. Thus, the program will terminate once it has read only as many blocks is necessary in the stream to decode the file, and it seems to scale well as the file size, block size, and drop rate increase.
22 | 
23 | Usage
24 | ------------
25 | 
26 |     To run the encoder, invoke the following from the shell
27 |     $ ./encoder <file> <blocksize> <seed> [c] [delta]
28 | 
29 |     To run the decoder, run the following
30 |     $ ./decoder <drop-rate>
31 |     where <drop-rate> is written as a decimal probability of a block being dropped in transmission
32 | 


--------------------------------------------------------------------------------
/tests/test.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | REALDIR=$(dirname $0)
 3 | DATA=$REALDIR/data/README.txt
 4 | OUT=$REALDIR/out
 5 | ENCODER=$REALDIR/../bin/encoder
 6 | DECODER=$REALDIR/../bin/decoder
 7 | 
 8 | BLOCK_SIZE=64
 9 | DROP_RATE=0
10 | 
11 | if [ ! -d $OUT ];
12 | then
13 |     mkdir -p $OUT;
14 | else
15 |     rm $OUT/*
16 | fi
17 | 
18 | 
19 | echo "Encoding file $DATA"
20 | $ENCODER $DATA $BLOCK_SIZE  | $DECODER > $OUT/decoded 
21 | echo "Verifying data <=> decoded"
22 | 
23 | if [[ -z $(diff $DATA $OUT/decoded) ]];
24 |     then
25 |         echo "Test passed!"
26 |     else
27 |         echo "Test Failed!"
28 |     fi
29 | 


--------------------------------------------------------------------------------