├── .gitignore ├── LICENSE.txt ├── MANIFEST.in ├── README.md ├── bin ├── decoder └── encoder ├── lt ├── __init__.py ├── decode │ ├── __init__.py │ └── __main__.py ├── encode │ ├── __init__.py │ └── __main__.py └── sampler.py ├── setup.py └── tests ├── data └── README.txt ├── out └── decoded └── test.sh /.gitignore: -------------------------------------------------------------------------------- 1 | # Following section ignores files for python <<<<<<<<< 2 | # Byte-compiled / optimized / DLL files 3 | __pycache__/ 4 | *.py[cod] 5 | *$py.class 6 | 7 | # C extensions 8 | *.so 9 | 10 | # Distribution / packaging 11 | .Python 12 | env/ 13 | build/ 14 | develop-eggs/ 15 | dist/ 16 | downloads/ 17 | eggs/ 18 | .eggs/ 19 | lib/ 20 | lib64/ 21 | parts/ 22 | sdist/ 23 | var/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *,cover 47 | 48 | # Translations 49 | *.mo 50 | *.pot 51 | 52 | # Django stuff: 53 | *.log 54 | 55 | # Sphinx documentation 56 | docs/_build/ 57 | 58 | # PyBuilder 59 | target/ 60 | 61 | # End ignores for python >>>>>>>>>>>> 62 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) [2015] [Anson Rosenthal] 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include LICENSE.txt 2 | include README.md 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | LT-code 2 | ======= 3 | 4 | This is an implementation of a Luby Transform code in Python, consisting of two executables, one for each encoding and decoding files. These are thin wrappers around a core stream/file API. 5 | 6 | See [_D.J.C, MacKay, 'Information theory, inference, and learning algorithms. Cambridge University Press, 2003_](http://www.inference.org.uk/itprnn/book.pdf) for reference on the algorithms. 7 | 8 | ## Encoding 9 | 10 | The encoding algorithm follows the given spec, so no innovations there. A few optimizations are made however. First, the CDF of the degree distribution, M(d), is precomputed for all degrees d = 1, ..., K. This CDF is represented as an array mapping index d => M(d), so sampling from the degree distribution mu(d) becomes a linear search through the CDF array looking for the bucket our random number on \[0, 1) landed in. This random number is generated as specified using the linear congruential generator. 11 | 12 | Second, the integer representation of all blocks is held in RAM for maximum speed in block sample generation. This is a limitation on the size of the file practically encoded on most computers, but this decision does not reach far into other parts of the design, and it can be easily addressed if necessary for better memory scalability. 13 | 14 | ```python 15 | from sys import stdout 16 | from lt import encode 17 | 18 | # Stream a fountain of 1024B blocks to stdout 19 | block_size = 1024 20 | with open(filename, 'rb') as f: 21 | for block in encode.encoder(f, block_size): 22 | stdout.buffer.write(block) 23 | ``` 24 | 25 | ## Decoding 26 | 27 | The decoder reads the header, then the body, of each incoming block and conducts all possible steps in the belief propagation algorithm on a representation of the source node/check node graph that become possible given the new check node. This is done using an online algorithm, which computes the appropriate messages incrementally and passes them eagerly as the value of source nodes is resolved. Thus, the decoder will finish decoding once it has read only as many blocks is necessary in the stream to decode the file, and it seems to scale well as the file size, and block size increase. 28 | 29 | ```python 30 | from sys import stdin, stdout 31 | from lt import decode 32 | 33 | # Usage 1: Blocking 34 | # Blocks until decoding is complete, returns bytes 35 | data = decode.decode(stdin.buffer) 36 | 37 | 38 | # Usage 2: Incremental 39 | # Consume blocks in a loop, breaking when finished 40 | decoder = decode.LtDecoder() 41 | for block in decode.read_blocks(stdin.buffer): 42 | decoder.consume_block(block) 43 | if decoder.is_done(): 44 | break 45 | 46 | # You can collect the decoded transmission as bytes 47 | data = decoder.bytes_dump() 48 | 49 | # Or You can write the output directly to another stream 50 | decoder.stream_dump(sys.stdout.buffer) 51 | 52 | ``` 53 | ## Commandline Usage 54 | 55 | To run the encoder, invoke the following from the shell 56 | ``` 57 | $ ./bin/encoder [seed] [c] [delta] 58 | ``` 59 | 60 | For example, the following streams the encoding of `file.txt` to stdout in 64B blocks. 61 | ``` 62 | $ ./bin/encoder ./file.txt 64 63 | ``` 64 | 65 | To run the decoder on stdin, run the following 66 | ``` 67 | $ ./bin/decoder 68 | ``` 69 | -------------------------------------------------------------------------------- /bin/decoder: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | python3 -m lt.decode "$@" 3 | -------------------------------------------------------------------------------- /bin/encoder: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | python3 -m lt.encode "$@" 3 | -------------------------------------------------------------------------------- /lt/__init__.py: -------------------------------------------------------------------------------- 1 | from lt import encode, decode 2 | -------------------------------------------------------------------------------- /lt/decode/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import io 3 | import sys 4 | 5 | from struct import unpack, error 6 | from random import random 7 | from collections import defaultdict 8 | 9 | from math import ceil 10 | from lt import sampler 11 | 12 | # Check node in graph 13 | class CheckNode(object): 14 | 15 | def __init__(self, src_nodes, check): 16 | self.check = check 17 | self.src_nodes = src_nodes 18 | 19 | class BlockGraph(object): 20 | """Graph on which we run Belief Propagation to resolve 21 | source node data 22 | """ 23 | 24 | def __init__(self, num_blocks): 25 | self.checks = defaultdict(list) 26 | self.num_blocks = num_blocks 27 | self.eliminated = {} 28 | 29 | def add_block(self, nodes, data): 30 | """Adds a new check node and edges between that node and all 31 | source nodes it connects, resolving all message passes that 32 | become possible as a result. 33 | """ 34 | 35 | # We can eliminate this source node 36 | if len(nodes) == 1: 37 | to_eliminate = list(self.eliminate(next(iter(nodes)), data)) 38 | 39 | # Recursively eliminate all nodes that can now be resolved 40 | while len(to_eliminate): 41 | other, check = to_eliminate.pop() 42 | to_eliminate.extend(self.eliminate(other, check)) 43 | else: 44 | 45 | # Pass messages from already-resolved source nodes 46 | for node in list(nodes): 47 | if node in self.eliminated: 48 | nodes.remove(node) 49 | data ^= self.eliminated[node] 50 | 51 | # Resolve if we are left with a single non-resolved source node 52 | if len(nodes) == 1: 53 | return self.add_block(nodes, data) 54 | else: 55 | 56 | # Add edges for all remaining nodes to this check 57 | check = CheckNode(nodes, data) 58 | for node in nodes: 59 | self.checks[node].append(check) 60 | 61 | # Are we done yet? 62 | return len(self.eliminated) >= self.num_blocks 63 | 64 | def eliminate(self, node, data): 65 | """Resolves a source node, passing the message to all associated checks 66 | """ 67 | 68 | # Cache resolved value 69 | self.eliminated[node] = data 70 | others = self.checks[node] 71 | del self.checks[node] 72 | 73 | # Pass messages to all associated checks 74 | for check in others: 75 | check.check ^= data 76 | check.src_nodes.remove(node) 77 | 78 | # Yield all nodes that can now be resolved 79 | if len(check.src_nodes) == 1: 80 | yield (next(iter(check.src_nodes)), check.check) 81 | 82 | class LtDecoder(object): 83 | 84 | def __init__(self, c=sampler.DEFAULT_C, delta=sampler.DEFAULT_DELTA): 85 | self.c = c 86 | self.delta = delta 87 | self.K = 0 88 | self.filesize = 0 89 | self.blocksize = 0 90 | 91 | self.block_graph = None 92 | self.prng = None 93 | self.initialized = False 94 | 95 | def is_done(self): 96 | return self.done 97 | 98 | def consume_block(self, lt_block): 99 | (filesize, blocksize, blockseed), block = lt_block 100 | 101 | # first time around, init things 102 | if not self.initialized: 103 | self.filesize = filesize 104 | self.blocksize = blocksize 105 | 106 | self.K = ceil(filesize/blocksize) 107 | self.block_graph = BlockGraph(self.K) 108 | self.prng = sampler.PRNG(params=(self.K, self.delta, self.c)) 109 | self.initialized = True 110 | 111 | # Run PRNG with given seed to figure out which blocks were XORed to make received data 112 | _, _, src_blocks = self.prng.get_src_blocks(seed=blockseed) 113 | 114 | # If BP is done, stop 115 | self.done = self._handle_block(src_blocks, block) 116 | return self.done 117 | 118 | def bytes_dump(self): 119 | buffer = io.BytesIO() 120 | self.stream_dump(buffer) 121 | return buffer.getvalue() 122 | 123 | def stream_dump(self, out_stream): 124 | 125 | # Iterate through blocks, stopping before padding junk 126 | for ix, block_bytes in enumerate(map(lambda p: int.to_bytes(p[1], self.blocksize, 'big'), 127 | sorted(self.block_graph.eliminated.items(), key = lambda p:p[0]))): 128 | if ix < self.K-1 or self.filesize % self.blocksize == 0: 129 | out_stream.write(block_bytes) 130 | else: 131 | out_stream.write(block_bytes[:self.filesize%self.blocksize]) 132 | 133 | def _handle_block(self, src_blocks, block): 134 | """What to do with new block: add check and pass 135 | messages in graph 136 | """ 137 | return self.block_graph.add_block(src_blocks, block) 138 | 139 | def _read_header(stream): 140 | """Read block header from network 141 | """ 142 | header_bytes = stream.read(12) 143 | return unpack('!III', header_bytes) 144 | 145 | def _read_block(blocksize, stream): 146 | """Read block data from network into integer type 147 | """ 148 | blockdata = stream.read(blocksize) 149 | return int.from_bytes(blockdata, 'big') 150 | 151 | def read_blocks(stream): 152 | """Generate parsed blocks from input stream 153 | """ 154 | while True: 155 | header = _read_header(stream) 156 | block = _read_block(header[1], stream) 157 | yield (header, block) 158 | 159 | # TODO: NO validation here that the bytes consist of a *single* block 160 | def block_from_bytes(bts): 161 | return next(read_blocks(io.BytesIO(bts))) 162 | 163 | def decode(in_stream, out_stream=None, **kwargs): 164 | 165 | decoder = LtDecoder(**kwargs) 166 | 167 | # Begin forever loop 168 | for lt_block in read_blocks(in_stream): 169 | decoder.consume_block(lt_block) 170 | if decoder.is_done(): 171 | break 172 | 173 | if out_stream: 174 | decoder.stream_dump(out_stream) 175 | else: 176 | return decoder.bytes_dump() 177 | -------------------------------------------------------------------------------- /lt/decode/__main__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import fileinput 4 | import sys 5 | import time 6 | from struct import unpack, error 7 | from random import random 8 | from ctypes import c_int 9 | from collections import defaultdict 10 | from math import ceil 11 | 12 | from lt import decode 13 | 14 | def run(stream=sys.stdin.buffer): 15 | """Reads from stream, applying the LT decoding algorithm 16 | to incoming encoded blocks until sufficiently many blocks 17 | have been received to reconstruct the entire file. 18 | """ 19 | payload = decode.decode(stream) 20 | sys.stdout.write(payload.decode('utf8')) 21 | 22 | if __name__ == '__main__': 23 | parser = argparse.ArgumentParser("decoder") 24 | try: 25 | run(sys.stdin.buffer) 26 | except error: 27 | print("Decoder got some invalid data. Try again.", file=sys.stderr) 28 | -------------------------------------------------------------------------------- /lt/encode/__init__.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from random import randint 3 | from struct import pack 4 | 5 | from lt import sampler 6 | 7 | def _split_file(f, blocksize): 8 | """Block file byte contents into blocksize chunks, padding last one if necessary 9 | """ 10 | 11 | f_bytes = f.read() 12 | blocks = [int.from_bytes(f_bytes[i:i+blocksize].ljust(blocksize, b'0'), sys.byteorder) 13 | for i in range(0, len(f_bytes), blocksize)] 14 | return len(f_bytes), blocks 15 | 16 | 17 | def encoder(f, blocksize, seed=None, c=sampler.DEFAULT_C, delta=sampler.DEFAULT_DELTA): 18 | """Generates an infinite sequence of blocks to transmit 19 | to the receiver 20 | """ 21 | 22 | # Generate seed if not provided 23 | if seed is None: 24 | seed = randint(0, 1 << 31 - 1) 25 | 26 | # get file blocks 27 | filesize, blocks = _split_file(f, blocksize) 28 | 29 | # init stream vars 30 | K = len(blocks) 31 | prng = sampler.PRNG(params=(K, delta, c)) 32 | prng.set_seed(seed) 33 | 34 | # block generation loop 35 | while True: 36 | blockseed, d, ix_samples = prng.get_src_blocks() 37 | block_data = 0 38 | for ix in ix_samples: 39 | block_data ^= blocks[ix] 40 | 41 | # Generate blocks of XORed data in network byte order 42 | block = (filesize, blocksize, blockseed, int.to_bytes(block_data, blocksize, sys.byteorder)) 43 | yield pack('!III%ss'%blocksize, *block) 44 | -------------------------------------------------------------------------------- /lt/encode/__main__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | """Implementation of a Luby Transform encoder. 3 | 4 | This is a type of fountain code, which deals with lossy channels by 5 | sending an infinite stream of statistically correllated packets generated 6 | from a set of blocks into which the source data is divided. In this way, 7 | epensive retransmissions are unecessary, as the receiver will be able 8 | to reconstruct the file with high probability after receiving only 9 | slightly more blocks than one would have to transmit sending the raw 10 | blocks over a lossless channel. 11 | 12 | See 13 | 14 | D.J.C, MacKay, 'Information theory, inference, and learning algorithms'. 15 | Cambridge University Press, 2003 16 | 17 | for reference. 18 | """ 19 | import os.path 20 | import argparse 21 | import sys 22 | import time 23 | 24 | from lt import encode, sampler 25 | 26 | def run(fn, blocksize, seed, c, delta): 27 | """Run the encoder until the channel is broken, signalling that the 28 | receiver has successfully reconstructed the file 29 | """ 30 | 31 | with open(fn, 'rb') as f: 32 | for block in encode.encoder(f, blocksize, seed, c, delta): 33 | sys.stdout.buffer.write(block) 34 | 35 | if __name__ == '__main__': 36 | parser = argparse.ArgumentParser("encoder") 37 | parser.add_argument('file', help='the source file to encode') 38 | parser.add_argument('blocksize', metavar='block-size', 39 | type=int, 40 | help='the size of each encoded block, in bytes') 41 | parser.add_argument('seed', type=int, 42 | nargs="?", 43 | default=2067261, 44 | help='the initial seed for the random number generator') 45 | parser.add_argument('c', type=float, 46 | nargs="?", 47 | default=sampler.DEFAULT_C, 48 | help='degree sampling distribution tuning parameter') 49 | parser.add_argument('delta', type=float, 50 | nargs="?", 51 | default=sampler.DEFAULT_DELTA, 52 | help='degree sampling distribution tuning parameter') 53 | args = parser.parse_args() 54 | 55 | if not os.path.exists(args.file): 56 | print("File %s doesn't exist. Try again." % args.file, file=sys.stderr) 57 | sys.exit(1) 58 | 59 | try: 60 | run(args.file, args.blocksize, args.seed, args.c, args.delta) 61 | except (GeneratorExit, IOError): 62 | print("Decoder has cut off transmission. Fountain closed.", file=sys.stderr) 63 | sys.stdout.write = lambda s:None 64 | sys.stdout.flush = lambda:None 65 | sys.exit(0) 66 | -------------------------------------------------------------------------------- /lt/sampler.py: -------------------------------------------------------------------------------- 1 | """Implementation of a sampler for the Robust Soliton Distribution. 2 | 3 | This is the distribution on the `degree` of blocks encoded in the 4 | Luby Transform code. Blocks of data transmitted are generated by 5 | sampling degree `d` from the Robust Soliton Distrubution, then 6 | sampling `d` blocks uniformly from the sequence of blocks in the 7 | file to be transmitted. These are XOR'ed together, and the result 8 | is transmitted. 9 | 10 | Critically, the state of the PRNG when the degree of a block was 11 | sampled is transmitted with the block as metadata, so the 12 | receiver can reconstruct the sampling of source blocks given the 13 | same PRNG parameters below. 14 | """ 15 | from math import log, floor, sqrt 16 | 17 | DEFAULT_C = 0.1 18 | DEFAULT_DELTA = 0.5 19 | 20 | # Parameters for Pseudorandom Number Generator 21 | PRNG_A = 16807 22 | PRNG_M = (1 << 31) - 1 23 | PRNG_MAX_RAND = PRNG_M - 1 24 | 25 | def gen_tau(S, K, delta): 26 | """The Robust part of the RSD, we precompute an 27 | array for speed 28 | """ 29 | pivot = floor(K/S) 30 | return [S/K * 1/d for d in range(1, pivot)] \ 31 | + [S/K * log(S/delta)] \ 32 | + [0 for d in range(pivot, K)] 33 | 34 | def gen_rho(K): 35 | """The Ideal Soliton Distribution, we precompute 36 | an array for speed 37 | """ 38 | return [1/K] + [1/(d*(d-1)) for d in range(2, K+1)] 39 | 40 | def gen_mu(K, delta, c): 41 | """The Robust Soliton Distribution on the degree of 42 | transmitted blocks 43 | """ 44 | 45 | S = c * log(K/delta) * sqrt(K) 46 | tau = gen_tau(S, K, delta) 47 | rho = gen_rho(K) 48 | normalizer = sum(rho) + sum(tau) 49 | return [(rho[d] + tau[d])/normalizer for d in range(K)] 50 | 51 | def gen_rsd_cdf(K, delta, c): 52 | """The CDF of the RSD on block degree, precomputed for 53 | sampling speed""" 54 | 55 | mu = gen_mu(K, delta, c) 56 | return [sum(mu[:d+1]) for d in range(K)] 57 | 58 | 59 | class PRNG(object): 60 | """A Pseudorandom Number Generator that yields samples 61 | from the set of source blocks using the RSD degree 62 | distribution described above. 63 | """ 64 | 65 | def __init__(self, params): 66 | """Provide RSD parameters on construction 67 | """ 68 | 69 | self.state = None # Seed is set by interfacing code using set_seed 70 | K, delta, c = params 71 | self.K = K 72 | self.cdf = gen_rsd_cdf(K, delta, c) 73 | 74 | def _get_next(self): 75 | """Executes the next iteration of the PRNG 76 | evolution process, and returns the result 77 | """ 78 | 79 | self.state = PRNG_A * self.state % PRNG_M 80 | return self.state 81 | 82 | def _sample_d(self): 83 | """Samples degree given the precomputed 84 | distributions above and the linear PRNG output 85 | """ 86 | 87 | p = self._get_next() / PRNG_MAX_RAND 88 | for ix, v in enumerate(self.cdf): 89 | if v > p: 90 | return ix + 1 91 | return ix + 1 92 | 93 | def set_seed(self, seed): 94 | """Reset the state of the PRNG to the 95 | given seed 96 | """ 97 | 98 | self.state = seed 99 | 100 | 101 | def get_src_blocks(self, seed=None): 102 | """Returns the indices of a set of `d` source blocks 103 | sampled from indices i = 1, ..., K-1 uniformly, where 104 | `d` is sampled from the RSD described above. 105 | """ 106 | 107 | if seed: 108 | self.state = seed 109 | 110 | blockseed = self.state 111 | d = self._sample_d() 112 | have = 0 113 | nums = set() 114 | while have < d: 115 | num = self._get_next() % self.K 116 | if num not in nums: 117 | nums.add(num) 118 | have += 1 119 | return blockseed, d, nums 120 | 121 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | try: 2 | from setuptools import setup, find_packages 3 | except ImportError: 4 | from distutils.core import setup 5 | 6 | config = { 7 | 'description': 'An implementation of an Encoder and Decoder for the Luby Transform Fountain code. Useful for transmitting data over very lossy channels where retry-based transmission protocols struggle.', 8 | 'author': 'Anson Rosenthal', 9 | 'url': 'https://github.com/anrosent/LT-Code', 10 | 'author_email': 'anson.rosenthal@gmail.com', 11 | 'version': '0.3.3', 12 | 'packages': ['lt', 'lt.encode', 'lt.decode'], 13 | 'scripts': [], 14 | 'name': 'lt-code' 15 | } 16 | 17 | setup(**config) 18 | -------------------------------------------------------------------------------- /tests/data/README.txt: -------------------------------------------------------------------------------- 1 | _ _____ ____ _ 2 | | | |_ _| / ___|___ __| | ___ ___ 3 | | | | | | | / _ \ / _` |/ _ \/ __| 4 | | |___| | | |__| (_) | (_| | __/\__ \ 5 | |_____|_| \____\___/ \__,_|\___||___/ 6 | =========== 7 | anrosent 8 | Completed for Brown University course cs168: Computer Networks. 9 | 10 | This is an implementation of a Luby Transform code in Python, consisting of two executables, one for each encoding and decoding files. The sampling code is pulled into a library shared between the two executables. 11 | 12 | Encoding 13 | ----------- 14 | 15 | The encoding algorithm follows the given spec, so no innovations there. A few optimizations are made however. First, the CDF of the degree distribution, M(d), is precomputed for all degrees d = 1, ..., K. This CDF is represented as an array mapping index d => M(d), so sampling from the degree distribution mu(d) becomes a linear search through the CDF array looking for the bucket our random number on [0, 1) landed in. This random number is generated as specified using the linear congruential generator. 16 | Second, the integer representation of all blocks is held in RAM for maximum speed in block sample generation. This is a limitation on the size of the file practically encoded on most computers, but this decision does not reach far into other parts of the design, and it can be easily addressed if necessary for better memory scalability. 17 | 18 | Decoding 19 | ----------- 20 | 21 | The decoder is essentially a loop that reads the header, then the body, of each incoming block and conducts all possible steps in the belief propagation algorithm on a representation of the source node/check node graph that become possible given the new check node. This is done using an online algorithm, which computes the appropriate messages incrementally and passes them eagerly as the value of source nodes is resolved. Thus, the program will terminate once it has read only as many blocks is necessary in the stream to decode the file, and it seems to scale well as the file size, block size, and drop rate increase. 22 | 23 | Usage 24 | ------------ 25 | 26 | To run the encoder, invoke the following from the shell 27 | $ ./encoder [c] [delta] 28 | 29 | To run the decoder, run the following 30 | $ ./decoder 31 | where is written as a decimal probability of a block being dropped in transmission 32 | -------------------------------------------------------------------------------- /tests/out/decoded: -------------------------------------------------------------------------------- 1 | _ _____ ____ _ 2 | | | |_ _| / ___|___ __| | ___ ___ 3 | | | | | | | / _ \ / _` |/ _ \/ __| 4 | | |___| | | |__| (_) | (_| | __/\__ \ 5 | |_____|_| \____\___/ \__,_|\___||___/ 6 | =========== 7 | anrosent 8 | Completed for Brown University course cs168: Computer Networks. 9 | 10 | This is an implementation of a Luby Transform code in Python, consisting of two executables, one for each encoding and decoding files. The sampling code is pulled into a library shared between the two executables. 11 | 12 | Encoding 13 | ----------- 14 | 15 | The encoding algorithm follows the given spec, so no innovations there. A few optimizations are made however. First, the CDF of the degree distribution, M(d), is precomputed for all degrees d = 1, ..., K. This CDF is represented as an array mapping index d => M(d), so sampling from the degree distribution mu(d) becomes a linear search through the CDF array looking for the bucket our random number on [0, 1) landed in. This random number is generated as specified using the linear congruential generator. 16 | Second, the integer representation of all blocks is held in RAM for maximum speed in block sample generation. This is a limitation on the size of the file practically encoded on most computers, but this decision does not reach far into other parts of the design, and it can be easily addressed if necessary for better memory scalability. 17 | 18 | Decoding 19 | ----------- 20 | 21 | The decoder is essentially a loop that reads the header, then the body, of each incoming block and conducts all possible steps in the belief propagation algorithm on a representation of the source node/check node graph that become possible given the new check node. This is done using an online algorithm, which computes the appropriate messages incrementally and passes them eagerly as the value of source nodes is resolved. Thus, the program will terminate once it has read only as many blocks is necessary in the stream to decode the file, and it seems to scale well as the file size, block size, and drop rate increase. 22 | 23 | Usage 24 | ------------ 25 | 26 | To run the encoder, invoke the following from the shell 27 | $ ./encoder [c] [delta] 28 | 29 | To run the decoder, run the following 30 | $ ./decoder 31 | where is written as a decimal probability of a block being dropped in transmission 32 | -------------------------------------------------------------------------------- /tests/test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | REALDIR=$(dirname $0) 3 | DATA=$REALDIR/data/README.txt 4 | OUT=$REALDIR/out 5 | ENCODER=$REALDIR/../bin/encoder 6 | DECODER=$REALDIR/../bin/decoder 7 | 8 | BLOCK_SIZE=64 9 | DROP_RATE=0 10 | 11 | if [ ! -d $OUT ]; 12 | then 13 | mkdir -p $OUT; 14 | else 15 | rm $OUT/* 16 | fi 17 | 18 | 19 | echo "Encoding file $DATA" 20 | $ENCODER $DATA $BLOCK_SIZE | $DECODER > $OUT/decoded 21 | echo "Verifying data <=> decoded" 22 | 23 | if [[ -z $(diff $DATA $OUT/decoded) ]]; 24 | then 25 | echo "Test passed!" 26 | else 27 | echo "Test Failed!" 28 | fi 29 | --------------------------------------------------------------------------------