├── .no-jekyll
├── .gitignore
├── tests
    ├── simple.md
    ├── references.md
    ├── two_chunks.md
    ├── some_yes_some_no.md
    └── complex_references.md
├── Cargo.toml
├── makefile
├── index.html
├── install.sh
├── LICENSE
├── .github
    └── workflows
    │   └── release.yml
├── docs
    ├── tests.md
    └── illiterate.md
├── README.md
├── Cargo.lock
└── src
    └── main.rs


/.no-jekyll:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | /target
2 | 


--------------------------------------------------------------------------------
/tests/simple.md:
--------------------------------------------------------------------------------
 1 | # A simple markdown
 2 | 
 3 | Here is a simple chunk of code.
 4 | 
 5 | ```rust {export}
 6 | fn main() {
 7 |     println!("Hello, world!");
 8 | }
 9 | ```
10 | 
11 | Let's see how this works.
12 | 


--------------------------------------------------------------------------------
/Cargo.toml:
--------------------------------------------------------------------------------
 1 | [package]
 2 | name = "illiterate"
 3 | version = "0.3.1"
 4 | edition = "2024"
 5 | 
 6 | [dependencies]
 7 | clap = { version="4.5.40", features = ["derive"]}
 8 | pulldown-cmark = "0.13.0"
 9 | regex = "1.11.1"
10 | 


--------------------------------------------------------------------------------
/tests/references.md:
--------------------------------------------------------------------------------
 1 | # Block references
 2 | 
 3 | This is a chunk with a reference to another chunk.
 4 | 
 5 | ```rust {export}
 6 | fn main() {
 7 |     <<main_content>>
 8 | }
 9 | ```
10 | 
11 | And here is the content:
12 | 
13 | ```rust {name=main_content}
14 | println!("Hello World");
15 | ```
16 | 


--------------------------------------------------------------------------------
/tests/two_chunks.md:
--------------------------------------------------------------------------------
 1 | # Two chunks
 2 | 
 3 | This example has two chunks.
 4 | 
 5 | ```python {name=hello_world}
 6 | print("Hello World")
 7 | ```
 8 | And a second chunk in another language:
 9 | 
10 | ```rust {export=main.rs}
11 | fn main() {
12 |     // A first chunk
13 | }
14 | ```
15 | 
16 | Let's see how this goes.
17 | 


--------------------------------------------------------------------------------
/makefile:
--------------------------------------------------------------------------------
 1 | .PHONY: build test dev self
 2 | 
 3 | build:
 4 | 	cargo build
 5 | 
 6 | self:
 7 | 	./target/debug/illiterate docs/illiterate.md docs/tests.md --dir src
 8 | 	make test
 9 | 
10 | check:
11 | 	find docs -name "*.md" | entr ./target/debug/illiterate docs/*.md --dir src --test
12 | 
13 | test:
14 | 	cargo test
15 | 
16 | dev:
17 | 	find docs -name "*.md" | entr bash -c "./target/debug/illiterate docs/*.md --dir src && make test"
18 | 


--------------------------------------------------------------------------------
/tests/some_yes_some_no.md:
--------------------------------------------------------------------------------
 1 | # Some yes, some no
 2 | 
 3 | Let's try to parse some chunks that are correct like
 4 | 
 5 | ```rust {export}
 6 | fn main() {
 7 |     // This one goes
 8 | }
 9 | ```
10 | 
11 | And another like:
12 | 
13 | ```rust {name=chunk1}
14 | fn another() {
15 |     return 0;
16 | }
17 | ```
18 | 
19 | But then some chunks like:
20 | 
21 | ```python
22 | print("Hello World")
23 | ```
24 | 
25 | Should not be parsed.
26 | 


--------------------------------------------------------------------------------
/tests/complex_references.md:
--------------------------------------------------------------------------------
 1 | # A file with complex references
 2 | 
 3 | This is a method we will need later:
 4 | 
 5 | ```rust {name=helper}
 6 | fn helper() {
 7 |     <<helper_content>>
 8 |     return 42;
 9 | }
10 | ```
11 | 
12 | This is our main method:
13 | 
14 | ```rust {export}
15 | fn main() {
16 |     <<main_content>>
17 |     helper();
18 | }
19 | 
20 | <<helper>>
21 | ```
22 | 
23 | This is the main content:
24 | 
25 | ```rust {name=main_content}
26 | println!("Hello World");
27 | ```
28 | 
29 | And finally this is what goes inside the helper:
30 | 
31 | ```rust {name=helper_content}
32 | let a = 0;
33 | ```
34 | 
35 | Let's see how this goes.
36 | 


--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
 1 | <!-- index.html -->
 2 | 
 3 | <!DOCTYPE html>
 4 | <html>
 5 |   <head>
 6 |     <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
 7 |     <meta name="viewport" content="width=device-width,initial-scale=1" />
 8 |     <meta charset="UTF-8" />
 9 |     <link rel="stylesheet" href="//cdn.jsdelivr.net/npm/docsify/themes/vue.css" />
10 |   </head>
11 |   <body>
12 |     <div id="app"></div>
13 |     <script>
14 |       window.$docsify = {
15 |         //...
16 |       };
17 |     </script>
18 |     <script src="//cdn.jsdelivr.net/npm/docsify@4"></script>
19 | 
20 |     <script src="//cdn.jsdelivr.net/npm/prismjs@1/components/prism-rust.min.js"></script>
21 |     <script>
22 | 
23 |     </script>
24 |   </body>
25 | </html>
26 | 


--------------------------------------------------------------------------------
/install.sh:
--------------------------------------------------------------------------------
 1 | set -e
 2 | 
 3 | # Fetch the latest release version tag (e.g., v0.1.0)
 4 | LATEST_RELEASE=$(curl -s "https://api.github.com/repos/apiad/illiterate/releases/latest" | grep '"tag_name":' | sed -E 's/.*"tag_name": "(.*)",/\1/')
 5 | 
 6 | echo "Downloading illiterate version ${LATEST_RELEASE}..."
 7 | 
 8 | # Construct the download URL for the Linux binary
 9 | DOWNLOAD_URL="https://github.com/apiad/illiterate/releases/download/${LATEST_RELEASE}/illiterate-${LATEST_RELEASE}-linux-x86_64.tar.gz"
10 | 
11 | # Download and extract the binary into /usr/local/bin
12 | # This may require sudo if you don't have write permissions to the target directory.
13 | wget "${DOWNLOAD_URL}" -O illiterate.tar.gz
14 | 
15 | echo "Done. Enter your password now to install."
16 | 
17 | sudo tar -xvzf illiterate.tar.gz -C /usr/local/bin illiterate
18 | rm illiterate.tar.gz
19 | 
20 | echo "Installation complete. You can now run 'illiterate' from your terminal."
21 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 Alejandro Piad
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/.github/workflows/release.yml:
--------------------------------------------------------------------------------
 1 | # .github/workflows/release.yml
 2 | 
 3 | name: Release Illiterate (Linux)
 4 | 
 5 | # This workflow runs when a new tag is pushed that starts with "v"
 6 | on:
 7 |   push:
 8 |     tags:
 9 |       - 'v[0-9]+.[0-9]+.[0-9]+*'
10 | 
11 | jobs:
12 |   # This job creates the GitHub Release entry
13 |   create-release:
14 |     name: Create Release
15 |     runs-on: ubuntu-latest
16 |     outputs:
17 |       upload_url: ${{ steps.create_release.outputs.upload_url }}
18 |     steps:
19 |       - name: Create Release
20 |         id: create_release
21 |         uses: actions/create-release@v1
22 |         env:
23 |           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
24 |         with:
25 |           tag_name: ${{ github.ref }}
26 |           release_name: Release ${{ github.ref }}
27 |           body: |
28 |             Linux binary release. See CHANGELOG.md for details.
29 |           draft: false
30 |           prerelease: false
31 | 
32 |   # This job builds and uploads the single Linux binary
33 |   build-and-upload:
34 |     name: Build and Upload Linux Binary
35 |     needs: create-release
36 |     runs-on: ubuntu-latest
37 |     steps:
38 |       - name: Checkout repository
39 |         uses: actions/checkout@v4
40 | 
41 |       - name: Install Rust toolchain
42 |         uses: dtolnay/rust-toolchain@stable
43 |         with:
44 |           targets: x86_64-unknown-linux-musl
45 | 
46 |       - name: Build static Linux binary
47 |         run: cargo build --release --target x86_64-unknown-linux-musl
48 |         env:
49 |           RUSTFLAGS: -C target-feature=+crt-static
50 | 
51 |       - name: Package binary for release
52 |         run: |
53 |           # Create a directory for packaging
54 |           mkdir -p staging
55 |           # Copy the binary, README, and LICENSE
56 |           cp target/x86_64-unknown-linux-musl/release/illiterate staging/
57 |           cp README.md staging/
58 |           cp LICENSE staging/
59 |           # Create the tarball
60 |           cd staging
61 |           tar -czvf ../illiterate-${{ github.ref_name }}-linux-x86_64.tar.gz *
62 |           cd ..
63 | 
64 |       - name: Upload Release Asset
65 |         uses: actions/upload-release-asset@v1
66 |         env:
67 |           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
68 |         with:
69 |           upload_url: ${{ needs.create-release.outputs.upload_url }}
70 |           asset_path: ./illiterate-${{ github.ref_name }}-linux-x86_64.tar.gz
71 |           asset_name: illiterate-${{ github.ref_name }}-linux-x86_64.tar.gz
72 |           asset_content_type: application/gzip
73 | 


--------------------------------------------------------------------------------
/docs/tests.md:
--------------------------------------------------------------------------------
  1 | # Tests
  2 | 
  3 | Here we briefly explain the unit tests for `illiterate`.
  4 | 
  5 | ```rust {name=tests}
  6 | #[cfg(test)]
  7 | mod tests {
  8 |     use super::*;
  9 | 
 10 |     <<tests_extract_chunks>>
 11 |     <<tests_info_extraction>>
 12 |     <<tests_build_chunk_map>>
 13 |     <<tests_expand_chunk>>
 14 | }
 15 | ```
 16 | 
 17 | This one is for building the named chunk map.
 18 | 
 19 | ```rust {name=build_chunk_map}
 20 | #[test]
 21 | fn test_build_chunk_map() {
 22 |     let chunks = extract_chunks("tests/references.md");
 23 |     let chunk_map = create_named_chunk_map(&chunks);
 24 | 
 25 |     assert!(chunk_map.len() == 1);
 26 |     assert!(chunk_map.contains_key("main_content"));
 27 | }
 28 | ```
 29 | 
 30 | These tests check that we can extract chunks from a markdown file with different structures.
 31 | 
 32 | ```rust {name=tests_extract_chunks}
 33 | #[test]
 34 | fn test_extract_simple_chunk() {
 35 |     let chunks = extract_chunks("tests/simple.md");
 36 | 
 37 |     assert!(chunks.len() == 1);
 38 | 
 39 |     let chunk0 = &chunks[0];
 40 |     assert!(chunk0.info.lang == "rust");
 41 |     assert_eq!(chunk0.info.path, Some("simple.rs".to_string()));
 42 | }
 43 | 
 44 | #[test]
 45 | fn test_extract_two_chunks() {
 46 |     let chunks = extract_chunks("tests/two_chunks.md");
 47 | 
 48 |     assert!(chunks.len() == 2);
 49 | 
 50 |     let chunk0 = &chunks[0];
 51 |     assert!(chunk0.info.lang == "python");
 52 |     assert!(chunk0.content == "print(\"Hello World\")\n");
 53 |     assert_eq!(chunk0.info.name, Some("hello_world".to_string()));
 54 | 
 55 |     let chunk1 = &chunks[1];
 56 |     assert!(chunk1.info.lang == "rust");
 57 |     assert!(chunk1.content == "fn main() {\n    // A first chunk\n}\n");
 58 |     assert_eq!(chunk1.info.path, Some("main.rs".to_string()));
 59 | }
 60 | 
 61 | #[test]
 62 | fn some_yes_some_no() {
 63 |     let chunks = extract_chunks("tests/some_yes_some_no.md");
 64 |     assert!(chunks.len() == 2);
 65 | }
 66 | ```
 67 | 
 68 | These tests are for the `parse_info_string` function. They check that the function correctly parses the information string into a `ChunkInfo` struct for a bunch of combinations of the information string.
 69 | 
 70 | ```rust {name=tests_info_extraction}
 71 | #[test]
 72 | fn test_full_string_parsing() {
 73 |     let info = "rust {export=src/main.rs} {name=chunk_1}";
 74 |     let expected = ChunkInfo {
 75 |         lang: "rust".to_string(),
 76 |         path: Some("src/main.rs".to_string()),
 77 |         name: Some("chunk_1".to_string()),
 78 |         export: true,
 79 |     };
 80 |     assert_eq!(parse_info_string(info), Some(expected));
 81 | }
 82 | 
 83 | #[test]
 84 | fn test_language_and_name() {
 85 |     let info = "python {name=hello_world}";
 86 |     let expected = ChunkInfo {
 87 |         lang: "python".to_string(),
 88 |         path: None,
 89 |         name: Some("hello_world".to_string()),
 90 |         export: false,
 91 |     };
 92 |     assert_eq!(parse_info_string(info), Some(expected));
 93 | }
 94 | 
 95 | #[test]
 96 | fn test_attributes_in_different_order() {
 97 |     let info = "rust {name=chunk_1} {export=src/main.rs}";
 98 |     let expected = ChunkInfo {
 99 |         lang: "rust".to_string(),
100 |         path: Some("src/main.rs".to_string()),
101 |         name: Some("chunk_1".to_string()),
102 |         export: true,
103 |     };
104 |     assert_eq!(parse_info_string(info), Some(expected));
105 | }
106 | 
107 | #[test]
108 | fn test_language_only() {
109 |     let info = "python";
110 |     assert_eq!(parse_info_string(info), None);
111 | }
112 | 
113 | #[test]
114 | fn test_with_export_only() {
115 |     let info = "javascript {export=app.js}";
116 |     let expected = ChunkInfo {
117 |         lang: "javascript".to_string(),
118 |         path: Some("app.js".to_string()),
119 |         name: None,
120 |         export: true,
121 |     };
122 |     assert_eq!(parse_info_string(info), Some(expected));
123 | }
124 | 
125 | #[test]
126 | fn test_with_headless_export_only() {
127 |     let info = "rust {export}";
128 |     let expected = ChunkInfo {
129 |         lang: "rust".to_string(),
130 |         path: None,
131 |         name: None,
132 |         export: true,
133 |     };
134 |     assert_eq!(parse_info_string(info), Some(expected));
135 | }
136 | 
137 | #[test]
138 | fn test_headless_export_with_name() {
139 |     let info = "rust {name=my_frag} {export}";
140 |     let expected = ChunkInfo {
141 |         lang: "rust".to_string(),
142 |         path: None,
143 |         name: Some("my_frag".to_string()),
144 |         export: true,
145 |     };
146 |     assert_eq!(parse_info_string(info), Some(expected));
147 | }
148 | 
149 | #[test]
150 | fn test_with_name_only() {
151 |     let info = "rust {name=my_fragment}";
152 |     let expected = ChunkInfo {
153 |         lang: "rust".to_string(),
154 |         path: None,
155 |         name: Some("my_fragment".to_string()),
156 |         export: false,
157 |     };
158 |     assert_eq!(parse_info_string(info), Some(expected));
159 | }
160 | 
161 | #[test]
162 | fn test_with_extra_whitespace() {
163 |     let info = "  bash   {export=run.sh}  ";
164 |     let expected = ChunkInfo {
165 |         lang: "bash".to_string(),
166 |         path: Some("run.sh".to_string()),
167 |         name: None,
168 |         export: true,
169 |     };
170 |     assert_eq!(parse_info_string(info), Some(expected));
171 | }
172 | 
173 | #[test]
174 | fn test_no_match_for_invalid_format() {
175 |     let info = "{invalid_format}";
176 |     assert_eq!(parse_info_string(info), None);
177 | }
178 | 
179 | #[test]
180 | fn test_empty_string() {
181 |     let info = "";
182 |     assert_eq!(parse_info_string(info), None);
183 | }
184 | ```
185 | 
186 | These are tests for the recursive chunk expansion logic.
187 | 
188 | ```rust {name=tests_expand_chunks}
189 | #[test]
190 | fn expand_simple_chunk() {
191 |     let chunks = extract_chunks("tests/references.md");
192 |     let chunk_map = create_named_chunk_map(&chunks);
193 | 
194 |     // The exportable chunk is the first one
195 |     let main_chunk = chunks.iter().find(|c| c.info.export).unwrap();
196 |     let expanded = main_chunk.expand(&chunk_map);
197 | 
198 |     assert_eq!(expanded, "fn main() {\n    println!(\"Hello World\");\n\n}\n");
199 | }
200 | 
201 | #[test]
202 | fn test_expand_complex() {
203 |     let chunks = extract_chunks("tests/complex_references.md");
204 |     let chunk_map = create_named_chunk_map(&chunks);
205 | 
206 |     let main_chunk = chunks.iter().find(|c| c.info.name == Some("helper".to_string())).unwrap();
207 |     let content = main_chunk.expand(&chunk_map);
208 | 
209 |     assert!(content.contains("let a = 0;"));
210 | }
211 | ```
212 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # illiterate
  2 | 
  3 | A fast, zero-config, programmer-first literate programming tool. illiterate exports source code from Markdown files, allowing you to keep your code and documentation in one place, perfectly in sync. It's written in Rust, distributed as a single static binary, and designed to be simple, powerful, and language-agnostic.
  4 | 
  5 | `illiterate` is bootstrapped. The best way to understand how to use it, is to read the [annotated source code](docs/illiterate.md) to learn how it works.
  6 | 
  7 | ## Philosophy
  8 | 
  9 | * **Markdown as the Source of Truth:** Your documentation isn't just *about* the code; it *is* the code.
 10 | * **Zero-Config by Default:** No illiterate.toml or other config files needed. Everything is controlled from the command line or within the Markdown itself.
 11 | * **Programmer-First:** The primary goal is to generate clean, compilable source code. Beautiful documentation is a happy side effect.
 12 | * **Language Agnostic:** illiterate works with any programming language because it simply treats code blocks as text.
 13 | 
 14 | ## Installation
 15 | 
 16 | `illiterate` is distributed as a single static binary for Linux (and Windows Subsystem for Linux).
 17 | 
 18 | You can install it by downloading the latest pre-compiled binary from the [GitHub Releases page](https://github.com/apiad/illiterate/releases/latest) and placing it in a directory on your PATH.
 19 | 
 20 | The following command will install the latest version directly into `/usr/local/bin`:
 21 | 
 22 | ```bash
 23 | curl https://raw.githubusercontent.com/apiad/illiterate/refs/heads/main/install.sh | sh
 24 | ```
 25 | 
 26 | That's it! Run `illiterate --help` to get started.
 27 | 
 28 | ## Quick Start
 29 | 
 30 | Create a Markdown file named `my_app.md`:
 31 | 
 32 |     # My Awesome Application
 33 | 
 34 |     This is the main entry point for our program.
 35 | 
 36 |     ```rust {export=src/main.rs}
 37 |     fn main() {
 38 |         println!("Hello, Literate World!");
 39 |         <<add_a_goodbye_message>>
 40 |     }
 41 |     ```
 42 | 
 43 |     And here's a reusable code fragment that we'll inject into `main`.
 44 | 
 45 |     ```rust {name=add_a_goodbye_message}
 46 |     println!("Goodbye!");
 47 |     ```
 48 | 
 49 | Run `illiterate` to export the file:
 50 | 
 51 | ```bash
 52 | illiterate my_app.md
 53 | ```
 54 | 
 55 | A new file, `src/main.rs`, has been created with the following content:
 56 | 
 57 | ```rust
 58 | fn main() {
 59 |     println!("Hello, Literate World!");
 60 |     println!("Goodbye!");
 61 | }
 62 | ```
 63 | 
 64 | ## Core Concepts
 65 | 
 66 | `illiterate` works by parsing special attributes inside your fenced code blocks.
 67 | 
 68 | ### 1. Export Blocks ({export=...})
 69 | 
 70 | A code block marked with `{export=path/to/file.ext}` will have its contents extracted and appended to the specified file. All blocks targeting the same file are concatenated in the order they appear.
 71 | 
 72 |     ```python {export=app/main.py}
 73 |     import utils
 74 |     ```
 75 | 
 76 | ### 2. Named Fragments & Includes ({name=...} and <<...>>)
 77 | 
 78 | A code block can be given a name with `{name=my_fragment}`. This block is not exported directly but can be included elsewhere using the `<<my_fragment>>` syntax. This allows you to explain code in logical chunks, out of order, and assemble it correctly later.
 79 | 
 80 |     ```rust {name=setup_database}
 81 |     // Logic to connect to the database...
 82 |     ```
 83 | 
 84 |     ```rust {export=src/main.rs}
 85 |     fn main() {
 86 |         // Setup the database first
 87 |         <<setup_database>>
 88 |     }
 89 |     ```
 90 | 
 91 | ### 3. "Magic" Headless Exporting ({export})
 92 | 
 93 | For simple cases where one Markdown file corresponds to one source file, you can use a headless `{export}` attribute. illiterate will automatically generate the filename based on the Markdown file's name and the code block's language.
 94 | 
 95 | Given a file named my_module.md:
 96 | 
 97 |     ```rust {export}
 98 |     pub fn public_function() {
 99 |         // ...
100 |     }
101 |     ```
102 | 
103 | Running illiterate my_module.md will create the file my_module.rs.
104 | 
105 | ## Command-Line Usage
106 | 
107 | illiterate [OPTIONS] [FILES...]
108 | 
109 | * **[FILES...]**: One or more Markdown files to process.
110 | * **--dir <DIRECTORY>**: Sets the root output directory for all exported files. Defaults to the current directory.
111 | * **--test**: Tests the output against generated files, without generating anything. Exists with zero if the generated files would not change. Useful for CI/CD.
112 | 
113 | ## Changelog
114 | 
115 | ### v0.3.1
116 | 
117 | - Minor improvements and bug fixes.
118 | 
119 | ### v0.3.0
120 | 
121 | - Add flag `--test` to test the output against generated files.
122 | 
123 | ### v0.2.0
124 | 
125 | - First version with full support for named references and headless exporting.
126 | 
127 | ### v0.1.0
128 | 
129 | - Basic markdown parsing and extraction of simple blocks.
130 | 
131 | ## Why not...?
132 | 
133 | There are many features you could conceivably expect from a literate programming tool that I purposefully avoided in favor of simplicity. Some of them are listed below with possible alternatives.
134 | 
135 | Most of these are achievable with a proper build system, e.g., using `makefile` or any other build utility.
136 | 
137 | - **Conditional inclusions**: If you want to include some code chunks based on some condition (like, optional features), you can always achieve it by separating them into different source (markdown) files, and then conditionally including them in your call to `illiterate`.
138 | - **Replacing instead of appending**: If you want that some named chunks are treated as a replacement instead of an addition to the same named chunk, you can instead refactor your source such that source A or source B implements one version and source B another, and then use conditional compilation (see above) to decide which to include.
139 | - **Backpatching literate sources**: In literate programming, it can be tempting to make small fixes in the tangled code that you would later want to incorporate in the literate source. However, this creates more headaches than it solves, including keeping track of where each piece of tangled code came from. Combining recursive expansion and implicit indentation of code chunks, with the language-agnostic nature of `illiterate`, it is almost impossible to do this robustly because we would need to know how to annotate (e.g., how to put comments) in any possible language to achieve this. Instead, I encourage you to fully embrace literate programming and treat tangled source the same way you treat compiled bytecode.
140 | 
141 | ## Contribution
142 | 
143 | At the moment, I consider `illiterate` feature-full and don't plan on adding anything other than to improve performance or refine some corner cases I might have missed.
144 | 
145 | That being said, there are some features that I do believe would make the whole experience of working with literate sources much better, and I would happily accept PRs in this direction, including but not limited to:
146 | 
147 | - **Linting literate sources**: If you have a language with type checking or any other static checks, it would be awesome to be able to see those linting errors in the literate source. However, this is extremely hard to achieve with `illiterate` for the same reasoning we don't want backpatching (see above), and literate programming coupled with robust testing already pretty much solves the need for very complicated static analysis.  However, I do think it's doable by creating intermediate mapping files that explain which line of tangled source comes from which line of literate source, though I'm unconvinced whether the added complexity justifies the feature.
148 | - **Support for editor X**: In the same veing, if you want to add editor support for literate sources (e.g., cross-referencing and navigation between named chunks), I'm happy to accept PRs that include editor-specific config files or even short editor extensions. These would not become part of the `illiterate` program itself, but rather would be additional tools we could add or link to in the main `illiterate` repository.
149 | 
150 | I will also happily accept PRs that improve `illiterate` in the following ways:
151 | 
152 | - Improving performance.
153 | - Improving the literate source explanations.
154 | - Fixes and/or tests for existing bugs and corner cases.
155 | - Github workflows to compile for other platforms.
156 | 
157 | If you want to fork `illiterate` and modify it for your own use cases, you're absolutely welcome to. `illiterate` is fully open source and will forever be free as in free beer and free speech.
158 | 
159 | ### Building from Source
160 | 
161 | This project is self-hosting! The source code for illiterate lives in illiterate.md and is exported to the src/ directory, which is checked into version control.
162 | 
163 | The workflow for contributors is simple:
164 | 
165 | 1. **Clone the repository:**
166 | 
167 | ```bash
168 | git clone https://github.com/apiad/illiterate.git
169 | cd illiterate
170 | ```
171 | 
172 | 2. **Build the initial version:** The src/ directory contains pre-exported source code, so you can build it immediately.
173 | 
174 | ```bash
175 | cargo build
176 | ```
177 | 
178 | This creates a binary at `target/debug/illiterate`.
179 | 
180 | 3. **Make your changes:** Edit the "source of truth" file, `illiterate.md`.
181 | 
182 | 4. **Re-export the source code:** Use the binary you just built to update the `src/` directory with your changes. This will run tests as well.
183 | 
184 | ```bash
185 | make self
186 | ```
187 | 
188 | 5. **Rinse and repeat** until done. Then push and send a PR.
189 | 
190 | ## License
191 | 
192 | `illiterate` is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.
193 | 


--------------------------------------------------------------------------------
/Cargo.lock:
--------------------------------------------------------------------------------
  1 | # This file is automatically @generated by Cargo.
  2 | # It is not intended for manual editing.
  3 | version = 4
  4 | 
  5 | [[package]]
  6 | name = "aho-corasick"
  7 | version = "1.1.3"
  8 | source = "registry+https://github.com/rust-lang/crates.io-index"
  9 | checksum = "8e60d3430d3a69478ad0993f19238d2df97c507009a52b3c10addcd7f6bcb916"
 10 | dependencies = [
 11 |  "memchr",
 12 | ]
 13 | 
 14 | [[package]]
 15 | name = "anstream"
 16 | version = "0.6.19"
 17 | source = "registry+https://github.com/rust-lang/crates.io-index"
 18 | checksum = "301af1932e46185686725e0fad2f8f2aa7da69dd70bf6ecc44d6b703844a3933"
 19 | dependencies = [
 20 |  "anstyle",
 21 |  "anstyle-parse",
 22 |  "anstyle-query",
 23 |  "anstyle-wincon",
 24 |  "colorchoice",
 25 |  "is_terminal_polyfill",
 26 |  "utf8parse",
 27 | ]
 28 | 
 29 | [[package]]
 30 | name = "anstyle"
 31 | version = "1.0.11"
 32 | source = "registry+https://github.com/rust-lang/crates.io-index"
 33 | checksum = "862ed96ca487e809f1c8e5a8447f6ee2cf102f846893800b20cebdf541fc6bbd"
 34 | 
 35 | [[package]]
 36 | name = "anstyle-parse"
 37 | version = "0.2.7"
 38 | source = "registry+https://github.com/rust-lang/crates.io-index"
 39 | checksum = "4e7644824f0aa2c7b9384579234ef10eb7efb6a0deb83f9630a49594dd9c15c2"
 40 | dependencies = [
 41 |  "utf8parse",
 42 | ]
 43 | 
 44 | [[package]]
 45 | name = "anstyle-query"
 46 | version = "1.1.3"
 47 | source = "registry+https://github.com/rust-lang/crates.io-index"
 48 | checksum = "6c8bdeb6047d8983be085bab0ba1472e6dc604e7041dbf6fcd5e71523014fae9"
 49 | dependencies = [
 50 |  "windows-sys",
 51 | ]
 52 | 
 53 | [[package]]
 54 | name = "anstyle-wincon"
 55 | version = "3.0.9"
 56 | source = "registry+https://github.com/rust-lang/crates.io-index"
 57 | checksum = "403f75924867bb1033c59fbf0797484329750cfbe3c4325cd33127941fabc882"
 58 | dependencies = [
 59 |  "anstyle",
 60 |  "once_cell_polyfill",
 61 |  "windows-sys",
 62 | ]
 63 | 
 64 | [[package]]
 65 | name = "bitflags"
 66 | version = "2.9.1"
 67 | source = "registry+https://github.com/rust-lang/crates.io-index"
 68 | checksum = "1b8e56985ec62d17e9c1001dc89c88ecd7dc08e47eba5ec7c29c7b5eeecde967"
 69 | 
 70 | [[package]]
 71 | name = "clap"
 72 | version = "4.5.40"
 73 | source = "registry+https://github.com/rust-lang/crates.io-index"
 74 | checksum = "40b6887a1d8685cebccf115538db5c0efe625ccac9696ad45c409d96566e910f"
 75 | dependencies = [
 76 |  "clap_builder",
 77 |  "clap_derive",
 78 | ]
 79 | 
 80 | [[package]]
 81 | name = "clap_builder"
 82 | version = "4.5.40"
 83 | source = "registry+https://github.com/rust-lang/crates.io-index"
 84 | checksum = "e0c66c08ce9f0c698cbce5c0279d0bb6ac936d8674174fe48f736533b964f59e"
 85 | dependencies = [
 86 |  "anstream",
 87 |  "anstyle",
 88 |  "clap_lex",
 89 |  "strsim",
 90 | ]
 91 | 
 92 | [[package]]
 93 | name = "clap_derive"
 94 | version = "4.5.40"
 95 | source = "registry+https://github.com/rust-lang/crates.io-index"
 96 | checksum = "d2c7947ae4cc3d851207c1adb5b5e260ff0cca11446b1d6d1423788e442257ce"
 97 | dependencies = [
 98 |  "heck",
 99 |  "proc-macro2",
100 |  "quote",
101 |  "syn",
102 | ]
103 | 
104 | [[package]]
105 | name = "clap_lex"
106 | version = "0.7.5"
107 | source = "registry+https://github.com/rust-lang/crates.io-index"
108 | checksum = "b94f61472cee1439c0b966b47e3aca9ae07e45d070759512cd390ea2bebc6675"
109 | 
110 | [[package]]
111 | name = "colorchoice"
112 | version = "1.0.4"
113 | source = "registry+https://github.com/rust-lang/crates.io-index"
114 | checksum = "b05b61dc5112cbb17e4b6cd61790d9845d13888356391624cbe7e41efeac1e75"
115 | 
116 | [[package]]
117 | name = "getopts"
118 | version = "0.2.23"
119 | source = "registry+https://github.com/rust-lang/crates.io-index"
120 | checksum = "cba6ae63eb948698e300f645f87c70f76630d505f23b8907cf1e193ee85048c1"
121 | dependencies = [
122 |  "unicode-width",
123 | ]
124 | 
125 | [[package]]
126 | name = "heck"
127 | version = "0.5.0"
128 | source = "registry+https://github.com/rust-lang/crates.io-index"
129 | checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea"
130 | 
131 | [[package]]
132 | name = "illiterate"
133 | version = "0.3.1"
134 | dependencies = [
135 |  "clap",
136 |  "pulldown-cmark",
137 |  "regex",
138 | ]
139 | 
140 | [[package]]
141 | name = "is_terminal_polyfill"
142 | version = "1.70.1"
143 | source = "registry+https://github.com/rust-lang/crates.io-index"
144 | checksum = "7943c866cc5cd64cbc25b2e01621d07fa8eb2a1a23160ee81ce38704e97b8ecf"
145 | 
146 | [[package]]
147 | name = "memchr"
148 | version = "2.7.5"
149 | source = "registry+https://github.com/rust-lang/crates.io-index"
150 | checksum = "32a282da65faaf38286cf3be983213fcf1d2e2a58700e808f83f4ea9a4804bc0"
151 | 
152 | [[package]]
153 | name = "once_cell_polyfill"
154 | version = "1.70.1"
155 | source = "registry+https://github.com/rust-lang/crates.io-index"
156 | checksum = "a4895175b425cb1f87721b59f0f286c2092bd4af812243672510e1ac53e2e0ad"
157 | 
158 | [[package]]
159 | name = "proc-macro2"
160 | version = "1.0.95"
161 | source = "registry+https://github.com/rust-lang/crates.io-index"
162 | checksum = "02b3e5e68a3a1a02aad3ec490a98007cbc13c37cbe84a3cd7b8e406d76e7f778"
163 | dependencies = [
164 |  "unicode-ident",
165 | ]
166 | 
167 | [[package]]
168 | name = "pulldown-cmark"
169 | version = "0.13.0"
170 | source = "registry+https://github.com/rust-lang/crates.io-index"
171 | checksum = "1e8bbe1a966bd2f362681a44f6edce3c2310ac21e4d5067a6e7ec396297a6ea0"
172 | dependencies = [
173 |  "bitflags",
174 |  "getopts",
175 |  "memchr",
176 |  "pulldown-cmark-escape",
177 |  "unicase",
178 | ]
179 | 
180 | [[package]]
181 | name = "pulldown-cmark-escape"
182 | version = "0.11.0"
183 | source = "registry+https://github.com/rust-lang/crates.io-index"
184 | checksum = "007d8adb5ddab6f8e3f491ac63566a7d5002cc7ed73901f72057943fa71ae1ae"
185 | 
186 | [[package]]
187 | name = "quote"
188 | version = "1.0.40"
189 | source = "registry+https://github.com/rust-lang/crates.io-index"
190 | checksum = "1885c039570dc00dcb4ff087a89e185fd56bae234ddc7f056a945bf36467248d"
191 | dependencies = [
192 |  "proc-macro2",
193 | ]
194 | 
195 | [[package]]
196 | name = "regex"
197 | version = "1.11.1"
198 | source = "registry+https://github.com/rust-lang/crates.io-index"
199 | checksum = "b544ef1b4eac5dc2db33ea63606ae9ffcfac26c1416a2806ae0bf5f56b201191"
200 | dependencies = [
201 |  "aho-corasick",
202 |  "memchr",
203 |  "regex-automata",
204 |  "regex-syntax",
205 | ]
206 | 
207 | [[package]]
208 | name = "regex-automata"
209 | version = "0.4.9"
210 | source = "registry+https://github.com/rust-lang/crates.io-index"
211 | checksum = "809e8dc61f6de73b46c85f4c96486310fe304c434cfa43669d7b40f711150908"
212 | dependencies = [
213 |  "aho-corasick",
214 |  "memchr",
215 |  "regex-syntax",
216 | ]
217 | 
218 | [[package]]
219 | name = "regex-syntax"
220 | version = "0.8.5"
221 | source = "registry+https://github.com/rust-lang/crates.io-index"
222 | checksum = "2b15c43186be67a4fd63bee50d0303afffcef381492ebe2c5d87f324e1b8815c"
223 | 
224 | [[package]]
225 | name = "strsim"
226 | version = "0.11.1"
227 | source = "registry+https://github.com/rust-lang/crates.io-index"
228 | checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
229 | 
230 | [[package]]
231 | name = "syn"
232 | version = "2.0.104"
233 | source = "registry+https://github.com/rust-lang/crates.io-index"
234 | checksum = "17b6f705963418cdb9927482fa304bc562ece2fdd4f616084c50b7023b435a40"
235 | dependencies = [
236 |  "proc-macro2",
237 |  "quote",
238 |  "unicode-ident",
239 | ]
240 | 
241 | [[package]]
242 | name = "unicase"
243 | version = "2.8.1"
244 | source = "registry+https://github.com/rust-lang/crates.io-index"
245 | checksum = "75b844d17643ee918803943289730bec8aac480150456169e647ed0b576ba539"
246 | 
247 | [[package]]
248 | name = "unicode-ident"
249 | version = "1.0.18"
250 | source = "registry+https://github.com/rust-lang/crates.io-index"
251 | checksum = "5a5f39404a5da50712a4c1eecf25e90dd62b613502b7e925fd4e4d19b5c96512"
252 | 
253 | [[package]]
254 | name = "unicode-width"
255 | version = "0.2.1"
256 | source = "registry+https://github.com/rust-lang/crates.io-index"
257 | checksum = "4a1a07cc7db3810833284e8d372ccdc6da29741639ecc70c9ec107df0fa6154c"
258 | 
259 | [[package]]
260 | name = "utf8parse"
261 | version = "0.2.2"
262 | source = "registry+https://github.com/rust-lang/crates.io-index"
263 | checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
264 | 
265 | [[package]]
266 | name = "windows-sys"
267 | version = "0.59.0"
268 | source = "registry+https://github.com/rust-lang/crates.io-index"
269 | checksum = "1e38bc4d79ed67fd075bcc251a1c39b32a1776bbe92e5bef1f0bf1f8c531853b"
270 | dependencies = [
271 |  "windows-targets",
272 | ]
273 | 
274 | [[package]]
275 | name = "windows-targets"
276 | version = "0.52.6"
277 | source = "registry+https://github.com/rust-lang/crates.io-index"
278 | checksum = "9b724f72796e036ab90c1021d4780d4d3d648aca59e491e6b98e725b84e99973"
279 | dependencies = [
280 |  "windows_aarch64_gnullvm",
281 |  "windows_aarch64_msvc",
282 |  "windows_i686_gnu",
283 |  "windows_i686_gnullvm",
284 |  "windows_i686_msvc",
285 |  "windows_x86_64_gnu",
286 |  "windows_x86_64_gnullvm",
287 |  "windows_x86_64_msvc",
288 | ]
289 | 
290 | [[package]]
291 | name = "windows_aarch64_gnullvm"
292 | version = "0.52.6"
293 | source = "registry+https://github.com/rust-lang/crates.io-index"
294 | checksum = "32a4622180e7a0ec044bb555404c800bc9fd9ec262ec147edd5989ccd0c02cd3"
295 | 
296 | [[package]]
297 | name = "windows_aarch64_msvc"
298 | version = "0.52.6"
299 | source = "registry+https://github.com/rust-lang/crates.io-index"
300 | checksum = "09ec2a7bb152e2252b53fa7803150007879548bc709c039df7627cabbd05d469"
301 | 
302 | [[package]]
303 | name = "windows_i686_gnu"
304 | version = "0.52.6"
305 | source = "registry+https://github.com/rust-lang/crates.io-index"
306 | checksum = "8e9b5ad5ab802e97eb8e295ac6720e509ee4c243f69d781394014ebfe8bbfa0b"
307 | 
308 | [[package]]
309 | name = "windows_i686_gnullvm"
310 | version = "0.52.6"
311 | source = "registry+https://github.com/rust-lang/crates.io-index"
312 | checksum = "0eee52d38c090b3caa76c563b86c3a4bd71ef1a819287c19d586d7334ae8ed66"
313 | 
314 | [[package]]
315 | name = "windows_i686_msvc"
316 | version = "0.52.6"
317 | source = "registry+https://github.com/rust-lang/crates.io-index"
318 | checksum = "240948bc05c5e7c6dabba28bf89d89ffce3e303022809e73deaefe4f6ec56c66"
319 | 
320 | [[package]]
321 | name = "windows_x86_64_gnu"
322 | version = "0.52.6"
323 | source = "registry+https://github.com/rust-lang/crates.io-index"
324 | checksum = "147a5c80aabfbf0c7d901cb5895d1de30ef2907eb21fbbab29ca94c5b08b1a78"
325 | 
326 | [[package]]
327 | name = "windows_x86_64_gnullvm"
328 | version = "0.52.6"
329 | source = "registry+https://github.com/rust-lang/crates.io-index"
330 | checksum = "24d5b23dc417412679681396f2b49f3de8c1473deb516bd34410872eff51ed0d"
331 | 
332 | [[package]]
333 | name = "windows_x86_64_msvc"
334 | version = "0.52.6"
335 | source = "registry+https://github.com/rust-lang/crates.io-index"
336 | checksum = "589f6da84c646204747d1270a2a5661ea66ed1cced2631d546fdfb155959f9ec"
337 | 


--------------------------------------------------------------------------------
/src/main.rs:
--------------------------------------------------------------------------------
  1 | // All necessary packages
  2 | use clap::Parser;
  3 | use pulldown_cmark::{CodeBlockKind, Event, Parser as MarkdownParser, Tag};
  4 | use regex::Regex;
  5 | use std::{
  6 |     collections::HashMap,
  7 |     env,
  8 |     fs::{self},
  9 |     io,
 10 |     path::{Path, PathBuf},
 11 | };
 12 | 
 13 | 
 14 | // Utility methods
 15 | /// A fast, zero-config, programmer-first literate programming tool.
 16 | #[derive(Parser, Debug)]
 17 | #[command(author, version, about, long_about = None)]
 18 | struct Cli {
 19 |     /// One or more Markdown files to process
 20 |     #[arg(required = true)]
 21 |     files: Vec<PathBuf>,
 22 | 
 23 |     /// Sets the root output directory for all exported files
 24 |     #[arg(long, short, value_name = "DIRECTORY")]
 25 |     dir: Option<PathBuf>,
 26 | 
 27 |     /// Run in test mode. Compares generated files with their
 28 |     /// on-disk counterparts and reports differences.
 29 |     #[arg(long, short)]
 30 |     test: bool,
 31 | }
 32 | /// Processes a list of markdown files and builds an in-memory map of the
 33 | /// files to be generated, without writing anything to disk.
 34 | fn generate_output_map(
 35 |     paths: &[PathBuf],
 36 |     root_dir: Option<&PathBuf>,
 37 | ) -> HashMap<PathBuf, String> {
 38 |     // First, we'll store all raw chunks in this vector
 39 |     let mut all_chunks = Vec::new();
 40 |     
 41 |     for path in paths.iter() {
 42 |         all_chunks.extend(extract_chunks(path.to_str().unwrap()));
 43 |     }
 44 | 
 45 | 
 46 |     // Next, we create two data structures with all chunks
 47 |     // A map of all named chunks for easy lookup during expansion.
 48 |     let named_chunks_map = create_named_chunk_map(&all_chunks);
 49 |     
 50 |     // A list of the chunks that are marked for export.
 51 |     let exportable_chunks = all_chunks.iter().filter(|chunk| chunk.info.export);
 52 | 
 53 | 
 54 |     // And finally create the final sources
 55 |     // This map will hold the final content for each output file.
 56 |     let mut source_map: HashMap<PathBuf, String> = HashMap::new();
 57 |     
 58 |     // Define the base dir (or use default)
 59 |     let base_dir = root_dir.cloned().unwrap_or_else(|| env::current_dir().unwrap());
 60 |     
 61 |     // Expand all exportable chunks and collect their content into the output map.
 62 |     for chunk in exportable_chunks {
 63 |         // This is the where the recursive magic happens
 64 |         let content = chunk.expand(&named_chunks_map);
 65 |         // This is the path to the file where the chunk should be written.
 66 |         let file_path = base_dir.join(chunk.info.path.as_ref().unwrap());
 67 |         // Append the expanded content of the current chunk to the appropriate file's content in the map.
 68 |         source_map.entry(file_path).or_default().push_str(&content);
 69 |     }
 70 | 
 71 | 
 72 |     return source_map;
 73 | }
 74 | #[derive(Debug, PartialEq)]
 75 | struct Chunk {
 76 |     info: ChunkInfo,
 77 |     content: String,
 78 | }
 79 | 
 80 | #[derive(Debug, PartialEq, Clone)]
 81 | struct ChunkInfo {
 82 |     lang: String,
 83 |     path: Option<String>,
 84 |     name: Option<String>,
 85 |     export: bool,
 86 | }
 87 | fn extract_chunks(file_path: &str) -> Vec<Chunk> {
 88 |     let mut chunks = Vec::new();
 89 |     let content = std::fs::read_to_string(file_path).unwrap();
 90 |     let parser = MarkdownParser::new(&content);
 91 |     let mut in_chunk = false;
 92 | 
 93 |     // list of common language extensions (e.g., .py, .rs, .cpp)
 94 |     let lang_ext = language_extensions();
 95 | 
 96 |     for event in parser {
 97 |         match event {
 98 |             Event::Start(Tag::CodeBlock(kind)) => {
 99 |                 if let CodeBlockKind::Fenced(info_str) = kind {
100 |                     if let Some(info) = parse_info_string(&info_str) {
101 |                         let mut chunk = Chunk {
102 |                             info,
103 |                             content: String::new(),
104 |                         };
105 |                 
106 |                         // For empty export directives, we generate a default path
107 |                         if chunk.info.export && chunk.info.path.is_none() {
108 |                             let file_stem = Path::new(file_path).file_stem().unwrap().to_str().unwrap();
109 |                             let extension = lang_ext.get(chunk.info.lang.as_str()).unwrap_or(&"txt");
110 |                             let default_path = format!("{}.{}", file_stem, extension);
111 |                             chunk.info.path = Some(default_path);
112 |                 
113 |                         }
114 |                 
115 |                         chunks.push(chunk);
116 |                         in_chunk = true;
117 |                     }
118 |                 }
119 | 
120 |             }
121 |             Event::Text(text) => {
122 |                 // If we're inside a code chunk, just take every text
123 |                 // and append it to the last chunk (the one we're inside of)
124 |                 if in_chunk {
125 |                     if let Some(last_chunk) = chunks.last_mut() {
126 |                         last_chunk.content.push_str(&text);
127 |                     }
128 |                 }
129 |             }
130 |             Event::End(_) => {
131 |                 in_chunk = false;
132 |             }
133 |             _ => {} // Nothing else matters
134 |         }
135 |     }
136 | 
137 |     return chunks;
138 | }
139 | fn language_extensions() -> HashMap<&'static str, &'static str> {
140 |     let mut map = HashMap::new();
141 | 
142 |     map.insert("python", "py");
143 |     map.insert("javascript", "js");
144 |     map.insert("java", "java");
145 |     map.insert("csharp", "cs");
146 |     map.insert("cpp", "cpp");
147 |     map.insert("c", "c");
148 |     map.insert("typescript", "ts");
149 |     map.insert("php", "php");
150 |     map.insert("swift", "swift");
151 |     map.insert("ruby", "rb");
152 |     map.insert("go", "go");
153 |     map.insert("kotlin", "kt");
154 |     map.insert("rust", "rs");
155 |     map.insert("r", "r");
156 |     map.insert("matlab", "m");
157 |     map.insert("perl", "pl");
158 |     map.insert("scala", "scala");
159 |     map.insert("objc", "m");
160 |     map.insert("lua", "lua");
161 |     map.insert("dart", "dart");
162 |     map.insert("haskell", "hs");
163 |     map.insert("groovy", "groovy");
164 |     map.insert("elixir", "ex");
165 |     map.insert("julia", "jl");
166 |     map.insert("fsharp", "fs");
167 |     map.insert("clojure", "clj");
168 |     map.insert("erlang", "erl");
169 |     map.insert("assembly", "asm");
170 |     map.insert("sql", "sql");
171 |     map.insert("bash", "sh");
172 | 
173 |     return map;
174 | }
175 | fn parse_info_string(info_string: &str) -> Option<ChunkInfo> {
176 |     // First, capture the language and the rest of the attributes string.
177 |     let lang_re = Regex::new(r"^\s*(?P<lang>\w+)\s*(?P<attrs>.*)$").unwrap();
178 |     let caps = lang_re.captures(info_string)?;
179 |     let lang = caps.name("lang").unwrap().as_str().to_string();
180 |     let attrs_str = caps.name("attrs").unwrap().as_str();
181 | 
182 |     // This regex finds all attributes, handling both `{key=value}` and `{key}` formats.
183 |     let attr_re = Regex::new(r"\{(?P<key>[a-zA-Z\d_]+)(=(?P<value>[^}]+))?\}").unwrap();
184 | 
185 |     let mut path = None;
186 |     let mut name = None;
187 |     let mut export = false;
188 | 
189 |     // Iterate over all attribute matches found in the string.
190 |     for attr_caps in attr_re.captures_iter(attrs_str) {
191 |         let key = attr_caps.name("key").unwrap().as_str();
192 |         // The "value" capture group is now optional.
193 |         let value = attr_caps.name("value");
194 | 
195 |         match key {
196 |             "export" => {
197 |                 export = true;
198 |                 if let Some(val_match) = value {
199 |                     // This was {export=...}, so we have a path.
200 |                     path = Some(val_match.as_str().to_string());
201 |                 }
202 |             }
203 |             "name" => {
204 |                 if let Some(val_match) = value {
205 |                     name = Some(val_match.as_str().to_string());
206 |                 }
207 |                 // Note: {name} without a value is ignored.
208 |             }
209 |             _ => {} // Ignore unknown attributes
210 |         }
211 | 
212 |     }
213 | 
214 |     if path.is_none() && name.is_none() && !export {
215 |         return None;
216 |     }
217 | 
218 |     return Some(ChunkInfo {
219 |         lang: lang,
220 |         path: path,
221 |         name: name,
222 |         export: export,
223 |     });
224 | }
225 | fn create_named_chunk_map(chunks: &[Chunk]) -> HashMap<String, Vec<&Chunk>> {
226 |     let mut chunk_map: HashMap<String, Vec<&Chunk>> = HashMap::new();
227 | 
228 |     for chunk in chunks.iter() {
229 |         if let Some(name) = &chunk.info.name {
230 |             chunk_map.entry(name.clone()).or_default().push(chunk);
231 |         }
232 |     }
233 |     chunk_map
234 | }
235 | impl Chunk {
236 |     /// Public method to start the expansion process.
237 |     /// It initializes the tracking stack for circular dependency checks.
238 |     pub fn expand(&self, named_chunks: &HashMap<String, Vec<&Chunk>>) -> String {
239 |         let mut expansion_stack = Vec::new();
240 |         self.expand_recursive(named_chunks, &mut expansion_stack)
241 |     }
242 | 
243 |     /// Recursively expands the content of this chunk by replacing `<<...>>` references.
244 |     fn expand_recursive(
245 |         &self,
246 |         named_chunks: &HashMap<String, Vec<&Chunk>>,
247 |         expansion_stack: &mut Vec<String>,
248 |     ) -> String {
249 |         // Check for circular dependencies.
250 |         if let Some(name) = &self.info.name {
251 |             if expansion_stack.contains(name) {
252 |                 let error_msg = format!(
253 |                     "\n// ERROR: Circular reference detected for chunk '{}'\n",
254 |                     name
255 |                 );
256 |                 return error_msg;
257 |             }
258 |             expansion_stack.push(name.clone());
259 |         }
260 |     
261 |     
262 |         // This will hold the final expanded chunk
263 |         let mut final_content = String::new();
264 |         // This regex matches lines with a named reference in the form <<...>>
265 |         let include_re = Regex::new(r"^(?P<indent>\s*)<<(?P<name>[\w_.-]+)>>\s*$").unwrap();
266 |     
267 |         for line in self.content.lines() {
268 |             if let Some(caps) = include_re.captures(line) {
269 |                 // This line contains a named reference.
270 |                 let indent_str = caps.name("indent").unwrap().as_str();
271 |                 let name_to_include = caps.name("name").unwrap().as_str();
272 |     
273 |                 match named_chunks.get(name_to_include) {
274 |                     Some(chunks_to_include) => {
275 |                         for chunk in chunks_to_include {
276 |                             // Recursively expand the included chunk.
277 |                             let expanded_include = chunk.expand_recursive(named_chunks, expansion_stack);
278 |                             // Add the captured indentation to each line of the expanded content.
279 |                             for expanded_line in expanded_include.lines() {
280 |                                 final_content.push_str(indent_str);
281 |                                 final_content.push_str(expanded_line);
282 |                                 final_content.push('\n');
283 |                             }
284 |                             final_content.push('\n');
285 |                         }
286 |                     }
287 |                     None => {
288 |                         // Handle missing chunk reference
289 |                         panic!("ERROR: Chunk '{}' not found", name_to_include);
290 |                     }
291 |                 }
292 |     
293 |             } else {
294 |                 // This line doesn't, so add it as is.
295 |                 final_content.push_str(line);
296 |                 final_content.push('\n');
297 |             }
298 |         }
299 |     
300 |         // Some post-process we will need to make circular checks work
301 |         if let Some(name) = &self.info.name {
302 |             if expansion_stack.last() == Some(name) {
303 |                 expansion_stack.pop();
304 |             }
305 |         }
306 |     
307 |     
308 |         return final_content;
309 |     }
310 | 
311 | }
312 | /// Writes the in-memory file map to the disk, overwriting existing files.
313 | fn write_output_to_disk(output_map: &HashMap<PathBuf, String>) -> io::Result<()> {
314 |     for (path, content) in output_map {
315 |         if let Some(parent) = path.parent() {
316 |             fs::create_dir_all(parent)?;
317 |         }
318 |         fs::write(path, content)?;
319 |     }
320 |     Ok(())
321 | }
322 | /// Compares the in-memory file map with files on disk and reports differences.
323 | fn run_test_comparison(output_map: &HashMap<PathBuf, String>) -> bool {
324 |     let mut differences = Vec::new();
325 | 
326 |     for (path, generated_content) in output_map {
327 |         match fs::read_to_string(path) {
328 |             Ok(disk_content) => {
329 |                 if &disk_content != generated_content {
330 |                     differences.push(format!("Content mismatch in {}", path.display()));
331 |                 }
332 |             }
333 |             Err(e) if e.kind() == io::ErrorKind::NotFound => {
334 |                 differences.push(format!("Missing expected file on disk: {}", path.display()));
335 |             }
336 |             Err(e) => {
337 |                 differences.push(format!("Could not read file {}: {}", path.display(), e));
338 |             }
339 |         }
340 | 
341 |     }
342 | 
343 |     // Also check for any files on disk that shouldn't be there (optional but good practice)
344 |     // For now, we'll stick to the core requirement.
345 | 
346 |     if differences.is_empty() {
347 |         println!("✅ All {} generated files are in sync with the disk.", output_map.len());
348 |         return true;
349 |     } else {
350 |         println!("❌ Found {} differences:", differences.len());
351 |         for diff in differences {
352 |             println!("  - {}", diff);
353 |         }
354 |         return false;
355 |     }
356 | }
357 | 
358 | 
359 | fn main() {
360 |     let args = Cli::parse();
361 | 
362 |     // 1. Generate the complete output in memory
363 |     let output_map = generate_output_map(&args.files, args.dir.as_ref());
364 | 
365 |     if args.test {
366 |         // 2a. Run the test logic
367 |         if !run_test_comparison(&output_map) {
368 |             // Exit with a non-zero code to indicate test failure
369 |             std::process::exit(1);
370 |         }
371 | 
372 |     } else {
373 |         // 2b. Run the file writing logic
374 |         match write_output_to_disk(&output_map) {
375 |             Ok(_) => println!("✅ Successfully exported {} file(s).", output_map.len()),
376 |             Err(e) => {
377 |                 eprintln!("🔥 Error writing files to disk: {}", e);
378 |                 std::process::exit(1);
379 |             }
380 |         }
381 | 
382 |     }
383 | }
384 | 
385 | // Lots and lots of unit tests
386 | #[cfg(test)]
387 | mod tests {
388 |     use super::*;
389 | 
390 |     #[test]
391 |     fn test_extract_simple_chunk() {
392 |         let chunks = extract_chunks("tests/simple.md");
393 |     
394 |         assert!(chunks.len() == 1);
395 |     
396 |         let chunk0 = &chunks[0];
397 |         assert!(chunk0.info.lang == "rust");
398 |         assert_eq!(chunk0.info.path, Some("simple.rs".to_string()));
399 |     }
400 |     
401 |     #[test]
402 |     fn test_extract_two_chunks() {
403 |         let chunks = extract_chunks("tests/two_chunks.md");
404 |     
405 |         assert!(chunks.len() == 2);
406 |     
407 |         let chunk0 = &chunks[0];
408 |         assert!(chunk0.info.lang == "python");
409 |         assert!(chunk0.content == "print(\"Hello World\")\n");
410 |         assert_eq!(chunk0.info.name, Some("hello_world".to_string()));
411 |     
412 |         let chunk1 = &chunks[1];
413 |         assert!(chunk1.info.lang == "rust");
414 |         assert!(chunk1.content == "fn main() {\n    // A first chunk\n}\n");
415 |         assert_eq!(chunk1.info.path, Some("main.rs".to_string()));
416 |     }
417 |     
418 |     #[test]
419 |     fn some_yes_some_no() {
420 |         let chunks = extract_chunks("tests/some_yes_some_no.md");
421 |         assert!(chunks.len() == 2);
422 |     }
423 | 
424 |     #[test]
425 |     fn test_full_string_parsing() {
426 |         let info = "rust {export=src/main.rs} {name=chunk_1}";
427 |         let expected = ChunkInfo {
428 |             lang: "rust".to_string(),
429 |             path: Some("src/main.rs".to_string()),
430 |             name: Some("chunk_1".to_string()),
431 |             export: true,
432 |         };
433 |         assert_eq!(parse_info_string(info), Some(expected));
434 |     }
435 |     
436 |     #[test]
437 |     fn test_language_and_name() {
438 |         let info = "python {name=hello_world}";
439 |         let expected = ChunkInfo {
440 |             lang: "python".to_string(),
441 |             path: None,
442 |             name: Some("hello_world".to_string()),
443 |             export: false,
444 |         };
445 |         assert_eq!(parse_info_string(info), Some(expected));
446 |     }
447 |     
448 |     #[test]
449 |     fn test_attributes_in_different_order() {
450 |         let info = "rust {name=chunk_1} {export=src/main.rs}";
451 |         let expected = ChunkInfo {
452 |             lang: "rust".to_string(),
453 |             path: Some("src/main.rs".to_string()),
454 |             name: Some("chunk_1".to_string()),
455 |             export: true,
456 |         };
457 |         assert_eq!(parse_info_string(info), Some(expected));
458 |     }
459 |     
460 |     #[test]
461 |     fn test_language_only() {
462 |         let info = "python";
463 |         assert_eq!(parse_info_string(info), None);
464 |     }
465 |     
466 |     #[test]
467 |     fn test_with_export_only() {
468 |         let info = "javascript {export=app.js}";
469 |         let expected = ChunkInfo {
470 |             lang: "javascript".to_string(),
471 |             path: Some("app.js".to_string()),
472 |             name: None,
473 |             export: true,
474 |         };
475 |         assert_eq!(parse_info_string(info), Some(expected));
476 |     }
477 |     
478 |     #[test]
479 |     fn test_with_headless_export_only() {
480 |         let info = "rust {export}";
481 |         let expected = ChunkInfo {
482 |             lang: "rust".to_string(),
483 |             path: None,
484 |             name: None,
485 |             export: true,
486 |         };
487 |         assert_eq!(parse_info_string(info), Some(expected));
488 |     }
489 |     
490 |     #[test]
491 |     fn test_headless_export_with_name() {
492 |         let info = "rust {name=my_frag} {export}";
493 |         let expected = ChunkInfo {
494 |             lang: "rust".to_string(),
495 |             path: None,
496 |             name: Some("my_frag".to_string()),
497 |             export: true,
498 |         };
499 |         assert_eq!(parse_info_string(info), Some(expected));
500 |     }
501 |     
502 |     #[test]
503 |     fn test_with_name_only() {
504 |         let info = "rust {name=my_fragment}";
505 |         let expected = ChunkInfo {
506 |             lang: "rust".to_string(),
507 |             path: None,
508 |             name: Some("my_fragment".to_string()),
509 |             export: false,
510 |         };
511 |         assert_eq!(parse_info_string(info), Some(expected));
512 |     }
513 |     
514 |     #[test]
515 |     fn test_with_extra_whitespace() {
516 |         let info = "  bash   {export=run.sh}  ";
517 |         let expected = ChunkInfo {
518 |             lang: "bash".to_string(),
519 |             path: Some("run.sh".to_string()),
520 |             name: None,
521 |             export: true,
522 |         };
523 |         assert_eq!(parse_info_string(info), Some(expected));
524 |     }
525 |     
526 |     #[test]
527 |     fn test_no_match_for_invalid_format() {
528 |         let info = "{invalid_format}";
529 |         assert_eq!(parse_info_string(info), None);
530 |     }
531 |     
532 |     #[test]
533 |     fn test_empty_string() {
534 |         let info = "";
535 |         assert_eq!(parse_info_string(info), None);
536 |     }
537 | 
538 |     // ERROR: Chunk 'tests_build_chunk_map' not found
539 |     // ERROR: Chunk 'tests_expand_chunk' not found
540 | }
541 | 


--------------------------------------------------------------------------------
/docs/illiterate.md:
--------------------------------------------------------------------------------
  1 | # Illiterate
  2 | 
  3 | `illiterate` is a tool to enable literate programming.
  4 | It extracts code from Markdown files and exports it to source files.
  5 | 
  6 | The core of `illiterate` is a simple parser that reads Markdown files and extracts code blocks, building a web of interconnected chunks that reference other chunks.
  7 | 
  8 | Here is the main loop. First we parse the command line arguments, then we generate the complete output map (that contains all parsed chunks and relations), and finally we either run the test logic or the file writing logic.
  9 | 
 10 | ```rust {export=main.rs}
 11 | // All necessary packages
 12 | <<packages>>
 13 | 
 14 | // Utility methods
 15 | <<utilities>>
 16 | 
 17 | fn main() {
 18 |     let args = Cli::parse();
 19 | 
 20 |     // 1. Generate the complete output in memory
 21 |     let output_map = generate_output_map(&args.files, args.dir.as_ref());
 22 | 
 23 |     if args.test {
 24 |         // 2a. Run the test logic
 25 |         <<test_source>>
 26 |     } else {
 27 |         // 2b. Run the file writing logic
 28 |         <<generate_source>>
 29 |     }
 30 | }
 31 | 
 32 | // Lots and lots of unit tests
 33 | <<tests>>
 34 | ```
 35 | 
 36 | The comparison logic is quite simple. We just compare the output map with the expected output map. This will print all differences between the two maps and exit with a non-zero output if there is any difference.
 37 | 
 38 | ```rust {name=test_source}
 39 | if !run_test_comparison(&output_map) {
 40 |     // Exit with a non-zero code to indicate test failure
 41 |     std::process::exit(1);
 42 | }
 43 | ```
 44 | 
 45 | The writing logic is also quite simple. We just write all files to disk and print a success message. If there is an error, we print an error message and exit with a non-zero output.
 46 | 
 47 | ```rust {name=generate_source}
 48 | match write_output_to_disk(&output_map) {
 49 |     Ok(_) => println!("✅ Successfully exported {} file(s).", output_map.len()),
 50 |     Err(e) => {
 51 |         eprintln!("🔥 Error writing files to disk: {}", e);
 52 |         std::process::exit(1);
 53 |     }
 54 | }
 55 | ```
 56 | 
 57 | Now let's get into the meat of the logic.
 58 | 
 59 | ## The CLI
 60 | 
 61 | First, let's build the command line interface. We will use the `clap` crate for this. First, we need to include it in the packages.
 62 | 
 63 | ```rust {name=packages}
 64 | use clap::Parser;
 65 | ```
 66 | 
 67 | And then we define the `Cli` struct that represents the input. Here we are using Rust's powerful macro system to construct the CLI directly from type annotations.
 68 | 
 69 | ```rust {name=utilities}
 70 | /// A fast, zero-config, programmer-first literate programming tool.
 71 | #[derive(Parser, Debug)]
 72 | #[command(author, version, about, long_about = None)]
 73 | struct Cli {
 74 |     /// One or more Markdown files to process
 75 |     #[arg(required = true)]
 76 |     files: Vec<PathBuf>,
 77 | 
 78 |     /// Sets the root output directory for all exported files
 79 |     #[arg(long, short, value_name = "DIRECTORY")]
 80 |     dir: Option<PathBuf>,
 81 | 
 82 |     /// Run in test mode. Compares generated files with their
 83 |     /// on-disk counterparts and reports differences.
 84 |     #[arg(long, short)]
 85 |     test: bool,
 86 | }
 87 | ```
 88 | 
 89 | ## The Main Loop
 90 | 
 91 | The core functionality in `illiterate` can be split into two big tasks:
 92 | 
 93 | 1. building a map of all the code chunks in the input files, and;
 94 | 2. using that map to construct the output files.
 95 | 
 96 | The first part is relatively straightforward. We will need a Markdown parser to read the input files, and a way to store the code chunks. The second part will be slightly more complicated because chunks can reference other chunks, so the expansion of a chunk into actual code requires a recursive traversal of the chunk graph.
 97 | 
 98 | The top-level functionality that encapsulates all this process in the method `generate_output_map`. Let's take a look at the big picture, and then dive into the actual implementation.
 99 | 
100 | ```rust {name=utilities}
101 | /// Processes a list of markdown files and builds an in-memory map of the
102 | /// files to be generated, without writing anything to disk.
103 | fn generate_output_map(
104 |     paths: &[PathBuf],
105 |     root_dir: Option<&PathBuf>,
106 | ) -> HashMap<PathBuf, String> {
107 |     // First, we'll store all raw chunks in this vector
108 |     <<extract_all_chunks>>
109 | 
110 |     // Next, we create two data structures with all chunks
111 |     <<create_data_structures>>
112 | 
113 |     // And finally create the final sources
114 |     <<synthesize_sources>>
115 | 
116 |     return source_map;
117 | }
118 | ```
119 | 
120 | The process to extract all code chunks requires iterating over each input file, performing the extraction of chunks in that file, and collecting all chunks in a global vector. We will look at the `extract_chunks` function in short, which is where the core of the work happens.
121 | 
122 | ```rust {name=extract_all_chunks}
123 | let mut all_chunks = Vec::new();
124 | 
125 | for path in paths.iter() {
126 |     all_chunks.extend(extract_chunks(path.to_str().unwrap()));
127 | }
128 | ```
129 | 
130 | Once we have all chunks, we need two data structures. The first is a hash-based map of all named chunks (those that have a `name` attribute). In this structure we will associated to a single name all the chunks declared with that name, so we can later expand them in sequence.
131 | 
132 | The second structure is a simple list of all chunks that have an `export` directive, which will be processed also in sequence.
133 | 
134 | ```rust {name=create_data_structures}
135 | // A map of all named chunks for easy lookup during expansion.
136 | let named_chunks_map = create_named_chunk_map(&all_chunks);
137 | 
138 | // A list of the chunks that are marked for export.
139 | let exportable_chunks = all_chunks.iter().filter(|chunk| chunk.info.export);
140 | ```
141 | 
142 | The final part is to synthesize the tangled source files. This is done by iterating over all exportable chunks and expanding them. The expanded content is then appended to the appropriate file's content in the map. This creates an in-memory representation of the tangled source, that will later be either compared to filesystem source files (if `--test` is passed) or written to disk.
143 | 
144 | ```rust {name=synthesize_sources}
145 | // This map will hold the final content for each output file.
146 | let mut source_map: HashMap<PathBuf, String> = HashMap::new();
147 | 
148 | // Define the base dir (or use default)
149 | let base_dir = root_dir.cloned().unwrap_or_else(|| env::current_dir().unwrap());
150 | 
151 | // Expand all exportable chunks and collect their content into the output map.
152 | for chunk in exportable_chunks {
153 |     // This is the where the recursive magic happens
154 |     let content = chunk.expand(&named_chunks_map);
155 |     // This is the path to the file where the chunk should be written.
156 |     let file_path = base_dir.join(chunk.info.path.as_ref().unwrap());
157 |     // Append the expanded content of the current chunk to the appropriate file's content in the map.
158 |     source_map.entry(file_path).or_default().push_str(&content);
159 | }
160 | ```
161 | 
162 | Now that we have the big picture ready, let's take a look at the specifics.
163 | 
164 | ## Extracting chunks
165 | 
166 | The first step in the main loop is to find all code chunks and build the two data structures explained before. Let's look at that code next.
167 | 
168 | We will begin by examining the code that extracts all chunks from a single source file.
169 | 
170 | We can use the markdown parser from `pulldown-cmark` to easily identify all relevant code snippets in a given source file. For this, we will need to first import relevant packages (remember we already have a `Parser` from `clap` imported).
171 | 
172 | ```rust {name=packages}
173 | use pulldown_cmark::{CodeBlockKind, Event, Parser as MarkdownParser, Tag};
174 | ```
175 | 
176 | The most important type here is `Parser`, which we rename as `MarkdownParser`. For the parsing method itself, we take advantage of `pulldown-cmark` event-based parsing. The core of the method is to iterate over all parsing events, and identify the ones we need to process.
177 | 
178 | ### The Chunk Struct
179 | 
180 | Here is the `Chunk` structure we want to produce, and the related `ChunkInfo`, which represent all the information we need about a code chunk.
181 | 
182 | ```rust {name=utilities}
183 | #[derive(Debug, PartialEq)]
184 | struct Chunk {
185 |     info: ChunkInfo,
186 |     content: String,
187 | }
188 | 
189 | #[derive(Debug, PartialEq, Clone)]
190 | struct ChunkInfo {
191 |     lang: String,
192 |     path: Option<String>,
193 |     name: Option<String>,
194 |     export: bool,
195 | }
196 | ```
197 | 
198 | ### The Extract Chunks Method
199 | 
200 | And here is, finally, the method that extracts all chunks from a single markdown file.
201 | 
202 | This method only cares when a code block begins and ends, to gather the necessary info (stored in a `ChunkInfo`) all the text in-between. The begining of a code block is rather complicated because we need to parse all attributes, so we'll leave that out for a minute, and take a look at the high-level code.
203 | 
204 | ```rust {name=utilities}
205 | fn extract_chunks(file_path: &str) -> Vec<Chunk> {
206 |     let mut chunks = Vec::new();
207 |     let content = std::fs::read_to_string(file_path).unwrap();
208 |     let parser = MarkdownParser::new(&content);
209 |     let mut in_chunk = false;
210 | 
211 |     // list of common language extensions (e.g., .py, .rs, .cpp)
212 |     let lang_ext = language_extensions();
213 | 
214 |     for event in parser {
215 |         match event {
216 |             Event::Start(Tag::CodeBlock(kind)) => {
217 |                 <<process_start_code_block>>
218 |             }
219 |             Event::Text(text) => {
220 |                 // If we're inside a code chunk, just take every text
221 |                 // and append it to the last chunk (the one we're inside of)
222 |                 if in_chunk {
223 |                     if let Some(last_chunk) = chunks.last_mut() {
224 |                         last_chunk.content.push_str(&text);
225 |                     }
226 |                 }
227 |             }
228 |             Event::End(_) => {
229 |                 in_chunk = false;
230 |             }
231 |             _ => {} // Nothing else matters
232 |         }
233 |     }
234 | 
235 |     return chunks;
236 | }
237 | ```
238 | 
239 | The thing we care most about at the begining of a code block is the info string which contains the language name (e.g., rust) and all possible attributes, like `export` and `name`.
240 | 
241 | The parsing of this info string is performed in the aptly named `parse_info_string`, which we'll look next. But with that in place, the only significant complexity remaining is to generate a default path for empty `{export}` directives.
242 | 
243 | ```rust {name=process_start_code_block}
244 | if let CodeBlockKind::Fenced(info_str) = kind {
245 |     if let Some(info) = parse_info_string(&info_str) {
246 |         let mut chunk = Chunk {
247 |             info,
248 |             content: String::new(),
249 |         };
250 | 
251 |         // For empty export directives, we generate a default path
252 |         if chunk.info.export && chunk.info.path.is_none() {
253 |             <<generate_default_path>>
254 |         }
255 | 
256 |         chunks.push(chunk);
257 |         in_chunk = true;
258 |     }
259 | }
260 | ```
261 | 
262 | Generating a default path is basically a bit of Rust golf to find the stem of the current markdown file (the part of the filename without the extension), and append the corresponding from a predefined hash map we hard-coded.
263 | 
264 | ```rust {name=generate_default_path}
265 | let file_stem = Path::new(file_path).file_stem().unwrap().to_str().unwrap();
266 | let extension = lang_ext.get(chunk.info.lang.as_str()).unwrap_or(&"txt");
267 | let default_path = format!("{}.{}", file_stem, extension);
268 | chunk.info.path = Some(default_path);
269 | ```
270 | 
271 | And just for completion, here is the hard-coded hash map.
272 | 
273 | ```rust {name=utilities}
274 | fn language_extensions() -> HashMap<&'static str, &'static str> {
275 |     let mut map = HashMap::new();
276 | 
277 |     map.insert("python", "py");
278 |     map.insert("javascript", "js");
279 |     map.insert("java", "java");
280 |     map.insert("csharp", "cs");
281 |     map.insert("cpp", "cpp");
282 |     map.insert("c", "c");
283 |     map.insert("typescript", "ts");
284 |     map.insert("php", "php");
285 |     map.insert("swift", "swift");
286 |     map.insert("ruby", "rb");
287 |     map.insert("go", "go");
288 |     map.insert("kotlin", "kt");
289 |     map.insert("rust", "rs");
290 |     map.insert("r", "r");
291 |     map.insert("matlab", "m");
292 |     map.insert("perl", "pl");
293 |     map.insert("scala", "scala");
294 |     map.insert("objc", "m");
295 |     map.insert("lua", "lua");
296 |     map.insert("dart", "dart");
297 |     map.insert("haskell", "hs");
298 |     map.insert("groovy", "groovy");
299 |     map.insert("elixir", "ex");
300 |     map.insert("julia", "jl");
301 |     map.insert("fsharp", "fs");
302 |     map.insert("clojure", "clj");
303 |     map.insert("erlang", "erl");
304 |     map.insert("assembly", "asm");
305 |     map.insert("sql", "sql");
306 |     map.insert("bash", "sh");
307 | 
308 |     return map;
309 | }
310 | ```
311 | 
312 | ### The Parse Info String Method
313 | 
314 | To finish off this part of the functionality, we'll end by looking at the `parse_info_string` method.
315 | 
316 | The core of this method is using regular expressions to first, split the info string into pieces, and then parsing each of those pieces. For example, consider a code block like the following:
317 | 
318 |     ```rust {export=main.rs}
319 |     ```
320 | 
321 | The info string will be `rust {export=main.rs}`. Our method needs to split it into `rust` and `export=main.rs`, to later extract the `main.rs` string from the `export` attribute.
322 | 
323 | There is no much else we can say about this rather than showing the entire method, so buckle up.
324 | 
325 | ```rust {name=utilities}
326 | fn parse_info_string(info_string: &str) -> Option<ChunkInfo> {
327 |     // First, capture the language and the rest of the attributes string.
328 |     let lang_re = Regex::new(r"^\s*(?P<lang>\w+)\s*(?P<attrs>.*)$").unwrap();
329 |     let caps = lang_re.captures(info_string)?;
330 |     let lang = caps.name("lang").unwrap().as_str().to_string();
331 |     let attrs_str = caps.name("attrs").unwrap().as_str();
332 | 
333 |     // This regex finds all attributes, handling both `{key=value}` and `{key}` formats.
334 |     let attr_re = Regex::new(r"\{(?P<key>[a-zA-Z\d_]+)(=(?P<value>[^}]+))?\}").unwrap();
335 | 
336 |     let mut path = None;
337 |     let mut name = None;
338 |     let mut export = false;
339 | 
340 |     // Iterate over all attribute matches found in the string.
341 |     for attr_caps in attr_re.captures_iter(attrs_str) {
342 |         let key = attr_caps.name("key").unwrap().as_str();
343 |         // The "value" capture group is now optional.
344 |         let value = attr_caps.name("value");
345 | 
346 |         <<extract_attribute_values>>
347 |     }
348 | 
349 |     if path.is_none() && name.is_none() && !export {
350 |         return None;
351 |     }
352 | 
353 |     return Some(ChunkInfo {
354 |         lang: lang,
355 |         path: path,
356 |         name: name,
357 |         export: export,
358 |     });
359 | }
360 | ```
361 | 
362 | And here is the missing part to extract attribute values. For now, we only accept the `export` and `name` attributes.
363 | 
364 | ```rust {name=extract_attribute_values}
365 | match key {
366 |     "export" => {
367 |         export = true;
368 |         if let Some(val_match) = value {
369 |             // This was {export=...}, so we have a path.
370 |             path = Some(val_match.as_str().to_string());
371 |         }
372 |     }
373 |     "name" => {
374 |         if let Some(val_match) = value {
375 |             name = Some(val_match.as_str().to_string());
376 |         }
377 |         // Note: {name} without a value is ignored.
378 |     }
379 |     _ => {} // Ignore unknown attributes
380 | }
381 | ```
382 | 
383 | Of course, we can't forget to import the `regex` package.
384 | 
385 | ```rust {name=packages}
386 | use regex::Regex;
387 | ```
388 | 
389 | Putting all of the above together, we now have the ability to extract code chunks from all markdown files, and store them in a huge list of `Chunk` objects. Let's move to sorting them.
390 | 
391 | ### Sorting Chunks
392 | 
393 | Once we have all code chunks, we need to separate them into two lists. One is simply the list of chunks with an `export` directive, which will already did. The other is what I call a `named_chunk_map`, which is a hash map of strings to the list of all chunks named with that key. This is necessary because we later want to concatenate all those equally-named chunks in a single blob.
394 | 
395 | This is a straightforward method that simply iterates over all chunks with a `name=...` directive and appends them to the corresponding key.
396 | 
397 | ```rust {name=utilities}
398 | fn create_named_chunk_map(chunks: &[Chunk]) -> HashMap<String, Vec<&Chunk>> {
399 |     let mut chunk_map: HashMap<String, Vec<&Chunk>> = HashMap::new();
400 | 
401 |     for chunk in chunks.iter() {
402 |         if let Some(name) = &chunk.info.name {
403 |             chunk_map.entry(name.clone()).or_default().push(chunk);
404 |         }
405 |     }
406 |     chunk_map
407 | }
408 | ```
409 | 
410 | ## Expanding Chunks
411 | 
412 | Once we have all chunks correctly sorted out (the exportable chunks in a flat list, and the named chunks in a map grouped by name), we can start to tangle the output files.
413 | 
414 | The basic idea, as we already saw, is to go over all exportable chunks, and build the corresponding output files by expanding all references to named chunks, recursively. This will create an in-memory representation of all output files with the flat, tangled content that ultimately must either be written to disk or compared with the existing sources in disk.
415 | 
416 | The only piece left is the recursive expansion of named chunks. The main method is shown below:
417 | 
418 | 
419 | ```rust {name=utilities}
420 | impl Chunk {
421 |     /// Public method to start the expansion process.
422 |     /// It initializes the tracking stack for circular dependency checks.
423 |     pub fn expand(&self, named_chunks: &HashMap<String, Vec<&Chunk>>) -> String {
424 |         let mut expansion_stack = Vec::new();
425 |         self.expand_recursive(named_chunks, &mut expansion_stack)
426 |     }
427 | 
428 |     <<expand_recursive>>
429 | }
430 | ```
431 | 
432 | And here is the recursive implementation:
433 | 
434 | ```rust {name=expand_recursive}
435 | /// Recursively expands the content of this chunk by replacing `<<...>>` references.
436 | fn expand_recursive(
437 |     &self,
438 |     named_chunks: &HashMap<String, Vec<&Chunk>>,
439 |     expansion_stack: &mut Vec<String>,
440 | ) -> String {
441 |     // Check for circular dependencies.
442 |     <<check_circular_dependencies_in>>
443 | 
444 |     // This will hold the final expanded chunk
445 |     let mut final_content = String::new();
446 |     // This regex matches lines with a named reference in the form <<...>>
447 |     let include_re = Regex::new(r"^(?P<indent>\s*)<<(?P<name>[\w_.-]+)>>\s*$").unwrap();
448 | 
449 |     for line in self.content.lines() {
450 |         if let Some(caps) = include_re.captures(line) {
451 |             // This line contains a named reference.
452 |             let indent_str = caps.name("indent").unwrap().as_str();
453 |             let name_to_include = caps.name("name").unwrap().as_str();
454 | 
455 |             <<process_named_reference>>
456 |         } else {
457 |             // This line doesn't, so add it as is.
458 |             final_content.push_str(line);
459 |             final_content.push('\n');
460 |         }
461 |     }
462 | 
463 |     // Some post-process we will need to make circular checks work
464 |     <<check_circular_dependencies_out>>
465 | 
466 |     return final_content;
467 | }
468 | ```
469 | 
470 | The recursive expansion itself is not that complicated once we have everything in place. We just need to iterate over all named chunks associated with the corresponding name and call their `expand_recursive` method.
471 | 
472 | The `expansion_stack` parameter is what helps us keep track of the current call stack, so we never enter a loop. We will see that next.
473 | 
474 | ```rust {name=process_named_reference}
475 | match named_chunks.get(name_to_include) {
476 |     Some(chunks_to_include) => {
477 |         for chunk in chunks_to_include {
478 |             // Recursively expand the included chunk.
479 |             let expanded_include = chunk.expand_recursive(named_chunks, expansion_stack);
480 |             // Add the captured indentation to each line of the expanded content.
481 |             for expanded_line in expanded_include.lines() {
482 |                 final_content.push_str(indent_str);
483 |                 final_content.push_str(expanded_line);
484 |                 final_content.push('\n');
485 |             }
486 |             final_content.push('\n');
487 |         }
488 |     }
489 |     None => {
490 |         // Handle missing chunk reference
491 |         panic!("ERROR: Chunk '{}' not found", name_to_include);
492 |     }
493 | }
494 | ```
495 | 
496 | So the only piece left is to check for circular dependencies. We keep a stack of the current call stack and check if we are trying to expand a chunk that is already in the stack. If so, we panic.
497 | 
498 | ```rust {name=check_circular_dependencies_in}
499 | if let Some(name) = &self.info.name {
500 |     if expansion_stack.contains(name) {
501 |         let error_msg = format!(
502 |             "\n// ERROR: Circular reference detected for chunk '{}'\n",
503 |             name
504 |         );
505 |         return error_msg;
506 |     }
507 |     expansion_stack.push(name.clone());
508 | }
509 | ```
510 | 
511 | And at the end of the method, we need to add this piece of code to pop the chunk from the stack.
512 | 
513 | ```rust {name=check_circular_dependencies_out}
514 | if let Some(name) = &self.info.name {
515 |     if expansion_stack.last() == Some(name) {
516 |         expansion_stack.pop();
517 |     }
518 | }
519 | ```
520 | 
521 | With this method in place, we are basically done. All that's left is a couple utility methods we called from `main`.
522 | 
523 | ## Final Utilities
524 | 
525 | The first such utility method is simply to write the in-memory file map to the disk.
526 | 
527 | ```rust {name=utilities}
528 | /// Writes the in-memory file map to the disk, overwriting existing files.
529 | fn write_output_to_disk(output_map: &HashMap<PathBuf, String>) -> io::Result<()> {
530 |     for (path, content) in output_map {
531 |         if let Some(parent) = path.parent() {
532 |             fs::create_dir_all(parent)?;
533 |         }
534 |         fs::write(path, content)?;
535 |     }
536 |     Ok(())
537 | }
538 | ```
539 | 
540 | And the other missing method is the one that compares the in-memory file map with the files on disk. This is useful for CI/CD to make sure we don't commit markdown files that are out of sync with the tangled source code.
541 | 
542 | ```rust {name=utilities}
543 | /// Compares the in-memory file map with files on disk and reports differences.
544 | fn run_test_comparison(output_map: &HashMap<PathBuf, String>) -> bool {
545 |     let mut differences = Vec::new();
546 | 
547 |     for (path, generated_content) in output_map {
548 |         <<check_generated_content>>
549 |     }
550 | 
551 |     // Also check for any files on disk that shouldn't be there (optional but good practice)
552 |     // For now, we'll stick to the core requirement.
553 | 
554 |     if differences.is_empty() {
555 |         println!("✅ All {} generated files are in sync with the disk.", output_map.len());
556 |         return true;
557 |     } else {
558 |         println!("❌ Found {} differences:", differences.len());
559 |         for diff in differences {
560 |             println!("  - {}", diff);
561 |         }
562 |         return false;
563 |     }
564 | }
565 | ```
566 | 
567 | The core of this method is of course checking the generated content against the on-disk source. THere three posibilities: the file is missing, the file is present but the content is different, or the file is present and the content is the same. We need to handle all three cases.
568 | 
569 | ```rust {name=check_generated_content}
570 | match fs::read_to_string(path) {
571 |     Ok(disk_content) => {
572 |         if &disk_content != generated_content {
573 |             differences.push(format!("Content mismatch in {}", path.display()));
574 |         }
575 |     }
576 |     Err(e) if e.kind() == io::ErrorKind::NotFound => {
577 |         differences.push(format!("Missing expected file on disk: {}", path.display()));
578 |     }
579 |     Err(e) => {
580 |         differences.push(format!("Could not read file {}: {}", path.display(), e));
581 |     }
582 | }
583 | ```
584 | 
585 | And finally, here are some missing packages we've been assuming were imported.
586 | 
587 | 
588 | ```rust {name=packages}
589 | use std::{
590 |     collections::HashMap,
591 |     env,
592 |     fs::{self},
593 |     io,
594 |     path::{Path, PathBuf},
595 | };
596 | ```
597 | 
598 | And that's it. The whole of `illiterate` in close to 600 lines of literate programming. Enjoy!
599 | 


--------------------------------------------------------------------------------