├── .gitignore
├── README.md
├── c-embed.c
├── c-embed.h
├── examples
└── 0_fgets
│ ├── data
│ ├── data.txt
│ └── data2
│ │ └── data2.txt
│ ├── main
│ ├── main.c
│ └── makefile
└── makefile
/.gitignore:
--------------------------------------------------------------------------------
1 | **/c-embed.o
2 | **/main
3 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # c-embed
2 |
3 | naturally embed read-only file/filesystem snapshots into any C99 program w. a single header, **zero dependencies** and **zero modifications** to your code at build time.
4 |
5 | works with `C99` to `C++20` compilers
6 |
7 | ## usage
8 |
9 | `c-embed` allows you to embed static, read-only snapshots of full filesystems into your C executable with almost zero effort.
10 |
11 | `c-embed` builds an object file containing the static file system image using the embedder binary `c-embed`, which outputs the name of the embedded filesystem file (default: `c-embed.o`):
12 |
13 | c-embed myfile.txt
...
14 | > c-embed.o
15 |
16 | *Note: Specifying a directory will recursively add all files (including in subdirectories) to the virtual file system.*
17 |
18 | The single `c-embed.h` header exposes `stdio.h` style interface for accessing the embedded file system, which you simply link as an object file:
19 |
20 | /* main.c */
21 |
22 | #include
23 |
24 | int main(){
25 |
26 | EFILE * pFile = eopen ("myfile.txt" , "r");
27 |
28 | //...
29 |
30 | eclose(pFile);
31 | return 0;
32 |
33 | }
34 |
35 | **build**
36 |
37 | gcc main.c c-embed.o -o main
38 |
39 | *Note: All files in the virtual filesystem are located relative to the working directory where the `c-embed` binary is executed.*
40 |
41 | ### building / installation
42 |
43 | Build the `c-embed` executble and install the single header:
44 |
45 | sudo make all
46 |
47 | ### c-embed stdio.h interface
48 |
49 | The `stdio.h` style interface for virtual filesystems implemented in `c-embed` contains the following:
50 |
51 | EFILE - embedded file pointer
52 | eopen - open embedded file
53 | eclose - close embedded file pointer
54 | eeof - end of file check
55 | egets - get string util eof or new line
56 | egetc - get current stream char promoted to int
57 | eerror - prints a descriptive error
58 | eread - read to memory address
59 | eseek - seek a position in the file
60 | etell - get the position in the file
61 | rewind - reset the position to the start of the file
62 |
63 | in full analogy to their regular [`stdio.h` counterparts](https://cplusplus.com/reference/cstdio/).
64 |
65 | ### zero-modification embedding
66 |
67 | If you use `stdio.h` for your filesystem-io, then your code requires **zero** modifications to use `c-embed` for embedding virtual file system snapshots. To facilitate this, `c-embed` only requires a minor modification to your build process:
68 |
69 | #include
70 |
71 | int main(int argc, char* args[]){
72 |
73 | FILE* eFile = fopen("data/data.txt", "r");
74 |
75 | char buffer [100]{' '};
76 |
77 | if (eFile == NULL)
78 | perror ("Error opening file");
79 |
80 | else while(!feof(eFile)){
81 | if( fgets(buffer, 100, eFile) == NULL ) break;
82 | fputs (buffer , stdout);
83 | }
84 |
85 | }
86 |
87 | Makefile - build with live file system:
88 |
89 | all:
90 | gcc main.c -o main
91 |
92 | Makefile - build with embedded virtual file system:
93 |
94 | DAT = data_path_1 data_path_2 # files to embed
95 | CEF = -DCEMBED_TRANSLATE -include /usr/local/include/c-embed.h
96 |
97 | all:
98 | gcc main.c $(shell c-embed $(DAT)) $(CEF) -o main
99 |
100 | This allows for seamless transitioning between development and deployment builds without needing to modify the code at all.
101 |
102 | *Note: The `CEMBED_TRANSLATE` preprocessor definition translates the `stdio.h` interface into the `c-embed` interface.*
103 |
104 | ## how it works
105 |
106 | The `c-embed` binary builds two files, containing a filesystem indexing structure and a concatenation of the files we wish to embed.
107 |
108 | The indexing structure uses a simple hash function to turn path strings into keys and store positions of binary data in the concatenated filesystem file.
109 |
110 | The binary then uses `objcopy` to turn these two files into accessible objects, which we can easily link with our main executable and access via defined symbols.
111 |
112 | The `c-embed.h` header then includes these symbols (one set of symbols for the indexing structure, one set for the file system) and implements the `stdio.h` style interface to interpret the data and access the embedded files.
113 |
114 | ### other details
115 |
116 | The filesystem remains static in the object file, meaning that the file system need not exist as long as `c-embed.o` exists.
117 |
118 | You can generate a static file system in one place and copy the `c-embed.o` file somewhere else and still have accessors given by the relative paths from the place where it was embedded.
119 |
120 | The `c-embed.o` file represents the filesystem snapshot, and compiling your executable by linking this file embeds the snapshot.
121 |
122 | ## Motivation
123 |
124 | Shipping data with programs is difficult when the goal is to distribute a single executable. For instance, shader programs for graphical applications are linked and compiled at runtime and therefore the data needs to be loaded in live.
125 |
126 | A number of strategies exist to circumvent this problem, each with their own set of drawbacks which `c-embed` avoids.
127 |
128 | ### Strategies
129 |
130 | 1. Embed as string / const char* literal
131 |
132 | This is trash for obvious reasons. It requires the manual declaration, tracking and working with symbols. The contents have no syntax highlighting and are difficult to edit and swap out without a complex toolchain.
133 |
134 | 2. Base64-Encode and Embed as String Literal
135 |
136 | A number of other projects on Github use this approach for binary data. It is identical to the previous strategy.
137 |
138 | Both of these solutions require modification / cluttering of your code with additional `#include` directives or symbols.
139 |
140 | 3. ASCII Files: `#include` as string literal
141 |
142 | const std::string file_content =
143 | #include "file"
144 | ;
145 |
146 | This is more flexible as it allows the data to remain in a separate file, albeit with the addition of the delimiters `R""(` and `)""`, which is ugly. This does not work for binary data. Note that here we are actually also managing file names, so changes to names or relative paths require modification of code. Additionally, *relative file paths must exist at compile time*.
147 |
148 | ### Solution
149 |
150 | The goal of this tiny, single-header library is to leverage the c preprocessor to move the entire file system configuration to the build process and out of your code.
151 |
152 | The `c-embed` binary uses objcopy to generate a single object file containing the relevant symbols for accessing the binary-encoded data.
153 |
154 | Finally, an abstract `stdio.h` style interface is provided for retrieving the data with proper error handling.
155 |
156 | The result is that `c-embed` does not strictly require that the file system itself exists at compile time or even relative to the build system; only at some point, somewhere.
157 |
158 | Files stay as the files which they are, and can thus be manipulated appropriately by your editor of choice, with the embedding occuring at build time.
159 |
160 | ## Future Work
161 |
162 | It would be interesting to consider if the file system could be made (temporarily) writeable in RAM. But this is beyond the purpose of this library.
163 |
164 | Other necessary improvements include:
165 | - active hash collision detection during embedding
166 | - somehow make it possible to link multiple file systems simultaneously
167 | - add an embed file inspection tool (this requires including metadata in the future)
168 |
169 | ## License
170 |
171 | MIT License
172 |
--------------------------------------------------------------------------------
/c-embed.c:
--------------------------------------------------------------------------------
1 | /*
2 | # c-embed
3 | # embed virtual file systems into an c program
4 | # - at build time
5 | # - with zero dependencies
6 | # - with zero code modifications
7 | # - with zero clutter in your program
8 | # author: nicholas mcdonald 2022
9 | */
10 |
11 | #define CEMBED_BUILD
12 |
13 | #include "c-embed.h"
14 | #include
15 | #include
16 |
17 | #define CEMBED_FILE "c-embed.o" // Output File
18 | #define CEMBED_TMPDIR "cembed_tmp" // Temporary Directory
19 | #define CEMBED_ARCH "elf64-x86-64" // Target Architecture
20 |
21 | FILE* ms = NULL; // Mapping Structure
22 | FILE* fs = NULL; // Virtual Filesystem
23 | FILE* file = NULL; // Embed Target File Pointer
24 | u_int32_t pos = 0; // Current Position
25 |
26 | void cembed(char* filename){
27 |
28 | file = fopen(filename, "rb"); // Open the Embed Target File
29 | if(file == NULL){
30 | printf("Failed to open file %s.", filename);
31 | return;
32 | }
33 |
34 | fseek(file, 0, SEEK_END); // Define Map
35 | EMAP map = {hash(filename), pos, (u_int32_t)ftell(file)};
36 | rewind (file);
37 |
38 | char* buf = malloc(sizeof(char)*(map.size));
39 | if(buf == NULL){
40 | printf("Memory error for file %s.", filename);
41 | return;
42 | }
43 |
44 | u_int32_t result = fread(buf, 1, map.size, file);
45 | if(result != map.size){
46 | printf("Read error for file %s.", filename);
47 | return;
48 | }
49 |
50 | fwrite(&map, sizeof map, 1, ms); // Write Mapping Structure
51 | fwrite(buf, map.size, 1, fs); // Write Virtual Filesystem
52 |
53 | free(buf); // Free Buffer
54 | fclose(file); // Close the File
55 | file = NULL; // Reset the Pointer
56 | pos += map.size; // Shift the Index Position
57 |
58 | }
59 |
60 | #define CEMBED_DIRENT_FILE 8
61 | #define CEMBED_DIRENT_DIR 4
62 | #define CEMBED_MAXPATH 512
63 |
64 | void iterdir(char* d){
65 |
66 | char* fullpath = (char*)malloc(CEMBED_MAXPATH*sizeof(char));
67 |
68 | DIR *dir;
69 | struct dirent *ent;
70 |
71 | if ((dir = opendir(d)) != NULL) {
72 |
73 | while ((ent = readdir(dir)) != NULL) {
74 |
75 | if(strcmp(ent->d_name, ".") == 0) continue;
76 | if(strcmp(ent->d_name, "..") == 0) continue;
77 |
78 | if(ent->d_type == CEMBED_DIRENT_FILE){
79 | strcpy(fullpath, d);
80 | strcat(fullpath, "/");
81 | strcat(fullpath, ent->d_name);
82 | cembed(fullpath);
83 | }
84 |
85 | else if(ent->d_type == CEMBED_DIRENT_DIR){
86 | strcpy(fullpath, d);
87 | strcat(fullpath, "/");
88 | strcat(fullpath, ent->d_name);
89 | iterdir(fullpath);
90 | }
91 |
92 | }
93 |
94 | closedir(dir);
95 |
96 | }
97 |
98 | else {
99 |
100 | strcpy(fullpath, d);
101 | cembed(fullpath);
102 |
103 | }
104 |
105 | free(fullpath);
106 |
107 | }
108 |
109 | int main(int argc, char* argv[]){
110 |
111 | char fmt[CEMBED_MAXPATH];
112 |
113 | if(argc <= 1)
114 | return 0;
115 |
116 | sprintf(fmt, "if [ ! -d %s ]; then mkdir %s; fi;", CEMBED_TMPDIR, CEMBED_TMPDIR);
117 | system(fmt);
118 |
119 | // Build the Mapping Structure and Virtual File System
120 |
121 | ms = fopen("cembed.map", "wb");
122 | fs = fopen("cembed.fs", "wb");
123 |
124 | if(ms == NULL || fs == NULL){
125 | printf("Failed to initialize map and filesystem. Check permissions.");
126 | return 0;
127 | }
128 |
129 | for(int i = 1; i < argc; i++)
130 | iterdir(argv[i]);
131 |
132 | fclose(ms);
133 | fclose(fs);
134 |
135 | // Convert to Embeddable Symbols
136 |
137 | sprintf(fmt, "objcopy -I binary -O %s "\
138 | "--redefine-sym _binary_cembed_map_start=cembed_map_start "\
139 | "--redefine-sym _binary_cembed_map_end=cembed_map_end "\
140 | "--redefine-sym _binary_cembed_map_size=cembed_map_size "\
141 | "cembed.map cembed.map.o", CEMBED_ARCH);
142 | system(fmt);
143 |
144 | sprintf(fmt, "mv cembed.map.o %s/cembed.map.o", CEMBED_TMPDIR);
145 | system(fmt);
146 | system("rm cembed.map");
147 |
148 | sprintf(fmt, "objcopy -I binary -O %s "\
149 | "--redefine-sym _binary_cembed_fs_start=cembed_fs_start "\
150 | "--redefine-sym _binary_cembed_fs_end=cembed_fs_end "\
151 | "--redefine-sym _binary_cembed_fs_size=cembed_fs_size "\
152 | "cembed.fs cembed.fs.o", CEMBED_ARCH);
153 | system(fmt);
154 |
155 | sprintf(fmt, "mv cembed.fs.o %s/cembed.fs.o", CEMBED_TMPDIR);
156 | system(fmt);
157 | system("rm cembed.fs");
158 |
159 | sprintf(fmt, "ld -relocatable cembed_tmp/*.o -o %s", CEMBED_FILE);
160 | system(fmt);
161 |
162 | sprintf(fmt, "rm -rf %s", CEMBED_TMPDIR);
163 | system(fmt);
164 |
165 | printf("%s", CEMBED_FILE);
166 |
167 | return 0;
168 |
169 | }
170 |
--------------------------------------------------------------------------------
/c-embed.h:
--------------------------------------------------------------------------------
1 | /*
2 | # c-embed
3 | # embed virtual file systems into an c program
4 | # - at build time
5 | # - with zero dependencies
6 | # - with zero code modifications
7 | # - with zero clutter in your program
8 | # author: nicholas mcdonald 2022
9 | */
10 |
11 | #ifndef CEMBED
12 | #define CEMBED
13 |
14 | #include
15 | #include
16 | #include
17 | #include
18 | #include
19 |
20 | u_int32_t hash(char * key){ // Hash Function: MurmurOAAT64
21 | u_int32_t h = 3323198485ul;
22 | for (;*key;++key) {
23 | h ^= *key;
24 | h *= 0x5bd1e995;
25 | h ^= h >> 15;
26 | }
27 | return h;
28 | }
29 |
30 | typedef size_t epos_t;
31 |
32 | struct EMAP_S { // Map Indexing Struct
33 | u_int32_t hash;
34 | u_int32_t pos;
35 | u_int32_t size;
36 | };
37 | typedef struct EMAP_S EMAP;
38 |
39 | struct EFILE_S { // Virtual File Stream
40 | char* pos;
41 | char* end;
42 | size_t size;
43 | int err;
44 | };
45 | typedef struct EFILE_S EFILE;
46 |
47 | // Error Handling
48 |
49 | __thread int eerrcode = 0;
50 | #define ethrow(err) { (eerrcode = (err)); return NULL; }
51 | #define eerrno (eerrcode)
52 |
53 | #define EERRCODE_SUCCESS 0
54 | #define EERRCODE_NOFILE 1
55 | #define EERRCODE_NOMAP 2
56 | #define EERRCODE_NULLSTREAM 3
57 | #define EERRCODE_OOBSTREAMPOS 4
58 |
59 | const char* eerrstr(int e){
60 | switch(e){
61 | case 0: return "Success.";
62 | case 1: return "No file found.";
63 | case 2: return "Mapping stucture error.";
64 | case 3: return "File stream pointer is NULL.";
65 | case 4: return "File stream pointer is out-of-bounds.";
66 | default: return "Unknown cembed error code.";
67 | };
68 | };
69 |
70 | #define eerror(c) printf("%s: (%u) %s\n", c, eerrcode, eerrstr(eerrcode))
71 |
72 | // File Useage
73 |
74 | #ifndef CEMBED_BUILD
75 |
76 | extern char cembed_map_start; // Embedded Indexing Structure
77 | extern char cembed_map_end;
78 | extern char cembed_map_size;
79 |
80 | extern char cembed_fs_start; // Embedded Virtual File System
81 | extern char cembed_fs_end;
82 | extern char cembed_fs_size;
83 |
84 | EFILE* eopen(const char* file, const char* mode){
85 |
86 | EMAP* map = (EMAP*)(&cembed_map_start);
87 | const char* end = &cembed_map_end;
88 |
89 | if( map == NULL || end == NULL )
90 | ethrow(EERRCODE_NOMAP);
91 |
92 | const u_int32_t key = hash((char*)file);
93 | while( ((char*)map != end) && (map->hash != key) )
94 | map++;
95 |
96 | if(map->hash != key)
97 | ethrow(EERRCODE_NOFILE);
98 |
99 | EFILE* e = (EFILE*)malloc(sizeof *e);
100 | e->pos = (&cembed_fs_start + map->pos);
101 | e->end = (&cembed_fs_start + map->pos + map->size);
102 | e->size = map->size;
103 |
104 | return e;
105 |
106 | }
107 |
108 | void eclose(EFILE* e){
109 | free(e);
110 | e = NULL;
111 | }
112 |
113 | bool eeof(EFILE* e){
114 | if(e == NULL){
115 | (eerrcode = (EERRCODE_NULLSTREAM));
116 | return true;
117 | }
118 | if(e->end < e->pos){
119 | (eerrcode = (EERRCODE_OOBSTREAMPOS));
120 | return true;
121 | }
122 | if((e->end - e->pos) - e->size < 0){
123 | (eerrcode = (EERRCODE_OOBSTREAMPOS));
124 | return true;
125 | }
126 | return (e->end == e->pos);
127 | }
128 |
129 | size_t eread(void* ptr, size_t size, size_t count, EFILE* stream){
130 |
131 | if(stream->end - stream->pos < size*count){
132 | size_t scount = stream->end - stream->pos;
133 | memcpy(ptr, (void*)stream->pos, scount);
134 | stream->pos = stream->end;
135 | return (scount/size);
136 | }
137 |
138 | memcpy(ptr, (void*)stream->pos, size*count);
139 | return count;
140 |
141 | }
142 |
143 | int egetpos(EFILE* e, epos_t* pos){
144 |
145 | if(e->end <= e->pos){
146 | pos = NULL;
147 | return 1;
148 | }
149 |
150 | *pos = (epos_t)(e->end - e->pos);
151 | return 0;
152 |
153 | }
154 |
155 | char* egets ( char* str, int num, EFILE* stream ){
156 |
157 | if(eeof(stream))
158 | return NULL;
159 |
160 | for(int i = 0; i < num && !eeof(stream) && *(stream->pos) != '\r'; i++)
161 | str[i] = *(stream->pos++);
162 |
163 | return str;
164 |
165 | }
166 |
167 | int egetc ( EFILE* stream ){
168 | if(eeof(stream))
169 | return -1;
170 | return (int)(*(stream->pos++));
171 | }
172 |
173 | long int etell(EFILE* e){
174 | return (e->end - e->pos) - e->size;
175 | }
176 |
177 | void rewind(EFILE* e){
178 | e->pos = (e->end - e->size);
179 | }
180 |
181 | int eseek ( EFILE* stream, long int offset, int origin ){
182 |
183 | if(origin == SEEK_SET)
184 | stream->pos = stream->end - stream->size + offset;
185 | if(origin == SEEK_CUR)
186 | stream->pos += offset;
187 | if(origin == SEEK_END)
188 | stream->pos = stream->end + offset;
189 |
190 | if(stream->end < stream->pos || etell(stream) < 0){
191 | (eerrcode = (EERRCODE_OOBSTREAMPOS));
192 | return true;
193 | }
194 |
195 | return 0;
196 |
197 | }
198 |
199 | // Preprocessor Translation
200 |
201 | #ifdef CEMBED_TRANSLATE
202 | #define FILE EFILE
203 | #define fopen eopen
204 | #define fclose eclose
205 | #define feof eeof
206 | #define fgets egets
207 | #define fgetc egetc
208 | #define perror eerror
209 | #define fread eread
210 | #define fseek eseek
211 | #define ftell etell
212 | #endif
213 |
214 | #endif
215 | #endif
216 |
--------------------------------------------------------------------------------
/examples/0_fgets/data/data.txt:
--------------------------------------------------------------------------------
1 | Hello World
2 |
--------------------------------------------------------------------------------
/examples/0_fgets/data/data2/data2.txt:
--------------------------------------------------------------------------------
1 | Other Data
2 |
--------------------------------------------------------------------------------
/examples/0_fgets/main:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/weigert/c-embed/bc672128bd1c798c0cd641eb240ac3dafc1bb1f5/examples/0_fgets/main
--------------------------------------------------------------------------------
/examples/0_fgets/main.c:
--------------------------------------------------------------------------------
1 | #include
2 |
3 | int main(int argc, char* args[]){
4 |
5 | FILE* eFile = fopen("data/data2/data2.txt", "r");
6 |
7 | char buffer [100] = {' '};
8 |
9 | if (eFile == NULL)
10 | perror ("Error opening file");
11 |
12 | else while(!feof(eFile)){
13 | if( fgets(buffer, 100, eFile) == NULL ) break;
14 | fputs (buffer , stdout);
15 | }
16 |
17 | fclose(eFile);
18 |
19 | }
20 |
--------------------------------------------------------------------------------
/examples/0_fgets/makefile:
--------------------------------------------------------------------------------
1 | # c-embed build system
2 |
3 | # data directory to embed
4 | DAT = data
5 |
6 | # c-embed configuration clags
7 | CEF = -include /usr/local/include/c-embed.h -DCEMBED_TRANSLATE
8 |
9 | # build rules
10 | .PHONY: embedded relative
11 |
12 | embedded: CF = $(shell c-embed $(DAT)) $(CEF)
13 | embedded: all
14 |
15 | relative: CF =
16 | relative: all
17 |
18 | build:
19 | gcc main.c $(CF) -o main
20 |
21 | all: build
22 |
--------------------------------------------------------------------------------
/makefile:
--------------------------------------------------------------------------------
1 | # c-embed
2 | # embed virtual file systems into an c program
3 | # - at build time
4 | # - with zero dependencies
5 | # - with zero code modifications
6 | # - with zero clutter in your program
7 | # author: nicholas mcdonald 2022
8 |
9 | INSTALL_DIR = /usr/local/bin
10 | INCLUDE_DIR = /usr/local/include
11 |
12 | build:
13 | gcc -g c-embed.c -o c-embed
14 |
15 | install:
16 | mv c-embed $(INSTALL_DIR)/c-embed
17 | cp c-embed.h $(INCLUDE_DIR)/c-embed.h
18 |
19 | all: build install
20 |
--------------------------------------------------------------------------------