├── .gitignore ├── README.md ├── c-embed.c ├── c-embed.h ├── examples └── 0_fgets │ ├── data │ ├── data.txt │ └── data2 │ │ └── data2.txt │ ├── main │ ├── main.c │ └── makefile └── makefile /.gitignore: -------------------------------------------------------------------------------- 1 | **/c-embed.o 2 | **/main 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # c-embed 2 | 3 | naturally embed read-only file/filesystem snapshots into any C99 program w. a single header, **zero dependencies** and **zero modifications** to your code at build time. 4 | 5 | works with `C99` to `C++20` compilers 6 | 7 | ## usage 8 | 9 | `c-embed` allows you to embed static, read-only snapshots of full filesystems into your C executable with almost zero effort. 10 | 11 | `c-embed` builds an object file containing the static file system image using the embedder binary `c-embed`, which outputs the name of the embedded filesystem file (default: `c-embed.o`): 12 | 13 | c-embed myfile.txt ... 14 | > c-embed.o 15 | 16 | *Note: Specifying a directory will recursively add all files (including in subdirectories) to the virtual file system.* 17 | 18 | The single `c-embed.h` header exposes `stdio.h` style interface for accessing the embedded file system, which you simply link as an object file: 19 | 20 | /* main.c */ 21 | 22 | #include 23 | 24 | int main(){ 25 | 26 | EFILE * pFile = eopen ("myfile.txt" , "r"); 27 | 28 | //... 29 | 30 | eclose(pFile); 31 | return 0; 32 | 33 | } 34 | 35 | **build** 36 | 37 | gcc main.c c-embed.o -o main 38 | 39 | *Note: All files in the virtual filesystem are located relative to the working directory where the `c-embed` binary is executed.* 40 | 41 | ### building / installation 42 | 43 | Build the `c-embed` executble and install the single header: 44 | 45 | sudo make all 46 | 47 | ### c-embed stdio.h interface 48 | 49 | The `stdio.h` style interface for virtual filesystems implemented in `c-embed` contains the following: 50 | 51 | EFILE - embedded file pointer 52 | eopen - open embedded file 53 | eclose - close embedded file pointer 54 | eeof - end of file check 55 | egets - get string util eof or new line 56 | egetc - get current stream char promoted to int 57 | eerror - prints a descriptive error 58 | eread - read to memory address 59 | eseek - seek a position in the file 60 | etell - get the position in the file 61 | rewind - reset the position to the start of the file 62 | 63 | in full analogy to their regular [`stdio.h` counterparts](https://cplusplus.com/reference/cstdio/). 64 | 65 | ### zero-modification embedding 66 | 67 | If you use `stdio.h` for your filesystem-io, then your code requires **zero** modifications to use `c-embed` for embedding virtual file system snapshots. To facilitate this, `c-embed` only requires a minor modification to your build process: 68 | 69 | #include 70 | 71 | int main(int argc, char* args[]){ 72 | 73 | FILE* eFile = fopen("data/data.txt", "r"); 74 | 75 | char buffer [100]{' '}; 76 | 77 | if (eFile == NULL) 78 | perror ("Error opening file"); 79 | 80 | else while(!feof(eFile)){ 81 | if( fgets(buffer, 100, eFile) == NULL ) break; 82 | fputs (buffer , stdout); 83 | } 84 | 85 | } 86 | 87 | Makefile - build with live file system: 88 | 89 | all: 90 | gcc main.c -o main 91 | 92 | Makefile - build with embedded virtual file system: 93 | 94 | DAT = data_path_1 data_path_2 # files to embed 95 | CEF = -DCEMBED_TRANSLATE -include /usr/local/include/c-embed.h 96 | 97 | all: 98 | gcc main.c $(shell c-embed $(DAT)) $(CEF) -o main 99 | 100 | This allows for seamless transitioning between development and deployment builds without needing to modify the code at all. 101 | 102 | *Note: The `CEMBED_TRANSLATE` preprocessor definition translates the `stdio.h` interface into the `c-embed` interface.* 103 | 104 | ## how it works 105 | 106 | The `c-embed` binary builds two files, containing a filesystem indexing structure and a concatenation of the files we wish to embed. 107 | 108 | The indexing structure uses a simple hash function to turn path strings into keys and store positions of binary data in the concatenated filesystem file. 109 | 110 | The binary then uses `objcopy` to turn these two files into accessible objects, which we can easily link with our main executable and access via defined symbols. 111 | 112 | The `c-embed.h` header then includes these symbols (one set of symbols for the indexing structure, one set for the file system) and implements the `stdio.h` style interface to interpret the data and access the embedded files. 113 | 114 | ### other details 115 | 116 | The filesystem remains static in the object file, meaning that the file system need not exist as long as `c-embed.o` exists. 117 | 118 | You can generate a static file system in one place and copy the `c-embed.o` file somewhere else and still have accessors given by the relative paths from the place where it was embedded. 119 | 120 | The `c-embed.o` file represents the filesystem snapshot, and compiling your executable by linking this file embeds the snapshot. 121 | 122 | ## Motivation 123 | 124 | Shipping data with programs is difficult when the goal is to distribute a single executable. For instance, shader programs for graphical applications are linked and compiled at runtime and therefore the data needs to be loaded in live. 125 | 126 | A number of strategies exist to circumvent this problem, each with their own set of drawbacks which `c-embed` avoids. 127 | 128 | ### Strategies 129 | 130 | 1. Embed as string / const char* literal 131 | 132 | This is trash for obvious reasons. It requires the manual declaration, tracking and working with symbols. The contents have no syntax highlighting and are difficult to edit and swap out without a complex toolchain. 133 | 134 | 2. Base64-Encode and Embed as String Literal 135 | 136 | A number of other projects on Github use this approach for binary data. It is identical to the previous strategy. 137 | 138 | Both of these solutions require modification / cluttering of your code with additional `#include` directives or symbols. 139 | 140 | 3. ASCII Files: `#include` as string literal 141 | 142 | const std::string file_content = 143 | #include "file" 144 | ; 145 | 146 | This is more flexible as it allows the data to remain in a separate file, albeit with the addition of the delimiters `R""(` and `)""`, which is ugly. This does not work for binary data. Note that here we are actually also managing file names, so changes to names or relative paths require modification of code. Additionally, *relative file paths must exist at compile time*. 147 | 148 | ### Solution 149 | 150 | The goal of this tiny, single-header library is to leverage the c preprocessor to move the entire file system configuration to the build process and out of your code. 151 | 152 | The `c-embed` binary uses objcopy to generate a single object file containing the relevant symbols for accessing the binary-encoded data. 153 | 154 | Finally, an abstract `stdio.h` style interface is provided for retrieving the data with proper error handling. 155 | 156 | The result is that `c-embed` does not strictly require that the file system itself exists at compile time or even relative to the build system; only at some point, somewhere. 157 | 158 | Files stay as the files which they are, and can thus be manipulated appropriately by your editor of choice, with the embedding occuring at build time. 159 | 160 | ## Future Work 161 | 162 | It would be interesting to consider if the file system could be made (temporarily) writeable in RAM. But this is beyond the purpose of this library. 163 | 164 | Other necessary improvements include: 165 | - active hash collision detection during embedding 166 | - somehow make it possible to link multiple file systems simultaneously 167 | - add an embed file inspection tool (this requires including metadata in the future) 168 | 169 | ## License 170 | 171 | MIT License 172 | -------------------------------------------------------------------------------- /c-embed.c: -------------------------------------------------------------------------------- 1 | /* 2 | # c-embed 3 | # embed virtual file systems into an c program 4 | # - at build time 5 | # - with zero dependencies 6 | # - with zero code modifications 7 | # - with zero clutter in your program 8 | # author: nicholas mcdonald 2022 9 | */ 10 | 11 | #define CEMBED_BUILD 12 | 13 | #include "c-embed.h" 14 | #include 15 | #include 16 | 17 | #define CEMBED_FILE "c-embed.o" // Output File 18 | #define CEMBED_TMPDIR "cembed_tmp" // Temporary Directory 19 | #define CEMBED_ARCH "elf64-x86-64" // Target Architecture 20 | 21 | FILE* ms = NULL; // Mapping Structure 22 | FILE* fs = NULL; // Virtual Filesystem 23 | FILE* file = NULL; // Embed Target File Pointer 24 | u_int32_t pos = 0; // Current Position 25 | 26 | void cembed(char* filename){ 27 | 28 | file = fopen(filename, "rb"); // Open the Embed Target File 29 | if(file == NULL){ 30 | printf("Failed to open file %s.", filename); 31 | return; 32 | } 33 | 34 | fseek(file, 0, SEEK_END); // Define Map 35 | EMAP map = {hash(filename), pos, (u_int32_t)ftell(file)}; 36 | rewind (file); 37 | 38 | char* buf = malloc(sizeof(char)*(map.size)); 39 | if(buf == NULL){ 40 | printf("Memory error for file %s.", filename); 41 | return; 42 | } 43 | 44 | u_int32_t result = fread(buf, 1, map.size, file); 45 | if(result != map.size){ 46 | printf("Read error for file %s.", filename); 47 | return; 48 | } 49 | 50 | fwrite(&map, sizeof map, 1, ms); // Write Mapping Structure 51 | fwrite(buf, map.size, 1, fs); // Write Virtual Filesystem 52 | 53 | free(buf); // Free Buffer 54 | fclose(file); // Close the File 55 | file = NULL; // Reset the Pointer 56 | pos += map.size; // Shift the Index Position 57 | 58 | } 59 | 60 | #define CEMBED_DIRENT_FILE 8 61 | #define CEMBED_DIRENT_DIR 4 62 | #define CEMBED_MAXPATH 512 63 | 64 | void iterdir(char* d){ 65 | 66 | char* fullpath = (char*)malloc(CEMBED_MAXPATH*sizeof(char)); 67 | 68 | DIR *dir; 69 | struct dirent *ent; 70 | 71 | if ((dir = opendir(d)) != NULL) { 72 | 73 | while ((ent = readdir(dir)) != NULL) { 74 | 75 | if(strcmp(ent->d_name, ".") == 0) continue; 76 | if(strcmp(ent->d_name, "..") == 0) continue; 77 | 78 | if(ent->d_type == CEMBED_DIRENT_FILE){ 79 | strcpy(fullpath, d); 80 | strcat(fullpath, "/"); 81 | strcat(fullpath, ent->d_name); 82 | cembed(fullpath); 83 | } 84 | 85 | else if(ent->d_type == CEMBED_DIRENT_DIR){ 86 | strcpy(fullpath, d); 87 | strcat(fullpath, "/"); 88 | strcat(fullpath, ent->d_name); 89 | iterdir(fullpath); 90 | } 91 | 92 | } 93 | 94 | closedir(dir); 95 | 96 | } 97 | 98 | else { 99 | 100 | strcpy(fullpath, d); 101 | cembed(fullpath); 102 | 103 | } 104 | 105 | free(fullpath); 106 | 107 | } 108 | 109 | int main(int argc, char* argv[]){ 110 | 111 | char fmt[CEMBED_MAXPATH]; 112 | 113 | if(argc <= 1) 114 | return 0; 115 | 116 | sprintf(fmt, "if [ ! -d %s ]; then mkdir %s; fi;", CEMBED_TMPDIR, CEMBED_TMPDIR); 117 | system(fmt); 118 | 119 | // Build the Mapping Structure and Virtual File System 120 | 121 | ms = fopen("cembed.map", "wb"); 122 | fs = fopen("cembed.fs", "wb"); 123 | 124 | if(ms == NULL || fs == NULL){ 125 | printf("Failed to initialize map and filesystem. Check permissions."); 126 | return 0; 127 | } 128 | 129 | for(int i = 1; i < argc; i++) 130 | iterdir(argv[i]); 131 | 132 | fclose(ms); 133 | fclose(fs); 134 | 135 | // Convert to Embeddable Symbols 136 | 137 | sprintf(fmt, "objcopy -I binary -O %s "\ 138 | "--redefine-sym _binary_cembed_map_start=cembed_map_start "\ 139 | "--redefine-sym _binary_cembed_map_end=cembed_map_end "\ 140 | "--redefine-sym _binary_cembed_map_size=cembed_map_size "\ 141 | "cembed.map cembed.map.o", CEMBED_ARCH); 142 | system(fmt); 143 | 144 | sprintf(fmt, "mv cembed.map.o %s/cembed.map.o", CEMBED_TMPDIR); 145 | system(fmt); 146 | system("rm cembed.map"); 147 | 148 | sprintf(fmt, "objcopy -I binary -O %s "\ 149 | "--redefine-sym _binary_cembed_fs_start=cembed_fs_start "\ 150 | "--redefine-sym _binary_cembed_fs_end=cembed_fs_end "\ 151 | "--redefine-sym _binary_cembed_fs_size=cembed_fs_size "\ 152 | "cembed.fs cembed.fs.o", CEMBED_ARCH); 153 | system(fmt); 154 | 155 | sprintf(fmt, "mv cembed.fs.o %s/cembed.fs.o", CEMBED_TMPDIR); 156 | system(fmt); 157 | system("rm cembed.fs"); 158 | 159 | sprintf(fmt, "ld -relocatable cembed_tmp/*.o -o %s", CEMBED_FILE); 160 | system(fmt); 161 | 162 | sprintf(fmt, "rm -rf %s", CEMBED_TMPDIR); 163 | system(fmt); 164 | 165 | printf("%s", CEMBED_FILE); 166 | 167 | return 0; 168 | 169 | } 170 | -------------------------------------------------------------------------------- /c-embed.h: -------------------------------------------------------------------------------- 1 | /* 2 | # c-embed 3 | # embed virtual file systems into an c program 4 | # - at build time 5 | # - with zero dependencies 6 | # - with zero code modifications 7 | # - with zero clutter in your program 8 | # author: nicholas mcdonald 2022 9 | */ 10 | 11 | #ifndef CEMBED 12 | #define CEMBED 13 | 14 | #include 15 | #include 16 | #include 17 | #include 18 | #include 19 | 20 | u_int32_t hash(char * key){ // Hash Function: MurmurOAAT64 21 | u_int32_t h = 3323198485ul; 22 | for (;*key;++key) { 23 | h ^= *key; 24 | h *= 0x5bd1e995; 25 | h ^= h >> 15; 26 | } 27 | return h; 28 | } 29 | 30 | typedef size_t epos_t; 31 | 32 | struct EMAP_S { // Map Indexing Struct 33 | u_int32_t hash; 34 | u_int32_t pos; 35 | u_int32_t size; 36 | }; 37 | typedef struct EMAP_S EMAP; 38 | 39 | struct EFILE_S { // Virtual File Stream 40 | char* pos; 41 | char* end; 42 | size_t size; 43 | int err; 44 | }; 45 | typedef struct EFILE_S EFILE; 46 | 47 | // Error Handling 48 | 49 | __thread int eerrcode = 0; 50 | #define ethrow(err) { (eerrcode = (err)); return NULL; } 51 | #define eerrno (eerrcode) 52 | 53 | #define EERRCODE_SUCCESS 0 54 | #define EERRCODE_NOFILE 1 55 | #define EERRCODE_NOMAP 2 56 | #define EERRCODE_NULLSTREAM 3 57 | #define EERRCODE_OOBSTREAMPOS 4 58 | 59 | const char* eerrstr(int e){ 60 | switch(e){ 61 | case 0: return "Success."; 62 | case 1: return "No file found."; 63 | case 2: return "Mapping stucture error."; 64 | case 3: return "File stream pointer is NULL."; 65 | case 4: return "File stream pointer is out-of-bounds."; 66 | default: return "Unknown cembed error code."; 67 | }; 68 | }; 69 | 70 | #define eerror(c) printf("%s: (%u) %s\n", c, eerrcode, eerrstr(eerrcode)) 71 | 72 | // File Useage 73 | 74 | #ifndef CEMBED_BUILD 75 | 76 | extern char cembed_map_start; // Embedded Indexing Structure 77 | extern char cembed_map_end; 78 | extern char cembed_map_size; 79 | 80 | extern char cembed_fs_start; // Embedded Virtual File System 81 | extern char cembed_fs_end; 82 | extern char cembed_fs_size; 83 | 84 | EFILE* eopen(const char* file, const char* mode){ 85 | 86 | EMAP* map = (EMAP*)(&cembed_map_start); 87 | const char* end = &cembed_map_end; 88 | 89 | if( map == NULL || end == NULL ) 90 | ethrow(EERRCODE_NOMAP); 91 | 92 | const u_int32_t key = hash((char*)file); 93 | while( ((char*)map != end) && (map->hash != key) ) 94 | map++; 95 | 96 | if(map->hash != key) 97 | ethrow(EERRCODE_NOFILE); 98 | 99 | EFILE* e = (EFILE*)malloc(sizeof *e); 100 | e->pos = (&cembed_fs_start + map->pos); 101 | e->end = (&cembed_fs_start + map->pos + map->size); 102 | e->size = map->size; 103 | 104 | return e; 105 | 106 | } 107 | 108 | void eclose(EFILE* e){ 109 | free(e); 110 | e = NULL; 111 | } 112 | 113 | bool eeof(EFILE* e){ 114 | if(e == NULL){ 115 | (eerrcode = (EERRCODE_NULLSTREAM)); 116 | return true; 117 | } 118 | if(e->end < e->pos){ 119 | (eerrcode = (EERRCODE_OOBSTREAMPOS)); 120 | return true; 121 | } 122 | if((e->end - e->pos) - e->size < 0){ 123 | (eerrcode = (EERRCODE_OOBSTREAMPOS)); 124 | return true; 125 | } 126 | return (e->end == e->pos); 127 | } 128 | 129 | size_t eread(void* ptr, size_t size, size_t count, EFILE* stream){ 130 | 131 | if(stream->end - stream->pos < size*count){ 132 | size_t scount = stream->end - stream->pos; 133 | memcpy(ptr, (void*)stream->pos, scount); 134 | stream->pos = stream->end; 135 | return (scount/size); 136 | } 137 | 138 | memcpy(ptr, (void*)stream->pos, size*count); 139 | return count; 140 | 141 | } 142 | 143 | int egetpos(EFILE* e, epos_t* pos){ 144 | 145 | if(e->end <= e->pos){ 146 | pos = NULL; 147 | return 1; 148 | } 149 | 150 | *pos = (epos_t)(e->end - e->pos); 151 | return 0; 152 | 153 | } 154 | 155 | char* egets ( char* str, int num, EFILE* stream ){ 156 | 157 | if(eeof(stream)) 158 | return NULL; 159 | 160 | for(int i = 0; i < num && !eeof(stream) && *(stream->pos) != '\r'; i++) 161 | str[i] = *(stream->pos++); 162 | 163 | return str; 164 | 165 | } 166 | 167 | int egetc ( EFILE* stream ){ 168 | if(eeof(stream)) 169 | return -1; 170 | return (int)(*(stream->pos++)); 171 | } 172 | 173 | long int etell(EFILE* e){ 174 | return (e->end - e->pos) - e->size; 175 | } 176 | 177 | void rewind(EFILE* e){ 178 | e->pos = (e->end - e->size); 179 | } 180 | 181 | int eseek ( EFILE* stream, long int offset, int origin ){ 182 | 183 | if(origin == SEEK_SET) 184 | stream->pos = stream->end - stream->size + offset; 185 | if(origin == SEEK_CUR) 186 | stream->pos += offset; 187 | if(origin == SEEK_END) 188 | stream->pos = stream->end + offset; 189 | 190 | if(stream->end < stream->pos || etell(stream) < 0){ 191 | (eerrcode = (EERRCODE_OOBSTREAMPOS)); 192 | return true; 193 | } 194 | 195 | return 0; 196 | 197 | } 198 | 199 | // Preprocessor Translation 200 | 201 | #ifdef CEMBED_TRANSLATE 202 | #define FILE EFILE 203 | #define fopen eopen 204 | #define fclose eclose 205 | #define feof eeof 206 | #define fgets egets 207 | #define fgetc egetc 208 | #define perror eerror 209 | #define fread eread 210 | #define fseek eseek 211 | #define ftell etell 212 | #endif 213 | 214 | #endif 215 | #endif 216 | -------------------------------------------------------------------------------- /examples/0_fgets/data/data.txt: -------------------------------------------------------------------------------- 1 | Hello World 2 | -------------------------------------------------------------------------------- /examples/0_fgets/data/data2/data2.txt: -------------------------------------------------------------------------------- 1 | Other Data 2 | -------------------------------------------------------------------------------- /examples/0_fgets/main: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/weigert/c-embed/bc672128bd1c798c0cd641eb240ac3dafc1bb1f5/examples/0_fgets/main -------------------------------------------------------------------------------- /examples/0_fgets/main.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int main(int argc, char* args[]){ 4 | 5 | FILE* eFile = fopen("data/data2/data2.txt", "r"); 6 | 7 | char buffer [100] = {' '}; 8 | 9 | if (eFile == NULL) 10 | perror ("Error opening file"); 11 | 12 | else while(!feof(eFile)){ 13 | if( fgets(buffer, 100, eFile) == NULL ) break; 14 | fputs (buffer , stdout); 15 | } 16 | 17 | fclose(eFile); 18 | 19 | } 20 | -------------------------------------------------------------------------------- /examples/0_fgets/makefile: -------------------------------------------------------------------------------- 1 | # c-embed build system 2 | 3 | # data directory to embed 4 | DAT = data 5 | 6 | # c-embed configuration clags 7 | CEF = -include /usr/local/include/c-embed.h -DCEMBED_TRANSLATE 8 | 9 | # build rules 10 | .PHONY: embedded relative 11 | 12 | embedded: CF = $(shell c-embed $(DAT)) $(CEF) 13 | embedded: all 14 | 15 | relative: CF = 16 | relative: all 17 | 18 | build: 19 | gcc main.c $(CF) -o main 20 | 21 | all: build 22 | -------------------------------------------------------------------------------- /makefile: -------------------------------------------------------------------------------- 1 | # c-embed 2 | # embed virtual file systems into an c program 3 | # - at build time 4 | # - with zero dependencies 5 | # - with zero code modifications 6 | # - with zero clutter in your program 7 | # author: nicholas mcdonald 2022 8 | 9 | INSTALL_DIR = /usr/local/bin 10 | INCLUDE_DIR = /usr/local/include 11 | 12 | build: 13 | gcc -g c-embed.c -o c-embed 14 | 15 | install: 16 | mv c-embed $(INSTALL_DIR)/c-embed 17 | cp c-embed.h $(INCLUDE_DIR)/c-embed.h 18 | 19 | all: build install 20 | --------------------------------------------------------------------------------