├── .gitignore ├── LICENSE ├── README.md ├── build.sh └── src ├── main.c ├── matrix ├── matrix.c └── matrix.h ├── model ├── rnn.c └── rnn.h └── vocabulary ├── vocabulary.c └── vocabulary.h /.gitignore: -------------------------------------------------------------------------------- 1 | /dist/ 2 | /.vscode/ 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 io-eric 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # RNN Text Generation Model in C 2 | 3 | This repository contains a simple Recurrent Neural Network (RNN) implemented from scratch in C, designed to generate text based on a given input word. The model is trained on a small dataset of sentences and learns to predict the next word in a sequence. For example, given the input word "Matrix," the model might predict "dimensions" as the next word. 4 | 5 | ## Overview 6 | This project implements a basic RNN model in C to generate text. The model is trained on a small dataset of sentences and learns to predict the next word in a sequence. The implementation includes: 7 | - Vocabulary creation and word indexing. 8 | - One-hot encoding for input and target vectors. 9 | - Forward and backward propagation for training. 10 | - Text generation based on a given input word. 11 | 12 | ## Features 13 | - From Scratch Implementation: The RNN is implemented entirely in C without relying on external machine learning libraries. 14 | - Text Generation: Given an input word, the model predicts the next word in the sequence. 15 | - Customizable Training: Adjustable parameters such as learning rate, hidden layer size, and number of epochs. 16 | 17 | ## Requirements 18 | - C Compiler: GCC or any C99-compatible compiler. 19 | - Basic C Libraries: Standard libraries like stdio.h, stdlib.h, and string.h. 20 | 21 | ## Installation 22 | 1. Clone the repository: 23 | ``` 24 | git clone https://github.com/io-eric/C-RNN-Text-Generation.git 25 | cd C-RNN-Text-Generation 26 | ``` 27 | 2. Compile and run the project: 28 | ``` 29 | ./build.sh 30 | ``` 31 | ## Usage 32 | After compiling and running the program, the model will: 33 | 1. Train on the provided dataset. 34 | 2. Generate text based on the input word "Matrix." 35 | 3. Print the generated text to the console. 36 | 37 | To modify the input word or the number of generated words, edit the following lines in main.c: 38 | ````c 39 | char *input_text = "Matrix"; // Change the input word 40 | char *next_word_predictions = rnn_generate_text(v, rnn, input_text, 5); // Change the number of words to generate 41 | ```` 42 | 43 | ## Training Data 44 | The model is trained on a small dataset of sentences: 45 | ```c 46 | const char *training_data[] = { 47 | "Matrix dimensions don’t match? Shocking.", 48 | "Rain on the window? Wow, never seen that before.", 49 | "Starting is the hardest part? Groundbreaking insight.", 50 | "Traveling? Because who wouldn’t want to get lost in a new place?", 51 | "Books? Oh yeah, they’re just full of ideas or whatever.", 52 | "Time’s too short for pointless stuff... unless it’s procrastination." 53 | }; 54 | ``` 55 | 56 | You can replace this dataset with your own text data for custom training. 57 | 58 | ## Results 59 | After training, the model generates text based on the input word. For example: 60 | ```c 61 | $ ./build.sh 62 | Compilation successful! 63 | Epoch 0, Average Loss: 1.657553 64 | Epoch 1000, Average Loss: 0.044296 65 | Epoch 2000, Average Loss: 0.012500 66 | Epoch 3000, Average Loss: 0.012825 67 | Epoch 4000, Average Loss: 0.008907 68 | Input text: Matrix 69 | Next word predictions: dimensions dimensions never never full 70 | ``` 71 | 72 | The model's performance is limited by the small dataset and simple architecture, but it demonstrates the basic principles of RNNs and text generation. 73 | 74 | ## Limitations 75 | - Small Dataset: The model is trained on a very small dataset, which limits its ability to generalize. 76 | - Basic Architecture: The RNN is a simple implementation and may struggle with long-term dependencies. 77 | - Overfitting: Due to the small dataset, the model may overfit and repeat words. 78 | 79 | ## Future Improvements 80 | - Larger Dataset: Train the model on a larger and more diverse dataset. 81 | - Advanced Architectures: Implement more advanced RNN variants like LSTMs or GRUs. 82 | - Better Text Generation: Improve the text generation logic to produce more coherent and varied outputs. 83 | - User Interface: Add a command-line interface for easier interaction with the model. 84 | 85 | ## License 86 | This project is licensed under the MIT License. See the [LICENSE](./LICENSE) file for details. 87 | -------------------------------------------------------------------------------- /build.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Set the output name for the executable 4 | OUTPUT="dist/rnn" 5 | 6 | # Define your source files 7 | SOURCES="src/main.c src/matrix/matrix.c src/vocabulary/vocabulary.c src/model/rnn.c" 8 | 9 | # Define any compiler flags if needed (e.g., for debugging) 10 | CFLAGS="-Wall -g" 11 | 12 | # Make sure the dist directory exists 13 | mkdir -p dist 14 | 15 | # Compile the source files into an executable 16 | gcc $CFLAGS $SOURCES -o $OUTPUT -lm 17 | 18 | # Check if the compilation was successful 19 | if [ $? -eq 0 ]; then 20 | echo "Compilation successful!" 21 | 22 | # Run the program 23 | ./$OUTPUT 24 | else 25 | echo "Compilation failed." 26 | fi 27 | -------------------------------------------------------------------------------- /src/main.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include "model/rnn.h" 5 | #include "vocabulary/vocabulary.h" 6 | 7 | int main() 8 | { 9 | // Initialize vocabulary 10 | Vocabulary *v = vocabulary_create(100); 11 | 12 | const char *training_data[] = { 13 | "Matrix dimensions don’t match? Shocking.", 14 | "Rain on the window? Wow, never seen that before.", 15 | "Starting is the hardest part? Groundbreaking insight.", 16 | "Traveling? Because who wouldn’t want to get lost in a new place?", 17 | "Books? Oh yeah, they’re just full of ideas or whatever.", 18 | "Time’s too short for pointless stuff... unless it’s procrastination."}; 19 | 20 | int num_samples = sizeof(training_data) / sizeof(training_data[0]); 21 | 22 | // Add words to the vocabulary 23 | for (int i = 0; i < num_samples; i++) 24 | { 25 | char *sentence = strdup(training_data[i]); 26 | char *word = strtok(sentence, " "); 27 | while (word != NULL) 28 | { 29 | vocabulary_add_word(v, word); 30 | word = strtok(NULL, " "); 31 | } 32 | free(sentence); 33 | } 34 | 35 | // Parameters 36 | int input_size = v->size; 37 | int hidden_size = 100; 38 | int output_size = v->size; 39 | double learning_rate = 0.01; 40 | 41 | RNN *rnn = rnn_init(input_size, hidden_size, output_size, learning_rate); 42 | 43 | // Prepare training data: input-target pairs 44 | Matrix **input_vectors = (Matrix **)malloc(num_samples * sizeof(Matrix *)); 45 | Matrix **target_vectors = (Matrix **)malloc(num_samples * sizeof(Matrix *)); 46 | 47 | if (input_vectors == NULL || target_vectors == NULL) 48 | { 49 | fprintf(stderr, "Memory allocation failed\n"); 50 | return 1; 51 | } 52 | 53 | // Convert sentences to sequences of one-hot encoded vectors 54 | for (int i = 0; i < num_samples; i++) 55 | { 56 | char *sentence = strdup(training_data[i]); 57 | char *word = strtok(sentence, " "); 58 | input_vectors[i] = create_one_hot_vector(v, word); 59 | 60 | word = strtok(NULL, " "); 61 | target_vectors[i] = create_one_hot_vector(v, word); 62 | 63 | free(sentence); 64 | } 65 | 66 | // Train the RNN 67 | int epochs = 2000; 68 | for (int epoch = 0; epoch < epochs; epoch++) 69 | { 70 | double epoch_loss = 0.0; 71 | for (int i = 0; i < num_samples; i++) 72 | { 73 | Matrix *output = rnn_forward(rnn, input_vectors[i]); 74 | rnn_backward(rnn, input_vectors[i], target_vectors[i]); 75 | 76 | double loss = matrix_mean_square_error(output, target_vectors[i]); 77 | epoch_loss += loss; 78 | 79 | matrix_free(output); 80 | } 81 | 82 | double avg_epoch_loss = epoch_loss / num_samples; 83 | if (epoch % 1000 == 0) // Print loss every 1000 epochs 84 | { 85 | printf("Epoch %d, Average Loss: %f\n", epoch, avg_epoch_loss); 86 | } 87 | } 88 | 89 | // Generate text after training 90 | char *input_text = "Rain"; 91 | char *next_word_predictions = rnn_generate_text(v, rnn, input_text, 5); 92 | printf("Input text: %s\n", input_text); 93 | printf("Next word predictions: %s\n", next_word_predictions); 94 | 95 | // Clean up 96 | free(next_word_predictions); 97 | for (int i = 0; i < num_samples; i++) 98 | { 99 | matrix_free(input_vectors[i]); 100 | matrix_free(target_vectors[i]); 101 | } 102 | free(input_vectors); 103 | free(target_vectors); 104 | rnn_free(rnn); 105 | vocabulary_free(v); 106 | 107 | return 0; 108 | } -------------------------------------------------------------------------------- /src/matrix/matrix.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include "matrix.h" 7 | 8 | #define MAXCHAR 100 9 | 10 | Matrix* matrix_create(int rows, int cols) { 11 | // Allocate memory for the matrix structure 12 | Matrix* matrix = malloc(sizeof(Matrix)); 13 | if (!matrix) return NULL; // Memory allocation failure check 14 | 15 | matrix->rows = rows; 16 | matrix->cols = cols; 17 | 18 | // Allocate memory for the matrix entries (array of row pointers) 19 | matrix->entries = malloc(rows * sizeof(double*)); 20 | if (!matrix->entries) { 21 | free(matrix); // Free matrix structure if allocation fails 22 | return NULL; 23 | } 24 | 25 | // Allocate memory for each row (array of doubles) 26 | for (int i = 0; i < rows; i++) { 27 | matrix->entries[i] = malloc(cols * sizeof(double)); 28 | if (!matrix->entries[i]) { 29 | // Free all previously allocated memory if row allocation fails 30 | for (int j = 0; j < i; j++) { 31 | free(matrix->entries[j]); 32 | } 33 | free(matrix->entries); 34 | free(matrix); 35 | return NULL; 36 | } 37 | } 38 | 39 | return matrix; // Return the created matrix 40 | } 41 | 42 | // Function to free the matrix memory 43 | void matrix_free(Matrix* matrix) { 44 | if (matrix) { 45 | // Free each row 46 | for (int i = 0; i < matrix->rows; i++) { 47 | free(matrix->entries[i]); 48 | } 49 | // Free the row pointers array 50 | free(matrix->entries); 51 | // Free the matrix structure itself 52 | free(matrix); 53 | } 54 | } 55 | 56 | void matrix_fill(Matrix *m, double n) 57 | { 58 | for (int i = 0; i < m->rows; i++) 59 | { 60 | for (int j = 0; j < m->cols; j++) 61 | { 62 | m->entries[i][j] = n; 63 | } 64 | } 65 | } 66 | 67 | Matrix* matrix_zero(int rows, int cols) { 68 | Matrix* m = matrix_create(rows, cols); 69 | matrix_fill(m, 0.0); 70 | return m; 71 | } 72 | 73 | Matrix *matrix_copy(Matrix *m) 74 | { 75 | Matrix *mat = matrix_create(m->rows, m->cols); 76 | for (int i = 0; i < m->rows; i++) 77 | { 78 | for (int j = 0; j < m->cols; j++) 79 | { 80 | mat->entries[i][j] = m->entries[i][j]; 81 | } 82 | } 83 | return mat; 84 | } 85 | 86 | void matrix_print(Matrix *m) 87 | { 88 | printf("Rows: %d Columns: %d\n", m->rows, m->cols); 89 | for (int i = 0; i < m->rows; i++) 90 | { 91 | for (int j = 0; j < m->cols; j++) 92 | { 93 | printf("%1.3f ", m->entries[i][j]); 94 | } 95 | printf("\n"); 96 | } 97 | } 98 | 99 | void matrix_print_dimensions(Matrix *m) 100 | { 101 | printf("Rows: %d Columns: %d\n", m->rows, m->cols); 102 | } 103 | 104 | void matrix_save(Matrix *m, FILE *file) 105 | { 106 | fprintf(file, "%d\n", m->rows); 107 | fprintf(file, "%d\n", m->cols); 108 | for (int i = 0; i < m->rows; i++) 109 | { 110 | for (int j = 0; j < m->cols; j++) 111 | { 112 | fprintf(file, "%.6f\n", m->entries[i][j]); 113 | } 114 | } 115 | printf("Successfully saved matrix to file\n"); 116 | } 117 | 118 | Matrix *matrix_load(FILE *file) 119 | { 120 | char entry[MAXCHAR]; 121 | 122 | // Read matrix dimensions 123 | if (!fgets(entry, MAXCHAR, file)) 124 | return NULL; 125 | int rows = atoi(entry); 126 | if (!fgets(entry, MAXCHAR, file)) 127 | return NULL; 128 | int cols = atoi(entry); 129 | 130 | // Create matrix 131 | Matrix *m = matrix_create(rows, cols); 132 | if (!m) 133 | return NULL; // Memory allocation failed 134 | 135 | // Read matrix data 136 | for (int i = 0; i < m->rows; i++) 137 | { 138 | for (int j = 0; j < m->cols; j++) 139 | { 140 | if (!fgets(entry, MAXCHAR, file)) 141 | return NULL; 142 | m->entries[i][j] = strtod(entry, NULL); 143 | } 144 | } 145 | return m; 146 | } 147 | 148 | int matrix_argmax(Matrix *m) 149 | { 150 | // Expects a Mx1 matrix 151 | double max_score = 0; 152 | int max_idx = 0; 153 | for (int i = 0; i < m->rows; i++) 154 | { 155 | if (m->entries[i][0] > max_score) 156 | { 157 | max_score = m->entries[i][0]; 158 | max_idx = i; 159 | } 160 | } 161 | return max_idx; 162 | } 163 | 164 | Matrix* matrix_row(Matrix* m, int row_index) { 165 | if (row_index >= m->rows || row_index < 0) { 166 | // Invalid row index 167 | return NULL; 168 | } 169 | 170 | // Create a new matrix with one row and the same number of columns 171 | Matrix* row = matrix_create(1, m->cols); 172 | 173 | // Copy the elements of the specified row into the new matrix 174 | for (int col = 0; col < m->cols; col++) { 175 | row->entries[0][col] = m->entries[row_index][col]; 176 | } 177 | 178 | return row; 179 | } 180 | 181 | double matrix_mean_square_error(Matrix *output, Matrix *target) { 182 | // Ensure the matrices have the same dimensions 183 | if (output->rows != target->rows || output->cols != target->cols) { 184 | fprintf(stderr, "Error: Matrices must have the same dimensions for MSE calculation.\n"); 185 | exit(EXIT_FAILURE); 186 | } 187 | 188 | double mse = 0.0; 189 | int total_elements = output->rows * output->cols; 190 | 191 | // Calculate the sum of squared differences 192 | for (int i = 0; i < output->rows; i++) { 193 | for (int j = 0; j < output->cols; j++) { 194 | double error = output->entries[i][j] - target->entries[i][j]; 195 | mse += pow(error, 2); // Square the error and add to the total 196 | } 197 | } 198 | 199 | // Divide by the total number of elements to get the mean 200 | mse /= total_elements; 201 | 202 | return mse; 203 | } 204 | 205 | int matrix_check_dimensions(Matrix *m1, Matrix *m2) 206 | { 207 | return m1->rows == m2->rows && m1->cols == m2->cols; 208 | } 209 | 210 | void matrix_randomize(Matrix *m, double min, double max) 211 | { 212 | static int seeded = 0; 213 | if (!seeded) 214 | { 215 | srand((unsigned int)time(NULL)); // Seed random number generator once 216 | seeded = 1; 217 | } 218 | 219 | for (int i = 0; i < m->rows; i++) 220 | { 221 | for (int j = 0; j < m->cols; j++) 222 | { 223 | // Generate random numbers in the range [min, max) 224 | m->entries[i][j] = min + ((double)rand() / RAND_MAX) * (max - min); 225 | } 226 | } 227 | } 228 | 229 | void matrix_xavier_randomize(Matrix *m, int input_size, int output_size) 230 | { 231 | static int seeded = 0; 232 | if (!seeded) 233 | { 234 | srand((unsigned int)time(NULL)); // Seed random number generator once 235 | seeded = 1; 236 | } 237 | 238 | double limit = sqrt(6.0 / (input_size + output_size)); // Xavier limit 239 | 240 | for (int i = 0; i < m->rows; i++) 241 | { 242 | for (int j = 0; j < m->cols; j++) 243 | { 244 | // Initialize weights with values from the uniform distribution within [-limit, limit] 245 | m->entries[i][j] = (2.0 * ((double)rand() / RAND_MAX) - 1.0) * limit; 246 | } 247 | } 248 | } 249 | 250 | double matrix_sum_elements(Matrix *m) 251 | { 252 | double sum = 0.0; 253 | for (int i = 0; i < m->rows; i++) 254 | { 255 | for (int j = 0; j < m->cols; j++) 256 | { 257 | sum += m->entries[i][j]; 258 | } 259 | } 260 | return sum; 261 | } 262 | 263 | Matrix *matrix_add(Matrix *m1, Matrix *m2) 264 | { 265 | if (matrix_check_dimensions(m1, m2)) 266 | { 267 | Matrix *m = matrix_create(m1->rows, m1->cols); 268 | for (int i = 0; i < m1->rows; i++) 269 | { 270 | for (int j = 0; j < m2->cols; j++) 271 | { 272 | m->entries[i][j] = m1->entries[i][j] + m2->entries[i][j]; 273 | } 274 | } 275 | return m; 276 | } 277 | else 278 | { 279 | printf("(matrix_add) Dimensions mismatch add: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols); 280 | exit(EXIT_FAILURE); 281 | } 282 | } 283 | Matrix *matrix_subtract(Matrix *m1, Matrix *m2) 284 | { 285 | if (matrix_check_dimensions(m1, m2)) 286 | { 287 | Matrix *m = matrix_create(m1->rows, m1->cols); 288 | for (int i = 0; i < m1->rows; i++) 289 | { 290 | for (int j = 0; j < m2->cols; j++) 291 | { 292 | m->entries[i][j] = m1->entries[i][j] - m2->entries[i][j]; 293 | } 294 | } 295 | return m; 296 | } 297 | else 298 | { 299 | printf("(matrix_subtract) Dimensions mismatch subtract: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols); 300 | exit(EXIT_FAILURE); 301 | } 302 | } 303 | 304 | Matrix *matrix_multiply(Matrix *m1, Matrix *m2) 305 | { 306 | if (m1->cols == m2->rows) 307 | { 308 | // Create a result matrix with appropriate dimensions 309 | Matrix *result = matrix_create(m1->rows, m2->cols); 310 | 311 | // Perform matrix multiplication 312 | for (int i = 0; i < m1->rows; i++) 313 | { 314 | for (int j = 0; j < m2->cols; j++) 315 | { 316 | result->entries[i][j] = 0; 317 | for (int k = 0; k < m1->cols; k++) 318 | { 319 | result->entries[i][j] += m1->entries[i][k] * m2->entries[k][j]; 320 | } 321 | } 322 | } 323 | return result; 324 | } 325 | else 326 | { 327 | printf("(matrix_multiply) Dimensions mismatch multiply: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols); 328 | exit(EXIT_FAILURE); 329 | } 330 | } 331 | 332 | Matrix *matrix_dot(Matrix *m1, Matrix *m2) 333 | { 334 | if (m1->cols == m2->rows) 335 | { 336 | Matrix *m = matrix_create(m1->rows, m2->cols); 337 | for (int i = 0; i < m1->rows; i++) 338 | { 339 | for (int j = 0; j < m2->cols; j++) 340 | { 341 | double sum = 0; 342 | for (int k = 0; k < m2->rows; k++) 343 | { 344 | sum += m1->entries[i][k] * m2->entries[k][j]; 345 | } 346 | m->entries[i][j] = sum; 347 | } 348 | } 349 | return m; 350 | } 351 | else 352 | { 353 | printf("(matrix_dot) Dimensions mismatch dot: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols); 354 | exit(EXIT_FAILURE); 355 | } 356 | } 357 | 358 | Matrix *matrix_apply(double (*func)(double), Matrix *m) 359 | { 360 | Matrix *mat = matrix_copy(m); 361 | for (int i = 0; i < m->rows; i++) 362 | { 363 | for (int j = 0; j < m->cols; j++) 364 | { 365 | mat->entries[i][j] = (*func)(m->entries[i][j]); 366 | } 367 | } 368 | return mat; 369 | } 370 | 371 | Matrix *matrix_scale(double n, Matrix *m) 372 | { 373 | Matrix *mat = matrix_copy(m); 374 | for (int i = 0; i < m->rows; i++) 375 | { 376 | for (int j = 0; j < m->cols; j++) 377 | { 378 | mat->entries[i][j] *= n; 379 | } 380 | } 381 | return mat; 382 | } 383 | 384 | Matrix *matrix_addScalar(double n, Matrix *m) 385 | { 386 | Matrix *mat = matrix_copy(m); 387 | for (int i = 0; i < m->rows; i++) 388 | { 389 | for (int j = 0; j < m->cols; j++) 390 | { 391 | mat->entries[i][j] += n; 392 | } 393 | } 394 | return mat; 395 | } 396 | 397 | Matrix *matrix_transpose(Matrix *m) 398 | { 399 | Matrix *mat = matrix_create(m->cols, m->rows); 400 | for (int i = 0; i < m->rows; i++) 401 | { 402 | for (int j = 0; j < m->cols; j++) 403 | { 404 | mat->entries[j][i] = m->entries[i][j]; 405 | } 406 | } 407 | return mat; 408 | } 409 | -------------------------------------------------------------------------------- /src/matrix/matrix.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | 5 | typedef struct 6 | { 7 | double **entries; 8 | int rows; 9 | int cols; 10 | } Matrix; 11 | 12 | // Matrix Creation, Management, and Basic Utilities 13 | Matrix *matrix_create(int row, int col); 14 | void matrix_free(Matrix *m); 15 | void matrix_fill(Matrix *m, double n); 16 | Matrix *matrix_zero(int row, int col); 17 | Matrix *matrix_copy(Matrix *m); 18 | void matrix_print(Matrix *m); 19 | void matrix_print_dimensions(Matrix *m); 20 | 21 | // File Operations 22 | void matrix_save(Matrix *m, FILE *file); 23 | Matrix *matrix_load(FILE *file); 24 | 25 | // Matrix Queries 26 | int matrix_argmax(Matrix *m); 27 | int matrix_check_dimensions(Matrix *m1, Matrix *m2); 28 | Matrix *matrix_row(Matrix *m, int row_index); 29 | double matrix_mean_square_error(Matrix *output, Matrix *target); 30 | 31 | // Matrix Operations 32 | void matrix_randomize(Matrix *m, double min, double max); 33 | void matrix_xavier_randomize(Matrix *m, int input_size, int output_size); 34 | double matrix_sum_elements(Matrix *m); 35 | Matrix *matrix_add(Matrix *m1, Matrix *m2); 36 | Matrix *matrix_subtract(Matrix *m1, Matrix *m2); 37 | Matrix *matrix_multiply(Matrix *m1, Matrix *m2); 38 | Matrix *matrix_dot(Matrix *m1, Matrix *m2); 39 | Matrix *matrix_apply(double (*func)(double), Matrix *m); 40 | Matrix *matrix_scale(double n, Matrix *m); 41 | Matrix *matrix_addScalar(double n, Matrix *m); 42 | Matrix *matrix_transpose(Matrix *m); -------------------------------------------------------------------------------- /src/model/rnn.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | #include "rnn.h" 8 | #include "../matrix/matrix.h" 9 | #include "../vocabulary/vocabulary.h" 10 | 11 | #define MAX_WORD_LENGTH 64 12 | 13 | RNN *rnn_init(int input_size, int hidden_size, int output_size, double learning_rate) 14 | { 15 | RNN *rnn = (RNN *)malloc(sizeof(RNN)); 16 | if (!rnn) 17 | { 18 | fprintf(stderr, "Error: Unable to allocate memory for RNN\n"); 19 | exit(EXIT_FAILURE); 20 | } 21 | 22 | rnn->input_size = input_size; 23 | rnn->hidden_size = hidden_size; 24 | rnn->output_size = output_size; 25 | rnn->learning_rate = learning_rate; 26 | 27 | // Initialize hidden weights (input to hidden) 28 | rnn->hidden_weights = matrix_create(hidden_size, input_size); 29 | matrix_xavier_randomize(rnn->hidden_weights, input_size, hidden_size); 30 | 31 | // Initialize output weights (hidden to output) 32 | rnn->output_weights = matrix_create(output_size, hidden_size); 33 | matrix_xavier_randomize(rnn->output_weights, hidden_size, output_size); 34 | 35 | // Initialize hidden state to zeros 36 | rnn->hidden_state = matrix_zero(hidden_size, 1); 37 | 38 | return rnn; 39 | } 40 | 41 | void rnn_free(RNN *rnn) 42 | { 43 | if (rnn) 44 | { 45 | matrix_free(rnn->hidden_weights); 46 | matrix_free(rnn->output_weights); 47 | matrix_free(rnn->hidden_state); 48 | free(rnn); 49 | } 50 | } 51 | 52 | Matrix *rnn_forward(RNN *rnn, Matrix *input) 53 | { 54 | // Update hidden state: hidden_state = tanh(hidden_weigths * input + hidden_state) 55 | Matrix *hidden_input = matrix_dot(rnn->hidden_weights, input); 56 | Matrix *new_hidden_state = matrix_add(hidden_input, rnn->hidden_state); 57 | matrix_free(hidden_input); 58 | 59 | // Apply tanh activation to the hidden state 60 | Matrix *activated_hidden = matrix_apply(tanh, new_hidden_state); 61 | matrix_free(new_hidden_state); 62 | 63 | // Update the RNN's hidden state 64 | matrix_free(rnn->hidden_state); 65 | rnn->hidden_state = matrix_copy(activated_hidden); 66 | 67 | // Compute output: output = output_weights * hidden_state 68 | Matrix *output = matrix_dot(rnn->output_weights, activated_hidden); 69 | matrix_free(activated_hidden); 70 | 71 | return output; 72 | } 73 | 74 | double square(double x) 75 | { 76 | return x * x; 77 | } 78 | 79 | void rnn_backward(RNN *rnn, Matrix *input, Matrix *target) 80 | { 81 | // Perform forward pass to get the output and hidden state 82 | Matrix *output = rnn_forward(rnn, input); 83 | 84 | // Compute the error in the output layer: output_error = output - target 85 | Matrix *output_error = matrix_subtract(output, target); 86 | 87 | // Compute the gradient of the loss with respect to the output weights 88 | // output_weights_gradient = output_error * hidden_state^T 89 | Matrix *hidden_state_transpose = matrix_transpose(rnn->hidden_state); 90 | Matrix *output_weights_gradient = matrix_dot(output_error, hidden_state_transpose); 91 | matrix_free(hidden_state_transpose); 92 | 93 | // Update the output weights: output_weights -= learning_rate * output_weights_gradient 94 | Matrix *scaled_output_weights_gradient = matrix_scale(rnn->learning_rate, output_weights_gradient); 95 | Matrix *updated_output_weights = matrix_subtract(rnn->output_weights, scaled_output_weights_gradient); 96 | matrix_free(rnn->output_weights); 97 | rnn->output_weights = updated_output_weights; 98 | matrix_free(scaled_output_weights_gradient); 99 | matrix_free(output_weights_gradient); 100 | 101 | // Compute the gradient of the loss with respect to the hidden state 102 | // hidden_error = output_weights^T * output_error 103 | Matrix *output_weights_transpose = matrix_transpose(rnn->output_weights); 104 | Matrix *hidden_error = matrix_dot(output_weights_transpose, output_error); 105 | matrix_free(output_weights_transpose); 106 | 107 | // Compute the gradient of the loss with respect to the hidden weights 108 | // hidden_weights_gradient = hidden_error * input^T 109 | Matrix *input_transpose = matrix_transpose(input); 110 | Matrix *hidden_weights_gradient = matrix_dot(hidden_error, input_transpose); 111 | matrix_free(input_transpose); 112 | 113 | // Update the hidden weights: hidden_weights -= learning_rate * hidden_weights_gradient 114 | Matrix *scaled_hidden_weights_gradient = matrix_scale(rnn->learning_rate, hidden_weights_gradient); 115 | Matrix *updated_hidden_weights = matrix_subtract(rnn->hidden_weights, scaled_hidden_weights_gradient); 116 | matrix_free(rnn->hidden_weights); 117 | rnn->hidden_weights = updated_hidden_weights; 118 | matrix_free(scaled_hidden_weights_gradient); 119 | matrix_free(hidden_weights_gradient); 120 | 121 | // Update the hidden state for the next iteration 122 | matrix_free(rnn->hidden_state); 123 | rnn->hidden_state = matrix_copy(hidden_error); 124 | 125 | // Free memory 126 | matrix_free(output); 127 | matrix_free(output_error); 128 | matrix_free(hidden_error); 129 | } 130 | 131 | // Generate text using the RNN 132 | char *rnn_generate_text(Vocabulary* v, RNN *rnn, char *initial_input, int length) 133 | { 134 | // Convert initial_input to a matrix (one-hot encoded or otherwise) 135 | Matrix *input = matrix_create(rnn->input_size, 1); 136 | 137 | // Initialize the generated text buffer 138 | // Allocate enough space for the generated words and spaces between them 139 | char *generated_text = (char *)malloc((length * (MAX_WORD_LENGTH + 1)) * sizeof(char)); 140 | if (!generated_text) 141 | { 142 | fprintf(stderr, "Error: Unable to allocate memory for generated text\n"); 143 | exit(1); 144 | } 145 | generated_text[0] = '\0'; // Initialize as an empty string 146 | 147 | // Convert the initial input into a one-hot encoded vector and set it as the input 148 | Matrix *initial_input_vector = create_one_hot_vector(v, initial_input); 149 | Matrix *input_copy = matrix_copy(initial_input_vector); 150 | matrix_free(initial_input_vector); 151 | 152 | // Copy the input to the matrix input 153 | for (int i = 0; i < rnn->input_size; i++) { 154 | input->entries[i][0] = input_copy->entries[i][0]; 155 | } 156 | matrix_free(input_copy); 157 | 158 | // Generate text 159 | for (int i = 0; i < length; i++) 160 | { 161 | // Perform forward pass 162 | Matrix *output = rnn_forward(rnn, input); 163 | 164 | // Find the index of the word with the highest probability 165 | int predicted_word_index = matrix_argmax(output); 166 | 167 | // Map the predicted index to the actual word in the vocabulary 168 | const char *predicted_word = vocabulary_get_word(v, predicted_word_index); 169 | 170 | // Append the predicted word to the generated text 171 | strcat(generated_text, predicted_word); 172 | if (i < length - 1) { 173 | strcat(generated_text, " "); // Add a space between words 174 | } 175 | 176 | // Free the output matrix 177 | matrix_free(output); 178 | 179 | // Update input for the next step 180 | matrix_fill(input, 0.0); 181 | input->entries[predicted_word_index][0] = 1.0; // One-hot encoding of predicted word 182 | } 183 | 184 | // Free the input matrix 185 | matrix_free(input); 186 | 187 | return generated_text; 188 | } 189 | 190 | // Save the RNN model to a file 191 | void rnn_save(RNN *rnn, const char *filename) 192 | { 193 | FILE *file = fopen(filename, "wb"); 194 | if (!file) 195 | { 196 | fprintf(stderr, "Error: Unable to open file for saving RNN\n"); 197 | exit(1); 198 | } 199 | 200 | // Save RNN metadata 201 | fwrite(&rnn->input_size, sizeof(int), 1, file); 202 | fwrite(&rnn->hidden_size, sizeof(int), 1, file); 203 | fwrite(&rnn->output_size, sizeof(int), 1, file); 204 | fwrite(&rnn->learning_rate, sizeof(double), 1, file); 205 | 206 | // Save matrices 207 | matrix_save(rnn->hidden_weights, file); 208 | matrix_save(rnn->output_weights, file); 209 | matrix_save(rnn->hidden_state, file); 210 | 211 | fclose(file); 212 | } 213 | 214 | // Load the RNN model from a file 215 | RNN *rnn_load(const char *filename) 216 | { 217 | FILE *file = fopen(filename, "rb"); 218 | if (!file) 219 | { 220 | fprintf(stderr, "Error: Unable to open file for loading RNN\n"); 221 | exit(EXIT_FAILURE); 222 | } 223 | 224 | // Load RNN metadata 225 | int input_size, hidden_size, output_size; 226 | double learning_rate; 227 | fread(&input_size, sizeof(int), 1, file); 228 | fread(&hidden_size, sizeof(int), 1, file); 229 | fread(&output_size, sizeof(int), 1, file); 230 | fread(&learning_rate, sizeof(double), 1, file); 231 | 232 | // Initialize RNN 233 | RNN *rnn = rnn_init(input_size, hidden_size, output_size, learning_rate); 234 | 235 | // Load matrices 236 | matrix_free(rnn->hidden_weights); 237 | matrix_free(rnn->output_weights); 238 | matrix_free(rnn->hidden_state); 239 | rnn->hidden_weights = matrix_load(file); 240 | rnn->output_weights = matrix_load(file); 241 | rnn->hidden_state = matrix_load(file); 242 | 243 | fclose(file); 244 | return rnn; 245 | } -------------------------------------------------------------------------------- /src/model/rnn.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | #pragma once 3 | #include "../matrix/matrix.h" 4 | #include "../vocabulary/vocabulary.h" 5 | 6 | typedef struct 7 | { 8 | int input_size; // Size of the input vector (e.g., vocabulary size) 9 | int hidden_size; // Size of the hidden state vector 10 | int output_size; // Size of the output vector (e.g., vocabulary size) 11 | double learning_rate; // Learning rate for training 12 | Matrix *hidden_weights; // Weights for the hidden state (input to hidden) 13 | Matrix *output_weights; // Weights for the output (hidden to output) 14 | Matrix *hidden_state; // Current hidden state of the RNN 15 | } RNN; 16 | 17 | RNN *rnn_init(int input_size, int hidden_size, int output_size, double learning_rate); 18 | void rnn_free(RNN *rnn); 19 | Matrix *rnn_forward(RNN *rnn, Matrix *input); // Forward pass 20 | void rnn_backward(RNN *rnn, Matrix *input, Matrix *target); // Backward pass (backpropagation through time) 21 | char *rnn_generate_text(Vocabulary *v, RNN *rnn, char *initial_input, int length); 22 | void rnn_save(RNN *rnn, const char *filename); 23 | RNN *rnn_load(const char *filename); -------------------------------------------------------------------------------- /src/vocabulary/vocabulary.c: -------------------------------------------------------------------------------- 1 | #include "vocabulary.h" 2 | #include 3 | #include 4 | #include 5 | 6 | unsigned int hash_word(const char *word) 7 | { 8 | unsigned int hash = 0; 9 | while (*word) 10 | { 11 | hash = (hash * 31) + *word++; 12 | } 13 | return hash; 14 | } 15 | 16 | Vocabulary *vocabulary_create(int initial_capacity) 17 | { 18 | Vocabulary *v = (Vocabulary *)malloc(sizeof(Vocabulary)); 19 | if (!v) 20 | return NULL; // Memory allocation failure 21 | 22 | initial_capacity += VOCAB_SPECIAL_COUNT; 23 | 24 | v->table = (VocabularyEntry **)calloc(initial_capacity, sizeof(VocabularyEntry *)); 25 | if (!v->table) 26 | { 27 | perror("Failed to allocate memory for hash table"); 28 | free(v); 29 | return NULL; // Memory allocation failure 30 | } 31 | v->size = 0; 32 | v->capacity = initial_capacity; 33 | 34 | // Add special tokens with predefined IDs 35 | vocabulary_add_word(v, ""); // ID 0 36 | vocabulary_add_word(v, ""); // ID 1 37 | vocabulary_add_word(v, ""); // ID 2 38 | vocabulary_add_word(v, ""); // ID 3 39 | 40 | return v; 41 | } 42 | 43 | void vocabulary_free(Vocabulary *v) 44 | { 45 | if (!v) 46 | return; 47 | for (int i = 0; i < v->capacity; i++) 48 | { 49 | VocabularyEntry *entry = v->table[i]; 50 | while (entry) 51 | { 52 | VocabularyEntry *next = entry->next; 53 | free(entry->word); 54 | free(entry); 55 | entry = next; 56 | } 57 | } 58 | free(v->table); 59 | free(v); 60 | } 61 | 62 | int vocabulary_add_word(Vocabulary *v, const char *word) 63 | { 64 | if (v->size == v->capacity) 65 | { 66 | perror("Vocabulary already full"); 67 | return -1; // Vocabulary full 68 | } 69 | unsigned int hash = hash_word(word) % v->capacity; 70 | VocabularyEntry *entry = v->table[hash]; 71 | while (entry) 72 | { 73 | if (strcmp(entry->word, word) == 0) 74 | { 75 | return entry->id; 76 | } 77 | entry = entry->next; 78 | } 79 | 80 | VocabularyEntry *new_entry = (VocabularyEntry *)malloc(sizeof(VocabularyEntry)); 81 | if (!new_entry) 82 | { 83 | perror("Failed to allocate memory for new entry"); 84 | return -1; // Memory allocation failure 85 | } 86 | 87 | new_entry->word = strdup(word); 88 | if (!new_entry->word) 89 | { 90 | perror("Failed to allocate memory for word"); 91 | free(new_entry); // Free the entry structure 92 | return -1; 93 | } 94 | 95 | new_entry->id = v->size; 96 | new_entry->next = v->table[hash]; 97 | v->table[hash] = new_entry; 98 | 99 | v->size++; 100 | return new_entry->id; 101 | } 102 | 103 | int vocabulary_get_index(const Vocabulary *v, const char *word) 104 | { 105 | if (!v || !word) 106 | return -1; 107 | 108 | unsigned int hash = hash_word(word) % v->capacity; 109 | VocabularyEntry *entry = v->table[hash]; 110 | 111 | while (entry) 112 | { 113 | if (strcmp(entry->word, word) == 0) 114 | { 115 | return entry->id; 116 | } 117 | entry = entry->next; 118 | } 119 | return -1; // Word not found 120 | } 121 | 122 | char *vocabulary_get_word(Vocabulary *v, int index) { 123 | for (int i = 0; i < v->capacity; i++) { 124 | VocabularyEntry *entry = v->table[i]; 125 | while (entry) { 126 | if (entry->id == index) { 127 | return entry->word; 128 | } 129 | entry = entry->next; 130 | } 131 | } 132 | return NULL; // Word not found 133 | } 134 | 135 | 136 | void vocabulary_print(const Vocabulary *v) 137 | { 138 | if (!v) 139 | { 140 | fprintf(stderr, "Vocabulary is NULL\n"); 141 | return; 142 | } 143 | 144 | for (int i = 0; i < v->capacity; i++) 145 | { 146 | VocabularyEntry *entry = v->table[i]; 147 | while (entry) 148 | { 149 | printf("Word: %s, ID: %d\n", entry->word, entry->id); 150 | entry = entry->next; 151 | } 152 | } 153 | } 154 | 155 | // Function to create a one-hot encoded vector for a given word 156 | Matrix *create_one_hot_vector(Vocabulary *v, char *word) 157 | { 158 | int index = vocabulary_get_index(v, word); 159 | if (index == -1) 160 | { 161 | fprintf(stderr, "Word '%s' not found in vocabulary\n", word); 162 | exit(1); 163 | } 164 | Matrix *one_hot = matrix_create(v->size, 1); 165 | for (int i = 0; i < v->size; i++) 166 | { 167 | one_hot->entries[i][0] = (i == index) ? 1.0 : 0.0; 168 | } 169 | return one_hot; 170 | } -------------------------------------------------------------------------------- /src/vocabulary/vocabulary.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | #include "../matrix/matrix.h" 3 | 4 | // Enum for special token IDs 5 | typedef enum 6 | { 7 | VOCAB_UNK = 0, // 8 | VOCAB_PAD, // 9 | VOCAB_BOS, // 10 | VOCAB_EOS, // 11 | VOCAB_SPECIAL_COUNT // Total number of special tokens 12 | } SpecialTokens; 13 | 14 | typedef struct VocabularyEntry 15 | { 16 | char *word; 17 | int id; 18 | struct VocabularyEntry *next; // For chaining (handling collisions) 19 | } VocabularyEntry; 20 | 21 | typedef struct 22 | { 23 | VocabularyEntry **table; 24 | int size; // Number of words in the vocabulary 25 | int capacity; // Capacity of the hash table 26 | } Vocabulary; 27 | 28 | Vocabulary *vocabulary_create(int initial_capacity); 29 | void vocabulary_free(Vocabulary *v); 30 | int vocabulary_add_word(Vocabulary *v, const char *word); 31 | int vocabulary_get_index(const Vocabulary *v, const char *word); 32 | char *vocabulary_get_word(Vocabulary *v, int index); 33 | void vocabulary_print(const Vocabulary *v); 34 | Matrix *create_one_hot_vector(Vocabulary *v, char *word); --------------------------------------------------------------------------------