├── .gitignore
├── LICENSE
├── README.md
├── build.sh
└── src
    ├── main.c
    ├── matrix
        ├── matrix.c
        └── matrix.h
    ├── model
        ├── rnn.c
        └── rnn.h
    └── vocabulary
        ├── vocabulary.c
        └── vocabulary.h


/.gitignore:
--------------------------------------------------------------------------------
1 | /dist/
2 | /.vscode/
3 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 io-eric
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # RNN Text Generation Model in C
 2 | 
 3 | This repository contains a simple Recurrent Neural Network (RNN) implemented from scratch in C, designed to generate text based on a given input word. The model is trained on a small dataset of sentences and learns to predict the next word in a sequence. For example, given the input word "Matrix," the model might predict "dimensions" as the next word.
 4 | 
 5 | ## Overview
 6 | This project implements a basic RNN model in C to generate text. The model is trained on a small dataset of sentences and learns to predict the next word in a sequence. The implementation includes:
 7 | - Vocabulary creation and word indexing.
 8 | - One-hot encoding for input and target vectors.
 9 | - Forward and backward propagation for training.
10 | - Text generation based on a given input word.
11 | 
12 | ## Features
13 | - From Scratch Implementation: The RNN is implemented entirely in C without relying on external machine learning libraries.
14 | - Text Generation: Given an input word, the model predicts the next word in the sequence.
15 | - Customizable Training: Adjustable parameters such as learning rate, hidden layer size, and number of epochs.
16 | 
17 | ## Requirements
18 | - C Compiler: GCC or any C99-compatible compiler.
19 | - Basic C Libraries: Standard libraries like stdio.h, stdlib.h, and string.h.
20 | 
21 | ## Installation
22 | 1. Clone the repository:
23 |    ```
24 |    git clone https://github.com/io-eric/C-RNN-Text-Generation.git
25 |    cd C-RNN-Text-Generation
26 |    ```
27 | 2. Compile and run the project:
28 |     ```
29 |    ./build.sh
30 |     ```
31 | ## Usage
32 | After compiling and running the program, the model will:
33 | 1. Train on the provided dataset.
34 | 2. Generate text based on the input word "Matrix."
35 | 3. Print the generated text to the console.
36 | 
37 | To modify the input word or the number of generated words, edit the following lines in main.c:
38 | ````c
39 | char *input_text = "Matrix"; // Change the input word
40 | char *next_word_predictions = rnn_generate_text(v, rnn, input_text, 5); // Change the number of words to generate
41 | ````
42 | 
43 | ## Training Data
44 | The model is trained on a small dataset of sentences:
45 | ```c
46 | const char *training_data[] = {
47 |     "Matrix dimensions don’t match? Shocking.",
48 |     "Rain on the window? Wow, never seen that before.",
49 |     "Starting is the hardest part? Groundbreaking insight.",
50 |     "Traveling? Because who wouldn’t want to get lost in a new place?",
51 |     "Books? Oh yeah, they’re just full of ideas or whatever.",
52 |     "Time’s too short for pointless stuff... unless it’s procrastination."
53 | };
54 | ```
55 | 
56 | You can replace this dataset with your own text data for custom training.
57 | 
58 | ## Results
59 | After training, the model generates text based on the input word. For example:
60 | ```c
61 | $ ./build.sh
62 | Compilation successful!
63 | Epoch 0, Average Loss: 1.657553
64 | Epoch 1000, Average Loss: 0.044296
65 | Epoch 2000, Average Loss: 0.012500
66 | Epoch 3000, Average Loss: 0.012825
67 | Epoch 4000, Average Loss: 0.008907
68 | Input text: Matrix
69 | Next word predictions: dimensions dimensions never never full
70 | ```
71 | 
72 | The model's performance is limited by the small dataset and simple architecture, but it demonstrates the basic principles of RNNs and text generation.
73 | 
74 | ## Limitations
75 | - Small Dataset: The model is trained on a very small dataset, which limits its ability to generalize.
76 | - Basic Architecture: The RNN is a simple implementation and may struggle with long-term dependencies.
77 | - Overfitting: Due to the small dataset, the model may overfit and repeat words.
78 | 
79 | ## Future Improvements
80 | - Larger Dataset: Train the model on a larger and more diverse dataset.
81 | - Advanced Architectures: Implement more advanced RNN variants like LSTMs or GRUs.
82 | - Better Text Generation: Improve the text generation logic to produce more coherent and varied outputs.
83 | - User Interface: Add a command-line interface for easier interaction with the model.
84 | 
85 | ## License
86 | This project is licensed under the MIT License. See the [LICENSE](./LICENSE) file for details.
87 | 


--------------------------------------------------------------------------------
/build.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Set the output name for the executable
 4 | OUTPUT="dist/rnn"
 5 | 
 6 | # Define your source files
 7 | SOURCES="src/main.c src/matrix/matrix.c src/vocabulary/vocabulary.c src/model/rnn.c"
 8 | 
 9 | # Define any compiler flags if needed (e.g., for debugging)
10 | CFLAGS="-Wall -g"
11 | 
12 | # Make sure the dist directory exists
13 | mkdir -p dist
14 | 
15 | # Compile the source files into an executable
16 | gcc $CFLAGS $SOURCES -o $OUTPUT -lm
17 | 
18 | # Check if the compilation was successful
19 | if [ $? -eq 0 ]; then
20 |     echo "Compilation successful!"
21 | 
22 |     # Run the program 
23 |     ./$OUTPUT
24 | else
25 |     echo "Compilation failed."
26 | fi
27 | 


--------------------------------------------------------------------------------
/src/main.c:
--------------------------------------------------------------------------------
  1 | #include <stdio.h>
  2 | #include <stdlib.h>
  3 | #include <string.h>
  4 | #include "model/rnn.h"
  5 | #include "vocabulary/vocabulary.h"
  6 | 
  7 | int main()
  8 | {
  9 |     // Initialize vocabulary
 10 |     Vocabulary *v = vocabulary_create(100);
 11 | 
 12 |     const char *training_data[] = {
 13 |         "Matrix dimensions don’t match? Shocking.",
 14 |         "Rain on the window? Wow, never seen that before.",
 15 |         "Starting is the hardest part? Groundbreaking insight.",
 16 |         "Traveling? Because who wouldn’t want to get lost in a new place?",
 17 |         "Books? Oh yeah, they’re just full of ideas or whatever.",
 18 |         "Time’s too short for pointless stuff... unless it’s procrastination."};
 19 | 
 20 |     int num_samples = sizeof(training_data) / sizeof(training_data[0]);
 21 | 
 22 |     // Add words to the vocabulary
 23 |     for (int i = 0; i < num_samples; i++)
 24 |     {
 25 |         char *sentence = strdup(training_data[i]);
 26 |         char *word = strtok(sentence, " ");
 27 |         while (word != NULL)
 28 |         {
 29 |             vocabulary_add_word(v, word);
 30 |             word = strtok(NULL, " ");
 31 |         }
 32 |         free(sentence);
 33 |     }
 34 | 
 35 |     // Parameters
 36 |     int input_size = v->size;
 37 |     int hidden_size = 100;
 38 |     int output_size = v->size;
 39 |     double learning_rate = 0.01;
 40 | 
 41 |     RNN *rnn = rnn_init(input_size, hidden_size, output_size, learning_rate);
 42 | 
 43 |     // Prepare training data: input-target pairs
 44 |     Matrix **input_vectors = (Matrix **)malloc(num_samples * sizeof(Matrix *));
 45 |     Matrix **target_vectors = (Matrix **)malloc(num_samples * sizeof(Matrix *));
 46 | 
 47 |     if (input_vectors == NULL || target_vectors == NULL)
 48 |     {
 49 |         fprintf(stderr, "Memory allocation failed\n");
 50 |         return 1;
 51 |     }
 52 | 
 53 |     // Convert sentences to sequences of one-hot encoded vectors
 54 |     for (int i = 0; i < num_samples; i++)
 55 |     {
 56 |         char *sentence = strdup(training_data[i]);
 57 |         char *word = strtok(sentence, " ");
 58 |         input_vectors[i] = create_one_hot_vector(v, word);
 59 | 
 60 |         word = strtok(NULL, " ");
 61 |         target_vectors[i] = create_one_hot_vector(v, word);
 62 | 
 63 |         free(sentence);
 64 |     }
 65 | 
 66 |     // Train the RNN
 67 |     int epochs = 2000;
 68 |     for (int epoch = 0; epoch < epochs; epoch++)
 69 |     {
 70 |         double epoch_loss = 0.0;
 71 |         for (int i = 0; i < num_samples; i++)
 72 |         {
 73 |             Matrix *output = rnn_forward(rnn, input_vectors[i]);
 74 |             rnn_backward(rnn, input_vectors[i], target_vectors[i]);
 75 | 
 76 |             double loss = matrix_mean_square_error(output, target_vectors[i]);
 77 |             epoch_loss += loss;
 78 | 
 79 |             matrix_free(output);
 80 |         }
 81 | 
 82 |         double avg_epoch_loss = epoch_loss / num_samples;
 83 |         if (epoch % 1000 == 0) // Print loss every 1000 epochs
 84 |         {
 85 |             printf("Epoch %d, Average Loss: %f\n", epoch, avg_epoch_loss);
 86 |         }
 87 |     }
 88 | 
 89 |     // Generate text after training
 90 |     char *input_text = "Rain";
 91 |     char *next_word_predictions = rnn_generate_text(v, rnn, input_text, 5);
 92 |     printf("Input text: %s\n", input_text);
 93 |     printf("Next word predictions: %s\n", next_word_predictions);
 94 |     
 95 |     // Clean up
 96 |     free(next_word_predictions);
 97 |     for (int i = 0; i < num_samples; i++)
 98 |     {
 99 |         matrix_free(input_vectors[i]);
100 |         matrix_free(target_vectors[i]);
101 |     }
102 |     free(input_vectors);
103 |     free(target_vectors);
104 |     rnn_free(rnn);
105 |     vocabulary_free(v);
106 | 
107 |     return 0;
108 | }


--------------------------------------------------------------------------------
/src/matrix/matrix.c:
--------------------------------------------------------------------------------
  1 | #include <stdio.h>
  2 | #include <math.h>
  3 | #include <stdio.h>
  4 | #include <stdlib.h>
  5 | #include <time.h>
  6 | #include "matrix.h"
  7 | 
  8 | #define MAXCHAR 100
  9 | 
 10 | Matrix* matrix_create(int rows, int cols) {
 11 |     // Allocate memory for the matrix structure
 12 |     Matrix* matrix = malloc(sizeof(Matrix));
 13 |     if (!matrix) return NULL;  // Memory allocation failure check
 14 | 
 15 |     matrix->rows = rows;
 16 |     matrix->cols = cols;
 17 | 
 18 |     // Allocate memory for the matrix entries (array of row pointers)
 19 |     matrix->entries = malloc(rows * sizeof(double*));
 20 |     if (!matrix->entries) {
 21 |         free(matrix);  // Free matrix structure if allocation fails
 22 |         return NULL;
 23 |     }
 24 | 
 25 |     // Allocate memory for each row (array of doubles)
 26 |     for (int i = 0; i < rows; i++) {
 27 |         matrix->entries[i] = malloc(cols * sizeof(double));
 28 |         if (!matrix->entries[i]) {
 29 |             // Free all previously allocated memory if row allocation fails
 30 |             for (int j = 0; j < i; j++) {
 31 |                 free(matrix->entries[j]);
 32 |             }
 33 |             free(matrix->entries);
 34 |             free(matrix);
 35 |             return NULL;
 36 |         }
 37 |     }
 38 | 
 39 |     return matrix;  // Return the created matrix
 40 | }
 41 | 
 42 | // Function to free the matrix memory
 43 | void matrix_free(Matrix* matrix) {
 44 |     if (matrix) {
 45 |         // Free each row
 46 |         for (int i = 0; i < matrix->rows; i++) {
 47 |             free(matrix->entries[i]);
 48 |         }
 49 |         // Free the row pointers array
 50 |         free(matrix->entries);
 51 |         // Free the matrix structure itself
 52 |         free(matrix);
 53 |     }
 54 | }
 55 | 
 56 | void matrix_fill(Matrix *m, double n)
 57 | {
 58 |     for (int i = 0; i < m->rows; i++)
 59 |     {
 60 |         for (int j = 0; j < m->cols; j++)
 61 |         {
 62 |             m->entries[i][j] = n;
 63 |         }
 64 |     }
 65 | }
 66 | 
 67 | Matrix* matrix_zero(int rows, int cols) {
 68 |     Matrix* m = matrix_create(rows, cols);
 69 |     matrix_fill(m, 0.0);
 70 |     return m;
 71 | }
 72 | 
 73 | Matrix *matrix_copy(Matrix *m)
 74 | {
 75 |     Matrix *mat = matrix_create(m->rows, m->cols);
 76 |     for (int i = 0; i < m->rows; i++)
 77 |     {
 78 |         for (int j = 0; j < m->cols; j++)
 79 |         {
 80 |             mat->entries[i][j] = m->entries[i][j];
 81 |         }
 82 |     }
 83 |     return mat;
 84 | }
 85 | 
 86 | void matrix_print(Matrix *m)
 87 | {
 88 |     printf("Rows: %d Columns: %d\n", m->rows, m->cols);
 89 |     for (int i = 0; i < m->rows; i++)
 90 |     {
 91 |         for (int j = 0; j < m->cols; j++)
 92 |         {
 93 |             printf("%1.3f ", m->entries[i][j]);
 94 |         }
 95 |         printf("\n");
 96 |     }
 97 | }
 98 | 
 99 | void matrix_print_dimensions(Matrix *m)
100 | {
101 |     printf("Rows: %d Columns: %d\n", m->rows, m->cols);
102 | }
103 | 
104 | void matrix_save(Matrix *m, FILE *file)
105 | {
106 |     fprintf(file, "%d\n", m->rows);
107 |     fprintf(file, "%d\n", m->cols);
108 |     for (int i = 0; i < m->rows; i++)
109 |     {
110 |         for (int j = 0; j < m->cols; j++)
111 |         {
112 |             fprintf(file, "%.6f\n", m->entries[i][j]);
113 |         }
114 |     }
115 |     printf("Successfully saved matrix to file\n");
116 | }
117 | 
118 | Matrix *matrix_load(FILE *file)
119 | {
120 |     char entry[MAXCHAR];
121 | 
122 |     // Read matrix dimensions
123 |     if (!fgets(entry, MAXCHAR, file))
124 |         return NULL;
125 |     int rows = atoi(entry);
126 |     if (!fgets(entry, MAXCHAR, file))
127 |         return NULL;
128 |     int cols = atoi(entry);
129 | 
130 |     // Create matrix
131 |     Matrix *m = matrix_create(rows, cols);
132 |     if (!m)
133 |         return NULL; // Memory allocation failed
134 | 
135 |     // Read matrix data
136 |     for (int i = 0; i < m->rows; i++)
137 |     {
138 |         for (int j = 0; j < m->cols; j++)
139 |         {
140 |             if (!fgets(entry, MAXCHAR, file))
141 |                 return NULL;
142 |             m->entries[i][j] = strtod(entry, NULL);
143 |         }
144 |     }
145 |     return m;
146 | }
147 | 
148 | int matrix_argmax(Matrix *m)
149 | {
150 |     // Expects a Mx1 matrix
151 |     double max_score = 0;
152 |     int max_idx = 0;
153 |     for (int i = 0; i < m->rows; i++)
154 |     {
155 |         if (m->entries[i][0] > max_score)
156 |         {
157 |             max_score = m->entries[i][0];
158 |             max_idx = i;
159 |         }
160 |     }
161 |     return max_idx;
162 | }
163 | 
164 | Matrix* matrix_row(Matrix* m, int row_index) {
165 |     if (row_index >= m->rows || row_index < 0) {
166 |         // Invalid row index
167 |         return NULL;
168 |     }
169 | 
170 |     // Create a new matrix with one row and the same number of columns
171 |     Matrix* row = matrix_create(1, m->cols);
172 | 
173 |     // Copy the elements of the specified row into the new matrix
174 |     for (int col = 0; col < m->cols; col++) {
175 |         row->entries[0][col] = m->entries[row_index][col];
176 |     }
177 | 
178 |     return row;
179 | }
180 | 
181 | double matrix_mean_square_error(Matrix *output, Matrix *target) {
182 |     // Ensure the matrices have the same dimensions
183 |     if (output->rows != target->rows || output->cols != target->cols) {
184 |         fprintf(stderr, "Error: Matrices must have the same dimensions for MSE calculation.\n");
185 |         exit(EXIT_FAILURE); 
186 |     }
187 | 
188 |     double mse = 0.0;
189 |     int total_elements = output->rows * output->cols;
190 | 
191 |     // Calculate the sum of squared differences
192 |     for (int i = 0; i < output->rows; i++) {
193 |         for (int j = 0; j < output->cols; j++) {
194 |             double error = output->entries[i][j] - target->entries[i][j];
195 |             mse += pow(error, 2); // Square the error and add to the total
196 |         }
197 |     }
198 | 
199 |     // Divide by the total number of elements to get the mean
200 |     mse /= total_elements;
201 | 
202 |     return mse;
203 | }
204 | 
205 | int matrix_check_dimensions(Matrix *m1, Matrix *m2)
206 | {
207 |     return m1->rows == m2->rows && m1->cols == m2->cols;
208 | }
209 | 
210 | void matrix_randomize(Matrix *m, double min, double max)
211 | {
212 |     static int seeded = 0;
213 |     if (!seeded)
214 |     {
215 |         srand((unsigned int)time(NULL));  // Seed random number generator once
216 |         seeded = 1;
217 |     }
218 | 
219 |     for (int i = 0; i < m->rows; i++)
220 |     {
221 |         for (int j = 0; j < m->cols; j++)
222 |         {
223 |             // Generate random numbers in the range [min, max)
224 |             m->entries[i][j] = min + ((double)rand() / RAND_MAX) * (max - min);
225 |         }
226 |     }
227 | }
228 | 
229 | void matrix_xavier_randomize(Matrix *m, int input_size, int output_size)
230 | {
231 |     static int seeded = 0;
232 |     if (!seeded)
233 |     {
234 |         srand((unsigned int)time(NULL));  // Seed random number generator once
235 |         seeded = 1;
236 |     }
237 | 
238 |     double limit = sqrt(6.0 / (input_size + output_size));  // Xavier limit
239 | 
240 |     for (int i = 0; i < m->rows; i++)
241 |     {
242 |         for (int j = 0; j < m->cols; j++)
243 |         {
244 |             // Initialize weights with values from the uniform distribution within [-limit, limit]
245 |             m->entries[i][j] = (2.0 * ((double)rand() / RAND_MAX) - 1.0) * limit;
246 |         }
247 |     }
248 | }
249 | 
250 | double matrix_sum_elements(Matrix *m)
251 | {
252 |     double sum = 0.0;
253 |     for (int i = 0; i < m->rows; i++)
254 |     {
255 |         for (int j = 0; j < m->cols; j++)
256 |         {
257 |             sum += m->entries[i][j];
258 |         }
259 |     }
260 |     return sum;
261 | }
262 | 
263 | Matrix *matrix_add(Matrix *m1, Matrix *m2)
264 | {
265 |     if (matrix_check_dimensions(m1, m2))
266 |     {
267 |         Matrix *m = matrix_create(m1->rows, m1->cols);
268 |         for (int i = 0; i < m1->rows; i++)
269 |         {
270 |             for (int j = 0; j < m2->cols; j++)
271 |             {
272 |                 m->entries[i][j] = m1->entries[i][j] + m2->entries[i][j];
273 |             }
274 |         }
275 |         return m;
276 |     }
277 |     else
278 |     {
279 |         printf("(matrix_add) Dimensions mismatch add: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols);
280 |         exit(EXIT_FAILURE);
281 |     }
282 | }
283 | Matrix *matrix_subtract(Matrix *m1, Matrix *m2)
284 | {
285 |     if (matrix_check_dimensions(m1, m2))
286 |     {
287 |         Matrix *m = matrix_create(m1->rows, m1->cols);
288 |         for (int i = 0; i < m1->rows; i++)
289 |         {
290 |             for (int j = 0; j < m2->cols; j++)
291 |             {
292 |                 m->entries[i][j] = m1->entries[i][j] - m2->entries[i][j];
293 |             }
294 |         }
295 |         return m;
296 |     }
297 |     else
298 |     {
299 |         printf("(matrix_subtract) Dimensions mismatch subtract: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols);
300 |         exit(EXIT_FAILURE);
301 |     }
302 | }
303 | 
304 | Matrix *matrix_multiply(Matrix *m1, Matrix *m2)
305 | {
306 |     if (m1->cols == m2->rows)
307 |     {
308 |         // Create a result matrix with appropriate dimensions
309 |         Matrix *result = matrix_create(m1->rows, m2->cols);
310 |         
311 |         // Perform matrix multiplication
312 |         for (int i = 0; i < m1->rows; i++)
313 |         {
314 |             for (int j = 0; j < m2->cols; j++)
315 |             {
316 |                 result->entries[i][j] = 0;
317 |                 for (int k = 0; k < m1->cols; k++)
318 |                 {
319 |                     result->entries[i][j] += m1->entries[i][k] * m2->entries[k][j];
320 |                 }
321 |             }
322 |         }
323 |         return result;
324 |     }
325 |     else
326 |     {
327 |         printf("(matrix_multiply) Dimensions mismatch multiply: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols);
328 |         exit(EXIT_FAILURE);
329 |     }
330 | }
331 | 
332 | Matrix *matrix_dot(Matrix *m1, Matrix *m2)
333 | {
334 |     if (m1->cols == m2->rows)
335 |     {
336 |         Matrix *m = matrix_create(m1->rows, m2->cols);
337 |         for (int i = 0; i < m1->rows; i++)
338 |         {
339 |             for (int j = 0; j < m2->cols; j++)
340 |             {
341 |                 double sum = 0;
342 |                 for (int k = 0; k < m2->rows; k++)
343 |                 {
344 |                     sum += m1->entries[i][k] * m2->entries[k][j];
345 |                 }
346 |                 m->entries[i][j] = sum;
347 |             }
348 |         }
349 |         return m;
350 |     }
351 |     else
352 |     {
353 |         printf("(matrix_dot) Dimensions mismatch dot: %dx%d %dx%d\n", m1->rows, m1->cols, m2->rows, m2->cols);
354 |         exit(EXIT_FAILURE);
355 |     }
356 | }
357 | 
358 | Matrix *matrix_apply(double (*func)(double), Matrix *m)
359 | {
360 |     Matrix *mat = matrix_copy(m);
361 |     for (int i = 0; i < m->rows; i++)
362 |     {
363 |         for (int j = 0; j < m->cols; j++)
364 |         {
365 |             mat->entries[i][j] = (*func)(m->entries[i][j]);
366 |         }
367 |     }
368 |     return mat;
369 | }
370 | 
371 | Matrix *matrix_scale(double n, Matrix *m)
372 | {
373 |     Matrix *mat = matrix_copy(m);
374 |     for (int i = 0; i < m->rows; i++)
375 |     {
376 |         for (int j = 0; j < m->cols; j++)
377 |         {
378 |             mat->entries[i][j] *= n;
379 |         }
380 |     }
381 |     return mat;
382 | }
383 | 
384 | Matrix *matrix_addScalar(double n, Matrix *m)
385 | {
386 |     Matrix *mat = matrix_copy(m);
387 |     for (int i = 0; i < m->rows; i++)
388 |     {
389 |         for (int j = 0; j < m->cols; j++)
390 |         {
391 |             mat->entries[i][j] += n;
392 |         }
393 |     }
394 |     return mat;
395 | }
396 | 
397 | Matrix *matrix_transpose(Matrix *m)
398 | {
399 |     Matrix *mat = matrix_create(m->cols, m->rows);
400 |     for (int i = 0; i < m->rows; i++)
401 |     {
402 |         for (int j = 0; j < m->cols; j++)
403 |         {
404 |             mat->entries[j][i] = m->entries[i][j];
405 |         }
406 |     }
407 |     return mat;
408 | }
409 | 


--------------------------------------------------------------------------------
/src/matrix/matrix.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | 
 3 | #include <stdio.h>
 4 | 
 5 | typedef struct
 6 | {
 7 |     double **entries;
 8 |     int rows;
 9 |     int cols;
10 | } Matrix;
11 | 
12 | // Matrix Creation, Management, and Basic Utilities
13 | Matrix *matrix_create(int row, int col);
14 | void matrix_free(Matrix *m);
15 | void matrix_fill(Matrix *m, double n);
16 | Matrix *matrix_zero(int row, int col);
17 | Matrix *matrix_copy(Matrix *m);
18 | void matrix_print(Matrix *m);
19 | void matrix_print_dimensions(Matrix *m);
20 | 
21 | // File Operations
22 | void matrix_save(Matrix *m, FILE *file);
23 | Matrix *matrix_load(FILE *file);
24 | 
25 | // Matrix Queries
26 | int matrix_argmax(Matrix *m);
27 | int matrix_check_dimensions(Matrix *m1, Matrix *m2);
28 | Matrix *matrix_row(Matrix *m, int row_index);
29 | double matrix_mean_square_error(Matrix *output, Matrix *target);
30 | 
31 | // Matrix Operations
32 | void matrix_randomize(Matrix *m, double min, double max);
33 | void matrix_xavier_randomize(Matrix *m, int input_size, int output_size);
34 | double matrix_sum_elements(Matrix *m);
35 | Matrix *matrix_add(Matrix *m1, Matrix *m2);
36 | Matrix *matrix_subtract(Matrix *m1, Matrix *m2);
37 | Matrix *matrix_multiply(Matrix *m1, Matrix *m2);
38 | Matrix *matrix_dot(Matrix *m1, Matrix *m2);
39 | Matrix *matrix_apply(double (*func)(double), Matrix *m);
40 | Matrix *matrix_scale(double n, Matrix *m);
41 | Matrix *matrix_addScalar(double n, Matrix *m);
42 | Matrix *matrix_transpose(Matrix *m);


--------------------------------------------------------------------------------
/src/model/rnn.c:
--------------------------------------------------------------------------------
  1 | #include <stdio.h>
  2 | #include <stdlib.h>
  3 | #include <math.h>
  4 | #include <string.h>
  5 | #include <time.h>
  6 | 
  7 | #include "rnn.h"
  8 | #include "../matrix/matrix.h"
  9 | #include "../vocabulary/vocabulary.h"
 10 | 
 11 | #define MAX_WORD_LENGTH 64
 12 | 
 13 | RNN *rnn_init(int input_size, int hidden_size, int output_size, double learning_rate)
 14 | {
 15 |     RNN *rnn = (RNN *)malloc(sizeof(RNN));
 16 |     if (!rnn)
 17 |     {
 18 |         fprintf(stderr, "Error: Unable to allocate memory for RNN\n");
 19 |         exit(EXIT_FAILURE);
 20 |     }
 21 | 
 22 |     rnn->input_size = input_size;
 23 |     rnn->hidden_size = hidden_size;
 24 |     rnn->output_size = output_size;
 25 |     rnn->learning_rate = learning_rate;
 26 | 
 27 |     // Initialize hidden weights (input to hidden)
 28 |     rnn->hidden_weights = matrix_create(hidden_size, input_size);
 29 |     matrix_xavier_randomize(rnn->hidden_weights, input_size, hidden_size);
 30 | 
 31 |     // Initialize output weights (hidden to output)
 32 |     rnn->output_weights = matrix_create(output_size, hidden_size);
 33 |     matrix_xavier_randomize(rnn->output_weights, hidden_size, output_size);
 34 | 
 35 |     // Initialize hidden state to zeros
 36 |     rnn->hidden_state = matrix_zero(hidden_size, 1);
 37 | 
 38 |     return rnn;
 39 | }
 40 | 
 41 | void rnn_free(RNN *rnn)
 42 | {
 43 |     if (rnn)
 44 |     {
 45 |         matrix_free(rnn->hidden_weights);
 46 |         matrix_free(rnn->output_weights);
 47 |         matrix_free(rnn->hidden_state);
 48 |         free(rnn);
 49 |     }
 50 | }
 51 | 
 52 | Matrix *rnn_forward(RNN *rnn, Matrix *input)
 53 | {
 54 |     // Update hidden state: hidden_state = tanh(hidden_weigths * input + hidden_state)
 55 |     Matrix *hidden_input = matrix_dot(rnn->hidden_weights, input);
 56 |     Matrix *new_hidden_state = matrix_add(hidden_input, rnn->hidden_state);
 57 |     matrix_free(hidden_input);
 58 | 
 59 |     // Apply tanh activation to the hidden state
 60 |     Matrix *activated_hidden = matrix_apply(tanh, new_hidden_state);
 61 |     matrix_free(new_hidden_state);
 62 | 
 63 |     // Update the RNN's hidden state
 64 |     matrix_free(rnn->hidden_state);
 65 |     rnn->hidden_state = matrix_copy(activated_hidden);
 66 | 
 67 |     // Compute output: output = output_weights * hidden_state
 68 |     Matrix *output = matrix_dot(rnn->output_weights, activated_hidden);
 69 |     matrix_free(activated_hidden);
 70 | 
 71 |     return output;
 72 | }
 73 | 
 74 | double square(double x)
 75 | {
 76 |     return x * x;
 77 | }
 78 | 
 79 | void rnn_backward(RNN *rnn, Matrix *input, Matrix *target)
 80 | {
 81 |     // Perform forward pass to get the output and hidden state
 82 |     Matrix *output = rnn_forward(rnn, input);
 83 | 
 84 |     // Compute the error in the output layer: output_error = output - target
 85 |     Matrix *output_error = matrix_subtract(output, target);
 86 | 
 87 |     // Compute the gradient of the loss with respect to the output weights
 88 |     // output_weights_gradient = output_error * hidden_state^T
 89 |     Matrix *hidden_state_transpose = matrix_transpose(rnn->hidden_state);
 90 |     Matrix *output_weights_gradient = matrix_dot(output_error, hidden_state_transpose);
 91 |     matrix_free(hidden_state_transpose);
 92 | 
 93 |     // Update the output weights: output_weights -= learning_rate * output_weights_gradient
 94 |     Matrix *scaled_output_weights_gradient = matrix_scale(rnn->learning_rate, output_weights_gradient);
 95 |     Matrix *updated_output_weights = matrix_subtract(rnn->output_weights, scaled_output_weights_gradient);
 96 |     matrix_free(rnn->output_weights);
 97 |     rnn->output_weights = updated_output_weights;
 98 |     matrix_free(scaled_output_weights_gradient);
 99 |     matrix_free(output_weights_gradient);
100 | 
101 |     // Compute the gradient of the loss with respect to the hidden state
102 |     // hidden_error = output_weights^T * output_error
103 |     Matrix *output_weights_transpose = matrix_transpose(rnn->output_weights);
104 |     Matrix *hidden_error = matrix_dot(output_weights_transpose, output_error);
105 |     matrix_free(output_weights_transpose);
106 | 
107 |     // Compute the gradient of the loss with respect to the hidden weights
108 |     // hidden_weights_gradient = hidden_error * input^T
109 |     Matrix *input_transpose = matrix_transpose(input);
110 |     Matrix *hidden_weights_gradient = matrix_dot(hidden_error, input_transpose);
111 |     matrix_free(input_transpose);
112 | 
113 |     // Update the hidden weights: hidden_weights -= learning_rate * hidden_weights_gradient
114 |     Matrix *scaled_hidden_weights_gradient = matrix_scale(rnn->learning_rate, hidden_weights_gradient);
115 |     Matrix *updated_hidden_weights = matrix_subtract(rnn->hidden_weights, scaled_hidden_weights_gradient);
116 |     matrix_free(rnn->hidden_weights);
117 |     rnn->hidden_weights = updated_hidden_weights;
118 |     matrix_free(scaled_hidden_weights_gradient);
119 |     matrix_free(hidden_weights_gradient);
120 | 
121 |     // Update the hidden state for the next iteration
122 |     matrix_free(rnn->hidden_state);
123 |     rnn->hidden_state = matrix_copy(hidden_error);
124 | 
125 |     // Free memory
126 |     matrix_free(output);
127 |     matrix_free(output_error);
128 |     matrix_free(hidden_error);
129 | }
130 | 
131 | // Generate text using the RNN
132 | char *rnn_generate_text(Vocabulary* v, RNN *rnn, char *initial_input, int length)
133 | {
134 |     // Convert initial_input to a matrix (one-hot encoded or otherwise)
135 |     Matrix *input = matrix_create(rnn->input_size, 1);
136 | 
137 |     // Initialize the generated text buffer
138 |     // Allocate enough space for the generated words and spaces between them
139 |     char *generated_text = (char *)malloc((length * (MAX_WORD_LENGTH + 1)) * sizeof(char));
140 |     if (!generated_text)
141 |     {
142 |         fprintf(stderr, "Error: Unable to allocate memory for generated text\n");
143 |         exit(1);
144 |     }
145 |     generated_text[0] = '\0'; // Initialize as an empty string
146 | 
147 |     // Convert the initial input into a one-hot encoded vector and set it as the input
148 |     Matrix *initial_input_vector = create_one_hot_vector(v, initial_input);
149 |     Matrix *input_copy = matrix_copy(initial_input_vector); 
150 |     matrix_free(initial_input_vector); 
151 | 
152 |     // Copy the input to the matrix input
153 |     for (int i = 0; i < rnn->input_size; i++) {
154 |         input->entries[i][0] = input_copy->entries[i][0];
155 |     }
156 |     matrix_free(input_copy);
157 | 
158 |     // Generate text
159 |     for (int i = 0; i < length; i++)
160 |     {
161 |         // Perform forward pass
162 |         Matrix *output = rnn_forward(rnn, input);
163 | 
164 |         // Find the index of the word with the highest probability
165 |         int predicted_word_index = matrix_argmax(output);
166 | 
167 |         // Map the predicted index to the actual word in the vocabulary
168 |         const char *predicted_word = vocabulary_get_word(v, predicted_word_index);
169 | 
170 |         // Append the predicted word to the generated text
171 |         strcat(generated_text, predicted_word);
172 |         if (i < length - 1) {
173 |             strcat(generated_text, " "); // Add a space between words
174 |         }
175 | 
176 |         // Free the output matrix
177 |         matrix_free(output);
178 | 
179 |         // Update input for the next step
180 |         matrix_fill(input, 0.0);
181 |         input->entries[predicted_word_index][0] = 1.0; // One-hot encoding of predicted word
182 |     }
183 | 
184 |     // Free the input matrix
185 |     matrix_free(input);
186 | 
187 |     return generated_text;
188 | }
189 | 
190 | // Save the RNN model to a file
191 | void rnn_save(RNN *rnn, const char *filename)
192 | {
193 |     FILE *file = fopen(filename, "wb");
194 |     if (!file)
195 |     {
196 |         fprintf(stderr, "Error: Unable to open file for saving RNN\n");
197 |         exit(1);
198 |     }
199 | 
200 |     // Save RNN metadata
201 |     fwrite(&rnn->input_size, sizeof(int), 1, file);
202 |     fwrite(&rnn->hidden_size, sizeof(int), 1, file);
203 |     fwrite(&rnn->output_size, sizeof(int), 1, file);
204 |     fwrite(&rnn->learning_rate, sizeof(double), 1, file);
205 | 
206 |     // Save matrices
207 |     matrix_save(rnn->hidden_weights, file);
208 |     matrix_save(rnn->output_weights, file);
209 |     matrix_save(rnn->hidden_state, file);
210 | 
211 |     fclose(file);
212 | }
213 | 
214 | // Load the RNN model from a file
215 | RNN *rnn_load(const char *filename)
216 | {
217 |     FILE *file = fopen(filename, "rb");
218 |     if (!file)
219 |     {
220 |         fprintf(stderr, "Error: Unable to open file for loading RNN\n");
221 |         exit(EXIT_FAILURE);
222 |     }
223 | 
224 |     // Load RNN metadata
225 |     int input_size, hidden_size, output_size;
226 |     double learning_rate;
227 |     fread(&input_size, sizeof(int), 1, file);
228 |     fread(&hidden_size, sizeof(int), 1, file);
229 |     fread(&output_size, sizeof(int), 1, file);
230 |     fread(&learning_rate, sizeof(double), 1, file);
231 | 
232 |     // Initialize RNN
233 |     RNN *rnn = rnn_init(input_size, hidden_size, output_size, learning_rate);
234 | 
235 |     // Load matrices
236 |     matrix_free(rnn->hidden_weights);
237 |     matrix_free(rnn->output_weights);
238 |     matrix_free(rnn->hidden_state);
239 |     rnn->hidden_weights = matrix_load(file);
240 |     rnn->output_weights = matrix_load(file);
241 |     rnn->hidden_state = matrix_load(file);
242 | 
243 |     fclose(file);
244 |     return rnn;
245 | }


--------------------------------------------------------------------------------
/src/model/rnn.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | #pragma once
 3 | #include "../matrix/matrix.h"
 4 | #include "../vocabulary/vocabulary.h"
 5 | 
 6 | typedef struct
 7 | {
 8 |     int input_size;         // Size of the input vector (e.g., vocabulary size)
 9 |     int hidden_size;        // Size of the hidden state vector
10 |     int output_size;        // Size of the output vector (e.g., vocabulary size)
11 |     double learning_rate;   // Learning rate for training
12 |     Matrix *hidden_weights; // Weights for the hidden state (input to hidden)
13 |     Matrix *output_weights; // Weights for the output (hidden to output)
14 |     Matrix *hidden_state;   // Current hidden state of the RNN
15 | } RNN;
16 | 
17 | RNN *rnn_init(int input_size, int hidden_size, int output_size, double learning_rate);
18 | void rnn_free(RNN *rnn);
19 | Matrix *rnn_forward(RNN *rnn, Matrix *input);               // Forward pass
20 | void rnn_backward(RNN *rnn, Matrix *input, Matrix *target); // Backward pass (backpropagation through time)
21 | char *rnn_generate_text(Vocabulary *v, RNN *rnn, char *initial_input, int length);
22 | void rnn_save(RNN *rnn, const char *filename);
23 | RNN *rnn_load(const char *filename);


--------------------------------------------------------------------------------
/src/vocabulary/vocabulary.c:
--------------------------------------------------------------------------------
  1 | #include "vocabulary.h"
  2 | #include <stdio.h>
  3 | #include <stdlib.h>
  4 | #include <string.h>
  5 | 
  6 | unsigned int hash_word(const char *word)
  7 | {
  8 |     unsigned int hash = 0;
  9 |     while (*word)
 10 |     {
 11 |         hash = (hash * 31) + *word++;
 12 |     }
 13 |     return hash;
 14 | }
 15 | 
 16 | Vocabulary *vocabulary_create(int initial_capacity)
 17 | {
 18 |     Vocabulary *v = (Vocabulary *)malloc(sizeof(Vocabulary));
 19 |     if (!v)
 20 |         return NULL; // Memory allocation failure
 21 | 
 22 |     initial_capacity += VOCAB_SPECIAL_COUNT;
 23 | 
 24 |     v->table = (VocabularyEntry **)calloc(initial_capacity, sizeof(VocabularyEntry *));
 25 |     if (!v->table)
 26 |     {
 27 |         perror("Failed to allocate memory for hash table");
 28 |         free(v);
 29 |         return NULL; // Memory allocation failure
 30 |     }
 31 |     v->size = 0;
 32 |     v->capacity = initial_capacity;
 33 | 
 34 |     // Add special tokens with predefined IDs
 35 |     vocabulary_add_word(v, "<pad>"); // ID 0
 36 |     vocabulary_add_word(v, "<bos>"); // ID 1
 37 |     vocabulary_add_word(v, "<eos>"); // ID 2
 38 |     vocabulary_add_word(v, "<unk>"); // ID 3
 39 | 
 40 |     return v;
 41 | }
 42 | 
 43 | void vocabulary_free(Vocabulary *v)
 44 | {
 45 |     if (!v)
 46 |         return;
 47 |     for (int i = 0; i < v->capacity; i++)
 48 |     {
 49 |         VocabularyEntry *entry = v->table[i];
 50 |         while (entry)
 51 |         {
 52 |             VocabularyEntry *next = entry->next;
 53 |             free(entry->word);
 54 |             free(entry);
 55 |             entry = next;
 56 |         }
 57 |     }
 58 |     free(v->table);
 59 |     free(v);
 60 | }
 61 | 
 62 | int vocabulary_add_word(Vocabulary *v, const char *word)
 63 | {
 64 |     if (v->size == v->capacity)
 65 |     {
 66 |         perror("Vocabulary already full");
 67 |         return -1; // Vocabulary full
 68 |     }
 69 |     unsigned int hash = hash_word(word) % v->capacity;
 70 |     VocabularyEntry *entry = v->table[hash];
 71 |     while (entry)
 72 |     {
 73 |         if (strcmp(entry->word, word) == 0)
 74 |         {
 75 |             return entry->id;
 76 |         }
 77 |         entry = entry->next;
 78 |     }
 79 | 
 80 |     VocabularyEntry *new_entry = (VocabularyEntry *)malloc(sizeof(VocabularyEntry));
 81 |     if (!new_entry)
 82 |     {
 83 |         perror("Failed to allocate memory for new entry");
 84 |         return -1; // Memory allocation failure
 85 |     }
 86 | 
 87 |     new_entry->word = strdup(word);
 88 |     if (!new_entry->word)
 89 |     {
 90 |         perror("Failed to allocate memory for word");
 91 |         free(new_entry); // Free the entry structure
 92 |         return -1;
 93 |     }
 94 | 
 95 |     new_entry->id = v->size;
 96 |     new_entry->next = v->table[hash];
 97 |     v->table[hash] = new_entry;
 98 | 
 99 |     v->size++;
100 |     return new_entry->id;
101 | }
102 | 
103 | int vocabulary_get_index(const Vocabulary *v, const char *word)
104 | {
105 |     if (!v || !word)
106 |         return -1;
107 | 
108 |     unsigned int hash = hash_word(word) % v->capacity;
109 |     VocabularyEntry *entry = v->table[hash];
110 | 
111 |     while (entry)
112 |     {
113 |         if (strcmp(entry->word, word) == 0)
114 |         {
115 |             return entry->id;
116 |         }
117 |         entry = entry->next;
118 |     }
119 |     return -1; // Word not found
120 | }
121 | 
122 | char *vocabulary_get_word(Vocabulary *v, int index) {
123 |     for (int i = 0; i < v->capacity; i++) {
124 |         VocabularyEntry *entry = v->table[i];
125 |         while (entry) {
126 |             if (entry->id == index) {
127 |                 return entry->word;
128 |             }
129 |             entry = entry->next;
130 |         }
131 |     }
132 |     return NULL; // Word not found
133 | }
134 | 
135 | 
136 | void vocabulary_print(const Vocabulary *v)
137 | {
138 |     if (!v)
139 |     {
140 |         fprintf(stderr, "Vocabulary is NULL\n");
141 |         return;
142 |     }
143 | 
144 |     for (int i = 0; i < v->capacity; i++)
145 |     {
146 |         VocabularyEntry *entry = v->table[i];
147 |         while (entry)
148 |         {
149 |             printf("Word: %s, ID: %d\n", entry->word, entry->id);
150 |             entry = entry->next;
151 |         }
152 |     }
153 | }
154 | 
155 | // Function to create a one-hot encoded vector for a given word
156 | Matrix *create_one_hot_vector(Vocabulary *v, char *word)
157 | {
158 |     int index = vocabulary_get_index(v, word);
159 |     if (index == -1)
160 |     {
161 |         fprintf(stderr, "Word '%s' not found in vocabulary\n", word);
162 |         exit(1);
163 |     }
164 |     Matrix *one_hot = matrix_create(v->size, 1);
165 |     for (int i = 0; i < v->size; i++)
166 |     {
167 |         one_hot->entries[i][0] = (i == index) ? 1.0 : 0.0;
168 |     }
169 |     return one_hot;
170 | }


--------------------------------------------------------------------------------
/src/vocabulary/vocabulary.h:
--------------------------------------------------------------------------------
 1 | #pragma once
 2 | #include "../matrix/matrix.h"
 3 | 
 4 | // Enum for special token IDs
 5 | typedef enum
 6 | {
 7 |     VOCAB_UNK = 0,      // <unk>
 8 |     VOCAB_PAD,          // <pad>
 9 |     VOCAB_BOS,          // <bos>
10 |     VOCAB_EOS,          // <eos>
11 |     VOCAB_SPECIAL_COUNT // Total number of special tokens
12 | } SpecialTokens;
13 | 
14 | typedef struct VocabularyEntry
15 | {
16 |     char *word;
17 |     int id;
18 |     struct VocabularyEntry *next; // For chaining (handling collisions)
19 | } VocabularyEntry;
20 | 
21 | typedef struct
22 | {
23 |     VocabularyEntry **table;
24 |     int size;     // Number of words in the vocabulary
25 |     int capacity; // Capacity of the hash table
26 | } Vocabulary;
27 | 
28 | Vocabulary *vocabulary_create(int initial_capacity);
29 | void vocabulary_free(Vocabulary *v);
30 | int vocabulary_add_word(Vocabulary *v, const char *word);
31 | int vocabulary_get_index(const Vocabulary *v, const char *word);
32 | char *vocabulary_get_word(Vocabulary *v, int index);
33 | void vocabulary_print(const Vocabulary *v);
34 | Matrix *create_one_hot_vector(Vocabulary *v, char *word);


--------------------------------------------------------------------------------