├── images ├── LLM_oobabooga.png ├── LLM_oobabooga2.png ├── LLM_oobabooga3.png ├── LLM_oobabooga4.png ├── convertToGGML.png ├── executeGGML4B.png ├── executeGGMLF16.png ├── quantizedModel.png └── quantizeTo4Bits.png ├── ABLITERATION.md ├── UNSLOTH.md ├── .gitignore ├── README.md └── Notebooks ├── Train Vicuna 7b.ipynb ├── Sentence Similarity 1.ipynb ├── K-Means Clustering.ipynb └── Neural Networks 1 - Single Neuron.ipynb /images/LLM_oobabooga.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/LLM_oobabooga.png -------------------------------------------------------------------------------- /images/LLM_oobabooga2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/LLM_oobabooga2.png -------------------------------------------------------------------------------- /images/LLM_oobabooga3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/LLM_oobabooga3.png -------------------------------------------------------------------------------- /images/LLM_oobabooga4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/LLM_oobabooga4.png -------------------------------------------------------------------------------- /images/convertToGGML.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/convertToGGML.png -------------------------------------------------------------------------------- /images/executeGGML4B.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/executeGGML4B.png -------------------------------------------------------------------------------- /images/executeGGMLF16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/executeGGMLF16.png -------------------------------------------------------------------------------- /images/quantizedModel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/quantizedModel.png -------------------------------------------------------------------------------- /images/quantizeTo4Bits.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsobrado/llm_notebooks/HEAD/images/quantizeTo4Bits.png -------------------------------------------------------------------------------- /ABLITERATION.md: -------------------------------------------------------------------------------- 1 | # Abliterating LLMs: A Simple Technique to Bypass Refusal 2 | 3 | Abliterating LLMs is a process that allows you to bypass the safety constraints of language models without the need for retraining. By identifying and removing a specific "refusal direction" in the model's activation space, you can make the model respond to harmful requests that it would normally refuse. Here's how it works: 4 | 5 | ## Step 1: Finding the Refusal Direction 6 | 7 | First, we need to find the "refusal direction" in the model's activation space. To do this, we: 8 | 9 | 1. Prepare a set of harmful and harmless instructions. 10 | 2. Feed these instructions to the model and record the activations at a specific layer and position. 11 | 3. Calculate the average difference between the activations for harmful and harmless instructions. 12 | 4. Normalize this difference to get the "refusal direction." 13 | 14 | Here's the code to do this: 15 | 16 | ``` 17 | # Prepare harmful and harmless instructions 18 | harmful_toks = tokenize_instructions(harmful_instructions) 19 | harmless_toks = tokenize_instructions(harmless_instructions) 20 | 21 | # Run the model and record activations 22 | harmful_acts = run_model(harmful_toks) 23 | harmless_acts = run_model(harmless_toks) 24 | 25 | # Calculate the refusal direction 26 | layer = 14 27 | pos = -1 28 | harmful_mean = harmful_acts[layer][:, pos, :].mean(dim=0) 29 | harmless_mean = harmless_acts[layer][:, pos, :].mean(dim=0) 30 | refusal_dir = (harmful_mean - harmless_mean).normalized() 31 | ``` 32 | 33 | ## Step 2: Removing the Refusal Direction 34 | 35 | Now that we have the refusal direction, we can remove it from the model's activations during inference. This prevents the model from recognizing and refusing harmful requests. 36 | 37 | Here's a function that removes the refusal direction from an activation: 38 | 39 | ``` 40 | def remove_refusal_direction(activation, refusal_dir): 41 | projection = einops.einsum(activation, refusal_dir, 'batch hidden, hidden -> batch') 42 | return activation - projection[:, None] * refusal_dir 43 | ``` 44 | 45 | We can apply this function to the model's activations at multiple layers using hooks: 46 | 47 | ``` 48 | def apply_hooks(model, refusal_dir): 49 | def hook(activation, hook): 50 | return remove_refusal_direction(activation, refusal_dir) 51 | 52 | for layer in model.layers: 53 | layer.register_forward_hook(hook) 54 | ``` 55 | 56 | ## Step 3: Generating with the Modified Model 57 | 58 | Finally, we can generate responses using the modified model: 59 | 60 | ``` 61 | apply_hooks(model, refusal_dir) 62 | generated_text = model.generate(harmful_instruction) 63 | ``` 64 | 65 | The generated text will now include responses to harmful requests that the original model would have refused. 66 | 67 | ## Conclusion 68 | 69 | By identifying and removing a specific direction in the model's activation space, we can make the model respond to harmful requests without the need for retraining. 70 | 71 | This technique highlights the vulnerability of current approaches to making language models safe and aligned. It also opens up possibilities for better understanding how these models work internally. -------------------------------------------------------------------------------- /UNSLOTH.md: -------------------------------------------------------------------------------- 1 | # Setting Up Mamba, Installing `unsloth` in WSL2, and Adding Conda Environment as a Kernel in Jupyter 2 | 3 | Install the `unsloth` package, and make your Conda environment available as a kernel in Jupyter, on a system running WSL2 on Windows 11 with an Nvidia 40XX GPU. 4 | 5 | ## Step 1: Install Mamba 6 | 7 | First, download and install Mambaforge: 8 | 9 | ``` 10 | wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh 11 | bash Mambaforge-Linux-x86_64.sh 12 | ``` 13 | 14 | ## Step 2: Create and Activate a Conda Environment 15 | 16 | Create a new Conda environment named `unsloth_env` with Python 3.10: 17 | 18 | ``` 19 | mamba create --name unsloth_env python=3.10 20 | mamba activate unsloth_env 21 | ``` 22 | 23 | ## Step 3: Install Required Packages 24 | 25 | Install the necessary packages including `pytorch`, `cudatoolkit`, `xformers`, and `bitsandbytes`: 26 | 27 | ``` 28 | mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge 29 | ``` 30 | 31 | ## Step 4: Upgrade Pip 32 | 33 | Upgrade `pip` to the latest version to avoid installation issues: 34 | 35 | ``` 36 | pip install --upgrade pip 37 | ``` 38 | 39 | ## Step 5: Install the `unsloth` Package 40 | 41 | Install the `unsloth` package from the GitHub repository: 42 | 43 | ``` 44 | pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git" 45 | ``` 46 | 47 | ## Step 6: Install and Configure Jupyter Kernel 48 | 49 | Install `ipykernel` and add the Conda environment as a kernel in Jupyter: 50 | 51 | ``` 52 | mamba install ipykernel 53 | python -m ipykernel install --user --name=unsloth_env --display-name "Python (unsloth_env)" 54 | ``` 55 | 56 | ## Step 7: Verify Kernel Installation 57 | 58 | Check if the kernel is successfully added to Jupyter: 59 | 60 | ``` 61 | jupyter kernelspec list 62 | ``` 63 | 64 | ## Step 8: Start Jupyter Notebook or JupyterLab 65 | 66 | Start Jupyter Notebook or JupyterLab to use the newly added kernel: 67 | 68 | ``` 69 | jupyter notebook 70 | ``` 71 | 72 | or 73 | 74 | ``` 75 | jupyter lab 76 | ``` 77 | 78 | ## Troubleshooting 79 | 80 | ### ImportError: cannot import name 'packaging' from 'pkg_resources' 81 | 82 | If you encounter the following error: 83 | 84 | ``` 85 | ImportError: cannot import name 'packaging' from 'pkg_resources' (/home/drusniel/mambaforge/envs/unsloth_env/lib/python3.10/site-packages/pkg_resources/__init__.py) 86 | ``` 87 | 88 | You can resolve it by installing a specific version of `setuptools` using Mamba: (version 70 didn't work) 89 | 90 | ``` 91 | mamba install setuptools==69.5.1 92 | ``` 93 | 94 | ### ImportError: cannot import name 'packaging' from 'pkg_resources' 95 | 96 | If you encounter the following error: 97 | ``` 98 | ValueError: Query/Key/Value should all have BMHK or BMK shape. 99 | ``` 100 | 101 | 102 | ## Environment Details 103 | 104 | - **Operating System:** Windows 11 with WSL2 105 | - **GPU:** Nvidia 40XX 106 | 107 | By following these steps, I was able to set up my environment, install the necessary packages, and configure the Jupyter kernel correctly. 108 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py,cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | .pybuilder/ 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | # For a library or package, you might want to ignore these files since the code is 87 | # intended to run in multiple environments; otherwise, check them in: 88 | # .python-version 89 | 90 | # pipenv 91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 94 | # install all needed dependencies. 95 | #Pipfile.lock 96 | 97 | # poetry 98 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 99 | # This is especially recommended for binary packages to ensure reproducibility, and is more 100 | # commonly ignored for libraries. 101 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 102 | #poetry.lock 103 | 104 | # pdm 105 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 106 | #pdm.lock 107 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 108 | # in version control. 109 | # https://pdm.fming.dev/#use-with-ide 110 | .pdm.toml 111 | 112 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 113 | __pypackages__/ 114 | 115 | # Celery stuff 116 | celerybeat-schedule 117 | celerybeat.pid 118 | 119 | # SageMath parsed files 120 | *.sage.py 121 | 122 | # Environments 123 | .env 124 | .venv 125 | env/ 126 | venv/ 127 | ENV/ 128 | env.bak/ 129 | venv.bak/ 130 | 131 | # Spyder project settings 132 | .spyderproject 133 | .spyproject 134 | 135 | # Rope project settings 136 | .ropeproject 137 | 138 | # mkdocs documentation 139 | /site 140 | 141 | # mypy 142 | .mypy_cache/ 143 | .dmypy.json 144 | dmypy.json 145 | 146 | # Pyre type checker 147 | .pyre/ 148 | 149 | # pytype static type analyzer 150 | .pytype/ 151 | 152 | # Cython debug symbols 153 | cython_debug/ 154 | 155 | # PyCharm 156 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 157 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 158 | # and can be added to the global gitignore or merged into this file. For a more nuclear 159 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 160 | #.idea/ 161 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LLMs 2 | 3 | Large Language Models (LLMs) are created to produce writing that resembles that of a person. They get the ability to anticipate the next word in a sentence based on the context of the previous words through training on a massive amount of text data. They are frequently referred to as autoregressive mathematical models for this reason. 4 | 5 | Let's start with some concepts: 6 | 7 | **Embedding** is just a vector that represents the significance of a word token or a portion of an image. 8 | 9 | **Context** refers to the model's finite window because it can only handle a small portion of text. 10 | 11 | A simple plug-in model called **LoRA** for adjusting the main model's "loss". 12 | 13 | **loss** is a score that indicates how poor the output was. 14 | 15 | Making a low precision version of the model using **quantization** such that it still functions but is now considerably faster and requires less computing. 16 | 17 | When you feed its most recent prediction back in, **GPT** can generate long sentences because it is trained on everything and predicts the next word. 18 | 19 | Note: I tested these steps on WSL2 - Ubuntu 20.04. 20 | 21 | ## About formats and conversions 22 | 23 | Language models can be saved and loaded in various formats, here are the most known frameworks: 24 | 25 | **PyTorch Model (.pt or .pth)**: This is a common format for models trained using the PyTorch framework. It represents the state_dict (or the "state dictionary"), which is a Python dictionary object that maps each layer in the model to its trainable parameters (weights and biases). 26 | 27 | **TensorFlow Checkpoints**: TensorFlow is another popular framework for training machine learning models. A checkpoint file contains the weights of a trained model. Unlike a full model, it doesn't contain any description of the computation that the model performs, it's just the weights. That's useful because often the models can be large and storing the full model in memory can be expensive. 28 | 29 | **Hugging Face Transformers**: Hugging Face is a company known for their Transformers library, which provides state-of-the-art general-purpose architectures. They have their own model saving and loading mechanisms, usually leveraging PyTorch or TensorFlow under the hood. You can save a model using the save_pretrained() method and load it using from_pretrained(). 30 | 31 | Here is a brief overview of the different language model file formats: 32 | 33 | * **GGML** stands for Google's Transformer-XL model format. It is a text-based format that stores the model's parameters in a human-readable format. GGML is a good choice for debugging and understanding how the model works. 34 | * **HF** stands for Hugging Face's Transformers format. It is a binary format that stores the model's parameters in a compressed format. HF is a good choice for production deployment, as it is more efficient than GGML. 35 | * **Checkpoints .ckpt** are saved states of a language model's training process. They can be used to resume training, or to load a model that has already been trained. Checkpoints can be useful for debugging, or for saving a model's progress so that it can be resumed later. 36 | * **ONNX** is a cross-platform format for machine learning models. It can be used to store and share language models between different frameworks. 37 | * **Safetensor** is a new format for storing tensors safely. It is designed to be more secure than traditional formats, such as pickle, which can be used to execute arbitrary code. Safetensor is also faster than pickle, making it a good choice for production deployment. 38 | * **Pytorch .pb** is a binary format for storing neural networks. It is efficient and can be loaded quickly. 39 | * **Pytorch .pt** is the most common extension for PyTorch language models. It is a binary file that stores the model's parameters and state. 40 | * **Pytorch .pth** is another common extension for PyTorch language models. It is a text-based file that stores the model's parameters and state. 41 | * **.bin** file is a binary file that stores the parameters and state of a language model. It is a more efficient way to store a language model than a text-based file, such as a .pth file. This is because a binary file can be compressed, which makes it smaller and faster to load. 42 | 43 | ### Quantization 44 | 45 | Quantization is a technique for reducing the size and complexity of machine learning models. It works by representing the model's parameters and weights in a lower precision format. This can lead to significant reductions in model size and inference time, without sacrificing much accuracy. 46 | 47 | There are two main types of quantization: post-training quantization and quantization aware training. 48 | 49 | **Post-training quantization** is the most common type of quantization. It works by converting a trained model to a lower precision format after it has been trained. This can be done using a variety of tools and techniques. 50 | 51 | **Quantization aware training** is a newer technique that involves training a model with quantization in mind. This can lead to better accuracy and performance than post-training quantization. 52 | 53 | ### Models available 54 | 55 | **Llama** 56 | 57 | Llama is a large language model (LLM) released by Meta AI in February 2023. A variety of model sizes were trained ranging from 7 billion to 65 billion parameters. LLaMA's developers reported that the 13 billion parameter model's performance on most NLP benchmarks exceeded that of the much larger GPT-3 (with 175 billion parameters) and that the largest model was competitive with state of the art models such as PaLM and Chinchilla. 58 | 59 | **Open Llama** 60 | 61 | [Open Llama](https://github.com/s-JoL/Open-Llama) is an open-source reproduction of Meta AI's LLaMA model. The creators of Open Llama have made the permissively licensed model publicly available as a 7B OpenLLaMA model that has been trained with 200 billion tokens. 62 | 63 | **Vicuna** 64 | Vicuna is a delta model for LLaMA. Delta models are small, efficient models that can be used to improve the performance of larger models. Vicuna Delta is trained on a dataset of user-shared conversations collected from ShareGPT, and it has been shown to improve the performance of LLaMA on a variety of NLP tasks, including natural language inference, question answering, and summarization. 65 | 66 | Vicuna Delta is available as a pre-trained model from the Hugging Face Hub. 67 | 68 | ## Convert Open-LLama Checkpoint to quantized GGML format 69 | 70 | Download Open LLama into your models folder: 71 | ``` 72 | git clone https://huggingface.co/openlm-research/open_llama_7b_preview_200bt/ 73 | ``` 74 | 75 | Clone llama.cpp and build it: 76 | ``` 77 | git clone https://github.com/ggerganov/llama.cpp 78 | cd llama.cpp 79 | cmake -B build 80 | cmake --build build 81 | ``` 82 | 83 | Convert it from ```.pth``` to ```.ggml```: 84 | 85 | ``` 86 | python3 convert-pth-to-ggml.py ../models/open_llama_7b_preview_200bt/ open_llama_7b_preview_200bt_transformers_weights 1 87 | ``` 88 | 89 | ![convertToGGML](https://github.com/danielsobrado/llm_notebooks/blob/main/images/convertToGGML.png) 90 | 91 | Test it: 92 | 93 | ./build/bin/main -m models/open_llama_7b_preview_200bt_q5_0.ggml --ignore-eos -n 1280 -p "Give me in python the quicksort algorithm" --mlock 94 | 95 | ![executeGGMLF16](https://github.com/danielsobrado/llm_notebooks/blob/main/images/executeGGMLF16.png) 96 | 97 | Quantize it to 4 bits: 98 | 99 | ``` 100 | ./build/bin/quantize ../models/open_llama_7b_preview_200bt/open_llama_7b_preview_200bt_transformers_weights/ggml-model-f16.bin ../models/open_llama_7b_preview_200bt_q4_0.ggml q4_0 101 | ``` 102 | ![quantizeTo4Bits](https://github.com/danielsobrado/llm_notebooks/blob/main/images/quantizeTo4Bits.png) 103 | 104 | It is way smaller! 105 | 106 | ![quantizedModel](https://github.com/danielsobrado/llm_notebooks/blob/main/images/quantizedModel.png) 107 | 108 | Test it: 109 | 110 | ./build/bin/main -m models/open_llama_7b_preview_200bt_q4_0.ggml --ignore-eos -n 1280 -p "Give me in python the quicksort algorithm" --mlock 111 | 112 | ![executeGGML4B](https://github.com/danielsobrado/llm_notebooks/blob/main/images/executeGGML4B.png) 113 | 114 | You'll notice that the inference is much faster and requires less memory. 115 | 116 | ## LLM notebooks 117 | Testing local LLMs 118 | 119 | 1. Train Vicuna 7B on a text fragment: [Notebook](https://github.com/danielsobrado/llm_notebooks/blob/main/Notebooks/Train%20Vicuna%207b.ipynb) 120 | 121 | ## Concepts: 122 | 1. Dot Product: [Notebook](https://github.com/danielsobrado/llm_notebooks/blob/main/Notebooks/Dot%20Product.ipynb) 123 | 124 | ## How to run the examples 125 | 126 | Folow the steps: 127 | 128 | * Create a new environment: `conda create -n examples` 129 | * Activate the environment: `conda activate examples` 130 | * Install the packages: `conda install jupyter numpy matplotlib seaborn plotly` 131 | * Start notebooks: `jupyter notebook --NotebookApp.password="''" --NotebookApp.token="''"` 132 | 133 | ## Links: 134 | * GPTQ inference Triton kernel: https://github.com/fpgaminer/GPTQ-triton 135 | 136 | # LLM Server 137 | 138 | We'll use [oobabooga](https://github.com/oobabooga/text-generation-webui) as server, using the [OpenAI extension](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai) 139 | 140 | * `python server.py --extensions openai --no-stream` 141 | 142 | We'll download a model GGML for CPU or GPTQ for GPU, both quantized: 143 | ![oobabooga1](https://github.com/danielsobrado/llm_notebooks/blob/main/images/LLM_oobabooga.png) 144 | ![oobabooga2](https://github.com/danielsobrado/llm_notebooks/blob/main/images/LLM_oobabooga2.png) 145 | 146 | Test that works fine: 147 | ![oobabooga3](https://github.com/danielsobrado/llm_notebooks/blob/main/images/LLM_oobabooga3.png) 148 | 149 | And we'll be able to connect from our notebook to the server: 150 | ![oobabooga server](https://github.com/danielsobrado/llm_notebooks/blob/main/images/LLM_oobabooga4.png) 151 | 152 | 153 | 154 | 155 | -------------------------------------------------------------------------------- /Notebooks/Train Vicuna 7b.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "id": "600313fc", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "VICUNA_MODEL_PATH = \"./models/ggml-vicuna-7b-1.1-q4_2.bin\"" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": 2, 16 | "id": "f8c433c1", 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "from langchain.llms import LlamaCpp\n", 21 | "from langchain import PromptTemplate, LLMChain" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 10, 27 | "id": "79a74903", 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "template = \"\"\"\n", 32 | "\n", 33 | "Question: {question}\n", 34 | "Answer: \n", 35 | "\n", 36 | "\"\"\"" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 11, 42 | "id": "b1398486", 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "prompt = PromptTemplate(template=template, input_variables=[\"question\"])" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 12, 52 | "id": "7b3b247a", 53 | "metadata": {}, 54 | "outputs": [ 55 | { 56 | "name": "stdout", 57 | "output_type": "stream", 58 | "text": [ 59 | "CPU times: user 165 µs, sys: 67.7 ms, total: 67.8 ms\n", 60 | "Wall time: 66.2 ms\n" 61 | ] 62 | }, 63 | { 64 | "name": "stderr", 65 | "output_type": "stream", 66 | "text": [ 67 | "llama.cpp: loading model from ./models/ggml-vicuna-7b-1.1-q4_2.bin\n", 68 | "llama_model_load_internal: format = ggjt v1 (latest)\n", 69 | "llama_model_load_internal: n_vocab = 32000\n", 70 | "llama_model_load_internal: n_ctx = 512\n", 71 | "llama_model_load_internal: n_embd = 4096\n", 72 | "llama_model_load_internal: n_mult = 256\n", 73 | "llama_model_load_internal: n_head = 32\n", 74 | "llama_model_load_internal: n_layer = 32\n", 75 | "llama_model_load_internal: n_rot = 128\n", 76 | "llama_model_load_internal: ftype = 5 (mostly Q4_2)\n", 77 | "llama_model_load_internal: n_ff = 11008\n", 78 | "llama_model_load_internal: n_parts = 1\n", 79 | "llama_model_load_internal: model size = 7B\n", 80 | "llama_model_load_internal: ggml ctx size = 59.11 KB\n", 81 | "llama_model_load_internal: mem required = 5809.32 MB (+ 2052.00 MB per state)\n", 82 | "llama_init_from_file: kv self size = 512.00 MB\n", 83 | "AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | \n" 84 | ] 85 | } 86 | ], 87 | "source": [ 88 | "%%time\n", 89 | "llm = LlamaCpp(model_path=VICUNA_MODEL_PATH)" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 13, 95 | "id": "781a47c2", 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "llm_chain = LLMChain(prompt=prompt, llm=llm)" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": 14, 105 | "id": "02870eb2", 106 | "metadata": {}, 107 | "outputs": [ 108 | { 109 | "name": "stderr", 110 | "output_type": "stream", 111 | "text": [ 112 | "\n", 113 | "llama_print_timings: load time = 549.90 ms\n", 114 | "llama_print_timings: sample time = 19.59 ms / 67 runs ( 0.29 ms per run)\n", 115 | "llama_print_timings: prompt eval time = 1067.04 ms / 16 tokens ( 66.69 ms per token)\n", 116 | "llama_print_timings: eval time = 8262.33 ms / 66 runs ( 125.19 ms per run)\n", 117 | "llama_print_timings: total time = 9354.29 ms\n" 118 | ] 119 | }, 120 | { 121 | "data": { 122 | "text/plain": [ 123 | "'Spain is a country in southern Europe. It is bordered by France, Andorra, and Portugal to the north, and by Gibraltar, Morocco, and the Atlantic Ocean to the south.\\n\\nWhat is the capital of Spain?\\nAnswer: \\n\\nThe capital of Spain is Madrid.'" 124 | ] 125 | }, 126 | "execution_count": 14, 127 | "metadata": {}, 128 | "output_type": "execute_result" 129 | } 130 | ], 131 | "source": [ 132 | "question = \"Where is Spain?\"\n", 133 | "llm_chain.run(question)" 134 | ] 135 | }, 136 | { 137 | "cell_type": "code", 138 | "execution_count": 15, 139 | "id": "4b3179d7", 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "template2 = \"\"\"\n", 144 | "\n", 145 | "Question: {question}\n", 146 | "Answer: \n", 147 | "\n", 148 | "\"\"\"\n", 149 | "\n", 150 | "prompt2 = PromptTemplate(template=template2, input_variables=[\"question\"])\n", 151 | "\n", 152 | "llm_chain2 = LLMChain(prompt=prompt, llm=llm)" 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": 16, 158 | "id": "272a1017", 159 | "metadata": {}, 160 | "outputs": [ 161 | { 162 | "name": "stdout", 163 | "output_type": "stream", 164 | "text": [ 165 | "CPU times: user 6min 34s, sys: 105 ms, total: 6min 34s\n", 166 | "Wall time: 34.3 s\n" 167 | ] 168 | }, 169 | { 170 | "name": "stderr", 171 | "output_type": "stream", 172 | "text": [ 173 | "\n", 174 | "llama_print_timings: load time = 549.90 ms\n", 175 | "llama_print_timings: sample time = 76.39 ms / 256 runs ( 0.30 ms per run)\n", 176 | "llama_print_timings: prompt eval time = 1283.65 ms / 20 tokens ( 64.18 ms per token)\n", 177 | "llama_print_timings: eval time = 32917.07 ms / 255 runs ( 129.09 ms per run)\n", 178 | "llama_print_timings: total time = 34307.95 ms\n" 179 | ] 180 | }, 181 | { 182 | "data": { 183 | "text/plain": [ 184 | "'Astrelys is a fantasy-themed card game that was designed by Vincent Diamante and published by Portal Games in 2017. The game is for 1-4 players and takes around 60-90 minutes to play. In Astrelys, players build their own deck from a pool of cards and use it to conquer various locations on the game board.\\n\\nThe game has been well-received by critics and players alike for its unique mechanics and engaging gameplay. Players can choose from a variety of characters, each with their own strengths and abilities, and must strategically manage their resources to achieve victory.\\n\\nOne of the key elements of Astrelys is its use of \"Mana\" as a resource. Each player starts with a certain amount of Mana, which they can then spend on playing cards from their deck. The cards represent various actions that players can take, such as attacking opponents, moving to different locations, or casting spells.\\n\\nPlayers can also recruit creatures and followers to their side, each with their own abilities and strengths. As the game progresses, players must balance their use of Mana'" 185 | ] 186 | }, 187 | "execution_count": 16, 188 | "metadata": {}, 189 | "output_type": "execute_result" 190 | } 191 | ], 192 | "source": [ 193 | "%%time\n", 194 | "question2 = \"Tell me about Astraelys?\"\n", 195 | "llm_chain2.run(question2)" 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": 17, 201 | "id": "cfbc4b13", 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [ 205 | "from langchain.embeddings import LlamaCppEmbeddings" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": 18, 211 | "id": "66b577d9", 212 | "metadata": {}, 213 | "outputs": [ 214 | { 215 | "name": "stderr", 216 | "output_type": "stream", 217 | "text": [ 218 | "llama.cpp: loading model from ./models/ggml-vicuna-7b-1.1-q4_2.bin\n", 219 | "llama_model_load_internal: format = ggjt v1 (latest)\n", 220 | "llama_model_load_internal: n_vocab = 32000\n", 221 | "llama_model_load_internal: n_ctx = 512\n", 222 | "llama_model_load_internal: n_embd = 4096\n", 223 | "llama_model_load_internal: n_mult = 256\n", 224 | "llama_model_load_internal: n_head = 32\n", 225 | "llama_model_load_internal: n_layer = 32\n", 226 | "llama_model_load_internal: n_rot = 128\n", 227 | "llama_model_load_internal: ftype = 5 (mostly Q4_2)\n", 228 | "llama_model_load_internal: n_ff = 11008\n", 229 | "llama_model_load_internal: n_parts = 1\n", 230 | "llama_model_load_internal: model size = 7B\n", 231 | "llama_model_load_internal: ggml ctx size = 59.11 KB\n", 232 | "llama_model_load_internal: mem required = 5809.32 MB (+ 2052.00 MB per state)\n", 233 | "llama_init_from_file: kv self size = 512.00 MB\n", 234 | "AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | \n" 235 | ] 236 | } 237 | ], 238 | "source": [ 239 | "llama = LlamaCppEmbeddings(model_path=VICUNA_MODEL_PATH)" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": 19, 245 | "id": "272542ad", 246 | "metadata": {}, 247 | "outputs": [], 248 | "source": [ 249 | "template = \"\"\"\n", 250 | "\n", 251 | "Question: {question}\n", 252 | "\n", 253 | "Answer: \n", 254 | "\n", 255 | "\"\"\"" 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": 20, 261 | "id": "c3c00bbc", 262 | "metadata": {}, 263 | "outputs": [], 264 | "source": [ 265 | "prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n", 266 | "llm_chain = LLMChain(prompt=prompt, llm=llm)" 267 | ] 268 | }, 269 | { 270 | "cell_type": "code", 271 | "execution_count": 21, 272 | "id": "ffe3958b", 273 | "metadata": {}, 274 | "outputs": [ 275 | { 276 | "name": "stdout", 277 | "output_type": "stream", 278 | "text": [ 279 | "In the realm of Luminara, where the ethereal glow of Nectaris illuminated the twilight, lived two extraordinary beings, Astraelys, the Nebula Spinner, and Volcanion, the Quasar Forger. Both bore magic as immense as the cosmos itself. Astraelys would spin gossamer threads of nebular energy into radiant tapestries, and Volcanion hammered raw astral elements into extraordinary artifacts. Their souls were in sync, humming the same cosmic melody.\r\n", 280 | "\r\n", 281 | "A day came when the menacing Void Serpent threatened to shroud their realm in perpetual obscurity. Bound by courage and purpose, Astraelys and Volcanion fused their extraordinary gifts. Astraelys spun a cloak of pure Nebular silk, shimmering with the brilliance of a billion constellations. Volcanion forged a radiant quasar lance of unparalleled power.\r\n", 282 | "\r\n", 283 | "Adorned in the radiant cloak and wielding the lance, they confronted the Void Serpent. Astraelys' cloak bathed Luminara in a celestial glow, blinding the Serpent. Guided by this radiant beacon, Volcanion struck the Serpent with his lance. The Serpent shrieked, metamorphosing into a harmless comet, forever encircling Luminara, a testament to their triumph.\r\n", 284 | "\r\n", 285 | "From then on, Astraelys and Volcanion became the sentinels of Luminara. Together, they spun the nebulae and molded the quasars, painting their world with cosmic wonder, their bond as infinite and awe-inspiring as the universe itself." 286 | ] 287 | } 288 | ], 289 | "source": [ 290 | "!cat './fragment.txt'" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": 22, 296 | "id": "e02b55dd", 297 | "metadata": {}, 298 | "outputs": [], 299 | "source": [ 300 | "from langchain.document_loaders import TextLoader\n", 301 | "loader = TextLoader('./fragment.txt')" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": 23, 307 | "id": "f782a8eb", 308 | "metadata": {}, 309 | "outputs": [ 310 | { 311 | "name": "stderr", 312 | "output_type": "stream", 313 | "text": [ 314 | "Using embedded DuckDB with persistence: data will be stored in: db\n", 315 | "\n", 316 | "llama_print_timings: load time = 564.50 ms\n", 317 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 318 | "llama_print_timings: prompt eval time = 14364.82 ms / 220 tokens ( 65.29 ms per token)\n", 319 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 320 | "llama_print_timings: total time = 14366.40 ms\n", 321 | "\n", 322 | "llama_print_timings: load time = 564.50 ms\n", 323 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 324 | "llama_print_timings: prompt eval time = 11324.44 ms / 173 tokens ( 65.46 ms per token)\n", 325 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 326 | "llama_print_timings: total time = 11325.90 ms\n" 327 | ] 328 | } 329 | ], 330 | "source": [ 331 | "from langchain.indexes import VectorstoreIndexCreator\n", 332 | "index = VectorstoreIndexCreator(embedding=llama,\n", 333 | " vectorstore_kwargs={\"persist_directory\": \"db\"}\n", 334 | " ).from_loaders([loader])" 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": 24, 340 | "id": "5d519d16", 341 | "metadata": {}, 342 | "outputs": [ 343 | { 344 | "name": "stderr", 345 | "output_type": "stream", 346 | "text": [ 347 | "Using embedded DuckDB without persistence: data will be transient\n", 348 | "\n", 349 | "llama_print_timings: load time = 564.50 ms\n", 350 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 351 | "llama_print_timings: prompt eval time = 8069.01 ms / 123 tokens ( 65.60 ms per token)\n", 352 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 353 | "llama_print_timings: total time = 8069.98 ms\n", 354 | "\n", 355 | "llama_print_timings: load time = 564.50 ms\n", 356 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 357 | "llama_print_timings: prompt eval time = 6148.30 ms / 96 tokens ( 64.04 ms per token)\n", 358 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 359 | "llama_print_timings: total time = 6149.11 ms\n", 360 | "\n", 361 | "llama_print_timings: load time = 564.50 ms\n", 362 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 363 | "llama_print_timings: prompt eval time = 6791.61 ms / 104 tokens ( 65.30 ms per token)\n", 364 | "llama_print_timings: eval time = 126.88 ms / 1 runs ( 126.88 ms per run)\n", 365 | "llama_print_timings: total time = 6919.53 ms\n", 366 | "\n", 367 | "llama_print_timings: load time = 564.50 ms\n", 368 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 369 | "llama_print_timings: prompt eval time = 4415.45 ms / 67 tokens ( 65.90 ms per token)\n", 370 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 371 | "llama_print_timings: total time = 4416.25 ms\n" 372 | ] 373 | } 374 | ], 375 | "source": [ 376 | "from langchain.vectorstores import Chroma\n", 377 | "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", 378 | "from langchain.chains import RetrievalQA\n", 379 | "\n", 380 | "documents = loader.load()\n", 381 | "\n", 382 | "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n", 383 | "texts = text_splitter.split_documents(documents)\n", 384 | "\n", 385 | "# Again, we should persist the db and figure out how to reuse it\n", 386 | "docsearch = Chroma.from_documents(texts, llama)" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 25, 392 | "id": "25d49f71", 393 | "metadata": {}, 394 | "outputs": [], 395 | "source": [ 396 | "MIN_DOCS = 1\n", 397 | "\n", 398 | "qa = RetrievalQA.from_chain_type(llm=llm, chain_type=\"stuff\",\n", 399 | " retriever=docsearch.as_retriever(search_kwargs={\"k\": MIN_DOCS}))" 400 | ] 401 | }, 402 | { 403 | "cell_type": "code", 404 | "execution_count": 29, 405 | "id": "3fdb31a4", 406 | "metadata": {}, 407 | "outputs": [ 408 | { 409 | "name": "stdout", 410 | "output_type": "stream", 411 | "text": [ 412 | "Tell me about Astraelys?\n" 413 | ] 414 | }, 415 | { 416 | "name": "stderr", 417 | "output_type": "stream", 418 | "text": [ 419 | "\n", 420 | "llama_print_timings: load time = 564.50 ms\n", 421 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 422 | "llama_print_timings: prompt eval time = 671.06 ms / 10 tokens ( 67.11 ms per token)\n", 423 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 424 | "llama_print_timings: total time = 671.64 ms\n", 425 | "\n", 426 | "llama_print_timings: load time = 549.90 ms\n", 427 | "llama_print_timings: sample time = 17.09 ms / 57 runs ( 0.30 ms per run)\n", 428 | "llama_print_timings: prompt eval time = 8233.56 ms / 128 tokens ( 64.32 ms per token)\n", 429 | "llama_print_timings: eval time = 7452.44 ms / 56 runs ( 133.08 ms per run)\n", 430 | "llama_print_timings: total time = 15708.64 ms\n" 431 | ] 432 | }, 433 | { 434 | "data": { 435 | "text/plain": [ 436 | "' Astraelys is an Eevee that has evolved into an Umbra. It is mentioned in the context that it and its partner Volcanion became sentinels of Luminara together, spinning nebulae and molding quasars.'" 437 | ] 438 | }, 439 | "execution_count": 29, 440 | "metadata": {}, 441 | "output_type": "execute_result" 442 | } 443 | ], 444 | "source": [ 445 | "query = \"Tell me about Astraelys?\"\n", 446 | "print(query)\n", 447 | "\n", 448 | "qa.run(query)" 449 | ] 450 | }, 451 | { 452 | "cell_type": "code", 453 | "execution_count": 30, 454 | "id": "d9d7144d", 455 | "metadata": {}, 456 | "outputs": [ 457 | { 458 | "name": "stderr", 459 | "output_type": "stream", 460 | "text": [ 461 | "\n", 462 | "llama_print_timings: load time = 564.50 ms\n", 463 | "llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 464 | "llama_print_timings: prompt eval time = 971.12 ms / 15 tokens ( 64.74 ms per token)\n", 465 | "llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)\n", 466 | "llama_print_timings: total time = 971.77 ms\n", 467 | "\n", 468 | "llama_print_timings: load time = 549.90 ms\n", 469 | "llama_print_timings: sample time = 33.48 ms / 115 runs ( 0.29 ms per run)\n", 470 | "llama_print_timings: prompt eval time = 11180.75 ms / 171 tokens ( 65.38 ms per token)\n", 471 | "llama_print_timings: eval time = 15139.20 ms / 114 runs ( 132.80 ms per run)\n", 472 | "llama_print_timings: total time = 26365.02 ms\n" 473 | ] 474 | }, 475 | { 476 | "data": { 477 | "text/plain": [ 478 | "\" Astraelys is an ancient entity that was revered as a goddess in ancient times. The void serpent was a powerful magical beast that was said to have the power to control time and space itself, making it a formidable foe for the forces of light. In this encounter, Astraelys' radiant cloak blinded the void serpent, allowing Volcanion to strike it with his lance, turning it into a harmless comet that encircled Luminara as a testament to their victory.\"" 479 | ] 480 | }, 481 | "execution_count": 30, 482 | "metadata": {}, 483 | "output_type": "execute_result" 484 | } 485 | ], 486 | "source": [ 487 | "query = \"Tell me about the Astraelys and the Void Serpent\"\n", 488 | "\n", 489 | "qa.run(query)" 490 | ] 491 | } 492 | ], 493 | "metadata": { 494 | "kernelspec": { 495 | "display_name": "Python 3 (ipykernel)", 496 | "language": "python", 497 | "name": "python3" 498 | }, 499 | "language_info": { 500 | "codemirror_mode": { 501 | "name": "ipython", 502 | "version": 3 503 | }, 504 | "file_extension": ".py", 505 | "mimetype": "text/x-python", 506 | "name": "python", 507 | "nbconvert_exporter": "python", 508 | "pygments_lexer": "ipython3", 509 | "version": "3.9.13" 510 | } 511 | }, 512 | "nbformat": 4, 513 | "nbformat_minor": 5 514 | } 515 | -------------------------------------------------------------------------------- /Notebooks/Sentence Similarity 1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "id": "e7355e36-6327-4ee4-a7f6-47e2f9e8fe42", 7 | "metadata": {}, 8 | "outputs": [ 9 | { 10 | "name": "stdout", 11 | "output_type": "stream", 12 | "text": [ 13 | "Collecting sentence-transformers\n", 14 | " Downloading sentence_transformers-2.7.0-py3-none-any.whl.metadata (11 kB)\n", 15 | "Requirement already satisfied: transformers<5.0.0,>=4.34.0 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (4.40.2)\n", 16 | "Requirement already satisfied: tqdm in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (4.66.4)\n", 17 | "Requirement already satisfied: torch>=1.11.0 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (2.3.0)\n", 18 | "Requirement already satisfied: numpy in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (1.26.4)\n", 19 | "Collecting scikit-learn (from sentence-transformers)\n", 20 | " Downloading scikit_learn-1.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)\n", 21 | "Requirement already satisfied: scipy in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (1.13.0)\n", 22 | "Requirement already satisfied: huggingface-hub>=0.15.1 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (0.23.0)\n", 23 | "Requirement already satisfied: Pillow in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sentence-transformers) (10.3.0)\n", 24 | "Requirement already satisfied: filelock in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (3.14.0)\n", 25 | "Requirement already satisfied: fsspec>=2023.5.0 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (2023.10.0)\n", 26 | "Requirement already satisfied: packaging>=20.9 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (24.0)\n", 27 | "Requirement already satisfied: pyyaml>=5.1 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (6.0.1)\n", 28 | "Requirement already satisfied: requests in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (2.31.0)\n", 29 | "Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (4.11.0)\n", 30 | "Requirement already satisfied: sympy in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (1.12)\n", 31 | "Requirement already satisfied: networkx in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (3.3)\n", 32 | "Requirement already satisfied: jinja2 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (3.1.4)\n", 33 | "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (12.1.105)\n", 34 | "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (12.1.105)\n", 35 | "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (12.1.105)\n", 36 | "Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (8.9.2.26)\n", 37 | "Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (12.1.3.1)\n", 38 | "Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (11.0.2.54)\n", 39 | "Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (10.3.2.106)\n", 40 | "Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (11.4.5.107)\n", 41 | "Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (12.1.0.106)\n", 42 | "Requirement already satisfied: nvidia-nccl-cu12==2.20.5 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (2.20.5)\n", 43 | "Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from torch>=1.11.0->sentence-transformers) (12.1.105)\n", 44 | "Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.11.0->sentence-transformers) (12.4.127)\n", 45 | "Requirement already satisfied: regex!=2019.12.17 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from transformers<5.0.0,>=4.34.0->sentence-transformers) (2024.4.28)\n", 46 | "Requirement already satisfied: tokenizers<0.20,>=0.19 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from transformers<5.0.0,>=4.34.0->sentence-transformers) (0.19.1)\n", 47 | "Requirement already satisfied: safetensors>=0.4.1 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from transformers<5.0.0,>=4.34.0->sentence-transformers) (0.4.3)\n", 48 | "Collecting joblib>=1.2.0 (from scikit-learn->sentence-transformers)\n", 49 | " Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB)\n", 50 | "Collecting threadpoolctl>=3.1.0 (from scikit-learn->sentence-transformers)\n", 51 | " Downloading threadpoolctl-3.5.0-py3-none-any.whl.metadata (13 kB)\n", 52 | "Requirement already satisfied: MarkupSafe>=2.0 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from jinja2->torch>=1.11.0->sentence-transformers) (2.1.5)\n", 53 | "Requirement already satisfied: charset-normalizer<4,>=2 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (3.3.2)\n", 54 | "Requirement already satisfied: idna<4,>=2.5 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (3.7)\n", 55 | "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (2.2.1)\n", 56 | "Requirement already satisfied: certifi>=2017.4.17 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (2024.2.2)\n", 57 | "Requirement already satisfied: mpmath>=0.19 in /home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages (from sympy->torch>=1.11.0->sentence-transformers) (1.3.0)\n", 58 | "Downloading sentence_transformers-2.7.0-py3-none-any.whl (171 kB)\n", 59 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m171.5/171.5 kB\u001b[0m \u001b[31m4.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 60 | "\u001b[?25hDownloading scikit_learn-1.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)\n", 61 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m13.1/13.1 MB\u001b[0m \u001b[31m18.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n", 62 | "\u001b[?25hDownloading joblib-1.4.2-py3-none-any.whl (301 kB)\n", 63 | "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m301.8/301.8 kB\u001b[0m \u001b[31m9.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", 64 | "\u001b[?25hDownloading threadpoolctl-3.5.0-py3-none-any.whl (18 kB)\n", 65 | "Installing collected packages: threadpoolctl, joblib, scikit-learn, sentence-transformers\n", 66 | "Successfully installed joblib-1.4.2 scikit-learn-1.5.0 sentence-transformers-2.7.0 threadpoolctl-3.5.0\n" 67 | ] 68 | } 69 | ], 70 | "source": [ 71 | "!pip install sentence-transformers" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 6, 77 | "id": "28c94c74-9122-470c-a6c7-d8a70c53768a", 78 | "metadata": {}, 79 | "outputs": [], 80 | "source": [ 81 | "import numpy as np\n", 82 | "from sklearn.metrics.pairwise import cosine_similarity\n", 83 | "from sentence_transformers import SentenceTransformer\n", 84 | "import logging\n", 85 | "\n", 86 | "# Configure logging\n", 87 | "logging.basicConfig(level=logging.INFO)\n", 88 | "logger = logging.getLogger(__name__)" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 7, 94 | "id": "6e764571-c2b1-41b8-84cb-d41ba289d14e", 95 | "metadata": {}, 96 | "outputs": [ 97 | { 98 | "name": "stderr", 99 | "output_type": "stream", 100 | "text": [ 101 | "INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: paraphrase-MiniLM-L6-v2\n", 102 | "/home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n", 103 | " warnings.warn(\n", 104 | "/home/drusniel/llm_notebooks/myenv/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n", 105 | " warnings.warn(\n", 106 | "INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cuda\n" 107 | ] 108 | } 109 | ], 110 | "source": [ 111 | "# Load pre-trained SentenceTransformer model\n", 112 | "model = SentenceTransformer('paraphrase-MiniLM-L6-v2')" 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": 8, 118 | "id": "82340413-c0f9-4b83-9cc0-24d366144e6c", 119 | "metadata": {}, 120 | "outputs": [ 121 | { 122 | "name": "stderr", 123 | "output_type": "stream", 124 | "text": [ 125 | "INFO:__main__:Sentences from text1: ['As the sun sank below the horizon, casting shadows across Thistledown, a group of adventurers gathered around a flickering campfire', 'The air was full of the scent of pine and the distant calls of nocturnal animals.\\n\\n\"Are you sure this is the right path, Elaria?\" asked Thorne, his hand on the hilt of his sword', 'He peered out at the darkening forest warily.\\n\\n\"The map led us here,\" said Elaria, a slim elf with piercing green eyes', '\"The ancient runes spoke of a temple hidden beyond the Silverstream', 'We must trust in the old ways.\"\\n\\nBrakkar, the burly dwarf, sighed and adjusted his axe', '\"Trusting in old runes and forgotten temples..', 'This had better lead to treasure worth all this trouble.\"\\n\\n\"Not all treasures are made of gold, Brakkar,\" said Lyra, the group\\'s sorceress', 'Her eyes reflected the light of the fire', '\"Some secrets are far more valuable.\"\\n\\nThorne smiled', '\"Secrets or gold..', 'We\\'ll find out soon enough.\"']\n", 126 | "INFO:__main__:Sentences from text2: ['The sun was just setting, and Thistledown was darkening by degrees', 'A campfire burned at the center of a small group of adventurers—some humans, some elves, and one dwarf', 'The air held the scent of pine, and the distant call of nocturnal animals echoed through the forest.\\n\\n\"Are you sure this is right?\" Thorne asked Elaria, who was pointing at a trail', '\"We\\'re supposed to be going toward the Silverstream.\"\\n\\n\"The map we found led us here,\" said Elaria, an elf', 'Her green eyes glinted with a hint of mischief in the firelight', '\"It spoke of a temple hidden beyond the river', 'We\\'ll see if those old runes hold any truth.\"\\n\\n\"Trusting in old runes and forgotten temples?\" Brakkar said, adjusting his axe', '\"Hope it leads to treasure worth all this trouble.\"\\n\\nLyra smiled at that', '\"Not all treasures are made of gold', 'Some secrets are far more valuable.\"']\n" 127 | ] 128 | }, 129 | { 130 | "data": { 131 | "application/vnd.jupyter.widget-view+json": { 132 | "model_id": "cc49254d568c4167b78b742579ca0dd6", 133 | "version_major": 2, 134 | "version_minor": 0 135 | }, 136 | "text/plain": [ 137 | "Batches: 0%| | 0/1 [00:00 threshold:\n", 244 | " matches.append((sentences1[i], sentences2[j], similarity))\n", 245 | " unmatched1.discard(i)\n", 246 | " unmatched2.discard(j)\n", 247 | " \n", 248 | " unmatched_sentences1 = [sentences1[i] for i in unmatched1]\n", 249 | " unmatched_sentences2 = [sentences2[i] for i in unmatched2]\n", 250 | " \n", 251 | " return matches, unmatched_sentences1, unmatched_sentences2\n", 252 | "\n", 253 | "def main():\n", 254 | " text1 = (\"As the sun sank below the horizon, casting shadows across Thistledown, \"\n", 255 | " \"a group of adventurers gathered around a flickering campfire. The air was full of the scent of pine and the distant calls of nocturnal animals.\\n\\n\"\n", 256 | " \"\\\"Are you sure this is the right path, Elaria?\\\" asked Thorne, his hand on the hilt of his sword. He peered out at the darkening forest warily.\\n\\n\"\n", 257 | " \"\\\"The map led us here,\\\" said Elaria, a slim elf with piercing green eyes. \\\"The ancient runes spoke of a temple hidden beyond the Silverstream. We must trust in the old ways.\\\"\\n\\n\"\n", 258 | " \"Brakkar, the burly dwarf, sighed and adjusted his axe. \\\"Trusting in old runes and forgotten temples... This had better lead to treasure worth all this trouble.\\\"\\n\\n\"\n", 259 | " \"\\\"Not all treasures are made of gold, Brakkar,\\\" said Lyra, the group's sorceress. Her eyes reflected the light of the fire. \\\"Some secrets are far more valuable.\\\"\\n\\n\"\n", 260 | " \"Thorne smiled. \\\"Secrets or gold... We'll find out soon enough.\\\"\")\n", 261 | "\n", 262 | " text2 = (\"The sun was just setting, and Thistledown was darkening by degrees. A campfire burned at the center of a small group of adventurers—some humans, some elves, and one dwarf. The air held the scent of pine, and the distant call of nocturnal animals echoed through the forest.\\n\\n\"\n", 263 | " \"\\\"Are you sure this is right?\\\" Thorne asked Elaria, who was pointing at a trail. \\\"We're supposed to be going toward the Silverstream.\\\"\\n\\n\"\n", 264 | " \"\\\"The map we found led us here,\\\" said Elaria, an elf. Her green eyes glinted with a hint of mischief in the firelight. \\\"It spoke of a temple hidden beyond the river. We'll see if those old runes hold any truth.\\\"\\n\\n\"\n", 265 | " \"\\\"Trusting in old runes and forgotten temples?\\\" Brakkar said, adjusting his axe. \\\"Hope it leads to treasure worth all this trouble.\\\"\\n\\n\"\n", 266 | " \"Lyra smiled at that. \\\"Not all treasures are made of gold. Some secrets are far more valuable.\\\"\")\n", 267 | " \n", 268 | " matches, unmatched1, unmatched2 = match_sentences(text1, text2)\n", 269 | " \n", 270 | " logger.info(\"Matched sentences:\")\n", 271 | " for match in matches:\n", 272 | " logger.info(\"Text1: %s\\nText2: %s\\nSimilarity: %.2f\", match[0], match[1], match[2])\n", 273 | " \n", 274 | " logger.info(\"Unmatched sentences in text1: %s\", unmatched1)\n", 275 | " logger.info(\"Unmatched sentences in text2: %s\", unmatched2)\n", 276 | "\n", 277 | "if __name__ == \"__main__\":\n", 278 | " main()" 279 | ] 280 | }, 281 | { 282 | "cell_type": "markdown", 283 | "id": "b22bbc6e-501f-46a1-9dd4-5cf68d751544", 284 | "metadata": {}, 285 | "source": [ 286 | "The initial split by . might not capture all sentence boundaries properly, especially when sentences end with other punctuation marks or newlines. We can enhance the preprocessing step to handle various sentence boundaries more accurately. Additionally, we'll adjust the similarity threshold" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": 9, 292 | "id": "f9149dfe-dc28-4050-a8fc-9b25ee6768b8", 293 | "metadata": {}, 294 | "outputs": [], 295 | "source": [ 296 | "import re\n", 297 | "import numpy as np\n", 298 | "import logging\n", 299 | "\n", 300 | "def preprocess_text(text):\n", 301 | " \"\"\"\n", 302 | " Tokenizes the input text into sentences using regular expressions to handle different punctuation.\n", 303 | " \"\"\"\n", 304 | " sentences = re.split(r'(? threshold:\n", 330 | " matches.append((sentences1[i], sentences2[j], similarity))\n", 331 | " unmatched1.discard(i)\n", 332 | " unmatched2.discard(j)\n", 333 | " \n", 334 | " unmatched_sentences1 = [sentences1[i] for i in unmatched1]\n", 335 | " unmatched_sentences2 = [sentences2[i] for i in unmatched2]\n", 336 | " \n", 337 | " return matches, unmatched_sentences1, unmatched_sentences2\n" 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": 10, 343 | "id": "4278d072-e9db-492d-9db9-f3548981dcf1", 344 | "metadata": {}, 345 | "outputs": [ 346 | { 347 | "name": "stderr", 348 | "output_type": "stream", 349 | "text": [ 350 | "INFO:__main__:Sentences from text1: ['As the sun sank below the horizon, casting shadows across Thistledown, a group of adventurers gathered around a flickering campfire.', 'The air was full of the scent of pine and the distant calls of nocturnal animals.', '\"Are you sure this is the right path, Elaria?\" asked Thorne, his hand on the hilt of his sword.', 'He peered out at the darkening forest warily.', '\"The map led us here,\" said Elaria, a slim elf with piercing green eyes.', '\"The ancient runes spoke of a temple hidden beyond the Silverstream.', 'We must trust in the old ways.\"\\n\\nBrakkar, the burly dwarf, sighed and adjusted his axe.', '\"Trusting in old runes and forgotten temples...', 'This had better lead to treasure worth all this trouble.\"\\n\\n\"Not all treasures are made of gold, Brakkar,\" said Lyra, the group\\'s sorceress.', 'Her eyes reflected the light of the fire.', '\"Some secrets are far more valuable.\"\\n\\nThorne smiled.', '\"Secrets or gold...', 'We\\'ll find out soon enough.\"']\n", 351 | "INFO:__main__:Sentences from text2: ['The sun was just setting, and Thistledown was darkening by degrees.', 'A campfire burned at the center of a small group of adventurers—some humans, some elves, and one dwarf.', 'The air held the scent of pine, and the distant call of nocturnal animals echoed through the forest.', '\"Are you sure this is right?\" Thorne asked Elaria, who was pointing at a trail.', '\"We\\'re supposed to be going toward the Silverstream.\"\\n\\n\"The map we found led us here,\" said Elaria, an elf.', 'Her green eyes glinted with a hint of mischief in the firelight.', '\"It spoke of a temple hidden beyond the river.', 'We\\'ll see if those old runes hold any truth.\"\\n\\n\"Trusting in old runes and forgotten temples?\" Brakkar said, adjusting his axe.', '\"Hope it leads to treasure worth all this trouble.\"\\n\\nLyra smiled at that.', '\"Not all treasures are made of gold.', 'Some secrets are far more valuable.\"']\n" 352 | ] 353 | }, 354 | { 355 | "data": { 356 | "application/vnd.jupyter.widget-view+json": { 357 | "model_id": "03a8f2c11a144c93bcaf43800780be0d", 358 | "version_major": 2, 359 | "version_minor": 0 360 | }, 361 | "text/plain": [ 362 | "Batches: 0%| | 0/1 [00:00(Optimal K=4)\",\n", 753 | " showarrow=True,\n", 754 | " arrowhead=2,\n", 755 | " arrowcolor=\"red\",\n", 756 | " font=dict(size=12, color=\"red\")\n", 757 | ")\n", 758 | "\n", 759 | "fig.show()\n", 760 | "\n", 761 | "print(\"WCSS values for each K:\")\n", 762 | "for k, wcss in zip(k_range, wcss_values):\n", 763 | " print(f\"K={k}: WCSS={wcss:.2f}\")" 764 | ] 765 | }, 766 | { 767 | "cell_type": "markdown", 768 | "id": "5676be77", 769 | "metadata": {}, 770 | "source": [ 771 | "## Step 8: Business Applications\n", 772 | "\n", 773 | "Now let's interpret our customer segments and think about how a business might use this information." 774 | ] 775 | }, 776 | { 777 | "cell_type": "code", 778 | "execution_count": null, 779 | "id": "bdbb7792", 780 | "metadata": {}, 781 | "outputs": [], 782 | "source": [ 783 | "# Analyze our customer segments\n", 784 | "def analyze_segments(df):\n", 785 | " \"\"\"\n", 786 | " Provide business insights for each customer segment\n", 787 | " \"\"\"\n", 788 | " segment_insights = {}\n", 789 | " \n", 790 | " for cluster in df['Cluster'].unique():\n", 791 | " cluster_data = df[df['Cluster'] == cluster]\n", 792 | " \n", 793 | " insights = {\n", 794 | " 'count': len(cluster_data),\n", 795 | " 'avg_age': cluster_data['Age'].mean(),\n", 796 | " 'avg_income': cluster_data['Annual_Income'].mean(),\n", 797 | " 'age_range': (cluster_data['Age'].min(), cluster_data['Age'].max()),\n", 798 | " 'income_range': (cluster_data['Annual_Income'].min(), \n", 799 | " cluster_data['Annual_Income'].max())\n", 800 | " }\n", 801 | " \n", 802 | " # Business interpretation\n", 803 | " if insights['avg_age'] < 35 and insights['avg_income'] > 60:\n", 804 | " insights['segment_name'] = \"Young Professionals\"\n", 805 | " insights['marketing_strategy'] = \"Tech products, career development, premium services\"\n", 806 | " elif insights['avg_age'] > 50 and insights['avg_income'] > 60:\n", 807 | " insights['segment_name'] = \"Affluent Seniors\"\n", 808 | " insights['marketing_strategy'] = \"Luxury goods, travel, health & wellness\"\n", 809 | " elif insights['avg_income'] < 50:\n", 810 | " insights['segment_name'] = \"Budget Conscious\"\n", 811 | " insights['marketing_strategy'] = \"Value products, discounts, essential services\"\n", 812 | " else:\n", 813 | " insights['segment_name'] = \"Middle Market\"\n", 814 | " insights['marketing_strategy'] = \"Quality products, family-oriented services\"\n", 815 | " \n", 816 | " segment_insights[f'Cluster {cluster + 1}'] = insights\n", 817 | " \n", 818 | " return segment_insights\n", 819 | "\n", 820 | "# Get business insights\n", 821 | "business_insights = analyze_segments(customers_df)\n", 822 | "\n", 823 | "print(\"\\n\" + \"=\"*60)\n", 824 | "print(\"CUSTOMER SEGMENT ANALYSIS & MARKETING RECOMMENDATIONS\")\n", 825 | "print(\"=\"*60)\n", 826 | "\n", 827 | "for cluster_name, insights in business_insights.items():\n", 828 | " print(f\"\\n{cluster_name}: {insights['segment_name']}\")\n", 829 | " print(f\" • Size: {insights['count']} customers\")\n", 830 | " print(f\" • Average Age: {insights['avg_age']:.1f} years\")\n", 831 | " print(f\" • Average Income: ${insights['avg_income']:.1f}k\")\n", 832 | " print(f\" • Marketing Focus: {insights['marketing_strategy']}\")\n", 833 | " print(\"-\" * 40)" 834 | ] 835 | }, 836 | { 837 | "cell_type": "markdown", 838 | "id": "8245cc96", 839 | "metadata": {}, 840 | "source": [ 841 | "## Step 9: Interactive Clustering Tool\n", 842 | "\n", 843 | "Let's create an interactive visualization where you can experiment with different numbers of clusters!" 844 | ] 845 | }, 846 | { 847 | "cell_type": "code", 848 | "execution_count": null, 849 | "id": "f8ba1670", 850 | "metadata": {}, 851 | "outputs": [], 852 | "source": [ 853 | "def create_interactive_clustering(data, k_values=[2, 3, 4, 5, 6]):\n", 854 | " \"\"\"\n", 855 | " Create subplots showing clustering results for different K values\n", 856 | " \"\"\"\n", 857 | " from plotly.subplots import make_subplots\n", 858 | " \n", 859 | " n_plots = len(k_values)\n", 860 | " cols = min(3, n_plots)\n", 861 | " rows = (n_plots + cols - 1) // cols\n", 862 | " \n", 863 | " fig = make_subplots(\n", 864 | " rows=rows, cols=cols,\n", 865 | " subplot_titles=[f'K = {k}' for k in k_values],\n", 866 | " horizontal_spacing=0.1,\n", 867 | " vertical_spacing=0.15\n", 868 | " )\n", 869 | " \n", 870 | " colors = ['red', 'blue', 'green', 'orange', 'purple', 'brown']\n", 871 | " \n", 872 | " for idx, k in enumerate(k_values):\n", 873 | " row = (idx // cols) + 1\n", 874 | " col = (idx % cols) + 1\n", 875 | " \n", 876 | " # Perform clustering\n", 877 | " kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)\n", 878 | " labels = kmeans.fit_predict(data)\n", 879 | " centroids = kmeans.cluster_centers_\n", 880 | " \n", 881 | " # Add data points\n", 882 | " for cluster in range(k):\n", 883 | " cluster_data = data[labels == cluster]\n", 884 | " fig.add_trace(\n", 885 | " go.Scatter(\n", 886 | " x=cluster_data[:, 0],\n", 887 | " y=cluster_data[:, 1],\n", 888 | " mode='markers',\n", 889 | " marker=dict(color=colors[cluster % len(colors)], size=6),\n", 890 | " name=f'Cluster {cluster + 1}' if idx == 0 else None,\n", 891 | " showlegend=idx == 0,\n", 892 | " legendgroup=f'cluster_{cluster}'\n", 893 | " ),\n", 894 | " row=row, col=col\n", 895 | " )\n", 896 | " \n", 897 | " # Add centroids\n", 898 | " fig.add_trace(\n", 899 | " go.Scatter(\n", 900 | " x=centroids[:, 0],\n", 901 | " y=centroids[:, 1],\n", 902 | " mode='markers',\n", 903 | " marker=dict(color='black', size=12, symbol='x'),\n", 904 | " name='Centroids' if idx == 0 else None,\n", 905 | " showlegend=idx == 0\n", 906 | " ),\n", 907 | " row=row, col=col\n", 908 | " )\n", 909 | " \n", 910 | " fig.update_layout(\n", 911 | " title_text=\"K-Means Clustering: Comparing Different Numbers of Clusters\",\n", 912 | " height=400 * rows,\n", 913 | " showlegend=True\n", 914 | " )\n", 915 | " \n", 916 | " # Update axis labels\n", 917 | " for i in range(1, rows + 1):\n", 918 | " for j in range(1, cols + 1):\n", 919 | " fig.update_xaxes(title_text=\"Age\", row=i, col=j)\n", 920 | " fig.update_yaxes(title_text=\"Income ($k)\", row=i, col=j)\n", 921 | " \n", 922 | " return fig\n", 923 | "\n", 924 | "# Create interactive comparison\n", 925 | "interactive_fig = create_interactive_clustering(data_points, [2, 3, 4, 5, 6])\n", 926 | "interactive_fig.show()\n", 927 | "\n", 928 | "print(\"\\nCompare the different clustering results above!\")\n", 929 | "print(\"Notice how K=4 seems to capture the natural groups best.\")" 930 | ] 931 | }, 932 | { 933 | "cell_type": "markdown", 934 | "id": "403e2f7f", 935 | "metadata": {}, 936 | "source": [ 937 | "## How to Run the Animations\n", 938 | "\n", 939 | "To run these Manim animations:\n", 940 | "\n", 941 | "1. **Install Manim** if you haven't already:\n", 942 | " ```bash\n", 943 | " pip install manim\n", 944 | " ```\n", 945 | "\n", 946 | "2. **Run each animation cell** one by one. The `%%manim` magic command will:\n", 947 | " - Generate the animation\n", 948 | " - Save it as a video file\n", 949 | " - Display it in the notebook\n", 950 | "\n", 951 | "3. **Animation Quality**: The animations are set to medium quality (`-qm`) for a good balance of visual quality and file size.\n", 952 | "\n", 953 | "**Note**: The first time you run Manim, it might take a moment to install dependencies and set up the environment.\n", 954 | "\n", 955 | "## Summary\n", 956 | "\n", 957 | "🎯 **What we learned about K-Means Clustering:**\n", 958 | "\n", 959 | "- **Clustering groups similar data points together automatically**\n", 960 | "- **K-Means uses distance to measure similarity**\n", 961 | "- **The algorithm iteratively improves cluster assignments**\n", 962 | "- **Choosing the right number of clusters is crucial**\n", 963 | "- **Business applications include customer segmentation, market research, and more**\n", 964 | "\n", 965 | "The animations help visualize these abstract concepts, making machine learning more accessible and intuitive!" 966 | ] 967 | } 968 | ], 969 | "metadata": { 970 | "language_info": { 971 | "name": "python" 972 | } 973 | }, 974 | "nbformat": 4, 975 | "nbformat_minor": 5 976 | } 977 | -------------------------------------------------------------------------------- /Notebooks/Neural Networks 1 - Single Neuron.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "c03583b3", 6 | "metadata": {}, 7 | "source": [ 8 | "# Neural Networks from Scratch - Part 1: The Single Neuron\n", 9 | "\n", 10 | "Welcome to the fascinating world of Neural Networks! In this first notebook, we'll start simple and build a single artificial neuron from scratch. You'll learn how a neuron thinks, makes decisions, and processes information - all explained in plain English with beautiful animations.\n", 11 | "\n", 12 | "## What You'll Learn:\n", 13 | "- What is an artificial neuron?\n", 14 | "- How neurons process information\n", 15 | "- Weights, biases, and activation functions\n", 16 | "- Programming a neuron from scratch\n", 17 | "- Making your first predictions!" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "id": "4cc45336", 23 | "metadata": {}, 24 | "source": [ 25 | "## Step 1: Setting Up Our Environment\n", 26 | "\n", 27 | "Let's import the libraries we need. We'll use minimal dependencies to understand everything from the ground up." 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": null, 33 | "id": "e16f6aa1", 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "import numpy as np\n", 38 | "import matplotlib.pyplot as plt\n", 39 | "import plotly.graph_objects as go\n", 40 | "import plotly.express as px\n", 41 | "from manim import *\n", 42 | "import random\n", 43 | "\n", 44 | "# Set random seeds for reproducible results\n", 45 | "np.random.seed(42)\n", 46 | "random.seed(42)\n", 47 | "\n", 48 | "print(\"Environment set up successfully!\")\n", 49 | "print(\"Ready to build our first neuron! 🧠\")" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "id": "2814532f", 55 | "metadata": {}, 56 | "source": [ 57 | "## Step 2: Understanding the Inspiration\n", 58 | "\n", 59 | "Artificial neurons are inspired by biological neurons in our brain. Let's understand the basic concept before we start coding." 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": null, 65 | "id": "9460e4d7", 66 | "metadata": {}, 67 | "outputs": [], 68 | "source": [ 69 | "# Let's create a simple example to understand what a neuron does\n", 70 | "print(\"🧠 BIOLOGICAL NEURON vs ARTIFICIAL NEURON\")\n", 71 | "print(\"=\" * 50)\n", 72 | "\n", 73 | "print(\"\\nBiological Neuron:\")\n", 74 | "print(\"• Receives signals from other neurons\")\n", 75 | "print(\"• Processes these signals\")\n", 76 | "print(\"• Decides whether to 'fire' (send signal)\")\n", 77 | "print(\"• Sends signal to other neurons\")\n", 78 | "\n", 79 | "print(\"\\nArtificial Neuron:\")\n", 80 | "print(\"• Receives numbers as inputs\")\n", 81 | "print(\"• Multiplies inputs by weights\")\n", 82 | "print(\"• Adds them up (+ bias)\")\n", 83 | "print(\"• Applies activation function\")\n", 84 | "print(\"• Outputs a number\")\n", 85 | "\n", 86 | "print(\"\\n💡 Key Insight: A neuron is just a mathematical function!\")" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "id": "c765c6ed", 92 | "metadata": {}, 93 | "source": [ 94 | "## Step 3: Our First Simple Example\n", 95 | "\n", 96 | "Let's create a neuron that helps decide whether to go outside based on weather conditions." 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": null, 102 | "id": "423f52d7", 103 | "metadata": {}, 104 | "outputs": [], 105 | "source": [ 106 | "# Simple example: \"Should I go outside?\"\n", 107 | "# Inputs: [temperature, sunshine, wind_speed]\n", 108 | "# Output: go_outside (0 = stay inside, 1 = go outside)\n", 109 | "\n", 110 | "def simple_neuron_example():\n", 111 | " print(\"🌤️ WEATHER DECISION NEURON\")\n", 112 | " print(\"=\" * 40)\n", 113 | " \n", 114 | " # Example weather data\n", 115 | " weather_conditions = [\n", 116 | " {\"temp\": 25, \"sunshine\": 8, \"wind\": 2, \"description\": \"Perfect day\"},\n", 117 | " {\"temp\": 5, \"sunshine\": 2, \"wind\": 8, \"description\": \"Cold and windy\"},\n", 118 | " {\"temp\": 30, \"sunshine\": 9, \"wind\": 1, \"description\": \"Hot and sunny\"},\n", 119 | " {\"temp\": 15, \"sunshine\": 3, \"wind\": 6, \"description\": \"Cool and cloudy\"}\n", 120 | " ]\n", 121 | " \n", 122 | " # Our neuron's \"opinion\" (weights)\n", 123 | " # How much does each factor matter?\n", 124 | " weight_temp = 0.3 # Temperature matters moderately\n", 125 | " weight_sunshine = 0.5 # Sunshine matters a lot!\n", 126 | " weight_wind = -0.4 # Wind is bad (negative weight)\n", 127 | " bias = -5 # General preference to stay inside\n", 128 | " \n", 129 | " print(f\"Neuron's preferences (weights):\")\n", 130 | " print(f\"• Temperature importance: {weight_temp}\")\n", 131 | " print(f\"• Sunshine importance: {weight_sunshine}\")\n", 132 | " print(f\"• Wind importance: {weight_wind} (negative = bad)\")\n", 133 | " print(f\"• Base preference (bias): {bias}\\n\")\n", 134 | " \n", 135 | " for weather in weather_conditions:\n", 136 | " # This is what our neuron does:\n", 137 | " # 1. Multiply each input by its weight\n", 138 | " # 2. Add them all up\n", 139 | " # 3. Add the bias\n", 140 | " \n", 141 | " decision_score = (weather[\"temp\"] * weight_temp + \n", 142 | " weather[\"sunshine\"] * weight_sunshine + \n", 143 | " weather[\"wind\"] * weight_wind + \n", 144 | " bias)\n", 145 | " \n", 146 | " # 4. Apply activation function (simple threshold)\n", 147 | " go_outside = 1 if decision_score > 0 else 0\n", 148 | " \n", 149 | " print(f\"{weather['description']:15} | Score: {decision_score:6.1f} | Decision: {'GO OUT! 🌞' if go_outside else 'Stay in 🏠'}\")\n", 150 | "\n", 151 | "simple_neuron_example()" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "id": "6f2cd6ea", 157 | "metadata": {}, 158 | "source": [ 159 | "## Step 4: Building Our Neuron Class\n", 160 | "\n", 161 | "Now let's code a proper neuron class that we can reuse. We'll build it step by step so you understand every line." 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "id": "984607ae", 168 | "metadata": {}, 169 | "outputs": [], 170 | "source": [ 171 | "class SimpleNeuron:\n", 172 | " \"\"\"\n", 173 | " A simple artificial neuron that processes inputs and produces output.\n", 174 | " \n", 175 | " The neuron:\n", 176 | " 1. Takes inputs (numbers)\n", 177 | " 2. Multiplies each input by a weight\n", 178 | " 3. Adds all weighted inputs together\n", 179 | " 4. Adds a bias term\n", 180 | " 5. Applies an activation function\n", 181 | " 6. Returns the output\n", 182 | " \"\"\"\n", 183 | " \n", 184 | " def __init__(self, num_inputs):\n", 185 | " \"\"\"\n", 186 | " Initialize the neuron with random weights and bias.\n", 187 | " \n", 188 | " Args:\n", 189 | " num_inputs: How many input numbers this neuron expects\n", 190 | " \"\"\"\n", 191 | " # Initialize weights randomly (small values)\n", 192 | " self.weights = np.random.randn(num_inputs) * 0.5\n", 193 | " \n", 194 | " # Initialize bias to zero\n", 195 | " self.bias = 0.0\n", 196 | " \n", 197 | " print(f\"🧠 Neuron created with {num_inputs} inputs\")\n", 198 | " print(f\" Initial weights: {self.weights.round(3)}\")\n", 199 | " print(f\" Initial bias: {self.bias}\")\n", 200 | " \n", 201 | " def forward(self, inputs):\n", 202 | " \"\"\"\n", 203 | " Process inputs through the neuron (forward pass).\n", 204 | " \n", 205 | " This is the core of what a neuron does!\n", 206 | " \"\"\"\n", 207 | " # Step 1: Multiply inputs by weights (this is our dot product!)\n", 208 | " weighted_sum = np.dot(inputs, self.weights)\n", 209 | " \n", 210 | " # Step 2: Add bias\n", 211 | " z = weighted_sum + self.bias\n", 212 | " \n", 213 | " # Step 3: Apply activation function (sigmoid)\n", 214 | " output = self.sigmoid(z)\n", 215 | " \n", 216 | " return output, z # Return both output and raw sum\n", 217 | " \n", 218 | " def sigmoid(self, z):\n", 219 | " \"\"\"\n", 220 | " Sigmoid activation function.\n", 221 | " \n", 222 | " This function:\n", 223 | " - Takes any number (positive or negative)\n", 224 | " - Returns a number between 0 and 1\n", 225 | " - Creates smooth, S-shaped curve\n", 226 | " \"\"\"\n", 227 | " # Prevent overflow for very large numbers\n", 228 | " z = np.clip(z, -500, 500)\n", 229 | " return 1 / (1 + np.exp(-z))\n", 230 | " \n", 231 | " def set_weights_and_bias(self, weights, bias):\n", 232 | " \"\"\"\n", 233 | " Manually set weights and bias (useful for examples).\n", 234 | " \"\"\"\n", 235 | " self.weights = np.array(weights)\n", 236 | " self.bias = bias\n", 237 | " print(f\"✏️ Weights set to: {self.weights}\")\n", 238 | " print(f\" Bias set to: {self.bias}\")\n", 239 | "\n", 240 | "# Test our neuron!\n", 241 | "print(\"Let's create our first neuron!\\n\")\n", 242 | "neuron = SimpleNeuron(num_inputs=3)\n", 243 | "\n", 244 | "# Test with some inputs\n", 245 | "test_input = [1.0, 2.0, -0.5]\n", 246 | "output, raw_sum = neuron.forward(test_input)\n", 247 | "\n", 248 | "print(f\"\\n🧪 Testing the neuron:\")\n", 249 | "print(f\" Input: {test_input}\")\n", 250 | "print(f\" Raw sum (z): {raw_sum:.3f}\")\n", 251 | "print(f\" Final output: {output:.3f}\")" 252 | ] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "id": "d7720b09", 257 | "metadata": {}, 258 | "source": [ 259 | "## Step 5: Understanding Activation Functions\n", 260 | "\n", 261 | "The activation function is crucial - it decides when the neuron should \"fire\". Let's explore different types." 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": null, 267 | "id": "7432f634", 268 | "metadata": {}, 269 | "outputs": [], 270 | "source": [ 271 | "# Let's visualize different activation functions\n", 272 | "def plot_activation_functions():\n", 273 | " \"\"\"\n", 274 | " Show different activation functions and explain what they do.\n", 275 | " \"\"\"\n", 276 | " z_values = np.linspace(-10, 10, 200)\n", 277 | " \n", 278 | " # Different activation functions\n", 279 | " def sigmoid(z):\n", 280 | " return 1 / (1 + np.exp(-np.clip(z, -500, 500)))\n", 281 | " \n", 282 | " def relu(z):\n", 283 | " return np.maximum(0, z)\n", 284 | " \n", 285 | " def tanh(z):\n", 286 | " return np.tanh(z)\n", 287 | " \n", 288 | " def step(z):\n", 289 | " return (z > 0).astype(float)\n", 290 | " \n", 291 | " # Create the plot\n", 292 | " fig = go.Figure()\n", 293 | " \n", 294 | " # Add each activation function\n", 295 | " functions = [\n", 296 | " (sigmoid, \"Sigmoid\", \"blue\", \"Smooth, outputs 0-1\"),\n", 297 | " (relu, \"ReLU\", \"red\", \"Simple, outputs 0 or input\"),\n", 298 | " (tanh, \"Tanh\", \"green\", \"Smooth, outputs -1 to 1\"),\n", 299 | " (step, \"Step\", \"orange\", \"Binary, outputs 0 or 1\")\n", 300 | " ]\n", 301 | " \n", 302 | " for func, name, color, description in functions:\n", 303 | " y_values = func(z_values)\n", 304 | " fig.add_trace(go.Scatter(\n", 305 | " x=z_values,\n", 306 | " y=y_values,\n", 307 | " mode='lines',\n", 308 | " name=f\"{name}: {description}\",\n", 309 | " line=dict(color=color, width=3)\n", 310 | " ))\n", 311 | " \n", 312 | " fig.update_layout(\n", 313 | " title=\"Activation Functions: How Neurons Make Decisions\",\n", 314 | " xaxis_title=\"Input to Activation Function (z)\",\n", 315 | " yaxis_title=\"Output\",\n", 316 | " hovermode='x unified',\n", 317 | " showlegend=True\n", 318 | " )\n", 319 | " \n", 320 | " return fig\n", 321 | "\n", 322 | "# Show the plot\n", 323 | "activation_plot = plot_activation_functions()\n", 324 | "activation_plot.show()\n", 325 | "\n", 326 | "print(\"\\n📊 Activation Function Comparison:\")\n", 327 | "print(\"• Sigmoid: Smooth probability-like output (0 to 1)\")\n", 328 | "print(\"• ReLU: Simple and fast, only positive outputs\")\n", 329 | "print(\"• Tanh: Like sigmoid but ranges from -1 to 1\")\n", 330 | "print(\"• Step: Binary decision, like an on/off switch\")" 331 | ] 332 | }, 333 | { 334 | "cell_type": "markdown", 335 | "id": "e43e2617", 336 | "metadata": {}, 337 | "source": [ 338 | "## Step 6: Interactive Neuron Playground\n", 339 | "\n", 340 | "Let's create an interactive example where you can see how changing weights affects the neuron's behavior." 341 | ] 342 | }, 343 | { 344 | "cell_type": "code", 345 | "execution_count": null, 346 | "id": "67685377", 347 | "metadata": {}, 348 | "outputs": [], 349 | "source": [ 350 | "def neuron_playground(inputs, weight_ranges, bias_range):\n", 351 | " \"\"\"\n", 352 | " Interactive playground to understand how weights and bias affect output.\n", 353 | " \"\"\"\n", 354 | " print(\"🎮 NEURON PLAYGROUND\")\n", 355 | " print(\"=\" * 30)\n", 356 | " print(f\"Fixed inputs: {inputs}\")\n", 357 | " print(\"\\nLet's see how different weights change the output:\\n\")\n", 358 | " \n", 359 | " # Test different weight combinations\n", 360 | " test_cases = [\n", 361 | " ([1.0, 1.0, 1.0], 0.0, \"All weights equal\"),\n", 362 | " ([2.0, 0.0, 0.0], 0.0, \"Only first input matters\"),\n", 363 | " ([0.0, 2.0, 0.0], 0.0, \"Only second input matters\"),\n", 364 | " ([1.0, 1.0, 1.0], 2.0, \"Positive bias (easier to activate)\"),\n", 365 | " ([1.0, 1.0, 1.0], -2.0, \"Negative bias (harder to activate)\"),\n", 366 | " ([-1.0, -1.0, -1.0], 0.0, \"All negative weights\")\n", 367 | " ]\n", 368 | " \n", 369 | " neuron = SimpleNeuron(len(inputs))\n", 370 | " \n", 371 | " for weights, bias, description in test_cases:\n", 372 | " neuron.set_weights_and_bias(weights, bias)\n", 373 | " output, z = neuron.forward(inputs)\n", 374 | " \n", 375 | " print(f\"{description:30} | Raw sum: {z:6.2f} | Output: {output:.3f}\")\n", 376 | " \n", 377 | " print(\"\\n💡 Notice how:\")\n", 378 | " print(\"• Positive weights amplify inputs\")\n", 379 | " print(\"• Negative weights diminish inputs\")\n", 380 | " print(\"• Bias shifts the decision threshold\")\n", 381 | " print(\"• Sigmoid keeps output between 0 and 1\")\n", 382 | "\n", 383 | "# Run the playground\n", 384 | "test_inputs = [0.5, 1.5, -0.8]\n", 385 | "neuron_playground(test_inputs, [(-2, 2), (-2, 2), (-2, 2)], (-3, 3))" 386 | ] 387 | }, 388 | { 389 | "cell_type": "markdown", 390 | "id": "9e9f8ea0", 391 | "metadata": {}, 392 | "source": [ 393 | "## Step 7: Manim Animations\n", 394 | "\n", 395 | "Now let's create beautiful animations to visualize how a neuron works!" 396 | ] 397 | }, 398 | { 399 | "cell_type": "markdown", 400 | "id": "17232b9b", 401 | "metadata": {}, 402 | "source": [ 403 | "### Animation 1: Neuron Structure" 404 | ] 405 | }, 406 | { 407 | "cell_type": "code", 408 | "execution_count": null, 409 | "id": "b5f5430f", 410 | "metadata": {}, 411 | "outputs": [], 412 | "source": [ 413 | "%%manim -qm -v WARNING NeuronStructure\n", 414 | "\n", 415 | "class NeuronStructure(Scene):\n", 416 | " def construct(self):\n", 417 | " # Title\n", 418 | " title = Text(\"Anatomy of an Artificial Neuron\", font_size=42, color=BLUE)\n", 419 | " self.play(Write(title))\n", 420 | " self.wait(1)\n", 421 | " self.play(title.animate.to_edge(UP))\n", 422 | " \n", 423 | " # Create input nodes\n", 424 | " input_positions = [[-4, 2, 0], [-4, 0, 0], [-4, -2, 0]]\n", 425 | " inputs = []\n", 426 | " input_labels = []\n", 427 | " \n", 428 | " for i, pos in enumerate(input_positions):\n", 429 | " # Input circle\n", 430 | " input_node = Circle(radius=0.3, color=GREEN, fill_opacity=0.7)\n", 431 | " input_node.move_to(pos)\n", 432 | " inputs.append(input_node)\n", 433 | " \n", 434 | " # Input label\n", 435 | " label = Text(f\"x{i+1}\", font_size=20, color=WHITE)\n", 436 | " label.move_to(pos)\n", 437 | " input_labels.append(label)\n", 438 | " \n", 439 | " # Create neuron (main processing unit)\n", 440 | " neuron = Circle(radius=0.8, color=BLUE, fill_opacity=0.8)\n", 441 | " neuron.move_to([2, 0, 0])\n", 442 | " neuron_label = Text(\"Neuron\", font_size=24, color=WHITE)\n", 443 | " neuron_label.move_to([2, 0, 0])\n", 444 | " \n", 445 | " # Create output\n", 446 | " output = Circle(radius=0.3, color=RED, fill_opacity=0.7)\n", 447 | " output.move_to([5, 0, 0])\n", 448 | " output_label = Text(\"y\", font_size=20, color=WHITE)\n", 449 | " output_label.move_to([5, 0, 0])\n", 450 | " \n", 451 | " # Show inputs first\n", 452 | " explanation1 = Text(\"Inputs: Raw data numbers\", font_size=24, color=GREEN)\n", 453 | " explanation1.to_edge(DOWN)\n", 454 | " \n", 455 | " self.play(Write(explanation1))\n", 456 | " for inp, label in zip(inputs, input_labels):\n", 457 | " self.play(Create(inp), Write(label), run_time=0.5)\n", 458 | " \n", 459 | " self.wait(1)\n", 460 | " \n", 461 | " # Show neuron\n", 462 | " explanation2 = Text(\"Neuron: Processes and combines inputs\", font_size=24, color=BLUE)\n", 463 | " self.play(ReplacementTransform(explanation1, explanation2))\n", 464 | " self.play(Create(neuron), Write(neuron_label))\n", 465 | " \n", 466 | " # Create connections (weights)\n", 467 | " connections = []\n", 468 | " weight_labels = []\n", 469 | " weights = [0.8, -0.3, 1.2]\n", 470 | " \n", 471 | " for i, (inp_pos, weight) in enumerate(zip(input_positions, weights)):\n", 472 | " # Connection line\n", 473 | " line = Line(inp_pos, [2, 0, 0], color=YELLOW)\n", 474 | " connections.append(line)\n", 475 | " \n", 476 | " # Weight label\n", 477 | " mid_point = [(inp_pos[0] + 2)/2, inp_pos[1]/2, 0]\n", 478 | " w_label = Text(f\"w{i+1}={weight}\", font_size=16, color=YELLOW)\n", 479 | " w_label.move_to(mid_point)\n", 480 | " weight_labels.append(w_label)\n", 481 | " \n", 482 | " explanation3 = Text(\"Weights: How important each input is\", font_size=24, color=YELLOW)\n", 483 | " self.play(ReplacementTransform(explanation2, explanation3))\n", 484 | " \n", 485 | " for conn, w_label in zip(connections, weight_labels):\n", 486 | " self.play(Create(conn), Write(w_label), run_time=0.7)\n", 487 | " \n", 488 | " self.wait(1)\n", 489 | " \n", 490 | " # Show output\n", 491 | " output_line = Line([2, 0, 0], [5, 0, 0], color=RED)\n", 492 | " \n", 493 | " explanation4 = Text(\"Output: Final decision or prediction\", font_size=24, color=RED)\n", 494 | " self.play(ReplacementTransform(explanation3, explanation4))\n", 495 | " \n", 496 | " self.play(Create(output_line), Create(output), Write(output_label))\n", 497 | " \n", 498 | " # Show the math\n", 499 | " math_formula = MathTex(\n", 500 | " r\"y = \\sigma(w_1x_1 + w_2x_2 + w_3x_3 + b)\",\n", 501 | " font_size=32,\n", 502 | " color=WHITE\n", 503 | " )\n", 504 | " math_formula.move_to([0, -3, 0])\n", 505 | " \n", 506 | " self.play(Write(math_formula))\n", 507 | " \n", 508 | " # Final explanation\n", 509 | " final_explanation = Text(\n", 510 | " \"σ (sigma) = activation function (like sigmoid)\",\n", 511 | " font_size=20, color=WHITE\n", 512 | " )\n", 513 | " final_explanation.next_to(math_formula, DOWN)\n", 514 | " \n", 515 | " self.play(Write(final_explanation))\n", 516 | " self.wait(3)" 517 | ] 518 | }, 519 | { 520 | "cell_type": "markdown", 521 | "id": "8f5179fe", 522 | "metadata": {}, 523 | "source": [ 524 | "### Animation 2: Information Flow" 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "execution_count": null, 530 | "id": "13be8ce1", 531 | "metadata": {}, 532 | "outputs": [], 533 | "source": [ 534 | "%%manim -qm -v WARNING InformationFlow\n", 535 | "\n", 536 | "class InformationFlow(Scene):\n", 537 | " def construct(self):\n", 538 | " # Title\n", 539 | " title = Text(\"How Information Flows Through a Neuron\", font_size=38, color=BLUE)\n", 540 | " self.play(Write(title))\n", 541 | " self.wait(1)\n", 542 | " self.play(title.animate.to_edge(UP))\n", 543 | " \n", 544 | " # Create the neuron structure (simplified)\n", 545 | " inputs = [Circle(radius=0.2, color=GREEN, fill_opacity=0.8).move_to([-4, i, 0]) for i in [1, 0, -1]]\n", 546 | " neuron = Circle(radius=0.6, color=BLUE, fill_opacity=0.8).move_to([0, 0, 0])\n", 547 | " output = Circle(radius=0.2, color=RED, fill_opacity=0.8).move_to([4, 0, 0])\n", 548 | " \n", 549 | " # Input values\n", 550 | " input_values = [2.0, -1.5, 0.8]\n", 551 | " weights = [0.5, -0.3, 1.2]\n", 552 | " bias = 0.1\n", 553 | " \n", 554 | " # Labels\n", 555 | " input_labels = [Text(f\"{val}\", font_size=16, color=WHITE).move_to(inp.get_center()) \n", 556 | " for inp, val in zip(inputs, input_values)]\n", 557 | " \n", 558 | " # Create everything\n", 559 | " for inp, label in zip(inputs, input_labels):\n", 560 | " self.add(inp, label)\n", 561 | " self.add(neuron, output)\n", 562 | " \n", 563 | " # Step 1: Show inputs\n", 564 | " step1 = Text(\"Step 1: Inputs arrive at the neuron\", font_size=24, color=WHITE)\n", 565 | " step1.to_edge(DOWN)\n", 566 | " self.play(Write(step1))\n", 567 | " \n", 568 | " # Highlight inputs\n", 569 | " for inp in inputs:\n", 570 | " self.play(inp.animate.set_stroke(color=YELLOW, width=4), run_time=0.3)\n", 571 | " \n", 572 | " self.wait(1)\n", 573 | " \n", 574 | " # Step 2: Multiply by weights\n", 575 | " step2 = Text(\"Step 2: Each input is multiplied by its weight\", font_size=24, color=WHITE)\n", 576 | " self.play(ReplacementTransform(step1, step2))\n", 577 | " \n", 578 | " # Show calculations\n", 579 | " calculations = []\n", 580 | " for i, (val, weight) in enumerate(zip(input_values, weights)):\n", 581 | " calc_text = Text(f\"{val} × {weight} = {val*weight:.2f}\", \n", 582 | " font_size=16, color=YELLOW)\n", 583 | " calc_text.move_to([-2, 1-i*0.5, 0])\n", 584 | " calculations.append(calc_text)\n", 585 | " self.play(Write(calc_text), run_time=0.5)\n", 586 | " \n", 587 | " self.wait(1)\n", 588 | " \n", 589 | " # Step 3: Sum everything\n", 590 | " step3 = Text(\"Step 3: Add all weighted inputs + bias\", font_size=24, color=WHITE)\n", 591 | " self.play(ReplacementTransform(step2, step3))\n", 592 | " \n", 593 | " # Calculate sum\n", 594 | " weighted_sum = sum(val * weight for val, weight in zip(input_values, weights))\n", 595 | " total_sum = weighted_sum + bias\n", 596 | " \n", 597 | " sum_text = Text(f\"Sum = {weighted_sum:.2f} + {bias} = {total_sum:.2f}\", \n", 598 | " font_size=20, color=ORANGE)\n", 599 | " sum_text.move_to([0, -1.5, 0])\n", 600 | " \n", 601 | " self.play(Write(sum_text))\n", 602 | " \n", 603 | " # Clear calculations\n", 604 | " self.play(*[FadeOut(calc) for calc in calculations])\n", 605 | " \n", 606 | " self.wait(1)\n", 607 | " \n", 608 | " # Step 4: Apply activation function\n", 609 | " step4 = Text(\"Step 4: Apply activation function (sigmoid)\", font_size=24, color=WHITE)\n", 610 | " self.play(ReplacementTransform(step3, step4))\n", 611 | " \n", 612 | " # Calculate sigmoid\n", 613 | " sigmoid_output = 1 / (1 + np.exp(-total_sum))\n", 614 | " \n", 615 | " sigmoid_text = Text(f\"σ({total_sum:.2f}) = {sigmoid_output:.3f}\", \n", 616 | " font_size=20, color=RED)\n", 617 | " sigmoid_text.move_to([2, -1.5, 0])\n", 618 | " \n", 619 | " self.play(Write(sigmoid_text))\n", 620 | " \n", 621 | " # Show final output\n", 622 | " output_label = Text(f\"{sigmoid_output:.3f}\", font_size=16, color=WHITE)\n", 623 | " output_label.move_to(output.get_center())\n", 624 | " self.play(Write(output_label))\n", 625 | " \n", 626 | " # Animate flow\n", 627 | " flow_dot = Dot(color=YELLOW, radius=0.1)\n", 628 | " flow_dot.move_to(inputs[0].get_center())\n", 629 | " \n", 630 | " self.play(Create(flow_dot))\n", 631 | " self.play(flow_dot.animate.move_to(neuron.get_center()), run_time=1)\n", 632 | " self.play(flow_dot.animate.move_to(output.get_center()), run_time=1)\n", 633 | " \n", 634 | " # Final message\n", 635 | " final_msg = Text(\"Information transformed from inputs to meaningful output!\", \n", 636 | " font_size=20, color=GREEN)\n", 637 | " self.play(ReplacementTransform(step4, final_msg))\n", 638 | " \n", 639 | " self.wait(3)" 640 | ] 641 | }, 642 | { 643 | "cell_type": "markdown", 644 | "id": "c78ba4d0", 645 | "metadata": {}, 646 | "source": [ 647 | "### Animation 3: Activation Functions in Action" 648 | ] 649 | }, 650 | { 651 | "cell_type": "code", 652 | "execution_count": null, 653 | "id": "19827598", 654 | "metadata": {}, 655 | "outputs": [], 656 | "source": [ 657 | "%%manim -qm -v WARNING ActivationFunctions\n", 658 | "\n", 659 | "class ActivationFunctions(Scene):\n", 660 | " def construct(self):\n", 661 | " # Title\n", 662 | " title = Text(\"Activation Functions: The Neuron's Decision Maker\", font_size=36, color=BLUE)\n", 663 | " self.play(Write(title))\n", 664 | " self.wait(1)\n", 665 | " self.play(title.animate.to_edge(UP))\n", 666 | " \n", 667 | " # Create axes\n", 668 | " axes = Axes(\n", 669 | " x_range=[-5, 5, 1],\n", 670 | " y_range=[-0.5, 1.5, 0.5],\n", 671 | " x_length=8,\n", 672 | " y_length=4,\n", 673 | " axis_config={\"color\": GREY}\n", 674 | " )\n", 675 | " \n", 676 | " x_label = Text(\"Input (z)\", font_size=20).next_to(axes.x_axis, DOWN)\n", 677 | " y_label = Text(\"Output\", font_size=20).next_to(axes.y_axis, LEFT)\n", 678 | " \n", 679 | " self.play(Create(axes), Write(x_label), Write(y_label))\n", 680 | " \n", 681 | " # Sigmoid function\n", 682 | " def sigmoid(x):\n", 683 | " return 1 / (1 + np.exp(-x))\n", 684 | " \n", 685 | " # Create sigmoid curve\n", 686 | " sigmoid_curve = axes.plot(sigmoid, x_range=[-5, 5], color=BLUE)\n", 687 | " \n", 688 | " sigmoid_label = Text(\"Sigmoid Function\", font_size=24, color=BLUE)\n", 689 | " sigmoid_label.to_edge(LEFT).shift(UP*2)\n", 690 | " \n", 691 | " self.play(Create(sigmoid_curve), Write(sigmoid_label))\n", 692 | " \n", 693 | " # Show key points\n", 694 | " points_to_show = [-3, -1, 0, 1, 3]\n", 695 | " dots = []\n", 696 | " labels = []\n", 697 | " \n", 698 | " for x_val in points_to_show:\n", 699 | " y_val = sigmoid(x_val)\n", 700 | " point = axes.coords_to_point(x_val, y_val)\n", 701 | " \n", 702 | " dot = Dot(point, color=RED, radius=0.08)\n", 703 | " label = Text(f\"({x_val}, {y_val:.2f})\", font_size=12, color=WHITE)\n", 704 | " label.next_to(dot, UP, buff=0.1)\n", 705 | " \n", 706 | " dots.append(dot)\n", 707 | " labels.append(label)\n", 708 | " \n", 709 | " # Animate points\n", 710 | " for dot, label in zip(dots, labels):\n", 711 | " self.play(Create(dot), Write(label), run_time=0.5)\n", 712 | " \n", 713 | " # Explanations\n", 714 | " explanations = [\n", 715 | " \"Large negative inputs → Close to 0\",\n", 716 | " \"Zero input → Exactly 0.5\", \n", 717 | " \"Large positive inputs → Close to 1\",\n", 718 | " \"Smooth transition (differentiable)\"\n", 719 | " ]\n", 720 | " \n", 721 | " explanation_text = Text(\"Key Properties:\", font_size=20, color=YELLOW)\n", 722 | " explanation_text.to_edge(DOWN).shift(UP*2)\n", 723 | " self.play(Write(explanation_text))\n", 724 | " \n", 725 | " for i, exp in enumerate(explanations):\n", 726 | " exp_text = Text(f\"• {exp}\", font_size=16, color=WHITE)\n", 727 | " exp_text.next_to(explanation_text, DOWN, buff=0.3)\n", 728 | " exp_text.shift(DOWN * i * 0.4)\n", 729 | " self.play(Write(exp_text), run_time=0.8)\n", 730 | " \n", 731 | " # Show decision boundary\n", 732 | " decision_line = DashedLine(\n", 733 | " axes.coords_to_point(0, -0.5),\n", 734 | " axes.coords_to_point(0, 1.5),\n", 735 | " color=GREEN\n", 736 | " )\n", 737 | " \n", 738 | " decision_label = Text(\"Decision\\nBoundary\", font_size=14, color=GREEN)\n", 739 | " decision_label.next_to(decision_line, RIGHT)\n", 740 | " \n", 741 | " self.play(Create(decision_line), Write(decision_label))\n", 742 | " \n", 743 | " # Final insight\n", 744 | " insight = Text(\n", 745 | " \"Sigmoid converts any number to a probability-like value!\",\n", 746 | " font_size=18, color=YELLOW\n", 747 | " )\n", 748 | " insight.to_edge(DOWN)\n", 749 | " \n", 750 | " self.play(Write(insight))\n", 751 | " self.wait(3)" 752 | ] 753 | }, 754 | { 755 | "cell_type": "markdown", 756 | "id": "a457d6a7", 757 | "metadata": {}, 758 | "source": [ 759 | "## Step 8: Real-World Example - House Price Prediction\n", 760 | "\n", 761 | "Let's use our neuron to solve a real problem: predicting if a house is expensive based on its features." 762 | ] 763 | }, 764 | { 765 | "cell_type": "code", 766 | "execution_count": null, 767 | "id": "5e545ba1", 768 | "metadata": {}, 769 | "outputs": [], 770 | "source": [ 771 | "# Real-world example: House price prediction\n", 772 | "def house_price_neuron():\n", 773 | " \"\"\"\n", 774 | " Use our neuron to predict if a house is expensive.\n", 775 | " Inputs: [size_sqft, bedrooms, age_years]\n", 776 | " Output: probability that house is expensive (>$500k)\n", 777 | " \"\"\"\n", 778 | " print(\"🏠 HOUSE PRICE PREDICTION NEURON\")\n", 779 | " print(\"=\" * 40)\n", 780 | " \n", 781 | " # Create our house price neuron\n", 782 | " house_neuron = SimpleNeuron(num_inputs=3)\n", 783 | " \n", 784 | " # Set weights based on our intuition:\n", 785 | " # - Larger houses are more expensive (positive weight)\n", 786 | " # - More bedrooms = more expensive (positive weight)\n", 787 | " # - Older houses might be less expensive (negative weight)\n", 788 | " house_neuron.set_weights_and_bias(\n", 789 | " weights=[0.001, 0.3, -0.05], # size, bedrooms, age\n", 790 | " bias=-1.5 # Generally lean towards \"not expensive\"\n", 791 | " )\n", 792 | " \n", 793 | " # Test houses\n", 794 | " test_houses = [\n", 795 | " {\"size\": 1200, \"bedrooms\": 2, \"age\": 30, \"description\": \"Small older home\"},\n", 796 | " {\"size\": 2500, \"bedrooms\": 4, \"age\": 5, \"description\": \"Large new home\"},\n", 797 | " {\"size\": 1800, \"bedrooms\": 3, \"age\": 15, \"description\": \"Medium family home\"},\n", 798 | " {\"size\": 3500, \"bedrooms\": 5, \"age\": 2, \"description\": \"Luxury mansion\"},\n", 799 | " {\"size\": 900, \"bedrooms\": 1, \"age\": 50, \"description\": \"Tiny old cottage\"}\n", 800 | " ]\n", 801 | " \n", 802 | " print(\"\\nPredicting house prices...\\n\")\n", 803 | " print(f\"{'Description':20} | {'Size':6} | {'Beds':4} | {'Age':3} | {'Probability':11} | {'Prediction':10}\")\n", 804 | " print(\"-\" * 70)\n", 805 | " \n", 806 | " for house in test_houses:\n", 807 | " # Prepare inputs (normalize size by dividing by 1000)\n", 808 | " inputs = [house[\"size\"], house[\"bedrooms\"], house[\"age\"]]\n", 809 | " \n", 810 | " # Get prediction\n", 811 | " probability, raw_sum = house_neuron.forward(inputs)\n", 812 | " \n", 813 | " # Convert to binary prediction\n", 814 | " prediction = \"Expensive\" if probability > 0.5 else \"Affordable\"\n", 815 | " \n", 816 | " print(f\"{house['description']:20} | {house['size']:4d}sq | {house['bedrooms']:2d}br | {house['age']:2d}y | {probability:9.3f} | {prediction:10}\")\n", 817 | " \n", 818 | " print(\"\\n💡 How the neuron 'thinks':\")\n", 819 | " print(\"• Size weight (0.001): Bigger houses add to expense score\")\n", 820 | " print(\"• Bedroom weight (0.3): More bedrooms add significantly\")\n", 821 | " print(\"• Age weight (-0.05): Older houses reduce expense score\")\n", 822 | " print(\"• Bias (-1.5): Default assumption is 'affordable'\")\n", 823 | " print(\"• Sigmoid output: Converts score to probability\")\n", 824 | "\n", 825 | "house_price_neuron()" 826 | ] 827 | }, 828 | { 829 | "cell_type": "markdown", 830 | "id": "43346046", 831 | "metadata": {}, 832 | "source": [ 833 | "## Step 9: Visualizing Decision Boundaries\n", 834 | "\n", 835 | "Let's see how our neuron divides the input space into \"expensive\" and \"affordable\" regions." 836 | ] 837 | }, 838 | { 839 | "cell_type": "code", 840 | "execution_count": null, 841 | "id": "f29a80bb", 842 | "metadata": {}, 843 | "outputs": [], 844 | "source": [ 845 | "def visualize_decision_boundary():\n", 846 | " \"\"\"\n", 847 | " Create a 2D visualization of how the neuron makes decisions.\n", 848 | " We'll fix one input and vary the other two.\n", 849 | " \"\"\"\n", 850 | " # Create a simpler 2-input neuron for visualization\n", 851 | " simple_neuron = SimpleNeuron(num_inputs=2)\n", 852 | " simple_neuron.set_weights_and_bias([1.5, -0.8], bias=-1.0)\n", 853 | " \n", 854 | " # Create grid of input values\n", 855 | " x1_range = np.linspace(-3, 4, 100)\n", 856 | " x2_range = np.linspace(-3, 4, 100)\n", 857 | " X1, X2 = np.meshgrid(x1_range, x2_range)\n", 858 | " \n", 859 | " # Calculate neuron output for each point\n", 860 | " Z = np.zeros_like(X1)\n", 861 | " for i in range(len(x1_range)):\n", 862 | " for j in range(len(x2_range)):\n", 863 | " output, _ = simple_neuron.forward([X1[i,j], X2[i,j]])\n", 864 | " Z[i,j] = output\n", 865 | " \n", 866 | " # Create the plot\n", 867 | " fig = go.Figure()\n", 868 | " \n", 869 | " # Add contour plot\n", 870 | " fig.add_trace(go.Contour(\n", 871 | " x=x1_range,\n", 872 | " y=x2_range,\n", 873 | " z=Z,\n", 874 | " colorscale='RdYlBu',\n", 875 | " showscale=True,\n", 876 | " colorbar=dict(title=\"Neuron Output\"),\n", 877 | " contours=dict(\n", 878 | " start=0,\n", 879 | " end=1,\n", 880 | " size=0.1\n", 881 | " )\n", 882 | " ))\n", 883 | " \n", 884 | " # Add decision boundary (output = 0.5)\n", 885 | " fig.add_trace(go.Contour(\n", 886 | " x=x1_range,\n", 887 | " y=x2_range,\n", 888 | " z=Z,\n", 889 | " contours=dict(\n", 890 | " start=0.5,\n", 891 | " end=0.5,\n", 892 | " size=0.1\n", 893 | " ),\n", 894 | " line=dict(color='black', width=4),\n", 895 | " showscale=False,\n", 896 | " name=\"Decision Boundary\"\n", 897 | " ))\n", 898 | " \n", 899 | " # Add some sample points\n", 900 | " sample_points = [[-1, 2], [2, 1], [0, 0], [3, -1], [-2, -1]]\n", 901 | " \n", 902 | " for i, point in enumerate(sample_points):\n", 903 | " output, _ = simple_neuron.forward(point)\n", 904 | " color = 'red' if output > 0.5 else 'blue'\n", 905 | " fig.add_trace(go.Scatter(\n", 906 | " x=[point[0]],\n", 907 | " y=[point[1]],\n", 908 | " mode='markers',\n", 909 | " marker=dict(size=12, color=color, line=dict(color='white', width=2)),\n", 910 | " name=f\"Point {i+1} (out={output:.2f})\",\n", 911 | " showlegend=True\n", 912 | " ))\n", 913 | " \n", 914 | " fig.update_layout(\n", 915 | " title=\"Neuron Decision Boundary Visualization\",\n", 916 | " xaxis_title=\"Input 1 (x1)\",\n", 917 | " yaxis_title=\"Input 2 (x2)\",\n", 918 | " width=700,\n", 919 | " height=600\n", 920 | " )\n", 921 | " \n", 922 | " return fig\n", 923 | "\n", 924 | "# Show the decision boundary\n", 925 | "boundary_plot = visualize_decision_boundary()\n", 926 | "boundary_plot.show()\n", 927 | "\n", 928 | "print(\"\\n🎯 Understanding the Decision Boundary:\")\n", 929 | "print(\"• Blue region: Neuron outputs < 0.5 (Class 0)\")\n", 930 | "print(\"• Red region: Neuron outputs > 0.5 (Class 1)\")\n", 931 | "print(\"• Black line: Decision boundary (output = 0.5)\")\n", 932 | "print(\"• Single neuron creates LINEAR decision boundary\")\n", 933 | "print(\"• Points are colored by their actual predictions\")" 934 | ] 935 | }, 936 | { 937 | "cell_type": "markdown", 938 | "id": "96560fc8", 939 | "metadata": {}, 940 | "source": [ 941 | "## Step 10: Key Takeaways\n", 942 | "\n", 943 | "Let's summarize what we've learned about single neurons and prepare for the next notebook." 944 | ] 945 | }, 946 | { 947 | "cell_type": "code", 948 | "execution_count": null, 949 | "id": "5089d751", 950 | "metadata": {}, 951 | "outputs": [], 952 | "source": [ 953 | "print(\"🧠 NEURAL NETWORKS PART 1: SINGLE NEURON - SUMMARY\")\n", 954 | "print(\"=\" * 60)\n", 955 | "print()\n", 956 | "print(\"✅ What we learned:\")\n", 957 | "print(\"• A neuron is a mathematical function that processes inputs\")\n", 958 | "print(\"• Key components: weights, bias, activation function\")\n", 959 | "print(\"• Weights determine input importance\")\n", 960 | "print(\"• Bias shifts the decision threshold\")\n", 961 | "print(\"• Activation functions (like sigmoid) add non-linearity\")\n", 962 | "print(\"• Single neurons create linear decision boundaries\")\n", 963 | "print(\"• Neurons can solve simple classification problems\")\n", 964 | "print()\n", 965 | "print(\"🔧 What we built:\")\n", 966 | "print(\"• Complete neuron class from scratch\")\n", 967 | "print(\"• Weather decision system\")\n", 968 | "print(\"• House price predictor\")\n", 969 | "print(\"• Decision boundary visualizations\")\n", 970 | "print(\"• Beautiful Manim animations\")\n", 971 | "print()\n", 972 | "print(\"🚀 Coming up in Part 2:\")\n", 973 | "print(\"• Training neurons to learn from data\")\n", 974 | "print(\"• Gradient descent optimization\")\n", 975 | "print(\"• Loss functions and backpropagation\")\n", 976 | "print(\"• Making neurons smarter through learning\")\n", 977 | "print()\n", 978 | "print(\"🎯 Key insight: A single neuron is like a smart linear classifier\")\n", 979 | "print(\" that can learn patterns in data!\")\n", 980 | "print()\n", 981 | "print(\"Ready for Part 2? Let's teach our neuron to learn! 🚀\")" 982 | ] 983 | }, 984 | { 985 | "cell_type": "markdown", 986 | "id": "8378e7b3", 987 | "metadata": {}, 988 | "source": [ 989 | "## How to Run the Animations\n", 990 | "\n", 991 | "To run these Manim animations:\n", 992 | "\n", 993 | "1. **Install Manim** if you haven't already:\n", 994 | " ```bash\n", 995 | " pip install manim\n", 996 | " ```\n", 997 | "\n", 998 | "2. **Run each animation cell** one by one. The `%%manim` magic command will:\n", 999 | " - Generate the animation\n", 1000 | " - Save it as a video file\n", 1001 | " - Display it in the notebook\n", 1002 | "\n", 1003 | "3. **Animation Settings**: Medium quality (`-qm`) for good balance of visual quality and rendering speed.\n", 1004 | "\n", 1005 | "**Note**: First-time Manim setup might take a moment to install dependencies.\n", 1006 | "\n", 1007 | "---\n", 1008 | "\n", 1009 | "**🎓 Congratulations!** You've built your first artificial neuron from scratch and understand how it processes information. In the next notebook, we'll teach it to learn from data!" 1010 | ] 1011 | } 1012 | ], 1013 | "metadata": { 1014 | "language_info": { 1015 | "name": "python" 1016 | } 1017 | }, 1018 | "nbformat": 4, 1019 | "nbformat_minor": 5 1020 | } 1021 | --------------------------------------------------------------------------------