├── README.md ├── geeqie_dup1.jpg ├── geeqie_dup2.jpg ├── yandex_large.jpg └── yandex_largest.jpg /README.md: -------------------------------------------------------------------------------- 1 | # Current LoRA workflow 2 | 3 | Most of my public LoRA's are here: https://civitai.com/user/chairfull 4 | 5 | |Model|Images Used|Note| 6 | |-----|:---------:|----| 7 | |[Josh Brolin](https://civitai.com/models/33629/josh-brolin)|19|Fewest images used.| 8 | |[Guy Pierce](https://civitai.com/models/23993/guy-pierce)|32|Used CLIP instead of BLIP.| 9 | |[Woody Harrelson](https://civitai.com/models/42418/woody-harrelson)|38|First model I used regularisation images on.| 10 | |[Jack Nicholson](https://civitai.com/models/23994/jack-nicholson)|52|Most downloaded [Male].| 11 | |[Kelly Brook](https://civitai.com/models/23990/kelly-brook)|146|Captioned with CLIP Interrogator 2.1 at `best` setting. For most models I use BLIP.| 12 | |[Anne Hathaway](https://civitai.com/models/26164/anne-hathaway)|147|Most downloaded.| 13 | |[Maitland Ward](https://civitai.com/models/26187/maitland-ward)|325|Most images used.| 14 | 15 | ## 1) Training Data 16 | 17 | Image quality = Model quality. 18 | 19 | Image quantity = Model flexibility. 20 | 21 | Image quality is a big part in how well a LoRA turns out, so try to find the highest quality images you can. 22 | 23 | Many images I've used are over 2000x3000. Some >8000x5000. 24 | I only crop out other people and text. I don't resize. 25 | 26 | High quality image **!=** big image. A high quality is one where if you zoom in you see details like skin pores, eye flecks, fabric threads. 27 | 28 | If you zoom in and it looks blurry, that image is someones crummy upscale. Using too many of those images in training will give the model a cartoon airbrush look. 29 | 30 | ### Finding images 31 | 32 | #### (Optional) Chrome extensions 33 | [Imagus](https://chrome.google.com/webstore/detail/imagus/immpkjjlgappgfkkfieppnmlhakdmaab?hl=en): See full image by hovering it or a link, and hit `Ctrl+S` to save it. 34 | 35 | [Double Click Image Downloader](https://chrome.google.com/webstore/detail/double-click-image-downlo/bkijmpolkanhdehnlnabfooghjdokakc/?hl=en): For quicker downloading. 36 | 37 | [YouTube Screenshot Button](https://chrome.google.com/webstore/detail/youtube-screenshot-button/ehehmcocpegbmagfhajbmeofolnpdfla?hl=en): Make sure to set the video quality to HD, as high as you can set it. 38 | Pause a video and use `,` and `.` keys to move back/forward one frame, to find the least blurry frame. 39 | 40 | [UBlockOrigin](https://chrome.google.com/webstore/detail/ublock-origin/cjpalhdlnbpafiamejdnhcphjbkeiagm?hl=en): Nicest adblocker, imo. 41 | 42 | #### Sources 43 | [Yandex](https://yandex.ru/images/search?isize=large&text=dog) 44 | This is my goto. Better than Google's image search. Allows easilly finding in different size. 45 | 46 | 1) Search your subject. 47 | 48 | 2) Sort images by largest 49 | 50 | 51 | 3) On the right is a size drop down, attempt to find the biggest. 52 | 53 | Only do this if the largest is actually better quality. It may be an crummy upscale, or the link may not work. 54 | 55 | You can also search for better quality images by dragging them into Yandex, to do a `Similar image search`. 56 | 57 | I'm now using YouTube, and it works quite well. Get a [YouTube screenshot chrome extension](https://chrome.google.com/webstore/detail/ublock-origin/cjpalhdlnbpafiamejdnhcphjbkeiagm?hl=en) and use `,` `.` to find unique facial angles that aren't blurry. (Be sure to set video quality to highest possible.) 58 | 59 | For a person, attempt to find **at least** one of each: 60 | - Profile left + profile right. 61 | - 3/4 left + 3/4 right. 62 | - Looking at camera. 63 | - Looking up + looking down. 64 | - (Bonus) Looking up + down at 3/4 left and right. 65 | - (Bonus) All these angles with multiple expressions (happy, neutral, angry) 66 | 67 | ### Processing images 68 | 69 | #### Dealing with duplicates 70 | While looking for images I save as many as look decent. Sometimes coming across higher quality versions later. So I end up with duplicates. 71 | 72 | To remove duplicates I use [Geeqie Image viewer](https://www.geeqie.org). 73 | 74 | Open Geeqie and go to your folder of images. 75 | 76 | Select all of them in the lower right panel. 77 | 78 | `Right Click` and select `Find duplicates`. 79 | 80 | 81 | Sort on `Similarity`, (low, med, high). 82 | 83 | If it finds any, get rid of whichever ones seem lower quality, by `Right Clicking` and selecting `Delete`. 84 | 85 | 86 | #### Bulk Cropping 87 | Select the few images that need cropping and drag into [bulkimagecrop.com](https://bulkimagecrop.com/). 88 | 89 | While I try to remove other similar subjects (other males, if training on a male), and text, I don't try to center the subject. 90 | Having the subject dead center in every image could train the model to think you always want that. 91 | 92 | I have the subject be on far left, far top, bottom right... 93 | 94 | Once you've cropped and downloaded the images to zips, you can mass unzip with: `unzip \*.zip` 95 | 96 | ### Zip 97 | Zip the images: `zip ./my_pics -r .` 98 | 99 | ### Upload 100 | - Upload the zip to your Google drive. 101 | - `Right Click` it in GDrive, select `Share` or `Get link`. 102 | - Toggle `Make Public`. 103 | - Click `Copy link`. 104 | 105 | ## 2) Kohya 106 | I used: https://github.com/Linaqruf/kohya-trainer (Dreambooth method, top one.) 107 | 108 | I use the Google colab version as my GPU sucks, but I assume it works the same if you run it on your pc. 109 | 110 | I mostly BLIP to auto caption the images. 111 | Recently I started upping the word count from `15-75` to `30-100`, and the results have seemed a tinge better? 112 | 113 | Leave pretty much all the settings values at their default, except: 114 | * For pre trained model download: `Stable-Diffusion-v1-5`. 115 | * For VAE model download: `stablediffusion.vae.pt` 116 | * Set `pretrained_model_name_or_path` to `/content/pretrained_model/Stable-Diffusion-v1-5.safetensors` 117 | * Set `vae` to `/content/vae/stablediffusion.vae.pt` 118 | * Set `class_token` to `man`. 119 | 120 | ### `min_snr_gamma` 121 | Setting it to 5 seems to really lower the loss faster. 122 | I've tried it at .00001, .01, .6, 2, 5, and 15. And 5 seemed the best. 123 | 124 | ### `network_dim` 125 | I had this at 32 for the longest time, but raising it to 64 really improved the likeness. 126 | 127 | I don't understand the relation with `network_alpha`. I've set it to 1, 4, and 32 and couldn't see what was different? 128 | 129 | 130 | # Experiments 131 | 132 | Random ideas I'm trying out: 133 | 134 | ## Regularization images 135 | I've tried all different methods with regularisation images and don't find them that great. 136 | Maybe I'm doing something wrong. 137 | 138 | For this [Woody Harrelson](https://civitai.com/models/42418/woody-harrelson) model, I used all the photos of males that I've trained other models with, as regularization data. Didn't crop anything or care about aspect ratio. It seems Kohya will bucket them. Everything worked fine. 139 | 140 | ## Higher quality through tokens 141 | Tokens, in the captions, are what you **don't** want trained as part of your model, with the exception of the `class_token`: 142 | So for a man: `a man, in a red hat, in a forest` would only extract the `man` not the `red hat` or `forest`. 143 | Theoretically, this should work for style and image quality, so for old images I might add: `blurry, old image, scan, jpeg artifacts, low quality` in hopes the model will pull a sharper image. 144 | 145 | ## CLIP instead of BLIP 146 | For [this model](https://civitai.com/models/23990/kelly-brook) I captioned 146 images with CLIP Interrogator 2.1 on the `best` setting. 147 | 148 | It took a long time, and I don't know that it was worth it. Theoretically it should be more flexible than other models. Needs more testing. 149 | 150 | ## Sentiment analyzer for better facial expressiveness 151 | To get more expressiveness out of training data, I'm going to try a [sentiment analyzer](https://huggingface.co/spaces/schibsted/facial_expression_classifier) on a set of photos. 152 | 153 | Maybe instead of a single subject, I will train a ton of random faces of emotions at different angles, and then caption each like 154 | ``` 155 | img1.png: a man on the beach, neutral_90 sad_20 fear_5 happy_3 156 | img2.png: a woman at work, happy_40 neutral_20 sad_9 157 | ``` 158 | -------------------------------------------------------------------------------- /geeqie_dup1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chairfull/my_lora_workflow/32559e3b94dcd62927bb888a2a66218e1cde458e/geeqie_dup1.jpg -------------------------------------------------------------------------------- /geeqie_dup2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chairfull/my_lora_workflow/32559e3b94dcd62927bb888a2a66218e1cde458e/geeqie_dup2.jpg -------------------------------------------------------------------------------- /yandex_large.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chairfull/my_lora_workflow/32559e3b94dcd62927bb888a2a66218e1cde458e/yandex_large.jpg -------------------------------------------------------------------------------- /yandex_largest.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/chairfull/my_lora_workflow/32559e3b94dcd62927bb888a2a66218e1cde458e/yandex_largest.jpg --------------------------------------------------------------------------------