├── .gitignore
├── LICENSE
├── README.md
├── detect.py
├── local_cmd.sh
├── local_cmd_inference_time_mitigation.sh
├── plot.py
├── prompt
    ├── group_nomem500.txt
    ├── sd1_mem345.txt
    └── sd2_mem219.txt
├── read_results.py
├── refactored_classes
    ├── MemAttn.py
    ├── refactored_attention.py
    ├── refactored_attention_processor.py
    ├── refactored_transformer_2d.py
    ├── refactored_unet_2d_blocks.py
    └── refactored_unet_2d_condition.py
└── text2img.py


/.gitignore:
--------------------------------------------------------------------------------
1 | results/
2 | __pycache__/
3 | rewrite_param.py
4 | *npy
5 | *log
6 | plot/
7 | results.txt
8 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 Jie Ren
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention
 2 | 
 3 | Official code for [Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention](https://arxiv.org/abs/2403.11052)
 4 | 
 5 | Recent advancements in text-to-image diffusion models have demonstrated their remarkable capability to generate high-quality images from textual prompts. However, increasing research indicates that these models memorize and replicate images from their training data, raising tremendous concerns about potential copyright infringement and privacy risks. In our study, we provide a novel perspective to understand this memorization phenomenon by examining its relationship with cross-attention mechanisms. We reveal that during memorization, the cross-attention tends to focus disproportionately on the embeddings of specific tokens. The diffusion model is overfitted to these token embeddings, memorizing corresponding training images. To elucidate this phenomenon, we further identify and discuss various intrinsic findings of cross-attention that contribute to memorization. Building on these insights, we introduce an innovative approach to detect and mitigate memorization in diffusion models. The advantage of our proposed method is that it will not compromise the speed of either the training or the inference processes in these models while preserving the quality of generated images.
 6 | 
 7 | ## Necessary dependencies
 8 | 
 9 | diffusers, pytorch, transformers
10 | 
11 | ## How to use
12 | 
13 | ### Detection
14 | 
15 | Detailed CMD of generation and detection can be found in **local_cmd.sh**
16 | 
17 | ### Mitigation
18 | 
19 | Detailed CMD of mitigation can be found in **local_cmd_inference_time_mitigation.sh**
20 | 
21 | ## Cite
22 | ```
23 | @article{ren2024unveiling,
24 |   title={Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention},
25 |   author={Ren, Jie and Li, Yaxin and Zen, Shenglai and Xu, Han and Lyu, Lingjuan and Xing, Yue and Tang, Jiliang},
26 |   journal={arXiv preprint arXiv:2403.11052},
27 |   year={2024}
28 | }
29 | ```
30 | 


--------------------------------------------------------------------------------
/detect.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from sklearn.metrics import roc_auc_score, roc_curve
  3 | 
  4 | import argparse
  5 | parser = argparse.ArgumentParser(description="Process some integers.")
  6 | parser.add_argument("--mode", type=str, default='entropy', help="The scale of noise offset.")
  7 | parser.add_argument("--mem_input", type=str, default=None, help="The scale of noise offset.")
  8 | parser.add_argument("--nomem_input", type=str, default=None, help="The scale of noise offset.")
  9 | args = parser.parse_args()
 10 | 
 11 | if args.mode == "D":
 12 |     use_entropy = True
 13 |     use_delta_padding = True
 14 |     use_special_layer = False
 15 |     step_id = np.arange(40, 50)
 16 | elif args.mode == "E":
 17 |     use_entropy = True
 18 |     use_delta_padding = False
 19 |     use_special_layer = True
 20 |     step_id = np.array([0])
 21 |     layer_id = np.array([3])
 22 | else:
 23 |     raise("Incorrect mode.")
 24 | 
 25 | mem_name = f'{args.mem_input}.npy'
 26 | nomem_name = f'{args.nomem_input}.npy'
 27 | mem_length_name = f'{args.mem_input}_length.npy'
 28 | nomem_length_name = f'{args.nomem_input}_length.npy'
 29 | 
 30 | if use_special_layer:
 31 |     attn_mem = np.load(mem_name)[:, :, layer_id].mean((2, 3))
 32 |     attn_nomem = np.load(nomem_name)[:, :, layer_id].mean((2, 3))
 33 | else:
 34 |     attn_mem = np.load(mem_name).mean((2, 3))
 35 |     attn_nomem = np.load(nomem_name).mean((2, 3))
 36 | 
 37 | # import pdb ; pdb.set_trace()
 38 | 
 39 | entropy_every_step_mem = (- attn_mem * np.log(attn_mem)).sum(2)
 40 | entropy_every_step_nomem = (- attn_nomem * np.log(attn_nomem)).sum(2)
 41 | 
 42 | score_mem = 0
 43 | score_nomem = 0
 44 | 
 45 | if use_entropy:
 46 |     score_mem += entropy_every_step_mem[step_id].mean(0)
 47 |     score_nomem += entropy_every_step_nomem[step_id].mean(0)
 48 | 
 49 | if use_delta_padding:
 50 | 
 51 |     length_mem = np.load(f'./{mem_length_name}')
 52 |     length_nomem = np.load(f'./{nomem_length_name}')
 53 | 
 54 |     padding_entropy_mem_list = []
 55 |     padding_entropy_nomem_list = []
 56 | 
 57 |     prompt_entropy_mem_list = []
 58 |     prompt_entropy_nomem_list = []
 59 | 
 60 |     for i in range(attn_mem.shape[1]):
 61 |         padding_entropy = attn_mem[:, i, length_mem[i]-1:]
 62 |         padding_entropy = (- padding_entropy * np.log(padding_entropy)).sum(1)
 63 |         padding_entropy_mem_list.append(padding_entropy)
 64 | 
 65 |         prompt_entropy = attn_mem[:, i, :length_mem[i]-1]
 66 |         prompt_entropy = (- prompt_entropy * np.log(prompt_entropy)).sum(1)
 67 |         prompt_entropy_mem_list.append(prompt_entropy)
 68 |         
 69 |     padding_entropy_mem_list = np.stack(padding_entropy_mem_list, axis=1)
 70 |     prompt_entropy_mem_list = np.stack(prompt_entropy_mem_list, axis=1)
 71 | 
 72 |     for i in range(attn_nomem.shape[1]):
 73 |         padding_entropy = attn_nomem[:, i, length_nomem[i]-1:]
 74 |         padding_entropy = (- padding_entropy * np.log(padding_entropy)).sum(1)
 75 |         padding_entropy_nomem_list.append(padding_entropy)
 76 | 
 77 |         prompt_entropy = attn_nomem[:, i, :length_nomem[i]-1]
 78 |         prompt_entropy = (- prompt_entropy * np.log(prompt_entropy)).sum(1)
 79 |         prompt_entropy_nomem_list.append(prompt_entropy)
 80 | 
 81 |     padding_entropy_nomem_list = np.stack(padding_entropy_nomem_list, axis=1)
 82 |     prompt_entropy_nomem_list = np.stack(prompt_entropy_nomem_list, axis=1)
 83 | 
 84 | 
 85 |     score_mem += (padding_entropy_mem_list[step_id].mean(0) - padding_entropy_mem_list[0])
 86 |     score_nomem += (padding_entropy_nomem_list[step_id].mean(0) - padding_entropy_nomem_list[0])
 87 | 
 88 | scores = np.concatenate([score_mem, score_nomem], axis=0)
 89 | labels = np.array([1] * len(entropy_every_step_mem[40]) + [0] * len(entropy_every_step_nomem[40]))
 90 | 
 91 | # import pdb ; pdb.set_trace()
 92 | auroc = roc_auc_score(labels, scores)
 93 | floats_list = [auroc]
 94 | 
 95 | # Calculate ROC curve
 96 | fpr, tpr, thresholds = roc_curve(labels, scores)
 97 | 
 98 | # Find the closest FPR to 0.01 (1% FPR)
 99 | thre_fpr = [0.01, 0.03, 0.05, 0.1]
100 | for i in range(len(thre_fpr)):
101 |     target_fpr = thre_fpr[i]
102 |     closest_fpr_index = np.argmin(np.abs(fpr - target_fpr))
103 |     closest_fpr = fpr[closest_fpr_index]
104 |     tpr_at_target_fpr = tpr[closest_fpr_index]
105 | 
106 |     # print(f"True Positive Rate (TPR) at 1% False Positive Rate (FPR): {tpr_at_target_fpr}")
107 |     floats_list.append(tpr_at_target_fpr)
108 | 
109 | # Specify the output file name
110 | output_file_name = 'results.txt'
111 | 
112 | heads = ['AUROC', "TPR@0.01FPR", "TPR@0.03FPR", "TPR@0.05FPR", "TPR@0.1FPR"]
113 | 
114 | # Open the output file in write mode
115 | with open(output_file_name, 'w') as file:
116 |     # Join the list of floats converted to strings with '\t' as the separator
117 |     # and write to the file
118 |     file.write('\t'.join(map(str, heads)) + '\n')
119 |     file.write('\t'.join(map(str, floats_list)))
120 | 
121 | print(f"List of floats has been saved to '{output_file_name}'.")
122 | 


--------------------------------------------------------------------------------
/local_cmd.sh:
--------------------------------------------------------------------------------
 1 | # Generating images and save attention entropy
 2 | # MY_CMD="python text2img.py --prompt prompt/group_nomem500 --output_name debug --save_numpy"
 3 | 
 4 | # Merging results
 5 | # MY_CMD="python read_results.py --mode merge_small --n_mem 500 --n_step 50 --merge_input ./results/local_group_nomem500_detect_small_seed0 --merge_output ./results/local_group_nomem500_detect_small_seed0"
 6 | 
 7 | # Detection results
 8 | # MY_CMD="python detect.py --mode E --mem_input ./results/local_sd1_mem_not_n_detect_small_seed0 --nomem_input ./results/local_group_nomem500_detect_small_seed0"
 9 | 
10 | # Plot results
11 | # MY_CMD="python plot.py --mem_input ./results/local_sd1_mem_not_n_detect_small_seed0 --nomem_input ./results/local_group_nomem500_detect_small_seed0"
12 | 
13 | echo $MY_CMD
14 | echo ${MY_CMD}>>local_history.log
15 | CUDA_VISIBLE_DEVICES='4' $MY_CMD # HF_HOME=$HF_CACHE_DIR TRANSFORMERS_CACHE=$HF_CACHE_DIR
16 | 


--------------------------------------------------------------------------------
/local_cmd_inference_time_mitigation.sh:
--------------------------------------------------------------------------------
1 | # Generating images and save attention entropy
2 | MY_CMD="python text2img.py --prompt prompt/group_nomem500 --float16 --output_name debug --c1 1.25 --cross_attn_mask --miti_mem"
3 | 
4 | echo $MY_CMD
5 | echo ${MY_CMD}>>local_history.log
6 | CUDA_VISIBLE_DEVICES='4' $MY_CMD # HF_HOME=$HF_CACHE_DIR TRANSFORMERS_CACHE=$HF_CACHE_DIR
7 | 


--------------------------------------------------------------------------------
/plot.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | 
 4 | import argparse
 5 | parser = argparse.ArgumentParser(description="Process some integers.")
 6 | parser.add_argument("--mem_input", type=str, default=None, help="The scale of noise offset.")
 7 | parser.add_argument("--nomem_input", type=str, default=None, help="The scale of noise offset.")
 8 | args = parser.parse_args()
 9 | 
10 | save_name = 'test'
11 | 
12 | mem_name = f'{args.mem_input}.npy'
13 | nomem_name = f'{args.nomem_input}.npy'
14 | 
15 | attn_mem = np.load(mem_name).mean((2,3))
16 | attn_nomem = np.load(nomem_name).mean((2,3))
17 | 
18 | entropy_every_step_mem = (- attn_mem * np.log(attn_mem)).sum(2)
19 | entropy_every_step_nomem = (- attn_nomem * np.log(attn_nomem)).sum(2)
20 | vector1 = entropy_every_step_mem.mean(1)
21 | vector2 = entropy_every_step_nomem.mean(1)
22 | 
23 | x = np.arange(0, len(vector1))[::-1]
24 | 
25 | plt.figure(figsize=(6.7, 5))
26 | 
27 | # Plotting the vectors
28 | plt.plot(x, vector1, label='Memorization')
29 | plt.plot(x, vector2, label='Non-memorization')
30 | 
31 | plt.xticks([0, 49 / 2, 49], ['0', '$T/2$', '$T$'], fontsize=22)
32 | plt.yticks(fontsize=16)
33 | 
34 | plt.ylabel('Entropy', fontsize=22)
35 | plt.legend(loc='upper left', fontsize=20)
36 | plt.ylim(0.4, 1.3)
37 | plt.tight_layout()
38 | plt.savefig(f"./plot/{save_name}.png")
39 | print(f"saved at ./plot/{save_name}.png")
40 | 
41 | plt.close()
42 | 


--------------------------------------------------------------------------------
/prompt/group_nomem500.txt:
--------------------------------------------------------------------------------
  1 | A serene beach at sunrise
  2 | Snow-covered rooftops in a small town
  3 | A bustling urban skate park
  4 | A lonely boat on a calm lake
  5 | A vibrant street art mural
  6 | A tranquil bamboo forest
  7 | An ancient temple in the jungle
  8 | A modern city skyline at dusk
  9 | A frozen waterfall in winter
 10 | A hot air balloon over a patchwork of fields
 11 | An old windmill in a field of tulips
 12 | A lively carnival at night
 13 | A hidden cave with glowing crystals
 14 | A rustic mountain cabin with smoke from the chimney
 15 | A lush vineyard in the countryside
 16 | A misty morning in a mountain valley
 17 | A colorful bird in a tropical rainforest
 18 | A cozy bookstore in a busy city
 19 | An astronaut floating in space near a space station
 20 | A dramatic cliff overlooking the ocean
 21 | A peaceful monastery in the mountains
 22 | A busy sushi bar in Tokyo
 23 | A charming cottage surrounded by wildflowers
 24 | A surreal landscape with floating islands
 25 | A scenic overlook with a view of a winding river
 26 | A bustling farmer's market on a sunny morning
 27 | A tranquil koi pond with a small bridge
 28 | A lively jazz club in the evening
 29 | A picturesque lighthouse at sunset
 30 | A spooky abandoned house at night
 31 | A colorful carnival parade
 32 | A serene yoga studio with natural light
 33 | A vibrant coral reef with diverse marine life
 34 | A bustling train station in the evening
 35 | A romantic dinner setup on a beach
 36 | A foggy London street in early morning
 37 | A whimsical candy shop
 38 | A busy bee garden in full bloom
 39 | A rooftop terrace with city views
 40 | A wildflower meadow with a rainbow
 41 | A medieval knight in shining armor
 42 | An underwater scene with a sunken ship
 43 | A spooky graveyard in the fog
 44 | A bustling bazaar in Istanbul
 45 | A cozy winter cabin with a roaring fire
 46 | A magnificent castle surrounded by a moat
 47 | A lively Mardi Gras celebration
 48 | A tranquil morning in a Zen garden
 49 | A traditional tea ceremony in progress
 50 | A small boat in a vast ocean
 51 | A high-tech futuristic city at night
 52 | A quaint European village street
 53 | A surfer riding a big wave
 54 | A vibrant sunset over a peaceful lake
 55 | An old-fashioned steam train chugging through the countryside
 56 | A detailed close-up of a dragonfly on a leaf
 57 | A picturesque vine-covered cottage
 58 | A bustling night market in Asia
 59 | A mountain biker on a rugged trail
 60 | A scenic hot air balloon festival
 61 | A beautiful butterfly garden
 62 | A romantic gondola ride in Venice
 63 | A mysterious forest path in the moonlight
 64 | A vibrant street in Havana
 65 | A panoramic view of the Grand Canyon
 66 | An ancient stone bridge in a misty forest
 67 | A lively Oktoberfest celebration
 68 | A majestic eagle soaring over mountains
 69 | A colorful tulip field in Holland
 70 | A serene alpine lake with crystal-clear water
 71 | A cozy alpine ski lodge
 72 | A bustling city street in New York
 73 | A peaceful lavender field at sunset
 74 | An artist's palette with vibrant paint colors
 75 | A lively street dance performance
 76 | A panoramic view of the Himalayas
 77 | A vibrant fish market in Japan
 78 | A traditional Chinese dragon dance
 79 | A rustic old barn in a field
 80 | A tranquil pond with lily pads and lotus flowers
 81 | A traditional Mongolian yurt in the steppe
 82 | A breathtaking view from a mountain peak
 83 | A luxurious yacht on the open sea
 84 | A lively square in Marrakech
 85 | A serene chapel in the countryside
 86 | A vibrant festival in Rio de Janeiro
 87 | An old steam locomotive on a mountain pass
 88 | A beautiful peacock displaying its feathers
 89 | A bustling harbor with fishing boats
 90 | A charming old streetcar in San Francisco
 91 | A scenic vineyard in Tuscany
 92 | A traditional Scottish bagpiper in full dress
 93 | A dramatic thunderstorm over the plains
 94 | A traditional Indian wedding procession
 95 | A beautiful Japanese cherry blossom festival
 96 | A peaceful riverside picnic scene
 97 | A panoramic view of Paris from the Eiffel Tower
 98 | A majestic lion resting in the savannah
 99 | A colorful hot air balloon race
100 | A vibrant street festival in New Orleans
101 | A serene beach at sunset, with the sky painted in shades of pink and orange.
102 | An astronaut floating in space, with Earth visible in the background.
103 | A mystical forest shrouded in fog, with ancient trees and glowing fireflies.
104 | A cyberpunk street market bustling with activity under neon lights.
105 | A vintage car parked on a deserted highway, with mountains in the distance.
106 | A snowy village at night, with lights twinkling in cozy cottages.
107 | An underwater city with coral buildings and schools of colorful fish.
108 | A desert oasis at dawn, with palm trees and a clear, reflective pool.
109 | A steampunk workshop filled with intricate machinery and steam engines.
110 | A panoramic view of the Grand Canyon during a thunderstorm.
111 | A fantasy castle floating in the sky, connected to the ground by a vine bridge.
112 | A detailed map of a fictional world, with continents, oceans, and landmarks.
113 | A close-up of a bee pollinating a vibrant flower.
114 | A bustling medieval market square, with vendors, performers, and townsfolk.
115 | A post-apocalyptic city overrun by nature, with skyscrapers covered in vines.
116 | A cozy library with floor-to-ceiling bookshelves and a roaring fireplace.
117 | A jazz club in the 1920s, with musicians on stage and dancers on the floor.
118 | A tranquil Japanese Zen garden with a stone path leading to a tea house.
119 | An abstract painting using only geometric shapes and primary colors.
120 | A group of warriors riding dragons above a fiery volcano.
121 | A northern lights display over a snowy, untouched landscape.
122 | A high-fashion photoshoot in a futuristic metallic wardrobe.
123 | A dark alleyway in a rain-soaked city, with a mysterious figure in the shadows.
124 | A sunken pirate ship surrounded by treasure and aquatic life.
125 | A vibrant street art mural capturing the essence of urban life.
126 | A Roman coliseum filled with spectators, mid-gladiator battle.
127 | A magical library with books that float off the shelves.
128 | A space station orbiting a distant exoplanet.
129 | A Victorian ballroom with guests in period attire dancing under chandeliers.
130 | A peaceful monastery in the Himalayas, with monks meditating.
131 | An old western town at high noon, with a duel about to take place.
132 | A high-speed chase on a futuristic highway with hovercars.
133 | A secret garden hidden behind an ivy-covered wall, with a magical fountain.
134 | A detailed portrait of a queen in a lavish gown adorned with jewels.
135 | An ancient temple at sunrise, with monkeys roaming the ruins.
136 | A vibrant coral reef teeming with marine life and a sunken statue.
137 | A haunted house on a hill, with ghosts visible in the windows.
138 | A fantasy scene of a wizard casting a spell in an enchanted forest.
139 | A traditional Chinese garden with a pagoda and a dragon-shaped bridge.
140 | A futuristic city with green technology and vertical gardens on buildings.
141 | A close-up of a clockwork mechanism, with gears and springs in detail.
142 | A carnival at night, with a Ferris wheel and cotton candy stands.
143 | A battlefield from World War II, with tanks and soldiers in the fog.
144 | A microchip city, with electronic circuits and components as buildings.
145 | A superhero flying over a city, saving people from a disaster.
146 | A romantic scene under the Eiffel Tower at night, with a couple holding hands.
147 | An ancient library with scrolls and tomes, lit by candlelight.
148 | A fashion runway show with avant-garde designs and dramatic lighting.
149 | A wild west shootout in front of a saloon, with cowboys and horses.
150 | A moonlit path through a forest leading to a mysterious castle.
151 | A cybernetic organism, half-human, half-machine, in a dystopian future.
152 | A bird's-eye view of a sprawling metropolis at dawn.
153 | A fantasy scene with a knight fighting a dragon to protect a village.
154 | An abandoned amusement park taken over by nature, with a rusty Ferris wheel.
155 | A close-up of a spider web covered in morning dew.
156 | A bustling spaceport with aliens and spacecraft from different galaxies.
157 | A silent movie scene with exaggerated expressions and title cards.
158 | A Renaissance fair with jesters, knights, and artisans.
159 | A dystopian cityscape with towering billboards and a dense fog.
160 | A serene pond with lily pads and a small wooden bridge.
161 | A high-speed train cutting through a futuristic cityscape.
162 | A vintage detective's office, with a typewriter and a smoking pipe.
163 | A majestic waterfall hidden in a tropical rainforest.
164 | A street scene in Paris with cafes, artists, and the Sacré-Cœur in the distance.
165 | A secret base on Mars, with astronauts and rovers exploring the landscape.
166 | An ice cave with glowing blue ice and stalactites.
167 | A gothic cathedral with stained glass windows and flying buttresses.
168 | A wild safari scene with elephants, lions, and giraffes.
169 | A cyberpunk hacker's lair, with screens displaying code and futuristic gadgets.
170 | A scene from a noir film, with a detective in a trench coat under a streetlamp.
171 | A fairytale cottage in the woods, with a thatched roof and a smoking chimney.
172 | A luxurious yacht sailing in the Caribbean with a sunset in the background.
173 | A bustling Asian night market, with lanterns and street food stalls.
174 | A fantasy scene of mermaids swimming near an underwater palace.
175 | A surreal landscape with a checkerboard ground and floating clocks.
176 | An old-fashioned barbershop with red and white stripes and antique chairs.
177 | A dramatic opera scene with performers in elaborate costumes on stage.
178 | A Victorian era scientist's laboratory, with bubbling potions and electrical experiments.
179 | A Viking ship sailing through rough seas, with warriors ready for battle.
180 | A minimalist interior design scene with sleek furniture and natural light.
181 | A cybernetic eye with intricate details and futuristic enhancements.
182 | A dramatic cliffside castle overlooking a stormy sea.
183 | A futuristic medical facility with advanced technology and holographic displays.
184 | A serene yoga retreat in Bali, with open pavilions and lush surroundings.
185 | A bustling construction site with cranes and workers building a skyscraper.
186 | A close-up of a hummingbird hovering near a bright flower.
187 | A snowy mountain range with climbers ascending a steep peak.
188 | A pirate map with a path leading to buried treasure, marked with an X.
189 | A detective solving a mystery in a rain-soaked alleyway.
190 | A panoramic view of a city skyline at dusk, with lights starting to twinkle.
191 | A luxurious Venetian masquerade ball, with guests in ornate masks and gowns.
192 | A sunlit forest clearing with deer grazing and birds chirping.
193 | A science fiction scene of a colony on a distant planet, with domed habitats.
194 | A magical potion shop with shelves of colorful bottles and glowing ingredients.
195 | A historical reenactment of a Civil War battle, with soldiers and cannons.
196 | A serene bamboo forest in Japan, with a path winding through the tall stalks.
197 | A dystopian future with robots patrolling deserted city streets.
198 | A classic American diner at sunset, with vintage cars parked outside.
199 | A fantasy scene of a giant tree with houses and bridges built into its branches.
200 | A bustling airport terminal with travelers and planes visible through large windows.
201 | An elegant tea party in a garden, with delicate china and floral arrangements.
202 | A close-up of a vintage typewriter with a blank sheet of paper.
203 | A panoramic view of an ancient Roman city, with the Colosseum and forums.
204 | A neon-lit arcade with rows of retro video game machines.
205 | A fantasy scene of a dark lord's throne room, with torches and a red carpet.
206 | A serene koi pond with a wooden footbridge and water lilies.
207 | A vintage train station with steam locomotives and bustling passengers.
208 | A high-angle shot of a crowded music festival, with the stage in the distance.
209 | A fantasy scene of a battle between elves and orcs in a dense forest.
210 | A vibrant farmers market with stalls of fresh produce and handmade goods.
211 | A snowy log cabin in the mountains, with smoke coming from the chimney.
212 | A luxury cruise ship sailing in the Mediterranean, with the coast in the background.
213 | A close-up of a vintage camera with film reels and a leather case.
214 | A bustling street in Tokyo at night, with neon signs and busy pedestrians.
215 | A fantasy scene of a unicorn grazing in a magical meadow at sunrise.
216 | A panoramic view of the Himalayas from a mountaintop, with clouds below.
217 | A dramatic courtroom scene with a judge, jury, and attorneys in action.
218 | A traditional Indian wedding ceremony, with vibrant colors and intricate designs.
219 | A mysterious ancient ruin, with hieroglyphs and torches illuminating the walls.
220 | A high-tech command center with screens displaying global data and maps.
221 | A scenic vineyard in Tuscany, with rolling hills and a setting sun.
222 | A detailed illustration of a human heart, with arteries and veins.
223 | A fantasy scene of a fairy glen, with mushrooms and sparkling lights.
224 | A dramatic volcanic eruption, with lava flowing and ash clouding the sky.
225 | A tranquil scene of a monk meditating by a mountain stream.
226 | A bustling kitchen in a Michelin-starred restaurant, with chefs at work.
227 | A panoramic view of a modern city park, with skyscrapers in the background.
228 | A close-up of a detailed clock face, with Roman numerals and ornate hands.
229 | A fantasy scene of a sorcerer casting spells in an ancient library.
230 | A vibrant mural depicting the history of a city, with iconic buildings and figures.
231 | A luxury penthouse apartment with floor-to-ceiling windows and city views.
232 | A bustling fish market, with vendors selling fresh seafood and ice.
233 | A dramatic scene of a spaceship entering a wormhole in deep space.
234 | A panoramic view of a desert at sunset, with sand dunes and a camel caravan.
235 | A close-up of a dew-covered spider web in the early morning light.
236 | A fantasy scene of a crystal cave, with glowing gems and a hidden treasure.
237 | A vibrant street parade, with dancers, musicians, and colorful floats.
238 | A dramatic scene of a knight rescuing a princess from a tower.
239 | A modern office space with open-plan desks and glass walls.
240 | A panoramic view of a medieval city, with stone walls and narrow streets.
241 | A close-up of a dragon's eye, with scales and fiery depth.
242 | A tranquil Japanese tea ceremony, with traditional attire and utensils.
243 | A bustling open-air market in Morocco, with spices, textiles, and pottery.
244 | A dramatic thunderstorm over a prairie, with lightning striking the ground.
245 | A fantasy scene of a witch's cottage in the woods, with bubbling cauldrons.
246 | A detailed illustration of a vintage steam engine, with pistons and wheels.
247 | A serene scene of a lighthouse at dawn, with waves crashing against the rocks.
248 | A luxurious spa retreat, with hot springs and a mountain view.
249 | A dramatic scene of a pirate ship battle, with cannons and sails.
250 | A vibrant graffiti wall, with urban art and messages.
251 | A scenic view of a mountain pass, with a winding road and autumn colors.
252 | A fantasy scene of an angel descending from the heavens, with wings spread.
253 | A bustling café in Amsterdam, with bikes parked outside and canals nearby.
254 | A panoramic view of an ancient Greek city, with temples and agoras.
255 | A vibrant butterfly garden, with flowers and fluttering wings.
256 | A dramatic scene of a samurai duel at sunrise, with cherry blossoms.
257 | A modern train speeding through a European countryside.
258 | A close-up of a vintage vinyl record player, with a spinning record.
259 | A fantasy scene of a giant protecting a village from invaders.
260 | A serene scene of a paddleboarder on a calm lake at sunrise.
261 | A bustling newsroom, with journalists and screens displaying breaking news.
262 | A panoramic view of a fjord in Norway, with steep cliffs and a clear sky.
263 | A detailed illustration of a robot, with gears and circuits.
264 | A vibrant scene of a Mardi Gras parade, with costumes and beads.
265 | A serene scene of a hot air balloon floating over a lavender field.
266 | A fantasy scene of a mage casting a spell in a crystal chamber.
267 | A bustling street in Havana, with classic cars and colorful buildings.
268 | A panoramic view of a snowy ski resort, with skiers and chalets.
269 | A close-up of a jeweled crown, with intricate designs and precious stones.
270 | A vibrant scene of a salsa dance in a Latin American club.
271 | A detailed illustration of a spacecraft landing on an alien planet.
272 | A tranquil scene of a yoga class on a beach at sunrise.
273 | A bustling bazaar in Istanbul, with lanterns, rugs, and spices.
274 | A dramatic scene of a superhero saving a city from disaster.
275 | A serene scene of a cottage garden, with flowers and a picket fence.
276 | A vibrant scene of a carnival in Rio, with dancers and costumes.
277 | A detailed illustration of a medieval manuscript, with illuminations.
278 | A tranquil scene of a mountain lodge, with snow and a fireplace.
279 | A vibrant scene of a street festival in India, with colors and music.
280 | A detailed illustration of a knight in armor, with a sword and shield.
281 | A tranquil scene of a vine-covered pergola, with a garden path.
282 | A vibrant scene of a jazz band playing in a New Orleans club.
283 | A detailed illustration of a fantasy creature, with wings and scales.
284 | A tranquil scene of a sailboat on a calm sea at sunset.
285 | A vibrant scene of a food festival, with stalls and tastings.
286 | A detailed illustration of a vintage motorcycle, with chrome and leather.
287 | A tranquil scene of a forest cabin, with snow and a wood stove.
288 | A vibrant scene of a Chinese New Year parade, with dragons and lanterns.
289 | A detailed illustration of an ancient Egyptian tomb, with hieroglyphs.
290 | A tranquil scene of a desert campfire, with stars and silhouettes.
291 | A vibrant scene of a street art festival, with murals and artists.
292 | A detailed illustration of a futuristic city, with skyscrapers and drones.
293 | A tranquil scene of a monastery in the mountains, with monks and prayer flags.
294 | A vibrant scene of a Flamenco dance in Spain, with castanets and dresses.
295 | A detailed illustration of a fantasy castle, with towers and moats.
296 | A tranquil scene of a river rafting adventure, with rapids and paddles.
297 | A vibrant scene of a Tokyo crosswalk, with crowds and billboards.
298 | A detailed illustration of a vintage airplane, with propellers and wings.
299 | A tranquil scene of a Mediterranean villa, with olive trees and a pool.
300 | A vibrant scene of a night market in Thailand, with lanterns and food.
301 | A snow-covered village in the Swiss Alps.
302 | A bustling medieval marketplace.
303 | A futuristic cityscape at night with flying cars.
304 | A tranquil koi pond in a Japanese garden.
305 | A grand Venetian masquerade ball.
306 | An underwater scene with a coral reef and colorful fish.
307 | A haunting abandoned amusement park.
308 | A cozy cabin in a snowy forest.
309 | A magnificent castle on a hill during sunset.
310 | A surreal landscape with floating islands.
311 | An ancient temple hidden in the jungle.
312 | A picturesque Tuscan vineyard at sunrise.
313 | A lively carnival in Rio de Janeiro.
314 | A tranquil Zen garden with a stone path.
315 | A bustling New York street in the 1920s.
316 | A panoramic view of Athens with the Acropolis.
317 | A vibrant coral reef with diverse marine life.
318 | A detailed steampunk city with gears and pipes.
319 | A serene lavender field in Provence, France.
320 | A classic 1950s diner on Route 66.
321 | A dramatic Viking ship sailing rough seas.
322 | A romantic Parisian cafe in the rain.
323 | A wildflower meadow with a rainbow overhead.
324 | An old library filled with ancient books.
325 | A whimsical fairy tale forest with a castle.
326 | A dramatic desert landscape with sand dunes.
327 | A lively Oktoberfest celebration in Munich.
328 | A mysterious lighthouse during a storm.
329 | A colorful Moroccan bazaar.
330 | An elegant Victorian garden party.
331 | A vibrant Bollywood dance scene.
332 | A serene moonlit lake with swans.
333 | A bustling Tokyo street at night.
334 | A historic London street in the fog.
335 | A traditional Chinese tea ceremony.
336 | A dramatic cliff overlooking the ocean.
337 | A lively African village with traditional huts.
338 | A panoramic view of Rio de Janeiro from Christ the Redeemer.
339 | A frozen waterfall in a winter landscape.
340 | A colorful Day of the Dead celebration in Mexico.
341 | A picturesque Greek village on the coast.
342 | A spooky haunted house on a hill.
343 | A vibrant tulip field in the Netherlands.
344 | A traditional Native American powwow.
345 | A romantic gondola ride in Venice.
346 | A dramatic thunderstorm over the plains.
347 | A bustling fish market in Japan.
348 | A scenic view of the Grand Canyon.
349 | A lively Mardi Gras parade in New Orleans.
350 | A peaceful Buddhist temple in the mountains.
351 | An ancient Egyptian pyramid at sunset.
352 | A medieval knight's tournament.
353 | A futuristic space station orbiting a planet.
354 | A panoramic view of Hong Kong's skyline at night.
355 | A traditional Irish village with thatched cottages.
356 | A scenic African safari with elephants and giraffes.
357 | A colorful Indian Holi festival.
358 | A peaceful Alpine village in winter.
359 | A vibrant sunset over the Sahara Desert.
360 | A charming English countryside in spring.
361 | A bustling South American street market.
362 | A magical winter scene with the Northern Lights.
363 | A historic Roman Colosseum in ruins.
364 | A panoramic view of Sydney Harbour.
365 | A vibrant Caribbean beach party.
366 | A dramatic mountain landscape in the Himalayas.
367 | A traditional Japanese samurai duel.
368 | A serene Scandinavian fjord at dawn.
369 | A lively Spanish flamenco dance.
370 | A breathtaking view of the Great Wall of China.
371 | A peaceful Amish countryside with horse-drawn buggies.
372 | A traditional Korean village with hanok houses.
373 | A colorful Dutch windmill scene.
374 | A lively Hawaiian luau on the beach.
375 | A historic American Civil War battlefield.
376 | A panoramic view of Istanbul with the Blue Mosque.
377 | A bustling Southeast Asian floating market.
378 | A serene English rose garden.
379 | A traditional Mongolian yurt in the steppes.
380 | A scenic view of Victoria Falls.
381 | A charming French chateau with vineyards.
382 | A festive Chinese New Year celebration.
383 | A dramatic view of Mount Everest.
384 | A historic pirate ship sailing the Caribbean.
385 | A picturesque Norwegian village with aurora borealis.
386 | A lively Middle Eastern souk.
387 | A scenic Tuscan hilltown at dusk.
388 | A traditional Maasai village in Kenya.
389 | A panoramic view of Cape Town with Table Mountain.
390 | A vibrant Sicilian street market.
391 | A tranquil Japanese Zen rock garden.
392 | A festive Bavarian beer garden.
393 | A dramatic Icelandic waterfall.
394 | A colorful Caribbean carnival.
395 | A historic Western frontier town.
396 | A picturesque Swiss chalet in the mountains.
397 | A vibrant street scene in Havana, Cuba.
398 | A scenic view of Machu Picchu.
399 | A lively Filipino festival with traditional dances.
400 | A peaceful English cottage garden.
401 | A scenic vineyard in Napa Valley.
402 | A dramatic Alaskan glacier.
403 | A festive Russian winter scene with St. Basil's Cathedral.
404 | A panoramic view of Jerusalem's Old City.
405 | A vibrant Peruvian market in the Andes.
406 | A serene Thai beach with longtail boats.
407 | A festive Italian piazza at night.
408 | A traditional Russian village in winter.
409 | A historic American Gold Rush town.
410 | A scenic New Zealand landscape with sheep.
411 | A vibrant Ethiopian market scene.
412 | A panoramic view of Edinburgh Castle.
413 | A bustling Moroccan medina.
414 | A serene Swiss lake with mountains in the background.
415 | A festive Brazilian samba parade.
416 | A dramatic view of the Acropolis in Athens.
417 | A charming Austrian village in the Alps.
418 | A historic American Civil War reenactment.
419 | A scenic Canadian Rockies landscape.
420 | A colorful Filipino jeepney on a busy street.
421 | A tranquil Balinese temple with lotus ponds.
422 | A lively Argentine tango in Buenos Aires.
423 | A picturesque German castle in the Black Forest.
424 | A scenic Australian Outback with kangaroos.
425 | A festive Greek island with white houses and blue domes.
426 | A panoramic view of the Las Vegas Strip at night.
427 | A traditional Thai floating market.
428 | A scenic view of the Amalfi Coast.
429 | A colorful Colombian street mural.
430 | A historic American colonial town.
431 | A serene Cambodian temple at dawn.
432 | A festive Polish folk dance.
433 | A panoramic view of Dubai's skyline.
434 | A charming Belgian chocolate shop.
435 | A historic British Royal Guard ceremony.
436 | A scenic view of the Scottish Highlands.
437 | A festive South African tribal dance.
438 | A picturesque Danish windmill in the countryside.
439 | A colorful Brazilian favela.
440 | A tranquil Swedish archipelago.
441 | A scenic view of the Patagonia mountains.
442 | A festive Hungarian folk festival.
443 | A panoramic view of the Singapore skyline.
444 | A charming Czech village in the fall.
445 | A historic American jazz club in the 1920s.
446 | A scenic Icelandic geothermal spring.
447 | A festive Austrian Christmas market.
448 | A panoramic view of the San Francisco Bay.
449 | A colorful Turkish bazaar.
450 | A tranquil New Zealand fjord.
451 | A festive Indian wedding procession.
452 | A scenic view of the Rocky Mountains in Colorado.
453 | A colorful Jamaican beach scene.
454 | A festive German Oktoberfest.
455 | A panoramic view of the Miami skyline.
456 | A scenic Italian vineyard in Tuscany.
457 | A colorful Mexican Dia de Muertos celebration.
458 | A tranquil Greek monastery on a cliff.
459 | A festive Irish St. Patrick's Day parade.
460 | A scenic view of the Oregon coast.
461 | A colorful Puerto Rican street festival.
462 | A festive Scottish Highland Games.
463 | A panoramic view of the Hong Kong harbor.
464 | A scenic French lavender field at sunset.
465 | A colorful Guatemalan market.
466 | A festive Canadian hockey game.
467 | A panoramic view of the Berlin skyline.
468 | A scenic Spanish vineyard in Rioja.
469 | A colorful Dominican carnival.
470 | A festive Swedish Midsummer celebration.
471 | A panoramic view of the Moscow skyline.
472 | A scenic Dutch tulip field in spring.
473 | A colorful Kenyan tribal ceremony.
474 | A festive American Fourth of July parade.
475 | A panoramic view of the Rome skyline.
476 | A scenic view of the English Lake District.
477 | A colorful Haitian street art scene.
478 | A festive Czech Easter market.
479 | A panoramic view of the Vienna skyline.
480 | A scenic view of a vineyard in Bordeaux.
481 | A colorful Bolivian folk dance.
482 | A festive Belgian beer festival.
483 | A panoramic view of the Kuala Lumpur skyline.
484 | A scenic Hungarian castle on the Danube.
485 | A colorful Ecuadorian indigenous market.
486 | A festive Dutch King's Day celebration.
487 | A panoramic view of the Athens skyline.
488 | A scenic view of a lavender field in England.
489 | A colorful Nepalese temple ceremony.
490 | A festive Australian surf competition.
491 | A panoramic view of the Prague skyline.
492 | A scenic vineyard in the Chilean Andes.
493 | A colorful Salvadoran street mural.
494 | A festive Norwegian Constitution Day parade.
495 | A panoramic view of the Buenos Aires skyline.
496 | A scenic view of the Swiss Matterhorn.
497 | A colorful Indonesian temple festival.
498 | A festive Danish Viking festival.
499 | A panoramic view of the Shanghai skyline.
500 | A scenic view of a vineyard in the South of France.


--------------------------------------------------------------------------------
/prompt/sd1_mem345.txt:
--------------------------------------------------------------------------------
  1 | The No Limits Business Woman Podcast
  2 | Full body U-Zip main opening - Full body U-Zip main opening on front of bag for easy unloading when you get to camp
  3 | Mothers influence on her young hippo
  4 | "Watch: Passion Pit's New Video, ""Lifted Up (1985)"""
  5 | Talks on the Precepts and Buddhist Ethics
  6 | Sony Won't Release <i>The Interview</i> on VOD
  7 | <i>I Am Chris Farley</i> Documentary Releases First Trailer
  8 | <em>South Park: The Stick of Truth</em> Review (Multi-Platform)
  9 | <i>X-Men: Days of Future Past</i> Director Joins Vince Gilligan's <i>Battle Creek</i> Pilot
 10 | Insights with Laura Powers
 11 | As Punisher Joins <i>Daredevil</i> Season Two, Who Will the New Villain Be?
 12 | Aretha Franklin Files $10 Million Suit Over Patti LaBelle Fight Story On Satire Website
 13 | Hawkgirl Cast in <i>Arrow</i>/<i>Flash</i> Spinoff Series For The CW
 14 | Here's What You Need to Know About St. Vincent's Apple Music Radio Show
 15 | Daniel Radcliffe Dons a Beard and Saggy Jeans in Trailer for BBC GTA Miniseries <i>The Gamechangers</i>
 16 | The Happy Scientist
 17 | The Health Mastery Café with Dr. Dave
 18 | """Listen to The Dead Weather's New Song, """"Buzzkill(er)"""""""
 19 | There's a <i>Mrs. Doubtfire</i> Sequel in the Works
 20 | DC All Stars podcast
 21 | 35 Possible Titles for the <i>Mrs. Doubtfire</i> Sequel
 22 | "Listen to The Dead Weather's New Song, ""Buzzkill(er)"""
 23 | Rambo 5 und Rocky Spin-Off - Sylvester Stallone gibt Updates
 24 | Anna Kendrick is Writing a Collection of Funny, Personal Essays
 25 | Breaking Down the $12 in Your Six-Pack of Craft Beer
 26 | <i>It's Always Sunny</i> Gang Will Turn Your Life Around with Self-Help Book
 27 | Renegade RSS Laptop Backpack - View 91
 28 | Passion. Podcast. Profit.
 29 | Prince Reunites With Warner Brothers, Plans New Album
 30 | Here's Who Ian McShane May Be Playing in <i>Game of Thrones</i> Season Six
 31 | Future Steve Carell Movie Set In North Korea Canceled By New Regency
 32 | Gary Ryan Moving Beyond Being Good®
 33 | Will Ferrell, John C. Reilly in Talks for <i>Border Guards</i>
 34 | 25 Timeless <i>Golden Girls</i> Memes and Quotables
 35 | Long-Lost F. Scott Fitzgerald Story Rediscovered and Published, 76 Years Later
 36 | Watch the First Episode of <i>Garfunkel and Oates</i>
 37 | J Dilla's Synthesizers, Equipment Donated to Smithsonian Museum
 38 | Foyer painted in HABANERO
 39 | Living in the Light with Ann Graham Lotz
 40 | <i>The Colbert Report</i> Gets End Date
 41 | Brit Marling-Zal Batmanglij Drama Series <i>The OA</i> Gets Picked Up By Netflix
 42 | Chris Messina In Talks to Star Alongside Ben Affleck in <i>Live By Night</i>
 43 | Netflix Strikes Deal with AT&T for Faster Streaming
 44 | Foyer painted in WHITE
 45 | Sound Advice with John W Doyle
 46 | <i>The Long Dark</i> Gets First Trailer, Steam Early Access
 47 | Watch the Trailer for NBC's <i>Constantine</i>
 48 | Donna Tartt's <i>The Goldfinch</i> Scores Film Adaptation
 49 | Air Conditioners & Parts
 50 | If Barbie Were The Face of The World's Most Famous Paintings
 51 | Designart Canada White Stained Glass Floral Design 29-in Round Metal Wall Art
 52 | Aaron Paul to Play Luke Skywalker at LACMA Reading of <i>The Empire Strikes Back</i>
 53 | Foyer painted in KHAKI
 54 | Renegade RSS Laptop Backpack - View 31
 55 | Renegade RSS Laptop Backpack - View 21
 56 | Falmouth Navy Blue Area Rug by Andover Mills
 57 | Emma Watson to play Belle in Disney's <i>Beauty and the Beast</i>
 58 | George R.R. Martin Donates $10,000 to Wolf Sanctuary for a 13-Year-Old Fan
 59 | Director Danny Boyle Is Headed To TV With FX Deal
 60 | Plymouth Curtain Panel featuring Madelyn - White Botanical Floral Large Scale by heatherdutton
 61 | Plymouth Curtain Panel featuring diamond_x_blue_gray by boxwood_press
 62 | Axle Laptop Backpack - View 81
 63 | Renegade RSS Laptop Backpack - View 3
 64 | Anzell Blue/Gray Area Rug by Andover Mills
 65 | Aero 51-204710 51 Series 15x10 Wheel, Spun, 5 on 4-3/4 BP, 1 Inch BS
 66 | Melrose Gray/Blue Area Rug by Andover Mills
 67 | Aero 58-905055PUR 58 Series 15x10 Wheel, SP, 5 on 5 Inch, 5-1/2 BS
 68 | Lilah Gray Area Rug by Andover Mills
 69 | George R.R. Martin to Focus on Writing Next Book, World Rejoices
 70 | Baby Shower Turned Meteor Shower: Anne Hathaway Fights Off Aliens in Sci-Fi Comedy <i>The Shower</i>
 71 | Ava DuVernay Won't Direct <i>Black Panther</i> After All
 72 | Renegade RSS Laptop Backpack - View 51
 73 | Shaw Floors Couture' Collection Ultimate Expression 15′ Sahara 00205_19829
 74 | Foyer painted in HIGH TIDE
 75 | 33 Screenshots of Musicians in Videogames
 76 | <i>Breaking Bad</i> Fans Get a Chance to Call Saul with Albuquerque Billboard
 77 | Shaw Floors Sandy Hollow III 15′ Adobe 00108_Q4278
 78 | Designart Blue Fractal Abstract Illustration Abstract Canvas Wall Art - 7 Panels
 79 | Shaw Floors Couture' Collection Ultimate Expression 15′ Peanut Brittle 00702_19829
 80 | Aero 50-975035BLU 50 Series 15x7 Inch Wheel, 5 on 5 Inch BP 3-1/2 BS
 81 | Aero 55-004220 55 Series 15x10 Wheel, 4-lug, 4 on 4-1/4 BP, 2 Inch BS
 82 | Design Art Light in Dense Fall Forest with Fog Ultra Vibrant Landscape Oversized Circle Wall Art
 83 | Renegade RSS Laptop Backpack - View 5
 84 | Lilah Dark Gray Area Rug by Andover Mills
 85 | FUSE Backpack 25 - View 81
 86 | Obadiah Hand-Woven Wool Gray Area Rug by Mercury Row
 87 | Foyer painted in SALTY AIR
 88 | Shaw Floors Shaw Design Center Different Times II 12 Classic Buff 00108_5C494
 89 | Shaw Floors Sandy Hollow Classic Iv 12′ Sahara 00205_E0554
 90 | Shaw Floors Value Collections All Star Weekend III 15′ Net Desert Sunrise 00721_E0816
 91 | Aero 50-284730 50 Series 15x8 Inch Wheel, 5 on 4-3/4 BP, 3 Inch BS
 92 | Sarah Silverman Will Star in HBO Pilot from <i>Secret Diary of a Call Girl</i> Creator
 93 | Shaw Floors Caress By Shaw Cashmere II Icelandic 00100_CCS02
 94 | Freddy Adu Signs For Yet Another Club You Probably Don't Know
 95 | Shaw Floors Nfa/Apg Color Express II Suitable 00712_NA209
 96 | Shaw Floors Simply The Best Without Limits II Net Sandbank 00103_5E508
 97 | Shaw Floors Shaw Flooring Gallery Burtonville Luminary 00201_5293G
 98 | Shaw Floors Couture' Collection Ultimate Expression 12′ Tropic Vine 00304_19698
 99 | Shaw Floors Value Collections Cashmere I Lg Net Harvest Moon 00126_CC47B
100 | Shaw Floors Value Collections Passageway 2 12 Camel 00204_E9153
101 | Shaw Floors Shaw Design Center Different Times II 12 Silk 00104_5C494
102 | Aero 30-984540BLK 30 Series 13x8 Inch Wheel, 4 on 4-1/2 BP 4 Inch BS
103 | Shaw Floors Simply The Best Of Course We Can III 12′ Sepia 00105_E9425
104 | Signature Purple Ombre Sugar Skull and Rose Bedding
105 | Aero 52-985030GOL 52 Series 15x8 Inch Wheel, 5 on 5 BP, 3 Inch BS IMCA
106 | Shaw Floors Shaw Flooring Gallery Ellendale 15′ Ink Spot 00501_5301G
107 | Renegade RSS Laptop Backpack - View 41
108 | Pencil pleat curtains in collection Panama Cotton, fabric: 702-34
109 | Aero 30-974230BLK 30 Series 13x7 Inch Wheel, 4 on 4-1/4 BP 3 Inch BS
110 | Shaw Floors Value Collections Because We Can II 15′ Net Sea Shell 00100_E9315
111 | Designart Pink Fractal Pattern With Swirls Abstract Wall Art Canvas - 6 Panels
112 | Read a Previously Unpublished F. Scott Fitzgerald Story
113 | Design Art Beautiful View of Paris Paris Eiffel Towerunder Red Sky Ultra Glossy Cityscape Circle Wall Art
114 | Shaw Floors Queen Point Guard 12′ Flax Seed 00103_Q4855
115 | Shaw Floors Queen Sandy Hollow I 15′ Peanut Brittle 00702_Q4274
116 | Aero 56-084530 56 Series 15x8 Wheel, Spun, 5 on 4-1/2 BP, 3 Inch BS
117 | Aero 31-974510BLK 31 Series 13x7 Wheel, Spun Lite 4 on 4-1/2 BP 1 BS
118 | Shaw Floors Town Creek III Sea Mist 00400_52S32
119 | Shiflett Gray/Blue/White Area Rug by Andover Mills
120 | Aero 53-204720 53 Series 15x10 Wheel, BLock, 5 on 4-3/4 BP, 2 Inch BS
121 | Shaw Floors Shaw Design Center Royal Portrush III 12′ Crumpet 00203_5C613
122 | Shaw Floors Shaw Design Center Park Manor 12′ Cashew 00106_QC459
123 | Emma Watson Set to Star Alongside Tom Hanks in Film Adaptation of Dave Eggers' <i>The Circle</i>
124 | Shaw Floors Pure Waters 15 Clam Shell 00102_52H11
125 | Aero 31-984210BLK 31 Series 13x8 Wheel, Spun 4 on 4-1/4 BP 1 Inch BS
126 | Shaw Floors SFA Timeless Appeal I 15′ Tundra 00708_Q4311
127 | Aero 50-924750ORG 50 Series 15x12 Wheel, 5 on 4-3/4 BP, 5 Inch BS
128 | Shaw Floors SFA Tuscan Valley Cashmere 00701_52E29
129 | Shaw Floors Shaw Design Center Moment Of Truth Acorn 00700_5C789
130 | Renegade RSS Laptop Backpack - View 2
131 | Shaw Floors Nfa/Apg Barracan Classic I True Blue 00423_NA074
132 | Shaw Floors Shaw Design Center Different Times III 12 Soft Copper 00600_5C496
133 | Shaw Floors All Star Weekend I 15′ Castaway 00400_E0141
134 | Shaw Floors Couture' Collection Ultimate Expression 15′ Soft Shadow 00105_19829
135 | FUSE Backpack 25 - View 71
136 | Shaw Floors Caress By Shaw Cashmere I Lg Bismuth 00124_CC09B
137 | FUSE Backpack 25 - View 91
138 | Shaw Floors Value Collections Cozy Harbor I Net Waters Edge 00307_5E364
139 | Aero 31-904030RED 31 Series 13x10 Wheel, Spun Lite, 4 on 4 BP, 3 BS
140 | Shaw Floors Value Collections That's Right Net Sedona 00708_E0925
141 | Grieve Cream/Navy Area Rug by Bungalow Rose
142 | Shaw Floors Value Collections Explore With Me Twist Net Wave Pool 00410_E0849
143 | Shaw Floors SFA Shingle Creek Iv 15′ Mojave 00301_EA519
144 | Shaw Floors Roll Special Xv540 Tropical Wave 00420_XV540
145 | Shaw Floors Value Collections Of Course We Can III 12′ Net Shadow 00502_E9441
146 | Shaw Floors Value Collections Passageway 1 12 Net Classic Buff 00108_E9152
147 | Shaw Floors Caress By Shaw Cashmere Classic Iv Navajo 00703_CCS71
148 | Lilah Gray Area Rug By Andover Mills.
149 | Tremont Blue/Ivory Area Rug by Andover Mills
150 | Tab top lined curtains in collection Chenille, fabric: 702-22
151 | Shaw Floors Caress By Shaw Cashmere I Lg Heirloom 00122_CC09B
152 | Shaw Floors Shaw Design Center Sweet Valley II 12′ Tuscany 00204_QC422
153 | Shaw Floors Shaw Design Center Sweet Valley III 12′ Soft Shadow 00105_QC424
154 | Shaw Floors Sandy Hollow Classic III 12′ Cashew 00106_E0552
155 | Aero 52-984740RED 52 Series 15x8 Wheel, 5 on 4-3/4 BP, 4 Inch BS IMCA
156 | Shaw Floors Shaw Design Center Sweet Valley II 15′ Mountain Mist 00103_QC423
157 | Renegade RSS Laptop Backpack - View 4
158 | Shaw Floors Value Collections Cashmere I Lg Net Pebble Path 00722_CC47B
159 | Shaw Floors Couture' Collection Ultimate Expression 12′ Almond Flake 00200_19698
160 | "Daft Punk, Jay Z Collaborate on ""Computerized"""
161 | Meditation Floor Pillow
162 | Shaw Floors Value Collections Cashmere Iv Lg Net Navajo 00703_CC50B
163 | Netflix Hits 50 Million Subscribers
164 | Shaw Floors SFA Awesome 4 Linen 00104_E0741
165 | Shaw Floors SFA Enjoy The Moment I 15′ Butterscotch 00201_0C138
166 | Foyer painted in BROWN BAG
167 | Shaw Floors Value Collections Of Course We Can I 15 Net Linen 00100_E9432
168 | Shaw Floors Value Collections Sandy Hollow Cl II Net Alpine Fern 00305_5E510
169 | Shaw Floors Caress By Shaw Cashmere Classic I Mesquite 00724_CCS68
170 | Shaw Floors Caress By Shaw Cashmere Classic I Rich Henna 00620_CCS68
171 | Lilah Teal Blue Area Rug by Andover Mills
172 | Pencil pleat curtain in collection Linen, fabric: 392-05
173 | Shaw Floors Foundations Elemental Mix II Pixels 00170_E9565
174 | Shaw Floors Nfa/Apg Detailed Artistry I Snowcap 00179_NA328
175 | Designart Circled Blue Psychedelic Texture Abstract Art On Canvas - 7 Panels
176 | Shaw Floors SFA My Inspiration I Textured Canvas 00150_EA559
177 | Peraza Hand-Tufted White Area Rug by Mercury Row
178 | Shaw Floors Caress By Shaw Cashmere Classic III Rich Henna 00620_CCS70
179 | Shaw Floors SFA Timeless Appeal III 12′ Country Haze 00307_Q4314
180 | Pencil pleat curtain in collection Panama Cotton, fabric: 702-31
181 | Duhon Gray/Ivory Area Rug by Mercury Row
182 | Shaw Floors Value Collections Because We Can III 15′ Net Birch Tree 00103_E9317
183 | Shaw Floors Value Collections Cashmere Iv Lg Net Jade 00323_CC50B
184 | Shaw Floors Shaw Floor Studio Bright Spirit II 15′ Marzipan 00201_Q4651
185 | Pencil pleat curtains in collection Blackout, fabric: 269-12
186 | Shaw Floors Shaw Flooring Gallery Union City II 12′ Golden Echoes 00202_5306G
187 | Green Fractal Lights In Fog Abstract Wall Art Canvas - 6 Panels
188 | Aero 52-984510BLK 52 Series 15x8 Wheel, 5 on 4-1/2 BP, 1 Inch BS IMCA
189 | 3D Black & White Skull King Design Luggage Covers 007
190 | Shaw Floors Value Collections Take The Floor Twist II Net Biscotti 00131_5E070
191 | Aero 52-984720BLU 52 Series 15x8 Wheel, 5 on 4-3/4 BP, 2 Inch BS IMCA
192 | Björk Explains Decision To Pull <i>Vulnicura</i> From Spotify
193 | "Listen to Ricky Gervais Perform ""Slough"" as David Brent"
194 | Brickhill Ivory Area Rug by Andover Mills
195 | Shaw Floors Caress By Shaw Cashmere Iv Bison 00707_CCS04
196 | Shaw Floors Value Collections What's Up Net Linen 00104_E0926
197 | Smithtown Latte Area Rug by Andover Mills
198 | Anderson Tuftex Natural State 1 Dream Dust 00220_ARK51
199 | Shaw Floors Caress By Shaw Cashmere I Lg Gentle Doe 00128_CC09B
200 | Shaw Floors Caress By Shaw Quiet Comfort Classic III Atlantic 00523_CCB98
201 | Shaw Floors Shaw Floor Studio Textured Story 15 Candied Truffle 55750_52B76
202 | Shaw Floors Resilient Residential Stone Works 720c Plus Glacier 00147_525SA
203 | Set of 6 Brother TZe-231 Black on White P-Touch Label
204 | Aero 31-984040GRN 31 Series 13x8 Wheel, Spun, 4 on 4 BP, 4 Inch BS
205 | Shaw Floors Anso Premier Dealer Great Effect I 15′ Almond Flake 00200_Q4328
206 | Shaw Floors Spice It Up Tyler Taupe 00103_E9013
207 | Shaw Floors Shaw Flooring Gallery Inspired By II French Linen 00103_5560G
208 | Shaw Floors Caress By Shaw Quiet Comfort Classic III Froth 00520_CCB98
209 | Shaw Floors Value Collections All Star Weekend I 12 Net Crumpet 00203_E0792
210 | Pencil pleat curtains in collection Jupiter, fabric: 127-00
211 | Aero 30-904550RED 30 Series 13x10 Inch Wheel, 4 on 4-1/2 BP 5 Inch BS
212 | Shaw Floors Value Collections Solidify III 12 Net Natural Contour 00104_5E340
213 | Tab top curtains in collection Avinon, fabric: 131-15
214 | Shaw Floors Caress By Shaw Quiet Comfort Classic Iv Rich Henna 00620_CCB99
215 | Shaw Floors Value Collections Cashmere I Lg Net Barnboard 00525_CC47B
216 | Shaw Floors Caress By Shaw Cashmere Classic Iv Spruce 00321_CCS71
217 | Pencil pleat curtains in collection Jupiter, fabric: 127-50
218 | Shaw Floors Northern Parkway Crystal Clear 00410_52V34
219 | Shaw Floors Roll Special Xv863 Bare Mineral 00105_XV863
220 | Shaw Floors Leading Legacy Crystal Gray 00500_E0546
221 | Shaw Floors Roll Special Xy176 Sugar Cookie 00101_XY176
222 | FUSE Backpack 25 - View 61
223 | Aero 52-984520RED 52 Series 15x8 Wheel, 5 on 4-1/2 BP, 2 Inch BS IMCA
224 | Shaw Floors Roll Special Xv375 Royal Purple 00902_XV375
225 | Shaw Floors Caress By Shaw Cashmere III Lg Pacific 00524_CC11B
226 | Plymouth Curtain Panel featuring Christmas woofs by gkumardesign
227 | Designart Serene Maldives Seashore at Sunset Oversized Landscape Canvas Art - 4 Panels
228 | Shaw Floors Apd/Sdc Decordovan II 12′ Country Haze 00307_QC392
229 | Shaw Floors Enduring Comfort I French Linen 00103_E0341
230 | Shaw Floors Value Collections Passageway 2 12 Pewter 00501_E9153
231 | Eugenia Brown Area Rug by Andover Mills
232 | Aero 53-904530BLU 53 Series 15x10 Wheel, BL, 5 on 4-1/2 BP 3 Inch BS
233 | Anderson Tuftex Anderson Hardwood Palo Duro Mixed Width Golden Ore 37212_AA777
234 | Shaw Floors Value Collections Passageway 2 12 Mocha Chip 00705_E9153
235 | Shaw Floors Caress By Shaw Cashmere III Lg Yearling 00107_CC11B
236 | Aero 51-980540GRN 51 Series 15x8 Wheel, Spun, 5 on WIDE 5, 4 Inch BS
237 | Annabel Green Area Rug by Bungalow Rose
238 | Melrose Gray/Yellow Area Rug by Andover Mills
239 | Shaw Floors Shaw Floor Studio Porto Veneri III 12′ Golden Rod 00202_52U58
240 | Shaw Floors Shaw Design Center Sweet Valley I 15′ Blue Suede 00400_QC421
241 | Shaw Floors Shaw Floor Studio Porto Veneri I 15′ Cream 00101_52U55
242 | Shaw Floors Elemental Mix III Gentle Rain 00171_E9566
243 | Shaw Floors
244 | Shaw Floors Simply The Best Bandon Dunes Silver Leaf 00541_E0823
245 | Shaw Floors Caress By Shaw Quiet Comfort Classic I Deep Indigo 00424_CCB96
246 | Shaw Floors Value Collections Color Moxie Meteorite 00501_E9900
247 | Aero 51-985020GRN 51 Series 15x8 Wheel, Spun, 5 on 5 Inch, 2 Inch BS
248 | Shaw Floors That's Right Rustic Taupe 00706_E0812
249 | Shaw Floors Queen Thrive Fine Lace 00100_Q4207
250 | Shaw Floors Newbern Classic 15′ Crimson 55803_E0950
251 | Shaw Floors Caress By Shaw Cashmere Classic II Spearmint 00320_CCS69
252 | Shaw Floors Value Collections Xvn05 (s) Soft Chamois 00103_E1236
253 | Ethelyn Lilah Area Rug by Andover Mills
254 | Shaw Floors Caress By Shaw Tranquil Waters Net Sky Washed 00400_5E062
255 | Shaw Floors SFA Find Your Comfort Tt I Lilac Field (t) 901T_EA817
256 | Shaw Floors Solidify III 15′ Pewter 00701_5E267
257 | Red Exotic Fractal Pattern Abstract Art On Canvas-7 Panels
258 | Pencil pleat curtains 130 x 260 cm (51 x 102 inch) in collection Brooklyn, fabric: 137-79
259 | Shaw Floors All Star Weekend III Net Royal Purple 00902_E0773
260 | Shaw Floors Make It Yours (s) Dockside 00752_E0819
261 | Shaw Floors Caress By Shaw Cashmere III Lg Onyx 00528_CC11B
262 | Shaw Floors Value Collections Take The Floor Texture II Net Hickory 00711_5E067
263 | Pencil pleat curtains in collection Velvet, fabric: 704-15
264 | Shaw Floors Simply The Best Within Reach III Grey Fox 00504_5E261
265 | Shaw Floors SFA Vivid Colors I Moroccan Jewel 00803_0C160
266 | Pencil pleat curtains in collection Velvet, fabric: 704-18
267 | Shaw Floors Value Collections Xvn05 (s) Bridgewater Tan 00709_E1236
268 | Shaw Floors Shaw Flooring Gallery Highland Cove III 12 Sage Leaf 00302_5223G
269 | Shaw Floors Queen Harborfields II 15′ Green Apple 00303_Q4721
270 | Shaw Floors Fusion Value 300 Canyon Shadow 00810_E0281
271 | Pencil pleat curtains in collection Avinon, fabric: 129-66
272 | Designart Forest Road In Thick Woods Modern ForestWrapped Canvas Art - 5 Panels
273 | Hegwood Gray Area Rug by Andover Mills
274 | Shaw Floors Shaw Design Center Sweet Valley II 12′ Pine Cone 00703_QC422
275 | FUSE Backpack 25 - View 41
276 | Shaw Floors Sorin I Antique Pin 00571_FQ411
277 | Shaw Floors Sandy Hollow Classic Iv 12′ London Fog 00501_E0554
278 | Renegade RSS Laptop Backpack - View 11
279 | Shaw Floors Value Collections Fyc Tt II Net Dinner With Friends (t) 732T_5E022
280 | Designart Blue Fractal Sky With Blur Stars Abstract Canvas Art Print - 6 Panels
281 | Shaw Floors Value Collections Take The Floor Twist II Net White Hot 00150_5E070
282 | Shaw Floors My Choice II Cappuccino 00756_E0651
283 | Shaw Floors Couture' Collection Ultimate Expression 15′ Oatmeal 00104_19829
284 | Chelmsford Brown Area Rug by Andover Mills
285 | Shaw Floors Roll Special Xv864 Butter Cream 00200_XV864
286 | Designart Bohinj Lake Panorama Seashore Canvas ArtPrint - 6 Panels
287 | Designart Beautiful Winter Panorama Large Landscape Canvas Art Print - 5 Panels
288 | Shaw Floors Value Collections Xvn04 Marble Gray 00503_E1234
289 | Pencil pleat curtains in collection Venice, fabric: 142-57
290 | FUSE Roll Top Backpack 25 - View 71
291 | Shaw Floors Caress By Shaw Quiet Comfort Classic Iv Silver Lining 00123_CCB99
292 | Shaw Floors Value Collections All Star Weekend I 12 Net Butter Cream 00200_E0792
293 | Shaw Floors SFA Awesome 6 (s) Marble Gray 00503_E0745
294 | Pencil pleat curtains in collection Acapulco, fabric: 141-37
295 | Shaw Floors Value Collections Take The Floor Twist Blue White Hot 00150_5E071
296 | Shaw Floors Value Collections All Star Weekend II 12′ Net Molasses 00710_E0814
297 | Shaw Floors Value Collections Cabana Life (b) Net Summer Wind 00251_5E004
298 | Shaw Floors Inspired By III Washed Turquoise 00453_5562G
299 | Shaw Floors Shaw Design Center Toe To Toe (s) Cool Taupe 00750_5C749
300 | Beige on White Watercolor Skull Bedding
301 | Shaw Floors Caress By Shaw Cashmere II Lg Spruce 00321_CC10B
302 | Pencil pleat curtain in collection Sunny, fabric: 143-43
303 | FUSE Backpack 25 - View 31
304 | Shaw Floors Caress By Shaw Cashmere Classic Iv Deep Indigo 00424_CCS71
305 | Shaw Floors Value Collections Cashmere Iv Lg Net Brass Lantern 00222_CC50B
306 | Designart Turquoise Watercolor Fractal Pattern ContemporaryArt On Canvas - 5 Panels
307 | Shaw Floors Caress By Shaw Devon Classic II Toasted Grain 0241B_CCS94
308 | FUSE Backpack 25 - View 111
309 | Shaw Floors Caress By Shaw Cashmere II Lg Blush 00125_CC10B
310 | Tab top curtains in collection Mirella, fabric: 141-11
311 | Pencil pleat curtain in collection Edinburgh, fabric: 142-31
312 | Designart Golden Fractal Pattern With Circles Abstract Canvas Art Print - 7 Panels
313 | Thumbprintz Splatter No I Red Floor Pillow
314 | Aero 53-984710BLU 53 Series 15x8 Wheel, BL, 5 on 4-3/4, 1 Inch BS IMCA
315 | Shaw Floors Caress By Shaw Quiet Comfort Classic II Chestnut 00726_CCB97
316 | Designart Palms on Philippines Tropical Beach Modern Seascape Canvas Artwork - 5 Panels
317 | Shaw Floors Anso Premier Dealer Great Effect I 12′ Coffee Bean 00711_Q4327
318 | Florina Holiday Floor Pillow
319 | Shaw Floors Home Fn Gold Hardwood Kings Canyon 2 – 5 Stonehenge 00510_HW622
320 | Shaw Floors Value Collections Within Reach II Net Beige Bisque 00110_5E336
321 | FUSE Roll Top Backpack 25 - View 81
322 | Designart Multi Color Fractal Stained Glass Abstract Wall Art Canvas - 5 Panels
323 | Ethan Hawke to Star as Jazz Great Chet Baker in New Biopic
324 | Designart Bright Blue Veins Of Marble Abstract Wall Art Canvas - 6 Panels
325 | FUSE Roll Top Backpack 25 - View 61
326 | Renegade RSS Laptop Backpack - View 10
327 | Pink Psychedelic Relaxing Art Abstract Canvas ArtPrint - 6 Panels
328 | Designart Dark Orange Fractal Flower ContemporaryWall Art Canvas - 5 Panels
329 | Waterford Sand Silk Stripe Swatch
330 | Plattville Green Area Rug by Andover Mills
331 | Design Art Dark Morning In Forest Panorama Landscape Canvas Art Print - 5 Panels
332 | FUSE Roll Top Backpack 25 - View 91
333 | Oushak Hand-Knotted Beige/Blue Area Rug by Shalom Brothers
334 | Eyelet curtains in collection Panama Cotton, fabric: 702-00
335 | Shaw Floors Shaw Floor Studio Porto Veneri II 12′ Soft Copper 00600_52U56
336 | Designart Clear Green Veins Of Marble Abstract Canvas Art Print - 5 Panels
337 | Blackout eyelet curtain in collection Blackout, fabric: 269-99
338 | FUSE Backpack 25 - View 21
339 | Purple Starry Fractal Sky Abstract Canvas Art Print - 7 Panels
340 | Design Art Bright Pink Designs On Black Abstract Wall Art Canvas - 5 Panels
341 | Pencil pleat curtain in collection Damasco, fabric: 141-77
342 | Designart Green Abstract Metal Grill ContemporaryArt On Canvas - 5 Panels
343 | Shaw Floors SFA Shingle Creek II 12′ Silver Charm 00500_EA514
344 | Aero 53984710WGRN 53 Series 15x8 Wheel, BL, 5 on 4-3/4, 1 BS Wissota
345 | Susan Blue Area Rug by Charlton Home
346 | 


--------------------------------------------------------------------------------
/prompt/sd2_mem219.txt:
--------------------------------------------------------------------------------
  1 | Hardwood ColorCollectionStripSolid 7SAPS31403 GoldenOak
  2 | Shaw Floors SFA Awesome 4 Linen 00104_E0741
  3 | Shaw Floors Value Collections All Star Weekend I 12 Net Butter Cream 00200_E0792
  4 | Shaw Floors Value Collections All Star Weekend I 12 Net Crumpet 00203_E0792
  5 | Shaw Floors Nfa Premier Gallery Hardwood Brighton Point 5 Burnt Barnboard 00304_VH032
  6 | Shaw Floors Value Collections Cashmere I Lg Net Barnboard 00525_CC47B
  7 | Shaw Floors Value Collections Fyc Tt II Net Dinner With Friends (t) 732T_5E022
  8 | Shaw Floors Value Collections Passageway 2 12 Camel 00204_E9153
  9 | Shaw Floors Northern Parkway Crystal Clear 00410_52V34
 10 | Shaw Floors Value Collections Xvn04 Marble Gray 00503_E1234
 11 | Shaw Floors Value Collections Cashmere I Lg Net Pebble Path 00722_CC47B
 12 | Shaw Floors Simply The Best Bandon Dunes Silver Leaf 00541_E0823
 13 | Shaw Floors Shaw Flooring Gallery Inspired By II French Linen 00103_5560G
 14 | Shaw Floors Fusion Value 300 Canyon Shadow 00810_E0281
 15 | Shaw Floors Roll Special Xy176 Sugar Cookie 00101_XY176
 16 | Shaw Floors Value Collections Cashmere I Lg Net Harvest Moon 00126_CC47B
 17 | Shaw Floors Caress By Shaw Quiet Comfort Classic III Atlantic 00523_CCB98
 18 | Shaw Floors Caress By Shaw Cashmere Classic III Rich Henna 00620_CCS70
 19 | Shaw Floors Shaw Floor Studio Porto Veneri I 15′ Cream 00101_52U55
 20 | Shaw Floors SFA Awesome 7 (s) Stunning Navy 00401_E0747
 21 | Shaw Floors Nfa Premier Gallery Hardwood Edenwild 2.25 Red Oak Natural 00700_VH029
 22 | Shaw Floors Value Collections Passageway 2 12 Pewter 00501_E9153
 23 | Shaw Floors Town Creek III Sea Mist 00400_52S32
 24 | Shaw Floors Simply The Best Of Course We Can III 12′ Sepia 00105_E9425
 25 | Shaw Floors Home Fn Gold Hardwood Ruger Oak 3 Natural Oak 01000_HW537
 26 | Shaw Floors Value Collections Take The Floor Texture II Net Hickory 00711_5E067
 27 | Shaw Floors Shaw Design Center Sweet Valley I 15′ Blue Suede 00400_QC421
 28 | Shaw Floors Shaw Design Center Park Manor 12′ Cashew 00106_QC459
 29 | Shaw Floors Value Collections Cabana Life (b) Net Summer Wind 00251_5E004
 30 | Shaw Floors Caress By Shaw Cashmere I Lg Gentle Doe 00128_CC09B
 31 | Shaw Floors Home Fn Gold Hardwood Kings Canyon 2 – 5 Stonehenge 00510_HW622
 32 | Shaw Floors Caress By Shaw Cashmere I Lg Heirloom 00122_CC09B
 33 | Shaw Floors Sandy Hollow Classic III 15′ Rouge Red 00820_E0553
 34 | Shaw Floors Value Collections Solidify III 12 Net Natural Contour 00104_5E340
 35 | Shaw Floors Simply The Best Without Limits II Net Sandbank 00103_5E508
 36 | Shaw Floors Value Collections Take The Floor Twist II Net White Hot 00150_5E070
 37 | Shaw Floors Pure Waters 15 Clam Shell 00102_52H11
 38 | Shaw Floors Value Collections Xvn05 (s) Bridgewater Tan 00709_E1236
 39 | Shaw Floors Value Collections Passageway 1 12 Net Classic Buff 00108_E9152
 40 | Shaw Floors Solidify III 15′ Pewter 00701_5E267
 41 | Shaw Floors SFA Timeless Appeal I 15′ Tundra 00708_Q4311
 42 | Shaw Floors SFA Awesome 6 (s) Marble Gray 00503_E0745
 43 | Shaw Floors SFA Tuscan Valley Cashmere 00701_52E29
 44 | Shaw Floors Caress By Shaw Quiet Comfort Classic Iv Rich Henna 00620_CCB99
 45 | Shaw Floors Shaw Design Center Moment Of Truth Acorn 00700_5C789
 46 | Shaw Floors Property Solutions Tailored Elegance Suede Brown 26774_PS726
 47 | Shaw Floors Value Collections Sandy Hollow Cl II Net Alpine Fern 00305_5E510
 48 | Shaw Floors Value Collections What's Up Net Linen 00104_E0926
 49 | Shaw Floors Value Collections Of Course We Can III 12′ Net Shadow 00502_E9441
 50 | Shaw Floors My Choice II Cappuccino 00756_E0651
 51 | Shaw Floors Value Collections Of Course We Can I 15 Net Linen 00100_E9432
 52 | Shaw Floors Value Collections Cashmere Iv Lg Net Navajo 00703_CC50B
 53 | Shaw Floors Queen Sandy Hollow I 15′ Peanut Brittle 00702_Q4274
 54 | Shaw Floors Queen Harborfields III 15′ Dark Roast 00709_Q4723
 55 | Shaw Floors Shaw Flooring Gallery Highland Cove III 12 Sage Leaf 00302_5223G
 56 | Simple Floral Pave Utpala Black Onyx Ring with Amethyst and Swiss Blue Topaz in 18k Yellow Gold
 57 | Shaw Floors Value Collections Because We Can II 15′ Net Sea Shell 00100_E9315
 58 | Shaw Floors Value Collections Xvn05 (s) Soft Chamois 00103_E1236
 59 | Shaw Floors Queen Versatile Design I 15′ Dark Roast 00709_Q4784
 60 | Shaw Floors SFA Drexel Hill III 12′ Coffee Bean 00705_EA055
 61 | Shaw Floors Caress By Shaw Cashmere Classic Iv Deep Indigo 00424_CCS71
 62 | Shaw Floors Value Collections All Star Weekend III 15′ Net Desert Sunrise 00721_E0816
 63 | Shaw Floors Shaw Design Center Sweet Valley II 12′ Tuscany 00204_QC422
 64 | Shaw Floors Shaw Flooring Gallery Union City II 12′ Golden Echoes 00202_5306G
 65 | Shaw Floors SFA Shingle Creek II 12′ Silver Charm 00500_EA514
 66 | Shaw Floors Spice It Up Tyler Taupe 00103_E9013
 67 | Shaw Floors Caress By Shaw Cashmere II Icelandic 00100_CCS02
 68 | Shaw Floors Value Collections Passageway 2 12 Mocha Chip 00705_E9153
 69 | Shaw Floors Caress By Shaw Cashmere II Lg Blush 00125_CC10B
 70 | Shaw Floors That's Right Rustic Taupe 00706_E0812
 71 | Shaw Natural Values II Plus Collection: Brookdale Walnut 7mm Attached Pad Laminate SL255 638
 72 | Shaw Floors Caress By Shaw Quiet Comfort Classic I Deep Indigo 00424_CCB96
 73 | Shaw Floors Shaw Hardwoods Yukon Maple Mixed Width Bison 03000_SW549
 74 | Shaw Floors Queen Point Guard 12′ Flax Seed 00103_Q4855
 75 | Shaw Floors Caress By Shaw Quiet Comfort Classic Iv Silver Lining 00123_CCB99
 76 | Shaw Floors Enduring Comfort I French Linen 00103_E0341
 77 | Shaw Floors Caress By Shaw Cashmere III Lg Yearling 00107_CC11B
 78 | Shaw Floors Value Collections Explore With Me Twist Net Wave Pool 00410_E0849
 79 | Shaw Floors Vinyl Residential Vigor 512c Plus Auburn Oak 00698_0935V
 80 | Shaw Floors Value Collections Because We Can III 15′ Net Birch Tree 00103_E9317
 81 | Shaw Floors SFA Shingle Creek Iv 15′ Mojave 00301_EA519
 82 | Shaw Floors Couture' Collection Ultimate Expression 15′ Soft Shadow 00105_19829
 83 | Shaw Floors Caress By Shaw Quiet Comfort Classic III Froth 00520_CCB98
 84 | Shaw Floors Foundations Elemental Mix II Pixels 00170_E9565
 85 | Shaw Floors Elemental Mix III Gentle Rain 00171_E9566
 86 | Shaw Floors Shaw Floor Studio Textured Story 15 Candied Truffle 55750_52B76
 87 | Shaw Floors Caress By Shaw Quiet Comfort Classic II Chestnut 00726_CCB97
 88 | Shaw Floors Sandy Hollow Classic III 12′ Cashew 00106_E0552
 89 | Shaw Floors Roll Special Xv864 Butter Cream 00200_XV864
 90 | Shaw Floors Value Collections Cashmere Iv Lg Net Jade 00323_CC50B
 91 | Shaw Floors Queen Harborfields II 15′ Green Apple 00303_Q4721
 92 | Shaw Floors Sorin I Antique Pin 00571_FQ411
 93 | Shaw Floors Value Collections Cozy Harbor I Net Waters Edge 00307_5E364
 94 | Shaw Floors Value Collections Take The Floor Twist Blue White Hot 00150_5E071
 95 | Shaw Floors Value Collections Color Moxie Meteorite 00501_E9900
 96 | Shaw Floors Caress By Shaw Tranquil Waters Net Sky Washed 00400_5E062
 97 | Shaw Floors Value Collections Take The Floor Twist II Net Biscotti 00131_5E070
 98 | Shaw Floors SFA Enjoy The Moment I 15′ Butterscotch 00201_0C138
 99 | Shaw Floors Shaw Flooring Gallery Ellendale 15′ Ink Spot 00501_5301G
100 | Shaw Floors Inspired By III Washed Turquoise 00453_5562G
101 | Shaw Floors SFA Vivid Colors I Moroccan Jewel 00803_0C160
102 | Shaw Floors Newbern Classic 15′ Crimson 55803_E0950
103 | Shaw Floors Resilient Property Solutions Optimum 512c Plus Alabaster Oak 00117_VE210
104 | Shaw Floors SFA Timeless Appeal III 12′ Country Haze 00307_Q4314
105 | Shaw Floors Nfa/Apg Barracan Classic I True Blue 00423_NA074
106 | Shaw Floors Nfa/Apg Detailed Elegance III Chocolate 00758_NA334
107 | Shaw Floors Caress By Shaw Cashmere I Lg Bismuth 00124_CC09B
108 | Shaw Floors Shaw Design Center Sweet Valley III 12′ Soft Shadow 00105_QC424
109 | Shaw Floors Value Collections Within Reach II Net Beige Bisque 00110_5E336
110 | Shaw Floors Caress By Shaw Cashmere Iv Lg Chestnut 00726_CC12B
111 | Shaw Floors Value Collections That's Right Net Sedona 00708_E0925
112 | Shaw Floors Couture' Collection Ultimate Expression 12′ Almond Flake 00200_19698
113 | Shaw Floors Shaw Floor Studio Porto Veneri III 12′ Golden Rod 00202_52U58
114 | Shaw Floors Shaw Floor Studio Bright Spirit II 15′ Marzipan 00201_Q4651
115 | Shaw Floors Shaw Design Center Different Times II 12 Silk 00104_5C494
116 | Shaw Floors Caress By Shaw Cashmere Classic I Mesquite 00724_CCS68
117 | Shaw Floors Value Collections Cashmere Iv Lg Net Brass Lantern 00222_CC50B
118 | Shaw Floors Sandy Hollow Classic Iv 12′ London Fog 00501_E0554
119 | Shaw Floors Couture' Collection Ultimate Expression 15′ Peanut Brittle 00702_19829
120 | Shaw Floors Shaw Floor Studio Bright Spirit II 15′ Coffee Bean 00711_Q4651
121 | Shaw Floors Shaw Design Center Different Times II 12 Classic Buff 00108_5C494
122 | Shaw Floors Caress By Shaw Cashmere Iv Bison 00707_CCS04
123 | Shaw Floors Shaw Design Center Royal Portrush III 12′ Crumpet 00203_5C613
124 | Shaw Floors Caress By Shaw Cashmere Classic I Rich Henna 00620_CCS68
125 | Shaw Floors Caress By Shaw Cashmere II Lg Spruce 00321_CC10B
126 | Shaw Floors Caress By Shaw Devon Classic II Toasted Grain 0241B_CCS94
127 | Shaw Floors Caress By Shaw Cashmere Classic II Spearmint 00320_CCS69
128 | Shaw Floors Couture' Collection Ultimate Expression 15′ Sahara 00205_19829
129 | I Heart Ayn Rand - Throw Pillow
130 | Shaw Floors Apd/Sdc Decordovan II 12′ Country Haze 00307_QC392
131 | Shaw Floors Sandy Hollow Classic Iv 12′ Sahara 00205_E0554
132 | Gently - Throw Pillow
133 | Shaw Floors Queen Thrive Fine Lace 00100_Q4207
134 | Shaw Floors Couture' Collection Ultimate Expression 15′ Oatmeal 00104_19829
135 | Shaw Floors SFA My Inspiration I Textured Canvas 00150_EA559
136 | Shaw Floors Value Collections All Star Weekend II 12′ Net Molasses 00710_E0814
137 | Shaw Floors Resilient Property Solutions Easy Fashion Coconut Milk 00163_VPS50
138 | Designart Small Red Flowers in Spring Photo LargeFloral Canvas Artwork - 5 Panels
139 | Shaw Floors Shaw Design Center Sweet Valley II 12′ Pine Cone 00703_QC422
140 | Shaw Floors Caress By Shaw Cashmere III Lg Pacific 00524_CC11B
141 | Shaw Floors Shaw Floor Studio Porto Veneri II 12′ Soft Copper 00600_52U56
142 | Shaw Floors Builder Specified Water's Edge Perfect Taupe 00119_HGR77
143 | Sky Blue Heart Shape Pattern HD Sublimation Metal print with Decorating Float Frame (BOX)
144 | Shaw Floors Shaw Design Center Sweet Valley II 15′ Mountain Mist 00103_QC423
145 | Shaw Floors SFA Find Your Comfort Tt I Lilac Field (t) 901T_EA817
146 | Shaw Floors Couture' Collection Ultimate Expression 12′ Tropic Vine 00304_19698
147 | Shaw Floors Resilient Residential Stone Works 720c Plus Glacier 00147_525SA
148 | Shaw Floors Caress By Shaw Cashmere Classic Iv Spruce 00321_CCS71
149 | Shaw Floors Caress By Shaw Cashmere III Lg Onyx 00528_CC11B
150 | Shaw Floors Ceramic Solutions Range 16×32 Polished Statuario 00151_CS39W
151 | Shaw Floors Shaw Design Center Toe To Toe (s) Cool Taupe 00750_5C749
152 | Shaw Floors Anso Premier Dealer Great Effect I 15′ Almond Flake 00200_Q4328
153 | Design Art Dark Morning In Forest Panorama Landscape Canvas Art Print - 5 Panels
154 | Shaw Floors Shaw Design Center Different Times III 12 Soft Copper 00600_5C496
155 | Designart Bright Blue Veins Of Marble Abstract Wall Art Canvas - 6 Panels
156 | Shaw Floors Caress By Shaw Cashmere Classic Iv Cranberry 00821_CCS71
157 | Shaw Floors Anso Premier Dealer Great Effect I 12′ Coffee Bean 00711_Q4327
158 | Designart Blue Fractal Sky With Blur Stars Abstract Canvas Art Print - 6 Panels
159 | Designart Pink Fractal Abstract Illustration Abstract CanvasWall Art - 4 Panels
160 | Pencil pleat curtains in collection Velvet, fabric: 704-18
161 | Shaw Floors Nfa/Apg Color Express II Suitable 00712_NA209
162 | Sting Like A Bee By Louisa  - Throw Pillow
163 | Shaw Floors Shaw Flooring Gallery Burtonville Luminary 00201_5293G
164 | Designart Green Abstract Metal Grill ContemporaryArt On Canvas - 5 Panels
165 | Doggy - Throw Pillow
166 | Sixty Years Men's French Terry Sweatshirt by Lakers Nation's Artist Shop
167 | Pencil pleat curtains in collection Venice, fabric: 142-57
168 | Pencil pleat curtains in collection Acapulco, fabric: 141-37
169 | Designart Multicolor Optical Fiber Lighting LargeAbstract Canvas Wall Art - 5 Panels
170 | Shaw Floors Caress By Shaw Cashmere Classic Iv Navajo 00703_CCS71
171 | Pencil pleat curtains in collection Velvet, fabric: 704-15
172 | I Am A Dreamer Blue White Circle Throw Pillow
173 | Shaw Floors Leading Legacy Crystal Gray 00500_E0546
174 | Pencil pleat curtains 130 x 260 cm (51 x 102 inch) in collection Brooklyn, fabric: 137-79
175 | Shaw Floors Make It Yours (s) Dockside 00752_E0819
176 | Skull Of A Skeleton With A Burning Cigarette - Vincent Van Gogh Wall Tapestry
177 | Shaw Floors Simply The Best Within Reach III Grey Fox 00504_5E261
178 | Pencil pleat curtains in collection Jupiter, fabric: 127-50
179 | Shaw Floors Sandy Hollow III 15′ Adobe 00108_Q4278
180 | Designart Lavender Flowers With Old House Oversized Landscape Wrapped Wall Art Print - 5 Panels
181 | Designart Pink And Orange Roses On White Modern Forest Canvas Art - 3 Panels
182 | Pencil pleat curtain in collection Linen, fabric: 392-05
183 | Pencil pleat curtain in collection Sunny, fabric: 143-43
184 | Pencil pleat curtain in collection Panama Cotton, fabric: 702-31
185 | Designart Serene Maldives Seashore at Sunset Oversized Landscape Canvas Art - 4 Panels
186 | Design Art Light in Dense Fall Forest with Fog Ultra Vibrant Landscape Oversized Circle Wall Art
187 | Designart Crystal Cell Red Steel Texture AbstractCanvas ArtPrint - 7 Panels
188 | Designart Cloudy Orange Starry Fractal Sky Abstract Canvas Art Print - 5 Panels
189 | Designart Golden Fractal Pattern With Circles Abstract Canvas Art Print - 7 Panels
190 | DEATH METAL! (Funny Unicorn / Rainbow Mosh Parody Design) Unisex T-Shirt
191 | Designart Palms on Philippines Tropical Beach Modern Seascape Canvas Artwork - 5 Panels
192 | STEP UP OR SHUT UP UNISEX DISTRICT TEE SHIRT
193 | Flower Car Seat Covers
194 | stay wild Wall Tapestry
195 | Designart Cryptical Blue Fractal Pattern AbstractWall Art Canvas - 4 Panels
196 | Pencil pleat curtains in collection Panama Cotton, fabric: 702-34
197 | Pencil pleat curtains in collection Jupiter, fabric: 127-00
198 | Designart Bohinj Lake Panorama Seashore Canvas ArtPrint - 6 Panels
199 | Pencil pleat curtains in collection Avinon, fabric: 129-66
200 | Load image into Gallery viewer, Powered by Cannabis - Short-Sleeve Unisex T-Shirt
201 | Designart Blue Glowing Bubbles Time ContemporaryWall Art Canvas - 5 Panels
202 | Designart Clear Green Veins Of Marble Abstract Canvas Art Print - 5 Panels
203 | Designart Mystic Blue Thunder Sky Abstract CanvasArt Print- 5 Panels
204 | Designart Canada White Stained Glass Floral Design 29-in Round Metal Wall Art
205 | Designart Circled Blue Psychedelic Texture Abstract Art On Canvas - 7 Panels
206 | Shaw Floors Couture' Collection Ultimate Expression 15′ Coffee Bean 00711_19829
207 | Pencil pleat curtains in collection Blackout, fabric: 269-12
208 | Designart Turquoise Watercolor Fractal Pattern ContemporaryArt On Canvas - 5 Panels
209 | Pencil pleat curtains in collection Christmas, fabric: 141-78
210 | Bella and Canvas Short-Sleeve Unisex T-Shirt: TR-6 white text
211 | Designart Forest Road In Thick Woods Modern ForestWrapped Canvas Art - 5 Panels
212 | Designart Rapeseed Fields And Green Wheat Landscape Canvas Art Print - 6 Panels
213 | Designart River In Snowy Winter Abstract LandscapeArt
214 | Designart Dark Orange Fractal Flower ContemporaryWall Art Canvas - 5 Panels
215 | Designart Blue Waters in Spring Seascape Photography Wrapped Canvas Art Print - 5 Panels
216 | Shaw Floors Nfa/Apg Detailed Artistry I Snowcap 00179_NA328
217 | Bella and Canvas Short-Sleeve Unisex T-Shirt: immigrants black text
218 | Designart Yellow Trees and Fallen Leaves Modern Forest Canvas Art - 4 Panels
219 | Shaw Floors Roll Special Xv540 Tropical Wave 00420_XV540
220 | 


--------------------------------------------------------------------------------
/read_results.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | parser = argparse.ArgumentParser(description="Process some integers.")
  3 | parser.add_argument("--local", type=str, default='', help="The scale of noise offset.")
  4 | parser.add_argument("--n_mem", type=int, default=295, help="The scale of noise offset.")
  5 | parser.add_argument("--n_step", type=int, default=51, help="The scale of noise offset.")
  6 | parser.add_argument("--hidden_dim", type=int, default=4096, help="The scale of noise offset.")
  7 | parser.add_argument("--n_head", type=int, default=8, help="The scale of noise offset.")
  8 | parser.add_argument("--job_id", type=str, default='local', help="The scale of noise offset.")
  9 | parser.add_argument("--mode", type=str, default='entropy', help="The scale of noise offset.")
 10 | parser.add_argument("--merge_input", type=str, default=None, help="The scale of noise offset.")
 11 | parser.add_argument("--merge_output", type=str, default=None, help="The scale of noise offset.")
 12 | args = parser.parse_args()
 13 | 
 14 | import os
 15 | if args.local != '':
 16 |     os.environ['CUDA_VISIBLE_DEVICES'] = args.local
 17 | 
 18 | import numpy as np
 19 | from tqdm import tqdm
 20 | import torch
 21 | from diffusers import StableDiffusionPipeline
 22 | import glob
 23 | 
 24 | def merge_save(n_count, path, save_path):
 25 | 
 26 |     entropy_results = []
 27 |     length_results = []
 28 | 
 29 |     mem_bar = tqdm(range(n_count), total=n_mem)
 30 | 
 31 |     for i in mem_bar:
 32 |         file_path = f"{path}/{i}_pos.npz"
 33 | 
 34 |         # Load the data
 35 |         data = np.load(file_path)
 36 |         all_arrays = []
 37 |         for array_name in data.files:
 38 |             all_arrays.append(data[array_name])
 39 |         data.close()
 40 | 
 41 |         length_results.append(all_arrays.pop())
 42 | 
 43 |         step_attn_list = [[] for _ in range(args.n_step)]
 44 |         step_entropy_list = [[] for _ in range(args.n_step)]
 45 | 
 46 |         for attn_id, array in enumerate(all_arrays):
 47 |             step_id = attn_id // 16
 48 |             array_tensor = torch.tensor(array, device='cuda')
 49 |             repeated_head = array_tensor.repeat(args.n_head // array_tensor.shape[0], args.hidden_dim // array_tensor.shape[1], 1)
 50 |             step_attn_list[step_id].append(repeated_head.detach().cpu().numpy())
 51 | 
 52 |         for step_i in range(len(step_attn_list)):
 53 |             step_attn_all_repeated = np.stack(step_attn_list[step_i], axis=0)
 54 |             step_entropy_list[step_i].append(step_attn_all_repeated.mean((1,2)))
 55 | 
 56 |         entropy_results.append(np.array(step_entropy_list))
 57 | 
 58 |     result = np.concatenate(entropy_results, axis=1)
 59 |     length_results = np.array(length_results)
 60 |     np.save(f'{save_path}_length.npy', length_results)
 61 |     np.save(f'{save_path}.npy', result)
 62 | 
 63 | def merge_save_small(n_count, path, save_path):
 64 | 
 65 |     entropy_results = []
 66 |     length_results = []
 67 | 
 68 |     mem_bar = tqdm(range(n_count), total=n_mem)
 69 | 
 70 |     for i in mem_bar:
 71 |         file_path = f"{path}/{i}_pos.npz"
 72 | 
 73 |         # Load the data
 74 |         data = np.load(file_path)
 75 |         all_arrays = []
 76 |         for array_name in data.files:
 77 |             all_arrays.append(data[array_name])
 78 |         data.close()
 79 | 
 80 |         length_results.append(all_arrays.pop())
 81 | 
 82 |         # import pdb ; pdb.set_trace()
 83 | 
 84 |         step_attn_list = [[] for _ in range(args.n_step)]
 85 |         step_entropy_list = [[] for _ in range(args.n_step)]
 86 | 
 87 |         for attn_id, array in enumerate(all_arrays):
 88 |             step_id = attn_id // 16
 89 |             step_attn_list[step_id].append(array)
 90 | 
 91 |         for step_i in range(len(step_attn_list)):
 92 |             step_attn_all_repeated = np.stack(step_attn_list[step_i], axis=0)
 93 |             step_entropy_list[step_i].append(step_attn_all_repeated)
 94 | 
 95 |         entropy_results.append(np.array(step_entropy_list))
 96 | 
 97 |     result = np.concatenate(entropy_results, axis=1)
 98 |     length_results = np.array(length_results)
 99 |     np.save(f'{save_path}.npy', result)
100 |     np.save(f'{save_path}_length.npy', length_results)
101 |     print(result.shape)
102 | 
103 | n_mem = args.n_mem
104 | 
105 | if args.mode == 'merge':
106 |     entropy_every_step_mem = merge_save(n_mem, path=args.merge_input, save_path=args.merge_output)
107 | 
108 | elif args.mode == 'merge_small':
109 |     entropy_every_step_mem = merge_save_small(n_mem, path=args.merge_input, save_path=args.merge_output)
110 | 


--------------------------------------------------------------------------------
/refactored_classes/MemAttn.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from diffusers import StableDiffusionPipeline
  4 | from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer, CLIPVisionModelWithProjection
  5 | from diffusers.loaders import LoraLoaderMixin, TextualInversionLoaderMixin
  6 | from diffusers.utils import scale_lora_layers, USE_PEFT_BACKEND, logging, unscale_lora_layers, replace_example_docstring, deprecate
  7 | from diffusers.models.lora import adjust_lora_scale_text_encoder
  8 | from diffusers.models import AutoencoderKL, UNet2DConditionModel
  9 | from diffusers.schedulers import KarrasDiffusionSchedulers
 10 | from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker
 11 | from typing import Any, Callable, Dict, List, Optional, Union, Tuple
 12 | from diffusers.image_processor import PipelineImageInput
 13 | from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion import retrieve_timesteps, rescale_noise_cfg
 14 | from diffusers.pipelines.stable_diffusion.pipeline_output import StableDiffusionPipelineOutput
 15 | 
 16 | logger = logging.get_logger(__name__)  # pylint: disable=invalid-name
 17 | 
 18 | EXAMPLE_DOC_STRING = """
 19 |     Examples:
 20 |         ```py
 21 |         >>> import torch
 22 |         >>> from diffusers import StableDiffusionPipeline
 23 | 
 24 |         >>> pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
 25 |         >>> pipe = pipe.to("cuda")
 26 | 
 27 |         >>> prompt = "a photo of an astronaut riding a horse on mars"
 28 |         >>> image = pipe(prompt).images[0]
 29 |         ```
 30 | """
 31 | 
 32 | class MemStableDiffusionPipeline(StableDiffusionPipeline):
 33 | 
 34 |     model_cpu_offload_seq = "text_encoder->image_encoder->unet->vae"
 35 |     _optional_components = ["safety_checker", "feature_extractor", "image_encoder"]
 36 |     _exclude_from_cpu_offload = ["safety_checker"]
 37 |     _callback_tensor_inputs = ["latents", "prompt_embeds", "negative_prompt_embeds"]
 38 | 
 39 |     def __init__(
 40 |         self,
 41 |         vae: AutoencoderKL,
 42 |         text_encoder: CLIPTextModel,
 43 |         tokenizer: CLIPTokenizer,
 44 |         unet: UNet2DConditionModel,
 45 |         scheduler: KarrasDiffusionSchedulers,
 46 |         safety_checker: StableDiffusionSafetyChecker,
 47 |         feature_extractor: CLIPImageProcessor,
 48 |         image_encoder: CLIPVisionModelWithProjection = None,
 49 |         requires_safety_checker: bool = True,
 50 |     ):
 51 |         super().__init__(vae=vae, text_encoder=text_encoder, tokenizer=tokenizer, unet=unet,
 52 |                          scheduler=scheduler, safety_checker=safety_checker, feature_extractor=feature_extractor, image_encoder=image_encoder, requires_safety_checker=requires_safety_checker)
 53 | 
 54 |     def encode_prompt(
 55 |         self,
 56 |         prompt,
 57 |         device,
 58 |         num_images_per_prompt,
 59 |         do_classifier_free_guidance,
 60 |         negative_prompt=None,
 61 |         prompt_embeds: Optional[torch.FloatTensor] = None,
 62 |         negative_prompt_embeds: Optional[torch.FloatTensor] = None,
 63 |         lora_scale: Optional[float] = None,
 64 |         clip_skip: Optional[int] = None,
 65 |         args = None,
 66 |     ):
 67 |         r"""
 68 |         Encodes the prompt into text encoder hidden states.
 69 | 
 70 |         Args:
 71 |             prompt (`str` or `List[str]`, *optional*):
 72 |                 prompt to be encoded
 73 |             device: (`torch.device`):
 74 |                 torch device
 75 |             num_images_per_prompt (`int`):
 76 |                 number of images that should be generated per prompt
 77 |             do_classifier_free_guidance (`bool`):
 78 |                 whether to use classifier free guidance or not
 79 |             negative_prompt (`str` or `List[str]`, *optional*):
 80 |                 The prompt or prompts not to guide the image generation. If not defined, one has to pass
 81 |                 `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is
 82 |                 less than `1`).
 83 |             prompt_embeds (`torch.FloatTensor`, *optional*):
 84 |                 Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not
 85 |                 provided, text embeddings will be generated from `prompt` input argument.
 86 |             negative_prompt_embeds (`torch.FloatTensor`, *optional*):
 87 |                 Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt
 88 |                 weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input
 89 |                 argument.
 90 |             lora_scale (`float`, *optional*):
 91 |                 A LoRA scale that will be applied to all LoRA layers of the text encoder if LoRA layers are loaded.
 92 |             clip_skip (`int`, *optional*):
 93 |                 Number of layers to be skipped from CLIP while computing the prompt embeddings. A value of 1 means that
 94 |                 the output of the pre-final layer will be used for computing the prompt embeddings.
 95 |         """
 96 | 
 97 |         # set lora scale so that monkey patched LoRA
 98 |         # function of text encoder can correctly access it
 99 |         if lora_scale is not None and isinstance(self, LoraLoaderMixin):
100 |             self._lora_scale = lora_scale
101 | 
102 |             # dynamically adjust the LoRA scale
103 |             if not USE_PEFT_BACKEND:
104 |                 adjust_lora_scale_text_encoder(self.text_encoder, lora_scale)
105 |             else:
106 |                 scale_lora_layers(self.text_encoder, lora_scale)
107 | 
108 |         if prompt is not None and isinstance(prompt, str):
109 |             batch_size = 1
110 |         elif prompt is not None and isinstance(prompt, list):
111 |             batch_size = len(prompt)
112 |         else:
113 |             batch_size = prompt_embeds.shape[0]
114 | 
115 |         if prompt_embeds is None:
116 |             # textual inversion: procecss multi-vector tokens if necessary
117 |             if isinstance(self, TextualInversionLoaderMixin):
118 |                 prompt = self.maybe_convert_prompt(prompt, self.tokenizer)
119 | 
120 |             text_inputs = self.tokenizer(
121 |                 prompt,
122 |                 padding="max_length",
123 |                 max_length=self.tokenizer.model_max_length,
124 |                 truncation=True,
125 |                 return_tensors="pt",
126 |             )
127 | 
128 |             temp_inputs_ids = self.tokenizer(
129 |                 prompt,
130 |                 # padding="max_length", # This is only for print. It will not be used for generation (RJ)
131 |                 max_length=self.tokenizer.model_max_length,
132 |                 truncation=True,
133 |                 return_tensors="pt",
134 |             )
135 | 
136 |             print(temp_inputs_ids['input_ids'].shape)
137 |             if args is not None:
138 |                 args.prompt_length = temp_inputs_ids['input_ids'].shape[1]
139 |             # print(text_inputs['attention_mask'])
140 | 
141 |             # print(text_inputs)
142 |             # import pdb ; pdb.set_trace()
143 |             text_input_ids = text_inputs.input_ids
144 |             untruncated_ids = self.tokenizer(prompt, padding="longest", return_tensors="pt").input_ids
145 | 
146 |             if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal(
147 |                 text_input_ids, untruncated_ids
148 |             ):
149 |                 removed_text = self.tokenizer.batch_decode(
150 |                     untruncated_ids[:, self.tokenizer.model_max_length - 1 : -1]
151 |                 )
152 |                 logger.warning(
153 |                     "The following part of your input was truncated because CLIP can only handle sequences up to"
154 |                     f" {self.tokenizer.model_max_length} tokens: {removed_text}"
155 |                 )
156 | 
157 |             if hasattr(self.text_encoder.config, "use_attention_mask") and self.text_encoder.config.use_attention_mask:
158 |                 attention_mask = text_inputs.attention_mask.to(device)
159 |             else:
160 |                 attention_mask = None
161 |                 # attention_mask = text_inputs.attention_mask.to(device)
162 | 
163 |             # import pdb ; pdb.set_trace()
164 | 
165 |             if clip_skip is None:
166 |                 prompt_embeds = self.text_encoder(text_input_ids.to(device), attention_mask=attention_mask)
167 |                 prompt_embeds = prompt_embeds[0]
168 |             else:
169 |                 prompt_embeds = self.text_encoder(
170 |                     text_input_ids.to(device), attention_mask=attention_mask, output_hidden_states=True
171 |                 )
172 |                 # Access the `hidden_states` first, that contains a tuple of
173 |                 # all the hidden states from the encoder layers. Then index into
174 |                 # the tuple to access the hidden states from the desired layer.
175 |                 prompt_embeds = prompt_embeds[-1][-(clip_skip + 1)]
176 |                 # We also need to apply the final LayerNorm here to not mess with the
177 |                 # representations. The `last_hidden_states` that we typically use for
178 |                 # obtaining the final prompt representations passes through the LayerNorm
179 |                 # layer.
180 |                 prompt_embeds = self.text_encoder.text_model.final_layer_norm(prompt_embeds)
181 | 
182 |         if self.text_encoder is not None:
183 |             prompt_embeds_dtype = self.text_encoder.dtype
184 |         elif self.unet is not None:
185 |             prompt_embeds_dtype = self.unet.dtype
186 |         else:
187 |             prompt_embeds_dtype = prompt_embeds.dtype
188 | 
189 |         prompt_embeds = prompt_embeds.to(dtype=prompt_embeds_dtype, device=device)
190 | 
191 |         bs_embed, seq_len, _ = prompt_embeds.shape
192 |         # duplicate text embeddings for each generation per prompt, using mps friendly method
193 |         prompt_embeds = prompt_embeds.repeat(1, num_images_per_prompt, 1)
194 |         prompt_embeds = prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1)
195 | 
196 |         # get unconditional embeddings for classifier free guidance
197 |         if do_classifier_free_guidance and negative_prompt_embeds is None:
198 |             uncond_tokens: List[str]
199 |             if negative_prompt is None:
200 |                 uncond_tokens = [""] * batch_size
201 |             elif prompt is not None and type(prompt) is not type(negative_prompt):
202 |                 raise TypeError(
203 |                     f"`negative_prompt` should be the same type to `prompt`, but got {type(negative_prompt)} !="
204 |                     f" {type(prompt)}."
205 |                 )
206 |             elif isinstance(negative_prompt, str):
207 |                 uncond_tokens = [negative_prompt]
208 |             elif batch_size != len(negative_prompt):
209 |                 raise ValueError(
210 |                     f"`negative_prompt`: {negative_prompt} has batch size {len(negative_prompt)}, but `prompt`:"
211 |                     f" {prompt} has batch size {batch_size}. Please make sure that passed `negative_prompt` matches"
212 |                     " the batch size of `prompt`."
213 |                 )
214 |             else:
215 |                 uncond_tokens = negative_prompt
216 | 
217 |             # textual inversion: procecss multi-vector tokens if necessary
218 |             if isinstance(self, TextualInversionLoaderMixin):
219 |                 uncond_tokens = self.maybe_convert_prompt(uncond_tokens, self.tokenizer)
220 | 
221 |             max_length = prompt_embeds.shape[1]
222 |             uncond_input = self.tokenizer(
223 |                 uncond_tokens,
224 |                 padding="max_length",
225 |                 max_length=max_length,
226 |                 truncation=True,
227 |                 return_tensors="pt",
228 |             )
229 | 
230 |             if hasattr(self.text_encoder.config, "use_attention_mask") and self.text_encoder.config.use_attention_mask:
231 |                 attention_mask = uncond_input.attention_mask.to(device)
232 |             else:
233 |                 attention_mask = None
234 | 
235 |             negative_prompt_embeds = self.text_encoder(
236 |                 uncond_input.input_ids.to(device),
237 |                 attention_mask=attention_mask,
238 |             )
239 |             negative_prompt_embeds = negative_prompt_embeds[0]
240 | 
241 |         if do_classifier_free_guidance:
242 |             # duplicate unconditional embeddings for each generation per prompt, using mps friendly method
243 |             seq_len = negative_prompt_embeds.shape[1]
244 | 
245 |             negative_prompt_embeds = negative_prompt_embeds.to(dtype=prompt_embeds_dtype, device=device)
246 | 
247 |             negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_images_per_prompt, 1)
248 |             negative_prompt_embeds = negative_prompt_embeds.view(batch_size * num_images_per_prompt, seq_len, -1)
249 | 
250 |         if isinstance(self, LoraLoaderMixin) and USE_PEFT_BACKEND:
251 |             # Retrieve the original scale by scaling back the LoRA layers
252 |             unscale_lora_layers(self.text_encoder, lora_scale)
253 | 
254 |         return prompt_embeds, negative_prompt_embeds
255 |     
256 |     @torch.no_grad()
257 |     @replace_example_docstring(EXAMPLE_DOC_STRING)
258 |     def __call__(
259 |         self,
260 |         prompt: Union[str, List[str]] = None,
261 |         height: Optional[int] = None,
262 |         width: Optional[int] = None,
263 |         num_inference_steps: int = 50,
264 |         timesteps: List[int] = None,
265 |         guidance_scale: float = 7.5,
266 |         negative_prompt: Optional[Union[str, List[str]]] = None,
267 |         num_images_per_prompt: Optional[int] = 1,
268 |         eta: float = 0.0,
269 |         generator: Optional[Union[torch.Generator, List[torch.Generator]]] = None,
270 |         latents: Optional[torch.FloatTensor] = None,
271 |         prompt_embeds: Optional[torch.FloatTensor] = None,
272 |         negative_prompt_embeds: Optional[torch.FloatTensor] = None,
273 |         ip_adapter_image: Optional[PipelineImageInput] = None,
274 |         output_type: Optional[str] = "pil",
275 |         return_dict: bool = True,
276 |         cross_attention_kwargs: Optional[Dict[str, Any]] = None,
277 |         guidance_rescale: float = 0.0,
278 |         clip_skip: Optional[int] = None,
279 |         callback_on_step_end: Optional[Callable[[int, int, Dict], None]] = None,
280 |         callback_on_step_end_tensor_inputs: List[str] = ["latents"],
281 |         save_prefix="heatmap/test",
282 |         args = None,
283 |         **kwargs,
284 |     ):
285 |         r"""
286 |         The call function to the pipeline for generation.
287 | 
288 |         Args:
289 |             prompt (`str` or `List[str]`, *optional*):
290 |                 The prompt or prompts to guide image generation. If not defined, you need to pass `prompt_embeds`.
291 |             height (`int`, *optional*, defaults to `self.unet.config.sample_size * self.vae_scale_factor`):
292 |                 The height in pixels of the generated image.
293 |             width (`int`, *optional*, defaults to `self.unet.config.sample_size * self.vae_scale_factor`):
294 |                 The width in pixels of the generated image.
295 |             num_inference_steps (`int`, *optional*, defaults to 50):
296 |                 The number of denoising steps. More denoising steps usually lead to a higher quality image at the
297 |                 expense of slower inference.
298 |             timesteps (`List[int]`, *optional*):
299 |                 Custom timesteps to use for the denoising process with schedulers which support a `timesteps` argument
300 |                 in their `set_timesteps` method. If not defined, the default behavior when `num_inference_steps` is
301 |                 passed will be used. Must be in descending order.
302 |             guidance_scale (`float`, *optional*, defaults to 7.5):
303 |                 A higher guidance scale value encourages the model to generate images closely linked to the text
304 |                 `prompt` at the expense of lower image quality. Guidance scale is enabled when `guidance_scale > 1`.
305 |             negative_prompt (`str` or `List[str]`, *optional*):
306 |                 The prompt or prompts to guide what to not include in image generation. If not defined, you need to
307 |                 pass `negative_prompt_embeds` instead. Ignored when not using guidance (`guidance_scale < 1`).
308 |             num_images_per_prompt (`int`, *optional*, defaults to 1):
309 |                 The number of images to generate per prompt.
310 |             eta (`float`, *optional*, defaults to 0.0):
311 |                 Corresponds to parameter eta (η) from the [DDIM](https://arxiv.org/abs/2010.02502) paper. Only applies
312 |                 to the [`~schedulers.DDIMScheduler`], and is ignored in other schedulers.
313 |             generator (`torch.Generator` or `List[torch.Generator]`, *optional*):
314 |                 A [`torch.Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make
315 |                 generation deterministic.
316 |             latents (`torch.FloatTensor`, *optional*):
317 |                 Pre-generated noisy latents sampled from a Gaussian distribution, to be used as inputs for image
318 |                 generation. Can be used to tweak the same generation with different prompts. If not provided, a latents
319 |                 tensor is generated by sampling using the supplied random `generator`.
320 |             prompt_embeds (`torch.FloatTensor`, *optional*):
321 |                 Pre-generated text embeddings. Can be used to easily tweak text inputs (prompt weighting). If not
322 |                 provided, text embeddings are generated from the `prompt` input argument.
323 |             negative_prompt_embeds (`torch.FloatTensor`, *optional*):
324 |                 Pre-generated negative text embeddings. Can be used to easily tweak text inputs (prompt weighting). If
325 |                 not provided, `negative_prompt_embeds` are generated from the `negative_prompt` input argument.
326 |             ip_adapter_image: (`PipelineImageInput`, *optional*): Optional image input to work with IP Adapters.
327 |             output_type (`str`, *optional*, defaults to `"pil"`):
328 |                 The output format of the generated image. Choose between `PIL.Image` or `np.array`.
329 |             return_dict (`bool`, *optional*, defaults to `True`):
330 |                 Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] instead of a
331 |                 plain tuple.
332 |             cross_attention_kwargs (`dict`, *optional*):
333 |                 A kwargs dictionary that if specified is passed along to the [`AttentionProcessor`] as defined in
334 |                 [`self.processor`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
335 |             guidance_rescale (`float`, *optional*, defaults to 0.0):
336 |                 Guidance rescale factor from [Common Diffusion Noise Schedules and Sample Steps are
337 |                 Flawed](https://arxiv.org/pdf/2305.08891.pdf). Guidance rescale factor should fix overexposure when
338 |                 using zero terminal SNR.
339 |             clip_skip (`int`, *optional*):
340 |                 Number of layers to be skipped from CLIP while computing the prompt embeddings. A value of 1 means that
341 |                 the output of the pre-final layer will be used for computing the prompt embeddings.
342 |             callback_on_step_end (`Callable`, *optional*):
343 |                 A function that calls at the end of each denoising steps during the inference. The function is called
344 |                 with the following arguments: `callback_on_step_end(self: DiffusionPipeline, step: int, timestep: int,
345 |                 callback_kwargs: Dict)`. `callback_kwargs` will include a list of all tensors as specified by
346 |                 `callback_on_step_end_tensor_inputs`.
347 |             callback_on_step_end_tensor_inputs (`List`, *optional*):
348 |                 The list of tensor inputs for the `callback_on_step_end` function. The tensors specified in the list
349 |                 will be passed as `callback_kwargs` argument. You will only be able to include variables listed in the
350 |                 `._callback_tensor_inputs` attribute of your pipeline class.
351 | 
352 |         Examples:
353 | 
354 |         Returns:
355 |             [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] or `tuple`:
356 |                 If `return_dict` is `True`, [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] is returned,
357 |                 otherwise a `tuple` is returned where the first element is a list with the generated images and the
358 |                 second element is a list of `bool`s indicating whether the corresponding generated image contains
359 |                 "not-safe-for-work" (nsfw) content.
360 |         """
361 | 
362 |         callback = kwargs.pop("callback", None)
363 |         callback_steps = kwargs.pop("callback_steps", None)
364 | 
365 |         if callback is not None:
366 |             deprecate(
367 |                 "callback",
368 |                 "1.0.0",
369 |                 "Passing `callback` as an input argument to `__call__` is deprecated, consider using `callback_on_step_end`",
370 |             )
371 |         if callback_steps is not None:
372 |             deprecate(
373 |                 "callback_steps",
374 |                 "1.0.0",
375 |                 "Passing `callback_steps` as an input argument to `__call__` is deprecated, consider using `callback_on_step_end`",
376 |             )
377 | 
378 |         # 0. Default height and width to unet
379 |         height = height or self.unet.config.sample_size * self.vae_scale_factor
380 |         width = width or self.unet.config.sample_size * self.vae_scale_factor
381 |         # to deal with lora scaling and other possible forward hooks
382 | 
383 |         # 1. Check inputs. Raise error if not correct
384 |         self.check_inputs(
385 |             prompt,
386 |             height,
387 |             width,
388 |             callback_steps,
389 |             negative_prompt,
390 |             prompt_embeds,
391 |             negative_prompt_embeds,
392 |             callback_on_step_end_tensor_inputs,
393 |         )
394 | 
395 |         self._guidance_scale = guidance_scale
396 |         self._guidance_rescale = guidance_rescale
397 |         self._clip_skip = clip_skip
398 |         self._cross_attention_kwargs = cross_attention_kwargs
399 |         self._interrupt = False
400 | 
401 |         # 2. Define call parameters
402 |         if prompt is not None and isinstance(prompt, str):
403 |             batch_size = 1
404 |         elif prompt is not None and isinstance(prompt, list):
405 |             batch_size = len(prompt)
406 |         else:
407 |             batch_size = prompt_embeds.shape[0]
408 | 
409 |         device = self._execution_device
410 | 
411 |         # 3. Encode input prompt
412 |         lora_scale = (
413 |             self.cross_attention_kwargs.get("scale", None) if self.cross_attention_kwargs is not None else None
414 |         )
415 | 
416 |         prompt_embeds, negative_prompt_embeds = self.encode_prompt(
417 |             prompt,
418 |             device,
419 |             num_images_per_prompt,
420 |             self.do_classifier_free_guidance,
421 |             negative_prompt,
422 |             prompt_embeds=prompt_embeds,
423 |             negative_prompt_embeds=negative_prompt_embeds,
424 |             lora_scale=lora_scale,
425 |             clip_skip=self.clip_skip,
426 |             args=args,
427 |         )
428 | 
429 |         # import pdb ; pdb.set_trace()
430 | 
431 |         # For classifier free guidance, we need to do two forward passes.
432 |         # Here we concatenate the unconditional and text embeddings into a single batch
433 |         # to avoid doing two forward passes
434 |         if self.do_classifier_free_guidance:
435 |             prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds])
436 | 
437 |         if ip_adapter_image is not None:
438 |             image_embeds = self.prepare_ip_adapter_image_embeds(
439 |                 ip_adapter_image, device, batch_size * num_images_per_prompt
440 |             )
441 | 
442 |         # 4. Prepare timesteps
443 |         timesteps, num_inference_steps = retrieve_timesteps(self.scheduler, num_inference_steps, device, timesteps)
444 | 
445 |         # 5. Prepare latent variables
446 |         num_channels_latents = self.unet.config.in_channels
447 |         latents = self.prepare_latents(
448 |             batch_size * num_images_per_prompt,
449 |             num_channels_latents,
450 |             height,
451 |             width,
452 |             prompt_embeds.dtype,
453 |             device,
454 |             generator,
455 |             latents,
456 |         )
457 | 
458 |         # 6. Prepare extra step kwargs. TODO: Logic should ideally just be moved out of the pipeline
459 |         extra_step_kwargs = self.prepare_extra_step_kwargs(generator, eta)
460 | 
461 |         # 6.1 Add image embeds for IP-Adapter
462 |         added_cond_kwargs = {"image_embeds": image_embeds} if ip_adapter_image is not None else None
463 | 
464 |         # 6.2 Optionally get Guidance Scale Embedding
465 |         timestep_cond = None
466 |         if self.unet.config.time_cond_proj_dim is not None:
467 |             guidance_scale_tensor = torch.tensor(self.guidance_scale - 1).repeat(batch_size * num_images_per_prompt)
468 |             timestep_cond = self.get_guidance_scale_embedding(
469 |                 guidance_scale_tensor, embedding_dim=self.unet.config.time_cond_proj_dim
470 |             ).to(device=device, dtype=latents.dtype)
471 | 
472 |         # 7. Denoising loop
473 |         num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order
474 |         self._num_timesteps = len(timesteps)
475 |         step_counter = 0
476 |         attn_weight_list_numpy = []
477 |         with self.progress_bar(total=num_inference_steps) as progress_bar:
478 |             for i, t in enumerate(timesteps):
479 |                 if self.interrupt:
480 |                     continue
481 | 
482 |                 # expand the latents if we are doing classifier free guidance
483 |                 latent_model_input = torch.cat([latents] * 2) if self.do_classifier_free_guidance else latents
484 |                 latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)
485 | 
486 |                 # if step_counter < 30:
487 |                 #     args.miti_mem = True
488 |                 # else:
489 |                 #     args.miti_mem = False
490 | 
491 |                 # predict the noise residual
492 |                 # import pdb ; pdb.set_trace()
493 |                 # if args.miss_token_debug:
494 |                 #     # print(args.prompt_length)
495 |                 #     # prompt_embeds[1, 1:args.prompt_length-1] = 0
496 |                 #     prompt_embeds[1, args.prompt_length-1:] = 0
497 |                 #     # prompt_embeds[1, 20:] = 0
498 |                 noise_pred_dict = self.unet(
499 |                     latent_model_input,
500 |                     t,
501 |                     encoder_hidden_states=prompt_embeds,
502 |                     timestep_cond=timestep_cond,
503 |                     cross_attention_kwargs=self.cross_attention_kwargs,
504 |                     added_cond_kwargs=added_cond_kwargs,
505 |                     return_dict=False,
506 |                     return_attention=True,
507 |                     miti_mem = args.miti_mem,
508 |                     args = args,
509 |                 )
510 | 
511 |                 attn_weight_list_collect = noise_pred_dict[1]
512 |                 noise_pred = noise_pred_dict[0]
513 | 
514 |                 attn_weight_list_numpy += attn_weight_list_collect
515 | 
516 |                 # if args.plot:
517 | 
518 |                 #     import numpy as np
519 |                 #     import seaborn as sns
520 |                 #     import matplotlib.pyplot as plt
521 | 
522 |                 #     # if step_counter % 10 == 0:
523 |                 #     print(step_counter)
524 |                 #     if step_counter in [0, 40]:
525 |                 #         print("got: ")
526 |                 #         for item_i, item in enumerate(attn_weight_list_collect):
527 |                 #             # if item_i not in [3, 6]:
528 |                 #             #     continue
529 |                 #             if len(args.layers_to_plot) != 0:
530 |                 #                 if item_i not in args.layers_to_plot:
531 |                 #                     continue
532 |                 #             print(item.shape)
533 | 
534 |                 #             data = attn_weight_list_collect[item_i][1:, :, :, :30]
535 | 
536 |                 #             # import pdb ; pdb.set_trace()
537 | 
538 |                 #             # Reshape the array to a new shape that consolidates the first two dimensions
539 |                 #             reshaped_data = data.reshape(-1, *data.shape[2:])
540 | 
541 |                 #             # Set up the matplotlib figure
542 |                 #             if reshaped_data.shape[0] == 8:
543 |                 #                 fig, axes = plt.subplots(2, 4, figsize=(11, 5))  # Adjust the size as needed
544 |                 #             elif reshaped_data.shape[0] == 5:
545 |                 #                 fig, axes = plt.subplots(1, 5, figsize=(11, 3))  # Adjust the size as needed
546 |                 #             elif reshaped_data.shape[0] == 10:
547 |                 #                 fig, axes = plt.subplots(2, 5, figsize=(11, 5))  # Adjust the size as needed
548 |                 #             elif reshaped_data.shape[0] == 20:
549 |                 #                 fig, axes = plt.subplots(4, 5, figsize=(11, 10))  # Adjust the size as needed
550 |                 #             else:
551 |                 #                 raise("Wrong shape")
552 | 
553 |                 #             # Iterate over the reshaped data and plot each heatmap
554 |                 #             for i, ax in enumerate(axes.flatten()):
555 |                 #                 sns.heatmap(reshaped_data[i], ax=ax, vmin=0, vmax=1)
556 | 
557 |                 #                 temp = np.sum(reshaped_data[i], axis=0)
558 |                 #                 argmax_temp = np.argmax(temp[1:]) + 1
559 |                 #                 max_temp = np.max(temp[1:])
560 |                 #                 # print(temp, max_temp)
561 |                 #                 # print(argmax_temp)
562 |                 #                 # import pdb ; pdb.set_trace()
563 | 
564 |                 #                 # ax.set_title(f"Heatmap {i+1}")
565 |                 #                 # ax.set_title(f"({argmax_temp}):{max_temp:.2f}")
566 | 
567 |                 #             plt.tight_layout()
568 |                 #             plt.savefig(f'{save_prefix}/step{step_counter}_layer{item_i}.png')
569 |                 #             print("saved at: ", f'{save_prefix}/step{step_counter}_layer{item_i}.png')
570 |                 #             plt.close()
571 |                 step_counter += 1
572 | 
573 |                 # perform guidance
574 |                 if self.do_classifier_free_guidance:
575 |                     noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
576 |                     noise_pred = noise_pred_uncond + self.guidance_scale * (noise_pred_text - noise_pred_uncond)
577 |                 #     print("check")
578 |                 # import pdb ; pdb.set_trace()
579 | 
580 |                 if self.do_classifier_free_guidance and self.guidance_rescale > 0.0:
581 |                     # Based on 3.4. in https://arxiv.org/pdf/2305.08891.pdf
582 |                     noise_pred = rescale_noise_cfg(noise_pred, noise_pred_text, guidance_rescale=self.guidance_rescale)
583 | 
584 |                 # compute the previous noisy sample x_t -> x_t-1
585 |                 latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
586 | 
587 |                 if callback_on_step_end is not None:
588 |                     callback_kwargs = {}
589 |                     for k in callback_on_step_end_tensor_inputs:
590 |                         callback_kwargs[k] = locals()[k]
591 |                     callback_outputs = callback_on_step_end(self, i, t, callback_kwargs)
592 | 
593 |                     latents = callback_outputs.pop("latents", latents)
594 |                     prompt_embeds = callback_outputs.pop("prompt_embeds", prompt_embeds)
595 |                     negative_prompt_embeds = callback_outputs.pop("negative_prompt_embeds", negative_prompt_embeds)
596 | 
597 |                 # call the callback, if provided
598 |                 if i == len(timesteps) - 1 or ((i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0):
599 |                     progress_bar.update()
600 |                     if callback is not None and i % callback_steps == 0:
601 |                         step_idx = i // getattr(self.scheduler, "order", 1)
602 |                         callback(step_idx, t, latents)
603 | 
604 |         # attn_weight_list_numpy
605 |         if args.save_numpy:
606 |             import numpy as np
607 |             positive_prompt_attn_scores = []
608 |             save_whole_numpy = False
609 |             for _id in range(len(attn_weight_list_numpy)):
610 |                 if save_whole_numpy:
611 |                     positive_prompt_attn_scores.append(attn_weight_list_numpy[_id][1, :, :, :])
612 |                 else:
613 |                     positive_prompt_attn_scores.append(attn_weight_list_numpy[_id][1, :, :, :].mean(1))
614 |             positive_prompt_attn_scores.append(args.prompt_length)
615 |             # np.savez(f'{args.save_prefix_numpy}/{args.prompt_id}.npz', *attn_weight_list_numpy)
616 |             np.savez(f'{args.save_prefix_numpy}/{args.prompt_id}_pos.npz', *positive_prompt_attn_scores)
617 | 
618 |         if not output_type == "latent":
619 |             image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False, generator=generator)[
620 |                 0
621 |             ]
622 |             image, has_nsfw_concept = self.run_safety_checker(image, device, prompt_embeds.dtype)
623 |         else:
624 |             image = latents
625 |             has_nsfw_concept = None
626 | 
627 |         if has_nsfw_concept is None:
628 |             do_denormalize = [True] * image.shape[0]
629 |         else:
630 |             do_denormalize = [not has_nsfw for has_nsfw in has_nsfw_concept]
631 | 
632 |         image = self.image_processor.postprocess(image, output_type=output_type, do_denormalize=do_denormalize)
633 | 
634 |         # Offload all models
635 |         self.maybe_free_model_hooks()
636 | 
637 |         if not return_dict:
638 |             return (image, has_nsfw_concept)
639 | 
640 |         return StableDiffusionPipelineOutput(images=image, nsfw_content_detected=has_nsfw_concept)


--------------------------------------------------------------------------------
/refactored_classes/refactored_attention.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | from typing import Any, Dict, Optional
 15 | 
 16 | import torch
 17 | import torch.nn.functional as F
 18 | from torch import nn
 19 | 
 20 | from diffusers.utils import USE_PEFT_BACKEND
 21 | from diffusers.utils.torch_utils import maybe_allow_in_graph
 22 | from diffusers.models.activations import GEGLU, GELU, ApproximateGELU
 23 | from .refactored_attention_processor import Attention
 24 | from diffusers.models.embeddings import SinusoidalPositionalEmbedding
 25 | from diffusers.models.lora import LoRACompatibleLinear
 26 | from diffusers.models.normalization import AdaLayerNorm, AdaLayerNormContinuous, AdaLayerNormZero, RMSNorm
 27 | 
 28 | 
 29 | def _chunked_feed_forward(
 30 |     ff: nn.Module, hidden_states: torch.Tensor, chunk_dim: int, chunk_size: int, lora_scale: Optional[float] = None
 31 | ):
 32 |     # "feed_forward_chunk_size" can be used to save memory
 33 |     if hidden_states.shape[chunk_dim] % chunk_size != 0:
 34 |         raise ValueError(
 35 |             f"`hidden_states` dimension to be chunked: {hidden_states.shape[chunk_dim]} has to be divisible by chunk size: {chunk_size}. Make sure to set an appropriate `chunk_size` when calling `unet.enable_forward_chunking`."
 36 |         )
 37 | 
 38 |     num_chunks = hidden_states.shape[chunk_dim] // chunk_size
 39 |     if lora_scale is None:
 40 |         ff_output = torch.cat(
 41 |             [ff(hid_slice) for hid_slice in hidden_states.chunk(num_chunks, dim=chunk_dim)],
 42 |             dim=chunk_dim,
 43 |         )
 44 |     else:
 45 |         # TOOD(Patrick): LoRA scale can be removed once PEFT refactor is complete
 46 |         ff_output = torch.cat(
 47 |             [ff(hid_slice, scale=lora_scale) for hid_slice in hidden_states.chunk(num_chunks, dim=chunk_dim)],
 48 |             dim=chunk_dim,
 49 |         )
 50 | 
 51 |     return ff_output
 52 | 
 53 | 
 54 | @maybe_allow_in_graph
 55 | class GatedSelfAttentionDense(nn.Module):
 56 |     r"""
 57 |     A gated self-attention dense layer that combines visual features and object features.
 58 | 
 59 |     Parameters:
 60 |         query_dim (`int`): The number of channels in the query.
 61 |         context_dim (`int`): The number of channels in the context.
 62 |         n_heads (`int`): The number of heads to use for attention.
 63 |         d_head (`int`): The number of channels in each head.
 64 |     """
 65 | 
 66 |     def __init__(self, query_dim: int, context_dim: int, n_heads: int, d_head: int):
 67 |         super().__init__()
 68 | 
 69 |         # we need a linear projection since we need cat visual feature and obj feature
 70 |         self.linear = nn.Linear(context_dim, query_dim)
 71 | 
 72 |         self.attn = Attention(query_dim=query_dim, heads=n_heads, dim_head=d_head)
 73 |         self.ff = FeedForward(query_dim, activation_fn="geglu")
 74 | 
 75 |         self.norm1 = nn.LayerNorm(query_dim)
 76 |         self.norm2 = nn.LayerNorm(query_dim)
 77 | 
 78 |         self.register_parameter("alpha_attn", nn.Parameter(torch.tensor(0.0)))
 79 |         self.register_parameter("alpha_dense", nn.Parameter(torch.tensor(0.0)))
 80 | 
 81 |         self.enabled = True
 82 | 
 83 |     def forward(self, x: torch.Tensor, objs: torch.Tensor) -> torch.Tensor:
 84 |         if not self.enabled:
 85 |             return x
 86 | 
 87 |         n_visual = x.shape[1]
 88 |         objs = self.linear(objs)
 89 | 
 90 |         x = x + self.alpha_attn.tanh() * self.attn(self.norm1(torch.cat([x, objs], dim=1)))[:, :n_visual, :]
 91 |         x = x + self.alpha_dense.tanh() * self.ff(self.norm2(x))
 92 | 
 93 |         return x
 94 | 
 95 | 
 96 | @maybe_allow_in_graph
 97 | class BasicTransformerBlock(nn.Module):
 98 |     r"""
 99 |     A basic Transformer block.
100 | 
101 |     Parameters:
102 |         dim (`int`): The number of channels in the input and output.
103 |         num_attention_heads (`int`): The number of heads to use for multi-head attention.
104 |         attention_head_dim (`int`): The number of channels in each head.
105 |         dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use.
106 |         cross_attention_dim (`int`, *optional*): The size of the encoder_hidden_states vector for cross attention.
107 |         activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward.
108 |         num_embeds_ada_norm (:
109 |             obj: `int`, *optional*): The number of diffusion steps used during training. See `Transformer2DModel`.
110 |         attention_bias (:
111 |             obj: `bool`, *optional*, defaults to `False`): Configure if the attentions should contain a bias parameter.
112 |         only_cross_attention (`bool`, *optional*):
113 |             Whether to use only cross-attention layers. In this case two cross attention layers are used.
114 |         double_self_attention (`bool`, *optional*):
115 |             Whether to use two self-attention layers. In this case no cross attention layers are used.
116 |         upcast_attention (`bool`, *optional*):
117 |             Whether to upcast the attention computation to float32. This is useful for mixed precision training.
118 |         norm_elementwise_affine (`bool`, *optional*, defaults to `True`):
119 |             Whether to use learnable elementwise affine parameters for normalization.
120 |         norm_type (`str`, *optional*, defaults to `"layer_norm"`):
121 |             The normalization layer to use. Can be `"layer_norm"`, `"ada_norm"` or `"ada_norm_zero"`.
122 |         final_dropout (`bool` *optional*, defaults to False):
123 |             Whether to apply a final dropout after the last feed-forward layer.
124 |         attention_type (`str`, *optional*, defaults to `"default"`):
125 |             The type of attention to use. Can be `"default"` or `"gated"` or `"gated-text-image"`.
126 |         positional_embeddings (`str`, *optional*, defaults to `None`):
127 |             The type of positional embeddings to apply to.
128 |         num_positional_embeddings (`int`, *optional*, defaults to `None`):
129 |             The maximum number of positional embeddings to apply.
130 |     """
131 | 
132 |     def __init__(
133 |         self,
134 |         dim: int,
135 |         num_attention_heads: int,
136 |         attention_head_dim: int,
137 |         dropout=0.0,
138 |         cross_attention_dim: Optional[int] = None,
139 |         activation_fn: str = "geglu",
140 |         num_embeds_ada_norm: Optional[int] = None,
141 |         attention_bias: bool = False,
142 |         only_cross_attention: bool = False,
143 |         double_self_attention: bool = False,
144 |         upcast_attention: bool = False,
145 |         norm_elementwise_affine: bool = True,
146 |         norm_type: str = "layer_norm",  # 'layer_norm', 'ada_norm', 'ada_norm_zero', 'ada_norm_single', 'layer_norm_i2vgen'
147 |         norm_eps: float = 1e-5,
148 |         final_dropout: bool = False,
149 |         attention_type: str = "default",
150 |         positional_embeddings: Optional[str] = None,
151 |         num_positional_embeddings: Optional[int] = None,
152 |         ada_norm_continous_conditioning_embedding_dim: Optional[int] = None,
153 |         ada_norm_bias: Optional[int] = None,
154 |         ff_inner_dim: Optional[int] = None,
155 |         ff_bias: bool = True,
156 |         attention_out_bias: bool = True,
157 |     ):
158 |         super().__init__()
159 |         self.only_cross_attention = only_cross_attention
160 | 
161 |         if norm_type in ("ada_norm", "ada_norm_zero") and num_embeds_ada_norm is None:
162 |             raise ValueError(
163 |                 f"`norm_type` is set to {norm_type}, but `num_embeds_ada_norm` is not defined. Please make sure to"
164 |                 f" define `num_embeds_ada_norm` if setting `norm_type` to {norm_type}."
165 |             )
166 | 
167 |         self.norm_type = norm_type
168 |         self.num_embeds_ada_norm = num_embeds_ada_norm
169 | 
170 |         if positional_embeddings and (num_positional_embeddings is None):
171 |             raise ValueError(
172 |                 "If `positional_embedding` type is defined, `num_positition_embeddings` must also be defined."
173 |             )
174 | 
175 |         if positional_embeddings == "sinusoidal":
176 |             self.pos_embed = SinusoidalPositionalEmbedding(dim, max_seq_length=num_positional_embeddings)
177 |         else:
178 |             self.pos_embed = None
179 | 
180 |         # Define 3 blocks. Each block has its own normalization layer.
181 |         # 1. Self-Attn
182 |         if norm_type == "ada_norm":
183 |             self.norm1 = AdaLayerNorm(dim, num_embeds_ada_norm)
184 |         elif norm_type == "ada_norm_zero":
185 |             self.norm1 = AdaLayerNormZero(dim, num_embeds_ada_norm)
186 |         elif norm_type == "ada_norm_continuous":
187 |             self.norm1 = AdaLayerNormContinuous(
188 |                 dim,
189 |                 ada_norm_continous_conditioning_embedding_dim,
190 |                 norm_elementwise_affine,
191 |                 norm_eps,
192 |                 ada_norm_bias,
193 |                 "rms_norm",
194 |             )
195 |         else:
196 |             self.norm1 = nn.LayerNorm(dim, elementwise_affine=norm_elementwise_affine, eps=norm_eps)
197 | 
198 |         self.attn1 = Attention(
199 |             query_dim=dim,
200 |             heads=num_attention_heads,
201 |             dim_head=attention_head_dim,
202 |             dropout=dropout,
203 |             bias=attention_bias,
204 |             cross_attention_dim=cross_attention_dim if only_cross_attention else None,
205 |             upcast_attention=upcast_attention,
206 |             out_bias=attention_out_bias,
207 |         )
208 | 
209 |         # 2. Cross-Attn
210 |         if cross_attention_dim is not None or double_self_attention:
211 |             # We currently only use AdaLayerNormZero for self attention where there will only be one attention block.
212 |             # I.e. the number of returned modulation chunks from AdaLayerZero would not make sense if returned during
213 |             # the second cross attention block.
214 |             if norm_type == "ada_norm":
215 |                 self.norm2 = AdaLayerNorm(dim, num_embeds_ada_norm)
216 |             elif norm_type == "ada_norm_continuous":
217 |                 self.norm2 = AdaLayerNormContinuous(
218 |                     dim,
219 |                     ada_norm_continous_conditioning_embedding_dim,
220 |                     norm_elementwise_affine,
221 |                     norm_eps,
222 |                     ada_norm_bias,
223 |                     "rms_norm",
224 |                 )
225 |             else:
226 |                 self.norm2 = nn.LayerNorm(dim, norm_eps, norm_elementwise_affine)
227 | 
228 |             self.attn2 = Attention(
229 |                 query_dim=dim,
230 |                 cross_attention_dim=cross_attention_dim if not double_self_attention else None,
231 |                 heads=num_attention_heads,
232 |                 dim_head=attention_head_dim,
233 |                 dropout=dropout,
234 |                 bias=attention_bias,
235 |                 upcast_attention=upcast_attention,
236 |                 out_bias=attention_out_bias,
237 |             )  # is self-attn if encoder_hidden_states is none
238 |         else:
239 |             self.norm2 = None
240 |             self.attn2 = None
241 | 
242 |         # 3. Feed-forward
243 |         if norm_type == "ada_norm_continuous":
244 |             self.norm3 = AdaLayerNormContinuous(
245 |                 dim,
246 |                 ada_norm_continous_conditioning_embedding_dim,
247 |                 norm_elementwise_affine,
248 |                 norm_eps,
249 |                 ada_norm_bias,
250 |                 "layer_norm",
251 |             )
252 | 
253 |         elif norm_type in ["ada_norm_zero", "ada_norm", "layer_norm", "ada_norm_continuous"]:
254 |             self.norm3 = nn.LayerNorm(dim, norm_eps, norm_elementwise_affine)
255 |         elif norm_type == "layer_norm_i2vgen":
256 |             self.norm3 = None
257 | 
258 |         self.ff = FeedForward(
259 |             dim,
260 |             dropout=dropout,
261 |             activation_fn=activation_fn,
262 |             final_dropout=final_dropout,
263 |             inner_dim=ff_inner_dim,
264 |             bias=ff_bias,
265 |         )
266 | 
267 |         # 4. Fuser
268 |         if attention_type == "gated" or attention_type == "gated-text-image":
269 |             self.fuser = GatedSelfAttentionDense(dim, cross_attention_dim, num_attention_heads, attention_head_dim)
270 | 
271 |         # 5. Scale-shift for PixArt-Alpha.
272 |         if norm_type == "ada_norm_single":
273 |             self.scale_shift_table = nn.Parameter(torch.randn(6, dim) / dim**0.5)
274 | 
275 |         # let chunk size default to None
276 |         self._chunk_size = None
277 |         self._chunk_dim = 0
278 | 
279 |     def set_chunk_feed_forward(self, chunk_size: Optional[int], dim: int = 0):
280 |         # Sets chunk feed-forward
281 |         self._chunk_size = chunk_size
282 |         self._chunk_dim = dim
283 | 
284 |     def forward(
285 |         self,
286 |         hidden_states: torch.FloatTensor,
287 |         attention_mask: Optional[torch.FloatTensor] = None,
288 |         encoder_hidden_states: Optional[torch.FloatTensor] = None,
289 |         encoder_attention_mask: Optional[torch.FloatTensor] = None,
290 |         timestep: Optional[torch.LongTensor] = None,
291 |         cross_attention_kwargs: Dict[str, Any] = None,
292 |         class_labels: Optional[torch.LongTensor] = None,
293 |         added_cond_kwargs: Optional[Dict[str, torch.Tensor]] = None,
294 |         return_attention = False,
295 |         miti_mem = False,
296 |         args=None,
297 |     ) -> torch.FloatTensor:
298 |         # Notice that normalization is always applied before the real computation in the following blocks.
299 |         # 0. Self-Attention
300 |         batch_size = hidden_states.shape[0]
301 | 
302 |         if self.norm_type == "ada_norm":
303 |             norm_hidden_states = self.norm1(hidden_states, timestep)
304 |         elif self.norm_type == "ada_norm_zero":
305 |             norm_hidden_states, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.norm1(
306 |                 hidden_states, timestep, class_labels, hidden_dtype=hidden_states.dtype
307 |             )
308 |         elif self.norm_type in ["layer_norm", "layer_norm_i2vgen"]:
309 |             norm_hidden_states = self.norm1(hidden_states)
310 |         elif self.norm_type == "ada_norm_continuous":
311 |             norm_hidden_states = self.norm1(hidden_states, added_cond_kwargs["pooled_text_emb"])
312 |         elif self.norm_type == "ada_norm_single":
313 |             shift_msa, scale_msa, gate_msa, shift_mlp, scale_mlp, gate_mlp = (
314 |                 self.scale_shift_table[None] + timestep.reshape(batch_size, 6, -1)
315 |             ).chunk(6, dim=1)
316 |             norm_hidden_states = self.norm1(hidden_states)
317 |             norm_hidden_states = norm_hidden_states * (1 + scale_msa) + shift_msa
318 |             norm_hidden_states = norm_hidden_states.squeeze(1)
319 |         else:
320 |             raise ValueError("Incorrect norm used")
321 | 
322 |         if self.pos_embed is not None:
323 |             norm_hidden_states = self.pos_embed(norm_hidden_states)
324 | 
325 |         # 1. Retrieve lora scale.
326 |         lora_scale = cross_attention_kwargs.get("scale", 1.0) if cross_attention_kwargs is not None else 1.0
327 | 
328 |         # 2. Prepare GLIGEN inputs
329 |         cross_attention_kwargs = cross_attention_kwargs.copy() if cross_attention_kwargs is not None else {}
330 |         gligen_kwargs = cross_attention_kwargs.pop("gligen", None)
331 | 
332 |         # import pdb ; pdb.set_trace()
333 | 
334 |         attn_output = self.attn1(
335 |             norm_hidden_states,
336 |             encoder_hidden_states=encoder_hidden_states if self.only_cross_attention else None,
337 |             attention_mask=attention_mask,
338 |             **cross_attention_kwargs,
339 |         )
340 |         if self.norm_type == "ada_norm_zero":
341 |             attn_output = gate_msa.unsqueeze(1) * attn_output
342 |         elif self.norm_type == "ada_norm_single":
343 |             attn_output = gate_msa * attn_output
344 | 
345 |         hidden_states = attn_output + hidden_states
346 |         if hidden_states.ndim == 4:
347 |             hidden_states = hidden_states.squeeze(1)
348 | 
349 |         # 2.5 GLIGEN Control
350 |         if gligen_kwargs is not None:
351 |             hidden_states = self.fuser(hidden_states, gligen_kwargs["objs"])
352 | 
353 |         # 3. Cross-Attention
354 |         if self.attn2 is not None:
355 |             if self.norm_type == "ada_norm":
356 |                 norm_hidden_states = self.norm2(hidden_states, timestep)
357 |             elif self.norm_type in ["ada_norm_zero", "layer_norm", "layer_norm_i2vgen"]:
358 |                 norm_hidden_states = self.norm2(hidden_states)
359 |             elif self.norm_type == "ada_norm_single":
360 |                 # For PixArt norm2 isn't applied here:
361 |                 # https://github.com/PixArt-alpha/PixArt-alpha/blob/0f55e922376d8b797edd44d25d0e7464b260dcab/diffusion/model/nets/PixArtMS.py#L70C1-L76C103
362 |                 norm_hidden_states = hidden_states
363 |             elif self.norm_type == "ada_norm_continuous":
364 |                 norm_hidden_states = self.norm2(hidden_states, added_cond_kwargs["pooled_text_emb"])
365 |             else:
366 |                 raise ValueError("Incorrect norm")
367 | 
368 |             if self.pos_embed is not None and self.norm_type != "ada_norm_single":
369 |                 norm_hidden_states = self.pos_embed(norm_hidden_states)
370 | 
371 | 
372 |             # import pdb ; pdb.set_trace() TODO return attention
373 |                 
374 |             # print(args)
375 |             # import pdb ; pdb.set_trace()
376 | 
377 |             attn_output, attn_weights = self.attn2(
378 |                 norm_hidden_states,
379 |                 encoder_hidden_states=encoder_hidden_states,
380 |                 attention_mask=encoder_attention_mask,
381 |                 return_attention=True,
382 |                 miti_mem = miti_mem,
383 |                 direct_args=args,
384 |                 **cross_attention_kwargs,
385 |             )
386 |             hidden_states = attn_output + hidden_states
387 |             # print(attn_weights.shape)
388 | 
389 |         # 4. Feed-forward
390 |         # i2vgen doesn't have this norm 🤷‍♂️
391 |         if self.norm_type == "ada_norm_continuous":
392 |             norm_hidden_states = self.norm3(hidden_states, added_cond_kwargs["pooled_text_emb"])
393 |         elif not self.norm_type == "ada_norm_single":
394 |             norm_hidden_states = self.norm3(hidden_states)
395 | 
396 |         if self.norm_type == "ada_norm_zero":
397 |             norm_hidden_states = norm_hidden_states * (1 + scale_mlp[:, None]) + shift_mlp[:, None]
398 | 
399 |         if self.norm_type == "ada_norm_single":
400 |             norm_hidden_states = self.norm2(hidden_states)
401 |             norm_hidden_states = norm_hidden_states * (1 + scale_mlp) + shift_mlp
402 | 
403 |         if self._chunk_size is not None:
404 |             # "feed_forward_chunk_size" can be used to save memory
405 |             ff_output = _chunked_feed_forward(
406 |                 self.ff, norm_hidden_states, self._chunk_dim, self._chunk_size, lora_scale=lora_scale
407 |             )
408 |         else:
409 |             ff_output = self.ff(norm_hidden_states, scale=lora_scale)
410 | 
411 |         if self.norm_type == "ada_norm_zero":
412 |             ff_output = gate_mlp.unsqueeze(1) * ff_output
413 |         elif self.norm_type == "ada_norm_single":
414 |             ff_output = gate_mlp * ff_output
415 | 
416 |         hidden_states = ff_output + hidden_states
417 |         if hidden_states.ndim == 4:
418 |             hidden_states = hidden_states.squeeze(1)
419 | 
420 |         if return_attention:
421 |             return hidden_states, attn_weights
422 |         else:
423 |             return hidden_states
424 | 
425 | 
426 | @maybe_allow_in_graph
427 | class TemporalBasicTransformerBlock(nn.Module):
428 |     r"""
429 |     A basic Transformer block for video like data.
430 | 
431 |     Parameters:
432 |         dim (`int`): The number of channels in the input and output.
433 |         time_mix_inner_dim (`int`): The number of channels for temporal attention.
434 |         num_attention_heads (`int`): The number of heads to use for multi-head attention.
435 |         attention_head_dim (`int`): The number of channels in each head.
436 |         cross_attention_dim (`int`, *optional*): The size of the encoder_hidden_states vector for cross attention.
437 |     """
438 | 
439 |     def __init__(
440 |         self,
441 |         dim: int,
442 |         time_mix_inner_dim: int,
443 |         num_attention_heads: int,
444 |         attention_head_dim: int,
445 |         cross_attention_dim: Optional[int] = None,
446 |     ):
447 |         super().__init__()
448 |         self.is_res = dim == time_mix_inner_dim
449 | 
450 |         self.norm_in = nn.LayerNorm(dim)
451 | 
452 |         # Define 3 blocks. Each block has its own normalization layer.
453 |         # 1. Self-Attn
454 |         self.norm_in = nn.LayerNorm(dim)
455 |         self.ff_in = FeedForward(
456 |             dim,
457 |             dim_out=time_mix_inner_dim,
458 |             activation_fn="geglu",
459 |         )
460 | 
461 |         self.norm1 = nn.LayerNorm(time_mix_inner_dim)
462 |         self.attn1 = Attention(
463 |             query_dim=time_mix_inner_dim,
464 |             heads=num_attention_heads,
465 |             dim_head=attention_head_dim,
466 |             cross_attention_dim=None,
467 |         )
468 | 
469 |         # 2. Cross-Attn
470 |         if cross_attention_dim is not None:
471 |             # We currently only use AdaLayerNormZero for self attention where there will only be one attention block.
472 |             # I.e. the number of returned modulation chunks from AdaLayerZero would not make sense if returned during
473 |             # the second cross attention block.
474 |             self.norm2 = nn.LayerNorm(time_mix_inner_dim)
475 |             self.attn2 = Attention(
476 |                 query_dim=time_mix_inner_dim,
477 |                 cross_attention_dim=cross_attention_dim,
478 |                 heads=num_attention_heads,
479 |                 dim_head=attention_head_dim,
480 |             )  # is self-attn if encoder_hidden_states is none
481 |         else:
482 |             self.norm2 = None
483 |             self.attn2 = None
484 | 
485 |         # 3. Feed-forward
486 |         self.norm3 = nn.LayerNorm(time_mix_inner_dim)
487 |         self.ff = FeedForward(time_mix_inner_dim, activation_fn="geglu")
488 | 
489 |         # let chunk size default to None
490 |         self._chunk_size = None
491 |         self._chunk_dim = None
492 | 
493 |     def set_chunk_feed_forward(self, chunk_size: Optional[int], **kwargs):
494 |         # Sets chunk feed-forward
495 |         self._chunk_size = chunk_size
496 |         # chunk dim should be hardcoded to 1 to have better speed vs. memory trade-off
497 |         self._chunk_dim = 1
498 | 
499 |     def forward(
500 |         self,
501 |         hidden_states: torch.FloatTensor,
502 |         num_frames: int,
503 |         encoder_hidden_states: Optional[torch.FloatTensor] = None,
504 |     ) -> torch.FloatTensor:
505 |         # Notice that normalization is always applied before the real computation in the following blocks.
506 |         # 0. Self-Attention
507 |         batch_size = hidden_states.shape[0]
508 | 
509 |         batch_frames, seq_length, channels = hidden_states.shape
510 |         batch_size = batch_frames // num_frames
511 | 
512 |         hidden_states = hidden_states[None, :].reshape(batch_size, num_frames, seq_length, channels)
513 |         hidden_states = hidden_states.permute(0, 2, 1, 3)
514 |         hidden_states = hidden_states.reshape(batch_size * seq_length, num_frames, channels)
515 | 
516 |         residual = hidden_states
517 |         hidden_states = self.norm_in(hidden_states)
518 | 
519 |         if self._chunk_size is not None:
520 |             hidden_states = _chunked_feed_forward(self.ff_in, hidden_states, self._chunk_dim, self._chunk_size)
521 |         else:
522 |             hidden_states = self.ff_in(hidden_states)
523 | 
524 |         if self.is_res:
525 |             hidden_states = hidden_states + residual
526 | 
527 |         norm_hidden_states = self.norm1(hidden_states)
528 |         attn_output = self.attn1(norm_hidden_states, encoder_hidden_states=None)
529 |         hidden_states = attn_output + hidden_states
530 | 
531 |         # 3. Cross-Attention
532 |         if self.attn2 is not None:
533 |             norm_hidden_states = self.norm2(hidden_states)
534 |             attn_output = self.attn2(norm_hidden_states, encoder_hidden_states=encoder_hidden_states)
535 |             hidden_states = attn_output + hidden_states
536 | 
537 |         # 4. Feed-forward
538 |         norm_hidden_states = self.norm3(hidden_states)
539 | 
540 |         if self._chunk_size is not None:
541 |             ff_output = _chunked_feed_forward(self.ff, norm_hidden_states, self._chunk_dim, self._chunk_size)
542 |         else:
543 |             ff_output = self.ff(norm_hidden_states)
544 | 
545 |         if self.is_res:
546 |             hidden_states = ff_output + hidden_states
547 |         else:
548 |             hidden_states = ff_output
549 | 
550 |         hidden_states = hidden_states[None, :].reshape(batch_size, seq_length, num_frames, channels)
551 |         hidden_states = hidden_states.permute(0, 2, 1, 3)
552 |         hidden_states = hidden_states.reshape(batch_size * num_frames, seq_length, channels)
553 | 
554 |         return hidden_states
555 | 
556 | 
557 | class SkipFFTransformerBlock(nn.Module):
558 |     def __init__(
559 |         self,
560 |         dim: int,
561 |         num_attention_heads: int,
562 |         attention_head_dim: int,
563 |         kv_input_dim: int,
564 |         kv_input_dim_proj_use_bias: bool,
565 |         dropout=0.0,
566 |         cross_attention_dim: Optional[int] = None,
567 |         attention_bias: bool = False,
568 |         attention_out_bias: bool = True,
569 |     ):
570 |         super().__init__()
571 |         if kv_input_dim != dim:
572 |             self.kv_mapper = nn.Linear(kv_input_dim, dim, kv_input_dim_proj_use_bias)
573 |         else:
574 |             self.kv_mapper = None
575 | 
576 |         self.norm1 = RMSNorm(dim, 1e-06)
577 | 
578 |         self.attn1 = Attention(
579 |             query_dim=dim,
580 |             heads=num_attention_heads,
581 |             dim_head=attention_head_dim,
582 |             dropout=dropout,
583 |             bias=attention_bias,
584 |             cross_attention_dim=cross_attention_dim,
585 |             out_bias=attention_out_bias,
586 |         )
587 | 
588 |         self.norm2 = RMSNorm(dim, 1e-06)
589 | 
590 |         self.attn2 = Attention(
591 |             query_dim=dim,
592 |             cross_attention_dim=cross_attention_dim,
593 |             heads=num_attention_heads,
594 |             dim_head=attention_head_dim,
595 |             dropout=dropout,
596 |             bias=attention_bias,
597 |             out_bias=attention_out_bias,
598 |         )
599 | 
600 |     def forward(self, hidden_states, encoder_hidden_states, cross_attention_kwargs):
601 |         cross_attention_kwargs = cross_attention_kwargs.copy() if cross_attention_kwargs is not None else {}
602 | 
603 |         if self.kv_mapper is not None:
604 |             encoder_hidden_states = self.kv_mapper(F.silu(encoder_hidden_states))
605 | 
606 |         norm_hidden_states = self.norm1(hidden_states)
607 | 
608 |         attn_output = self.attn1(
609 |             norm_hidden_states,
610 |             encoder_hidden_states=encoder_hidden_states,
611 |             **cross_attention_kwargs,
612 |         )
613 | 
614 |         hidden_states = attn_output + hidden_states
615 | 
616 |         norm_hidden_states = self.norm2(hidden_states)
617 | 
618 |         attn_output = self.attn2(
619 |             norm_hidden_states,
620 |             encoder_hidden_states=encoder_hidden_states,
621 |             **cross_attention_kwargs,
622 |         )
623 | 
624 |         hidden_states = attn_output + hidden_states
625 | 
626 |         return hidden_states
627 | 
628 | 
629 | class FeedForward(nn.Module):
630 |     r"""
631 |     A feed-forward layer.
632 | 
633 |     Parameters:
634 |         dim (`int`): The number of channels in the input.
635 |         dim_out (`int`, *optional*): The number of channels in the output. If not given, defaults to `dim`.
636 |         mult (`int`, *optional*, defaults to 4): The multiplier to use for the hidden dimension.
637 |         dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use.
638 |         activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to be used in feed-forward.
639 |         final_dropout (`bool` *optional*, defaults to False): Apply a final dropout.
640 |         bias (`bool`, defaults to True): Whether to use a bias in the linear layer.
641 |     """
642 | 
643 |     def __init__(
644 |         self,
645 |         dim: int,
646 |         dim_out: Optional[int] = None,
647 |         mult: int = 4,
648 |         dropout: float = 0.0,
649 |         activation_fn: str = "geglu",
650 |         final_dropout: bool = False,
651 |         inner_dim=None,
652 |         bias: bool = True,
653 |     ):
654 |         super().__init__()
655 |         if inner_dim is None:
656 |             inner_dim = int(dim * mult)
657 |         dim_out = dim_out if dim_out is not None else dim
658 |         linear_cls = LoRACompatibleLinear if not USE_PEFT_BACKEND else nn.Linear
659 | 
660 |         if activation_fn == "gelu":
661 |             act_fn = GELU(dim, inner_dim, bias=bias)
662 |         if activation_fn == "gelu-approximate":
663 |             act_fn = GELU(dim, inner_dim, approximate="tanh", bias=bias)
664 |         elif activation_fn == "geglu":
665 |             act_fn = GEGLU(dim, inner_dim, bias=bias)
666 |         elif activation_fn == "geglu-approximate":
667 |             act_fn = ApproximateGELU(dim, inner_dim, bias=bias)
668 | 
669 |         self.net = nn.ModuleList([])
670 |         # project in
671 |         self.net.append(act_fn)
672 |         # project dropout
673 |         self.net.append(nn.Dropout(dropout))
674 |         # project out
675 |         self.net.append(linear_cls(inner_dim, dim_out, bias=bias))
676 |         # FF as used in Vision Transformer, MLP-Mixer, etc. have a final dropout
677 |         if final_dropout:
678 |             self.net.append(nn.Dropout(dropout))
679 | 
680 |     def forward(self, hidden_states: torch.Tensor, scale: float = 1.0) -> torch.Tensor:
681 |         compatible_cls = (GEGLU,) if USE_PEFT_BACKEND else (GEGLU, LoRACompatibleLinear)
682 |         for module in self.net:
683 |             if isinstance(module, compatible_cls):
684 |                 hidden_states = module(hidden_states, scale)
685 |             else:
686 |                 hidden_states = module(hidden_states)
687 |         return hidden_states
688 | 


--------------------------------------------------------------------------------
/refactored_classes/refactored_transformer_2d.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2023 The HuggingFace Team. All rights reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | from dataclasses import dataclass
 15 | from typing import Any, Dict, Optional
 16 | 
 17 | import torch
 18 | import torch.nn.functional as F
 19 | from torch import nn
 20 | 
 21 | from diffusers.configuration_utils import ConfigMixin, register_to_config
 22 | from diffusers.utils import USE_PEFT_BACKEND, BaseOutput, deprecate, is_torch_version
 23 | from .refactored_attention import BasicTransformerBlock
 24 | from diffusers.models.embeddings import ImagePositionalEmbeddings, PatchEmbed, PixArtAlphaTextProjection
 25 | from diffusers.models.lora import LoRACompatibleConv, LoRACompatibleLinear
 26 | from diffusers.models.modeling_utils import ModelMixin
 27 | from diffusers.models.normalization import AdaLayerNormSingle
 28 | 
 29 | 
 30 | @dataclass
 31 | class Transformer2DModelOutput(BaseOutput):
 32 |     """
 33 |     The output of [`Transformer2DModel`].
 34 | 
 35 |     Args:
 36 |         sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` or `(batch size, num_vector_embeds - 1, num_latent_pixels)` if [`Transformer2DModel`] is discrete):
 37 |             The hidden states output conditioned on the `encoder_hidden_states` input. If discrete, returns probability
 38 |             distributions for the unnoised latent pixels.
 39 |     """
 40 | 
 41 |     sample: torch.FloatTensor
 42 | 
 43 | 
 44 | class Transformer2DModel(ModelMixin, ConfigMixin):
 45 |     """
 46 |     A 2D Transformer model for image-like data.
 47 | 
 48 |     Parameters:
 49 |         num_attention_heads (`int`, *optional*, defaults to 16): The number of heads to use for multi-head attention.
 50 |         attention_head_dim (`int`, *optional*, defaults to 88): The number of channels in each head.
 51 |         in_channels (`int`, *optional*):
 52 |             The number of channels in the input and output (specify if the input is **continuous**).
 53 |         num_layers (`int`, *optional*, defaults to 1): The number of layers of Transformer blocks to use.
 54 |         dropout (`float`, *optional*, defaults to 0.0): The dropout probability to use.
 55 |         cross_attention_dim (`int`, *optional*): The number of `encoder_hidden_states` dimensions to use.
 56 |         sample_size (`int`, *optional*): The width of the latent images (specify if the input is **discrete**).
 57 |             This is fixed during training since it is used to learn a number of position embeddings.
 58 |         num_vector_embeds (`int`, *optional*):
 59 |             The number of classes of the vector embeddings of the latent pixels (specify if the input is **discrete**).
 60 |             Includes the class for the masked latent pixel.
 61 |         activation_fn (`str`, *optional*, defaults to `"geglu"`): Activation function to use in feed-forward.
 62 |         num_embeds_ada_norm ( `int`, *optional*):
 63 |             The number of diffusion steps used during training. Pass if at least one of the norm_layers is
 64 |             `AdaLayerNorm`. This is fixed during training since it is used to learn a number of embeddings that are
 65 |             added to the hidden states.
 66 | 
 67 |             During inference, you can denoise for up to but not more steps than `num_embeds_ada_norm`.
 68 |         attention_bias (`bool`, *optional*):
 69 |             Configure if the `TransformerBlocks` attention should contain a bias parameter.
 70 |     """
 71 | 
 72 |     _supports_gradient_checkpointing = True
 73 | 
 74 |     @register_to_config
 75 |     def __init__(
 76 |         self,
 77 |         num_attention_heads: int = 16,
 78 |         attention_head_dim: int = 88,
 79 |         in_channels: Optional[int] = None,
 80 |         out_channels: Optional[int] = None,
 81 |         num_layers: int = 1,
 82 |         dropout: float = 0.0,
 83 |         norm_num_groups: int = 32,
 84 |         cross_attention_dim: Optional[int] = None,
 85 |         attention_bias: bool = False,
 86 |         sample_size: Optional[int] = None,
 87 |         num_vector_embeds: Optional[int] = None,
 88 |         patch_size: Optional[int] = None,
 89 |         activation_fn: str = "geglu",
 90 |         num_embeds_ada_norm: Optional[int] = None,
 91 |         use_linear_projection: bool = False,
 92 |         only_cross_attention: bool = False,
 93 |         double_self_attention: bool = False,
 94 |         upcast_attention: bool = False,
 95 |         norm_type: str = "layer_norm",
 96 |         norm_elementwise_affine: bool = True,
 97 |         norm_eps: float = 1e-5,
 98 |         attention_type: str = "default",
 99 |         caption_channels: int = None,
100 |     ):
101 |         super().__init__()
102 |         self.use_linear_projection = use_linear_projection
103 |         self.num_attention_heads = num_attention_heads
104 |         self.attention_head_dim = attention_head_dim
105 |         inner_dim = num_attention_heads * attention_head_dim
106 | 
107 |         conv_cls = nn.Conv2d if USE_PEFT_BACKEND else LoRACompatibleConv
108 |         linear_cls = nn.Linear if USE_PEFT_BACKEND else LoRACompatibleLinear
109 | 
110 |         # 1. Transformer2DModel can process both standard continuous images of shape `(batch_size, num_channels, width, height)` as well as quantized image embeddings of shape `(batch_size, num_image_vectors)`
111 |         # Define whether input is continuous or discrete depending on configuration
112 |         self.is_input_continuous = (in_channels is not None) and (patch_size is None)
113 |         self.is_input_vectorized = num_vector_embeds is not None
114 |         self.is_input_patches = in_channels is not None and patch_size is not None
115 | 
116 |         if norm_type == "layer_norm" and num_embeds_ada_norm is not None:
117 |             deprecation_message = (
118 |                 f"The configuration file of this model: {self.__class__} is outdated. `norm_type` is either not set or"
119 |                 " incorrectly set to `'layer_norm'`.Make sure to set `norm_type` to `'ada_norm'` in the config."
120 |                 " Please make sure to update the config accordingly as leaving `norm_type` might led to incorrect"
121 |                 " results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it"
122 |                 " would be very nice if you could open a Pull request for the `transformer/config.json` file"
123 |             )
124 |             deprecate("norm_type!=num_embeds_ada_norm", "1.0.0", deprecation_message, standard_warn=False)
125 |             norm_type = "ada_norm"
126 | 
127 |         if self.is_input_continuous and self.is_input_vectorized:
128 |             raise ValueError(
129 |                 f"Cannot define both `in_channels`: {in_channels} and `num_vector_embeds`: {num_vector_embeds}. Make"
130 |                 " sure that either `in_channels` or `num_vector_embeds` is None."
131 |             )
132 |         elif self.is_input_vectorized and self.is_input_patches:
133 |             raise ValueError(
134 |                 f"Cannot define both `num_vector_embeds`: {num_vector_embeds} and `patch_size`: {patch_size}. Make"
135 |                 " sure that either `num_vector_embeds` or `num_patches` is None."
136 |             )
137 |         elif not self.is_input_continuous and not self.is_input_vectorized and not self.is_input_patches:
138 |             raise ValueError(
139 |                 f"Has to define `in_channels`: {in_channels}, `num_vector_embeds`: {num_vector_embeds}, or patch_size:"
140 |                 f" {patch_size}. Make sure that `in_channels`, `num_vector_embeds` or `num_patches` is not None."
141 |             )
142 | 
143 |         # 2. Define input layers
144 |         if self.is_input_continuous:
145 |             self.in_channels = in_channels
146 | 
147 |             self.norm = torch.nn.GroupNorm(num_groups=norm_num_groups, num_channels=in_channels, eps=1e-6, affine=True)
148 |             if use_linear_projection:
149 |                 self.proj_in = linear_cls(in_channels, inner_dim)
150 |             else:
151 |                 self.proj_in = conv_cls(in_channels, inner_dim, kernel_size=1, stride=1, padding=0)
152 |         elif self.is_input_vectorized:
153 |             assert sample_size is not None, "Transformer2DModel over discrete input must provide sample_size"
154 |             assert num_vector_embeds is not None, "Transformer2DModel over discrete input must provide num_embed"
155 | 
156 |             self.height = sample_size
157 |             self.width = sample_size
158 |             self.num_vector_embeds = num_vector_embeds
159 |             self.num_latent_pixels = self.height * self.width
160 | 
161 |             self.latent_image_embedding = ImagePositionalEmbeddings(
162 |                 num_embed=num_vector_embeds, embed_dim=inner_dim, height=self.height, width=self.width
163 |             )
164 |         elif self.is_input_patches:
165 |             assert sample_size is not None, "Transformer2DModel over patched input must provide sample_size"
166 | 
167 |             self.height = sample_size
168 |             self.width = sample_size
169 | 
170 |             self.patch_size = patch_size
171 |             interpolation_scale = self.config.sample_size // 64  # => 64 (= 512 pixart) has interpolation scale 1
172 |             interpolation_scale = max(interpolation_scale, 1)
173 |             self.pos_embed = PatchEmbed(
174 |                 height=sample_size,
175 |                 width=sample_size,
176 |                 patch_size=patch_size,
177 |                 in_channels=in_channels,
178 |                 embed_dim=inner_dim,
179 |                 interpolation_scale=interpolation_scale,
180 |             )
181 | 
182 |         # 3. Define transformers blocks
183 |         self.transformer_blocks = nn.ModuleList(
184 |             [
185 |                 BasicTransformerBlock(
186 |                     inner_dim,
187 |                     num_attention_heads,
188 |                     attention_head_dim,
189 |                     dropout=dropout,
190 |                     cross_attention_dim=cross_attention_dim,
191 |                     activation_fn=activation_fn,
192 |                     num_embeds_ada_norm=num_embeds_ada_norm,
193 |                     attention_bias=attention_bias,
194 |                     only_cross_attention=only_cross_attention,
195 |                     double_self_attention=double_self_attention,
196 |                     upcast_attention=upcast_attention,
197 |                     norm_type=norm_type,
198 |                     norm_elementwise_affine=norm_elementwise_affine,
199 |                     norm_eps=norm_eps,
200 |                     attention_type=attention_type,
201 |                 )
202 |                 for d in range(num_layers)
203 |             ]
204 |         )
205 | 
206 |         # 4. Define output layers
207 |         self.out_channels = in_channels if out_channels is None else out_channels
208 |         if self.is_input_continuous:
209 |             # TODO: should use out_channels for continuous projections
210 |             if use_linear_projection:
211 |                 self.proj_out = linear_cls(inner_dim, in_channels)
212 |             else:
213 |                 self.proj_out = conv_cls(inner_dim, in_channels, kernel_size=1, stride=1, padding=0)
214 |         elif self.is_input_vectorized:
215 |             self.norm_out = nn.LayerNorm(inner_dim)
216 |             self.out = nn.Linear(inner_dim, self.num_vector_embeds - 1)
217 |         elif self.is_input_patches and norm_type != "ada_norm_single":
218 |             self.norm_out = nn.LayerNorm(inner_dim, elementwise_affine=False, eps=1e-6)
219 |             self.proj_out_1 = nn.Linear(inner_dim, 2 * inner_dim)
220 |             self.proj_out_2 = nn.Linear(inner_dim, patch_size * patch_size * self.out_channels)
221 |         elif self.is_input_patches and norm_type == "ada_norm_single":
222 |             self.norm_out = nn.LayerNorm(inner_dim, elementwise_affine=False, eps=1e-6)
223 |             self.scale_shift_table = nn.Parameter(torch.randn(2, inner_dim) / inner_dim**0.5)
224 |             self.proj_out = nn.Linear(inner_dim, patch_size * patch_size * self.out_channels)
225 | 
226 |         # 5. PixArt-Alpha blocks.
227 |         self.adaln_single = None
228 |         self.use_additional_conditions = False
229 |         if norm_type == "ada_norm_single":
230 |             self.use_additional_conditions = self.config.sample_size == 128
231 |             # TODO(Sayak, PVP) clean this, for now we use sample size to determine whether to use
232 |             # additional conditions until we find better name
233 |             self.adaln_single = AdaLayerNormSingle(inner_dim, use_additional_conditions=self.use_additional_conditions)
234 | 
235 |         self.caption_projection = None
236 |         if caption_channels is not None:
237 |             self.caption_projection = PixArtAlphaTextProjection(in_features=caption_channels, hidden_size=inner_dim)
238 | 
239 |         self.gradient_checkpointing = False
240 | 
241 |     def _set_gradient_checkpointing(self, module, value=False):
242 |         if hasattr(module, "gradient_checkpointing"):
243 |             module.gradient_checkpointing = value
244 | 
245 |     def forward(
246 |         self,
247 |         hidden_states: torch.Tensor,
248 |         encoder_hidden_states: Optional[torch.Tensor] = None,
249 |         timestep: Optional[torch.LongTensor] = None,
250 |         added_cond_kwargs: Dict[str, torch.Tensor] = None,
251 |         class_labels: Optional[torch.LongTensor] = None,
252 |         cross_attention_kwargs: Dict[str, Any] = None,
253 |         attention_mask: Optional[torch.Tensor] = None,
254 |         encoder_attention_mask: Optional[torch.Tensor] = None,
255 |         return_dict: bool = True,
256 |         return_attention = False,
257 |         miti_mem = False,
258 |         args=None,
259 |     ):
260 |         """
261 |         The [`Transformer2DModel`] forward method.
262 | 
263 |         Args:
264 |             hidden_states (`torch.LongTensor` of shape `(batch size, num latent pixels)` if discrete, `torch.FloatTensor` of shape `(batch size, channel, height, width)` if continuous):
265 |                 Input `hidden_states`.
266 |             encoder_hidden_states ( `torch.FloatTensor` of shape `(batch size, sequence len, embed dims)`, *optional*):
267 |                 Conditional embeddings for cross attention layer. If not given, cross-attention defaults to
268 |                 self-attention.
269 |             timestep ( `torch.LongTensor`, *optional*):
270 |                 Used to indicate denoising step. Optional timestep to be applied as an embedding in `AdaLayerNorm`.
271 |             class_labels ( `torch.LongTensor` of shape `(batch size, num classes)`, *optional*):
272 |                 Used to indicate class labels conditioning. Optional class labels to be applied as an embedding in
273 |                 `AdaLayerZeroNorm`.
274 |             cross_attention_kwargs ( `Dict[str, Any]`, *optional*):
275 |                 A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under
276 |                 `self.processor` in
277 |                 [diffusers.models.attention_processor](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
278 |             attention_mask ( `torch.Tensor`, *optional*):
279 |                 An attention mask of shape `(batch, key_tokens)` is applied to `encoder_hidden_states`. If `1` the mask
280 |                 is kept, otherwise if `0` it is discarded. Mask will be converted into a bias, which adds large
281 |                 negative values to the attention scores corresponding to "discard" tokens.
282 |             encoder_attention_mask ( `torch.Tensor`, *optional*):
283 |                 Cross-attention mask applied to `encoder_hidden_states`. Two formats supported:
284 | 
285 |                     * Mask `(batch, sequence_length)` True = keep, False = discard.
286 |                     * Bias `(batch, 1, sequence_length)` 0 = keep, -10000 = discard.
287 | 
288 |                 If `ndim == 2`: will be interpreted as a mask, then converted into a bias consistent with the format
289 |                 above. This bias will be added to the cross-attention scores.
290 |             return_dict (`bool`, *optional*, defaults to `True`):
291 |                 Whether or not to return a [`~models.unets.unet_2d_condition.UNet2DConditionOutput`] instead of a plain
292 |                 tuple.
293 | 
294 |         Returns:
295 |             If `return_dict` is True, an [`~models.transformer_2d.Transformer2DModelOutput`] is returned, otherwise a
296 |             `tuple` where the first element is the sample tensor.
297 |         """
298 |         # ensure attention_mask is a bias, and give it a singleton query_tokens dimension.
299 |         #   we may have done this conversion already, e.g. if we came here via UNet2DConditionModel#forward.
300 |         #   we can tell by counting dims; if ndim == 2: it's a mask rather than a bias.
301 |         # expects mask of shape:
302 |         #   [batch, key_tokens]
303 |         # adds singleton query_tokens dimension:
304 |         #   [batch,                    1, key_tokens]
305 |         # this helps to broadcast it as a bias over attention scores, which will be in one of the following shapes:
306 |         #   [batch,  heads, query_tokens, key_tokens] (e.g. torch sdp attn)
307 |         #   [batch * heads, query_tokens, key_tokens] (e.g. xformers or classic attn)
308 |         if attention_mask is not None and attention_mask.ndim == 2:
309 |             # assume that mask is expressed as:
310 |             #   (1 = keep,      0 = discard)
311 |             # convert mask into a bias that can be added to attention scores:
312 |             #       (keep = +0,     discard = -10000.0)
313 |             attention_mask = (1 - attention_mask.to(hidden_states.dtype)) * -10000.0
314 |             attention_mask = attention_mask.unsqueeze(1)
315 | 
316 |         # convert encoder_attention_mask to a bias the same way we do for attention_mask
317 |         if encoder_attention_mask is not None and encoder_attention_mask.ndim == 2:
318 |             encoder_attention_mask = (1 - encoder_attention_mask.to(hidden_states.dtype)) * -10000.0
319 |             encoder_attention_mask = encoder_attention_mask.unsqueeze(1)
320 | 
321 |         # Retrieve lora scale.
322 |         lora_scale = cross_attention_kwargs.get("scale", 1.0) if cross_attention_kwargs is not None else 1.0
323 | 
324 |         # 1. Input
325 |         if self.is_input_continuous:
326 |             batch, _, height, width = hidden_states.shape
327 |             residual = hidden_states
328 | 
329 |             hidden_states = self.norm(hidden_states)
330 |             if not self.use_linear_projection:
331 |                 hidden_states = (
332 |                     self.proj_in(hidden_states, scale=lora_scale)
333 |                     if not USE_PEFT_BACKEND
334 |                     else self.proj_in(hidden_states)
335 |                 )
336 |                 inner_dim = hidden_states.shape[1]
337 |                 hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch, height * width, inner_dim)
338 |             else:
339 |                 inner_dim = hidden_states.shape[1]
340 |                 hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch, height * width, inner_dim)
341 |                 hidden_states = (
342 |                     self.proj_in(hidden_states, scale=lora_scale)
343 |                     if not USE_PEFT_BACKEND
344 |                     else self.proj_in(hidden_states)
345 |                 )
346 | 
347 |         elif self.is_input_vectorized:
348 |             hidden_states = self.latent_image_embedding(hidden_states)
349 |         elif self.is_input_patches:
350 |             height, width = hidden_states.shape[-2] // self.patch_size, hidden_states.shape[-1] // self.patch_size
351 |             hidden_states = self.pos_embed(hidden_states)
352 | 
353 |             if self.adaln_single is not None:
354 |                 if self.use_additional_conditions and added_cond_kwargs is None:
355 |                     raise ValueError(
356 |                         "`added_cond_kwargs` cannot be None when using additional conditions for `adaln_single`."
357 |                     )
358 |                 batch_size = hidden_states.shape[0]
359 |                 timestep, embedded_timestep = self.adaln_single(
360 |                     timestep, added_cond_kwargs, batch_size=batch_size, hidden_dtype=hidden_states.dtype
361 |                 )
362 | 
363 |         # 2. Blocks
364 |         if self.caption_projection is not None:
365 |             batch_size = hidden_states.shape[0]
366 |             encoder_hidden_states = self.caption_projection(encoder_hidden_states)
367 |             encoder_hidden_states = encoder_hidden_states.view(batch_size, -1, hidden_states.shape[-1])
368 | 
369 |         attn_weights_list = []
370 |         for block in self.transformer_blocks:
371 |             if self.training and self.gradient_checkpointing:
372 | 
373 |                 def create_custom_forward(module, return_dict=None):
374 |                     def custom_forward(*inputs):
375 |                         if return_dict is not None:
376 |                             return module(*inputs, return_dict=return_dict)
377 |                         else:
378 |                             return module(*inputs)
379 | 
380 |                     return custom_forward
381 | 
382 |                 ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {}
383 |                 hidden_states = torch.utils.checkpoint.checkpoint(
384 |                     create_custom_forward(block),
385 |                     hidden_states,
386 |                     attention_mask,
387 |                     encoder_hidden_states,
388 |                     encoder_attention_mask,
389 |                     timestep,
390 |                     cross_attention_kwargs,
391 |                     class_labels,
392 |                     **ckpt_kwargs,
393 |                 )
394 |             else:
395 |                 # TODO return attention
396 |                 hidden_states, attn_weights = block(
397 |                     hidden_states,
398 |                     attention_mask=attention_mask,
399 |                     encoder_hidden_states=encoder_hidden_states,
400 |                     encoder_attention_mask=encoder_attention_mask,
401 |                     timestep=timestep,
402 |                     cross_attention_kwargs=cross_attention_kwargs,
403 |                     class_labels=class_labels,
404 |                     return_attention=True,
405 |                     miti_mem=miti_mem,
406 |                     args=args,
407 |                 )
408 |                 attn_weights_list.append(attn_weights)
409 | 
410 |         # 3. Output
411 |         if self.is_input_continuous:
412 |             if not self.use_linear_projection:
413 |                 hidden_states = hidden_states.reshape(batch, height, width, inner_dim).permute(0, 3, 1, 2).contiguous()
414 |                 hidden_states = (
415 |                     self.proj_out(hidden_states, scale=lora_scale)
416 |                     if not USE_PEFT_BACKEND
417 |                     else self.proj_out(hidden_states)
418 |                 )
419 |             else:
420 |                 hidden_states = (
421 |                     self.proj_out(hidden_states, scale=lora_scale)
422 |                     if not USE_PEFT_BACKEND
423 |                     else self.proj_out(hidden_states)
424 |                 )
425 |                 hidden_states = hidden_states.reshape(batch, height, width, inner_dim).permute(0, 3, 1, 2).contiguous()
426 | 
427 |             output = hidden_states + residual
428 |         elif self.is_input_vectorized:
429 |             hidden_states = self.norm_out(hidden_states)
430 |             logits = self.out(hidden_states)
431 |             # (batch, self.num_vector_embeds - 1, self.num_latent_pixels)
432 |             logits = logits.permute(0, 2, 1)
433 | 
434 |             # log(p(x_0))
435 |             output = F.log_softmax(logits.double(), dim=1).float()
436 | 
437 |         if self.is_input_patches:
438 |             if self.config.norm_type != "ada_norm_single":
439 |                 conditioning = self.transformer_blocks[0].norm1.emb(
440 |                     timestep, class_labels, hidden_dtype=hidden_states.dtype
441 |                 )
442 |                 shift, scale = self.proj_out_1(F.silu(conditioning)).chunk(2, dim=1)
443 |                 hidden_states = self.norm_out(hidden_states) * (1 + scale[:, None]) + shift[:, None]
444 |                 hidden_states = self.proj_out_2(hidden_states)
445 |             elif self.config.norm_type == "ada_norm_single":
446 |                 shift, scale = (self.scale_shift_table[None] + embedded_timestep[:, None]).chunk(2, dim=1)
447 |                 hidden_states = self.norm_out(hidden_states)
448 |                 # Modulation
449 |                 hidden_states = hidden_states * (1 + scale) + shift
450 |                 hidden_states = self.proj_out(hidden_states)
451 |                 hidden_states = hidden_states.squeeze(1)
452 | 
453 |             # unpatchify
454 |             if self.adaln_single is None:
455 |                 height = width = int(hidden_states.shape[1] ** 0.5)
456 |             hidden_states = hidden_states.reshape(
457 |                 shape=(-1, height, width, self.patch_size, self.patch_size, self.out_channels)
458 |             )
459 |             hidden_states = torch.einsum("nhwpqc->nchpwq", hidden_states)
460 |             output = hidden_states.reshape(
461 |                 shape=(-1, self.out_channels, height * self.patch_size, width * self.patch_size)
462 |             )
463 | 
464 |         if not return_dict:
465 |             if return_attention:
466 |                 return (output, attn_weights_list)
467 |             else:
468 |                 return (output,)
469 | 
470 |         return Transformer2DModelOutput(sample=output)
471 | 


--------------------------------------------------------------------------------
/text2img.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | 
  3 | # Create the parser
  4 | parser = argparse.ArgumentParser(description="Process some integers.")
  5 | 
  6 | # Add arguments
  7 | parser.add_argument("--model_name", type=str, default="CompVis/stable-diffusion-v1-4", help="an integer to be processed")
  8 | parser.add_argument("--local", type=str, default='', help="The scale of noise offset.")
  9 | parser.add_argument("--save_prefix", type=str, default="heatmap/test", help="The scale of noise offset.")
 10 | parser.add_argument("--prompt", type=str, default="a photo of an astronaut riding a horse on mars", help="The scale of noise offset.")
 11 | parser.add_argument("--job_id", type=str, default='local', help="The scale of noise offset.")
 12 | parser.add_argument("--output_name", type=str, default='local', help="The scale of noise offset.")
 13 | parser.add_argument("--save_prefix_numpy", type=str, default='local', help="The scale of noise offset.")
 14 | parser.add_argument("--miti_mem", action="store_true", default=False, help="display the square of the number")
 15 | parser.add_argument("--save_numpy", action="store_true", default=False, help="display the square of the number")
 16 | parser.add_argument("--mask_length_minis1", action="store_true", default=False, help="display the square of the number")
 17 | parser.add_argument("--cross_attn_mask", action="store_true", default=False, help="display the square of the number")
 18 | parser.add_argument('--c1', type=float, default=1, help='an integer for the accumulator')
 19 | parser.add_argument('--seed', type=int, default=0, help='an integer for the accumulator')
 20 | 
 21 | # Parse the arguments
 22 | args = parser.parse_args()
 23 | 
 24 | # import pdb ; pdb.set_trace()
 25 | 
 26 | import os
 27 | 
 28 | if args.local != '':
 29 |     os.environ['CUDA_VISIBLE_DEVICES'] = args.local
 30 | 
 31 | import torch
 32 | from diffusers import EulerDiscreteScheduler, DDIMScheduler
 33 | from refactored_classes.MemAttn import MemStableDiffusionPipeline as StableDiffusionPipeline
 34 | from refactored_classes.refactored_unet_2d_condition import UNet2DConditionModel
 35 | 
 36 | import numpy as np
 37 | import random
 38 | 
 39 | torch.cuda.manual_seed(args.seed)
 40 | torch.manual_seed(args.seed)
 41 | torch.cuda.manual_seed_all(args.seed)
 42 | np.random.seed(args.seed)
 43 | random.seed(args.seed)
 44 | 
 45 | # torch.backends.cudnn.enabled = True
 46 | torch.backends.cudnn.benchmark = False
 47 | torch.backends.cudnn.deterministic = True
 48 | 
 49 | 
 50 | def set_seed(seed):
 51 |     torch.cuda.manual_seed(seed)
 52 |     torch.manual_seed(seed)
 53 |     torch.cuda.manual_seed_all(seed)
 54 |     np.random.seed(seed)
 55 |     random.seed(seed)
 56 | 
 57 | model_id = args.model_name
 58 | device = "cuda"
 59 | 
 60 | unet = UNet2DConditionModel.from_pretrained(
 61 |         model_id, subfolder="unet", cache_dir="/localscratch/renjie/cache/", torch_dtype=torch.float16
 62 |     )
 63 | 
 64 | if args.model_name == "stabilityai/stable-diffusion-2":
 65 |     scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
 66 |     pipe = StableDiffusionPipeline.from_pretrained(model_id, unet=unet, cache_dir="/egr/research-dselab/renjie3/.cache", scheduler=scheduler, safety_checker=None, torch_dtype=torch.float16)
 67 | 
 68 | else:
 69 |     pipe = StableDiffusionPipeline.from_pretrained(model_id, unet=unet, cache_dir="/egr/research-dselab/renjie3/.cache", safety_checker=None, torch_dtype=torch.float16)
 70 |     pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
 71 | pipe = pipe.to(device)
 72 | 
 73 | save_dir = f"./results/{args.job_id}_{args.prompt}_{args.output_name}_seed{args.seed}"
 74 | if not os.path.exists(save_dir):
 75 |     os.makedirs(save_dir)
 76 |     
 77 | from time import time
 78 | 
 79 | time_counter = 0
 80 | 
 81 | args.save_prefix_numpy = save_dir
 82 | 
 83 | counter = 0
 84 | with open(f"{args.prompt}.txt", 'r') as file:
 85 |     for line_id, line in enumerate(file):
 86 |         prompt = line.strip()
 87 |         save_name = '_'.join(prompt.split(' ')).replace('/', '<#>')
 88 | 
 89 |         print(prompt)
 90 |     
 91 |         args.prompt_id = counter
 92 |         save_prefix = f"{save_dir}/{args.prompt_id}_{save_name}"
 93 | 
 94 |         set_seed(line_id + args.seed)
 95 |         num_images_per_prompt = 1
 96 |         start_time = time()
 97 |         images = pipe(prompt, num_images_per_prompt=num_images_per_prompt, save_prefix=save_prefix, args=args).images
 98 |         image = images[0]
 99 |         end_time = time()
100 |         print(end_time - start_time)
101 |         try:
102 |             image.save(f"{save_prefix}.png")
103 |             print("image saved at: ", f"{save_prefix}.png")
104 |         except:
105 |             print(f"save at {save_prefix} failed")
106 |             continue
107 |         
108 |         time_counter += end_time - start_time
109 |         counter += 1
110 | 
111 | print(time_counter / counter)
112 | 


--------------------------------------------------------------------------------