YouTube2Blog

├── .gitignore ├── README.md ├── WebUI ├── index.html └── script.js ├── __pycache__ ├── main.cpython-39.pyc ├── restAPI.cpython-310.pyc └── restAPI.cpython-39.pyc ├── assets ├── savedImages.png ├── steps.png └── steps2.png ├── main.py ├── requirements.txt └── restAPI.py /.gitignore: -------------------------------------------------------------------------------- 1 | /audio.mp3 2 | /segments -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Youtube2Blog 2 | 3 | Transform your YouTube content into engaging blog posts effortlessly with our AI-powered Youtube to Blog conversion tool. Optimize your reach and unlock the potential of SEO for your videos. Get started now! 🚀 #AI #SEO #YouTube2Blog" 4 | 5 | ### Features ⭐️ 6 | 7 | - Convert any Youtube or Drive Videos 📹 8 | - Has Suggestions over images 🌠 9 | - Provides a Medium Blog Like editor 📝 10 | 11 | ## If you like it Please Gimme some stars ⭐️ so i can have some motive to do more !! 🥺 12 | 13 | Here is how you can do it ! 14 | 15 | ### Step 1 - Enter the Youtube URL 16 | 17 | ![Steps](https://github.com/badboysm890/Youtube2Blog/raw/main/assets/steps.png "Steps") 18 | 19 | ### Step 2 - Edit the content that apears in the Editor 20 | 21 | ![Steps](https://github.com/badboysm890/Youtube2Blog/raw/main/assets/steps2.png "Steps") 22 | 23 | ### Step 3 - Click on any Image You want to use and Finally Copy and Paste it in Medium or Any Blog app 24 | 25 | ![Steps](https://github.com/badboysm890/Youtube2Blog/raw/main/assets/savedImages.png "Steps") 26 | 27 | ## Installation 28 | 29 | 1. Clone this repository: 30 | 31 | ``` 32 | git clone https://github.com/badboysm890/Youtube2Blog.git 33 | ``` 34 | 35 | 2. Navigate to the project directory: 36 | 37 | ``` 38 | cd Youtube2Blog 39 | ``` 40 | 41 | 3. Install the required packages: 42 | 43 | ``` 44 | pip install -r requirements.txt 45 | ``` 46 | 47 | ## Project Structure 48 | 49 | The project consists of the following files: 50 | 51 | 1. `main.py` - This has the base code for the Youtube Download to Conversion. 52 | 2. `restAPI.py.py` - This is the FASTAPI server which takes care of the web interface. 53 | 3. `/WebUI` - This will be Web UI files use any web server you want preferably live server vscode. 54 | 55 | ## Usage 56 | 57 | Step 1 58 | 59 | Once every thing has been installed, you have two more prerequisites those are FFMPEG and Youtube-dl 60 | 61 | ``` 62 | pip install --upgrade youtube-dl 63 | ``` 64 | 65 | For Mac 66 | ``` 67 | brew install ffmpeg 68 | ``` 69 | 70 | For Linux 71 | ``` 72 | sudo apt install ffmpeg 73 | ``` 74 | 75 | For example converting a youtube to blog: 76 | ``` 77 | python3 main.py --params https://youtu.be/SJeBRW1QQMA --name Test.txt 78 | ``` 79 | 80 | 81 | Step 2 82 | 83 | 84 | ## Contributing 85 | 86 | We welcome contributions to this project. Please follow these guidelines: 87 | 88 | 1. Create a fork of the repository. 89 | 2. Create a new branch with a descriptive name. 90 | 3. Commit your changes to your branch. 91 | 4. Create a pull request, explaining your changes and the motivation behind them. 92 | 93 | ## License 94 | 95 | Nothing Like that Just Have Fun 96 | 97 | ## Todo 98 | 99 | - Add Automation to have sceduled post to medium 100 | - Use Stable Diffusion to Blog and add more related images 101 | - Make it more fast and accurate paragraphs 102 | -------------------------------------------------------------------------------- /WebUI/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | YouTube2Blog 13 | 14 | 15 | 16 | 17 | 23 | 24 |

25 | 29 |

30 | 31 |

32 |

Editor

33 |

34 |

35 |

36 | 37 |

38 |

Recommended Images

39 |

40 |

41 |

42 | 43 |

44 | 45 |

46 | 47 |

48 | 51 |

52 | 53 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | -------------------------------------------------------------------------------- /WebUI/script.js: -------------------------------------------------------------------------------- 1 | var editor = new MediumEditor('.editable', { 2 | toolbar: { 3 | /* These are the default options for the toolbar, 4 | if nothing is passed this is what is used */ 5 | allowMultiParagraphSelection: true, 6 | buttons: ['bold', 'italic', 'underline', 'anchor', 'h2', 'h3', 'quote'], 7 | diffLeft: 0, 8 | diffTop: -10, 9 | firstButtonClass: 'medium-editor-button-first', 10 | lastButtonClass: 'medium-editor-button-last', 11 | relativeContainer: null, 12 | standardizeSelectionStart: false, 13 | static: false, 14 | /* options which only apply when static is true */ 15 | align: 'center', 16 | sticky: false, 17 | updateOnEmptySelection: false 18 | }, 19 | anchor: { 20 | /* These are the default options for anchor form, 21 | if nothing is passed this is what it used */ 22 | customClassOption: null, 23 | customClassOptionText: 'Button', 24 | linkValidation: false, 25 | placeholderText: 'Paste or type a link', 26 | targetCheckbox: false, 27 | targetCheckboxText: 'Open in new window' 28 | }, 29 | keyboardCommands: { 30 | /* This example includes the default options for keyboardCommands, 31 | if nothing is passed this is what it used */ 32 | commands: [{ 33 | command: 'bold', 34 | key: 'B', 35 | meta: true, 36 | shift: false, 37 | alt: false 38 | }, 39 | { 40 | command: 'italic', 41 | key: 'I', 42 | meta: true, 43 | shift: false, 44 | alt: false 45 | }, 46 | { 47 | command: 'underline', 48 | key: 'U', 49 | meta: true, 50 | shift: false, 51 | alt: false 52 | } 53 | ], 54 | }, 55 | autoLink: true, 56 | placeholder: { 57 | text: 'Text Will be added here after conversion of the blog', 58 | }, 59 | 60 | }); 61 | 62 | $(".Convert").click(function() { 63 | // get data form youtubeLink 64 | var youtubeLink = $(".youtubeLink").val(); 65 | $.ajax({ 66 | type: "POST", 67 | url: "http://127.0.0.1:8000", 68 | data: JSON.stringify({ url: youtubeLink }), 69 | contentType: "application/json; charset=utf-8", 70 | dataType: "json" 71 | }).done(function(data) { 72 | console.log(data.text); 73 | editor.setContent(data.text); 74 | // set it in local storage 75 | localStorage.setItem("text", data.text); 76 | }); 77 | }); 78 | 79 | // on load of the page 80 | $(document).ready(function() { 81 | // delete searchedKeywords from local storage 82 | localStorage.removeItem("searchedKeywords"); 83 | var text = localStorage.getItem("text"); 84 | if (text) { 85 | editor.setContent(text); 86 | } 87 | // get all the text inside h3 tag as an array and pass it to findImages function 88 | var titles = []; 89 | $("h3").each(function() { 90 | titles.push($(this).text()); 91 | }); 92 | // remove the first element as it is the title of the blog 93 | titles.shift(); 94 | titles.forEach(element => { 95 | findImages(element); 96 | }); 97 | }); 98 | 99 | function findImages(title) { 100 | fetch('https://pixabay.com/api/?key=35871702-48a42c8285cb9f940a8c6f663&q=' + title + '&image_type=photo') 101 | .then(response => response.json()) 102 | .then(data => { 103 | var imageData = data.hits; 104 | var imageURLs = []; 105 | for (var i = 0; i < imageData.length; i++) { 106 | imageURLs.push(imageData[i].webformatURL); 107 | } 108 | console.log(imageURLs); 109 | // add a heading only if there are images 110 | if(imageURLs.length > 0) 111 | { 112 | $(".exampleImages").append('

' + title + '

'); 113 | // extract keywords from title and search for images 114 | } else { 115 | var searchedKeywords = localStorage.getItem("searchedKeywords"); 116 | searchedKeywords = JSON.parse(searchedKeywords); 117 | var keywords = title.split(" "); 118 | // console.log(keywords); 119 | // remove all the 5 letter words 120 | keywords = keywords.filter(function(word) { 121 | return word.length > 5; 122 | } 123 | ); 124 | console.log(keywords); 125 | keywords.forEach(element => { 126 | console.log(element); 127 | if(searchedKeywords == null) 128 | { 129 | searchedKeywords = []; 130 | } 131 | if(searchedKeywords.includes(element)) 132 | { 133 | console.log("Already searched for this keyword"); 134 | return; 135 | } else { 136 | findImages(element); 137 | console.log(element, "------------"); 138 | } 139 | searchedKeywords.push(element); 140 | localStorage.setItem("searchedKeywords", JSON.stringify(searchedKeywords)); 141 | }); 142 | } 143 | 144 | imageURLs.forEach(element => { 145 | $(".exampleImages").append('

'); 146 | }); 147 | }) 148 | .catch(error => console.log(error)); 149 | } -------------------------------------------------------------------------------- /__pycache__/main.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/badboysm890/Youtube2Blog/0d35ab08d233bf514fafbf2f9397b955c9939a4b/__pycache__/main.cpython-39.pyc -------------------------------------------------------------------------------- /__pycache__/restAPI.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/badboysm890/Youtube2Blog/0d35ab08d233bf514fafbf2f9397b955c9939a4b/__pycache__/restAPI.cpython-310.pyc -------------------------------------------------------------------------------- /__pycache__/restAPI.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/badboysm890/Youtube2Blog/0d35ab08d233bf514fafbf2f9397b955c9939a4b/__pycache__/restAPI.cpython-39.pyc -------------------------------------------------------------------------------- /assets/savedImages.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/badboysm890/Youtube2Blog/0d35ab08d233bf514fafbf2f9397b955c9939a4b/assets/savedImages.png -------------------------------------------------------------------------------- /assets/steps.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/badboysm890/Youtube2Blog/0d35ab08d233bf514fafbf2f9397b955c9939a4b/assets/steps.png -------------------------------------------------------------------------------- /assets/steps2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/badboysm890/Youtube2Blog/0d35ab08d233bf514fafbf2f9397b955c9939a4b/assets/steps2.png -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | from __future__ import unicode_literals 2 | import whisper 3 | from pydub import AudioSegment 4 | import torch 5 | from transformers import T5ForConditionalGeneration,T5Tokenizer 6 | import torch 7 | import os 8 | import argparse 9 | 10 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 11 | models = T5ForConditionalGeneration.from_pretrained("Michau/t5-base-en-generate-headline") 12 | tokenizer = T5Tokenizer.from_pretrained("Michau/t5-base-en-generate-headline") 13 | models = models.to(device) 14 | model = whisper.load_model("base") 15 | 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument('--params', type=str, help='Youtube link') 18 | parser.add_argument('--name', type=str, help='Name of the file') 19 | args = parser.parse_args() 20 | url = args.params 21 | name = args.name 22 | 23 | if os.path.exists("audio.mp3"): 24 | os.remove("audio.mp3") 25 | 26 | os.system("youtube-dl "+"--write-thumbnail "+"--skip-download "+url + " -o logo.png") 27 | os.system("yt-dlp -f 140 -o audio.mp3 " + url) 28 | 29 | while not os.path.exists("audio.mp3"): 30 | continue 31 | 32 | if os.path.exists("segments"): 33 | os.system("rm -rf segments") 34 | 35 | audio = AudioSegment.from_file("audio.mp3") 36 | segment_length = 30 * 1000 37 | 38 | if not os.path.exists("segments"): 39 | os.makedirs("segments") 40 | 41 | for i, segment in enumerate(audio[::segment_length]): 42 | segment.export(f"segments/{i}.mp3", format="mp3") 43 | 44 | orginal_text = "" 45 | audio_list = os.listdir("segments") 46 | headings = [] 47 | orginal_texts = [] 48 | dataForWeb = { 49 | 50 | } 51 | 52 | for i in range(len(audio_list)): 53 | print(f"Processing segment {i+1}/{len(audio_list)}") 54 | audio = whisper.load_audio(f"segments/{i}.mp3") 55 | audio = whisper.pad_or_trim(audio) 56 | mel = whisper.log_mel_spectrogram(audio).to(model.device) 57 | _, probs = model.detect_language(mel) 58 | options = whisper.DecodingOptions(fp16 = False) 59 | result = whisper.decode(model, mel, options) 60 | 61 | text = "headline: " + result.text 62 | max_len = 256 63 | encoding = tokenizer.encode_plus(text, return_tensors = "pt") 64 | input_ids = encoding["input_ids"].to(device) 65 | attention_masks = encoding["attention_mask"].to(device) 66 | beam_outputs = models.generate( 67 | input_ids = input_ids, 68 | attention_mask = attention_masks, 69 | max_length = 64, 70 | num_beams = 3, 71 | early_stopping = True, 72 | ) 73 | results = tokenizer.decode(beam_outputs[0]) 74 | headings.append(results) 75 | dataForWeb[i] = { 76 | "heading" : results, 77 | "text" : result.text 78 | } 79 | 80 | orginal_text += "\n" 81 | orginal_text += "

"+results + "

" 82 | # new line 83 | orginal_text += "\n" 84 | orginal_text += "

"+result.text+ "

" 85 | 86 | with open(name, "w") as f: 87 | f.write(orginal_text) 88 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | aiohttp 2 | aiosignal 3 | async-generator==1.10 4 | async-timeout 5 | attrs 6 | brotlipy 7 | certifi==2022.12.7 8 | cffi 9 | charset-normalizer 10 | click 11 | colorama 12 | cryptography 13 | dataclasses 14 | datasets==2.8.0 15 | dill 16 | exceptiongroup==1.1.0 17 | filelock 18 | frozenlist 19 | fsspec 20 | h11==0.14.0 21 | huggingface-hub==0.12.0 22 | idna 23 | importlib-metadata 24 | joblib 25 | multidict 26 | multiprocess 27 | numpy 28 | outcome==1.2.0 29 | packaging 30 | pandas==1.5.3 31 | Pillow==9.4.0 32 | pyarrow==10.0.1 33 | pycparser 34 | pydub==0.25.1 35 | pyOpenSSL 36 | PySocks 37 | python-dateutil 38 | pytz 39 | PyYAML 40 | regex 41 | requests 42 | responses==0.18.0 43 | sacremoses==0.0.43 44 | selenium==4.8.0 45 | six 46 | sniffio==1.3.0 47 | sortedcontainers==2.4.0 48 | sounddevice==0.4.5 49 | tokenizers 50 | torch 51 | tqdm 52 | transformers 53 | trio==0.22.0 54 | trio-websocket==0.9.2 55 | typing_extensions 56 | urllib3 57 | whisper==1.1.10 58 | wsproto==1.2.0 59 | xxhash 60 | yarl 61 | zipp 62 | -------------------------------------------------------------------------------- /restAPI.py: -------------------------------------------------------------------------------- 1 | from __future__ import unicode_literals 2 | from fastapi import Body, FastAPI, Request 3 | from fastapi.middleware.cors import CORSMiddleware 4 | import re 5 | import os 6 | 7 | app = FastAPI() 8 | 9 | 10 | 11 | origins = [ 12 | "http://localhost", 13 | "http://127.0.0.1:5500", 14 | ] 15 | 16 | app.add_middleware( 17 | CORSMiddleware, 18 | allow_origins=origins, 19 | allow_credentials=True, 20 | allow_methods=["*"], 21 | allow_headers=["*"], 22 | ) 23 | 24 | @app.post("/") 25 | async def check_image(request: Request): 26 | request = await request.json() 27 | print(request) 28 | url = request["url"] 29 | textName = re.sub(r"[^a-zA-Z0-9]+", ' ', url) 30 | textName = textName.replace(" ", "_") 31 | textName = textName + ".txt" 32 | os.system("python main.py --params " + request["url"] + " --name " + textName) 33 | with open(textName, "r") as f: 34 | text = f.read() 35 | os.remove(textName) 36 | return {"text": text} --------------------------------------------------------------------------------