├── .gitattributes ├── 1. Midjourney Images ├── Midjourney_pict_1.png ├── Midjourney_pict_2.png └── Prompt.txt ├── 2. ChatGPT Script ├── ChatGPT_Counselling_Script.txt └── chatgpt script feed.JPG ├── 3. ElevenLabs Audio └── synthesized_audio.mp3 ├── 4. D-ID Video └── AI Newsreader.mp4 ├── LICENSE ├── README.md └── Screenshots ├── chatgpt script feed.JPG ├── discord midjourney.JPG ├── eleven labs.JPG ├── studio D-ID.JPG └── video library DID.JPG /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /1. Midjourney Images/Midjourney_pict_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/1. Midjourney Images/Midjourney_pict_1.png -------------------------------------------------------------------------------- /1. Midjourney Images/Midjourney_pict_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/1. Midjourney Images/Midjourney_pict_2.png -------------------------------------------------------------------------------- /1. Midjourney Images/Prompt.txt: -------------------------------------------------------------------------------- 1 | High-quality upper body professional photo of a female media news reporter in a red coat with a newsroom background -------------------------------------------------------------------------------- /2. ChatGPT Script/ChatGPT_Counselling_Script.txt: -------------------------------------------------------------------------------- 1 | Hi everyone, I'm Keezum, your newsreader. Here are today's headlines: 2 | 3 | 1. The US Senate has voted to pass a new COVID relief bill. 4 | 5 | 2. The European Union has signed a new trade agreement with China. 6 | 7 | 3. Tropical Storm Eta is expected to make landfall in Florida this week. 8 | 9 | 4. A magnitude 7.0 earthquake has struck near the Philippines. 10 | 11 | Thanks for joining me for the news, stay tuned for more updates! -------------------------------------------------------------------------------- /2. ChatGPT Script/chatgpt script feed.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/2. ChatGPT Script/chatgpt script feed.JPG -------------------------------------------------------------------------------- /3. ElevenLabs Audio/synthesized_audio.mp3: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/3. ElevenLabs Audio/synthesized_audio.mp3 -------------------------------------------------------------------------------- /4. D-ID Video/AI Newsreader.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/4. D-ID Video/AI Newsreader.mp4 -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Shivam Singh 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Generative-AI Newsreader 2 | 3 | ## Presenting to you, **Keezum AI Newsreader**! 4 | 5 | Welcome to the AI Newsreader repository! This project is a demonstration of how generative AI tools can be used together to create a sophisticated and dynamic newsreader avatar. 6 | 7 | > :warning: Disclaimer: Just kidding feel free to use it as per your comfort. 8 | 9 | [![shivam01-neel-high-quality-upper-body-professional-photo-of-a-f-0d6ca52a-b365-44b8-8730-c8b86079fd7c.png](https://i.postimg.cc/RZY6XMGD/shivam01-neel-high-quality-upper-body-professional-photo-of-a-f-0d6ca52a-b365-44b8-8730-c8b86079fd7c.png)](https://postimg.cc/qN8v7HPs) 10 |
11 | [visit this link](https://youtube.com/shorts/mpRrZsIssGc?feature=share) to watch the demo video to see Macy in action, where she briefly talks about two commonly-prescribed medications. 12 | 13 | The outcome is pretty impressive and will only get better over time, given the speed at which generative AI is improving. To create this demo, it took me 0 dollars and only 25 minutes. 14 | 15 | Here are the tools I used: 16 | 1. [**Midjourney**](https://midjourney.com/home/?callbackUrl=%2Fapp%2F) to generate an image of a female pharmacist 17 | 2. OpenAI's [**ChatGPT**](https://platform.openai.com/playground) to generate script for newsheadlines 18 | 3. [**ElevenLabs**](http://www.elevenlabs.io/)'s Prime Video AI to generate audio from the ChatGPT script 19 | 4. [**D-ID**](https://www.d-id.com/)'s Creative Reality Studio to generate realistic animated avatar video synced with audio (Free trial) 20 | 21 | ___ 22 |
23 | 24 | ## Step-by-Step Guide 25 | 26 | ### (1) Midjourney - Image Generation 27 | - We need a face to represent our avatar, and we can use image generation tools like Midjourney to do just that 28 | - Midjourney is a free AI service by OpenAI that creates images from textual descriptions 29 | - Setup: 30 | 1. Midjourney works entirely on Discord, so make sure you sign up for a Discord account (which is free). 31 | 2. Visit this Midjourney site [link](https://discord.gg/midjourney), which automatically takes you to a Discord invite. 32 | 3. Accept the Discord invite to Midjourney. Choose to Continue to Discord. 33 | 4. Click on the Midjourney button (with the ship icon) and select any of the Newcomer rooms e.g., `newbies-24` 34 | 5. In the chat line, type `/imagine` followed by your description prompt. For example, the prompt I used was "High-quality upper body professional photo of a female media news reporter in a red coat with a newsroom background ". Press Enter after typing it in, and give Midjourney some time to generate the images. 35 | 6. Once done, you will see an output of four images. Below the image set, you will see a set of buttons U1-U4, and V1-V4. 36 | 7. The four images are numbered going clockwise from the upper left. To get a new variation on one you like, select "V1" (or V2, V3, or V4), and to get a high resolution copy, select "U1" (or U2, U3, or U4). 37 | 8. Once you got your selected high-resolution variant by selecting one of the U buttons, click on the image and select 'Open in Browser'. You can then save the high resolution image on your local machine 38 | 39 | 40 |
41 | 42 | ### (2) Playground ChatGPT - Text Generation 43 | - Since everyone using chatgpt so it got a bit slow these days and not everyone interested in paying money so we can use ChatGPTPlayground to perform our task very quickly.We need a news headlines script that can provide news. To do that, we can use ChatGPTPlayground. 44 | - ChatGPT is a chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models. 45 | - Setup: 46 | 1. Visit this link to access ChatGPT (you will need to login accordingly) 47 | 2. In the prompt section, enter the relevant description for the counselling script. For example, the prompt I used is as follows: "Create a script that first introduces oneself as a newsreader called Keezum, and then talks about the headlines of news with points in start and give some news info with the limit of two to three lines max". 48 | 3. From the output on the ChatGPT screen, copy and save the generated text in a text file on your local machine. 49 | 50 |
51 | 52 | ### (3) ElevenLabs - Text-to-Speech Generation 53 | - Next, we want to convert the ChatGPT script text into a natural sounding audio clip. We can do so with free tools like Prime Voice AI (by ElevenLabs) 54 | - Prime Voice AI is a realistic and versatile AI speech software that brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling. 55 | - Setup: 56 | 1. Visit the [ElevenLabs page](https://beta.elevenlabs.io/) and create an account for free. 57 | 2. On the Speech Synthesis page, select a specific voice in the settings, paste the text script into the text section, and click Generate. The voice I chose was `premade/Domi` as I found it to be the most lively and natural. The settings can also be adjusted accordingly for things like stability and clarity. 58 | 3. I shortened the script slightly by removing the section on the drug Amlodipine because I did not want the demo to be too long. 59 | 4. There is a credit limit for the free account, so make sure you use them wisely for the audio you want to generate. 60 | 5. Download and save the .mp3 (titled 'synthesized_audio.mp3') file on your local machine. 61 | 62 |
63 | 64 | ### (4) D-ID - Photorealistic Talking Avatar (and Audio Sync) Generation 65 | - Lastly, it is time to piece the pharmacist image and counselling audio together into a photorealistic video. To do so, we can use tools like D-ID. 66 | - D-ID’s creative AI technology takes images of faces and turns them into high-quality, photorealistic videos. At the click of a button, it can combine images with audio or text to give them expression and speech. 67 | - Setup: 68 | 1. Visit the [D-ID](https://www.d-id.com/) website and create a Free Trial account 69 | 2. Select the Create Video button to start creating a new video 70 | 3. Add your Midjourney pharmacist image as a presenter image 71 | 4. Upload our ChatGPT scripted audio in the `Upload your own voice` section on the right. 72 | 5. Click the `Generate Video` button at the top right and wait for your masterpiece to be ready for download! 73 | 74 | ___ 75 | ### Dependencies 76 | 77 | This project requires the following dependencies: 78 | 79 | Python 3.6 or higher 80 | Midjourney 81 | OpenAI's GPT-3 API 82 | ElevenLabs' Prime Video AI 83 | D-ID's Creative Reality Studio 84 | ___ 85 | ### Future Work 86 | 87 | We plan to continue to refine and improve this project by integrating more generative AI tools and expanding the functionality of the newsreader avatar. We also welcome any contributions or suggestions from the community. 88 | Acknowledgments 89 | 90 | We would like to thank the developers and researchers at Midjourney, OpenAI, ElevenLabs, and D-ID for their excellent generative AI tools, which made this project possible. 91 | ___ 92 | ### License 93 | 94 | This project is licensed under the MIT License - see the LICENSE file for details. 95 | 96 | -------------------------------------------------------------------------------- /Screenshots/chatgpt script feed.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/Screenshots/chatgpt script feed.JPG -------------------------------------------------------------------------------- /Screenshots/discord midjourney.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/Screenshots/discord midjourney.JPG -------------------------------------------------------------------------------- /Screenshots/eleven labs.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/Screenshots/eleven labs.JPG -------------------------------------------------------------------------------- /Screenshots/studio D-ID.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/Screenshots/studio D-ID.JPG -------------------------------------------------------------------------------- /Screenshots/video library DID.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sivamsinghsh/Generative-AI-Newsreader/c83d42be6c63d7fd0b67c7e60eba7129ba0c9686/Screenshots/video library DID.JPG --------------------------------------------------------------------------------