├── .gitignore ├── README.md ├── data ├── processed_posts.json └── raw_posts.json ├── few_shot.py ├── llm_helper.py ├── main.py ├── post_generator.py ├── preprocess.py ├── requirements.txt └── resources ├── architecture.jpg └── tool.jpg /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py,cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | .pybuilder/ 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | # For a library or package, you might want to ignore these files since the code is 87 | # intended to run in multiple environments; otherwise, check them in: 88 | # .python-version 89 | 90 | # pipenv 91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 94 | # install all needed dependencies. 95 | #Pipfile.lock 96 | 97 | # poetry 98 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 99 | # This is especially recommended for binary packages to ensure reproducibility, and is more 100 | # commonly ignored for libraries. 101 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 102 | #poetry.lock 103 | 104 | # pdm 105 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 106 | #pdm.lock 107 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 108 | # in version control. 109 | # https://pdm.fming.dev/latest/usage/project/#working-with-version-control 110 | .pdm.toml 111 | .pdm-python 112 | .pdm-build/ 113 | 114 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 115 | __pypackages__/ 116 | 117 | # Celery stuff 118 | celerybeat-schedule 119 | celerybeat.pid 120 | 121 | # SageMath parsed files 122 | *.sage.py 123 | 124 | # Environments 125 | .env 126 | .venv 127 | env/ 128 | venv/ 129 | ENV/ 130 | env.bak/ 131 | venv.bak/ 132 | 133 | # Spyder project settings 134 | .spyderproject 135 | .spyproject 136 | 137 | # Rope project settings 138 | .ropeproject 139 | 140 | # mkdocs documentation 141 | /site 142 | 143 | # mypy 144 | .mypy_cache/ 145 | .dmypy.json 146 | dmypy.json 147 | 148 | # Pyre type checker 149 | .pyre/ 150 | 151 | # pytype static type analyzer 152 | .pytype/ 153 | 154 | # Cython debug symbols 155 | cython_debug/ 156 | 157 | # PyCharm 158 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 159 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 160 | # and can be added to the global gitignore or merged into this file. For a more nuclear 161 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 162 | .idea/ 163 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # project-genai-post-generator 2 | This tool will analyze posts of a LinkedIn influencer and help them create the new posts based on the writing style in their old posts 3 | 4 | 5 | 6 | Let's say Mohan is a LinkedIn influencer and he needs help in writing his future posts. He can feed his past LinkedIn posts to this tool and it will extract key topics. Then he can select the topic, length, language etc. and use Generate button to create a new post that will match his writing style. 7 | 8 | ## Technical Architecture 9 | 10 | 11 | 1. Stage 1: Collect LinkedIn posts and extract Topic, Language, Length etc. from it. 12 | 1. Stage 2: Now use topic, language and length to generate a new post. Some of the past posts related to that specific topic, language and length will be used for few shot learning to guide the LLM about the writing style etc. 13 | 14 | ## Set-up 15 | 1. To get started we first need to get an API_KEY from here: https://console.groq.com/keys. Inside `.env` update the value of `GROQ_API_KEY` with the API_KEY you created. 16 | 2. To get started, first install the dependencies using: 17 | ```commandline 18 | pip install -r requirements.txt 19 | ``` 20 | 3. Run the streamlit app: 21 | ```commandline 22 | streamlit run main.py 23 | ``` 24 | Copyright (C) Codebasics Inc. All rights reserved. 25 | 26 | 27 | **Additional Terms:** 28 | This software is licensed under the MIT License. However, commercial use of this software is strictly prohibited without prior written permission from the author. Attribution must be given in all copies or substantial portions of the software. 29 | -------------------------------------------------------------------------------- /data/processed_posts.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "text": "Just saw a LinkedIn Influencer with 'Organic Growth' written in the profile with 65K+ followers claiming that he can help you in growing your platform, copying the posts from other influencers.", 4 | "engagement": 90, 5 | "line_count": 1, 6 | "language": "English", 7 | "tags": [ 8 | "Influencer", 9 | "Organic Growth" 10 | ] 11 | }, 12 | { 13 | "text": "Jobseekers, this one\u2019s for you.\n Every application, every interview, every follow-up\u2026 the pressure is immense.\n And I know what you're thinking: Am I not good enough? \n But let me tell you, this isn\u2019t about you or your skills. It\u2019s about a broken system where 60% of applicants never hear back. \n Your mental health is not worth sacrificing for a system that doesn\u2019t acknowledge your worth. \n Please remember, taking care of yourself is the real priority. \n Your dream job will come, but for now, breathe. \ud83c\udf3b", 14 | "engagement": 347, 15 | "line_count": 7, 16 | "language": "English", 17 | "tags": [ 18 | "Job Search", 19 | "Mental Health" 20 | ] 21 | }, 22 | { 23 | "text": "Looking for jobs on LinkedIn is like online dating: Full of promises, but in the end, you\u2019re just left ghosted.", 24 | "engagement": 109, 25 | "line_count": 1, 26 | "language": "English", 27 | "tags": [ 28 | "Job Search", 29 | "Online Dating" 30 | ] 31 | }, 32 | { 33 | "text": "LinkedIn scams be like: 'Congratulations, you've been selected for a role you didn\u2019t even apply for!' \n The catch? Pay Rs. 50,000 for the honor.", 34 | "engagement": 115, 35 | "line_count": 2, 36 | "language": "English", 37 | "tags": [ 38 | "Scams" 39 | ] 40 | }, 41 | { 42 | "text": "sapne dekhna achi baat hai,\nlekin job ka sapna dekh ke 'interested' likhna,\nyeh toh achi baat nahi hai na?", 43 | "engagement": 545, 44 | "line_count": 3, 45 | "language": "Hinglish", 46 | "tags": [ 47 | "Job Search", 48 | "Sapne" 49 | ] 50 | }, 51 | { 52 | "text": "Next time when I'll be reading some LinkedIn Influencer's story, I am starting from the last line.\nIf there's a link attached to it, it's most probably a fake one.\nSaves me time!", 53 | "engagement": 188, 54 | "line_count": 3, 55 | "language": "English", 56 | "tags": [ 57 | "Productivity", 58 | "Time Management" 59 | ] 60 | }, 61 | { 62 | "text": "Every time I poured my heart into 5-6 rounds of interviews and faced rejection, it felt like a punch in the gut. The sleepless nights, the endless preparation, all for nothing.\n\nBut looking back, I realize it wasn\u2019t nothing. It was the Universe\u2019s way of saying, \u201cNot this one, something better is on the way.\u201d\n\nEvery single time, I\u2019ve been shown why that rejection happened.\n\nDoors I thought I wanted to walk through were shut, only to have the right ones swing open.\n\nThe kind that aligned with my growth, my values, and my happiness.\n\nAt first, it stung. It hurt deeply. But now, when things don\u2019t go as planned, I don\u2019t panic.\n\nI don\u2019t question my worth. I sit back, breathe, and trust. The Universe knows.\n\nI know there's another plan waiting. Something bigger, better, and just for me.\n\nTo anyone feeling the weight of rejection: trust that the closed doors are protecting you from something you can\u2019t see right now.\n\nYour path is being cleared for something even more beautiful.", 63 | "engagement": 206, 64 | "line_count": 16, 65 | "language": "English", 66 | "tags": [ 67 | "Motivation" 68 | ] 69 | }, 70 | { 71 | "text": "To everyone who's still looking for a job...\n\nI see you. I feel you. \ud83d\udc94\n\nEvery rejection email feels like a punch in the gut, and every 'We'll get back to you' sounds more like 'You'll never hear from us.'\n\nBut I want you to know, you're not alone in this. \ud83c\udf38\n\nAccording to a study, 80% of jobseekers struggle with anxiety and self-doubt during their search. It's normal to feel lost, but it's not the end.\n\nTake breaks, breathe, and remember, this doesn't define you. Your worth is not tied to an offer letter. \ud83d\udca5\n\nYour mental health matters more than any job.", 72 | "engagement": 899, 73 | "line_count": 9, 74 | "language": "English", 75 | "tags": [ 76 | "Job Search", 77 | "Mental Health" 78 | ] 79 | }, 80 | { 81 | "text": "Sometimes, we forget that a company\u2019s brand name doesn\u2019t define someone\u2019s talent. It\u2019s easy to get caught up in the 'big company = big talent' mindset, but that's not always the case.\n\nI\u2019ve had the privilege of working with people from smaller companies (lesser known) who blow me away with their skills and dedication. They don\u2019t need a fancy title or a famous brand behind them to prove their worth.\n\nI've seen the other side too\u2014people in top-tier companies feeling lost, overwhelmed, or stuck, even though the world sees them as 'successful'.\n\nLet\u2019s stop attaching someone\u2019s value to the company they work for. Freshers especially need to hear this\u2014skills are what matter, not the size of the company behind them.\n\nAt the end of the day, happiness and growth don\u2019t come from a brand name, they come from doing what you love and constantly improving your craft.", 82 | "engagement": 166, 83 | "line_count": 11, 84 | "language": "English", 85 | "tags": [ 86 | "Self Improvement", 87 | "Career Advice" 88 | ] 89 | }, 90 | { 91 | "text": "So when I left a toxic work environment, I told my manager a simple thing and felt so good \ud83d\ude2f\n\nI just said-\n\n'Hope your son gets a manager like you.\nI hope that the manager behaves the same way as you did with me.\nThank you.'\n\nNow tell me 1 thing-\n\nShe always said that she was a great manager.\nWhy will she get offended?\n\nI just told her that I wish her son would get a manager like she was.\n\nIf you felt bad, then that means you were a bad manager and now you know it. \ud83e\b80\n\nIf you feel good, then take it as a blessing for your son and you'll really want someone to treat your son/daughter in the same way.\n\nShe cannot be even angry with me else it'll prove that she was not a 'great' manager.\n\nMuskan - 1\nManager - 0\n\nMuskan -> Aura +100000000\n\n(Fictional message unfortunately :(\n)\n\nHope you all become the people that your sons/daughters will like to work under \ud83d\ude4f\n\nThere are a lot of bad people/things, bring a small change and break the chain :)", 92 | "engagement": 1111, 93 | "line_count": 19, 94 | "language": "English", 95 | "tags": [ 96 | "Motivation", 97 | "Leadership" 98 | ] 99 | } 100 | ] -------------------------------------------------------------------------------- /data/raw_posts.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "text": "Just saw a LinkedIn Influencer with 'Organic Growth' written in the profile with 65K+ followers claiming that he can help you in growing your platform, copying the posts from other influencers.", 4 | "engagement": 90 5 | }, 6 | { 7 | "text": "Jobseekers, this one’s for you.\n Every application, every interview, every follow-up… the pressure is immense.\n And I know what you're thinking: Am I not good enough? \n But let me tell you, this isn’t about you or your skills. It’s about a broken system where 60% of applicants never hear back. \n Your mental health is not worth sacrificing for a system that doesn’t acknowledge your worth. \n Please remember, taking care of yourself is the real priority. \n Your dream job will come, but for now, breathe. 🌻", 8 | "engagement": 347 9 | }, 10 | { 11 | "text": "Looking for jobs on LinkedIn is like online dating: Full of promises, but in the end, you’re just left ghosted.", 12 | "engagement": 109 13 | }, 14 | { 15 | "text": "LinkedIn scams be like: 'Congratulations, you've been selected for a role you didn’t even apply for!' \n The catch? Pay Rs. 50,000 for the honor.", 16 | "engagement": 115 17 | }, 18 | { 19 | "text": "sapne dekhna achi baat hai,\nlekin job ka sapna dekh ke 'interested' likhna,\nyeh toh achi baat nahi hai na?", 20 | "engagement": 545 21 | }, 22 | { 23 | "text": "Next time when I'll be reading some LinkedIn Influencer's story, I am starting from the last line.\nIf there's a link attached to it, it's most probably a fake one.\nSaves me time!", 24 | "engagement": 188 25 | }, 26 | { 27 | "text": "Every time I poured my heart into 5-6 rounds of interviews and faced rejection, it felt like a punch in the gut. The sleepless nights, the endless preparation, all for nothing.\n\nBut looking back, I realize it wasn’t nothing. It was the Universe’s way of saying, “Not this one, something better is on the way.”\n\nEvery single time, I’ve been shown why that rejection happened.\n\nDoors I thought I wanted to walk through were shut, only to have the right ones swing open.\n\nThe kind that aligned with my growth, my values, and my happiness.\n\nAt first, it stung. It hurt deeply. But now, when things don’t go as planned, I don’t panic.\n\nI don’t question my worth. I sit back, breathe, and trust. The Universe knows.\n\nI know there's another plan waiting. Something bigger, better, and just for me.\n\nTo anyone feeling the weight of rejection: trust that the closed doors are protecting you from something you can’t see right now.\n\nYour path is being cleared for something even more beautiful.", 28 | "engagement": 206 29 | }, 30 | { 31 | "text": "To everyone who's still looking for a job...\n\nI see you. I feel you. \ud83d\udc94\n\nEvery rejection email feels like a punch in the gut, and every 'We'll get back to you' sounds more like 'You'll never hear from us.'\n\nBut I want you to know, you're not alone in this. \ud83c\udf38\n\nAccording to a study, 80% of jobseekers struggle with anxiety and self-doubt during their search. It's normal to feel lost, but it's not the end.\n\nTake breaks, breathe, and remember, this doesn't define you. Your worth is not tied to an offer letter. \ud83d\udca5\n\nYour mental health matters more than any job.", 32 | "engagement": 899 33 | }, 34 | { 35 | "text": "Sometimes, we forget that a company’s brand name doesn’t define someone’s talent. It’s easy to get caught up in the 'big company = big talent' mindset, but that's not always the case.\n\nI’ve had the privilege of working with people from smaller companies (lesser known) who blow me away with their skills and dedication. They don’t need a fancy title or a famous brand behind them to prove their worth.\n\nI've seen the other side too—people in top-tier companies feeling lost, overwhelmed, or stuck, even though the world sees them as 'successful'.\n\nLet’s stop attaching someone’s value to the company they work for. Freshers especially need to hear this—skills are what matter, not the size of the company behind them.\n\nAt the end of the day, happiness and growth don’t come from a brand name, they come from doing what you love and constantly improving your craft.", 36 | "engagement": 166 37 | }, 38 | { 39 | "text": "So when I left a toxic work environment, I told my manager a simple thing and felt so good \ud83d\ude2f\n\nI just said-\n\n'Hope your son gets a manager like you.\nI hope that the manager behaves the same way as you did with me.\nThank you.'\n\nNow tell me 1 thing-\n\nShe always said that she was a great manager.\nWhy will she get offended?\n\nI just told her that I wish her son would get a manager like she was.\n\nIf you felt bad, then that means you were a bad manager and now you know it. \ud83e\b80\n\nIf you feel good, then take it as a blessing for your son and you'll really want someone to treat your son/daughter in the same way.\n\nShe cannot be even angry with me else it'll prove that she was not a 'great' manager.\n\nMuskan - 1\nManager - 0\n\nMuskan -> Aura +100000000\n\n(Fictional message unfortunately :(\n)\n\nHope you all become the people that your sons/daughters will like to work under \ud83d\ude4f\n\nThere are a lot of bad people/things, bring a small change and break the chain :)", 40 | "engagement": 1111 41 | } 42 | ] -------------------------------------------------------------------------------- /few_shot.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import json 3 | 4 | 5 | class FewShotPosts: 6 | def __init__(self, file_path="data/processed_posts.json"): 7 | self.df = None 8 | self.unique_tags = None 9 | self.load_posts(file_path) 10 | 11 | def load_posts(self, file_path): 12 | with open(file_path, encoding="utf-8") as f: 13 | posts = json.load(f) 14 | self.df = pd.json_normalize(posts) 15 | self.df['length'] = self.df['line_count'].apply(self.categorize_length) 16 | # collect unique tags 17 | all_tags = self.df['tags'].apply(lambda x: x).sum() 18 | self.unique_tags = list(set(all_tags)) 19 | 20 | def get_filtered_posts(self, length, language, tag): 21 | df_filtered = self.df[ 22 | (self.df['tags'].apply(lambda tags: tag in tags)) & # Tags contain 'Influencer' 23 | (self.df['language'] == language) & # Language is 'English' 24 | (self.df['length'] == length) # Line count is less than 5 25 | ] 26 | return df_filtered.to_dict(orient='records') 27 | 28 | def categorize_length(self, line_count): 29 | if line_count < 5: 30 | return "Short" 31 | elif 5 <= line_count <= 10: 32 | return "Medium" 33 | else: 34 | return "Long" 35 | 36 | def get_tags(self): 37 | return self.unique_tags 38 | 39 | 40 | if __name__ == "__main__": 41 | fs = FewShotPosts() 42 | # print(fs.get_tags()) 43 | posts = fs.get_filtered_posts("Medium","Hinglish","Job Search") 44 | print(posts) -------------------------------------------------------------------------------- /llm_helper.py: -------------------------------------------------------------------------------- 1 | from langchain_groq import ChatGroq 2 | import os 3 | from dotenv import load_dotenv 4 | 5 | load_dotenv() 6 | llm = ChatGroq(groq_api_key=os.getenv("GROQ_API_KEY"), model_name="llama-3.2-90b-text-preview") 7 | 8 | 9 | if __name__ == "__main__": 10 | response = llm.invoke("Two most important ingradient in samosa are ") 11 | print(response.content) 12 | 13 | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | from few_shot import FewShotPosts 3 | from post_generator import generate_post 4 | 5 | 6 | # Options for length and language 7 | length_options = ["Short", "Medium", "Long"] 8 | language_options = ["English", "Hinglish"] 9 | 10 | 11 | # Main app layout 12 | def main(): 13 | st.subheader("LinkedIn Post Generator: Codebasics") 14 | 15 | # Create three columns for the dropdowns 16 | col1, col2, col3 = st.columns(3) 17 | 18 | fs = FewShotPosts() 19 | tags = fs.get_tags() 20 | with col1: 21 | # Dropdown for Topic (Tags) 22 | selected_tag = st.selectbox("Topic", options=tags) 23 | 24 | with col2: 25 | # Dropdown for Length 26 | selected_length = st.selectbox("Length", options=length_options) 27 | 28 | with col3: 29 | # Dropdown for Language 30 | selected_language = st.selectbox("Language", options=language_options) 31 | 32 | 33 | 34 | # Generate Button 35 | if st.button("Generate"): 36 | post = generate_post(selected_length, selected_language, selected_tag) 37 | st.write(post) 38 | 39 | 40 | # Run the app 41 | if __name__ == "__main__": 42 | main() 43 | -------------------------------------------------------------------------------- /post_generator.py: -------------------------------------------------------------------------------- 1 | from llm_helper import llm 2 | from few_shot import FewShotPosts 3 | 4 | few_shot = FewShotPosts() 5 | 6 | 7 | def get_length_str(length): 8 | if length == "Short": 9 | return "1 to 5 lines" 10 | if length == "Medium": 11 | return "6 to 10 lines" 12 | if length == "Long": 13 | return "11 to 15 lines" 14 | 15 | 16 | def generate_post(length, language, tag): 17 | prompt = get_prompt(length, language, tag) 18 | response = llm.invoke(prompt) 19 | return response.content 20 | 21 | 22 | def get_prompt(length, language, tag): 23 | length_str = get_length_str(length) 24 | 25 | prompt = f''' 26 | Generate a LinkedIn post using the below information. No preamble. 27 | 28 | 1) Topic: {tag} 29 | 2) Length: {length_str} 30 | 3) Language: {language} 31 | If Language is Hinglish then it means it is a mix of Hindi and English. 32 | The script for the generated post should always be English. 33 | ''' 34 | # prompt = prompt.format(post_topic=tag, post_length=length_str, post_language=language) 35 | 36 | examples = few_shot.get_filtered_posts(length, language, tag) 37 | 38 | if len(examples) > 0: 39 | prompt += "4) Use the writing style as per the following examples." 40 | 41 | for i, post in enumerate(examples): 42 | post_text = post['text'] 43 | prompt += f'\n\n Example {i+1}: \n\n {post_text}' 44 | 45 | if i == 1: # Use max two samples 46 | break 47 | 48 | return prompt 49 | 50 | 51 | if __name__ == "__main__": 52 | print(generate_post("Medium", "English", "Mental Health")) -------------------------------------------------------------------------------- /preprocess.py: -------------------------------------------------------------------------------- 1 | import json 2 | from llm_helper import llm 3 | from langchain_core.prompts import PromptTemplate 4 | from langchain_core.output_parsers import JsonOutputParser 5 | from langchain_core.exceptions import OutputParserException 6 | 7 | 8 | def process_posts(raw_file_path, processed_file_path=None): 9 | with open(raw_file_path, encoding='utf-8') as file: 10 | posts = json.load(file) 11 | enriched_posts = [] 12 | for post in posts: 13 | metadata = extract_metadata(post['text']) 14 | post_with_metadata = post | metadata 15 | enriched_posts.append(post_with_metadata) 16 | 17 | unified_tags = get_unified_tags(enriched_posts) 18 | for post in enriched_posts: 19 | current_tags = post['tags'] 20 | new_tags = {unified_tags[tag] for tag in current_tags} 21 | post['tags'] = list(new_tags) 22 | 23 | with open(processed_file_path, encoding='utf-8', mode="w") as outfile: 24 | json.dump(enriched_posts, outfile, indent=4) 25 | 26 | 27 | def extract_metadata(post): 28 | template = ''' 29 | You are given a LinkedIn post. You need to extract number of lines, language of the post and tags. 30 | 1. Return a valid JSON. No preamble. 31 | 2. JSON object should have exactly three keys: line_count, language and tags. 32 | 3. tags is an array of text tags. Extract maximum two tags. 33 | 4. Language should be English or Hinglish (Hinglish means hindi + english) 34 | 35 | Here is the actual post on which you need to perform this task: 36 | {post} 37 | ''' 38 | 39 | pt = PromptTemplate.from_template(template) 40 | chain = pt | llm 41 | response = chain.invoke(input={"post": post}) 42 | 43 | try: 44 | json_parser = JsonOutputParser() 45 | res = json_parser.parse(response.content) 46 | except OutputParserException: 47 | raise OutputParserException("Context too big. Unable to parse jobs.") 48 | return res 49 | 50 | 51 | def get_unified_tags(posts_with_metadata): 52 | unique_tags = set() 53 | # Loop through each post and extract the tags 54 | for post in posts_with_metadata: 55 | unique_tags.update(post['tags']) # Add the tags to the set 56 | 57 | unique_tags_list = ','.join(unique_tags) 58 | 59 | template = '''I will give you a list of tags. You need to unify tags with the following requirements, 60 | 1. Tags are unified and merged to create a shorter list. 61 | Example 1: "Jobseekers", "Job Hunting" can be all merged into a single tag "Job Search". 62 | Example 2: "Motivation", "Inspiration", "Drive" can be mapped to "Motivation" 63 | Example 3: "Personal Growth", "Personal Development", "Self Improvement" can be mapped to "Self Improvement" 64 | Example 4: "Scam Alert", "Job Scam" etc. can be mapped to "Scams" 65 | 2. Each tag should be follow title case convention. example: "Motivation", "Job Search" 66 | 3. Output should be a JSON object, No preamble 67 | 3. Output should have mapping of original tag and the unified tag. 68 | For example: {{"Jobseekers": "Job Search", "Job Hunting": "Job Search", "Motivation": "Motivation}} 69 | 70 | Here is the list of tags: 71 | {tags} 72 | ''' 73 | pt = PromptTemplate.from_template(template) 74 | chain = pt | llm 75 | response = chain.invoke(input={"tags": str(unique_tags_list)}) 76 | try: 77 | json_parser = JsonOutputParser() 78 | res = json_parser.parse(response.content) 79 | except OutputParserException: 80 | raise OutputParserException("Context too big. Unable to parse jobs.") 81 | return res 82 | 83 | 84 | if __name__ == "__main__": 85 | process_posts("data/raw_posts.json", "data/processed_posts.json") -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | streamlit==1.35.0 2 | langchain==0.2.14 3 | langchain-core==0.2.39 4 | langchain-community==0.2.12 5 | langchain_groq==0.1.9 6 | pandas==2.0.2 -------------------------------------------------------------------------------- /resources/architecture.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codebasics/project-genai-post-generator/62c5e1672a70dcac1724adb1a98c5771dc64b7f0/resources/architecture.jpg -------------------------------------------------------------------------------- /resources/tool.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codebasics/project-genai-post-generator/62c5e1672a70dcac1724adb1a98c5771dc64b7f0/resources/tool.jpg --------------------------------------------------------------------------------