├── .gitignore
├── README.md
├── data
├── processed_posts.json
└── raw_posts.json
├── few_shot.py
├── llm_helper.py
├── main.py
├── post_generator.py
├── preprocess.py
├── requirements.txt
└── resources
├── architecture.jpg
└── tool.jpg
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | share/python-wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | MANIFEST
28 |
29 | # PyInstaller
30 | # Usually these files are written by a python script from a template
31 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
32 | *.manifest
33 | *.spec
34 |
35 | # Installer logs
36 | pip-log.txt
37 | pip-delete-this-directory.txt
38 |
39 | # Unit test / coverage reports
40 | htmlcov/
41 | .tox/
42 | .nox/
43 | .coverage
44 | .coverage.*
45 | .cache
46 | nosetests.xml
47 | coverage.xml
48 | *.cover
49 | *.py,cover
50 | .hypothesis/
51 | .pytest_cache/
52 | cover/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | .pybuilder/
76 | target/
77 |
78 | # Jupyter Notebook
79 | .ipynb_checkpoints
80 |
81 | # IPython
82 | profile_default/
83 | ipython_config.py
84 |
85 | # pyenv
86 | # For a library or package, you might want to ignore these files since the code is
87 | # intended to run in multiple environments; otherwise, check them in:
88 | # .python-version
89 |
90 | # pipenv
91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
94 | # install all needed dependencies.
95 | #Pipfile.lock
96 |
97 | # poetry
98 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99 | # This is especially recommended for binary packages to ensure reproducibility, and is more
100 | # commonly ignored for libraries.
101 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102 | #poetry.lock
103 |
104 | # pdm
105 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106 | #pdm.lock
107 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108 | # in version control.
109 | # https://pdm.fming.dev/latest/usage/project/#working-with-version-control
110 | .pdm.toml
111 | .pdm-python
112 | .pdm-build/
113 |
114 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
115 | __pypackages__/
116 |
117 | # Celery stuff
118 | celerybeat-schedule
119 | celerybeat.pid
120 |
121 | # SageMath parsed files
122 | *.sage.py
123 |
124 | # Environments
125 | .env
126 | .venv
127 | env/
128 | venv/
129 | ENV/
130 | env.bak/
131 | venv.bak/
132 |
133 | # Spyder project settings
134 | .spyderproject
135 | .spyproject
136 |
137 | # Rope project settings
138 | .ropeproject
139 |
140 | # mkdocs documentation
141 | /site
142 |
143 | # mypy
144 | .mypy_cache/
145 | .dmypy.json
146 | dmypy.json
147 |
148 | # Pyre type checker
149 | .pyre/
150 |
151 | # pytype static type analyzer
152 | .pytype/
153 |
154 | # Cython debug symbols
155 | cython_debug/
156 |
157 | # PyCharm
158 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
159 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
160 | # and can be added to the global gitignore or merged into this file. For a more nuclear
161 | # option (not recommended) you can uncomment the following to ignore the entire idea folder.
162 | .idea/
163 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # project-genai-post-generator
2 | This tool will analyze posts of a LinkedIn influencer and help them create the new posts based on the writing style in their old posts
3 |
4 |
5 |
6 | Let's say Mohan is a LinkedIn influencer and he needs help in writing his future posts. He can feed his past LinkedIn posts to this tool and it will extract key topics. Then he can select the topic, length, language etc. and use Generate button to create a new post that will match his writing style.
7 |
8 | ## Technical Architecture
9 |
10 |
11 | 1. Stage 1: Collect LinkedIn posts and extract Topic, Language, Length etc. from it.
12 | 1. Stage 2: Now use topic, language and length to generate a new post. Some of the past posts related to that specific topic, language and length will be used for few shot learning to guide the LLM about the writing style etc.
13 |
14 | ## Set-up
15 | 1. To get started we first need to get an API_KEY from here: https://console.groq.com/keys. Inside `.env` update the value of `GROQ_API_KEY` with the API_KEY you created.
16 | 2. To get started, first install the dependencies using:
17 | ```commandline
18 | pip install -r requirements.txt
19 | ```
20 | 3. Run the streamlit app:
21 | ```commandline
22 | streamlit run main.py
23 | ```
24 | Copyright (C) Codebasics Inc. All rights reserved.
25 |
26 |
27 | **Additional Terms:**
28 | This software is licensed under the MIT License. However, commercial use of this software is strictly prohibited without prior written permission from the author. Attribution must be given in all copies or substantial portions of the software.
29 |
--------------------------------------------------------------------------------
/data/processed_posts.json:
--------------------------------------------------------------------------------
1 | [
2 | {
3 | "text": "Just saw a LinkedIn Influencer with 'Organic Growth' written in the profile with 65K+ followers claiming that he can help you in growing your platform, copying the posts from other influencers.",
4 | "engagement": 90,
5 | "line_count": 1,
6 | "language": "English",
7 | "tags": [
8 | "Influencer",
9 | "Organic Growth"
10 | ]
11 | },
12 | {
13 | "text": "Jobseekers, this one\u2019s for you.\n Every application, every interview, every follow-up\u2026 the pressure is immense.\n And I know what you're thinking: Am I not good enough? \n But let me tell you, this isn\u2019t about you or your skills. It\u2019s about a broken system where 60% of applicants never hear back. \n Your mental health is not worth sacrificing for a system that doesn\u2019t acknowledge your worth. \n Please remember, taking care of yourself is the real priority. \n Your dream job will come, but for now, breathe. \ud83c\udf3b",
14 | "engagement": 347,
15 | "line_count": 7,
16 | "language": "English",
17 | "tags": [
18 | "Job Search",
19 | "Mental Health"
20 | ]
21 | },
22 | {
23 | "text": "Looking for jobs on LinkedIn is like online dating: Full of promises, but in the end, you\u2019re just left ghosted.",
24 | "engagement": 109,
25 | "line_count": 1,
26 | "language": "English",
27 | "tags": [
28 | "Job Search",
29 | "Online Dating"
30 | ]
31 | },
32 | {
33 | "text": "LinkedIn scams be like: 'Congratulations, you've been selected for a role you didn\u2019t even apply for!' \n The catch? Pay Rs. 50,000 for the honor.",
34 | "engagement": 115,
35 | "line_count": 2,
36 | "language": "English",
37 | "tags": [
38 | "Scams"
39 | ]
40 | },
41 | {
42 | "text": "sapne dekhna achi baat hai,\nlekin job ka sapna dekh ke 'interested' likhna,\nyeh toh achi baat nahi hai na?",
43 | "engagement": 545,
44 | "line_count": 3,
45 | "language": "Hinglish",
46 | "tags": [
47 | "Job Search",
48 | "Sapne"
49 | ]
50 | },
51 | {
52 | "text": "Next time when I'll be reading some LinkedIn Influencer's story, I am starting from the last line.\nIf there's a link attached to it, it's most probably a fake one.\nSaves me time!",
53 | "engagement": 188,
54 | "line_count": 3,
55 | "language": "English",
56 | "tags": [
57 | "Productivity",
58 | "Time Management"
59 | ]
60 | },
61 | {
62 | "text": "Every time I poured my heart into 5-6 rounds of interviews and faced rejection, it felt like a punch in the gut. The sleepless nights, the endless preparation, all for nothing.\n\nBut looking back, I realize it wasn\u2019t nothing. It was the Universe\u2019s way of saying, \u201cNot this one, something better is on the way.\u201d\n\nEvery single time, I\u2019ve been shown why that rejection happened.\n\nDoors I thought I wanted to walk through were shut, only to have the right ones swing open.\n\nThe kind that aligned with my growth, my values, and my happiness.\n\nAt first, it stung. It hurt deeply. But now, when things don\u2019t go as planned, I don\u2019t panic.\n\nI don\u2019t question my worth. I sit back, breathe, and trust. The Universe knows.\n\nI know there's another plan waiting. Something bigger, better, and just for me.\n\nTo anyone feeling the weight of rejection: trust that the closed doors are protecting you from something you can\u2019t see right now.\n\nYour path is being cleared for something even more beautiful.",
63 | "engagement": 206,
64 | "line_count": 16,
65 | "language": "English",
66 | "tags": [
67 | "Motivation"
68 | ]
69 | },
70 | {
71 | "text": "To everyone who's still looking for a job...\n\nI see you. I feel you. \ud83d\udc94\n\nEvery rejection email feels like a punch in the gut, and every 'We'll get back to you' sounds more like 'You'll never hear from us.'\n\nBut I want you to know, you're not alone in this. \ud83c\udf38\n\nAccording to a study, 80% of jobseekers struggle with anxiety and self-doubt during their search. It's normal to feel lost, but it's not the end.\n\nTake breaks, breathe, and remember, this doesn't define you. Your worth is not tied to an offer letter. \ud83d\udca5\n\nYour mental health matters more than any job.",
72 | "engagement": 899,
73 | "line_count": 9,
74 | "language": "English",
75 | "tags": [
76 | "Job Search",
77 | "Mental Health"
78 | ]
79 | },
80 | {
81 | "text": "Sometimes, we forget that a company\u2019s brand name doesn\u2019t define someone\u2019s talent. It\u2019s easy to get caught up in the 'big company = big talent' mindset, but that's not always the case.\n\nI\u2019ve had the privilege of working with people from smaller companies (lesser known) who blow me away with their skills and dedication. They don\u2019t need a fancy title or a famous brand behind them to prove their worth.\n\nI've seen the other side too\u2014people in top-tier companies feeling lost, overwhelmed, or stuck, even though the world sees them as 'successful'.\n\nLet\u2019s stop attaching someone\u2019s value to the company they work for. Freshers especially need to hear this\u2014skills are what matter, not the size of the company behind them.\n\nAt the end of the day, happiness and growth don\u2019t come from a brand name, they come from doing what you love and constantly improving your craft.",
82 | "engagement": 166,
83 | "line_count": 11,
84 | "language": "English",
85 | "tags": [
86 | "Self Improvement",
87 | "Career Advice"
88 | ]
89 | },
90 | {
91 | "text": "So when I left a toxic work environment, I told my manager a simple thing and felt so good \ud83d\ude2f\n\nI just said-\n\n'Hope your son gets a manager like you.\nI hope that the manager behaves the same way as you did with me.\nThank you.'\n\nNow tell me 1 thing-\n\nShe always said that she was a great manager.\nWhy will she get offended?\n\nI just told her that I wish her son would get a manager like she was.\n\nIf you felt bad, then that means you were a bad manager and now you know it. \ud83e\b80\n\nIf you feel good, then take it as a blessing for your son and you'll really want someone to treat your son/daughter in the same way.\n\nShe cannot be even angry with me else it'll prove that she was not a 'great' manager.\n\nMuskan - 1\nManager - 0\n\nMuskan -> Aura +100000000\n\n(Fictional message unfortunately :(\n)\n\nHope you all become the people that your sons/daughters will like to work under \ud83d\ude4f\n\nThere are a lot of bad people/things, bring a small change and break the chain :)",
92 | "engagement": 1111,
93 | "line_count": 19,
94 | "language": "English",
95 | "tags": [
96 | "Motivation",
97 | "Leadership"
98 | ]
99 | }
100 | ]
--------------------------------------------------------------------------------
/data/raw_posts.json:
--------------------------------------------------------------------------------
1 | [
2 | {
3 | "text": "Just saw a LinkedIn Influencer with 'Organic Growth' written in the profile with 65K+ followers claiming that he can help you in growing your platform, copying the posts from other influencers.",
4 | "engagement": 90
5 | },
6 | {
7 | "text": "Jobseekers, this one’s for you.\n Every application, every interview, every follow-up… the pressure is immense.\n And I know what you're thinking: Am I not good enough? \n But let me tell you, this isn’t about you or your skills. It’s about a broken system where 60% of applicants never hear back. \n Your mental health is not worth sacrificing for a system that doesn’t acknowledge your worth. \n Please remember, taking care of yourself is the real priority. \n Your dream job will come, but for now, breathe. 🌻",
8 | "engagement": 347
9 | },
10 | {
11 | "text": "Looking for jobs on LinkedIn is like online dating: Full of promises, but in the end, you’re just left ghosted.",
12 | "engagement": 109
13 | },
14 | {
15 | "text": "LinkedIn scams be like: 'Congratulations, you've been selected for a role you didn’t even apply for!' \n The catch? Pay Rs. 50,000 for the honor.",
16 | "engagement": 115
17 | },
18 | {
19 | "text": "sapne dekhna achi baat hai,\nlekin job ka sapna dekh ke 'interested' likhna,\nyeh toh achi baat nahi hai na?",
20 | "engagement": 545
21 | },
22 | {
23 | "text": "Next time when I'll be reading some LinkedIn Influencer's story, I am starting from the last line.\nIf there's a link attached to it, it's most probably a fake one.\nSaves me time!",
24 | "engagement": 188
25 | },
26 | {
27 | "text": "Every time I poured my heart into 5-6 rounds of interviews and faced rejection, it felt like a punch in the gut. The sleepless nights, the endless preparation, all for nothing.\n\nBut looking back, I realize it wasn’t nothing. It was the Universe’s way of saying, “Not this one, something better is on the way.”\n\nEvery single time, I’ve been shown why that rejection happened.\n\nDoors I thought I wanted to walk through were shut, only to have the right ones swing open.\n\nThe kind that aligned with my growth, my values, and my happiness.\n\nAt first, it stung. It hurt deeply. But now, when things don’t go as planned, I don’t panic.\n\nI don’t question my worth. I sit back, breathe, and trust. The Universe knows.\n\nI know there's another plan waiting. Something bigger, better, and just for me.\n\nTo anyone feeling the weight of rejection: trust that the closed doors are protecting you from something you can’t see right now.\n\nYour path is being cleared for something even more beautiful.",
28 | "engagement": 206
29 | },
30 | {
31 | "text": "To everyone who's still looking for a job...\n\nI see you. I feel you. \ud83d\udc94\n\nEvery rejection email feels like a punch in the gut, and every 'We'll get back to you' sounds more like 'You'll never hear from us.'\n\nBut I want you to know, you're not alone in this. \ud83c\udf38\n\nAccording to a study, 80% of jobseekers struggle with anxiety and self-doubt during their search. It's normal to feel lost, but it's not the end.\n\nTake breaks, breathe, and remember, this doesn't define you. Your worth is not tied to an offer letter. \ud83d\udca5\n\nYour mental health matters more than any job.",
32 | "engagement": 899
33 | },
34 | {
35 | "text": "Sometimes, we forget that a company’s brand name doesn’t define someone’s talent. It’s easy to get caught up in the 'big company = big talent' mindset, but that's not always the case.\n\nI’ve had the privilege of working with people from smaller companies (lesser known) who blow me away with their skills and dedication. They don’t need a fancy title or a famous brand behind them to prove their worth.\n\nI've seen the other side too—people in top-tier companies feeling lost, overwhelmed, or stuck, even though the world sees them as 'successful'.\n\nLet’s stop attaching someone’s value to the company they work for. Freshers especially need to hear this—skills are what matter, not the size of the company behind them.\n\nAt the end of the day, happiness and growth don’t come from a brand name, they come from doing what you love and constantly improving your craft.",
36 | "engagement": 166
37 | },
38 | {
39 | "text": "So when I left a toxic work environment, I told my manager a simple thing and felt so good \ud83d\ude2f\n\nI just said-\n\n'Hope your son gets a manager like you.\nI hope that the manager behaves the same way as you did with me.\nThank you.'\n\nNow tell me 1 thing-\n\nShe always said that she was a great manager.\nWhy will she get offended?\n\nI just told her that I wish her son would get a manager like she was.\n\nIf you felt bad, then that means you were a bad manager and now you know it. \ud83e\b80\n\nIf you feel good, then take it as a blessing for your son and you'll really want someone to treat your son/daughter in the same way.\n\nShe cannot be even angry with me else it'll prove that she was not a 'great' manager.\n\nMuskan - 1\nManager - 0\n\nMuskan -> Aura +100000000\n\n(Fictional message unfortunately :(\n)\n\nHope you all become the people that your sons/daughters will like to work under \ud83d\ude4f\n\nThere are a lot of bad people/things, bring a small change and break the chain :)",
40 | "engagement": 1111
41 | }
42 | ]
--------------------------------------------------------------------------------
/few_shot.py:
--------------------------------------------------------------------------------
1 | import pandas as pd
2 | import json
3 |
4 |
5 | class FewShotPosts:
6 | def __init__(self, file_path="data/processed_posts.json"):
7 | self.df = None
8 | self.unique_tags = None
9 | self.load_posts(file_path)
10 |
11 | def load_posts(self, file_path):
12 | with open(file_path, encoding="utf-8") as f:
13 | posts = json.load(f)
14 | self.df = pd.json_normalize(posts)
15 | self.df['length'] = self.df['line_count'].apply(self.categorize_length)
16 | # collect unique tags
17 | all_tags = self.df['tags'].apply(lambda x: x).sum()
18 | self.unique_tags = list(set(all_tags))
19 |
20 | def get_filtered_posts(self, length, language, tag):
21 | df_filtered = self.df[
22 | (self.df['tags'].apply(lambda tags: tag in tags)) & # Tags contain 'Influencer'
23 | (self.df['language'] == language) & # Language is 'English'
24 | (self.df['length'] == length) # Line count is less than 5
25 | ]
26 | return df_filtered.to_dict(orient='records')
27 |
28 | def categorize_length(self, line_count):
29 | if line_count < 5:
30 | return "Short"
31 | elif 5 <= line_count <= 10:
32 | return "Medium"
33 | else:
34 | return "Long"
35 |
36 | def get_tags(self):
37 | return self.unique_tags
38 |
39 |
40 | if __name__ == "__main__":
41 | fs = FewShotPosts()
42 | # print(fs.get_tags())
43 | posts = fs.get_filtered_posts("Medium","Hinglish","Job Search")
44 | print(posts)
--------------------------------------------------------------------------------
/llm_helper.py:
--------------------------------------------------------------------------------
1 | from langchain_groq import ChatGroq
2 | import os
3 | from dotenv import load_dotenv
4 |
5 | load_dotenv()
6 | llm = ChatGroq(groq_api_key=os.getenv("GROQ_API_KEY"), model_name="llama-3.2-90b-text-preview")
7 |
8 |
9 | if __name__ == "__main__":
10 | response = llm.invoke("Two most important ingradient in samosa are ")
11 | print(response.content)
12 |
13 |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
1 | import streamlit as st
2 | from few_shot import FewShotPosts
3 | from post_generator import generate_post
4 |
5 |
6 | # Options for length and language
7 | length_options = ["Short", "Medium", "Long"]
8 | language_options = ["English", "Hinglish"]
9 |
10 |
11 | # Main app layout
12 | def main():
13 | st.subheader("LinkedIn Post Generator: Codebasics")
14 |
15 | # Create three columns for the dropdowns
16 | col1, col2, col3 = st.columns(3)
17 |
18 | fs = FewShotPosts()
19 | tags = fs.get_tags()
20 | with col1:
21 | # Dropdown for Topic (Tags)
22 | selected_tag = st.selectbox("Topic", options=tags)
23 |
24 | with col2:
25 | # Dropdown for Length
26 | selected_length = st.selectbox("Length", options=length_options)
27 |
28 | with col3:
29 | # Dropdown for Language
30 | selected_language = st.selectbox("Language", options=language_options)
31 |
32 |
33 |
34 | # Generate Button
35 | if st.button("Generate"):
36 | post = generate_post(selected_length, selected_language, selected_tag)
37 | st.write(post)
38 |
39 |
40 | # Run the app
41 | if __name__ == "__main__":
42 | main()
43 |
--------------------------------------------------------------------------------
/post_generator.py:
--------------------------------------------------------------------------------
1 | from llm_helper import llm
2 | from few_shot import FewShotPosts
3 |
4 | few_shot = FewShotPosts()
5 |
6 |
7 | def get_length_str(length):
8 | if length == "Short":
9 | return "1 to 5 lines"
10 | if length == "Medium":
11 | return "6 to 10 lines"
12 | if length == "Long":
13 | return "11 to 15 lines"
14 |
15 |
16 | def generate_post(length, language, tag):
17 | prompt = get_prompt(length, language, tag)
18 | response = llm.invoke(prompt)
19 | return response.content
20 |
21 |
22 | def get_prompt(length, language, tag):
23 | length_str = get_length_str(length)
24 |
25 | prompt = f'''
26 | Generate a LinkedIn post using the below information. No preamble.
27 |
28 | 1) Topic: {tag}
29 | 2) Length: {length_str}
30 | 3) Language: {language}
31 | If Language is Hinglish then it means it is a mix of Hindi and English.
32 | The script for the generated post should always be English.
33 | '''
34 | # prompt = prompt.format(post_topic=tag, post_length=length_str, post_language=language)
35 |
36 | examples = few_shot.get_filtered_posts(length, language, tag)
37 |
38 | if len(examples) > 0:
39 | prompt += "4) Use the writing style as per the following examples."
40 |
41 | for i, post in enumerate(examples):
42 | post_text = post['text']
43 | prompt += f'\n\n Example {i+1}: \n\n {post_text}'
44 |
45 | if i == 1: # Use max two samples
46 | break
47 |
48 | return prompt
49 |
50 |
51 | if __name__ == "__main__":
52 | print(generate_post("Medium", "English", "Mental Health"))
--------------------------------------------------------------------------------
/preprocess.py:
--------------------------------------------------------------------------------
1 | import json
2 | from llm_helper import llm
3 | from langchain_core.prompts import PromptTemplate
4 | from langchain_core.output_parsers import JsonOutputParser
5 | from langchain_core.exceptions import OutputParserException
6 |
7 |
8 | def process_posts(raw_file_path, processed_file_path=None):
9 | with open(raw_file_path, encoding='utf-8') as file:
10 | posts = json.load(file)
11 | enriched_posts = []
12 | for post in posts:
13 | metadata = extract_metadata(post['text'])
14 | post_with_metadata = post | metadata
15 | enriched_posts.append(post_with_metadata)
16 |
17 | unified_tags = get_unified_tags(enriched_posts)
18 | for post in enriched_posts:
19 | current_tags = post['tags']
20 | new_tags = {unified_tags[tag] for tag in current_tags}
21 | post['tags'] = list(new_tags)
22 |
23 | with open(processed_file_path, encoding='utf-8', mode="w") as outfile:
24 | json.dump(enriched_posts, outfile, indent=4)
25 |
26 |
27 | def extract_metadata(post):
28 | template = '''
29 | You are given a LinkedIn post. You need to extract number of lines, language of the post and tags.
30 | 1. Return a valid JSON. No preamble.
31 | 2. JSON object should have exactly three keys: line_count, language and tags.
32 | 3. tags is an array of text tags. Extract maximum two tags.
33 | 4. Language should be English or Hinglish (Hinglish means hindi + english)
34 |
35 | Here is the actual post on which you need to perform this task:
36 | {post}
37 | '''
38 |
39 | pt = PromptTemplate.from_template(template)
40 | chain = pt | llm
41 | response = chain.invoke(input={"post": post})
42 |
43 | try:
44 | json_parser = JsonOutputParser()
45 | res = json_parser.parse(response.content)
46 | except OutputParserException:
47 | raise OutputParserException("Context too big. Unable to parse jobs.")
48 | return res
49 |
50 |
51 | def get_unified_tags(posts_with_metadata):
52 | unique_tags = set()
53 | # Loop through each post and extract the tags
54 | for post in posts_with_metadata:
55 | unique_tags.update(post['tags']) # Add the tags to the set
56 |
57 | unique_tags_list = ','.join(unique_tags)
58 |
59 | template = '''I will give you a list of tags. You need to unify tags with the following requirements,
60 | 1. Tags are unified and merged to create a shorter list.
61 | Example 1: "Jobseekers", "Job Hunting" can be all merged into a single tag "Job Search".
62 | Example 2: "Motivation", "Inspiration", "Drive" can be mapped to "Motivation"
63 | Example 3: "Personal Growth", "Personal Development", "Self Improvement" can be mapped to "Self Improvement"
64 | Example 4: "Scam Alert", "Job Scam" etc. can be mapped to "Scams"
65 | 2. Each tag should be follow title case convention. example: "Motivation", "Job Search"
66 | 3. Output should be a JSON object, No preamble
67 | 3. Output should have mapping of original tag and the unified tag.
68 | For example: {{"Jobseekers": "Job Search", "Job Hunting": "Job Search", "Motivation": "Motivation}}
69 |
70 | Here is the list of tags:
71 | {tags}
72 | '''
73 | pt = PromptTemplate.from_template(template)
74 | chain = pt | llm
75 | response = chain.invoke(input={"tags": str(unique_tags_list)})
76 | try:
77 | json_parser = JsonOutputParser()
78 | res = json_parser.parse(response.content)
79 | except OutputParserException:
80 | raise OutputParserException("Context too big. Unable to parse jobs.")
81 | return res
82 |
83 |
84 | if __name__ == "__main__":
85 | process_posts("data/raw_posts.json", "data/processed_posts.json")
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | streamlit==1.35.0
2 | langchain==0.2.14
3 | langchain-core==0.2.39
4 | langchain-community==0.2.12
5 | langchain_groq==0.1.9
6 | pandas==2.0.2
--------------------------------------------------------------------------------
/resources/architecture.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/codebasics/project-genai-post-generator/62c5e1672a70dcac1724adb1a98c5771dc64b7f0/resources/architecture.jpg
--------------------------------------------------------------------------------
/resources/tool.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/codebasics/project-genai-post-generator/62c5e1672a70dcac1724adb1a98c5771dc64b7f0/resources/tool.jpg
--------------------------------------------------------------------------------