├── .gitignore
├── README.md
├── api_helper
    ├── ghost_api.py
    └── serp_api.py
├── app.py
├── llm_keyword_fetcher
    └── llm_generator.py
├── postgres.py
├── prompt.py
├── prompts
    ├── content_prompt.py
    ├── faq_prompt.py
    ├── feedback_content_prompt.py
    └── structure_prompt.py
├── requirements.txt
├── seo
    └── data_for_seo_api.py
└── st_frontend
    ├── frontend.py
    └── st_helper.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .env
2 | /__pycache__
3 | .DS_Store
4 | /api_helper/__pycache__
5 | /st_frontend/__pycache__
6 | /prompts/__pycache__
7 | /seo/__pycache__
8 | /llm_keyword_fetcher/__pycache__


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## BlogIQ - Content Generation using AI (Clone of writesonic.com & copy.ai)
  2 | 
  3 | ### 🚀 Introduction:
  4 | 
  5 | BlogIQ stands as a beacon of innovation in the realm of content creation, providing bloggers with an advanced platform powered by state-of-the-art technology, including Langchain, Langgraph, and OpenAI GPT-4 models. 🌟 Seamlessly integrated with dataforseo.com API, BlogIQ offers a comprehensive suite of features tailored to streamline and optimize the blogging process.
  6 | 
  7 | 🔍 Step 1: Topic and Keyword Selection:
  8 | 
  9 | At its core, BlogIQ simplifies content creation by allowing users to specify their desired topic and primary keyword. Leveraging data from the dataforseo.com API, the app conducts an extensive Google search to curate a selection of relevant URLs. Users retain complete control, with the option to manually select additional URLs or meta keywords, or opt for automated meta keyword generation using OpenAI LLM to ensure adherence to SEO best practices. 📊💡
 10 | 
 11 | 🛠️ Step 2: Title and Structure Generation:
 12 | 
 13 | Powered by Langchain, Langgraph, and OpenAI GPT-4 models, BlogIQ facilitates title and structure generation with unprecedented accuracy. Users are presented with a range of options tailored to their preferences, with the added flexibility of providing input to direct the GPT-4 models during this process, ensuring that the generated content seamlessly aligns with their vision and objectives. 💭✨
 14 | 
 15 | 📝 Step 3: Content Generation:
 16 | 
 17 | Expanding upon the chosen structure, BlogIQ dynamically generates content enriched with insights gleaned from the selected URLs and meta keywords. Users can further guide the content generation process by providing prompts to the GPT-4 models, fostering personalized and engaging content creation experiences. 📝✨
 18 | 
 19 | 💬 Step 4: FAQ Generation:
 20 | 
 21 | In the final stage, BlogIQ completes the content creation journey by generating FAQs for the blog. By analyzing the generated content and identifying potential questions, the app automatically generates a set of FAQs, enriching the blog post with valuable additional information and enhancing its engagement potential. 🤔💬
 22 | 
 23 | ![ezgif com-optimize](https://github.com/langchain-tech/BlogIQ/assets/100914015/1b53acb4-bdd9-4bfe-9e81-03dafb11ef68)
 24 | 
 25 | 
 26 | ## Getting Started
 27 | 
 28 | ### Prerequisites
 29 | 
 30 | Before running the app, make sure you have the following dependencies installed:
 31 | 
 32 | - Python 3.x
 33 | - pip (Python package installer)
 34 | 
 35 | ### Installation
 36 | 
 37 | 1. Clone the repository:
 38 | 
 39 |     ```bash
 40 |     git clone https://github.com/langschain/BlogIQ.git
 41 |     ```
 42 | 
 43 | 2. Navigate to the project directory:
 44 | 
 45 |     ```bash
 46 |     cd BlogIQ
 47 |     ```
 48 | 
 49 | 3. Install the required packages:
 50 | 
 51 |     ```bash
 52 |     pip install -r requirements.txt
 53 |     ```
 54 | 
 55 | ### Configuration
 56 | 
 57 | 1. Create a `.env` file in the project root directory.
 58 | 
 59 | 2. Add the following environment variables to the `.env` file:
 60 | 
 61 |     ```
 62 |     LANGCHAIN_TRACING_V2=your_langchain_tracing_v2_key
 63 |     LANGCHAIN_PROJECT=your_langchain_project_key
 64 |     OPENAI_API_KEY=your_openai_api_key
 65 |     LANGCHAIN_API_KEY=your_langchain_api_key
 66 |     DATA_FOR_SEO_TOKEN=your_data_for_seo_api_key
 67 |     ```
 68 | 
 69 |     Replace `your_langchain_tracing_v2_key`, `your_langchain_project_key`, `your_openai_api_key`, `your_langchain_api_key`, and `your_data_for_seo_api_key` with your actual API keys.
 70 | 
 71 | ## Usage
 72 | 
 73 | 1. Run the Streamlit app:
 74 | 
 75 |     ```bash
 76 |     streamlit run app.py
 77 |     ```
 78 | 
 79 | 2. Access the app in your web browser at [http://localhost:8501](http://localhost:8501).
 80 | 
 81 | ## Additional Information
 82 | 
 83 | 📚 Explore Technical Blogs Generated by BlogIQ on Langchain Blogs:
 84 | 
 85 | Welcome to the collection of technical blogs generated by BlogIQ on Langchain Blogs, your go-to platform for streamlined content creation powered by advanced AI technology. 🚀
 86 | 
 87 | Where to Find the Technical Blogs:
 88 | 
 89 | Curious to see the technical content generated by BlogIQ in action? You can explore all the technical blogs created using our innovative tool on [Langchain Blogs](https://www.langchain.ca/blog/). From in-depth articles on software development and data science to tutorials on machine learning and programming languages, there's something for every tech enthusiast.
 90 | 
 91 | Why Explore:
 92 | 
 93 | Whether you're a software developer seeking new insights, a data scientist exploring cutting-edge technologies, or a tech enthusiast looking to expand your knowledge, Langchain Blogs is the perfect destination. Each blog post is crafted using BlogIQ's intuitive interface, which harnesses the power of Langchain, Langgraph, and OpenAI GPT-4 models.
 94 | 
 95 | Start Exploring:
 96 | 
 97 | Visit Langchain Blogs today and unlock your creativity! Get ready to dive deep into the world of technology with engaging content that captivates and educates. Happy exploring! 📖✨
 98 | 
 99 | [Explore Technical Blogs Generated by BlogIQ on Langchain Blogs](https://www.langchain.ca/blog/)
100 | 
101 | ## Contributing
102 | 
103 | If you'd like to contribute to this project, please fork the repository and submit a pull request.
104 | 
105 | ## License
106 | 
107 | This project is licensed under the [MIT License](LICENSE).
108 | 
109 | ## Acknowledgments
110 | 
111 | - Special thanks to Langchain and OpenAI for providing powerful tools for natural language processing.
112 | - The app structure and components are based on the Streamlit framework.
113 | 
114 | 


--------------------------------------------------------------------------------
/api_helper/ghost_api.py:
--------------------------------------------------------------------------------
 1 | import requests
 2 | 
 3 | GHOST_API_URL='https://www.langchain.ca/blog/ghost/api/v3/content/posts/'
 4 | 
 5 | # Your Ghost API key
 6 | GHOST_API_KEY='65f1ac9109333f517ddaac03:2cb16de1e79fa7d0e140896c14b541b08de4b54507b79ed54ca1e751b62de6a4'
 7 | 
 8 | 
 9 | 
10 | 
11 | def post_blog(title, content):
12 |     payload = {
13 |         'posts': [{
14 |             'title': title,
15 |             'html': content,
16 |             'tags': ['tag1', 'tag2'],
17 |             'status': 'draft'
18 |         }]
19 |     }
20 |     print(payload)
21 |     response = requests.post(GHOST_API_URL, json=payload, headers={'Authorization': f'Ghost {GHOST_API_KEY}'}, timeout=60)
22 | 
23 |     # Check response status
24 |     if response.status_code == 201:
25 |         print('Blog post published successfully!')
26 |     else:
27 |         print(f'Failed to publish blog post. Status code: {response.status_code}, Error: {response.text}')
28 | 


--------------------------------------------------------------------------------
/api_helper/serp_api.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from serpapi import GoogleSearch
 3 | from dotenv import load_dotenv
 4 | 
 5 | load_dotenv()
 6 | 
 7 | SERP_API_KEY = os.getenv("SERP_API_KEY")
 8 | 
 9 | def serp_api_caller(question):
10 |     print("---Calling SerpApi---")
11 |     params = {
12 |         "engine": "google",
13 |         "q": question,
14 |         "api_key": SERP_API_KEY
15 |     }
16 |     search = GoogleSearch(params)
17 |     results = search.get_dict()
18 |     organic_results = results["organic_results"]
19 |     return [details['link'] for details in organic_results]


--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import json
  3 | import pprint
  4 | import secrets
  5 | import operator
  6 | import streamlit as st
  7 | from langchain import hub
  8 | from serpapi import GoogleSearch
  9 | from dotenv import load_dotenv
 10 | from typing import Annotated, Sequence, TypedDict, Dict
 11 | from langgraph.graph import END, StateGraph
 12 | from langchain.text_splitter import RecursiveCharacterTextSplitter
 13 | from langchain_community.document_loaders import WebBaseLoader
 14 | from langchain_community.vectorstores import Chroma
 15 | from langchain.output_parsers.openai_tools import PydanticToolsParser
 16 | from langchain.prompts import PromptTemplate
 17 | from langchain.schema import Document
 18 | from langchain_community.tools.tavily_search import TavilySearchResults
 19 | from langchain_community.vectorstores import Chroma
 20 | from langchain_core.messages import BaseMessage, FunctionMessage
 21 | from langchain_core.output_parsers import StrOutputParser
 22 | from langchain_core.pydantic_v1 import BaseModel, Field
 23 | from langchain_core.runnables import RunnablePassthrough
 24 | from langchain_core.utils.function_calling import convert_to_openai_tool
 25 | from langchain_openai import ChatOpenAI, OpenAIEmbeddings
 26 | from postgres import create_record, update_record
 27 | from prompt import get_structure_template, get_content_generator_template
 28 | 
 29 | 
 30 | from langchain_community.chat_models import ChatOllama
 31 | from langchain_community.embeddings import OllamaEmbeddings
 32 | 
 33 | ## app imports
 34 | from st_frontend.frontend import main
 35 | from prompts.content_prompt import content_template
 36 | from prompts.structure_prompt import structure_template
 37 | from prompts.feedback_content_prompt import feedback_content_template
 38 | from prompts.faq_prompt import faq_template
 39 | 
 40 | ### Uncomment import 'pdb' this to use debugger in the app
 41 | ### Use this code in between any file or function to stop debugger at any point pdb.set_trace()
 42 | import pdb
 43 | 
 44 | ## Used to load .env file
 45 | load_dotenv()
 46 | 
 47 | os.environ["LANGCHAIN_TRACING_V2"] = os.getenv("LANGCHAIN_TRACING_V2")
 48 | os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")
 49 | os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
 50 | os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
 51 | os.environ["TAVILY_API_KEY"] = os.getenv("TAVILY_API_KEY")
 52 | 
 53 | class GraphState(TypedDict):
 54 |     keys: Dict[str, any]
 55 | 
 56 | def create_collection(collection_name, question, urls):
 57 |     print("---Got Results---")
 58 |     docs = [WebBaseLoader(url).load() for url in urls]
 59 |     docs_list = [item for sublist in docs for item in sublist]
 60 |     text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
 61 |         chunk_size=1000, chunk_overlap=0
 62 |     )
 63 |     doc_splits = text_splitter.split_documents(docs_list)
 64 |     print("---CREATING NEW DOCUMENTS---")
 65 |     vectorstore = Chroma.from_documents(
 66 |         documents=doc_splits,
 67 |         collection_name=collection_name,
 68 |         embedding=OpenAIEmbeddings(),
 69 |     )
 70 |     create_record(collection_name, urls)
 71 |     print(f"Collection '{collection_name}' created successfully.")
 72 |     return vectorstore.as_retriever()
 73 | 
 74 | def retrieve_documents(collection_name, question):
 75 |     print("---RETRIEVING OLD DOCUMENTS---")
 76 |     embedding_function = OpenAIEmbeddings()
 77 |     vectorstore = Chroma(collection_name, embedding_function)
 78 |     return vectorstore.as_retriever()
 79 |      
 80 | 
 81 | def retrieve(state):
 82 |     print("---RETRIEVE---")
 83 |     state_dict = state["keys"]
 84 |     question = state_dict["question"]
 85 |     primary_keyword = state_dict["primary_keyword"]
 86 |     structure_prompt = state_dict["structure_prompt"]
 87 |     urls = state_dict["selected_urls"]
 88 |     step_to_execute = state_dict["step_to_execute"]
 89 |     selected_keywords = state_dict["selected_keywords"]
 90 | 
 91 | 
 92 |     if 'total_headings' in state_dict:
 93 |         total_headings = state_dict['total_headings']
 94 |     else:
 95 |         total_headings = ''
 96 | 
 97 |     if 'current_heading' in state_dict:
 98 |         current_heading = state_dict['current_heading']
 99 |     else:
100 |         current_heading = ''
101 | 
102 |     if 'faq_prompt' in state_dict:
103 |         faq_prompt = state_dict['faq_prompt']
104 |     else:
105 |         faq_prompt = ''
106 | 
107 |     if 'blog_prompt' in state_dict:
108 |         blog_prompt = state_dict['blog_prompt']
109 |     else:
110 |         blog_prompt = ''
111 | 
112 |     if 'number_of_words_per_heading' in state_dict:
113 |         number_of_words_per_heading = state_dict['number_of_words_per_heading']
114 |     else:
115 |         number_of_words_per_heading = ''
116 | 
117 |     if 'blog_content' in state_dict:
118 |         blog_content = state_dict['blog_content']
119 |     else:
120 |         blog_content = ''
121 | 
122 |     if 'blog_title' in state_dict:
123 |         blog_title = state_dict["blog_title"]
124 |     else:
125 |         blog_title = ''
126 | 
127 |     if 'blog' in state_dict:
128 |         blog = state_dict["blog"]
129 |     else:
130 |         blog = ''
131 | 
132 |     if 'rephrase_context' in state_dict:
133 |         rephrase_context = state_dict["rephrase_context"]
134 |     else:
135 |         rephrase_context = ''
136 | 
137 |     if 'rephrase' in state_dict:
138 |         rephrase = state_dict["rephrase"]
139 |     else:
140 |         rephrase = ''
141 | 
142 |     if 'structure' in state_dict:
143 |         structure = state_dict["structure"]
144 |     else:
145 |         structure = ""
146 | 
147 |     if 'heading' in state_dict:
148 |         heading = state_dict["heading"]
149 |     else:
150 |         heading = ""
151 | 
152 | 
153 |     if 'collection_key' in state_dict:
154 |         collection_key = state_dict["collection_key"]
155 |         retriever = retrieve_documents(collection_key, heading)
156 |     else:
157 |         collection_key = secrets.token_hex(12 // 2)
158 |         retriever = create_collection(collection_key, question, urls)
159 | 
160 |     documents = retriever.get_relevant_documents(heading)
161 | 
162 |     return  {    "keys":
163 | 
164 |                 {
165 |                     "documents": documents,
166 |                     "question": question,
167 |                     'primary_keyword': primary_keyword,
168 |                     "structure_prompt": structure_prompt,
169 |                     "urls": urls,
170 |                     "step_to_execute": step_to_execute,
171 |                     "structure": structure,
172 |                     "collection_key": collection_key,
173 |                     "heading": heading,
174 |                     "rephrase_context": rephrase_context,
175 |                     "rephrase": rephrase,
176 |                     "blog": blog,
177 |                     "blog_title": blog_title,
178 |                     "selected_keywords": selected_keywords,
179 |                     "blog_content": blog_content,
180 |                     "number_of_words_per_heading": number_of_words_per_heading,
181 |                     "blog_prompt": blog_prompt,
182 |                     "faq_prompt": faq_prompt,
183 |                     "total_headings": total_headings,
184 |                     "current_heading": current_heading
185 | 
186 |                 }
187 |             }
188 | 
189 | def generate(state):
190 |     blog_structure = {
191 |         "Blog_Structure_1":
192 |             {
193 |                 "title": "TITLE",
194 |                 "headings":
195 |                     [
196 |                         "HEADING 1",
197 |                         "HEADING 2",
198 |                         "HEADING 3",
199 |                         "HEADING 4",
200 |                         "HEADING 5",
201 |                         "HEADING 6",
202 |                         "HEADING 7",
203 |                         "HEADING 8",
204 |                         "HEADING 9",
205 |                         "HEADING 10"
206 |                     ]
207 |             },
208 |         "Blog_Structure_2":
209 |             {
210 |                 "title": "TITLE",
211 |                 "headings":
212 |                     [
213 |                         "HEADING 1",
214 |                         "HEADING 2",
215 |                         "HEADING 3",
216 |                         "HEADING 4",
217 |                         "HEADING 5",
218 |                         "HEADING 6",
219 |                         "HEADING 7",
220 |                         "HEADING 8",
221 |                         "HEADING 9",
222 |                         "HEADING 10"
223 |                     ]
224 |             },
225 |         "Blog_Structure_3":
226 |             {
227 |                 "title": "TITLE",
228 |                 "headings":
229 |                     [
230 |                         "HEADING 1",
231 |                         "HEADING 2",
232 |                         "HEADING 3",
233 |                         "HEADING 4",
234 |                         "HEADING 5",
235 |                         "HEADING 6",
236 |                         "HEADING 7",
237 |                         "HEADING 8",
238 |                         "HEADING 9",
239 |                         "HEADING 10"
240 |                     ]
241 |             }
242 |     }
243 |     print("---GENERATE---")
244 |     state_dict = state["keys"]
245 |     question = state_dict["question"]
246 |     documents = state_dict["documents"]
247 |     primary_keyword = state_dict["primary_keyword"]
248 |     structure_prompt = state_dict["structure_prompt"]
249 |     urls = state_dict["urls"]
250 |     collection_key = state_dict["collection_key"]
251 |     step_to_execute = state_dict["step_to_execute"]
252 |     structure = state_dict["structure"]
253 |     heading = state_dict["heading"]
254 |     rephrase_context = state_dict["rephrase_context"]
255 |     rephrase = state_dict["rephrase"]
256 |     blog = state_dict["blog"]
257 |     blog_title = state_dict["blog_title"]
258 |     selected_keywords = state_dict['selected_keywords']
259 |     blog_content = state_dict['blog_content']
260 |     number_of_words_per_heading = state_dict['number_of_words_per_heading']
261 |     blog_prompt = state_dict['blog_prompt']
262 |     faq_prompt = state_dict['faq_prompt']
263 |     total_headings = state_dict['total_headings']
264 |     current_heading = state_dict['current_heading']
265 |     print(state_dict)
266 | 
267 |     if step_to_execute == "Generate Structure":
268 |         heading = ''
269 |         template = structure_template()
270 |         prompt = PromptTemplate(template=template, input_variables=["documents", "question", "structure_prompt", "primary_keyword", "blog_structure", "selected_keywords"])
271 |     elif rephrase == True:
272 |         template = feedback_content_template()
273 |         prompt = PromptTemplate(template=template, input_variables=["documents", "structure", "primary_keyword", "refference_links", "rephrase_context", "blog", "structure_prompt"])
274 |     elif step_to_execute == "Generate Blog":
275 |         heading = state_dict["heading"]
276 |         template = content_template(blog_content)
277 |         prompt = PromptTemplate(template=template, input_variables=["documents", "structure", "primary_keyword", "number_of_words_per_heading", "refference_links", "heading", "blog_title", "selected_keywords", "blog_content", "blog_prompt", "total_headings", "current_heading"])
278 |     elif step_to_execute == "Generate Faq's":
279 |         template = faq_template()
280 |         prompt = PromptTemplate(template=template, input_variables=["documents", "primary_keyword", "selected_keywords", "faq_prompt"])
281 | 
282 |     llm = ChatOpenAI(model_name="gpt-4-turbo-preview", temperature=0.7, streaming=True, max_tokens=4096, verbose=True)
283 |     # llm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0.7, streaming=True, max_tokens=4096, verbose=True)
284 |     # llm = ChatOllama(model="llama2:latest")
285 |     rag_chain = prompt | llm | StrOutputParser()
286 | 
287 |     if step_to_execute == "Generate Structure":
288 | 
289 |         generation = rag_chain.invoke(
290 |             {
291 |                 "documents": documents,
292 |                 "question": question,
293 |                 "structure_prompt": structure_prompt,
294 |                 "primary_keyword": primary_keyword,
295 |                 "refference_links": urls,
296 |                 "blog_structure": blog_structure,
297 |                 "selected_keywords": selected_keywords
298 |             }
299 |         )
300 |         print("------- Structure Generated -------")
301 | 
302 |     elif rephrase == True:
303 |         generation = rag_chain.invoke(
304 |             {
305 |                 "documents": documents,
306 |                 "primary_keyword": primary_keyword,
307 |                 "refference_links": urls,
308 |                 "structure": structure,
309 |                 "heading": heading,
310 |                 "blog": blog,
311 |                 "blog_title": blog_title,
312 |                 "rephrase_context": rephrase_context,
313 |                 "structure_prompt": structure_prompt
314 |             }
315 |         )
316 |         print("------- Content Rephrased -------")
317 | 
318 |     elif step_to_execute == "Generate Blog":
319 |         generation = rag_chain.invoke(
320 |             {
321 |                 "documents": documents,
322 |                 "primary_keyword": primary_keyword,
323 |                 "refference_links": urls,
324 |                 "structure": structure,
325 |                 "heading": heading,
326 |                 "blog": blog,
327 |                 "blog_title": blog_title,
328 |                 "selected_keywords": selected_keywords,
329 |                 "blog_content": blog_content,
330 |                 "number_of_words_per_heading": number_of_words_per_heading,
331 |                 "blog_prompt": blog_prompt,
332 |                 "total_headings": total_headings,
333 |                 "current_heading": current_heading
334 |             }
335 |         ) 
336 |         print("------- Content Generated -------")
337 | 
338 |     elif step_to_execute == "Generate Faq's":
339 |         generation = rag_chain.invoke(
340 |             {
341 |                 "documents": documents,
342 |                 "primary_keyword": primary_keyword,
343 |                 "selected_keywords": selected_keywords,
344 |                 "faq_prompt": faq_prompt,
345 |             }
346 |         )
347 |         print("------- Faq's Generated -------")
348 | 
349 |     return  {    "keys":
350 | 
351 |                 {
352 |                     "documents": documents,
353 |                     "question": question,
354 |                     'primary_keyword': primary_keyword,
355 |                     "structure_prompt": structure_prompt,
356 |                     "urls": urls,
357 |                     "generation": generation,
358 |                     "step_to_execute": step_to_execute,
359 |                     "blog": generation,
360 |                     "collection_key": collection_key,
361 |                     "heading": heading
362 | 
363 |                 }
364 |             }
365 | 
366 | workflow = StateGraph(GraphState)
367 | workflow.add_node("retrieve", retrieve)
368 | workflow.add_node("generate", generate)
369 | workflow.set_entry_point("retrieve")
370 | 
371 | workflow.add_edge("retrieve", "generate")
372 | workflow.add_edge("generate", END)
373 | app = workflow.compile()
374 | 
375 | if __name__ == "__main__":
376 |     main(app)
377 | 


--------------------------------------------------------------------------------
/llm_keyword_fetcher/llm_generator.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from dotenv import load_dotenv
 3 | from langchain_openai import ChatOpenAI
 4 | from langchain_core.prompts import ChatPromptTemplate
 5 | from langchain_core.output_parsers import JsonOutputParser
 6 | import json  # For parsing JSON output
 7 | 
 8 | ### Uncomment import 'pdb' this to use debugger in the app
 9 | ### Use this code in between any file or function to stop debugger at any point pdb.set_trace()
10 | import pdb  # Optional debugger
11 | 
12 | ## Used to load .env file
13 | load_dotenv()
14 | 
15 | os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
16 | 
17 | # Create output parser and LLM instances
18 | LLM = ChatOpenAI()
19 | 
20 | 
21 | def call_llm(topic, primary_keywords):
22 |     """
23 |     This function calls the LLM to generate SEO-friendly meta keywords
24 |     as a Python list for the provided topic and primary keyword.
25 | 
26 |     Args:
27 |         topic (str): Topic of the blog.
28 |         primary_keywords (str): Primary keyword for the blog.
29 | 
30 |     Returns:
31 |         list: List of generated SEO-friendly meta keywords.
32 |     """
33 | 
34 |     prompt = ChatPromptTemplate.from_messages([
35 |         ("system", f"You are a world SEO enhancing engineer. Provide SEO-friendly meta keywords as a JSON object containing a 'keywords' key with a list of keywords, approximately 30-40 in total."),  # Modified prompt
36 |         ("user", "{input}")
37 |     ])
38 |     chain = prompt | LLM | JsonOutputParser()
39 | 
40 |     response = chain.invoke({"input": f"Here is the topic of blog --> {topic} and primary keyword for blog --> {primary_keywords}"})
41 |     print(response)
42 | 
43 |     try:
44 |         return response["keywords"]
45 |     except (KeyError, json.JSONDecodeError) as e:
46 |         print(f"Error parsing response: {e}")
47 |         return []


--------------------------------------------------------------------------------
/postgres.py:
--------------------------------------------------------------------------------
 1 | # CREATE TABLE public.embeddings (
 2 | #     collection_key character varying,
 3 | #     serp_urls      character varying[], -- Assuming ARRAY of character varying
 4 | #     created_at     timestamp without time zone,
 5 | #     updated_at     timestamp without time zone
 6 | # );
 7 | 
 8 | import psycopg2
 9 | 
10 | import os
11 | from dotenv import load_dotenv
12 | load_dotenv()
13 | 
14 | dbname = os.getenv("DB_NAME")
15 | user = os.getenv("DB_USER")
16 | password = os.getenv("DB_PASSWORD")
17 | host = os.getenv("DB_HOST")
18 | port = os.getenv("DB_PORT")
19 | 
20 | def create_connection():
21 |     try:
22 |         connection = psycopg2.connect(dbname=dbname, user=user, password=password, host=host, port=port)
23 |         cursor = connection.cursor()
24 |         return connection, cursor
25 |     except psycopg2.Error as error:
26 |         print("Error creating database connection:", error)
27 |         return None, None
28 | 
29 | def close_connection(connection):
30 |     if connection:
31 |         connection.close()
32 |         print("Connection closed.")
33 | 
34 | def create_record(collection_key, serp_urls):
35 |     connection, cursor = create_connection()
36 |     if connection and cursor:
37 |         try:
38 |             insert_query = """
39 |             INSERT INTO embeddings (collection_key, serp_urls)
40 |             VALUES (%s, %s);
41 |             """
42 |             cursor.execute(insert_query, (collection_key, serp_urls))
43 |             connection.commit()
44 |             print("Record created successfully.")
45 |         except psycopg2.Error as error:
46 |             print("Error with database operation:", error)
47 |         finally:
48 |             close_connection(connection)
49 | 
50 | def update_record(collection_key, new_serp_urls):
51 |     connection, cursor = create_connection()
52 | 
53 |     if connection and cursor:
54 |         try:
55 |             # SQL query to update the serp_urls for a specific record
56 |             update_query = """
57 |             UPDATE embeddings
58 |             SET serp_urls = %s, updated_at = CURRENT_TIMESTAMP
59 |             WHERE collection_key = %s;
60 |             """
61 |             cursor.execute(update_query, (new_serp_urls, collection_key))
62 |             connection.commit()
63 |             print("Record updated successfully.")
64 |         except psycopg2.Error as error:
65 |             print("Error with database operation:", error)
66 |         finally:
67 |             close_connection(connection)
68 | 


--------------------------------------------------------------------------------
/prompt.py:
--------------------------------------------------------------------------------
 1 | def get_structure_template():
 2 |     template = """
 3 |     Please provide me structure of blog of on {question}, must have a structure, must have good headings and must have 10 headings.
 4 |     Important Commands --> Need to provide a good title and its must contain the key word --> {primary_keyword}
 5 |     Additional commands need to follow -->
 6 |     {structure_prompt}
 7 | 
 8 |     Here is the knowledge base for better understanding quality content blog --> {documents}
 9 |     NOTE --> Provide at least three sets of structure with different variations in heading.
10 |     Note to Content Creator: Please ensure that the content you generate is entirely original and not copied from any existing sources. Avoid direct copy-pasting and provide unique insights and perspectives. Plagiarism is strictly prohibited, and we require 100% unique content for the blog. Make sure to use your own words and creativity to deliver high-quality, original content. Thank you for your understanding and adherence to these guidelines.
11 |     Very Import Note -->  Proivde structure as json
12 |     Here is the DEMO structure to follow --> {blog_structure}
13 |     """
14 |     return template
15 | 
16 | # def get_content_generator_template():
17 | #     template = """
18 | #     You are a world class blog write and your name is Michael. You need to write a write a blog using the first object from the python dictionary.
19 | #     Here is the strcuture dictionary object --> {structure}
20 | #     Here is the knowledge base for better understanding quality content blog --> {documents}
21 | #     Important Command --> You need to write atleast 200 words for each heading.
22 | #     """
23 | #     return template
24 | 
25 | def get_content_generator_template():
26 |     template = """
27 |     You are a world class writer of blogs and articles and your name is Michael. Will provide you the heading and can make subheadings as per your way of writing.
28 |     Here is the heading on that you need to write content --> {heading}
29 |     Here is the knowledge base for better understanding quality content blog --> {documents}
30 |     Important Command --> You need to write atleast 200 words for the given topic.
31 |     """
32 |     return template
33 | 
34 | 
35 | # def get_template():
36 | #     template = """
37 | #     Provide me Good Blog.
38 | #     Here is the question on which you need to write a blog. Question --> {question}
39 | #     Note --> Generates content without imposing a maximum token limit. Must contain {blog_words_limit} words.
40 | #     As you are one of the best content writers in the world, your name is Joe. Today, we're tasked with writing a blog post, and it's essential that we adhere to the following ruleset:
41 | 
42 | #     1) Make sure each paragraph must be 100 to 150 words long.
43 | #     2) The blog should be human-readable and unique.
44 | #     3) A conclusion should be included at the end of the blog.
45 | #     4) The blog should follow a common blogging structure.
46 | #     5) Blog should in multiple paragraphs 
47 | #     6) Use meta keywords in blog writing
48 | #     7) Make sure not to copy anything from documents (knowledge base) directly, otherwise it will be plagiarism
49 | 
50 | #     I'll be provided with documents to review and understand the topics we'll be covering. Additionally, I'll provide meta keywords that we need to incorporate to enhance our ranking on Google search.
51 | 
52 | #     Here is the documents (knowledge base) that you need to utilise to write blog --> {documents}, primary meta keywords --> {primary_keyword} & here is addtional context ---> {structure_prompt}
53 | #     """
54 | #     return template
55 | 
56 | # def get_template():
57 | #     template = """
58 | #         Provide a comprehensive blog post on the topic of Natural Language Processing (NLP).
59 |         
60 | #         Additional Context:
61 | 
62 | #         {structure_prompt}
63 | 
64 | #         Meta Keywords:
65 | #         {keywords_string}
66 | 
67 | #         Knowledge Base:
68 | #         {documents}
69 | 
70 | #         Blog Minimum Word Limit and make sure blog contains:
71 | #         {blog_words_limit}
72 | #     """
73 | #     return template
74 | 
75 | 
76 | 
77 | 
78 | 


--------------------------------------------------------------------------------
/prompts/content_prompt.py:
--------------------------------------------------------------------------------
 1 | ### Uncomment import 'pdb' this to use debugger in the app
 2 | ### Use this code in between any file or function to stop debugger at any point pdb.set_trace()
 3 | import pdb
 4 | 
 5 | def content_template(blog):
 6 |   """
 7 |   This function generates a formatted prompt for the LLM to create blog content.
 8 | 
 9 |   Args:
10 |     blog (dict): Dictionary containing previously generated content (title, headings, content)
11 |                   if any.
12 |     heading (str): Current heading for which content needs to be generated.
13 | 
14 |   Returns:
15 |     str: The formatted prompt template.
16 |   """
17 | 
18 |   template = """
19 |   **NOTE: I NEED 100 PERCENT PLAGIARISM-FREE CONTENT. DO NOT DIRECTLY COPY AND PASTE FROM KNOWLEDGE PROVIDED BELOW.**
20 | 
21 |   {blog_prompt}
22 | 
23 |   You are a world-class writer of blogs and articles named Michael. You are going to provide headings of blog one by one and you need to write engaging and informative content for the following heading:
24 | 
25 |   **You need to generate content on total {total_headings} for this blog.**
26 |   **Currently you are going to generate content for {current_heading} heading.**
27 | 
28 |   <h2>{heading}</h2>
29 | 
30 |   **NOTE: Only add `Conclusion` and `Final Thoughts` in last heading of blog if required.**
31 | 
32 |   **Target Word Count:** {number_of_words_per_heading}
33 | 
34 |   **NOTE: Please provide content in proper <h3>, <h4>, <p>, <ul>, <li>, <ol>, <b>, <i>, <a> and other required html tags.**
35 | 
36 |   **Additionally, to further enhance user comprehension, consider including relevant code snippets within the content, especially for technical concepts. You can provide reference links for the important terms.**
37 | 
38 |   **Knowledge Base:**
39 | 
40 |   {documents}  **Important Note:** Aim for at least 200 words while maintaining clarity and avoiding excessive subheadings.
41 | 
42 |   **SEO Keywords:** {selected_keywords}
43 | 
44 |   """
45 |   print(template + content(blog))
46 |   return template + content(blog)
47 | 
48 | 
49 | def content(blog):
50 |   """
51 |   This function retrieves previously generated content for the next heading.
52 | 
53 |   Args:
54 |     blog (dict): Dictionary containing previously generated content.
55 | 
56 |   Returns:
57 |     str: The previously generated content or an empty string.
58 |   """
59 | 
60 |   if blog:
61 |     return "\n**Previously Generated Content:** --> {blog_content}\n\n MOST IMPORTANT WARNING: First check the previously generated content and only than generate new content which is unique and no duplicacy happens in the content at any cost."
62 |   else:
63 |     return "\n"


--------------------------------------------------------------------------------
/prompts/faq_prompt.py:
--------------------------------------------------------------------------------
1 | def faq_template():
2 |     template = """
3 |     You are a world class writer of frequently asked question answers and your name is Michael.
4 |     Here is the user commands for faq generation --> {faq_prompt}
5 |     Here is the meta seo keywords :--> {selected_keywords}. You need to use them in faq to make them seo friendly.
6 |     Here is the knowledge base that you have to use for writing the faq'a question answer --> {documents}
7 |     Provide as a orderd list of faq's.
8 |     """
9 |     return template


--------------------------------------------------------------------------------
/prompts/feedback_content_prompt.py:
--------------------------------------------------------------------------------
 1 | def feedback_content_template():
 2 |     template = """
 3 |     You are a world class writer of blogs and articles and your name is Michael. You have provided me a blog.
 4 |     Blog Title is --> {blog_title}
 5 |     Here is the blog that you have written for me --> {blog}
 6 | 
 7 | 
 8 |     Here is the rephrase query for you to work Michael --> {rephrase_context}
 9 |     Here is the knowledge base that you have used for writing the blog --> {documents}
10 |     Here is some rephrasing feedback from user that you have to consider and rephrased the content.
11 |     NOTE: Please provide content in proper <h2>, <p>, <ul>, <li>, <ol>, <b>, <i> and other required html tags.
12 |     Make sure after doing rephrasing you need to provide whole rephrased blog in the response. Please don't shorten the content at any cost.
13 |     """
14 |     return template


--------------------------------------------------------------------------------
/prompts/structure_prompt.py:
--------------------------------------------------------------------------------
 1 | def structure_template():
 2 |     template = """
 3 |     Topic for blog --> {question}.
 4 |     Important Commands --> Need to provide a good title and its must contain this meta key word --> {primary_keyword}
 5 |     Important user commands to follow -->
 6 |     {structure_prompt}
 7 |     Here is the knowledge base for better understanding quality content blog --> {documents}
 8 |     NOTE --> Provide at least three sets of structure with different variations in heading.
 9 |     Note to Content Creator: Please ensure that the content you generate is entirely original and not copied from any existing sources. Avoid direct copy-pasting and provide unique insights and perspectives. Plagiarism is strictly prohibited, and we require 100% unique content for the blog. Make sure to use your own words and creativity to deliver high-quality, original content. Thank you for your understanding and adherence to these guidelines.
10 |     Very Import Note -->  Proivde structure as json
11 |     Here is the DEMO structure to follow --> {blog_structure}
12 |     DON'T INCLUDE this "```json" in the response.
13 |     """
14 |     return template


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
  1 | aiohttp==3.9.5
  2 | aiosignal==1.3.1
  3 | altair==5.3.0
  4 | annotated-types==0.6.0
  5 | anyio==4.3.0
  6 | asgiref==3.8.1
  7 | attrs==23.2.0
  8 | backoff==2.2.1
  9 | bcrypt==4.1.2
 10 | beautifulsoup4==4.12.3
 11 | blinker==1.7.0
 12 | build==1.2.1
 13 | cachetools==5.3.3
 14 | certifi==2024.2.2
 15 | charset-normalizer==3.3.2
 16 | chroma-hnswlib==0.7.3
 17 | chromadb==0.5.0
 18 | click==8.1.7
 19 | coloredlogs==15.0.1
 20 | dataclasses-json==0.6.4
 21 | Deprecated==1.2.14
 22 | distro==1.9.0
 23 | fastapi==0.110.2
 24 | filelock==3.13.4
 25 | flatbuffers==24.3.25
 26 | frozenlist==1.4.1
 27 | fsspec==2024.3.1
 28 | gitdb==4.0.11
 29 | GitPython==3.1.43
 30 | google-auth==2.29.0
 31 | google_search_results==2.4.2
 32 | googleapis-common-protos==1.63.0
 33 | grpcio==1.62.2
 34 | h11==0.14.0
 35 | httpcore==1.0.5
 36 | httptools==0.6.1
 37 | httpx==0.27.0
 38 | huggingface-hub==0.22.2
 39 | humanfriendly==10.0
 40 | idna==3.7
 41 | importlib-metadata==7.0.0
 42 | importlib_resources==6.4.0
 43 | Jinja2==3.1.3
 44 | jsonpatch==1.33
 45 | jsonpointer==2.4
 46 | jsonschema==4.21.1
 47 | jsonschema-specifications==2023.12.1
 48 | kubernetes==29.0.0
 49 | langchain==0.1.16
 50 | langchain-community==0.0.34
 51 | langchain-core==0.1.45
 52 | langchain-openai==0.1.3
 53 | langchain-text-splitters==0.0.1
 54 | langgraph==0.0.38
 55 | langsmith==0.1.50
 56 | markdown-it-py==3.0.0
 57 | MarkupSafe==2.1.5
 58 | marshmallow==3.21.1
 59 | mdurl==0.1.2
 60 | mmh3==4.1.0
 61 | monotonic==1.6
 62 | mpmath==1.3.0
 63 | multidict==6.0.5
 64 | mypy-extensions==1.0.0
 65 | numpy==1.26.4
 66 | oauthlib==3.2.2
 67 | onnxruntime==1.17.3
 68 | openai==1.23.3
 69 | opentelemetry-api==1.24.0
 70 | opentelemetry-exporter-otlp-proto-common==1.24.0
 71 | opentelemetry-exporter-otlp-proto-grpc==1.24.0
 72 | opentelemetry-instrumentation==0.45b0
 73 | opentelemetry-instrumentation-asgi==0.45b0
 74 | opentelemetry-instrumentation-fastapi==0.45b0
 75 | opentelemetry-proto==1.24.0
 76 | opentelemetry-sdk==1.24.0
 77 | opentelemetry-semantic-conventions==0.45b0
 78 | opentelemetry-util-http==0.45b0
 79 | orjson==3.10.1
 80 | overrides==7.7.0
 81 | packaging==23.2
 82 | pandas==2.2.2
 83 | pillow==10.3.0
 84 | posthog==3.5.0
 85 | protobuf==4.25.3
 86 | psycopg2-binary==2.9.9
 87 | pyarrow==16.0.0
 88 | pyasn1==0.6.0
 89 | pyasn1_modules==0.4.0
 90 | pydantic==2.7.1
 91 | pydantic_core==2.18.2
 92 | pydeck==0.8.1b0
 93 | Pygments==2.17.2
 94 | PyPika==0.48.9
 95 | pyproject_hooks==1.0.0
 96 | python-dateutil==2.9.0.post0
 97 | python-dotenv==1.0.1
 98 | pytz==2024.1
 99 | PyYAML==6.0.1
100 | referencing==0.34.0
101 | regex==2024.4.16
102 | requests==2.31.0
103 | requests-oauthlib==2.0.0
104 | rich==13.7.1
105 | rpds-py==0.18.0
106 | rsa==4.9
107 | serpapi==0.1.5
108 | setuptools==69.5.1
109 | shellingham==1.5.4
110 | six==1.16.0
111 | smmap==5.0.1
112 | sniffio==1.3.1
113 | soupsieve==2.5
114 | SQLAlchemy==2.0.29
115 | starlette==0.37.2
116 | streamlit==1.33.0
117 | sympy==1.12
118 | tenacity==8.2.3
119 | tiktoken==0.6.0
120 | tokenizers==0.19.1
121 | toml==0.10.2
122 | toolz==0.12.1
123 | tornado==6.4
124 | tqdm==4.66.2
125 | typer==0.12.3
126 | typing-inspect==0.9.0
127 | typing_extensions==4.11.0
128 | tzdata==2024.1
129 | urllib3==2.2.1
130 | uvicorn==0.29.0
131 | uvloop==0.19.0
132 | watchfiles==0.21.0
133 | websocket-client==1.8.0
134 | websockets==12.0
135 | wrapt==1.16.0
136 | yarl==1.9.4
137 | zipp==3.18.1
138 | 


--------------------------------------------------------------------------------
/seo/data_for_seo_api.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import json
  3 | import requests
  4 | import pandas as pd
  5 | from dotenv import load_dotenv
  6 | from collections import namedtuple
  7 | 
  8 | Result = namedtuple("Result", ["success", "data", "message"])
  9 | 
 10 | ### Uncomment import 'pdb' this to use debugger in the app
 11 | ### Use this code in between any file or function to stop debugger at any point pdb.set_trace()
 12 | import pdb
 13 | 
 14 | load_dotenv()
 15 | 
 16 | DATA_FOR_SEO_TOKEN = os.getenv("DATA_FOR_SEO_TOKEN")
 17 | 
 18 | SERP_URL = "https://api.dataforseo.com/v3/dataforseo_labs/google/related_keywords/live"
 19 | SERP_LOCATIONS_URL="https://api.dataforseo.com/v3/dataforseo_labs/locations_and_languages"
 20 | SERP_API_URL="https://api.dataforseo.com/v3/serp/google/organic/live/regular"
 21 | 
 22 | def get_headers():
 23 |     return {
 24 |         'Authorization': f"Basic {DATA_FOR_SEO_TOKEN}",
 25 |         'Content-Type': 'application/json'
 26 |     }
 27 | 
 28 | def get_locations():
 29 |     response = requests.request("GET", SERP_LOCATIONS_URL, headers=get_headers())
 30 |     response = json.loads(response.text)
 31 | 
 32 |     if 'tasks' in response and len(response['tasks']) > 0 and 'result' in response['tasks'][0] and len(response['tasks'][0]['result']) > 0:
 33 |         items = response['tasks'][0]['result']
 34 |         if items and len(items) > 0:
 35 |             data = {item['location_name']: item['location_code'] for item in items}
 36 |             return Result(success=True, data={"data": data}, message="Locations and languages fetched sucessfully.")
 37 |         else:
 38 |             result = Result(success=False, data={"error": {"message": "No locations and languages found."}}, message=None)
 39 |             return result
 40 |     else:
 41 |         return Result(success=False, data={"error": {"message": "Something went wrong"}}, message=None)
 42 | 
 43 | 
 44 | def get_serp_urls(question, location_code):
 45 |     payload = [
 46 |         {
 47 |             "keyword": question,
 48 |             "location_code": location_code,
 49 |             "language_code": "en",
 50 |             "depth": 15
 51 |         }
 52 |     ]
 53 |     payload = json.dumps(payload)
 54 |     response = requests.request("POST", SERP_API_URL, headers=get_headers(), data=payload)
 55 |     response = json.loads(response.text)
 56 |     print(response)
 57 |     if 'tasks' in response and len(response['tasks']) > 0 and 'result' in response['tasks'][0] and len(response['tasks'][0]['result']) > 0 and 'items' in response['tasks'][0]['result'][0]:
 58 |         items = response['tasks'][0]['result'][0]['items']
 59 | 
 60 |         if items and len(items) > 0:
 61 |             urls = [item['url'] for item in items]
 62 |             return Result(success=True, data={"data": urls}, message="Urls fetched sucessfully.")
 63 |         else:
 64 |             result = Result(success=False, data={"error": {"message": "No url found."}}, message=None)
 65 |             return result
 66 |     else:
 67 |         return Result(success=False, data={"error": {"message": "Something went wrong"}}, message=None)
 68 | 
 69 | 
 70 | 
 71 | 
 72 | def get_keywords(question, location_code):
 73 |     # payload = "[{\"keyword\":\"Ai applications in healthcare\", \"location_code\":2840, \"language_code\":\"en\", \"depth\":3, \"include_seed_keyword\":false, \"include_serp_info\":false, \"limit\":30, \"offset\":0}]"
 74 |     keyword_payload = [
 75 |         {
 76 |             "keyword": question,
 77 |             "location_code": location_code,
 78 |             "language_code": "en",
 79 |             "depth": 3,
 80 |             "include_seed_keyword": False,
 81 |             "include_serp_info": False,
 82 |             "limit": 30,
 83 |             "offset": 0
 84 |         }
 85 |     ]
 86 |     payload = json.dumps(keyword_payload)
 87 |     response = requests.request("POST", SERP_URL, headers=get_headers(), data=payload)
 88 |     response = json.loads(response.text)
 89 | 
 90 |     print(response)
 91 |     if 'tasks' in response and len(response['tasks']) > 0 and 'result' in response['tasks'][0] and len(response['tasks'][0]['result']) > 0 and 'items' in response['tasks'][0]['result'][0]:
 92 |         items = response['tasks'][0]['result'][0]['items']
 93 | 
 94 |         if items and len(items) > 0:
 95 |         	print(items)
 96 | 	        selected_data = [
 97 | 	            {'keyword': item['keyword_data']['keyword'],
 98 | 	             'location_code': item['keyword_data']['location_code'],
 99 | 	             'language_code': item['keyword_data']['language_code'],
100 | 	             'competition': item['keyword_data']['keyword_info']['competition'],
101 | 	             'competition_level': item['keyword_data']['keyword_info']['competition_level'],
102 | 	             'cpc': item['keyword_data']['keyword_info']['cpc'],
103 | 	             'search_volume': item['keyword_data']['keyword_info']['search_volume'],
104 | 	             'Select': False
105 | 	             }
106 | 	            for item in items
107 | 	        ]
108 | 	        data_pd = pd.DataFrame(selected_data)
109 | 	        return Result(success=True, data={"data": data_pd}, message="Keywords fetched sucessfully.")
110 |         else:
111 |             result = Result(success=False, data={"error": {"message": "No keyword found."}}, message=None)
112 |             return result
113 |     else:
114 |         return Result(success=False, data={"error": {"message": "Something went wrong"}}, message=None)
115 | 
116 | 


--------------------------------------------------------------------------------
/st_frontend/frontend.py:
--------------------------------------------------------------------------------
  1 | import ast
  2 | import time
  3 | import streamlit as st
  4 | from api_helper.ghost_api import post_blog
  5 | from st_frontend.st_helper import initialize_session_data, primary_details, generate_structure_form, convert_to_title_case
  6 | 
  7 | ### Uncomment import 'pdb' this to use debugger in the app
  8 | ### Use this code in between any file or function to stop debugger at any point pdb.set_trace()
  9 | import pdb
 10 | 
 11 | def average_number_of_words(range_str, num_headings):
 12 |     lower_bound, upper_bound = map(int, range_str.split(" - "))
 13 |     average_words = (lower_bound + upper_bound) // 2
 14 |     return average_words // num_headings
 15 | 
 16 | def main(app):
 17 |     st.set_page_config(page_title='AI Content Generator', page_icon=None, layout="centered", initial_sidebar_state="auto", menu_items=None)
 18 |     st.sidebar.title("Content Generator for SEO")
 19 | 
 20 |     if 'session_data' not in st.session_state:
 21 |         st.session_state.session_data = initialize_session_data()
 22 | 
 23 |     current_step = st.sidebar.radio("Step to create a Blog:", ["Primary Details", "Generate Structure", "Generate Blog", "Generate Faq's"])
 24 | 
 25 |     if current_step == "Primary Details":
 26 |         primary_details(st.session_state.session_data)
 27 |     elif current_step == "Generate Structure":
 28 |         if st.button("Provide Structure Manually"):
 29 |             st.session_state.session_data['gen_step'] = 1
 30 |         if st.button("Generate Structure By LLM"):
 31 |             st.session_state.session_data['gen_step'] = 2
 32 | 
 33 |         if st.session_state.session_data['gen_step'] == 1:
 34 |             st.title("Provide Structure:")
 35 |             title = st.text_input("Enter Blog Tile:")
 36 |             if title:
 37 |                 st.session_state.session_data['blog_title'] = title
 38 |             selected_headings = st.text_input("Enter Manual headings (comma separated):")
 39 |             if selected_headings:
 40 |                 selected_headings = selected_headings.split(',')
 41 |                 st.session_state.session_data['selected_headings'] = selected_headings
 42 |             st.write(f"### Blog Title:")
 43 |             st.write(f"## {st.session_state.session_data['blog_title']}")
 44 |             st.write("### Headings:")
 45 |             for heading in st.session_state.session_data['selected_headings']:
 46 |                 st.write(heading)
 47 |         elif st.session_state.session_data['gen_step'] == 2:
 48 |             st.title("Generate Structure:")
 49 |             output = ''
 50 |             generate_structure_form(st.session_state.session_data)
 51 |             st.session_state.session_data['step_to_execute'] = current_step
 52 |             if st.button("Generate Blog Structure"):
 53 |                 context = st.session_state.session_data
 54 |                 output = app.invoke({"keys": context})
 55 |                 st.subheader("Generated Content:")
 56 |                 structure = output["keys"]["generation"]
 57 |                 st.session_state.session_data['structure'] = (structure or context['structure'])
 58 | 
 59 |             temp_structure = st.session_state.session_data
 60 | 
 61 |             if temp_structure and 'structure' in temp_structure:
 62 |                 if output:
 63 |                     st.session_state.session_data['collection_key'] = output["keys"]["collection_key"]
 64 |                 data = ast.literal_eval(temp_structure['structure'])
 65 |                 for key, value in data.items():
 66 |                     st.write(f"## {convert_to_title_case(key)}")
 67 |                     st.write(f"## Title: {value['title']}")
 68 |                     for heading in value['headings']:
 69 |                         st.write(heading)
 70 | 
 71 |                 titles = [value['title'] for value in data.values()]
 72 |                 title = st.selectbox("Select Title", titles)
 73 |                 all_headings = [heading for value in data.values() for heading in value['headings']]
 74 | 
 75 |                 manual_headings = st.text_input("Enter Manual headings (comma separated):")
 76 |                 manual_headings = manual_headings.split(',')
 77 |                 st.session_state.session_data['manual_headings'] = manual_headings
 78 | 
 79 |                 if manual_headings:
 80 |                     all_headings = all_headings + manual_headings
 81 | 
 82 |                 selected_headings = st.multiselect('Select Headings', all_headings)
 83 |                 if st.button("Reset Headings"):
 84 |                     st.session_state.session_data['selected_headings'] = []
 85 | 
 86 |                 if selected_headings:
 87 |                     st.session_state.session_data['selected_headings'] = selected_headings
 88 |                 st.session_state.session_data['blog_title'] = title
 89 |                 st.write(f"### Selected Blog Structure:")
 90 |                 st.write(f"## {st.session_state.session_data['blog_title']}")
 91 |                 st.write("### Headings:")
 92 |                 for heading in st.session_state.session_data['selected_headings']:
 93 |                     st.write(heading)
 94 | 
 95 | 
 96 | 
 97 |     elif current_step == "Generate Blog":
 98 |         blog_prompt = st.text_area("Enter Prompt for Blog Generation:", st.session_state.session_data['blog_prompt'])
 99 |         st.session_state.session_data['blog_prompt'] = blog_prompt
100 |         st.session_state.session_data['step_to_execute'] = current_step
101 |         context = st.session_state.session_data
102 |         headings =  context['selected_headings']
103 |         title = context['blog_title']
104 | 
105 |         st.write(f"## Question :--> {context['question']}")
106 |         st.write(f"## Primary Keyword :--> {context['primary_keyword']}")
107 |         st.write(f"## Selected Meta Keywords :-->")
108 |         st.write("<ul>", unsafe_allow_html=True)
109 |         for keyword in context['selected_keywords']:
110 |             st.write(f"<li>{keyword}</li>",unsafe_allow_html=True)
111 |         st.write("</ul>", unsafe_allow_html=True)
112 |         st.write(f"## Blog Title :--> {title}")
113 | 
114 |         for heading in headings:
115 |             st.write(heading)
116 | 
117 |         # context['number_of_words_per_heading'] = average_number_of_words(context['blog_words_limit'], len(headings))
118 | 
119 |         if st.button("Generate Blog Content"):
120 |             context['rephrase'] = False
121 |             content = ''
122 |             st.markdown(f"<h1>{title}</h1>", unsafe_allow_html=True)
123 |             for index in range(len(headings)):
124 |                 heading = headings[index]
125 |                 context['total_headings'] = len(headings) + 1
126 |                 context['current_heading'] = index + 1
127 |                 context['heading'] = heading
128 |                 context['blog_title'] = title
129 |                 context['blog_content'] = content
130 |                 time.sleep(20)
131 |                 output = app.invoke({"keys": context})
132 |                 current_heading_content = output["keys"]["blog"]
133 |                 st.session_state.session_data['collection_key'] = output["keys"]["collection_key"]
134 |                 content += f"{current_heading_content}\n\n"
135 |                 st.session_state.session_data['blog'] = content
136 |                 st.markdown(f"{current_heading_content}\n\n", unsafe_allow_html=True)
137 | 
138 |             re_content = st.text_area("Enter your feedback to rephrase content:", height=300)
139 |             if re_content and st.button("Click to rephrase content"):
140 |                 context['rephrase'] = True
141 |                 context['rephrase_context'] = re_content
142 |                 context['blog'] = st.session_state.session_data['blog']
143 |                 content = app.invoke({"keys": context})
144 | 
145 |             content = st.text_area("Edit Blog Content", value=content, height=600)
146 | 
147 |             if st.button("Save Changes!!!"):
148 |                 st.session_state.session_data['blog'] = content
149 | 
150 |             if st.button("Post Blog to Blog WebiSte"):
151 |                 response = post_blog(title, content)
152 |         else:
153 |             st.markdown(f"<h1>{title}</h1>", unsafe_allow_html=True)
154 |             st.markdown(st.session_state.session_data['blog'], unsafe_allow_html=True)
155 | 
156 |             content = st.text_area("Enter your feedback to rephrase content:", height=300)
157 |             if content and st.button("Click to rephrase content"):
158 |                 context['rephrase'] = True
159 |                 context['rephrase_context'] = content
160 |                 context['blog'] = st.session_state.session_data['blog']
161 |                 content = app.invoke({"keys": context})
162 |                 print(content["keys"]["blog"])
163 |                 st.session_state.session_data['blog'] = content["keys"]["blog"]
164 | 
165 |             content = st.text_area("Edit Blog Content", value=st.session_state.session_data['blog'], height=600)
166 |             if st.button("Save Changes!!!"):
167 |                 st.session_state.session_data['blog'] = content
168 | 
169 |             if st.button("Post Blog to `langchain.ca`"):
170 |                 response = post_blog(title, content)
171 | 
172 |             # if st.sidebar.button("Reset", key="reset"):
173 |             #     st.session_state.session_data = initialize_session_data()
174 |     elif current_step == "Generate Faq's":
175 |         faq_prompt = st.text_area("Enter Prompt for Faq's Generation:", st.session_state.session_data['faq_prompt'])
176 |         st.session_state.session_data['faq_prompt'] = faq_prompt
177 |         st.session_state.session_data['step_to_execute'] = current_step
178 |         context = st.session_state.session_data
179 |         headings =  context['selected_headings']
180 |         title = context['blog_title']
181 |         st.write(f"## Question :--> {context['question']}")
182 |         st.write(f"## Primary Keyword :--> {context['primary_keyword']}")
183 |         st.write(f"## Selected Meta Keywords :-->")
184 |         st.write("<ul>", unsafe_allow_html=True)
185 |         for keyword in context['selected_keywords']:
186 |             st.write(f"<li>{keyword}</li>",unsafe_allow_html=True)
187 |         st.write("</ul>", unsafe_allow_html=True)
188 |         st.write(f"## Blog Title :--> {title}")
189 | 
190 |         for heading in headings:
191 |             st.write(heading)
192 | 
193 |         if st.button("Generate Faq's"):
194 |             output = app.invoke({"keys": context})
195 |             faqs = output["keys"]["blog"]
196 |             st.session_state.session_data['faqs'] = faqs
197 |             st.session_state.session_data['collection_key'] = output["keys"]["collection_key"]
198 |             st.markdown(st.session_state.session_data['faqs'], unsafe_allow_html=True)
199 | 
200 |         st.markdown(st.session_state.session_data['faqs'], unsafe_allow_html=True)
201 | 
202 | 
203 | 
204 | 
205 | 
206 | 
207 | 
208 | 
209 | 
210 | 
211 | 
212 | 
213 | 
214 | 
215 | 
216 | 
217 | 
218 | 
219 | 
220 | 
221 | 
222 | 
223 | 
224 | 


--------------------------------------------------------------------------------
/st_frontend/st_helper.py:
--------------------------------------------------------------------------------
  1 | import streamlit as st
  2 | import types
  3 | 
  4 | 
  5 | ## app imports
  6 | from api_helper.serp_api import serp_api_caller
  7 | from seo.data_for_seo_api import get_keywords, get_serp_urls
  8 | from llm_keyword_fetcher.llm_generator import call_llm
  9 | 
 10 | ### Uncomment import 'pdb' this to use debugger in the app
 11 | ### Use this code in between any file or function to stop debugger at any point pdb.set_trace()
 12 | import pdb
 13 | 
 14 | # country_names = [country.name for country in pycountry.countries]
 15 | locations = {'Algeria': 2012, 'Angola': 2024, 'Azerbaijan': 2031, 'Argentina': 2032, 'Australia': 2036, 'Austria': 2040, 'Bahrain': 2048, 'Bangladesh': 2050, 'Armenia': 2051, 'Belgium': 2056, 'Bolivia': 2068, 'Brazil': 2076, 'Bulgaria': 2100, 'Myanmar (Burma)': 2104, 'Cambodia': 2116, 'Cameroon': 2120, 'Canada': 2124, 'Sri Lanka': 2144, 'Chile': 2152, 'Taiwan': 2158, 'Colombia': 2170, 'Costa Rica': 2188, 'Croatia': 2191, 'Cyprus': 2196, 'Czechia': 2203, 'Denmark': 2208, 'Ecuador': 2218, 'El Salvador': 2222, 'Estonia': 2233, 'Finland': 2246, 'France': 2250, 'Germany': 2276, 'Ghana': 2288, 'Greece': 2300, 'Guatemala': 2320, 'Hong Kong': 2344, 'Hungary': 2348, 'India': 2356, 'Indonesia': 2360, 'Ireland': 2372, 'Israel': 2376, 'Italy': 2380, "Cote d'Ivoire": 2384, 'Japan': 2392, 'Kazakhstan': 2398, 'Jordan': 2400, 'Kenya': 2404, 'South Korea': 2410, 'Latvia': 2428, 'Lithuania': 2440, 'Malaysia': 2458, 'Malta': 2470, 'Mexico': 2484, 'Morocco': 2504, 'Netherlands': 2528, 'New Zealand': 2554, 'Nicaragua': 2558, 'Nigeria': 2566, 'Norway': 2578, 'Pakistan': 2586, 'Panama': 2591, 'Paraguay': 2600, 'Peru': 2604, 'Philippines': 2608, 'Poland': 2616, 'Portugal': 2620, 'Romania': 2642, 'Saudi Arabia': 2682, 'Senegal': 2686, 'Serbia': 2688, 'Singapore': 2702, 'Slovakia': 2703, 'Vietnam': 2704, 'Slovenia': 2705, 'South Africa': 2710, 'Spain': 2724, 'Sweden': 2752, 'Switzerland': 2756, 'Thailand': 2764, 'United Arab Emirates': 2784, 'Tunisia': 2788, 'Turkiye': 2792, 'Ukraine': 2804, 'North Macedonia': 2807, 'Egypt': 2818, 'United Kingdom': 2826, 'United States': 2840, 'Burkina Faso': 2854, 'Uruguay': 2858, 'Venezuela': 2862}
 16 | frozen_locations = types.MappingProxyType(locations)
 17 | 
 18 | def handle_success(result):
 19 |     print("Success:", result.data)
 20 | 
 21 | # Function to handle failure
 22 | def handle_failure(result):
 23 |     print("Failure:", result.data["error"]["message"])
 24 | 
 25 | def initialize_session_data():
 26 |     return {
 27 |         'question': "",
 28 |         'primary_keyword': "",
 29 |         'urls': [],
 30 |         'structure_prompt': "",
 31 |         'selected_meta_keywords': [],
 32 |         'secondary_keywords': [],
 33 |         'selected_keywords': [],
 34 |         'manual_keywords': [],
 35 |         'country': 'United States',
 36 |         'selected_urls': [],
 37 |         'keyword': [],
 38 |         'blog': '',
 39 |         'selected_headings': '',
 40 |         'gen_step': '',
 41 |         'blog_title': '',
 42 |         'blog_prompt': '',
 43 |         'faq_prompt': '',
 44 |         'faqs': ''
 45 |     }
 46 | 
 47 | def handle_urls():
 48 |     urls_str = st.text_input("Enter URLs (separated by commas):")
 49 |     if urls_str:
 50 |         urls_str = urls_str.strip()
 51 |         if urls_str:
 52 |             urls = [url.strip() for url in urls_str.split(",")]
 53 |             return urls
 54 |         else:
 55 |             st.warning("Please enter URLs separated by commas.")
 56 |     return []
 57 | 
 58 | def handle_serp_api(option, question, session_data):
 59 |     urls = []
 60 |     if (option == 'Use Serpi Api' or option == 'Use Both of them') and question:
 61 |         if st.button("Fetch Urls from DataForSeo"):
 62 |             response = get_serp_urls(question, session_data['country'])
 63 |             urls = response.data['data']
 64 | 
 65 |     if option == 'Use Serpi Api' and question:
 66 |         return urls
 67 |     elif option == 'Use Custom Urls':
 68 |         return handle_urls()
 69 |     elif option == 'Use Both of them' and question:
 70 |         return (handle_urls() + urls)
 71 |     else:
 72 |         st.write('Topic Must be present!')
 73 |         return []
 74 | 
 75 | 
 76 | def primary_details(session_data):
 77 |     st.title("Primary Details For Content Generation:")
 78 | 
 79 |     question = st.text_input("Enter your topic name:", session_data['question'])
 80 |     primary_keyword = st.text_input("Enter primary keyword:", session_data['primary_keyword'])
 81 |     selected_country = st.selectbox("Select a country", ['United States'])
 82 |     session_data['country'] = frozen_locations[selected_country]
 83 |     option = st.radio('Select an option:', ['Use Serpi Api', 'Use Custom Urls', 'Use Both of them'])
 84 |     urls = handle_serp_api(option, question, session_data)
 85 |     if len(urls) > 0:
 86 |         session_data['urls'] = urls
 87 |     selected_urls = st.multiselect("Select Urls", session_data['urls'])
 88 |     st.write("Available urls from DataForSeo:")
 89 |     st.write(session_data['urls'])
 90 |     if selected_urls:
 91 |         session_data['selected_urls'] = selected_urls
 92 | 
 93 |     if st.button("Reset selected from DataForSeo:"):
 94 |         session_data['selected_urls'] = []
 95 |     st.write("Selected urls from DataForSeo:")
 96 |     st.write(session_data['selected_urls'])
 97 | 
 98 |     session_data['option'] = option
 99 | 
100 |     if question and primary_keyword and st.button('Fetch Secondary keywords Using LLM:'):
101 |         keywords = call_llm(question, primary_keyword)
102 |         session_data['keywords'] = keywords
103 | 
104 |     if 'keywords' in session_data:
105 |         st.write(f"Available keywords --> {session_data['keywords']}")
106 | 
107 |     if 'keywords' in session_data:
108 |         # keywords_s = session_data['selected_meta_keywords']
109 |         selected_meta_keywords = st.multiselect("Select Secondary Keywords", session_data['keywords'])
110 |         if selected_meta_keywords:
111 |             session_data['selected_meta_keywords'] = selected_meta_keywords
112 |     selected_rows = ''
113 |     if st.button("Fetch keywords from DataForSeo"):
114 |         success_result = get_keywords(primary_keyword, frozen_locations[selected_country])
115 |         if success_result.success:
116 |             sec_keywords = success_result.data['data']
117 |             session_data['sec_keywords'] = sec_keywords
118 |             handle_success(success_result)
119 |         else:
120 |             handle_failure(success_result)
121 |             st.write(f"No similar keywords found for your primary keyword on --> {primary_keyword}")
122 | 
123 |     if 'sec_keywords' in session_data:
124 |         st.write('Select Secondary keywords:')
125 |         data = session_data['sec_keywords'].reindex(columns=['Select', 'keyword', 'search_volume', 'competition', 'competition_level', 'cpc', 'language_code'])
126 |         selected_rows = st.data_editor(
127 |             data,
128 |             num_rows="dynamic",
129 |             hide_index=True,
130 |         )
131 | 
132 |         selected_rows = {key: [value[i] for i in range(len(value)) if selected_rows['Select'][i]] for key, value in selected_rows.items()}
133 |         st.write('Selected keywords:')
134 |         st.data_editor(
135 |             selected_rows,
136 |             hide_index=True,
137 |             disabled=["Select"]
138 |         )
139 |         session_data['secondary_keywords'] = selected_rows
140 | 
141 |     manual_keywords = st.text_input("Enter Manual Keywords (comma separated):")
142 |     if manual_keywords:
143 |         manual_keywords = manual_keywords.split(',')
144 |         session_data['manual_keywords'] = manual_keywords
145 | 
146 |     if selected_rows:
147 |         selected_keywords = set(session_data['manual_keywords'] + selected_rows['keyword'] + list(session_data['selected_meta_keywords']) + list(session_data['selected_keywords']))
148 |     else:
149 |         selected_keywords = set(session_data['manual_keywords'] + list(session_data['selected_meta_keywords']) + list(session_data['selected_keywords']))
150 | 
151 |     if st.button("Reset Selected keywords"):
152 |         if selected_rows:
153 |             selected_rows['keyword'] = []
154 |         session_data['selected_meta_keywords'] = []
155 |         selected_keywords = []
156 | 
157 |     st.write(f"## Selected Meta Keywords :-->")
158 |     st.write("<ul>", unsafe_allow_html=True)
159 |     for keyword in selected_keywords:
160 |         st.write(f"<li>{keyword}</li>",unsafe_allow_html=True)
161 |     st.write("</ul>", unsafe_allow_html=True)
162 | 
163 |     session_data['selected_keywords'] = selected_keywords
164 |     session_data['question'] = question
165 |     session_data['primary_keyword'] = primary_keyword
166 | 
167 |     return question, primary_keyword, session_data['urls'], session_data['selected_urls']
168 | 
169 | def generate_structure_form(session_data):
170 |     structure_prompt = st.text_area("Enter Prompt for Structure Generation:", session_data['structure_prompt'])
171 |     session_data['structure_prompt'] = structure_prompt
172 |     st.write(f"## Country --> {session_data['country']}")
173 |     st.write(f"## Selected Serp Urls -->")
174 |     st.write(session_data['selected_urls'])
175 |     st.write(f"## Selected Meta Keywords :-->")
176 |     st.write("<ul>", unsafe_allow_html=True)
177 |     for keyword in session_data['selected_keywords']:
178 |         st.write(f"<li>{keyword}</li>",unsafe_allow_html=True)
179 |     st.write("</ul>", unsafe_allow_html=True)
180 | 
181 | def convert_to_title_case(input_string):
182 |     words = input_string.split('_')
183 |     capitalized_words = [word.capitalize() for word in words]
184 |     result_string = ' '.join(capitalized_words)
185 |     return result_string


--------------------------------------------------------------------------------