├── .gitattributes ├── LICENSE ├── README.md ├── index.html ├── search.php ├── search_basic.php └── style.css /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Josh Clemm 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ai-search 2 | A Basic open source AI search engine, modeled after [Perplexity.ai](Perplexity.ai). If you’re not familiar with an AI-powered question-answering platform, they use a large language model like ChatGPT to answer your questions, but improves on ChatGPT in that it pulls in accurate and real-time search results to supplement the answer (so no “knowledge cutoff”). And lists citations within the answer itself which builds confidence it’s not hallucinating and allows you to research topics further. 3 | 4 | ## How to Run summary 5 | 1. Clone / download repo 6 | 2. Go get your API keys and add them to `search.php` (look for "[Fill me in]") 7 | 3. Run locally using php (php -S localhost:8000) 8 | 9 | ## Step by step details 10 | 11 | ### Step 1: Get Search Results for a user query 12 | 13 | The main challenge with LLMs like ChatGPT is that they have knowledge cutoffs (and they occasionally tend to [hallucinate](https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence))). It’s because they’re trained on data up to a specific date (eg Sep 2021). So if you want an answer to an up-to-date question or you simply want to research a topic in detail, you’ll need to _augment_ the answer with relevant sources. This technique is known as [RAG](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview) (retrieval augmented generation). And in our case we can simply supply the LLM up-to-date information from search engines like Google or Bing. 14 | 15 | To build this yourself, you’ll want to first sign up for an API key from [Bing](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api), Google (via [Serper](https://serper.dev/)), [Brave](https://brave.com/search/api/), or others. Bing, Brave, and Serper all offer free usage to get started. 16 | 17 | In `search.php`, put your API key where appropriate (look for "[Fill me in]"). For this example, I'm have code for both Brave and Google via Serper. 18 | 19 | 20 | ### Step 2: Decide the LLM you want to use 21 | 22 | Here, you’ll need to sign up for an API key from an LLM provider. There’s a lot of providers to choose from right now. For example there’s [OpenAI](https://platform.openai.com/docs/overview), [Anthropic](https://www.anthropic.com/api), [Anyscale](https://www.anyscale.com/), [Groq](https://groq.com/), [Cloudflare](https://ai.cloudflare.com/), [Perplexity](https://docs.perplexity.ai/docs/getting-started), [Lepton](https://www.lepton.ai/), or the big players like AWS, Azure, or Google Cloud. I’ve used many of these with success and they offer a subset of current and popular closed and open source models. And each model has unique strengths, different costs, and different speeds. For example, gpt-4 is very accurate but expensive and slow. When in doubt, I’d recommend using chatgpt-3.5-turbo from OpenAI. It’s good enough, cheap enough, and fast enough to test this out. 23 | 24 | Fortunately, most of these LLM serving providers are compatible with OpenAI’s API format, so switching to another provider / model is only minimal work (or just ask a [chatbot](https://yaddleai.com/search/?q=Show+the+code+to+call+openAI%27s+API) to write the code!). 25 | 26 | In `search.php`, put your API keys where appropriate (look for "[Fill me in]"). For this example, I'm using OpenAI (for chatgpt-3.5-turbo / gpt-4) and Groq (for Mixtral-8b7b). So to keep your work minimal, just go get keys for one or both of those. 27 | 28 | ### Step 3: Craft a prompt to pass along the search results in the context window 29 | 30 | When you want to ask an LLM a question, you can provide a lot of additional context. Each model has its own unique limit and some of them are very large. For [gpt-4-turbo](https://platform.openai.com/docs/models/continuous-model-upgrades), you could pass along the entirety of the [1st Harry Potter book](https://towardsdatascience.com/de-coded-understanding-context-windows-for-transformer-models-cd1baca6427e) with your question. Google’s super powerful [Gemini 1.5](https://medium.com/google-cloud/googles-gemini-1-5-pro-revolutionizing-ai-with-a-1m-token-context-window-bfea5adfd35f) can support a context size of over a million tokens. That’s enough to pass along the entirety of the 7-book Harry Potter series! 31 | 32 | Fortunately, passing along the snippets of 8-10 search results is far smaller, allowing you to use many of the faster (and much cheaper) models like gpt-3.5-turbo or mistral-7b. 33 | 34 | In my experience, passing along the user question, custom prompt message, and search result snippets are usually under 1K tokens. This is well under even the most basic model’s limits so this should be no problem. 35 | 36 | `search.php` has the sample prompt I’ve been playing around with you. Hat-tip to the folks at [Lepton AI](https://www.lepton.ai/) who [open-sourced a similar project](https://github.com/leptonai/search_with_lepton) which helped me refine this prompt. 37 | 38 | ### Step 4: Add Related or Follow Up Questions 39 | 40 | One of the nice features of Perplexity is how they suggest follow up questions. Fortunately, this is easy to replicate. 41 | 42 | To do this, you can make a second call to your LLM (in parallel) asking for related questions. And don’t forget to pass along those citations in the context again. 43 | 44 | Or, you can attempt to construct a prompt so that the LLM answers the question AND comes up with related questions. This saves an API call and some tokens, but it’s a bit challenging getting these LLMs to always answer in a consistent and repeatable format. 45 | 46 | ### Step 5: Make it look a lot better! 47 | 48 | To make this a complete example, we need a usable UI. I kept the UI as simple as possible and everything is in `index.html`. I’m using Bootstrap, jquery, and some basic CSS / javascript, markdown, and a JS syntax highlighter to make this happen. 49 | 50 | To improve the experience, the UI does the following: 51 | * The answer **streams** back to the user (improving perception of speed) 52 | * The **citations** are replaced by a nicer in-line UI with a clickable popup for the user to learn more 53 | * The **sources** considered are included after the answer in case the user wants to explore further 54 | * Markdown and code syntax **highlighting** are used if necessary 55 | 56 | ### Working Example 57 | 58 | To explore a working example, check out [https://yaddleai.com](https://yaddleai.com/). It's mostly the same code though I added a second search call in parallel to fetch images, I wrote a separate page to fetch the latest news, and a few other minor improvements. 59 | -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Answers 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 158 | 159 | 160 | 161 | 162 |
163 | 168 | 169 | 208 | 209 |
210 |

211 | Answer 212 | 213 |

214 |
215 | 216 |
217 | 218 | 219 | 220 | 221 | Followup Questions 222 |
223 |
224 | 225 |
226 | 227 | 228 | 229 | Web Sources 230 |
231 |
232 | 233 |
234 |
235 |

236 | This answer uses a large language model () to summarize search results and can make mistakes. Consider checking important information. 237 | Answer generated in . 238 |

239 |

240 |
241 |
242 |
243 | 244 | 247 |
248 | 249 | 250 | 251 | 252 | 253 | 254 | 255 | 494 | 495 | 496 | -------------------------------------------------------------------------------- /search.php: -------------------------------------------------------------------------------- 1 | $query 29 | ); 30 | $ENDPOINT = "https://api.search.brave.com/res/v1/web/search"; 31 | $url = $ENDPOINT . '?' . http_build_query($params); 32 | 33 | $headers = array( 34 | 'X-Subscription-Token: ' . $BRAVE_KEY, 35 | 'Accept: application/json' 36 | ); 37 | curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 38 | curl_setopt($curl, CURLOPT_URL, $url); 39 | curl_setopt($curl, CURLOPT_ENCODING, 'gzip'); 40 | curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 41 | 42 | $response = curl_exec($curl); 43 | curl_close($curl); 44 | 45 | $jsonContent = json_decode($response, true); 46 | 47 | $snippets = []; 48 | 49 | if (isset($jsonContent['web']['results'])) { 50 | foreach ($jsonContent['web']['results'] as $c) { 51 | 52 | $extra_snippets = ""; 53 | if (isset($c['extra_snippets'])) { 54 | foreach ($c['extra_snippets'] as $s) { 55 | $extra_snippets .= $s . ' '; 56 | } 57 | } 58 | $snippets[] = [ 59 | 'name' => $c['title'], 60 | 'url' => $c['url'], 61 | 'snippet' => $c['description'], 62 | 'extra_snippets' => $extra_snippets ?? '', 63 | 'favicon' => $c['meta_url']['favicon'] ?? '', 64 | ]; 65 | } 66 | } 67 | 68 | return array_slice($snippets, 0, $num_sources); 69 | } 70 | function search_with_serper($query, $num_sources = 9) 71 | { 72 | // Put your google search serper key here (https://serper.dev/) 73 | $SERPER_KEY = "[fill me in]"; 74 | 75 | $curl = curl_init(); 76 | 77 | $request = array( 78 | "q" => $query 79 | ); 80 | $data = json_encode($request, JSON_PRETTY_PRINT); 81 | 82 | $headers = array( 83 | 'X-API-KEY: ' . $SERPER_KEY, 84 | 'Content-Type: application/json' 85 | ); 86 | curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 87 | curl_setopt($curl, CURLOPT_POST, 1); 88 | curl_setopt($curl, CURLOPT_POSTFIELDS, $data); 89 | curl_setopt($curl, CURLOPT_URL, "https://google.serper.dev/search"); 90 | curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 91 | 92 | $response = curl_exec($curl); 93 | curl_close($curl); 94 | 95 | $jsonContent = json_decode($response, true); 96 | 97 | $snippets = []; 98 | 99 | if (isset($jsonContent['knowledgeGraph'])) { 100 | $url = $jsonContent['knowledgeGraph']['descriptionUrl'] ?? $jsonContent['knowledgeGraph']['website'] ?? null; 101 | $snippet = $jsonContent['knowledgeGraph']['description'] ?? null; 102 | if ($url && $snippet) { 103 | $snippets[] = [ 104 | 'name' => $jsonContent['knowledgeGraph']['title'] ?? '', 105 | 'url' => $url, 106 | 'snippet' => $snippet, 107 | ]; 108 | } 109 | } 110 | 111 | if (isset($jsonContent['answerBox'])) { 112 | $url = $jsonContent['answerBox']['link'] ?? $jsonContent['answerBox']['url'] ?? null; 113 | $snippet = $jsonContent['answerBox']['snippet'] ?? $jsonContent['answerBox']['answer'] ?? null; 114 | if ($url && $snippet) { 115 | $snippets[] = [ 116 | 'name' => $jsonContent['answerBox']['title'] ?? '', 117 | 'url' => $url, 118 | 'snippet' => $snippet, 119 | ]; 120 | } 121 | } 122 | 123 | if (isset($jsonContent['organic'])) { 124 | foreach ($jsonContent['organic'] as $c) { 125 | $snippets[] = [ 126 | 'name' => $c['title'], 127 | 'url' => $c['link'], 128 | 'snippet' => $c['snippet'] ?? '', 129 | ]; 130 | } 131 | } 132 | 133 | return array_slice($snippets, 0, $num_sources); 134 | } 135 | 136 | function setup_curl_to_llm($query, $context, $max_tokens, $stream = false, $model = "gpt-3.5-turbo", $temperature = 1) 137 | { 138 | // Put your OpenAI API key here (https://platform.openai.com/overview) 139 | // if you want to use other LLMs, most use the exact same API as OpenAI, 140 | // so really only the url, model, and KEY need to change 141 | $OPENAI_KEY = "[fill me in]"; 142 | 143 | // For Groq's API, get your key here (https://wow.groq.com/) 144 | $GROQ_KEY = "[fill me in]"; 145 | 146 | if (in_array($model, array('gpt-3.5-turbo', 'gpt-4'), true)) { 147 | $LLM_ENDPOINT = "https://api.openai.com/v1/chat/completions"; 148 | $LLM_KEY = $OPENAI_KEY; 149 | } 150 | else { 151 | $LLM_ENDPOINT = "https://api.groq.com/openai/v1/chat/completions"; 152 | $LLM_KEY = $GROQ_KEY; 153 | } 154 | 155 | $system = (object) [ 156 | "role" => "system", 157 | "content" => $context 158 | ]; 159 | 160 | $user = (object) [ 161 | "role" => "user", 162 | "content" => $query 163 | ]; 164 | 165 | $request = array( 166 | "model" => $model, 167 | "messages" => array( 168 | $system, 169 | $user 170 | ), 171 | "temperature" => $temperature, 172 | "stream" => $stream, 173 | "max_tokens" => $max_tokens 174 | ); 175 | $data = json_encode($request, JSON_PRETTY_PRINT); 176 | 177 | $curl = curl_init(); 178 | $headers = array( 179 | "Content-Type: application/json", 180 | "Authorization: Bearer " . $LLM_KEY, 181 | ); 182 | 183 | curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 184 | curl_setopt($curl, CURLOPT_POST, 1); 185 | curl_setopt($curl, CURLOPT_POSTFIELDS, $data); 186 | 187 | curl_setopt($curl, CURLOPT_URL, $LLM_ENDPOINT); 188 | curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 189 | 190 | if ($stream) { 191 | //stream back curl chunks, this is messy I know... 192 | $callback = function ($ch, $str) { 193 | //$str has the chunks of data streamed back. 194 | $chunks = explode("data: ", $str); 195 | foreach ($chunks as $i => $chunk) { 196 | if (!empty($chunk) || $chunk !== "[DONE]") { 197 | $json = json_decode($chunk); 198 | if (isset($json->choices)) { 199 | $choice = $json->choices[0]; 200 | if (isset($choice->delta)) { 201 | $delta = $choice->delta; 202 | if (isset($delta->content)) { 203 | echo $delta->content; 204 | flush(); 205 | ob_flush(); 206 | } else { 207 | return -1; 208 | } 209 | } 210 | } 211 | } 212 | } 213 | return strlen($str); //signals curl to keep going 214 | }; 215 | 216 | curl_setopt($curl, CURLOPT_WRITEFUNCTION, $callback); 217 | } 218 | 219 | return $curl; 220 | } 221 | 222 | function execute_curl($curl) 223 | { 224 | $result = curl_exec($curl); 225 | curl_close($curl); 226 | $jsonResult = json_decode($result); 227 | return nl2br($jsonResult->choices[0]->message->content); 228 | } 229 | 230 | function get_snippets_for_prompt($snippets) 231 | { 232 | $snippets_context = ""; 233 | foreach ($snippets as $i => $s) { 234 | $snippets_context .= "[citation:" . ($i + 1) . "] " . $s['snippet']; 235 | 236 | if(isset($s['extra_snippets'])) { 237 | $snippets_context .= $s['extra_snippets']; 238 | } 239 | 240 | if ($i < count($snippets) - 1) { 241 | $snippets_context .= "\n\n"; 242 | } 243 | } 244 | 245 | return $snippets_context; 246 | } 247 | 248 | function setup_get_answer_prompt($snippets) 249 | { 250 | // My prompt is to provide accurate, high-quality, and expertly written responses to your questions in a positive, interesting, and engaging manner. I aim to offer informative, logical, and actionable information in the same language as your queries. 251 | $starting_context = <<<'EOD' 252 | You are an assistant written by Josh Clemm. You will be given a question. And you will respond with two things. 253 | 254 | First, respond with an answer to the question. It must be accurate, high-quality, and expertly written in a positive, interesting, and engaging manner. It must be informative and in the same language as the user question. 255 | 256 | Second, respond with 3 related followup questions. First, please repeat the following phrase: ==== RELATED ====. And then the 3 follow up questions in a JSON array format, so it's clear you've started to answer the second part. Each related question should be no longer than 15 words. They should be based on user's original question and the citations given in the context. Do not repeat the original question. Make sure to determine the main subject from the user's original question. That subject needs to be in any related question, so the user can ask it standalone. 257 | 258 | For both the first and second response, you will be provided a set of citations to the question. Each will start with a reference number like [citation:x], where x is a number. Always use the related citations and cite the citation at the end of each sentence in the format [citation:x]. If a sentence comes from multiple citations, please list all applicable citations, like [citation:2][citation:3]. 259 | 260 | Here are the provided citations: 261 | 262 | EOD; 263 | 264 | // $final_context = "Finally, don't repeat the provided contexts verbatim. And don't mention you were passed contexts in the response."; 265 | $final_context = ""; 266 | 267 | $full_context = $starting_context . "\n\n" . get_snippets_for_prompt($snippets) . "\n\n" . $final_context; 268 | return $full_context; 269 | } 270 | 271 | // Use the multi cURL capabilities to run one or more curl commands in parallel 272 | function execute_multi_curl(...$curlArray) 273 | { 274 | $mh = curl_multi_init(); 275 | foreach ($curlArray as $curl) { 276 | curl_multi_add_handle($mh, $curl); 277 | } 278 | // Execute all queries simultaneously, and continue when all are complete 279 | $running = null; 280 | do { 281 | curl_multi_exec($mh, $running); 282 | // usleep(50000); 283 | curl_multi_select($mh); // This is a blocking call, only proceeding when there's activity 284 | } while ($running); 285 | 286 | // Collect the responses and remove the handles 287 | $responses = []; 288 | foreach ($curlArray as $curl) { 289 | $responses[] = curl_multi_getcontent($curl); 290 | curl_multi_remove_handle($mh, $curl); 291 | } 292 | curl_multi_close($mh); 293 | return $responses; 294 | } 295 | 296 | $snippets = array(); 297 | // $snippets = search_with_brave($query); 298 | $snippets = search_with_serper($query); 299 | 300 | echo "==== SOURCES ====\n"; 301 | echo json_encode($snippets, JSON_PRETTY_PRINT); 302 | 303 | $search_end = microtime(true); 304 | 305 | $answer_prompt_context = setup_get_answer_prompt($snippets); 306 | 307 | $answer_curl = setup_curl_to_llm($query, $answer_prompt_context, 2048, true, $model, 0.9); 308 | 309 | echo "\n==== ANSWER ====\n"; 310 | $responses = execute_multi_curl($answer_curl); 311 | 312 | $end = microtime(true); 313 | 314 | echo "\n==== METADATA ====\n"; 315 | $metadata = array( 316 | "query" => $query, 317 | "model" => $model, 318 | "duration" => array( 319 | "search" => number_format(($search_end - $start), 2) . 's', 320 | "llm" => number_format(($end - $search_end), 2) . 's', 321 | "total" => number_format(($end - $start), 2) . 's' 322 | ) 323 | ); 324 | echo json_encode($metadata, JSON_PRETTY_PRINT); 325 | -------------------------------------------------------------------------------- /search_basic.php: -------------------------------------------------------------------------------- 1 | $query, 'text_decorations' => false); 9 | $ENDPOINT = "https://api.search.brave.com/res/v1/web/search"; 10 | $url = $ENDPOINT . '?' . http_build_query($params); 11 | $headers = array( 12 | 'X-Subscription-Token: ' . $BRAVE_KEY, 13 | 'Accept: application/json' 14 | ); 15 | $curl = curl_init(); 16 | curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 17 | curl_setopt($curl, CURLOPT_URL, $url); 18 | curl_setopt($curl, CURLOPT_ENCODING, 'gzip'); 19 | curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 20 | $response = curl_exec($curl); 21 | curl_close($curl); 22 | 23 | $jsonContent = json_decode(strip_tags($response), true); 24 | $snippets = []; 25 | if (isset($jsonContent['web']['results'])) { 26 | foreach ($jsonContent['web']['results'] as $c) { 27 | $snippets[] = ['name' => $c['title'], 'url' => $c['url'], 'snippet' => $c['description']]; 28 | } 29 | } 30 | return array_slice($snippets, 0, $num_sources); 31 | } 32 | 33 | function setup_curl_to_llm($query, $context, $max_tokens, $model = "gpt-3.5-turbo", $temperature = 1) { 34 | // Put your OpenAI API key here (https://platform.openai.com/overview) 35 | // if you want to use other LLMs, most use the exact same API as OpenAI, 36 | // so really only the url, model, and KEY need to change 37 | $OPENAI_KEY = "[fill me in]"; 38 | $LLM_ENDPOINT = "https://api.openai.com/v1/chat/completions"; 39 | 40 | $system = (object) ["role" => "system", "content" => $context]; 41 | $user = (object) ["role" => "user", "content" => $query]; 42 | $request = array( 43 | "model" => $model, 44 | "messages" => array( 45 | $system, 46 | $user 47 | ), 48 | "temperature" => $temperature, 49 | "max_tokens" => $max_tokens 50 | ); 51 | $headers = array( 52 | "Content-Type: application/json", 53 | "Authorization: Bearer " . $OPENAI_KEY, 54 | ); 55 | 56 | $curl = curl_init(); 57 | curl_setopt($curl, CURLOPT_HTTPHEADER, $headers); 58 | curl_setopt($curl, CURLOPT_POST, 1); 59 | curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($request, JSON_PRETTY_PRINT)); 60 | curl_setopt($curl, CURLOPT_URL, $LLM_ENDPOINT); 61 | curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 62 | return $curl; 63 | } 64 | 65 | function execute_curl($curl) { 66 | $result = curl_exec($curl); 67 | curl_close($curl); 68 | $jsonResult = json_decode($result); 69 | return $jsonResult->choices[0]->message->content; 70 | } 71 | 72 | function get_snippets_for_prompt($snippets) { 73 | $snippets_context = ""; 74 | foreach ($snippets as $i => $s) { 75 | $snippets_context .= "[citation:" . ($i + 1) . "] " . $s['snippet'] . "\n\n"; 76 | } 77 | return $snippets_context; 78 | } 79 | 80 | function setup_get_answer_prompt($snippets) { 81 | $starting_context = <<<'EOD' 82 | You are an assistant written by Josh Clemm. You will be given a question. And you will respond with two things. 83 | First, respond with an answer to the question. It must be accurate, high-quality, and expertly written in a positive, interesting, and engaging manner. It must be informative and in the same language as the user question. 84 | Second, respond with 3 related followup questions. First print "==== RELATED ====" verbatim. Then, write the 3 follow up questions in a JSON array format, so it's clear you've started to answer the second part. Do not use markdown. Each related question should be no longer than 15 words. They should be based on user's original question and the citations given in the context. Do not repeat the original question. Make sure to determine the main subject from the user's original question. That subject needs to be in any related question, so the user can ask it standalone. 85 | For both the first and second response, you will be provided a set of citations for the question. Each will start with a reference number like [citation:x], where x is a number. Always use the related citations and cite the citation at the end of each sentence in the format [citation:x]. If a sentence comes from multiple citations, please list all applicable citations, like [citation:2][citation:3]. 86 | Here are the provided citations: 87 | EOD; 88 | return $starting_context . "\n\n" . get_snippets_for_prompt($snippets);; 89 | } 90 | 91 | // 0. Extract query and model from request paramaters 92 | $query = $_REQUEST["q"] ?? "how did Uber scale over the years?"; 93 | $model = $_REQUEST["model"] ?? "gpt-3.5-turbo"; 94 | 95 | // 1. Call search to get sources and their snippets 96 | $snippets = search_with_brave($query); 97 | echo "==== SOURCES ====\n" . json_encode($snippets, JSON_PRETTY_PRINT); 98 | 99 | // 2. Create a prompt passing along the sources and call the language model of your choice 100 | $answer_prompt_context = setup_get_answer_prompt($snippets); 101 | echo $answer_prompt_context; 102 | $answer_curl = setup_curl_to_llm($query, $answer_prompt_context, 2048, $model, 0.9); 103 | echo "\n==== ANSWER ====\n" . execute_curl($answer_curl); 104 | ?> -------------------------------------------------------------------------------- /style.css: -------------------------------------------------------------------------------- 1 | pre { 2 | box-shadow: var(--bs-box-shadow) !important; 3 | border: var(--bs-border-width) var(--bs-border-style) var(--bs-border-color) !important; 4 | border-radius: var(--bs-border-radius) !important; 5 | } 6 | 7 | .hljs { 8 | display: block; 9 | overflow-x: auto; 10 | padding: 1em; 11 | background: #032453; 12 | } 13 | 14 | .hljs-built_in, 15 | .hljs-selector-tag, 16 | .hljs-section, 17 | .hljs-link { 18 | color: #8be9fd; 19 | } 20 | 21 | .hljs-keyword { 22 | color: #ff79c6; 23 | } 24 | 25 | .hljs, 26 | .hljs-subst { 27 | color: #f8f8f2; 28 | } 29 | 30 | .hljs-title, 31 | .hljs-attr, 32 | .hljs-meta-keyword { 33 | font-style: italic; 34 | color: #50fa7b; 35 | } 36 | 37 | .hljs-string, 38 | .hljs-meta, 39 | .hljs-name, 40 | .hljs-type, 41 | .hljs-symbol, 42 | .hljs-bullet, 43 | .hljs-addition, 44 | .hljs-variable, 45 | .hljs-template-tag, 46 | .hljs-template-variable { 47 | color: #f1fa8c; 48 | } 49 | 50 | .hljs-comment, 51 | .hljs-quote, 52 | .hljs-deletion { 53 | color: #6272a4; 54 | } 55 | 56 | .hljs-keyword, 57 | .hljs-selector-tag, 58 | .hljs-literal, 59 | .hljs-title, 60 | .hljs-section, 61 | .hljs-doctag, 62 | .hljs-type, 63 | .hljs-name, 64 | .hljs-strong { 65 | font-weight: bold; 66 | } 67 | 68 | .hljs-literal, 69 | .hljs-number { 70 | color: #bd93f9; 71 | } 72 | 73 | .hljs-emphasis { 74 | font-style: italic; 75 | } --------------------------------------------------------------------------------