├── CNAME ├── rig-nextra └── .gitignore ├── pages ├── guides │ ├── 1_rag │ │ └── _meta.tsx │ ├── 2_advanced │ │ ├── _meta.tsx │ │ └── 22_flight_assistant.mdx │ ├── index.mdx │ ├── 3_deploy │ │ ├── _meta.tsx │ │ ├── Blog_1_aws_lambda.mdx │ │ └── Blog_2_aws_lambda_lancedb.mdx │ └── _meta.tsx ├── about.mdx ├── examples │ ├── 3_advanced │ │ ├── _meta.tsx │ │ └── 30_concurrent_processing.mdx │ ├── _meta.tsx │ ├── 0_model_providers │ │ ├── openai.mdx │ │ ├── anthropic.mdx │ │ └── gemini.mdx │ ├── 2_basics │ │ ├── 23_tic_tac_toe.mdx │ │ ├── 22_pid_controller.mdx │ │ ├── 24_synthetic_data.mdx │ │ ├── 20_simple_agent.mdx │ │ └── 21_text_classification.mdx │ ├── remote_examples_paths.json │ ├── index.mdx │ └── 1_rag │ │ └── 10_rag_pdf.mdx ├── docs │ ├── 3_concepts.mdx │ ├── 4_integrations │ │ ├── 42_plugins │ │ │ └── twitter.mdx │ │ ├── 42_plugins.mdx │ │ ├── 41_vector_stores.mdx │ │ ├── 40_model_providers.mdx │ │ ├── 41_vector_stores │ │ │ ├── neo4j.mdx │ │ │ ├── mongodb.mdx │ │ │ ├── in_memory.mdx │ │ │ └── lancedb.mdx │ │ └── 40_model_providers │ │ │ ├── openai.mdx │ │ │ └── anthropic.mdx │ ├── _meta.tsx │ ├── index.mdx │ ├── 2_architecture.mdx │ ├── 3_concepts │ │ ├── 1_embeddings.mdx │ │ ├── 5_loaders.mdx │ │ ├── 2_tools.mdx │ │ ├── 6_extractors.mdx │ │ ├── 4_chains.mdx │ │ ├── 0_completion.mdx │ │ └── 3_agent.mdx │ ├── 5_extensions │ │ └── 0_cli_chatbot.mdx │ ├── 4_integrations.mdx │ ├── 0_quickstart.mdx │ └── 1_why_rig.mdx ├── _app.tsx ├── _meta.tsx └── index.mdx ├── public ├── favicon.ico ├── images │ ├── ask_discord.png │ ├── ask_discord_2.png │ ├── deploy_1 │ │ ├── lc-cw-logs.png │ │ ├── rig-cw-logs.png │ │ ├── lc-coldstarts.png │ │ ├── lc-power-tuner.png │ │ ├── rig-coldstarts.png │ │ ├── rig-power-tuner.png │ │ ├── lc-deployment-package.png │ │ └── rig-deployment-package.png │ └── deploy_2 │ │ ├── lc-metrics.png │ │ ├── cold_starts.png │ │ ├── memory_usage.png │ │ ├── rig-metrics.png │ │ ├── lc-power-tuner.png │ │ ├── rig-power-tuner.png │ │ ├── lc-deployment-package.png │ │ └── rig-deployment-package.png ├── rig-dark.svg └── rig-light.svg ├── .github ├── screenshot.png └── workflows │ └── nextjs.yml ├── postcss.config.js ├── components ├── counters.module.css └── counters.tsx ├── .gitignore ├── next-env.d.ts ├── tailwind.config.ts ├── next.config.ts ├── README.md ├── tsconfig.json ├── styles.css ├── LICENSE ├── package.json └── theme.config.tsx /CNAME: -------------------------------------------------------------------------------- 1 | docs.rig.rs -------------------------------------------------------------------------------- /rig-nextra/.gitignore: -------------------------------------------------------------------------------- 1 | .next 2 | node_modules 3 | -------------------------------------------------------------------------------- /pages/guides/1_rag/_meta.tsx: -------------------------------------------------------------------------------- 1 | export default { 2 | "11_rag_system": "Simple RAG" 3 | } 4 | -------------------------------------------------------------------------------- /public/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/favicon.ico -------------------------------------------------------------------------------- /.github/screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/.github/screenshot.png -------------------------------------------------------------------------------- /pages/about.mdx: -------------------------------------------------------------------------------- 1 | # About 2 | 3 | This is the about page! This page is shown on the navbar. 4 | -------------------------------------------------------------------------------- /public/images/ask_discord.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/ask_discord.png -------------------------------------------------------------------------------- /public/images/ask_discord_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/ask_discord_2.png -------------------------------------------------------------------------------- /postcss.config.js: -------------------------------------------------------------------------------- 1 | module.exports = { 2 | plugins: { 3 | tailwindcss: {}, 4 | autoprefixer: {}, 5 | }, 6 | } 7 | -------------------------------------------------------------------------------- /pages/examples/3_advanced/_meta.tsx: -------------------------------------------------------------------------------- 1 | export default { 2 | "30_concurrent_processing": "Concurrent Agent Processing", 3 | } -------------------------------------------------------------------------------- /public/images/deploy_1/lc-cw-logs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/lc-cw-logs.png -------------------------------------------------------------------------------- /public/images/deploy_2/lc-metrics.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/lc-metrics.png -------------------------------------------------------------------------------- /public/images/deploy_1/rig-cw-logs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/rig-cw-logs.png -------------------------------------------------------------------------------- /public/images/deploy_2/cold_starts.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/cold_starts.png -------------------------------------------------------------------------------- /public/images/deploy_2/memory_usage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/memory_usage.png -------------------------------------------------------------------------------- /public/images/deploy_2/rig-metrics.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/rig-metrics.png -------------------------------------------------------------------------------- /public/images/deploy_1/lc-coldstarts.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/lc-coldstarts.png -------------------------------------------------------------------------------- /public/images/deploy_1/lc-power-tuner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/lc-power-tuner.png -------------------------------------------------------------------------------- /public/images/deploy_1/rig-coldstarts.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/rig-coldstarts.png -------------------------------------------------------------------------------- /public/images/deploy_1/rig-power-tuner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/rig-power-tuner.png -------------------------------------------------------------------------------- /public/images/deploy_2/lc-power-tuner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/lc-power-tuner.png -------------------------------------------------------------------------------- /public/images/deploy_2/rig-power-tuner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/rig-power-tuner.png -------------------------------------------------------------------------------- /pages/docs/3_concepts.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: 🧩 Concepts 3 | description: This section contains the concepts for Rig. 4 | --- 5 | 6 | # Concepts -------------------------------------------------------------------------------- /pages/guides/2_advanced/_meta.tsx: -------------------------------------------------------------------------------- 1 | export default { 2 | "21_discord_bot": "Discord Bot", 3 | "22_flight_assistant": "Flight Search Agent" 4 | } 5 | -------------------------------------------------------------------------------- /public/images/deploy_1/lc-deployment-package.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/lc-deployment-package.png -------------------------------------------------------------------------------- /public/images/deploy_1/rig-deployment-package.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_1/rig-deployment-package.png -------------------------------------------------------------------------------- /public/images/deploy_2/lc-deployment-package.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/lc-deployment-package.png -------------------------------------------------------------------------------- /public/images/deploy_2/rig-deployment-package.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Chalingok/rig-docs/HEAD/public/images/deploy_2/rig-deployment-package.png -------------------------------------------------------------------------------- /components/counters.module.css: -------------------------------------------------------------------------------- 1 | .counter { 2 | border: 1px solid #ccc; 3 | border-radius: 5px; 4 | padding: 2px 6px; 5 | margin: 12px 0 0; 6 | } 7 | -------------------------------------------------------------------------------- /pages/_app.tsx: -------------------------------------------------------------------------------- 1 | import '../styles.css'; 2 | 3 | export default function App({ Component, pageProps }) { 4 | return 5 | } -------------------------------------------------------------------------------- /pages/examples/_meta.tsx: -------------------------------------------------------------------------------- 1 | export default { 2 | "index": "Get Started", 3 | "1_rag": "RAG", 4 | "2_basics": "Basic", 5 | "3_advanced": "Advanced" 6 | } -------------------------------------------------------------------------------- /pages/docs/4_integrations/42_plugins/twitter.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Twitter 3 | description: This section describes the Twitter integration. 4 | --- 5 | 6 | Coming soon™️ ... -------------------------------------------------------------------------------- /pages/examples/0_model_providers/openai.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: OpenAI 3 | description: This section describes the OpenAI API integration. 4 | --- 5 | 6 | # OpenAI API Integration -------------------------------------------------------------------------------- /pages/docs/4_integrations/42_plugins.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Third Party Plugins 3 | description: This section describes the different third party plugins that Rig supports. 4 | --- 5 | -------------------------------------------------------------------------------- /pages/guides/index.mdx: -------------------------------------------------------------------------------- 1 |
Overview
2 | 3 | Explore our collection of blog posts and tutorials to learn how to build powerful LLM applications with Rig in Rust. -------------------------------------------------------------------------------- /pages/examples/0_model_providers/anthropic.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Anthropic (Claude) 3 | description: This section describes the Anthropic API (Claude) integration. 4 | --- 5 | 6 | # Anthropic API (Claude) Integration -------------------------------------------------------------------------------- /pages/guides/3_deploy/_meta.tsx: -------------------------------------------------------------------------------- 1 | const meta = { 2 | "Blog_1_aws_lambda": "Deploy a Rig Agent to AWS Lambda", 3 | "Blog_2_aws_lambda_lancedb": "Deploy a Rig Agent with LanceDB to AWS Lambda" 4 | } 5 | 6 | export default meta; -------------------------------------------------------------------------------- /pages/examples/2_basics/23_tic_tac_toe.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Tic-Tac-Toe Game 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to create a Tic Tac Toe game. 4 | --- 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .next/ 3 | .tsup/ 4 | node_modules/ 5 | *.log 6 | dist/ 7 | .turbo/ 8 | out/ 9 | 10 | .vercel/ 11 | .idea/ 12 | .eslintcache 13 | .env 14 | 15 | tsup.config.bundled* 16 | tsconfig.tsbuildinfo 17 | _pagefind/ -------------------------------------------------------------------------------- /pages/examples/2_basics/22_pid_controller.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: PID Controller (Basic) 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to create a PID controller. 4 | --- 5 | -------------------------------------------------------------------------------- /pages/examples/2_basics/24_synthetic_data.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Synthetic Data Generator 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to create synthetic data. 4 | --- 5 | -------------------------------------------------------------------------------- /next-env.d.ts: -------------------------------------------------------------------------------- 1 | /// 2 | /// 3 | 4 | // NOTE: This file should not be edited 5 | // see https://nextjs.org/docs/pages/building-your-application/configuring/typescript for more information. 6 | -------------------------------------------------------------------------------- /pages/examples/2_basics/20_simple_agent.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Simple Agent 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to create a simple agent that can answer questions and perform tasks. 4 | --- 5 | -------------------------------------------------------------------------------- /pages/examples/2_basics/21_text_classification.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Text Classification 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to create a simple agent that can answer questions and perform tasks. 4 | --- 5 | -------------------------------------------------------------------------------- /pages/guides/_meta.tsx: -------------------------------------------------------------------------------- 1 | const meta = { 2 | "index": "Overview", 3 | "0_text_extraction_classification": "Text Extraction and Classification", 4 | "1_rag": "Retrieval Augmented Generation (RAG)", 5 | "2_advanced": "Advanced Workflows", 6 | "3_deploy": "Deploy Rig" 7 | } 8 | 9 | export default meta; 10 | -------------------------------------------------------------------------------- /tailwind.config.ts: -------------------------------------------------------------------------------- 1 | import type { Config } from 'tailwindcss' 2 | 3 | export default { 4 | content: [ 5 | './pages/**/*.{js,ts,jsx,tsx,mdx}', 6 | './components/**/*.{js,ts,jsx,tsx}', 7 | './theme.config.tsx' 8 | ], 9 | theme: { 10 | extend: {} 11 | }, 12 | plugins: [], 13 | darkMode: 'class' 14 | } satisfies Config -------------------------------------------------------------------------------- /next.config.ts: -------------------------------------------------------------------------------- 1 | import nextra from 'nextra'; 2 | 3 | /** @type {import('next').NextConfig} */ 4 | const withNextra = nextra({ 5 | theme: 'nextra-theme-docs', 6 | themeConfig: './theme.config.tsx', 7 | defaultShowCopyCode: true, 8 | }); 9 | 10 | export default withNextra( 11 | { 12 | output: 'export', 13 | images: { unoptimized: true }, 14 | //basePath: '/rig-docs' 15 | } 16 | ); 17 | -------------------------------------------------------------------------------- /pages/docs/4_integrations/41_vector_stores.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Vector Stores 3 | description: This section describes the different vector stores that Rig supports. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | 9 | 10 | 11 | 12 | 13 | -------------------------------------------------------------------------------- /components/counters.tsx: -------------------------------------------------------------------------------- 1 | // Example from https://beta.reactjs.org/learn 2 | 3 | import { useState } from 'react' 4 | import styles from './counters.module.css' 5 | 6 | function MyButton() { 7 | const [count, setCount] = useState(0) 8 | 9 | function handleClick() { 10 | setCount(count + 1) 11 | } 12 | 13 | return ( 14 |
15 | 18 |
19 | ) 20 | } 21 | 22 | export default function MyApp() { 23 | return 24 | } 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Rig Docs 2 | 3 | This repository contains the documentation for [Rig](https://github.com/0xPlaygrounds/rig). 4 | 5 | It is built with [Nextra](https://nextra.site) and hosted on [GitHub Pages](https://pages.github.com/). 6 | 7 | You can find documentation on Nextra's Docs theme & components [here](https://nextra.site/docs/guide). 8 | 9 | ## Local Development 10 | 11 | First, run `pnpm i` to install the dependencies. 12 | 13 | Then, run `pnpm dev` to start the development server and visit localhost:3000. 14 | 15 | ## License 16 | 17 | This project is licensed under the MIT License. 18 | -------------------------------------------------------------------------------- /tsconfig.json: -------------------------------------------------------------------------------- 1 | { 2 | "compilerOptions": { 3 | "target": "es5", 4 | "lib": ["dom", "dom.iterable", "esnext"], 5 | "allowJs": true, 6 | "skipLibCheck": true, 7 | "strict": false, 8 | "forceConsistentCasingInFileNames": true, 9 | "noEmit": true, 10 | "incremental": true, 11 | "esModuleInterop": true, 12 | "module": "esnext", 13 | "moduleResolution": "node", 14 | "resolveJsonModule": true, 15 | "isolatedModules": true, 16 | "jsx": "preserve", 17 | "paths": { 18 | "react": ["./node_modules/@types/react"] 19 | } 20 | }, 21 | "include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", "tailwind.config.js", "pages/examples/index.mdx"], 22 | "exclude": ["node_modules"] 23 | 24 | } 25 | -------------------------------------------------------------------------------- /pages/examples/remote_examples_paths.json: -------------------------------------------------------------------------------- 1 | { 2 | "user": "0xPlaygrounds", 3 | "repo": "rig-examples", 4 | "branch": "master", 5 | "docsPath": "pages/examples/", 6 | "filePaths": [ 7 | "configs.mdx", 8 | "custom-rules.mdx", 9 | "getting-started.mdx", 10 | "getting-started/parser-options.mdx", 11 | "getting-started/parser.mdx", 12 | "index.mdx" 13 | ], 14 | "nestedMeta": { 15 | "index": "Introduction", 16 | "getting-started": { 17 | "type": "folder", 18 | "items": { 19 | "parser-options": "Parser Options", 20 | "parser": "Parser" 21 | } 22 | }, 23 | "configs": "Configs", 24 | "custom-rules": "Custom Rules" 25 | } 26 | } -------------------------------------------------------------------------------- /pages/docs/4_integrations/40_model_providers.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Model Providers 3 | description: This section describes the different model providers that Rig supports. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # Model Provider 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | -------------------------------------------------------------------------------- /pages/docs/_meta.tsx: -------------------------------------------------------------------------------- 1 | import { Blocks, BookText, CircleHelp, DraftingCompass, Landmark, Puzzle, Rocket, Unplug } from "lucide-react"; 2 | 3 | const iconStyle = { 4 | width: '1rem', 5 | display: 'inline' 6 | } 7 | 8 | const meta = { 9 | "index": "Overview", 10 | "0_quickstart": {title: <>   Quickstart}, 11 | "1_why_rig": {title: <>   Why Rig}, 12 | "2_architecture": {title: <>   Architecture}, 13 | "3_concepts": {title: <>   Concepts}, 14 | "4_integrations": {title: <>   Integrations}, 15 | "5_extensions": {title: <>   Extensions} 16 | } 17 | 18 | export default meta; 19 | -------------------------------------------------------------------------------- /pages/_meta.tsx: -------------------------------------------------------------------------------- 1 | import { Home, BookText, PocketKnife, CookingPot } from 'lucide-react' 2 | 3 | const iconStyle = { 4 | width: '1rem', 5 | display: 'inline', 6 | verticalAlign: '-0.4rem', 7 | } 8 | 9 | 10 | 11 | const meta = { 12 | "index": { 13 | "title": <> Get Started, 14 | "type": "page" 15 | }, 16 | "docs": { 17 | "title": <> Docs, 18 | "type": "page" 19 | }, 20 | "guides": { 21 | "title": <> Tutorials & Guides, 22 | "type": "page" 23 | }, 24 | "examples": { 25 | "title": <> Examples, 26 | "type": "page" 27 | }, 28 | "apiReference": { 29 | "title": "API Reference ↗", 30 | "type": "page", 31 | "href": "https://docs.rs/rig-core", 32 | "newWindow": true 33 | }, 34 | "contact": { 35 | "title": "Contact ↗", 36 | "type": "page", 37 | "href": "https://playgrounds.network", 38 | "newWindow": true 39 | } 40 | } 41 | 42 | export default meta -------------------------------------------------------------------------------- /styles.css: -------------------------------------------------------------------------------- 1 | @tailwind base; 2 | @tailwind components; 3 | @tailwind utilities; 4 | 5 | .nextra-card > div { 6 | padding: 0 1rem 1rem 1rem; 7 | position: relative; 8 | right: 0px; 9 | order: 2; 10 | } 11 | 12 | #card { 13 | display: flex; flex-direction: column; 14 | } 15 | 16 | .section-title { 17 | font-size: 2rem; 18 | font-weight: 700; 19 | } 20 | 21 | .section-surtitle { 22 | font-size: 1.10rem; 23 | font-weight: 700; 24 | color: transparent; 25 | background-image: linear-gradient(85.52deg, rgba(3, 205, 244) -7.27%, rgb(79, 138, 254) 108.87%); 26 | background-clip: text; 27 | -webkit-background-clip: text; 28 | -webkit-text-fill-color: transparent; 29 | position: relative; 30 | top: 0.5rem; 31 | } 32 | 33 | 34 | 35 | /* Subsequent details should have less of a gap */ 36 | details + details { 37 | margin-top: 0.5rem !important; 38 | } 39 | 40 | /* Fix detail padding */ 41 | details > div { 42 | padding: 0.25rem; 43 | } 44 | 45 | /* Details header should be bold */ 46 | details > summary { 47 | font-weight: 700; 48 | } 49 | 50 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Shu Ding 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "nextra-docs-template", 3 | "version": "0.0.1", 4 | "description": "Nextra docs template", 5 | "scripts": { 6 | "dev": "next dev", 7 | "build": "next build", 8 | "start": "next start", 9 | "export": "next build && next export", 10 | "deploy": "gh-pages -d out -t true" 11 | }, 12 | "repository": { 13 | "type": "git", 14 | "url": "git+https://github.com/shuding/nextra-docs-template.git" 15 | }, 16 | "author": "Shu Ding ", 17 | "license": "MIT", 18 | "bugs": { 19 | "url": "https://github.com/shuding/nextra-docs-template/issues" 20 | }, 21 | "homepage": "https://github.com/shuding/nextra-docs-template#readme", 22 | "dependencies": { 23 | "@swc/helpers": "^0.5.15", 24 | "@types/react": "^18.3.12", 25 | "autoprefixer": "^10.4.20", 26 | "lucide-react": "^0.460.0", 27 | "next": "^15.0.3", 28 | "nextra": "^3.2.4", 29 | "nextra-theme-docs": "^3.2.4", 30 | "postcss": "^8.4.49", 31 | "react": "^18.3.1", 32 | "react-dom": "^18.3.1", 33 | "sharp": "^0.33.5" 34 | }, 35 | "devDependencies": { 36 | "@types/node": "18.11.10", 37 | "tailwindcss": "^3.4.15", 38 | "typescript": "^4.9.5" 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /pages/examples/index.mdx: -------------------------------------------------------------------------------- 1 | import { GraduationCap, FileText, BookHeart } from 'lucide-react' 2 | import { Cards } from 'nextra/components' 3 | 4 |
Overview
5 | 6 | An index of our team curated Rig code examples and showcase of the best usecases from the community. 7 | 8 | For more detailed walkthroughs and guides, please see our [Tutorials & Guides](/guides) section. 9 | 10 | ### Showcase 11 | 12 | 13 | arXiv Research Assistant} 15 | href="https://rig-arxiv-agent-khjr.shuttle.app" 16 | children={
17 | A `shuttle.dev` deployed research assistant that uses arxiv to find and summarize research papers. 18 |
19 | } 20 | arrow 21 | target="_blank" 22 | /> 23 | PDF Summarizer} 25 | href="" 26 | children={
27 | Coming soon™️... 28 |
29 | } 30 | /> 31 | Sentiment Analysis/Extractor} 33 | href="" 34 | children={
35 | Coming soon™️... 36 |
37 | } 38 | /> 39 |
40 | 41 | -------------------------------------------------------------------------------- /pages/docs/index.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Overview 3 | description: This section contains the high-level documentation for Rig. 4 | --- 5 | 6 | import { CircleHelp, Landmark, Plug } from 'lucide-react' 7 | import { Cards } from 'nextra/components' 8 | 9 | # Docs 10 | 11 | This section contains the high-level documentation for Rig. 12 | 13 | 14 | } 17 | href="/docs/1_why_rig" 18 | children={
19 | Main features and benefits of Rig. 20 |
21 | } 22 | /> 23 | } 26 | href="/docs/2_architecture" 27 | children={
28 | Rig's architecture, design principles and key abstractions. 29 |
30 | } 31 | /> 32 | } 35 | href="/docs/3_concepts" 36 | children={
37 | Rig's core concepts and abstractions. 38 |
39 | } 40 | /> 41 | } 44 | href="/docs/3_integrations" 45 | children={
46 | Integrations with LLM providers, vector databases and third party plugins (e.g Discord, Twitter). 47 |
48 | } 49 | /> 50 | } 53 | href="/docs/5_utility_tools" 54 | children={
55 | Utility tools for Rig, such as a CLI, a GUI, etc. 56 |
57 | } 58 | /> 59 |
60 | 61 | -------------------------------------------------------------------------------- /theme.config.tsx: -------------------------------------------------------------------------------- 1 | import React from 'react' 2 | import { DocsThemeConfig } from 'nextra-theme-docs' 3 | import { Home } from 'lucide-react' 4 | const config: DocsThemeConfig = { 5 | head: ( 6 | <> 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | ), 15 | logo: ( 16 | <> 17 | Rig Logo 18 | Rig Logo 19 | 37 | 38 | ), 39 | project: { 40 | link: 'https://github.com/0xPlaygrounds/rig', 41 | }, 42 | chat: { 43 | link: 'https://discord.gg/playgrounds', 44 | }, 45 | editLink: { 46 | content: 'Edit this page on GitHub →' 47 | }, 48 | feedback: { 49 | content: '💡 Question? Give us feedback →', 50 | labels: 'feedback' 51 | }, 52 | docsRepositoryBase: 'https://github.com/0xPlaygrounds/rig-docs', 53 | footer: { 54 | content: ( 55 |
56 | 68 |
69 | ) 70 | } 71 | } 72 | 73 | export default config 74 | -------------------------------------------------------------------------------- /pages/docs/2_architecture.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: 🏛️ Architecture 3 | description: This section contains the architecture of Rig. 4 | --- 5 | 6 | import { FileTree } from 'nextra/components' 7 | 8 | # Architecture 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | ## Core concepts 38 | 39 | ### Completion and embedding models 40 | 41 | Rig provides a consistent API for working with LLMs and embeddings. Specifically, 42 | each provider (e.g. OpenAI, Cohere) has a `Client` struct that can be used to initialize completion 43 | and embedding models. These models implement the [CompletionModel](crate::completion::CompletionModel) 44 | and [EmbeddingModel](crate::embeddings::EmbeddingModel) traits respectively, which provide a common, 45 | low-level interface for creating completion and embedding requests and executing them. 46 | 47 | ### Agents 48 | 49 | Rig also provides high-level abstractions over LLMs in the form of the [Agent](crate::agent::Agent) type. 50 | 51 | The [Agent](crate::agent::Agent) type can be used to create anything from simple agents that use vanilla models to full blown 52 | RAG systems that can be used to answer questions using a knowledge base. 53 | 54 | ### Vector stores and indexes 55 | 56 | Rig defines a common interface for working with vector stores and indexes. Specifically, the library 57 | provides the [VectorStore](crate::vector_store::VectorStore) and [VectorStoreIndex](crate::vector_store::VectorStoreIndex) 58 | traits, which can be implemented on a given type to define vector stores and indices respectively. 59 | Those can then be used as the knowledge base for a RAG enabled [Agent](crate::agent::Agent), or 60 | as a source of context documents in a custom architecture that use multiple LLMs or agents. -------------------------------------------------------------------------------- /pages/examples/0_model_providers/gemini.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Gemini 3 | description: This section describes the Gemini API integration. 4 | --- 5 | 6 | # Gemini API Integration 7 | 8 | Based on the Gemini API Reference, most types (request parameters, response types) are generated from the API Reference, see [here](https://ai.google.dev/api?hl=fr&lang=python). 9 | 10 | 11 | ## Agent Example 12 | 13 | ```rust 14 | use rig::{ 15 | completion::Prompt, 16 | providers::gemini::{self, completion::gemini_api_types::GenerationConfig}, 17 | }; 18 | #[tracing::instrument(ret)] 19 | #[tokio::main] 20 | 21 | async fn main() -> Result<(), anyhow::Error> { 22 | tracing_subscriber::fmt() 23 | .with_max_level(tracing::Level::DEBUG) 24 | .with_target(false) 25 | .init(); 26 | 27 | // Initialize the Google Gemini client 28 | let client = gemini::Client::from_env(); 29 | 30 | // Create agent with a single context prompt 31 | let agent = client 32 | .agent(gemini::completion::GEMINI_1_5_PRO) 33 | .preamble("Be creative and concise. Answer directly and clearly.") 34 | .temperature(0.5) 35 | // The `GenerationConfig` utility struct helps construct a typesafe `additional_params` 36 | .additional_params(serde_json::to_value(GenerationConfig { 37 | top_k: Some(1), 38 | top_p: Some(0.95), 39 | candidate_count: Some(1), 40 | ..Default::default() 41 | })?) // Unwrap the Result to get the Value 42 | .build(); 43 | 44 | tracing::info!("Prompting the agent..."); 45 | 46 | // Prompt the agent and print the response 47 | let response = agent 48 | .prompt("How much wood would a woodchuck chuck if a woodchuck could chuck wood? Infer an answer.") 49 | .await; 50 | 51 | tracing::info!("Response: {:?}", response); 52 | 53 | match response { 54 | Ok(response) => println!("{}", response), 55 | Err(e) => { 56 | tracing::error!("Error: {:?}", e); 57 | return Err(e.into()); 58 | } 59 | } 60 | 61 | Ok(()) 62 | } 63 | ``` 64 | 65 | ## Embeddings 66 | 67 | ```rust 68 | use rig::providers::gemini; 69 | use rig::Embed; 70 | 71 | #[derive(Embed, Debug)] 72 | struct Greetings { 73 | #[embed] 74 | message: String, 75 | } 76 | 77 | #[tokio::main] 78 | async fn main() -> Result<(), anyhow::Error> { 79 | // Initialize the Google Gemini client 80 | // Create OpenAI client 81 | let client = gemini::Client::from_env(); 82 | 83 | let embeddings = client 84 | .embeddings(gemini::embedding::EMBEDDING_001) 85 | .document(Greetings { 86 | message: "Hello, world!".to_string(), 87 | })? 88 | .document(Greetings { 89 | message: "Goodbye, world!".to_string(), 90 | })? 91 | .build() 92 | .await 93 | .expect("Failed to embed documents"); 94 | 95 | println!("{:?}", embeddings); 96 | 97 | Ok(()) 98 | } 99 | ``` -------------------------------------------------------------------------------- /pages/examples/3_advanced/30_concurrent_processing.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Concurrent Processing with Rig 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to perform concurrent processing of LLM tasks. This approach significantly improves performance when dealing with multiple LLM queries, making it ideal for batch processing or high-throughput scenarios. 4 | --- 5 | 6 | # Concurrent Processing with Rig 7 | 8 | This example demonstrates how to use [Rig](https://github.com/0xPlaygrounds/rig), a powerful Rust library for building LLM-powered applications, to perform concurrent processing of LLM tasks. This approach significantly improves performance when dealing with multiple LLM queries, making it ideal for batch processing or high-throughput scenarios. 9 | 10 | ### Prerequisites 11 | 12 | Before you begin, ensure you have the following installed: 13 | 14 | - Rust (latest stable version) 15 | - Cargo (Rust's package manager) 16 | 17 | You'll also need an OpenAI or Cohere API key. If you don't have one, you can sign up at [OpenAI's website](https://openai.com) or [Cohere's website](https://cohere.com/) 18 | 19 | ### Setup 20 | 21 | 1. Create a new Rust project: 22 | ``` 23 | cargo new rig-concurrent-processing 24 | cd rig-concurrent-processing 25 | ``` 26 | 27 | 2. Add the following dependencies to your `Cargo.toml`: 28 | ```toml 29 | [dependencies] 30 | rig-core = "0.1.0" 31 | tokio = { version = "1.0", features = ["full"] } 32 | ``` 33 | 34 | 3. Set your OpenAI API key as an environment variable: 35 | ``` 36 | export OPENAI_API_KEY=your_api_key_here 37 | ``` 38 | 39 | ### Code Overview 40 | 41 | The main components of this example are: 42 | 43 | 1. OpenAI client initialization. 44 | 2. Creation of a shared GPT-3.5-turbo model instance. 45 | 3. Spawning of multiple concurrent tasks using Tokio. 46 | 4. Concurrent execution of LLM queries. 47 | 5. Collection and display of results. 48 | 49 | ### Running the Example 50 | 51 | 1. Copy the provided code into your `src/main.rs` file. 52 | 2. Run the example using: 53 | ``` 54 | cargo run 55 | ``` 56 | 57 | ### Customization 58 | 59 | You can easily modify this example to suit your specific use case: 60 | - Change the number of concurrent tasks by adjusting the loop range. 61 | - Modify the prompt to generate different types of content. 62 | - Experiment with different OpenAI models by changing the model name. 63 | 64 | ### Performance Considerations 65 | 66 | - Be mindful of OpenAI's rate limits when increasing concurrency. 67 | - Monitor system resource usage to optimize the number of concurrent tasks. 68 | - Consider implementing error handling and retry logic for production use. 69 | 70 | ### Troubleshooting 71 | 72 | If you encounter any issues: 73 | - Ensure your OpenAI API key is correctly set. 74 | - Check that all dependencies are properly installed. 75 | - Verify that you're using a compatible Rust version. 76 | 77 | For more detailed information, refer to the [Rig documentation](https://docs.rs/rig). -------------------------------------------------------------------------------- /.github/workflows/nextjs.yml: -------------------------------------------------------------------------------- 1 | # Sample workflow for building and deploying a Next.js site to GitHub Pages 2 | # 3 | # To get started with Next.js see: https://nextjs.org/docs/getting-started 4 | # 5 | name: Deploy Next.js site to GitHub Pages 6 | 7 | on: 8 | # Runs on pushes targeting the default branch 9 | push: 10 | branches: ["main"] 11 | 12 | # Allows you to run this workflow manually from the Actions tab 13 | workflow_dispatch: 14 | 15 | # Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages 16 | permissions: 17 | contents: read 18 | pages: write 19 | id-token: write 20 | 21 | # Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. 22 | # However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. 23 | concurrency: 24 | group: "pages" 25 | cancel-in-progress: false 26 | 27 | jobs: 28 | # Build job 29 | build: 30 | runs-on: ubuntu-latest 31 | steps: 32 | - name: Checkout 33 | uses: actions/checkout@v4 34 | - name: Detect package manager 35 | id: detect-package-manager 36 | run: | 37 | if [ -f "${{ github.workspace }}/yarn.lock" ]; then 38 | echo "manager=yarn" >> $GITHUB_OUTPUT 39 | echo "command=install" >> $GITHUB_OUTPUT 40 | echo "runner=yarn" >> $GITHUB_OUTPUT 41 | exit 0 42 | elif [ -f "${{ github.workspace }}/package.json" ]; then 43 | echo "manager=npm" >> $GITHUB_OUTPUT 44 | echo "command=ci" >> $GITHUB_OUTPUT 45 | echo "runner=npx --no-install" >> $GITHUB_OUTPUT 46 | exit 0 47 | else 48 | echo "Unable to determine package manager" 49 | exit 1 50 | fi 51 | - uses: pnpm/action-setup@v4 52 | name: Install pnpm 53 | with: 54 | version: 9 55 | run_install: false 56 | 57 | - name: Install Node.js 58 | uses: actions/setup-node@v4 59 | with: 60 | node-version: 20 61 | cache: 'pnpm' 62 | 63 | - name: Install dependencies 64 | run: pnpm install 65 | - name: Setup Pages 66 | uses: actions/configure-pages@v5 67 | # with: 68 | # Automatically inject basePath in your Next.js configuration file and disable 69 | # server side image optimization (https://nextjs.org/docs/api-reference/next/image#unoptimized). 70 | # 71 | # You may remove this line if you want to manage the configuration yourself. 72 | # static_site_generator: next 73 | - name: Build with Next.js 🏗️ 74 | run: ${{ steps.detect-package-manager.outputs.runner }} next build 75 | - name: Upload artifact 76 | uses: actions/upload-pages-artifact@v3 77 | with: 78 | path: ./out 79 | 80 | # Deployment job 81 | deploy : 82 | environment: 83 | name: github-pages 84 | url: ${{ steps.deployment.outputs.page_url }} 85 | runs-on: ubuntu-latest 86 | needs: build 87 | steps: 88 | - name: Deploy to GitHub Pages 🚀 89 | id: deployment 90 | uses: actions/deploy-pages@v4 91 | -------------------------------------------------------------------------------- /pages/docs/4_integrations/41_vector_stores/neo4j.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Neo4j 3 | --- 4 | 5 | import { Cards } from 'nextra/components' 6 | 7 | # Rig-Neo4j Integration 8 | 9 | The `rig-neo4j` crate provides a vector store implementation using Neo4j as the underlying datastore. This integration allows for efficient vector-based searches within a Neo4j graph database, leveraging Neo4j's vector index capabilities. 10 | 11 | ## Key Features 12 | 13 | - **Vector Indexing**: Supports creating and querying vector indexes in Neo4j, enabling semantic search capabilities. 14 | - **Integration with OpenAI**: Utilizes OpenAI's embedding models to generate vector representations of data. 15 | - **Flexible Querying**: Offers methods to perform top-N searches and retrieve node IDs based on vector similarity. 16 | - **Customizable Index Configuration**: Allows configuration of index properties such as dimensions and similarity functions. 17 | 18 | ## Prerequisites 19 | 20 | - **Neo4j GenAI Plugin**: Required for vector indexing capabilities. Enabled by default in Neo4j Aura or can be installed on self-managed instances. 21 | - **Pre-existing Vector Index**: The Neo4j vector index must be created before performing searches. This can be done using the Neo4j browser, Cypher queries, or the `Neo4jClient::create_vector_index` method. 22 | 23 | ## Usage 24 | 25 | ### Setup 26 | 27 | Add the `rig-neo4j` crate to your `Cargo.toml`: 28 | 29 | ```toml 30 | [dependencies] 31 | rig-neo4j = "0.2.0" 32 | ``` 33 | 34 | ### Example Workflow 35 | 36 | 1. **Connect to Neo4j**: Establish a connection using the `Neo4jClient`. 37 | 38 | ```rust 39 | let neo4j_client = Neo4jClient::connect("neo4j://localhost:7687", "username", "password").await?; 40 | ``` 41 | 42 | 2. **Create Vector Index**: Define and create a vector index on your data. 43 | 44 | ```rust 45 | neo4j_client.create_vector_index( 46 | IndexConfig::new("moviePlots"), 47 | "Movie", 48 | &model 49 | ).await?; 50 | ``` 51 | 52 | 3. **Perform Vector Search**: Query the vector index for similar nodes. 53 | 54 | ```rust 55 | let results = index.top_n::("a historical movie on quebec", 5).await?; 56 | ``` 57 | 58 | ### Example Code 59 | 60 | ```rust 61 | #[tokio::main] 62 | async fn main() -> Result<(), anyhow::Error> { 63 | let neo4j_client = Neo4jClient::connect("neo4j://localhost:7687", "username", "password").await?; 64 | let model = openai_client.embedding_model(TEXT_EMBEDDING_ADA_002); 65 | 66 | neo4j_client.create_vector_index( 67 | IndexConfig::new("moviePlots"), 68 | "Movie", 69 | &model 70 | ).await?; 71 | 72 | let index = neo4j_client.get_index(model, "moviePlots", SearchParams::default()).await?; 73 | let results = index.top_n::("a historical movie on quebec", 5).await?; 74 | println!("{:#?}", results); 75 | 76 | Ok(()) 77 | } 78 | ``` 79 | 80 | ## Additional Resources 81 | 82 | - **Examples**: See the [examples](https://github.com/0xPlaygrounds/rig/tree/main/rig-neo4j/examples) directory for detailed usage scenarios. 83 | - **Neo4j Documentation**: Refer to the [Neo4j vector index documentation](https://neo4j.com/docs/cypher-manual/current/indexes/semantic-indexes/vector-indexes/) for more information on setting up and using vector indexes. 84 | 85 |
86 | 87 | 88 | 89 | -------------------------------------------------------------------------------- /pages/docs/3_concepts/1_embeddings.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Embeddings 3 | description: This section contains the concepts for Rig. 4 | --- 5 | import { Cards } from 'nextra/components' 6 | 7 | # Embeddings in Rig 8 | 9 | Rig provides a comprehensive embeddings system for converting text and other data types into numerical vector representations that can be used for semantic search, similarity comparisons, and other NLP tasks. 10 | 11 | ## Core Concepts 12 | 13 | ### Embeddings 14 | 15 | An embedding is a vector representation of data (usually text) where semantically similar items are mapped to nearby points in the vector space. In Rig, embeddings are represented by the `Embedding` struct which contains: 16 | 17 | - The original document text 18 | - The vector representation as `Vec` 19 | 20 | ### The Embedding Process 21 | 22 | 1. **Document Preparation** 23 | - Documents implement the `Embed` trait 24 | - The `TextEmbedder` accumulates text to be embedded 25 | - Built-in implementations for common types (strings, numbers, JSON) 26 | 27 | 2. **Batch Processing** 28 | - The `EmbeddingsBuilder` collects multiple documents 29 | - Documents are batched for efficient API calls 30 | - Handles concurrent embedding generation 31 | 32 | 3. **Vector Generation** 33 | - An `EmbeddingModel` converts text to vectors 34 | - Providers like OpenAI implement the model interface 35 | - Results include both document text and vectors 36 | 37 | ## Working with Embeddings 38 | 39 | ### Basic Usage 40 | 41 | ```rust 42 | use rig::{embeddings::EmbeddingsBuilder, providers::openai}; 43 | 44 | // Create embedding model 45 | let model = openai_client.embedding_model("text-embedding-ada-002"); 46 | 47 | // Build embeddings 48 | let embeddings = EmbeddingsBuilder::new(model) 49 | .document("Some text")? 50 | .document("More text")? 51 | .build() 52 | .await?; 53 | ``` 54 | 55 | ### Vector Operations 56 | 57 | Rig provides several distance metrics for comparing embeddings: 58 | 59 | - Cosine similarity 60 | - Angular distance 61 | - Euclidean distance 62 | - Manhattan distance 63 | - Chebyshev distance 64 | 65 | Example: 66 | ```rust 67 | let similarity = embedding1.cosine_similarity(&embedding2, false); 68 | let distance = embedding1.euclidean_distance(&embedding2); 69 | ``` 70 | 71 | ### Custom Types 72 | 73 | To make your types embeddable, implement the `Embed` trait: 74 | 75 | ```rust 76 | struct Document { 77 | title: String, 78 | content: String 79 | } 80 | 81 | impl Embed for Document { 82 | fn embed(&self, embedder: &mut TextEmbedder) -> Result<(), EmbedError> { 83 | embedder.embed(self.title.clone()); 84 | embedder.embed(self.content.clone()); 85 | Ok(()) 86 | } 87 | } 88 | ``` 89 | 90 | ## Best Practices 91 | 92 | 1. **Document Preparation** 93 | - Clean and normalize text before embedding 94 | - Consider chunking large documents 95 | - Remove irrelevant embedding content 96 | 97 | 2. **Error Handling** 98 | - Handle provider API errors gracefully 99 | - Validate vector dimensions 100 | - Check for empty or invalid input 101 | 102 | 4. **Batching** 103 | - Use `EmbeddingsBuilder` for multiple documents 104 | - Respects provider's max batch size 105 | - Automatically handles concurrent processing 106 | 107 | ## See Also 108 | 109 | - [Completion & Generation](./0_completion.mdx) 110 | - [Tools](./2_tools.mdx) 111 | - [Vector Stores](../4_integrations/41_vector_store.mdx) 112 | 113 |
114 | 115 | -------------------------------------------------------------------------------- /pages/docs/5_extensions/0_cli_chatbot.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: CLI Chatbot 3 | description: This section contains the concepts for Rig. 4 | --- 5 | import { Cards } from 'nextra/components' 6 | 7 | # CLI Chatbot Utility 8 | 9 | ## Overview 10 | A utility function that creates an interactive REPL-style chatbot from any type implementing the `Chat` trait. Manages chat history, I/O, and basic command handling. 11 | 12 | ## Usage 13 | ```rust 14 | use rig::{cli_chatbot, providers::openai}; 15 | 16 | let agent = openai.agent("gpt-4") 17 | .preamble("You are a helpful assistant.") 18 | .build(); 19 | 20 | cli_chatbot(agent).await?; 21 | ``` 22 | 23 | ## Features 24 | - Interactive REPL interface with `> ` prompt 25 | - Maintains chat history for context 26 | - Simple "exit" command 27 | - Error handling for I/O and chat operations 28 | - Tracing support for debugging 29 | 30 | ## Implementation 31 | Reference: 32 | 33 | ```7:50:rig-core/src/cli_chatbot.rs 34 | pub async fn cli_chatbot(chatbot: impl Chat) -> Result<(), PromptError> { 35 | let stdin = io::stdin(); 36 | let mut stdout = io::stdout(); 37 | let mut chat_log = vec![]; 38 | 39 | println!("Welcome to the chatbot! Type 'exit' to quit."); 40 | loop { 41 | print!("> "); 42 | // Flush stdout to ensure the prompt appears before input 43 | stdout.flush().unwrap(); 44 | 45 | let mut input = String::new(); 46 | match stdin.read_line(&mut input) { 47 | Ok(_) => { 48 | // Remove the newline character from the input 49 | let input = input.trim(); 50 | // Check for a command to exit 51 | if input == "exit" { 52 | break; 53 | } 54 | tracing::info!("Prompt:\n{}\n", input); 55 | 56 | let response = chatbot.chat(input, chat_log.clone()).await?; 57 | chat_log.push(Message { 58 | role: "user".into(), 59 | content: input.into(), 60 | }); 61 | chat_log.push(Message { 62 | role: "assistant".into(), 63 | content: response.clone(), 64 | }); 65 | 66 | println!("========================== Response ============================"); 67 | println!("{response}"); 68 | println!("================================================================\n\n"); 69 | 70 | tracing::info!("Response:\n{}\n", response); 71 | } 72 | Err(error) => println!("Error reading input: {}", error), 73 | } 74 | } 75 | 76 | Ok(()) 77 | } 78 | ``` 79 | 80 | 81 | ## Chat History Management 82 | Automatically tracks conversation with `Message` objects: 83 | ```rust 84 | Message { 85 | role: "user"|"assistant", 86 | content: String 87 | } 88 | ``` 89 | 90 | ## Examples 91 | Used in calculator chatbot: 92 | 93 | ```279:281:rig-core/examples/calculator_chatbot.rs 94 | 95 | cli_chatbot(calculator_rag).await?; 96 | 97 | ``` 98 | 99 | 100 | And multi-agent systems: 101 | 102 | ```66:67:rig-core/examples/multi_agent.rs 103 | // Spin up a chatbot using the agent 104 | cli_chatbot(translator).await?; 105 | ``` 106 | 107 | 108 | ## Error Handling 109 | - Returns `Result<(), PromptError>` 110 | - Handles I/O errors gracefully 111 | - Propagates chat implementation errors 112 | 113 | ## See Also 114 | - [Agent System](./3_concepts/3_sagent.md) 115 | - [Completion Models](./3_concepts/0_completion.md) 116 | 117 | 118 | -------------------------------------------------------------------------------- /public/rig-dark.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 10 | 11 | 12 | 14 | 16 | 17 | 18 | 20 | -------------------------------------------------------------------------------- /public/rig-light.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 10 | 11 | 12 | 14 | 16 | 17 | 18 | 20 | -------------------------------------------------------------------------------- /pages/examples/1_rag/10_rag_pdf.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: PDF Summarizer 3 | description: This example demonstrates how to use Rig, a powerful Rust library for building LLM-powered applications, to perform concurrent processing of LLM tasks. This approach significantly improves performance when dealing with multiple LLM queries, making it ideal for batch processing or high-throughput scenarios. 4 | --- 5 | 6 | # Building a RAG Agent over PDF files using Rig 7 | 8 | ## Overview 9 | 10 | This project demonstrates a Retrieval-Augmented Generation (RAG) system built with Rig, a Rust library for developing LLM-powered applications. The system processes PDF documents, creates embeddings, and uses OpenAI's GPT-3.5-turbo model to answer questions based on the content of these documents. 11 | 12 | In this example, we use two PDF documents: 13 | 1. "Moore's Law for Everything" by Sam Altman 14 | 2. "The Last Question" by Isaac Asimov 15 | 16 | ## Features 17 | 18 | - PDF text extraction 19 | - Document embedding using OpenAI's text-embedding-ada-002 model 20 | - In-memory vector store for quick retrieval 21 | - Dynamic context generation for each query 22 | - Interactive Q&A interface 23 | 24 | ## Prerequisites 25 | 26 | Before you begin, ensure you have the following installed: 27 | - Rust (latest stable version) 28 | - Cargo (Rust's package manager) 29 | 30 | You'll also need an OpenAI API key. If you don't have one, sign up at [OpenAI's website](https://openai.com). 31 | 32 | ## Setup 33 | 34 | 1. Clone this repository: 35 | ``` 36 | git clone this repo 37 | cd pdf-rag-system 38 | ``` 39 | 40 | 2. Set your OpenAI API key as an environment variable: 41 | ``` 42 | export OPENAI_API_KEY=your_api_key_here 43 | ``` 44 | 45 | 3. Ensure you have the following PDF documents in a `documents` folder in your project root: 46 | - `Moores_Law_for_Everything.pdf` 47 | - `The_Last_Question.pdf` 48 | 49 | ## Running the Application 50 | 51 | 1. Build and run the application: 52 | ``` 53 | cargo run 54 | ``` 55 | 56 | 2. Once the system is ready, you'll see the message: "RAG System ready. Type 'exit' to quit." 57 | 58 | 3. Enter your questions at the prompt. The system will provide answers based on the content of the PDF documents. 59 | 60 | 4. To exit the application, type 'exit' at the prompt. 61 | 62 | ## Example Usage 63 | 64 | ``` 65 | RAG System ready. Type 'exit' to quit. 66 | Enter your question: tell me the premise of what sam altman is talking about 67 | Response: Sam Altman discusses the coming technological revolution driven by powerful artificial intelligence (AI) systems that can think, learn, and perform tasks currently done by people. He highlights how this AI revolution will lead to the creation of phenomenal wealth but also emphasizes the need for policy changes to distribute this wealth and ensure inclusivity in society. Altman proposes the idea of embracing AI advancements, transitioning taxation from labor to capital (such as companies and land), and distributing wealth equitably through the American Equity Fund. This plan aims to improve the standard of living for everyone by leveraging technology and fair economic policies in a rapidly changing future. 68 | Enter your question: 69 | ``` 70 | 71 | ## How It Works 72 | 73 | 1. **PDF Processing**: The system extracts text from the specified PDF documents. 74 | 2. **Embedding Creation**: It generates embeddings for the extracted text using OpenAI's embedding model. 75 | 3. **Vector Store**: The embeddings are stored in an in-memory vector store for quick retrieval. 76 | 4. **Query Processing**: When a user enters a question, the system: 77 | a. Generates an embedding for the question. 78 | b. Retrieves the most relevant context from the vector store. 79 | c. Sends the question and context to the GPT-3.5-turbo model. 80 | d. Returns the model's response to the user. 81 | 82 | ## Customization 83 | 84 | - To use different PDF documents, place them in the `documents` folder and update the file paths in the `main` function. 85 | - You can adjust the number of relevant documents retrieved for each query by changing the `dynamic_context` parameter. 86 | - To use a different OpenAI model, modify the model name in the `context_rag_agent` function call. 87 | 88 | ## Troubleshooting 89 | 90 | If you encounter any issues: 91 | - Ensure your OpenAI API key is correctly set. 92 | - Verify that the PDF documents are in the correct location and are readable. 93 | - Check that all dependencies are properly installed by running `cargo build`. 94 | 95 | ## Dependencies 96 | 97 | This project uses the following main dependencies: 98 | - `rig-core`: For building LLM-powered applications 99 | - `pdf-extract`: For extracting text from PDF files 100 | - `tokio`: For asynchronous runtime 101 | - `anyhow`: For error handling 102 | 103 | For a complete list of dependencies, refer to the `Cargo.toml` file. 104 | 105 | 106 | ## Contributing 107 | 108 | Contributions are welcome! Please feel free to submit a Pull Request. -------------------------------------------------------------------------------- /pages/docs/4_integrations/40_model_providers/openai.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: OpenAI 3 | description: Integration with OpenAI's API services, supporting both completion and embedding models. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # OpenAI Integration 9 | 10 | The OpenAI provider in Rig offers integration with OpenAI's API services, supporting both completion and embedding models. It provides a client-based architecture for interacting with OpenAI's models. 11 | 12 | ## Key Features 13 | 14 | - Full support for GPT-3.5 and GPT-4 model families 15 | - Text embedding models (Ada, text-embedding-3) 16 | - Automatic token usage tracking 17 | - Tool/function calling support 18 | - Custom API endpoint configuration 19 | 20 | ## Basic Usage 21 | 22 | ```rust 23 | use rig::providers::openai; 24 | 25 | // Create client from environment variable 26 | let client = openai::Client::from_env(); 27 | 28 | // Or explicitly with API key 29 | let client = openai::Client::new("your-api-key"); 30 | 31 | // Create a completion model 32 | let gpt4 = client.completion_model(openai::GPT_4); 33 | 34 | // Create an embedding model 35 | let embedder = client.embedding_model(openai::TEXT_EMBEDDING_3_LARGE); 36 | ``` 37 | 38 | ## Available Models 39 | 40 | ### Completion Models 41 | - `GPT_4` / `GPT_4O`: GPT-4 base and optimized versions 42 | - `GPT_35_TURBO`: GPT-3.5 Turbo and its variants 43 | - `GPT_35_TURBO_INSTRUCT`: Instruction-tuned GPT-3.5 44 | 45 | ### Embedding Models 46 | - `TEXT_EMBEDDING_3_LARGE`: 3072 dimensions 47 | - `TEXT_EMBEDDING_3_SMALL`: 1536 dimensions 48 | - `TEXT_EMBEDDING_ADA_002`: 1536 dimensions (legacy) 49 | 50 | ## Special Considerations 51 | 52 | 1. **Tool Calling**: OpenAI models support function calling through a specialized JSON format. The provider automatically handles conversion between Rig's tool definitions and OpenAI's expected format. 53 | 54 | 2. **Response Processing**: The provider implements special handling for: 55 | - Tool/function call responses 56 | - System messages 57 | - Token usage tracking 58 | 59 | 3. **Error Handling**: OpenAI-specific errors are automatically converted to Rig's error types for consistent error handling across providers. 60 | 61 | ## Implementation Details 62 | 63 | The core OpenAI provider implementation can be found in: 64 | 65 | ```rust filename="rig-core/src/providers/openai.rs [1:150]" 66 | //! OpenAI API client and Rig integration 67 | //! 68 | //! # Example 69 | //! ``` 70 | //! use rig::providers::openai; 71 | //! 72 | //! let client = openai::Client::new("YOUR_API_KEY"); 73 | //! 74 | //! let gpt4o = client.completion_model(openai::GPT_4O); 75 | //! ``` 76 | use crate::{ 77 | agent::AgentBuilder, 78 | completion::{self, CompletionError, CompletionRequest}, 79 | embeddings::{self, EmbeddingError, EmbeddingsBuilder}, 80 | extractor::ExtractorBuilder, 81 | json_utils, Embed, 82 | }; 83 | use schemars::JsonSchema; 84 | use serde::{Deserialize, Serialize}; 85 | use serde_json::json; 86 | 87 | // ================================================================ 88 | // Main OpenAI Client 89 | // ================================================================ 90 | const OPENAI_API_BASE_URL: &str = "https://api.openai.com"; 91 | 92 | #[derive(Clone)] 93 | pub struct Client { 94 | base_url: String, 95 | http_client: reqwest::Client, 96 | } 97 | 98 | impl Client { 99 | /// Create a new OpenAI client with the given API key. 100 | pub fn new(api_key: &str) -> Self { 101 | Self::from_url(api_key, OPENAI_API_BASE_URL) 102 | } 103 | 104 | /// Create a new OpenAI client with the given API key and base API URL. 105 | ``` 106 | 107 | 108 | Tool calling and response handling: 109 | 110 | ```rust filename="rig-core/src/providers/openai.rs [350:450]" 111 | #[derive(Debug, Deserialize)] 112 | pub struct Choice { 113 | pub index: usize, 114 | pub message: Message, 115 | pub logprobs: Option, 116 | pub finish_reason: String, 117 | } 118 | 119 | #[derive(Debug, Deserialize)] 120 | pub struct Message { 121 | pub role: String, 122 | pub content: Option, 123 | pub tool_calls: Option>, 124 | } 125 | 126 | #[derive(Debug, Deserialize)] 127 | pub struct ToolCall { 128 | pub id: String, 129 | pub r#type: String, 130 | pub function: Function, 131 | } 132 | 133 | #[derive(Clone, Debug, Deserialize, Serialize)] 134 | pub struct ToolDefinition { 135 | pub r#type: String, 136 | pub function: completion::ToolDefinition, 137 | } 138 | 139 | impl From for ToolDefinition { 140 | fn from(tool: completion::ToolDefinition) -> Self { 141 | Self { 142 | r#type: "function".into(), 143 | function: tool, 144 | } 145 | } 146 | ``` 147 | 148 | 149 | For detailed API reference and additional features, see the OpenAI API documentation and Rig's API documentation. 150 | 151 |
152 | 153 | 154 | 155 | 156 | -------------------------------------------------------------------------------- /pages/docs/4_integrations.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: 🔌 Integrations 3 | description: This section contains the integrations for Rig. 4 | --- 5 | 6 | # Overview 7 | 8 | Rig follows a modular integration strategy that separates core functionality into distinct integration types: 9 | 10 | - Model Providers 11 | - Vector Stores 12 | - Plugins (Discord, Twitter, etc. - coming soon) 13 | 14 | ## Model Provider Integrations 15 | 16 | Model providers are built directly into `rig-core` under the `providers` module, as shown in: 17 | 18 | 19 | ```rust filename="rig-core/src/lib.rs:61-66" 20 | //! Rig natively supports the following completion and embedding model provider integrations: 21 | //! - OpenAI 22 | //! - Cohere 23 | //! - Anthropic 24 | //! - Perplexity 25 | //! - Gemini 26 | //! - xAI 27 | ``` 28 | 29 | 30 | Each provider integration includes: 31 | - Client implementation with API handling 32 | - Model initialization helpers 33 | - Request/response type definitions 34 | - High-level abstractions (agents, RAG systems) 35 | 36 | Example using OpenAI provider: 37 | 38 | ```rust 39 | use rig::providers::openai; 40 | 41 | // Initialize the client 42 | let client = openai::Client::new("your-api-key"); 43 | 44 | // Create a model 45 | let gpt4 = client.completion_model("gpt-4"); 46 | 47 | // Or create an agent directly 48 | let agent = client.agent("gpt-4") 49 | .preamble("You are a helpful assistant") 50 | .build(); 51 | ``` 52 | 53 | ## Vector Store Integrations 54 | 55 | Vector stores are maintained as companion crates to keep the core library lean. As described in [CONTRIBUTING.md](https://github.com/0xPlaygrounds/rig/blob/main/CONTRIBUTING.md): 56 | 57 | 58 | > Rig is split up into multiple crates in a monorepo structure. The main crate `rig-core` contains all of the foundational abstractions for building with LLMs. This crate avoids adding many new dependencies to keep to lean and only really contains simple provider integrations on top of the base layer of abstractions. Side crates are leveraged to help add important first-party behavior without over burdening the main library with dependencies. For example, `rig-mongodb` contains extra dependencies to be able to interact with `mongodb` as a vector store. 59 | > 60 | > If you are unsure whether a side-crate should live in the main repo, you can spin up a personal repo containing your crate and create an issue in our repo making the case on whether this side-crate should be integrated in the main repo and maintained by the Rig team. 61 | 62 | 63 | Current vector store integrations include: 64 | 65 | ```markdown filename="CONTRIBUTING.md:91-95" 66 | Vector stores are available as separate companion-crates: 67 | - MongoDB vector store: [`rig-mongodb`](https://github.com/0xPlaygrounds/rig/tree/main/rig-mongodb) 68 | - LanceDB vector store: [`rig-lancedb`](https://github.com/0xPlaygrounds/rig/tree/main/rig-lancedb) 69 | - Neo4j vector store: [`rig-neo4j`](https://github.com/0xPlaygrounds/rig/tree/main/rig-neo4j) 70 | - Qdrant vector store: [`rig-qdrant`](https://github.com/0xPlaygrounds/rig/tree/main/rig-qdrant) 71 | ``` 72 | 73 | Each vector store companion crate: 74 | - Implements the `VectorStoreIndex` trait from `rig-core` 75 | - Manages its own dependencies 76 | - Provides store-specific configuration options 77 | - Contains dedicated examples and documentation 78 | 79 | import { Callout } from 'nextra/components' 80 | 81 | 82 | Note: In-memory vector store is included in `rig-core` as a default implementation. 83 | 84 | 85 | ## Upcoming Plugin Integrations 86 | 87 | Rig is expanding to support platform-specific plugins for enhanced context retrieval and interactions: 88 | 89 | ### Social Media Plugins 90 | - **Twitter Plugin**: Retrieve tweet context, thread analysis, and user interactions 91 | - **Discord Plugin**: Access channel history, member interactions, and server analytics 92 | 93 | ### Features 94 | - Real-time data streaming 95 | - Context-aware responses 96 | - Platform-specific formatting 97 | - Rate limiting and caching 98 | - Authentication handling 99 | 100 | These plugins will enable: 101 | - Building context-aware chatbots 102 | - Social media monitoring and analysis 103 | - Automated engagement systems 104 | - Cross-platform content management 105 | 106 | ## Contributing New Integrations 107 | 108 | The project welcomes new integrations through pull requests. Templates are available: 109 | 110 | 111 | ```10:18:.github/ISSUE_TEMPLATE/vector-store-integration-request.md 112 | ## Vector Store Integration Request 113 | 116 | 117 | ### Resources 118 | 121 | ``` 122 | 123 | 124 | When contributing: 125 | 1. For model providers: Add implementation under `rig-core/src/providers/` 126 | 2. For vector stores: Create a new companion crate following existing patterns 127 | 3. For plugins: Use the plugin template and implement required interfaces 128 | 4. Update documentation and changelog entries 129 | 130 | For detailed contribution guidelines, see the [Contributing Guide](CONTRIBUTING.md). 131 | -------------------------------------------------------------------------------- /pages/docs/3_concepts/5_loaders.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Loaders 3 | description: This section contains the concepts for Rig. 4 | --- 5 | import { Cards } from 'nextra/components' 6 | 7 | # Rig Loaders: File Loading and Processing 8 | 9 | Rig's loader system provides utilities for loading and preprocessing various file types, with a focus on structured data ingestion for LLM applications. The system is designed to be extensible and error-tolerant, with built-in support for common file types. 10 | 11 | ## Core Components 12 | 13 | ### FileLoader 14 | 15 | The base loader for handling generic files with features for: 16 | 17 | - Glob pattern matching 18 | - Directory traversal 19 | - Error handling 20 | - Content preprocessing 21 | 22 | ```rust 23 | use rig::loaders::FileLoader; 24 | 25 | // Load all Rust files in examples directory 26 | let examples = FileLoader::with_glob("examples/*.rs")? 27 | .read_with_path() 28 | .ignore_errors() 29 | .into_iter(); 30 | ``` 31 | 32 | ### PDF Loader (Optional) 33 | 34 | Specialized loader for PDF documents with additional capabilities: 35 | 36 | - Page-by-page extraction 37 | - Metadata handling 38 | - PDF-specific error handling 39 | 40 | ```rust 41 | use rig::loaders::PdfFileLoader; 42 | 43 | let documents = PdfFileLoader::with_glob("docs/*.pdf")? 44 | .load_with_path() 45 | .ignore_errors() 46 | .by_page() 47 | .into_iter(); 48 | ``` 49 | 50 | ## Key Features 51 | 52 | ### 1. Error Handling 53 | 54 | The loaders provide robust error handling through custom error types: 55 | 56 | ```rust 57 | pub enum FileLoaderError { 58 | InvalidGlobPattern(String), 59 | IoError(std::io::Error), 60 | PatternError(glob::PatternError), 61 | GlobError(glob::GlobError), 62 | } 63 | ``` 64 | 65 | ### 2. Flexible Loading Patterns 66 | 67 | Multiple ways to specify input sources: 68 | 69 | ```rust 70 | // Using glob patterns 71 | let glob_loader = FileLoader::with_glob("**/*.txt")?; 72 | 73 | // Using directory 74 | let dir_loader = FileLoader::with_dir("data/")?; 75 | ``` 76 | 77 | ### 3. Content Processing 78 | 79 | Built-in methods for common processing tasks: 80 | 81 | ```rust 82 | let processed_files = FileLoader::with_glob("*.txt")? 83 | .read() // Read contents 84 | .ignore_errors() // Skip failed reads 85 | .into_iter() 86 | .collect::>(); 87 | ``` 88 | 89 | ## Integration with Rig 90 | 91 | ### Agent Context Loading 92 | 93 | Reference: 94 | 95 | ```rust filename="rig-core/examples/agent_with_loaders.rs [17:28]" 96 | // Load in all the rust examples 97 | let examples = FileLoader::with_glob("rig-core/examples/*.rs")? 98 | .read_with_path() 99 | .ignore_errors() 100 | .into_iter(); 101 | 102 | // Create an agent with multiple context documents 103 | let agent = examples 104 | .fold(AgentBuilder::new(model), |builder, (path, content)| { 105 | builder.context(format!("Rust Example {:?}:\n{}", path, content).as_str()) 106 | }) 107 | .build(); 108 | ``` 109 | 110 | 111 | ### PDF Document Processing 112 | 113 | Reference: 114 | 115 | ```rust filename="rig-core/src/loaders/pdf.rs [31:45]" 116 | fn load(self) -> Result { 117 | Document::load(self).map_err(PdfLoaderError::PdfError) 118 | } 119 | fn load_with_path(self) -> Result<(PathBuf, Document), PdfLoaderError> { 120 | let contents = Document::load(&self); 121 | Ok((self, contents?)) 122 | } 123 | } 124 | impl Loadable for Result { 125 | fn load(self) -> Result { 126 | self.map(|t| t.load())? 127 | } 128 | fn load_with_path(self) -> Result<(PathBuf, Document), PdfLoaderError> { 129 | self.map(|t| t.load_with_path())? 130 | } 131 | ``` 132 | 133 | 134 | ## Best Practices 135 | 136 | 1. **Error Handling** 137 | - Use `ignore_errors()` for fault-tolerant processing 138 | - Handle specific error types when needed 139 | - Log errors appropriately 140 | 141 | 2. **Resource Management** 142 | - Process files in batches 143 | - Consider memory usage with large files 144 | - Clean up temporary resources 145 | 146 | 3. **Content Processing** 147 | - Preprocess content before LLM ingestion 148 | - Maintain file metadata when relevant 149 | - Use appropriate loader for file type 150 | 151 | ## Common Patterns 152 | 153 | ### Basic File Loading 154 | 155 | ```rust 156 | let loader = FileLoader::with_glob("data/*.txt")?; 157 | for content in loader.read().ignore_errors() { 158 | // Process content 159 | } 160 | ``` 161 | 162 | ### PDF Processing 163 | 164 | ```rust 165 | let pdf_loader = PdfFileLoader::with_glob("docs/*.pdf")?; 166 | let pages = pdf_loader 167 | .load_with_path() 168 | .ignore_errors() 169 | .by_page() 170 | .into_iter(); 171 | ``` 172 | 173 | ### Directory Processing 174 | 175 | ```rust 176 | let dir_loader = FileLoader::with_dir("data/")? 177 | .read_with_path() 178 | .ignore_errors(); 179 | 180 | for (path, content) in dir_loader { 181 | // Process files with path context 182 | } 183 | ``` 184 | 185 | ## See Also 186 | - [Agent System](./3_agent.mdx) 187 | - [Vector Store Integration](../4_integrations/41_vector_store.mdx) 188 | 189 |
190 | 191 | -------------------------------------------------------------------------------- /pages/docs/4_integrations/40_model_providers/anthropic.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Anthropic 3 | description: Integration with Anthropic's Claude models, supporting completion models with advanced features like tool usage and prompt caching. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # Anthropic Integration 9 | 10 | The Anthropic provider in Rig offers integration with Anthropic's Claude models, supporting completion models with advanced features like tool usage and prompt caching. 11 | 12 | ## Key Features 13 | 14 | - Support for Claude 3 model family (Opus, Sonnet, Haiku) 15 | - Version-specific API controls 16 | - Beta features support through headers 17 | - Prompt caching capabilities 18 | - Tool/function calling support 19 | - Detailed token usage tracking 20 | 21 | ## Basic Usage 22 | 23 | ```rust 24 | use rig::providers::anthropic::{ClientBuilder, CLAUDE_3_SONNET}; 25 | 26 | // Create client with specific version and beta features 27 | let client = ClientBuilder::new("your-api-key") 28 | .anthropic_version("2023-06-01") 29 | .anthropic_beta("prompt-caching-2024-07-31") 30 | .build(); 31 | 32 | // Create a completion model 33 | let claude = client.completion_model(CLAUDE_3_SONNET); 34 | 35 | // Or create an agent directly 36 | let agent = client 37 | .agent(CLAUDE_3_SONNET) 38 | .preamble("You are a helpful assistant") 39 | .build(); 40 | ``` 41 | 42 | ## Available Models 43 | 44 | ### Completion Models 45 | - `CLAUDE_3_OPUS`: Most capable model 46 | - `CLAUDE_3_SONNET`: Balanced performance and speed 47 | - `CLAUDE_3_HAIKU`: Fastest, most efficient model 48 | - `CLAUDE_3_5_SONNET`: Latest Sonnet version 49 | 50 | ## Special Considerations 51 | 52 | 1. **API Versioning**: Anthropic requires explicit version specification: 53 | 54 | ```rust filename="rig-core/src/providers/anthropic/completion.rs [29:32]" 55 | 56 | pub const ANTHROPIC_VERSION_2023_01_01: &str = "2023-01-01"; 57 | pub const ANTHROPIC_VERSION_2023_06_01: &str = "2023-06-01"; 58 | pub const ANTHROPIC_VERSION_LATEST: &str = ANTHROPIC_VERSION_2023_06_01; 59 | ``` 60 | 61 | 62 | 2. **Token Requirements**: Unlike other providers, Anthropic requires `max_tokens` to be set: 63 | 64 | ```rust filename="rig-core/src/providers/anthropic/completion.rs [176:182]" 65 | 66 | let prompt_with_context = completion_request.prompt_with_context(); 67 | 68 | // Check if max_tokens is set, required for Anthropic 69 | if completion_request.max_tokens.is_none() { 70 | return Err(CompletionError::RequestError( 71 | "max_tokens must be set for Anthropic".into(), 72 | ``` 73 | 74 | 75 | 3. **Response Format**: Anthropic responses include detailed token usage information including cache statistics: 76 | 77 | ```rust filename="rig-core/src/providers/anthropic/completion.rs [63:69]" 78 | pub input_tokens: u64, 79 | pub cache_read_input_tokens: Option, 80 | pub cache_creation_input_tokens: Option, 81 | pub output_tokens: u64, 82 | } 83 | 84 | impl std::fmt::Display for Usage { 85 | ``` 86 | 87 | 88 | 4. **Content Types**: Anthropic supports multiple content types in responses: 89 | 90 | ```rust filename="rig-core/src/providers/anthropic/completion.rs [43:58]" 91 | } 92 | 93 | #[derive(Debug, Deserialize, Serialize)] 94 | #[serde(untagged)] 95 | pub enum Content { 96 | String(String), 97 | Text { 98 | r#type: String, 99 | text: String, 100 | }, 101 | ToolUse { 102 | r#type: String, 103 | id: String, 104 | name: String, 105 | input: serde_json::Value, 106 | }, 107 | ``` 108 | 109 | 110 | ## Implementation Details 111 | 112 | The Anthropic provider is implemented across three main components: 113 | 114 | 1. **Client Builder**: Provides a flexible way to configure API settings: 115 | 116 | ```rust filename="rig-core/src/providers/anthropic/client.rs [13:73]" 117 | const ANTHROPIC_API_BASE_URL: &str = "https://api.anthropic.com"; 118 | 119 | #[derive(Clone)] 120 | pub struct ClientBuilder<'a> { 121 | api_key: &'a str, 122 | base_url: &'a str, 123 | anthropic_version: &'a str, 124 | anthropic_betas: Option>, 125 | } 126 | 127 | /// Create a new anthropic client using the builder 128 | /// 129 | /// # Example 130 | /// ``` 131 | /// use rig::providers::anthropic::{ClientBuilder, self}; 132 | /// 133 | /// // Initialize the Anthropic client 134 | /// let anthropic_client = ClientBuilder::new("your-claude-api-key") 135 | /// .anthropic_version(ANTHROPIC_VERSION_LATEST) 136 | /// .anthropic_beta("prompt-caching-2024-07-31") 137 | /// .build() 138 | /// ``` 139 | impl<'a> ClientBuilder<'a> { 140 | pub fn new(api_key: &'a str) -> Self { 141 | Self { 142 | api_key, 143 | ``` 144 | 145 | 146 | 2. **Completion Model**: Handles request/response formatting and error handling: 147 | 148 | ```rust filename="rig-core/src/providers/anthropic/completion.rs [137:142]" 149 | 150 | #[derive(Clone)] 151 | pub struct CompletionModel { 152 | client: Client, 153 | pub model: String, 154 | } 155 | ``` 156 | 157 | 158 | 3. **Tool Integration**: Supports function calling with specific Anthropic formatting: 159 | 160 | ```rust filename="rig-core/src/providers/anthropic/completion.rs [86:91]" 161 | } 162 | 163 | #[derive(Debug, Deserialize, Serialize)] 164 | pub struct ToolDefinition { 165 | pub name: String, 166 | pub description: Option, 167 | ``` 168 | 169 | 170 | For detailed API reference and additional features, see the Anthropic API documentation and Rig's API documentation. 171 | 172 |
173 | 174 | 175 | 176 | 177 | -------------------------------------------------------------------------------- /pages/docs/3_concepts/2_tools.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Tools 3 | description: This section contains the concepts for Rig. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # Rig Tools 9 | 10 | Tools are a core concept in Rig that allow agents to perform specific actions or computations. They provide a structured way to extend an agent's capabilities beyond pure language model interactions. 11 | 12 | ## Overview 13 | 14 | Tools in Rig are implemented through two main traits: 15 | - `Tool`: The base trait for implementing simple tools 16 | - `ToolEmbedding`: An extension trait that allows tools to be stored in vector stores and used with RAG (Retrieval Augmented Generation) 17 | 18 | ## Basic Tool Implementation 19 | 20 | A basic tool requires implementing the `Tool` trait, which defines: 21 | 22 | 1. A unique name identifier 23 | 2. Input argument types 24 | 3. Output types 25 | 4. Error handling 26 | 5. Tool definition (description and parameters) 27 | 6. Execution logic 28 | 29 | Here's a simple example of a tool that adds two numbers: 30 | 31 | ```rust 32 | #[derive(Deserialize)] 33 | struct AddArgs { 34 | x: i32, 35 | y: i32, 36 | } 37 | 38 | #[derive(Deserialize, Serialize)] 39 | struct Adder; 40 | 41 | impl Tool for Adder { 42 | const NAME: &'static str = "add"; 43 | type Error = MathError; 44 | type Args = AddArgs; 45 | type Output = i32; 46 | 47 | async fn definition(&self, _prompt: String) -> ToolDefinition { 48 | ToolDefinition { 49 | name: "add".to_string(), 50 | description: "Add x and y together".to_string(), 51 | parameters: json!({ 52 | "type": "object", 53 | "properties": { 54 | "x": { "type": "number", "description": "First number" }, 55 | "y": { "type": "number", "description": "Second number" } 56 | } 57 | }) 58 | } 59 | } 60 | 61 | async fn call(&self, args: Self::Args) -> Result { 62 | Ok(args.x + args.y) 63 | } 64 | } 65 | ``` 66 | 67 | ## RAG-Enabled Tools 68 | 69 | Tools can be made RAG-enabled by implementing the `ToolEmbedding` trait, which allows them to be: 70 | 1. Stored in vector stores 71 | 2. Retrieved based on semantic similarity 72 | 3. Dynamically added to agent prompts 73 | 74 | Reference implementation: 75 | 76 | ```28:77:rig-core/examples/rag_dynamic_tools.rs 77 | struct Add; 78 | 79 | impl Tool for Add { 80 | const NAME: &'static str = "add"; 81 | 82 | type Error = MathError; 83 | type Args = OperationArgs; 84 | type Output = i32; 85 | 86 | async fn definition(&self, _prompt: String) -> ToolDefinition { 87 | serde_json::from_value(json!({ 88 | "name": "add", 89 | "description": "Add x and y together", 90 | "parameters": { 91 | "type": "object", 92 | "properties": { 93 | "x": { 94 | "type": "number", 95 | "description": "The first number to add" 96 | }, 97 | "y": { 98 | "type": "number", 99 | "description": "The second number to add" 100 | } 101 | } 102 | } 103 | })) 104 | .expect("Tool Definition") 105 | } 106 | 107 | async fn call(&self, args: Self::Args) -> Result { 108 | let result = args.x + args.y; 109 | Ok(result) 110 | } 111 | } 112 | 113 | impl ToolEmbedding for Add { 114 | type InitError = InitError; 115 | type Context = (); 116 | type State = (); 117 | 118 | fn init(_state: Self::State, _context: Self::Context) -> Result { 119 | Ok(Add) 120 | } 121 | 122 | fn embedding_docs(&self) -> Vec { 123 | vec!["Add x and y together".into()] 124 | } 125 | 126 | fn context(&self) -> Self::Context {} 127 | ``` 128 | 129 | 130 | ## Using Tools with Agents 131 | 132 | Tools can be added to agents in two ways: 133 | 134 | 1. Static Tools: Always available to the agent 135 | ```rust 136 | let agent = client 137 | .agent("gpt-4") 138 | .preamble("You are a calculator.") 139 | .tool(Adder) 140 | .tool(Subtract) 141 | .build(); 142 | ``` 143 | 144 | 2. Dynamic Tools: Retrieved from a vector store based on the query 145 | ```rust 146 | let agent = client 147 | .agent("gpt-4") 148 | .preamble("You are a calculator.") 149 | .dynamic_tools(2, vector_store_index, toolset) 150 | .build(); 151 | ``` 152 | 153 | ## Tool Organization 154 | 155 | Tools are typically organized in a `ToolSet`, which provides: 156 | - Tool registration and management 157 | - Tool lookup by name 158 | - Tool execution routing 159 | - Conversion to embeddings for RAG 160 | 161 | ## Best Practices 162 | 163 | 1. **Unique Names**: Ensure each tool has a unique name within your application 164 | 2. **Clear Descriptions**: Provide clear, detailed descriptions in tool definitions 165 | 3. **Type Safety**: Use strong typing for tool arguments and outputs 166 | 4. **Error Handling**: Implement proper error types and handling 167 | 5. **RAG Consideration**: Consider implementing `ToolEmbedding` if your tool might benefit from semantic retrieval 168 | 169 | ## Integration with LLMs 170 | 171 | Tools are automatically integrated with LLM providers through Rig's agent system. The library handles: 172 | - Converting tool definitions to provider-specific formats 173 | - Parsing LLM outputs into tool calls 174 | - Routing tool calls to appropriate implementations 175 | - Returning tool results to the LLM 176 | 177 | For more information on integrating tools with specific LLM providers, see the provider-specific documentation in the [`providers`](./4_integrations/0_providers.md) module. 178 | 179 |
180 | 181 | -------------------------------------------------------------------------------- /pages/docs/3_concepts/6_extractors.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Extractors 3 | description: This section contains the concepts for Rig. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # Rig Extractors: Structured Data Extraction 9 | 10 | The Extractor system in Rig provides a high-level abstraction for extracting structured data from unstructured text using LLMs. It enables automatic parsing of text into strongly-typed Rust structures with minimal boilerplate. 11 | 12 | ## Core Concepts 13 | 14 | ### Extractor Structure 15 | 16 | The Extractor combines: 17 | 1. An LLM Agent 18 | 2. A target data structure 19 | 3. A submission tool 20 | 4. Type-safe deserialization 21 | 22 | Reference: 23 | 24 | ```rsust filename="rig-core/src/extractor.rs [55:59]" 25 | /// Extractor for structured data from text 26 | pub struct Extractor Deserialize<'a> + Send + Sync> { 27 | agent: Agent, 28 | _t: PhantomData, 29 | } 30 | ``` 31 | 32 | 33 | ### Target Data Requirements 34 | 35 | Structures must implement: 36 | - `serde::Deserialize` 37 | - `serde::Serialize` 38 | - `schemars::JsonSchema` 39 | 40 | ## Usage 41 | 42 | ### Basic Example 43 | 44 | ```rust 45 | use rig::providers::openai; 46 | 47 | // Define target structure 48 | #[derive(serde::Deserialize, serde::Serialize, schemars::JsonSchema)] 49 | struct Person { 50 | name: Option, 51 | age: Option, 52 | profession: Option, 53 | } 54 | 55 | // Create and use extractor 56 | let openai = openai::Client::new(api_key); 57 | let extractor = openai.extractor::(openai::GPT_4O).build(); 58 | 59 | let person = extractor.extract("John Doe is a 30 year old doctor.").await?; 60 | ``` 61 | 62 | ### Error Handling 63 | 64 | The system provides comprehensive error handling through `ExtractionError`: 65 | 66 | 67 | ```rust filename="rig-core/src/extractor.rs [43:53]" 68 | #[derive(Debug, thiserror::Error)] 69 | pub enum ExtractionError { 70 | #[error("No data extracted")] 71 | NoData, 72 | 73 | #[error("Failed to deserialize the extracted data: {0}")] 74 | DeserializationError(#[from] serde_json::Error), 75 | 76 | #[error("PromptError: {0}")] 77 | PromptError(#[from] PromptError), 78 | } 79 | ``` 80 | 81 | 82 | ## Key Features 83 | 84 | ### 1. Type Safety 85 | 86 | - Compile-time type checking 87 | - Automatic schema generation 88 | - Structured error handling 89 | 90 | ### 2. Flexible Extraction 91 | 92 | The extractor can be customized with: 93 | - Additional context 94 | - Custom preamble 95 | - Model configuration 96 | 97 | ```rust 98 | let extractor = openai.extractor::(model) 99 | .preamble("Extract person details with high precision") 100 | .context("Additional context about person formats") 101 | .build(); 102 | ``` 103 | 104 | ### 3. Submit Tool Integration 105 | 106 | The system uses a specialized tool for data submission: 107 | 108 | 109 | ```rust filename="rig-core/src/extractor.rs [134:152]" 110 | impl Deserialize<'a> + Serialize + Send + Sync> Tool for SubmitTool { 111 | const NAME: &'static str = "submit"; 112 | type Error = SubmitError; 113 | type Args = T; 114 | type Output = T; 115 | 116 | async fn definition(&self, _prompt: String) -> ToolDefinition { 117 | ToolDefinition { 118 | name: Self::NAME.to_string(), 119 | description: "Submit the structured data you extracted from the provided text." 120 | .to_string(), 121 | parameters: json!(schema_for!(T)), 122 | } 123 | } 124 | 125 | async fn call(&self, data: Self::Args) -> Result { 126 | Ok(data) 127 | } 128 | } 129 | ``` 130 | 131 | 132 | ## Best Practices 133 | 134 | 1. **Structure Design** 135 | - Use `Option` for optional fields 136 | - Keep structures focused and minimal 137 | - Document field requirements 138 | 139 | 2. **Error Handling** 140 | - Handle both extraction and deserialization errors 141 | - Provide fallback values where appropriate 142 | - Log extraction failures for debugging 143 | 144 | 3. **Context Management** 145 | - Provide clear extraction instructions 146 | - Include relevant domain context 147 | - Set appropriate model parameters 148 | 149 | ## Common Patterns 150 | 151 | ### Basic Extraction 152 | 153 | ```rust 154 | let extractor = client.extractor::(model).build(); 155 | let data = extractor.extract("raw text").await?; 156 | ``` 157 | 158 | ### Contextual Extraction 159 | 160 | ```rust 161 | let extractor = client.extractor::(model) 162 | .preamble("Extract with following rules...") 163 | .context("Domain-specific information...") 164 | .build(); 165 | ``` 166 | 167 | ### Batch Processing 168 | 169 | ```rust 170 | async fn process_documents(extractor: &Extractor, docs: Vec) -> Vec> { 171 | let mut results = Vec::new(); 172 | for doc in docs { 173 | results.push(extractor.extract(&doc).await); 174 | } 175 | results 176 | } 177 | ``` 178 | 179 | ## Integration Examples 180 | 181 | ### With File Loaders 182 | 183 | ```rust 184 | let docs = FileLoader::with_glob("*.txt")? 185 | .read() 186 | .ignore_errors(); 187 | 188 | let extractor = client.extractor::(model).build(); 189 | 190 | for doc in docs { 191 | let structured_data = extractor.extract(&doc).await?; 192 | // Process structured data 193 | } 194 | ``` 195 | 196 | ### With Agents 197 | 198 | The extractor can be used as part of a larger agent system: 199 | 200 | ```rust 201 | let data_extractor = client.extractor::(model).build(); 202 | let agent = client.agent(model) 203 | .tool(data_extractor) 204 | .build(); 205 | ``` 206 | 207 | ## See Also 208 | - [Agent System](./agent.md) 209 | - [Completion Models](./completion.md) 210 | - [Tool System](./tool.md) 211 | 212 | 213 |
214 | 215 | -------------------------------------------------------------------------------- /pages/docs/4_integrations/41_vector_stores/mongodb.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: MongoDB 3 | description: This section describes the MongoDB integration. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # MongoDB Integration 9 | 10 | 11 | # Vector Store Implementation - MongoDB 12 | 13 | ## Overview 14 | 15 | The MongoDB vector store implementation in Rig provides integration with MongoDB Atlas Vector Search, allowing for semantic search capabilities using MongoDB's vector search indexes. 16 | 17 | ## Key Features 18 | 19 | - Cosine similarity search 20 | - Custom search parameters 21 | - Automatic index validation 22 | - Detailed score tracking 23 | - Flexible document schema support 24 | 25 | ## Basic Usage 26 | 27 | ```rust 28 | use rig_mongodb::{MongoDbVectorIndex, SearchParams}; 29 | 30 | // Initialize the vector store 31 | let index = MongoDbVectorIndex::new( 32 | collection, 33 | embedding_model, 34 | "vector_index", 35 | SearchParams::new() 36 | ).await?; 37 | 38 | // Search for similar documents 39 | let results = index.top_n::("search query", 5).await?; 40 | ``` 41 | 42 | ## Implementation Details 43 | 44 | ### Core Components 45 | 46 | 1. **Vector Index Structure**: 47 | 48 | ```rust filename="rig-mongodb/src/lib.rs [82-89]" 49 | /// The `MongoDbVectorIndex` struct is the core component for interacting with MongoDB's vector search capabilities. 50 | /// It encapsulates the MongoDB collection, the embedding model, and the index name, along with search parameters. 51 | /// 52 | /// ```rust 53 | /// pub struct MongoDbVectorIndex { 54 | /// collection: Collection, 55 | /// model: Box, 56 | /// index_name: String, 57 | /// search_params: SearchParams, 58 | /// } 59 | /// ``` 60 | /// 61 | /// - `collection`: The MongoDB collection where documents are stored. 62 | /// - `model`: The embedding model used to generate vector representations of text. 63 | /// - `index_name`: The name of the vector index in MongoDB. 64 | /// - `search_params`: Parameters for customizing the search behavior. 65 | pub struct MongoDbVectorIndex { 66 | collection: Collection, 67 | model: Box, 68 | index_name: String, 69 | search_params: SearchParams, 70 | } 71 | 72 | 73 | ``` 74 | 75 | 76 | 2. **Search Parameters**: 77 | - Configurable field name for embeddings 78 | - Customizable number of candidates 79 | - Support for MongoDB-specific search options 80 | 81 | ### Search Pipeline 82 | 83 | The MongoDB implementation uses an aggregation pipeline with three main stages: 84 | 85 | 1. **Search Stage**: Performs vector similarity search 86 | 2. **Score Stage**: Calculates and normalizes similarity scores 87 | 3. **Project Stage**: Formats the output documents 88 | 89 | Reference implementation: 90 | 91 | ```246:285:rig-mongodb/src/lib.rs 92 | /// .top_n::("My boss says I zindle too much, what does that mean?", 1) 93 | /// .await?; 94 | /// ``` 95 | async fn top_n Deserialize<'a> + Send>( 96 | &self, 97 | query: &str, 98 | n: usize, 99 | ) -> Result, VectorStoreError> { 100 | let prompt_embedding = self.model.embed_text(query).await?; 101 | 102 | let mut cursor = self 103 | .collection 104 | .aggregate([ 105 | ``` 106 | 107 | 108 | ### Document Schema Requirements 109 | 110 | Documents must include: 111 | - A unique identifier field (`_id`) 112 | - An embedding vector field (configurable name) 113 | - Optional additional fields for storage 114 | 115 | Example schema: 116 | ```rust 117 | #[derive(Embed, Clone, Deserialize, Debug)] 118 | struct Document { 119 | #[serde(rename = "_id")] 120 | id: String, 121 | #[embed] 122 | content: String, 123 | embedding: Vec, 124 | } 125 | ``` 126 | 127 | ### MongoDB Index Requirements 128 | 129 | The collection must have a vector search index configured: 130 | 131 | 132 | ```rust filename="rig-mongodb/tests/integration_tests.rs [108-127]" 133 | 134 | 135 | ``` 136 | 137 | 138 | ## Special Considerations 139 | 140 | 1. **Index Validation**: The implementation automatically validates: 141 | - Index existence 142 | - Vector dimensions 143 | - Similarity metric 144 | 145 | 2. **Error Handling**: MongoDB-specific errors are converted to Rig's error types: 146 | 147 | ```rust filename="rig-mongodb/src/lib.rs [54-56]" 148 | 149 | 150 | ``` 151 | 152 | 153 | 3. **Performance Optimization**: 154 | - Uses MongoDB's native vector search capabilities 155 | - Supports cursor-based result streaming 156 | - Optimizes query projection 157 | 158 | ## Integration Example 159 | 160 | A complete example showing document embedding and search: 161 | 162 | ```24:95:rig-mongodb/examples/vector_search_mongodb.rs 163 | 164 | #[tokio::main] 165 | async fn main() -> Result<(), anyhow::Error> { 166 | // Initialize OpenAI client 167 | let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set"); 168 | let openai_client = Client::new(&openai_api_key); 169 | 170 | // Initialize MongoDB client 171 | let mongodb_connection_string = 172 | env::var("MONGODB_CONNECTION_STRING").expect("MONGODB_CONNECTION_STRING not set"); 173 | let options = ClientOptions::parse(mongodb_connection_string) 174 | .await 175 | .expect("MongoDB connection string should be valid"); 176 | 177 | let mongodb_client = 178 | MongoClient::with_options(options).expect("MongoDB client options should be valid"); 179 | 180 | // Initialize MongoDB vector store 181 | let collection: Collection = mongodb_client 182 | .database("knowledgebase") 183 | .collection("context"); 184 | 185 | // Select the embedding model and generate our embeddings 186 | let model = openai_client.embedding_model(TEXT_EMBEDDING_ADA_002); 187 | 188 | let words = vec![ 189 | Word { 190 | id: "doc0".to_string(), 191 | definition: "Definition of a *flurbo*: A flurbo is a green alien that lives on cold planets".to_string(), 192 | }, 193 | Word { 194 | id: "doc1".to_string(), 195 | definition: "Definition of a *glarb-glarb*: A glarb-glarb is a ancient tool used by the ancestors of the inhabitants of planet Jiro to farm the land.".to_string(), 196 | }, 197 | Word { 198 | id: "doc2".to_string(), 199 | definition: "Definition of a *linglingdong*: A term used by inhabitants of the far side of the moon to describe humans.".to_string(), 200 | } 201 | ]; 202 | 203 | ``` 204 | 205 | 206 | For detailed API reference and additional features, see the MongoDB Atlas Vector Search documentation and Rig's API documentation. 207 | 208 |
209 | 210 | 211 | 212 | -------------------------------------------------------------------------------- /pages/docs/3_concepts/4_chains.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Pipeline (Agentic Chain) 3 | description: This section contains the concepts for Rig. 4 | --- 5 | import { FileTree, Cards } from 'nextra/components' 6 | 7 | 8 | # Pipeline Module 9 | 10 | The Pipeline module provides a flexible API for building AI-powered processing pipelines with composable operations. Inspired by orchestration frameworks like Airflow and Dagster, it implements idiomatic Rust patterns for AI workflows. 11 | 12 | ## Overview 13 | 14 | > This module defines a flexible pipeline API for defining a sequence of operations that 15 | > may or may not use AI components (e.g.: semantic search, LLMs prompting, etc). 16 | > 17 | > The pipeline API was inspired by general orchestration pipelines such as Airflow, Dagster and Prefect, 18 | > but implemented with idiomatic Rust patterns and providing some AI-specific ops out-of-the-box along 19 | > general combinators. 20 | > 21 | > Pipelines are made up of one or more operations, or "ops", each of which must implement the [Op] trait. 22 | > The [Op] trait requires the implementation of only one method: `call`, which takes an input 23 | > and returns an output. The trait provides a wide range of combinators for chaining operations together. 24 | > 25 | > One can think of a pipeline as a DAG (Directed Acyclic Graph) where each node is an operation and 26 | > the edges represent the data flow between operations. When invoking the pipeline on some input, 27 | 28 | 29 | 30 | ## Core Concepts 31 | 32 | ### Operations (Ops) 33 | 34 | Operations are the building blocks of pipelines. Each operation: 35 | - Takes an input 36 | - Performs processing 37 | - Returns an output 38 | - Implements the `Op` trait 39 | 40 | ```rust 41 | use rig::pipeline::{self, Op}; 42 | 43 | // Simple operation that adds two numbers 44 | let add_op = pipeline::new() 45 | .map(|(x, y)| x + y); 46 | 47 | // Operation with async processing 48 | let async_op = pipeline::new() 49 | .then(|x| async move { x * 2 }); 50 | ``` 51 | 52 | ### Pipeline Structure 53 | 54 | Pipelines form a Directed Acyclic Graph (DAG) where: 55 | - Nodes represent operations 56 | - Edges represent data flow between operations 57 | - Input flows from root to leaf nodes 58 | - Output is returned from the final node 59 | 60 | Example DAG visualization: 61 | ```mermaid 62 | graph TD 63 | A[Input] --> B[Operation 1] 64 | B --> C[Operation 2] 65 | C --> D[Operation 3] 66 | D --> E[Output] 67 | ``` 68 | 69 | ## Basic Usage 70 | 71 | ### Sequential Operations 72 | 73 | Chain operations that execute one after another: 74 | 75 | ```rust 76 | use rig::pipeline::{self, Op}; 77 | 78 | let pipeline = pipeline::new() 79 | .map(|(x, y)| x + y) // Add numbers 80 | .map(|z| z * 2) // Double result 81 | .map(|n| n.to_string()); // Convert to string 82 | 83 | let result = pipeline.call((5, 3)).await; 84 | assert_eq!(result, "16"); 85 | ``` 86 | 87 | ### Parallel Operations 88 | 89 | Execute multiple operations concurrently using the `parallel!` macro: 90 | 91 | 92 | ```rust filename="rig-core/src/pipeline/parallel.rs [196:208]" 93 | ] 94 | values_and_positions: [ 95 | $($acc)* 96 | $current ( $($underscores)* + ) 97 | ] 98 | munching: [] 99 | } 100 | ); 101 | 102 | // Recursion step: map each value with its "position" (underscore count). 103 | ( 104 | // Accumulate a token for each future that has been expanded: "_ _ _". 105 | current_position: [ 106 | ``` 107 | 108 | 109 | ## AI-Specific Operations 110 | 111 | ### RAG Pipeline Example 112 | 113 | Build a Retrieval-Augmented Generation pipeline: 114 | 115 | ```rust 116 | use rig::pipeline::{self, Op}; 117 | 118 | let pipeline = pipeline::new() 119 | // Parallel: Query embedding & document lookup 120 | .chain(parallel!( 121 | passthrough(), 122 | lookup::<_, _, Document>(vector_store, 3) 123 | )) 124 | // Format context 125 | .map(|(query, docs)| format!( 126 | "Query: {}\nContext: {}", 127 | query, 128 | docs.join("\n") 129 | )) 130 | // Generate response 131 | .prompt(llm_model); 132 | ``` 133 | 134 | ### Extraction Pipeline 135 | 136 | Extract structured data from text: 137 | 138 | ```rust 139 | use rig::pipeline::{self, Op}; 140 | 141 | #[derive(Deserialize, JsonSchema)] 142 | struct Sentiment { 143 | score: f64, 144 | label: String, 145 | } 146 | 147 | let pipeline = pipeline::new() 148 | .map(|text| format!("Analyze sentiment: {}", text)) 149 | .extract::<_, _, Sentiment>(extractor); 150 | ``` 151 | 152 | ## Error Handling 153 | 154 | The module provides the `TryOp` trait for operations that may fail: 155 | 156 | 157 | ```36:52:rig-core/src/pipeline/try_op.rs 158 | /// let result = op.try_batch_call(2, vec![2, 4]).await; 159 | /// assert_eq!(result, Ok(vec![3, 5])); 160 | /// ``` 161 | fn try_batch_call( 162 | &self, 163 | n: usize, 164 | input: I, 165 | ) -> impl Future, Self::Error>> + Send 166 | where 167 | I: IntoIterator + Send, 168 | I::IntoIter: Send, 169 | Self: Sized, 170 | { 171 | use stream::{StreamExt, TryStreamExt}; 172 | 173 | async move { 174 | stream::iter(input) 175 | ``` 176 | 177 | 178 | ## Advanced Features 179 | 180 | ### Custom Operations 181 | 182 | Implement the `Op` trait for custom operations: 183 | 184 | ```rust 185 | struct CustomOp; 186 | 187 | impl Op for CustomOp { 188 | type Input = String; 189 | type Output = Vec; 190 | 191 | async fn call(&self, input: Self::Input) -> Self::Output { 192 | input.split_whitespace() 193 | .map(String::from) 194 | .collect() 195 | } 196 | } 197 | ``` 198 | 199 | ### Batch Processing 200 | 201 | Process multiple inputs concurrently: 202 | 203 | ```rust 204 | let pipeline = pipeline::new() 205 | .map(|text| analyze_sentiment(text)); 206 | 207 | // Process 5 documents concurrently 208 | let results = pipeline.batch_call(5, documents).await; 209 | ``` 210 | 211 | ## Best Practices 212 | 213 | 1. **Composability**: Design operations to be modular and reusable 214 | 2. **Error Handling**: Use `TryOp` for operations that may fail 215 | 3. **Resource Management**: Consider batch processing for multiple inputs 216 | 4. **Testing**: Unit test individual operations before combining 217 | 5. **Documentation**: Document expected inputs/outputs for each operation 218 | 219 | ## See Also 220 | 221 | - [Agent Module](./agent.md) - High-level AI agent abstractions 222 | - [Vector Store Module](./vector_store.md) - Document storage and retrieval 223 | - [Completion Module](./completion.md) - LLM interaction primitives 224 | 225 |
226 | 227 | -------------------------------------------------------------------------------- /pages/index.mdx: -------------------------------------------------------------------------------- 1 | import { Cards } from 'nextra/components' 2 | import { WalletCards, Rocket } from 'lucide-react' 3 | 4 | # Get Started with `cargo add rig-core{:bash}` 5 | Rig is a Rust library for building portable, modular, and lightweight Fullstack AI Agents. You can find API documentation on [docs.rs](https://docs.rs/rig-core). 6 | 7 |
8 | 9 | Why Rig? 10 | 11 | Rig is a Rust library for building portable, modular, and lightweight Fullstack AI Agents. You can find API documentation on [docs.rs](https://docs.rs/rig-core). 12 |
13 | 14 |
15 | 16 | Why Rust? 17 | 18 | Rust 🦀 has many advantages over Python or JS/TS, used by AI frameworks like Langchain, LlamaIndex, ai16z. 19 | 20 | To vizualise's Rust's edge over other languages, you can check out this [tool](https://benjdd.com/languages/). 21 | 22 | - **Lightweight**: Rust runs orders of magnitude faster than Python, which makes running & deploying swarms of agents a breeze. 23 | - **Safety**: Rust's type system and ownership model helps work with unexpected LLM outputs, coercing types and handling errors. 24 | - **Portability**: Rust code can be compiled to WebAssembly, allowing it to run in web browsers (even local LLM models!). 25 | 26 | 27 |
28 | 29 |
30 | 31 | 35 | Get started quickly with our comprehensive guide. 36 | 37 | } 38 | /> 39 | 43 | Explore tutorials and guides to enhance your skills. 44 | 45 | } 46 | /> 47 | 51 | Discover recipes and examples to build with Rig. 52 | 53 | } 54 | /> 55 | 59 | Learn how to contribute to the Rig project. 60 | 61 | } 62 | /> 63 | 67 | Find answers to frequently asked questions. 68 | 69 | } 70 | /> 71 | 75 | Stay updated with the latest release notes. 76 | 77 | } 78 | /> 79 | 83 | Share your feedback and help us improve. 84 | 85 | } 86 | /> 87 | 91 | Join our community on Discord for discussions. 92 | 93 | } 94 | /> 95 | 96 | ## High-level features 97 | - Full support for LLM completion and embedding workflows 98 | - Simple but powerful common abstractions over LLM providers (e.g. OpenAI, Cohere) and vector stores (e.g. MongoDB, in-memory) 99 | - Integrate LLMs in your app with minimal boilerplate 100 | 101 | ## Integrations 102 | ### Model Providers 103 | Rig natively supports the following completion and embedding model provider integrations: 104 |
105 |
106 | ChatGPT logo 107 | 108 | 109 | 110 | Claude Anthropic logo 111 | 112 | Cohere logo 113 | Gemini logo 114 | xAI logo 115 | perplexity logo 116 |
117 | 118 | You can also implement your own model provider integration by defining types that 119 | implement the [CompletionModel](crate::completion::CompletionModel) and [EmbeddingModel](crate::embeddings::EmbeddingModel) traits. 120 | ### Vector Stores 121 | Rig currently supports the following vector store integrations via companion crates: 122 | - `rig-mongodb`: Vector store implementation for MongoDB 123 | - `rig-lancedb`: Vector store implementation for LanceDB 124 | - `rig-neo4j`: Vector store implementation for Neo4j 125 | - `rig-qdrant`: Vector store implementation for Qdrant 126 | 127 | You can also implement your own vector store integration by defining types that 128 | implement the [VectorStoreIndex](crate::vector_store::VectorStoreIndex) trait. 129 | 130 | ## Simple example: 131 | ```rust 132 | use rig::{completion::Prompt, providers::openai}; 133 | 134 | #[tokio::main] 135 | async fn main() { 136 | // Create OpenAI client and agent. 137 | // This requires the `OPENAI_API_KEY` environment variable to be set. 138 | let openai_client = openai::Client::from_env(); 139 | 140 | let gpt4 = openai_client.agent("gpt-4").build(); 141 | 142 | // Prompt the model and print its response 143 | let response = gpt4 144 | .prompt("Who are you?") 145 | .await 146 | .expect("Failed to prompt GPT-4"); 147 | 148 | println!("GPT-4: {response}"); 149 | } 150 | ``` 151 | Note: using `#[tokio::main]{:rust}` requires you enable tokio's `macros{:yaml}` and `rt-multi-thread{:yaml}` features 152 | or just `full{:yaml}` to enable all features `cargo add tokio --features macros,rt-multi-thread{:bash}` -------------------------------------------------------------------------------- /pages/docs/3_concepts/0_completion.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Completion & Generation 3 | description: This section contains the concepts for Rig. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # Completion in Rig: LLM Interaction Layer 9 | 10 | Rig's completion system provides a layered approach to interacting with Language Models, offering both high-level convenience and low-level control. The system is built around a set of traits that define different levels of abstraction for LLM interactions. 11 | 12 | ## Core Traits 13 | 14 | ### 1. High-Level Interfaces 15 | 16 | #### `Prompt` Trait 17 | - Simplest interface for one-shot interactions 18 | - Fire-and-forget prompting 19 | - Returns string responses 20 | 21 | ```rust 22 | async fn prompt(&self, prompt: &str) -> Result; 23 | ``` 24 | 25 | #### `Chat` Trait 26 | - Conversation-aware interactions 27 | - Maintains chat history 28 | - Supports contextual responses 29 | 30 | ```rust 31 | async fn chat(&self, prompt: &str, history: Vec) -> Result; 32 | ``` 33 | 34 | ### 2. Low-Level Control 35 | 36 | #### `Completion` Trait 37 | - Fine-grained request configuration 38 | - Access to raw completion responses 39 | - Tool call handling 40 | 41 | Reference to implementation: 42 | 43 | ```rust filename="rig-core/src/completion.rs [165:246]" 44 | ... 45 | chat_history: Vec, 46 | ) -> impl std::future::Future> + Send; 47 | } 48 | 49 | /// Trait defininig a low-level LLM completion interface 50 | pub trait Completion { 51 | /// Generates a completion request builder for the given `prompt` and `chat_history`. 52 | /// This function is meant to be called by the user to further customize the 53 | /// request at prompt time before sending it. 54 | /// 55 | /// ❗IMPORTANT: The type that implements this trait might have already 56 | /// populated fields in the builder (the exact fields depend on the type). 57 | /// For fields that have already been set by the model, calling the corresponding 58 | /// method on the builder will overwrite the value set by the model. 59 | /// 60 | /// For example, the request builder returned by [`Agent::completion`](crate::agent::Agent::completion) will already 61 | /// contain the `preamble` provided when creating the agent. 62 | fn completion( 63 | &self, 64 | prompt: &str, 65 | chat_history: Vec, 66 | ) -> impl std::future::Future, CompletionError>> + Send; 67 | } 68 | 69 | /// General completion response struct that contains the high-level completion choice 70 | /// and the raw response. 71 | #[derive(Debug)] 72 | pub struct CompletionResponse { 73 | /// The completion choice returned by the completion model provider 74 | pub choice: ModelChoice, 75 | /// The raw response returned by the completion model provider 76 | pub raw_response: T, 77 | } 78 | 79 | /// Enum representing the high-level completion choice returned by the completion model provider. 80 | #[derive(Debug)] 81 | pub enum ModelChoice { 82 | /// Represents a completion response as a message 83 | Message(String), 84 | /// Represents a completion response as a tool call of the form 85 | /// `ToolCall(function_name, function_params)`. 86 | ToolCall(String, serde_json::Value), 87 | } 88 | 89 | /// Trait defining a completion model that can be used to generate completion responses. 90 | /// This trait is meant to be implemented by the user to define a custom completion model, 91 | /// either from a third party provider (e.g.: OpenAI) or a local model. 92 | pub trait CompletionModel: Clone + Send + Sync { 93 | /// The raw response type returned by the underlying completion model. 94 | type Response: Send + Sync; 95 | 96 | /// Generates a completion response for the given completion request. 97 | fn completion( 98 | &self, 99 | request: CompletionRequest, 100 | ) -> impl std::future::Future, CompletionError>> 101 | + Send; 102 | 103 | /// Generates a completion request builder for the given `prompt`. 104 | fn completion_request(&self, prompt: &str) -> CompletionRequestBuilder { 105 | CompletionRequestBuilder::new(self.clone(), prompt.to_string()) 106 | } 107 | } 108 | ``` 109 | 110 | 111 | #### `CompletionModel` Trait 112 | - Provider interface implementation 113 | - Raw request handling 114 | - Response parsing and error management 115 | 116 | ## Request Building 117 | 118 | ### CompletionRequestBuilder 119 | Fluent API for constructing requests with: 120 | 121 | ```rust 122 | let request = model.completion_request("prompt") 123 | .preamble("system instructions") 124 | .temperature(0.7) 125 | .max_tokens(1000) 126 | .documents(context_docs) 127 | .tools(available_tools) 128 | .build(); 129 | ``` 130 | 131 | ### Request Components 132 | 133 | 1. **Core Elements** 134 | - Prompt text 135 | - System preamble 136 | - Chat history 137 | - Temperature 138 | - Max tokens 139 | 140 | 2. **Context Management** 141 | - Document attachments 142 | - Metadata handling 143 | - Formatting controls 144 | 145 | 3. **Tool Integration** 146 | - Tool definitions 147 | - Parameter validation 148 | - Response parsing 149 | 150 | ## Response Handling 151 | 152 | ### CompletionResponse 153 | Structured response type with: 154 | 155 | ```rust 156 | enum ModelChoice { 157 | Message(String), 158 | ToolCall(String, Value) 159 | } 160 | 161 | struct CompletionResponse { 162 | choice: ModelChoice, 163 | raw_response: T, 164 | } 165 | ``` 166 | 167 | ### Error Handling 168 | 169 | Comprehensive error types: 170 | ```rust 171 | enum CompletionError { 172 | HttpError(reqwest::Error), 173 | JsonError(serde_json::Error), 174 | RequestError(Box), 175 | ResponseError(String), 176 | ProviderError(String), 177 | } 178 | ``` 179 | 180 | ## Usage Patterns 181 | 182 | ### Basic Completion 183 | 184 | ```rust 185 | let openai = Client::new(api_key); 186 | let model = openai.completion_model("gpt-4"); 187 | 188 | let response = model 189 | .prompt("Explain quantum computing") 190 | .await?; 191 | ``` 192 | 193 | ### Contextual Chat 194 | 195 | ```rust 196 | let chat_response = model 197 | .chat( 198 | "Continue the discussion", 199 | vec![Message::user("Previous context")] 200 | ) 201 | .await?; 202 | ``` 203 | 204 | ### Advanced Request Configuration 205 | 206 | ```rust 207 | let request = model 208 | .completion_request("Complex query") 209 | .preamble("Expert system") 210 | .temperature(0.8) 211 | .documents(context) 212 | .tools(available_tools) 213 | .send() 214 | .await?; 215 | ``` 216 | 217 | ## Provider Integration 218 | 219 | ### Implementing New Providers 220 | 221 | ```rust 222 | impl CompletionModel for CustomProvider { 223 | type Response = CustomResponse; 224 | 225 | async fn completion( 226 | &self, 227 | request: CompletionRequest 228 | ) -> Result, CompletionError> { 229 | // Provider-specific implementation 230 | } 231 | } 232 | ``` 233 | 234 | ## Best Practices 235 | 236 | 1. **Interface Selection** 237 | - Use `Prompt` for simple interactions 238 | - Use `Chat` for conversational flows 239 | - Use `Completion` for fine-grained control 240 | 241 | 2. **Error Handling** 242 | - Handle provider-specific errors 243 | - Implement graceful fallbacks 244 | - Log raw responses for debugging 245 | 246 | 3. **Resource Management** 247 | - Reuse model instances 248 | - Batch similar requests 249 | - Monitor token usage 250 | 251 | ## See Also 252 | - [Agent System](./3_agent.md) 253 | - [Tool Integration](./2_tools.mdx) 254 | - [Provider Implementation](../4_providers.mdx) 255 | 256 |
257 | 258 | -------------------------------------------------------------------------------- /pages/docs/3_concepts/3_agent.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Agents 3 | description: This section contains the concepts for Rig. 4 | --- 5 | import { Cards, FileTree } from 'nextra/components' 6 | 7 | # Rig Agents: High-Level LLM Orchestration 8 | 9 | Agents in Rig provide a high-level abstraction for working with LLMs, combining models with context, tools, and configuration. They serve as the primary interface for building complex AI applications, from simple chatbots to sophisticated RAG systems. 10 | 11 | Reference: 12 | 13 | 14 | > Agents 15 | > Rig also provides high-level abstractions over LLMs in the form of the [Agent](crate::agent::Agent) type. 16 | > 17 | > The [Agent](crate::agent::Agent) type can be used to create anything from simple agents that use vanilla models to full blown 18 | > RAG systems that can be used to answer questions using a knowledge base. 19 | 20 | 21 | 22 | 23 | ## Core Concepts 24 | 25 | ### Agent Structure 26 | 27 | An Agent consists of: 28 | 29 | 1. **Base Components** 30 | - Completion Model (e.g., GPT-4, Claude) 31 | - System Prompt (preamble) 32 | - Configuration (temperature, max tokens) 33 | 34 | 2. **Context Management** 35 | - Static Context: Always available documents 36 | - Dynamic Context: RAG-based contextual documents 37 | - Vector Store Integration 38 | 39 | 3. **Tool Integration** 40 | - Static Tools: Always available capabilities 41 | - Dynamic Tools: Context-dependent capabilities 42 | - Tool Management via ToolSet 43 | 44 | ## Usage Patterns 45 | 46 | ### Basic Agent Creation 47 | 48 | ```rust 49 | use rig::{providers::openai, Agent}; 50 | 51 | let openai = openai::Client::from_env(); 52 | 53 | // Create simple agent 54 | let agent = openai.agent("gpt-4") 55 | .preamble("You are a helpful assistant.") 56 | .temperature(0.7) 57 | .build(); 58 | 59 | // Use the agent 60 | let response = agent.prompt("Hello!").await?; 61 | ``` 62 | 63 | ### RAG-Enabled Agent 64 | 65 | ```rust 66 | use rig::{Agent, vector_store::InMemoryVectorStore}; 67 | 68 | // Create vector store and index 69 | let store = InMemoryVectorStore::new(); 70 | let index = store.index(embedding_model); 71 | 72 | // Create RAG agent 73 | let agent = openai.agent("gpt-4") 74 | .preamble("You are a knowledge assistant.") 75 | .dynamic_context(3, index) // Retrieve 3 relevant documents 76 | .build(); 77 | ``` 78 | 79 | ### Tool-Augmented Agent 80 | 81 | ```rust 82 | use rig::{Agent, Tool}; 83 | 84 | // Create agent with tools 85 | let agent = openai.agent("gpt-4") 86 | .preamble("You are a capable assistant with tools.") 87 | .tool(calculator) 88 | .tool(web_search) 89 | .dynamic_tools(2, tool_index, toolset) 90 | .build(); 91 | ``` 92 | 93 | ## Key Features 94 | 95 | ### Dynamic Context Resolution 96 | 97 | The agent automatically: 98 | 1. Processes incoming prompts 99 | 2. Queries vector stores for relevant context 100 | 3. Integrates retrieved information 101 | 4. Maintains conversation coherence 102 | 103 | Reference: 104 | 105 | ```rust filename="rig-core/src/agent.rs [171:197]" 106 | let dynamic_context = stream::iter(self.dynamic_context.iter()) 107 | .then(|(num_sample, index)| async { 108 | Ok::<_, VectorStoreError>( 109 | index 110 | .top_n(prompt, *num_sample) 111 | .await? 112 | .into_iter() 113 | .map(|(_, id, doc)| { 114 | // Pretty print the document if possible for better readability 115 | let text = serde_json::to_string_pretty(&doc) 116 | .unwrap_or_else(|_| doc.to_string()); 117 | 118 | Document { 119 | id, 120 | text, 121 | additional_props: HashMap::new(), 122 | } 123 | }) 124 | .collect::>(), 125 | ) 126 | }) 127 | .try_fold(vec![], |mut acc, docs| async { 128 | acc.extend(docs); 129 | Ok(acc) 130 | }) 131 | .await 132 | .map_err(|e| CompletionError::RequestError(Box::new(e)))?; 133 | ``` 134 | 135 | 136 | ### Tool Management 137 | 138 | Agents can: 139 | - Maintain static and dynamic tool sets 140 | - Resolve tool calls automatically 141 | - Handle tool execution and error states 142 | 143 | Reference: 144 | 145 | ```rust filename="rig-core/src/agent.rs [199:221]" 146 | let dynamic_tools = stream::iter(self.dynamic_tools.iter()) 147 | .then(|(num_sample, index)| async { 148 | Ok::<_, VectorStoreError>( 149 | index 150 | .top_n_ids(prompt, *num_sample) 151 | .await? 152 | .into_iter() 153 | .map(|(_, id)| id) 154 | .collect::>(), 155 | ) 156 | }) 157 | .try_fold(vec![], |mut acc, docs| async { 158 | for doc in docs { 159 | if let Some(tool) = self.tools.get(&doc) { 160 | acc.push(tool.definition(prompt.into()).await) 161 | } else { 162 | tracing::warn!("Tool implementation not found in toolset: {}", doc); 163 | } 164 | } 165 | Ok(acc) 166 | }) 167 | .await 168 | .map_err(|e| CompletionError::RequestError(Box::new(e)))?; 169 | ``` 170 | 171 | 172 | ### Flexible Configuration 173 | 174 | The AgentBuilder pattern provides extensive configuration options: 175 | 176 | ```rust 177 | let agent = AgentBuilder::new(model) 178 | // Basic configuration 179 | .preamble("System instructions") 180 | .temperature(0.8) 181 | .max_tokens(1000) 182 | 183 | // Context management 184 | .context("Static context") 185 | .dynamic_context(5, vector_store) 186 | 187 | // Tool integration 188 | .tool(tool1) 189 | .dynamic_tools(3, tool_store, toolset) 190 | 191 | // Additional parameters 192 | .additional_params(json!({ 193 | "top_p": 0.9, 194 | "frequency_penalty": 0.7 195 | })) 196 | .build(); 197 | ``` 198 | 199 | ## Best Practices 200 | 201 | 1. **Context Management** 202 | - Keep static context minimal and focused 203 | - Use dynamic context for large knowledge bases 204 | - Consider context window limitations 205 | 206 | 2. **Tool Integration** 207 | - Prefer static tools for core functionality 208 | - Use dynamic tools for context-specific operations 209 | - Implement proper error handling in tools 210 | 211 | 3. **Performance Optimization** 212 | - Configure appropriate sampling sizes for dynamic content 213 | - Use temperature settings based on task requirements 214 | - Monitor and optimize token usage 215 | 216 | ## Common Patterns 217 | 218 | ### Conversational Agents 219 | ```rust 220 | let chat_agent = openai.agent("gpt-4") 221 | .preamble("You are a conversational assistant.") 222 | .temperature(0.9) 223 | .build(); 224 | 225 | let response = chat_agent 226 | .chat("Hello!", previous_messages) 227 | .await?; 228 | ``` 229 | 230 | ### RAG Knowledge Base 231 | ```rust 232 | let kb_agent = openai.agent("gpt-4") 233 | .preamble("You are a knowledge base assistant.") 234 | .dynamic_context(5, document_store) 235 | .temperature(0.3) 236 | .build(); 237 | ``` 238 | 239 | ### Tool-First Agent 240 | ```rust 241 | let tool_agent = openai.agent("gpt-4") 242 | .preamble("You are a tool-using assistant.") 243 | .tool(calculator) 244 | .tool(web_search) 245 | .dynamic_tools(2, tool_store, toolset) 246 | .temperature(0.5) 247 | .build(); 248 | ``` 249 | 250 | ## See Also 251 | - [Completion Models](./0_completion.mdx) 252 | - [Vector Stores](../4_integrations/41_vector_store.mdx) 253 | - [Tools](./2_tools.mdx) 254 | - [RAG Systems](../../guides/1_rag/11_rag_system.mdx) 255 | 256 |
257 | 258 | -------------------------------------------------------------------------------- /pages/docs/0_quickstart.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: Quickstart 3 | description: This section contains the quickstart guide for Rig. 4 | --- 5 | 6 | In the rapidly evolving landscape of artificial intelligence (AI), Large Language Models (LLMs) have emerged as powerful tools for building sophisticated AI applications. However, harnessing the full potential of LLMs often requires navigating complex APIs, managing different providers, and implementing intricate workflows. This is where Rig comes in – a comprehensive Rust library designed to transform how developers build LLM-powered applications. 7 | 8 | ## The Challenge of Building LLM Applications 9 | 10 | Before diving into Rig's capabilities, let's consider the challenges developers face when building LLM applications: 11 | 12 | 1. **API Complexity**: Each LLM provider has its own API, requiring developers to learn and manage multiple interfaces. 13 | 2. **Workflow Management**: Implementing advanced AI workflows, such as Retrieval-Augmented Generation (RAG), involves multiple steps and can be error-prone. 14 | 3. **Performance and Scalability**: Ensuring optimal performance and scalability in LLM applications can be challenging, especially as projects grow in complexity. 15 | 4. **Type Safety and Error Handling**: Maintaining type safety and robust error handling across different LLM interactions is crucial but often difficult. 16 | 17 | ## Enter Rig: A Game-Changer for LLM Application Development 18 | 19 | Rig is more than just an API wrapper; it's a comprehensive framework that addresses these challenges head-on. By providing high-level abstractions and a unified interface, Rig simplifies the development process, allowing you to focus on building innovative AI solutions rather than wrestling with implementation details. 20 | 21 | Whether you're a seasoned Rust developer or new to the language, Rig offers a range of features designed to make your LLM application development smoother, faster, and more enjoyable. 22 | 23 | ## Getting Started with Rig 24 | 25 | Let's dive into a simple example to demonstrate how easy it is to get started with Rig: 26 | 27 | ```rust 28 | use rig::{completion::Prompt, providers::openai}; 29 | 30 | #[tokio::main] 31 | async fn main() -> Result<(), anyhow::Error> { 32 | // Initialize the OpenAI client using environment variables 33 | let openai_client = openai::Client::from_env(); 34 | 35 | // Create a GPT-4 model instance 36 | let gpt4 = openai_client.model("gpt-4").build(); 37 | 38 | // Send a prompt to GPT-4 and await the response 39 | let response = gpt4.prompt("Explain quantum computing in one sentence.").await?; 40 | 41 | // Print the response 42 | println!("GPT-4: {}", response); 43 | 44 | Ok(()) 45 | } 46 | ``` 47 | 48 | This simple example demonstrates how Rig abstracts away the complexities of interacting with OpenAI's API, allowing you to focus on the core logic of your application. 49 | 50 | To include Rig in your project, add the following to your `Cargo.toml`: 51 | 52 | ```toml 53 | [dependencies] 54 | rig-core = "0.0.6" 55 | tokio = { version = "1.34.0", features = ["full"] } 56 | ``` 57 | 58 | > 💡 **Tip**: Don't forget to set the `OPENAI_API_KEY` environment variable before running your application. 59 | 60 | ## Key Features and Developer Experience 61 | 62 | Rig combines Rust's powerful type system and performance with intuitive abstractions tailored for AI development. Let's explore some of its key features: 63 | 64 | ### 1. Unified and Intuitive API 65 | 66 | One of Rig's standout features is its consistent interface across different LLM providers: 67 | 68 | ```rust 69 | // Using OpenAI 70 | let gpt4 = openai_client.model("gpt-4").build(); 71 | let response = gpt4.prompt("Hello, GPT-4!").await?; 72 | 73 | // Using Cohere 74 | let command = cohere_client.model("command").build(); 75 | let response = command.prompt("Hello, Cohere!").await?; 76 | ``` 77 | 78 | This unified API design ensures that switching between providers or adding new ones to your project is seamless, reducing cognitive load and improving code maintainability. 79 | 80 | ### 2. Advanced Abstractions for Complex Workflows 81 | 82 | Rig shines when it comes to implementing complex AI workflows. For example, creating a Retrieval-Augmented Generation (RAG) system typically involves multiple steps: 83 | 84 | 1. Generating embeddings for documents 85 | 2. Storing these embeddings in a vector database 86 | 3. Retrieving relevant context based on user queries 87 | 4. Augmenting the LLM prompt with this context 88 | 89 | With Rig, this entire process can be condensed into a few lines of code: 90 | 91 | ```rust 92 | let rag_agent = openai_client.context_rag_agent("gpt-4") 93 | .preamble("You are a helpful assistant.") 94 | .dynamic_context(2, vector_store.index(embedding_model)) 95 | .build(); 96 | 97 | let response = rag_agent.prompt("What is the capital of France?").await?; 98 | ``` 99 | 100 | This high-level abstraction allows developers to implement advanced AI systems quickly and efficiently, without getting bogged down in the implementation details. 101 | 102 | ### 3. Type-Safe Development 103 | 104 | Leveraging Rust's strong type system, Rig provides compile-time guarantees and better auto-completion, enhancing the developer experience: 105 | 106 | ```rust 107 | #[derive(serde::Deserialize, JsonSchema)] 108 | struct Person { 109 | name: String, 110 | age: u8, 111 | } 112 | 113 | let extractor = openai_client.extractor::("gpt-4").build(); 114 | let person: Person = extractor.extract("John Doe is 30 years old").await?; 115 | ``` 116 | 117 | This type-safe approach helps catch errors early in the development process and makes refactoring and maintenance easier. 118 | 119 | ### 4. Extensibility and Integration 120 | 121 | Rig's flexible architecture allows for easy customization and seamless integration with Rust's growing AI ecosystem: 122 | 123 | ```rust 124 | impl VectorStore for MyCustomStore { 125 | // Implementation details... 126 | } 127 | 128 | let my_store = MyCustomStore::new(); 129 | let rag_agent = openai_client.context_rag_agent("gpt-4") 130 | .dynamic_context(2, my_store.index(embedding_model)) 131 | .build(); 132 | ``` 133 | 134 | This extensibility ensures that Rig can grow with your project's needs and integrate with other tools in your AI development stack. 135 | 136 | ## Advanced Features: RAG Systems and Beyond 137 | 138 | Let's explore a more comprehensive example of a RAG system with Rig, showcasing its ability to handle complex AI workflows: 139 | 140 | ```rust 141 | use rig::{ 142 | completion::Prompt, 143 | embeddings::EmbeddingsBuilder, 144 | providers::openai::Client, 145 | vector_store::{in_memory_store::InMemoryVectorStore, VectorStore}, 146 | }; 147 | 148 | #[tokio::main] 149 | async fn main() -> Result<(), anyhow::Error> { 150 | // Initialize OpenAI client and embedding model 151 | let openai_client = Client::from_env(); 152 | let embedding_model = openai_client.embedding_model("text-embedding-ada-002"); 153 | 154 | // Create and populate vector store 155 | let mut vector_store = InMemoryVectorStore::default(); 156 | let embeddings = EmbeddingsBuilder::new(embedding_model.clone()) 157 | .simple_document("doc1", "Rig is a Rust library for building LLM applications.") 158 | .simple_document("doc2", "Rig supports OpenAI and Cohere as LLM providers.") 159 | .build() 160 | .await?; 161 | vector_store.add_documents(embeddings).await?; 162 | 163 | // Create and use RAG agent 164 | let rag_agent = openai_client.context_rag_agent("gpt-4") 165 | .preamble("You are an assistant that answers questions about Rig.") 166 | .dynamic_context(1, vector_store.index(embedding_model)) 167 | .build(); 168 | 169 | let response = rag_agent.prompt("What is Rig?").await?; 170 | println!("RAG Agent: {}", response); 171 | 172 | Ok(()) 173 | } 174 | ``` 175 | 176 | This example demonstrates how Rig abstracts the complexity of creating a RAG system, handling embedding generation, vector storage, and context retrieval efficiently. With just a few lines of code, you've implemented a sophisticated AI system that can provide context-aware responses. 177 | 178 | But Rig's capabilities extend beyond RAG systems. Its flexible architecture allows for the implementation of various AI workflows, including: 179 | 180 | - Multi-agent systems for complex problem-solving 181 | - AI-powered data analysis and extraction 182 | - Automated content generation and summarization 183 | - And much more! -------------------------------------------------------------------------------- /pages/docs/4_integrations/41_vector_stores/in_memory.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: In-Memory Vector Store 3 | description: This section describes the in-memory vector store integration. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # In-Memory Vector Store 9 | 10 | ## Overview 11 | 12 | The in-memory vector store is Rig's default vector store implementation, included in `rig-core`. It provides a lightweight, RAM-based solution for vector similarity search, ideal for development, testing, and small-scale applications. 13 | 14 | ## Key Features 15 | 16 | - Zero external dependencies 17 | - Automatic or custom document ID generation 18 | - Multiple embedding support per document 19 | - Cosine similarity search 20 | - Flexible document schema support 21 | 22 | ## Implementation Details 23 | 24 | ### Core Components 25 | 26 | 1. **Store Structure**:s 27 | 28 | The `InMemoryVectorStore` uses a simple but effective data structure: 29 | 30 | ```rust 31 | pub struct InMemoryVectorStore { 32 | embeddings: HashMap)>, 33 | } 34 | ``` 35 | 36 | Key components: 37 | - **Key**: String identifier for each document 38 | - **Value**: Tuple containing: 39 | - `D`: The serializable document 40 | - `OneOrMany`: Either a single embedding or multiple embeddings 41 | 42 | The store supports multiple embeddings per document through the `OneOrMany` enum: 43 | ```rust 44 | pub enum OneOrMany { 45 | One(T), 46 | Many(Vec), 47 | } 48 | ``` 49 | 50 | When searching, the store: 51 | 52 | 1. Computes cosine similarity between the query and all document embeddings 53 | 2. For documents with multiple embeddings, uses the best-matching embedding 54 | 3. Uses a `BinaryHeap` to efficiently maintain the top-N results 55 | 4. Returns results sorted by similarity score 56 | 57 | Memory layout example: 58 | ```plaintext 59 | { 60 | "doc1" => ( 61 | Document { title: "Example 1", ... }, 62 | One(Embedding { vec: [0.1, 0.2, ...] }) 63 | ), 64 | "doc2" => ( 65 | Document { title: "Example 2", ... }, 66 | Many([ 67 | Embedding { vec: [0.3, 0.4, ...] }, 68 | Embedding { vec: [0.5, 0.6, ...] } 69 | ]) 70 | ) 71 | } 72 | ``` 73 | 74 | 2. **Vector Search Implementation**: 75 | - Uses a binary heap for efficient top-N retrieval 76 | - Maintains scores using ordered floating-point comparisons 77 | - Supports multiple embeddings per document with best-match selection 78 | 79 | ### Document Management 80 | 81 | Three ways to add documents: 82 | 83 | 1. **Auto-generated IDs**: 84 | ```rust 85 | let store = InMemoryVectorStore::from_documents(vec![ 86 | (doc1, embedding1), 87 | (doc2, embedding2) 88 | ]); 89 | ``` 90 | 91 | 2. **Custom IDs**: 92 | ```rust 93 | let store = InMemoryVectorStore::from_documents_with_ids(vec![ 94 | ("custom_id_1", doc1, embedding1), 95 | ("custom_id_2", doc2, embedding2) 96 | ]); 97 | ``` 98 | 99 | 3. **Function-generated IDs**: 100 | ```rust 101 | let store = InMemoryVectorStore::from_documents_with_id_f( 102 | documents, 103 | |doc| format!("doc_{}", doc.title) 104 | ); 105 | ``` 106 | 107 | ## Special Considerations 108 | 109 | ### 1. Memory Usage 110 | - All embeddings and documents are stored in RAM 111 | - Memory usage scales linearly with document count and embedding dimensions 112 | - Consider available memory when storing large datasets 113 | 114 | ### 2. Performance Characteristics 115 | - Fast lookups using HashMap for document retrieval 116 | - Efficient top-N selection using BinaryHeap 117 | - O(n) complexity for vector similarity search 118 | - Best for small to medium-sized datasets 119 | 120 | ### 3. Document Storage 121 | - Documents must be serializable 122 | - Supports multiple embeddings per document 123 | - Automatic pruning of large arrays (>400 elements) 124 | 125 | ## Usage Example 126 | 127 | ```rust 128 | use rig::providers::openai; 129 | use rig::embeddings::EmbeddingsBuilder; 130 | use rig::vector_store::in_memory_store::InMemoryVectorStore; 131 | 132 | #[tokio::main] 133 | async fn main() -> Result<(), anyhow::Error> { 134 | // Initialize store 135 | let mut store = InMemoryVectorStore::default(); 136 | 137 | // Create embeddings 138 | let embeddings = EmbeddingsBuilder::new(model) 139 | .simple_document("doc1", "First document content") 140 | .simple_document("doc2", "Second document content") 141 | .build() 142 | .await?; 143 | 144 | // Add documents to store 145 | store.add_documents(embeddings); 146 | 147 | // Create vector store index 148 | let index = store.index(model); 149 | 150 | // Search similar documents 151 | let results = store 152 | .top_n::("search query", 5) 153 | .await?; 154 | 155 | Ok(()) 156 | } 157 | ``` 158 | 159 | ## Implementation Specifics 160 | 161 | ### Vector Search Algorithm 162 | 163 | The core search implementation: 164 | 165 | ```rust filename=rig-core/src/vector_store/in_memory_store.rs [67:103] 166 | 167 | /// Implement vector search on [InMemoryVectorStore]. 168 | /// To be used by implementations of [VectorStoreIndex::top_n] and [VectorStoreIndex::top_n_ids] methods. 169 | fn vector_search(&self, prompt_embedding: &Embedding, n: usize) -> EmbeddingRanking { 170 | // Sort documents by best embedding distance 171 | let mut docs = BinaryHeap::new(); 172 | 173 | for (id, (doc, embeddings)) in self.embeddings.iter() { 174 | // Get the best context for the document given the prompt 175 | if let Some((distance, embed_doc)) = embeddings 176 | .iter() 177 | .map(|embedding| { 178 | ( 179 | OrderedFloat(embedding.cosine_similarity(prompt_embedding, false)), 180 | &embedding.document, 181 | ) 182 | }) 183 | .max_by(|a, b| a.0.cmp(&b.0)) 184 | { 185 | docs.push(Reverse(RankingItem(distance, id, doc, embed_doc))); 186 | }; 187 | 188 | ``` 189 | 190 | 191 | ### Error Handling 192 | 193 | The vector store operations can produce several error types: 194 | 195 | - `EmbeddingError`: Issues with embedding generation 196 | - `JsonError`: Document serialization/deserialization errors 197 | - `DatastoreError`: General storage operations errors 198 | - `MissingIdError`: When a requested document ID doesn't exist 199 | 200 | Example error handling: 201 | 202 | ```rust 203 | match store.get_document::("doc1") { 204 | Ok(Some(doc)) => println!("Found document: {:?}", doc), 205 | Ok(None) => println!("Document not found"), 206 | Err(VectorStoreError::JsonError(e)) => println!("Failed to deserialize: {}", e), 207 | Err(e) => println!("Other error: {}", e), 208 | } 209 | ``` 210 | 211 | 212 | ## Best Practices 213 | 214 | 1. **Memory Management**: 215 | - Monitor memory usage with large datasets 216 | - Consider chunking large document additions 217 | - Use cloud-based vector stores for production deployments 218 | 219 | 2. **Document Structure**: 220 | - Keep documents serializable 221 | - Avoid extremely large arrays 222 | - Consider using custom ID generation for meaningful identifiers 223 | 224 | 3. **Performance Optimization**: 225 | - Pre-allocate store capacity when possible 226 | - Batch document additions 227 | - Use appropriate embedding dimensions 228 | 229 | ## Limitations 230 | 231 | 1. **Scalability**: 232 | - Limited by available RAM 233 | - No persistence between program runs 234 | - Single-machine only 235 | 236 | 2. **Features**: 237 | - No built-in indexing optimizations 238 | - No metadata filtering 239 | - No automatic persistence 240 | 241 | 3. **Production Use**: 242 | - Best suited for development/testing 243 | - Consider cloud-based alternatives for production 244 | - No built-in backup/recovery mechanisms 245 | 246 | For production deployments, consider using one of Rig's other vector store integrations (MongoDB, LanceDB, Neo4j, or Qdrant) which offer persistence and better scalability. 247 | 248 | ## Thread Safety 249 | 250 | The `InMemoryVectorStore` is thread-safe for concurrent reads but requires exclusive access for writes. The store implements `Clone` for creating independent instances and `Send + Sync` for safe concurrent access across thread boundaries. 251 | 252 | For concurrent write access, consider wrapping the store in a synchronization primitive like `Arc>`. 253 | 254 | ## Comparison with Other Vector Stores 255 | | Feature | In-Memory | MongoDB | Qdrant | LanceDB | 256 | |---------|-----------|---------|---------|----------| 257 | | Persistence | ❌ | ✅ | ✅ | ✅ | 258 | | Horizontal Scaling | ❌ | ✅ | ✅ | ❌ | 259 | | Setup Complexity | Low | Medium | Medium | Low | 260 | | Memory Usage | High | Low | Medium | Low | 261 | | Query Speed | Fast | Medium | Fast | Fast | 262 | 263 | 264 |
265 | 266 | 267 | 268 | -------------------------------------------------------------------------------- /pages/docs/4_integrations/41_vector_stores/lancedb.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: LanceDB 3 | description: This section describes the LanceDB integration. 4 | --- 5 | 6 | import { Cards } from 'nextra/components' 7 | 8 | # Rig-LanceDB Integration Overview 9 | 10 | ## Introduction 11 | 12 | The `rig-lancedb` crate provides a vector store implementation using LanceDB, a serverless vector database built on Apache Arrow. This integration enables efficient vector similarity search with support for both local and cloud storage (S3, Google Cloud, Azure). 13 | 14 | ## Key Features 15 | 16 | - **Columnar Storage**: Native support for Apache Arrow's columnar format 17 | - **Multiple Index Types**: 18 | - IVF-PQ (Inverted File with Product Quantization) 19 | - Exact Nearest Neighbors (ENN) 20 | - **Flexible Deployment**: 21 | - Local storage 22 | - Cloud storage (S3, GCP, Azure) 23 | - **Advanced Query Capabilities**: Supports both vector similarity search and metadata filtering 24 | 25 | ## Implementation Details 26 | 27 | ### Vector Index Structure 28 | 29 | The core implementation revolves around the `LanceDbVectorIndex` struct: 30 | 31 | 32 | ```34:44:rig-lancedb/src/lib.rs 33 | /// ``` 34 | ``` 35 | 36 | 37 | ### Schema Definition 38 | 39 | LanceDB requires a specific schema for storing embeddings: 40 | 41 | 42 | ```35:47:rig-lancedb/tests/fixtures/lib.rs 43 | Schema::new(Fields::from(vec![ 44 | Field::new("id", DataType::Utf8, false), 45 | Field::new("definition", DataType::Utf8, false), 46 | Field::new( 47 | "embedding", 48 | DataType::FixedSizeList( 49 | Arc::new(Field::new("item", DataType::Float64, true)), 50 | dims as i32, 51 | ), 52 | false, 53 | ), 54 | ])) 55 | } 56 | ``` 57 | 58 | 59 | ### Search Parameters 60 | 61 | Configurable search parameters include: 62 | - Distance type (Cosine, L2) 63 | - Number of candidates 64 | - Query vector dimension validation 65 | 66 | ## Usage Examples 67 | 68 | ### 1. Local Storage with ANN (Approximate Nearest Neighbors) 69 | 70 | ```rust 71 | use rig_lancedb::{LanceDbVectorIndex, SearchParams}; 72 | 73 | // Initialize local database 74 | let db = lancedb::connect("data/lancedb-store").execute().await?; 75 | 76 | // Create vector index 77 | let vector_store = LanceDbVectorIndex::new( 78 | table, 79 | model, 80 | "id", 81 | SearchParams::default() 82 | ).await?; 83 | 84 | // Perform search 85 | let results = vector_store 86 | .top_n::("search query", 5) 87 | .await?; 88 | ``` 89 | 90 | ### 2. S3 Storage with IVF-PQ Index 91 | 92 | 93 | ```24:42:rig-lancedb/examples/vector_search_s3_ann.rs 94 | let model = openai_client.embedding_model(TEXT_EMBEDDING_ADA_002); 95 | 96 | // Initialize LanceDB on S3. 97 | // Note: see below docs for more options and IAM permission required to read/write to S3. 98 | // https://lancedb.github.io/lancedb/guides/storage/#aws-s3 99 | let db = lancedb::connect("s3://lancedb-test-829666124233") 100 | .execute() 101 | .await?; 102 | 103 | // Generate embeddings for the test data. 104 | let embeddings = EmbeddingsBuilder::new(model.clone()) 105 | .documents(words())? 106 | // Note: need at least 256 rows in order to create an index so copy the definition 256 times for testing purposes. 107 | .documents( 108 | (0..256) 109 | .map(|i| Word { 110 | id: format!("doc{}", i), 111 | definition: "Definition of *flumbuzzle (noun)*: A sudden, inexplicable urge to rearrange or reorganize small objects, such as desk items or books, for no apparent reason.".to_string() 112 | }) 113 | ``` 114 | 115 | 116 | ## LanceDB-Specific Features 117 | 118 | ### 1. Record Batch Processing 119 | LanceDB uses Arrow's RecordBatch for efficient data handling: 120 | 121 | 122 | ```19:31:rig-lancedb/src/utils/mod.rs 123 | } 124 | 125 | impl QueryToJson for lancedb::query::VectorQuery { 126 | async fn execute_query(&self) -> Result, VectorStoreError> { 127 | let record_batches = self 128 | .execute() 129 | .await 130 | .map_err(lancedb_to_rig_error)? 131 | .try_collect::>() 132 | .await 133 | .map_err(lancedb_to_rig_error)?; 134 | 135 | record_batches.deserialize() 136 | ``` 137 | 138 | 139 | ### 2. Column Filtering 140 | Automatic filtering of embedding columns: 141 | 142 | 143 | ```33:46:rig-lancedb/src/utils/mod.rs 144 | } 145 | 146 | /// Filter out the columns from a table that do not include embeddings. Return the vector of column names. 147 | pub(crate) trait FilterTableColumns { 148 | fn filter_embeddings(self) -> Vec; 149 | } 150 | 151 | impl FilterTableColumns for Arc { 152 | fn filter_embeddings(self) -> Vec { 153 | self.fields() 154 | .iter() 155 | .filter_map(|field| match field.data_type() { 156 | DataType::FixedSizeList(inner, ..) => match inner.data_type() { 157 | DataType::Float64 => None, 158 | ``` 159 | 160 | 161 | ### 3. Automatic Deserialization 162 | Built-in support for converting Arrow types to Rust types: 163 | 164 | 165 | ```1:45:rig-lancedb/src/utils/deserializer.rs 166 | use std::sync::Arc; 167 | 168 | use arrow_array::{ 169 | cast::AsArray, 170 | types::{ 171 | ArrowDictionaryKeyType, BinaryType, ByteArrayType, Date32Type, Date64Type, Decimal128Type, 172 | DurationMicrosecondType, DurationMillisecondType, DurationNanosecondType, 173 | DurationSecondType, Float32Type, Float64Type, Int16Type, Int32Type, Int64Type, Int8Type, 174 | IntervalDayTime, IntervalDayTimeType, IntervalMonthDayNano, IntervalMonthDayNanoType, 175 | IntervalYearMonthType, LargeBinaryType, LargeUtf8Type, RunEndIndexType, 176 | Time32MillisecondType, Time32SecondType, Time64MicrosecondType, Time64NanosecondType, 177 | TimestampMicrosecondType, TimestampMillisecondType, TimestampNanosecondType, 178 | TimestampSecondType, UInt16Type, UInt32Type, UInt64Type, UInt8Type, Utf8Type, 179 | }, 180 | Array, ArrowPrimitiveType, OffsetSizeTrait, RecordBatch, RunArray, StructArray, UnionArray, 181 | }; 182 | use lancedb::arrow::arrow_schema::{ArrowError, DataType, IntervalUnit, TimeUnit}; 183 | use rig::vector_store::VectorStoreError; 184 | use serde::Serialize; 185 | use serde_json::{json, Value}; 186 | 187 | use crate::serde_to_rig_error; 188 | 189 | fn arrow_to_rig_error(e: ArrowError) -> VectorStoreError { 190 | VectorStoreError::DatastoreError(Box::new(e)) 191 | } 192 | 193 | /// Trait used to deserialize data returned from LanceDB queries into a serde_json::Value vector. 194 | /// Data returned by LanceDB is a vector of `RecordBatch` items. 195 | pub(crate) trait RecordBatchDeserializer { 196 | fn deserialize(&self) -> Result, VectorStoreError>; 197 | } 198 | 199 | impl RecordBatchDeserializer for Vec { 200 | fn deserialize(&self) -> Result, VectorStoreError> { 201 | Ok(self 202 | .iter() 203 | .map(|record_batch| record_batch.deserialize()) 204 | .collect::, _>>()? 205 | .into_iter() 206 | .flatten() 207 | .collect()) 208 | } 209 | } 210 | 211 | ``` 212 | 213 | 214 | ## Best Practices 215 | 216 | 1. **Index Creation**: 217 | - Minimum of 256 rows required for IVF-PQ indexing 218 | - Choose appropriate distance metrics based on your use case 219 | 220 | 2. **Schema Design**: 221 | - Use appropriate data types for columns 222 | - Consider embedding dimension requirements 223 | 224 | 3. **Query Optimization**: 225 | - Use column filtering to reduce data transfer 226 | - Leverage metadata filtering when possible 227 | 228 | ## Limitations and Considerations 229 | 230 | 1. **Data Size**: 231 | - Local storage is suitable for smaller datasets 232 | - Use cloud storage for large-scale deployments 233 | 234 | 2. **Index Requirements**: 235 | - IVF-PQ index requires minimum dataset size 236 | - Consider memory requirements for large indices 237 | 238 | 3. **Performance**: 239 | - ANN provides better performance but lower accuracy 240 | - ENN provides exact results but slower performance 241 | 242 | ## Integration Example 243 | 244 | A complete example showing document embedding and search: 245 | 246 | 247 | ```13:73:rig-lancedb/examples/vector_search_local_ann.rs 248 | #[path = "./fixtures/lib.rs"] 249 | mod fixture; 250 | 251 | #[tokio::main] 252 | async fn main() -> Result<(), anyhow::Error> { 253 | // Initialize OpenAI client. Use this to generate embeddings (and generate test data for RAG demo). 254 | let openai_client = Client::from_env(); 255 | 256 | // Select an embedding model. 257 | let model = openai_client.embedding_model(TEXT_EMBEDDING_ADA_002); 258 | 259 | // Initialize LanceDB locally. 260 | let db = lancedb::connect("data/lancedb-store").execute().await?; 261 | 262 | // Generate embeddings for the test data. 263 | let embeddings = EmbeddingsBuilder::new(model.clone()) 264 | .documents(words())? 265 | // Note: need at least 256 rows in order to create an index so copy the definition 256 times for testing purposes. 266 | .documents( 267 | (0..256) 268 | ``` 269 | 270 | 271 | For detailed API reference and additional features, see the [LanceDB documentation](https://lancedb.github.io/lancedb/) and Rig's API documentation. 272 | 273 |
274 | 275 | 276 | 277 | 278 | -------------------------------------------------------------------------------- /pages/guides/3_deploy/Blog_1_aws_lambda.mdx: -------------------------------------------------------------------------------- 1 | # How to Deploy Your Rig App on AWS Lambda: A Step-by-Step Guide 2 | 3 | **TL;DR** 4 | 5 | * A step-by-step walkthrough on deploying a simple AI Agent built with [Rig](https://github.com/0xPlaygrounds/rig), a fullstack agent framework, on AWS Lambda using the cargo lambda CLI. 6 | * Comparison of performance metrics (memory usage, execution time, and cold starts) with a similar deployed Agent built with [LangChain](https://www.langchain.com). 7 | * **Rig vs. Langchain agent on AWS Lambda** 8 | 1. 🪶 Lower memory footprint (~26MB vs ~130MB) 9 | 2. ⏩ Faster cold starts (90ms vs. 1900ms) 10 | 3. 🧘‍♂️ More consistent performance across memory configuration 11 | 12 | Welcome to the series **Deploy Your Rig Application**! 13 | Apps built with Rig can vary in complexity across three core dimensions: LLM usage, knowledge bases for RAG, and the compute infrastructure where the application is deployed. In this series, we’ll explore how different combinations of these dimensions can be configured for production use. 14 | 15 | Today, we’ll start with a simple Rig agent that uses the [OpenAI model GPT-4-turbo](https://platform.openai.com/docs/models/gpt-4o), does not rely on a vector store (ie.: no RAGing), and will be deployed on AWS Lambda. 16 | 17 | This blog will provide a step-by-step deployment guide for the simple Rig app, showcase performance metrics of the Rig app running on AWS Lambda, and compare these metrics with those of a [LangChain]((https://www.langchain.com)) app on the same platform. 18 | 19 | > *💡 If you're new to Rig and want to start from the beginning or are looking for additional tutorials, check out our [blog series](https://rig.rs/build-with-rig-guide.html).* 20 | 21 | Let’s dive in! 22 | 23 | ## Prerequisites 24 | 25 | Before we begin building, ensure you have the following: 26 | 27 | * A clone of the [`rig-entertainer-lambda`](https://github.com/garance-buricatu/rig-aws/tree/master/rig-entertainer-lambda) crate (or your own Rig application). 28 | * An AWS account 29 | * An Open AI api key 30 | 31 | ## AWS Lambda Quick Overview 32 | 33 | You might deploy your Rust application on AWS lambda if it’s a task that can execute in under 15 mins or if your app is a REST API backend. 34 | 35 | ### AWS 🤝 Rust 36 | 37 | AWS Lambda supports Rust through the use of the [OS-only runtime Amazon Linux 2023](https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html) (a lambda runtime) in conjunction with the [Rust runtime client](https://github.com/awslabs/aws-lambda-rust-runtime), a rust crate. 38 | 39 | #### REST API backend 40 | 41 | * Use the [`lambda-http`](https://github.com/awslabs/aws-lambda-rust-runtime/tree/main/lambda-http) crate (from the runtime client) to write your function’s entrypoint. 42 | * Then, route traffic to your lambda via AWS API services like [Api Gateway](https://aws.amazon.com/api-gateway/), [App Sync](https://aws.amazon.com/pm/appsync), [VPC lattice](https://aws.amazon.com/vpc/lattice/), etc ... 43 | * If your lambda handles multiple endpoints of your API, the crate [axum](https://github.com/tokio-rs/axum) facilitates the routing within the lambda. 44 | 45 | #### Event based task (15 mins max.) 46 | 47 | * Your lambda function is invoked by some event with the event passed as the payload. For example, configure your S3 bucket to trigger the lambda function when a new object is added to the bucket. The function will receive the new object in the payload and can further process it. 48 | * Use the [`lambda_runtime`](https://github.com/awslabs/aws-lambda-rust-runtime/tree/main/lambda-runtime) crate with [`lambda_events`](https://github.com/awslabs/aws-lambda-rust-runtime/tree/main/lambda-events) (from the runtime client) to write your function’s entrypoint. 49 | * Then, invoke your function either via [`lambda invoke` command](https://docs.aws.amazon.com/cli/latest/reference/lambda/invoke.html) or with integrated AWS triggers (ie. S3 UploadObject trigger). 50 | 51 | > For both cases, the crate [`tokio`](https://docs.rs/tokio/latest/tokio/) must also be added to your project as the lambda runtime client uses `tokio` to handle asynchronous calls. 52 | 53 | ## Rig Entertainer Agent App 🤡 54 | 55 | The crate [`rig-entertainer-lambda`](https://github.com/garance-buricatu/rig-aws-lambda/tree/master/rig-entertainer-lambda) implements a simple Rust program that is executed via the `lambda_runtime`. It invokes a `Rig` agent using the OpenAI API, to entertain users with jokes. It is an event-based task that I will execute with the `lambda invoke` command. 56 | 57 | The main takeaway here is that the app's `Cargo.toml` file must include the following dependencies: 58 | 59 | 1. `rig-core` (our rig crate) 60 | 2. `lambda_runtime` 61 | 3. `tokio` 62 | 63 | ### Now let's deploy it 64 | 65 | There are *many* ways to deploy Rust lambdas to AWS. Some out of the box options include the AWS CLI, the [cargo lambda](https://www.cargo-lambda.info/guide/getting-started.html) CLI, the AWS SAM CLI, the AWS CDK, and more. You can also decide to create a Dockerfile for your app and use that container image in your Lambda function instead. See some useful examples [here](https://docs.aws.amazon.com/lambda/latest/dg/rust-package.html). 66 | 67 | In this blog, we'll use the cargo lambda CLI option to deploy the code in `rig-entertainer-rust` from your local machine to an AWS lambda: 68 | 69 | ```bash 70 | # Add your AWS credentials to your terminal 71 | # Create an AWS Lambda function named ‘rig-entertainer’ with architecture x86_64. 72 | 73 | function_name='rig-entertainer' 74 | 75 | cd rig-entertainer-lambda 76 | cargo lambda build --release # Can define different architectures here with --arm64 for example 77 | cargo lambda deploy $function_name # Since the name of the crate is the same as the the lambda function name, no need to specify a binary file 78 | ``` 79 | 80 | ### Metrics on the cloud ☁️ 81 | 82 | #### Deployment package 83 | 84 | This is the code configuration of the `rig-entertainer` function in AWS. The function’s code package (bundled code and dependencies required for lambda to run) includes the single rust binary called `bootstrap`, which is 3.2 MB. 85 | 86 | ![Deployment Package Rust](/images/deploy_1/rig-deployment-package.png) 87 | 88 | #### Memory, CPU, and runtime 89 | 90 | The image below gives metrics on memory usage and execution time of the function. Each row represents a single execution of the function. In **yellow** is the **total memory used**, in **red** is the amount of **memory allocated**, and in **blue** is the **runtime**. 91 | Although the lambda has many configuration options for the memory ranging from 128MB to 1024MB, we can see that the average memory used by our app is **26MB**. 92 | ![Rig Cloudwatch logs](/images/deploy_1/rig-cw-logs.png) 93 | 94 | Let's get more information on the metrics above by spamming the function and calculating averages. I invoked `rig-entertainer` 50 times for each memory configuration of 128MB, 256MB, 512MB, 1024MB using the [power tuner tool](https://github.com/alexcasalboni/aws-lambda-power-tuning) and the result of those invocations are displayed in the chart below. 95 | 96 | The x-axis is the memory allocation, and the y-axis is the average runtime over the 50 executions of `rig-entertainer`. 97 | 98 | > **Q.** We know that the function uses on average only 26MB per execution (which is less than the minimum memory allocation of 128MB) so why should we test higher memory configurations? 99 | **A.** [vCPUs are added to the lambda in proportion to memory](https://docs.aws.amazon.com/lambda/latest/operatorguide/computing-power.html) so adding memory could still affect the performance. 100 | 101 | However, we can see that adding memory to the function (and therefore adding computational power) does not affect its performance at all. Since the [cost of a lambda execution](https://aws.amazon.com/lambda/pricing/) is calculated in GB-seconds, we get the most efficient lambda for the lowest price! 102 | ![Power Tuner Rust](/images/deploy_1/rig-power-tuner.png) 103 | 104 | #### Cold starts ❄️ 105 | 106 | [Cold starts](https://docs.aws.amazon.com/lambda/latest/operatorguide/execution-environments.html) occur when the lambda function's execution environment needs to be booted up from scratch. This includes setting up the actual compute that the lambda function is running on, and downloading the lambda function code and dependencies in that environment. 107 | Cold start latency doesn't affect all function executions because once the lambda environment has been setup, it will be reused by subsequent executions of the same lambda. 108 | 109 | In the lambda cloudwatch logs, if a function execution requires a cold start, we see the `Init Duration` metric at the end of the execution. 110 | 111 | For `rig-entertainer`, we can see that the average cold start time is **90.9ms**: 112 | ![Rig cold starts](/images/deploy_1/rig-coldstarts.png) 113 | Note that the function was affected by cold starts 9 times out of the 245 times it was executed, so **0.036%** of the time. 114 | 115 | ## Langchain Entertainer Agent App 🐍 116 | 117 | We replicated the OpenAI entertainer agent using the [langchain](https://python.langchain.com/) python library in this [mini python app](https://github.com/garance-buricatu/rig-aws-lambda/tree/master/langchain-entertainer-lambda) which is also deployed to AWS Lambda in a function called `langchain-entertainer`. 118 | 119 | Let's compare the metrics outlined above. 120 | 121 | #### Deployment package 122 | 123 | This is the code configuration of the `langchain-entertainer` function in AWS. The function’s code package is a zip file including the lambda function code and all dependencies required for the lambda program to run. 124 | ![Deployment Package Python](/images/deploy_1/lc-deployment-package.png) 125 | 126 | #### Memory, CPU, and runtime 127 | 128 | There are varying memory configurations from 128MB, 256MB, 512MB, to 1024MB for the lambda shown in the table below. When 128MB of memory is allocated, on average about **112MB** of memory is used, and when more more than 128MB is allocated, about **130MB** of memory is used and the **runtime is lower**. 129 | ![alt text](/images/deploy_1/lc-cw-logs.png) 130 | 131 | Let's get some more averages for these metrics: I invoked `langchain-entertainer` 50 times for each memory configuration of 128MB, 256MB, 512MB, 1024MB using the [power tuner tool](https://github.com/alexcasalboni/aws-lambda-power-tuning) and the result of those invocations were plotted in the graph below. 132 | 133 | We can see that by increasing the memory allocation (and therefore computation power) of `langchain-entertainer`, the function becomes more performant (lower runtime). However, note that since you pay per GB-seconds, a more performant function is more expensive. 134 | ![alt text](/images/deploy_1/lc-power-tuner.png) 135 | 136 | #### Cold starts ❄️ 137 | 138 | For `langchain-entertainer`, the average cold start time is: **1,898.52ms**, ie. 20x as much as the rig app coldstart. 139 | ![Langchain cold starts](/images/deploy_1/lc-coldstarts.png) 140 | Note that the function was affected by cold starts 6 times out of the 202 times it was executed, so **0.029%** of the time. -------------------------------------------------------------------------------- /pages/docs/1_why_rig.mdx: -------------------------------------------------------------------------------- 1 | --- 2 | title: ❓Why Rig? 3 | description: This section contains the compelling reasons to use Rig for your next LLM project. 4 | --- 5 | 6 | # Why Rig? 5 Compelling Reasons to Use Rig for Your Next LLM Project 7 | 8 | **TL;DR** 9 | - **Rig**: Rust library simplifying LLM app development 10 | - **Developer-Friendly**: Intuitive API design, comprehensive documentation, and scalability from simple chatbots to complex AI systems. 11 | - **Key Features**: Unified API across LLM providers (OpenAI, Cohere); Streamlined embeddings and vector store support; High-level abstractions for complex AI workflows (e.g., RAG); Leverages Rust's performance and safety; Extensible for custom implementations 12 | - **Contribute**: Build with Rig, provide feedback, win $100 13 | - **Resources**: [GitHub](https://github.com/0xPlaygrounds/rig), [Examples](https://github.com/0xPlaygrounds/awesome-rig), [Docs](https://docs.rs/rig-core/latest/rig/). 14 | 15 | 16 | Large Language Models (LLMs) have become a cornerstone of modern AI applications. However, building with LLMs often involves wrestling with complex APIs and repetitive boilerplate code. That's where Rig comes in - a Rust library designed to simplify LLM application development. 17 | 18 | In my [previous post](https://dev.to/0thtachi/rig-a-rust-library-for-building-llm-powered-applications-3g75), I introduced Rig. Today, we're diving deeper into five key features that make Rig a compelling choice for building your next LLM project. 19 | 20 | ## 1. Unified API Across LLM Providers 21 | 22 | One of Rig's standout features is its consistent API that works seamlessly with different LLM providers. This abstraction layer allows you to switch between providers or use multiple providers in the same project with minimal code changes. 23 | 24 | Let's look at an example using both OpenAI and Cohere models with Rig: 25 | 26 | ```rust 27 | use rig::providers::{openai, cohere}; 28 | use rig::completion::Prompt; 29 | 30 | #[tokio::main] 31 | async fn main() -> Result<(), Box> { 32 | // Initialize OpenAI client using environment variables 33 | let openai_client = openai::Client::from_env(); 34 | let gpt4 = openai_client.model("gpt-4").build(); 35 | 36 | // Initialize Cohere client with API key from environment variable 37 | let cohere_client = cohere::Client::new(&std::env::var("COHERE_API_KEY")?); 38 | let command = cohere_client.model("command").build(); 39 | 40 | // Use OpenAI's GPT-4 to explain quantum computing 41 | let gpt4_response = gpt4.prompt("Explain quantum computing in one sentence.").await?; 42 | println!("GPT-4: {}", gpt4_response); 43 | 44 | // Use Cohere's Command model to explain quantum computing 45 | let command_response = command.prompt("Explain quantum computing in one sentence.").await?; 46 | println!("Cohere Command: {}", command_response); 47 | 48 | Ok(()) 49 | } 50 | ``` 51 | 52 | In this example, we're using the same `prompt` method for both OpenAI and Cohere models. This consistency allows you to focus on your application logic rather than provider-specific APIs. 53 | 54 | Here's what the output might look like: 55 | 56 | ```plaintext 57 | GPT-4: Quantum computing utilizes quantum mechanics principles to perform complex computations exponentially faster than classical computers. 58 | Cohere Command: Quantum computing harnesses quantum superposition and entanglement to process information in ways impossible for classical computers. 59 | ``` 60 | 61 | As you can see, we've obtained responses from two different LLM providers using nearly identical code, showcasing Rig's unified API. 62 | 63 | ## 2. Streamlined Embedding and Vector Store Support 64 | 65 | Rig provides robust support for embeddings and vector stores, crucial components for semantic search and retrieval-augmented generation (RAG). Let's explore a simple example using Rig's in-memory vector store: 66 | 67 | ```rust 68 | use rig::providers::openai; 69 | use rig::embeddings::EmbeddingsBuilder; 70 | use rig::vector_store::{in_memory_store::InMemoryVectorStore, VectorStore}; 71 | 72 | #[tokio::main] 73 | async fn main() -> Result<(), Box> { 74 | // Initialize OpenAI client and create an embedding model 75 | let openai_client = openai::Client::from_env(); 76 | let embedding_model = openai_client.embedding_model("text-embedding-ada-002"); 77 | 78 | // Create an in-memory vector store 79 | let mut vector_store = InMemoryVectorStore::default(); 80 | 81 | // Generate embeddings for two documents 82 | let embeddings = EmbeddingsBuilder::new(embedding_model.clone()) 83 | .simple_document("doc1", "Rust is a systems programming language.") 84 | .simple_document("doc2", "Python is known for its simplicity.") 85 | .build() 86 | .await?; 87 | 88 | // Add the embeddings to the vector store 89 | vector_store.add_documents(embeddings).await?; 90 | 91 | // Create an index from the vector store 92 | let index = vector_store.index(embedding_model); 93 | // Query the index for the most relevant document to "What is Rust?" 94 | let results = index.top_n_from_query("What is Rust?", 1).await?; 95 | 96 | // Print the most relevant document 97 | println!("Most relevant document: {:?}", results[0].1.document); 98 | 99 | Ok(()) 100 | } 101 | ``` 102 | 103 | This code demonstrates how to create embeddings, store them in a vector database, and perform a semantic search. Rig handles the complex processes behind the scenes making your life a lot easier. 104 | 105 | When you run this code, you'll see output similar to: 106 | 107 | ```plaintext 108 | Most relevant document: "Rust is a systems programming language." 109 | ``` 110 | 111 | This result shows that Rig's vector store successfully retrieved the most semantically relevant document to our query about Rust. 112 | 113 | ## 3. Powerful Abstractions for Agents and RAG Systems 114 | 115 | Rig provides high-level abstractions for building complex LLM applications, including agents and Retrieval-Augmented Generation (RAG) systems. Here's an example of setting up a RAG system with Rig: 116 | 117 | ```rust 118 | use rig::providers::openai; 119 | use rig::embeddings::EmbeddingsBuilder; 120 | use rig::vector_store::{in_memory_store::InMemoryVectorStore, VectorStore}; 121 | use rig::completion::Prompt; 122 | 123 | #[tokio::main] 124 | async fn main() -> Result<(), Box> { 125 | // Initialize OpenAI client and create an embedding model 126 | let openai_client = openai::Client::from_env(); 127 | let embedding_model = openai_client.embedding_model("text-embedding-ada-002"); 128 | 129 | // Create an in-memory vector store 130 | let mut vector_store = InMemoryVectorStore::default(); 131 | 132 | // Generate embeddings for two documents about Rust 133 | let embeddings = EmbeddingsBuilder::new(embedding_model.clone()) 134 | .simple_document("doc1", "Rust was initially designed by Graydon Hoare at Mozilla Research.") 135 | .simple_document("doc2", "Rust's first stable release (1.0) was on May 15, 2015.") 136 | .build() 137 | .await?; 138 | 139 | // Add the embeddings to the vector store 140 | vector_store.add_documents(embeddings).await?; 141 | 142 | // Create an index from the vector store 143 | let index = vector_store.index(embedding_model); 144 | 145 | // Create a RAG agent using GPT-4 with the vector store as context 146 | let rag_agent = openai_client.context_rag_agent("gpt-4") 147 | .preamble("You are an assistant that answers questions about Rust programming language.") 148 | .dynamic_context(1, index) 149 | .build(); 150 | 151 | // Use the RAG agent to answer a question about Rust 152 | let response = rag_agent.prompt("When was Rust's first stable release?").await?; 153 | println!("RAG Agent Response: {}", response); 154 | 155 | Ok(()) 156 | } 157 | ``` 158 | 159 | This code sets up a complete RAG system with just a few lines of clean and readable code. Rig's abstractions allow you to focus on the high-level architecture of your application, rather than getting bogged down in implementation details that is normally associated with building RAG agents. 160 | 161 | Running this code will produce output similar to: 162 | 163 | ```plaintext 164 | RAG Agent Response: Rust's first stable release (1.0) was on May 15, 2015. 165 | ``` 166 | 167 | This response demonstrates how the RAG agent uses information from the vector store to provide an accurate answer, showcasing Rig's ability to seamlessly combine retrieval and generation. 168 | 169 | ## 4. Efficient Memory Management and Thread Safety 170 | 171 | One of Rig's standout features is its ability to handle multiple LLM requests efficiently and safely, thanks to Rust's [ownership model](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html) and [zero-cost abstractions](https://rust-lang.github.io/async-book/01_getting_started/02_why_async.html#zero-cost-abstractions). This makes Rig particularly well-suited for building high-performance, concurrent LLM applications. 172 | 173 | Let's look at an example that demonstrates how Rig handles multiple LLM requests simultaneously: 174 | 175 | ```rust 176 | use rig::providers::openai; 177 | use rig::completion::Prompt; 178 | use tokio::task; 179 | use std::time::Instant; 180 | use std::sync::Arc; 181 | 182 | #[tokio::main] 183 | async fn main() -> Result<(), Box> { 184 | let openai_client = openai::Client::from_env(); 185 | let model = Arc::new(openai_client.model("gpt-3.5-turbo").build()); 186 | 187 | let start = Instant::now(); 188 | let mut handles = vec![]; 189 | 190 | // Spawn 10 concurrent tasks 191 | for i in 0..10 { 192 | let model_clone = Arc::clone(&model); 193 | let handle = task::spawn(async move { 194 | let prompt = format!("Generate a random fact about the number {}", i); 195 | model_clone.prompt(&prompt).await 196 | }); 197 | handles.push(handle); 198 | } 199 | 200 | // Collect results 201 | for handle in handles { 202 | let result = handle.await??; 203 | println!("Result: {}", result); 204 | } 205 | 206 | println!("Time elapsed: {:?}", start.elapsed()); 207 | Ok(()) 208 | } 209 | ``` 210 | 211 | This example showcases several key strengths of Rig as a Rust-native library: 212 | 213 | 1. **Concurrent Processing**: We're able to handle multiple tasks or LLM requests that run concurrently by leveraging Rust's [async capabilities](https://rust-lang.github.io/async-book/01_getting_started/01_chapter.html) and [Tokio runtime](https://tokio.rs/) to significantly speed up batch operations. 214 | 215 | 2. **Efficient Resource Sharing**: The LLM model is shared across all requests without the need for costly copying, saving memory and improving performance. 216 | 217 | 3. **Memory Safety**: Despite the concurrent operations, Rust's ownership model and the use of `Arc` ensures that we don't run into [data races](https://doc.rust-lang.org/nomicon/races.html) or [memory leaks](https://doc.rust-lang.org/nomicon/leaking.html). 218 | 219 | 4. **Resource Management**: Rig takes care of properly managing the lifecycle of resources like the OpenAI client and model, preventing resource leaks. 220 | 221 | When you run this code, you'll see multiple generated facts printed concurrently, followed by the total time taken. The exact output will vary, but it might look something like this: 222 | 223 | ```plaintext 224 | Result: The number 0 is the only number that is neither positive nor negative. 225 | Result: The number 1 is the only number that is neither prime nor composite. 226 | Result: The number 2 is the only even prime number. 227 | Result: The number 3 is the only prime number that is one less than a perfect square. 228 | ... 229 | 230 | Time elapsed: 1.502160549s 231 | ``` 232 | 233 | This example shows how Rig enables you to easily create applications that can handle multiple LLM requests at once, maintaining high performance while ensuring safety. Whether you're building a chatbot that needs to handle multiple users simultaneously or a data processing pipeline that needs to analyze large volumes of text quickly, Rig provides the tools to do so efficiently and reliably. 234 | 235 | ## 5. Extensibility for Custom Implementations 236 | 237 | Rig is designed to be extensible, allowing you to implement custom functionality when needed. For example, you can create custom tools for your agents: 238 | 239 | ```rust 240 | use rig::tool::Tool; 241 | use rig::completion::ToolDefinition; 242 | use serde::{Deserialize, Serialize}; 243 | use serde_json::json; 244 | 245 | // Define the arguments for the addition operation 246 | #[derive(Deserialize)] 247 | struct AddArgs { 248 | x: i32, 249 | y: i32, 250 | } 251 | 252 | // Define a custom error type for math operations 253 | #[derive(Debug, thiserror::Error)] 254 | #[error("Math error")] 255 | struct MathError; 256 | 257 | // Define the Adder struct 258 | #[derive(Deserialize, Serialize)] 259 | struct Adder; 260 | 261 | // Implement the Tool trait for Adder 262 | impl Tool for Adder { 263 | const NAME: &'static str = "add"; 264 | 265 | type Error = MathError; 266 | type Args = AddArgs; 267 | type Output = i32; 268 | 269 | // Define the tool's interface 270 | async fn definition(&self, _prompt: String) -> ToolDefinition { 271 | ToolDefinition { 272 | name: "add".to_string(), 273 | description: "Add two numbers".to_string(), 274 | parameters: json!({ 275 | "type": "object", 276 | "properties": { 277 | "x": { 278 | "type": "integer", 279 | "description": "First number to add" 280 | }, 281 | "y": { 282 | "type": "integer", 283 | "description": "Second number to add" 284 | } 285 | }, 286 | "required": ["x", "y"] 287 | }), 288 | } 289 | } 290 | 291 | // Implement the addition operation 292 | async fn call(&self, args: Self::Args) -> Result { 293 | Ok(args.x + args.y) 294 | } 295 | } 296 | ``` 297 | 298 | This code defines a custom tool (Adder) that can be used within a Rig agent. While it doesn't produce output on its own, when used in an agent, it would add two numbers. For instance, if an agent uses this tool with arguments `x: 5` and `y: 3`, it would return `8`. 299 | 300 | This extensibility allows Rig to grow with your project's needs, accommodating custom functionality without sacrificing the convenience of its high-level abstractions. 301 | -------------------------------------------------------------------------------- /pages/guides/3_deploy/Blog_2_aws_lambda_lancedb.mdx: -------------------------------------------------------------------------------- 1 | # ⚡🦀 Deploy a blazing-fast & Lightweight LLM app with Rust-Rig-LanceDB 2 | 3 | ## TL;DR 4 | 5 | * A step-by-step walkthrough on deploying a LLM app using [`Rig`](https://github.com/0xPlaygrounds/rig) & [`LanceDB`](https://lancedb.com) on AWS Lambda. You'll learn how to prepare your app, choose the right storage backend (like S3 or EFS), and optimize performance by efficiently using cloud metrics. 6 | * **Stats: Rig RAG Agent using LanceDB on AWS :** 7 | 1. Low memory usage (96MB - 113MB) 8 | 2. Fast cold starts (consistently 160ms) 9 | * **Stats: LangChain RAG Agent using LanceDB on AWS:** 10 | 1. Higher memory usage (246MB - 360MB) 11 | 2. Slower cold starts (1,900ms - 2,700ms) 12 | * [**Jump to Metrics** ⏬](#final-comparison-between-rig-and-langchain) 13 | 14 | Welcome back to **Deploy Your Rig Application**! Apps built with Rig vary in complexity based on LLM usage, vector databases for RAG, and infrastructure deployment. This series explores various configurations for production use. 15 | 16 | ⭐ **Today's Highlight**: Rig's **LanceDB integration**! ⭐ 17 | 18 | We'll deploy a Rig agent using OpenAI's `text-embedding-ada-002` and `GPT-4o`, relying on the [LanceDB vector store](https://lancedb.com) and deployed on [AWS Lambda](https://aws.amazon.com/lambda/). 19 | 20 | > *💡 If you're new to Rig and want to start from the beginning or are looking for additional tutorials, check out our [blog series](https://rig.rs/build-with-rig-guide.html).* 21 | 22 | Let’s dive in! 23 | 24 | ## Prerequisites 25 | 26 | Before we begin building, ensure you have the following: 27 | 28 | >❗ We will *not* be covering how to write your RAG app with Rig, only how to deploy it. So make sure you read [this tutorial](https://dev.to/0thtachi/build-a-fast-and-lightweight-rust-vector-search-app-with-rig-lancedb-57h2) first to help you code your application. 29 | 30 | * A clone of the [`rig-montreal-lancedb`](https://github.com/garance-buricatu/rig-aws/tree/master/rig-montreal-lancedb) crate which includes two separate binaries: a [`loader`](https://github.com/garance-buricatu/rig-aws/blob/master/rig-montreal-lancedb/src/bin/loader.rs) (writes data to LanceDB) and an [`app`](https://github.com/garance-buricatu/rig-aws/blob/master/rig-montreal-lancedb/src/bin/app.rs) (performs RAG on LanceDB). 31 | * An AWS account and some background knowledge on deployments on AWS, including Cloudformation templates 32 | * An OpenAI api key 33 | 34 | ## Our use case: Montreal 🌇 35 | 36 | The app in [`rig-montreal-lancedb`](https://github.com/garance-buricatu/rig-aws/tree/master/rig-montreal-lancedb) RAGs data from [montreal open data](https://donnees.montreal.ca). The Montréal municipality generates and manages large quantities of data through its activities, such as data about agriculture, politics, transportation, health and much more. The open data app publishes all these datasets and make them freely accessible to all citizens! Our app will index the metadata of all the public datasets so that a user can ask questions pertaining to the open data. 37 | The `loader` binary indexes all dataset metadata (name, description, tags, ...) into LanceDB and the `app` binary performs vector search on the data based on a prompt. For example: 38 | > **Prompt:** Give me information on gaseous pollutants in Montreal. How are the concentrations measured? 39 | **App answer:** The concentrations of gaseous pollutants in Montreal are measured through the Réseau de surveillance de la qualité de l'air (RSQA), which is a network of measurement stations located on the Island of Montreal. These stations continuously determine the atmospheric concentration of various pollutants. The data is transmitted via telemetry, ... 40 | 41 | ## LanceDB Quick Overview 💾 42 | 43 | [**Lance**](https://github.com/lancedb/lance) is an **open-source columnar data format** designed for performant ML workloads. 44 | 45 | * Written in Rust 🦀. 46 | * Native support for storing, querying and filtering vectors, deeply nested data and multi-modal data (text, images, videos, point clouds, and more). 47 | * Support for vector similarity search, full-text search and SQL. 48 | * Interoperable with other columnar formats (such as Parquet) via [Arrow](https://arrow.apache.org/overview/) 49 | * Disk-based indexes and storage. 50 | * Built to scale to hundreds of terabytes of data. 51 | 52 | [**LanceDB**](https://lancedb.github.io/lancedb/) is an **open-source vector database**. 53 | 54 | * Written in Rust 🦀. 55 | * Built on top of Lance. 56 | * Support for Python, JavaScript, and Rust client libraries to interact with the database. 57 | * Allows storage of raw data, metadata, and embeddings all at once. 58 | 59 | ## LanceDB Storage Options 60 | 61 | LanceDB's underlying optimized storage format, `lance`, is flexible enough to be supported by various storage backends, such as local NVMe, [EBS](https://aws.amazon.com/ebs/), [EFS](https://aws.amazon.com/efs/), [S3](https://aws.amazon.com/s3/) and other third-party APIs that connect to the cloud. 62 | 63 | > 💡 All you need to do to use a specific storage backend is define its connection string in the LanceDB client! 64 | 65 | Let's go through some storage options that are compatible with AWS Lambda! 66 | 67 | ## S3 - Object Store 68 | > 69 | > ❕Data is stored as individual objects *all at the same level*. 70 | > ❕Objects are kept track of by a distributed hash table (DHT), where each object is identified by a unique ID. 71 | 72 | | **Pros of Object Stores** | **Cons of Object Stores** | 73 | |---------------------------------------------------|-----------------------------------------------| 74 | | **Unlimited scaling** ♾️: Objects can be stored across distributed systems, eliminating single-node limitations. This is ideal for ML and AI applications handling large data volumes. | **Higher latency** 🚚: Accessing a remote object store over a network via HTTP/HTTPS adds overhead compared to file system protocols like NFS. Additionally, storing metadata separately from objects introduces some retrieval latency. | 75 | | **Cheap** 💸: The simple storage design makes it more affordable than traditional file systems. | | 76 | | **Highly available** and **resilient** 💪: Affordable storage allows for redundant data storage within and across data centers. | | 77 | 78 | ### S3 + LanceDB setup on AWS lambda 79 | 80 | **⚠️ Important**: S3 does **not support concurrent writes**. If multiple processes attempt to write to the same table simultaneously, it could lead to data corruption. But there's a solution! Use the [DynamoDB commit store feature in LanceDB](https://lancedb.github.io/lancedb/guides/storage/#dynamodb-commit-store-for-concurrent-writes) to prevent this. 81 | 82 | --- 83 | 84 | #### Part I - Write lambda function code 85 | 86 | 1. Create an [S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) where your Lance database will be stored. Ours is called: `rig-montreal-lancedb`. 87 | 2. In the lambda code, connect to the store via the [`LanceBD client`](https://docs.rs/lancedb/latest/lancedb/connection/struct.Connection.html) as so: 88 | 89 | ```rust 90 | // Note: Create s3://rig-montreal-lancedb bucket beforehand 91 | let db = lancedb::connect("s3://rig-montreal-lancedb").execute().await?; 92 | // OR 93 | let db = lancedb::connect("s3+ddb://rig-montreal-lancedb?ddbTableName=my-dynamodb-table").execute().await?; 94 | ``` 95 | 96 | #### Part II - Deploy lambdas 97 | > 98 | > 💡 Need a refresher on Lambda deployments? Check out our [previous blog](https://dev.to/garance_buricatu_a6864136/how-to-deploy-your-rig-app-on-aws-lambda-a-step-by-step-guide-2ge5) for a full walkthrough. 99 | 100 | ```bash 101 | # Lambda that writes to the store 102 | cargo lambda build --release --bin loader 103 | cargo lambda deploy --binary-name loader 104 | 105 | # Lambda that reads to the store 106 | cargo lambda build --release --bin app 107 | cargo lambda deploy --binary-name app 108 | ``` 109 | 110 | > 💡 Don’t forget to set the necessary [IAM permissions](https://lancedb.github.io/lancedb/guides/storage/#aws-iam-permissions)! Your lambda functions need appropriate access to the S3 bucket — whether it’s read, write, or both. 111 | 112 | ## Lambda ephemeral storage - Local file system 113 | 114 | [Lambda ephemeral storage](https://docs.aws.amazon.com/lambda/latest/dg/configuration-ephemeral-storage.html) is **temporary and unique** to each execution environment, it is not intended for persistent storage. In other words, any LanceDB store created during the lambda execution on ephemeral storage will be wiped when the function cold starts. 115 | This option can be used for very specific use cases (mostly for testing) where writing to the store needs to be done in the same process as reading, and data is only read by a single lambda execution. 116 | 117 | Ephemeral storage in a lambda is found in the `/tmp` directory. All you need to do is: 118 | 119 | ```rust 120 | let db = lancedb::connect("/tmp").execute().await?; 121 | ``` 122 | 123 | ## EFS - Virtual file system 124 | > 125 | > ❕A **serverless**, **elastic**, **shared file system** designed to be consumed by AWS services like EC2 and Lambda. 126 | > ❕Data is **persisted** and can be shared across lambda invocations (unlike the S3 without commit store and ephemeral storage options above). 127 | > ❕Supports up to 25,000 **concurrent connections**. 128 | 129 | | **Pros of EFS** | **Cons of EFS** | 130 | |:---------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------| 131 | | **Stateful lambda**: Mounting an EFS instance on a lambda function provides knowledge of previous and concurrent executions. | **Development time**: More involved cloud setup | 132 | | **Low latency** ⚡: A lambda function resides in the same **VPC** as the EFS instance, allowing low-latency network calls via the **NFS** protocol. | **Cost** 💲: More expensive than S3 | 133 | 134 | ### EFS + LanceDB setup on AWS Lambda 135 | > 136 | > 💡 Setting up EFS in the cloud can be intricate, so you can use our [CloudFormation template](https://github.com/garance-buricatu/rig-aws/blob/master/rig-montreal-lancedb/template.yaml) to streamline the deployment process. 137 | 138 | #### Part I - Build Rust code and upload zip files to S3 139 | 140 | In the lambda code, connect to the store via the [LanceBD client](https://docs.rs/lancedb/latest/lancedb/connection/struct.Connection.html) as so: 141 | 142 | ```rust 143 | let db = lancedb::connect("/mnt/efs").execute().await?; 144 | ``` 145 | 146 | Then, compile your code, zip the binaries, and upload them to S3: 147 | 148 | ```bash 149 | # Can also do this directly on the AWS console 150 | aws s3api create-bucket --bucket 151 | 152 | cargo lambda build --release --bin loader 153 | cargo lambda build --release --bin app 154 | 155 | cd target/lambda/loader 156 | zip -r bootstrap.zip bootstrap 157 | # Can also do this directly on the AWS console 158 | aws s3 cp bootstrap.zip s3:///rig/loader/ 159 | 160 | cd .. 161 | zip -r bootstrap.zip bootstrap 162 | # Can also do this directly on the AWS console 163 | aws s3 cp bootstrap.zip s3:///rig/app/ 164 | 165 | ``` 166 | 167 | #### Part II - Understand Cloudformation template 168 | 169 | The [template](https://github.com/garance-buricatu/rig-aws/blob/master/rig-montreal-lancedb/template.yaml) assumes that your AWS account already has the following resources: 170 | 171 | 1. A **VPC** with at least two private subnets in separate availability zones, each with public internet access. 172 | 3. An **S3 bucket** (as created in Part I) for storing Lambda code. 173 | 174 | > 💡 If you’re missing these resources, follow this AWS [tutorial](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-example-private-subnets-nat.html) to set up a basic VPC and subnets. 175 | 176 | **EFS setup** 177 | 178 | 1. **Mount Targets:** Create two mount targets for your EFS instance — one in each subnet (specified in `Parameters` section of CFT template). 179 | 2. **Security Groups:** Set up an EFS security group with rules to allow **NFS traffic** from your Lambda functions’ security group. 180 | 181 | **Lambda functions setup** 182 | 183 | 1. **Loader and App Lambdas:** Deploy both Lambda functions (`loader` and `app`) in the same subnets as your EFS mount targets. 184 | 2. **Security Groups:** Assign a security group that enables access to the EFS security group and public internet. 185 | 3. **EFS Mounting:** Configure the Lambdas to mount the EFS targets at `/mnt/efs`. 186 | 187 | > 💡 Once everything’s ready, deploy the CloudFormation template to launch your environment with just one click! 188 | --- 189 | 190 | ### Metrics on the cloud ☁️ 191 | 192 | If you've made it to here, you have the Montreal rig app with EFS as the LanceDbB storage backend deployed on AWS Lambda! 🎉 Now we want to look at some metrics when the app is run in the cloud. 193 | 194 | For reference, we replicated the Montreal agent using [langchain 🐍](https://python.langchain.com/) in this [python project](https://github.com/garance-buricatu/rig-aws/tree/master/langchain-montreal-lancedb) which contains the source code for the [`loader`](https://github.com/garance-buricatu/rig-aws/blob/master/langchain-montreal-lancedb/loader.py) and [`app`](https://github.com/garance-buricatu/rig-aws/blob/master/langchain-montreal-lancedb/app.py) lambdas. The python app uses the same LanceDB vector store on the same EFS instance as the Rig app. To see how the python app was configured in the cloud, take a look at the [CloudFormation template](https://github.com/garance-buricatu/rig-aws/blob/master/rig-montreal-lancedb/template.yaml). 195 | 196 | Let's compare them! 197 | 198 | ### Rig - Memory, runtime, and coldstarts 199 | 200 | We invoked the `app` function 50 times for each memory configuration of 128MB, 256MB, 512MB, 1024MB using the [power tuner tool](https://github.com/alexcasalboni/aws-lambda-power-tuning). 201 | The Cloudwatch query below gathers averages about runtime, memory usage, and cold starts of the lambda over the 50 invocations. 202 | 203 | ```sql 204 | filter @type = "REPORT" 205 | | stats 206 | avg(@maxMemoryUsed) / 1000000 as MemoryUsageMB, 207 | avg(@duration) / 1000 as AvgDurationSec, 208 | max(@duration) / 1000 as MaxDurationSec, 209 | min(@duration) / 1000 as MinDurationSec, 210 | avg(@initDuration) / 1000 as AvgColdStartTimeSec, 211 | count(*) as NumberOfInvocations, 212 | sum(@initDuration > 0) as ColdStartInvocations 213 | by bin(1d) as TimeRange, @memorySize / 1000000 as MemoryConfigurationMB 214 | ``` 215 | 216 | ![alt text](/images/deploy_2/rig-metrics.png) 217 | 218 | > **Memory and runtime analysis** 219 | > At the memory configuration of **128MB**, the lambda has the lowest average memory usage of **96.1 MB** and the highest runtime of **5.1s**. At a memory configuration of **1GB**, the lambda has the highest average memory usage of **113.1 MB** and the lowest runtime of **4.4s**. In other words, with an extra ~7MB of memory usage, the lambda function was 700ms faster. 220 | 221 | > **Cold starts analysis** ❄️ 222 | > The average initialization time remains steady around **0.16s**. 223 | 224 | The chart below shows the power tuner results after running the app 50 times with each of the 4 memory configurations. 225 | ![alt text](/images/deploy_2/rig-power-tuner.png) 226 | 227 | We see that adding memory to the function (and therefore adding computational power) **does in fact affect the performance of the lambda by less than a second**. 228 | 229 | ### LangChain - Memory, runtime, and coldstarts 230 | 231 | #### Deployment package 232 | 233 | We are not able to use zip files for the deployment package of the lambda functions as the zip size exceeds the maximum size allowed by AWS. The [loader dependencies](https://github.com/garance-buricatu/rig-aws/blob/master/langchain-montreal-lancedb/loader_requirements.txt) and [app dependencies](https://github.com/garance-buricatu/rig-aws/blob/master/langchain-montreal-lancedb/app_requirements.txt) create zip files of size around 150 MB. 234 | 235 | Instead, we must use container images. The [docker image](https://github.com/garance-buricatu/rig-aws/blob/master/langchain-montreal-lancedb/Dockerfile) has size 471.45MB using the base python lambda image. 236 | ![Deployment Package Python](/images/deploy_2/lc-deployment-package.png) 237 | 238 | We did the same experiment as with the Rig app above and got the following metrics: 239 | ![alt text](/images/deploy_2/lc-metrics.png) 240 | 241 | First of all, the function is **unable to run with a memory allocation of 128MB**. It gets killed at this allocation size due to lack of memory. So we will compare the three following memory configurations: 256MB, 512MB, 1GB. 242 | 243 | > **Memory and runtime analysis** 244 | > At the memory configuration of **256MB**, the lambda has the lowest average memory usage of **245.8 MB** and the highest runtime of **4.9s**. At a memory configuration of **1GB**, the lambda has the highest average memory usage of **359.6 MB** and the lowest runtime of **4.0s**. In other words, with an extra **~113MB** of memory usage, the lambda function was 1s faster. 245 | 246 | > **Cold starts analysis** ❄️ 247 | The average initialization time increases as the memory configuration increases with the lowest being **1.9s** and the highest being **2.7s**. 248 | 249 | The chart below shows the power tuner results after running the app 50 times with each of the 4 memory configurations. 250 | ![alt text](/images/deploy_2/lc-power-tuner.png) 251 | 252 | We see that adding memory to the function (and therefore adding computational power) also affects the performance of the lambda by about a second. 253 | 254 | ### Final Comparison between Rig and LangChain 255 | 256 | Based on the Cloudwatch logs produced by both the Rig and Langchain lambdas, we were able to produce the following graphics: 257 | ![alt text](/images/deploy_2/memory_usage.png) 258 | ![alt text](/images/deploy_2/cold_starts.png) 259 | -------------------------------------------------------------------------------- /pages/guides/2_advanced/22_flight_assistant.mdx: -------------------------------------------------------------------------------- 1 | # Build a Flight Search AI Agent with Rig 2 | 3 | **TL;DR**: This step-by-step guide will teach you how to build a Flight Search AI Assistant in Rust using the [Rig](https://github.com/0xPlaygrounds/rig) library. By the end, you'll have a functional AI agent that can find the cheapest flights between two airports. Along the way, you'll grasp Rust fundamentals, understand how to set up AI agents with custom tools, and see how Rig simplifies the process. 4 | 5 | --- 6 | 7 | ## Introduction 8 | 9 | Ever chatted with AI assistants like Siri, Alexa, or even those nifty chatbots that help you book flights or check the weather? Ever wondered what's happening under the hood? Today, we're going to demystify that by building our very own Flight Search AI Assistant using **[Rust](https://www.rust-lang.org/)** and the **[Rig](https://github.com/0xPlaygrounds/rig)** library. 10 | 11 | You might be thinking, *"Wait, Rust? Isn't that the language with the reputation for being hard?"* Don't worry! We'll walk through everything step by step, explaining concepts as we go. By the end, not only will you have a cool AI assistant, but you'll also have dipped your toes into Rust programming. 12 | 13 | Here's our game plan: 14 | 15 | - **Why Rust and Rig?** Understanding our tools of choice. 16 | - **Setting Up the Environment**: Getting Rust and Rig ready to roll. 17 | - **Understanding Agents and Tools**: The brains and hands of our assistant. 18 | - **Building the Flight Search Tool**: Where the magic happens. 19 | - **Creating the AI Agent**: Bringing our assistant to life. 20 | - **Running and Testing**: Seeing our creation in action. 21 | - **Wrapping Up**: Recap and next steps. 22 | 23 | > *Full source code for this project can be found on our [Replit Page](https://replit.com/@playgrounds/travelplanningagent) and [Github](https://github.com/0xPlaygrounds/awesome-rig/tree/main/travel_flights_planning_agent)* 24 | 25 | Sound exciting? Let's dive in! 26 | 27 | --- 28 | 29 | ## Why Rust and Rig? 30 | 31 | ### Why Rust? 32 | 33 | [Rust](https://www.rust-lang.org/) is a systems programming language known for its performance and safety. But beyond that, Rust has been making waves in areas like web development, game development, and now, AI applications. Here's why we're using Rust: 34 | 35 | - **Performance**: Rust is blazingly fast, making it ideal for applications that need to handle data quickly. 36 | - **Safety**: With its strict compiler checks, Rust ensures [memory safety](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html), preventing common bugs. 37 | - **Concurrency**: Rust makes it easier to write concurrent programs, which is great for handling multiple tasks simultaneously. Learn more about Rust's [concurrency model](https://doc.rust-lang.org/book/ch16-00-concurrency.html). 38 | 39 | ### Why Rig? 40 | 41 | [Rig](https://github.com/0xPlaygrounds/rig) is an open-source Rust library that simplifies building applications powered by Large Language Models (LLMs) like GPT-4. Think of Rig as a toolkit that provides: 42 | 43 | - **Unified API**: It abstracts away the complexities of different LLM providers. 44 | - **High-Level Abstractions**: Helps you build agents and tools without reinventing the wheel. 45 | - **Extensibility**: You can create custom tools tailored to your application's needs. 46 | 47 | By combining Rust and Rig, we're setting ourselves up to build a robust, efficient, and intelligent assistant. 48 | 49 | --- 50 | 51 | ## Setting Up the Environment 52 | 53 | Before we start coding, let's get everything ready. 54 | 55 | ### Prerequisites 56 | 57 | 1. **Install Rust**: If you haven't already, install Rust by following the instructions [here](https://www.rust-lang.org/tools/install). 58 | 59 | 2. **Basic Rust Knowledge**: Don't worry if you're new. We'll explain the Rust concepts as we encounter them. 60 | 61 | 3. **API Keys**: 62 | - **OpenAI API Key**: Sign up and get your key [here](https://platform.openai.com/). 63 | - **RapidAPI Key**: We'll use this to access the trip advisor flight search API. Get it [here](https://rapidapi.com/apidojo/api/tripadvisor1/). 64 | 65 | ### Project Setup 66 | 67 | #### 1. Create a New Rust Project 68 | 69 | Open your terminal and run: 70 | 71 | ```bash 72 | cargo new flight_search_assistant 73 | cd flight_search_assistant 74 | ``` 75 | 76 | This initializes a new Rust project named `flight_search_assistant`. 77 | 78 | #### 2. Update `Cargo.toml` 79 | 80 | Open the `Cargo.toml` file and update it with the necessary dependencies: 81 | 82 | ```toml 83 | [package] 84 | name = "flight_search_assistant" 85 | version = "0.1.0" 86 | edition = "2021" 87 | 88 | [dependencies] 89 | rig-core = "0.1.0" 90 | tokio = { version = "1.34.0", features = ["full"] } 91 | serde = { version = "1.0", features = ["derive"] } 92 | serde_json = "1.0" 93 | reqwest = { version = "0.11", features = ["json", "tls"] } 94 | dotenv = "0.15" 95 | thiserror = "1.0" 96 | chrono = { version = "0.4", features = ["serde"] } 97 | ``` 98 | 99 | Here's a quick rundown: 100 | 101 | - **[rig-core](https://crates.io/crates/rig-core)**: The core Rig library. 102 | - **[tokio](https://crates.io/crates/tokio)**: Asynchronous runtime for Rust. Think of it as the engine that allows us to perform tasks concurrently. 103 | - **[serde](https://serde.rs/)** & **[serde_json](https://crates.io/crates/serde_json)**: Libraries for serializing and deserializing data (converting between Rust structs and JSON). 104 | - **[reqwest](https://crates.io/crates/reqwest)**: An HTTP client for making API requests. 105 | - **[dotenv](https://crates.io/crates/dotenv)**: Loads environment variables from a `.env` file. 106 | - **[thiserror](https://crates.io/crates/thiserror)**: A library for better error handling. 107 | - **[chrono](https://crates.io/crates/chrono)**: For handling dates and times. 108 | 109 | #### 3. Set Up Environment Variables 110 | 111 | We don't want to hard-code our API keys for security reasons. Instead, we'll store them in a `.env` file. 112 | 113 | Create the file: 114 | 115 | ```bash 116 | touch .env 117 | ``` 118 | 119 | Add your API keys to `.env`: 120 | 121 | ```dotenv 122 | OPENAI_API_KEY=your_openai_api_key_here 123 | RAPIDAPI_KEY=your_rapidapi_key_here 124 | ``` 125 | 126 | *Remember to replace the placeholders with your actual keys.* 127 | 128 | #### 4. Install Dependencies 129 | 130 | Back in your terminal, run: 131 | 132 | ```bash 133 | cargo build 134 | ``` 135 | 136 | This will download and compile all the dependencies. 137 | 138 | --- 139 | 140 | ## Understanding Agents and Tools 141 | 142 | Before we jump into coding, let's clarify some key concepts. 143 | 144 | ### What Are Agents? 145 | 146 | In the context of Rig (and AI applications in general), an **Agent** is like the brain of your assistant. It's responsible for interpreting user inputs, deciding what actions to take, and generating responses. 147 | 148 | Think of the agent as the conductor of an orchestra, coordinating different instruments (or tools) to create harmonious music (or responses). 149 | 150 | ### What Are Tools? 151 | 152 | **Tools** are the skills or actions that the agent can use to fulfill a task. Each tool performs a specific function. In our case, the flight search functionality is a tool that the agent can use to find flight information. 153 | 154 | Continuing our analogy, tools are the instruments in the orchestra. Each one plays a specific role. 155 | 156 | ### How Do They Work Together? 157 | 158 | When a user asks, "Find me flights from NYC to LA," the agent processes this request and decides it needs to use the flight search tool to fetch the information. 159 | 160 | --- 161 | 162 | ## Building the Flight Search Tool 163 | 164 | Now, let's build the tool that will handle flight searches. 165 | 166 | ### 1. Create the Tool File 167 | 168 | In your `src` directory, create a new file named `flight_search_tool.rs`: 169 | 170 | ```bash 171 | touch src/flight_search_tool.rs 172 | ``` 173 | 174 | ### 2. Import Necessary Libraries 175 | 176 | Open `flight_search_tool.rs` and add: 177 | 178 | ```rust 179 | use chrono::{DateTime, Duration, Utc}; 180 | use rig::completion::ToolDefinition; 181 | use rig::tool::Tool; 182 | use serde::{Deserialize, Serialize}; 183 | use serde_json::{json, Value}; 184 | use std::collections::HashMap; 185 | use std::env; 186 | ``` 187 | 188 | ### 3. Define Data Structures 189 | 190 | We'll define structures to handle input arguments and output results. 191 | 192 | ```rust 193 | #[derive(Deserialize)] 194 | pub struct FlightSearchArgs { 195 | source: String, 196 | destination: String, 197 | date: Option, 198 | sort: Option, 199 | service: Option, 200 | itinerary_type: Option, 201 | adults: Option, 202 | seniors: Option, 203 | currency: Option, 204 | nearby: Option, 205 | nonstop: Option, 206 | } 207 | 208 | #[derive(Serialize)] 209 | pub struct FlightOption { 210 | pub airline: String, 211 | pub flight_number: String, 212 | pub departure: String, 213 | pub arrival: String, 214 | pub duration: String, 215 | pub stops: usize, 216 | pub price: f64, 217 | pub currency: String, 218 | pub booking_url: String, 219 | } 220 | ``` 221 | 222 | - **`FlightSearchArgs`**: Represents the parameters the user provides. 223 | - **`FlightOption`**: Represents each flight option we'll display to the user. 224 | 225 | *Want to dive deeper? Check out [Rust's struct documentation](https://doc.rust-lang.org/book/ch05-01-defining-structs.html).* 226 | 227 | ### 4. Error Handling with `thiserror` 228 | 229 | Rust encourages us to handle errors explicitly. We'll define a custom error type: 230 | 231 | ```rust 232 | #[derive(Debug, thiserror::Error)] 233 | pub enum FlightSearchError { 234 | #[error("HTTP request failed: {0}")] 235 | HttpRequestFailed(String), 236 | #[error("Invalid response structure")] 237 | InvalidResponse, 238 | #[error("API error: {0}")] 239 | ApiError(String), 240 | #[error("Missing API key")] 241 | MissingApiKey, 242 | } 243 | ``` 244 | 245 | This makes it easier to manage different kinds of errors that might occur during the API call. 246 | 247 | *Learn more about [error handling in Rust](https://doc.rust-lang.org/book/ch09-02-recoverable-errors-with-result.html).* 248 | 249 | ### 5. Implement the `Tool` Trait 250 | 251 | Now, we'll implement the `Tool` trait for our `FlightSearchTool`. 252 | 253 | First, define the tool: 254 | 255 | ```rust 256 | pub struct FlightSearchTool; 257 | ``` 258 | 259 | Implement the trait: 260 | 261 | ```rust 262 | impl Tool for FlightSearchTool { 263 | const NAME: &'static str = "search_flights"; 264 | 265 | type Args = FlightSearchArgs; 266 | type Output = String; 267 | type Error = FlightSearchError; 268 | 269 | async fn definition(&self, _prompt: String) -> ToolDefinition { 270 | ToolDefinition { 271 | name: Self::NAME.to_string(), 272 | description: "Search for flights between two airports".to_string(), 273 | parameters: json!({ 274 | "type": "object", 275 | "properties": { 276 | "source": { "type": "string", "description": "Source airport code (e.g., 'JFK')" }, 277 | "destination": { "type": "string", "description": "Destination airport code (e.g., 'LAX')" }, 278 | "date": { "type": "string", "description": "Flight date in 'YYYY-MM-DD' format" }, 279 | }, 280 | "required": ["source", "destination"] 281 | }), 282 | } 283 | } 284 | 285 | async fn call(&self, args: Self::Args) -> Result { 286 | // We'll implement the logic for calling the flight search API next. 287 | Ok("Flight search results".to_string()) 288 | } 289 | } 290 | ``` 291 | 292 | - **`definition`**: Provides metadata about the tool. 293 | - **`call`**: The function that will be executed when the agent uses this tool. 294 | 295 | *Curious about traits? Explore [Rust's trait system](https://doc.rust-lang.org/book/ch10-02-traits.html).* 296 | 297 | ### 6. Implement the `call` Function 298 | 299 | Now, let's flesh out the `call` function. 300 | 301 | #### a. Fetch the API Key 302 | 303 | ```rust 304 | let api_key = env::var("RAPIDAPI_KEY").map_err(|_| FlightSearchError::MissingApiKey)?; 305 | ``` 306 | 307 | We retrieve the API key from the environment variables. 308 | 309 | #### b. Set Default Values 310 | 311 | ```rust 312 | let date = args.date.unwrap_or_else(|| { 313 | let date = Utc::now() + Duration::days(30); 314 | date.format("%Y-%m-%d").to_string() 315 | }); 316 | ``` 317 | 318 | If the user doesn't provide a date, we'll default to 30 days from now. 319 | 320 | #### c. Build Query Parameters 321 | 322 | ```rust 323 | let mut query_params = HashMap::new(); 324 | query_params.insert("sourceAirportCode", args.source); 325 | query_params.insert("destinationAirportCode", args.destination); 326 | query_params.insert("date", date); 327 | ``` 328 | 329 | #### d. Make the API Request 330 | 331 | ```rust 332 | let client = reqwest::Client::new(); 333 | let response = client 334 | .get("https://tripadvisor16.p.rapidapi.com/api/v1/flights/searchFlights") 335 | .headers({ 336 | let mut headers = reqwest::header::HeaderMap::new(); 337 | headers.insert("X-RapidAPI-Host", "tripadvisor16.p.rapidapi.com".parse().unwrap()); 338 | headers.insert("X-RapidAPI-Key", api_key.parse().unwrap()); 339 | headers 340 | }) 341 | .query(&query_params) 342 | .send() 343 | .await 344 | .map_err(|e| FlightSearchError::HttpRequestFailed(e.to_string()))?; 345 | ``` 346 | 347 | We use `reqwest` to send an HTTP GET request to the flight search API. 348 | 349 | #### e. Parse and Format the Response 350 | 351 | After receiving the response, we need to parse the JSON data and format it for the user. 352 | 353 | ```rust 354 | let text = response 355 | .text() 356 | .await 357 | .map_err(|e| FlightSearchError::HttpRequestFailed(e.to_string()))?; 358 | 359 | let data: Value = serde_json::from_str(&text) 360 | .map_err(|e| FlightSearchError::HttpRequestFailed(e.to_string()))?; 361 | 362 | let mut flight_options = Vec::new(); 363 | 364 | // Here, we need to extract the flight options. (It's quite detailed, so we've omitted the full code to keep the focus clear.) 365 | 366 | // Format the flight options into a readable string 367 | let mut output = String::new(); 368 | output.push_str("Here are some flight options:\n\n"); 369 | 370 | for (i, option) in flight_options.iter().enumerate() { 371 | output.push_str(&format!("{}. **Airline**: {}\n", i + 1, option.airline)); 372 | // Additional formatting... 373 | } 374 | 375 | Ok(output) 376 | ``` 377 | 378 | *Note: A lot of this section involves parsing the raw API response. To keep things concise, the detailed extraction of flight options is intentionally omitted, but in your code, you'll parse the JSON to extract the necessary fields. See the [full code in the repository](https://replit.com/@playgrounds/travelplanningagent).* 379 | 380 | *Interested in JSON parsing? Check out [serde_json documentation](https://docs.serde.rs/serde_json/).* 381 | 382 | --- 383 | 384 | ## Creating the AI Agent 385 | 386 | Now that our tool is ready, let's build the agent that will use it. 387 | 388 | ### Updating `main.rs` 389 | 390 | Open `src/main.rs` and update it: 391 | 392 | ```rust 393 | mod flight_search_tool; 394 | 395 | use crate::flight_search_tool::FlightSearchTool; 396 | use dotenv::dotenv; 397 | use rig::completion::Prompt; 398 | use rig::providers::openai; 399 | use std::error::Error; 400 | 401 | #[tokio::main] 402 | async fn main() -> Result<(), Box> { 403 | dotenv().ok(); 404 | 405 | let openai_client = openai::Client::from_env(); 406 | 407 | let agent = openai_client 408 | .agent("gpt-4") 409 | .preamble("You are a helpful assistant that can find flights for users.") 410 | .tool(FlightSearchTool) 411 | .build(); 412 | 413 | let response = agent 414 | .prompt("Find me flights from San Antonio (SAT) to Atlanta (ATL) on November 15th 2024.") 415 | .await?; 416 | 417 | println!("Agent response:\n{}", response); 418 | 419 | Ok(()) 420 | } 421 | ``` 422 | 423 | - We initialize the OpenAI client using our API key. 424 | - We create an agent, giving it a preamble (context) and adding our `FlightSearchTool`. 425 | - We prompt the agent with a query. 426 | - Finally, we print out the response. 427 | 428 | *Want to understand asynchronous functions? Learn about the `async` keyword and the `#[tokio::main]` macro [here](https://tokio.rs/tokio/tutorial/async).* 429 | 430 | --- 431 | 432 | ## Running and Testing 433 | 434 | Let's see our assistant in action! 435 | 436 | ### Build the Project 437 | 438 | In your terminal, run: 439 | 440 | ```bash 441 | cargo build 442 | ``` 443 | 444 | Fix any compilation errors that may arise. 445 | 446 | ### Run the Application 447 | 448 | ```bash 449 | cargo run 450 | ``` 451 | 452 | You should see an output similar to: 453 | 454 | ```plaintext 455 | Agent response: 456 | Here are some flight options: 457 | 458 | 1. **Airline**: Spirit 459 | - **Flight Number**: NK123 460 | - **Departure**: 2024-11-15T05:00:00-06:00 461 | - **Arrival**: 2024-11-15T10:12:00-05:00 462 | - **Duration**: 4 hours 12 minutes 463 | - **Stops**: 1 stop(s) 464 | - **Price**: 77.97 USD 465 | - **Booking URL**: https://www.tripadvisor.com/CheapFlightsPartnerHandoff... 466 | 467 | 2. **Airline**: American 468 | - **Flight Number**: AA456 469 | - **Departure**: 2024-11-15T18:40:00-06:00 470 | - **Arrival**: 2024-11-15T23:58:00-05:00 471 | - **Duration**: 4 hours 18 minutes 472 | - **Stops**: 1 stop(s) 473 | - **Price**: 119.97 USD 474 | - **Booking URL**: https://www.tripadvisor.com/CheapFlightsPartnerHandoff... 475 | 476 | ... 477 | ``` 478 | 479 | *Note: The actual results may vary depending on the API response.* 480 | 481 | --- 482 | 483 | ## Wrapping Up 484 | 485 | Congratulations! You've built a functional Flight Search AI Assistant using Rust and Rig. Here's what we've achieved: 486 | 487 | - **Learned Rust Basics**: We've explored Rust's syntax and structure, including handling errors and asynchronous programming. 488 | - **Understood Agents and Tools**: We learned how agents act as the brain and tools as the skills. 489 | - **Built a Custom Tool**: We created a flight search tool that interacts with an external API. 490 | - **Created an AI Agent**: We integrated our tool into an agent that can understand and respond to user queries. 491 | - **Ran and Tested Our Assistant**: We saw our assistant in action, fetching and displaying flight options. 492 | 493 | ### Next Steps 494 | 495 | - **Enhance the Tool**: Add more parameters like class of service, number of passengers, or price filtering. 496 | - **Improve Error Handling**: Handle cases where no flights are found or when the API rate limit is reached. 497 | - **User Interface**: Build a simple command-line interface or even a web frontend. --------------------------------------------------------------------------------