├── .gitignore
├── LICENSE
├── README.md
└── awesome-decentralized-llms.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | .cache
 2 | *.egg-info
 3 | *.py[oc]
 4 | *~
 5 | .*.sw?
 6 | .coverage
 7 | .idea
 8 | .ipynb_checkpoints
 9 | .mypy_cache
10 | .netlify
11 | .pytest_cache
12 | .venv
13 | .vscode
14 | Pipfile.lock
15 | __pycache__/
16 | archive.zip
17 | build/
18 | coverage.xml
19 | dist/
20 | docs.zip
21 | docs_build
22 | env
23 | env3.*
24 | htmlcov
25 | log.txt
26 | site
27 | test.db
28 | venv
29 | wheels/
30 | model_cache/
31 | .DS_Store


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Ian Maurer
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Awesome LLM JSON List
  2 | 
  3 | This awesome list is dedicated to resources for using Large Language Models (LLMs) to generate JSON or other structured outputs.
  4 |   
  5 | ## Table of Contents  
  6 |   
  7 | * [Terminology](#terminology)  
  8 | * [Hosted Models](#hosted-models)
  9 | * [Local Models](#local-models)
 10 | * [Python Libraries](#python-libraries)
 11 | * [Blog Articles](#blog-articles)
 12 | * [Videos](#videos) 
 13 | * [Jupyter Notebooks](#jupyter-notebooks)
 14 | * [Leaderboards](#leaderboards)
 15 |   
 16 | ## Terminology  
 17 |   
 18 | Unfortunately, generating JSON goes by a few different names that roughly mean the same thing:  
 19 |   
 20 | * [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs): Using an LLM to generate any structured output including JSON, XML, or YAML regardless of technique (e.g. function calling, guided generation).
 21 | * [Function Calling](https://www.promptingguide.ai/applications/function_calling): Providing an LLM a hypothetical (or actual) function definition for it to "call" in it's chat or completion response. The LLM doesn't actually call the function, it just provides an indication that one should be called via a JSON message.
 22 | * [JSON Mode](https://platform.openai.com/docs/guides/text-generation/json-mode): Specifying that an LLM must generate valid JSON. Depending on the provider, a schema may or may not be specified and the LLM may create an unexpected schema.
 23 | * [Tool Usage](https://python.langchain.com/docs/concepts/tools/): Giving an LLM a choice of tools such as image generation, web search, and "function calling".  The functional calling parameter in the API request is now called "tools".
 24 | * [Guided Generation](https://arxiv.org/abs/2307.09702): For constraining an LLM model to generate text that follows a prescribed specification such as a [Context-Free Grammar](https://en.wikipedia.org/wiki/Context-free_grammar).
 25 | * [GPT Actions](https://platform.openai.com/docs/actions/introduction): ChatGPT invokes actions (i.e. API calls) based on the endpoints and parameters specified in an [OpenAPI specification](https://swagger.io/specification/). Unlike the capability called "Function Calling", this capability will indeed call your function hosted by an API server.
 26 | 
 27 | None of these names are great, that's why I named this list just "Awesome LLM JSON".
 28 |   
 29 | ## Hosted Models
 30 | 
 31 | | Provider     | Models                                                                             | Links                                                                                                                                                                                                                                                                                                                                                                                                                                    |
 32 | |--------------|------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 33 | | Anthropic    | claude-3-opus-20240229<br>claude-3-sonnet-20240229<br>claude-3-haiku-20240307      | [API Docs](https://docs.anthropic.com/claude/docs/tool-use)<br>[Pricing](https://docs.anthropic.com/claude/docs/tool-use)                                                                                                                                                                                                                                                                                                                |
 34 | | AnyScale     | Mistral-7B-Instruct-v0.1<br>Mixtral-8x7B-Instruct-v0.1                             | [Function Calling](https://docs.endpoints.anyscale.com/text-generation/function-calling)<br>[JSON Mode](https://docs.endpoints.anyscale.com/text-generation/json-mode)<br>[Pricing](https://docs.endpoints.anyscale.com/pricing/)<br>[Announcement (2023)](https://www.anyscale.com/blog/anyscale-endpoints-json-mode-and-function-calling-features)                                                                                     |
 35 | | Azure        | gpt-4<br>gpt-4-turbo<br>gpt-35-turbo<br>mistral-large-latest<br>mistral-large-2402 | [Function Calling](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/function-calling?tabs=python)<br>[OpenAI Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/#pricing)<br>[Mistral Pricing](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/000-000.mistral-ai-large-offer?tab=PlansAndPrice)                                                                |
 36 | | Cohere       | Command-R<br>Command-R+                                                            | [Function Calling](https://docs.cohere.com/docs/tool-use)<br>[Pricing](https://cohere.com/pricing)<br>[Command-R (2024-03-11)](https://txt.cohere.com/command-r/)<br>[Command-R+ (2024-04-04)](https://txt.cohere.com/command-r-plus-microsoft-azure/)                                                                                                                                                                                   |
 37 | | Fireworks.ai | firefunction-v1                                                                    | [Function Calling](https://readme.fireworks.ai/docs/function-calling)<br>[JSON Mode](https://readme.fireworks.ai/docs/structured-response-formatting)<br>[Grammar mode](https://readme.fireworks.ai/docs/structured-output-grammar-based)<br>[Pricing](https://fireworks.ai/pricing)<br>[Announcement (2023-12-20)](https://blog.fireworks.ai/fireworks-raises-the-quality-bar-with-function-calling-model-and-api-release-e7f49d1e98e9) |
 38 | | Google       | gemini-1.0-pro                                                                     | [Function Calling](https://cloud.google.com/vertex-ai/docs/generative-ai/multimodal/function-calling#rest)<br>[Pricing](https://ai.google.dev/pricing?authuser=1)                                                                                                                                                                                                                                                                        |
 39 | | Groq         | llama2-70b<br>mixtral-8x7b<br>gemma-7b-it                                          | [Function Calling](https://console.groq.com/docs/tool-use)<br>[Pricing](https://wow.groq.com/)                                                                                                                                                                                                                                                                                                                                           |
 40 | | Hugging Face TGI         | [many open-source models](https://huggingface.co/docs/text-generation-inference/supported_models)                                           | [Grammars, JSON mode, Function Calling and Tools](https://huggingface.co/docs/text-generation-inference/conceptual/guidance#guidance)<br>For [free locally](https://huggingface.co/docs/text-generation-inference/basic_tutorials/consuming_tgi), or via [dedicated](https://huggingface.co/docs/inference-endpoints/index) or [serverless](https://huggingface.co/docs/api-inference/index) endpoints.                                                                                                                                                                                                                                                                                                                                             |
 41 | | Mistral      | mistral-large-latest                                                               | [Function Calling](https://docs.mistral.ai/guides/function-calling/)<br>[Pricing](https://docs.mistral.ai/platform/pricing/)                                                                                                                                                                                                                                                                                                             |
 42 | | OpenAI       | gpt-4<br>gpt-4-turbo<br>gpt-35-turbo                                               | [Function Calling](https://openai.com/blog/openai-api/)<br>[JSON Mode](https://platform.openai.com/docs/guides/text-generation/json-mode)<br>[Pricing](https://openai.com/pricing)<br>[Announcement (2023-06-13)](https://openai.com/blog/function-calling-and-other-api-updates)                                                                                                                                                        |
 43 | | Rysana       | inversion-sm                                                                       | [API Docs](https://rysana.com/docs/api)<br>[Pricing](https://rysana.com/pricing)<br>[Announcement (2024-03-18)](https://rysana.com/inversion)                                                                                                                                                                                                                                                                                            |
 44 | | Together AI  | Mixtral-8x7B-Instruct-v0.1<br>Mistral-7B-Instruct-v0.1<br>CodeLlama-34b-Instruct   | [Function Calling](https://docs.together.ai/docs/function-calling)<br>[JSON Mode](https://docs.together.ai/docs/json-mode)<br>[Pricing](https://together.ai/pricing/)<br>[Announcement 2024-01-31](https://www.together.ai/blog/function-calling-json-mode)                                                                                                                                                                              |
 45 | 
 46 | **Parallel Function Calling**
 47 | 
 48 | Below is a list of hosted API models that support multiple parallel function calls. This could include checking the weather in multiple cities or first finding the location of a hotel and then checking the weather at it's location.
 49 | 
 50 | - anthropic
 51 | 	- claude-3-opus-20240229
 52 | 	- claude-3-sonnet-20240229
 53 | 	- claude-3-haiku-20240307
 54 | - azure/openai
 55 | 	- gpt-4-turbo-preview
 56 | 	- gpt-4-1106-preview
 57 | 	- gpt-4-0125-preview
 58 | 	- gpt-3.5-turbo-1106
 59 | 	- gpt-3.5-turbo-0125
 60 | - cohere
 61 | 	- command-r
 62 | - together_ai
 63 | 	- Mixtral-8x7B-Instruct-v0.1
 64 | 	- Mistral-7B-Instruct-v0.1
 65 | 	- CodeLlama-34b-Instruct
 66 |  
 67 | ## Local Models
 68 | 
 69 | [Mistral 7B Instruct v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) (2024-05-22, Apache 2.0) an instruct fine-tuned version of Mistral with added function calling support.
 70 | 
 71 | [C4AI Command R+](https://huggingface.co/CohereForAI/c4ai-command-r-plus) (2024-03-20, CC-BY-NC, Cohere) is a 104B parameter multilingual model with advanced Retrieval Augmented Generation (RAG) and tool use capabilities, optimized for reasoning, summarization, and question answering across 10 languages. Supports quantization for efficient use and demonstrates unique multi-step tool integration for complex task execution.
 72 | 
 73 | [Hermes 2 Pro - Mistral 7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B) (2024-03-13, Nous Research) is a 7B parameter model that excels at function calling, JSON structured outputs, and general tasks. Trained on an updated OpenHermes 2.5 Dataset and a new function calling dataset, it uses a special system prompt and multi-turn structure. Achieves 91% on function calling and 84% on JSON mode evaluations.
 74 | 
 75 | [Gorilla OpenFunctions v2](https://gorilla.cs.berkeley.edu//blogs/7_open_functions_v2.html) (2024-02-27, Apache 2.0 license, [Charlie Cheng-Jie Ji et al.](https://gorilla.cs.berkeley.edu//blogs/7_open_functions_v2.html))  interprets and executes functions based on JSON Schema Objects, supporting multiple languages and detecting function relevance.
 76 | 
 77 | [NexusRaven-V2](https://nexusflow.ai/blogs/ravenv2) (2023-12-05, Nexusflow)  is a 13B model outperforming GPT-4 in zero-shot function calling by up to 7%, enabling effective use of software tools. Further instruction-tuned on CodeLlama-13B-instruct.
 78 | 
 79 | [Functionary](https://functionary.meetkai.com/) (2023-08-04, [MeetKai](https://meetkai.com/)) interprets and executes functions based on JSON Schema Objects, supporting various compute requirements and call types. Compatible with OpenAI-python and llama-cpp-python for efficient function execution in JSON generation tasks.
 80 | 
 81 | [Hugging Face TGI](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) enables JSON outputs and function calling for a [variety of local models](https://huggingface.co/docs/text-generation-inference/supported_models). 
 82 | 
 83 | 
 84 | ## Python Libraries
 85 | 
 86 | 
 87 | [DSPy](https://github.com/stanfordnlp/dspy) (MIT) is a framework for algorithmically optimizing LM prompts and weights. DSPy introduced [typed predictor and signatures](https://github.com/entropy/dspy/blob/main/docs/docs/building-blocks/8-typed_predictors.md) to leverage [Pydantic](https://github.com/pydantic/pydantic) for enforcing type constraints on inputs and outputs, improving upon string-based fields. 
 88 | 
 89 | [FuzzTypes](https://github.com/genomoncology/FuzzTypes) (MIT) extends Pydantic with autocorrecting annotation types for enhanced data normalization and handling of complex types like emails, dates, and custom entities.
 90 | 
 91 | [guidance](https://github.com/guidance-ai/guidance) (Apache-2.0) enables constrained generation, interleaving Python logic with LLM calls, reusable functions, and calling external tools. Optimizes prompts for faster generation.
 92 | 
 93 | [Instructor](https://github.com/jxnl/instructor) (MIT) simplifies generating structured data from LLMs using Function Calling, Tool Calling, and constrained sampling modes. Built on Pydantic for validation and supports various LLMs.
 94 | 
 95 | [LangChain](https://github.com/langchain-ai/langchain) (MIT) provides an interface for chains, integrations with other tools, and chains for applications. LangChain offers [structured outputs](https://python.langchain.com/docs/how_to/structured_output/) and [tool calling](https://python.langchain.com/docs/how_to/tool_calling/) across models.
 96 | 
 97 | [LiteLLM](https://github.com/BerriAI/litellm) (MIT) simplifies calling 100+ LLMs in the OpenAI format, supporting [function calling](https://docs.litellm.ai/docs/completion/function_call), tool calling, and JSON mode.
 98 | 
 99 | [LlamaIndex](https://github.com/run-llama/llama_index) (MIT) provides [modules for structured outputs](https://docs.llamaindex.ai/en/stable/module_guides/querying/structured_outputs/structured_outputs.html) at different levels of abstraction, including output parsers for text completion endpoints, [Pydantic programs](https://docs.llamaindex.ai/en/stable/module_guides/querying/structured_outputs/pydantic_program.html) for mapping prompts to structured outputs using function calling or output parsing, and pre-defined Pydantic programs for specific output types.
100 | 
101 | [Marvin](https://github.com/PrefectHQ/marvin) (Apache-2.0) is a lightweight toolkit for building reliable natural language interfaces with self-documenting tools for tasks like entity extraction and multi-modal support.
102 | 
103 | [Outlines](https://github.com/outlines-dev/outlines) (Apache-2.0) facilitates structured text generation using multiple models, Jinja templating, and support for regex patterns, JSON schemas, Pydantic models, and context-free grammars.
104 | 
105 | [Pydantic](https://github.com/pydantic/pydantic) (MIT) simplifies working with data structures and JSON through data model definition, validation, JSON schema generation, and seamless parsing and serialization.
106 | 
107 | [PydanticAI](https://github.com/pydantic/pydantic-ai) (MIT) is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.
108 | 
109 | [SGLang](https://github.com/sgl-project/sglang) (MPL-2.0) allows specifying JSON schemas using regular expressions or Pydantic models for constrained decoding. Its high-performance runtime accelerates JSON decoding.
110 | 
111 | [SynCode](https://github.com/uiuc-focal-lab/syncode) (MIT) is a framework for the grammar-guided generation of Large Language Models (LLMs). It supports CFG for Python, Go, Java, JSON, YAML, and many more.
112 | 
113 | [Mirascope](https://github.com/Mirascope/mirascope) (MIT) is an LLM toolkit that supports structured extraction with an intuitive python API.
114 | 
115 | [Magnetic](https://github.com/jackmpcollins/magentic) (MIT) call LLMs from Python using 3 lines of code. Simply use the @prompt decorator to create functions that return structured output from the LLM, powered by Pydantic.
116 | 
117 | [Formatron](https://github.com/Dan-wanna-M/formatron) (MIT) is an efficient and scalable constrained decoding library that enables controlling over language model output format using f-string templates that support regular expressions, context-free grammars, JSON schemas, and Pydantic models. Formatron integrates seamlessly with various model inference libraries.
118 | 
119 | [Transformers-cfg](https://github.com/epfl-dlab/transformers-CFG) (MIT) extends Hugging Face Transformers with context-free grammar (CFG) support via an EBNF interface. It enables grammar-constrained generation with minimal changes to existing code of transformers and supports JSON mode and JSON Schema.
120 | 
121 | ## Blog Articles
122 | 
123 | [How fast can grammar-structured generation be?](http://blog.dottxt.co/how-fast-cfg.html) (2024-04-12, .txt Engineering) demonstrates an almost cost-free method to generate text that follows a grammar. It is shown to outperform `llama.cpp` by a factor of 50x on the C grammar.
124 | 
125 | [Structured Generation Improves LLM performance: GSM8K Benchmark](https://blog.dottxt.co/performance-gsm8k.html) (2024-03-15, .txt Engineering) demonstrates consistent improvements across 8 models, highlighting benefits like "prompt consistency" and "thought-control."
126 | 
127 | [LoRAX + Outlines: Better JSON Extraction with Structured Generation and LoRA](https://predibase.com/blog/lorax-outlines-better-json-extraction-with-structured-generation-and-lora) (2024-03-03, Predibase Blog) combines Outlines with LoRAX v0.8 to enhance extraction accuracy and schema fidelity through structured generation, fine-tuning, and LoRA adapters.
128 | 
129 | [FU, Show Me The Prompt. Quickly understand inscrutable LLM frameworks by intercepting API calls](https://hamel.dev/blog/posts/prompt/) (2023-02-14, Hamel Husain) provides a practical guide to intercepting API calls using mitmproxy, gaining insights into tool functionality, and assessing necessity. Emphasizes minimizing complexity and maintaining close connection with underlying LLMs.
130 | 
131 | [Coalescence: making LLM inference 5x faster](https://blog.dottxt.co/coalescence.html) (2024-02-02, .txt Engineering) shows how structured generation can be made faster than unstructured generation using a technique called "coalescence", with a caveat regarding how it may affect the quality of the generation.
132 | 
133 | [Why Pydantic became indispensable for LLMs](https://www.factsmachine.ai/p/how-pydantic-became-indispensable) (2024-01-19, [Adam Azzam](https://twitter.com/aaazzam)) explains Pydantic's emergence as a critical tool, enabling sharing data models via JSON schemas and reasoning between unstructured and structured data. Highlights the importance of quantizing the decision space and potential issues with LLMs overfitting to older schema versions.
134 | 
135 | [Getting Started with Function Calling](https://www.promptingguide.ai/applications/function_calling) (2024-01-11, Elvis Saravia) introduces function calling for connecting LLMs with external tools and APIs, providing an example using OpenAI's API and highlighting potential applications.
136 | 
137 | [Pushing ChatGPT's Structured Data Support To Its Limits](https://minimaxir.com/2023/12/chatgpt-structured-data/) (2023-12-21, Max Woolf) delves into leveraging ChatGPT's capabilities using paid API, JSON schemas, and Pydantic. Highlights techniques for improving output quality and the benefits of structured data support.
138 | 
139 | [Why use Instructor?](https://jxnl.github.io/instructor/why/) (2023-11-18, Jason Liu) explains the benefits of the library, offering a readable approach, support for partial extraction and various types, and a self-correcting mechanism. Recommends additional resources on the Instructor website.
140 | 
141 | [Using grammars to constrain llama.cpp output](https://www.imaurer.com/llama-cpp-grammars/) (2023-09-06, Ian Maurer) integrates context-free grammars with llama.cpp for more accurate and schema-compliant responses, particularly for biomedical data.
142 | 
143 | [Using OpenAI functions and their Python library for data extraction](https://til.simonwillison.net/gpt3/openai-python-functions-data-extraction) (2023-07-09, Simon Willison) demonstrates extracting structured data using OpenAI Python library and function calling in a single API call, with a code example and suggestions for handling streaming limitations.
144 | 
145 | ## Videos
146 | 
147 | [GPT Extracting Unstructured Data with Datasette and GPT-4 Turbo](https://www.youtube.com/watch?v=g3NtJatmQR0) (2024-04-09, Simon Willison) showcases the datasette-extract plugin's ability to populate database tables from unstructured text and images, leveraging GPT-4 Turbo's API for data extraction.
148 | 
149 | [LLM Structured Output for Function Calling with Ollama](https://www.youtube.com/watch?v=_-FrUReljTQ) (2024-03-25, Andrej Baranovskij) demonstrates function calling-based data extraction using Ollama, Instructor and [Sparrow agent](https://github.com/katanaml/sparrow). 
150 | 
151 | [Hermes 2 Pro Overview](https://www.youtube.com/watch?v=ViXURxck-HM) (2024-03-18, Prompt Engineer) introduces Hermes 2 Pro, a 7B parameter model excelling at function calling and structured JSON output. Demonstrates 90% accuracy in function calling and 84% in JSON mode, outperforming other models.
152 | 
153 | [Mistral AI Function Calling](https://www.youtube.com/watch?v=eOo4GfHj3ZE) (2024-02-24, Sophia Yang) demonstrates connecting LLMs to external tools, generating function arguments, and executing functions. Could be extended to generate or manipulate JSON data.
154 | 
155 | [Function Calling in Ollama vs OpenAI](https://www.youtube.com/watch?v=RXDWkiuXtG0)  (2024-02-13, [Matt Williams](https://twitter.com/Technovangelist)) clarifies that models generate structured output for parsing and invoking functions. Compares implementations, highlighting Ollama's simpler approach and using few-shot prompts for consistency.
156 | 
157 | [LLM Engineering: Structured Outputs](https://www.youtube.com/watch?v=1xUeL63ymM0) (2024-02-12, [Jason Liu](https://twitter.com/jxnlco), [Weights & Biases Course](https://www.wandb.courses/)) offers a concise course on handling structured JSON output, function calling, and validations using Pydantic, covering essentials for robust pipelines and efficient production integration.
158 | 
159 | [Pydantic is all you need](https://www.youtube.com/watch?v=yj-wSRJwrrc)  (2023-10-10, [Jason Liu](https://twitter.com/jxnlco), [AI Engineer Conference](https://www.ai.engineer/)) discusses the importance of Pydantic for structured prompting and output validation, introducing the Instructor library and showcasing advanced applications for reliable and maintainable LLM-powered applications.
160 | 
161 | ## Jupyter Notebooks
162 | 
163 | [Function Calling with llama-cpp-python and OpenAI Python Client](https://github.com/abetlen/llama-cpp-python/blob/main/examples/notebooks/Functions.ipynb) demonstrates integration, including setup using the Instructor library, with examples of retrieving weather information and extracting user details.
164 | 
165 | [Function Calling with Mistral Models](https://colab.research.google.com/github/mistralai/cookbook/blob/main/function_calling.ipynb) demonstrates connecting Mistral models with external tools through a simple example involving a payment transactions dataframe.
166 | 
167 | [chatgpt-structured-data](https://github.com/minimaxir/chatgpt-structured-data) by [Max Woolf](https://twitter.com/minimaxir) provides demos showcasing ChatGPT's function calling and structured data support, covering various use cases and schemas.## Leaderboards
168 | 
169 | ## Leaderboards
170 | 
171 | [Berkeley Function-Calling Leaderboard (BFCL)](https://gorilla.cs.berkeley.edu/blogs/8_berkeley_function_calling_leaderboard.html) is an evaluation framework for LLMs' function-calling capabilities including over 2k question-function-answer pairs across languages like Python, Java, JavaScript, SQL, and REST API, focusing on simple, multiple, and parallel function calls, as well as function relevance detection.
172 | 


--------------------------------------------------------------------------------
/awesome-decentralized-llms.md:
--------------------------------------------------------------------------------
  1 | # awesome-decentralized-llm
  2 | 
  3 | Collection of LLM resources that can be used to build products you can "own" or to perform reproducible research. Please note there are Terms of Service around some of the weights and training data that should be investigated before commercialization.
  4 | 
  5 | Table of Contents:
  6 | 
  7 | - [Leaderboards](#leaderboards)
  8 | - [Local LLMs](#local-llms)
  9 | - [LLM-based Tools](#llm-based-tools)
 10 | - [Training and Quantization](#training-and-quantization)
 11 | - [Non-English Models](#non-english-models)
 12 | - [Autonomous Agents](#autonomous-agents)
 13 | 
 14 | If you are looking for a list of open-source LLMs that can be used commercially, this is a great list:
 15 | [Open LLMs](https://github.com/eugeneyan/open-llms)
 16 | 
 17 | -----
 18 | 
 19 | ## Leaderboards
 20 | 
 21 | I am having a hard time keeping up with the latest and greatest open-source LLMs. Below are leaderboards I am checking periodically:
 22 | 
 23 | - [HuggingFace Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 24 |   The 🤗 Open LLM Leaderboard aims to track, rank and evaluate LLMs and chatbots as they are released.
 25 |   (2023-05-23, HuggingFace)
 26 | 
 27 | - [AlpacaEval 🦙 Leaderboard](https://tatsu-lab.github.io/alpaca_eval/)
 28 |   An Automatic Evaluator for Instruction-following Language Models
 29 |   (2023-07-01, Stanford Alpaca/Tatsu Lab)
 30 | 
 31 | - [Code Generation on HumanEval](https://paperswithcode.com/sota/code-generation-on-humaneval)
 32 |   HumanEval problem solving dataset described in the paper "Evaluating Large Language Models Trained on Code"
 33 |   (2023-07-01, Papers With Code)
 34 |   
 35 | -----
 36 | 
 37 | ## Local LLMs
 38 | 
 39 | ### Local LLM Repositories
 40 | 
 41 | - [LLM Foundry](https://github.com/mosaicml/llm-foundry)
 42 |   Release repo for MPT-7B and related models.
 43 |   (2023-05-05, MosaicML, Apache 2.0)
 44 | 
 45 | - [FastChat](https://github.com/lm-sys/FastChat)
 46 |   Release repo for Vicuna and FastChat-T5
 47 |   (2023-04-20, LMSYS, Apache 2.0)
 48 | 
 49 | - [StabilityLM](https://github.com/stability-AI/stableLM/) -
 50 |   Stability AI Language Models
 51 |   (2023-04-19, StabilityAI, Apache and CC BY-SA-4.0)
 52 | 
 53 | - [GPT4All](https://github.com/nomic-ai/gpt4all) -
 54 |   LLM trained with ~800k GPT-3.5-Turbo Generations based on GPT-J and LLaMa.
 55 |   (2023-04-13, Nomic AI, Apache/Meta ToS/OpenAI ToS)
 56 |  
 57 | - [Dolly](https://github.com/databrickslabs/dolly) -
 58 |   Large language model trained on the Databricks Machine Learning Platform
 59 |   (2023-03-24, Databricks Labs, Apache)
 60 |   
 61 | - [bloomz.cpp](https://github.com/NouamaneTazi/bloomz.cpp)
 62 |   Inference of HuggingFace's BLOOM-like models in pure C/C++.
 63 |   (2023-03-16, Nouamane Tazi, MIT License)
 64 |   
 65 | - [alpaca.cpp](https://github.com/antimatter15/alpaca.cpp) -
 66 |   Locally run an Instruction-Tuned Chat-Style LLM
 67 |   (2023-03-16, Kevin Kwok, MIT License)
 68 | 
 69 | - [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) -
 70 |   Code and documentation to train Stanford's Alpaca models, and generate the data.
 71 |   (2023-03-13, Stanford CRFM, Apache License, Non-Commercial Data, Meta/OpenAI ToS)
 72 | 
 73 | - [llama.cpp](https://github.com/ggerganov/llama.cpp) -
 74 |   Port of Facebook's LLaMA model in C/C++. 
 75 |   (2023-03-10, Georgi Gerganov, MIT License)
 76 | 
 77 | - [ChatRWKV](https://github.com/BlinkDL/ChatRWKV) -
 78 |   ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
 79 |   (2023-01-09, PENG Bo, Apache License)
 80 |   
 81 | - [RWKV-LM](https://github.com/BlinkDL/RWKV-LM) -
 82 |   RNN with Transformer-level LLM performance. Combines best of RNN and transformer: fast inference, saves VRAM, fast training.
 83 |   (2022?, PENG Bo, Apache License)
 84 | 
 85 | - [Open Assistant](https://open-assistant.io/) - A chat-based ChatGPT-like large language model. (2023-04-15, Pythia, LLAMA, Apache 2.0 License)
 86 | 
 87 | ### Local LLM Spaces, Models & Datasets
 88 | 
 89 | - [RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T)
 90 |   (2023-04-17, [@togethercompute](https://twitter.com/togethercompute))
 91 | 
 92 | - [Vicuna 13b](https://huggingface.co/lmsys/vicuna-13b-delta-v1.1)
 93 |   (2023-04-12, [@lmsysorg](https://twitter.com/lmsysorg))
 94 | 
 95 | - [Dolly 15k Instruction Tuning Labels](https://github.com/databrickslabs/dolly/tree/master/data)
 96 |   (2023-04-12, DataBricks, CC3 Allows Commercial Use)
 97 |   
 98 | - [Cerebras-GPT 7 Models](https://huggingface.co/cerebras)
 99 |   (2023-03-28, Huggingface, Cerebras, Apache License)
100 | 
101 | - [Alpine Data Cleaned](https://github.com/gururise/AlpacaDataCleaned)
102 |   (2023-03-21, Gene Ruebsamen, Apache & OpenAI ToS)
103 | 
104 | - [Alpaca Dataset](https://huggingface.co/datasets/tatsu-lab/alpaca)
105 |   (2023-03-13, Huggingface, Tatsu-Lab, Meta ToS/OpenAI ToS)
106 |   
107 | - [Alpaca Model Search](https://huggingface.co/models?sort=downloads&search=alpaca)
108 |   (Huggingface, Meta ToS/OpenAI ToS)
109 |   
110 | 
111 | ### Local LLM Resources
112 | 
113 | - [Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](https://www.mosaicml.com/blog/mpt-7b)
114 |   (2023-05-05, MosaicML, Blog Post)
115 | 
116 | - [Google "We Have No Moat, And Neither Does OpenAI"](https://www.semianalysis.com/p/google-we-have-no-moat-and-neither)
117 |   (2023-05-04, Leaked Internal Google Document)
118 | 
119 | - [RedPajama reproduces LLaMA training dataset of over 1.2 trillion tokens](https://www.together.xyz/blog/redpajama)
120 |   (2023-04-17, Together, Blog Post)
121 | 
122 | - [What’s in the RedPajama-Data-1T LLM training set](https://simonwillison.net/2023/Apr/17/redpajama-data/)
123 |   (2023-04-17, Simon Willison, Blog Post)
124 |   
125 | - [GPT4All-J: An Apache-2 Licensed Assistant-Style Chatbot](https://static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf)
126 |   (2023-04-13, nomic.ai)
127 | 
128 | - [Databricks releases Dolly 2.0, the first open, instruction-following LLM for commercial use](https://venturebeat-com.cdn.ampproject.org/c/s/venturebeat.com/ai/databricks-releases-dolly-2-0-the-first-open-instruction-following-llm-for-commercial-use/amp/)
129 |   (2023-04-13, Venture Beat, Sharon Goldman)
130 | 
131 | - [Summary of Curent Models](https://docs.google.com/spreadsheets/d/1O5KVQW1Hx5ZAkcg8AIRjbQLQzx2wVaLl0SqUu-ir9Fs/edit#gid=1158069878)
132 |   (2023-04-11, Dr Alan D. Thompson, Google Sheet)
133 | 
134 | - [Running GPT4All On a Mac Using Python langchain in a Jupyter Notebook](https://blog.ouseful.info/2023/04/04/running-gpt4all-on-a-mac-using-python-langchain-in-a-jupyter-notebook/)
135 |   (2023-04-04, Tony Hirst, Blog Post)
136 | 
137 | - [Vicuna Homepage](https://vicuna.lmsys.org/)
138 |   (2023-04-01, Meta ToS)
139 | 
140 | - [Cerebras-GPT vs LLaMA AI Model Comparison](https://www.lunasec.io/docs/blog/cerebras-gpt-vs-llama-ai-model-comparison/)
141 |   (2023-03-29, LunaSec, Blog Post)
142 | 
143 | - [Cerebras-GPT: Family of Open, Compute-efficient, LLMs](https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/)
144 |   (2023-03-28, Cerebras, Blog Post)
145 | 
146 | - [Hello Dolly: Democratizing the magic of ChatGPT with open models](https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html)
147 |   (2023-03-24, databricks, Blog Post)
148 | 
149 | - [The Coming of Local LLMs](https://nickarner.com/notes/the-coming-of-local-llms-march-23-2023/)
150 |   (2023-03-23, Nick Arner, Blog Post)
151 | 
152 | - [The RWKV language model: An RNN with the advantages of a transformer](https://johanwind.github.io/2023/03/23/rwkv_overview.html)
153 |   (2023-03-23, Johan Sokrates Wind, Blog Post)
154 |   
155 | - [Bringing Whisper and LLaMA to the masses](https://changelog.com/podcast/532)
156 |   (2023-03-15, The Changelog & Georgi Gerganov, Podcast Episode)
157 |   
158 | - [Alpaca: A Strong, Replicable Instruction-Following Model](https://crfm.stanford.edu/2023/03/13/alpaca.html)
159 |   (2023-03-13, Stanford CRFM, Project Homepage)
160 | 
161 | - [Large language models are having their Stable Diffusion moment](https://simonwillison.net/2023/Mar/11/llama/)
162 |   (2023-03-10, Simon Willison, Blog Post)
163 | 
164 | - [Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp](https://til.simonwillison.net/llms/llama-7b-m2)
165 |   (2023-03-10, Simon Willison, Blog/Today I Learned)
166 |   
167 | - [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
168 |   (2023-02-24, Meta AI, Meta ToS)
169 | 
170 | -----
171 | 
172 | ## LLM-based Tools
173 | 
174 | - [xTuring](https://github.com/stochasticai/xturing) -
175 |   This tool allows for the fine-tuning of language models either on your personal computer or in the cloud, all while minimizing GPU costs.
176 |   (2023-10-05, stochastic.ai)
177 | 
178 | - [MiniGPT-4](https://github.com/Vision-CAIR/MiniGPT-4)
179 |   Enhancing Vision-language Understanding with Advanced Large Language Models
180 |   (2023-04-17, Vision CAIR Research Group, KAUST, BSD)
181 | 
182 | - [Text generation web UI](https://github.com/oobabooga/text-generation-webui)
183 |   A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
184 |   (2023-04-15, oobabooga, AGPL)
185 | 
186 | - [TextSynth](https://bellard.org/ts_server/)
187 |   REST API for Large Language Models. Supports variety of models.
188 |   (2023-04-13, Fabrice Bellard, Commercial License GPU/Shareware CPU)
189 | 
190 | - [FastChat](https://github.com/lm-sys/FastChat) -
191 |   The release repo for Vicuna: An Open Chatbot Impressing GPT-4
192 |   (2023-04-13, LM-SYS, Apache)
193 | 
194 | - [tabby](https://github.com/TabbyML/tabby)
195 |   Self-hosted AI coding assistant.
196 |   (2023-04-12, TabbyML)
197 | 
198 | - [Basaran](https://github.com/hyperonym/basaran)
199 |   Open-source text completion API for Transformers-based text generation models.
200 |   (2023-04-12, Hyperonym)
201 | 
202 | - [TurboPilot](https://github.com/ravenscroftj/turbopilot)
203 |   CoPilot clone that runs code completion 6B-LLM with CPU and 4GB of RAM.
204 |   (2023-04-11, James Ravenscroft)
205 | 
206 | - [talkGPT4All](https://github.com/vra/talkGPT4All) -
207 |   A voice chatbot based on OpenAI Whisper and GPT4All, running on local laptop.
208 |   (2023-04-09, Yunfeng Wang, MIT License)
209 | 
210 | - [LLMZoo](https://github.com/FreedomIntelligence/LLMZoo)
211 |   Data, models, and evaluation benchmark for large language models
212 |   (2023-04-08, FreedomIntelligence, Apache)
213 | 
214 | - [LMFlow](https://github.com/OptimalScale/LMFlow)
215 |   An Extensible Toolkit for Finetuning and Inference of Large Foundation Models.
216 |   (2023-04-06, OptimalScale)
217 | 
218 | ----
219 | 
220 | ## Training and Quantization
221 | 
222 | - [DeepSpeed](https://github.com/microsoft/DeepSpeed)
223 |   Deep learning optimization library that makes distributed training and inference easy.
224 |   (2023-04-13, Microsoft, Apache)
225 |   
226 | - [ColossalAI](https://github.com/hpcaitech/ColossalAI)
227 |   Making large AI models cheaper, faster and more accessible
228 |   (2023-04-16, HPC-AI, Apache)
229 |   
230 | - [GPTQ-for-LLaMA](https://github.com/qwopqwop200/GPTQ-for-LLaMa) -
231 |   4 bits quantization of LLaMA using GPTQ
232 |   (2023-04-01, qwopqwop200, Meta ToS)
233 | 
234 | - [GPTQ](https://github.com/IST-DASLab/gptq)
235 |   Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
236 |   (2023-03-22, IST Austria Distributed Algorithms and Systems Lab)
237 | 
238 | - [xturing](https://github.com/stochasticai/xturing) -
239 |   Build and control your own LLMs
240 |   (2023-04-03, stochastic.ai)
241 |   
242 | - [spaCy](https://github.com/explosion/spaCy)
243 |   💫 Industrial-strength Natural Language Processing (NLP) in Python 
244 |   (2023-04-16, Explosion.ai, MIT)
245 |     
246 | ----
247 | 
248 | ## Non-English Models & Datasets
249 | 
250 | - [Polpaca](https://huggingface.co/mmosiolek/polpaca-lora-7b) -
251 |   Alpaca Speaks Polish
252 |   (2023-04-13, Marcin Mosiolek)
253 | 
254 | - [KOZA](https://github.com/bqpro1/koza)
255 |   KOZA is an instruct model for Polish language forked from alpaca-lora.
256 |   (2023-04-13, Leszek Bukowski)
257 |   
258 | - [Owca](https://github.com/Emplocity/owca) is a Polish-translated dataset of instructions for fine-tuning the Alpaca model (2023-04-13, Emplocity)
259 |   
260 | ----
261 | 
262 | ## LLM Technology for app integration
263 | 
264 | - [semantic-kernel](https://github.com/microsoft/semantic-kernel)
265 |   Integrate cutting-edge LLM technology quickly and easily into your apps
266 |   (2023-04-16, Microsoft)
267 |   
268 | - [LangChain](https://github.com/hwchase17/langchain)
269 |   ⚡ Building applications with LLMs through composability ⚡
270 |   (2023-04-16, Langchain)
271 |   
272 | ----
273 | 
274 | ## Autonomous Agents
275 | 
276 | 
277 | ### Autonomous Agent Repositories
278 | 
279 | - [AI Legion](https://github.com/eumemic/ai-legion) -
280 |   JS/TS framework for autonomous agents who can work together to accomplish tasks.
281 |   (2023-04-13, eumemic, MIT)
282 | 
283 | - [AgentGPT](https://github.com/reworkd/AgentGPT) -
284 |   Assemble, configure, and deploy autonomous AI Agents in your browser.
285 |   (2023-04-12, Rework.ai)
286 |   
287 | - [babyagi](https://github.com/yoheinakajima/babyagi) -
288 |   Python script example of AI-powered task management system. Uses OpenAI and Pinecone APIs to create, prioritize, and execute tasks. 
289 |   (2023-04-06, Yohei Nakajima)
290 | 
291 | - [ChatArena](https://github.com/chatarena/chatarena) - 
292 |   Multi-Agent Language Game Environments for LLMs.
293 |   (2023-04-05, UCL)
294 |   
295 | - [Auto-GPT](https://github.com/Torantulino/Auto-GPT) -
296 |   An experimental open-source attempt to make GPT-4 fully autonomous.
297 |   (2023-04-06, Toran Bruce Richards)
298 | 
299 | - [JARVIS](https://github.com/microsoft/JARVIS) -
300 |   JARVIS, a system to connect LLMs with ML community
301 |   (2023-04-06, Microsoft)
302 | 
303 | - [Autolang](https://github.com/alvarosevilla95/autolang) -
304 |   Based on BabyAGI, focused on workflows that complete. Powered by langchain.
305 |   (2023-04-10, Alvaro Sevilla)
306 | 
307 | - [Embedchain](https://github.com/embedchain/embedchain) -
308 |   Framework to create ChatGPT like bots over your dataset.
309 |   (2023-07-19, Embedchain)
310 | 
311 | ### Autonomous Agent Resources
312 | 
313 | - [Emergent autonomous scientific research capabilities of large language models](https://arxiv.org/abs/2304.05332)
314 |   (2023-04-11, Daniil A. Boiko,1 Robert MacKnight, and Gabe Gomes - Carnegie Mellon University)
315 | 
316 | - [Generative Agents: Interactive Simulacra of Human Behavior](https://arxiv.org/pdf/2304.03442.pdf)
317 |   (2023-04-07, Stanford and Google)
318 | 
319 | - [Twitter List: Homebrew AGI Club](https://twitter.com/i/lists/1642934512836575232)
320 |   (2023-04-06, [@altryne](https://twitter.com/altryne)]
321 | 
322 | - [LangChain: Custom Agents](https://blog.langchain.dev/custom-agents/)
323 |   (2023-04-03, LangChain)
324 |  
325 | - [HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace](https://arxiv.org/abs/2303.17580)
326 |   (2023-04-02, Microsoft)
327 |   
328 | - [Introducing Agents in Haystack: Make LLMs resolve complex tasks](https://haystack.deepset.ai/blog/introducing-haystack-agents) (2023-03-30, Haystack and Deepset)
329 | 
330 | - [Introducing "🤖 Task-driven Autonomous Agent"](https://twitter.com/yoheinakajima/status/1640934493489070080?s=20)
331 |   (2023-03-29, [@yoheinakajima](https://twitter.com/yoheinakajima))
332 | 
333 | - [A simple Python implementation of the ReAct pattern for LLMs](https://til.simonwillison.net/llms/python-react-pattern)
334 |   (2023-03-17, Simon Willison)
335 | 
336 | - [ReAct: Synergizing Reasoning and Acting in Language Models](https://react-lm.github.io/)
337 |   (2023-03-10, Princeton & Google)
338 | 
339 | ----
340 | 
341 | ## Prompting Tools
342 | 
343 | - [Aim 💫 — An easy-to-use & supercharged open-source AI metadata tracker (experiment tracking, prompt engineering)](https://github.com/aimhubio/aim)
344 |   (2023-04-16, [AimStack](https://aimstack.io/)
345 | 


--------------------------------------------------------------------------------