├── .gitignore ├── Makefile ├── README.md ├── cli.py ├── examples ├── artist.gpt ├── eiffel-tower.png ├── lion.gpt ├── lion.png ├── skyline.gpt └── squirrel-developer.png ├── requirements.txt ├── scripts ├── update-readme.gpt └── update-tool-file.gpt └── tool.gpt /.gitignore: -------------------------------------------------------------------------------- 1 | .venv 2 | .idea/ 3 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | bootstrap: 2 | python3 -m venv .venv 3 | . .venv/bin/activate; pip3 install -r requirements.txt 4 | 5 | readme: 6 | gptscript scripts/update-readme.gpt 7 | 8 | tool-file: 9 | gptscript scripts/update-tool-file.gpt 10 | 11 | validate: readme tool-file 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # image-generation 2 | 3 | This CLI tool is a simple command-line interface for interacting with the OpenAI API. It was specifically created as a tool that GPTScript can leverage in order to provide the langauge with image generation capabilities. 4 | It allows users to input a text prompt and get a response generated by OpenAI's GPT models. 5 | Users can provide the API key directly as a command-line argument or set it through the credential tool. 6 | 7 | ## Features 8 | 9 | - Accepts an OpenAI API key through the command line or environment variable. 10 | - Accepts a text prompt for image generation and displays the resulting images. 11 | - Supports image generation using a specified model, image size, quality, and the number of images to generate. 12 | 13 | ## Prerequisites 14 | 15 | - Python 3.x 16 | - OpenAI API key 17 | 18 | ## How to Build 19 | 20 | This CLI tool does not require a build step, as it is a simple Python script. The only requirement is to install the necessary Python dependencies. 21 | 22 | ## Installation 23 | 24 | 1. Clone this repository or download the source code: 25 | 26 | ```bash 27 | git clone https://github.com/your-username/your-repo.git 28 | cd your-repo 29 | ``` 30 | 31 | 2. Install the required Python packages: 32 | 33 | ```bash 34 | pip install -r requirements.txt 35 | ``` 36 | 37 | ## Usage 38 | 39 | You can run this CLI tool using the following command format: 40 | 41 | ```bash 42 | python cli.py --api-key YOUR_API_KEY --prompt "Your text prompt here" --model MODEL_NAME --size IMAGE_SIZE --quality IMAGE_QUALITY --number NUMBER_OF_IMAGES 43 | ``` 44 | 45 | ### CLI Arguments 46 | | Argument | Short Command | Description | Example | 47 | |----------|---------------|-------------|---------| 48 | | API Key | `-k` / `--api-key` | Optional: Your OpenAI API key. Can also be set with OPENAI_API_KEY environment variable. Command-line option takes precedence. | `--api-key YOUR_API_KEY` | 49 | | Prompt | `-p` / `--prompt` | Required: The text prompt for image generation. | `--prompt "Your text prompt here"` | 50 | | Model | `-m` / `--model` | Optional: The model to use for image generation. Default is "dall-e-3". | `--model MODEL_NAME` | 51 | | Size | `-s` / `--size` | Optional: The size of the image to generate, in the format WxH (e.g., 1024x1024). Default is 1024x1024. | `--size IMAGE_SIZE` | 52 | | Quality | `-q` / `--quality` | Optional: The quality of the generated image. Allowed values are "standard" or "hd". Default is "standard". | `--quality IMAGE_QUALITY` | 53 | | Number | `-n` / `--number` | Optional: The number of images to generate. Default is 1. | `--number NUMBER` | 54 | 55 | ## Authentication 56 | 57 | An OpenAI API key is needed in order to authenticate. When running the tool with GPTScript, the credential tool will 58 | prompt you to provide a key if you have not done so already. When running the Python program directly, you can provide 59 | it as a command-line argument or set it as an environment variable. 60 | 61 | ### Setting the API Key as an Environment Variable 62 | 63 | On Unix-like systems, you can set the environment variable like this: 64 | 65 | ```bash 66 | export OPENAI_API_KEY="your-api-key-here" 67 | ``` 68 | 69 | On Windows, you can set the environment variable like this: 70 | 71 | ```cmd 72 | set OPENAI_API_KEY=your-api-key-here 73 | ``` 74 | 75 | After setting the environment variable, you only need to provide the prompt argument when running the tool: 76 | 77 | ```bash 78 | python cli.py --prompt "Describe a futuristic city." 79 | ``` 80 | 81 | ## Examples 82 | 83 | ### Tool Usage 84 | This repository is made to integrate nicely with GPTScript. Accordingly, a [tool.gpt](./tool.gpt) file is provided that allows other GPTScripts to usage this tool. 85 | The following is an example of how to use this tool in a GPTScript where it assumes that the tool is located in the parent directory of the GPTScript file: 86 | 87 | ```gpt 88 | tools: ./tool.gpt 89 | 90 | You are an expert in image generation. Please generate a cartoon lion standing proudly in the savannah. 91 | ``` 92 | 93 | You can find this specific example in the [examples](./examples/example.gpt) file. 94 | 95 | ### Images 96 | #### Cartoon lion standing proudly in the savannah 97 | ```bash 98 | python cli.py --prompt "Cartoon lion standing proudly in the savannah" --quality "standard" 99 | ``` 100 | ![Cartoon lion](./examples/lion.png) 101 | 102 | #### A realistic photograph of a squirrel writing some code in a peaceful meadow 103 | ```bash 104 | python cli.py --prompt "A realistic photograph of a squirrel writing some code in a peaceful meadow" --quality "hd" 105 | ``` 106 | ![Squirrel developer](./examples/squirrel-developer.png) 107 | 108 | 109 | #### The eiffel tower rendered photorealistically at night with a swirling sky as the background 110 | ```bash 111 | python cli.py --prompt "The eiffel tower rendered photorealistically at night with a swirling sky as the background" --quality "hd" 112 | ``` 113 | ![Eiffel tower](./examples/eiffel-tower.png) 114 | -------------------------------------------------------------------------------- /cli.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import argparse 4 | import openai 5 | 6 | # Set up defaults and get API key from environment variable 7 | defaults = { 8 | "api_key": os.getenv('OPENAI_API_KEY'), 9 | "model": "dall-e-3", 10 | "size": "1024x1024", 11 | "quality": "standard", 12 | "number": "1", 13 | } 14 | 15 | # Function to validate and parse arguments 16 | def validate_and_parse_args(parser): 17 | args = parser.parse_args() 18 | 19 | for key, value in vars(args).items(): 20 | if not value: 21 | args.__dict__[key] = parser.get_default(key) 22 | 23 | if not args.api_key: 24 | parser.error('The --api-key argument is required if OPENAI_API_KEY environment variable is not set.') 25 | if not args.prompt: 26 | parser.error('The --prompt argument is required.') 27 | if not args.number.isdigit(): 28 | parser.error('The --number argument must be a number.') 29 | args.number = int(args.number) 30 | 31 | return args 32 | 33 | def main(): 34 | # Parse the command line arguments 35 | parser = argparse.ArgumentParser(description="CLI for image generation prompt using OpenAI's DALL-E model.") 36 | parser.add_argument('-k', '--api-key', type=str, default=defaults["api_key"], 37 | help='OpenAI API key. Can also be set with OPENAI_API_KEY environment variable.') 38 | parser.add_argument('-p', '--prompt', type=str, required=True, help='Prompt for image generation.') 39 | parser.add_argument('-m', '--model', type=str, default=defaults["model"], 40 | help=f'Model to use for image generation. Default is "{defaults["model"]}".') 41 | parser.add_argument('-s', '--size', type=str, default=defaults["size"], 42 | help=f'Size of the image to generate, format WxH (e.g. {defaults["size"]}). Default is {defaults["size"]}.') 43 | parser.add_argument('-q', '--quality', type=str, default=defaults["quality"], 44 | help=f'Quality of the generated image. Allowed values are "standard" or "hd". Default is "{defaults["quality"]}"') 45 | parser.add_argument('-n', '--number', type=str, default=defaults["number"], 46 | help='Number of images to generate. Default is 1.') 47 | args = validate_and_parse_args(parser) 48 | 49 | # Initialize OpenAI client 50 | client = openai.OpenAI(api_key=args.api_key) 51 | 52 | # Make request to the OpenAI API 53 | try: 54 | response = client.images.generate( 55 | model=args.model, 56 | prompt=args.prompt, 57 | size=args.size, 58 | quality=args.quality, 59 | n=args.number 60 | ) 61 | print([image.url for image in response.data]) 62 | except openai.OpenAIError as e: 63 | print(f"Received an error code while generating images: {e}", file=sys.stderr) 64 | sys.exit(1) 65 | 66 | if __name__ == "__main__": 67 | main() 68 | -------------------------------------------------------------------------------- /examples/artist.gpt: -------------------------------------------------------------------------------- 1 | tools: github.com/gptscript-ai/dalle-image-generation, sys.write, sys.download 2 | 3 | You are artist who takes inspiration from impressionist and pointilism paintings. 4 | Please come up with a new painting where you choose the subject and use your personal 5 | style. 6 | 7 | Once done, download the file and write it a file called painting with the appropriate 8 | extension. 9 | 10 | -------------------------------------------------------------------------------- /examples/eiffel-tower.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gptscript-ai/dalle-image-generation/32dec0cf805fc3500c70a95313fa4078c352823e/examples/eiffel-tower.png -------------------------------------------------------------------------------- /examples/lion.gpt: -------------------------------------------------------------------------------- 1 | tools: github.com/gptscript-ai/dalle-image-generation 2 | 3 | You are an expert in image generation. Please generate a lion standing proudly in the savannah. 4 | -------------------------------------------------------------------------------- /examples/lion.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gptscript-ai/dalle-image-generation/32dec0cf805fc3500c70a95313fa4078c352823e/examples/lion.png -------------------------------------------------------------------------------- /examples/skyline.gpt: -------------------------------------------------------------------------------- 1 | tools: github.com/gptscript-ai/dalle-image-generation, sys.write, sys.download 2 | 3 | Generate a charcoal drawing of the New York City skyline. 4 | 5 | Once done, download the file and write it a file called skyline with the appropriate 6 | extension. 7 | 8 | -------------------------------------------------------------------------------- /examples/squirrel-developer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gptscript-ai/dalle-image-generation/32dec0cf805fc3500c70a95313fa4078c352823e/examples/squirrel-developer.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # requirements.txt 2 | openai 3 | argparse 4 | -------------------------------------------------------------------------------- /scripts/update-readme.gpt: -------------------------------------------------------------------------------- 1 | tools: sys.read, sys.write 2 | 3 | Parse cli.py and determine if the README.md file needs to be updated. Only make changes for the CLI 4 | options and make sure that their usage, description, and default values are up to date. Be sure to 5 | that the general structure of the README.md file is not changed and that the only changes made 6 | are to the CLI options. 7 | 8 | Write the changes to README.md file if necessary. 9 | -------------------------------------------------------------------------------- /scripts/update-tool-file.gpt: -------------------------------------------------------------------------------- 1 | tools: sys.read, sys.write 2 | 3 | Parse cli.py and determine what CLI options are available. Also parse tool.gpt and 4 | determine the difference between the args defined in tool.gpt and the args defined in 5 | cli.py. Do not include the api-key or model in the args list. 6 | 7 | Update the tool.gpt file such that the args and their descriptions in cli.py are exactly represented 8 | in the tool.gpt file. 9 | -------------------------------------------------------------------------------- /tool.gpt: -------------------------------------------------------------------------------- 1 | name: image-generation 2 | credential: github.com/gptscript-ai/credential as sys.openai with OPENAI_API_KEY as env and "Please provide your OpenAI API key" as message and key as field 3 | description: Generates images based on the specified parameters and returns a list of URLs to the generated images. 4 | args: prompt: (required) The text prompt based on which the GPT model will generate a response 5 | args: size: (optional) The size of the image to generate, format WxH (e.g. 1024x1024). Defaults to 1024x1024. 6 | args: quality: (optional) The quality of the generated image. Allowed values are "standard" or "hd". Default is "standard". 7 | args: number: (optional) The number of images to generate. Defaults to 1. 8 | 9 | #!/usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/cli.py --prompt="${PROMPT}" --size="${SIZE}" --quality="${QUALITY}" --number="${NUMBER}" 10 | --------------------------------------------------------------------------------