├── .coveragerc
├── .gitignore
├── LICENSE
├── Makefile
├── README.md
├── docs
└── example_notebooks
│ └── example.ipynb
├── poetry.lock
├── pyproject.toml
├── tests
├── __init__.py
├── data
│ └── product_df.json
├── integration_tests
│ ├── __init__.py
│ └── test_llm_accessor.py
└── unit_tests
│ ├── __init__.py
│ └── test_llm_accessor.py
└── yolopandas
├── __init__.py
├── chains.py
├── llm_accessor.py
└── utils
├── __init__.py
└── query_helpers.py
/.coveragerc:
--------------------------------------------------------------------------------
1 | [run]
2 | omit = tests/*
3 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .vscode/
2 |
3 | # Byte-compiled / optimized / DLL files
4 | __pycache__/
5 | *.py[cod]
6 | *$py.class
7 |
8 | # Jupyter Notebook
9 | .ipynb_checkpoints
10 |
11 | # Unit test / coverage reports
12 | .coverage
13 | coverage.xml
14 | .pytest_cache/
15 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License
2 |
3 | Copyright 2023 Chester Curme
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in
13 | all copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21 | THE SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | coverage:
2 | poetry run pytest --cov \
3 | --cov-config=.coveragerc \
4 | --cov-report xml \
5 | --cov-report term-missing:skip-covered
6 |
7 | format:
8 | poetry run black .
9 | poetry run isort .
10 |
11 | lint:
12 | poetry run mypy .
13 | poetry run black . --check
14 | poetry run isort . --check
15 | poetry run flake8 .
16 |
17 | unit_tests:
18 | poetry run pytest tests/unit_tests
19 |
20 | integration_tests:
21 | poetry run pytest tests/integration_tests
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # YOLOPandas
2 |
3 | Interact with Pandas objects via LLMs and [LangChain](https://github.com/hwchase17/langchain).
4 |
5 | YOLOPandas lets you specify commands with natural language and execute them directly on Pandas objects.
6 | You can preview the code before executing, or set `yolo=True` to execute the code straight from the LLM.
7 |
8 | **Warning**: YOLOPandas will execute arbitrary Python code on the machine it runs on. This is a dangerous thing to do.
9 |
10 | https://user-images.githubusercontent.com/26529506/214591990-c295a283-b9e6-4775-81e4-28917183ebb1.mp4
11 |
12 | ## Quick Install
13 |
14 | `pip install yolopandas`
15 |
16 | ## Basic usage
17 |
18 | YOLOPandas adds a `llm` accessor to Pandas dataframes.
19 |
20 | ```python
21 | from yolopandas import pd
22 |
23 | df = pd.DataFrame(
24 | [
25 | {"name": "The Da Vinci Code", "type": "book", "price": 15, "quantity": 300, "rating": 4},
26 | {"name": "Jurassic Park", "type": "book", "price": 12, "quantity": 400, "rating": 4.5},
27 | {"name": "Jurassic Park", "type": "film", "price": 8, "quantity": 6, "rating": 5},
28 | {"name": "Matilda", "type": "book", "price": 5, "quantity": 80, "rating": 4},
29 | {"name": "Clockwork Orange", "type": None, "price": None, "quantity": 20, "rating": 4},
30 | {"name": "Walden", "type": None, "price": None, "quantity": 100, "rating": 4.5},
31 | ],
32 | )
33 |
34 | df.llm.query("What item is the least expensive?")
35 | ```
36 | The above will generate Pandas code to answer the question, and prompt the user to accept or reject the proposed code.
37 | Accepting it in this case will return a Pandas dataframe containing the result.
38 |
39 | Alternatively, you can execute the LLM output without first previewing it:
40 | ```python
41 | df.llm.query("What item is the least expensive?", yolo=True)
42 | ```
43 |
44 | `.query` can return the result of the computation, which we do not constrain. For instance, while `"Show me products under $10"` will return a dataframe, the query `"Split the dataframe into two, 1/3 in one, 2/3 in the other. Return (df1, df2)"` can return a tuple of two dataframes. You can also chain queries together, for instance:
45 | ```python
46 | df.llm.query("Group by type and take the mean of all numeric columns.", yolo=True).llm.query("Make a bar plot of the result and use a log scale.", yolo=True)
47 | ```
48 |
49 | Also, if you want to get a better idea of how much each query costs, you can use the function `run_query_with_cost` found in the utils module to compute the cost in $USD broken down by prompt/completion tokens:
50 |
51 | ```python
52 |
53 | from yolopandas.utils.query_helpers import run_query_with_cost
54 |
55 | run_query_with_cost(df, "What item is the least expensive?", yolo=True)
56 | ```
57 | After running the above code, the output looks like the following:
58 |
59 | ```
60 | Total Tokens: 267
61 | Prompt Tokens: 252
62 | Completion Tokens: 15
63 | Total Cost (USD): $0.00534
64 | ```
65 |
66 |
67 | See the [example notebook](docs/example_notebooks/example.ipynb) for more ideas.
68 |
69 |
70 | ## LangChain Components
71 |
72 | This package uses several LangChain components, making it easy to work with if you are familiar with LangChain. In particular, it utilizes the LLM, Chain, and Memory abstractions.
73 |
74 | ### LLM Abstraction
75 |
76 | By working with LangChain's LLM abstraction, it is very easy to plug-and-play different LLM providers into YOLOPandas. You can do this in a few different ways:
77 |
78 | 1. You can change the default LLM by specifying a config path using the `LLPANDAS_LLM_CONFIGURATION` environment variable. The file at this path should be in [one of the accepted formats](https://langchain.readthedocs.io/en/latest/modules/llms/examples/llm_serialization.html).
79 |
80 | 2. If you have a LangChain LLM wrapper in memory, you can set it as the default LLM to use by doing:
81 |
82 | ```python
83 | import yolopandas
84 | yolopandas.set_llm(llm)
85 | ```
86 |
87 | 3. You can set the LLM wrapper to use for a specific dataframe by doing: `df.reset_chain(llm=llm)`
88 |
89 |
90 | ### Chain Abstraction
91 |
92 | By working with LangChain's Chain abstraction, it is very easy to plug-and-play different chains into YOLOPandas. This can be useful if you want to customize the prompt, customize the chain, or anything like that.
93 |
94 | To use a custom chain for a particular dataframe, you can do:
95 |
96 | ```python
97 | df.set_chain(chain)
98 | ```
99 |
100 | If you ever want to reset the chain to the base chain, you can do:
101 |
102 | ```python
103 | df.reset_chain()
104 | ```
105 |
106 | ### Memory Abstraction
107 |
108 | The default chain used by YOLOPandas utilizes the LangChain concept of [memory](https://langchain.readthedocs.io/en/latest/modules/memory.html). This allows for "remembering" of previous commands, making it possible to ask follow up questions or ask for execution of commands that stem from previous interactions.
109 |
110 | For example, the query `"Make a seaborn plot of price grouped by type"` can be followed with `"Can you use a dark theme, and pastel colors?"` upon viewing the initial result.
111 |
112 | By default, memory is turned on. In order to have it turned off by default, you can set the environment variable `LLPANDAS_USE_MEMORY=False`.
113 |
114 | If you are resetting the chain, you can also specify whether to use memory there:
115 |
116 | ```python
117 | df.reset_chain(use_memory=False)
118 | ```
119 |
120 |
121 |
--------------------------------------------------------------------------------
/docs/example_notebooks/example.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "id": "7aa7789f-56e5-4372-b21b-75b9aee45bfc",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "from yolopandas import pd"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": 2,
16 | "id": "0f0e60c8-3a59-4834-93bc-f18931362387",
17 | "metadata": {},
18 | "outputs": [
19 | {
20 | "data": {
21 | "text/html": [
22 | "
\n",
23 | "\n",
36 | "
\n",
37 | " \n",
38 | " \n",
39 | " \n",
40 | " name \n",
41 | " type \n",
42 | " price \n",
43 | " quantity \n",
44 | " rating \n",
45 | " \n",
46 | " \n",
47 | " \n",
48 | " \n",
49 | " 0 \n",
50 | " The Da Vinci Code \n",
51 | " book \n",
52 | " 15.0 \n",
53 | " 300 \n",
54 | " 4.0 \n",
55 | " \n",
56 | " \n",
57 | " 1 \n",
58 | " Jurassic Park \n",
59 | " book \n",
60 | " 12.0 \n",
61 | " 400 \n",
62 | " 4.5 \n",
63 | " \n",
64 | " \n",
65 | " 2 \n",
66 | " Jurassic Park \n",
67 | " film \n",
68 | " 8.0 \n",
69 | " 6 \n",
70 | " 5.0 \n",
71 | " \n",
72 | " \n",
73 | " 3 \n",
74 | " Matilda \n",
75 | " book \n",
76 | " 5.0 \n",
77 | " 80 \n",
78 | " 4.0 \n",
79 | " \n",
80 | " \n",
81 | " 4 \n",
82 | " Clockwork Orange \n",
83 | " None \n",
84 | " NaN \n",
85 | " 20 \n",
86 | " 4.0 \n",
87 | " \n",
88 | " \n",
89 | " 5 \n",
90 | " Walden \n",
91 | " None \n",
92 | " NaN \n",
93 | " 100 \n",
94 | " 4.5 \n",
95 | " \n",
96 | " \n",
97 | "
\n",
98 | "
"
99 | ],
100 | "text/plain": [
101 | " name type price quantity rating\n",
102 | "0 The Da Vinci Code book 15.0 300 4.0\n",
103 | "1 Jurassic Park book 12.0 400 4.5\n",
104 | "2 Jurassic Park film 8.0 6 5.0\n",
105 | "3 Matilda book 5.0 80 4.0\n",
106 | "4 Clockwork Orange None NaN 20 4.0\n",
107 | "5 Walden None NaN 100 4.5"
108 | ]
109 | },
110 | "execution_count": 2,
111 | "metadata": {},
112 | "output_type": "execute_result"
113 | }
114 | ],
115 | "source": [
116 | "product_df = pd.DataFrame(\n",
117 | " [\n",
118 | " {\"name\": \"The Da Vinci Code\", \"type\": \"book\", \"price\": 15, \"quantity\": 300, \"rating\": 4},\n",
119 | " {\"name\": \"Jurassic Park\", \"type\": \"book\", \"price\": 12, \"quantity\": 400, \"rating\": 4.5},\n",
120 | " {\"name\": \"Jurassic Park\", \"type\": \"film\", \"price\": 8, \"quantity\": 6, \"rating\": 5},\n",
121 | " {\"name\": \"Matilda\", \"type\": \"book\", \"price\": 5, \"quantity\": 80, \"rating\": 4},\n",
122 | " {\"name\": \"Clockwork Orange\", \"type\": None, \"price\": None, \"quantity\": 20, \"rating\": 4},\n",
123 | " {\"name\": \"Walden\", \"type\": None, \"price\": None, \"quantity\": 100, \"rating\": 4.5},\n",
124 | " ],\n",
125 | ")\n",
126 | "\n",
127 | "product_df"
128 | ]
129 | },
130 | {
131 | "cell_type": "code",
132 | "execution_count": 3,
133 | "id": "b0790116-253e-46e0-a12e-8a31cb2f7db5",
134 | "metadata": {},
135 | "outputs": [
136 | {
137 | "name": "stdout",
138 | "output_type": "stream",
139 | "text": [
140 | "\u001b[32;1m\u001b[1;3mdf.isnull().sum()\n",
141 | "\u001b[0m"
142 | ]
143 | },
144 | {
145 | "data": {
146 | "text/plain": [
147 | "name 0\n",
148 | "type 2\n",
149 | "price 2\n",
150 | "quantity 0\n",
151 | "rating 0\n",
152 | "dtype: int64"
153 | ]
154 | },
155 | "execution_count": 3,
156 | "metadata": {},
157 | "output_type": "execute_result"
158 | }
159 | ],
160 | "source": [
161 | "product_df.llm.query(\"What columns are missing values?\")"
162 | ]
163 | },
164 | {
165 | "cell_type": "code",
166 | "execution_count": 4,
167 | "id": "bc4c7b71-d4b9-4b45-8a73-25bae097e93d",
168 | "metadata": {},
169 | "outputs": [
170 | {
171 | "data": {
172 | "text/plain": [
173 | "name 0\n",
174 | "type 2\n",
175 | "dtype: int64"
176 | ]
177 | },
178 | "execution_count": 4,
179 | "metadata": {},
180 | "output_type": "execute_result"
181 | }
182 | ],
183 | "source": [
184 | "product_df.llm.query(\"Of these, are any strings?\", yolo=True)"
185 | ]
186 | },
187 | {
188 | "cell_type": "code",
189 | "execution_count": 5,
190 | "id": "403c03f2-ea70-4aad-ba4c-3bb98943e238",
191 | "metadata": {},
192 | "outputs": [
193 | {
194 | "name": "stdout",
195 | "output_type": "stream",
196 | "text": [
197 | "\u001b[32;1m\u001b[1;3mimport random\n",
198 | "\n",
199 | "fruits = ['apple', 'banana', 'orange', 'strawberry', 'grape']\n",
200 | "\n",
201 | "df['type'] = df['type'].fillna(random.choice(fruits))\n",
202 | "\u001b[0m"
203 | ]
204 | }
205 | ],
206 | "source": [
207 | "product_df.llm.query(\"Impute the type column with random fruits.\")"
208 | ]
209 | },
210 | {
211 | "cell_type": "code",
212 | "execution_count": 6,
213 | "id": "4ee691c0-a959-4d16-8e99-3eb111363a68",
214 | "metadata": {},
215 | "outputs": [
216 | {
217 | "data": {
218 | "text/html": [
219 | "\n",
220 | "\n",
233 | "
\n",
234 | " \n",
235 | " \n",
236 | " \n",
237 | " name \n",
238 | " type \n",
239 | " price \n",
240 | " quantity \n",
241 | " rating \n",
242 | " \n",
243 | " \n",
244 | " \n",
245 | " \n",
246 | " 0 \n",
247 | " The Da Vinci Code \n",
248 | " book \n",
249 | " 15.0 \n",
250 | " 300 \n",
251 | " 4.0 \n",
252 | " \n",
253 | " \n",
254 | " 1 \n",
255 | " Jurassic Park \n",
256 | " book \n",
257 | " 12.0 \n",
258 | " 400 \n",
259 | " 4.5 \n",
260 | " \n",
261 | " \n",
262 | " 2 \n",
263 | " Jurassic Park \n",
264 | " film \n",
265 | " 8.0 \n",
266 | " 6 \n",
267 | " 5.0 \n",
268 | " \n",
269 | " \n",
270 | " 3 \n",
271 | " Matilda \n",
272 | " book \n",
273 | " 5.0 \n",
274 | " 80 \n",
275 | " 4.0 \n",
276 | " \n",
277 | " \n",
278 | " 4 \n",
279 | " Clockwork Orange \n",
280 | " apple \n",
281 | " NaN \n",
282 | " 20 \n",
283 | " 4.0 \n",
284 | " \n",
285 | " \n",
286 | " 5 \n",
287 | " Walden \n",
288 | " apple \n",
289 | " NaN \n",
290 | " 100 \n",
291 | " 4.5 \n",
292 | " \n",
293 | " \n",
294 | "
\n",
295 | "
"
296 | ],
297 | "text/plain": [
298 | " name type price quantity rating\n",
299 | "0 The Da Vinci Code book 15.0 300 4.0\n",
300 | "1 Jurassic Park book 12.0 400 4.5\n",
301 | "2 Jurassic Park film 8.0 6 5.0\n",
302 | "3 Matilda book 5.0 80 4.0\n",
303 | "4 Clockwork Orange apple NaN 20 4.0\n",
304 | "5 Walden apple NaN 100 4.5"
305 | ]
306 | },
307 | "execution_count": 6,
308 | "metadata": {},
309 | "output_type": "execute_result"
310 | }
311 | ],
312 | "source": [
313 | "product_df"
314 | ]
315 | },
316 | {
317 | "cell_type": "code",
318 | "execution_count": 7,
319 | "id": "d55ae853-8253-46b2-923c-a4dd36b7c8f8",
320 | "metadata": {},
321 | "outputs": [
322 | {
323 | "name": "stdout",
324 | "output_type": "stream",
325 | "text": [
326 | "\u001b[32;1m\u001b[1;3msplit_index = round(len(df) / 3)\n",
327 | "\n",
328 | "df1 = df[:split_index]\n",
329 | "df2 = df[split_index:]\n",
330 | "\n",
331 | "(df1, df2)\n",
332 | "\u001b[0m"
333 | ]
334 | },
335 | {
336 | "data": {
337 | "text/html": [
338 | "\n",
339 | "\n",
352 | "
\n",
353 | " \n",
354 | " \n",
355 | " \n",
356 | " name \n",
357 | " type \n",
358 | " price \n",
359 | " quantity \n",
360 | " rating \n",
361 | " \n",
362 | " \n",
363 | " \n",
364 | " \n",
365 | " 0 \n",
366 | " The Da Vinci Code \n",
367 | " book \n",
368 | " 15.0 \n",
369 | " 300 \n",
370 | " 4.0 \n",
371 | " \n",
372 | " \n",
373 | " 1 \n",
374 | " Jurassic Park \n",
375 | " book \n",
376 | " 12.0 \n",
377 | " 400 \n",
378 | " 4.5 \n",
379 | " \n",
380 | " \n",
381 | "
\n",
382 | "
"
383 | ],
384 | "text/plain": [
385 | " name type price quantity rating\n",
386 | "0 The Da Vinci Code book 15.0 300 4.0\n",
387 | "1 Jurassic Park book 12.0 400 4.5"
388 | ]
389 | },
390 | "metadata": {},
391 | "output_type": "display_data"
392 | },
393 | {
394 | "data": {
395 | "text/html": [
396 | "\n",
397 | "\n",
410 | "
\n",
411 | " \n",
412 | " \n",
413 | " \n",
414 | " name \n",
415 | " type \n",
416 | " price \n",
417 | " quantity \n",
418 | " rating \n",
419 | " \n",
420 | " \n",
421 | " \n",
422 | " \n",
423 | " 2 \n",
424 | " Jurassic Park \n",
425 | " film \n",
426 | " 8.0 \n",
427 | " 6 \n",
428 | " 5.0 \n",
429 | " \n",
430 | " \n",
431 | " 3 \n",
432 | " Matilda \n",
433 | " book \n",
434 | " 5.0 \n",
435 | " 80 \n",
436 | " 4.0 \n",
437 | " \n",
438 | " \n",
439 | " 4 \n",
440 | " Clockwork Orange \n",
441 | " apple \n",
442 | " NaN \n",
443 | " 20 \n",
444 | " 4.0 \n",
445 | " \n",
446 | " \n",
447 | " 5 \n",
448 | " Walden \n",
449 | " apple \n",
450 | " NaN \n",
451 | " 100 \n",
452 | " 4.5 \n",
453 | " \n",
454 | " \n",
455 | "
\n",
456 | "
"
457 | ],
458 | "text/plain": [
459 | " name type price quantity rating\n",
460 | "2 Jurassic Park film 8.0 6 5.0\n",
461 | "3 Matilda book 5.0 80 4.0\n",
462 | "4 Clockwork Orange apple NaN 20 4.0\n",
463 | "5 Walden apple NaN 100 4.5"
464 | ]
465 | },
466 | "metadata": {},
467 | "output_type": "display_data"
468 | }
469 | ],
470 | "source": [
471 | "from IPython.display import display\n",
472 | "\n",
473 | "\n",
474 | "df1, df2 = product_df.llm.query(\"Split the dataframe into two, 1/3 in one, 2/3 in the other. Return (df1, df2).\")\n",
475 | "\n",
476 | "display(df1)\n",
477 | "display(df2)"
478 | ]
479 | },
480 | {
481 | "cell_type": "code",
482 | "execution_count": 8,
483 | "id": "67aea456-2fc9-4e57-b5a7-7cd987479c14",
484 | "metadata": {},
485 | "outputs": [
486 | {
487 | "name": "stdout",
488 | "output_type": "stream",
489 | "text": [
490 | "\u001b[32;1m\u001b[1;3mdf[df['type'] == 'book']\n",
491 | "\u001b[0m"
492 | ]
493 | },
494 | {
495 | "data": {
496 | "text/html": [
497 | "\n",
498 | "\n",
511 | "
\n",
512 | " \n",
513 | " \n",
514 | " \n",
515 | " name \n",
516 | " type \n",
517 | " price \n",
518 | " quantity \n",
519 | " rating \n",
520 | " \n",
521 | " \n",
522 | " \n",
523 | " \n",
524 | " 0 \n",
525 | " The Da Vinci Code \n",
526 | " book \n",
527 | " 15.0 \n",
528 | " 300 \n",
529 | " 4.0 \n",
530 | " \n",
531 | " \n",
532 | " 1 \n",
533 | " Jurassic Park \n",
534 | " book \n",
535 | " 12.0 \n",
536 | " 400 \n",
537 | " 4.5 \n",
538 | " \n",
539 | " \n",
540 | " 3 \n",
541 | " Matilda \n",
542 | " book \n",
543 | " 5.0 \n",
544 | " 80 \n",
545 | " 4.0 \n",
546 | " \n",
547 | " \n",
548 | "
\n",
549 | "
"
550 | ],
551 | "text/plain": [
552 | " name type price quantity rating\n",
553 | "0 The Da Vinci Code book 15.0 300 4.0\n",
554 | "1 Jurassic Park book 12.0 400 4.5\n",
555 | "3 Matilda book 5.0 80 4.0"
556 | ]
557 | },
558 | "execution_count": 8,
559 | "metadata": {},
560 | "output_type": "execute_result"
561 | }
562 | ],
563 | "source": [
564 | "product_df.llm.query(\"Now show me all products that are books.\")"
565 | ]
566 | },
567 | {
568 | "cell_type": "code",
569 | "execution_count": 9,
570 | "id": "c35d7246-6cf2-4148-8726-aa2bd9cc46f9",
571 | "metadata": {},
572 | "outputs": [
573 | {
574 | "name": "stdout",
575 | "output_type": "stream",
576 | "text": [
577 | "\u001b[32;1m\u001b[1;3mdf[df['type'] == 'book'].sort_values(by='quantity').head(1)\n",
578 | "\u001b[0m"
579 | ]
580 | },
581 | {
582 | "data": {
583 | "text/html": [
584 | "\n",
585 | "\n",
598 | "
\n",
599 | " \n",
600 | " \n",
601 | " \n",
602 | " name \n",
603 | " type \n",
604 | " price \n",
605 | " quantity \n",
606 | " rating \n",
607 | " \n",
608 | " \n",
609 | " \n",
610 | " \n",
611 | " 3 \n",
612 | " Matilda \n",
613 | " book \n",
614 | " 5.0 \n",
615 | " 80 \n",
616 | " 4.0 \n",
617 | " \n",
618 | " \n",
619 | "
\n",
620 | "
"
621 | ],
622 | "text/plain": [
623 | " name type price quantity rating\n",
624 | "3 Matilda book 5.0 80 4.0"
625 | ]
626 | },
627 | "execution_count": 9,
628 | "metadata": {},
629 | "output_type": "execute_result"
630 | }
631 | ],
632 | "source": [
633 | "product_df.llm.query(\"Of these, which has the lowest items stocked?\")"
634 | ]
635 | },
636 | {
637 | "cell_type": "code",
638 | "execution_count": 10,
639 | "id": "023d3262-5228-4a85-a24e-f127fed6fb9b",
640 | "metadata": {},
641 | "outputs": [
642 | {
643 | "name": "stdout",
644 | "output_type": "stream",
645 | "text": [
646 | "\u001b[32;1m\u001b[1;3mimport seaborn as sns\n",
647 | "\n",
648 | "sns.catplot(x='type', y='price', data=df[df['type'] != 'apple'], kind='bar')\n",
649 | "\u001b[0m"
650 | ]
651 | },
652 | {
653 | "data": {
654 | "text/plain": [
655 | ""
656 | ]
657 | },
658 | "execution_count": 10,
659 | "metadata": {},
660 | "output_type": "execute_result"
661 | },
662 | {
663 | "data": {
664 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWAAAAFgCAYAAACFYaNMAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAAAP2klEQVR4nO3de4yldX3H8ffH3VoXJaJ1lMZlBS2lEjRVpt4wmgqYbWuLaW0BxYo1blu1Xqpu0Bppmv5h0NRe1OKqCASCKUhF20JFvLXWqiNgF1gR6wUX2TLrFS8VKd/+MWeSZbK7Hoc9z3dmzvuVbOY8z3n2/L6TTN757dlzzqSqkCQN717dA0jStDLAktTEAEtSEwMsSU0MsCQ1Wd89wDg2b95cV1xxRfcYkrRc2dvJVbED3r17d/cIknTArYoAS9JaZIAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpqsik9D0/i2bt3Krl27OPTQQznrrLO6x5G0HwZ4jdm1axe33HJL9xiSxuBTEJLUxABLUhMDLElNDLAkNTHAktTEAEtSEwMsSU0MsCQ1McCS1MQAS1ITAyxJTQywJDUxwJLUxABLUhMDLElNJhbgJOckuS3JdXu575VJKsmDJrW+JK10k9wBnwtsXnoyyWHA04GbJ7i2JK14EwtwVX0c+OZe7nozsBWoSa0tSavBoM8BJzkJuKWqPjfkupK0Eg32O+GSHAS8loWnH8a5fguwBWDTpk0TnEySegy5A34EcATwuSRfATYCVyc5dG8XV9W2qpqtqtmZmZkBx5SkYQy2A66q7cCDF49HEZ6tqt1DzSBJK8kkX4Z2EfBJ4KgkO5O8YFJrSdJqNLEdcFWd+hPuP3xSa0vSauA74SSpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJhMLcJJzktyW5Lo9zr0xyeeT/FeSf0xyyKTWl6SVbpI74HOBzUvOXQkcU1WPBr4AvGaC60vSijaxAFfVx4FvLjn3waq6c3T4n8DGSa0vSStd53PAfwBc3ri+JLVqCXCSPwPuBC7czzVbkswlmZufnx9uOEkayOABTnI68AzgOVVV+7quqrZV1WxVzc7MzAw2nyQNZf2QiyXZDGwFnlpVPxhybUlaaSb5MrSLgE8CRyXZmeQFwFuAg4Erk1yb5OxJrS9JK93EdsBVdepeTr9rUutJ0mrjO+EkqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJanJxH4r8kpx7KvP7x5hUAfvvp11wM27b5+q7/2zb/z97hGkn5o7YElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaTCzASc5JcluS6/Y498AkVya5afT1AZNaX5JWuknugM8FNi85dwZwVVUdCVw1OpakqTSxAFfVx4FvLjl9EnDe6PZ5wDMntb4krXRDPwf8kKq6dXR7F/CQgdeXpBWj7T/hqqqA2tf9SbYkmUsyNz8/P+BkkjSMoQP8P0l+HmD09bZ9XVhV26pqtqpmZ2ZmBhtQkoYydIDfDzxvdPt5wGUDry9JK8YkX4Z2EfBJ4KgkO5O8AHgDcGKSm4ATRseSNJUm9oHsVXXqPu46flJrStJq4jvhJKmJAZakJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJmMHOMnDkpwwur0hycGTG0uS1r6xApzkhcAlwNtHpzYC75vQTJI0FcbdAb8YOA74LkBV3QQ8eFJDSdI0GDfAP6qqOxYPkqwHajIjSdJ0GDfAH0vyWmBDkhOBi4EPLHfRJK9Icn2S65JclOQ+y30sSVqtxg3wGcA8sB34Q+BfgNctZ8EkDwVeCsxW1THAOuCU5TyWJK1m68e8bgNwTlW9AyDJutG5H9yDdTck+TFwEPD1ZT6OJK1a4+6Ar2IhuIs2AB9azoJVdQvwJuBm4FbgO1X1weU8liStZuPugO9TVd9bPKiq7yU5aDkLJnkAcBJwBPBt4OIkp1XVBUuu2wJsAdi0adNylpIGcfNfPKp7BE3Yptdvn8jjjrsD/n6Sxy4eJDkW+OEy1zwB+HJVzVfVj4FLgSctvaiqtlXVbFXNzszMLHMpSVq5xt0Bv5yFnerXgQCHAicvc82bgSeMdtA/BI4H5pb5WJK0ao0V4Kr6TJJfAo4anbpxtHv9qVXVp5JcAlwN3AlcA2xbzmNJ0mq23wAneVpVfTjJby+56xeTUFWXLmfRqjoTOHM5f1eS1oqftAN+KvBh4Df3cl+x8PytJGkZ9hvgqjozyb2Ay6vqHwaaSZKmwk98FURV3QVsHWAWSZoq474M7UNJXpXksCQPXPwz0ckkaY0b92VoJ7PwnO+Llpx/+IEdR5Kmx7gBPpqF+D6ZhRD/G3D2pIaSpGkwboDPY+HD2P92dPzs0bnfm8RQkjQNxg3wMVV19B7HH0lywyQGkqRpMe5/wl2d5AmLB0kej28flqR7ZNwd8LHAfyS5eXS8CbgxyXagqurRE5lOktawcQO8eaJTSNIUGvfDeL466UEkadqM+xywJOkAM8CS1MQAS1ITAyxJTQywJDUxwJLUxABLUhMDLElNDLAkNTHAktTEAEtSEwMsSU0MsCQ1McCS1MQAS1ITAyxJTQywJDUxwJLUpCXASQ5JckmSzyfZkeSJHXNIUqdxfynngfY3wBVV9awk9wYOappDktoMHuAk9weeApwOUFV3AHcMPYckdet4CuIIYB54d5JrkrwzyX0b5pCkVh0BXg88Fvj7qnoM8H3gjKUXJdmSZC7J3Pz8/NAzStLEdQR4J7Czqj41Or6EhSDfTVVtq6rZqpqdmZkZdEBJGsLgAa6qXcDXkhw1OnU8cMPQc0hSt65XQfwJcOHoFRBfAp7fNIcktWkJcFVdC8x2rC1JK4XvhJOkJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpoYYElq0vWB7JqQu+5937t9lbRyGeA15vtHPr17BElj8ikISWpigCWpiQGWpCYGWJKaGGBJamKAJamJAZakJgZYkpoYYElqYoAlqYkBlqQmBliSmhhgSWpigCWpSVuAk6xLck2Sf+qaQZI6de6AXwbsaFxfklq1BDjJRuA3gHd2rC9JK0HXDvivga3AXfu6IMmWJHNJ5ubn5wcbTJKGMniAkzwDuK2qPru/66pqW1XNVtXszMzMQNNJ0nA6dsDHAb+V5CvAe4CnJbmgYQ5JajV4gKvqNVW1saoOB04BPlxVpw09hyR183XAktSk9dfSV9VHgY92ziBJXdwBS1ITAyxJTQywJDUxwJLUxABLUhMDLElNDLAkNTHAktTEAEtSEwMsSU0MsCQ1McCS1MQAS1ITAyxJTQywJDUxwJLUxABLUhMDLElNDLAkNTHAktTEAEtSEwMsSU0MsCQ1McCS1MQAS1ITAyxJTQywJDUxwJLUxABLUhMDLElNBg9wksOSfCTJDUmuT/KyoWeQpJVgfcOadwKvrKqrkxwMfDbJlVV1Q8MsktRm8B1wVd1aVVePbt8O7AAeOvQcktSt9TngJIcDjwE+tZf7tiSZSzI3Pz8/+GySNGltAU5yP+C9wMur6rtL76+qbVU1W1WzMzMzww8oSRPWEuAkP8NCfC+sqks7ZpCkbh2vggjwLmBHVf3V0OtL0krRsQM+Dngu8LQk147+/HrDHJLUavCXoVXVvwMZel1JWml8J5wkNTHAktTEAEtSEwMsSU0MsCQ1McCS1MQAS1ITAyxJTQywJDUxwJLUxABLUhMDLElNDLAkNTHAktTEAEtSEwMsSU0MsCQ1McCS1MQAS1ITAyxJTQywJDUxwJLUxABLUhMDLElNDLAkNTHAktTEAEtSEwMsSU0MsCQ1McCS1KQlwEk2J7kxyReTnNExgyR1GzzASdYBbwV+DTgaODXJ0UPPIUndOnbAjwO+WFVfqqo7gPcAJzXMIUmt1jes+VDga3sc7wQev/SiJFuALaPD7yW5cYDZ1ooHAbu7hxhS3vS87hGmydT9fHFm7ukjXFFVm5ee7AjwWKpqG7Cte47VKMlcVc12z6G1yZ+vA6fjKYhbgMP2ON44OidJU6UjwJ8BjkxyRJJ7A6cA72+YQ5JaDf4URFXdmeQlwL8C64Bzqur6oedY43zqRpPkz9cBkqrqnkGSppLvhJOkJgZYkpoY4FUiyeFJrjsAj/OVJA86EDNpbUry0iQ7knxr8aMCkvx5kld1z7bWrNjXAUtq8yLghKra2T3IWucOeHVZn+TC0e7kkiQHJTk+yTVJtic5J8nPAuzr/KIkG5JcnuSFPd+KVqIkZwMPBy5P8ookb9nLNR9N8uYkc6OfxV9JcmmSm5L85fBTr14GeHU5CnhbVT0S+C7wp8C5wMlV9SgW/kXzx0nus7fzezzO/YAPABdV1TuGG18rXVX9EfB14FeBb+3n0jtG74Y7G7gMeDFwDHB6kp+b+KBrhAFeXb5WVZ8Y3b4AOB74clV9YXTuPOApLIR6b+cXXQa8u6rOH2BmrU2Lb57aDlxfVbdW1Y+AL3H3d7pqPwzw6rL0RdvfXubjfALYnOQef8KIptaPRl/v2uP24rH/tzQmA7y6bEryxNHtZwNzwOFJfmF07rnAx4Ab93F+0etZ+OflWyc/sqR9McCry43Ai5PsAB4AvBl4PnBxku0s7D7Orqr/3dv5JY/1MmBDkrMGm17S3fhWZElq4g5YkpoYYElqYoAlqYkBlqQmBliSmhhgrUlJDknyou45pP0xwFqrDmHhU72kFcsAa616A/CIJNcmuTjJMxfvGH2i3ElJTk9y2ejTvW5KcuYe15yW5NOjv//2JOs6vgmtbQZYa9UZwH9X1S8DbwFOB0hyf+BJwD+Prnsc8DvAo4HfTTKb5JHAycBxo7//f8Bzhhxe08EPzdCaV1UfS/K2JDMsxPa9o9/ODXBlVX0DIMmlwJOBO4Fjgc+MrtkA3NYyvNY0A6xpcT5wGnAKC5+TsWjpe/ELCHBeVb1moNk0pXwKQmvV7cDBexyfC7wcoKpu2OP8iUkemGQD8EwWPqrzKuBZSR4MMLr/YQPMrCnjDlhrUlV9I8knRr/I9PKqevXoU+Tet+TSTwPvBTYCF1TVHECS1wEfTHIv4Mcs/MaHrw72DWgq+GlomgpJDmLhtzc8tqq+Mzp3OjBbVS/pnE3Ty6cgtOYlOQHYAfzdYnyllcAdsCQ1cQcsSU0MsCQ1McCS1MQAS1ITAyxJTf4f290zOgI0FxEAAAAASUVORK5CYII=\n",
665 | "text/plain": [
666 | ""
667 | ]
668 | },
669 | "metadata": {
670 | "needs_background": "light"
671 | },
672 | "output_type": "display_data"
673 | }
674 | ],
675 | "source": [
676 | "product_df.llm.query(\"Now make a seaborn plot of price grouped by type, but exclude those random fruits you made.\")"
677 | ]
678 | },
679 | {
680 | "cell_type": "code",
681 | "execution_count": 11,
682 | "id": "2ad7daa9-f830-492d-b52c-40194be8f045",
683 | "metadata": {},
684 | "outputs": [
685 | {
686 | "name": "stdout",
687 | "output_type": "stream",
688 | "text": [
689 | "\u001b[32;1m\u001b[1;3mimport seaborn as sns\n",
690 | "\n",
691 | "sns.set_style(\"dark\")\n",
692 | "sns.set_palette(\"pastel\")\n",
693 | "\n",
694 | "sns.catplot(x='type', y='price', data=df[df['type'] != 'apple'], kind='bar')\n",
695 | "\u001b[0m"
696 | ]
697 | },
698 | {
699 | "data": {
700 | "text/plain": [
701 | ""
702 | ]
703 | },
704 | "execution_count": 11,
705 | "metadata": {},
706 | "output_type": "execute_result"
707 | },
708 | {
709 | "data": {
710 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWAAAAFgCAYAAACFYaNMAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAAAS9ElEQVR4nO3dfUyV9f/H8dfhThneJA40yww1rbRoJBZ5U1GiGAiiGTZb0spa5bHEMnTRdCtWa7rZHyVDU0tcmSSJ2Y2o2ZTu1DSztTS6wYRKRMI4Isj3jyb7+cvsRJ3rfc7h+dganQs4n/fZ2LNP1znXOa7W1tZWAQAcF2I9AAB0VAQYAIwQYAAwQoABwAgBBgAjYdYDeKOpqVnHjzdajwEA7RIT0/WcxwNiB+xyuaxHAID/XEAEGACCEQEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgIPM7t2facGC+dq9+zPrUQD8jYB4O0p4b+3aYlVWfiuPp1EJCcOsxwFwHuyAg0xjo+esrwD8FwEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgAHACAEGACMEGACM+CzAeXl5SkpKUlpa2p++t3z5cg0ePFi1tbW+Wh4A/J7PApyVlaWioqI/HT9y5Ih27NihPn36+GppAAgIPgtwYmKiunfv/qfjBQUFeuyxx+RyuXy1NAAEBEfPAW/evFmxsbG6/PLLnVwWAPySY58J19jYqKVLl2r58uVOLQkAfs2xHfAPP/ygqqoqZWRkKDk5WdXV1crKytIvv/zi1AgA4Fcc2wEPHjxYFRUVbbeTk5P1xhtvKDo62qkRAMCv+GwHPHv2bGVnZ6uyslKjR4/W2rVrfbUUAAQkn+2AFy1adN7vb9myxVdLA0BA4Eo4ADBCgAHACAEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgAHACAEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgAHACAEGACMEGACMEGAAMEKAAcBImK/uOC8vT9u2bVPPnj1VVlYmSXr22We1detWhYeH65JLLlFBQYG6devmqxEAwK/5bAeclZWloqKis46NGDFCZWVl2rBhgy699FItXbrUV8sDgN/zWYATExPVvXv3s46NHDlSYWF/bLqvueYaVVdX+2p5APB7ZueA161bp9GjR1stDwDmTAL84osvKjQ0VBMmTLBYHgD8gs+ehPsrJSUl2rZtm1asWCGXy+X08gDgNxwN8Pbt21VUVKRXX31VkZGRTi4NAH7HZwGePXu2PvnkEx07dkyjR4/WzJkzVVhYqKamJuXk5EiS4uPjtXDhQl+NAAB+zWcBXrRo0Z+O3X777b5aDgACDlfCAYARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcAAYIQAA4ARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEZ99KrK/uKBHlMLDOs5/Z0JDXW1fY2K6Gk/jnFPNp1V37IT1GMA/EvQBDg8L0Ruf/GI9hmMaPC1tXzvS4548PMZ6BOAf6zhbQwDwMwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADDiswDn5eUpKSlJaWlpbcfq6uqUk5OjlJQU5eTk6Pjx475aHgD8ns8CnJWVpaKiorOOFRYWKikpSe+9956SkpJUWFjoq+UBwO/5LMCJiYnq3r37WcfKy8uVmZkpScrMzNTmzZt9tTwA+D1HzwEfPXpUsbGxkqSYmBgdPXrUyeUBwK+YPQnncrnkcrmslgcAc44GuGfPnvr5558lST///LOio6OdXB4A/IqjAU5OTtb69eslSevXr9ctt9zi5PIA4Fd8FuDZs2crOztblZWVGj16tNauXasZM2Zox44dSklJ0c6dOzVjxgxfLQ8Afs9n7we8aNGicx5fuXKlr5YEgIDClXAAYIQAA4ARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcAAYIQAA4ARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcAAYIQAA4ARrwN8+PBh7dy5U5Lk8XjU0NDgs6EAoCPwKsCvv/663G638vPzJUnV1dV66KGHfDoYAAQ7rwK8evVqrVmzRl26dJEkXXrppaqtrfXpYAAQ7LwKcEREhCIiItpuNzc3+2wgAOgowrz5ocTERL300kvyeDzasWOHiouLlZyc3O5FV6xYobVr18rlcmnQoEEqKChQp06d2n1/ABCIvNoBz5kzR9HR0Ro0aJBee+013XjjjXrkkUfatWBNTY1WrVqldevWqaysTC0tLdq4cWO77gsAAplXO2CPx6NJkyZpypQpkqSWlhZ5PB5FRka2a9Ezvx8WFiaPx6PY2Nh23Q8ABDKvdsDTp0+Xx+Npu+3xeJSTk9OuBXv16qV77rlHN998s0aOHKkuXbpo5MiR7bovAAhkXu2AT548qaioqLbbUVFRamxsbNeCx48fV3l5ucrLy9W1a1fNmjVLpaWlysjIaNf9AdaiL+is0PBw6zHgQy2nTqm2zvP3P/gPeRXgyMhIffnllxoyZIgkaf/+/ercuXO7Fty5c6cuvvhiRUdHS5JSUlK0Z88eAoyAFRoerrp3XrAeAz50wbiZkowCPG/ePM2aNUuxsbFqbW3Vr7/+qsWLF7drwT59+mjv3r1qbGxU586dVVFRoaFDh7brvgAgkHkV4KuvvlqbNm1SZWWlJCkuLk7h7fxfrvj4eI0dO1YTJ05UWFiYrrjiCt1xxx3tui8ACGTnDXBFRYWSkpL03nvvnXX8u+++k/TH6YP2cLvdcrvd7fpdAAgW5w3wp59+qqSkJG3duvWc329vgAEAfxNgt9ut06dPa9SoURo/frxTMwFAh/C3rwMOCQlRUVGRE7MAQIfi1YUYN9xwg5YtW6YjR46orq6u7R8AQPt59SqIt99+Wy6XS8XFxWcdLy8v98lQANAReB3g4uJi7dq1Sy6XS8OGDVN2dravZwOAoObVKYi5c+fq0KFDuuuuuzRt2jQdPHhQc+fO9fVsABDUvNoBf/PNN3r77bfbbl9//fW8KgIA/iWvdsBXXnmlPv/887bbe/fu5fJhAPiXvNoBf/nll8rOzlafPn0kST/99JPi4uKUnp4uSdqwYYPvJgSAIOVVgHkdMAD897wK8EUXXeTrOQCgw/HqHDAA4L9HgAHACAEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgAHACAEGACMEGACMEGAAMEKAAcCISYDr6+vldrs1btw4paamas+ePRZjAIAprz4T7r/29NNPa9SoUVqyZImamprk8XgsxgAAU47vgH/77Td9+umnmjx5siQpIiJC3bp1c3oMADDneICrqqoUHR2tvLw8ZWZmav78+fr999+dHgMAzDke4ObmZh04cEBTp07V+vXrFRkZqcLCQqfHAABzjge4d+/e6t27t+Lj4yVJ48aN04EDB5weAwDMOR7gmJgY9e7dW99++60kqaKiQgMGDHB6DAAwZ/IqiCeffFJz5szRqVOn1LdvXxUUFFiMAQCmTAJ8xRVXqKSkxGJpAPAbXAkHAEYIMAAYIcAAYIQAA4ARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcBBJrxT5FlfAfgvAhxkho+5XX36X6nhY263HgXA3zB5Q3b4Tr/LE9Tv8gTrMQB4gR0wABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcAAYIQAA4ARAgwARggwABghwABgxCzALS0tyszM1P333281AgCYMgvwqlWrNGDAAKvlAcCcSYCrq6u1bds2TZ482WJ5APALJgF+5pln9NhjjykkhFPQADouxwu4detWRUdHa+jQoU4vDQB+xfHPhNu9e7e2bNmi7du36+TJk2poaNCcOXP0/PPPOz0KAJhyPMC5ubnKzc2VJH388cdavnw58QXQIXESFgCMmH4s/XXXXafrrrvOcgQAMMMOGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgAHACAEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwQoABwAgBBgAjBBgAjBBgADBCgAHACAEGACMEGACMEGAAMEKAAcAIAQYAIwQYAIwQYAAwEub0gkeOHNHjjz+uo0ePyuVyacqUKbr77rudHgMAzDke4NDQUD3xxBMaMmSIGhoaNGnSJI0YMUIDBw50ehQAMOX4KYjY2FgNGTJEktSlSxf1799fNTU1To8BAOZMzwFXVVXpq6++Unx8vOUYAGDCLMAnTpyQ2+3WvHnz1KVLF6sxAMCMSYBPnTolt9ut9PR0paSkWIwAAOYcD3Bra6vmz5+v/v37Kycnx+nlAcBvOB7gXbt2qbS0VB999JEyMjKUkZGhDz74wOkxAMCc4y9DGzZsmL7++munlwUAv8OVcABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcAAYIQAA4ARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEQIMAEYIMAAYIcAAYIQAA4ARAgwARggwABghwABghAADgBECDABGCDAAGCHAAGCEAAOAEZMAb9++XWPHjtWYMWNUWFhoMQIAmHM8wC0tLVq4cKGKioq0ceNGlZWV6eDBg06PAQDmHA/wvn371K9fP/Xt21cRERG67bbbVF5e7vQYAGAuzOkFa2pq1Lt377bbvXr10r59+877O+HhoYqJ6druNScPj2n37yJw/Ju/kX/rgnEzzdaGM3zx98WTcABgxPEA9+rVS9XV1W23a2pq1KtXL6fHAABzjgf4qquu0nfffacff/xRTU1N2rhxo5KTk50eAwDMOX4OOCwsTPn5+br33nvV0tKiSZMm6bLLLnN6DAAw52ptbW21HgIAOiKehAMAIwQYAIwQ4ABRVVWltLS0f30/ycnJqq2t/Q8mQrBatWqVUlNTlZiY2PZWAS+88IKWLVtmPFnwcfxJOAD+rbi4WCtWrDjrgin4BjvgANLc3Kzc3FylpqbK7XarsbFRFRUVyszMVHp6uvLy8tTU1CRJf3n8DI/Ho3vvvVevv/66xUOBn8rPz1dVVZXuu+8+rVixQgsXLvzTz9x111165plnlJWVpdTUVO3bt08PP/ywUlJStHjxYoOpAxcBDiCVlZW68847tWnTJkVFRenll1/WE088ocWLF2vDhg1qaWlRcXGxTp48ec7jZ/z+++964IEHlJaWpilTphg+IvibhQsXKjY2VitXrlS3bt3+8ufCw8NVUlKi7OxsPfjgg8rPz1dZWZnefPNNHTt2zMGJAxsBDiAXXnihrr32WknShAkTVFFRoYsvvlhxcXGSpIkTJ+qzzz5TZWXlOY+f8eCDDyorK0uZmZmOPwYEhzMXTw0aNEiXXXaZYmNjFRERob59+551pSvOjwAHEJfLddbt8+1QzichIUEffviheAk42isiIkKSFBIS0vbvZ243NzdbjRVwCHAA+emnn7Rnzx5JUllZmYYOHarDhw/r+++/lySVlpYqMTFRcXFx5zx+htvtVvfu3bVgwQLnHwSANgQ4gMTFxWn16tVKTU1VfX29pk+froKCAs2aNUvp6elyuVyaOnWqOnXqdM7j/9f8+fN18uRJPffcc0aPBgCXIgOAEXbAAGCEAAOAEQIMAEYIMAAYIcAAYIQAIyjV19dr9erV1mMA50WAEZTq6+u1Zs0a6zGA8+J1wAhKjz76qMrLyxUXF6d+/fppwoQJuvXWWyWp7R3l6uvr9f7776uhoUE1NTWaMGGCHn74YUl/XD34yiuv6NSpU4qPj9dTTz2l0NBQy4eEIMQOGEEpNzdXl1xyiUpLSzVt2jSVlJRIkn777Tft2bNHN910kyTpiy++0JIlS/TWW2/pnXfe0RdffKFDhw5p06ZNWrNmjUpLSxUSEqINGzYYPhoEK96QHUFv+PDhWrBggWpra/Xuu+9q7NixCgv740//hhtuUI8ePSRJY8aM0a5duxQWFqb9+/dr8uTJkv547+SePXuazY/gRYDRIWRkZOitt97Sxo0bVVBQ0Hb8/7/DnMvlUmtrqyZOnKjc3Fynx0QHwykIBKWoqCidOHGi7XZWVpZWrlwpSRo4cGDb8R07dqiurk4ej0ebN29WQkKCkpKS9O677+ro0aOSpLq6Oh0+fNjZB4AOgR0wglKPHj2UkJCgtLQ0jRo1SnPnzlX//v3bnog74+qrr9bMmTPbnoS76qqrJEmPPPKI7rnnHp0+fVrh4eHKz8/XRRddZPFQEMR4FQQ6hMbGRqWnp+vNN99U165dJUklJSXav3+/8vPzjadDR8UpCAS9nTt3avz48Zo2bVpbfAF/wA4YAIywAwYAIwQYAIwQYAAwQoABwAgBBgAj/wOSJV42B8vQEAAAAABJRU5ErkJggg==\n",
711 | "text/plain": [
712 | ""
713 | ]
714 | },
715 | "metadata": {},
716 | "output_type": "display_data"
717 | }
718 | ],
719 | "source": [
720 | "product_df.llm.query(\"Can you use a dark theme, and pastel colors?\")"
721 | ]
722 | },
723 | {
724 | "cell_type": "code",
725 | "execution_count": 12,
726 | "id": "42811777-24b0-4b52-bc21-3d10247537d7",
727 | "metadata": {},
728 | "outputs": [
729 | {
730 | "data": {
731 | "text/plain": [
732 | ""
733 | ]
734 | },
735 | "execution_count": 12,
736 | "metadata": {},
737 | "output_type": "execute_result"
738 | },
739 | {
740 | "data": {
741 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAEYCAYAAABIoN1PAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAAAW7klEQVR4nO3de1SVdb7H8c/e3FRUJDTEljqad8XbUcsmschqknDSJMvUsqzGGXVMmzGzUDSkpoulp5Oe0EwmndQ0FTQ1Wk6epMzLhDrZpI5KXlilIBe5bvb5w+PuUCJb3ZsHfvv9WsvV2g+bh+9mrd48PM/Db9ucTqdTAABj2K0eAADgWYQdAAxD2AHAMIQdAAxD2AHAMIQdAAzjb/UAklRRUSGHg7suAeBKBAT4XXJ7rQi7w+FUbu55q8cAgDqlWbNGl9zOqRgAMAxhBwDDEHYAMEytOMcOwHc4HOXKyflB5eWlVo9SZ/j7Byo0tJn8/NxLNmEHUKNycn5QvXoNFBzcXDabzepxaj2n06nCwjzl5Pygpk0j3PocTsUAqFHl5aUKDm5M1N1ks9kUHNz4in7DIewAahxRvzJX+v0i7ABQheTkhfrqqy+tHuOKcY4dPiusSZDsAYEe2VdFWanO5JZ4ZF++pklosAL8PXeMWVZeodycwmvej8Ph0Lhxv/PARDWPsMNn2QMCpbQ5ntlXzAuSCPvVCPC3a9nOaw/xRWP6BVf7nFOnTmrq1Inq2LGz/vWvg2rTpq2ef362Ro2KU3T0ndq160uNHDlGX36ZoVtuuVW33z5I33xzQG+++ZqKiooUGBigN998W0FB9bRw4X9q797dKisr1dChcbrvvvs99lquFmEH4JOOHz+mZ599Qd2799TcuQlas2aVJCkkJERLlrwvSfryywxJUllZmeLjn9Ps2XPVuXNXFRYWKDAwSKmp6xQcHKzk5GUqLS3V+PGPq1+/m9WixQ2WvS6JsAPwUddfH67u3XtKku6+e7BWr/6bJOmOO+76xXOPHz+mpk3D1LlzV0lScHBDSdJXX32hQ4cOadu2TyVJhYUF+v77LMIOAFb45Z0mFx7Xq1ff7X04nU49/fSfdNNN/T042bXjrhgAPik7+7T278+UJG3d+rHr6P1SWrVqrR9/PKNvvjkgSTp/vlDl5eXq16+/PvpotcrLyyVdOLIvKiry+uzV4YgdgE9q1aq11qxZpaSk2frVr9po6NDh+vDDDy753ICAAM2ePVfz5r2ikpISBQUF6Y03/kuxsffp9OlTeuyxh+V0OtWkSaiSkl6r4VfySzan02n5O1yUlTlYjx01rlmzRh67K0YxL+iHH/I9sy/DnT59TM2bt3Y9tuJ2x1OnTurPf56slJSVHvu63vbz75tU9XrsHLEDsJQn7jlHZZxjB+BzIiJa1Kmj9StF2AHAMIQdAAxD2AHAMIQdAAxD2AHAw1auXK7i4mLX42eemaT8/Hzl5+e71qTxJm53BGApTy6fLNWOJZRXrlyhu+4arHr16kmSXn11vqQL98+vXbtKw4bFefXrE3YAlvLk8smSe0sov/feYm3alKbQ0FBdf324OnbsrB07tmvChMnq1KmLcnNzNW7caK1evUGnTp3UnDnxKi6+sFTA00//WZGRPbRnzy4tWfLfatKkiY4cOayOHTsrPn6OVq/+QD/++IMmTXpKISFNtGDBIg0fHqvk5BQtXLhAJ06c0KOPjlTfvjfp7NkzGjgwWlFRt0mSEhKeV3T0IA0YcNs1fQ8IOwCfcvDgN0pP36KlS5fL4SjXY4+NUseOnat8fmjodZo37y0FBQUpK+u4Zs2aocWLUyRJ3333rVJSVqpp02YaP/5xZWZ+rbi4B/XBB+9r/vxFatKkSaV9/e53E3XkyGEtXbpckrR3726tXLlcUVG3qaCgQPv3Z2rGjFnX/BoJOwCfkpm5V1FRt7tOk9x6a9Rln19eXq55817Wd9/9S3a7n7Kyjrk+1rlzV11/fbgkqX37Djp9+qR69Ojp9iy9ev2HXnvtZeXk5Ojvf0/XwIHR8ve/9ixz8RQAJPn5+auiokKSVFr606mcDz54X6GhYVq6dIWSk5e5VnKUpMDAn64N2O12ORyOK/66v/nNYG3ZslFpaRsUEzPkGl7BTwg7AJ/So0dvbd++TSUlxTp/vlCff75dkhQREaFvvz0oSdq2Ld31/MLCAoWFNZXdbtfmzRvdineDBg10/vwv18C5sL3ygoeDB8dq5coVkqQ2bdpe5auqjLAD8CkdO3ZSdPSdeuSRkZo6dZI6deoiSXroodH66KPVGjt2pHJzc13PHzo0Th9/nKpHHnlIx44dVf361b8Rx5AhQzV16kRNnPhUpe0hIU0UGdlDo0c/oLfeelOSdN11YWrduo1iYmI99hpZthc+i2V7rfHz5Wetvt1x8eJFql+/gUaOHO2xGa5EcXGxxowZoSVL3lfDhg2rfB7L9gKoMy5E2Nr7zq3y1Vdf6qWX5mjEiJGXjfqVIuwAfNrjjz9V/ZO8pG/fm/Thh6ke3y/n2AHAMIQdAAxD2AHAMIQdAAxD2AGgClUtv1vbcVcMAEuFXFdPgX4BHttfqaNM584WV//E/+N0OuV0OmW3//I4t6rld2s7wg7AUoF+AXr95DKP7W9KizGSLh/2U6dOasqUCerSpZu+/fagunTpqsOHD6mkpES3336HHn/8Ka1a9bcql98tKjqvZ56ZpO7de2rfvkw1a9ZML730moKC6umbbw7opZfmyGazq2/fm/TFF58rJWWlx16fOzgVA8Anff99loYOjdNf/7pSEyZM1uLFKXrvvRXau3e3Dh36TnFxD6pp02aaP3+RFixYdMnPHzbswuc3bNhI27Z9KkmaOzdBf/rTc1q6dPklfwuoCRyxA/BJzZtHqFu3SEnSp59u1fr1a+VwOHTmzI86evSI2rVrf9nPj4hoofbtO0q6sP7MqVMnlZ+fr/Pnz6tbt+6SpDvv/I127Nju3RdyCYQdgE+6eN785MkTWrHir3rnnWVq3LixEhNnqbS0tNrPDwj46bqA3e4nh6P2LIvAqRgAPq2wsFD16tVXw4YNdfbsGX3xxQ7Xx6pafrcqjRo1UoMGDXTgwH5JUnr6Fo/P6w6O2AH4tPbtO6hDh44aOXK4wsPDFRnZw/Wxi8vvNm3a7JLn2S/l2Wfj9Ze/vCibza6ePXt7dHEvd7FsL3wWy/Za4+fLz1p9u6OnnT9/Xg0aNJAkpaQs1ZkzP2ry5Geueb8s2wugzrgQYetC7GkZGf+jlJSlcjjK1bx5hJ57blaNz0DYAcCD7rjjLt1xx12WzsDFUwAwDGEHUONqwaW9OuVKv18ePxXzySefaNu2bSooKNDw4cN16623evpLAKjD/P0DVViYp+DgxrLZbFaPU+s5nU4VFubJ39/994V1K+zTp0/Xtm3bFBYWptTUn97G6bPPPlNiYqIqKioUFxenJ598UoMGDdKgQYN07tw5vfzyy4QdQCWhoc2Uk/ODCgpyrR6lzvD3D1RoaDP3n+/Ok4YNG6ZRo0Zp2rRprm0Oh0OzZ8/Wu+++q/DwcA0fPlzR0dFq166dJOntt9/Www8/fIXjAzCdn5+/mjaNsHoMo7l1jr1v374KCQmptC0zM1OtW7dWy5YtFRgYqJiYGKWnp8vpdOqVV15RVFSUunbt6pWhAQBVu+pz7NnZ2WrevLnrcXh4uDIzM5WSkqKMjAzl5+fr2LFjeuihhzwyKADAPR6/eDpmzBiNGTPG07sFALjpqm93DA8P1+nTp12Ps7OzFR4e7pGhAABX76rDHhkZqaNHjyorK0ulpaVKS0tTdHS0J2cDAFwFt07FTJkyRTt37lROTo6ioqI0ceJExcXFKT4+XuPGjZPD4dD999+v9u0vvzA9AMD7WN0RPovVHVHXVbW6I0sKAIBhCDsAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGMat9zzFpYU1CZI9INAj+6ooK9WZ3BKP7AuAbyPs18AeEOix98y0x7wgibADuHacigEAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADAMYQcAwxB2ADCMv6d3mJWVpbffflsFBQWaP3++p3cPH9ckNFgB/hyPAJfjVtinT5+ubdu2KSwsTKmpqa7tn332mRITE1VRUaG4uDg9+eSTatmypebOnatJkyZ5bWj4rgB/u5btLPTIvsb0C/bIfoDaxq1Dn2HDhik5ObnSNofDodmzZys5OVlpaWlKTU3VoUOHvDIkAMB9boW9b9++CgkJqbQtMzNTrVu3VsuWLRUYGKiYmBilp6d7ZUgAgPuu+mRldna2mjdv7nocHh6u7Oxs5eTkKD4+Xv/85z+1aNEijwwJAHCfxy+ehoaGavbs2Z7eLQDATVd9xB4eHq7Tp0+7HmdnZys8PNwjQwEArt5Vhz0yMlJHjx5VVlaWSktLlZaWpujoaE/OBgC4Cm6dipkyZYp27typnJwcRUVFaeLEiYqLi1N8fLzGjRsnh8Oh+++/X+3bt/f2vACAargV9tdff/2S2wcOHKiBAwd6dCAAwLXhT/gAwDCEHQAMQ9gBwDCEHQAM4/E/UAKAqnhydc6y8grl5nhmQTjTEHYANYbVOWsGp2IAwDCEHQAMQ9gBwDCEHQAMQ9gBwDCEHQAMQ9gBwDCEHQAMQ9gBwDCEHQAMQ9gBwDCEHQAMwyJgAOomR7maNWt0zbupKCvVmdwSDwxUexB2AHWTn7+UNuead2OPeUGSWWHnVAwAGIawA4BhCDsAGIawA4BhCDsAGIawA4BhCDsAGIb72AH4tHKnwyN/6CRJpY4ynTtb7JF9XQvCDsCn+dv89PrJZR7Z15QWYyRZH3ZOxQCAYQg7ABiGsAOAYQg7ABiGsAOAYbgrppYw8ZYrANYg7LWEibdc+RJP/WDmhzI8gbADHuCpH8z8UIYncI4dAAxD2AHAMIQdAAxD2AHAMIQdAAxD2AHAMIQdAAxD2AHAMDan0+m0eggAgOdwxA4AhiHsAGAYwg4AhiHsAGAYwg4AhiHsAGAYwg4AhiHsAGAYwu5FRUVFVo8AwAcRdi/Ys2ePBg8erHvuuUeSdPDgQc2aNcvaoeC2VatW/WLbq6++asEkuBrnzp3TwYMHdeDAAdc/X8N7nnpBUlKSFi9erPHjx0uSOnXqpF27dlk8Fdy1ZcsWBQUFaciQIZKkhIQElZSUWDwV3PHGG29o7dq1atWqlWubzWbTsmWeeaP4uoKwe0lERESlx3Y7vxzVFQsWLND48eNlt9u1fft2NWrUSHPnzrV6LLhh06ZN2rp1qwIDA60exVLUxgsiIiK0Z88e2Ww2lZWVafHixbrxxhutHgvVyM3NVW5uroqLi/Xiiy8qOTlZwcHBmjBhgnJzc60eD27o0KGD8vPzrR7Dcqzu6AVnz55VYmKiMjIy5HQ69etf/1ozZsxQaGio1aPhMqKjo2Wz2eR0Ol3/vchmsyk9Pd3C6eCOffv26fe//706dOiggIAA1/aFCxdaOFXNI+wAjBETE6MRI0aoQ4cOlU5/9uvXz8Kpah7n2D1ozpw5stlsVX78+eefr8FpcLXKysq0YsUK1wXvfv36acSIEZWOAFE71atXT2PGjLF6DMtxxO5Ba9euvezHhw4dWkOT4FrMmDFD5eXluu+++yRJ69evl91uV2JiorWDoVpJSUkKDAxUdHR0pQuoXbt2tXCqmkfYvaigoECS1LBhQ4snwZUYMmSI1q9fX+021D6jR4/+xTZud4RH7Nu3T88995wKCwvldDpdt8t169bN6tHgBj8/Px0/ftx1L3RWVpb8/PwsngruSElJsXqEWoEjdi+IjY3VzJkz1adPH0nSrl27lJCQoA0bNlg8GdyRkZGh6dOnq2XLlnI6nTp58qTmzp2rm2++2erRUIV33333sh8fO3ZsDU1SO3DE7gV+fn6uqEtSnz595O/Pt7qu6N+/v7Zs2aIjR45Iktq2bevzf/BS2xUWFlo9Qq3CEbsXJCYmqqSkRDExMbLZbNq4cWOlP1H3tQs5dQ13xaCuI+xecKkLOBf54oWcuoa7Yuqed955R0888USVtxz72q3GnB/wAi7g1G379u2rdAdM//79Xb9toXZKTk7WE088oZYtWyokJMTqcSxH2L0gJydHb731lnbv3i2bzabevXvrD3/4A0sK1BHcFVP3hIWFKTs7W2vWrFFKSop8/UQEp2K8YOzYserTp4/rKG/Dhg3auXOnli5dau1gcMv/vytGkk6cOMFdMbVcSkqKli9frqysLIWHh7u2X1z3x9fW+SHsXnDvvfcqNTW10rbY2Fhud6wjSkpKtGTJEmVkZKhx48aKjIzUo48+qqCgIKtHQzVmzpyphIQEq8ewHGH3gqSkJHXv3t31Dkoff/yx9u3bp2nTplk8Gdzxxz/+UQ0bNlRsbKwkKTU1VXl5eZo/f77FkwHuIexe0KtXLxUVFbnOyzocDtWvX1/Shbti9uzZY+V4qMbgwYO1cePGarcBtRUXT71g7969ys3N1bFjxyq9pZqvLR1aV3Xp0kX/+Mc/1LNnT0nS119/zXIQqFMIuxesWrVKy5Yt0+nTp9WpUyd9/fXX6tWrF2Gv5S6eeikvL9eDDz6oFi1aSJJOnjyptm3bWjkacEUIuxcsW7ZMq1ev1gMPPKCUlBQdPnxY8+bNs3osVMPX3mUH5iLsXhAYGOi6g6K0tFQ33nij/v3vf1s8Fapzww03WD0C4BGE3QuaN2+uvLw8DRo0SGPHjlXjxo1dv9YDgLdxV4yX7dy5U/n5+RowYAArBAKoEYQdAAxjr/4pAIC6hLADgGEIO3xOXl6e3n//favHALyGsMPn5OXlacWKFVaPAXgNF0/hc55++mmlp6erTZs2at26tYYMGaJBgwZJkqZOnap77rlHeXl52rp1qwoKCpSdna0hQ4ZowoQJkqR169YpJSVFZWVl6tGjh2bOnMl67ahVOGKHz5k6dapatWqldevWadSoUVqzZo0kKT8/X3v37tVtt90m6cI7Kc2fP1/r1693rdB5+PBhbdq0SStWrNC6detkt9tZjhm1Dn+gBJ/Wr18/JSQk6OzZs9q8ebPuvvtu+ftf+N/illtucb3r1Z133qndu3fL399f+/fv1/DhwyVJxcXFCgsLs2x+4FIIO3zeb3/7W61fv15paWlKSkpybf/5myLbbDY5nU4NHTpUU6dOrekxAbdxKgY+Jzg4WIWFha7Hw4YN03vvvSdJateunWv7559/rtzcXBUXF+uTTz5R79691b9/f23evFlnzpyRJOXm5urEiRM1+wKAanDEDp8TGhqq3r17695779WAAQM0bdo0tW3b1nUB9aLu3btr4sSJrounkZGRkqTJkyfrscceU0VFhQICAhQfH88CYqhVuCsGPq+oqEixsbFau3atGjVqJElas2aN9u/fr/j4eIunA64cp2Lg03bs2KHBgwdr1KhRrqgDdR1H7ABgGI7YAcAwhB0ADEPYAcAwhB0ADEPYAcAwhB0ADPO/BxTvbkC4RV8AAAAASUVORK5CYII=\n",
742 | "text/plain": [
743 | ""
744 | ]
745 | },
746 | "metadata": {},
747 | "output_type": "display_data"
748 | }
749 | ],
750 | "source": [
751 | "product_df.llm.query(\"Group by type and take the mean of all numeric columns.\", yolo=True).llm.query(\"Make a bar plot of the result and use a log scale.\", yolo=True)"
752 | ]
753 | }
754 | ],
755 | "metadata": {
756 | "kernelspec": {
757 | "display_name": "Python 3 (ipykernel)",
758 | "language": "python",
759 | "name": "python3"
760 | },
761 | "language_info": {
762 | "codemirror_mode": {
763 | "name": "ipython",
764 | "version": 3
765 | },
766 | "file_extension": ".py",
767 | "mimetype": "text/x-python",
768 | "name": "python",
769 | "nbconvert_exporter": "python",
770 | "pygments_lexer": "ipython3",
771 | "version": "3.9.12"
772 | }
773 | },
774 | "nbformat": 4,
775 | "nbformat_minor": 5
776 | }
777 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [tool.poetry]
2 | name = "yolopandas"
3 | version = "0.0.6"
4 | description = "Interact with Pandas objects via LLMs and langchain."
5 | authors = []
6 | license = "MIT"
7 | readme = "README.md"
8 | repository = "https://www.github.com/ccurme/yolopandas"
9 |
10 | [tool.poetry.dependencies]
11 | python = ">=3.9,<4.0"
12 | ipython = "^8.8.0"
13 | langchain = ">= 0.0.60, < 1"
14 | openai = "^0"
15 | pandas = "^1.4"
16 |
17 | [tool.poetry.group.test.dependencies]
18 | pytest = "^7.2.0"
19 | pytest-cov = "^4.0.0"
20 |
21 | [tool.poetry.group.lint.dependencies]
22 | black = "^22.10.0"
23 | isort = "^5.10.1"
24 | flake8 = "^6.0.0"
25 |
26 | [tool.poetry.group.typing.dependencies]
27 | mypy = "^0.991"
28 |
29 | [tool.poetry.group.dev.dependencies]
30 | jupyter = "^1.0.0"
31 |
32 | [tool.isort]
33 | profile = "black"
34 |
35 | [tool.mypy]
36 | ignore_missing_imports = "True"
37 | disallow_untyped_defs = "True"
38 | exclude = ["tests"]
39 |
40 | [build-system]
41 | requires = ["poetry-core"]
42 | build-backend = "poetry.core.masonry.api"
43 |
--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
1 | import os
2 |
3 | TEST_DIRECTORY = os.path.dirname(os.path.abspath(__file__))
4 |
--------------------------------------------------------------------------------
/tests/data/product_df.json:
--------------------------------------------------------------------------------
1 | [
2 | {
3 | "name":"The Da Vinci Code",
4 | "type":"book",
5 | "price":15,
6 | "quantity":300,
7 | "rating":4.0
8 | },
9 | {
10 | "name":"Jurassic Park",
11 | "type":"book",
12 | "price":12,
13 | "quantity":400,
14 | "rating":4.5
15 | },
16 | {
17 | "name":"Jurassic Park",
18 | "type":"film",
19 | "price":8,
20 | "quantity":6,
21 | "rating":5.0
22 | },
23 | {
24 | "name":"Matilda",
25 | "type":"book",
26 | "price":6,
27 | "quantity":80,
28 | "rating":4.0
29 | }
30 | ]
--------------------------------------------------------------------------------
/tests/integration_tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ccurme/yolopandas/d008d0b73252045927289016c47667fd2247cc9f/tests/integration_tests/__init__.py
--------------------------------------------------------------------------------
/tests/integration_tests/test_llm_accessor.py:
--------------------------------------------------------------------------------
1 | import os
2 | import unittest
3 |
4 | from tests import TEST_DIRECTORY
5 | from yolopandas import pd
6 | from yolopandas.utils.query_helpers import run_query_with_cost
7 |
8 |
9 | class TestLLMAccessor(unittest.TestCase):
10 | @classmethod
11 | def setUpClass(cls):
12 | test_data_path = os.path.join(TEST_DIRECTORY, "data", "product_df.json")
13 | cls.product_df = pd.read_json(test_data_path)
14 |
15 | def test_basic_use(self):
16 | self.product_df.llm.reset_chain(use_memory=False)
17 | result = self.product_df.llm.query(
18 | "What is the price of the highest-priced book?",
19 | yolo=True,
20 | )
21 | expected_result = 15
22 | self.assertEqual(expected_result, result)
23 |
24 | result = run_query_with_cost(
25 | self.product_df, "What is the price of the highest-priced book?", yolo=True
26 | )
27 | self.assertEqual(expected_result, result)
28 |
29 | result = self.product_df.llm.query(
30 | "What is the average price of products grouped by type?",
31 | yolo=True,
32 | )
33 | expected = self.product_df.groupby("type")["price"].mean()
34 | pd.testing.assert_series_equal(expected, result)
35 |
36 | result = self.product_df.llm.query(
37 | "Give me products that are not books.",
38 | yolo=True,
39 | )
40 | expected = self.product_df[self.product_df["type"] != "book"]
41 | pd.testing.assert_frame_equal(expected, result)
42 |
43 | def test_memory(self):
44 | self.product_df.llm.reset_chain(use_memory=True)
45 | _ = self.product_df.llm.query(
46 | "Show me all products that are books.",
47 | yolo=True,
48 | )
49 | result = self.product_df.llm.query(
50 | "Of these, which has the fewest items stocked?",
51 | yolo=True,
52 | )
53 | expected = (
54 | self.product_df[self.product_df["type"] == "book"]
55 | .sort_values(by="quantity")
56 | .head(1)
57 | )
58 | pd.testing.assert_frame_equal(expected, result)
59 |
--------------------------------------------------------------------------------
/tests/unit_tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ccurme/yolopandas/d008d0b73252045927289016c47667fd2247cc9f/tests/unit_tests/__init__.py
--------------------------------------------------------------------------------
/tests/unit_tests/test_llm_accessor.py:
--------------------------------------------------------------------------------
1 | import os
2 | import unittest
3 | from unittest.mock import Mock, patch
4 |
5 | from yolopandas import pd
6 | from langchain.chains.base import Chain
7 | from tests import TEST_DIRECTORY
8 |
9 |
10 | def _get_mock_chain(response: str) -> Chain:
11 | """Make mock Chain for unit tests."""
12 | mock_chain = Mock(spec=Chain)
13 | mock_chain.run.return_value = response
14 |
15 | return mock_chain
16 |
17 |
18 | class TestLLMAccessor(unittest.TestCase):
19 | @classmethod
20 | def setUpClass(cls):
21 | test_data_path = os.path.join(TEST_DIRECTORY, "data", "product_df.json")
22 | cls.product_df = pd.read_json(test_data_path)
23 |
24 | @patch("yolopandas.llm_accessor.get_chain")
25 | def test_basic_use(self, mock):
26 | mock.return_value = _get_mock_chain("df[df['type'] == 'book']['price'].max()")
27 | result = self.product_df.llm.query(
28 | "What is the price of the highest-priced book?",
29 | yolo=True,
30 | )
31 | expected_result = 15
32 | self.assertEqual(expected_result, result)
33 |
34 | mock.return_value = _get_mock_chain("df.groupby('type')['price'].mean()")
35 | self.product_df.llm.reset_chain()
36 | result = self.product_df.llm.query(
37 | "What is the average price of products grouped by type?",
38 | yolo=True,
39 | )
40 | expected_result = self.product_df.groupby("type")["price"].mean()
41 | pd.testing.assert_series_equal(expected_result, result)
42 |
43 | mock.return_value = _get_mock_chain("df[df['type'] != 'book']")
44 | self.product_df.llm.reset_chain()
45 | result = self.product_df.llm.query(
46 | "Give me products that are not books.",
47 | yolo=True,
48 | )
49 | expected = self.product_df[self.product_df["type"] != "book"]
50 | pd.testing.assert_frame_equal(expected, result)
51 |
52 | @patch("yolopandas.llm_accessor.get_chain")
53 | def test_sliced(self, mock):
54 | mock.return_value = _get_mock_chain("df[df['type'] == 'book']['price'].max()")
55 | self.product_df.llm.reset_chain()
56 | result = self.product_df[["name", "type", "price", "rating"]].llm.query(
57 | "What is the price of the highest-priced book?", yolo=True
58 | )
59 | expected_result = 15
60 | self.assertEqual(expected_result, result)
61 |
62 | @patch("yolopandas.llm_accessor.get_chain")
63 | def test_multi_line(self, mock):
64 | """Test that we can accommodate multiple lines in the LLM response."""
65 | query = """
66 | Add a column `new_column` to the dataframe which is range 1 - number of rows,
67 | then return the mean of this column by each type.
68 | """
69 | mock_response = (
70 | "df['new_column'] = range(1, len(df) + 1)\n"
71 | "df.groupby('type')['new_column'].mean()"
72 | )
73 | mock.return_value = _get_mock_chain(mock_response)
74 | self.product_df.llm.reset_chain()
75 | result = self.product_df.llm.query(query, yolo=True)
76 | expected = (
77 | self.product_df.assign(new_column=range(1, len(self.product_df) + 1))
78 | .groupby("type")["new_column"]
79 | .mean()
80 | )
81 | pd.testing.assert_series_equal(expected, result)
82 |
83 | @patch("yolopandas.llm_accessor.get_chain")
84 | def test_multiline_exec(self, mock):
85 | """Test a multiline command when the final line should be exec'd not eval'd."""
86 | query = """
87 | Add a column `new_column` to the dataframe which is range 1 - number of rows,
88 | then add a column `foo` which is always the value 1
89 | """
90 | mock_response = "df['new_column'] = range(1, len(df) + 1)\n" "df['foo'] = 1"
91 | mock.return_value = _get_mock_chain(mock_response)
92 | self.product_df.llm.reset_chain()
93 | self.product_df.llm.query(query, yolo=True)
94 | expected_df = self.product_df.assign(
95 | new_column=range(1, len(self.product_df) + 1)
96 | ).assign(foo=1)
97 | pd.testing.assert_frame_equal(expected_df, self.product_df)
98 |
--------------------------------------------------------------------------------
/yolopandas/__init__.py:
--------------------------------------------------------------------------------
1 | from yolopandas import llm_accessor
2 | from yolopandas.chains import set_llm
3 | from yolopandas.llm_accessor import pd
4 |
--------------------------------------------------------------------------------
/yolopandas/chains.py:
--------------------------------------------------------------------------------
1 | import os
2 | from typing import Optional
3 |
4 | from langchain import LLMChain, OpenAI, PromptTemplate
5 | from langchain.chains.base import Chain
6 | from langchain.chains.conversation.memory import ConversationBufferMemory
7 | from langchain.llms.base import BaseLLM
8 | from langchain.llms.loading import load_llm
9 |
10 |
11 | DEFAULT_LLM = None
12 | # Default template, no memory
13 | TEMPLATE = """
14 | You are working with a pandas dataframe in Python. The name of the dataframe is `df`.
15 | The dataframe has the following columns: {df_columns}.
16 |
17 | You should execute code as commanded to either provide information to answer the question or to
18 | do the transformations required.
19 |
20 | You should not assign any variables; you should return a one-liner in Pandas.
21 |
22 | This is your objective: {query}
23 |
24 | Go!
25 |
26 | ```python
27 | print(df.head())
28 | ```
29 | ```output
30 | {df_head}
31 | ```
32 | ```python"""
33 |
34 | PROMPT = PromptTemplate(template=TEMPLATE, input_variables=["query", "df_head", "df_columns"])
35 |
36 |
37 | # Template with memory
38 | # TODO: add result of expected code to memory; currently we only remember what code was run.
39 | TEMPLATE_WITH_MEMORY = """
40 | You are working with a pandas dataframe in Python. The name of the dataframe is `df`.
41 | The dataframe has the following columns: {df_columns}.
42 |
43 | You are interacting with a programmer. The programmer issues commands and you should translate
44 | them into Python code and execute them.
45 |
46 | This is the history of your interaction so far:
47 | {chat_history}
48 | Human: {query}
49 |
50 | Go!
51 |
52 | ```python
53 | df.head()
54 | ```
55 | ```output
56 | {df_head}
57 | ```
58 | ```python
59 | """
60 | PROMPT_WITH_MEMORY = PromptTemplate(
61 | template=TEMPLATE_WITH_MEMORY, input_variables=["chat_history", "query", "df_head", "df_columns"]
62 | )
63 |
64 |
65 | def set_llm(llm: BaseLLM) -> None:
66 | global DEFAULT_LLM
67 | DEFAULT_LLM = llm
68 |
69 |
70 | def get_chain(llm: Optional[BaseLLM] = None, use_memory: bool = True) -> Chain:
71 | """Get chain to use."""
72 | if llm is None:
73 | if DEFAULT_LLM is None:
74 | llm_config_path = os.environ.get("LLPANDAS_LLM_CONFIGURATION")
75 | if llm_config_path is None:
76 | llm = OpenAI(temperature=0)
77 | else:
78 | llm = load_llm(llm_config_path)
79 | else:
80 | llm = DEFAULT_LLM
81 |
82 | if use_memory:
83 | memory = ConversationBufferMemory(memory_key="chat_history", input_key="query")
84 | chain = LLMChain(llm=llm, prompt=PROMPT_WITH_MEMORY, memory=memory)
85 | else:
86 | chain = LLMChain(llm=llm, prompt=PROMPT)
87 |
88 | return chain
89 |
--------------------------------------------------------------------------------
/yolopandas/llm_accessor.py:
--------------------------------------------------------------------------------
1 | import ast
2 | import os
3 | from typing import Any, Optional
4 |
5 | import pandas as pd
6 | from IPython.display import clear_output
7 | from langchain.chains.base import Chain
8 | from langchain.input import print_text
9 | from langchain.llms.base import BaseLLM
10 |
11 | from yolopandas.chains import get_chain
12 |
13 |
14 | @pd.api.extensions.register_dataframe_accessor("llm")
15 | class LLMAccessor:
16 | def __init__(self, pandas_df: pd.DataFrame):
17 | self.df = pandas_df
18 | use_memory = bool(os.environ.get("LLPANDAS_USE_MEMORY", True))
19 | self.chain = get_chain(use_memory=use_memory)
20 |
21 | def set_chain(self, chain: Chain) -> None:
22 | """Set chain to use."""
23 | self.chain = chain
24 |
25 | def reset_chain(
26 | self, llm: Optional[BaseLLM] = None, use_memory: bool = True
27 | ) -> None:
28 | """Reset chain with LLM or memory kwarg."""
29 | self.chain = get_chain(llm=llm, use_memory=use_memory)
30 |
31 | def query(self, query: str, yolo: bool = False) -> Any:
32 | """Query the dataframe."""
33 | df = self.df
34 | df_columns = df.columns.tolist()
35 | inputs = {"query": query, "df_head": df.head(), "df_columns": df_columns, "stop": "```"}
36 | llm_response = self.chain.run(**inputs)
37 | eval_expression = False
38 | if not yolo:
39 | print("suggested code:")
40 | print(llm_response)
41 | print("run this code? y/n")
42 | user_input = input()
43 | if user_input == "y":
44 | clear_output(wait=True)
45 | print_text(llm_response, color="green")
46 | eval_expression = True
47 | else:
48 | eval_expression = True
49 |
50 | if eval_expression:
51 | # WARNING: This is a bad idea. Here we evaluate the (potentially multi-line)
52 | # llm response. Do not use unless you trust that llm_response is not malicious.
53 | tree = ast.parse(llm_response)
54 | module = ast.Module(tree.body[:-1], type_ignores=[])
55 | exec(ast.unparse(module))
56 | module_end = ast.Module(tree.body[-1:], type_ignores=[])
57 | module_end_str = ast.unparse(module_end)
58 | try:
59 | return eval(module_end_str)
60 | except Exception:
61 | exec(module_end_str)
62 |
--------------------------------------------------------------------------------
/yolopandas/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ccurme/yolopandas/d008d0b73252045927289016c47667fd2247cc9f/yolopandas/utils/__init__.py
--------------------------------------------------------------------------------
/yolopandas/utils/query_helpers.py:
--------------------------------------------------------------------------------
1 | from typing import Any
2 |
3 | from langchain.callbacks import get_openai_callback
4 |
5 | from yolopandas import pd
6 |
7 |
8 | def run_query_with_cost(df: pd.DataFrame, query: str, yolo: bool = False) -> Any:
9 | """
10 | A function to run a YOLOPandas query with cost estimation returned for your query in terms of tokens used.
11 | This includes total tokens, prompt tokens, completion tokens, and the total cost in USD.
12 |
13 | Parameters
14 | ----------
15 | df : pd.DataFrame
16 | The Pandas DataFrame with your data
17 | query : str
18 | The query you want to run against your data
19 | yolo : bool
20 | Boolean value used to return a prompt to a user or not to accept the code result before
21 | running the code (False means to return the prompt)
22 |
23 | Returns
24 | -------
25 | result : Any
26 | The results of the query run against your data. A prompt may be returned as intermediary
27 | output to proceed with generating the result or not.
28 | """
29 | with get_openai_callback() as cb:
30 | result = df.llm.query(query, yolo=yolo)
31 | print(f"Total Tokens: {cb.total_tokens}")
32 | print(f"Prompt Tokens: {cb.prompt_tokens}")
33 | print(f"Completion Tokens: {cb.completion_tokens}")
34 | print(f"Total Cost (USD): ${cb.total_cost}")
35 | return result
36 |
--------------------------------------------------------------------------------