5 |
6 | # Building LLM-Powered Applications
7 |
8 | This repository contains materials for our [Building LLM-Powered Applications](https://www.wandb.courses/courses/building-llm-powered-apps) course.
9 |
10 | Learn how to build LLM-powered applications using LLM APIs, Langchain and W&B Prompts. This course will guide you through the entire process of designing, experimenting and evaluating LLM-based apps.
11 |
12 | ## 🚀 [Enroll for free](https://www.wandb.courses/courses/building-llm-powered-apps)
13 |
14 | ## What you'll learn
15 |
16 | ### Understand LLM-powered applications
17 | Learn the fundamentals of LLM-powered applications, including APIs, chains, agents and prompt engineering.
18 |
19 | ### Build your own app
20 | See how we develop a support automation bot for a software company, and build your own app.
21 |
22 | ### Experiment, evaluate, and deploy your solution
23 | Improve your LLM-powered app with structured experiments and evaluation. Deploy and monitor your application in production.
24 |
25 | ## Running the code
26 |
27 | - Notebooks can be run on your local system or via Google Colab
28 | - To run the python scripts in `src` directory, setup a virtual environment (e.g. `conda`) with `python<3.11` and install `requirements.txt`
29 | - If you have questions, you can ask them in [Discord](https://wandb.me/discord) in the `#courses` channel
30 |
--------------------------------------------------------------------------------
/llm-apps-course/docs_sample/collaborate-on-reports.md:
--------------------------------------------------------------------------------
1 | ---
2 | description: Collaborate and share W&B Reports with peers, co-workers, and your team.
3 | ---
4 |
5 | # Collaborate on reports
6 |
7 |
8 | Collaborate and Share W&B Reports
9 |
10 |
11 | Once you have saved a report, you can select the **Share** button to collaborate. A draft copy of the report is created when you select the **Edit** button. Draft reports auto-save. Select **Save to report** to publish your changes to the shared report.
12 |
13 | A warning notification will appear if an edit conflict occurs. This can occur if you and another collaborator edit the same report at the same time. The warning notification will guide you to resolve potential edit conflicts.
14 |
15 | 
16 |
17 | ### Comment on reports
18 |
19 | Click the comment button on a panel in a report to add a comment directly to that panel.
20 |
21 | 
22 |
23 |
24 |
25 | ### Who can edit and share reports?
26 |
27 | Reports that are created within an individual's private project is only visible to that user. The user can share their project to a team or to the public.
28 |
29 | On team projects, both the administrator, or member who created the report, can toggle permissions between edit or view access for other team members. Team members can share reports.
30 |
31 | To share a report, select the **Share** button on the upper right hand corner. You can either provide an email account or copy the magic link. Users invited by email will need to log into Weights & Biases to view the report. Users who are given a magic link to not need to log into Weights & Biases to view the report.
32 |
33 | Shared reports are view-only.
34 |
--------------------------------------------------------------------------------
/llm-apps-course/notebooks/prompt_template.txt:
--------------------------------------------------------------------------------
1 | Here are some examples of real user questions, you will be judged by how well you match this distribution.
2 | ***
3 | {QUESTIONS}
4 | ***
5 | In the next step, you will read a fragment of W&B documentation.
6 | This will serve as inspiration for synthetic user question and the source of the answer.
7 | Here is the document fragment:
8 | ***
9 | {CHUNK}
10 | ***
11 | You will now generate a user question and corresponding answer based on the above document.
12 | First, explain the user context and what problems they might be trying to solve.
13 | Second, generate user question.
14 | Third, provide the accurate and concise answer in markdown format to the user question using the documentation.
15 | You'll be evaluated on:
16 | - how realistic is that this question will come from a real user one day?
17 | - is this question about W&B?
18 | - can the question be answered using the W&B document fragment above?
19 | - how accurate is the answer?
20 | Remember that users have different styles and can be imprecise. You are very good at impersonating them!
21 | Use the following format:
22 | CONTEXT:
23 | QUESTION:
24 | ANSWER:
25 | Let's start!
--------------------------------------------------------------------------------
/llm-apps-course/notebooks/system_template.txt:
--------------------------------------------------------------------------------
1 | You are a creative assistant with the goal to generate a synthetic dataset of Weights & Biases (W&B) user questions.
2 | W&B users are asking these questions to a bot, so they don't know the answer and their questions are grounded in what they're trying to achieve.
3 | We are interested in questions that can be answered by W&B documentation.
4 | But the users don't have access to this documentation, so you need to imagine what they're trying to do and use according language.
--------------------------------------------------------------------------------
/llm-apps-course/requirements.txt:
--------------------------------------------------------------------------------
1 | wandb>=0.15.3
2 | openai==0.27.*
3 | pandas==1.5.*
4 | unstructured>=0.6.10
5 | langchain>=0.0.188
6 | gradio>=3.33.1
7 | gradio_client>=0.2.5
8 | tenacity>=8.2.2
9 | tiktoken>=0.4.0
10 | rich>=13.0.1
11 | chromadb>=0.3.25
12 | pdf2image>=1.16.3
13 | tabulate>=0.9.0
--------------------------------------------------------------------------------
/llm-apps-course/src/config.py:
--------------------------------------------------------------------------------
1 | """Configuration for the LLM Apps Course"""
2 | from types import SimpleNamespace
3 |
4 | TEAM = None
5 | PROJECT = "llmapps"
6 | JOB_TYPE = "production"
7 |
8 | default_config = SimpleNamespace(
9 | project=PROJECT,
10 | entity=TEAM,
11 | job_type=JOB_TYPE,
12 | vector_store_artifact="darek/llmapps/vector_store:latest",
13 | chat_prompt_artifact="darek/llmapps/chat_prompt:latest",
14 | chat_temperature=0.3,
15 | max_fallback_retries=1,
16 | model_name="gpt-3.5-turbo",
17 | eval_model="gpt-3.5-turbo",
18 | eval_artifact="darek/llmapps/generated_examples:v0",
19 | )
--------------------------------------------------------------------------------
/llm-intro/requirements.txt:
--------------------------------------------------------------------------------
1 | streamlit
2 | weave
3 | openai
--------------------------------------------------------------------------------
/llm-structured-extraction/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__
2 | wandb
3 | *.json
4 | *.jsonlines
5 | *.csv
6 |
--------------------------------------------------------------------------------
/llm-structured-extraction/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/llm-structured-extraction/__init__.py
--------------------------------------------------------------------------------
/llm-structured-extraction/helpers.py:
--------------------------------------------------------------------------------
1 | import pandas as pd
2 |
3 |
4 | def flatten_dict(d, parent_key="", sep="_"):
5 | """
6 | Flatten a nested dictionary.
7 |
8 | :param d: The nested dictionary to flatten.
9 | :param parent_key: The base key to use for the flattened keys.
10 | :param sep: Separator to use between keys.
11 | :return: A flattened dictionary.
12 | """
13 | items = []
14 | for k, v in d.items():
15 | new_key = f"{parent_key}{sep}{k}" if parent_key else k
16 | if isinstance(v, dict):
17 | items.extend(flatten_dict(v, new_key, sep=sep).items())
18 | else:
19 | items.append((new_key, v))
20 | return dict(items)
21 |
22 |
23 | def dicts_to_df(list_of_dicts):
24 | """
25 | Convert a list of dictionaries to a pandas DataFrame.
26 |
27 | :param list_of_dicts: List of dictionaries, potentially nested.
28 | :return: A pandas DataFrame representing the flattened data.
29 | """
30 | # Flatten each dictionary and create a DataFrame
31 | flattened_data = [flatten_dict(d) for d in list_of_dicts]
32 | return pd.DataFrame(flattened_data)
33 |
--------------------------------------------------------------------------------
/llm-structured-extraction/requirements.txt:
--------------------------------------------------------------------------------
1 | wandb
2 | weave
3 | ipykernel
4 | jupyter
5 | instructor
6 | openai>=1.1.0
7 | pydantic
8 | graphviz
9 | spacy
10 | nltk
11 | pandas
--------------------------------------------------------------------------------
/math-for-ml/.gitignore:
--------------------------------------------------------------------------------
1 | **/__pycache__/**
2 | **/.ipynb_checkpoints/**
3 | **/wandb/**
4 | .ok_storage*
5 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from . import animate, random_matrix, svd
2 |
3 | __all__ = ["animate", "random_matrix", "svd"]
4 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/config:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Linear Algebra",
3 | "endpoint": "",
4 | "src": [""],
5 | "tests": {
6 | "utils/tests/q*.py": "ok_test"
7 | },
8 | "protocols": [
9 | "file_contents",
10 | "grading",
11 | "backup"
12 | ]
13 | }
14 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/svd.py:
--------------------------------------------------------------------------------
1 | import matplotlib.pyplot as plt
2 | import numpy as np
3 |
4 |
5 | def compact(matrix, hermitian=False):
6 | return to_compact(*np.linalg.svd(matrix, hermitian=hermitian))
7 |
8 |
9 | def to_compact(U, sigma, V_T, threshold=1e-8):
10 | U_compact, sigma_compact, V_T_compact = [], [], []
11 |
12 | for idx, singular_value in enumerate(sigma):
13 | # if the singular value isnt too close to 0
14 | if singular_value > threshold:
15 | # include that singular value in sigma_compact
16 | sigma_compact.append(singular_value)
17 |
18 | # add a row of V_T as a row of V_T_compact
19 | V_T_compact.append(V_T[idx])
20 |
21 | # add a column of U as a row of U_compact
22 | U_compact.append(U.T[idx])
23 |
24 | else:
25 | break
26 |
27 | # convert the lists to arrays
28 | V_T_compact = np.array(V_T_compact)
29 | # turn sigma
30 | sigma_compact = np.diag(sigma_compact)
31 | U_compact = np.array(U_compact).T
32 |
33 | return U_compact, sigma_compact, V_T_compact
34 |
35 |
36 | def show_matrix(matrix, ax=None, add_colorbar=False):
37 | if ax is None:
38 | _, ax = plt.subplots(figsize=(6, 6))
39 | add_colorbar = True
40 | ax.axis("off")
41 | im = ax.matshow(matrix, cmap="Greys")
42 | if add_colorbar:
43 | plt.colorbar(im)
44 |
45 |
46 | def show_svd(U, S, V_T):
47 | _, axs = plt.subplots(ncols=3, figsize=(18, 6))
48 |
49 | for matrix, ax in zip([U, S, V_T], axs):
50 | show_matrix(matrix, ax=ax)
51 |
52 | dim = max(max(U.shape), max(V_T.shape))
53 | for ax in axs:
54 | ax.set_ylim([dim - 0.5, 0 - 0.5])
55 | ax.set_xlim([0 - 0.5, dim - 0.5])
56 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q01.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Dimensions",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a dictionary called dimensions
11 | >>> type(dimensions)
12 |
13 | >>> ## A is a vector, neither row nor column
14 | >>> dimensions["A"]
15 | 1
16 | >>> ## B and D are both vectors
17 | >>> dimensions["B"] == dimensions["D"]
18 | True
19 | >>> ## B/D are explicitly row/column vectors
20 | >>> ## so they have two dimensions
21 | >>> dimensions["B"]
22 | 2
23 | >>> ## C is a matrix, so it has two dimensions
24 | >>> dimensions["C"]
25 | 2
26 | """,
27 | "hidden": False,
28 | "locked": False
29 | }
30 | ],
31 | "setup": r"""
32 | """,
33 | "teardown": r"""
34 | """,
35 | "type": "doctest"}]
36 | }
37 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q02.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Shapes",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a dictionary called shapes
11 | >>> type(shapes)
12 |
13 | >>> ## B is a row vector with two entries
14 | >>> shapes["B"]
15 | (1, 2)
16 | >>> ## D is a column vector with two entries
17 | >>> shapes["D"]
18 | (2, 1)
19 | >>> ## C is a matrix with two rows and two columns
20 | >>> shapes["C"]
21 | (2, 2)
22 | >>> ## A is a vector with one entry
23 | >>> ## Watch out! (1) == 1 !== (1,)
24 | >>> shapes["A"]
25 | (1,)
26 | """,
27 | "hidden": False,
28 | "locked": False
29 | }
30 | ],
31 | "setup": r"""
32 | """,
33 | "teardown": r"""
34 | """,
35 | "type": "doctest"}]
36 | }
37 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q03.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Shape of Transpose",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a function with right name
11 | >>> callable(shape_of_transpose)
12 | True
13 | >>> ## shape is reversed shape of input matrix
14 | >>> shape_of_transpose(random_matrix) == transposed_shape
15 | array([ True, True])
16 | """,
17 | "hidden": False,
18 | "locked": False
19 | }
20 | ],
21 | "setup": r"""
22 | random_shape = np.random.randint(30, size=2) + 1
23 | print(f"Testing on matrix with shape {random_shape}")
24 | random_matrix = np.random.standard_normal(size=random_shape)
25 | transposed_shape = random_shape[::-1]
26 | """,
27 | "teardown": r"""
28 | """,
29 | "type": "doctest"}]
30 | }
31 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q04.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Repeat Thrice",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define an array with right name
11 | >>> isinstance(repeat_3_2, np.ndarray)
12 | True
13 | >>> ## it should have two dimensions
14 | >>> repeat_3_2.ndim
15 | 2
16 | >>> ## it takes length-2 vectors as input
17 | >>> repeat_3_2.shape[1]
18 | 2
19 | >>> ## its output has 3 times the length of the input
20 | >>> repeat_3_2.shape[0] // 3
21 | 2
22 | >>> ## its output is three copies of the input
23 | >>> repeat_3_2 @ [1., 2.]
24 | array([1., 2., 1., 2., 1., 2.])
25 | """,
26 | "hidden": False,
27 | "locked": False
28 | }
29 | ],
30 | "setup": r"""
31 | """,
32 | "teardown": r"""
33 | """,
34 | "type": "doctest"}]
35 | }
36 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q05.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Matrix Type Checking",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a function called are_compatible
11 | >>> callable(are_compatible)
12 | True
13 | >>> ## that function should return booleans
14 | >>> isinstance(are_compatible(A, B), bool)
15 | True
16 | >>> ## when the inner shapes are the same, return True
17 | >>> are_compatible(A, B)
18 | True
19 | >>> ## when the inner shapes differ, return False
20 | >>> are_compatible(A, M)
21 | False
22 | """,
23 | "hidden": False,
24 | "locked": False
25 | }
26 | ],
27 | "setup": r"""
28 | random_outer_shapes = np.random.randint(1, 5, size=2)
29 | random_inner_shape = np.random.randint(1, 5)
30 | A = np.random.randn(random_outer_shapes[0], random_inner_shape)
31 | B = np.random.randn(random_inner_shape, random_outer_shapes[1])
32 | M = np.random.randn(random_inner_shape + 1, random_outer_shapes[1])
33 | """,
34 | "teardown": r"""
35 | """,
36 | "type": "doctest"}]
37 | }
38 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q06.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Zero Second and Repeat Thrice",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define an array with right name
11 | >>> isinstance(set_second_to_zero_and_repeat_3, np.ndarray)
12 | True
13 | >>> ## it should have two dimensions
14 | >>> set_second_to_zero_and_repeat_3.ndim
15 | 2
16 | >>> ## it takes length-2 vectors as input
17 | >>> set_second_to_zero_and_repeat_3.shape[1]
18 | 2
19 | >>> ## its output has 3 times the length of the input
20 | >>> set_second_to_zero_and_repeat_3.shape[0] // 3
21 | 2
22 | >>> ## its output is three copies of the input
23 | >>> set_second_to_zero_and_repeat_3 @ [1., 2.]
24 | array([1., 0., 1., 0., 1., 0.])
25 | """,
26 | "hidden": False,
27 | "locked": False
28 | }
29 | ],
30 | "setup": r"""
31 | """,
32 | "teardown": r"""
33 | """,
34 | "type": "doctest"}]
35 | }
36 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q07.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "Refactoring",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define an array V
11 | >>> type(V)
12 |
13 | >>> ## make sure you multiply in the right order!
14 | >>> np.array_equal(WXYZ @ random_vec, V @ random_vec)
15 | False
16 | >>> ## result from their pipeline and yours should be (almost) same
17 | >>> np.allclose(their_pipeline(random_vec), V @ random_vec)
18 | True
19 | """,
20 | "hidden": False,
21 | "locked": False
22 | }
23 | ],
24 | "setup": r"""
25 | WXYZ = W @ X @ Y @ Z
26 | random_vec = np.random.standard_normal(size=(2, 1))
27 | """,
28 | "teardown": r"""
29 | """,
30 | "type": "doctest"}]
31 | }
32 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q08.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "apply_to_batch",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a function called apply_to_batch
11 | >>> callable(apply_to_batch)
12 | True
13 | >>> ## it should run when applied to compatible inputs
14 | >>> out = apply_to_batch(identity, vectors)
15 | >>> ## the return is of type array
16 | >>> type(out)
17 |
18 | """,
19 | "hidden": False,
20 | "locked": False
21 | },
22 | {
23 | "code": r"""
24 | >>> # applying the identity matrix shouldn't change inputs
25 | >>> np.allclose(vectors, apply_to_batch(identity, vectors))
26 | True
27 | >>> # the result should be the same as normal matrix multiplication
28 | >>> np.allclose(random_matrix @ vectors, apply_to_batch(random_matrix, vectors))
29 | True
30 | >>> # return_first should pull out the first entry in each vector
31 | >>> np.allclose(vectors[0], apply_to_batch(return_first, vectors))
32 | True
33 | """
34 | }
35 | ],
36 | "setup": r"""
37 | shape, count = 5, 10
38 | identity = np.eye(shape)
39 | return_first = np.array([[1.] + [0.] * (shape - 1)])
40 | random_matrix = np.random.standard_normal((shape, shape))
41 | vectors = np.random.standard_normal((shape, count))
42 | """,
43 | "teardown": r"""
44 | """,
45 | "type": "doctest"}]
46 | }
47 |
--------------------------------------------------------------------------------
/math-for-ml/01_linearalgebra/utils/tests/q09.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "make_repeater",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a function called make_repeater
11 | >>> callable(make_repeater)
12 | True
13 | >>> ## it should take two integer arguments
14 | >>> out = make_repeater(2, 2)
15 | >>> ## the return is of type ndarray
16 | >>> type(out)
17 |
18 | """,
19 | "hidden": False,
20 | "locked": False
21 | },
22 | {
23 | "code": r"""
24 | >>> # make_repeater should be able to make repeat_3_2
25 | >>> repeat_3_2 = make_repeater(3, 2)
26 | >>> # the result should have two dimensions
27 | >>> repeat_3_2.ndim
28 | 2
29 | >>> # the result should have outputs of size 6 and inputs of size 2
30 | >>> repeat_3_2.shape
31 | (6, 2)
32 | >>> # applying that repeater to the 0 vector should give a 0 vector
33 | >>> np.allclose(np.zeros(6), repeat_3_2 @ zeros_2)
34 | True
35 | >>> # applying that repeater to the 1s vector should give a 1s vector
36 | >>> np.allclose(np.ones(6), repeat_3_2 @ ones_2)
37 | True
38 | """
39 | }
40 | ],
41 | "setup": r"""
42 | zeros_2 = np.zeros(2)
43 | ones_2 = np.ones(2)
44 | """,
45 | "teardown": r"""
46 | """,
47 | "type": "doctest"}]
48 | }
49 |
--------------------------------------------------------------------------------
/math-for-ml/02_calculus/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from . import grad_plot, models, surfaces
2 |
3 | __all__ = ["grad_plot", "models", "surfaces"]
4 |
--------------------------------------------------------------------------------
/math-for-ml/02_calculus/utils/config:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Calculus",
3 | "endpoint": "",
4 | "src": [""],
5 | "tests": {
6 | "utils/tests/q*.py": "ok_test"
7 | },
8 | "protocols": [
9 | "file_contents",
10 | "grading",
11 | "backup"
12 | ]
13 | }
14 |
--------------------------------------------------------------------------------
/math-for-ml/02_calculus/utils/surfaces.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | import matplotlib.pyplot as plt
4 | from mpl_toolkits.mplot3d import Axes3D # noqa
5 |
6 | import scipy.stats
7 | from scipy.signal import convolve
8 |
9 |
10 | def axis_equal_3d(ax, center=0):
11 | # FROM StackO/19933125
12 |
13 | extents = np.array([getattr(ax, 'get_{}lim'.format(dim))() for dim in 'xyz'])
14 | sz = extents[:, 1] - extents[:, 0]
15 | if center == 0:
16 | centers = [0, 0, 0]
17 | else:
18 | centers = np.mean(extents, axis=1)
19 | maxsize = max(abs(sz))
20 | r = maxsize/2
21 | for ctr, dim in zip(centers, 'xyz'):
22 | getattr(ax, 'set_{}lim'.format(dim))(ctr - r, ctr + r)
23 |
24 |
25 | def gauss_random_field(x, y, scale):
26 | """creates a correlated gaussian random field
27 | by first generating 'white' uncorrelated field
28 | and then using a low-pass filter based on convolution
29 | with a gaussian function to make a 'red-shifted', correlated
30 | gaussian random field
31 | """
32 | white_field = np.random.standard_normal(size=x.shape)
33 |
34 | pos = np.empty(x.shape + (2,))
35 | pos[:, :, 0] = x
36 | pos[:, :, 1] = y
37 | gauss_rv = scipy.stats.multivariate_normal([0, 0], cov=np.ones(2))
38 | gauss_pdf = gauss_rv.pdf(pos)
39 | red_field = scale * convolve(white_field, gauss_pdf, mode='same')
40 |
41 | return red_field
42 |
43 |
44 | def plot_loss_surface(loss, N, mesh_extent):
45 | mesh = np.linspace(-mesh_extent, mesh_extent, N)
46 | weights1, weights2 = np.meshgrid(mesh, mesh)
47 |
48 | fig = plt.figure()
49 | ax = fig.add_subplot(111, projection='3d')
50 |
51 | ax._axis3don = False
52 |
53 | ax.plot_surface(weights1, weights2, loss(weights1, weights2),
54 | rstride=2, cstride=2, linewidth=0.2, edgecolor='b',
55 | alpha=1, cmap='Blues', shade=True)
56 |
57 | axis_equal_3d(ax, center=True)
58 |
--------------------------------------------------------------------------------
/math-for-ml/02_calculus/utils/tests/q03.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "linear_approx",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## you must define a function with right name
11 | >>> callable(linear_approx)
12 | True
13 | >>> ## it should run on and return the right types
14 | >>> isinstance(linear_approx(identity, 0., 0.), float)
15 | True
16 | >>> ## if epsilon is 0, should return f(input)
17 | >>> np.isclose(linear_approx(identity, 0., 0.), 0.)
18 | True
19 | >>> np.isclose(linear_approx(constant, 0., 0.), constant(0.))
20 | True
21 | >>> np.isclose(linear_approx(np.square, 0., 0.), 0.)
22 | True
23 | >>> np.isclose(linear_approx(np.square, val, 0.), np.square(val))
24 | True
25 | >>> ## linear approximation of abs is line with slope +/-1
26 | >>> np.isclose(linear_approx(np.abs, val, -val), 0.)
27 | True
28 | >>> ## linear approximation of square is 2 * x
29 | >>> np.isclose(linear_approx(np.square, 1., -1.), -1.)
30 | True
31 | """,
32 | "hidden": False,
33 | "locked": False
34 | }
35 | ],
36 | "setup": r"""
37 | val = np.random.randn()
38 | print(f"Testing on random value {val}")
39 | # identity function returns its inputs unchanged
40 | identity = lambda x: x
41 | # constant function always returns the same thing
42 | constant = lambda x: val
43 | """,
44 | "teardown": r"""
45 | """,
46 | "type": "doctest"}]
47 | }
48 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from . import clt, mle
2 |
3 | __all__ = ["clt", "mle"]
4 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/config:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Probability",
3 | "endpoint": "",
4 | "src": [""],
5 | "tests": {
6 | "utils/tests/q*.py": "ok_test"
7 | },
8 | "protocols": [
9 | "file_contents",
10 | "grading",
11 | "backup"
12 | ]
13 | }
14 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q01.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "array_is_pmf",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(array_is_pmf)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(array_is_pmf(np.array([0.])), bool)
15 | True
16 | >>> ## an array with negative entries is not a pmf
17 | >>> array_is_pmf(np.array([-1.]))
18 | False
19 | >>> ## an array that doesn't sum to 1 is not a pmf
20 | >>> array_is_pmf(np.array([0.5, 0.4]))
21 | False
22 | >>> ## the rand_array variable is never a valid pmf
23 | >>> array_is_pmf(rand_array)
24 | False
25 | >>> ## some numerical error in the sum is tolerable
26 | >>> ## so use np.isclose, not ==
27 | >>> array_is_pmf(close_to_1)
28 | True
29 | >>> ## the rand_pmf variable is always a pmf
30 | >>> array_is_pmf(rand_pmf)
31 | True
32 | """,
33 | "hidden": False,
34 | "locked": False
35 | }
36 | ],
37 | "setup": r"""
38 | >>> rand_array = 1 / 5 * np.random.rand(4)
39 | >>> rand_pmf = rand_array / np.sum(rand_array)
40 | >>> close_to_1 = np.array([1/100 for ii in range(100)])
41 | """,
42 | "teardown": r"""
43 | """,
44 | "type": "doctest"}]
45 | }
46 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q02.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "my_pdf",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(my_pdf)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(my_pdf(0.5), float)
15 | True
16 | >>> ## a pdf never takes on negative values
17 | >>> all(my_pdf(x) >= 0 for x in test_values)
18 | True
19 | >>> ## a pdf integrates to 1
20 | >>> integrates_to_one(my_pdf)
21 | True
22 | """,
23 | "hidden": False,
24 | "locked": False
25 | }
26 | ],
27 | "setup": r"""
28 | >>> test_values = np.random.rand(100)
29 | """,
30 | "teardown": r"""
31 | """,
32 | "type": "doctest"}]
33 | }
34 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q03.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "surprise",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(surprise)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(surprise(simple_pmf, 0), float)
15 | True
16 | >>> ## for the same numerical input, should give the same output
17 | >>> surprise(simple_pmf, 0) == surprise(simple_pmf, 1)
18 | True
19 | >>> ## the surprise for probability 1 is 0
20 | >>> np.isclose(surprise(constant_pmf, 0), 0.)
21 | True
22 | >>> ## the inverse of the surprise is the negative exponent
23 | >>> neg_exps = [np.exp(-1 * surprise(rand_pmf, ii)) for ii in range(4)]
24 | >>> np.allclose(rand_pmf, neg_exps)
25 | True
26 | """,
27 | "hidden": False,
28 | "locked": False
29 | }
30 | ],
31 | "setup": r"""
32 | >>> constant_pmf = np.array([1.])
33 | >>> simple_pmf = np.array([0.5, 0.5])
34 | >>> rand_array = 1 / 5 * np.random.rand(4) + 0.01
35 | >>> rand_pmf = rand_array / np.sum(rand_array)
36 | """,
37 | "teardown": r"""
38 | """,
39 | "type": "doctest"}]
40 | }
41 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q04.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "softmax",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(softmax)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(softmax(three_ones), np.ndarray)
15 | True
16 | >>> ## the entries are always non-negative
17 | >>> np.all(np.greater_equal(softmax(rand_array), 0.))
18 | True
19 | >>> ## the entries should sum to 1
20 | >>> np.isclose(np.sum(softmax(rand_array)), 1.)
21 | True
22 | >>> ## applying softmax shouldn't change which entry is biggest
23 | >>> np.argmax(rand_array) == np.argmax(softmax(rand_array))
24 | True
25 | >>> ## if all the entries are the same, all outputs are the same
26 | >>> np.allclose(softmax(three_ones), 1 / 3)
27 | True
28 | """,
29 | "hidden": False,
30 | "locked": False
31 | }
32 | ],
33 | "setup": r"""
34 | >>> three_ones = np.ones(3)
35 | >>> rand_array = np.random.rand(10)
36 | """,
37 | "teardown": r"""
38 | """,
39 | "type": "doctest"}]
40 | }
41 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q05.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "entropy",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(entropy)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(entropy(rand_pmf), float)
15 | True
16 | >>> ## the entropy is always non-negative
17 | >>> np.greater_equal(entropy(rand_pmf), 0.)
18 | True
19 | >>> ## the entropy of a fair coin is log(2)
20 | >>> np.isclose(entropy(coin_pmf), log_2)
21 | True
22 | >>> ## the entropy of any other distribution on 2 states is lower
23 | >>> entropy(coin_pmf) > entropy(rand_pmf)
24 | True
25 | """,
26 | "hidden": False,
27 | "locked": False
28 | }
29 | ],
30 | "setup": r"""
31 | >>> rand_array = np.random.rand(2) + 0.05
32 | >>> rand_pmf = rand_array / np.sum(rand_array)
33 | >>> coin_pmf = np.array([0.5, 0.5])
34 | >>> log_2 = np.log(2)
35 | >>> rand_pmf_and_zero = np.append(rand_pmf, [0.])
36 | >>> one_hot_pmf = np.array([0., 0., 0., 1.])
37 | """,
38 | "teardown": r"""
39 | """,
40 | "type": "doctest"}]
41 | }
42 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q07.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "divergence",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(divergence)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(divergence(rand_pmf, rand_pmf), float)
15 | True
16 | >>> ## the divergence is always non-negative
17 | >>> np.greater_equal(divergence(rand_pmf, coin_pmf), 0.)
18 | True
19 | >>> np.greater_equal(divergence(coin_pmf, rand_pmf), 0.)
20 | True
21 | >>> ## the order of arguments matters
22 | >>> divergence(coin_pmf, rand_pmf) != divergence(rand_pmf, coin_pmf)
23 | True
24 | >>> ## the divergence between a pmf and itself is 0.
25 | >>> np.isclose(divergence(coin_pmf, coin_pmf), 0.)
26 | True
27 | >>> np.isclose(divergence(rand_pmf, rand_pmf), 0.)
28 | True
29 | """,
30 | "hidden": False,
31 | "locked": False
32 | }
33 | ],
34 | "setup": r"""
35 | >>> rand_array = np.random.rand(2) + 0.05
36 | >>> rand_pmf = rand_array / np.sum(rand_array)
37 | >>> coin_pmf = np.array([0.5, 0.5])
38 | """,
39 | "teardown": r"""
40 | """,
41 | "type": "doctest"}]
42 | }
43 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q09.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "gaussian_surprise",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(gaussian_surprise)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(gaussian_surprise(mean, rand_array), float)
15 | True
16 | >>> ## the surprise is always non-negative
17 | >>> np.greater_equal(gaussian_surprise(mean, rand_array), 0.)
18 | True
19 | >>> ## the squared error for x == mu is 0, so surprise is N * 1/2 log Z
20 | >>> np.isclose(gaussian_surprise(mean, constant_array), len(constant_array) * 0.5 * log_Z)
21 | True
22 | >>> ## N * 1/2 log Z is the minimum possible surprise
23 | >>> np.greater_equal(gaussian_surprise(mean, rand_array), len(rand_array) * 0.5 * log_Z)
24 | True
25 | >>> np.greater_equal(gaussian_surprise(rand_array[0], rand_array), len(rand_array) * 0.5 * log_Z)
26 | True
27 | """,
28 | "hidden": False,
29 | "locked": False
30 | }
31 | ],
32 | "setup": r"""
33 | >>> rand_array = np.random.randn(5)
34 | >>> mean = np.mean(rand_array)
35 | >>> constant_array = np.zeros(5) + mean
36 | >>> log_Z = np.log(2. * np.pi)
37 | """,
38 | "teardown": r"""
39 | """,
40 | "type": "doctest"}]
41 | }
42 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/tests/q10.py:
--------------------------------------------------------------------------------
1 | test = {
2 | "name": "sum_squared_error",
3 | "points": 1,
4 | "suites": [
5 | {
6 | "cases": [
7 | {
8 | "code": r"""
9 | >>> # TESTS BEGIN HERE
10 | >>> ## must define a function with the right name
11 | >>> callable(sum_squared_error)
12 | True
13 | >>> ## that function should take and return the right types
14 | >>> isinstance(sum_squared_error(mean, rand_array), float)
15 | True
16 | >>> ## the surprise is always non-negative
17 | >>> np.greater_equal(sum_squared_error(mean, rand_array), 0.)
18 | True
19 | >>> ## the squared error for x == mu is 0
20 | >>> np.isclose(sum_squared_error(mean, constant_array), 0.)
21 | True
22 | >>> sse_with_mean = sum_squared_error(mean, rand_array)
23 | >>> ## the mean minimizes the squared error
24 | >>> np.greater_equal(sum_squared_error(rand_array[0], rand_array), sse_with_mean)
25 | True
26 | >>> ## the squared error of the mean is equal to the variance * N
27 | >>> np.isclose(sum_squared_error(mean, rand_array), len(rand_array) * variance)
28 | True
29 | """,
30 | "hidden": False,
31 | "locked": False
32 | }
33 | ],
34 | "setup": r"""
35 | >>> rand_array = np.random.randn(5)
36 | >>> mean = np.mean(rand_array)
37 | >>> constant_array = np.zeros(5) + mean
38 | >>> variance = np.var(rand_array)
39 | """,
40 | "teardown": r"""
41 | """,
42 | "type": "doctest"}]
43 | }
44 |
--------------------------------------------------------------------------------
/math-for-ml/03_probability/utils/util.py:
--------------------------------------------------------------------------------
1 | import scipy.integrate as integrate
2 |
3 | def integrates_to_one(pdf):
4 | integral, err_bound = integrate.quad(pdf, 0, 1)
5 | return abs(1 - integral) <= err_bound
6 |
--------------------------------------------------------------------------------
/math-for-ml/Dockerfile:
--------------------------------------------------------------------------------
1 | # adapted from github/BIDS/dats Dockerfile
2 | FROM jupyter/datascience-notebook:04f7f60d34a6
3 |
4 | USER $NB_USER
5 |
6 | COPY ./requirements.txt ./
7 |
8 | # install Python libraries with pip
9 | RUN pip install --no-cache-dir -r requirements.txt
10 |
11 | # install Linux tools with apt-get
12 | USER root
13 | RUN apt-get update && apt-get install -y curl ffmpeg graphviz wget
14 |
15 | # add files to home directory and rename/reown
16 | COPY ./ /home/$NB_USER/math-for-ml/
17 |
18 | RUN usermod -G users $NB_USER && chown -R $NB_USER /home/$NB_USER/ && chgrp -R users /home/$NB_USER/
19 |
20 | USER $NB_USER
21 |
22 | RUN export USER=$NB_USER
23 |
--------------------------------------------------------------------------------
/math-for-ml/autograder.py:
--------------------------------------------------------------------------------
1 | from client.api.notebook import Notebook
2 | import wandb
3 |
4 |
5 | class WandbTrackedOK(object):
6 |
7 | def __init__(self, entity, ok_path, project, wandb_path="./wandb"):
8 | self.grader = Notebook(ok_path)
9 | wandb.init(entity=entity, project=project, dir=wandb_path)
10 | self.test_map = self.grader.assignment.test_map
11 | self.pass_dict = {k: 0 for k in self.test_map}
12 | self.log()
13 |
14 | def grade(self, question, *args, **kwargs):
15 | result = self.grader.grade(question, *args, **kwargs)
16 | self.pass_dict[question] = result["passed"]
17 | self.log()
18 |
19 | def log(self):
20 | total = sum([v for v in self.pass_dict.values()])
21 | wandb.log({"passes": self.pass_dict,
22 | "total": total})
23 |
24 | def __delete__(self):
25 | wandb.join()
26 |
--------------------------------------------------------------------------------
/math-for-ml/requirements-colab.txt:
--------------------------------------------------------------------------------
1 | okpy==1.15.0
2 | torchviz==0.0.1
3 | wandb
4 |
--------------------------------------------------------------------------------
/math-for-ml/requirements-local.txt:
--------------------------------------------------------------------------------
1 | autograd==1.3
2 | okpy==1.15.0
3 | jupyter==1.0.0
4 | matplotlib==3.2.2
5 | pandas==1.0.5
6 | scikit-learn==0.23.1
7 | tensorflow==2.7.0
8 | torch==1.5.1
9 | torchvision==0.6.1
10 | torchviz==0.0.1
11 | wandb
12 |
--------------------------------------------------------------------------------
/math-for-ml/requirements.txt:
--------------------------------------------------------------------------------
1 | autograd==1.3
2 | nbgitpuller==0.8.0
3 | okpy==1.15.0
4 | tensorflow==2.7.0
5 | torch==1.5.1
6 | torchvision==0.6.1
7 | torchviz==0.0.1
8 | wandb
9 |
--------------------------------------------------------------------------------
/ml-dataval-course/.gitignore:
--------------------------------------------------------------------------------
1 | canonical-paritioned-dataset/
2 | canonical-paritioned-dataset.tar
3 | *.csv
4 |
5 |
--------------------------------------------------------------------------------
/ml-dataval-course/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2023 Shreya Shankar
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/ml-dataval-course/dataval/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/ml-dataval-course/dataval/__init__.py
--------------------------------------------------------------------------------
/ml-dataval-course/dataval/plot.py:
--------------------------------------------------------------------------------
1 | import pandas as pd
2 | import seaborn as sns
3 |
4 | def violinplot(data: pd.DataFrame, variable: str, groupby: str = "week"):
5 | if groupby == "week":
6 | return sns.violinplot(data = data, x=variable, y=data["fact_time"].dt.isocalendar().week.astype(str))
7 |
8 | if groupby == "day":
9 | return sns.violinplot(data = data, x=variable, y=data["fact_time"].dt.isocalendar().day.astype(str))
10 |
11 | if groupby == "hour":
12 | return sns.violinplot(data = data, x=variable, y=data["fact_time"].dt.hour.astype(str))
13 |
14 | if groupby == "hour-binary":
15 | return sns.violinplot(data = data, x=variable, y=data["fact_time"].dt.hour.between(10, 17).astype(str))
16 |
17 | raise ValueError("groupby must be week, day, or hour.")
--------------------------------------------------------------------------------
/ml-dataval-course/dataval/train.py:
--------------------------------------------------------------------------------
1 | import catboost
2 | import pandas as pd
3 |
4 | from sklearn.model_selection import train_test_split
5 | from sklearn import metrics
6 |
7 |
8 | class CatBoostTrainer(object):
9 | def __init__(self, hparams: dict):
10 | self.hparams = hparams
11 | if "random_seed" not in self.hparams:
12 | self.hparams["random_seed"] = 42
13 | if "loss_function" not in self.hparams:
14 | self.hparams["loss_function"] = "RMSE"
15 | if "eval_metric" not in self.hparams:
16 | self.hparams["eval_metric"] = "RMSE"
17 |
18 | self.model = catboost.CatBoostRegressor(**self.hparams)
19 |
20 | def fit(self, X, y, verbose=False, callbacks=None):
21 | # Split X and y into train and test
22 | X_train, X_test, y_train, y_test = train_test_split(
23 | X, y, test_size=0.2, shuffle=False
24 | )
25 | self.model.fit(
26 | X_train,
27 | y_train,
28 | verbose=verbose,
29 | eval_set=(X_test, y_test),
30 | callbacks=callbacks,
31 | )
32 |
33 | def predict(self, X):
34 | preds = self.model.predict(X)
35 | return preds
36 | # if preds.ndim == 2:
37 | # return preds[:, 0]
38 |
39 | def score(self, X, y, metric: str = "MSE"):
40 | if metric == "MSE":
41 | return metrics.mean_squared_error(y, self.predict(X))
42 |
43 | raise ValueError(f"Metric {metric} not supported.")
44 |
45 | def get_feature_importance(self):
46 | importances = self.model.get_feature_importance()
47 | feature_names = self.model.feature_names_
48 | return pd.DataFrame(
49 | {"feature": feature_names, "importance": importances}
50 | ).sort_values(by="importance", ascending=False)
51 |
--------------------------------------------------------------------------------
/ml-dataval-course/download.sh:
--------------------------------------------------------------------------------
1 | wget https://storage.yandexcloud.net/yandex-research/shifts/weather/canonical-partitioned-dataset.tar
2 | tar -xvf canonical-partitioned-dataset.tar
--------------------------------------------------------------------------------
/ml-dataval-course/requirements.txt:
--------------------------------------------------------------------------------
1 | wandb
2 | catboost
3 | duckdb
4 | ipykernel
5 | ipywidgets
6 | jupyterlab
7 | matplotlib
8 | modal-client
9 | numpy
10 | pandas
11 | scikit-learn
12 | gate-drift
13 | seaborn
14 |
--------------------------------------------------------------------------------
/ml-dataval-course/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import setup, find_packages
2 |
3 | setup(
4 | name="ml-dataval-tutorial",
5 | version="0.1",
6 | author="Shreya Shankar",
7 | author_email="shreyashankar@berkeley.edu",
8 | url="https://github.com/shreyashankar/ml-dataval-tutorial",
9 | packages=find_packages(),
10 | install_requires=open("requirements.txt").readlines(),
11 | scripts=[
12 | "download.sh",
13 | ],
14 | classifiers=[
15 | "Development Status :: 3 - Alpha",
16 | "Intended Audience :: Developers",
17 | "License :: OSI Approved :: MIT License",
18 | "Programming Language :: Python :: 3.8",
19 | "Programming Language :: Python :: 3.9",
20 | ],
21 | )
22 |
--------------------------------------------------------------------------------
/mlops-001/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | # Learn Effective MLOps: Model Development
7 |
8 | This repository contains materials for our [MLOps Course](https://www.wandb.courses/courses/effective-mlops-model-development)
9 |
10 | Bringing machine learning models to production is challenging, with a continuous iterative lifecycle that consists of many complex components.
11 |
12 | Having a disciplined, flexible and collaborative process - an effective MLOps system - is crucial to enabling velocity and rigor, and building an end-to-end machine learning pipeline that continually delivers production-ready ML models and services.
13 |
14 | ## 🚀 [Enroll for free](https://www.wandb.courses/courses/effective-mlops-model-development)
15 |
16 | ## What you'll learn
17 |
18 | - Best practice machine learning workflows
19 | - Exploratory data analysis with Tables and Reports in W&B
20 | - Versioning datasets and models with Artifacts and Model Registry in W&B
21 | - Tracking and analyzing experiments
22 | - Automating hyperparameter optimization with Sweeps
23 | - Model evaluation techniques that ensure reproducibility and enterprise-level governance
24 |
25 |
--------------------------------------------------------------------------------
/mlops-001/lesson1/params.py:
--------------------------------------------------------------------------------
1 | WANDB_PROJECT = "mlops-course-001"
2 | ENTITY = None # set this to team name if working in a team
3 | BDD_CLASSES = {i:c for i,c in enumerate(['background', 'road', 'traffic light', 'traffic sign', 'person', 'vehicle', 'bicycle'])}
4 | RAW_DATA_AT = 'bdd_simple_1k'
5 | PROCESSED_DATA_AT = 'bdd_simple_1k_split'
--------------------------------------------------------------------------------
/mlops-001/lesson1/requirements.txt:
--------------------------------------------------------------------------------
1 | torch>=1.9
2 | fastai>=2.7
3 | matplotlib
4 | numpy
5 | pandas
6 | scikit-learn
7 | torchvision
8 | tqdm
9 | wandb
--------------------------------------------------------------------------------
/mlops-001/lesson2/params.py:
--------------------------------------------------------------------------------
1 | WANDB_PROJECT = "mlops-course-001"
2 | ENTITY = 'av-team'
3 | BDD_CLASSES = {i:c for i,c in enumerate(['background', 'road', 'traffic light', 'traffic sign', 'person', 'vehicle', 'bicycle'])}
4 | RAW_DATA_AT = 'av-team/mlops-course-001/bdd_simple_1k'
5 | PROCESSED_DATA_AT = 'av-team/mlops-course-001/bdd_simple_1k_split'
--------------------------------------------------------------------------------
/mlops-001/lesson2/sweep.yaml:
--------------------------------------------------------------------------------
1 | # The program to run
2 | program: train.py
3 |
4 | # Method can be grid, random or bayes
5 | method: random
6 |
7 | # Project this sweep is part of
8 | project: mlops-course-001
9 | entity: av-team
10 |
11 | # Metric to optimize
12 | metric:
13 | name: miou
14 | goal: maximize
15 |
16 |
17 | # Parameters space to search
18 | parameters:
19 | log_preds:
20 | value: False
21 | lr:
22 | distribution: log_uniform_values
23 | min: 1e-5
24 | max: 1e-2
25 | batch_size:
26 | values: [4, 8]
27 | img_size:
28 | value: 240
29 | arch:
30 | values:
31 | - 'resnet18'
32 | - 'convnext_tiny'
33 | - 'regnet_x_400mf'
34 | - 'mobilenet_v3_small'
--------------------------------------------------------------------------------
/mlops-001/lesson3/params.py:
--------------------------------------------------------------------------------
1 | WANDB_PROJECT = "mlops-course-001"
2 | ENTITY = 'av-team'
3 | BDD_CLASSES = {i:c for i,c in enumerate(['background', 'road', 'traffic light', 'traffic sign', 'person', 'vehicle', 'bicycle'])}
4 | RAW_DATA_AT = 'bdd_simple_1k'
5 | PROCESSED_DATA_AT = 'bdd_simple_1k_split'
6 |
--------------------------------------------------------------------------------
/model-dev-course/lesson1/Lesson 1 slides.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-dev-course/lesson1/Lesson 1 slides.pdf
--------------------------------------------------------------------------------
/model-dev-course/lesson1/README.md:
--------------------------------------------------------------------------------
1 | # MLOPS
2 |
3 | ## Setup
4 |
5 | To run the notebooks please install the dependencies from [requirements.txt](requirements.txt)
6 |
--------------------------------------------------------------------------------
/model-dev-course/lesson1/assets/annotated_lemons.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-dev-course/lesson1/assets/annotated_lemons.png
--------------------------------------------------------------------------------
/model-dev-course/lesson1/assets/lemons.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-dev-course/lesson1/assets/lemons.png
--------------------------------------------------------------------------------
/model-dev-course/lesson1/requirements.txt:
--------------------------------------------------------------------------------
1 | torch>=1.9
2 | fastai>=2.7
3 | wandb
4 | pycocotools
5 | scikit-image
6 | pandas
7 | numpy
8 |
9 |
--------------------------------------------------------------------------------
/model-dev-course/lesson2/Lesson 2 slides.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-dev-course/lesson2/Lesson 2 slides.pdf
--------------------------------------------------------------------------------
/model-dev-course/lesson2/sweep.yaml:
--------------------------------------------------------------------------------
1 | program: train.py
2 | method: random
3 | project: lemon-project
4 | entity: wandb_course
5 | metric:
6 | name: f1_score
7 | goal: maximize
8 | parameters:
9 | bs:
10 | value: 16
11 | img_size:
12 | values: [256, 512]
13 | arch:
14 | values:
15 | - 'resnet18'
16 | - 'convnext_tiny'
17 | - 'regnetx_004'
18 | lr:
19 | distribution: 'log_uniform_values'
20 | min: 1e-5
21 | max: 1e-2
22 | seed:
23 | values: [1,2,3]
--------------------------------------------------------------------------------
/model-dev-course/lesson3/Lesson 3 Slides.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-dev-course/lesson3/Lesson 3 Slides.pdf
--------------------------------------------------------------------------------
/model-management/.dockerignore:
--------------------------------------------------------------------------------
1 | ./wandb/
2 | ./artifacts/
3 | ./models/
4 | ./llm_recipes.egg-info/
5 | ./nbs/
6 | ./output/
--------------------------------------------------------------------------------
/model-management/.gitignore:
--------------------------------------------------------------------------------
1 | wandb/
2 | artifacts/
3 | output/
4 | scratch*
5 | secrets
6 |
--------------------------------------------------------------------------------
/model-management/Dockerfile.eval:
--------------------------------------------------------------------------------
1 | # You can run this by doing
2 | # docker un
3 | # Use an official PyTorch runtime as a parent image
4 | FROM python:3.11-slim-bookworm
5 |
6 | # setup workdir
7 | USER root
8 | WORKDIR /root/src
9 | RUN mkdir -p /root/src
10 | COPY . /root/src/
11 |
12 | # Install gcc and python3-dev
13 | RUN apt-get update && apt-get install -y gcc python3-dev
14 |
15 | # Install any needed packages specified in requirements.txt
16 | RUN python -m pip install --no-cache-dir -r requirements_eval.txt
17 |
18 | # Entry Point
19 | ENTRYPOINT ["python", "eval.py"]
20 |
--------------------------------------------------------------------------------
/model-management/Dockerfile.train:
--------------------------------------------------------------------------------
1 | # You can run this by doing
2 | # docker un
3 | # Use an official PyTorch runtime as a parent image
4 | FROM pytorch/pytorch:latest
5 |
6 | # setup workdir
7 | USER root
8 | WORKDIR /root/src
9 | RUN mkdir -p /root/src
10 | COPY . /root/src/
11 |
12 | # Install any needed packages specified in requirements.txt
13 | RUN python -m pip install --no-cache-dir -r requirements.txt
14 |
15 | # install our lib
16 | RUN python -m pip install .
17 |
18 | # Entry Point
19 | ENTRYPOINT ["python", "train.py"]
--------------------------------------------------------------------------------
/model-management/image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-management/image.png
--------------------------------------------------------------------------------
/model-management/mini_llm/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/model-management/mini_llm/__init__.py
--------------------------------------------------------------------------------
/model-management/mini_llm/data.py:
--------------------------------------------------------------------------------
1 | import wandb
2 | from .utils import load_jsonl
3 |
4 | DEFAULT_ALPACA_SPLIT = 'capecape/alpaca_ft/alpaca_gpt4_splitted:v4'
5 |
6 | def _prompt_no_input(row):
7 | return ("Below is an instruction that describes a task. "
8 | "Write a response that appropriately completes the request.\n\n"
9 | "### Instruction:\n{instruction}\n\n### Response:\n").format_map(row)
10 |
11 | def _prompt_input(row):
12 | return ("Below is an instruction that describes a task, paired with an input that provides further context. "
13 | "Write a response that appropriately completes the request.\n\n"
14 | "### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n").format_map(row)
15 |
16 | def create_alpaca_prompt(row):
17 | return _prompt_no_input(row) if row["input"] == "" else _prompt_input(row)
18 |
19 | def create_alpaca_prompt_with_response(row):
20 | instruct = _prompt_no_input(row) if row["input"] == "" else _prompt_input(row)
21 | return instruct + row["output"]
22 |
23 | def get_alpaca_split(dataset_at = DEFAULT_ALPACA_SPLIT):
24 | artifact = wandb.use_artifact(dataset_at, type='dataset')
25 | dataset_dir = artifact.download()
26 |
27 | train_dataset = load_jsonl(f"{dataset_dir}/alpaca_gpt4_train.jsonl")
28 | eval_dataset = load_jsonl(f"{dataset_dir}/alpaca_gpt4_eval.jsonl")
29 |
30 | def _format_dataset(dataset):
31 | "No EOS token yet"
32 | return [{"prompt":create_alpaca_prompt(row),
33 | "output": row["output"]} for row in dataset]
34 |
35 | train_dataset = _format_dataset(train_dataset)
36 | eval_dataset = _format_dataset(eval_dataset)
37 | print(train_dataset[0])
38 | return train_dataset, eval_dataset
--------------------------------------------------------------------------------
/model-management/mini_llm/openai.py:
--------------------------------------------------------------------------------
1 | from openai import OpenAI
2 | from tenacity import (
3 | retry,
4 | stop_after_attempt,
5 | wait_random_exponential, # for exponential backoff
6 | )
7 |
8 | client = OpenAI()
9 |
10 | @retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
11 | def completion_with_backoff(**kwargs):
12 | return client.chat.completions.create(**kwargs)
13 |
14 |
15 |
16 |
--------------------------------------------------------------------------------
/model-management/mini_llm/utils.py:
--------------------------------------------------------------------------------
1 | import json, argparse
2 | from ast import literal_eval
3 | from pathlib import Path
4 |
5 | def str2bool(v):
6 | "Fix Argparse to process bools"
7 | if isinstance(v, bool):
8 | return v
9 | if v.lower() == 'true':
10 | return True
11 | elif v.lower() == 'false':
12 | return False
13 | else:
14 | raise argparse.ArgumentTypeError('Boolean value expected.')
15 |
16 | def parse_args(config):
17 | print("Running with the following config")
18 | parser = argparse.ArgumentParser(description='Run training baseline')
19 | for k,v in config.__dict__.items():
20 | parser.add_argument('--'+k, type=type(v) if type(v) is not bool else str2bool,
21 | default=v,
22 | help=f"Default: {v}")
23 | args = vars(parser.parse_args())
24 |
25 | # update config with parsed args
26 | for k, v in args.items():
27 | try:
28 | # attempt to eval it it (e.g. if bool, number, or etc)
29 | attempt = literal_eval(v)
30 | except (SyntaxError, ValueError):
31 | # if that goes wrong, just use the string
32 | attempt = v
33 | setattr(config, k, attempt)
34 | print(f"--{k}:{v}")
35 |
36 | def load_jsonl(file_path):
37 | data = []
38 | with open(file_path, 'r') as file:
39 | for line in file:
40 | data.append(json.loads(line))
41 | return data
--------------------------------------------------------------------------------
/model-management/requirements.txt:
--------------------------------------------------------------------------------
1 | wandb
2 | scipy
3 | transformers
4 | peft
5 | tenacity
6 | accelerate
7 | datasets
8 | evaluate
9 | trl
10 | pandas
--------------------------------------------------------------------------------
/model-management/requirements_eval.txt:
--------------------------------------------------------------------------------
1 | wandb>=0.15
2 | pandas
3 | instructor
4 | openai>=1.1.0
5 | tenacity
6 | tqdm
--------------------------------------------------------------------------------
/model-management/save_baseline_to_artifact.py:
--------------------------------------------------------------------------------
1 | # this script will save predictions from baseline model to a wandb artifact
2 |
3 | import wandb
4 | import json
5 | from pathlib import Path
6 | from types import SimpleNamespace
7 | import pandas as pd
8 |
9 | from mini_llm.utils import parse_args
10 |
11 | WANDB_PROJECT = "tinyllama"
12 | WANDB_ENTITY = "reviewco"
13 | WANDB_MODEL = "reviewco/model-registry/Small-Instruct-LLM"
14 | TABLE_NAME = "sample_predictions"
15 | ALIAS_BASELINE = "baseline"
16 | ARTIFACT_NAME = "baseline_predictions"
17 |
18 | config = SimpleNamespace(
19 | wandb_model = WANDB_MODEL,
20 | alias_baseline = ALIAS_BASELINE,
21 | table_name = TABLE_NAME,
22 | out_dir="./output",
23 | )
24 |
25 | if __name__ == "__main__":
26 | parse_args(config)
27 | out_dir = Path(config.out_dir)
28 | out_dir.mkdir(parents=True, exist_ok=True)
29 | # create a run to have lineage
30 | wandb.init(project=WANDB_PROJECT, entity=WANDB_ENTITY, job_type="save_artifact", config=config)
31 | config = wandb.config
32 | # create a new wandb artifact if it doeasn't exist
33 | baseline_artifact = wandb.Artifact(ARTIFACT_NAME, type="predictions")
34 |
35 | # download the table from the baseline model
36 | table_artifact = wandb.use_artifact(f"{WANDB_MODEL}:{config.alias_baseline}", type="model")
37 | # Get the producer run ID from the artifact
38 | producer_run_id = table_artifact.logged_by().id
39 | # Retrieve the specific table ('sample_predictions') from the run
40 | table_artifact = wandb.use_artifact(f"run-{producer_run_id}-{config.table_name}:v0")
41 | # Download the table artifact
42 | table = table_artifact.get(config.table_name)
43 |
44 | # add the table to the artifact
45 | baseline_artifact.add(table, config.table_name)
46 | # save the artifact
47 | wandb.log_artifact(baseline_artifact)
48 |
49 |
--------------------------------------------------------------------------------
/model-registry-201/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | This repository contains materials for our [Weights & Biases 201: Model Registry](https://www.wandb.courses/courses/201-model-registry) course.
7 |
8 | ## What you'll learn
9 |
10 | ### Learn to log to a centralized Model Registry
11 | Master the process of logging and managing models in a centralized model registry, allowing for an organized and efficient model management workflow.
12 |
13 | ### Master model management
14 | Gain hands-on experience in accessing, evaluating, and automating downstream processes of models, aiding in robust model deployment and monitoring.
15 |
16 | ### Develop reporting techniques
17 | Learn to build dynamic, auto-updating reports showcasing model evaluation and automation processes, enabling clear communication with stakeholders.
18 |
19 | ## Running the code
20 | - Notebooks can be run on your local system or via Google Colab
21 | - If you have questions, you can ask them in [Discord](https://wandb.me/discord) in the `#courses` channel
22 |
23 | ## 🚀 [Enroll for free](https://www.wandb.courses/courses/201-model-registry)
24 |
--------------------------------------------------------------------------------
/prompting/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | # Developer's guide to LLM prompting
7 |
8 | This repository contains materials for our [Developer's guide to LLM prompting](https://www.wandb.courses/courses/prompting) course.
9 |
10 | Learn how to integrate LLMs effectively into your applications, leverage prompts to create consistent and reproducible outputs from large language models, and customize LLMs for your use case. Discover the anatomy of prompts, explore advanced prompting techniques, and gain hands-on experience implementing a LLM for a text understanding use case.
11 |
12 | ## 🚀 [Enroll for free](https://www.wandb.courses/courses/prompting)
13 |
14 | ## What you'll learn
15 |
16 | - Craft effective prompts using system messages, context injection, and output indicators to control LLM behavior
17 |
18 | - Understand advanced prompting techniques like zero-shot, few-shot, and chain-of-thought to solve complex tasks
19 |
20 | - Build a functional LLM application with dynamic user inputs and structured outputs for real-world use cases
21 |
22 |
23 | ## Running the code
24 |
25 | - Notebooks can be run on your local system or via Google Colab
26 | - If you have questions, you can ask them in [Discord](https://wandb.me/discord) in the `#courses` channel
27 |
28 | Happy prompting!
29 |
--------------------------------------------------------------------------------
/prompting/text_formatting.py:
--------------------------------------------------------------------------------
1 | import textwrap
2 | from IPython.display import Markdown, display_markdown, display
3 |
4 | class TextWrapperDisplay:
5 | def __init__(self, text, max_width=80):
6 | self.text = text
7 | self.max_width = max_width
8 |
9 | def _repr_pretty_(self, p, cycle):
10 | if cycle:
11 | p.text(f'TextWrapperDisplay(...)')
12 | else:
13 | wrapped_text = textwrap.fill(self.text, width=self.max_width)
14 | p.text(wrapped_text)
15 |
16 | def display_wrapped_text(text, max_width=100):
17 | display(TextWrapperDisplay(text, max_width))
18 |
19 | def escape_xml_tags(text):
20 | text = text.replace("<", "\n<").replace(">", ">\n")
21 | return text
22 |
23 | def render(md_text, markdown=False):
24 | if markdown:
25 | display_markdown(Markdown(escape_xml_tags(md_text)))
26 | else:
27 | display_wrapped_text(md_text)
--------------------------------------------------------------------------------
/pyimagesearch/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | # W&B + PyImageSearch MLOps Course
7 |
8 | This repo contains the code for the course: https://pyimagesearch.mykajabi.com/offers/LQSsX59C
9 |
10 |
11 | 1. Download the dataset.
12 | ```shell
13 | $ git clone -qq https://github.com/softwaremill/lemon-dataset.git
14 | $ unzip -q lemon-dataset/data/lemon-dataset.zip
15 | ```
16 |
17 | 1.5 Open the `params.py` file and change the wandb parameters to your own project.
18 | ```python
19 | PROJECT_NAME = "wandb_course"
20 | ENTITY = "pyimagesearch" # just set this to your username
21 | ```
22 |
23 | 2. Run `python prepare_data.py` to prepare the dataset as an artifact and upload it to W&B.
24 |
25 | 3. Run `python eda.py` to do Exploratory Data Analysis on the dataset. We also upload the wandb Table from our analysis to W&B.
26 |
27 | 4. Train the model using `python train.py`
28 |
29 | 5. Evaluate the model using `python eval.py`
30 |
--------------------------------------------------------------------------------
/pyimagesearch/environment.yml:
--------------------------------------------------------------------------------
1 | ## Conda Environment file
2 | name: course
3 | channels:
4 | - pytorch
5 | - nvidia
6 | - conda-forge
7 | dependencies:
8 | - python==3.10.10
9 | - pytorch==1.13.1
10 | - pytorch-cuda==11.7
11 | - pycocotools==2.0.6
12 | - timm==0.6.12
13 | - pip==23.0.1
14 | - pandas==2.0.0
15 | - wandb==0.14.2
16 | - torchvision==0.14.1
17 | - scikit-image==0.19.3
18 | - scikit-learn==1.2.1
19 | - matplotlib
20 | - pip:
21 | - fastprogress
22 | - torcheval
--------------------------------------------------------------------------------
/pyimagesearch/params.py:
--------------------------------------------------------------------------------
1 | # W&B parameters
2 | PROJECT_NAME = "pis_course"
3 | ENTITY = None # change this to you wandb team/entity name
4 |
5 | # ENTITY = None
6 |
7 |
8 | # parameters for the dataset
9 | RAW_DATA_FOLDER = "lemon-dataset"
10 | IMAGES_FOLDER = "images"
11 | ANNOTATIONS_FILE = "annotations/instances_default.json"
12 | ARTIFACT_NAME = "lemon_data"
13 | DATA_AT = f"{PROJECT_NAME}/{ARTIFACT_NAME}:v0"
14 | # DATA_AT = f"{ENTITY}/{PROJECT_NAME}/{ARTIFACT_NAME}:v0" # if Entity is not None
--------------------------------------------------------------------------------
/pyimagesearch/requirements.txt:
--------------------------------------------------------------------------------
1 | black==23.1.0
2 | isort==5.12.0
3 | scikit-image==0.20.0
4 | wandb>=0.14.0
5 | pycocotools==2.0.6
6 | timm==0.6.12
7 | fastprogress==1.0.3
8 | torcheval==0.0.6
9 | torch>=1.10.0
10 | pandas>=2.0.0
11 | scikit-learn>=1.3.0
--------------------------------------------------------------------------------
/pyimagesearch/sweep.yaml:
--------------------------------------------------------------------------------
1 | program: train.py
2 | method: random
3 | entity: pyimagesearch
4 | project: wandb_course
5 | metric:
6 | name: valid_binary_f1_score
7 | goal: maximize
8 | parameters:
9 | batch_size:
10 | value: 16
11 | image_size:
12 | values: [256, 512]
13 | model_arch:
14 | values:
15 | - 'resnet18'
16 | - 'convnext_tiny'
17 | - 'regnetx_004'
18 | learning_rate:
19 | distribution: 'log_uniform_values'
20 | min: 1e-5
21 | max: 1e-2
22 | seed:
23 | values: [1,2,3]
--------------------------------------------------------------------------------
/rag-advanced/.gitignore:
--------------------------------------------------------------------------------
1 | .env
2 |
3 | data/wandb_docs/*
--------------------------------------------------------------------------------
/rag-advanced/README.md:
--------------------------------------------------------------------------------
1 | # RAG++
2 |
3 | This repository contains the code for the [RAG++ : From POC to Production course](https://www.wandb.courses/courses/rag-in-production).
4 |
5 | **LLM-friendly Resources**
6 |
7 | To help learn with ChatGPT etc, full text concatentation of the entire course transcript and course notebooks [can be found here](https://github.com/wandb/edu/tree/main/rag-advanced/resources#llm-friendly-resources)
8 |
9 | # Course Overview
10 |
11 | Practical RAG techniques for engineers: learn production-ready solutions from industry experts to optimize performance, cut costs, and enhance the accuracy and relevance of your applications.
12 |
13 | ## Setup and Installation
14 |
15 | First create an environment with python 3.10:
16 | ```bash
17 | conda create -n rag-advanced python=3.10
18 | conda activate rag-advanced
19 | ```
20 |
21 | Then install the requirements:
22 | ```bash
23 | pip install -r requirements.txt
24 | ```
25 |
26 | ## Project Structure
27 |
28 | The project is structured as follows:
29 |
30 | ```
31 | ├── data # contains the data used for the course
32 | ├── notebooks # contains the notebooks and colabs used for the course
33 | ├── src # contains the source code for the course
34 | ```
35 |
--------------------------------------------------------------------------------
/rag-advanced/data/.gitinclude:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/data/.gitinclude
--------------------------------------------------------------------------------
/rag-advanced/images/01_chunked_data.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/01_chunked_data.png
--------------------------------------------------------------------------------
/rag-advanced/images/01_raw_data.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/01_raw_data.png
--------------------------------------------------------------------------------
/rag-advanced/images/01_weave_trace_timeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/01_weave_trace_timeline.png
--------------------------------------------------------------------------------
/rag-advanced/images/03_compare_retriever_responses.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/03_compare_retriever_responses.png
--------------------------------------------------------------------------------
/rag-advanced/images/03_compare_retrievers.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/03_compare_retrievers.png
--------------------------------------------------------------------------------
/rag-advanced/images/04_compare_query_enhanced_responses.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/04_compare_query_enhanced_responses.png
--------------------------------------------------------------------------------
/rag-advanced/images/06_compare_prompts.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/images/06_compare_prompts.png
--------------------------------------------------------------------------------
/rag-advanced/notebooks/prompts/initial_system.txt:
--------------------------------------------------------------------------------
1 | Answer to the following question about W&B. Provide an helpful and complete answer based only on the provided documents.
2 |
--------------------------------------------------------------------------------
/rag-advanced/notebooks/prompts/query_enhanced_system.txt:
--------------------------------------------------------------------------------
1 | Answer to the user's question about W&B.
2 | Please respond to the user in the following language: {language}
3 | We have also identifed the following intents based on the user's query:
4 | {intents}
5 | Tailor your answer to addres the above intents.
6 | Provide an helful and complete answer based only on the provided documents.
--------------------------------------------------------------------------------
/rag-advanced/notebooks/prompts/search_query.json:
--------------------------------------------------------------------------------
1 | [
2 | {
3 | "role": "system",
4 | "content": "## Instruction\nThe user will provide you with a Weights & Biase related question.\nYour goal is to generate 5 distinct search queries so that relevant information can be gathered from the web to answer the user's question.\nRespond only with a list of search queries delimited by new-lines and no other text.\n"
5 | }
6 | ]
--------------------------------------------------------------------------------
/rag-advanced/notebooks/scripts/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wandb/edu/dbf299fda5fbe0fbeb794e04f4cb09291a32cefc/rag-advanced/notebooks/scripts/__init__.py
--------------------------------------------------------------------------------
/rag-advanced/requirements.txt:
--------------------------------------------------------------------------------
1 | weave>=0.51.27
2 | cohere>=5.13.6
3 | beautifulsoup4>=4.12.3
4 | levenshtein>=0.25.1
5 | markdown-it-py>=3.0.0
6 | nltk>=3.8.1
7 | pandas>=2.2.2
8 | pymdown-extensions>=10.8.1
9 | python-dotenv>=1.0.1
10 | ranx>=0.3.19
11 | rouge>=1.0.1
12 | scikit-learn>=1.5.0
13 | bm25s>=0.1.10
14 | PyStemmer>=2.2.0.1
15 | scipy>=1.14.0
16 | fasttext-langdetect==1.0.5
17 | tiktoken>=0.7.0
18 | python-frontmatter>=1.1.0
19 | syncer<=2.0.3
20 | numpy<2.0.0
21 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 0 - extras/Chapter00.md:
--------------------------------------------------------------------------------
1 | ## Chapter 0: Setup
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 | Let's install the required packages and check our setup for this course.
10 |
11 | ### 🎉 Free Cohere API key
12 |
13 | Before you run this colab notebook, head over to this [link to redeem a free Cohere API key](https://docs.google.com/forms/d/e/1FAIpQLSc9x4nV8_nSQvJnaINO1j9NIa2IUbAJqrKeSllNNCCbMFmCxw/viewform?usp=sf_link).
14 |
15 | Alternatively if you have a Cohere API key feel free to proceed. :)
16 |
17 |
18 | ```
19 | !pip install -qq weave cohere
20 | ```
21 |
22 | ## 1. Setup Weave
23 |
24 |
25 | The code cell below will prompt you to put in a W&B API key. You can get your API key by heading over to https://wandb.ai/authorize.
26 |
27 |
28 | ```
29 | # import weave
30 | import weave
31 |
32 | # initialize weave client
33 | weave_client = weave.init("rag-course")
34 | ```
35 |
36 | ## 2. Setup Cohere
37 |
38 | The code cell below will prompt you to put in a Cohere API key.
39 |
40 |
41 | ```
42 | import getpass
43 |
44 | import cohere
45 |
46 | cohere_client = cohere.ClientV2(
47 | api_key=getpass.getpass("Please enter your COHERE_API_KEY")
48 | )
49 | ```
50 |
51 | ## A simple-turn chat with Cohere's command-r-plus
52 |
53 |
54 | ```
55 | response = cohere_client.chat(
56 | messages=[
57 | {"role": "user", "content": "What is retrieval augmented generation (RAG)?"}
58 | ],
59 | model="command-r-plus",
60 | temperature=0.1,
61 | max_tokens=2000,
62 | )
63 | ```
64 |
65 | Let's head over to the weave URL to check out the generated response.
66 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 0 - extras/RAG-0.3-course outro.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | That concludes our RAG++ course.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:04,000
7 | Thank you for joining us and taking the time
8 |
9 | 3
10 | 00:00:04,000 --> 00:00:05,000
11 | to enhance your skills
12 |
13 | 4
14 | 00:00:05,000 --> 00:00:08,000
15 | in building production-ready RAG systems.
16 |
17 | 5
18 | 00:00:08,000 --> 00:00:09,000
19 | We appreciate your commitment
20 |
21 | 6
22 | 00:00:09,000 --> 00:00:12,000
23 | and hope you found the course valuable.
24 |
25 | 7
26 | 00:00:12,000 --> 00:00:14,000
27 | Your feedback is important to us.
28 |
29 | 8
30 | 00:00:14,000 --> 00:00:16,000
31 | Please take a moment to leave a review
32 |
33 | 9
34 | 00:00:16,000 --> 00:00:18,000
35 | and share your thoughts.
36 |
37 | 10
38 | 00:00:18,000 --> 00:00:19,000
39 | Thank you once again,
40 |
41 | 11
42 | 00:00:19,000 --> 00:00:21,000
43 | and we look forward to seeing the innovative solutions
44 |
45 | 12
46 | 00:00:21,000 --> 00:00:23,000
47 | you will be creating.
48 |
49 | 13
50 | 00:00:23,000 --> 00:00:25,000
51 | Best of luck and happy building.
52 |
53 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 1/RAG-1.1-chapter introduction.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | Hello everyone and welcome to the first chapter of our course on retrieval augmented generation.
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:10,000
7 | In this chapter, we'll set the stage for the rest of the course by introducing a few concepts,
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:15,000
11 | challenges, and best practices in RAG systems. We'll also give you a sneak peek into wandbot,
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:20,000
15 | our production RAG system that we'll use as a case study throughout this course.
16 |
17 | 5
18 | 00:00:20,000 --> 00:00:24,000
19 | Here's what you can expect to learn in this chapter. First, we'll explore the core components
20 |
21 | 6
22 | 00:00:24,000 --> 00:00:29,000
23 | that make up an advanced RAG system. Then, we'll discuss the challenges that you're likely to face
24 |
25 | 7
26 | 00:00:30,000 --> 00:00:36,000
27 | implementing a RAG system. Next, we'll then cover a few best practices that we've developed through
28 |
29 | 8
30 | 00:00:36,000 --> 00:00:41,000
31 | our experience with building wandbot. And finally, we'll put theory into practice by guiding you
32 |
33 | 9
34 | 00:00:41,000 --> 00:00:46,000
35 | through building a RAG system. This hands-on experience will serve as the foundation that
36 |
37 | 10
38 | 00:00:46,000 --> 00:00:50,000
39 | we'll build upon and refine through the rest of the course. By the end of this chapter,
40 |
41 | 11
42 | 00:00:50,000 --> 00:00:55,000
43 | you should have the necessary foundation required to start your journey into building an advanced
44 |
45 | 12
46 | 00:00:55,000 --> 00:00:58,000
47 | RAG system that we will explore in this course.
48 |
49 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.1-chapter introduction.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | Hello, my name is Ayush and I will be your instructor for this chapter focused on evaluating
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:10,000
7 | LLM applications. We will build from where Bharat left off in the last chapter.
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:14,000
11 | Before we start, a crucial disclaimer. Every application is unique, so we designed this
12 |
13 | 4
14 | 00:00:14,000 --> 00:00:20,000
15 | chapter to help build a mental model of approaching evaluation. In this chapter, we will explore
16 |
17 | 5
18 | 00:00:20,000 --> 00:00:25,000
19 | the importance of evaluating LLM applications. We will dive into practical approaches and
20 |
21 | 6
22 | 00:00:25,000 --> 00:00:31,000
23 | cover a few strategies. We will explore ideas to build eval datasets and finally try to
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:35,000
27 | drill the idea of evaluation-driven development.
28 |
29 | 8
30 | 00:00:35,000 --> 00:00:39,000
31 | Talking about evaluation-driven development, we learned it the hard way. A few months ago,
32 |
33 | 9
34 | 00:00:39,000 --> 00:00:44,000
35 | we refactored wandbot to make it faster and better. In a rush to get it done quickly,
36 |
37 | 10
38 | 00:00:44,000 --> 00:00:49,000
39 | in two weeks time, we skipped through evaluating major changes. Well, our first evaluation
40 |
41 | 11
42 | 00:00:49,000 --> 00:00:54,000
43 | of the refactored branch gave only 9% accuracy, which was a significant drop from our baseline
44 |
45 | 12
46 | 00:00:54,000 --> 00:00:59,000
47 | of 72%. We made a mistake, so you don't have to. In order to do it right, we created
48 |
49 | 13
50 | 00:00:59,000 --> 00:01:06,000
51 | a GitHub branch, cherry-picked the changes and evaluated each one. After running 50 evaluations
52 |
53 | 14
54 | 00:01:06,000 --> 00:01:12,000
55 | and spending $2000 over a span of another 6 weeks, we identified the key issues, resolved
56 |
57 | 15
58 | 00:01:12,000 --> 00:01:16,000
59 | them and ultimately improved the accuracy by 8% over the baseline.
60 |
61 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.10-LLM eval limitations.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | Well, LLM evaluators have limitations as well, but there are some solutions.
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:11,000
7 | The evaluator is non-deterministic, thus ideally we should be running 3-5 trials per experiment.
8 |
9 | 3
10 | 00:00:11,000 --> 00:00:15,000
11 | You might have already thought about it if we are using an LLM to evaluate an LLM system
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:21,000
15 | who evaluates the evaluator. Remember the concept of alignment where we align the evaluator's
16 |
17 | 5
18 | 00:00:21,000 --> 00:00:27,000
19 | judgement with that of human judgement with careful prompting.
20 |
21 | 6
22 | 00:00:27,000 --> 00:00:31,000
23 | Cost is obviously higher compared to traditional metrics, but remember you are using LLM evaluator
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:37,000
27 | to measure abstract concepts and manual evaluation will be very costly and time consuming.
28 |
29 | 8
30 | 00:00:37,000 --> 00:00:43,000
31 | Finally, the evaluation dataset needs to be updated if the data source has changed, thus
32 |
33 | 9
34 | 00:00:43,000 --> 00:00:48,000
35 | requiring a careful inspection of the evaluator's alignment with human judgement. In my opinion,
36 |
37 | 10
38 | 00:00:48,000 --> 00:00:52,000
39 | having a dedicated team for evaluation makes a huge difference.
40 |
41 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.11-pairwise eval.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | Circling back to the wandbot example at the start of this chapter where I showed an 8%
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:10,000
7 | increase in the performance, well we were very happy with our evaluator's judgement
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:15,000
11 | but we took an extra step to set up a simple A-B test or pairwise evaluation using humans.
12 |
13 | 4
14 | 00:00:16,000 --> 00:00:20,000
15 | We set up this A-B test using Argilla and alternatively using a simple radio app.
16 |
17 | 5
18 | 00:00:21,000 --> 00:00:26,000
19 | We asked our in-house MLEs to choose the response that felt answered the given query better.
20 |
21 | 6
22 | 00:00:26,000 --> 00:00:34,000
23 | Well, our new system was preferred 60% of the time compared to the older system which was
24 |
25 | 7
26 | 00:00:34,000 --> 00:00:40,000
27 | preferred only 40% of the time. If possible, setting up such wipe checks will give more
28 |
29 | 8
30 | 00:00:40,000 --> 00:00:43,000
31 | confidence in deploying your application.
32 |
33 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.12-conclusion.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | Well, this is the end of the chapter. I hope you learned something new and useful.
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:06,000
7 | To conclude, here are some of the take home pointers.
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:12,000
11 | Make sure to spend time and resources on evaluation and evaluate all the non-deterministic
12 |
13 | 4
14 | 00:00:12,000 --> 00:00:18,000
15 | components both individually and end-to-end. Use a mix of direct, pairwise and reference
16 |
17 | 5
18 | 00:00:18,000 --> 00:00:24,000
19 | evaluation depending on your use case. Start building your evaluation set from the get-go,
20 |
21 | 6
22 | 00:00:24,000 --> 00:00:31,000
23 | but keep iterating on it. LLM evaluators are a strong tool, but use it wisely. Make sure to
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:38,000
27 | align the judgment with that of humans. Finally, make sure to spend time to integrate your
28 |
29 | 8
30 | 00:00:38,000 --> 00:00:43,000
31 | evaluation pipeline with better tooling early on to iterate rapidly. I hope it was useful.
32 |
33 | 9
34 | 00:00:44,000 --> 00:00:46,000
35 | Let's see you in chapter 3.
36 |
37 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.5-LLM evaluator-.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | Now let's evaluate our retriever using an LLM evaluator. We will talk more formally about
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:10,000
7 | this concept when we will evaluate a response generator. But here let me show you an LLM
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:17,000
11 | evaluator in action for evaluating a component of the RAG pipeline. The idea is to first ask an LLM
12 |
13 | 4
14 | 00:00:17,000 --> 00:00:22,000
15 | to score each retrieved context based on a relevance to a given question. You can check
16 |
17 | 5
18 | 00:00:22,000 --> 00:00:28,000
19 | out the system prompt. The instruction is documented here which is to rank documents
20 |
21 | 6
22 | 00:00:28,000 --> 00:00:33,000
23 | based on their relevance to a given question as well as answer pair. The instructions are also
24 |
25 | 7
26 | 00:00:33,000 --> 00:00:40,000
27 | followed by a rubric or the scoring criteria. Here the criteria is to give a score in a range of 0 to
28 |
29 | 8
30 | 00:00:40,000 --> 00:00:48,000
31 | 2 where 0 represents that the document is irrelevant whereas 1 is neutral and 2 is that
32 |
33 | 9
34 | 00:00:48,000 --> 00:00:53,000
35 | the document is highly relevant to the question answer pair. The final output of the judge should
36 |
37 | 10
38 | 00:00:53,000 --> 00:01:00,000
39 | look something like this where each context id is given a relevant score.
40 |
41 | 11
42 | 00:01:05,000 --> 00:01:10,000
43 | Based on this scoring by LLM we can then compute two metrics the mean relevance as well as the
44 |
45 | 12
46 | 00:01:10,000 --> 00:01:18,000
47 | rank score. The rank metric measures the position of the relevant chunks. We then set up our
48 |
49 | 13
50 | 00:01:18,000 --> 00:01:25,000
51 | evaluation like we did in the past and run the evaluation using our LLM as a judge metric.
52 |
53 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.6-assertions.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | But hey, before we do any fancy evaluation, let's take a step back from LLM as a judge
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:10,000
7 | jargon and try to express the desired output in plain code. This is direct evaluation using
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:16,000
11 | heuristics. The heuristics should form the first layer of your system evaluation. We
12 |
13 | 4
14 | 00:00:16,000 --> 00:00:21,000
15 | can use these metrics to inspect the structure of the response. Some examples can be to check
16 |
17 | 5
18 | 00:00:21,000 --> 00:00:26,000
19 | if there are bullet points or code snippets in the response. We can also check if the
20 |
21 | 6
22 | 00:00:26,000 --> 00:00:31,000
23 | response is of a certain length or is a valid JSON in case of structured output. Doing these
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:36,000
27 | types of heuristics based evaluation will reinforce what we expect and hence make the
28 |
29 | 8
30 | 00:00:36,000 --> 00:00:39,000
31 | system more robust.
32 |
33 | 9
34 | 00:00:39,000 --> 00:00:44,000
35 | Writing assertions or unit tests are application specific. A more generic evaluation strategy
36 |
37 | 10
38 | 00:00:44,000 --> 00:00:50,000
39 | is to do reference-based evaluation using traditional metrics, like comparing the similarity
40 |
41 | 11
42 | 00:00:50,000 --> 00:00:55,000
43 | ratio between the normalized model output and the expected answer, or computing the
44 |
45 | 12
46 | 00:00:55,000 --> 00:01:03,000
47 | Laue distance. We are also showing both ROUGE and BLEU metrics. We again set up our evaluation
48 |
49 | 13
50 | 00:01:03,000 --> 00:01:08,000
51 | using Weave Evaluation, but this time we evaluate a simple RAG pipeline.
52 |
53 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.7-limitations.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | The traditional metrics for evaluating generation fall short on measuring the semantics deeply
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:08,000
7 | and usually ignore the contextual relevance of the user query.
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:12,000
11 | It usually performs fuzzy exact match which are hard to interpret.
12 |
13 | 4
14 | 00:00:12,000 --> 00:00:17,000
15 | Furthermore, most track pipelines are nuanced and subjective demanding for better evaluation
16 |
17 | 5
18 | 00:00:17,000 --> 00:00:23,000
19 | metrics. Note however that these traditional metrics can be included in the evaluation suite
20 |
21 | 6
22 | 00:00:23,000 --> 00:00:29,000
23 | because of its speed. LLM evaluators can help overcome some of these limitations
24 |
25 | 7
26 | 00:00:29,000 --> 00:00:36,000
27 | but has its own set of problems. We have already used LLM as a judge to evaluate our retriever
28 |
29 | 8
30 | 00:00:36,000 --> 00:00:42,000
31 | but let us define what it means more formally here. The idea of LLM evaluator is based on two
32 |
33 | 9
34 | 00:00:42,000 --> 00:00:47,000
35 | facts. One, a powerful LLM can compare pieces of text and second it can follow instructions.
36 |
37 | 10
38 | 00:00:48,000 --> 00:00:54,000
39 | Using these two facts we can give a powerful LLM pieces of text like the retrieve context,
40 |
41 | 11
42 | 00:00:54,000 --> 00:01:00,000
43 | the generated response and the user query. We also give it a set of instructions which outline
44 |
45 | 12
46 | 00:01:00,000 --> 00:01:07,000
47 | the scoring criteria. The LLM then gives a score based on the learned internal representations.
48 |
49 | 13
50 | 00:01:07,000 --> 00:01:12,000
51 | One can pause here and ask two important questions. Are these scores deterministic
52 |
53 | 14
54 | 00:01:12,000 --> 00:01:18,000
55 | to which I would say a no. If we are using an LLM to evaluate an LLM system,
56 |
57 | 15
58 | 00:01:18,000 --> 00:01:23,000
59 | how can we evaluate the LLM evaluator in the first place. Well, we will get to it later.
60 |
61 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 2/RAG-2.9-re-evaluting models.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:06,000
3 | What's even more exciting is that because our simple RAG pipeline is written using Weave.model
4 |
5 | 2
6 | 00:00:06,000 --> 00:00:13,000
7 | or in a structured manner, we can easily switch the underlying LLM from Command R to Command R+, re-initialize
8 |
9 | 3
10 | 00:00:13,000 --> 00:00:17,000
11 | the RAG pipeline and run the evaluation on all the metrics that we have discussed so
12 |
13 | 4
14 | 00:00:17,000 --> 00:00:20,000
15 | far.
16 |
17 | 5
18 | 00:00:20,000 --> 00:00:25,000
19 | Using the Weave dashboard, we can easily compare two evaluations.
20 |
21 | 6
22 | 00:00:25,000 --> 00:00:31,000
23 | In this case, I am comparing the evaluation of two RAG pipeline, one using Command R and another
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:32,000
27 | one using Command R+.
28 |
29 | 8
30 | 00:00:32,000 --> 00:00:39,000
31 | Click on the compare button and we have the comparison dashboard created for us automatically.
32 |
33 | 9
34 | 00:00:39,000 --> 00:00:44,000
35 | We can look through the differences or the delta of the metrics that changed and the
36 |
37 | 10
38 | 00:00:44,000 --> 00:00:50,000
39 | best part is that we can look through individual samples and compare the answer for all of
40 |
41 | 11
42 | 00:00:50,000 --> 00:00:51,000
43 | these.
44 |
45 | 12
46 | 00:00:51,000 --> 00:00:57,000
47 | For example, here this particular output got a score of 1, whereas this output only got
48 |
49 | 13
50 | 00:00:57,000 --> 00:01:00,000
51 | a score of 0.
52 |
53 | 14
54 | 00:01:00,000 --> 00:01:06,000
55 | Maybe there was something wrong when Command R was used and we can literally run through
56 |
57 | 15
58 | 00:01:06,000 --> 00:01:11,000
59 | all the 20 samples here and try to figure out where our system is failing.
60 |
61 | 16
62 | 00:01:11,000 --> 00:01:17,000
63 | This concludes how we can use evaluation on Weave for evaluation driven development to
64 |
65 | 17
66 | 00:01:17,000 --> 00:01:19,000
67 | improve the quality of our RAG pipeline.
68 |
69 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 3/RAG-3.1-chapter intro.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Hello and welcome to chapter 3 of our course.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:06,000
7 | In this chapter, we'll dive into a crucial aspect of RAG systems,
8 |
9 | 3
10 | 00:00:06,000 --> 00:00:08,000
11 | data ingestion and pre-processing.
12 |
13 | 4
14 | 00:00:08,000 --> 00:00:11,000
15 | Think of this chapter as your guidebook to transform your raw data
16 |
17 | 5
18 | 00:00:11,000 --> 00:00:14,000
19 | into the fuel that powers your RAG engine.
20 |
21 | 6
22 | 00:00:14,000 --> 00:00:17,000
23 | We'll explore how to process and convert all sorts of information
24 |
25 | 7
26 | 00:00:17,000 --> 00:00:20,000
27 | from documents to code snippets into a format
28 |
29 | 8
30 | 00:00:20,000 --> 00:00:23,000
31 | that your RAG system can work with efficiently.
32 |
33 | 9
34 | 00:00:23,000 --> 00:00:26,000
35 | Let's take a look at what we're going to cover in this chapter.
36 |
37 | 10
38 | 00:00:26,000 --> 00:00:30,000
39 | First, we're going to really dig into what data ingestion means for RAG systems.
40 |
41 | 11
42 | 00:00:30,000 --> 00:00:33,000
43 | We'll then explore some techniques for formatting your data.
44 |
45 | 12
46 | 00:00:33,000 --> 00:00:36,000
47 | Next, we're going to dive into the art of chunking.
48 |
49 | 13
50 | 00:00:36,000 --> 00:00:40,000
51 | We'll also look at how to use metadata to give your system even more context.
52 |
53 | 14
54 | 00:00:40,000 --> 00:00:45,000
55 | Finally, you'll learn how to spot and fix common ingestion issues.
56 |
57 | 15
58 | 00:00:45,000 --> 00:00:47,000
59 | By the end of this chapter, you'll have a solid grasp
60 |
61 | 16
62 | 00:00:47,000 --> 00:00:50,000
63 | of how to prepare your data for optimal RAG performance.
64 |
65 | 17
66 | 00:00:50,000 --> 00:00:54,000
67 | This is the foundation that can really take your RAG system to the next level.
68 |
69 | 18
70 | 00:00:54,000 --> 00:00:55,000
71 | So let's dive in.
72 |
73 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 3/RAG-3.10-BM25.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | In chapter 1, we saw how to use tf-idf vectors to build a basic retriever.
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:10,000
7 | Here, we'll explore another type of retrieval method called best matching 25 or bm25 in short.
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:16,000
11 | bm25 is an evolution of tf-idf that better handles document length and term frequency saturation
12 |
13 | 4
14 | 00:00:16,000 --> 00:00:22,000
15 | using a probabilistic approach. By implementing both methods and ingesting data into the two
16 |
17 | 5
18 | 00:00:22,000 --> 00:00:27,000
19 | retrievers, we can more effectively compare their impact on our RAG pipeline. Having ingested the
20 |
21 | 6
22 | 00:00:27,000 --> 00:00:32,000
23 | data into the retrievers, we can finally use Weave to evaluate them holistically. This allows us to
24 |
25 | 7
26 | 00:00:32,000 --> 00:00:38,000
27 | compare their retrieval performance as well as the overall impact on the RAG pipeline. After running
28 |
29 | 8
30 | 00:00:38,000 --> 00:00:42,000
31 | the evaluations, you should be able to compare the performance of the two methods in your Weave
32 |
33 | 9
34 | 00:00:42,000 --> 00:00:47,000
35 | dashboard. Here, you'll notice that the bm25 retriever generally performs better than the
36 |
37 | 10
38 | 00:00:47,000 --> 00:00:50,000
39 | tf-idf retriever on most metrics.
40 |
41 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 3/RAG-3.11-conclusion.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | To summarize, we've covered tokenization,
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:07,000
7 | semantic chunking, and explode BM25 as an alternative to TF-IDF retrieval.
8 |
9 | 3
10 | 00:00:07,000 --> 00:00:12,000
11 | We also evaluated the impact of these techniques on the RAG system as a whole using Weave.
12 |
13 | 4
14 | 00:00:12,000 --> 00:00:17,000
15 | These improvements can significantly enhance the performance of your RAG system.
16 |
17 | 5
18 | 00:00:17,000 --> 00:00:23,000
19 | In the next chapter, we'll dive into improving our system's ability to understand and parse user queries better.
20 |
21 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 3/RAG-3.7-from experience.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | As we wrap up this chapter, let's reflect on some key insights we've gained from our journey with
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:09,000
7 | wandbot. First and foremost, we've learned the importance of tailored pre-processing.
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:15,000
11 | One size doesn't fit all when it comes to RAG systems. Your ingestion pipeline should be as
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:20,000
15 | unique as your data and your use case. We've also discovered the delicate balance between efficiency
16 |
17 | 5
18 | 00:00:20,000 --> 00:00:26,000
19 | and accuracy. It's tempting to aim for perfect accuracy, but sometimes that comes at the cost
20 |
21 | 6
22 | 00:00:26,000 --> 00:00:33,000
23 | of speed. Finding the right balance is crucial for a responsive and reliable system. Another key
24 |
25 | 7
26 | 00:00:33,000 --> 00:00:39,000
27 | lesson is the value of continuous evaluation. RAG systems are an intersected and forgetted solution.
28 |
29 | 8
30 | 00:00:39,000 --> 00:00:45,000
31 | They require ongoing monitoring and refinement to maintain peak performance. And lastly, we've seen
32 |
33 | 9
34 | 00:00:45,000 --> 00:00:50,000
35 | the power of collaborative problem solving. Some of our biggest breakthroughs came from team
36 |
37 | 10
38 | 00:00:50,000 --> 00:00:56,000
39 | brainstorming sessions and cross-functional collaboration. These insights have helped us
40 |
41 | 11
42 | 00:00:56,000 --> 00:01:01,000
43 | navigate challenges, optimize performance, and continuously improve our user experience.
44 |
45 | 12
46 | 00:01:01,000 --> 00:01:06,000
47 | Remember these lessons when building your own RAG system. They should help you build more effective,
48 |
49 | 13
50 | 00:01:06,000 --> 00:01:09,000
51 | efficient, and adaptable systems.
52 |
53 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 3/RAG-3.9-chunking in code.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Now, let's look at chunking.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:05,000
7 | Here, we are using a semantic chunking approach,
8 |
9 | 3
10 | 00:00:05,000 --> 00:00:07,000
11 | which groups similar sentences together.
12 |
13 | 4
14 | 00:00:07,000 --> 00:00:10,000
15 | This method preserves context better than simple
16 |
17 | 5
18 | 00:00:10,000 --> 00:00:12,000
19 | fixed-length chunk splitting.
20 |
21 | 6
22 | 00:00:12,000 --> 00:00:15,000
23 | While I'll not dive into the details of the chunker here,
24 |
25 | 7
26 | 00:00:15,000 --> 00:00:18,000
27 | you should definitely explore the chunking code in the repository
28 |
29 | 8
30 | 00:00:18,000 --> 00:00:21,000
31 | to develop a deeper understanding of the chunking mechanism.
32 |
33 | 9
34 | 00:00:21,000 --> 00:00:24,000
35 | Notice how our chunks have varying sizes
36 |
37 | 10
38 | 00:00:24,000 --> 00:00:26,000
39 | but maintain semantic coherence.
40 |
41 | 11
42 | 00:00:26,000 --> 00:00:29,000
43 | This adaptive approach can significantly improve
44 |
45 | 12
46 | 00:00:29,000 --> 00:00:31,000
47 | the retrieval relevance of our system.
48 |
49 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 4/RAG-4.1-chapter intro.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | Hello and welcome to chapter 4 of our course, where we will be exploring query enhancement.
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:09,000
7 | In this chapter, we'll discuss some powerful tools and techniques to enrich user queries for our RAG
8 |
9 | 3
10 | 00:00:09,000 --> 00:00:15,000
11 | systems. Query enhancement gives a RAG system a deep understanding of user intentions. It allows
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:20,000
15 | your system to grasp the real meaning and intentions behind each query. We'll give you
16 |
17 | 5
18 | 00:00:20,000 --> 00:00:25,000
19 | the techniques to improve how a RAG system interprets and responds to user queries.
20 |
21 | 6
22 | 00:00:25,000 --> 00:00:30,000
23 | So let's get started. By the end of this chapter, you will have a comprehensive understanding of
24 |
25 | 7
26 | 00:00:30,000 --> 00:00:35,000
27 | query enhancement and its significance in RAG systems. We'll dive into four key techniques
28 |
29 | 8
30 | 00:00:35,000 --> 00:00:39,000
31 | that can significantly improve query processing capabilities. These aren't just theoretical
32 |
33 | 9
34 | 00:00:39,000 --> 00:00:44,000
35 | concepts. We'll examine their practical applications through our wandbot example.
36 |
37 | 10
38 | 00:00:44,000 --> 00:00:49,000
39 | We'll also share some best practices to effectively implement query enhancement in your
40 |
41 | 11
42 | 00:00:49,000 --> 00:00:54,000
43 | own RAG systems. This knowledge will be instrumental in enhancing a RAG system's capabilities.
44 |
45 | 12
46 | 00:00:55,000 --> 00:01:00,000
47 | So let's begin our exploration of query enhancement and its potential to elevate RAG performance.
48 |
49 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 4/RAG-4.3-enhancing context-.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | Now let's look at an often overlooked aspect of query processing,
4 |
5 | 2
6 | 00:00:03,000 --> 00:00:08,000
7 | metadata extraction and how it enhances context in queries.
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:12,000
11 | Metadata gives your RAG system additional contextual clues.
12 |
13 | 4
14 | 00:00:12,000 --> 00:00:16,000
15 | It's crucial for understanding the full picture of a user's query.
16 |
17 | 5
18 | 00:00:16,000 --> 00:00:20,000
19 | For example, in wandbot, we identify the language of the user's query.
20 |
21 | 6
22 | 00:00:20,000 --> 00:00:24,000
23 | This might seem simple, but it's actually quite crucial for our multilingual support.
24 |
25 | 7
26 | 00:00:25,000 --> 00:00:28,000
27 | Currently, wandbot can handle queries in English and Japanese.
28 |
29 | 8
30 | 00:00:28,000 --> 00:00:34,000
31 | By detecting the language, we can route queries to different language-specific resources.
32 |
33 | 9
34 | 00:00:34,000 --> 00:00:36,000
35 | For instance, if a query is detected as Japanese,
36 |
37 | 10
38 | 00:00:36,000 --> 00:00:42,000
39 | wandbot knows to retrieve relevant Japanese documentation and generate responses in Japanese.
40 |
41 | 11
42 | 00:00:43,000 --> 00:00:47,000
43 | This contextual adaptation leads to more consistent and appropriate responses.
44 |
45 | 12
46 | 00:00:48,000 --> 00:00:52,000
47 | It also streamlines the retrieval process, making our system more efficient.
48 |
49 | 13
50 | 00:00:52,000 --> 00:00:56,000
51 | By implementing metadata extraction, you are essentially fine-tuning your RAC system's
52 |
53 | 14
54 | 00:00:56,000 --> 00:01:01,000
55 | ability to understand and respond to queries in a more nuanced and contextually aware manner.
56 |
57 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 4/RAG-4.4-LLM in query enhancement-.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | This is a good time to talk about the role that large language models play in query enhancement.
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:09,000
7 | While traditional NLP techniques have been the go-to for query enhancement,
8 |
9 | 3
10 | 00:00:09,000 --> 00:00:14,000
11 | in OneBot we use LLMs with function calling capabilities for our query enhancement.
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:21,000
15 | Here's how it works. We prompt the LLM to enhance the user queries and generate structured outputs.
16 |
17 | 5
18 | 00:00:21,000 --> 00:00:26,000
19 | The LLM then gives a structured JSON that includes intent classification,
20 |
21 | 6
22 | 00:00:26,000 --> 00:00:29,000
23 | keyword extraction, and subquery generation.
24 |
25 | 7
26 | 00:00:30,000 --> 00:00:33,000
27 | We then use pydantic models to validate this output,
28 |
29 | 8
30 | 00:00:33,000 --> 00:00:35,000
31 | ensuring it meets our schema requirements.
32 |
33 | 9
34 | 00:00:36,000 --> 00:00:40,000
35 | This approach has significantly improved the accuracy of our intent classification
36 |
37 | 10
38 | 00:00:40,000 --> 00:00:45,000
39 | and keyword extraction steps. It allows for more nuanced query reformulation,
40 |
41 | 11
42 | 00:00:45,000 --> 00:00:49,000
43 | which is particularly effective when handling complex queries.
44 |
45 | 12
46 | 00:00:50,000 --> 00:00:53,000
47 | With this, we are able to provide more contextually relevant responses,
48 |
49 | 13
50 | 00:00:53,000 --> 00:00:59,000
51 | even for tricky technical queries. Using LLMs with function calling capabilities in
52 |
53 | 14
54 | 00:00:59,000 --> 00:01:04,000
55 | your query enhancement process can get you started without training NLP models from scratch,
56 |
57 | 15
58 | 00:01:04,000 --> 00:01:10,000
59 | giving your RAG system the extra boost of intelligence and flexibility it needs.
60 |
61 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.11-rank fusion.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:01,000
3 | When you retrieve from multiple sources,
4 |
5 | 2
6 | 00:00:01,000 --> 00:00:05,000
7 | re-ranking so many chunks add to the latency of the application.
8 |
9 | 3
10 | 00:00:05,000 --> 00:00:08,000
11 | A simpler yet powerful way is to use rank fusion.
12 |
13 | 4
14 | 00:00:08,000 --> 00:00:11,000
15 | The idea is to aggregate the rank of unique documents
16 |
17 | 5
18 | 00:00:11,000 --> 00:00:15,000
19 | appearing in retrieved contexts from different retrieval sources.
20 |
21 | 6
22 | 00:00:15,000 --> 00:00:18,000
23 | We then take this unified or fused rank and order it.
24 |
25 | 7
26 | 00:00:18,000 --> 00:00:22,000
27 | Taking top end from here should improve the recall of the retrieval system as a whole.
28 |
29 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.12-hybrid retriever-.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Now let's look at the hybrid retriever in action.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:04,000
7 | The hybrid retriever re-ranker class
8 |
9 | 3
10 | 00:00:04,000 --> 00:00:07,000
11 | combines both the BM25 retriever and dense retriever.
12 |
13 | 4
14 | 00:00:07,000 --> 00:00:10,000
15 | The idea is to first retrieve top-key documents
16 |
17 | 5
18 | 00:00:10,000 --> 00:00:11,000
19 | using both the retriever.
20 |
21 | 6
22 | 00:00:11,000 --> 00:00:15,000
23 | We then use the reciprocal rank fusion to fuse the results
24 |
25 | 7
26 | 00:00:15,000 --> 00:00:17,000
27 | and then use Cohere re-ranker to rank
28 |
29 | 8
30 | 00:00:17,000 --> 00:00:21,000
31 | and then select top-n documents for our query.
32 |
33 | 9
34 | 00:00:21,000 --> 00:00:23,000
35 | Let's index the chunk data
36 |
37 | 10
38 | 00:00:23,000 --> 00:00:25,000
39 | and set up evaluation using Weave evaluation
40 |
41 | 11
42 | 00:00:25,000 --> 00:00:27,000
43 | using the metrics we have been using so far.
44 |
45 | 12
46 | 00:00:28,000 --> 00:00:31,000
47 | Let's compare the hybrid retriever re-ranker
48 |
49 | 13
50 | 00:00:31,000 --> 00:00:34,000
51 | with dense retriever as well as dense retriever with re-ranker.
52 |
53 | 14
54 | 00:00:34,000 --> 00:00:38,000
55 | This is our evaluation comparison dashboard.
56 |
57 | 15
58 | 00:00:38,000 --> 00:00:42,000
59 | The negative values here show that our hybrid retriever re-ranker
60 |
61 | 16
62 | 00:00:42,000 --> 00:00:45,000
63 | is performing better compared to the other two retrievers.
64 |
65 | 17
66 | 00:00:45,000 --> 00:00:50,000
67 | The hit rate is 60% compared to 31% and 48% respectively
68 |
69 | 18
70 | 00:00:50,000 --> 00:00:52,000
71 | for the other two retrievers.
72 |
73 | 19
74 | 00:00:52,000 --> 00:00:55,000
75 | We also improved the F1 score over the dense retriever
76 |
77 | 20
78 | 00:00:55,000 --> 00:00:57,000
79 | as well as dense retriever with re-ranker.
80 |
81 | 21
82 | 00:00:57,000 --> 00:01:01,000
83 | All this at the cost of a higher model of tendency,
84 |
85 | 22
86 | 00:01:01,000 --> 00:01:02,000
87 | which is obvious.
88 |
89 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.13-conclusion.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Well, this is the end of this chapter.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:04,000
7 | To summarize, there aren't any silver bullets,
8 |
9 | 3
10 | 00:00:04,000 --> 00:00:07,000
11 | so experiment and evaluate with different retrieval techniques.
12 |
13 | 4
14 | 00:00:07,000 --> 00:00:11,000
15 | For complex queries, try breaking down the query into smaller subqueries
16 |
17 | 5
18 | 00:00:11,000 --> 00:00:14,000
19 | or rewrite the query to improve the recall of your retrieval system.
20 |
21 | 6
22 | 00:00:15,000 --> 00:00:18,000
23 | Metadata filtering helps speed up retrieval time
24 |
25 | 7
26 | 00:00:18,000 --> 00:00:20,000
27 | and enforce correct information to be picked.
28 |
29 | 8
30 | 00:00:21,000 --> 00:00:23,000
31 | Don't be afraid to create multiple vector databases
32 |
33 | 9
34 | 00:00:23,000 --> 00:00:26,000
35 | or retrieve using different algorithms,
36 |
37 | 10
38 | 00:00:26,000 --> 00:00:29,000
39 | because we can re-rank different retrieved contexts
40 |
41 | 11
42 | 00:00:29,000 --> 00:00:30,000
43 | to improve the recall.
44 |
45 | 12
46 | 00:00:30,000 --> 00:00:31,000
47 | However, this is a slow technique,
48 |
49 | 13
50 | 00:00:31,000 --> 00:00:35,000
51 | so reciprocal rank fusion is a better way or a quicker way
52 |
53 | 14
54 | 00:00:35,000 --> 00:00:37,000
55 | to select the best chance across multiple retrieval techniques,
56 |
57 | 15
58 | 00:00:37,000 --> 00:00:40,000
59 | which can also be followed with Cohere re-ranking.
60 |
61 | 16
62 | 00:00:40,000 --> 00:00:43,000
63 | I hope you learned something new and exciting in this chapter.
64 |
65 | 17
66 | 00:00:43,000 --> 00:00:44,000
67 | Thank you.
68 |
69 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.2-limitations.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | Let me present a few arguments outlining the limitations of a simple retrieval system.
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:10,000
7 | We start off with the embedding. TF-IDF and BM25-based embedding have limited capacity to
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:15,000
11 | capture different contexts of the same word. Thus, we need deep neural network-based embeddings
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:20,000
15 | that are trained on billions of tokens. The second argument is on misalignment.
16 |
17 | 5
18 | 00:00:20,000 --> 00:00:26,000
19 | Usually, user queries lack the precise language or structure that aligns seamlessly with the
20 |
21 | 6
22 | 00:00:26,000 --> 00:00:31,000
23 | wording of relevant documents. Furthermore, a simple query when embedded have a better
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:37,000
27 | chance of finding a suitable piece of context during retrieval. What is van-db can be answered
28 |
29 | 8
30 | 00:00:37,000 --> 00:00:42,000
31 | using the first paragraph of our documentation, whereas a complex query which can be mapped to
32 |
33 | 9
34 | 00:00:42,000 --> 00:00:46,000
35 | different sections of our documentation would not be able to find all the pieces of relevant
36 |
37 | 10
38 | 00:00:46,000 --> 00:00:53,000
39 | information. We will go into argument 4, but just so you know that the position of relevant chunk
40 |
41 | 11
42 | 00:00:53,000 --> 00:00:55,000
43 | is important.
44 |
45 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.3-comp evaluation-.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | We have implemented a cohere embed English v3 based dense retriever in the Colab notebook
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:10,000
7 | for this chapter. Do check it out. Here I am showing the comparison of TF-IDF based
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:15,000
11 | retriever and dense retriever system across multiple evaluation metrics. This is our evaluation
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:22,000
15 | comparison dashboard. Dense retriever got a better F1 score compared to the TF-IDF retriever.
16 |
17 | 5
18 | 00:00:22,000 --> 00:00:27,000
19 | It has a better NDCG metric which is a more reliable metric compared to MRR and hit rate.
20 |
21 | 6
22 | 00:00:27,000 --> 00:00:33,000
23 | The most surprising result from this comparison is the fact that TF-IDF retriever is actually
24 |
25 | 7
26 | 00:00:33,000 --> 00:00:39,000
27 | slower compared to the dense retriever which is at least to me counter-intuitive.
28 |
29 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.5-retrieve with COT.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | For complex or multi-step QA tasks,
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:04,000
7 | retrieval with chain of thought reasoning
8 |
9 | 3
10 | 00:00:04,000 --> 00:00:06,000
11 | is getting a lot of traction.
12 |
13 | 4
14 | 00:00:06,000 --> 00:00:09,000
15 | The idea is same as retrieval with decomposition,
16 |
17 | 5
18 | 00:00:09,000 --> 00:00:12,000
19 | but instead of breaking down the query into sub-questions
20 |
21 | 6
22 | 00:00:12,000 --> 00:00:14,000
23 | in the first LLM call,
24 |
25 | 7
26 | 00:00:14,000 --> 00:00:18,000
27 | we first retrieve relevant documents for a query Q.
28 |
29 | 8
30 | 00:00:18,000 --> 00:00:21,000
31 | We then prompt LLM to read the query and documents
32 |
33 | 9
34 | 00:00:21,000 --> 00:00:25,000
35 | to come up with reasoning sentence T1 and retrieve for it.
36 |
37 | 10
38 | 00:00:25,000 --> 00:00:26,000
39 | We keep retrieving and reasoning
40 |
41 | 11
42 | 00:00:26,000 --> 00:00:28,000
43 | till termination step is met.
44 |
45 | 12
46 | 00:00:28,000 --> 00:00:32,000
47 | In this case, determination is the presence of the answer,
48 |
49 | 13
50 | 00:00:32,000 --> 00:00:34,000
51 | is substring.
52 |
53 | 14
54 | 00:00:34,000 --> 00:00:37,000
55 | You can come up with different complexities of retrieval
56 |
57 | 15
58 | 00:00:37,000 --> 00:00:39,000
59 | with chain of thought reasoning.
60 |
61 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.6-metadata filtering.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | In chapter 3, Bharat talked about metadata management,
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:07,000
7 | where the vector store is not only a collection of documents and their embeddings,
8 |
9 | 3
10 | 00:00:07,000 --> 00:00:10,000
11 | but also store relevant metadata associated with each document.
12 |
13 | 4
14 | 00:00:11,000 --> 00:00:17,000
15 | During retrieval, the metadata filtering strategy uses an LLM to first extract the metadata
16 |
17 | 5
18 | 00:00:17,000 --> 00:00:19,000
19 | from the user query based off some schema.
20 |
21 | 6
22 | 00:00:19,000 --> 00:00:23,000
23 | In this example, we are extracting the year metadata from the query.
24 |
25 | 7
26 | 00:00:23,000 --> 00:00:29,000
27 | We then filter the vector store to select only those documents associated with this metadata.
28 |
29 | 8
30 | 00:00:29,000 --> 00:00:34,000
31 | Doing cosine similarity with this subset ensures a better and richer context for LLM.
32 |
33 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.7-logical routing.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | In cases where we have multiple sources of information,
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:05,000
7 | we not only retrieve information from a single vector store,
8 |
9 | 3
10 | 00:00:05,000 --> 00:00:08,000
11 | but can do so over multiple vector stores.
12 |
13 | 4
14 | 00:00:08,000 --> 00:00:10,000
15 | Depending on use case or business constraints,
16 |
17 | 5
18 | 00:00:10,000 --> 00:00:13,000
19 | we can even use conventional SQL databases
20 |
21 | 6
22 | 00:00:13,000 --> 00:00:15,000
23 | or consider doing web search.
24 |
25 | 7
26 | 00:00:15,000 --> 00:00:18,000
27 | We do all this by using a concept called routing.
28 |
29 | 8
30 | 00:00:18,000 --> 00:00:21,000
31 | Besides jargon, the underlying idea is simple.
32 |
33 | 9
34 | 00:00:21,000 --> 00:00:25,000
35 | We take in the user query and use an LLM function calling ability
36 |
37 | 10
38 | 00:00:25,000 --> 00:00:26,000
39 | to select the retrieval sources.
40 |
41 | 11
42 | 00:00:27,000 --> 00:00:28,000
43 | Based on a recent study,
44 |
45 | 12
46 | 00:00:28,000 --> 00:00:32,000
47 | it is best if we use one LLM call to come up with a reasoning step
48 |
49 | 13
50 | 00:00:32,000 --> 00:00:36,000
51 | to select data sources and use another LLM call to do function calling.
52 |
53 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.8-context stuffing.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | Now let's look at the fourth argument I made about the position of the relevant context.
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:08,000
7 | After we do cosine similarity between the query embedding and all the document embeddings,
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:13,000
11 | we then sort it in descending order. Note that this ordering will not necessarily surface all
12 |
13 | 4
14 | 00:00:13,000 --> 00:00:18,000
15 | the relevant information. Say there are six relevant pieces of information. If we take the
16 |
17 | 5
18 | 00:00:18,000 --> 00:00:25,000
19 | top three chunks, we only have one relevant chunk, thus the recall is 0.167. Taking the top five
20 |
21 | 6
22 | 00:00:25,000 --> 00:00:32,000
23 | chunks improves the recall to 0.4. If we take top 10 chunks, the recall will be 0.667. You get the
24 |
25 | 7
26 | 00:00:32,000 --> 00:00:37,000
27 | point. The more we take from this ordered list of documents, the better the recall. But obviously
28 |
29 | 8
30 | 00:00:37,000 --> 00:00:42,000
31 | there ain't any free lunch. More context increases the latency and the cost of the pipeline.
32 |
33 | 9
34 | 00:00:43,000 --> 00:00:48,000
35 | Moreover, we know from control study that the position of the relevant context is crucial for
36 |
37 | 10
38 | 00:00:48,000 --> 00:00:53,000
39 | the LLM to properly attend to it. In the Lost in the Middle paper, the authors change the position
40 |
41 | 11
42 | 00:00:53,000 --> 00:00:58,000
43 | of the most relevant document from the first position till the 20th. Clearly the performance
44 |
45 | 12
46 | 00:00:58,000 --> 00:01:03,000
47 | is highest when the relevant information occurs at the beginning or at the end of the input context
48 |
49 | 13
50 | 00:01:03,000 --> 00:01:08,000
51 | and significantly degrades when models must access relevant information in the middle of long
52 |
53 | 14
54 | 00:01:08,000 --> 00:01:14,000
55 | contexts. In our top k equals 10 retrieve context, two of the relevant pieces of information are in
56 |
57 | 15
58 | 00:01:14,000 --> 00:01:18,000
59 | middle which will obviously impact the quality of the generated response.
60 |
61 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 5/RAG-5.9-cross encoder.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | Another powerful re-ranker model is a cross-encoder transformer.
4 |
5 | 2
6 | 00:00:03,000 --> 00:00:06,000
7 | Cross-encoder because we train this transformer like BERT
8 |
9 | 3
10 | 00:00:06,000 --> 00:00:10,000
11 | with pairs of documents and learn to map it to relevancy score.
12 |
13 | 4
14 | 00:00:10,000 --> 00:00:14,000
15 | During the ranking, we pass it pairs of query and retrieve chunks.
16 |
17 | 5
18 | 00:00:15,000 --> 00:00:20,000
19 | The model does assigns a new score to each chunk which we can order in descending order.
20 |
21 | 6
22 | 00:00:21,000 --> 00:00:26,000
23 | Note that since we are processing a lot of text, this is a relatively slow process,
24 |
25 | 7
26 | 00:00:26,000 --> 00:00:28,000
27 | so be mindful of the top key parameter of your retriever.
28 |
29 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 6/RAG-6.1-chapter intro.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | Hello and welcome to chapter 6 of our course where we'll be talking about response synthesis.
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:10,000
7 | In this chapter, we'll explore how RAG systems transform the retrieved information
8 |
9 | 3
10 | 00:00:10,000 --> 00:00:15,000
11 | into coherent and relevant answers for our users. Response synthesis is a critical component that
12 |
13 | 4
14 | 00:00:15,000 --> 00:00:21,000
15 | bridges the gap between raw data retrieval and a user-friendly output. By the end of this chapter,
16 |
17 | 5
18 | 00:00:21,000 --> 00:00:26,000
19 | you'll understand the key techniques and best practices for crafting high-quality responses
20 |
21 | 6
22 | 00:00:26,000 --> 00:00:31,000
23 | in your RAG system. So let's get started. Here's what you will learn in this chapter.
24 |
25 | 7
26 | 00:00:31,000 --> 00:00:35,000
27 | We're going to unpack the important role of response synthesis in RAG systems.
28 |
29 | 8
30 | 00:00:35,000 --> 00:00:40,000
31 | We'll start by introducing a few key prompting techniques. These are the building blocks of an
32 |
33 | 9
34 | 00:00:40,000 --> 00:00:45,000
35 | effective RAG response generation system. You'll see how these techniques come to life through
36 |
37 | 10
38 | 00:00:45,000 --> 00:00:51,000
39 | our wandbot example. We'll also dive into some best practices and troubleshooting strategies.
40 |
41 | 11
42 | 00:00:51,000 --> 00:00:54,000
43 | These lessons will be invaluable when you're building your own RAG system.
44 |
45 | 12
46 | 00:00:55,000 --> 00:00:59,000
47 | By the end of this chapter, you'll have a comprehensive toolkit for crafting high-quality
48 |
49 | 13
50 | 00:00:59,000 --> 00:01:05,000
51 | and relevant responses from your RAG system. You'll understand how to guide LLM behavior
52 |
53 | 14
54 | 00:01:05,000 --> 00:01:09,000
55 | and ensure accuracy and adapt to diverse queries. So let's dive in.
56 |
57 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 6/RAG-6.2-response synthesis.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:04,000
3 | Let's kick things off by understanding response synthesis and prompting.
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:08,000
7 | Think of response synthesis as the translator in your RAG system.
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:12,000
11 | It takes the raw data and transforms it into a user-friendly answer.
12 |
13 | 4
14 | 00:00:12,000 --> 00:00:16,000
15 | It's the bridge between what your system retrieves and what the user sees.
16 |
17 | 5
18 | 00:00:16,000 --> 00:00:20,000
19 | Now, prompting is like giving your LLM a really good instruction manual.
20 |
21 | 6
22 | 00:00:20,000 --> 00:00:25,000
23 | It guides the LLM's behavior, ensuring that it generates relevant and friendly responses.
24 |
25 | 7
26 | 00:00:26,000 --> 00:00:30,000
27 | In wandbot, we've seen firsthand how crucial these elements are.
28 |
29 | 8
30 | 00:00:30,000 --> 00:00:33,000
31 | The beauty of effective prompting is its versatility.
32 |
33 | 9
34 | 00:00:33,000 --> 00:00:37,000
35 | It allows RAG systems like wandbot to handle a wide range of queries,
36 |
37 | 10
38 | 00:00:37,000 --> 00:00:41,000
39 | ranging from simple questions to complex technical issues.
40 |
41 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 6/RAG-6.4-system prompt.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | Let's dive into some best practices for system prompt iteration.
4 |
5 | 2
6 | 00:00:03,000 --> 00:00:07,000
7 | These are the habits that separate a good drag system from great ones.
8 |
9 | 3
10 | 00:00:07,000 --> 00:00:09,000
11 | First up, version control.
12 |
13 | 4
14 | 00:00:09,000 --> 00:00:11,000
15 | Treat your prompts like code.
16 |
17 | 5
18 | 00:00:11,000 --> 00:00:13,000
19 | Track changes and their impacts.
20 |
21 | 6
22 | 00:00:13,000 --> 00:00:16,000
23 | It's a lifesaver when you need to roll back or understand what worked.
24 |
25 | 7
26 | 00:00:17,000 --> 00:00:19,000
27 | Next, schedule regular performance reviews.
28 |
29 | 8
30 | 00:00:20,000 --> 00:00:24,000
31 | Catch issues early and keep improving consistently.
32 |
33 | 9
34 | 00:00:24,000 --> 00:00:26,000
35 | Next, involve your whole team.
36 |
37 | 10
38 | 00:00:26,000 --> 00:00:29,000
39 | Developers, domain experts, designers.
40 |
41 | 11
42 | 00:00:30,000 --> 00:00:32,000
43 | They all bring unique perspectives.
44 |
45 | 12
46 | 00:00:32,000 --> 00:00:36,000
47 | And in our experience, this cross-functional approach is what leads to more robust solutions.
48 |
49 | 13
50 | 00:00:37,000 --> 00:00:39,000
51 | And lastly, document everything.
52 |
53 | 14
54 | 00:00:39,000 --> 00:00:42,000
55 | Write down why you've made each refinement.
56 |
57 | 15
58 | 00:00:42,000 --> 00:00:46,000
59 | It builds your team's knowledge base and saves time in the long run.
60 |
61 | 16
62 | 00:00:46,000 --> 00:00:48,000
63 | Again, consistency is key.
64 |
65 | 17
66 | 00:00:48,000 --> 00:00:54,000
67 | Implement these practices and you'll see your drag system evolve and improve over time.
68 |
69 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 6/RAG-6.6-advanced prompting.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | We'll now look at a few advanced prompting techniques.
4 |
5 | 2
6 | 00:00:03,000 --> 00:00:08,000
7 | These strategies are where RAG systems start to really flex their muscles.
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:10,000
11 | First is Chain of Thought Prompting.
12 |
13 | 4
14 | 00:00:10,000 --> 00:00:15,000
15 | It helps the LLM break down complex problems into step-by-step solutions.
16 |
17 | 5
18 | 00:00:15,000 --> 00:00:16,000
19 | Next we have Self-reflection Prompting.
20 |
21 | 6
22 | 00:00:16,000 --> 00:00:22,000
23 | Here's where an LLM double checks its own work and refines its responses for improved
24 |
25 | 7
26 | 00:00:22,000 --> 00:00:23,000
27 | accuracy.
28 |
29 | 8
30 | 00:00:23,000 --> 00:00:26,000
31 | Next we have Tree of Thought Prompting.
32 |
33 | 9
34 | 00:00:26,000 --> 00:00:31,000
35 | This technique allows your LLM to explore multiple reasoning paths simultaneously,
36 |
37 | 10
38 | 00:00:31,000 --> 00:00:35,000
39 | much like a chess player considering various moves simultaneously.
40 |
41 | 11
42 | 00:00:35,000 --> 00:00:38,000
43 | These techniques pack a punch in terms of benefits.
44 |
45 | 12
46 | 00:00:38,000 --> 00:00:43,000
47 | They sharpen the reasoning skills, tackle complex queries with ease, and produce nuanced
48 |
49 | 13
50 | 00:00:43,000 --> 00:00:46,000
51 | and more context-aware responses.
52 |
53 | 14
54 | 00:00:46,000 --> 00:00:47,000
55 | But here's the catch.
56 |
57 | 15
58 | 00:00:47,000 --> 00:00:51,000
59 | With the increased power comes increased complexity.
60 |
61 | 16
62 | 00:00:51,000 --> 00:00:56,000
63 | You might find yourself juggling token limitations and intricate prompt designs.
64 |
65 | 17
66 | 00:00:56,000 --> 00:01:00,000
67 | The secret sauce is finding the right balance for your specific use case.
68 |
69 | 18
70 | 00:01:00,000 --> 00:01:04,000
71 | For those hungry for more, we have a full course dedicated to these techniques.
72 |
73 | 19
74 | 00:01:04,000 --> 00:01:08,000
75 | Check out the link in the description to dive deeper into this fascinating world.
76 |
77 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 6/RAG-6.7-ensuring quality-.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:05,000
3 | When it comes to quality control in RAG systems, we focus on three key elements
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:08,000
7 | guardrails, grounding, and citations.
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:11,000
11 | Think of output guardrails as your system's safety net.
12 |
13 | 4
14 | 00:00:11,000 --> 00:00:15,000
15 | We use structural checks, semantic filters, and safety measures
16 |
17 | 5
18 | 00:00:15,000 --> 00:00:19,000
19 | to keep responses on topic and appropriate.
20 |
21 | 6
22 | 00:00:19,000 --> 00:00:23,000
23 | Grounding techniques are all about anchoring the responses in facts.
24 |
25 | 7
26 | 00:00:23,000 --> 00:00:27,000
27 | For example, in wandbot, we tightly couple the responses with our documentation
28 |
29 | 8
30 | 00:00:28,000 --> 00:00:30,000
31 | and fact check against our knowledge base.
32 |
33 | 9
34 | 00:00:30,000 --> 00:00:34,000
35 | This has reduced hallucinations and improved response accuracy significantly.
36 |
37 | 10
38 | 00:00:35,000 --> 00:00:37,000
39 | Citations are our transparency tool.
40 |
41 | 11
42 | 00:00:38,000 --> 00:00:43,000
43 | By including links and in-text references, we have enhanced user trust
44 |
45 | 12
46 | 00:00:43,000 --> 00:00:46,000
47 | and users appreciate being able to verify information themselves.
48 |
49 | 13
50 | 00:00:47,000 --> 00:00:50,000
51 | The impact on wandbot has evolved from a basic query system
52 |
53 | 14
54 | 00:00:50,000 --> 00:00:52,000
55 | to a trusted documentation assistant.
56 |
57 | 15
58 | 00:00:52,000 --> 00:00:55,000
59 | For example, when explaining complex features,
60 |
61 | 16
62 | 00:00:56,000 --> 00:00:59,000
63 | it now provides accurate step-by-step guidance with verifiable sources.
64 |
65 | 17
66 | 00:01:00,000 --> 00:01:05,000
67 | Remember, these techniques work together to create a robust, trustworthy RAG system.
68 |
69 | 18
70 | 00:01:05,000 --> 00:01:09,000
71 | Implement them consistently and you will see a marked improvement
72 |
73 | 19
74 | 00:01:09,000 --> 00:01:11,000
75 | in output quality and user satisfaction.
76 |
77 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 6/RAG-6.8-troubleshooting.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | Let's wrap up with some troubleshooting tips and best practices.
4 |
5 | 2
6 | 00:00:03,000 --> 00:00:08,000
7 | Even the best RAG systems face challenges, but with the right approach, we can overcome them.
8 |
9 | 3
10 | 00:00:09,000 --> 00:00:14,000
11 | Common issues include incoherent responses, struggles with ambiguous queries,
12 |
13 | 4
14 | 00:00:14,000 --> 00:00:16,000
15 | and inconsistency across topics.
16 |
17 | 5
18 | 00:00:16,000 --> 00:00:20,000
19 | For improving coherence, we use structured prompts to guide logical flow.
20 |
21 | 6
22 | 00:00:21,000 --> 00:00:24,000
23 | It's like giving your LLM a roadmap for its responses.
24 |
25 | 7
26 | 00:00:24,000 --> 00:00:28,000
27 | When dealing with ambiguous queries, clarification prompts are what you need.
28 |
29 | 8
30 | 00:00:29,000 --> 00:00:32,000
31 | They help your system ask for more information when needed.
32 |
33 | 9
34 | 00:00:32,000 --> 00:00:36,000
35 | To maintain consistency, regular knowledge base updates are crucial.
36 |
37 | 10
38 | 00:00:36,000 --> 00:00:39,000
39 | Keep your system's information fresh and aligned.
40 |
41 | 11
42 | 00:00:40,000 --> 00:00:42,000
43 | These techniques aren't just theoretical.
44 |
45 | 12
46 | 00:00:42,000 --> 00:00:47,000
47 | In wandbot, they've improved the response coherence and halved our irrelevant responses.
48 |
49 | 13
50 | 00:00:48,000 --> 00:00:50,000
51 | Remember, troubleshooting is an ongoing process.
52 |
53 | 14
54 | 00:00:50,000 --> 00:00:54,000
55 | Stay vigilant, keep refining, and your RAG system will continue to improve over time.
56 |
57 | 15
58 | 00:00:55,000 --> 00:00:59,000
59 | By implementing these practices, you're not just fixing problems,
60 |
61 | 16
62 | 00:00:59,000 --> 00:01:04,000
63 | you're building a robust, reliable system that can handle whatever the user throws at them.
64 |
65 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 7/RAG-7.1-chapter intro.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Welcome to the last chapter in this course.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:06,000
7 | If you have made it this far, we have some nuggets of wisdom that we want to share in this chapter.
8 |
9 | 3
10 | 00:00:06,000 --> 00:00:10,000
11 | We will share general tips and tricks to speed up the latency of your LLM system
12 |
13 | 4
14 | 00:00:10,000 --> 00:00:14,000
15 | and go over general considerations like parallelization of your LLM calls
16 |
17 | 5
18 | 00:00:14,000 --> 00:00:16,000
19 | and making your application configurable.
20 |
21 | 6
22 | 00:00:16,000 --> 00:00:23,000
23 | In chapter 2, we covered how we evaluated wandbot to achieve 8% accuracy over our baseline of 72%.
24 |
25 | 7
26 | 00:00:23,000 --> 00:00:29,000
27 | In this chapter, we will spend some time to discuss ideas on how we did it by reducing the latency by 84%.
28 |
29 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 7/RAG-7.2-frameworks.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Let's start by talking about frameworks.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:04,000
7 | In general, be happy to use whatever suits your purpose,
8 |
9 | 3
10 | 00:00:04,000 --> 00:00:06,000
11 | but avoid framework overload.
12 |
13 | 4
14 | 00:00:06,000 --> 00:00:11,000
15 | While building wandbot, we switched from LlamaIndex to Instructor to LangChain or a mix of them.
16 |
17 | 5
18 | 00:00:11,000 --> 00:00:15,000
19 | It was an exercise to see what works well, but honestly, they are all great.
20 |
21 | 6
22 | 00:00:15,000 --> 00:00:18,000
23 | If your workflow is data heavy, consider using LlamaIndex.
24 |
25 | 7
26 | 00:00:18,000 --> 00:00:21,000
27 | And if it is LLM Paul heavy, consider using LangChain.
28 |
29 | 8
30 | 00:00:21,000 --> 00:00:24,000
31 | I highly recommend evaluating frameworks for yourself.
32 |
33 | 9
34 | 00:00:24,000 --> 00:00:28,000
35 | I'm also an active believer of using less abstractions wherever possible.
36 |
37 | 10
38 | 00:00:28,000 --> 00:00:31,000
39 | For wandbot, we use frameworks for generic tasks,
40 |
41 | 11
42 | 00:00:31,000 --> 00:00:35,000
43 | but wrote custom pure Pythonic code for performance-critical sections.
44 |
45 | 12
46 | 00:00:35,000 --> 00:00:39,000
47 | Finally, this is obvious, but asynchronous programming is your best friend.
48 |
49 | 13
50 | 00:00:39,000 --> 00:00:40,000
51 | Don't shy away from it.
52 |
53 | 14
54 | 00:00:40,000 --> 00:00:44,000
55 | Depending on use case, there can be multiple IO Bottlenecks,
56 |
57 | 15
58 | 00:00:44,000 --> 00:00:45,000
59 | and async programming can help.
60 |
61 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 7/RAG-7.3-data ingestion.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:03,000
3 | Let's look at data ingestion and how we can speed it up and make it more efficient.
4 |
5 | 2
6 | 00:00:04,000 --> 00:00:07,000
7 | A general rule of thumb is to use multi-processing for your data ingestion operations,
8 |
9 | 3
10 | 00:00:07,000 --> 00:00:10,000
11 | like converting raw files to text or chunking them.
12 |
13 | 4
14 | 00:00:10,000 --> 00:00:14,000
15 | Depending on your use case, you can also approach indexing in multiple ways.
16 |
17 | 5
18 | 00:00:14,000 --> 00:00:18,000
19 | If you only have a few thousand samples, doing flat indexing is not a bad option.
20 |
21 | 6
22 | 00:00:18,000 --> 00:00:22,000
23 | You can also consider various variants of hierarchical indexing
24 |
25 | 7
26 | 00:00:22,000 --> 00:00:25,000
27 | to speed up searching through most relevant pieces of information.
28 |
29 | 8
30 | 00:00:26,000 --> 00:00:31,000
31 | Make sure your files, be it PDF, web pages or markdown are converted to a simple text
32 |
33 | 9
34 | 00:00:31,000 --> 00:00:34,000
35 | and associated metadata. This makes the whole application more efficient.
36 |
37 | 10
38 | 00:00:34,000 --> 00:00:39,000
39 | I like LlamaIndex's Document class which is an excellent abstraction for handling data.
40 |
41 | 11
42 | 00:00:40,000 --> 00:00:43,000
43 | Make sure you have validation in place which is very important.
44 |
45 | 12
46 | 00:00:43,000 --> 00:00:47,000
47 | Also ensure to keep track of your data versions with Weave dataset.
48 |
49 | 13
50 | 00:00:47,000 --> 00:00:51,000
51 | Baking in versioning tool like Weave is a one-time work but you get the benefits out of it
52 |
53 | 14
54 | 00:00:51,000 --> 00:00:54,000
55 | throughout the lifecycle of the project.
56 |
57 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 7/RAG-7.4-vector store.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | Selecting the right vector store is also very important.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:05,000
7 | Usually the likes of Chroma and Weaviate are all you need.
8 |
9 | 3
10 | 00:00:05,000 --> 00:00:08,000
11 | In wandbot, when we switched from FAISS to Chroma,
12 |
13 | 4
14 | 00:00:08,000 --> 00:00:10,000
15 | we got a massive speed boost.
16 |
17 | 5
18 | 00:00:10,000 --> 00:00:12,000
19 | This was mostly coming from the fact
20 |
21 | 6
22 | 00:00:12,000 --> 00:00:15,000
23 | that we were able to do efficient metadata filtering.
24 |
25 | 7
26 | 00:00:15,000 --> 00:00:17,000
27 | Note that for most applications,
28 |
29 | 8
30 | 00:00:17,000 --> 00:00:19,000
31 | fancy vector store is not necessary.
32 |
33 | 9
34 | 00:00:19,000 --> 00:00:21,000
35 | Vector stores are usually fast,
36 |
37 | 10
38 | 00:00:21,000 --> 00:00:23,000
39 | and super fast vector stores gain their speed
40 |
41 | 11
42 | 00:00:23,000 --> 00:00:25,000
43 | at the cost of recall.
44 |
45 | 12
46 | 00:00:25,000 --> 00:00:28,000
47 | For most applications like RAG, recall is more important.
48 |
49 | 13
50 | 00:00:29,000 --> 00:00:32,000
51 | For wandbot, we used in-memory vector store
52 |
53 | 14
54 | 00:00:32,000 --> 00:00:33,000
55 | to lower latency,
56 |
57 | 15
58 | 00:00:33,000 --> 00:00:35,000
59 | but note that this was only possible
60 |
61 | 16
62 | 00:00:35,000 --> 00:00:38,000
63 | because the total size of the documents to index
64 |
65 | 17
66 | 00:00:38,000 --> 00:00:39,000
67 | wasn't that huge.
68 |
69 | 18
70 | 00:00:39,000 --> 00:00:42,000
71 | And many applications will fall in this bracket.
72 |
73 | 19
74 | 00:00:43,000 --> 00:00:46,000
75 | However, using dedicated cloud managed DBs
76 |
77 | 20
78 | 00:00:46,000 --> 00:00:49,000
79 | make the application overall easy to manage
80 |
81 | 21
82 | 00:00:49,000 --> 00:00:50,000
83 | and easy to configure.
84 |
85 |
--------------------------------------------------------------------------------
/rag-advanced/resources/Chapter 7/RAG-7.5-LLM calls.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:00,000 --> 00:00:02,000
3 | LLM calls take the most time in a RAG pipeline.
4 |
5 | 2
6 | 00:00:02,000 --> 00:00:06,000
7 | If you're using an open source model, there are ways to speed up generation,
8 |
9 | 3
10 | 00:00:06,000 --> 00:00:09,000
11 | but most frontier LLM providers are already employing these tricks.
12 |
13 | 4
14 | 00:00:09,000 --> 00:00:13,000
15 | So the way ahead to reduce overhead due to LLM calls
16 |
17 | 5
18 | 00:00:13,000 --> 00:00:16,000
19 | is to make it parallel wherever possible.
20 |
21 | 6
22 | 00:00:16,000 --> 00:00:18,000
23 | Most frontier LLM providers can handle multiple requests
24 |
25 | 7
26 | 00:00:18,000 --> 00:00:21,000
27 | and this helps in parallelization.
28 |
29 | 8
30 | 00:00:21,000 --> 00:00:24,000
31 | Language expression language (LECL) is something we have used in wandbot
32 |
33 | 9
34 | 00:00:24,000 --> 00:00:27,000
35 | and something we recommend to parallelize LLM calls
36 |
37 | 10
38 | 00:00:27,000 --> 00:00:29,000
39 | while making the code more readable and efficient.
40 |
41 | 11
42 | 00:00:30,000 --> 00:00:35,000
43 | LECL allows to chain small components both sequentially and parallelly
44 |
45 | 12
46 | 00:00:35,000 --> 00:00:38,000
47 | and the best part is you can switch between sync and async mode
48 |
49 | 13
50 | 00:00:38,000 --> 00:00:39,000
51 | without changing anything in the code.
52 |
53 | 14
54 | 00:00:39,000 --> 00:00:40,000
55 | It just works.
56 |
57 | 15
58 | 00:00:40,000 --> 00:00:44,000
59 | Finally, try to batch user queries wherever possible.
60 |
61 |
--------------------------------------------------------------------------------
/wandb101/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | # Learn W&B: 101
7 |
8 | This repository contains materials for our [W&B 101](https://www.wandb.courses/courses/wandb-101)
9 |
10 | Getting started tracking your experiments using W&B is easy with this series of short tutorials.
11 |
12 | ## 🚀 [Enroll for free](https://www.wandb.courses/courses/wandb-101)
13 |
14 | ## What you'll learn
15 |
16 | - How to visualize metrics while training models
17 | - How to track the hyperparameters of your experiments
18 | - How to analyze your results using the W&B UI
19 | - An overview of the other problems W&B can help you solve like hyperparameter tuning & model evaluation
20 |
--------------------------------------------------------------------------------
/wandb101/requirements.txt:
--------------------------------------------------------------------------------
1 | torch>=1.9
2 | torchvision
3 | wandb
4 |
5 |
--------------------------------------------------------------------------------