├── CODE_OF_CONDUCT.md
├── CONTRIBUTIONS.md
├── LICENSE
├── README.md
├── aks.md
├── arch.png
├── bhive.png
├── build
└── lib
│ └── krs
│ ├── __init__.py
│ ├── krs.py
│ ├── main.py
│ └── utils
│ ├── __init__.py
│ ├── cluster_scanner.py
│ ├── constants.py
│ ├── fetch_tools_krs.py
│ ├── functional.py
│ └── llm_client.py
├── demo.gif
├── dist
└── krs-0.1.0-py3.10.egg
├── dokc
├── dokc.md
├── eks.md
├── gke.md
├── kind.md
├── krs.egg-info
├── PKG-INFO
├── SOURCES.txt
├── dependency_links.txt
├── entry_points.txt
├── requires.txt
└── top_level.txt
├── krs
├── __init__.py
├── krs.py
├── main.py
├── requirements.txt
└── utils
│ ├── __init__.py
│ ├── cluster_scanner.py
│ ├── constants.py
│ ├── fetch_tools_krs.py
│ ├── functional.py
│ └── llm_client.py
├── mkc.md
├── samples
├── install-tools.sh
└── uninstall-tools.sh
└── setup.py
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Contributor Covenant Code of Conduct
2 |
3 | ## Our Pledge
4 |
5 | We as members, contributors, and leaders pledge to make participation in our
6 | community a harassment-free experience for everyone, regardless of age, body
7 | size, visible or invisible disability, ethnicity, sex characteristics, gender
8 | identity and expression, level of experience, education, socio-economic status,
9 | nationality, personal appearance, race, religion, or sexual identity
10 | and orientation.
11 |
12 | We pledge to act and interact in ways that contribute to an open, welcoming,
13 | diverse, inclusive, and healthy community.
14 |
15 | ## Our Standards
16 |
17 | Examples of behavior that contributes to a positive environment for our
18 | community include:
19 |
20 | * Demonstrating empathy and kindness toward other people
21 | * Being respectful of differing opinions, viewpoints, and experiences
22 | * Giving and gracefully accepting constructive feedback
23 | * Accepting responsibility and apologizing to those affected by our mistakes,
24 | and learning from the experience
25 | * Focusing on what is best not just for us as individuals, but for the
26 | overall community
27 |
28 | Examples of unacceptable behavior include:
29 |
30 | * The use of sexualized language or imagery, and sexual attention or
31 | advances of any kind
32 | * Trolling, insulting or derogatory comments, and personal or political attacks
33 | * Public or private harassment
34 | * Publishing others' private information, such as a physical or email
35 | address, without their explicit permission
36 | * Other conduct which could reasonably be considered inappropriate in a
37 | professional setting
38 |
39 | ## Enforcement Responsibilities
40 |
41 | Community leaders are responsible for clarifying and enforcing our standards of
42 | acceptable behavior and will take appropriate and fair corrective action in
43 | response to any behavior that they deem inappropriate, threatening, offensive,
44 | or harmful.
45 |
46 | Community leaders have the right and responsibility to remove, edit, or reject
47 | comments, commits, code, wiki edits, issues, and other contributions that are
48 | not aligned to this Code of Conduct, and will communicate reasons for moderation
49 | decisions when appropriate.
50 |
51 | ## Scope
52 |
53 | This Code of Conduct applies within all community spaces, and also applies when
54 | an individual is officially representing the community in public spaces.
55 | Examples of representing our community include using an official e-mail address,
56 | posting via an official social media account, or acting as an appointed
57 | representative at an online or offline event.
58 |
59 | ## Enforcement
60 |
61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
62 | reported to the community leaders responsible for enforcement at
63 | .
64 | All complaints will be reviewed and investigated promptly and fairly.
65 |
66 | All community leaders are obligated to respect the privacy and security of the
67 | reporter of any incident.
68 |
69 | ## Enforcement Guidelines
70 |
71 | Community leaders will follow these Community Impact Guidelines in determining
72 | the consequences for any action they deem in violation of this Code of Conduct:
73 |
74 | ### 1. Correction
75 |
76 | **Community Impact**: Use of inappropriate language or other behavior deemed
77 | unprofessional or unwelcome in the community.
78 |
79 | **Consequence**: A private, written warning from community leaders, providing
80 | clarity around the nature of the violation and an explanation of why the
81 | behavior was inappropriate. A public apology may be requested.
82 |
83 | ### 2. Warning
84 |
85 | **Community Impact**: A violation through a single incident or series
86 | of actions.
87 |
88 | **Consequence**: A warning with consequences for continued behavior. No
89 | interaction with the people involved, including unsolicited interaction with
90 | those enforcing the Code of Conduct, for a specified period of time. This
91 | includes avoiding interactions in community spaces as well as external channels
92 | like social media. Violating these terms may lead to a temporary or
93 | permanent ban.
94 |
95 | ### 3. Temporary Ban
96 |
97 | **Community Impact**: A serious violation of community standards, including
98 | sustained inappropriate behavior.
99 |
100 | **Consequence**: A temporary ban from any sort of interaction or public
101 | communication with the community for a specified period of time. No public or
102 | private interaction with the people involved, including unsolicited interaction
103 | with those enforcing the Code of Conduct, is allowed during this period.
104 | Violating these terms may lead to a permanent ban.
105 |
106 | ### 4. Permanent Ban
107 |
108 | **Community Impact**: Demonstrating a pattern of violation of community
109 | standards, including sustained inappropriate behavior, harassment of an
110 | individual, or aggression toward or disparagement of classes of individuals.
111 |
112 | **Consequence**: A permanent ban from any sort of public interaction within
113 | the community.
114 |
115 | ## Attribution
116 |
117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage],
118 | version 2.0, available at
119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
120 |
121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct
122 | enforcement ladder](https://github.com/mozilla/diversity).
123 |
124 | [homepage]: https://www.contributor-covenant.org
125 |
126 | For answers to common questions about this code of conduct, see the FAQ at
127 | https://www.contributor-covenant.org/faq. Translations are available at
128 | https://www.contributor-covenant.org/translations.
129 |
--------------------------------------------------------------------------------
/CONTRIBUTIONS.md:
--------------------------------------------------------------------------------
1 | # Contribution Guidelines
2 |
3 | Thank you for your interest in contributing to the Kubernetes Resource Scanner (KRS) tool! We welcome your contributions and are excited to help make this project better.
4 |
5 | ## Code of Conduct
6 |
7 | Before contributing, please review and adhere to the [Code of Conduct](CODE_OF_CONDUCT.md). We expect all contributors to follow these guidelines to ensure a positive and inclusive environment.
8 |
9 | ## Release Management and Pull Request Submission Guidelines
10 |
11 | ### Repository Branch Structure
12 |
13 | Our project employs a three-branch workflow to manage the development and release of new features and fixes:
14 |
15 | - **main**: Stable release branch that contains production-ready code.
16 | - **pre-release**: Staging branch for final testing before merging into main.
17 | - **release-v0.x.x**: Active development branch where all new changes, bug fixes, and features are made and tested.
18 |
19 | ### Contributing to the Project
20 |
21 | To contribute to our project, follow these steps to ensure your changes are properly integrated:
22 |
23 | **Selecting the Correct Branch**
24 |
25 | - Always base your work on the latest **release-vx.x.x** branch. This branch will be named according to the version, for example, release-v0.1.0.
26 | - Ensure you check the branch name in the repository for the most current version branch.
27 |
28 | **Working on Issues**
29 |
30 | - Before you start working on an issue, comment on that issue stating that you are taking it on. This helps prevent multiple contributors from working on the same issue simultaneously.
31 | - Include the issue number in your branch name for traceability, e.g., 123-fix-login-bug.
32 |
33 | ## **Pull Request (PR) Process**
34 |
35 | To maintain code quality and orderly management, all contributors must follow this PR process:
36 |
37 |
38 | ### **Step 1: Sync Your Fork**
39 |
40 | Before starting your work, sync your fork with the upstream repository to ensure you have the latest changes from the release-v0.x.x branch:
41 | ```
42 | git checkout release-vx.x.x
43 | git pull origin release-vx.x.x
44 | ```
45 |
46 |
47 | ### **Step 2: Create a New Branch**
48 |
49 | Create a new branch from the **release-vx.x.x** branch for your work:
50 | ```
51 | git checkout -b your-branch-name
52 | ```
53 |
54 |
55 | ### **Step 3: Make Changes and Commit**
56 |
57 | Make your changes locally and commit them with clear, concise commit messages. Your commits should reference the issue number:
58 | ```
59 | git commit -m "Fix issue #123: resolve login bug"
60 | ```
61 |
62 |
63 | ### **Step 4: Push Changes**
64 |
65 | Push your branch and changes to your fork:
66 |
67 | ```
68 | git push -u origin your-branch-name
69 | ```
70 |
71 |
72 | ### **Step 5: Open a Pull Request**
73 |
74 | - Go to the original repository on GitHub and open a pull request from your branch to the **release-vx.x.x** branch.
75 | - Clearly describe the changes you are proposing in the PR description. Link the PR to any relevant issues.
76 |
77 |
78 | ### **Step 6: PR Review**
79 |
80 | - All pull requests must undergo review by at least two peers before merging.
81 | - Address any feedback and make required updates to your PR; this may involve additional commits.
82 |
83 |
84 | ### **Step 7: Final Merging**
85 |
86 | - Once your PR is approved by the reviewers, one of the maintainers will merge it into the release-v0.x.x branch.
87 | - The changes will eventually be merged into pre-release and then main as part of our release process.
88 |
89 |
90 | **Notes on Contribution**
91 |
92 | - Please make sure all tests pass before submitting a PR.
93 | - Adhere to the coding standards and guidelines provided in our repository to ensure consistency and quality.
94 |
95 | ## Additional Resources
96 |
97 | - [GitHub Guides: Contributing to Open Source](https://guides.github.com/activities/contributing-to-open-source/)
98 | - [How to Contribute to an Open Source Project](https://opensource.guide/how-to-contribute/)
99 | - [The Art of Readable Code](https://www.goodreads.com/book/show/86770.The_Art_of_Readable_Code)
100 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | 
2 | 
3 | 
4 | 
5 | 
6 | 
7 |
8 |
9 |
10 | # Kubetools Recommender System(a.k.a Krs)
11 |
12 | A GenAI-powered Kubetools Recommender system for your Kubernetes cluster.
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 | # Table of Contents
22 |
23 | - [Kubetools Recommender System](#kubetools-recommender-system)
24 | - [Getting Started](#getting-started)
25 | - [Clone the repository](#clone-the-repository)
26 | - [Install the Krs Tool](#install-the-krs-tool)
27 | - [Krs CLI](#krs-cli)
28 | - [Initialise and load the scanner](#initialise-and-load-the-scanner)
29 | - [Scan your cluster](#scan-your-cluster)
30 | - [Lists all the namespaces](#lists-all-the-namespaces)
31 | - [Installing sample Kubernetes Tools](#installing-sample-kubernetes-tools)
32 | - [Use scanner](#use-scanner)
33 | - [Kubetools Recommender System](#kubetools-recommender-system-1)
34 | - [Krs health](#krs-health)
35 | - [Using OpenAI](#using-openai)
36 | - [Using Hugging Face](#using-hugging-face)
37 | - [FAQs](#faqs)
38 |
39 |
40 |
41 | The main functionalities of the project include:
42 |
43 |
44 | - **Scanning the Kubernetes cluster**: The tool scans the cluster to identify the deployed pods, services, and deployments. It retrieves information about the tools used in the cluster and their rankings.
45 | - **Detecting tools from the repository**: The tool detects the tools used in the cluster by analyzing the names of the pods and deployments.
46 | - **Extracting rankings**: The tool extracts the rankings of the detected tools based on predefined criteria. It categorizes the tools into different categories and provides the rankings for each category.
47 | - **Generating recommendations**: The tool generates recommendations for Kubernetes tools based on the detected tools and their rankings. It suggests the best tools for each category and compares them with the tools already used in the cluster.
48 | - **Health check**: The tool provides a health check for a selected pod in the cluster. It extracts logs and events from the pod and analyzes them using a language model (LLM) to identify potential issues and provide recommendations for resolving them.
49 | - **Exporting pod information**: The tool exports the information about the pods, services, and deployments in the cluster to a JSON file.
50 | - **Cleaning up**: The tool provides an option to clean up the project's data directory by deleting all files and directories within it.
51 | - **Model**: Supports OpenAI and Hugging Face models
52 |
53 | ## How does it work?
54 |
55 |
56 |
57 |
58 | The project uses various Python libraries, such as typer, requests, kubernetes, tabulate, and pickle, to accomplish its functionalities.
59 | It also utilizes a language model (LLM) for the health check feature.
60 | The project's directory structure and package management are managed using requirements.txt.
61 | The project's data, such as tool rankings, CNCF status, and Kubernetes cluster information, are stored in JSON files and pickled files.
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 | ## Prerequisites:
70 |
71 | 1. A Kubernetes cluster up and running locally or in the Cloud.
72 | 2. Python 3.6+
73 |
74 | Note: If the kube config path for your cluster is not the default *(~/.kube/config)*, ensure you are providing it during `krs init`
75 |
76 | ## Tested Environment
77 |
78 | - [Docker Desktop(Mac, Linux and Windows)](https://github.com/kubetoolsca/krs?tab=readme-ov-file#getting-started)
79 | - [Minikube](https://github.com/kubetoolsca/krs/blob/main/mkc.md)
80 | - [Google Kubernetes Engine](https://github.com/kubetoolsca/krs/blob/main/gke.md)
81 | - [Amazon Elastic Kubernetes Service](eks.md)
82 | - [Azure Kubernetes Service](aks.md)
83 | - [DigitalOcean Kubernetes Cluster](dokc.md)
84 |
85 |
86 | ## Getting Started
87 |
88 |
89 | ## Clone the repository
90 |
91 | ```
92 | git clone https://github.com/kubetoolsca/krs.git
93 | ```
94 |
95 | ### Install the Krs Tool
96 |
97 | Change directory to /krs and run the following command to install krs locally on your system:
98 |
99 | ```
100 | pip install .
101 | ````
102 |
103 |
104 | ## Krs CLI
105 |
106 | ```
107 |
108 | krs --help
109 |
110 | Usage: krs [OPTIONS] COMMAND [ARGS]...
111 |
112 | krs: A command line interface to scan your Kubernetes Cluster, detect errors, provide resolutions using LLMs and recommend latest tools for your cluster
113 |
114 | ╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
115 | │ --install-completion Install completion for the current shell. │
116 | │ --show-completion Show completion for the current shell, to copy it or customize the installation. │
117 | │ --help Show this message and exit. │
118 | ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
119 | ╭─ Commands ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
120 | │ exit Ends krs services safely and deletes all state files from system. Removes all cached data. │
121 | │ export Exports pod info with logs and events. │
122 | │ health Starts an interactive terminal using an LLM of your choice to detect and fix issues with your cluster │
123 | │ init Initializes the services and loads the scanner. │
124 | │ namespaces Lists all the namespaces. │
125 | │ pods Lists all the pods with namespaces, or lists pods under a specified namespace. │
126 | │ recommend Generates a table of recommended tools from our ranking database and their CNCF project status. │
127 | │ scan Scans the cluster and extracts a list of tools that are currently used. │
128 | ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
129 | ```
130 |
131 | ## Initialise and load the scanner
132 |
133 | Run the following command to initialize the services and loads the scanner.
134 |
135 |
136 | ```
137 | krs init
138 | ```
139 |
140 | ## Scan your cluster
141 |
142 | Run the following command to scan the cluster and extract a list of tools that are currently used.
143 |
144 | ```
145 | krs scan
146 | ```
147 |
148 | You will see the following results:
149 |
150 | ```
151 |
152 | Scanning your cluster...
153 |
154 | Cluster scanned successfully...
155 |
156 | Extracted tools used in cluster...
157 |
158 |
159 | The cluster is using the following tools:
160 |
161 | +-------------+--------+------------+---------------+
162 | | Tool Name | Rank | Category | CNCF Status |
163 | +=============+========+============+===============+
164 | +-------------+--------+------------+---------------+
165 | ```
166 |
167 |
168 | ## Lists all the namespaces
169 |
170 | ```
171 | krs namespaces
172 | Namespaces in your cluster are:
173 |
174 | 1. default
175 | 2. kube-node-lease
176 | 3. kube-public
177 | 4. kube-system
178 | ```
179 |
180 | ## Installing sample Kubernetes Tools
181 |
182 | Assuming that you already have a bunch of Kubernetes tools running in your infrastructure.
183 | If not, you can leverage [samples/install-tools.sh](samples/install-tools.sh) script to install these sample tools.
184 |
185 | ```
186 | cd samples
187 | sh install-tools.sh
188 | ```
189 |
190 | ## Use scanner
191 |
192 | ```
193 | krs scan
194 |
195 | Scanning your cluster...
196 |
197 | Cluster scanned successfully...
198 |
199 | Extracted tools used in cluster...
200 |
201 |
202 | The cluster is using the following tools:
203 |
204 | +-------------+--------+----------------------+---------------+
205 | | Tool Name | Rank | Category | CNCF Status |
206 | +=============+========+======================+===============+
207 | | kubeshark | 4 | Alert and Monitoring | unlisted |
208 | +-------------+--------+----------------------+---------------+
209 | | portainer | 39 | Cluster Management | listed |
210 | +-------------+--------+----------------------+---------------+
211 | ```
212 |
213 | ## Kubetools Recommender System
214 |
215 | Generates a table of recommended tools from our ranking database and their CNCF project status.
216 |
217 |
218 | ```
219 | krs recommend
220 |
221 | Our recommended tools for this deployment are:
222 |
223 | +----------------------+------------------+-------------+---------------+
224 | | Category | Recommendation | Tool Name | CNCF Status |
225 | +======================+==================+=============+===============+
226 | | Alert and Monitoring | Recommended tool | grafana | listed |
227 | +----------------------+------------------+-------------+---------------+
228 | | Cluster Management | Recommended tool | rancher | unlisted |
229 | +----------------------+------------------+-------------+---------------+
230 | ```
231 |
232 |
233 | ## Krs health
234 |
235 |
236 |
237 | Assuming that there is a Nginx Pod under the namespace ns1
238 |
239 | ```
240 | krs pods --namespace ns1
241 |
242 | Pods in namespace 'ns1':
243 |
244 | 1. nginx-pod
245 | ```
246 |
247 | ```
248 | krs health
249 |
250 | Starting interactive terminal...
251 |
252 |
253 | Choose the model provider for healthcheck:
254 |
255 | [1] OpenAI
256 | [2] Huggingface
257 |
258 | >>
259 | ```
260 |
261 | The user is prompted to choose a model provider for the health check.
262 | The options provided are "OpenAI" and "Huggingface". The selected option determines which LLM model will be used for the health check.
263 |
264 | Let's say you choose the option "1", then it will install the necessary libraries.
265 |
266 |
267 | ```
268 | Enter your OpenAI API key: sk-3iXXXXXTpTyyOq2mR
269 |
270 | Enter the OpenAI model name: gpt-3.5-turbo
271 | API key and model are valid.
272 |
273 | Namespaces in the cluster:
274 |
275 | 1. default
276 | 2. kube-node-lease
277 | 3. kube-public
278 | 4. kube-system
279 | 5. ns1
280 |
281 | Which namespace do you want to check the health for? Select a namespace by entering its number: >> 5
282 |
283 | Pods in the namespace ns1:
284 |
285 | 1. nginx-pod
286 |
287 | Which pod from ns1 do you want to check the health for? Select a pod by entering its number: >>
288 | Checking status of the pod...
289 |
290 | Extracting logs and events from the pod...
291 |
292 | Logs and events from the pod extracted successfully!
293 |
294 |
295 | Interactive session started. Type 'end chat' to exit from the session!
296 |
297 | >> The provided log entries are empty, as there is nothing between the curly braces {}. Therefore, everything looks good and there are no warnings or errors to report.
298 | ```
299 |
300 | Let us pick up an example of Pod that throws an error:
301 |
302 | ```
303 | krs health
304 |
305 | Starting interactive terminal...
306 |
307 |
308 | Do you want to continue fixing the previously selected pod ? (y/n): >> n
309 |
310 | Loading LLM State..
311 |
312 | Model: gpt-3.5-turbo
313 |
314 | Namespaces in the cluster:
315 |
316 | 1. default
317 | 2. kube-node-lease
318 | 3. kube-public
319 | 4. kube-system
320 | 5. portainer
321 |
322 | Which namespace do you want to check the health for? Select a namespace by entering its number: >> 4
323 |
324 | Pods in the namespace kube-system:
325 |
326 | 1. coredns-76f75df574-mdk6w
327 | 2. coredns-76f75df574-vg6z2
328 | 3. etcd-docker-desktop
329 | 4. kube-apiserver-docker-desktop
330 | 5. kube-controller-manager-docker-desktop
331 | 6. kube-proxy-p5hw4
332 | 7. kube-scheduler-docker-desktop
333 | 8. storage-provisioner
334 | 9. vpnkit-controller
335 |
336 | Which pod from kube-system do you want to check the health for? Select a pod by entering its number: >> 4
337 |
338 | Checking status of the pod...
339 |
340 | Extracting logs and events from the pod...
341 |
342 | Logs and events from the pod extracted successfully!
343 |
344 |
345 | Interactive session started. Type 'end chat' to exit from the session!
346 |
347 | >> Warning/Error 1:
348 | "Unable to authenticate the request" with err="[invalid bearer token, service account token has expired]"
349 | This indicates that there was an issue with authenticating the request due to an invalid bearer token and an expired service account token. To resolve this issue, the bearer token needs to be updated or regenerated, and the service account token needs to be renewed.
350 |
351 | Warning/Error 2:
352 | "Failed to update lease" with err="StorageError: invalid object, Code: 4, Key: /registry/leases/kube-system/apiserver-457xdkumpkdqmkwikdt6khrbya"
353 | This error indicates a storage error while trying to update a lease, possibly due to an invalid object or a conflict in UIDs. To resolve this issue, the lease object needs to be checked and possibly recreated with the correct UID.
354 |
355 | Warning/Error 3:
356 | "Error removing old endpoints from kubernetes service: no API server IP addresses were listed in storage, refusing to erase all endpoints for the kubernetes Service"
357 | This warning indicates that there were no API server IP addresses listed in storage, causing the removal of old endpoints from the Kubernetes service to fail. To resolve this issue, the API server IP addresses need to be correctly listed in storage.
358 |
359 | Warning/Error 4:
360 | "Skipping API apps/v1beta1 because it has no resources."
361 | This warning indicates that the API version 'apps/v1beta1' is being skipped because it has no associated resources. This might be an informational warning and may not necessarily require immediate action, depending on the use case.
362 |
363 | Warning/Error 5:
364 | "watch chan error: etcdserver: mvcc: required revision has been compacted"
365 | This error indicates a watch channel error related to the revision being compacted in etcdserver. To resolve this issue, the etcd database revision or configuration may need to be checked and potentially adjusted.
366 |
367 | Overall, it is important to investigate and address each warning and error to ensure the stability and reliability of the Kubernetes cluster. If there are no concerns in the log entries provided, the message 'Everything looks good!' can be returned.
368 |
369 | >> can you help me in fixing this error message
370 | >> To help you fix the error messages mentioned in the log entries, here are some steps you can take:
371 |
372 | 1. For the "Unable to authenticate the request" error:
373 | - Generate a new bearer token and update it in the relevant configuration files.
374 | - Renew the service account token to prevent expiration issues.
375 |
376 | 2. For the "Failed to update lease" error:
377 | - Investigate the storage error and check for any invalid objects.
378 | - Recreate the lease object with the correct UID if needed.
379 |
380 | 3. For the "Error removing old endpoints from Kubernetes service" error:
381 | - Ensure that the API server IP addresses are correctly listed in the storage.
382 | - Verify the configuration of the Kubernetes service and update it if necessary.
383 |
384 | 4. For the "watch chan error: etcdserver: mvcc: required revision has been compacted" error:
385 | - Check the etcd server configuration and ensure that the revision is maintained correctly.
386 | - Monitor the etcd server for any potential issues causing revision compaction.
387 |
388 | After performing these steps, restart relevant services or components as needed to apply the changes. Monitor the Kubernetes cluster for any further errors and ensure that the issues have been resolved successfully.
389 |
390 | Feel free to provide more specific details or additional logs if you need further assistance with resolving the error messages.
391 | ```
392 |
393 |
394 | ## Using Hugging Face
395 |
396 | ```
397 | krs health
398 |
399 | Starting interactive terminal...
400 |
401 |
402 | Choose the model provider for healthcheck:
403 |
404 | [1] OpenAI
405 | [2] Huggingface
406 |
407 | >> 2
408 |
409 | Installing necessary libraries..........
410 |
411 | transformers is already installed.
412 |
413 | torch is already installed.
414 | /opt/homebrew/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
415 | torch.utils._pytree._register_pytree_node(
416 |
417 | Enter the Huggingface model name: codellama/CodeLlama-13b-hf
418 | tokenizer_config.json: 100%|█████████████████████████████████████████████| 749/749 [00:00<00:00, 768kB/s]
419 | tokenizer.model: 100%|████████████████████████████████████████████████| 500k/500k [00:00<00:00, 1.94MB/s]
420 | tokenizer.json: 100%|███████████████████████████████████████████████| 1.84M/1.84M [00:01<00:00, 1.78MB/s]
421 | special_tokens_map.json: 100%|██████████████████████████████████████████| 411/411 [00:00<00:00, 1.49MB/s]
422 | config.json: 100%|██████████████████████████████████████████████████████| 589/589 [00:00<00:00, 1.09MB/s]
423 | model.safetensors.index.json: 100%|█████████████████████████████████| 31.4k/31.4k [00:00<00:00, 13.9MB/s]
424 | ...
425 | ```
426 |
427 | ## FAQs
428 |
429 |
430 | How safe is Krs for Prod environment
431 |
432 | The tool is designed to be a non-invasive tool that provides insights into the current state of a Kubernetes cluster without making any changes to the cluster itself. It does not store any sensitive data or credentials, and it only retrieves information from the cluster and external data sources.
433 |
434 |
435 |
436 |
437 | ## Community
438 | Find us on [Slack](https://www.launchpass.com/kubetoolsio)
439 |
440 |
441 |
442 |
443 |
444 |
445 |
446 |
447 |
448 |
449 |
450 |
--------------------------------------------------------------------------------
/aks.md:
--------------------------------------------------------------------------------
1 | # Setting up Krs for an EKS cluster on Microsoft Azure
2 |
3 | Enhance your Kubernetes cluster management on Azure with KRS, a powerful tool designed to provide recommendations and health checks using AI. KRS scans your cluster to identify deployed pods, services, and deployments, analyzes the tools used, and provides rankings based on their popularity. With features like generating recommendations, performing health checks, and exporting pod information, KRS supports both OpenAI and Hugging Face models to ensure your Kubernetes environment runs efficiently. This guide will walk you through setting up KRS for an EKS cluster on Azure, from installation to advanced usage.
4 |
5 | ## Prerequisites:
6 |
7 | - An Azure account
8 | - Install Azure CLI on your laptop
9 |
10 | ## Installation of KRS:**
11 |
12 | ## 1. Clone the repository using the command:
13 |
14 | ```
15 | git clone https://github.com/kubetoolsca/krs.git
16 | ```
17 |
18 | ## 2. Install the Krs Tool:
19 |
20 | Change directory to /krs and run the following command to install krs locally on your system:
21 |
22 | ```
23 | pip install .
24 | ```
25 |
26 | ## 3. Check if the tool has been successfully installed using:
27 |
28 | ```
29 | krs --help
30 | ```
31 |
32 | Once you get a list of commands you can move onto the next part.
33 |
34 | ## Create an EKS cluster on your Azure account
35 |
36 | ## 1. Create an EKS Cluster:
37 |
38 | To create an EKS account, you can log into your account and search for Azure Kubernetes Service.
39 |
40 |
41 |
42 |
43 | Once you click create, you can name your cluster, add a node pool (I used the default agent pool but you can create your own), and leave everything else to its default state. This will help you create a cluster.
44 |
45 | ## 2. Install Azure CLI:
46 |
47 |
48 | ```
49 | brew update && brew install azure-cli
50 | ```
51 |
52 | ## 3. Log into your Azure account:
53 |
54 | Once the CLI is installed, log into your Azure account using the command:
55 |
56 | ```
57 | az login
58 | ```
59 |
60 | ## 4. Connect to Your Cluster:
61 |
62 | Retrieve the connection command from your cluster details on the Azure portal and execute it to connect to your cluster.
63 |
64 |
65 |
66 |
67 |
68 | ## Using Krs
69 |
70 | ## 1. Initialise Krs:
71 |
72 | ```
73 | % krs init
74 | ```
75 |
76 | ## 2. Scan the Clusters:
77 |
78 | ```
79 | % krs scan
80 | Scanning your cluster...
81 | Cluster scanned successfully...
82 | Extracted tools used in cluster...
83 | The cluster is using the following tools:
84 | +-------------+--------+-----------------------------+---------------+
85 | | Tool Name | Rank | Category | CNCF Status |
86 | +=============+========+=============================+===============+
87 | | autoscaler | 5 | Cluster with Core CLI tools | unlisted |
88 | +-------------+--------+-----------------------------+---------------+
89 | ```
90 |
91 | ## 3. Get Recommended Tools:
92 |
93 | ```
94 | % krs recommend`
95 | Our recommended tools for this deployment are:
96 | +-----------------------------+------------------+-------------+---------------+
97 | | Category | Recommendation | Tool Name | CNCF Status |
98 | +=============================+==================+=============+===============+
99 | | Cluster with Core CLI tools | Recommended tool | k9s | unlisted |
100 | +-----------------------------+------------------+-------------+---------------+
101 | ```
102 |
103 | ## 4. Installing a few tools:
104 |
105 | ```
106 | % brew install helm`
107 | `% helm install kubeview kubeview`
108 | ```
109 | helm install kubeview kubeview
110 | NAME: kubeview
111 | LAST DEPLOYED: Sat Jun 29 21:44:17 2024
112 | NAMESPACE: default
113 | STATUS: deployed
114 | REVISION: 1
115 | NOTES:
116 | =====================================
117 | ==== KubeView has been deployed! ====
118 | =====================================
119 | To get the external IP of your application, run the following:
120 | export SERVICE_IP=$(kubectl get svc --namespace default kubeview -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
121 | echo http://$SERVICE_IP
122 | NOTE: It may take a few minutes for the LoadBalancer IP to be available.
123 | You can watch the status of by running 'kubectl get --namespace default svc -w kubeview'
124 | ```
125 |
126 | ## 5. Exports pod info with logs and events:
127 |
128 | ```
129 | % krs export`
130 | Pod info with logs and events exported. Json file saved to current directory!
131 |
132 | ```
133 | meetsimarkaur@meetsimars-MBP krs % ls
134 | CODE_OF_CONDUCT.md arch.png gke.md kubeview
135 | CONTRIBUTIONS.md bhive.png krs samples
136 | LICENSE build krs.egg-info setup.py
137 | README.md exported_pod_info.json kubetail
138 | ```
139 |
140 | ## 6. Detecting and Fixing Issues with my cluster:
141 |
142 | ```
143 | % krs health`
144 | Starting interactive terminal...
145 | Choose the model provider for healthcheck:
146 | [1] OpenAI
147 | [2] Huggingface
148 | >> 1
149 | Installing necessary libraries.........
150 | openai is already installed.
151 | Enter your OpenAI API key: sk-proj-xxxxxxx
152 | Enter the OpenAI model name: gpt-3.5-turbo
153 | API key and model are valid.
154 | Namespaces in the cluster:
155 | 1. default
156 | 2. kube-node-lease
157 | 3. kube-public
158 | 4. kube-system
159 | Which namespace do you want to check the health for? Select a namespace by entering its number:
160 | >> 1
161 | Pods in the namespace default:
162 | 1. kubeview-64fd5d8b8c-khv8v
163 | Which pod from default do you want to check the health for? Select a pod by entering its number:
164 | >> 1
165 | Checking status of the pod...
166 | Extracting logs and events from the pod...
167 | Logs and events from the pod extracted successfully!
168 | Interactive session started. Type 'end chat' to exit from the session!
169 | >> Everything looks good!
170 | Since the log entries provided are empty, there are no warnings or errors to analyze or address. If there were actual log entries to review, common steps to resolve potential issues in a Kubernetes environment could include:
171 | 1. Checking the configuration files for any errors or inconsistencies.
172 | 2. Verifying that all necessary resources (e.g. pods, services, deployments) are running as expected.
173 | 3. Monitoring the cluster for any performance issues or resource constraints.
174 | 4. Troubleshooting any networking problems that may be impacting connectivity.
175 | 5. Updating Kubernetes components or applying patches as needed to ensure system stability and security.
176 | 6. Checking logs of specific pods or services for more detailed error messages to pinpoint the root cause of any issues.
177 | >> 2
178 | >> Since the log entries are still empty, the response remains the same: Everything looks good! If you encounter any specific issues or errors in the future, feel free to provide the logs for further analysis and troubleshooting.
179 | ```
180 |
181 | Using KRS, you can effortlessly identify and optimize the tools within your Kubernetes clusters, whether on-premises or in the public cloud. The `krs` command feature, in particular, stands out by suggesting tools that are better suited for your cluster's specific needs. Discovering this functionality was a revelation, showcasing the tool's ingenuity in enhancing cluster management. It's a testament to the advanced capabilities of KRS, making it an indispensable asset for SRE and DevOps engineers and teams.
182 |
--------------------------------------------------------------------------------
/arch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/arch.png
--------------------------------------------------------------------------------
/bhive.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/bhive.png
--------------------------------------------------------------------------------
/build/lib/krs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/build/lib/krs/__init__.py
--------------------------------------------------------------------------------
/build/lib/krs/krs.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 |
3 | import typer, os
4 | from krs.main import KrsMain
5 | from krs.utils.constants import KRSSTATE_PICKLE_FILEPATH, KRS_DATA_DIRECTORY
6 |
7 | app = typer.Typer(help="krs: A command line interface to scan your Kubernetes Cluster, detect errors, provide resolutions using LLMs and recommend latest tools for your cluster")
8 | krs = KrsMain()
9 |
10 | def check_initialized():
11 | if not os.path.exists(KRSSTATE_PICKLE_FILEPATH):
12 | typer.echo("KRS is not initialized. Please run 'krs init' first.")
13 | raise typer.Exit()
14 |
15 | if not os.path.exists(KRS_DATA_DIRECTORY):
16 | os.mkdir(KRS_DATA_DIRECTORY)
17 |
18 | @app.command()
19 | def init(kubeconfig: str = typer.Option('~/.kube/config', help="Custom path for kubeconfig file if not default")):
20 | """
21 | Initializes the services and loads the scanner.
22 | """
23 | krs.initialize(kubeconfig)
24 | typer.echo("Services initialized and scanner loaded.")
25 |
26 | @app.command()
27 | def scan():
28 | """
29 | Scans the cluster and extracts a list of tools that are currently used.
30 | """
31 | check_initialized()
32 | krs.scan_cluster()
33 |
34 |
35 | @app.command()
36 | def namespaces():
37 | """
38 | Lists all the namespaces.
39 | """
40 | check_initialized()
41 | namespaces = krs.list_namespaces()
42 | typer.echo("Namespaces in your cluster are: \n")
43 | for i, namespace in enumerate(namespaces):
44 | typer.echo(str(i+1)+ ". "+ namespace)
45 |
46 | @app.command()
47 | def pods(namespace: str = typer.Option(None, help="Specify namespace to list pods from")):
48 | """
49 | Lists all the pods with namespaces, or lists pods under a specified namespace.
50 | """
51 | check_initialized()
52 | if namespace:
53 | pods = krs.list_pods(namespace)
54 | if pods == 'wrong namespace name':
55 | typer.echo(f"\nWrong namespace name entered, try again!\n")
56 | raise typer.Abort()
57 | typer.echo(f"\nPods in namespace '{namespace}': \n")
58 | else:
59 | pods = krs.list_pods_all()
60 | typer.echo("\nAll pods in the cluster: \n")
61 |
62 | for i, pod in enumerate(pods):
63 | typer.echo(str(i+1)+ '. '+ pod)
64 |
65 | @app.command()
66 | def recommend():
67 | """
68 | Generates a table of recommended tools from our ranking database and their CNCF project status.
69 | """
70 | check_initialized()
71 | krs.generate_recommendations()
72 |
73 | @app.command()
74 | def health(change_model: bool = typer.Option(False, help="Option to reinitialize/change the LLM, if set to True"),
75 | device: str = typer.Option('cpu', help='Option to run Huggingface models on GPU by entering the option as "gpu"')):
76 | """
77 | Starts an interactive terminal using an LLM of your choice to detect and fix issues with your cluster
78 | """
79 | check_initialized()
80 | typer.echo("\nStarting interactive terminal...\n")
81 | krs.health_check(change_model, device)
82 |
83 | @app.command()
84 | def export():
85 | """
86 | Exports pod info with logs and events.
87 | """
88 | check_initialized()
89 | krs.export_pod_info()
90 | typer.echo("Pod info with logs and events exported. Json file saved to current directory!")
91 |
92 | @app.command()
93 | def exit():
94 | """
95 | Ends krs services safely and deletes all state files from system. Removes all cached data.
96 | """
97 | check_initialized()
98 | krs.exit()
99 | typer.echo("Krs services closed safely.")
100 |
101 | if __name__ == "__main__":
102 | app()
103 |
--------------------------------------------------------------------------------
/build/lib/krs/main.py:
--------------------------------------------------------------------------------
1 | from krs.utils.fetch_tools_krs import krs_tool_ranking_info
2 | from krs.utils.cluster_scanner import KubetoolsScanner
3 | from krs.utils.llm_client import KrsGPTClient
4 | from krs.utils.functional import extract_log_entries, CustomJSONEncoder
5 | import os, pickle, time, json
6 | from tabulate import tabulate
7 | from krs.utils.constants import (KRSSTATE_PICKLE_FILEPATH, LLMSTATE_PICKLE_FILEPATH, POD_INFO_FILEPATH, KRS_DATA_DIRECTORY)
8 |
9 | class KrsMain:
10 |
11 | def __init__(self):
12 |
13 | self.pod_info = None
14 | self.pod_list = None
15 | self.namespaces = None
16 | self.deployments = None
17 | self.state_file = KRSSTATE_PICKLE_FILEPATH
18 | self.isClusterScanned = False
19 | self.continue_chat = False
20 | self.logs_extracted = []
21 | self.scanner = None
22 | self.get_events = True
23 | self.get_logs = True
24 | self.cluster_tool_list = None
25 | self.detailed_cluster_tool_list = None
26 | self.category_cluster_tools_dict = None
27 |
28 | self.load_state()
29 |
30 | def initialize(self, config_file='~/.kube/config'):
31 | self.config_file = config_file
32 | self.tools_dict, self.category_dict, cncf_status_dict = krs_tool_ranking_info()
33 | self.cncf_status = cncf_status_dict['cncftools']
34 | self.scanner = KubetoolsScanner(self.get_events, self.get_logs, self.config_file)
35 | self.save_state()
36 |
37 | def save_state(self):
38 | state = {
39 | 'pod_info': self.pod_info,
40 | 'pod_list': self.pod_list,
41 | 'namespaces': self.namespaces,
42 | 'deployments': self.deployments,
43 | 'cncf_status': self.cncf_status,
44 | 'tools_dict': self.tools_dict,
45 | 'category_tools_dict': self.category_dict,
46 | 'extracted_logs': self.logs_extracted,
47 | 'kubeconfig': self.config_file,
48 | 'isScanned': self.isClusterScanned,
49 | 'cluster_tool_list': self.cluster_tool_list,
50 | 'detailed_tool_list': self.detailed_cluster_tool_list,
51 | 'category_tool_list': self.category_cluster_tools_dict
52 | }
53 | os.makedirs(os.path.dirname(self.state_file), exist_ok=True)
54 | with open(self.state_file, 'wb') as f:
55 | pickle.dump(state, f)
56 |
57 | def load_state(self):
58 | if os.path.exists(self.state_file):
59 | with open(self.state_file, 'rb') as f:
60 | state = pickle.load(f)
61 | self.pod_info = state.get('pod_info')
62 | self.pod_list = state.get('pod_list')
63 | self.namespaces = state.get('namespaces')
64 | self.deployments = state.get('deployments')
65 | self.cncf_status = state.get('cncf_status')
66 | self.tools_dict = state.get('tools_dict')
67 | self.category_dict = state.get('category_tools_dict')
68 | self.logs_extracted = state.get('extracted_logs')
69 | self.config_file = state.get('kubeconfig')
70 | self.isClusterScanned = state.get('isScanned')
71 | self.cluster_tool_list = state.get('cluster_tool_list')
72 | self.detailed_cluster_tool_list = state.get('detailed_tool_list')
73 | self.category_cluster_tools_dict = state.get('category_tool_list')
74 | self.scanner = KubetoolsScanner(self.get_events, self.get_logs, self.config_file)
75 |
76 | def check_scanned(self):
77 | if not self.isClusterScanned:
78 | self.pod_list, self.pod_info, self.deployments, self.namespaces = self.scanner.scan_kubernetes_deployment()
79 | self.save_state()
80 |
81 | def list_namespaces(self):
82 | self.check_scanned()
83 | return self.scanner.list_namespaces()
84 |
85 | def list_pods(self, namespace):
86 | self.check_scanned()
87 | if namespace not in self.list_namespaces():
88 | return "wrong namespace name"
89 | return self.scanner.list_pods(namespace)
90 |
91 | def list_pods_all(self):
92 | self.check_scanned()
93 | return self.scanner.list_pods_all()
94 |
95 | def detect_tools_from_repo(self):
96 | tool_set = set()
97 | for pod in self.pod_list:
98 | for service_name in pod.split('-'):
99 | if service_name in self.tools_dict.keys():
100 | tool_set.add(service_name)
101 |
102 | for dep in self.deployments:
103 | for service_name in dep.split('-'):
104 | if service_name in self.tools_dict.keys():
105 | tool_set.add(service_name)
106 |
107 | return list(tool_set)
108 |
109 | def extract_rankings(self):
110 | tool_dict = {}
111 | category_tools_dict = {}
112 | for tool in self.cluster_tool_list:
113 | tool_details = self.tools_dict[tool]
114 | for detail in tool_details:
115 | rank = detail['rank']
116 | category = detail['category']
117 | if category not in category_tools_dict:
118 | category_tools_dict[category] = []
119 | category_tools_dict[category].append(rank)
120 |
121 | tool_dict[tool] = tool_details
122 |
123 | return tool_dict, category_tools_dict
124 |
125 | def generate_recommendations(self):
126 |
127 | if not self.isClusterScanned:
128 | self.scan_cluster()
129 |
130 | self.print_recommendations()
131 |
132 | def scan_cluster(self):
133 |
134 | print("\nScanning your cluster...\n")
135 | self.pod_list, self.pod_info, self.deployments, self.namespaces = self.scanner.scan_kubernetes_deployment()
136 | self.isClusterScanned = True
137 | print("Cluster scanned successfully...\n")
138 | self.cluster_tool_list = self.detect_tools_from_repo()
139 | print("Extracted tools used in cluster...\n")
140 | self.detailed_cluster_tool_list, self.category_cluster_tools_dict = self.extract_rankings()
141 |
142 | self.print_scan_results()
143 | self.save_state()
144 |
145 | def print_scan_results(self):
146 | scan_results = []
147 |
148 | for tool, details in self.detailed_cluster_tool_list.items():
149 | first_entry = True
150 | for detail in details:
151 | row = [tool if first_entry else "", detail['rank'], detail['category'], self.cncf_status.get(tool, 'unlisted')]
152 | scan_results.append(row)
153 | first_entry = False
154 |
155 | print("\nThe cluster is using the following tools:\n")
156 | print(tabulate(scan_results, headers=["Tool Name", "Rank", "Category", "CNCF Status"], tablefmt="grid"))
157 |
158 | def print_recommendations(self):
159 | recommendations = []
160 |
161 | for category, ranks in self.category_cluster_tools_dict.items():
162 | rank = ranks[0]
163 | recommended_tool = self.category_dict[category][1]['name']
164 | status = self.cncf_status.get(recommended_tool, 'unlisted')
165 | if rank == 1:
166 | row = [category, "Already using the best", recommended_tool, status]
167 | else:
168 | row = [category, "Recommended tool", recommended_tool, status]
169 | recommendations.append(row)
170 |
171 | print("\nOur recommended tools for this deployment are:\n")
172 | print(tabulate(recommendations, headers=["Category", "Recommendation", "Tool Name", "CNCF Status"], tablefmt="grid"))
173 |
174 |
175 | def health_check(self, change_model=False, device='cpu'):
176 |
177 | if os.path.exists(LLMSTATE_PICKLE_FILEPATH) and not change_model:
178 | continue_previous_chat = input("\nDo you want to continue fixing the previously selected pod ? (y/n): >> ")
179 | while True:
180 | if continue_previous_chat not in ['y', 'n']:
181 | continue_previous_chat = input("\nPlease enter one of the given options ? (y/n): >> ")
182 | else:
183 | break
184 |
185 | if continue_previous_chat=='y':
186 | krsllmclient = KrsGPTClient(device=device)
187 | self.continue_chat = True
188 | else:
189 | krsllmclient = KrsGPTClient(reset_history=True, device=device)
190 |
191 | else:
192 | krsllmclient = KrsGPTClient(reinitialize=True, device=device)
193 | self.continue_chat = False
194 |
195 | if not self.continue_chat:
196 |
197 | self.check_scanned()
198 |
199 | print("\nNamespaces in the cluster:\n")
200 | namespaces = self.list_namespaces()
201 | namespace_len = len(namespaces)
202 | for i, namespace in enumerate(namespaces, start=1):
203 | print(f"{i}. {namespace}")
204 |
205 | self.selected_namespace_index = int(input("\nWhich namespace do you want to check the health for? Select a namespace by entering its number: >> "))
206 | while True:
207 | if self.selected_namespace_index not in list(range(1, namespace_len+1)):
208 | self.selected_namespace_index = int(input(f"\nWrong input! Select a namespace number between {1} to {namespace_len}: >> "))
209 | else:
210 | break
211 |
212 | self.selected_namespace = namespaces[self.selected_namespace_index - 1]
213 | pod_list = self.list_pods(self.selected_namespace)
214 | pod_len = len(pod_list)
215 | print(f"\nPods in the namespace {self.selected_namespace}:\n")
216 | for i, pod in enumerate(pod_list, start=1):
217 | print(f"{i}. {pod}")
218 | self.selected_pod_index = int(input(f"\nWhich pod from {self.selected_namespace} do you want to check the health for? Select a pod by entering its number: >> "))
219 |
220 | while True:
221 | if self.selected_pod_index not in list(range(1, pod_len+1)):
222 | self.selected_pod_index = int(input(f"\nWrong input! Select a pod number between {1} to {pod_len}: >> "))
223 | else:
224 | break
225 |
226 | print("\nChecking status of the pod...")
227 |
228 | print("\nExtracting logs and events from the pod...")
229 |
230 | logs_from_pod = self.get_logs_from_pod(self.selected_namespace_index, self.selected_pod_index)
231 |
232 | self.logs_extracted = extract_log_entries(logs_from_pod)
233 |
234 | print("\nLogs and events from the pod extracted successfully!\n")
235 |
236 | prompt_to_llm = self.create_prompt(self.logs_extracted)
237 |
238 | krsllmclient.interactive_session(prompt_to_llm)
239 |
240 | self.save_state()
241 |
242 | def get_logs_from_pod(self, namespace_index, pod_index):
243 | try:
244 | namespace_index -= 1
245 | pod_index -= 1
246 | namespace = list(self.list_namespaces())[namespace_index]
247 | return list(self.pod_info[namespace][pod_index]['info']['Logs'].values())[0]
248 | except KeyError as e:
249 | print("\nKindly enter a value from the available namespaces and pods")
250 | return None
251 |
252 | def create_prompt(self, log_entries):
253 | prompt = "You are a DevOps expert with experience in Kubernetes. Analyze the following log entries:\n{\n"
254 | for entry in sorted(log_entries): # Sort to maintain consistent order
255 | prompt += f"{entry}\n"
256 | prompt += "}\nIf there is nothing of concern in between { }, return a message stating that 'Everything looks good!'. Explain the warnings and errors and the steps that should be taken to resolve the issues, only if they exist."
257 | return prompt
258 |
259 | def export_pod_info(self):
260 |
261 | self.check_scanned()
262 |
263 | with open(POD_INFO_FILEPATH, 'w') as f:
264 | json.dump(self.pod_info, f, cls=CustomJSONEncoder)
265 |
266 |
267 | def exit(self):
268 |
269 | try:
270 | # List all files and directories in the given directory
271 | files = os.listdir(KRS_DATA_DIRECTORY)
272 | for file in files:
273 | file_path = os.path.join(KRS_DATA_DIRECTORY, file)
274 | # Check if it's a file and not a directory
275 | if os.path.isfile(file_path):
276 | os.remove(file_path) # Delete the file
277 | print(f"Deleted file: {file_path}")
278 |
279 | except Exception as e:
280 | print(f"Error occurred: {e}")
281 |
282 | def main(self):
283 | self.scan_cluster()
284 | self.generate_recommendations()
285 | self.health_check()
286 |
287 |
288 | if __name__=='__main__':
289 | recommender = KrsMain()
290 | recommender.main()
291 | # logs_info = recommender.get_logs_from_pod(4,2)
292 | # print(logs_info)
293 | # logs = recommender.extract_log_entries(logs_info)
294 | # print(logs)
295 | # print(recommender.create_prompt(logs))
296 |
297 |
--------------------------------------------------------------------------------
/build/lib/krs/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/build/lib/krs/utils/__init__.py
--------------------------------------------------------------------------------
/build/lib/krs/utils/cluster_scanner.py:
--------------------------------------------------------------------------------
1 | from kubernetes import client, config
2 | import logging
3 |
4 | class KubetoolsScanner:
5 | def __init__(self, get_events=True, get_logs=True, config_file='~/.kube/config'):
6 | self.get_events = get_events
7 | self.get_logs = get_logs
8 | self.config_file = config_file
9 | self.v1 = None
10 | self.v2 = None
11 | self.setup_kubernetes_client()
12 |
13 | def setup_kubernetes_client(self):
14 | try:
15 | config.load_kube_config(config_file=self.config_file)
16 | self.v1 = client.AppsV1Api()
17 | self.v2 = client.CoreV1Api()
18 | except Exception as e:
19 | logging.error("Failed to load Kubernetes configuration: %s", e)
20 | raise
21 |
22 | def scan_kubernetes_deployment(self):
23 | try:
24 | deployments = self.v1.list_deployment_for_all_namespaces()
25 | namespaces = self.list_namespaces()
26 | except Exception as e:
27 | logging.error("Error fetching data from Kubernetes API: %s", e)
28 | return {}, {}, []
29 |
30 | pod_dict = {}
31 | pod_list = []
32 | for name in namespaces:
33 | pods = self.list_pods(name)
34 | pod_list += pods
35 | pod_dict[name] = [{'name': pod, 'info': self.get_pod_info(name, pod)} for pod in pods]
36 |
37 | deployment_list = [dep.metadata.name for dep in deployments.items]
38 | return pod_list, pod_dict, deployment_list, namespaces
39 |
40 | def list_namespaces(self):
41 | namespaces = self.v2.list_namespace()
42 | return [namespace.metadata.name for namespace in namespaces.items]
43 |
44 | def list_pods_all(self):
45 | pods = self.v2.list_pod_for_all_namespaces()
46 | return [pod.metadata.name for pod in pods.items]
47 |
48 | def list_pods(self, namespace):
49 | pods = self.v2.list_namespaced_pod(namespace)
50 | return [pod.metadata.name for pod in pods.items]
51 |
52 | def get_pod_info(self, namespace, pod, include_events=True, include_logs=True):
53 | """
54 | Retrieves information about a specific pod in a given namespace.
55 |
56 | Args:
57 | namespace (str): The namespace of the pod.
58 | pod (str): The name of the pod.
59 | include_events (bool): Flag indicating whether to include events associated with the pod.
60 | include_logs (bool): Flag indicating whether to include logs of the pod.
61 |
62 | Returns:
63 | dict: A dictionary containing the pod information, events (if include_events is True), and logs (if include_logs is True).
64 | """
65 | pod_info = self.v2.read_namespaced_pod(pod, namespace)
66 | pod_info_map = pod_info.to_dict()
67 | pod_info_map["metadata"]["managed_fields"] = None # Clean up metadata
68 |
69 | info = {'PodInfo': pod_info_map}
70 |
71 | if include_events:
72 | info['Events'] = self.fetch_pod_events(namespace, pod)
73 |
74 | if include_logs:
75 | # Retrieve logs for all containers within the pod
76 | container_logs = {}
77 | for container in pod_info.spec.containers:
78 | try:
79 | logs = self.v2.read_namespaced_pod_log(name=pod, namespace=namespace, container=container.name)
80 | container_logs[container.name] = logs
81 | except Exception as e:
82 | logging.error("Failed to fetch logs for container %s in pod %s: %s", container.name, pod, e)
83 | container_logs[container.name] = "Error fetching logs: " + str(e)
84 | info['Logs'] = container_logs
85 |
86 | return info
87 |
88 | def fetch_pod_events(self, namespace, pod):
89 | events = self.v2.list_namespaced_event(namespace)
90 | return [{
91 | 'Name': event.metadata.name,
92 | 'Message': event.message,
93 | 'Reason': event.reason
94 | } for event in events.items if event.involved_object.name == pod]
95 |
96 |
97 | if __name__ == '__main__':
98 |
99 | scanner = KubetoolsScanner()
100 | pod_list, pod_info, deployments, namespaces = scanner.scan_kubernetes_deployment()
101 | print("POD List: \n\n", pod_list)
102 | print("\n\nPOD Info: \n\n", pod_info.keys())
103 | print("\n\nNamespaces: \n\n", namespaces)
104 | print("\n\nDeployments : \n\n", deployments)
105 |
106 |
--------------------------------------------------------------------------------
/build/lib/krs/utils/constants.py:
--------------------------------------------------------------------------------
1 | KUBETOOLS_JSONPATH = 'krs/data/kubetools_data.json'
2 | KUBETOOLS_DATA_JSONURL = 'https://raw.githubusercontent.com/Kubetools-Technologies-Inc/kubetools_data/main/data/kubetools_data.json'
3 |
4 | CNCF_YMLPATH = 'krs/data/landscape.yml'
5 | CNCF_YMLURL = 'https://raw.githubusercontent.com/cncf/landscape/master/landscape.yml'
6 | CNCF_TOOLS_JSONPATH = 'krs/data/cncf_tools.json'
7 |
8 | TOOLS_RANK_JSONPATH = 'krs/data/tools_rank.json'
9 | CATEGORY_RANK_JSONPATH = 'krs/data/category_rank.json'
10 |
11 | LLMSTATE_PICKLE_FILEPATH = 'krs/data/llmstate.pkl'
12 | KRSSTATE_PICKLE_FILEPATH = 'krs/data/krsstate.pkl'
13 |
14 | POD_INFO_FILEPATH = './exported_pod_info.json'
15 |
16 | MAX_OUTPUT_TOKENS = 512
17 |
18 | KRS_DATA_DIRECTORY = 'krs/data'
19 |
--------------------------------------------------------------------------------
/build/lib/krs/utils/fetch_tools_krs.py:
--------------------------------------------------------------------------------
1 | import json
2 | import requests
3 | import yaml
4 | from krs.utils.constants import (KUBETOOLS_DATA_JSONURL, KUBETOOLS_JSONPATH, CNCF_YMLPATH, CNCF_YMLURL, CNCF_TOOLS_JSONPATH, TOOLS_RANK_JSONPATH, CATEGORY_RANK_JSONPATH)
5 |
6 | # Function to convert 'githubStars' to a float, or return 0 if it cannot be converted
7 | def get_github_stars(tool):
8 | stars = tool.get('githubStars', 0)
9 | try:
10 | return float(stars)
11 | except ValueError:
12 | return 0.0
13 |
14 | # Function to download and save a file
15 | def download_file(url, filename):
16 | response = requests.get(url)
17 | response.raise_for_status() # Ensure we notice bad responses
18 | with open(filename, 'wb') as file:
19 | file.write(response.content)
20 |
21 | def parse_yaml_to_dict(yaml_file_path):
22 | with open(yaml_file_path, 'r') as file:
23 | data = yaml.safe_load(file)
24 |
25 | cncftools = {}
26 |
27 | for category in data.get('landscape', []):
28 | for subcategory in category.get('subcategories', []):
29 | for item in subcategory.get('items', []):
30 | item_name = item.get('name').lower()
31 | project_status = item.get('project', 'listed')
32 | cncftools[item_name] = project_status
33 |
34 | return {'cncftools': cncftools}
35 |
36 | def save_json_file(jsondict, jsonpath):
37 |
38 | # Write the category dictionary to a new JSON file
39 | with open(jsonpath, 'w') as f:
40 | json.dump(jsondict, f, indent=4)
41 |
42 |
43 | def krs_tool_ranking_info():
44 | # New dictionaries
45 | tools_dict = {}
46 | category_tools_dict = {}
47 |
48 | download_file(KUBETOOLS_DATA_JSONURL, KUBETOOLS_JSONPATH)
49 | download_file(CNCF_YMLURL, CNCF_YMLPATH)
50 |
51 | with open(KUBETOOLS_JSONPATH) as f:
52 | data = json.load(f)
53 |
54 | for category in data:
55 | # Sort the tools in the current category by the number of GitHub stars
56 | sorted_tools = sorted(category['tools'], key=get_github_stars, reverse=True)
57 |
58 | for i, tool in enumerate(sorted_tools, start=1):
59 | tool["name"] = tool['name'].replace("\t", "").lower()
60 | tool['ranking'] = i
61 |
62 | # Update tools_dict
63 | tools_dict.setdefault(tool['name'], []).append({
64 | 'rank': i,
65 | 'category': category['category']['name'],
66 | 'url': tool['link']
67 | })
68 |
69 | # Update ranked_tools_dict
70 | category_tools_dict.setdefault(category['category']['name'], {}).update({i: {'name': tool['name'], 'url': tool['link']}})
71 |
72 |
73 | cncf_tools_dict = parse_yaml_to_dict(CNCF_YMLPATH)
74 | save_json_file(cncf_tools_dict, CNCF_TOOLS_JSONPATH)
75 | save_json_file(tools_dict, TOOLS_RANK_JSONPATH)
76 | save_json_file(category_tools_dict, CATEGORY_RANK_JSONPATH)
77 |
78 | return tools_dict, category_tools_dict, cncf_tools_dict
79 |
80 | if __name__=='__main__':
81 | tools_dict, category_tools_dict, cncf_tools_dict = krs_tool_ranking_info()
82 | print(cncf_tools_dict)
83 |
84 |
--------------------------------------------------------------------------------
/build/lib/krs/utils/functional.py:
--------------------------------------------------------------------------------
1 | from difflib import SequenceMatcher
2 | import re, json
3 | from datetime import datetime
4 |
5 | class CustomJSONEncoder(json.JSONEncoder):
6 | """JSON Encoder for complex objects not serializable by default json code."""
7 | def default(self, obj):
8 | if isinstance(obj, datetime):
9 | # Format datetime object as a string in ISO 8601 format
10 | return obj.isoformat()
11 | # Let the base class default method raise the TypeError
12 | return json.JSONEncoder.default(self, obj)
13 |
14 | def similarity(a, b):
15 | return SequenceMatcher(None, a, b).ratio()
16 |
17 | def filter_similar_entries(log_entries):
18 | unique_entries = list(log_entries)
19 | to_remove = set()
20 |
21 | # Compare each pair of log entries
22 | for i in range(len(unique_entries)):
23 | for j in range(i + 1, len(unique_entries)):
24 | if similarity(unique_entries[i], unique_entries[j]) > 0.85:
25 | # Choose the shorter entry to remove, or either if they are the same length
26 | if len(unique_entries[i]) > len(unique_entries[j]):
27 | to_remove.add(unique_entries[i])
28 | else:
29 | to_remove.add(unique_entries[j])
30 |
31 | # Filter out the highly similar entries
32 | filtered_entries = {entry for entry in unique_entries if entry not in to_remove}
33 | return filtered_entries
34 |
35 | def extract_log_entries(log_contents):
36 | # Patterns to match different log formats
37 | patterns = [
38 | re.compile(r'\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{6}Z\s+(warn|error)\s+\S+\s+(.*)', re.IGNORECASE),
39 | re.compile(r'[WE]\d{4} \d{2}:\d{2}:\d{2}.\d+\s+\d+\s+(.*)'),
40 | re.compile(r'({.*})')
41 | ]
42 |
43 | log_entries = set()
44 | # Attempt to match each line with all patterns
45 | for line in log_contents.split('\n'):
46 | for pattern in patterns:
47 | match = pattern.search(line)
48 | if match:
49 | if match.groups()[0].startswith('{'):
50 | # Handle JSON formatted log entries
51 | try:
52 | log_json = json.loads(match.group(1))
53 | if 'severity' in log_json and log_json['severity'].lower() in ['error', 'warning']:
54 | level = "Error" if log_json['severity'] == "ERROR" else "Warning"
55 | message = log_json.get('error', '') if 'error' in log_json.keys() else line
56 | log_entries.add(f"{level}: {message.strip()}")
57 | elif 'level' in log_json:
58 | level = "Error" if log_json['level'] == "error" else "Warning"
59 | message = log_json.get('msg', '') + log_json.get('error', '')
60 | log_entries.add(f"{level}: {message.strip()}")
61 | except json.JSONDecodeError:
62 | continue # Skip if JSON is not valid
63 | else:
64 | if len(match.groups()) == 2:
65 | level, message = match.groups()
66 | elif len(match.groups()) == 1:
67 | message = match.group(1) # Assuming error as default
68 | level = "ERROR" # Default if not specified in the log
69 |
70 | level = "Error" if "error" in level.lower() else "Warning"
71 | formatted_message = f"{level}: {message.strip()}"
72 | log_entries.add(formatted_message)
73 | break # Stop after the first match
74 |
75 | return filter_similar_entries(log_entries)
--------------------------------------------------------------------------------
/build/lib/krs/utils/llm_client.py:
--------------------------------------------------------------------------------
1 | import pickle
2 | import subprocess
3 | import os, time
4 | from krs.utils.constants import (MAX_OUTPUT_TOKENS, LLMSTATE_PICKLE_FILEPATH)
5 |
6 | class KrsGPTClient:
7 |
8 | def __init__(self, reinitialize=False, reset_history=False, device='cpu'):
9 |
10 | self.reinitialize = reinitialize
11 | self.client = None
12 | self.pipeline = None
13 | self.provider = None
14 | self.model = None
15 | self.openai_api_key = None
16 | self.continue_chat = False
17 | self.history = []
18 | self.max_tokens = MAX_OUTPUT_TOKENS
19 | self.device = device
20 |
21 |
22 | if not self.reinitialize:
23 | print("\nLoading LLM State..")
24 | self.load_state()
25 | print("\nModel: ", self.model)
26 | if not self.model:
27 | self.initialize_client()
28 |
29 | self.history = [] if reset_history == True else self.history
30 |
31 | if self.history:
32 | continue_chat = input("\n\nDo you want to continue previous chat ? (y/n) >> ")
33 | while continue_chat not in ['y', 'n']:
34 | print("Please enter either y or n!")
35 | continue_chat = input("\nDo you want to continue previous chat ? (y/n) >> ")
36 | if continue_chat == 'No':
37 | self.history = []
38 | else:
39 | self.continue_chat = True
40 |
41 | def save_state(self, filename=LLMSTATE_PICKLE_FILEPATH):
42 | state = {
43 | 'provider': self.provider,
44 | 'model': self.model,
45 | 'history': self.history,
46 | 'openai_api_key': self.openai_api_key
47 | }
48 | with open(filename, 'wb') as output:
49 | pickle.dump(state, output, pickle.HIGHEST_PROTOCOL)
50 |
51 | def load_state(self):
52 | try:
53 | with open(LLMSTATE_PICKLE_FILEPATH, 'rb') as f:
54 | state = pickle.load(f)
55 | self.provider = state['provider']
56 | self.model = state['model']
57 | self.history = state.get('history', [])
58 | self.openai_api_key = state.get('openai_api_key', '')
59 | if self.provider == 'OpenAI':
60 | self.init_openai_client(reinitialize=True)
61 | elif self.provider == 'huggingface':
62 | self.init_huggingface_client(reinitialize=True)
63 | except (FileNotFoundError, EOFError):
64 | pass
65 |
66 | def install_package(self, package_name):
67 | import importlib
68 | try:
69 | importlib.import_module(package_name)
70 | print(f"\n{package_name} is already installed.")
71 | except ImportError:
72 | print(f"\nInstalling {package_name}...", end='', flush=True)
73 | result = subprocess.run(['pip', 'install', package_name], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
74 | if result.returncode == 0:
75 | print(f" \n{package_name} installed successfully.")
76 | else:
77 | print(f" \nFailed to install {package_name}.")
78 |
79 |
80 | def initialize_client(self):
81 | if not self.client and not self.pipeline:
82 | choice = input("\nChoose the model provider for healthcheck: \n\n[1] OpenAI \n[2] Huggingface\n\n>> ")
83 | if choice == '1':
84 | self.init_openai_client()
85 | elif choice == '2':
86 | self.init_huggingface_client()
87 | else:
88 | raise ValueError("Invalid option selected")
89 |
90 | def init_openai_client(self, reinitialize=False):
91 |
92 | if not reinitialize:
93 | print("\nInstalling necessary libraries..........")
94 | self.install_package('openai')
95 |
96 | import openai
97 | from openai import OpenAI
98 |
99 | self.provider = 'OpenAI'
100 | self.openai_api_key = input("\nEnter your OpenAI API key: ") if not reinitialize else self.openai_api_key
101 | self.model = input("\nEnter the OpenAI model name: ") if not reinitialize else self.model
102 |
103 | self.client = OpenAI(api_key=self.openai_api_key)
104 |
105 | if not reinitialize or self.reinitialize:
106 | while True:
107 | try:
108 | self.validate_openai_key()
109 | break
110 | except openai.error.AuthenticationError:
111 | self.openai_api_key = input("\nInvalid Key! Please enter the correct OpenAI API key: ")
112 | except openai.error.InvalidRequestError as e:
113 | print(e)
114 | self.model = input("\nEnter an OpenAI model name from latest OpenAI docs: ")
115 | except openai.APIConnectionError as e:
116 | print(e)
117 | self.init_openai_client(reinitialize=False)
118 |
119 | self.save_state()
120 |
121 | def init_huggingface_client(self, reinitialize=False):
122 |
123 | if not reinitialize:
124 | print("\nInstalling necessary libraries..........")
125 | self.install_package('transformers')
126 | self.install_package('torch')
127 |
128 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
129 |
130 | import warnings
131 | from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
132 |
133 | warnings.filterwarnings("ignore", category=FutureWarning)
134 |
135 | self.provider = 'huggingface'
136 | self.model = input("\nEnter the Huggingface model name: ") if not reinitialize else self.model
137 |
138 | try:
139 | self.tokenizer = AutoTokenizer.from_pretrained(self.model)
140 | self.model_hf = AutoModelForCausalLM.from_pretrained(self.model)
141 | self.pipeline = pipeline('text-generation', model=self.model_hf, tokenizer=self.tokenizer, device=0 if self.device == 'gpu' else -1)
142 |
143 | except OSError as e:
144 | print("\nError loading model: ", e)
145 | print("\nPlease enter a valid Huggingface model name.")
146 | self.init_huggingface_client(reinitialize=True)
147 |
148 | self.save_state()
149 |
150 | def validate_openai_key(self):
151 | """Validate the OpenAI API key by attempting a small request."""
152 | response = self.client.chat.completions.create(
153 | model=self.model,
154 | messages=[{"role": "user", "content": "Test prompt, do nothing"}],
155 | max_tokens=5
156 | )
157 | print("API key and model are valid.")
158 |
159 | def infer(self, prompt):
160 | self.history.append({"role": "user", "content": prompt})
161 | input_prompt = self.history_to_prompt()
162 |
163 | if self.provider == 'OpenAI':
164 | response = self.client.chat.completions.create(
165 | model=self.model,
166 | messages=input_prompt,
167 | max_tokens = self.max_tokens
168 | )
169 | output = response.choices[0].message.content.strip()
170 |
171 | elif self.provider == 'huggingface':
172 | responses = self.pipeline(input_prompt, max_new_tokens=self.max_tokens)
173 | output = responses[0]['generated_text']
174 |
175 | self.history.append({"role": "assistant", "content": output})
176 | print(">> ", output)
177 |
178 | def interactive_session(self, prompt_input):
179 | print("\nInteractive session started. Type 'end chat' to exit from the session!\n")
180 |
181 | if self.continue_chat:
182 | print('>> ', self.history[-1]['content'])
183 | else:
184 | initial_prompt = prompt_input
185 | self.infer(initial_prompt)
186 |
187 | while True:
188 | prompt = input("\n>> ")
189 | if prompt.lower() == 'end chat':
190 | break
191 | self.infer(prompt)
192 | self.save_state()
193 |
194 | def history_to_prompt(self):
195 | if self.provider == 'OpenAI':
196 | return self.history
197 | elif self.provider == 'huggingface':
198 | return " ".join([item["content"] for item in self.history])
199 |
200 | if __name__ == "__main__":
201 | client = KrsGPTClient(reinitialize=False)
202 | # client.interactive_session("You are an 8th grade math tutor. Ask questions to gauge my expertise so that you can generate a training plan for me.")
203 |
204 |
--------------------------------------------------------------------------------
/demo.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/demo.gif
--------------------------------------------------------------------------------
/dist/krs-0.1.0-py3.10.egg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/dist/krs-0.1.0-py3.10.egg
--------------------------------------------------------------------------------
/dokc:
--------------------------------------------------------------------------------
1 | ## Prerequisites
2 |
--------------------------------------------------------------------------------
/dokc.md:
--------------------------------------------------------------------------------
1 | # Install and Configure Krs with Digital Ocean Kubernetes Cluster
2 |
3 |
4 | ## Prerequisites
5 |
6 | - Digital Ocean Account
7 | - Homebrew(if you're on Mac)
8 |
9 | ## Getting Started
10 |
11 | ### 1. Setup a Kubernetes Cluster on Digital Ocean
12 |
13 | 
14 |
15 | ### 2. Install and Setup doctl on your Local Machine
16 |
17 | If on Ubuntu use:
18 |
19 | ```
20 | sudo snap install doctl
21 | sudo snap connect doctl:kube-config
22 | sudo snap connect doctl:ssh-keys :ssh-keys
23 | sudo snap connect doctl:dot-docker
24 | ```
25 |
26 | If on Mac, use:
27 |
28 | ```
29 | brew install doctl
30 | ```
31 |
32 | ### 3. Authenticate your Digital Ocean Account
33 |
34 | ```
35 | doctl auth init
36 | ```
37 |
38 | ### 4. Connect your Local Machine to your Digital Ocean Kubernetes Cluster
39 |
40 | ```
41 | doctl kubernetes cluster kubeconfig save ea3a5a97-fdba-4455-bd81-46df80c68267
42 | ```
43 |
44 | ### 5. Setup KRS using these commands:
45 |
46 | ```
47 | git clone https://github.com/kubetoolsca/krs.git
48 | cd krs
49 | pip install .
50 | ```
51 |
52 | ### 6. Initialize KRS to permit it access to your cluster using the given command,
53 |
54 | ```
55 | krs init
56 | ```
57 |
58 | ### 7. Get a view of all possible actions with KRS, by running the given command
59 | ```
60 | krs --help
61 | ```
62 |
63 | ```
64 | krs --help
65 |
66 | Usage: krs [OPTIONS] COMMAND [ARGS]...
67 |
68 | krs: A command line interface to scan your Kubernetes Cluster, detect errors,
69 | provide resolutions using LLMs and recommend latest tools for your cluster
70 |
71 | ╭─ Options ────────────────────────────────────────────────────────────────────╮
72 | │ --install-completion Install completion for the current shell. │
73 | │ --show-completion Show completion for the current shell, to copy │
74 | │ it or customize the installation. │
75 | │ --help Show this message and exit. │
76 | ╰──────────────────────────────────────────────────────────────────────────────╯
77 | ╭─ Commands ───────────────────────────────────────────────────────────────────╮
78 | │ exit Ends krs services safely and deletes all state files from │
79 | │ system. Removes all cached data. │
80 | │ export Exports pod info with logs and events. │
81 | │ health Starts an interactive terminal using an LLM of your choice to │
82 | │ detect and fix issues with your cluster │
83 | │ init Initializes the services and loads the scanner. │
84 | │ namespaces Lists all the namespaces. │
85 | │ pods Lists all the pods with namespaces, or lists pods under a │
86 | │ specified namespace. │
87 | │ recommend Generates a table of recommended tools from our ranking │
88 | │ database and their CNCF project status. │
89 | │ scan Scans the cluster and extracts a list of tools that are │
90 | │ currently used. │
91 | ╰──────────────────────────────────────────────────────────────────────────────╯
92 | ```
93 | ### 8. Permit KRS to get information on the tools utilized in your cluster by running the given command
94 |
95 | ```
96 | krs scan
97 | ```
98 |
99 | ```
100 | krs scan
101 |
102 | Scanning your cluster...
103 |
104 | Cluster scanned successfully...
105 |
106 | Extracted tools used in cluster...
107 |
108 |
109 | The cluster is using the following tools:
110 |
111 | +-------------+--------+------------------+---------------+
112 | | Tool Name | Rank | Category | CNCF Status |
113 | +=============+========+==================+===============+
114 | | cilium | 1 | Network Policies | graduated |
115 | +-------------+--------+------------------+---------------+
116 | | hubble | 7 | Security Tools | listed |
117 | +-------------+--------+------------------+---------------+
118 |
119 | ```
120 |
121 | ### 9. Get recommendations on possible tools to use in your cluster by running the given command
122 |
123 | ```
124 | krs recommend
125 | ```
126 |
127 | ```
128 | krs recommend
129 |
130 | Our recommended tools for this deployment are:
131 |
132 | +------------------+------------------------+-------------+---------------+
133 | | Category | Recommendation | Tool Name | CNCF Status |
134 | +==================+========================+=============+===============+
135 | | Network Policies | Already using the best | cilium | graduated |
136 | +------------------+------------------------+-------------+---------------+
137 | | Security Tools | Recommended tool | trivy | listed |
138 | +------------------+------------------------+-------------+---------------+
139 |
140 | ```
141 |
142 | ### 10. Check the pod and namespace status in your Kubernetes cluster, including errors.
143 |
144 | ```
145 | krs health
146 | ```
147 |
148 | ```
149 | krs health
150 |
151 | Starting interactive terminal...
152 |
153 |
154 | Choose the model provider for healthcheck:
155 |
156 | [1] OpenAI
157 | [2] Huggingface
158 |
159 | >> 1
160 |
161 | Installing necessary libraries..........
162 |
163 | openai is already installed.
164 |
165 | Enter your OpenAI API key: sk-proj-qxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxP
166 |
167 | Enter the OpenAI model name: gpt-3.5-turbo
168 | API key and model are valid.
169 |
170 | Namespaces in the cluster:
171 |
172 | 1. default
173 | 2. kube-node-lease
174 | 3. kube-public
175 | 4. kube-system
176 | 5. portainer
177 |
178 | Which namespace do you want to check the health for? Select a namespace by entering its number: >> 4
179 |
180 | Pods in the namespace kube-system:
181 |
182 | 1. cilium-9lqbq
183 | 2. cilium-ffpct
184 | 3. cilium-pvknr
185 | 4. coredns-85f59d8784-nvr2n
186 | 5. coredns-85f59d8784-p9jcv
187 | 6. cpc-bridge-proxy-c6xzr
188 | 7. cpc-bridge-proxy-p7r4p
189 | 8. cpc-bridge-proxy-tkfrd
190 | 9. csi-do-node-hwxn7
191 | 10. csi-do-node-q27rc
192 | 11. csi-do-node-rn7dm
193 | 12. do-node-agent-6t5ms
194 | 13. do-node-agent-85r8b
195 | 14. do-node-agent-m7bvr
196 | 15. hubble-relay-74686df4df-856pj
197 | 16. hubble-ui-86cc69bddc-xc745
198 | 17. konnectivity-agent-9k8vk
199 | 18. konnectivity-agent-h5fm2
200 | 19. konnectivity-agent-kf4xh
201 | 20. kube-proxy-94945
202 | 21. kube-proxy-qgv4j
203 | 22. kube-proxy-vztzf
204 |
205 | Which pod from kube-system do you want to check the health for? Select a pod by entering its number: >> 1
206 |
207 | Checking status of the pod...
208 |
209 | Extracting logs and events from the pod...
210 |
211 | Logs and events from the pod extracted successfully!
212 |
213 |
214 | Interactive session started. Type 'end chat' to exit from the session!
215 |
216 | >> The log entries provided are empty {}, so there is nothing to analyze. Therefore, I can confirm that 'Everything looks good!' in this case.
217 |
218 | If there were warnings or errors in the log entries, I would have analyzed them thoroughly to identify the root cause. Depending on the specific warnings or errors, potential steps to resolve the issues could include:
219 |
220 | 1. Analyzing the specific error message to understand the problem
221 | 2. Checking Kubernetes resources (e.g., pods, deployments, configmaps) for any misconfigurations
222 | 3. Verifying connectivity to external resources or dependencies
223 | 4. Checking for resource limitations or constraints that could be causing issues
224 | 5. Reviewing recent changes in the Kubernetes environment that could have introduced problems
225 | 6. Using Kubernetes troubleshooting tools like kubectl logs, describe, or events to gather more information
226 |
227 | By following these steps and addressing any identified issues, you can resolve warnings or errors in the Kubernetes environment.
228 |
229 | >> Wonderful, anything else to note?
230 | >> In addition to resolving warnings or errors in Kubernetes logs, it's important to regularly monitor and maintain the Kubernetes environment to ensure smooth operation. Some best practices for Kubernetes maintenance include:
231 |
232 | 1. Regularly updating Kubernetes components and cluster nodes to the latest stable versions to benefit from bug fixes and security patches.
233 | 2. Implementing automated backups of critical data and configurations to prevent data loss in case of failures.
234 | 3. Monitoring resource utilization and scaling components as needed to optimize performance and cost efficiency.
235 | 4. Implementing security best practices, such as network policies, RBAC, and pod security policies to protect the cluster from unauthorized access or malicious activities.
236 | 5. Conducting regular health checks and performance tuning to identify and address any bottlenecks or inefficiencies in the cluster.
237 | 6. Developing a disaster recovery plan and testing it periodically to ensure business continuity in case of unexpected events.
238 |
239 | By following these maintenance practices, you can ensure the stability, scalability, and security of your Kubernetes environment for optimal DevOps operations.
240 |
241 | >> Alright, so are we done?
242 | >> Yes, we have covered the analysis of the Kubernetes log entries and discussed best practices for Kubernetes maintenance. If you have any more questions or need further assistance, feel free to ask. Otherwise, we can consider this conversation complete.
243 |
244 | >> Wonderful. Thanks! end chat
245 | >> You're welcome! If you have any more questions in the future, feel free to reach out. Have a great day! Goodbye!
246 |
247 | >> end chat
248 |
249 |
250 | ```
251 |
252 |
253 |
--------------------------------------------------------------------------------
/eks.md:
--------------------------------------------------------------------------------
1 | # Install and Configure Krs with EKS (AWS)
2 |
3 |
4 | ## Prerequisites
5 |
6 | - AWS Account
7 | - AWSCLI installed on your system
8 | - Homebrew(if you're on Mac)
9 |
10 |
11 | ## Getting Started
12 |
13 | ### 1. Setup Amazon EKS Cluster
14 |
15 | ```
16 | $ eksctl create cluster --name --version --region --nodegroup-name --node-type --nodes --zones=
17 | ```
18 |
19 |
20 |
21 | 
22 |
23 |
24 | ### 2. Authenticate your AWS account
25 |
26 |
27 | ```
28 | aws configure
29 | ```
30 |
31 |
32 |
33 | ### 3. Extract the list of running clusters on AWS using this command:
34 |
35 | ```
36 | $ aws eks list-clusters
37 | ```
38 |
39 | ### 4. Create a config file that permits KRS access to the EKS cluster using this command:
40 |
41 | ```
42 | aws eks update-kubeconfig --name
43 | ```
44 |
45 |
46 | ### 5. Setup KRS using these commands:
47 |
48 | ```
49 | $git clone https://github.com/kubetoolsca/krs.git
50 | $ cd krs
51 | $ pip install
52 | ```
53 |
54 | ### 6. Initialize KRS to permit it access to your cluster using the given command,
55 |
56 | ```
57 | krs init
58 | ```
59 |
60 | ### 7. Get a view of all possible actions with KRS, by running the given command
61 |
62 |
63 | ```
64 | krs --help
65 |
66 | Usage: krs [OPTIONS] COMMAND [ARGS]...
67 |
68 | krs: A command line interface to scan your Kubernetes Cluster, detect errors,
69 | provide resolutions using LLMs and recommend latest tools for your cluster
70 |
71 | ╭─ Options ────────────────────────────────────────────────────────────────────╮
72 | │ --install-completion Install completion for the current shell. │
73 | │ --show-completion Show completion for the current shell, to copy │
74 | │ it or customize the installation. │
75 | │ --help Show this message and exit. │
76 | ╰──────────────────────────────────────────────────────────────────────────────╯
77 | ╭─ Commands ───────────────────────────────────────────────────────────────────╮
78 | │ exit Ends krs services safely and deletes all state files from │
79 | │ system. Removes all cached data. │
80 | │ export Exports pod info with logs and events. │
81 | │ health Starts an interactive terminal using an LLM of your choice to │
82 | │ detect and fix issues with your cluster │
83 | │ init Initializes the services and loads the scanner. │
84 | │ namespaces Lists all the namespaces. │
85 | │ pods Lists all the pods with namespaces, or lists pods under a │
86 | │ specified namespace. │
87 | │ recommend Generates a table of recommended tools from our ranking │
88 | │ database and their CNCF project status. │
89 | │ scan Scans the cluster and extracts a list of tools that are │
90 | │ currently used. │
91 | ╰──────────────────────────────────────────────────────────────────────────────╯
92 | ```
93 |
94 | ### 8. Permit KRS to get information on the tools utilized in your cluster by running the given command
95 |
96 | ```
97 | krs scan
98 |
99 | Scanning your cluster...
100 |
101 | Cluster scanned successfully...
102 |
103 | Extracted tools used in cluster...
104 |
105 | The cluster is using the following tools:
106 |
107 | +-------------+--------+-----------------------------+---------------+
108 | | Tool Name | Rank | Category | CNCF Status |
109 | +=============+========+=============================+===============+
110 | | autoscaler | 5 | Cluster with Core CLI tools | unlisted |
111 | +-------------+--------+-----------------------------+---------------+
112 | | istio | 2 | Service Mesh | graduated |
113 | +-------------+--------+-----------------------------+---------------+
114 | | kserve | 3 | Artificial Intelligence | listed |
115 | +-------------+--------+-----------------------------+---------------+
116 | ```
117 |
118 | #### 9. Get recommendations on possible tools to use in your cluster by running the given command
119 |
120 | ```
121 | krs recommend
122 | ```
123 | ```
124 | +-----------------------------+------------------+-------------+---------------+
125 | | Category | Recommendation | Tool Name | CNCF Status |
126 | +=============================+==================+=============+===============+
127 | | Cluster with Core CLI tools | Recommended tool | k9s | unlisted |
128 | +-----------------------------+------------------+-------------+---------------+
129 | | Service Mesh | Recommended tool | traefik | listed |
130 | +-----------------------------+------------------+-------------+---------------+
131 | | Artificial Intelligence | Recommended tool | k8sgpt | sandbox |
132 | +-----------------------------+------------------+-------------+---------------+
133 | ```
134 |
135 | #### 10. Check the pod and namespace status in your Kubernetes cluster, including errors.
136 |
137 | ```
138 | krs health
139 | ```
140 | ```
141 | Starting interactive terminal...
142 |
143 | Choose the model provider for healthcheck:
144 |
145 | [1] OpenAI
146 | [2] Huggingface
147 |
148 | >> 1
149 |
150 | Installing necessary libraries..........
151 |
152 | openai is already installed.
153 |
154 | Enter your OpenAI API key: sk-proj-xxxxxxxxxx
155 |
156 | Enter the OpenAI model name: gpt-3.5-turbo
157 | API key and model are valid.
158 |
159 | Namespaces in the cluster:
160 |
161 | 1. cert-manager
162 | 2. default
163 | 3. istio-system
164 | 4. knative-serving
165 | 5. kserve
166 | 6. kserve-test
167 | 7. kube-node-lease
168 | 8. kube-public
169 | 9. Kube-system
170 |
171 | Which namespace do you want to check the health for? Select a namespace by entering its number: >> 9
172 |
173 | Pods in the namespace kube-system:
174 |
175 | 1. aws-node-46hzm
176 | 2. aws-node-wdgnn
177 | 3. coredns-586b798467-54t6h
178 | 4. coredns-586b798467-jmlrp
179 | 5. kube-proxy-hfmjl
180 | 6. kube-proxy-n8lc6
181 |
182 | Which pod from kube-system do you want to check the health for? Select a pod by entering its number: >> 1
183 |
184 | Checking status of the pod...
185 |
186 | Extracting logs and events from the pod...
187 |
188 | Logs and events from the pod extracted successfully!
189 |
190 |
191 | Interactive session started. Type 'end chat' to exit from the session!
192 |
193 | >> The provided log entries are empty, so there is nothing to analyze. Everything looks good!
194 |
195 | >> Wonderful, so what next
196 | >> If you have any specific questions or another set of log entries you would like me to analyze, feel free to provide them. I'm here to help with any DevOps or Kubernetes-related queries you may have. Just let me know how I can assist you further!
197 | ```
198 |
199 |
200 |
201 |
202 |
203 |
204 |
205 |
--------------------------------------------------------------------------------
/gke.md:
--------------------------------------------------------------------------------
1 | ## Setting up Krs for Google Kubernetes Engine
2 |
3 | ## Prerequisite
4 |
5 | - A Google Cloud Account
6 | - Installing Google Cloud SDK on your macOS
7 |
8 | Execute the following command to install Google Cloud SDK in your system:
9 |
10 | ```
11 | tar xfz google-cloud-sdk-195.0.0-darwin-x86_64.tar.gz
12 | ./google-cloud-sdk/install.sh
13 | ```
14 |
15 |
16 | - Enable Google Cloud Engine API
17 |
18 | 
19 |
20 |
21 | - Authenticate Your Google Cloud using gcloud auth
22 |
23 |
24 | ```
25 | gcloud init
26 | ```
27 |
28 | In your browser, log in to your Google user account when prompted and click Allow to grant permission to access Google Cloud Platform resources.
29 |
30 |
31 | ## Creating GKE Cluster
32 |
33 | ```
34 | gcloud container clusters create k8s-lab1 --disk-size 10 --zone asia-east1-a --machine-type n1-standard-2 --num-nodes 3 --scopes compute-rw
35 | ```
36 |
37 | ## Viewing it on Google Cloud Platform
38 |
39 | 
40 |
41 |
42 | ## Viewing the new context on Docker Desktop
43 |
44 |
45 |
46 | ### Verifying the Google Kubernetes Cluster
47 |
48 | ```
49 | kubectl get nodes
50 | NAME STATUS ROLES AGE VERSION
51 | gke-k8s-lab1-default-pool-5dfb7153-3fr7 Ready 3m1s v1.29.4-gke.1043002
52 | gke-k8s-lab1-default-pool-5dfb7153-nl3v Ready 3m1s v1.29.4-gke.1043002
53 | gke-k8s-lab1-default-pool-5dfb7153-rkg8 Ready 3m2s v1.29.4-gke.1043002
54 | ```
55 |
56 | ## Initialize the KRS
57 |
58 | ```
59 | krs init
60 | Services initialized and scanner loaded.
61 | ```
62 |
63 | ## Running the scanner
64 |
65 | ```
66 | krs scan
67 |
68 | Scanning your cluster...
69 |
70 | Cluster scanned successfully...
71 |
72 | Extracted tools used in cluster...
73 |
74 |
75 | The cluster is using the following tools:
76 |
77 | +-------------+--------+-----------------------------+---------------+
78 | | Tool Name | Rank | Category | CNCF Status |
79 | +=============+========+=============================+===============+
80 | | autoscaler | 5 | Cluster with Core CLI tools | unlisted |
81 | +-------------+--------+-----------------------------+---------------+
82 | | fluentbit | 4 | Logging and Tracing | unlisted |
83 | +-------------+--------+-----------------------------+---------------+
84 | ```
85 |
86 | ## Checking the Krs Recommendation
87 |
88 | ```
89 | krs recommend
90 |
91 | Our recommended tools for this deployment are:
92 |
93 | +-----------------------------+------------------+-------------+---------------+
94 | | Category | Recommendation | Tool Name | CNCF Status |
95 | +=============================+==================+=============+===============+
96 | | Cluster with Core CLI tools | Recommended tool | k9s | unlisted |
97 | +-----------------------------+------------------+-------------+---------------+
98 | | Logging and Tracing | Recommended tool | elk | unlisted |
99 | ```
100 |
101 |
102 | ## Installing Kubeview
103 |
104 | ```
105 | git clone https://github.com/benc-uk/kubeview
106 | cd kubeview/charts/
107 | helm install kubeview kubeview
108 | ```
109 |
110 | ## Running the scanner again
111 |
112 | ```
113 | krs scan
114 |
115 | Scanning your cluster...
116 |
117 | Cluster scanned successfully...
118 |
119 | Extracted tools used in cluster...
120 |
121 |
122 | The cluster is using the following tools:
123 |
124 | +-------------+--------+-----------------------------+---------------+
125 | | Tool Name | Rank | Category | CNCF Status |
126 | +=============+========+=============================+===============+
127 | | kubeview | 30 | Cluster with Core CLI tools | unlisted |
128 | +-------------+--------+-----------------------------+---------------+
129 | | | 3 | Cluster Management | unlisted |
130 | +-------------+--------+-----------------------------+---------------+
131 | | autoscaler | 5 | Cluster with Core CLI tools | unlisted |
132 | +-------------+--------+-----------------------------+---------------+
133 | | fluentbit | 4 | Logging and Tracing | unlisted |
134 | +-------------+--------+-----------------------------+---------------+
135 | ```
136 |
--------------------------------------------------------------------------------
/kind.md:
--------------------------------------------------------------------------------
1 | # Install and Configure Krs with Kind
2 |
3 | ## Prerequisites
4 |
5 | - Podman, Docker, or Virtual Box (container runtime)
6 | - Kubectl
7 | - go (version 1.16+)
8 |
9 | ## Getting Started
10 |
11 | ### 1. Setup a Kind Kubernetes Cluster on your Local Machine
12 | ```
13 | go install sigs.k8s.io/kind@v0.23.0 && kind create cluster
14 | ```
15 | 
16 |
17 | ### 2. Setup KRS using these commands:
18 |
19 | ```
20 | git clone https://github.com/kubetoolsca/krs.git
21 | cd krs
22 | pip install .
23 | ```
24 |
25 | ### 3. Initialize KRS to permit it access to your cluster using the given command,
26 |
27 | ```
28 | krs init
29 | ```
30 |
31 | ### 4. Get a view of all possible actions with KRS, by running the given command
32 | ```
33 | krs --help
34 | ```
35 |
36 | ```
37 | krs --help
38 |
39 | Usage: krs [OPTIONS] COMMAND [ARGS]...
40 |
41 | krs: A command line interface to scan your Kubernetes Cluster, detect errors,
42 | provide resolutions using LLMs and recommend latest tools for your cluster
43 |
44 | ╭─ Options ────────────────────────────────────────────────────────────────────╮
45 | │ --install-completion Install completion for the current shell. │
46 | │ --show-completion Show completion for the current shell, to copy │
47 | │ it or customize the installation. │
48 | │ --help Show this message and exit. │
49 | ╰──────────────────────────────────────────────────────────────────────────────╯
50 | ╭─ Commands ───────────────────────────────────────────────────────────────────╮
51 | │ exit Ends krs services safely and deletes all state files from │
52 | │ system. Removes all cached data. │
53 | │ export Exports pod info with logs and events. │
54 | │ health Starts an interactive terminal using an LLM of your choice to │
55 | │ detect and fix issues with your cluster │
56 | │ init Initializes the services and loads the scanner. │
57 | │ namespaces Lists all the namespaces. │
58 | │ pods Lists all the pods with namespaces, or lists pods under a │
59 | │ specified namespace. │
60 | │ recommend Generates a table of recommended tools from our ranking │
61 | │ database and their CNCF project status. │
62 | │ scan Scans the cluster and extracts a list of tools that are │
63 | │ currently used. │
64 | ╰──────────────────────────────────────────────────────────────────────────────╯
65 | ```
66 | ### 5. Permit KRS to get information on the tools utilized in your cluster by running the given command
67 |
68 | ```
69 | krs scan
70 | ```
71 |
72 | ```
73 | krs scan
74 |
75 | Scanning your cluster...
76 |
77 | Cluster scanned successfully...
78 |
79 | Extracted tools used in cluster...
80 |
81 |
82 | The cluster is using the following tools:
83 |
84 | +-------------+--------+-----------------------------------+---------------+
85 | | Tool Name | Rank | Category | CNCF Status |
86 | +=============+========+===================================+===============+
87 | | kind | 3 | Alternative Tools for Development | listed |
88 | +-------------+--------+-----------------------------------+---------------+
89 | | | 4 | Cluster Management | listed |
90 | +-------------+--------+-----------------------------------+---------------+
91 |
92 | ```
93 |
94 | ### 6. Get recommendations on possible tools to use in your cluster by running the given command
95 |
96 | ```
97 | krs recommend
98 | ```
99 |
100 | ```
101 | krs recommend
102 |
103 | Our recommended tools for this deployment are:
104 |
105 | +-----------------------------------+------------------+-------------+---------------+
106 | | Category | Recommendation | Tool Name | CNCF Status |
107 | +===================================+==================+=============+===============+
108 | | Alternative Tools for Development | Recommended tool | minikube | listed |
109 | +-----------------------------------+------------------+-------------+---------------+
110 | | Cluster Management | Recommended tool | rancher | unlisted |
111 | +-----------------------------------+------------------+-------------+---------------+
112 |
113 |
114 | ```
115 |
116 | ### 7. Check the pod and namespace status in your Kubernetes cluster, including errors.
117 |
118 | ```
119 | krs health
120 | ```
121 |
122 | ```
123 | krs health
124 |
125 | Starting interactive terminal...
126 |
127 |
128 | Choose the model provider for healthcheck:
129 |
130 | [1] OpenAI
131 | [2] Huggingface
132 |
133 | >> 1
134 |
135 | Installing necessary libraries..........
136 |
137 | openai is already installed.
138 |
139 | Enter your OpenAI API key: sk-proj-qxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxP
140 |
141 | Enter the OpenAI model name: gpt-3.5-turbo
142 | API key and model are valid.
143 |
144 | Namespaces in the cluster:
145 |
146 | 1. default
147 | 2. kube-node-lease
148 | 3. kube-public
149 | 4. kube-system
150 | 5. local-path-storage
151 |
152 | Which namespace do you want to check the health for? Select a namespace by entering its number: >> 4
153 |
154 | Pods in the namespace kube-system:
155 |
156 | 1. cilium-9lqbq
157 | 2. cilium-ffpct
158 | 3. cilium-pvknr
159 | 4. coredns-85f59d8784-nvr2n
160 | 5. coredns-85f59d8784-p9jcv
161 | 6. cpc-bridge-proxy-c6xzr
162 | 7. cpc-bridge-proxy-p7r4p
163 | 8. cpc-bridge-proxy-tkfrd
164 | 9. csi-do-node-hwxn7
165 | 10. csi-do-node-q27rc
166 | 11. csi-do-node-rn7dm
167 | 12. do-node-agent-6t5ms
168 | 13. do-node-agent-85r8b
169 | 14. do-node-agent-m7bvr
170 | 15. hubble-relay-74686df4df-856pj
171 | 16. hubble-ui-86cc69bddc-xc745
172 | 17. konnectivity-agent-9k8vk
173 | 18. konnectivity-agent-h5fm2
174 | 19. konnectivity-agent-kf4xh
175 | 20. kube-proxy-94945
176 | 21. kube-proxy-qgv4j
177 | 22. kube-proxy-vztzf
178 |
179 | Which pod from kube-system do you want to check the health for? Select a pod by entering its number: >> 1
180 |
181 | Checking status of the pod...
182 |
183 | Extracting logs and events from the pod...
184 |
185 | Logs and events from the pod extracted successfully!
186 |
187 |
188 | Interactive session started. Type 'end chat' to exit from the session!
189 |
190 | >> The log entries provided are empty {}, so there is nothing to analyze. Therefore, I can confirm that 'Everything looks good!' in this case.
191 |
192 | If there were warnings or errors in the log entries, I would have analyzed them thoroughly to identify the root cause. Depending on the specific warnings or errors, potential steps to resolve the issues could include:
193 |
194 | 1. Analyzing the specific error message to understand the problem
195 | 2. Checking Kubernetes resources (e.g., pods, deployments, configmaps) for any misconfigurations
196 | 3. Verifying connectivity to external resources or dependencies
197 | 4. Checking for resource limitations or constraints that could be causing issues
198 | 5. Reviewing recent changes in the Kubernetes environment that could have introduced problems
199 | 6. Using Kubernetes troubleshooting tools like kubectl logs, describe, or events to gather more information
200 |
201 | By following these steps and addressing any identified issues, you can resolve warnings or errors in the Kubernetes environment.
202 |
203 | >> Wonderful, anything else to note?
204 | >> In addition to resolving warnings or errors in Kubernetes logs, it's important to regularly monitor and maintain the Kubernetes environment to ensure smooth operation. Some best practices for Kubernetes maintenance include:
205 |
206 | 1. Regularly updating Kubernetes components and cluster nodes to the latest stable versions to benefit from bug fixes and security patches.
207 | 2. Implementing automated backups of critical data and configurations to prevent data loss in case of failures.
208 | 3. Monitoring resource utilization and scaling components as needed to optimize performance and cost efficiency.
209 | 4. Implementing security best practices, such as network policies, RBAC, and pod security policies to protect the cluster from unauthorized access or malicious activities.
210 | 5. Conducting regular health checks and performance tuning to identify and address any bottlenecks or inefficiencies in the cluster.
211 | 6. Developing a disaster recovery plan and testing it periodically to ensure business continuity in case of unexpected events.
212 |
213 | By following these maintenance practices, you can ensure the stability, scalability, and security of your Kubernetes environment for optimal DevOps operations.
214 |
215 | >> Alright, so are we done?
216 | >> Yes, we have covered the analysis of the Kubernetes log entries and discussed best practices for Kubernetes maintenance. If you have any more questions or need further assistance, feel free to ask. Otherwise, we can consider this conversation complete.
217 |
218 | >> Wonderful. Thanks! end chat
219 | >> You're welcome! If you have any more questions in the future, feel free to reach out. Have a great day! Goodbye!
220 |
221 | >> end chat
222 |
223 |
224 | ```
225 |
--------------------------------------------------------------------------------
/krs.egg-info/PKG-INFO:
--------------------------------------------------------------------------------
1 | Metadata-Version: 2.1
2 | Name: krs
3 | Version: 0.1.0
4 | Summary: Kubernetes Recommendation Service with LLM integration
5 | Home-page: https://github.com/KrsGPTs/krs
6 | Author: Abhijeet Mazumdar , Karan Singh & Ajeet Singh Raina
7 | Author-email: abhijeet@kubetools.ca, karan@kubetools.ca, ajeet@kubetools.ca
8 | Classifier: Programming Language :: Python :: 3
9 | Classifier: License :: OSI Approved :: MIT License
10 | Classifier: Operating System :: OS Independent
11 | Requires-Python: >=3.6
12 | License-File: LICENSE
13 | Requires-Dist: typer==0.12.3
14 | Requires-Dist: requests==2.32.2
15 | Requires-Dist: kubernetes==29.0.0
16 | Requires-Dist: tabulate==0.9.0
17 |
--------------------------------------------------------------------------------
/krs.egg-info/SOURCES.txt:
--------------------------------------------------------------------------------
1 | LICENSE
2 | README.md
3 | setup.py
4 | krs/__init__.py
5 | krs/krs.py
6 | krs/main.py
7 | krs.egg-info/PKG-INFO
8 | krs.egg-info/SOURCES.txt
9 | krs.egg-info/dependency_links.txt
10 | krs.egg-info/entry_points.txt
11 | krs.egg-info/requires.txt
12 | krs.egg-info/top_level.txt
13 | krs/utils/__init__.py
14 | krs/utils/cluster_scanner.py
15 | krs/utils/constants.py
16 | krs/utils/fetch_tools_krs.py
17 | krs/utils/functional.py
18 | krs/utils/llm_client.py
--------------------------------------------------------------------------------
/krs.egg-info/dependency_links.txt:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/krs.egg-info/entry_points.txt:
--------------------------------------------------------------------------------
1 | [console_scripts]
2 | krs = krs.krs:app
3 |
--------------------------------------------------------------------------------
/krs.egg-info/requires.txt:
--------------------------------------------------------------------------------
1 | typer==0.12.3
2 | requests==2.32.2
3 | kubernetes==29.0.0
4 | tabulate==0.9.0
5 |
--------------------------------------------------------------------------------
/krs.egg-info/top_level.txt:
--------------------------------------------------------------------------------
1 | krs
2 |
--------------------------------------------------------------------------------
/krs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/krs/__init__.py
--------------------------------------------------------------------------------
/krs/krs.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 |
3 | import typer, os
4 | from krs.main import KrsMain
5 | from krs.utils.constants import KRSSTATE_PICKLE_FILEPATH, KRS_DATA_DIRECTORY
6 |
7 | app = typer.Typer(help="krs: A command line interface to scan your Kubernetes Cluster, detect errors, provide resolutions using LLMs and recommend latest tools for your cluster")
8 | krs = KrsMain()
9 |
10 | def check_initialized():
11 | if not os.path.exists(KRSSTATE_PICKLE_FILEPATH):
12 | typer.echo("KRS is not initialized. Please run 'krs init' first.")
13 | raise typer.Exit()
14 |
15 | if not os.path.exists(KRS_DATA_DIRECTORY):
16 | os.mkdir(KRS_DATA_DIRECTORY)
17 |
18 | @app.command()
19 | def init(kubeconfig: str = typer.Option('~/.kube/config', help="Custom path for kubeconfig file if not default")):
20 | """
21 | Initializes the services and loads the scanner.
22 | """
23 | krs.initialize(kubeconfig)
24 | typer.echo("Services initialized and scanner loaded.")
25 |
26 | @app.command()
27 | def scan():
28 | """
29 | Scans the cluster and extracts a list of tools that are currently used.
30 | """
31 | check_initialized()
32 | krs.scan_cluster()
33 |
34 |
35 | @app.command()
36 | def namespaces():
37 | """
38 | Lists all the namespaces.
39 | """
40 | check_initialized()
41 | namespaces = krs.list_namespaces()
42 | typer.echo("Namespaces in your cluster are: \n")
43 | for i, namespace in enumerate(namespaces):
44 | typer.echo(str(i+1)+ ". "+ namespace)
45 |
46 | @app.command()
47 | def pods(namespace: str = typer.Option(None, help="Specify namespace to list pods from")):
48 | """
49 | Lists all the pods with namespaces, or lists pods under a specified namespace.
50 | """
51 | check_initialized()
52 | if namespace:
53 | pods = krs.list_pods(namespace)
54 | if pods == 'wrong namespace name':
55 | typer.echo(f"\nWrong namespace name entered, try again!\n")
56 | raise typer.Abort()
57 | typer.echo(f"\nPods in namespace '{namespace}': \n")
58 | else:
59 | pods = krs.list_pods_all()
60 | typer.echo("\nAll pods in the cluster: \n")
61 |
62 | for i, pod in enumerate(pods):
63 | typer.echo(str(i+1)+ '. '+ pod)
64 |
65 | @app.command()
66 | def recommend():
67 | """
68 | Generates a table of recommended tools from our ranking database and their CNCF project status.
69 | """
70 | check_initialized()
71 | krs.generate_recommendations()
72 |
73 | @app.command()
74 | def health(change_model: bool = typer.Option(False, help="Option to reinitialize/change the LLM, if set to True"),
75 | device: str = typer.Option('cpu', help='Option to run Huggingface models on GPU by entering the option as "gpu"')):
76 | """
77 | Starts an interactive terminal using an LLM of your choice to detect and fix issues with your cluster
78 | """
79 | check_initialized()
80 | typer.echo("\nStarting interactive terminal...\n")
81 | krs.health_check(change_model, device)
82 |
83 | @app.command()
84 | def export():
85 | """
86 | Exports pod info with logs and events.
87 | """
88 | check_initialized()
89 | krs.export_pod_info()
90 | typer.echo("Pod info with logs and events exported. Json file saved to current directory!")
91 |
92 | @app.command()
93 | def exit():
94 | """
95 | Ends krs services safely and deletes all state files from system. Removes all cached data.
96 | """
97 | check_initialized()
98 | krs.exit()
99 | typer.echo("Krs services closed safely.")
100 |
101 | if __name__ == "__main__":
102 | app()
103 |
--------------------------------------------------------------------------------
/krs/main.py:
--------------------------------------------------------------------------------
1 | from krs.utils.fetch_tools_krs import krs_tool_ranking_info
2 | from krs.utils.cluster_scanner import KubetoolsScanner
3 | from krs.utils.llm_client import KrsGPTClient
4 | from krs.utils.functional import extract_log_entries, CustomJSONEncoder
5 | import os, pickle, time, json
6 | from tabulate import tabulate
7 | from krs.utils.constants import (KRSSTATE_PICKLE_FILEPATH, LLMSTATE_PICKLE_FILEPATH, POD_INFO_FILEPATH, KRS_DATA_DIRECTORY)
8 |
9 | class KrsMain:
10 |
11 | def __init__(self):
12 |
13 | self.pod_info = None
14 | self.pod_list = None
15 | self.namespaces = None
16 | self.deployments = None
17 | self.state_file = KRSSTATE_PICKLE_FILEPATH
18 | self.isClusterScanned = False
19 | self.continue_chat = False
20 | self.logs_extracted = []
21 | self.scanner = None
22 | self.get_events = True
23 | self.get_logs = True
24 | self.cluster_tool_list = None
25 | self.detailed_cluster_tool_list = None
26 | self.category_cluster_tools_dict = None
27 |
28 | self.load_state()
29 |
30 | def initialize(self, config_file='~/.kube/config'):
31 | self.config_file = config_file
32 | self.tools_dict, self.category_dict, cncf_status_dict = krs_tool_ranking_info()
33 | self.cncf_status = cncf_status_dict['cncftools']
34 | self.scanner = KubetoolsScanner(self.get_events, self.get_logs, self.config_file)
35 | self.save_state()
36 |
37 | def save_state(self):
38 | state = {
39 | 'pod_info': self.pod_info,
40 | 'pod_list': self.pod_list,
41 | 'namespaces': self.namespaces,
42 | 'deployments': self.deployments,
43 | 'cncf_status': self.cncf_status,
44 | 'tools_dict': self.tools_dict,
45 | 'category_tools_dict': self.category_dict,
46 | 'extracted_logs': self.logs_extracted,
47 | 'kubeconfig': self.config_file,
48 | 'isScanned': self.isClusterScanned,
49 | 'cluster_tool_list': self.cluster_tool_list,
50 | 'detailed_tool_list': self.detailed_cluster_tool_list,
51 | 'category_tool_list': self.category_cluster_tools_dict
52 | }
53 | os.makedirs(os.path.dirname(self.state_file), exist_ok=True)
54 | with open(self.state_file, 'wb') as f:
55 | pickle.dump(state, f)
56 |
57 | def load_state(self):
58 | if os.path.exists(self.state_file):
59 | with open(self.state_file, 'rb') as f:
60 | state = pickle.load(f)
61 | self.pod_info = state.get('pod_info')
62 | self.pod_list = state.get('pod_list')
63 | self.namespaces = state.get('namespaces')
64 | self.deployments = state.get('deployments')
65 | self.cncf_status = state.get('cncf_status')
66 | self.tools_dict = state.get('tools_dict')
67 | self.category_dict = state.get('category_tools_dict')
68 | self.logs_extracted = state.get('extracted_logs')
69 | self.config_file = state.get('kubeconfig')
70 | self.isClusterScanned = state.get('isScanned')
71 | self.cluster_tool_list = state.get('cluster_tool_list')
72 | self.detailed_cluster_tool_list = state.get('detailed_tool_list')
73 | self.category_cluster_tools_dict = state.get('category_tool_list')
74 | self.scanner = KubetoolsScanner(self.get_events, self.get_logs, self.config_file)
75 |
76 | def check_scanned(self):
77 | if not self.isClusterScanned:
78 | self.pod_list, self.pod_info, self.deployments, self.namespaces = self.scanner.scan_kubernetes_deployment()
79 | self.save_state()
80 |
81 | def list_namespaces(self):
82 | self.check_scanned()
83 | return self.scanner.list_namespaces()
84 |
85 | def list_pods(self, namespace):
86 | self.check_scanned()
87 | if namespace not in self.list_namespaces():
88 | return "wrong namespace name"
89 | return self.scanner.list_pods(namespace)
90 |
91 | def list_pods_all(self):
92 | self.check_scanned()
93 | return self.scanner.list_pods_all()
94 |
95 | def detect_tools_from_repo(self):
96 | tool_set = set()
97 | for pod in self.pod_list:
98 | for service_name in pod.split('-'):
99 | if service_name in self.tools_dict.keys():
100 | tool_set.add(service_name)
101 |
102 | for dep in self.deployments:
103 | for service_name in dep.split('-'):
104 | if service_name in self.tools_dict.keys():
105 | tool_set.add(service_name)
106 |
107 | return list(tool_set)
108 |
109 | def extract_rankings(self):
110 | tool_dict = {}
111 | category_tools_dict = {}
112 | for tool in self.cluster_tool_list:
113 | tool_details = self.tools_dict[tool]
114 | for detail in tool_details:
115 | rank = detail['rank']
116 | category = detail['category']
117 | if category not in category_tools_dict:
118 | category_tools_dict[category] = []
119 | category_tools_dict[category].append(rank)
120 |
121 | tool_dict[tool] = tool_details
122 |
123 | return tool_dict, category_tools_dict
124 |
125 | def generate_recommendations(self):
126 |
127 | if not self.isClusterScanned:
128 | self.scan_cluster()
129 |
130 | self.print_recommendations()
131 |
132 | def scan_cluster(self):
133 |
134 | print("\nScanning your cluster...\n")
135 | self.pod_list, self.pod_info, self.deployments, self.namespaces = self.scanner.scan_kubernetes_deployment()
136 | self.isClusterScanned = True
137 | print("Cluster scanned successfully...\n")
138 | self.cluster_tool_list = self.detect_tools_from_repo()
139 | print("Extracted tools used in cluster...\n")
140 | self.detailed_cluster_tool_list, self.category_cluster_tools_dict = self.extract_rankings()
141 |
142 | self.print_scan_results()
143 | self.save_state()
144 |
145 | def print_scan_results(self):
146 | scan_results = []
147 |
148 | for tool, details in self.detailed_cluster_tool_list.items():
149 | first_entry = True
150 | for detail in details:
151 | row = [tool if first_entry else "", detail['rank'], detail['category'], self.cncf_status.get(tool, 'unlisted')]
152 | scan_results.append(row)
153 | first_entry = False
154 |
155 | print("\nThe cluster is using the following tools:\n")
156 | print(tabulate(scan_results, headers=["Tool Name", "Rank", "Category", "CNCF Status"], tablefmt="grid"))
157 |
158 | def print_recommendations(self):
159 | recommendations = []
160 |
161 | for category, ranks in self.category_cluster_tools_dict.items():
162 | rank = ranks[0]
163 | recommended_tool = self.category_dict[category][1]['name']
164 | status = self.cncf_status.get(recommended_tool, 'unlisted')
165 | if rank == 1:
166 | row = [category, "Already using the best", recommended_tool, status]
167 | else:
168 | row = [category, "Recommended tool", recommended_tool, status]
169 | recommendations.append(row)
170 |
171 | print("\nOur recommended tools for this deployment are:\n")
172 | print(tabulate(recommendations, headers=["Category", "Recommendation", "Tool Name", "CNCF Status"], tablefmt="grid"))
173 |
174 |
175 | def health_check(self, change_model=False, device='cpu'):
176 |
177 | if os.path.exists(LLMSTATE_PICKLE_FILEPATH) and not change_model:
178 | continue_previous_chat = input("\nDo you want to continue fixing the previously selected pod ? (y/n): >> ")
179 | while True:
180 | if continue_previous_chat not in ['y', 'n']:
181 | continue_previous_chat = input("\nPlease enter one of the given options ? (y/n): >> ")
182 | else:
183 | break
184 |
185 | if continue_previous_chat=='y':
186 | krsllmclient = KrsGPTClient(device=device)
187 | self.continue_chat = True
188 | else:
189 | krsllmclient = KrsGPTClient(reset_history=True, device=device)
190 |
191 | else:
192 | krsllmclient = KrsGPTClient(reinitialize=True, device=device)
193 | self.continue_chat = False
194 |
195 | if not self.continue_chat:
196 |
197 | self.check_scanned()
198 |
199 | print("\nNamespaces in the cluster:\n")
200 | namespaces = self.list_namespaces()
201 | namespace_len = len(namespaces)
202 | for i, namespace in enumerate(namespaces, start=1):
203 | print(f"{i}. {namespace}")
204 |
205 | self.selected_namespace_index = int(input("\nWhich namespace do you want to check the health for? Select a namespace by entering its number: >> "))
206 | while True:
207 | if self.selected_namespace_index not in list(range(1, namespace_len+1)):
208 | self.selected_namespace_index = int(input(f"\nWrong input! Select a namespace number between {1} to {namespace_len}: >> "))
209 | else:
210 | break
211 |
212 | self.selected_namespace = namespaces[self.selected_namespace_index - 1]
213 | pod_list = self.list_pods(self.selected_namespace)
214 | pod_len = len(pod_list)
215 | print(f"\nPods in the namespace {self.selected_namespace}:\n")
216 | for i, pod in enumerate(pod_list, start=1):
217 | print(f"{i}. {pod}")
218 | self.selected_pod_index = int(input(f"\nWhich pod from {self.selected_namespace} do you want to check the health for? Select a pod by entering its number: >> "))
219 |
220 | while True:
221 | if self.selected_pod_index not in list(range(1, pod_len+1)):
222 | self.selected_pod_index = int(input(f"\nWrong input! Select a pod number between {1} to {pod_len}: >> "))
223 | else:
224 | break
225 |
226 | print("\nChecking status of the pod...")
227 |
228 | print("\nExtracting logs and events from the pod...")
229 |
230 | logs_from_pod = self.get_logs_from_pod(self.selected_namespace_index, self.selected_pod_index)
231 |
232 | self.logs_extracted = extract_log_entries(logs_from_pod)
233 |
234 | print("\nLogs and events from the pod extracted successfully!\n")
235 |
236 | prompt_to_llm = self.create_prompt(self.logs_extracted)
237 |
238 | krsllmclient.interactive_session(prompt_to_llm)
239 |
240 | self.save_state()
241 |
242 | def get_logs_from_pod(self, namespace_index, pod_index):
243 | try:
244 | namespace_index -= 1
245 | pod_index -= 1
246 | namespace = list(self.list_namespaces())[namespace_index]
247 | return list(self.pod_info[namespace][pod_index]['info']['Logs'].values())[0]
248 | except KeyError as e:
249 | print("\nKindly enter a value from the available namespaces and pods")
250 | return None
251 |
252 | def create_prompt(self, log_entries):
253 | prompt = "You are a DevOps expert with experience in Kubernetes. Analyze the following log entries:\n{\n"
254 | for entry in sorted(log_entries): # Sort to maintain consistent order
255 | prompt += f"{entry}\n"
256 | prompt += "}\nIf there is nothing of concern in between { }, return a message stating that 'Everything looks good!'. Explain the warnings and errors and the steps that should be taken to resolve the issues, only if they exist."
257 | return prompt
258 |
259 | def export_pod_info(self):
260 |
261 | self.check_scanned()
262 |
263 | with open(POD_INFO_FILEPATH, 'w') as f:
264 | json.dump(self.pod_info, f, cls=CustomJSONEncoder)
265 |
266 |
267 | def exit(self):
268 |
269 | try:
270 | # List all files and directories in the given directory
271 | files = os.listdir(KRS_DATA_DIRECTORY)
272 | for file in files:
273 | file_path = os.path.join(KRS_DATA_DIRECTORY, file)
274 | # Check if it's a file and not a directory
275 | if os.path.isfile(file_path):
276 | os.remove(file_path) # Delete the file
277 | print(f"Deleted file: {file_path}")
278 |
279 | except Exception as e:
280 | print(f"Error occurred: {e}")
281 |
282 | def main(self):
283 | self.scan_cluster()
284 | self.generate_recommendations()
285 | self.health_check()
286 |
287 |
288 | if __name__=='__main__':
289 | recommender = KrsMain()
290 | recommender.main()
291 | # logs_info = recommender.get_logs_from_pod(4,2)
292 | # print(logs_info)
293 | # logs = recommender.extract_log_entries(logs_info)
294 | # print(logs)
295 | # print(recommender.create_prompt(logs))
296 |
297 |
--------------------------------------------------------------------------------
/krs/requirements.txt:
--------------------------------------------------------------------------------
1 | typer==0.12.3
2 | requests==2.32.2
3 | kubernetes==29.0.0
4 | tabulate==0.9.0
5 |
6 |
--------------------------------------------------------------------------------
/krs/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kubetoolsca/krs/f2af89aee7c3317d67dc51eb016168a948bc70d3/krs/utils/__init__.py
--------------------------------------------------------------------------------
/krs/utils/cluster_scanner.py:
--------------------------------------------------------------------------------
1 | from kubernetes import client, config
2 | import logging
3 |
4 | class KubetoolsScanner:
5 | def __init__(self, get_events=True, get_logs=True, config_file='~/.kube/config'):
6 | self.get_events = get_events
7 | self.get_logs = get_logs
8 | self.config_file = config_file
9 | self.v1 = None
10 | self.v2 = None
11 | self.setup_kubernetes_client()
12 |
13 | def setup_kubernetes_client(self):
14 | try:
15 | config.load_kube_config(config_file=self.config_file)
16 | self.v1 = client.AppsV1Api()
17 | self.v2 = client.CoreV1Api()
18 | except Exception as e:
19 | logging.error("Failed to load Kubernetes configuration: %s", e)
20 | raise
21 |
22 | def scan_kubernetes_deployment(self):
23 | try:
24 | deployments = self.v1.list_deployment_for_all_namespaces()
25 | namespaces = self.list_namespaces()
26 | except Exception as e:
27 | logging.error("Error fetching data from Kubernetes API: %s", e)
28 | return {}, {}, []
29 |
30 | pod_dict = {}
31 | pod_list = []
32 | for name in namespaces:
33 | pods = self.list_pods(name)
34 | pod_list += pods
35 | pod_dict[name] = [{'name': pod, 'info': self.get_pod_info(name, pod)} for pod in pods]
36 |
37 | deployment_list = [dep.metadata.name for dep in deployments.items]
38 | return pod_list, pod_dict, deployment_list, namespaces
39 |
40 | def list_namespaces(self):
41 | namespaces = self.v2.list_namespace()
42 | return [namespace.metadata.name for namespace in namespaces.items]
43 |
44 | def list_pods_all(self):
45 | pods = self.v2.list_pod_for_all_namespaces()
46 | return [pod.metadata.name for pod in pods.items]
47 |
48 | def list_pods(self, namespace):
49 | pods = self.v2.list_namespaced_pod(namespace)
50 | return [pod.metadata.name for pod in pods.items]
51 |
52 | def get_pod_info(self, namespace, pod, include_events=True, include_logs=True):
53 | """
54 | Retrieves information about a specific pod in a given namespace.
55 |
56 | Args:
57 | namespace (str): The namespace of the pod.
58 | pod (str): The name of the pod.
59 | include_events (bool): Flag indicating whether to include events associated with the pod.
60 | include_logs (bool): Flag indicating whether to include logs of the pod.
61 |
62 | Returns:
63 | dict: A dictionary containing the pod information, events (if include_events is True), and logs (if include_logs is True).
64 | """
65 | pod_info = self.v2.read_namespaced_pod(pod, namespace)
66 | pod_info_map = pod_info.to_dict()
67 | pod_info_map["metadata"]["managed_fields"] = None # Clean up metadata
68 |
69 | info = {'PodInfo': pod_info_map}
70 |
71 | if include_events:
72 | info['Events'] = self.fetch_pod_events(namespace, pod)
73 |
74 | if include_logs:
75 | # Retrieve logs for all containers within the pod
76 | container_logs = {}
77 | for container in pod_info.spec.containers:
78 | try:
79 | logs = self.v2.read_namespaced_pod_log(name=pod, namespace=namespace, container=container.name)
80 | container_logs[container.name] = logs
81 | except Exception as e:
82 | logging.error("Failed to fetch logs for container %s in pod %s: %s", container.name, pod, e)
83 | container_logs[container.name] = "Error fetching logs: " + str(e)
84 | info['Logs'] = container_logs
85 |
86 | return info
87 |
88 | def fetch_pod_events(self, namespace, pod):
89 | events = self.v2.list_namespaced_event(namespace)
90 | return [{
91 | 'Name': event.metadata.name,
92 | 'Message': event.message,
93 | 'Reason': event.reason
94 | } for event in events.items if event.involved_object.name == pod]
95 |
96 |
97 | if __name__ == '__main__':
98 |
99 | scanner = KubetoolsScanner()
100 | pod_list, pod_info, deployments, namespaces = scanner.scan_kubernetes_deployment()
101 | print("POD List: \n\n", pod_list)
102 | print("\n\nPOD Info: \n\n", pod_info.keys())
103 | print("\n\nNamespaces: \n\n", namespaces)
104 | print("\n\nDeployments : \n\n", deployments)
105 |
106 |
--------------------------------------------------------------------------------
/krs/utils/constants.py:
--------------------------------------------------------------------------------
1 | KUBETOOLS_JSONPATH = 'krs/data/kubetools_data.json'
2 | KUBETOOLS_DATA_JSONURL = 'https://raw.githubusercontent.com/Kubetools-Technologies-Inc/kubetools_data/main/data/kubetools_data.json'
3 |
4 | CNCF_YMLPATH = 'krs/data/landscape.yml'
5 | CNCF_YMLURL = 'https://raw.githubusercontent.com/cncf/landscape/master/landscape.yml'
6 | CNCF_TOOLS_JSONPATH = 'krs/data/cncf_tools.json'
7 |
8 | TOOLS_RANK_JSONPATH = 'krs/data/tools_rank.json'
9 | CATEGORY_RANK_JSONPATH = 'krs/data/category_rank.json'
10 |
11 | LLMSTATE_PICKLE_FILEPATH = 'krs/data/llmstate.pkl'
12 | KRSSTATE_PICKLE_FILEPATH = 'krs/data/krsstate.pkl'
13 |
14 | POD_INFO_FILEPATH = './exported_pod_info.json'
15 |
16 | MAX_OUTPUT_TOKENS = 512
17 |
18 | KRS_DATA_DIRECTORY = 'krs/data'
19 |
--------------------------------------------------------------------------------
/krs/utils/fetch_tools_krs.py:
--------------------------------------------------------------------------------
1 | import json
2 | import requests
3 | import yaml
4 | from krs.utils.constants import (KUBETOOLS_DATA_JSONURL, KUBETOOLS_JSONPATH, CNCF_YMLPATH, CNCF_YMLURL, CNCF_TOOLS_JSONPATH, TOOLS_RANK_JSONPATH, CATEGORY_RANK_JSONPATH)
5 |
6 | # Function to convert 'githubStars' to a float, or return 0 if it cannot be converted
7 | def get_github_stars(tool):
8 | stars = tool.get('githubStars', 0)
9 | try:
10 | return float(stars)
11 | except ValueError:
12 | return 0.0
13 |
14 | # Function to download and save a file
15 | def download_file(url, filename):
16 | response = requests.get(url)
17 | response.raise_for_status() # Ensure we notice bad responses
18 | with open(filename, 'wb') as file:
19 | file.write(response.content)
20 |
21 | def parse_yaml_to_dict(yaml_file_path):
22 | with open(yaml_file_path, 'r') as file:
23 | data = yaml.safe_load(file)
24 |
25 | cncftools = {}
26 |
27 | for category in data.get('landscape', []):
28 | for subcategory in category.get('subcategories', []):
29 | for item in subcategory.get('items', []):
30 | item_name = item.get('name').lower()
31 | project_status = item.get('project', 'listed')
32 | cncftools[item_name] = project_status
33 |
34 | return {'cncftools': cncftools}
35 |
36 | def save_json_file(jsondict, jsonpath):
37 |
38 | # Write the category dictionary to a new JSON file
39 | with open(jsonpath, 'w') as f:
40 | json.dump(jsondict, f, indent=4)
41 |
42 |
43 | def krs_tool_ranking_info():
44 | # New dictionaries
45 | tools_dict = {}
46 | category_tools_dict = {}
47 |
48 | download_file(KUBETOOLS_DATA_JSONURL, KUBETOOLS_JSONPATH)
49 | download_file(CNCF_YMLURL, CNCF_YMLPATH)
50 |
51 | with open(KUBETOOLS_JSONPATH) as f:
52 | data = json.load(f)
53 |
54 | for category in data:
55 | # Sort the tools in the current category by the number of GitHub stars
56 | sorted_tools = sorted(category['tools'], key=get_github_stars, reverse=True)
57 |
58 | for i, tool in enumerate(sorted_tools, start=1):
59 | tool["name"] = tool['name'].replace("\t", "").lower()
60 | tool['ranking'] = i
61 |
62 | # Update tools_dict
63 | tools_dict.setdefault(tool['name'], []).append({
64 | 'rank': i,
65 | 'category': category['category']['name'],
66 | 'url': tool['link']
67 | })
68 |
69 | # Update ranked_tools_dict
70 | category_tools_dict.setdefault(category['category']['name'], {}).update({i: {'name': tool['name'], 'url': tool['link']}})
71 |
72 |
73 | cncf_tools_dict = parse_yaml_to_dict(CNCF_YMLPATH)
74 | save_json_file(cncf_tools_dict, CNCF_TOOLS_JSONPATH)
75 | save_json_file(tools_dict, TOOLS_RANK_JSONPATH)
76 | save_json_file(category_tools_dict, CATEGORY_RANK_JSONPATH)
77 |
78 | return tools_dict, category_tools_dict, cncf_tools_dict
79 |
80 | if __name__=='__main__':
81 | tools_dict, category_tools_dict, cncf_tools_dict = krs_tool_ranking_info()
82 | print(cncf_tools_dict)
83 |
84 |
--------------------------------------------------------------------------------
/krs/utils/functional.py:
--------------------------------------------------------------------------------
1 | from difflib import SequenceMatcher
2 | import re, json
3 | from datetime import datetime
4 |
5 | class CustomJSONEncoder(json.JSONEncoder):
6 | """JSON Encoder for complex objects not serializable by default json code."""
7 | def default(self, obj):
8 | if isinstance(obj, datetime):
9 | # Format datetime object as a string in ISO 8601 format
10 | return obj.isoformat()
11 | # Let the base class default method raise the TypeError
12 | return json.JSONEncoder.default(self, obj)
13 |
14 | def similarity(a, b):
15 | return SequenceMatcher(None, a, b).ratio()
16 |
17 | def filter_similar_entries(log_entries):
18 | unique_entries = list(log_entries)
19 | to_remove = set()
20 |
21 | # Compare each pair of log entries
22 | for i in range(len(unique_entries)):
23 | for j in range(i + 1, len(unique_entries)):
24 | if similarity(unique_entries[i], unique_entries[j]) > 0.85:
25 | # Choose the shorter entry to remove, or either if they are the same length
26 | if len(unique_entries[i]) > len(unique_entries[j]):
27 | to_remove.add(unique_entries[i])
28 | else:
29 | to_remove.add(unique_entries[j])
30 |
31 | # Filter out the highly similar entries
32 | filtered_entries = {entry for entry in unique_entries if entry not in to_remove}
33 | return filtered_entries
34 |
35 | def extract_log_entries(log_contents):
36 | # Patterns to match different log formats
37 | patterns = [
38 | re.compile(r'\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{6}Z\s+(warn|error)\s+\S+\s+(.*)', re.IGNORECASE),
39 | re.compile(r'[WE]\d{4} \d{2}:\d{2}:\d{2}.\d+\s+\d+\s+(.*)'),
40 | re.compile(r'({.*})')
41 | ]
42 |
43 | log_entries = set()
44 | # Attempt to match each line with all patterns
45 | for line in log_contents.split('\n'):
46 | for pattern in patterns:
47 | match = pattern.search(line)
48 | if match:
49 | if match.groups()[0].startswith('{'):
50 | # Handle JSON formatted log entries
51 | try:
52 | log_json = json.loads(match.group(1))
53 | if 'severity' in log_json and log_json['severity'].lower() in ['error', 'warning']:
54 | level = "Error" if log_json['severity'] == "ERROR" else "Warning"
55 | message = log_json.get('error', '') if 'error' in log_json.keys() else line
56 | log_entries.add(f"{level}: {message.strip()}")
57 | elif 'level' in log_json:
58 | level = "Error" if log_json['level'] == "error" else "Warning"
59 | message = log_json.get('msg', '') + log_json.get('error', '')
60 | log_entries.add(f"{level}: {message.strip()}")
61 | except json.JSONDecodeError:
62 | continue # Skip if JSON is not valid
63 | else:
64 | if len(match.groups()) == 2:
65 | level, message = match.groups()
66 | elif len(match.groups()) == 1:
67 | message = match.group(1) # Assuming error as default
68 | level = "ERROR" # Default if not specified in the log
69 |
70 | level = "Error" if "error" in level.lower() else "Warning"
71 | formatted_message = f"{level}: {message.strip()}"
72 | log_entries.add(formatted_message)
73 | break # Stop after the first match
74 |
75 | return filter_similar_entries(log_entries)
--------------------------------------------------------------------------------
/krs/utils/llm_client.py:
--------------------------------------------------------------------------------
1 | import pickle
2 | import subprocess
3 | import os, time
4 | from krs.utils.constants import (MAX_OUTPUT_TOKENS, LLMSTATE_PICKLE_FILEPATH)
5 |
6 | class KrsGPTClient:
7 |
8 | def __init__(self, reinitialize=False, reset_history=False, device='cpu'):
9 |
10 | self.reinitialize = reinitialize
11 | self.client = None
12 | self.pipeline = None
13 | self.provider = None
14 | self.model = None
15 | self.openai_api_key = None
16 | self.continue_chat = False
17 | self.history = []
18 | self.max_tokens = MAX_OUTPUT_TOKENS
19 | self.device = device
20 |
21 |
22 | if not self.reinitialize:
23 | print("\nLoading LLM State..")
24 | self.load_state()
25 | print("\nModel: ", self.model)
26 | if not self.model:
27 | self.initialize_client()
28 |
29 | self.history = [] if reset_history == True else self.history
30 |
31 | if self.history:
32 | continue_chat = input("\n\nDo you want to continue previous chat ? (y/n) >> ")
33 | while continue_chat not in ['y', 'n']:
34 | print("Please enter either y or n!")
35 | continue_chat = input("\nDo you want to continue previous chat ? (y/n) >> ")
36 | if continue_chat == 'No':
37 | self.history = []
38 | else:
39 | self.continue_chat = True
40 |
41 | def save_state(self, filename=LLMSTATE_PICKLE_FILEPATH):
42 | state = {
43 | 'provider': self.provider,
44 | 'model': self.model,
45 | 'history': self.history,
46 | 'openai_api_key': self.openai_api_key
47 | }
48 | with open(filename, 'wb') as output:
49 | pickle.dump(state, output, pickle.HIGHEST_PROTOCOL)
50 |
51 | def load_state(self):
52 | try:
53 | with open(LLMSTATE_PICKLE_FILEPATH, 'rb') as f:
54 | state = pickle.load(f)
55 | self.provider = state['provider']
56 | self.model = state['model']
57 | self.history = state.get('history', [])
58 | self.openai_api_key = state.get('openai_api_key', '')
59 | if self.provider == 'OpenAI':
60 | self.init_openai_client(reinitialize=True)
61 | elif self.provider == 'huggingface':
62 | self.init_huggingface_client(reinitialize=True)
63 | except (FileNotFoundError, EOFError):
64 | pass
65 |
66 | def install_package(self, package_name):
67 | import importlib
68 | try:
69 | importlib.import_module(package_name)
70 | print(f"\n{package_name} is already installed.")
71 | except ImportError:
72 | print(f"\nInstalling {package_name}...", end='', flush=True)
73 | result = subprocess.run(['pip', 'install', package_name], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
74 | if result.returncode == 0:
75 | print(f" \n{package_name} installed successfully.")
76 | else:
77 | print(f" \nFailed to install {package_name}.")
78 |
79 |
80 | def initialize_client(self):
81 | if not self.client and not self.pipeline:
82 | choice = input("\nChoose the model provider for healthcheck: \n\n[1] OpenAI \n[2] Huggingface\n\n>> ")
83 | if choice == '1':
84 | self.init_openai_client()
85 | elif choice == '2':
86 | self.init_huggingface_client()
87 | else:
88 | raise ValueError("Invalid option selected")
89 |
90 | def init_openai_client(self, reinitialize=False):
91 |
92 | if not reinitialize:
93 | print("\nInstalling necessary libraries..........")
94 | self.install_package('openai')
95 |
96 | import openai
97 | from openai import OpenAI
98 | import getpass
99 |
100 | self.provider = 'OpenAI'
101 | self.openai_api_key = getpass.getpass("\nEnter your OpenAI API key: ") if not reinitialize else self.openai_api_key
102 | self.model = input("\nEnter the OpenAI model name: ") if not reinitialize else self.model
103 |
104 | self.client = OpenAI(api_key=self.openai_api_key)
105 |
106 | if not reinitialize or self.reinitialize:
107 | while True:
108 | try:
109 | self.validate_openai_key()
110 | break
111 | except openai.error.AuthenticationError:
112 | self.openai_api_key = input("\nInvalid Key! Please enter the correct OpenAI API key: ")
113 | except openai.error.InvalidRequestError as e:
114 | print(e)
115 | self.model = input("\nEnter an OpenAI model name from latest OpenAI docs: ")
116 | except openai.APIConnectionError as e:
117 | print(e)
118 | self.init_openai_client(reinitialize=False)
119 |
120 | self.save_state()
121 |
122 | def init_huggingface_client(self, reinitialize=False):
123 |
124 | if not reinitialize:
125 | print("\nInstalling necessary libraries..........")
126 | self.install_package('transformers')
127 | self.install_package('torch')
128 |
129 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
130 |
131 | import warnings
132 | from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
133 |
134 | warnings.filterwarnings("ignore", category=FutureWarning)
135 |
136 | self.provider = 'huggingface'
137 | self.model = input("\nEnter the Huggingface model name: ") if not reinitialize else self.model
138 |
139 | try:
140 | self.tokenizer = AutoTokenizer.from_pretrained(self.model)
141 | self.model_hf = AutoModelForCausalLM.from_pretrained(self.model)
142 | self.pipeline = pipeline('text-generation', model=self.model_hf, tokenizer=self.tokenizer, device=0 if self.device == 'gpu' else -1)
143 |
144 | except OSError as e:
145 | print("\nError loading model: ", e)
146 | print("\nPlease enter a valid Huggingface model name.")
147 | self.init_huggingface_client(reinitialize=True)
148 |
149 | self.save_state()
150 |
151 | def validate_openai_key(self):
152 | """Validate the OpenAI API key by attempting a small request."""
153 | response = self.client.chat.completions.create(
154 | model=self.model,
155 | messages=[{"role": "user", "content": "Test prompt, do nothing"}],
156 | max_tokens=5
157 | )
158 | print("API key and model are valid.")
159 |
160 | def infer(self, prompt):
161 | self.history.append({"role": "user", "content": prompt})
162 | input_prompt = self.history_to_prompt()
163 |
164 | if self.provider == 'OpenAI':
165 | response = self.client.chat.completions.create(
166 | model=self.model,
167 | messages=input_prompt,
168 | max_tokens = self.max_tokens
169 | )
170 | output = response.choices[0].message.content.strip()
171 |
172 | elif self.provider == 'huggingface':
173 | responses = self.pipeline(input_prompt, max_new_tokens=self.max_tokens)
174 | output = responses[0]['generated_text']
175 |
176 | self.history.append({"role": "assistant", "content": output})
177 | print(">> ", output)
178 |
179 | def interactive_session(self, prompt_input):
180 | print("\nInteractive session started. Type 'end chat' to exit from the session!\n")
181 |
182 | if self.continue_chat:
183 | print('>> ', self.history[-1]['content'])
184 | else:
185 | initial_prompt = prompt_input
186 | self.infer(initial_prompt)
187 |
188 | while True:
189 | prompt = input("\n>> ")
190 | if prompt.lower() == 'end chat':
191 | break
192 | self.infer(prompt)
193 | self.save_state()
194 |
195 | def history_to_prompt(self):
196 | if self.provider == 'OpenAI':
197 | return self.history
198 | elif self.provider == 'huggingface':
199 | return " ".join([item["content"] for item in self.history])
200 |
201 | if __name__ == "__main__":
202 | client = KrsGPTClient(reinitialize=False)
203 | # client.interactive_session("You are an 8th grade math tutor. Ask questions to gauge my expertise so that you can generate a training plan for me.")
204 |
205 |
--------------------------------------------------------------------------------
/mkc.md:
--------------------------------------------------------------------------------
1 | # Install and Configure Krs with MiniKube
2 |
3 | ## Prerequisites
4 |
5 | - Podman, Docker, or Virtual Box (container runtime)
6 | - Kubectl
7 |
8 | ## Getting Started
9 |
10 | ### 1. Setup a MiniKube Kubernetes Cluster on your Local Machine
11 | ```
12 | curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
13 | sudo install minikube-linux-amd64 /usr/local/bin/minikube && rm minikube-linux-amd64
14 | minikube start
15 | ```
16 | 
17 |
18 | ### 2. Setup KRS using these commands:
19 |
20 | ```
21 | git clone https://github.com/kubetoolsca/krs.git
22 | cd krs
23 | pip install .
24 | ```
25 |
26 | ### 3. Initialize KRS to permit it access to your cluster using the given command,
27 |
28 | ```
29 | krs init
30 | ```
31 |
32 | ### 4. Get a view of all possible actions with KRS, by running the given command
33 | ```
34 | krs --help
35 | ```
36 |
37 | ```
38 | krs --help
39 |
40 | Usage: krs [OPTIONS] COMMAND [ARGS]...
41 |
42 | krs: A command line interface to scan your Kubernetes Cluster, detect errors,
43 | provide resolutions using LLMs and recommend latest tools for your cluster
44 |
45 | ╭─ Options ────────────────────────────────────────────────────────────────────╮
46 | │ --install-completion Install completion for the current shell. │
47 | │ --show-completion Show completion for the current shell, to copy │
48 | │ it or customize the installation. │
49 | │ --help Show this message and exit. │
50 | ╰──────────────────────────────────────────────────────────────────────────────╯
51 | ╭─ Commands ───────────────────────────────────────────────────────────────────╮
52 | │ exit Ends krs services safely and deletes all state files from │
53 | │ system. Removes all cached data. │
54 | │ export Exports pod info with logs and events. │
55 | │ health Starts an interactive terminal using an LLM of your choice to │
56 | │ detect and fix issues with your cluster │
57 | │ init Initializes the services and loads the scanner. │
58 | │ namespaces Lists all the namespaces. │
59 | │ pods Lists all the pods with namespaces, or lists pods under a │
60 | │ specified namespace. │
61 | │ recommend Generates a table of recommended tools from our ranking │
62 | │ database and their CNCF project status. │
63 | │ scan Scans the cluster and extracts a list of tools that are │
64 | │ currently used. │
65 | ╰──────────────────────────────────────────────────────────────────────────────╯
66 | ```
67 | ### 5. Permit KRS to get information on the tools utilized in your cluster by running the given command
68 |
69 | ```
70 | krs scan
71 | ```
72 |
73 | ```
74 | krs scan
75 |
76 | Scanning your cluster...
77 |
78 | Cluster scanned successfully...
79 |
80 | Extracted tools used in cluster...
81 |
82 |
83 | The cluster is using the following tools:
84 |
85 | +-------------+--------+------------------+---------------+
86 | | Tool Name | Rank | Category | CNCF Status |
87 | +=============+========+==================+===============+
88 | | cilium | 1 | Network Policies | graduated |
89 | +-------------+--------+------------------+---------------+
90 | | hubble | 7 | Security Tools | listed |
91 | +-------------+--------+------------------+---------------+
92 |
93 | ```
94 |
95 | ### 6. Get recommendations on possible tools to use in your cluster by running the given command
96 |
97 | ```
98 | krs recommend
99 | ```
100 |
101 | ```
102 | krs recommend
103 |
104 | Our recommended tools for this deployment are:
105 |
106 | +------------------+------------------------+-------------+---------------+
107 | | Category | Recommendation | Tool Name | CNCF Status |
108 | +==================+========================+=============+===============+
109 | | Network Policies | Already using the best | cilium | graduated |
110 | +------------------+------------------------+-------------+---------------+
111 | | Security Tools | Recommended tool | trivy | listed |
112 | +------------------+------------------------+-------------+---------------+
113 |
114 | ```
115 |
116 | ### 7. Check the pod and namespace status in your Kubernetes cluster, including errors.
117 |
118 | ```
119 | krs health
120 | ```
121 |
122 | ```
123 | krs health
124 |
125 | Starting interactive terminal...
126 |
127 |
128 | Choose the model provider for healthcheck:
129 |
130 | [1] OpenAI
131 | [2] Huggingface
132 |
133 | >> 1
134 |
135 | Installing necessary libraries..........
136 |
137 | openai is already installed.
138 |
139 | Enter your OpenAI API key: sk-proj-qxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxP
140 |
141 | Enter the OpenAI model name: gpt-3.5-turbo
142 | API key and model are valid.
143 |
144 | Namespaces in the cluster:
145 |
146 | 1. default
147 | 2. kube-node-lease
148 | 3. kube-public
149 | 4. kube-system
150 | 5. portainer
151 |
152 | Which namespace do you want to check the health for? Select a namespace by entering its number: >> 4
153 |
154 | Pods in the namespace kube-system:
155 |
156 | 1. cilium-9lqbq
157 | 2. cilium-ffpct
158 | 3. cilium-pvknr
159 | 4. coredns-85f59d8784-nvr2n
160 | 5. coredns-85f59d8784-p9jcv
161 | 6. cpc-bridge-proxy-c6xzr
162 | 7. cpc-bridge-proxy-p7r4p
163 | 8. cpc-bridge-proxy-tkfrd
164 | 9. csi-do-node-hwxn7
165 | 10. csi-do-node-q27rc
166 | 11. csi-do-node-rn7dm
167 | 12. do-node-agent-6t5ms
168 | 13. do-node-agent-85r8b
169 | 14. do-node-agent-m7bvr
170 | 15. hubble-relay-74686df4df-856pj
171 | 16. hubble-ui-86cc69bddc-xc745
172 | 17. konnectivity-agent-9k8vk
173 | 18. konnectivity-agent-h5fm2
174 | 19. konnectivity-agent-kf4xh
175 | 20. kube-proxy-94945
176 | 21. kube-proxy-qgv4j
177 | 22. kube-proxy-vztzf
178 |
179 | Which pod from kube-system do you want to check the health for? Select a pod by entering its number: >> 1
180 |
181 | Checking status of the pod...
182 |
183 | Extracting logs and events from the pod...
184 |
185 | Logs and events from the pod extracted successfully!
186 |
187 |
188 | Interactive session started. Type 'end chat' to exit from the session!
189 |
190 | >> The log entries provided are empty {}, so there is nothing to analyze. Therefore, I can confirm that 'Everything looks good!' in this case.
191 |
192 | If there were warnings or errors in the log entries, I would have analyzed them thoroughly to identify the root cause. Depending on the specific warnings or errors, potential steps to resolve the issues could include:
193 |
194 | 1. Analyzing the specific error message to understand the problem
195 | 2. Checking Kubernetes resources (e.g., pods, deployments, configmaps) for any misconfigurations
196 | 3. Verifying connectivity to external resources or dependencies
197 | 4. Checking for resource limitations or constraints that could be causing issues
198 | 5. Reviewing recent changes in the Kubernetes environment that could have introduced problems
199 | 6. Using Kubernetes troubleshooting tools like kubectl logs, describe, or events to gather more information
200 |
201 | By following these steps and addressing any identified issues, you can resolve warnings or errors in the Kubernetes environment.
202 |
203 | >> Wonderful, anything else to note?
204 | >> In addition to resolving warnings or errors in Kubernetes logs, it's important to regularly monitor and maintain the Kubernetes environment to ensure smooth operation. Some best practices for Kubernetes maintenance include:
205 |
206 | 1. Regularly updating Kubernetes components and cluster nodes to the latest stable versions to benefit from bug fixes and security patches.
207 | 2. Implementing automated backups of critical data and configurations to prevent data loss in case of failures.
208 | 3. Monitoring resource utilization and scaling components as needed to optimize performance and cost efficiency.
209 | 4. Implementing security best practices, such as network policies, RBAC, and pod security policies to protect the cluster from unauthorized access or malicious activities.
210 | 5. Conducting regular health checks and performance tuning to identify and address any bottlenecks or inefficiencies in the cluster.
211 | 6. Developing a disaster recovery plan and testing it periodically to ensure business continuity in case of unexpected events.
212 |
213 | By following these maintenance practices, you can ensure the stability, scalability, and security of your Kubernetes environment for optimal DevOps operations.
214 |
215 | >> Alright, so are we done?
216 | >> Yes, we have covered the analysis of the Kubernetes log entries and discussed best practices for Kubernetes maintenance. If you have any more questions or need further assistance, feel free to ask. Otherwise, we can consider this conversation complete.
217 |
218 | >> Wonderful. Thanks! end chat
219 | >> You're welcome! If you have any more questions in the future, feel free to reach out. Have a great day! Goodbye!
220 |
221 | >> end chat
222 |
223 |
224 | ```
225 |
--------------------------------------------------------------------------------
/samples/install-tools.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Update Helm repository cache
4 | helm repo update
5 |
6 | # Install Kubeshark
7 | helm repo add kubeshark https://helm.kubeshark.co
8 | helm install kubeshark kubeshark/kubeshark
9 |
10 | ## Installing Portainer
11 |
12 | kubectl create namespace portainer
13 | helm repo add portainer https://portainer.github.io/k8s/
14 | helm repo update
15 |
16 | # Dry run Portainer installation to see what gets installed
17 | helm install --dry-run --debug portainer -n portainer deploy/helm/portainer
18 |
19 | # Install Portainer
20 | helm upgrade -i -n portainer portainer portainer/portainer
21 |
22 |
23 | echo "Kubeshark and Portainer installed successfully!"
24 |
--------------------------------------------------------------------------------
/samples/uninstall-tools.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Function to uninstall a Helm release
4 | uninstall_helm_release() {
5 | local release_name="$1"
6 | helm uninstall "$release_name" || true # Suppress errors if release not found
7 | }
8 |
9 | # Update Helm repository cache
10 | helm repo update
11 |
12 |
13 | # Uninstall Kubeshark
14 | uninstall_helm_release kubeshark
15 |
16 | # Uninstall Portainer
17 | uninstall_helm_release portainer
18 |
19 | ## deleting the namespaces
20 | kubectl delete ns portainer
21 | kubectl delete ns kubeshark
22 |
23 | echo "Kubeshark and Portainer uninstalled (if previously installed)."
24 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import setup, find_packages
2 |
3 | # Read the requirements.txt file for dependencies
4 | with open('krs/requirements.txt') as f:
5 | requirements = f.read().splitlines()
6 |
7 | setup(
8 | name='krs',
9 | version='0.1.0',
10 | description='Kubernetes Recommendation Service with LLM integration',
11 | author='Abhijeet Mazumdar , Karan Singh & Ajeet Singh Raina',
12 | author_email='abhijeet@kubetools.ca, karan@kubetools.ca, ajeet@kubetools.ca',
13 | url='https://github.com/kubetoolsca/krs',
14 | packages=find_packages(),
15 | include_package_data=True,
16 | install_requires=requirements,
17 | entry_points={
18 | 'console_scripts': [
19 | 'krs=krs.krs:app', # Adjust the module and function path as needed
20 | ],
21 | },
22 | classifiers=[
23 | 'Programming Language :: Python :: 3',
24 | 'License :: OSI Approved :: MIT License',
25 | 'Operating System :: OS Independent',
26 | ],
27 | python_requires='>=3.6',
28 | )
29 |
--------------------------------------------------------------------------------