├── README.md
├── auditing.ipynb
├── fine_tune.ipynb
├── fine_tune.txt
├── solodit.json
└── training_data.pkl
/README.md:
--------------------------------------------------------------------------------
1 | # Audit GPT
2 | Fine-tuning GPT-3/4 for Smart Contract Auditing.
3 |
4 |
5 | ## Running
6 | Get an API key from OpenAI and set it to environment. Fine-tuning costs around $16.
7 | ```
8 | export OPENAI_API_KEY=sk_XXXXXXXX
9 | ```
10 |
11 | ## Files
12 | * `fine_tune.ipynb` - Code for crawling & preparing data and fine-tuning OpenAI models.
13 | * `audit.ipynb` - Code for leveraging fine-tuned model to do an audit.
14 |
15 | ## Example
16 |
17 | Let fine-tuned GPT audit [Kairos Auction Faucet](https://github.com/sherlock-audit/2023-02-kairos/blob/main/kairos-contracts/src/AuctionFacet.sol) from Sherlock competition:
18 |
19 | | **False Positives** | **9** |
20 | |---------------------|-------|
21 | | **False Negatives** | **1** |
22 |
23 | #### Details
24 | ```solidity
25 | contract AuctionFacet is IAuctionFacet, SafeMint {
26 | using RayMath for Ray;
27 | using RayMath for uint256;
28 | using Erc20CheckedTransfer for IERC20;
29 |
30 | /// @notice buy one or multiple NFTs in liquidation
31 | ```
32 | 🤖️ the method buy() doesn’t check that the sender of the transaction is the winner of the auction. This allows the attacker to steal all of the funds in the contract.
33 | ```solidity
34 | /// @param args arguments on what and how to buy
35 | function buy(BuyArg[] memory args) external {
36 | for (uint256 i = 0; i < args.length; i++) {
37 | useLoan(args[i]);
38 | }
39 | }
40 | ```
41 | 🤖️ malicious user can buy tokens for the lender multiple times, draining their stake.
42 | ```solidity
43 |
44 | /// @notice gets the price to buy the underlying collateral of the loan
45 | /// @param loanId identifier of the loan
46 | /// @return price computed price
47 | function price(uint256 loanId) public view returns (uint256) {
48 | Loan storage loan = protocolStorage().loan[loanId];
49 | ```
50 | 🤖️ the method price() returns the wrong price of the collateral ETH when the loan is in the Erased state. This allows a malicious lender to buy back the collateral ETH at a low price before it is erased from their wallet.
51 | ```solidity
52 | uint256 loanEndDate = loan.endDate;
53 | uint256 timeSinceLiquidable = block.timestamp - loanEndDate;
54 |
55 | checkLoanStatus(loanId);
56 |
57 | /* the decreasing factor controls the evolution of the price from its initial value to 0 (and staying at 0)
58 | ```
59 | 🤖️ Liquidable loans can be stuck in the endDate state, so that timeSinceLiquidable will always be greater than block.timestamp, and the borrower will not be able to pay back the loan before its end date. This can cause the borrower’s collateral to be stuck in the EndDate state forever, and can lead to a loss of principal for the lender.
60 | ```solidity
61 | over the course of the auction duration */
62 | Ray decreasingFactor = timeSinceLiquidable >= loan.auction.duration
63 | ? ZERO
64 | : ONE.sub(timeSinceLiquidable.div(loan.auction.duration));
65 |
66 | /* the estimated value arises from the mean of the loan offer loanToValues used in the loan regarding their
67 | ```
68 | 🤖️ the estimated liquidable stake is wrong, which can be used to frontrun the bidders.
69 | ```solidity
70 | share in the collateral usage. This must stay consitent even if less than the full value of the NFT has been
71 | used as collateral */
72 | uint256 estimatedValue = loan.lent.mul(ONE.div(loan.shareLent));
73 |
74 | /* by mutliplying the estimated price by some factor and slowly decreasing this price over time we aim to
75 | make sure a liquidator will buy the NFT at fair market price. */
76 | ```
77 | 🤖️ the liquidator may not buy the collateral token for its fair value, but instead buy it for much less than its fair value, allowing the borrower to profit from the liquidator’s mistake.
78 | ```solidity
79 | return estimatedValue.mul(loan.auction.priceFactor).mul(decreasingFactor);
80 | }
81 |
82 | /// @notice handles buying one NFT
83 | /// @param arg arguments on what and how to buy
84 | function useLoan(BuyArg memory arg) internal {
85 | ```
86 | 🤖️ the user can lose their NFTs when they buy a loan.
87 | ```solidity
88 | Loan storage loan = protocolStorage().loan[arg.loanId];
89 |
90 | checkLoanStatus(arg.loanId);
91 | uint256 toPay = price(arg.loanId);
92 |
93 | /* store as liquidated and paid before transfers to avoid malicious reentrency, following
94 | ```
95 | 🤖️ Users can take the ETH in the collateral of a loan without paying back the loan.
96 | ```solidity
97 | checks-effects-interaction pattern */
98 | loan.payment.liquidated = true;
99 | loan.payment.paid = toPay;
100 | loan.assetLent.checkedTransferFrom(msg.sender, address(this), toPay);
101 | loan.collateral.implem.safeTransferFrom(address(this), arg.to, loan.collateral.id);
102 |
103 | ```
104 | 🤖️ the user’s NFT could be stuck in the contract when the user tries to liquidate the loan.
105 | ```solidity
106 | emit Buy(arg.loanId, abi.encode(arg));
107 | }
108 |
109 | /// @notice checks that loan is liquidable, revert if not
110 | /// @param loanId identifier of the loan
111 | function checkLoanStatus(uint256 loanId) internal view {
112 | ```
113 | 🤖️ Liquidation can be bypassed by calling buy on a loan that has already been liquidated. This allows users to purchase a loan that they should not be able to purchase, allowing them to steal funds from the borrower.
114 | ```solidity
115 | Loan storage loan = protocolStorage().loan[loanId];
116 |
117 | if (block.timestamp < loan.endDate) {
118 | revert CollateralIsNotLiquidableYet(loan.endDate, loanId);
119 | }
120 | if (loan.payment.paid != 0 || loan.payment.liquidated) {
121 | ```
122 | 🤖️ Liquidation of collateral does not update endDate, so users can take out a new loan in the old endDate, allowing them to take out multiple loans with the same collateral.
123 | ```solidity
124 | revert LoanAlreadyRepaid(loanId);
125 | }
126 | }
127 | ```
128 | 🤖️ User’s loan can be maliciously reported as already repaid so that the user can’t repay the loan again. This allows users to take NFTs out of their account multiple times.
129 |
--------------------------------------------------------------------------------
/auditing.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 19,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "# change the model name to your own model name!\n",
10 | "MODEL_NAME = \"davinci:ft-personal-2023-04-23-04-39-34\""
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": null,
16 | "metadata": {},
17 | "outputs": [],
18 | "source": [
19 | "import os\n",
20 | "import openai\n",
21 | "openai.api_key = os.getenv(\"OPENAI_API_KEY\")\n",
22 | "\n",
23 | "def find_vulnerabilities(prompt):\n",
24 | " prompt = prompt.replace(\" \", \"\").replace(\"\\t\", \"\") + \"\\n\\nThe vulnerability is:\"\n",
25 | " response = openai.Completion.create( \n",
26 | " model=MODEL_NAME, \n",
27 | " prompt=prompt, \n",
28 | " temperature=0, \n",
29 | " max_tokens=1024, \n",
30 | " top_p=1, \n",
31 | " frequency_penalty=0.5, \n",
32 | " presence_penalty=0, stop=[\"\\n\", \" User:\", \" AI:\"] \n",
33 | " )\n",
34 | "3 return response[\"choices\"][0][\"text\"]"
35 | ]
36 | },
37 | {
38 | "cell_type": "code",
39 | "execution_count": 20,
40 | "metadata": {},
41 | "outputs": [],
42 | "source": [
43 | "### https://github.com/sherlock-audit/2023-02-kairos/blob/main/kairos-contracts/src/AuctionFacet.sol\n",
44 | "contract = \"\"\"\n",
45 | "// SPDX-License-Identifier: UNLICENSED\n",
46 | "pragma solidity 0.8.18;\n",
47 | "\n",
48 | "import {IERC20} from \"@openzeppelin/contracts/token/ERC20/IERC20.sol\";\n",
49 | "\n",
50 | "import {IAuctionFacet} from \"./interface/IAuctionFacet.sol\";\n",
51 | "\n",
52 | "import {BuyArg, NFToken, Ray} from \"./DataStructure/Objects.sol\";\n",
53 | "import {Loan, Protocol, Provision, SupplyPosition} from \"./DataStructure/Storage.sol\";\n",
54 | "import {RayMath} from \"./utils/RayMath.sol\";\n",
55 | "import {Erc20CheckedTransfer} from \"./utils/Erc20CheckedTransfer.sol\";\n",
56 | "import {SafeMint} from \"./SupplyPositionLogic/SafeMint.sol\";\n",
57 | "import {protocolStorage, supplyPositionStorage, ONE, ZERO} from \"./DataStructure/Global.sol\";\n",
58 | "// solhint-disable-next-line max-line-length\n",
59 | "import {LoanAlreadyRepaid, CollateralIsNotLiquidableYet} from \"./DataStructure/Errors.sol\";\n",
60 | "\n",
61 | "/// @notice handles sale of collaterals being liquidated, following a dutch auction starting at repayment date\n",
62 | "contract AuctionFacet is IAuctionFacet, SafeMint {\n",
63 | " using RayMath for Ray;\n",
64 | " using RayMath for uint256;\n",
65 | " using Erc20CheckedTransfer for IERC20;\n",
66 | "\n",
67 | " /// @notice buy one or multiple NFTs in liquidation\n",
68 | " /// @param args arguments on what and how to buy\n",
69 | " function buy(BuyArg[] memory args) external {\n",
70 | " for (uint256 i = 0; i < args.length; i++) {\n",
71 | " useLoan(args[i]);\n",
72 | " }\n",
73 | " }\n",
74 | "\n",
75 | " /// @notice gets the price to buy the underlying collateral of the loan\n",
76 | " /// @param loanId identifier of the loan\n",
77 | " /// @return price computed price\n",
78 | " function price(uint256 loanId) public view returns (uint256) {\n",
79 | " Loan storage loan = protocolStorage().loan[loanId];\n",
80 | " uint256 loanEndDate = loan.endDate;\n",
81 | " uint256 timeSinceLiquidable = block.timestamp - loanEndDate;\n",
82 | "\n",
83 | " checkLoanStatus(loanId);\n",
84 | "\n",
85 | " /* the decreasing factor controls the evolution of the price from its initial value to 0 (and staying at 0)\n",
86 | " over the course of the auction duration */\n",
87 | " Ray decreasingFactor = timeSinceLiquidable >= loan.auction.duration\n",
88 | " ? ZERO\n",
89 | " : ONE.sub(timeSinceLiquidable.div(loan.auction.duration));\n",
90 | "\n",
91 | " /* the estimated value arises from the mean of the loan offer loanToValues used in the loan regarding their\n",
92 | " share in the collateral usage. This must stay consitent even if less than the full value of the NFT has been\n",
93 | " used as collateral */\n",
94 | " uint256 estimatedValue = loan.lent.mul(ONE.div(loan.shareLent));\n",
95 | "\n",
96 | " /* by mutliplying the estimated price by some factor and slowly decreasing this price over time we aim to\n",
97 | " make sure a liquidator will buy the NFT at fair market price. */\n",
98 | " return estimatedValue.mul(loan.auction.priceFactor).mul(decreasingFactor);\n",
99 | " }\n",
100 | "\n",
101 | " /// @notice handles buying one NFT\n",
102 | " /// @param arg arguments on what and how to buy\n",
103 | " function useLoan(BuyArg memory arg) internal {\n",
104 | " Loan storage loan = protocolStorage().loan[arg.loanId];\n",
105 | "\n",
106 | " checkLoanStatus(arg.loanId);\n",
107 | " uint256 toPay = price(arg.loanId);\n",
108 | "\n",
109 | " /* store as liquidated and paid before transfers to avoid malicious reentrency, following\n",
110 | " checks-effects-interaction pattern */\n",
111 | " loan.payment.liquidated = true;\n",
112 | " loan.payment.paid = toPay;\n",
113 | " loan.assetLent.checkedTransferFrom(msg.sender, address(this), toPay);\n",
114 | " loan.collateral.implem.safeTransferFrom(address(this), arg.to, loan.collateral.id);\n",
115 | "\n",
116 | " emit Buy(arg.loanId, abi.encode(arg));\n",
117 | " }\n",
118 | "\n",
119 | " /// @notice checks that loan is liquidable, revert if not\n",
120 | " /// @param loanId identifier of the loan\n",
121 | " function checkLoanStatus(uint256 loanId) internal view {\n",
122 | " Loan storage loan = protocolStorage().loan[loanId];\n",
123 | "\n",
124 | " if (block.timestamp < loan.endDate) {\n",
125 | " revert CollateralIsNotLiquidableYet(loan.endDate, loanId);\n",
126 | " }\n",
127 | " if (loan.payment.paid != 0 || loan.payment.liquidated) {\n",
128 | " revert LoanAlreadyRepaid(loanId);\n",
129 | " }\n",
130 | " }\n",
131 | "}\n",
132 | "\"\"\""
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": 21,
138 | "metadata": {},
139 | "outputs": [],
140 | "source": [
141 | "def parse_solidity(content):\n",
142 | " paranthesis = 0\n",
143 | " current_contract = []\n",
144 | " contract_segments = []\n",
145 | " for i in content.split(\"\\n\"):\n",
146 | " if \"{\" in i:\n",
147 | " paranthesis += 1\n",
148 | " if \"}\" in i:\n",
149 | " paranthesis -= 1\n",
150 | " if paranthesis != 0:\n",
151 | " current_contract.append(i)\n",
152 | " if len(current_contract) > 5:\n",
153 | " contract_segments.append(\"\\n\".join(current_contract))\n",
154 | " current_contract = []\n",
155 | " if len(current_contract)>1:\n",
156 | " contract_segments.append(\"\\n\".join(current_contract))\n",
157 | " return contract_segments\n",
158 | "\n",
159 | "\n"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": 22,
165 | "metadata": {},
166 | "outputs": [
167 | {
168 | "name": "stdout",
169 | "output_type": "stream",
170 | "text": [
171 | "==================================================\n",
172 | "contract AuctionFacet is IAuctionFacet, SafeMint {\n",
173 | " using RayMath for Ray;\n",
174 | " using RayMath for uint256;\n",
175 | " using Erc20CheckedTransfer for IERC20;\n",
176 | "\n",
177 | " /// @notice buy one or multiple NFTs in liquidation\n",
178 | "🤖️ the method buy() doesn’t check that the sender of the transaction is the winner of the auction. This allows the attacker to steal all of the funds in the contract.\n",
179 | "==================================================\n",
180 | "==================================================\n",
181 | " /// @param args arguments on what and how to buy\n",
182 | " function buy(BuyArg[] memory args) external {\n",
183 | " for (uint256 i = 0; i < args.length; i++) {\n",
184 | " useLoan(args[i]);\n",
185 | " }\n",
186 | " }\n",
187 | "🤖️ malicious user can buy tokens for the lender multiple times, draining their stake.\n",
188 | "==================================================\n",
189 | "==================================================\n",
190 | "\n",
191 | " /// @notice gets the price to buy the underlying collateral of the loan\n",
192 | " /// @param loanId identifier of the loan\n",
193 | " /// @return price computed price\n",
194 | " function price(uint256 loanId) public view returns (uint256) {\n",
195 | " Loan storage loan = protocolStorage().loan[loanId];\n",
196 | "🤖️ the method price() returns the wrong price of the collateral ETH when the loan is in the Erased state. This allows a malicious lender to buy back the collateral ETH at a low price before it is erased from their wallet.\n",
197 | "==================================================\n",
198 | "==================================================\n",
199 | " uint256 loanEndDate = loan.endDate;\n",
200 | " uint256 timeSinceLiquidable = block.timestamp - loanEndDate;\n",
201 | "\n",
202 | " checkLoanStatus(loanId);\n",
203 | "\n",
204 | " /* the decreasing factor controls the evolution of the price from its initial value to 0 (and staying at 0)\n",
205 | "🤖️ Liquidable loans can be stuck in the endDate state, so that timeSinceLiquidable will always be greater than block.timestamp, and the borrower will not be able to pay back the loan before its end date. This can cause the borrower’s collateral to be stuck in the EndDate state forever, and can lead to a loss of principal for the lender.\n",
206 | "==================================================\n",
207 | "==================================================\n",
208 | " over the course of the auction duration */\n",
209 | " Ray decreasingFactor = timeSinceLiquidable >= loan.auction.duration\n",
210 | " ? ZERO\n",
211 | " : ONE.sub(timeSinceLiquidable.div(loan.auction.duration));\n",
212 | "\n",
213 | " /* the estimated value arises from the mean of the loan offer loanToValues used in the loan regarding their\n",
214 | "🤖️ the liquidable state is wrongfully calculated, causing the bidders to overpay for the loan.\n",
215 | "==================================================\n",
216 | "==================================================\n",
217 | " share in the collateral usage. This must stay consitent even if less than the full value of the NFT has been\n",
218 | " used as collateral */\n",
219 | " uint256 estimatedValue = loan.lent.mul(ONE.div(loan.shareLent));\n",
220 | "\n",
221 | " /* by mutliplying the estimated price by some factor and slowly decreasing this price over time we aim to\n",
222 | " make sure a liquidator will buy the NFT at fair market price. */\n",
223 | "🤖️ the liquidator may not buy the collateral token for its fair value, but instead buy it for much less than its fair value, allowing the borrower to profit from the liquidator’s mistake.\n",
224 | "==================================================\n",
225 | "==================================================\n",
226 | " return estimatedValue.mul(loan.auction.priceFactor).mul(decreasingFactor);\n",
227 | " }\n",
228 | "\n",
229 | " /// @notice handles buying one NFT\n",
230 | " /// @param arg arguments on what and how to buy\n",
231 | " function useLoan(BuyArg memory arg) internal {\n",
232 | "🤖️ the user can lose their NFTs when they buy a loan.\n",
233 | "==================================================\n",
234 | "==================================================\n",
235 | " Loan storage loan = protocolStorage().loan[arg.loanId];\n",
236 | "\n",
237 | " checkLoanStatus(arg.loanId);\n",
238 | " uint256 toPay = price(arg.loanId);\n",
239 | "\n",
240 | " /* store as liquidated and paid before transfers to avoid malicious reentrency, following\n",
241 | "🤖️ Users can take the ETH in the collateral of a loan without paying back the loan.\n",
242 | "==================================================\n",
243 | "==================================================\n",
244 | " checks-effects-interaction pattern */\n",
245 | " loan.payment.liquidated = true;\n",
246 | " loan.payment.paid = toPay;\n",
247 | " loan.assetLent.checkedTransferFrom(msg.sender, address(this), toPay);\n",
248 | " loan.collateral.implem.safeTransferFrom(address(this), arg.to, loan.collateral.id);\n",
249 | "\n",
250 | "🤖️ the user’s NFT could be stuck in the contract when the user tries to liquidate the loan.\n",
251 | "==================================================\n",
252 | "==================================================\n",
253 | " emit Buy(arg.loanId, abi.encode(arg));\n",
254 | " }\n",
255 | "\n",
256 | " /// @notice checks that loan is liquidable, revert if not\n",
257 | " /// @param loanId identifier of the loan\n",
258 | " function checkLoanStatus(uint256 loanId) internal view {\n",
259 | "🤖️ Liquidation can be bypassed by taking a new loan with the same collateral before the old one is liquidated. This allows users to take multiple loans in a single transaction, which is not possible when using the Smart Contract implementation.\n",
260 | "==================================================\n",
261 | "==================================================\n",
262 | " Loan storage loan = protocolStorage().loan[loanId];\n",
263 | "\n",
264 | " if (block.timestamp < loan.endDate) {\n",
265 | " revert CollateralIsNotLiquidableYet(loan.endDate, loanId);\n",
266 | " }\n",
267 | " if (loan.payment.paid != 0 || loan.payment.liquidated) {\n",
268 | "🤖️ Liquidation of collateral does not update endDate, so users can take out a loan with no endDate and no payment, allowing them to profit from the liquidation fee without paying it.\n",
269 | "==================================================\n",
270 | "==================================================\n",
271 | " revert LoanAlreadyRepaid(loanId);\n",
272 | " }\n",
273 | " }\n",
274 | "🤖️ User’s loan can be maliciously reported as already repaid so that the user can’t repay the loan again. This allows users to take NFTs out of their account multiple times.\n",
275 | "==================================================\n"
276 | ]
277 | }
278 | ],
279 | "source": [
280 | "parsed = parse_solidity(contract)\n",
281 | "\n",
282 | "for i in parsed:\n",
283 | " print(\"=\" * 50)\n",
284 | " print(i)\n",
285 | " print(\"🤖️ \" + find_vulnerabilities(i))\n",
286 | " print(\"=\" * 50)"
287 | ]
288 | },
289 | {
290 | "cell_type": "code",
291 | "execution_count": null,
292 | "metadata": {},
293 | "outputs": [],
294 | "source": []
295 | }
296 | ],
297 | "metadata": {
298 | "kernelspec": {
299 | "display_name": "Python 3",
300 | "language": "python",
301 | "name": "python3"
302 | },
303 | "language_info": {
304 | "codemirror_mode": {
305 | "name": "ipython",
306 | "version": 3
307 | },
308 | "file_extension": ".py",
309 | "mimetype": "text/x-python",
310 | "name": "python",
311 | "nbconvert_exporter": "python",
312 | "pygments_lexer": "ipython3",
313 | "version": "3.10.8"
314 | },
315 | "orig_nbformat": 4
316 | },
317 | "nbformat": 4,
318 | "nbformat_minor": 2
319 | }
320 |
--------------------------------------------------------------------------------
/fine_tune.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "attachments": {},
5 | "cell_type": "markdown",
6 | "metadata": {},
7 | "source": [
8 | "## Crawl Code4rena reports from Solodit"
9 | ]
10 | },
11 | {
12 | "cell_type": "code",
13 | "execution_count": 160,
14 | "metadata": {},
15 | "outputs": [],
16 | "source": [
17 | "import requests\n",
18 | "\n",
19 | "def get_solodit(page):\n",
20 | " headers = {\n",
21 | " 'authorization': 'Token 36dc738e703c50039f3e6f03ee696730c49c54cb', # <- replace with your own token! You can find it in the network tab of your browser\n",
22 | " 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36',\n",
23 | " }\n",
24 | "\n",
25 | " params = {\n",
26 | " 'source': 'Code4rena',\n",
27 | " 'impact': 'HIGH,MEDIUM',\n",
28 | " 'finder': '',\n",
29 | " 'protocol': '',\n",
30 | " 'report_date': '',\n",
31 | " 'min_quality_score': '0',\n",
32 | " 'min_general_score': '0',\n",
33 | " 'tags': '',\n",
34 | " 'bookmarked': 'False',\n",
35 | " 'keyword': '',\n",
36 | " 'page': page,\n",
37 | " }\n",
38 | "\n",
39 | " response = requests.get('https://solodit.xyz/api/issues/rest/', params=params, headers=headers)\n",
40 | " return response.json()\n"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 161,
46 | "metadata": {},
47 | "outputs": [
48 | {
49 | "data": {
50 | "text/plain": [
51 | "124"
52 | ]
53 | },
54 | "execution_count": 161,
55 | "metadata": {},
56 | "output_type": "execute_result"
57 | }
58 | ],
59 | "source": [
60 | "total_pages = get_solodit(1)[\"total_pages\"]\n",
61 | "total_pages"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 234,
67 | "metadata": {},
68 | "outputs": [
69 | {
70 | "name": "stderr",
71 | "output_type": "stream",
72 | "text": [
73 | "/var/folders/wy/h6tpyrcn4szfs0598d0y5hhh0000gn/T/ipykernel_62742/2122698606.py:3: TqdmDeprecationWarning: This function will be removed in tqdm==5.0.0\n",
74 | "Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`\n",
75 | " for i in tqdm.tqdm_notebook(range(1, total_pages)):\n"
76 | ]
77 | },
78 | {
79 | "data": {
80 | "application/vnd.jupyter.widget-view+json": {
81 | "model_id": "2acbbb3f231f43dc8d9dbd99f60db3d8",
82 | "version_major": 2,
83 | "version_minor": 0
84 | },
85 | "text/plain": [
86 | " 0%| | 0/123 [00:00, ?it/s]"
87 | ]
88 | },
89 | "metadata": {},
90 | "output_type": "display_data"
91 | }
92 | ],
93 | "source": [
94 | "import tqdm\n",
95 | "data = []\n",
96 | "for i in tqdm.tqdm_notebook(range(1, total_pages+1)):\n",
97 | " try:\n",
98 | " data += get_solodit(i)[\"results\"]\n",
99 | " except Exception as e:\n",
100 | " print(e)\n",
101 | " print(f\"Error in page {i}\")"
102 | ]
103 | },
104 | {
105 | "cell_type": "code",
106 | "execution_count": 236,
107 | "metadata": {},
108 | "outputs": [],
109 | "source": [
110 | "import json\n",
111 | "\n",
112 | "with open(\"solodit.json\", \"w\") as f:\n",
113 | " json.dump(data, f)"
114 | ]
115 | },
116 | {
117 | "attachments": {},
118 | "cell_type": "markdown",
119 | "metadata": {},
120 | "source": [
121 | "## Parse Markdown Data & GitHub LoC"
122 | ]
123 | },
124 | {
125 | "cell_type": "code",
126 | "execution_count": 237,
127 | "metadata": {},
128 | "outputs": [],
129 | "source": [
130 | "import re\n",
131 | "def parse(content):\n",
132 | " loc_and_vuln = content[\"content\"].split(\"# Vulnerability details\")\n",
133 | " loc = loc_and_vuln[0]\n",
134 | " locs = re.findall(r\"https://github.com/(.+?)/blob/(.+?)/(.+?)#(.+)\", loc)\n",
135 | " vuln = \" \".join(content[\"title\"].split(\"] \")[1:])\n",
136 | " return locs, vuln"
137 | ]
138 | },
139 | {
140 | "attachments": {},
141 | "cell_type": "markdown",
142 | "metadata": {},
143 | "source": [
144 | "## Crawl Code From GitHub"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": 238,
150 | "metadata": {},
151 | "outputs": [],
152 | "source": [
153 | "import functools\n",
154 | "\n",
155 | "@functools.lru_cache(maxsize=1000)\n",
156 | "def fetch_github(url):\n",
157 | " response = requests.get(url)\n",
158 | " return response.text\n",
159 | "\n",
160 | "def crawl(repo, commit_hash, file_name, line_number):\n",
161 | " line_number = line_number.replace(\"L\", \"\")\n",
162 | " url = f\"https://raw.githubusercontent.com/{repo}/{commit_hash}/{file_name}\"\n",
163 | " content = fetch_github(url)\n",
164 | " lines = content.split(\"\\n\")\n",
165 | " if \"-\" in line_number:\n",
166 | " start, end = line_number.split(\"-\")\n",
167 | " start = max(0, int(start))\n",
168 | " end = min(len(lines), int(end) + 1)\n",
169 | " else:\n",
170 | " line_number = int(line_number)\n",
171 | " start = max(0, line_number - 1)\n",
172 | " end = min(len(lines), line_number + 2)\n",
173 | "\n",
174 | " return \"\\n\".join(lines[start:end])"
175 | ]
176 | },
177 | {
178 | "cell_type": "code",
179 | "execution_count": 242,
180 | "metadata": {},
181 | "outputs": [
182 | {
183 | "name": "stderr",
184 | "output_type": "stream",
185 | "text": [
186 | "/var/folders/wy/h6tpyrcn4szfs0598d0y5hhh0000gn/T/ipykernel_62742/4115993144.py:1: TqdmDeprecationWarning: This function will be removed in tqdm==5.0.0\n",
187 | "Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`\n",
188 | " for idx, i in tqdm.tqdm_notebook(enumerate(data)):\n"
189 | ]
190 | },
191 | {
192 | "data": {
193 | "application/vnd.jupyter.widget-view+json": {
194 | "model_id": "86a96b6e4bed4fe6a67f0c40833d7500",
195 | "version_major": 2,
196 | "version_minor": 0
197 | },
198 | "text/plain": [
199 | "0it [00:00, ?it/s]"
200 | ]
201 | },
202 | "metadata": {},
203 | "output_type": "display_data"
204 | },
205 | {
206 | "name": "stdout",
207 | "output_type": "stream",
208 | "text": [
209 | "invalid literal for int() with base 10: '199), the cidData is written directly, without checking and handling the case that a previously added nft may not have been removed:\\r'\n",
210 | "Error in 11 ('OpenCoreCH/cid-c4-squad', '4558d25aa8ea92644f3e778457fd6708104e0f24', 'src/CidNFT.sol', 'L192-L199), the cidData is written directly, without checking and handling the case that a previously added nft may not have been removed:\\r')\n",
211 | "invalid literal for int() with base 10: '199):\\r'\n",
212 | "Error in 11 ('OpenCoreCH/cid-c4-squad', '4558d25aa8ea92644f3e778457fd6708104e0f24', 'src/CidNFT.sol', 'L192-L199):\\r')\n",
213 | "invalid literal for int() with base 10: '#127'\n",
214 | "Error in 199 ('pooltogether/ERC5164', '5647bd84f2a6d1a37f41394874d567e45a97bf48', 'src/ethereum-arbitrum/EthereumToArbitrumRelayer.sol', 'L118-#L127')\n",
215 | "invalid literal for int() with base 10: '465:#466'\n",
216 | "Error in 789 ('code-423n4/2022-06-illuminate', 'main', 'lender/Lender.sol', 'L465:#L466')\n",
217 | "invalid literal for int() with base 10: '235)'\n",
218 | "Error in 805 ('code-423n4/2022-06-illuminate', 'main', 'lender/Lender.sol', 'L192-L235)')\n",
219 | "invalid literal for int() with base 10: '534)'\n",
220 | "Error in 805 ('code-423n4/2022-06-illuminate', 'main', 'lender/Lender.sol', 'L486-L534)')\n",
221 | "invalid literal for int() with base 10: '589)'\n",
222 | "Error in 805 ('code-423n4/2022-06-illuminate', 'main', 'lender/Lender.sol', 'L545-L589)')\n",
223 | "invalid literal for int() with base 10: '193)'\n",
224 | "Error in 809 ('code-423n4/2022-06-illuminate', 'main', 'redeemer/Redeemer.sol', 'L193)')\n",
225 | "invalid literal for int() with base 10: ':~:text=function%20transferToke(,%7D'\n",
226 | "Error in 830 ('code-423n4/2022-06-yieldy', 'main', 'src/contracts/Staking.sol', ':~:text=function%20transferToke(,%7D')\n",
227 | "too many values to unpack (expected 2)\n",
228 | "Error in 917 ('code-423n4/2022-06-connext', '4dd6149748b635f95460d4c3924c7e3fb6716967', 'contracts/contracts/core/connext/facets/BridgeFacet.sol', 'L819](https://github.com/code-423n4/2022-06-connext/blob/4dd6149748b635f95460d4c3924c7e3fb6716967/contracts/contracts/core/connext/facets/BridgeFacet.sol#L819')\n",
229 | "too many values to unpack (expected 2)\n",
230 | "Error in 933 ('code-423n4/2022-06-notional-coop', '6f8c325f604e2576e2fe257b6b57892ca181509a', 'index-coop-notional-trade-module/contracts/protocol/modules/v1/NotionalTradeModule.sol', 'L526](https://github.com/code-423n4/2022-06-notional-coop/blob/6f8c325f604e2576e2fe257b6b57892ca181509a/index-coop-notional-trade-module/contracts/protocol/modules/v1/NotionalTradeModule.sol#L526)\\r')\n",
231 | "invalid literal for int() with base 10: '229:#244'\n",
232 | "Error in 1010 ('code-423n4/2022-05-rubicon', 'main', 'contracts/RubiconRouter.sol', 'L229:#L244')\n",
233 | "invalid literal for int() with base 10: '492)'\n",
234 | "Error in 1014 ('code-423n4/2022-05-rubicon', 'main', 'contracts/RubiconRouter.sol', 'L475-L492)')\n",
235 | "invalid literal for int() with base 10: '48)'\n",
236 | "Error in 1313 ('code-423n4/2022-03-lifinance', 'main', 'src/Libraries/LibSwap.sol', 'L29-L48)')\n",
237 | "invalid literal for int() with base 10: '#182'\n",
238 | "Error in 1461 ('code-423n4/2022-02-redacted-cartel', 'main', 'contracts/RewardDistributor.sol', 'L178-#L182')\n",
239 | "invalid literal for int() with base 10: '#73'\n",
240 | "Error in 1461 ('code-423n4/2022-02-redacted-cartel', 'main', 'contracts/RewardDistributor.sol', 'L65-#L73')\n",
241 | "invalid literal for int() with base 10: ':~:text=uint256%20amount%20%3D%20records,_nftId%5D.reserve%20%3D%20_reserve%3B'\n",
242 | "Error in 1475 ('code-423n4/2022-02-nested', '69cf51d8e4eeb8bce3025db7f4f74cc565c9fad3', 'contracts/NestedRecords.sol', ':~:text=uint256%20amount%20%3D%20records,_nftId%5D.reserve%20%3D%20_reserve%3B')\n",
243 | "invalid literal for int() with base 10: '28\">name and symbol. It is possible it set them back to an empty string, uninitializing the contract and letting the initialize(..)
function be called again. This way, the owner may,\n",
244 | "Error in 1727 ('code-423n4/2021-12-amun', 'main', 'contracts/basket/contracts/facets/ERC20/ERC20Facet.sol', 'L25-L28\">name and symbol. It is possible it set them back to an empty string, uninitializing the contract and letting the initialize(..)
function be called again. This way, the owner may, for example, hide minting additional tokens. Or, after accidentally setting name and symbol to empty strings, anyone can take control over the contract and mint any number of tokens.