├── .env.template
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── anthropic_api.py
├── app.py
├── flask_server.py
├── index.html
├── main.css
├── main.js
└── requirements.txt


/.env.template:
--------------------------------------------------------------------------------
1 | ANTHROPIC_API_KEY=your-api-key-here


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | venv/
2 | __pycache__/
3 | .vscode/
4 | .env
5 | links.json


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
 1 | # Code Of Conduct
 2 | 
 3 | ## Our Pledge
 4 | 
 5 | We as members, contributors and leaders pledge to make participation in our project and community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity, gender expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion or sexual identity and orientation.
 6 | 
 7 | We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive and healthy community.
 8 | 
 9 | ## Our Standards
10 | 
11 | Examples of behavior that contributes to a positive environment include:
12 | 
13 | - Using welcoming and inclusive language
14 | - Being respectful of differing viewpoints and experiences
15 | - Giving and gracefully accepting constructive feedback
16 | - Showing empathy towards other community members
17 | 
18 | Examples of unacceptable behavior include:
19 | 
20 | - Trolling, insulting/derogatory comments and personal or political attacks
21 | - Public or private harassment
22 | - Publishing others’ private information, such as a physical or email address, without their explicit permission
23 | - Other conduct which could reasonably be considered inappropriate
24 | 
25 | ## Enforcement Responsibilities
26 | 
27 | Project maintainers are responsible for clarifying and enforcing standards of acceptable behavior and will take appropriate and fair corrective action in response to any instances of unacceptable behavior.
28 | 
29 | ## Scope
30 | 
31 | This Code Of Conduct applies within all project spaces and also applies when an individual is representing the project or its community in public spaces.
32 | 
33 | ## Enforcement
34 | 
35 | Instances of abusive, harassing or otherwise unacceptable behavior may be reported to the project team at contact@bretbernhoft.com. All complaints will be reviewed and investigated promptly and fairly.
36 | 
37 | ## Attribution
38 | 
39 | This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1, available [here](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).
40 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing To Mapping A Website's Internal Links
 2 | 
 3 | Thanks for taking the time to contribute.
 4 | 
 5 | ## How To Contribute
 6 | 
 7 | - Report bugs by opening an issue.
 8 | - Suggest enhancements by opening an issue and labeling it as an enhancement.
 9 | - Fork the repo, create a new branch, and submit a pull request.
10 | 
11 | ## Branching Strategy
12 | 
13 | - Work from a feature branch, not from `main`.
14 | - Name your branch using this pattern: `feature/your-description` or `bugfix/short-description`.
15 | 
16 | ## Before Submitting A Pull Request
17 | 
18 | - Ensure the PR description clearly explains the problem and solution.
19 | - Format your code with Prettier.
20 | - Update documentation if necessary.
21 | 
22 | ## Code Standards
23 | 
24 | - Use 2 spaces for indentation.
25 | - Follow the existing code style.
26 | - Run Prettier before pushing code.
27 | 
28 | ## Running Locally
29 | 
30 | 1. Clone the repo
31 | 2. Install dependencies
32 | 3. Run `python3 app.py`
33 | 
34 | ## Commit Messages
35 | 
36 | Follow [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/).
37 | 
38 | ## Code Of Conduct
39 | 
40 | Please read our [Code Of Conduct](CODE_OF_CONDUCT.md) before contributing.
41 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 Bret Bernhoft
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Mapping A Website's Internal Links
 2 | 
 3 | ![Preview Of Resulting Visualization](https://hosting.photobucket.com/bbcfb0d4-be20-44a0-94dc-65bff8947cf2/ad55b31e-edf3-4a21-a6a2-3e61f4d84a0b.jpg)
 4 | 
 5 | Explore a website's internal links, then visualize those connections as a network graph with scorecards and analysis using Claude AI.
 6 | 
 7 | ## Set Up
 8 | 
 9 | ### Programs Needed
10 | 
11 | - [Git](https://git-scm.com/downloads)
12 | - [Python](https://www.python.org/downloads/) (When installing on Windows, make sure you check the "[Add python 3.xx to PATH](https://hosting.photobucket.com/images/i/bernhoftbret/python.png)" box.)
13 | 
14 | ### Steps
15 | 
16 | 1. Install the above programs.
17 | 
18 | 2. Open a shell window (For Windows open PowerShell, for MacOS open Terminal & for Linux open your distro's terminal emulator).
19 | 
20 | 3. Clone this repository using `git` by running the following command: `git clone git@github.com:devbret/website-internal-links.git`.
21 | 
22 | 4. Navigate to the repo's directory by running: `cd website-internal-links`.
23 | 
24 | 5. Create a virtual environment with this command: `python3 -m venv venv`. Then activate your virtual environment using: `source venv/bin/activate`.
25 | 
26 | 6. Install the needed dependencies for running the script by running: `pip install -r requirements.txt`.
27 | 
28 | 7. Set the environment variable for the Anthropic API key by renaming the `.env.template` file to `.env` and placing your value immediately after the `=` character.
29 | 
30 | 8. Edit the app.py file `WEBSITE_TO_CRAWL` variable (on line 21), this is the website you would like to visualize.
31 | 
32 |    - Also edit the app.py file `MAX_PAGES_TO_CRAWL` variable (on line 24), this specifies how many pages you would like to crawl.
33 | 
34 | 9. Run the script with the command `python3 app.py`.
35 | 
36 | 10. To view the website's connections using the index.html file you will need to run a local web server. To do this run `python3 -m http.server` in a new terminal.
37 | 
38 | 11. Once the network map has been launched, hover over any given node for more information about the particular web page, as well as the option submit data for analysis via Claude AI. By clicking on a node, you will be sent to the related URL address in a new tab.
39 | 
40 | 12. To exit the virtual environment (venv), type: `deactivate` in the terminal.
41 | 
42 | ## Performance Considerations
43 | 
44 | Generating visualizations for this app takes an unexpectedly large amount of processing power. It is thus advisable to initially experiment with mapping less than one hundred pages per launch.
45 | 
46 | ## Additional Notes
47 | 
48 | The analysis uses textstat for readability and TextBlob for sentiment.
49 | 
50 | The crawler checks for SEO and accessibility markers like:
51 | 
52 | - Heading structure
53 | 
54 | - Image alt tags
55 | 
56 | - Form label usage
57 | 
58 | - Semantic HTML elements
59 | 
60 | ## Troubleshooting
61 | 
62 | If working with GitHub codespaces, you may have to:
63 | 
64 | - `python -m nltk.downloader punkt_tab`
65 | 
66 | - Then reattempt steps 6 - 9.
67 | 
68 | If all else fails, please contact the maintainer here on GitHub or via [LinkedIn](https://www.linkedin.com/in/bernhoftbret/).
69 | 
70 | Cheers!
71 | 


--------------------------------------------------------------------------------
/anthropic_api.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import json
 3 | from anthropic import Anthropic
 4 | from dotenv import load_dotenv
 5 | 
 6 | load_dotenv()
 7 | 
 8 | api_key = os.getenv("ANTHROPIC_API_KEY")
 9 | if not api_key:
10 |     raise ValueError("ANTHROPIC_API_KEY not found in environment variables. Please set it in your .env file.")
11 | 
12 | try:
13 |     anthropic = Anthropic(api_key=api_key)
14 | except Exception as e:
15 |     raise ValueError(f"Failed to initialize Anthropic client: {e}")
16 | 
17 | 
18 | def analyze_with_anthropic(page_data):
19 |     system_prompt = """You are an expert analyst. Your task is to review structured JSON data from a webpage.
20 | Summarize the strengths and weaknesses of this page in terms of SEO, accessibility, and semantic HTML structure.
21 | Provide specific, actionable suggestions for improvements.
22 | Structure your response clearly, using Markdown for headings (e.g., ## Strengths, ## Weaknesses, ## Suggestions)."""
23 | 
24 |     user_message_content = f"""
25 | Here is a structured JSON of a webpage:
26 | 
27 | {json.dumps(page_data, indent=2)}
28 | 
29 | Please analyze it based on the instructions provided.
30 | """
31 | 
32 |     try:
33 |         response = anthropic.messages.create(
34 |             model="claude-3-7-sonnet-20250219",
35 |             max_tokens=1500,
36 |             temperature=0.5,
37 |             system=system_prompt,
38 |             messages=[
39 |                 {
40 |                     "role": "user",
41 |                     "content": user_message_content
42 |                 }
43 |             ]
44 |         )
45 |         if response.content and len(response.content) > 0:
46 |             return response.content[0].text.strip()
47 |         else:
48 |             return "No content returned from API."
49 | 
50 |     except Exception as e:
51 |         error_message = f"Anthropic API error: {e}"
52 |         print(error_message)
53 |         if hasattr(e, 'response') and hasattr(e.response, 'json'):
54 |             try:
55 |                 error_details = e.response.json()
56 |                 error_message += f" | Details: {json.dumps(error_details)}"
57 |             except json.JSONDecodeError:
58 |                 error_message += f" | Details: (Could not decode JSON error response from API)"
59 | 
60 | 
61 |         raise Exception(error_message)


--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
  1 | import requests
  2 | import subprocess
  3 | from urllib.parse import urljoin, urlparse
  4 | from bs4 import BeautifulSoup
  5 | import json
  6 | import time
  7 | import textstat
  8 | from textblob import TextBlob
  9 | import nltk
 10 | from nltk.corpus import stopwords
 11 | from collections import Counter
 12 | import re
 13 | from dotenv import load_dotenv
 14 | 
 15 | load_dotenv()
 16 | 
 17 | nltk.download('punkt', quiet=True)
 18 | nltk.download('stopwords', quiet=True)
 19 | 
 20 | # The website you would like to visualize.
 21 | WEBSITE_TO_CRAWL = 'https://example.com/' 
 22 | 
 23 | # Specify how many pages you would like to crawl.
 24 | MAX_PAGES_TO_CRAWL = 20
 25 | 
 26 | def is_internal(url, base):
 27 |     return urlparse(url).netloc == urlparse(base).netloc
 28 | 
 29 | def check_heading_structure(soup):
 30 |     headings = [int(tag.name[1]) for tag in soup.find_all(re.compile('^h[1-6]$'))]
 31 |     skipped_levels = []
 32 |     prev_level = 0
 33 |     for level in headings:
 34 |         if prev_level and level > prev_level + 1:
 35 |             skipped_levels.append((prev_level, level))
 36 |         prev_level = level
 37 |     return skipped_levels
 38 | 
 39 | def check_semantic_elements(soup):
 40 |     semantic_tags = ['main', 'nav', 'article', 'section', 'header', 'footer', 'aside']
 41 |     used_semantics = {tag: bool(soup.find(tag)) for tag in semantic_tags}
 42 |     return used_semantics
 43 | 
 44 | def check_image_alts(soup):
 45 |     images = soup.find_all('img')
 46 |     images_without_alt = [img['src'] for img in images if not img.get('alt') or img.get('alt').strip() == '']
 47 |     return images_without_alt
 48 | 
 49 | def check_form_labels(soup):
 50 |     inputs = soup.find_all(['input', 'textarea', 'select'])
 51 |     labeled_inputs = set()
 52 |     for label in soup.find_all('label'):
 53 |         if label.get('for'):
 54 |             labeled_inputs.add(label['for'])
 55 |     inputs_without_labels = []
 56 |     for field in inputs:
 57 |         if field.get('id') and field.get('type') not in ['hidden', 'submit', 'button', 'reset']:
 58 |             if field['id'] not in labeled_inputs:
 59 |                 parent_label = field.find_parent('label')
 60 |                 if not parent_label:
 61 |                     inputs_without_labels.append(field['id'])
 62 |     return inputs_without_labels
 63 | 
 64 | 
 65 | def crawl_site(start_url, max_links=MAX_PAGES_TO_CRAWL):
 66 |     visited = set()
 67 |     site_structure = {}
 68 |     to_visit = [start_url.rstrip('/')]
 69 | 
 70 |     while to_visit and len(visited) < max_links:
 71 |         url = to_visit.pop(0)
 72 |         if url in visited:
 73 |             continue
 74 | 
 75 |         normalized_url = url.rstrip('/')
 76 |         if normalized_url in visited:
 77 |             continue
 78 |         
 79 |         visited.add(normalized_url)
 80 |         print(f"Crawling: {normalized_url} ({len(visited)}/{max_links})")
 81 | 
 82 |         try:
 83 |             start_time = time.time()
 84 |             response = requests.get(normalized_url, timeout=10, headers={'User-Agent': 'MyCrawler/1.0'})
 85 |             response_time = time.time() - start_time
 86 |             status_code = response.status_code
 87 | 
 88 |             if response.status_code != 200:
 89 |                 print(f"Skipping {normalized_url} due to status code: {status_code}")
 90 |                 site_structure[normalized_url] = {"status_code": status_code, "error": "Failed to fetch"}
 91 |                 continue
 92 |             
 93 |             content_type = response.headers.get('content-type', '').lower()
 94 |             if 'text/html' not in content_type:
 95 |                 print(f"Skipping {normalized_url} as content type is not HTML: {content_type}")
 96 |                 site_structure[normalized_url] = {"status_code": status_code, "error": "Not HTML content"}
 97 |                 continue
 98 | 
 99 |             soup = BeautifulSoup(response.text, 'html.parser')
100 |             page_title = soup.title.string.strip() if soup.title else ''
101 | 
102 |             meta_desc_tag = soup.find('meta', attrs={'name': 'description'})
103 |             meta_description = meta_desc_tag['content'].strip() if meta_desc_tag and 'content' in meta_desc_tag.attrs else ''
104 | 
105 |             meta_keywords_tag = soup.find('meta', attrs={'name': 'keywords'})
106 |             meta_keywords = meta_keywords_tag['content'].strip() if meta_keywords_tag and 'content' in meta_keywords_tag.attrs else ''
107 | 
108 |             h1_tags = [h1.get_text(strip=True) for h1 in soup.find_all('h1')]
109 | 
110 |             text_content_for_analysis = []
111 |             for element in soup.find_all(['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'li', 'span', 'article']):
112 |                 text_content_for_analysis.append(element.get_text(separator=' ', strip=True))
113 |             text = " ".join(text_content_for_analysis)
114 | 
115 |             word_count = len(text.split()) if text else 0
116 | 
117 |             readability_score = textstat.flesch_kincaid_grade(text) if text else 0
118 |             sentiment = TextBlob(text).sentiment.polarity if text else 0
119 | 
120 |             keyword_density = {}
121 |             if text:
122 |                 text_clean = re.sub(r'[^\w\s]', '', text.lower())
123 |                 tokens = nltk.word_tokenize(text_clean)
124 |                 stop_words = set(stopwords.words('english'))
125 |                 filtered_tokens = [word for word in tokens if word not in stop_words and word.isalpha() and len(word) > 1]
126 |                 if filtered_tokens:
127 |                     word_freq = Counter(filtered_tokens)
128 |                     total_filtered_words = sum(word_freq.values())
129 |                     most_common = word_freq.most_common(10)
130 |                     keyword_density = {word: round(count / total_filtered_words, 4) for word, count in most_common}
131 | 
132 | 
133 |             image_count = len(soup.find_all('img'))
134 |             script_count = len(soup.find_all('script'))
135 |             stylesheet_count = len(soup.find_all('link', rel='stylesheet'))
136 |             has_viewport_meta = bool(soup.find('meta', attrs={'name': 'viewport'}))
137 |             heading_count = len(soup.find_all(['h2', 'h3', 'h4', 'h5', 'h6']))
138 |             paragraph_count = len(soup.find_all('p'))
139 | 
140 |             semantic_elements = check_semantic_elements(soup)
141 |             heading_issues = check_heading_structure(soup)
142 |             unlabeled_inputs = check_form_labels(soup)
143 |             images_without_alt = check_image_alts(soup)
144 | 
145 |             internal_links_found = []
146 |             external_links_found = []
147 | 
148 |             for link_tag in soup.find_all('a', href=True):
149 |                 href = link_tag.get('href')
150 |                 if not href or href.startswith('#') or href.startswith('mailto:') or href.startswith('tel:'):
151 |                     continue
152 | 
153 |                 absolute_href = urljoin(normalized_url, href).split('#')[0].rstrip('/')
154 | 
155 |                 if is_internal(absolute_href, start_url):
156 |                     internal_links_found.append(absolute_href)
157 |                     if absolute_href not in visited and absolute_href not in to_visit and len(visited) + len(to_visit) < max_links :
158 |                         to_visit.append(absolute_href)
159 |                 else:
160 |                     external_links_found.append(absolute_href)
161 |             
162 |             site_structure[normalized_url] = {
163 |                 "url": normalized_url,
164 |                 "title": page_title,
165 |                 "meta_description": meta_description,
166 |                 "meta_keywords": meta_keywords,
167 |                 "h1_tags": h1_tags,
168 |                 "word_count": word_count,
169 |                 "readability_score": readability_score,
170 |                 "sentiment": sentiment,
171 |                 "keyword_density": keyword_density,
172 |                 "image_count": image_count,
173 |                 "script_count": script_count,
174 |                 "stylesheet_count": stylesheet_count,
175 |                 "has_viewport_meta": has_viewport_meta,
176 |                 "heading_count": heading_count,
177 |                 "paragraph_count": paragraph_count,
178 |                 "status_code": status_code,
179 |                 "response_time": round(response_time, 2),
180 |                 "internal_links": list(set(internal_links_found)),
181 |                 "external_links": list(set(external_links_found)),
182 |                 "semantic_elements": semantic_elements,
183 |                 "heading_issues": heading_issues,
184 |                 "unlabeled_inputs": unlabeled_inputs,
185 |                 "images_without_alt": images_without_alt
186 |             }
187 | 
188 |         except requests.exceptions.Timeout:
189 |             print(f"Timeout crawling {normalized_url}")
190 |             site_structure[normalized_url] = {"status_code": "Timeout", "error": "Request timed out"}
191 |         except requests.exceptions.RequestException as e:
192 |             print(f"Failed to crawl {normalized_url}: {e}")
193 |             site_structure[normalized_url] = {"status_code": "Error", "error": str(e)}
194 |         except Exception as e:
195 |             print(f"An unexpected error occurred while processing {normalized_url}: {e}")
196 |             site_structure[normalized_url] = {"status_code": "Processing Error", "error": str(e)}
197 | 
198 | 
199 |     return site_structure
200 | 
201 | 
202 | def save_links_as_json(site_structure, filename='links.json'):
203 |     with open(filename, 'w') as file:
204 |         json.dump(site_structure, file, indent=2)
205 |     print(f"Site structure saved to {filename}")
206 | 
207 | if __name__ == "__main__":
208 |     crawled_site_structure = crawl_site(WEBSITE_TO_CRAWL, MAX_PAGES_TO_CRAWL)
209 |     save_links_as_json(crawled_site_structure)
210 |     print("Crawling complete. Starting Flask server subprocess...")
211 |     subprocess.run(["python", "flask_server.py"])
212 |     print("Flask server subprocess has been initiated.")


--------------------------------------------------------------------------------
/flask_server.py:
--------------------------------------------------------------------------------
 1 | from flask import Flask, request, jsonify
 2 | from flask_cors import CORS
 3 | from anthropic_api import analyze_with_anthropic
 4 | import json
 5 | 
 6 | site_structure = {}
 7 | 
 8 | app = Flask(__name__)
 9 | CORS(app)
10 | 
11 | def load_crawled_data(filename='links.json'):
12 |     global site_structure
13 |     try:
14 |         with open(filename, 'r') as f:
15 |             site_structure = json.load(f)
16 |         print(f"Successfully loaded {len(site_structure)} URLs from {filename}")
17 |     except FileNotFoundError:
18 |         print(f"ERROR: {filename} not found. Make sure app.py has run and created it.")
19 |         site_structure = {}
20 |     except json.JSONDecodeError:
21 |         print(f"ERROR: Could not decode JSON from {filename}. It might be corrupted.")
22 |         site_structure = {}
23 | 
24 | @app.route('/api/analyze', methods=['POST'])
25 | def analyze():
26 |     data = request.json
27 |     if not data or 'url' not in data:
28 |         return jsonify({"error": "Missing 'url' in request body"}), 400
29 | 
30 |     requested_url = data['url'].rstrip("/")
31 |     page_data = site_structure.get(requested_url) or site_structure.get(requested_url + "/")
32 | 
33 |     if not page_data:
34 |         print(f"Debug: URL '{requested_url}' not found in site_structure.")
35 |         print(f"Debug: Available keys: {list(site_structure.keys())[:5]}")
36 |         return jsonify({"error": "No data found for this URL"}), 404
37 | 
38 |     try:
39 |         analysis = analyze_with_anthropic(page_data)
40 |         return jsonify({"analysis": analysis})
41 |     except Exception as e:
42 |         print(f"Error during analysis for {requested_url}: {e}")
43 |         return jsonify({"error": f"An error occurred during analysis: {str(e)}"}), 500
44 | 
45 | @app.route('/api/urls')
46 | def list_urls():
47 |     return jsonify(list(site_structure.keys()))
48 | 
49 | def attach_data(structure):
50 |     global site_structure
51 |     print("attach_data called. Note: Server primarily loads data from links.json on startup.")
52 |     site_structure = structure
53 | 
54 | if __name__ == "__main__": 
55 |     load_crawled_data()
56 |     app.run(debug=True)


--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
 1 | <!DOCTYPE html>
 2 | <html>
 3 |   <head>
 4 |     <title>Mapping A Website's Internal Links</title>
 5 |     <script src="https://d3js.org/d3.v6.min.js"></script>
 6 |     <link rel="stylesheet" href="main.css" />
 7 |   </head>
 8 |   <body>
 9 |     <div id="scorecard" class="scorecard">
10 |       <h3>Website Scorecard</h3>
11 |       <ul id="scorecard-list"></ul>
12 |     </div>
13 | 
14 |     <div id="scorecard-right" class="scorecard">
15 |       <h3>Node Scorecard</h3>
16 |       <ul id="tooltip-scorecard-list"></ul>
17 |       <div id="claude-analysis-section">
18 |         <h3>Claude AI Analysis</h3>
19 |         <button id="analyze-node-button">Get Claude Analysis</button>
20 |         <p
21 |           id="claude-analysis-output"
22 |           style="margin-top: 0.5rem; white-space: pre-wrap"
23 |         ></p>
24 |       </div>
25 |     </div>
26 | 
27 |     <script src="main.js"></script>
28 |   </body>
29 | </html>
30 | 


--------------------------------------------------------------------------------
/main.css:
--------------------------------------------------------------------------------
  1 | * {
  2 |   font-family: "Arial", sans-serif;
  3 |   box-sizing: border-box;
  4 |   margin: 0;
  5 |   padding: 0;
  6 | }
  7 | 
  8 | body {
  9 |   overflow: hidden;
 10 |   margin: 0;
 11 |   background-color: #f9f9f9;
 12 |   color: #333;
 13 |   font-size: 16px;
 14 |   line-height: 1.6;
 15 | }
 16 | 
 17 | h3 {
 18 |   text-align: center;
 19 |   margin-bottom: 5px;
 20 |   font-size: 23px;
 21 | }
 22 | 
 23 | .link {
 24 |   stroke: #888;
 25 |   stroke-opacity: 0.75;
 26 |   stroke-width: 2px;
 27 |   transition: stroke 0.2s, stroke-opacity 0.2s;
 28 | }
 29 | 
 30 | .node {
 31 |   stroke: #555;
 32 |   stroke-width: 1.5px;
 33 |   cursor: pointer;
 34 |   fill: #0077cc;
 35 |   transition: transform 0.2s, stroke-width 0.2s, fill 0.2s;
 36 | }
 37 | 
 38 | .node:hover {
 39 |   transform: scale(1.15);
 40 |   stroke-width: 2.5px;
 41 |   fill: #ff6600;
 42 | }
 43 | 
 44 | .dimmed {
 45 |   opacity: 0.2;
 46 |   transition: opacity 0.2s ease-in-out;
 47 | }
 48 | 
 49 | .highlight {
 50 |   opacity: 1;
 51 |   stroke: black;
 52 |   stroke-width: 2px;
 53 | }
 54 | 
 55 | svg {
 56 |   display: block;
 57 |   width: 100%;
 58 |   height: 100%;
 59 |   background-color: #f0f0f0;
 60 | }
 61 | 
 62 | .scorecard {
 63 |   padding: 20px;
 64 |   background-color: #fff;
 65 |   border: 1px solid #ddd;
 66 |   border-radius: 6px;
 67 |   box-shadow: 0 4px 10px rgba(0, 0, 0, 0.1);
 68 |   max-width: 400px;
 69 |   min-width: 400px;
 70 |   min-height: 600px;
 71 |   max-height: 90vh;
 72 |   overflow-y: scroll;
 73 |   position: absolute;
 74 |   z-index: 100;
 75 | }
 76 | 
 77 | #scorecard {
 78 |   top: 20px;
 79 |   left: 20px;
 80 | }
 81 | 
 82 | #scorecard-right {
 83 |   top: 20px;
 84 |   right: 20px;
 85 |   font-size: 12px;
 86 |   max-height: 90vh;
 87 |   overflow-y: scroll;
 88 | }
 89 | 
 90 | #scorecard-list,
 91 | #tooltip-scorecard-list {
 92 |   list-style: none;
 93 |   padding: 0;
 94 | }
 95 | 
 96 | #claude-analysis-section {
 97 |   display: none;
 98 |   margin-top: 15px;
 99 | }
100 | 
101 | #analyze-node-button {
102 |   display: block;
103 |   margin: 0px auto 25px;
104 |   padding: 4px 6px;
105 | }
106 | 


--------------------------------------------------------------------------------
/main.js:
--------------------------------------------------------------------------------
  1 | d3.json("links.json")
  2 |   .then(function (site_structure) {
  3 |     if (!site_structure || Object.keys(site_structure).length === 0) {
  4 |       console.warn("links.json is empty or invalid. Cannot render graph.");
  5 |       d3.select("body").append("p").text("No crawl data available to display.");
  6 |       return;
  7 |     }
  8 | 
  9 |     const nodes = [];
 10 |     const links = [];
 11 | 
 12 |     function getNodeData(url) {
 13 |       const data = site_structure[url] || {};
 14 |       return {
 15 |         id: url,
 16 |         title: data.title || "",
 17 |         meta_description: data.meta_description || "",
 18 |         meta_keywords: data.meta_keywords || "",
 19 |         h1_tags: data.h1_tags || [],
 20 |         word_count: data.word_count || 0,
 21 |         status_code: data.status_code || "",
 22 |         response_time: data.response_time || 0,
 23 |         readability_score: data.readability_score || 0,
 24 |         sentiment: data.sentiment || 0,
 25 |         keyword_density: data.keyword_density || {},
 26 |         image_count: data.image_count || 0,
 27 |         script_count: data.script_count || 0,
 28 |         stylesheet_count: data.stylesheet_count || 0,
 29 |         has_viewport_meta: data.has_viewport_meta || false,
 30 |         heading_count: data.heading_count || 0,
 31 |         paragraph_count: data.paragraph_count || 0,
 32 |         internal_links: data.internal_links || [],
 33 |         external_links: data.external_links || [],
 34 |         semantic_elements: data.semantic_elements || {},
 35 |         heading_issues: data.heading_issues || [],
 36 |         unlabeled_inputs: data.unlabeled_inputs || [],
 37 |         images_without_alt: data.images_without_alt || [],
 38 |       };
 39 |     }
 40 | 
 41 |     Object.keys(site_structure).forEach((sourceUrl) => {
 42 |       const sourceData = site_structure[sourceUrl];
 43 | 
 44 |       if (!nodes.some((node) => node.id === sourceUrl)) {
 45 |         nodes.push(getNodeData(sourceUrl));
 46 |       }
 47 | 
 48 |       if (sourceData && Array.isArray(sourceData.internal_links)) {
 49 |         sourceData.internal_links.forEach((targetUrl) => {
 50 |           links.push({ source: sourceUrl, target: targetUrl });
 51 |           if (!nodes.some((node) => node.id === targetUrl)) {
 52 |             nodes.push(getNodeData(targetUrl));
 53 |           }
 54 |         });
 55 |       }
 56 |     });
 57 | 
 58 |     if (nodes.length === 0) {
 59 |       console.warn("No nodes to display in the graph.");
 60 |       return;
 61 |     }
 62 | 
 63 |     const width = window.innerWidth;
 64 |     const height = window.innerHeight;
 65 | 
 66 |     const svg = d3
 67 |       .select("body")
 68 |       .append("svg")
 69 |       .attr("width", width)
 70 |       .attr("height", height);
 71 |     const container = svg.append("g");
 72 |     const zoomHandler = d3
 73 |       .zoom()
 74 |       .on("zoom", (event) => container.attr("transform", event.transform));
 75 |     svg.call(zoomHandler).call(zoomHandler.transform, d3.zoomIdentity);
 76 | 
 77 |     const simulation = d3
 78 |       .forceSimulation(nodes)
 79 |       .force(
 80 |         "link",
 81 |         d3
 82 |           .forceLink(links)
 83 |           .id((d) => d.id)
 84 |           .distance(430)
 85 |       )
 86 |       .force("charge", d3.forceManyBody().strength(-300))
 87 |       .force("center", d3.forceCenter(width / 2, height / 2));
 88 | 
 89 |     const link = container
 90 |       .append("g")
 91 |       .attr("class", "links")
 92 |       .selectAll("line")
 93 |       .data(links)
 94 |       .enter()
 95 |       .append("line")
 96 |       .attr("class", "link");
 97 | 
 98 |     const node = container
 99 |       .append("g")
100 |       .attr("class", "nodes")
101 |       .selectAll("circle")
102 |       .data(nodes)
103 |       .enter()
104 |       .append("circle")
105 |       .attr("r", 10)
106 |       .attr("fill", "magenta")
107 |       .attr("stroke", "black")
108 |       .attr("stroke-width", 2)
109 |       .on("mouseover", mouseover)
110 |       .on("mouseout", mouseout)
111 |       .on("click", (event, d) => window.open(d.id, "_blank"))
112 |       .call(
113 |         d3
114 |           .drag()
115 |           .on("start", dragstarted)
116 |           .on("drag", dragged)
117 |           .on("end", dragended)
118 |       );
119 | 
120 |     simulation.on("tick", () => {
121 |       link
122 |         .attr("x1", (d) => d.source.x)
123 |         .attr("y1", (d) => d.source.y)
124 |         .attr("x2", (d) => d.target.x)
125 |         .attr("y2", (d) => d.target.y);
126 |       node.attr("cx", (d) => d.x).attr("cy", (d) => d.y);
127 |     });
128 | 
129 |     function mouseover(event, d) {
130 |       const claudeDiv = document.querySelector("#claude-analysis-section");
131 |       const claudeOutput = document.querySelector("#claude-analysis-output");
132 |       claudeDiv.style.display = "block";
133 |       claudeOutput.innerHTML = "";
134 | 
135 |       currentlySelectedNode = d;
136 | 
137 |       const connectedLinks = links.filter(
138 |         (l) => l.source.id === d.id || l.target.id === d.id
139 |       ).length;
140 | 
141 |       const keywordDensityFormatted =
142 |         d.keyword_density && Object.keys(d.keyword_density).length > 0
143 |           ? Object.entries(d.keyword_density)
144 |               .filter(([keyword, density]) => density > 0)
145 |               .sort((a, b) => b[1] - a[1])
146 |               .slice(0, 5)
147 |               .map(
148 |                 ([keyword, density]) =>
149 |                   `${keyword}: ${(density * 100).toFixed(2)}%`
150 |               )
151 |               .join("<br/>")
152 |           : "N/A";
153 | 
154 |       const tooltipContent = `
155 |       <li><strong>Title:</strong> ${d.title || "N/A"}</li>
156 |       <li><strong>URL:</strong> <a href="${d.id}" target="_blank">${
157 |         d.id
158 |       }</a></li>
159 |       <li><strong>Connections:</strong> ${connectedLinks}</li>
160 |       <li><strong>Meta Description:</strong> ${d.meta_description || "N/A"}</li>
161 |       <li><strong>Meta Keywords:</strong> ${d.meta_keywords || "N/A"}</li>
162 |       <li><strong>H1 Tags:</strong> ${
163 |         d.h1_tags && d.h1_tags.length > 0 ? d.h1_tags.join(", ") : "None"
164 |       }</li>
165 |       <li><strong>Word Count:</strong> ${d.word_count}</li>
166 |       <li><strong>Unigram Density:</strong><br/>${keywordDensityFormatted}</li>
167 |       <li><strong>Readability Score:</strong> ${
168 |         typeof d.readability_score === "number"
169 |           ? d.readability_score.toFixed(2)
170 |           : "N/A"
171 |       }</li>
172 |       <li><strong>Sentiment:</strong> ${
173 |         typeof d.sentiment === "number" ? d.sentiment.toFixed(2) : "N/A"
174 |       }</li>
175 |       <li><strong>Image Count:</strong> ${d.image_count}</li>
176 |       <li><strong>Script Count:</strong> ${d.script_count}</li>
177 |       <li><strong>Stylesheet Count:</strong> ${d.stylesheet_count}</li>
178 |       <li><strong>Has Viewport Meta:</strong> ${
179 |         d.has_viewport_meta ? "Yes" : "No"
180 |       }</li>
181 |       <li><strong>Heading Count:</strong> ${d.heading_count}</li>
182 |       <li><strong>Paragraph Count:</strong> ${d.paragraph_count}</li>
183 |       <li><strong>Status Code:</strong> ${d.status_code || "N/A"}</li>
184 |       <li><strong>Response Time:</strong> ${
185 |         typeof d.response_time === "number"
186 |           ? d.response_time.toFixed(2) + " seconds"
187 |           : "N/A"
188 |       }</li>
189 |       <li><strong>Number Of Internal Links:</strong> ${
190 |         d.internal_links ? d.internal_links.length : 0
191 |       } links</li>
192 |       <li><strong>Number Of External Links:</strong> ${
193 |         d.external_links ? d.external_links.length : 0
194 |       } links</li>
195 |       <li><strong>Semantic Elements:</strong><br/>${
196 |         d.semantic_elements && Object.keys(d.semantic_elements).length > 0
197 |           ? Object.entries(d.semantic_elements)
198 |               .map(([tag, present]) => `${tag}: ${present ? "✔️" : "❌"}`)
199 |               .join("<br/>")
200 |           : "N/A"
201 |       }</li>
202 |       <li><strong>Heading Structure Issues:</strong> ${
203 |         d.heading_issues && d.heading_issues.length > 0
204 |           ? d.heading_issues.map((pair) => `${pair[0]} → ${pair[1]}`).join(", ")
205 |           : "None"
206 |       }</li>
207 |       <li><strong>Unlabeled Inputs:</strong> ${
208 |         d.unlabeled_inputs && d.unlabeled_inputs.length > 0
209 |           ? d.unlabeled_inputs.join(", ")
210 |           : "None"
211 |       }</li>
212 |       <li><strong>Images Without Alt Text:</strong> ${
213 |         d.images_without_alt && d.images_without_alt.length > 0
214 |           ? d.images_without_alt.length
215 |           : "None"
216 |       }</li>
217 |     `;
218 | 
219 |       d3.select("#tooltip-scorecard-list").html(tooltipContent);
220 |       d3.select(this).style("cursor", "pointer");
221 |       node.classed("dimmed", true);
222 |       link.classed("dimmed", true);
223 |       node
224 |         .filter(
225 |           (n) =>
226 |             n === d ||
227 |             links.some(
228 |               (l) =>
229 |                 (l.source.id === d.id && l.target.id === n.id) ||
230 |                 (l.target.id === d.id && l.source.id === n.id)
231 |             )
232 |         )
233 |         .classed("dimmed", false)
234 |         .classed("highlight", true);
235 |       link
236 |         .filter((l) => l.source.id === d.id || l.target.id === d.id)
237 |         .classed("dimmed", false)
238 |         .classed("highlight", true);
239 |     }
240 | 
241 |     function mouseout(event, d) {
242 |       d3.select(this).style("cursor", "auto");
243 |       node
244 |         .filter((n) => n !== currentlySelectedNode)
245 |         .classed("dimmed", false)
246 |         .classed("highlight", false);
247 |       link
248 |         .filter(
249 |           (l) =>
250 |             l.source.id !== currentlySelectedNode?.id &&
251 |             l.target.id !== currentlySelectedNode?.id
252 |         )
253 |         .classed("dimmed", false)
254 |         .classed("highlight", false);
255 |     }
256 | 
257 |     function dragstarted(event, d) {
258 |       if (!event.active) simulation.alphaTarget(0.3).restart();
259 |       d.fx = d.x;
260 |       d.fy = d.y;
261 |     }
262 | 
263 |     function dragged(event, d) {
264 |       d.fx = event.x;
265 |       d.fy = event.y;
266 |     }
267 | 
268 |     function dragended(event, d) {
269 |       if (!event.active) simulation.alphaTarget(0);
270 |       d.fx = null;
271 |       d.fy = null;
272 |     }
273 |   })
274 |   .catch(function (error) {
275 |     console.error("Error loading or processing links.json:", error);
276 |     d3.select("body")
277 |       .append("p")
278 |       .text(
279 |         "Could not load or process crawl data. Check the console for errors."
280 |       );
281 |   });
282 | 
283 | function calculateScorecard(site_structure) {
284 |   const pageValues = Object.values(site_structure);
285 |   const totalPages = pageValues.length;
286 | 
287 |   if (totalPages === 0) {
288 |     return {
289 |       totalPages: 0,
290 |       word_count: 0,
291 |       readability_score: 0,
292 |       sentiment: 0,
293 |       image_count: 0,
294 |       script_count: 0,
295 |       stylesheet_count: 0,
296 |       heading_count: 0,
297 |       paragraph_count: 0,
298 |       response_time: 0,
299 |       internal_links: 0,
300 |       external_links: 0,
301 |       keyword_density: {},
302 |       status_codes: {},
303 |       viewport_meta_count: 0,
304 |       semantic_elements: {
305 |         main: 0,
306 |         nav: 0,
307 |         article: 0,
308 |         section: 0,
309 |         header: 0,
310 |         footer: 0,
311 |         aside: 0,
312 |       },
313 |       heading_issues: 0,
314 |       unlabeled_inputs: 0,
315 |       images_without_alt: 0,
316 |       average_word_count: 0,
317 |       average_readability_score: 0,
318 |       average_sentiment: 0,
319 |       average_response_time: 0,
320 |     };
321 |   }
322 | 
323 |   const aggregated = pageValues.reduce(
324 |     (acc, page) => {
325 |       if (typeof page !== "object" || page === null) return acc;
326 | 
327 |       acc.word_count += page.word_count || 0;
328 |       acc.readability_score += page.readability_score || 0;
329 |       acc.sentiment += page.sentiment || 0;
330 |       acc.image_count += page.image_count || 0;
331 |       acc.script_count += page.script_count || 0;
332 |       acc.stylesheet_count += page.stylesheet_count || 0;
333 |       acc.heading_count += page.heading_count || 0;
334 |       acc.paragraph_count += page.paragraph_count || 0;
335 |       acc.response_time += page.response_time || 0;
336 | 
337 |       acc.internal_links += Array.isArray(page.internal_links)
338 |         ? page.internal_links.length
339 |         : 0;
340 |       acc.external_links += Array.isArray(page.external_links)
341 |         ? page.external_links.length
342 |         : 0;
343 | 
344 |       if (page.keyword_density && typeof page.keyword_density === "object") {
345 |         for (const [keyword, density] of Object.entries(page.keyword_density)) {
346 |           acc.keyword_density[keyword] =
347 |             (acc.keyword_density[keyword] || 0) + density;
348 |         }
349 |       }
350 | 
351 |       if (page.status_code) {
352 |         acc.status_codes[page.status_code] =
353 |           (acc.status_codes[page.status_code] || 0) + 1;
354 |       }
355 | 
356 |       acc.viewport_meta_count += page.has_viewport_meta ? 1 : 0;
357 | 
358 |       if (
359 |         page.semantic_elements &&
360 |         typeof page.semantic_elements === "object"
361 |       ) {
362 |         for (const [tag, present] of Object.entries(page.semantic_elements)) {
363 |           if (present) acc.semantic_elements[tag]++;
364 |         }
365 |       }
366 |       acc.heading_issues += Array.isArray(page.heading_issues)
367 |         ? page.heading_issues.length
368 |         : 0;
369 |       acc.unlabeled_inputs += Array.isArray(page.unlabeled_inputs)
370 |         ? page.unlabeled_inputs.length
371 |         : 0;
372 |       acc.images_without_alt += Array.isArray(page.images_without_alt)
373 |         ? page.images_without_alt.length
374 |         : 0;
375 | 
376 |       return acc;
377 |     },
378 |     {
379 |       word_count: 0,
380 |       readability_score: 0,
381 |       sentiment: 0,
382 |       image_count: 0,
383 |       script_count: 0,
384 |       stylesheet_count: 0,
385 |       heading_count: 0,
386 |       paragraph_count: 0,
387 |       response_time: 0,
388 |       internal_links: 0,
389 |       external_links: 0,
390 |       keyword_density: {},
391 |       status_codes: {},
392 |       viewport_meta_count: 0,
393 |       semantic_elements: {
394 |         main: 0,
395 |         nav: 0,
396 |         article: 0,
397 |         section: 0,
398 |         header: 0,
399 |         footer: 0,
400 |         aside: 0,
401 |       },
402 |       heading_issues: 0,
403 |       unlabeled_inputs: 0,
404 |       images_without_alt: 0,
405 |     }
406 |   );
407 | 
408 |   aggregated.average_word_count =
409 |     totalPages > 0 ? aggregated.word_count / totalPages : 0;
410 |   aggregated.average_readability_score =
411 |     totalPages > 0 ? aggregated.readability_score / totalPages : 0;
412 |   aggregated.average_sentiment =
413 |     totalPages > 0 ? aggregated.sentiment / totalPages : 0;
414 |   aggregated.average_response_time =
415 |     totalPages > 0 ? aggregated.response_time / totalPages : 0;
416 | 
417 |   return {
418 |     totalPages,
419 |     ...aggregated,
420 |   };
421 | }
422 | 
423 | function displayScorecard(scorecard) {
424 |   const list = d3.select("#scorecard-list").html("");
425 |   list
426 |     .append("li")
427 |     .html(`<strong>Total Pages:</strong> ${scorecard.totalPages}`);
428 |   list
429 |     .append("li")
430 |     .html(
431 |       `<strong>Average Word Count:</strong> ${scorecard.average_word_count.toFixed(
432 |         2
433 |       )}`
434 |     );
435 |   list
436 |     .append("li")
437 |     .html(
438 |       `<strong>Average Readability Score:</strong> ${scorecard.average_readability_score.toFixed(
439 |         2
440 |       )}`
441 |     );
442 |   list
443 |     .append("li")
444 |     .html(
445 |       `<strong>Average Sentiment:</strong> ${scorecard.average_sentiment.toFixed(
446 |         2
447 |       )}`
448 |     );
449 |   list
450 |     .append("li")
451 |     .html(
452 |       `<strong>Average Response Time:</strong> ${scorecard.average_response_time.toFixed(
453 |         2
454 |       )} seconds`
455 |     );
456 |   list
457 |     .append("li")
458 |     .html(`<strong>Total Images:</strong> ${scorecard.image_count}`);
459 |   list
460 |     .append("li")
461 |     .html(`<strong>Total Scripts:</strong> ${scorecard.script_count}`);
462 |   list
463 |     .append("li")
464 |     .html(`<strong>Total Stylesheets:</strong> ${scorecard.stylesheet_count}`);
465 |   list
466 |     .append("li")
467 |     .html(`<strong>Total Headings:</strong> ${scorecard.heading_count}`);
468 |   list
469 |     .append("li")
470 |     .html(`<strong>Total Paragraphs:</strong> ${scorecard.paragraph_count}`);
471 |   list
472 |     .append("li")
473 |     .html(
474 |       `<strong>Pages with Viewport Meta:</strong> ${scorecard.viewport_meta_count}`
475 |     );
476 |   list
477 |     .append("li")
478 |     .html(`<strong>Total Internal Links:</strong> ${scorecard.internal_links}`);
479 |   list
480 |     .append("li")
481 |     .html(`<strong>Total External Links:</strong> ${scorecard.external_links}`);
482 | 
483 |   list
484 |     .append("li")
485 |     .html(
486 |       `<strong>Heading Issues Detected:</strong> ${scorecard.heading_issues}`
487 |     );
488 |   list
489 |     .append("li")
490 |     .html(`<strong>Unlabeled Inputs:</strong> ${scorecard.unlabeled_inputs}`);
491 |   list
492 |     .append("li")
493 |     .html(
494 |       `<strong>Images Without Alt Text:</strong> ${scorecard.images_without_alt}`
495 |     );
496 | 
497 |   const semanticUsed = Object.entries(scorecard.semantic_elements)
498 |     .map(([tag, count]) => `${tag}: ${count}`)
499 |     .join("<br/>");
500 |   list
501 |     .append("li")
502 |     .html(`<strong>Semantic Element Usage:</strong><br/>${semanticUsed}`);
503 | 
504 |   const keywordList = Object.entries(scorecard.keyword_density)
505 |     .sort(([, densityA], [, densityB]) => densityB - densityA)
506 |     .slice(0, 10)
507 |     .map(
508 |       ([keyword, density]) =>
509 |         `${keyword}: (Avg density ${(density / scorecard.totalPages).toFixed(
510 |           4
511 |         )})`
512 |     )
513 |     .join("<br/>");
514 |   list
515 |     .append("li")
516 |     .html(
517 |       `<strong>Top Keyword Density (Average):</strong><br/>${
518 |         keywordList || "N/A"
519 |       }`
520 |     );
521 | 
522 |   const statusList = Object.entries(scorecard.status_codes)
523 |     .map(([code, count]) => `${code}: ${count}`)
524 |     .join("<br/>");
525 |   list
526 |     .append("li")
527 |     .html(`<strong>Status Codes:</strong><br/>${statusList || "N/A"}`);
528 | }
529 | 
530 | d3.json("links.json")
531 |   .then((loaded_site_structure) => {
532 |     if (
533 |       loaded_site_structure &&
534 |       Object.keys(loaded_site_structure).length > 0
535 |     ) {
536 |       const scorecard = calculateScorecard(loaded_site_structure);
537 |       displayScorecard(scorecard);
538 |     } else {
539 |       console.warn("No data for scorecard in the second d3.json call.");
540 |       const list = d3.select("#scorecard-list").html("");
541 |       list.append("li").text("No scorecard data loaded.");
542 |     }
543 |   })
544 |   .catch(function (error) {
545 |     console.error("Error loading links.json for scorecard:", error);
546 |     const list = d3.select("#scorecard-list").html("");
547 |     list
548 |       .append("li")
549 |       .html(`<strong>Error loading scorecard data:</strong> ${error.message}`);
550 |   });
551 | 
552 | let currentlySelectedNode = null;
553 | 
554 | d3.json("links.json")
555 |   .then((loaded_site_structure) => {
556 |     if (
557 |       loaded_site_structure &&
558 |       Object.keys(loaded_site_structure).length > 0
559 |     ) {
560 |       const scorecard = calculateScorecard(loaded_site_structure);
561 |       displayScorecard(scorecard);
562 |     } else {
563 |       console.warn("No data for scorecard.");
564 |     }
565 |   })
566 |   .catch(function (error) {
567 |     console.error("Error loading links.json for scorecard:", error);
568 |   });
569 | 
570 | document.addEventListener("DOMContentLoaded", () => {
571 |   const analyzeButton = document.getElementById("analyze-node-button");
572 |   const analysisOutput = document.getElementById("claude-analysis-output");
573 | 
574 |   if (analyzeButton && analysisOutput) {
575 |     analyzeButton.addEventListener("click", () => {
576 |       if (!currentlySelectedNode || !currentlySelectedNode.id) {
577 |         analysisOutput.textContent =
578 |           "Please hover over a node to select it before running analysis.";
579 |         return;
580 |       }
581 | 
582 |       analysisOutput.textContent = "Running analysis...";
583 | 
584 |       fetch("http://localhost:5000/api/analyze", {
585 |         method: "POST",
586 |         headers: {
587 |           "Content-Type": "application/json",
588 |         },
589 |         body: JSON.stringify({ url: currentlySelectedNode.id }),
590 |       })
591 |         .then(async (response) => {
592 |           const responseText = await response.text();
593 |           if (!response.ok) {
594 |             throw new Error(`HTTP error ${response.status}: ${responseText}`);
595 |           }
596 |           try {
597 |             return JSON.parse(responseText);
598 |           } catch (e) {
599 |             throw new Error("Invalid JSON response: " + responseText);
600 |           }
601 |         })
602 |         .then((data) => {
603 |           if (data.analysis) {
604 |             analysisOutput.textContent = data.analysis;
605 |           } else {
606 |             analysisOutput.textContent = `Error: ${
607 |               data.error || "Unexpected response format."
608 |             }`;
609 |           }
610 |         })
611 |         .catch((err) => {
612 |           analysisOutput.textContent = `Request failed: ${err.message}`;
613 |         });
614 |     });
615 |   } else {
616 |     if (!analyzeButton) console.error("Analyze button not found.");
617 |     if (!analysisOutput)
618 |       console.error("Claude analysis output element not found.");
619 |   }
620 | });
621 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | anthropic
 2 | beautifulsoup4==4.12.0
 3 | certifi==2024.8.30
 4 | charset-normalizer==3.3.2
 5 | click==8.1.7
 6 | dotenv
 7 | flask
 8 | flask-cors
 9 | idna==3.10
10 | joblib==1.4.2
11 | nltk==3.9.1
12 | pyphen==0.16.0
13 | regex==2024.9.11
14 | requests==2.31.0
15 | setuptools==75.1.0
16 | soupsieve==2.6
17 | textblob==0.18.0.post0
18 | textstat==0.7.4
19 | tqdm==4.66.5
20 | urllib3==2.2.3
21 | 


--------------------------------------------------------------------------------