├── requirements.txt
├── LICENSE
├── README.md
└── Healthcare Analytics Program for Salesforce Health Cloud


/requirements.txt:
--------------------------------------------------------------------------------
 1 | numpy==1.24.3
 2 | pandas==2.0.2
 3 | scikit-learn==1.3.0
 4 | simple-salesforce==1.12.4
 5 | joblib==1.3.1
 6 | matplotlib==3.7.2
 7 | seaborn==0.12.2
 8 | python-dotenv==1.0.0
 9 | pytest==7.4.0
10 | pytest-cov==4.1.0
11 | requests==2.31.0
12 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Healthcare-Analytics-Program-for-Salesforce-Health-Cloud
  2 | This healthcare analytics program integrates with Salesforce Health Cloud to identify high-risk patients who may need intervention.
  3 | 
  4 | Healthcare Analytics for Salesforce Health Cloud
  5 | A comprehensive analytics solution that integrates with Salesforce Health Cloud to identify high-risk patients who may need intervention based on behavioral and clinical data analysis.
  6 | Overview
  7 | This solution analyzes patient data from Salesforce Health Cloud to:
  8 | 
  9 | Identify high-risk patients using machine learning
 10 | Track patient behavior patterns and engagement
 11 | Generate personalized intervention recommendations
 12 | Create care gaps for care coordination
 13 | Provide risk stratification for population health management
 14 | 
 15 | Features
 16 | 
 17 | Salesforce Health Cloud Integration: Seamless data exchange with Health Cloud
 18 | Risk Prediction Model: Machine learning model to predict patient risk levels
 19 | Behavioral Analytics: Analysis of medication adherence, appointment attendance, and engagement
 20 | Intervention Recommendations: Automated personalized care recommendations
 21 | Care Gap Creation: Automated creation of care gaps for high-risk patients
 22 | Data Visualization: Risk distribution and patient segmentation reports
 23 | 
 24 | Installation
 25 | 
 26 | Clone this repository
 27 | 
 28 | bashCopygit clone https://github.com/yourusername/healthcare-analytics.git
 29 | cd healthcare-analytics
 30 | 
 31 | Create a virtual environment and activate it
 32 | 
 33 | bashCopypython -m venv venv
 34 | source venv/bin/activate  # On Windows: venv\Scripts\activate
 35 | 
 36 | Install the required packages
 37 | 
 38 | bashCopypip install -r requirements.txt
 39 | 
 40 | Configure your Salesforce credentials
 41 | 
 42 | bashCopycp config.example.ini config.ini
 43 | # Edit config.ini with your Salesforce credentials
 44 | Usage
 45 | Basic Usage
 46 | pythonCopyfrom healthcare_analytics import HealthcareAnalytics
 47 | 
 48 | # Initialize with Salesforce credentials
 49 | analytics = HealthcareAnalytics(
 50 |     sf_username="your_username@example.com",
 51 |     sf_password="your_password",
 52 |     sf_security_token="your_security_token"
 53 | )
 54 | 
 55 | # Run the full analysis pipeline
 56 | results = analytics.run_full_analysis()
 57 | 
 58 | # Save results to CSV
 59 | results.to_csv("patient_risk_analysis.csv", index=False)
 60 | Custom Analysis
 61 | pythonCopy# Fetch patient data
 62 | patient_data = analytics.fetch_patient_data(query_limit=500)
 63 | 
 64 | # Preprocess the data
 65 | processed_data = analytics.preprocess_data(patient_data)
 66 | 
 67 | # Create a new risk model or load existing
 68 | analytics.create_risk_model(processed_data)
 69 | # or
 70 | analytics.load_model('path/to/model.pkl')
 71 | 
 72 | # Predict risk scores
 73 | risk_results = analytics.predict_risk(processed_data)
 74 | 
 75 | # Generate recommendations
 76 | recommendations = analytics.generate_intervention_recommendations(risk_results)
 77 | 
 78 | # Upload to Salesforce
 79 | analytics.upload_risk_scores_to_salesforce(recommendations)
 80 | Configuration
 81 | The program requires the following Salesforce Health Cloud setup:
 82 | 
 83 | Custom fields on the Patient object:
 84 | 
 85 | Risk_Score__c (Number)
 86 | Risk_Category__c (Picklist: Low, Medium, High)
 87 | Last_Risk_Assessment_Date__c (DateTime)
 88 | 
 89 | 
 90 | Access to standard Health Cloud objects:
 91 | 
 92 | HealthCloudGA__EhrPatient__c
 93 | HealthCloudGA__EhrCondition__c
 94 | HealthCloudGA__EhrMedication__c
 95 | HealthCloudGA__EhrEncounter__c
 96 | HealthCloudGA__CareGap__c
 97 | 
 98 | 
 99 | 
100 | Customization
101 | Modifying Risk Factors
102 | Edit the preprocess_data method in healthcare_analytics.py to add or modify risk factors:
103 | pythonCopy# Add your custom risk factor
104 | df['CustomRiskFactor'] = df['Factor1'] * df['Factor2']
105 | Changing Recommendation Logic
106 | Edit the get_recommendation function in the generate_intervention_recommendations method to customize intervention recommendations.
107 | Deployment
108 | For production deployment:
109 | 
110 | Schedule the analysis to run periodically:
111 | Copy# Example crontab entry to run daily at 2 AM
112 | 0 2 * * * /path/to/venv/bin/python /path/to/run_analysis.py
113 | 
114 | Set up logging to a secure location:
115 | pythonCopy# In your deployment script
116 | logging.basicConfig(
117 |     filename='/var/log/healthcare_analytics.log',
118 |     level=logging.INFO
119 | )
120 | 
121 | Configure secure credential management using environment variables or a vault service.
122 | 
123 | Contributing
124 | Contributions are welcome! Please feel free to submit a Pull Request.
125 | 
126 | License
127 | This project is licensed under the MIT License - see the LICENSE file for details.
128 | 


--------------------------------------------------------------------------------
/Healthcare Analytics Program for Salesforce Health Cloud:
--------------------------------------------------------------------------------
  1 | """
  2 | Healthcare Analytics Program for Salesforce Health Cloud
  3 | This program integrates with Salesforce Health Cloud to analyze patient behavior
  4 | and identify high-risk patients who may need intervention.
  5 | """
  6 | 
  7 | import pandas as pd
  8 | import numpy as np
  9 | from simple_salesforce import Salesforce
 10 | from sklearn.ensemble import RandomForestClassifier
 11 | from sklearn.preprocessing import StandardScaler
 12 | from sklearn.model_selection import train_test_split
 13 | from sklearn.metrics import classification_report, confusion_matrix
 14 | import joblib
 15 | import datetime
 16 | import logging
 17 | import os
 18 | 
 19 | # Set up logging
 20 | logging.basicConfig(
 21 |     level=logging.INFO,
 22 |     format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
 23 |     handlers=[
 24 |         logging.FileHandler("healthcare_analytics.log"),
 25 |         logging.StreamHandler()
 26 |     ]
 27 | )
 28 | logger = logging.getLogger(__name__)
 29 | 
 30 | class HealthcareAnalytics:
 31 |     def __init__(self, sf_username, sf_password, sf_security_token, sf_domain='login'):
 32 |         """
 33 |         Initialize the Healthcare Analytics class with Salesforce credentials.
 34 |         
 35 |         Args:
 36 |             sf_username (str): Salesforce username
 37 |             sf_password (str): Salesforce password
 38 |             sf_security_token (str): Salesforce security token
 39 |             sf_domain (str): Salesforce domain, default is 'login'
 40 |         """
 41 |         self.sf = None
 42 |         self.sf_username = sf_username
 43 |         self.sf_password = sf_password
 44 |         self.sf_security_token = sf_security_token
 45 |         self.sf_domain = sf_domain
 46 |         self.model = None
 47 |         self.scaler = None
 48 |         
 49 |     def connect_to_salesforce(self):
 50 |         """Connect to Salesforce using the provided credentials."""
 51 |         try:
 52 |             self.sf = Salesforce(
 53 |                 username=self.sf_username,
 54 |                 password=self.sf_password,
 55 |                 security_token=self.sf_security_token,
 56 |                 domain=self.sf_domain
 57 |             )
 58 |             logger.info("Successfully connected to Salesforce")
 59 |             return True
 60 |         except Exception as e:
 61 |             logger.error(f"Failed to connect to Salesforce: {e}")
 62 |             return False
 63 |     
 64 |     def fetch_patient_data(self, query_limit=1000):
 65 |         """
 66 |         Fetch patient data from Salesforce Health Cloud.
 67 |         
 68 |         Args:
 69 |             query_limit (int): Maximum number of records to fetch
 70 |             
 71 |         Returns:
 72 |             pandas.DataFrame: DataFrame containing patient data
 73 |         """
 74 |         if not self.sf:
 75 |             if not self.connect_to_salesforce():
 76 |                 return None
 77 |         
 78 |         try:
 79 |             # Query patient data from HealthCloudGA__EhrPatient__c object
 80 |             # Adjust the query based on your Salesforce Health Cloud setup
 81 |             query = f"""
 82 |                 SELECT Id, Name, HealthCloudGA__Age__c, HealthCloudGA__Gender__c, 
 83 |                        HealthCloudGA__PrimaryLanguage__c, HealthCloudGA__MedicaidEnrollmentStatus__c,
 84 |                        HealthCloudGA__MedicareEnrollmentStatus__c, 
 85 |                        (SELECT Id, Name, ActivityDate, Status, Priority FROM Tasks),
 86 |                        (SELECT HealthCloudGA__RelatedAccount__c, HealthCloudGA__DiagnosisDate__c, 
 87 |                              HealthCloudGA__DiagnosticSummary__c FROM HealthCloudGA__EhrConditions__r),
 88 |                        (SELECT HealthCloudGA__RelatedAccount__c, HealthCloudGA__MedicationName__c, 
 89 |                              HealthCloudGA__Status__c FROM HealthCloudGA__EhrMedications__r),
 90 |                        (SELECT HealthCloudGA__RelatedAccount__c, HealthCloudGA__ServiceDate__c,
 91 |                              HealthCloudGA__Status__c FROM HealthCloudGA__EhrEncounters__r LIMIT 10)
 92 |                 FROM HealthCloudGA__EhrPatient__c
 93 |                 LIMIT {query_limit}
 94 |             """
 95 |             
 96 |             patients = self.sf.query_all(query)
 97 |             logger.info(f"Retrieved {len(patients['records'])} patient records")
 98 |             
 99 |             # Convert to DataFrame
100 |             patient_records = []
101 |             for record in patients['records']:
102 |                 patient_data = {
103 |                     'PatientId': record['Id'],
104 |                     'Name': record['Name'],
105 |                     'Age': record.get('HealthCloudGA__Age__c', 0),
106 |                     'Gender': record.get('HealthCloudGA__Gender__c', ''),
107 |                     'PrimaryLanguage': record.get('HealthCloudGA__PrimaryLanguage__c', ''),
108 |                     'MedicaidStatus': record.get('HealthCloudGA__MedicaidEnrollmentStatus__c', ''),
109 |                     'MedicareStatus': record.get('HealthCloudGA__MedicareEnrollmentStatus__c', ''),
110 |                 }
111 |                 
112 |                 # Process tasks
113 |                 tasks = record.get('Tasks', {}).get('records', [])
114 |                 patient_data['TaskCount'] = len(tasks)
115 |                 patient_data['OpenTaskCount'] = sum(1 for t in tasks if t['Status'] == 'Open')
116 |                 patient_data['HighPriorityTaskCount'] = sum(1 for t in tasks if t['Priority'] == 'High')
117 |                 
118 |                 # Process conditions
119 |                 conditions = record.get('HealthCloudGA__EhrConditions__r', {}).get('records', [])
120 |                 patient_data['ConditionCount'] = len(conditions)
121 |                 patient_data['ChronicConditionCount'] = sum(1 for c in conditions if 'chronic' in (c.get('HealthCloudGA__DiagnosticSummary__c', '') or '').lower())
122 |                 
123 |                 # Process medications
124 |                 medications = record.get('HealthCloudGA__EhrMedications__r', {}).get('records', [])
125 |                 patient_data['MedicationCount'] = len(medications)
126 |                 patient_data['ActiveMedicationCount'] = sum(1 for m in medications if m.get('HealthCloudGA__Status__c') == 'Active')
127 |                 
128 |                 # Process encounters
129 |                 encounters = record.get('HealthCloudGA__EhrEncounters__r', {}).get('records', [])
130 |                 patient_data['EncounterCount'] = len(encounters)
131 |                 
132 |                 # Calculate days since last encounter
133 |                 if encounters:
134 |                     encounter_dates = [datetime.datetime.strptime(e['HealthCloudGA__ServiceDate__c'], '%Y-%m-%d')
135 |                                       for e in encounters if e.get('HealthCloudGA__ServiceDate__c')]
136 |                     if encounter_dates:
137 |                         latest_encounter = max(encounter_dates)
138 |                         days_since = (datetime.datetime.now() - latest_encounter).days
139 |                         patient_data['DaysSinceLastEncounter'] = days_since
140 |                     else:
141 |                         patient_data['DaysSinceLastEncounter'] = 365  # Default if no date
142 |                 else:
143 |                     patient_data['DaysSinceLastEncounter'] = 365  # Default if no encounters
144 |                 
145 |                 patient_records.append(patient_data)
146 |             
147 |             return pd.DataFrame(patient_records)
148 |         
149 |         except Exception as e:
150 |             logger.error(f"Error fetching patient data: {e}")
151 |             return None
152 |     
153 |     def preprocess_data(self, df):
154 |         """
155 |         Preprocess patient data for analysis.
156 |         
157 |         Args:
158 |             df (pandas.DataFrame): DataFrame containing patient data
159 |             
160 |         Returns:
161 |             pandas.DataFrame: Preprocessed DataFrame
162 |         """
163 |         if df is None or df.empty:
164 |             logger.error("No data to preprocess")
165 |             return None
166 |         
167 |         try:
168 |             # Fill missing values
169 |             numeric_cols = ['Age', 'TaskCount', 'OpenTaskCount', 'HighPriorityTaskCount',
170 |                            'ConditionCount', 'ChronicConditionCount', 'MedicationCount',
171 |                            'ActiveMedicationCount', 'EncounterCount', 'DaysSinceLastEncounter']
172 |             
173 |             for col in numeric_cols:
174 |                 if col in df.columns:
175 |                     df[col] = df[col].fillna(0)
176 |             
177 |             # Categorical encoding
178 |             categorical_cols = ['Gender', 'PrimaryLanguage', 'MedicaidStatus', 'MedicareStatus']
179 |             for col in categorical_cols:
180 |                 if col in df.columns:
181 |                     df[col] = df[col].fillna('Unknown')
182 |                     df[col + '_Encoded'] = pd.factorize(df[col])[0]
183 |             
184 |             # Feature engineering
185 |             # Medication adherence proxy (ratio of active to total medications)
186 |             if 'MedicationCount' in df.columns and 'ActiveMedicationCount' in df.columns:
187 |                 df['MedicationAdherence'] = df.apply(
188 |                     lambda x: x['ActiveMedicationCount'] / x['MedicationCount'] if x['MedicationCount'] > 0 else 0, 
189 |                     axis=1
190 |                 )
191 |             
192 |             # Engagement score based on tasks and encounters
193 |             if all(col in df.columns for col in ['TaskCount', 'OpenTaskCount', 'EncounterCount']):
194 |                 df['EngagementScore'] = (df['TaskCount'] - df['OpenTaskCount']) + df['EncounterCount']
195 |             
196 |             # Risk factor based on chronic conditions and age
197 |             if all(col in df.columns for col in ['ChronicConditionCount', 'Age']):
198 |                 df['ChronicConditionAgeRisk'] = df['ChronicConditionCount'] * (df['Age'] / 100)
199 |             
200 |             # Identify missing follow-ups
201 |             if 'DaysSinceLastEncounter' in df.columns:
202 |                 df['MissingFollowUp'] = df['DaysSinceLastEncounter'] > 180  # Flag if no visit in 6 months
203 |             
204 |             logger.info("Data preprocessing completed successfully")
205 |             return df
206 |         
207 |         except Exception as e:
208 |             logger.error(f"Error during data preprocessing: {e}")
209 |             return None
210 |     
211 |     def create_risk_model(self, df, target_column=None):
212 |         """
213 |         Create a model to predict patient risk.
214 |         If target_column is provided, use supervised learning, otherwise use unsupervised.
215 |         
216 |         Args:
217 |             df (pandas.DataFrame): Preprocessed patient data
218 |             target_column (str, optional): Name of the column to predict
219 |             
220 |         Returns:
221 |             bool: True if model creation was successful, False otherwise
222 |         """
223 |         if df is None or df.empty:
224 |             logger.error("No data to create model")
225 |             return False
226 |         
227 |         try:
228 |             # Feature selection
229 |             feature_cols = [col for col in df.columns if col.endswith('_Encoded') 
230 |                           or col in ['Age', 'TaskCount', 'ConditionCount', 'ChronicConditionCount',
231 |                                     'MedicationCount', 'ActiveMedicationCount', 'EncounterCount',
232 |                                     'DaysSinceLastEncounter', 'MedicationAdherence', 
233 |                                     'EngagementScore', 'ChronicConditionAgeRisk']]
234 |             
235 |             features = df[feature_cols].copy()
236 |             
237 |             # Handle any remaining missing values
238 |             features = features.fillna(0)
239 |             
240 |             # Scale features
241 |             self.scaler = StandardScaler()
242 |             features_scaled = self.scaler.fit_transform(features)
243 |             
244 |             # If we have labeled data, use supervised learning
245 |             if target_column and target_column in df.columns:
246 |                 labels = df[target_column]
247 |                 
248 |                 # Split the data
249 |                 X_train, X_test, y_train, y_test = train_test_split(
250 |                     features_scaled, labels, test_size=0.25, random_state=42
251 |                 )
252 |                 
253 |                 # Train a RandomForest model
254 |                 self.model = RandomForestClassifier(n_estimators=100, random_state=42)
255 |                 self.model.fit(X_train, y_train)
256 |                 
257 |                 # Evaluate the model
258 |                 y_pred = self.model.predict(X_test)
259 |                 logger.info(f"Classification Report:\n{classification_report(y_test, y_pred)}")
260 |                 logger.info(f"Confusion Matrix:\n{confusion_matrix(y_test, y_pred)}")
261 |                 
262 |             else:
263 |                 # For unsupervised learning, we'll create a risk score based on selected features
264 |                 # This is a simplified approach - in a real scenario, consider clustering or more sophisticated methods
265 |                 logger.info("No target column provided. Creating risk score based on feature importance")
266 |                 
267 |                 # Create a basic RandomForest model to get feature importances
268 |                 self.model = RandomForestClassifier(n_estimators=100, random_state=42)
269 |                 
270 |                 # We'll create a synthetic target (high values = high risk) for feature importance
271 |                 synthetic_target = features['ChronicConditionCount'] + features['Age'] / 10
272 |                 synthetic_target = (synthetic_target > synthetic_target.median()).astype(int)
273 |                 
274 |                 # Fit the model
275 |                 self.model.fit(features_scaled, synthetic_target)
276 |                 
277 |                 # Get feature importances
278 |                 importances = self.model.feature_importances_
279 |                 feature_importance = pd.DataFrame({
280 |                     'Feature': feature_cols,
281 |                     'Importance': importances
282 |                 }).sort_values('Importance', ascending=False)
283 |                 
284 |                 logger.info(f"Feature importance:\n{feature_importance}")
285 |             
286 |             # Save the model
287 |             joblib.dump(self.model, 'healthcare_risk_model.pkl')
288 |             joblib.dump(self.scaler, 'healthcare_risk_scaler.pkl')
289 |             logger.info("Model created and saved successfully")
290 |             
291 |             return True
292 |         
293 |         except Exception as e:
294 |             logger.error(f"Error creating risk model: {e}")
295 |             return False
296 |     
297 |     def load_model(self, model_path='healthcare_risk_model.pkl', scaler_path='healthcare_risk_scaler.pkl'):
298 |         """
299 |         Load a previously saved model.
300 |         
301 |         Args:
302 |             model_path (str): Path to the saved model
303 |             scaler_path (str): Path to the saved scaler
304 |             
305 |         Returns:
306 |             bool: True if loading was successful, False otherwise
307 |         """
308 |         try:
309 |             if os.path.exists(model_path) and os.path.exists(scaler_path):
310 |                 self.model = joblib.load(model_path)
311 |                 self.scaler = joblib.load(scaler_path)
312 |                 logger.info("Model and scaler loaded successfully")
313 |                 return True
314 |             else:
315 |                 logger.error("Model or scaler file not found")
316 |                 return False
317 |         except Exception as e:
318 |             logger.error(f"Error loading model: {e}")
319 |             return False
320 |     
321 |     def predict_risk(self, df):
322 |         """
323 |         Predict risk scores for patients.
324 |         
325 |         Args:
326 |             df (pandas.DataFrame): Preprocessed patient data
327 |             
328 |         Returns:
329 |             pandas.DataFrame: DataFrame with original data and risk scores
330 |         """
331 |         if df is None or df.empty:
332 |             logger.error("No data to predict risk")
333 |             return None
334 |         
335 |         if self.model is None or self.scaler is None:
336 |             logger.error("Model or scaler not initialized")
337 |             return None
338 |         
339 |         try:
340 |             # Ensure we have all the necessary features
341 |             feature_cols = [col for col in df.columns if col.endswith('_Encoded') 
342 |                           or col in ['Age', 'TaskCount', 'ConditionCount', 'ChronicConditionCount',
343 |                                     'MedicationCount', 'ActiveMedicationCount', 'EncounterCount',
344 |                                     'DaysSinceLastEncounter', 'MedicationAdherence', 
345 |                                     'EngagementScore', 'ChronicConditionAgeRisk']]
346 |             
347 |             available_cols = [col for col in feature_cols if col in df.columns]
348 |             features = df[available_cols].copy()
349 |             
350 |             # Handle any missing values
351 |             features = features.fillna(0)
352 |             
353 |             # Scale features
354 |             features_scaled = self.scaler.transform(features)
355 |             
356 |             # Predict probabilities
357 |             risk_probs = self.model.predict_proba(features_scaled)[:, 1]
358 |             
359 |             # Add risk score to the original dataframe
360 |             result_df = df.copy()
361 |             result_df['RiskScore'] = risk_probs
362 |             
363 |             # Add risk category
364 |             result_df['RiskCategory'] = pd.cut(
365 |                 result_df['RiskScore'],
366 |                 bins=[0, 0.3, 0.6, 1.0],
367 |                 labels=['Low', 'Medium', 'High']
368 |             )
369 |             
370 |             logger.info(f"Risk predictions completed for {len(df)} patients")
371 |             return result_df
372 |         
373 |         except Exception as e:
374 |             logger.error(f"Error predicting risk: {e}")
375 |             return None
376 |     
377 |     def upload_risk_scores_to_salesforce(self, risk_df):
378 |         """
379 |         Upload risk scores back to Salesforce Health Cloud.
380 |         
381 |         Args:
382 |             risk_df (pandas.DataFrame): DataFrame with risk scores
383 |             
384 |         Returns:
385 |             bool: True if upload was successful, False otherwise
386 |         """
387 |         if risk_df is None or risk_df.empty:
388 |             logger.error("No risk data to upload")
389 |             return False
390 |         
391 |         if not self.sf:
392 |             if not self.connect_to_salesforce():
393 |                 return False
394 |         
395 |         try:
396 |             # Batch size for Salesforce API calls
397 |             batch_size = 200
398 |             success_count = 0
399 |             error_count = 0
400 |             
401 |             # Process in batches
402 |             for i in range(0, len(risk_df), batch_size):
403 |                 batch = risk_df.iloc[i:i+batch_size]
404 |                 
405 |                 for _, row in batch.iterrows():
406 |                     patient_id = row['PatientId']
407 |                     risk_score = float(row['RiskScore'])
408 |                     risk_category = row['RiskCategory']
409 |                     
410 |                     # Create a custom field update
411 |                     # Assuming you have custom fields for risk score and category
412 |                     update_data = {
413 |                         'Risk_Score__c': risk_score,
414 |                         'Risk_Category__c': risk_category,
415 |                         'Last_Risk_Assessment_Date__c': datetime.datetime.now().strftime('%Y-%m-%dT%H:%M:%S.000Z')
416 |                     }
417 |                     
418 |                     try:
419 |                         # Update the patient record
420 |                         self.sf.HealthCloudGA__EhrPatient__c.update(patient_id, update_data)
421 |                         success_count += 1
422 |                     except Exception as e:
423 |                         logger.error(f"Error updating patient {patient_id}: {e}")
424 |                         error_count += 1
425 |                 
426 |                 logger.info(f"Processed batch {i//batch_size + 1}, success: {success_count}, errors: {error_count}")
427 |             
428 |             # Create a Care Gap record for high-risk patients that need intervention
429 |             high_risk_patients = risk_df[risk_df['RiskCategory'] == 'High']
430 |             gap_count = 0
431 |             
432 |             for _, patient in high_risk_patients.iterrows():
433 |                 patient_id = patient['PatientId']
434 |                 
435 |                 # Create a Care Gap record - adjust fields as needed for your org
436 |                 care_gap_data = {
437 |                     'HealthCloudGA__Account__c': patient_id,
438 |                     'HealthCloudGA__GapStatus__c': 'Open',
439 |                     'HealthCloudGA__GapReason__c': 'High Risk Assessment',
440 |                     'HealthCloudGA__GapPriority__c': 'High',
441 |                     'HealthCloudGA__GapCreateDate__c': datetime.datetime.now().strftime('%Y-%m-%dT%H:%M:%S.000Z')
442 |                 }
443 |                 
444 |                 try:
445 |                     # Create the care gap record
446 |                     self.sf.HealthCloudGA__CareGap__c.create(care_gap_data)
447 |                     gap_count += 1
448 |                 except Exception as e:
449 |                     logger.error(f"Error creating care gap for patient {patient_id}: {e}")
450 |             
451 |             logger.info(f"Created {gap_count} care gaps for high-risk patients")
452 |             logger.info(f"Upload completed: {success_count} successes, {error_count} errors")
453 |             
454 |             return success_count > 0
455 |         
456 |         except Exception as e:
457 |             logger.error(f"Error uploading risk scores: {e}")
458 |             return False
459 |     
460 |     def generate_intervention_recommendations(self, risk_df):
461 |         """
462 |         Generate intervention recommendations based on risk levels.
463 |         
464 |         Args:
465 |             risk_df (pandas.DataFrame): DataFrame with risk scores
466 |             
467 |         Returns:
468 |             pandas.DataFrame: DataFrame with recommendations
469 |         """
470 |         if risk_df is None or risk_df.empty:
471 |             logger.error("No risk data for recommendations")
472 |             return None
473 |         
474 |         try:
475 |             # Create a copy for recommendations
476 |             rec_df = risk_df.copy()
477 |             
478 |             # Define recommendation logic based on risk factors
479 |             def get_recommendation(row):
480 |                 recommendations = []
481 |                 
482 |                 # Check for specific risk factors
483 |                 if row['RiskCategory'] == 'High':
484 |                     recommendations.append("Immediate care coordination intervention required")
485 |                     
486 |                     if row.get('ChronicConditionCount', 0) > 2:
487 |                         recommendations.append("Chronic care management program enrollment")
488 |                     
489 |                     if row.get('MedicationCount', 0) > 5:
490 |                         recommendations.append("Medication reconciliation and therapy management")
491 |                     
492 |                     if row.get('MissingFollowUp', False):
493 |                         recommendations.append("Schedule immediate follow-up appointment")
494 |                     
495 |                 elif row['RiskCategory'] == 'Medium':
496 |                     recommendations.append("Preventive care coordination recommended")
497 |                     
498 |                     if row.get('DaysSinceLastEncounter', 0) > 90:
499 |                         recommendations.append("Schedule follow-up within 30 days")
500 |                     
501 |                     if row.get('MedicationAdherence', 1) < 0.8:
502 |                         recommendations.append("Medication adherence counseling")
503 |                     
504 |                 else:  # Low risk
505 |                     recommendations.append("Routine care maintenance")
506 |                     
507 |                     if row.get('DaysSinceLastEncounter', 0) > 180:
508 |                         recommendations.append("Schedule routine check-up")
509 |                 
510 |                 # Add age-specific recommendations
511 |                 if row.get('Age', 0) > 65:
512 |                     recommendations.append("Fall risk assessment")
513 |                     recommendations.append("Medicare Annual Wellness Visit if not completed")
514 |                 
515 |                 return "; ".join(recommendations)
516 |             
517 |             # Apply the recommendation logic
518 |             rec_df['InterventionRecommendations'] = rec_df.apply(get_recommendation, axis=1)
519 |             
520 |             logger.info("Generated intervention recommendations")
521 |             return rec_df
522 |         
523 |         except Exception as e:
524 |             logger.error(f"Error generating recommendations: {e}")
525 |             return None
526 |     
527 |     def run_full_analysis(self):
528 |         """
529 |         Run the full analysis pipeline from data fetching to recommendations.
530 |         
531 |         Returns:
532 |             pandas.DataFrame: DataFrame with risk scores and recommendations
533 |         """
534 |         try:
535 |             # Step 1: Fetch patient data
536 |             logger.info("Starting full analysis pipeline")
537 |             patient_data = self.fetch_patient_data()
538 |             
539 |             if patient_data is None or patient_data.empty:
540 |                 logger.error("Failed to fetch patient data")
541 |                 return None
542 |             
543 |             # Step 2: Preprocess the data
544 |             processed_data = self.preprocess_data(patient_data)
545 |             
546 |             if processed_data is None:
547 |                 logger.error("Failed to preprocess data")
548 |                 return None
549 |             
550 |             # Step 3: Load or create risk model
551 |             if os.path.exists('healthcare_risk_model.pkl'):
552 |                 logger.info("Loading existing risk model")
553 |                 model_loaded = self.load_model()
554 |                 if not model_loaded:
555 |                     logger.warning("Failed to load existing model, creating new one")
556 |                     model_created = self.create_risk_model(processed_data)
557 |                     if not model_created:
558 |                         logger.error("Failed to create risk model")
559 |                         return None
560 |             else:
561 |                 logger.info("Creating new risk model")
562 |                 model_created = self.create_risk_model(processed_data)
563 |                 if not model_created:
564 |                     logger.error("Failed to create risk model")
565 |                     return None
566 |             
567 |             # Step 4: Predict risk scores
568 |             risk_results = self.predict_risk(processed_data)
569 |             
570 |             if risk_results is None:
571 |                 logger.error("Failed to predict risk scores")
572 |                 return None
573 |             
574 |             # Step 5: Generate intervention recommendations
575 |             recommendations = self.generate_intervention_recommendations(risk_results)
576 |             
577 |             if recommendations is None:
578 |                 logger.error("Failed to generate recommendations")
579 |                 return None
580 |             
581 |             # Step 6: Upload results to Salesforce
582 |             logger.info("Uploading results to Salesforce")
583 |             upload_success = self.upload_risk_scores_to_salesforce(recommendations)
584 |             
585 |             if upload_success:
586 |                 logger.info("Successfully uploaded risk scores to Salesforce")
587 |             else:
588 |                 logger.warning("Failed to upload some or all risk scores to Salesforce")
589 |             
590 |             # Return the final results
591 |             return recommendations
592 |         
593 |         except Exception as e:
594 |             logger.error(f"Error in full analysis pipeline: {e}")
595 |             return None
596 | 
597 | 
598 | # Example usage
599 | if __name__ == "__main__":
600 |     # Replace with your Salesforce credentials
601 |     healthcare_analytics = HealthcareAnalytics(
602 |         sf_username="your_username@example.com",
603 |         sf_password="your_password",
604 |         sf_security_token="your_security_token"
605 |     )
606 |     
607 |     # Run the full analysis
608 |     results = healthcare_analytics.run_full_analysis()
609 |     
610 |     if results is not None:
611 |         # Output summary of high-risk patients
612 |         high_risk = results[results['RiskCategory'] == 'High']
613 |         print(f"Found {len(high_risk)} high-risk patients requiring intervention")
614 |         
615 |         # Save results to CSV
616 |         results.to_csv("patient_risk_analysis.csv", index=False)
617 |         print("Results saved to patient_risk_analysis.csv")
618 |     else:
619 |         print("Analysis failed. Check the logs for details.")
620 | 


--------------------------------------------------------------------------------