├── Data Flow Diagram.png
├── Playbook for Watson Assistant Metrics Notebook.pdf
├── README.md
├── Watson Assistant Metrics Notebook.ipynb
├── Watson Dashboard Template.json
├── cognos-covid-dash.png
├── cognos-covid-dash2.png
├── coverage_calc.txt
├── db2-tables-sql-V2API.txt
├── db2-tables-sql.txt
├── dev
└── Watson Assistant Metrics Notebook (DB2).ipynb
└── testfile
/Data Flow Diagram.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/preethm/watson-assistant-metrics-notebook/0c10bd46c55a225ba412a4ec0655f6f5e146196c/Data Flow Diagram.png
--------------------------------------------------------------------------------
/Playbook for Watson Assistant Metrics Notebook.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/preethm/watson-assistant-metrics-notebook/0c10bd46c55a225ba412a4ec0655f6f5e146196c/Playbook for Watson Assistant Metrics Notebook.pdf
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Summary
2 | This Jupyter notebook was developed for analyzing the conversation logs for Watson Assistant (and Voice Interaction) solutions. The Playbook will demonstrate the steps required to complete the following:
3 |
4 |
5 | KPIs include conversation volume, key user topics, coverage, containment, escalation, and other key metrics vital to reporting on the performance of a Watson Assistant solution.
6 |
7 |
8 | ## Assets
9 | 1. **Playbook for Watson Assistant Metrics Notebook.pdf** - Contains step-by-step instructions in order to run the notebook and create a Dashboard.
10 | 2. **Watson Assistant Metrics Notebook.ipynb** - Jupyter Notebook that will be installed and run in Watson Studio.
11 | 3. **db2-tables-sql.txt** - Text file with the SQL Query needed to establish tables in a Db2 database (the shaped log data will be stored here)
12 |
13 | ## Instructions
14 | 1. Download the PDF in the repository named `Playbook for Watson Assistant Metrics Notebook.pdf`
15 | 2. Follow the step-by-step instructions in the playbook using the Jupyter notebook `Watson Assistant Metrics Notebook.ipynb` as a tool.
16 |
17 | ## Example Dashboards
18 |
19 |
20 |
21 |
22 |
23 | Happy Analyzing!
24 |
--------------------------------------------------------------------------------
/Watson Assistant Metrics Notebook.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Watson Assistant Metrics Notebook\n",
8 | "\n",
9 | "This notebook performs analytics on the user log records of Watson Assistant (including Voice Interaction). The logs are extracted, transformed, and loaded into a DB2 database and CSV files. A variety of key business metrics are calculated and displayed in the notebook. Using Watson Studio to build a Dashboard are recommended for further data exploration and dashboard visualizations. \n",
10 | "\n",
11 | "
\n",
12 | "\n",
13 | "### Table of Contents\n",
14 | "* [1. Configuration and Log Collection](#config) - This section will extract and transform the user log data from Watson Assistant.\n",
15 | "* [2. Key Performance Metrics](#performance-metrics) - Key metrics including containment rate, active users, and top intents will be calculated. \n",
16 | "* [3. Export Logs](#export-logs) The transformed log data will be saved to a DB2 database and CSV files."
17 | ]
18 | },
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {},
22 | "source": [
23 | "## Housekeeping \n",
24 | "This section will import libraries and dependencies for this notebook. \n",
25 | " \n",
26 | "> **Action Required:** Update the `project_id` and `project_access_token` in order to access your data assets. Instructions can be found here: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/token.html"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": null,
32 | "metadata": {},
33 | "outputs": [],
34 | "source": [
35 | "# @hidden_cell\n",
36 | "# The project token is an authorization token that is used to access project resources like data sources, connections, and used by platform APIs.\n",
37 | "from project_lib import Project\n",
38 | "project = Project(project_id='XXXXXXXX', project_access_token='XXXXXXXXX')\n",
39 | "pc = project.project_context\n"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": null,
45 | "metadata": {
46 | "scrolled": true
47 | },
48 | "outputs": [],
49 | "source": [
50 | "!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/getAllLogs.py\n",
51 | "!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/extractConversations.py\n",
52 | "!curl -O https://raw.githubusercontent.com/pratyushsingh97/Bigrams/master/bigrams.py\n",
53 | "\n",
54 | "%load_ext autoreload\n",
55 | "%autoreload 2\n",
56 | "import warnings\n",
57 | "warnings.simplefilter(\"ignore\")\n",
58 | "\n",
59 | "!pip install ibm-watson\n",
60 | "!pip install --user --upgrade \"pandas==1.0.3\";\n",
61 | "!pip install -r https://raw.githubusercontent.com/pratyushsingh97/Bigrams/master/requirements.txt\n",
62 | "\n",
63 | "import json\n",
64 | "import pandas as pd\n",
65 | "import getAllLogs\n",
66 | "import extractConversations\n",
67 | "import bigrams\n",
68 | "import seaborn as sn\n",
69 | "import ibm_db\n",
70 | "import ibm_db_dbi"
71 | ]
72 | },
73 | {
74 | "cell_type": "code",
75 | "execution_count": null,
76 | "metadata": {},
77 | "outputs": [],
78 | "source": [
79 | "# Custom functions to re-use code throughout notebook\n",
80 | "def turn_dict_to_df(df,col_names):\n",
81 | " df = pd.DataFrame.from_dict(df)\n",
82 | " df.reset_index(level=0, inplace=True)\n",
83 | " df.columns = col_names\n",
84 | " return df"
85 | ]
86 | },
87 | {
88 | "cell_type": "markdown",
89 | "metadata": {},
90 | "source": [
91 | "## 1. Configuration and log collection \n",
92 | "This section will configure your DB2 connection, log query parameters, and will extract the logs from your Watson Assistant instance.\n",
93 | "\n",
94 | "> **Action Required:** Update each of the variables marked with 'XXXXXXXX'. The comments in the cells guide you in the configuration."
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": null,
100 | "metadata": {},
101 | "outputs": [],
102 | "source": [
103 | "# Define the customer name. This prefix will be used for saving CSV & JSON files.\n",
104 | "custName = 'XXXXXXXX'\n",
105 | "\n",
106 | "# Set True or False if you want data to write to DB2 table\n",
107 | "connectDB2 = True\n",
108 | "\n",
109 | "# Set the start date for the log fetch. If you are using the DB2 connection in Section 1.1, this will be defined automatically.\n",
110 | "log_fetch_start = '2020-05-15'"
111 | ]
112 | },
113 | {
114 | "cell_type": "markdown",
115 | "metadata": {},
116 | "source": [
117 | "### 1.1 Configure & Establish DB2 connection\n",
118 | "This section will define the values for your DB2 database, establish the connection, check for the last loaded date.\n",
119 | "\n",
120 | "> **Action Required:** You will first need to provision your DB2 instance and establish the table schemas. Follow these instructions. Then, update the values below marked 'XXXXXXXX'. "
121 | ]
122 | },
123 | {
124 | "cell_type": "code",
125 | "execution_count": 3,
126 | "metadata": {},
127 | "outputs": [],
128 | "source": [
129 | "# Enter the values for you database connection. This can be found in DB2's Service Credentials from the tooling. \n",
130 | "dsn_database = \"XXXXXXXX\" # e.g. \"MORTGAGE\"\n",
131 | "dsn_uid = \"XXXXXXXX\" # e.g. \"dash104434\"\n",
132 | "dsn_pwd = \"XXXXXXXX\" # e.g. \"7dBZ3jWt9xN6$o0JiX!m\"\n",
133 | "dsn_hostname = \"XXXXXXXX\" # e.g. \"Use the same IP as Web Console\"\n",
134 | "dsn_port = \"50000\" # e.g. \"50000\" \n",
135 | "dsn_protocol = \"TCPIP\" # i.e. \"TCPIP\"\n",
136 | "dsn_driver = \"IBM DB2 ODBC DRIVER\" # Don't change"
137 | ]
138 | },
139 | {
140 | "cell_type": "code",
141 | "execution_count": null,
142 | "metadata": {},
143 | "outputs": [],
144 | "source": [
145 | "# Establish database connection\n",
146 | "if connectDB2 == True:\n",
147 | " dsn = (\"DRIVER={{IBM DB2 ODBC DRIVER}};\" \"DATABASE={0};\" \"HOSTNAME={1};\" \"PORT={2};\" \"PROTOCOL=TCPIP;\" \"UID={3};\" \"PWD={4};\").format(dsn_database, dsn_hostname, dsn_port, dsn_uid, dsn_pwd)\n",
148 | " options = { ibm_db.SQL_ATTR_AUTOCOMMIT: ibm_db.SQL_AUTOCOMMIT_ON }\n",
149 | " conn = ibm_db.connect(dsn, \"\", \"\",options)\n",
150 | " #Added options for auto commit\n",
151 | " \n",
152 | " # Retrieve the date for the previous DB2 run. If there is none defined, use 2020-04-15. This variable log_fetch_start is used for filtering WA logs.\n",
153 | " select_sql = 'SELECT * FROM WATSON.WA_LAST_RUN_LOG'\n",
154 | " select_stmt = ibm_db.exec_immediate(conn, select_sql)\n",
155 | " prev_run = ibm_db.fetch_both(select_stmt)\n",
156 | " first_run = True\n",
157 | " log_fetch_start = '2020-04-15'\n",
158 | " if prev_run != False:\n",
159 | " first_run = False\n",
160 | " l_conversation_id = prev_run.get('CONVERSATION_ID')\n",
161 | " l_request_timestamp = prev_run.get('REQUEST_TIMESTAMP')\n",
162 | " l_response_timestamp = prev_run.get('RESPONSE_TIMESTAMP')\n",
163 | " l_prev_run = prev_run.get('LASTRUN_TIMESTAMP')\n",
164 | " log_fetch_start = str(l_response_timestamp.date())\n",
165 | "print('log_fetch_start:',log_fetch_start)"
166 | ]
167 | },
168 | {
169 | "cell_type": "markdown",
170 | "metadata": {},
171 | "source": [
172 | "### 1.2 Retrieve logs from the Watson Assistant instance\n",
173 | "This section will retrieve the user logs from the Assistant `/logs` API.\n",
174 | "\n",
175 | "> **Action Required:** Update the fields below marked 'XXXXXXXX' based on the credentials of your Assistant. \n",
176 | "Solutions using an Assistant layer (v2 API) should set `workspace_id=None` and provide `assistant_id`. Otherwise, define workspace and comment out assistant_id.\n",
177 | "\n",
178 | "\n"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": null,
184 | "metadata": {},
185 | "outputs": [],
186 | "source": [
187 | "# Extract logs from your assistant. Complete this information.\n",
188 | "iam_apikey = 'XXXXXXXX' \n",
189 | "url = \"XXXXXXXX\" # Set the URL to the region, e.g. https://api.us-east.assistant.watson.cloud.ibm.com\n",
190 | "assistant_id = 'XXXXXXXX'\n",
191 | "workspace_id = None\n",
192 | "\n",
193 | "# If not using assistant_id, comment out the 2nd line below. \n",
194 | "log_filter=\"language::en,response_timestamp>=\" + log_fetch_start \\\n",
195 | "+\",request.context.system.assistant_id::\" + assistant_id\n",
196 | "\n",
197 | "#Change the number of logs retrieved, default settings will return 100,000 logs (200 pages of 500)\n",
198 | "page_size_limit=500\n",
199 | "page_num_limit=200\n",
200 | "\n",
201 | "version=\"2018-09-20\" # Watson Assistant API version\n",
202 | "\n",
203 | "rawLogsJson = getAllLogs.getLogs(iam_apikey, url, workspace_id, log_filter, page_size_limit, page_num_limit, version)\n",
204 | "rawLogsPath= custName + \"_logs.json\"\n",
205 | "\n",
206 | "# getAllLogs.writeLogs(rawLogsJson, rawLogsPath) # Saves the logs locally\n",
207 | "project.save_data(file_name = rawLogsPath,data = json.dumps(rawLogsJson),overwrite=True); # Saves the logs in Studio/COS\n",
208 | "print('\\nSaved log data to {}'.format(rawLogsPath))"
209 | ]
210 | },
211 | {
212 | "cell_type": "markdown",
213 | "metadata": {},
214 | "source": [
215 | "### 1.3 Load logs from JSON file (Defunct)\n",
216 | "If you have previously saved the JSON file, you can uncomment this section to load the logs. Otherwise, comment this section out and continue."
217 | ]
218 | },
219 | {
220 | "cell_type": "code",
221 | "execution_count": null,
222 | "metadata": {},
223 | "outputs": [],
224 | "source": [
225 | "# #If you have previously stored your logs on the file system, you can reload them here by uncommenting these lines\n",
226 | "# rawLogsPath= custName+\"_logs.json\"\n",
227 | "# rawLogsJson = extractConversations.readLogs(rawLogsPath)"
228 | ]
229 | },
230 | {
231 | "cell_type": "markdown",
232 | "metadata": {},
233 | "source": [
234 | "### 1.4 Format logs\n",
235 | "Now that the logs have been retrieved, this section will transform the data out of JSON format and into a Pandas dataframe. \n",
236 | "\n",
237 | "> **Optional:** If you wish to add any custom fields (such as a context variable), add it the first line `customFieldNames` below. Otherwise, run this cell as-is."
238 | ]
239 | },
240 | {
241 | "cell_type": "code",
242 | "execution_count": null,
243 | "metadata": {},
244 | "outputs": [],
245 | "source": [
246 | "# Optionally provide a comma-separated list of custom fields you want to extract, in addition to the default fields\n",
247 | "customFieldNames = ''\n",
248 | "\n",
249 | "# Unique conversation identifier across all records. This is default. For a multi-skill assistant you will need to provide your own key.\n",
250 | "primaryLogKey = \"response.context.conversation_id\"\n",
251 | "conversationKey='conversation_id' # Name of the correlating key as it appears in the data frame columns (remove 'response.context.')\n",
252 | "\n",
253 | "# These custom fields are added to the list. They are used for extracting metrics in the notebook. Do not change these.\n",
254 | "customFieldNames = customFieldNames + \",response.context.vgwSIPFromURI,response.context.vgwSessionID,request.context.vgwSMSFailureReason,\\\n",
255 | "request.context.vgwSMSUserPhoneNumber,response.output.vgwAction.parameters.transferTarget,response.context.language,\\\n",
256 | "response.context.metadata.user_id,response.output.generic\"\n",
257 | "\n",
258 | "allLogsDF = extractConversations.extractConversationData(rawLogsJson, primaryLogKey, customFieldNames)\n",
259 | "conversationsGroup = allLogsDF.groupby(conversationKey,as_index=False)\n",
260 | "\n",
261 | "# Splits the response_timestamp into month, day, and year fields that can be used for easier data filtering/visualizations \n",
262 | "allLogsDF[\"full_date\"] = pd.to_datetime(allLogsDF[\"response_timestamp\"])\n",
263 | "allLogsDF['month'] = allLogsDF['full_date'].dt.month\n",
264 | "allLogsDF['day'] = allLogsDF['full_date'].dt.day\n",
265 | "allLogsDF['year'] = allLogsDF['full_date'].dt.year\n",
266 | "\n",
267 | "print(\"Total log events:\",len(allLogsDF))\n",
268 | "allLogsDF.head()"
269 | ]
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": 1,
274 | "metadata": {},
275 | "outputs": [],
276 | "source": [
277 | "# Preprocessing Step to filter out non-ASCII Characters\n",
278 | "def _is_ascii(text) -> bool:\n",
279 | " try:\n",
280 | " text.encode().decode('ascii')\n",
281 | " except UnicodeDecodeError: # if you cannot decode the string into ascii then it contains non-ascii characters\n",
282 | " text = \"TEXT CONTAINED NON-ASCII CHARS\"\n",
283 | " \n",
284 | " return text\n",
285 | " \n",
286 | "allLogsDF['input.text'] = allLogsDF['input.text'].apply(lambda text: _is_ascii(text)) # Go through each row and check if it is ascii\n",
287 | "print(\"Total log events:\",len(allLogsDF))\n",
288 | "allLogsDF.head()"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": null,
294 | "metadata": {},
295 | "outputs": [],
296 | "source": [
297 | "# Print the column names\n",
298 | "# allLogsDF.columns"
299 | ]
300 | },
301 | {
302 | "cell_type": "markdown",
303 | "metadata": {},
304 | "source": [
305 | "### 1.5 Extract Bigrams from Input Text\n",
306 | "A bigram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. A Python script will be used to extract the bigrams in its lemmatization form. For example \"I am eating pie\" and \"I eat pie\" result in the same bigram \"eat_pie\". This cell will extract the bigrams from the `input.text` column and load it into a `input_bigrams` column. This column can be used to create efficient data visualizations (e.g. word clouds)."
307 | ]
308 | },
309 | {
310 | "cell_type": "code",
311 | "execution_count": null,
312 | "metadata": {},
313 | "outputs": [],
314 | "source": [
315 | "bigrams_list = bigrams.runner(allLogsDF['input.text'], stopwords=[\"like\"]) # add \"like\" to the list of stopwords\n",
316 | "allLogsDF['input_bigrams'] = bigrams_list"
317 | ]
318 | },
319 | {
320 | "cell_type": "markdown",
321 | "metadata": {},
322 | "source": [
323 | "# 2. Key Performance Metrics \n",
324 | "The notebook will calculate various performance metrics including `coverage` and `containment`. Standard volume metrics will also be provided.\n",
325 | "\n",
326 | "* [2.1 Core Metrics](#core-metrics) - These are conversational metrics that apply to both chat and voice solutions.\n",
327 | "* [2.2 Voice Interaction Metrics](#voice-metrics) - Additional measurements for voice solutions including phone calls, call transfers, unique caller IDs, etc.\n",
328 | "* [2.3 Custom Metrics](#custom-metrics) - Other ad-hoc analysis. Requires knowledge of Python."
329 | ]
330 | },
331 | {
332 | "cell_type": "markdown",
333 | "metadata": {},
334 | "source": [
335 | "## 2.1 Core Metrics \n",
336 | "These metrics apply to all Watson Assistant solutions. For voice solutions, additional metrics are in the next section.\n",
337 | "* [2.1.1 Abandonment at Greeting](#abandonment)\n",
338 | "* [2.1.2 Coverage Metric](#coverage-metric)\n",
339 | "* [2.1.3 Search Skill Responses](#search-skill)\n",
340 | "* [2.1.4 Escalation Requests](#escalation-metric)\n",
341 | "* [2.1.5 Active Users](#active-users)\n",
342 | "* [2.1.6 Top Intents & Average Confidence Scores](#top-intents-scores)\n",
343 | "* [2.1.7 Top Entities](#top-entities)\n",
344 | "* [2.1.8 Optional: Bilingual Assistants](#bilingual-assistants)"
345 | ]
346 | },
347 | {
348 | "cell_type": "code",
349 | "execution_count": null,
350 | "metadata": {},
351 | "outputs": [],
352 | "source": [
353 | "# dict{} that we will send to CSV for use in Watson Studio Cognos Dashboard\n",
354 | "metrics_dict = {}\n",
355 | "\n",
356 | "# These should match the count in the Watson Assistant Analytics tooling.\n",
357 | "totalConvs = len(allLogsDF[conversationKey].unique())\n",
358 | "print(\"Total messages: \", len(allLogsDF))\n",
359 | "print(\"Total conversations:\", totalConvs)"
360 | ]
361 | },
362 | {
363 | "cell_type": "markdown",
364 | "metadata": {},
365 | "source": [
366 | "### 2.1.1 Abandonment at Greeting \n",
367 | "\n",
368 | "The logs include non-user messages such as welcome messages and system messages from a Voice Interaction solution. By filtering out these messages, it will reveal how many conversations abandoned before the first user utterance."
369 | ]
370 | },
371 | {
372 | "cell_type": "code",
373 | "execution_count": null,
374 | "metadata": {},
375 | "outputs": [],
376 | "source": [
377 | "# This removes blank inputs and vgwHangUp tags in log events\n",
378 | "filteredLogsDF = allLogsDF[allLogsDF['input.text'] != \"\"]\n",
379 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwHangUp'] \n",
380 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwPostResponseTimeout'] \n",
381 | "\n",
382 | "filteredMessages = len(filteredLogsDF)\n",
383 | "filteredConvs = len(filteredLogsDF[conversationKey].unique())\n",
384 | "abandonedAtGreeting = (totalConvs - filteredConvs)\n",
385 | "metrics_dict['abandonedAtGreeting'] = [abandonedAtGreeting] # Put into metrics dict\n",
386 | "\n",
387 | "print(\"Abandoned conversations (no user input):\", abandonedAtGreeting)"
388 | ]
389 | },
390 | {
391 | "cell_type": "markdown",
392 | "metadata": {},
393 | "source": [
394 | "### 2.1.2 Coverage Metric \n",
395 | "Coverage is the measurement of the portion of total user messages that your assistant is attempting to respond to. For example, any messages that respond with \"Sorry I didn't understand\" from the anything_else node is considered uncovered.\n",
396 | "\n",
397 | "> **Action Required:** Define the node ids in `anything_else_nodes` list that represent any responses for uncovered messages. This can be found by exporting the Skill from the Assistant tooling, and searching the JSON for the relevant `dialog_node`. "
398 | ]
399 | },
400 | {
401 | "cell_type": "code",
402 | "execution_count": null,
403 | "metadata": {},
404 | "outputs": [],
405 | "source": [
406 | "# Define the node_id for anything_else and other uncovered nodes\n",
407 | "anything_else_nodes = ['XXXXXXXX'] \n",
408 | "\n",
409 | "# coveredDF = allLogsDF\n",
410 | "allLogsDF.rename(columns={'input.text': 'input_text'}, inplace=True)\n",
411 | "coverage = []\n",
412 | "\n",
413 | "for row in allLogsDF.itertuples():\n",
414 | " appended = False \n",
415 | " nodes = row.nodes_visited\n",
416 | " for node in nodes:\n",
417 | " if node in anything_else_nodes:\n",
418 | " coverage.append('uncovered') # Mark as uncovered if message hit node in anything_else_nodes list\n",
419 | " appended = True\n",
420 | " break\n",
421 | " if (row.input_text == '' or row.input_text == 'vgwHangUp' or row.input_text == 'vgwPostResponseTimeout') and not appended:\n",
422 | " coverage.append('system_message') # Mark greetings and voicegateway actions as system_messages\n",
423 | " appended = True\n",
424 | " if not appended:\n",
425 | " coverage.append('covered') # else, everything else is covered\n",
426 | "\n",
427 | "allLogsDF['coverage'] = coverage\n",
428 | "allLogsDF.rename(columns={'input_text': 'input.text'}, inplace=True)\n",
429 | "coveredDF = allLogsDF[allLogsDF['coverage'] == 'covered']\n",
430 | "uncoveredDF = allLogsDF[allLogsDF['coverage'] == 'uncovered']\n",
431 | "\n",
432 | "print('Covered messages: ', len(coveredDF))\n",
433 | "print('Uncovered messages: ', len(allLogsDF[allLogsDF['coverage'] == 'uncovered']))\n",
434 | "print('System messages: ', len(allLogsDF[allLogsDF['coverage'] == 'system_message']))\n",
435 | "print('\\nCoverage metric: ','{:.0%}'.format(len(coveredDF) / filteredMessages))\n",
436 | "\n",
437 | "# coveredMsgs[['input_text','output.text','coverage']].tail(10)\n",
438 | "\n",
439 | "metrics_dict['coverage'] = [len(coveredDF) / filteredMessages] # Put into metrics dict"
440 | ]
441 | },
442 | {
443 | "cell_type": "code",
444 | "execution_count": null,
445 | "metadata": {},
446 | "outputs": [],
447 | "source": [
448 | "# uncoveredDF[['input.text','output.text']].head(10)"
449 | ]
450 | },
451 | {
452 | "cell_type": "markdown",
453 | "metadata": {},
454 | "source": [
455 | "### 2.1.2 Search Skill Responses \n",
456 | "Watson Assistant has multiple response types including `text`, `option`, `image`, `pause`, or `search skill`. Each of these types are marked within `output.generic.response_type` inside the log data. This cell will calculate the number of Search Skill responses."
457 | ]
458 | },
459 | {
460 | "cell_type": "code",
461 | "execution_count": null,
462 | "metadata": {},
463 | "outputs": [],
464 | "source": [
465 | "# Run this cell\n",
466 | "response_type = []\n",
467 | "\n",
468 | "for row in allLogsDF['output.generic']:\n",
469 | " search_skill = False\n",
470 | " for response in row: # each output can have multiple responses\n",
471 | " if response['response_type'] == 'search_skill':\n",
472 | " response_type.append('search_skill')\n",
473 | " search_skill = True\n",
474 | " break\n",
475 | " \n",
476 | " if not search_skill: # if the response was not a search skill, append other to the list\n",
477 | " response_type.append('other')\n",
478 | " \n",
479 | "allLogsDF['response_type'] = response_type # Add in response_type column to allLogsDF\n",
480 | "searchSkillDF = allLogsDF[allLogsDF['response_type'] == 'search_skill'] # Set new DF \n",
481 | "print('Total Search Skill responses:',len(searchSkillDF))\n",
482 | "print('Percentage of total messages: {:.0%}'.format(len(searchSkillDF) / len(allLogsDF) ))\n",
483 | "\n",
484 | "searchSkillDF[['input.text','response_type']].head().reset_index(drop=True) # Print the list of user inputs that caused search skill"
485 | ]
486 | },
487 | {
488 | "cell_type": "code",
489 | "execution_count": null,
490 | "metadata": {},
491 | "outputs": [],
492 | "source": [
493 | "# Saves to CSV\n",
494 | "project.save_data(file_name = custName + \"_search-skill-inputs.csv\",data = searchSkillDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally"
495 | ]
496 | },
497 | {
498 | "cell_type": "markdown",
499 | "metadata": {},
500 | "source": [
501 | "### 2.1.3 Escalation Requests \n",
502 | "\n",
503 | "Escalation refers to any time a user is prompted to contact a live person (e.g. 1-800 number). If the assistant has an integration with a live handoff service (e.g. ZenDesk), this is considered escalation. For Voice Interaction solutions, we calculate `call containment` in the next section by counting the number of actual call transfers in the logs.\n",
504 | "\n",
505 | "> **Action Required:** Define the node id in `escalation_node` for a node that represents any responses to an escalation request (e.g. `#General-Agent-Escalation`). This can be found by exporting the Skill from the Assistant tooling, and searching the JSON for the relevant dialog_node.\n",
506 | " "
507 | ]
508 | },
509 | {
510 | "cell_type": "code",
511 | "execution_count": null,
512 | "metadata": {},
513 | "outputs": [],
514 | "source": [
515 | "# Define the escalation node\n",
516 | "escalation_node = \"XXXXXXXX\" \n",
517 | "node_visits_escalated = allLogsDF[[escalation_node in x for x in allLogsDF['nodes_visited']]]\n",
518 | "\n",
519 | "escalationMetric = len(node_visits_escalated)/filteredMessages\n",
520 | "metrics_dict['escalation'] = [escalationMetric] # Put into metrics dict\n",
521 | "print(\"Total visits to escalation node:\",len(node_visits_escalated))\n",
522 | "print(\"Percent of total messages escalated:\",'{:.0%}'.format(escalationMetric))"
523 | ]
524 | },
525 | {
526 | "cell_type": "markdown",
527 | "metadata": {},
528 | "source": [
529 | "### 2.1.4 Active Users \n",
530 | "How many unique users used the assistant?"
531 | ]
532 | },
533 | {
534 | "cell_type": "code",
535 | "execution_count": null,
536 | "metadata": {},
537 | "outputs": [],
538 | "source": [
539 | "uniqueUsers = allLogsDF[\"metadata.user_id\"].nunique()\n",
540 | "metrics_dict['uniqueUsers'] = [uniqueUsers] # inserts into metrics dict\n",
541 | "print('Total unique users: {}'.format(uniqueUsers))"
542 | ]
543 | },
544 | {
545 | "cell_type": "markdown",
546 | "metadata": {},
547 | "source": [
548 | "### 2.1.5 Top Intents & Average Confidence Scores "
549 | ]
550 | },
551 | {
552 | "cell_type": "code",
553 | "execution_count": null,
554 | "metadata": {},
555 | "outputs": [],
556 | "source": [
557 | "# Using pandas aggregators to count how often each intent is selected and its average confidence\n",
558 | "intentsDF = filteredLogsDF.groupby('intent',as_index=False).agg({\n",
559 | " 'input.text': ['count'], \n",
560 | " 'intent_confidence': ['mean']\n",
561 | "})\n",
562 | "\n",
563 | "intentsDF.columns=[\"intent\",\"count\",\"confidence\"] #Flatten the column headers for ease of use\n",
564 | "\n",
565 | "intentsDF = intentsDF[intentsDF['intent'] !=''] # Remove blanks, usually VGW tags + greetings\n",
566 | "intentsDF = intentsDF.sort_values('count',ascending=False)\n",
567 | "intentsDF = intentsDF.reset_index(drop=True)\n",
568 | "intentsDF.head(5) # If you want specific number shown, edit inside head(). If you want to show all, remove head() "
569 | ]
570 | },
571 | {
572 | "cell_type": "code",
573 | "execution_count": null,
574 | "metadata": {},
575 | "outputs": [],
576 | "source": [
577 | "#ax = sns.barplot(x=\"count\", y=\"intent\", data=intentsDF.head(),orient='h',palette=\"Blues_d\").set_title('Top Intents')"
578 | ]
579 | },
580 | {
581 | "cell_type": "markdown",
582 | "metadata": {},
583 | "source": [
584 | "### 2.1.6 Top Entities (Defunct) "
585 | ]
586 | },
587 | {
588 | "cell_type": "code",
589 | "execution_count": null,
590 | "metadata": {},
591 | "outputs": [],
592 | "source": [
593 | "entityDF = allLogsDF[allLogsDF[\"entities\"] != \"\"]\n",
594 | "#intentsDF = intentsDF[intentsDF['intent'] !=''] # Remove blanks, usually VGW tags + greetings\n",
595 | "entityDF[\"entities\"].iloc[0]"
596 | ]
597 | },
598 | {
599 | "cell_type": "markdown",
600 | "metadata": {},
601 | "source": [
602 | "### 2.1.7 Optional: Bilingual Assistants \n",
603 | "For assistants that use a single skill for two different languages. The skill may set a context variable (e.g. `$language==\"english\"`) and then respond accordingly based on this variable. This cell will count the unique conversation_ids that have a given context variable.\n",
604 | "\n",
605 | "> **Optional:** Define the `languageVar` that your skill uses to identify the language used to respond to the user."
606 | ]
607 | },
608 | {
609 | "cell_type": "code",
610 | "execution_count": null,
611 | "metadata": {},
612 | "outputs": [],
613 | "source": [
614 | "languageVar = 'language' # define the context variable that you retrieved above in customFields\n",
615 | "\n",
616 | "languageDF = allLogsDF.groupby([languageVar])[\"conversation_id\"].nunique()\n",
617 | "languageDF = turn_dict_to_df(languageDF, ['Context Var', 'Count'])\n",
618 | "languageDF = languageDF[languageDF['Context Var'] != '']\n",
619 | "languageDF"
620 | ]
621 | },
622 | {
623 | "cell_type": "markdown",
624 | "metadata": {},
625 | "source": [
626 | "## 2.2 Voice Interaction Metrics \n",
627 | "These metrics are for Voice Agent solutions. We start with volume metrics. \n",
628 | "If your solution is chat only, [skip to the next section.](#export-to-csv)\n",
629 | "\n",
630 | "* [2.2.1 Call Containment Rate](#containment-rate)\n",
631 | "* [2.2.2 Unique Callers](#unique-callers)\n",
632 | "* [2.2.3 SMS Sent](#sms-sent)"
633 | ]
634 | },
635 | {
636 | "cell_type": "code",
637 | "execution_count": null,
638 | "metadata": {},
639 | "outputs": [],
640 | "source": [
641 | "uniqueCallers = allLogsDF['vgwSIPFromURI'].unique()\n",
642 | "uniqueCalls = allLogsDF['vgwSessionID'].unique()\n",
643 | "\n",
644 | "print(\"Total phone calls:\", len(uniqueCalls)) # It will print '1' if there are no calls found in the logs\n",
645 | "print(\"Total unique callers:\", len(uniqueCallers))\n",
646 | "print(\"Average messages per call:\", int(len(allLogsDF) / len(uniqueCalls)))"
647 | ]
648 | },
649 | {
650 | "cell_type": "code",
651 | "execution_count": null,
652 | "metadata": {},
653 | "outputs": [],
654 | "source": [
655 | "# Filters out blank inputs and vgwHangUp tags in log events\n",
656 | "filteredLogsDF = allLogsDF[allLogsDF['input.text'] != \"\"]\n",
657 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwHangUp'] \n",
658 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwPostResponseTimeout'] "
659 | ]
660 | },
661 | {
662 | "cell_type": "markdown",
663 | "metadata": {},
664 | "source": [
665 | "### 2.2.1 Call Containment Rate \n",
666 | "How many call transfers did the voice solution perform?"
667 | ]
668 | },
669 | {
670 | "cell_type": "code",
671 | "execution_count": null,
672 | "metadata": {},
673 | "outputs": [],
674 | "source": [
675 | "transfersDF = allLogsDF.groupby([\"output.vgwAction.parameters.transferTarget\"])[\"vgwSessionID\"].count()\n",
676 | "transfersDF = turn_dict_to_df(transfersDF, ['TransferTo', 'Count'])\n",
677 | "transfersDF = transfersDF[transfersDF['TransferTo'] != '']\n",
678 | "\n",
679 | "print('Call transfer count:', transfersDF['Count'].sum()) \n",
680 | "containmentRate = 1 - transfersDF['Count'].sum() / len(uniqueCalls)\n",
681 | "print('Call containment rate:', '{:.0%}'.format(containmentRate))\n",
682 | "metrics_dict['callTransfers'] = [transfersDF['Count'].sum()] # Put into metrics dict\n",
683 | "metrics_dict['containment'] = [containmentRate] # Put into metrics dict\n",
684 | "transfersDF.sort_values('Count',ascending=False)"
685 | ]
686 | },
687 | {
688 | "cell_type": "markdown",
689 | "metadata": {},
690 | "source": [
691 | "### 2.2.2 Unique Callers \n",
692 | "How many unique caller IDs dialed into the voice solution?"
693 | ]
694 | },
695 | {
696 | "cell_type": "code",
697 | "execution_count": null,
698 | "metadata": {},
699 | "outputs": [],
700 | "source": [
701 | "callsDF = allLogsDF.groupby(['vgwSIPFromURI'])['vgwSessionID'].nunique()\n",
702 | "callsDF = pd.DataFrame.from_dict(callsDF)\n",
703 | "callsDF.reset_index(level=0, inplace=True)\n",
704 | "callsDF.columns = ['Caller ID', 'Call Count']\n",
705 | "print('Total unique caller IDs:', len(callsDF))\n",
706 | "callsDF.head().sort_values('Call Count',ascending=False)\n",
707 | "metrics_dict['callerIDs'] = [len(callsDF)] # Put into metrics dict"
708 | ]
709 | },
710 | {
711 | "cell_type": "markdown",
712 | "metadata": {},
713 | "source": [
714 | "### 2.2.3 SMS Sent \n",
715 | "How many SMS were sent by the assistant? A text message can be sent to the caller and can be initiated from within the Watson Assistant JSON editor. This will count the number of SMS sent."
716 | ]
717 | },
718 | {
719 | "cell_type": "code",
720 | "execution_count": null,
721 | "metadata": {},
722 | "outputs": [],
723 | "source": [
724 | "smsDF = allLogsDF[allLogsDF['vgwSMSUserPhoneNumber'] != '']\n",
725 | "metrics_dict['sms'] = [len(smsDF)] # Put into metrics dict\n",
726 | "print('Total SMS sent to callers: {}'.format(len(smsDF)))"
727 | ]
728 | },
729 | {
730 | "cell_type": "markdown",
731 | "metadata": {},
732 | "source": [
733 | "## 2.3 Custom Metrics \n",
734 | "This section is optional and can be used to create custom metrics. It will require the basic knowledge of Python and Pandas. Two examples of custom metrics included below can be modified, or additional metrics can be added here. [Jump to section 2.4](#export-logs) if you do not wish to build custom metrics.\n",
735 | "\n",
736 | "* [2.3.1 Context Variable Count](#context-variable-count)\n",
737 | "* [2.3.2 Response Mentions](#response-mentions)"
738 | ]
739 | },
740 | {
741 | "cell_type": "markdown",
742 | "metadata": {},
743 | "source": [
744 | "### 2.3.1 Context Variable Count \n",
745 | "Some use cases require the use of context variables in order to track user inputs. For one customer, the assistant asks a series of questions in order to screen the patient. \n",
746 | "\n",
747 | "> **Optional:** If you wish to count the number of context variables used across unique conversation IDs, define `contextVar` below."
748 | ]
749 | },
750 | {
751 | "cell_type": "code",
752 | "execution_count": null,
753 | "metadata": {},
754 | "outputs": [],
755 | "source": [
756 | "contextVar = 'preferredContact' # define the context variable that you retrieved above in customFields\n",
757 | "\n",
758 | "contextDF = allLogsDF.groupby([contextVar])[\"conversation_id\"].nunique()\n",
759 | "contextDF = turn_dict_to_df(contextDF, ['Context Var', 'Count'])\n",
760 | "contextDF = contextDF[contextDF['Context Var'] != '']\n",
761 | "contextDF"
762 | ]
763 | },
764 | {
765 | "cell_type": "code",
766 | "execution_count": null,
767 | "metadata": {},
768 | "outputs": [],
769 | "source": [
770 | "contextVar = 'contactSubmitted' # define the context variable that you retrieved above in customFields\n",
771 | "\n",
772 | "contextDF = allLogsDF.groupby([contextVar])[\"conversation_id\"].nunique()\n",
773 | "contextDF = turn_dict_to_df(contextDF, ['Context Var', 'Count'])\n",
774 | "contextDF = contextDF[contextDF['Context Var'] != '']\n",
775 | "contextDF"
776 | ]
777 | },
778 | {
779 | "cell_type": "code",
780 | "execution_count": null,
781 | "metadata": {},
782 | "outputs": [],
783 | "source": [
784 | "# project.save_data(file_name = custName + \"_ScreeningCount.csv\",data = customVarDF.to_csv(index=False),overwrite=True) # This saves in COS. Comment out if running locally"
785 | ]
786 | },
787 | {
788 | "cell_type": "markdown",
789 | "metadata": {},
790 | "source": [
791 | "### 2.3.2 Response Mentions \n",
792 | "A specific customer wanted to identify all mentions of `311` in the responses to users. You can modify this or comment it out."
793 | ]
794 | },
795 | {
796 | "cell_type": "code",
797 | "execution_count": null,
798 | "metadata": {},
799 | "outputs": [],
800 | "source": [
801 | "helpDF = allLogsDF[(allLogsDF['output.text'].str.contains('311')) | (allLogsDF['output.text'].str.contains('3-1-1'))] \n",
802 | "print('Total 3-1-1 response mentions:', len(helpDF))"
803 | ]
804 | },
805 | {
806 | "cell_type": "markdown",
807 | "metadata": {},
808 | "source": [
809 | "# 3. Export Logs \n",
810 | "The transformed log data inside the Pandas dataframe will be saved to CSV files and DB2 on Cloud database. These logs can be used for further data exploration and for creating visualizations in Cognos Dashboard in Watson Studio.\n",
811 | "\n",
812 | "* [3.1 Saving CSV files to Cloud Object Storage](#export-to-csv) CSV files will be saved to the project's Data Assets and Cloud Object Storage.\n",
813 | "* [3.2 Loading into DB2 on Cloud database](#export-to-db2) The data will be saved to a table on your DB2 instance. \n",
814 | "\n",
815 | "## 3.1 Saving CSV files to Cloud Object Storage \n",
816 | "The data will be saved into a CSV file in Cloud Object Storage, accessible via your project's assets folder in Watson Studio. There will be three distinct CSV files saved:\n",
817 | "* `_logs.csv` will contain all of the data within the allLogs dataframe\n",
818 | "* `_KeyMetrics.csv` will contain the calculated metrics such as coverage, escalation, containment rate, etc.\n",
819 | "* `_uncovered_msgs.csv` will contain the selection of uncovered messages. This file can be used for making improvements to intent training and dialog responses.\n",
820 | "\n",
821 | "\n",
822 | "### 3.1.1 Save all logs to CSV"
823 | ]
824 | },
825 | {
826 | "cell_type": "code",
827 | "execution_count": null,
828 | "metadata": {},
829 | "outputs": [],
830 | "source": [
831 | "# allLogsDF.to_csv(custName+'_logs.csv',index=False) # This saves if running notebook locally. Comment out for Studio. \n",
832 | "print('Saving all logs to {}'.format(custName+ \"_logs.csv\"))\n",
833 | "project.save_data(file_name = custName + \"_logs.csv\",data = allLogsDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally"
834 | ]
835 | },
836 | {
837 | "cell_type": "markdown",
838 | "metadata": {},
839 | "source": [
840 | "### 3.1.2 Save KPIs to CSV"
841 | ]
842 | },
843 | {
844 | "cell_type": "code",
845 | "execution_count": null,
846 | "metadata": {},
847 | "outputs": [],
848 | "source": [
849 | "metricsDF = pd.DataFrame(metrics_dict)\n",
850 | "# metricsDF.to_csv(custName + \"_KeyMetrics.csv\",index=False) # This saves if running notebook locally. Comment out for Studio. \n",
851 | "project.save_data(file_name = custName + \"_KeyMetrics.csv\",data = metricsDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally\n",
852 | "print('Saving key metrics to {}'.format(custName+ \"_KeyMetrics.csv\"))\n",
853 | "metricsDF"
854 | ]
855 | },
856 | {
857 | "cell_type": "markdown",
858 | "metadata": {},
859 | "source": [
860 | "### 3.1.3 Save uncovered messages to CSV\n",
861 | "Improve Coverage by analyzing these uncovered messages. This might require adding training data to Intents or customizing STT models."
862 | ]
863 | },
864 | {
865 | "cell_type": "code",
866 | "execution_count": null,
867 | "metadata": {},
868 | "outputs": [],
869 | "source": [
870 | "print('\\nSaved', len(uncoveredDF), 'messages to:', custName + \"_uncovered_msgs.csv\")\n",
871 | "# uncoveredDF.to_csv(custName + \"_uncovered_msgs.csv\",index=False, header=['Utterance','Response','Intent','Confidence'])\n",
872 | "\n",
873 | "project.save_data(file_name = custName + \"_uncovered_msgs.csv\",data = uncoveredDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally"
874 | ]
875 | },
876 | {
877 | "cell_type": "markdown",
878 | "metadata": {},
879 | "source": [
880 | "## 3.2 Loading into DB2 on Cloud database \n",
881 | "The transformed log data will be inserted into a table within an instance of DB2 on Cloud. Requires configuration in section 1.1."
882 | ]
883 | },
884 | {
885 | "cell_type": "code",
886 | "execution_count": null,
887 | "metadata": {},
888 | "outputs": [],
889 | "source": [
890 | "# Prepare to create Logs\n",
891 | "columns = 'BRANCH_EXITED_REASON,CONVERSATION_ID,DIALOG_TURN_COUNTER,ENTITIES,INPUT_TEXT,INTENT,INTENT_CONFIDENCE,NODES_VISITED,OUTPUT_TEXT,REQUEST_TIMESTAMP,RESPONSE_TIMESTAMP,LANGUAGE,PREV_NODES_VISITED,MONTH,DAY,YEAR,CALLER_ID,VGW_SESSION_ID,SMS_NUMBER,CALL_TRANSFER,USER_ID,COVERAGE,RESPONSE_TYPE,INPUT_BIGRAMS'\n",
892 | "insertSQL = 'Insert into WATSON.WA_FULL_LOGS(' + columns + ') values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)'\n",
893 | "stmt = ibm_db.prepare(conn, insertSQL)\n",
894 | "checkSQL = 'Select CONVERSATION_ID,RESPONSE_TIMESTAMP from WATSON.WA_FULL_LOGS where CONVERSATION_ID = ? and RESPONSE_TIMESTAMP = ?'\n",
895 | "checkStmt = ibm_db.prepare(conn, checkSQL)\n",
896 | "insertSQL"
897 | ]
898 | },
899 | {
900 | "cell_type": "code",
901 | "execution_count": null,
902 | "metadata": {
903 | "scrolled": true
904 | },
905 | "outputs": [],
906 | "source": [
907 | "LOG_EVENTS_COUNTER = 0\n",
908 | "# Insert the rows from the dataframe into the DB2 table.\n",
909 | "for n in range(len(allLogsDF)) :\n",
910 | " # check whether this record exists conversation id, response time stamp\n",
911 | " ibm_db.bind_param(checkStmt,1,allLogsDF.at[n,'conversation_id'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
912 | " ibm_db.bind_param(checkStmt, 2,allLogsDF.at[n,'response_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
913 | " #select_stmt = ibm_db.exec_immediate(conn, select_sql)\n",
914 | " select_result = ibm_db.execute(checkStmt)\n",
915 | " row_exists = ibm_db.fetch_both(checkStmt)\n",
916 | " if row_exists == False :\n",
917 | " #Row does not exists, hence insert into the All logs table\n",
918 | " LOG_EVENTS_COUNTER = LOG_EVENTS_COUNTER + 1\n",
919 | " #print('Inserting conversation id: ' + allLogsDF.at[n,'conversation_id'])\n",
920 | " \n",
921 | " ibm_db.bind_param(stmt,1,allLogsDF.at[n,'branch_exited_reason'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
922 | " ibm_db.bind_param(stmt,2,allLogsDF.at[n,'conversation_id'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
923 | " ibm_db.bind_param(stmt,3,allLogsDF.at[n,'dialog_turn_counter'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_DECIMAL)\n",
924 | " ibm_db.bind_param(stmt,4,str(allLogsDF.at[n,'entities'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
925 | " ibm_db.bind_param(stmt,5,str(allLogsDF.at[n,'input.text'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
926 | " ibm_db.bind_param(stmt,6,str(allLogsDF.at[n,'intent'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
927 | " ibm_db.bind_param(stmt,7,allLogsDF.at[n,'intent_confidence'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_DECIMAL)\n",
928 | " ibm_db.bind_param(stmt, 8,str(allLogsDF.at[n,'nodes_visited'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
929 | " ibm_db.bind_param(stmt, 9,str(allLogsDF.at[n,'output.text'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
930 | " ibm_db.bind_param(stmt, 10,allLogsDF.at[n,'request_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
931 | " ibm_db.bind_param(stmt, 11,allLogsDF.at[n,'response_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
932 | "\n",
933 | " ibm_db.bind_param(stmt, 12,str(allLogsDF.at[n,'language'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
934 | " ibm_db.bind_param(stmt, 13,str(allLogsDF.at[n,'prev_nodes_visited'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
935 | " \n",
936 | " ibm_db.bind_param(stmt, 14,str(allLogsDF.at[n,'month']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_INTEGER)\n",
937 | " ibm_db.bind_param(stmt, 15,str(allLogsDF.at[n,'day']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_INTEGER)\n",
938 | " ibm_db.bind_param(stmt, 16,str(allLogsDF.at[n,'year']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_INTEGER)\n",
939 | "\n",
940 | " ibm_db.bind_param(stmt, 17,str(allLogsDF.at[n,'vgwSIPFromURI']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
941 | " ibm_db.bind_param(stmt, 18,str(allLogsDF.at[n,'vgwSessionID']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
942 | " ibm_db.bind_param(stmt, 19,str(allLogsDF.at[n,'vgwSMSUserPhoneNumber']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
943 | " ibm_db.bind_param(stmt, 20,str(allLogsDF.at[n,'output.vgwAction.parameters.transferTarget']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
944 | " ibm_db.bind_param(stmt, 21,str(allLogsDF.at[n,'metadata.user_id']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
945 | " ibm_db.bind_param(stmt, 22,str(allLogsDF.at[n,'coverage']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
946 | " ibm_db.bind_param(stmt, 23,str(allLogsDF.at[n,'response_type']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
947 | " ibm_db.bind_param(stmt, 24,str(allLogsDF.at[n,'input_bigrams']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
948 | " \n",
949 | " ibm_db.execute(stmt)\n",
950 | "#ibm_db.commit(conn)\n",
951 | "#ibm_db.close(conn)\n",
952 | "print('Total log events saved to database:', LOG_EVENTS_COUNTER)"
953 | ]
954 | },
955 | {
956 | "cell_type": "markdown",
957 | "metadata": {},
958 | "source": [
959 | "### Update the current run details (WA_LAST_RUN_LOG)"
960 | ]
961 | },
962 | {
963 | "cell_type": "code",
964 | "execution_count": null,
965 | "metadata": {},
966 | "outputs": [],
967 | "source": [
968 | "# Store Current run details\n",
969 | "del_tracking = 'Delete from WATSON.WA_LAST_RUN_LOG'\n",
970 | "insert_tracking = 'Insert into WATSON.WA_LAST_RUN_LOG (conversation_id, request_timestamp, response_timestamp, lastrun_timestamp) Values (?,?,?,?) '\n",
971 | "trans_stmt = ibm_db.prepare(conn, insert_tracking)\n",
972 | "insert_tracking"
973 | ]
974 | },
975 | {
976 | "cell_type": "code",
977 | "execution_count": null,
978 | "metadata": {},
979 | "outputs": [],
980 | "source": [
981 | "#Delete previous entry\n",
982 | "ibm_db.exec_immediate(conn,del_tracking)\n",
983 | "\n",
984 | "# Get the latest log entry from the dataframe. First let's sort it so tail(1) is the last entry.\n",
985 | "allLogsDF.sort_values(by=['response_timestamp'], axis=0, \n",
986 | " ascending=True, inplace=True)\n",
987 | "allLogsDF = allLogsDF.reset_index(drop=True)\n",
988 | "last_row = allLogsDF.tail(1).reset_index(drop=True)\n",
989 | "#store the latest row details.\n",
990 | "ibm_db.bind_param(trans_stmt,1,last_row.at[0,'conversation_id'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
991 | "ibm_db.bind_param(trans_stmt,2,last_row.at[0,'request_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
992 | "ibm_db.bind_param(trans_stmt,3,last_row.at[0,'response_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
993 | "ibm_db.bind_param(trans_stmt,4,pd.Timestamp.now(),ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
994 | "ibm_db.execute(trans_stmt)\n",
995 | "print(pd.Timestamp.now())\n",
996 | "#Commit and close the connection\n",
997 | "ibm_db.commit(conn)\n",
998 | "ibm_db.close(conn)"
999 | ]
1000 | },
1001 | {
1002 | "cell_type": "markdown",
1003 | "metadata": {},
1004 | "source": [
1005 | "### End of Notebook v2.1 (last modified on 7-2-20)"
1006 | ]
1007 | },
1008 | {
1009 | "cell_type": "code",
1010 | "execution_count": null,
1011 | "metadata": {},
1012 | "outputs": [],
1013 | "source": []
1014 | }
1015 | ],
1016 | "metadata": {
1017 | "kernelspec": {
1018 | "display_name": "Python 3",
1019 | "language": "python",
1020 | "name": "python3"
1021 | },
1022 | "language_info": {
1023 | "codemirror_mode": {
1024 | "name": "ipython",
1025 | "version": 3
1026 | },
1027 | "file_extension": ".py",
1028 | "mimetype": "text/x-python",
1029 | "name": "python",
1030 | "nbconvert_exporter": "python",
1031 | "pygments_lexer": "ipython3",
1032 | "version": "3.7.4"
1033 | }
1034 | },
1035 | "nbformat": 4,
1036 | "nbformat_minor": 2
1037 | }
1038 |
--------------------------------------------------------------------------------
/Watson Dashboard Template.json:
--------------------------------------------------------------------------------
1 | {
2 | "name": "New dashboard",
3 | "layout": {
4 | "id": "page0",
5 | "items": [
6 | {
7 | "id": "page1",
8 | "items": [
9 | {
10 | "id": "page2",
11 | "css": "templateBox aspectRatio_default",
12 | "items": [
13 | {
14 | "id": "model0000017259181f3b_00000000",
15 | "style": {
16 | "top": "0%",
17 | "left": "0%",
18 | "right": "75%",
19 | "bottom": "75%"
20 | },
21 | "type": "templateDropZone",
22 | "templateName": "dz1",
23 | "relatedLayouts": "|model0000017258da3fbc_00000002|"
24 | },
25 | {
26 | "id": "model0000017259181f3b_00000001",
27 | "css": "noBorderLeft",
28 | "style": {
29 | "top": "0%",
30 | "left": "25%",
31 | "right": "50%",
32 | "bottom": "75%"
33 | },
34 | "type": "templateDropZone",
35 | "templateName": "dz3",
36 | "relatedLayouts": "|model0000017259407010_00000003|"
37 | },
38 | {
39 | "id": "model0000017259181f3b_00000002",
40 | "css": "noBorderLeft",
41 | "style": {
42 | "top": "0%",
43 | "left": "50%",
44 | "right": "25%",
45 | "bottom": "75%"
46 | },
47 | "type": "templateDropZone",
48 | "templateName": "dz4",
49 | "relatedLayouts": "|model0000017259293c90_00000001|"
50 | },
51 | {
52 | "id": "model0000017259181f3c_00000000",
53 | "css": "noBorderLeft",
54 | "style": {
55 | "top": "0%",
56 | "left": "75%",
57 | "right": "00%",
58 | "bottom": "75%"
59 | },
60 | "type": "templateDropZone",
61 | "templateName": "dz5",
62 | "relatedLayouts": "|model0000017259188226_00000002|"
63 | },
64 | {
65 | "id": "model0000017259181f3c_00000001",
66 | "css": "noBorderTop",
67 | "style": {
68 | "top": "25%",
69 | "left": "0%",
70 | "right": "0%",
71 | "bottom": "0%"
72 | },
73 | "type": "templateDropZone",
74 | "templateName": "dz2",
75 | "relatedLayouts": ""
76 | },
77 | {
78 | "id": "model0000017259232989_00000002",
79 | "style": {
80 | "top": "62.34132581100141%",
81 | "left": "37.286821705426355%",
82 | "height": "37.517630465444284%",
83 | "width": "62.63565891472868%"
84 | },
85 | "type": "widget",
86 | "relatedLayouts": ""
87 | },
88 | {
89 | "id": "model0000017259308c46_00000000",
90 | "style": {
91 | "left": "0.0992063492063492%",
92 | "top": "25.09025270758123%",
93 | "height": "74.72924187725631%",
94 | "width": "37.301587301587304%"
95 | },
96 | "type": "widget",
97 | "relatedLayouts": ""
98 | },
99 | {
100 | "id": "model0000017258dbae82_00000003",
101 | "style": {
102 | "top": "25.105782792665725%",
103 | "left": "37.51937984496124%",
104 | "height": "37.23554301833568%",
105 | "width": "62.325581395348834%"
106 | },
107 | "type": "widget",
108 | "relatedLayouts": ""
109 | },
110 | {
111 | "id": "model0000017259188226_00000002",
112 | "style": {
113 | "top": "0.14104372355430184%",
114 | "left": "74.96124031007751%",
115 | "height": "24.682651622002822%",
116 | "width": "24.88372093023256%"
117 | },
118 | "type": "widget",
119 | "relatedLayouts": "model0000017259181f3c_00000000"
120 | },
121 | {
122 | "id": "model0000017259293c90_00000001",
123 | "style": {
124 | "top": "0.18050541516245489%",
125 | "left": "50%",
126 | "height": "24.729241877256317%",
127 | "width": "24.900793650793652%"
128 | },
129 | "type": "widget",
130 | "relatedLayouts": "model0000017259181f3b_00000002"
131 | },
132 | {
133 | "id": "model0000017259407010_00000003",
134 | "style": {
135 | "top": "0.18050541516245489%",
136 | "left": "25%",
137 | "height": "24.729241877256317%",
138 | "width": "24.900793650793652%"
139 | },
140 | "type": "widget",
141 | "relatedLayouts": "model0000017259181f3b_00000001",
142 | "clones": 1
143 | },
144 | {
145 | "id": "model0000017258da3fbc_00000002",
146 | "style": {
147 | "top": "0.18050541516245489%",
148 | "left": "0.0992063492063492%",
149 | "height": "24.729241877256317%",
150 | "width": "24.8015873015873%"
151 | },
152 | "type": "widget",
153 | "relatedLayouts": "model0000017259181f3b_00000000"
154 | },
155 | {
156 | "id": "model00000172762bef64_00000000",
157 | "from": "model000001727625d6c4_00000000",
158 | "style": {
159 | "width": "74.90079365079364%",
160 | "height": "6.67870036101083%",
161 | "left": "0%",
162 | "top": "99.81949458483754%"
163 | },
164 | "type": "widget",
165 | "relatedLayouts": ""
166 | }
167 | ],
168 | "type": "scalingAbsolute"
169 | }
170 | ],
171 | "type": "container",
172 | "title": {
173 | "translationTable": {
174 | "Default": "[Client] Chatbot"
175 | }
176 | },
177 | "templateName": "Template4",
178 | "clones": 1
179 | }
180 | ],
181 | "style": {
182 | "height": "100%"
183 | },
184 | "type": "tab"
185 | },
186 | "theme": "darkTheme",
187 | "version": 1009,
188 | "eventGroups": [
189 | {
190 | "id": "page1:1",
191 | "widgetIds": [
192 | "model0000017258da3fbc_00000002",
193 | "model0000017258dbae82_00000003",
194 | "model0000017259188226_00000002",
195 | "model0000017259293c90_00000001",
196 | "model0000017259308c46_00000000",
197 | "model0000017259407010_00000003",
198 | "model0000017259232989_00000002"
199 | ]
200 | }
201 | ],
202 | "properties": {
203 | "defaultLocale": "Default"
204 | },
205 | "pageContext": [],
206 | "dataSources": {
207 | "version": "1.0",
208 | "sources": [
209 | {
210 | "id": "model0000017258d9dfa0_00000002",
211 | "assetId": "assetId0000017258d9dfa0_00000000",
212 | "clientId": "",
213 | "module": {
214 | "xsd": "https://ibm.com/daas/module/1.0/module.xsd",
215 |
216 | "label": "WA_FULL_LOGS",
217 | "identifier": "WA_FULL_LOGS",
218 | "table": {
219 | "column": [
220 | {
221 | "datatype": "varchar(128)",
222 | "nullable": true,
223 | "name": "CONVERSATION_ID",
224 | "label": "CONVERSATION_ID",
225 | "usage": "attribute",
226 | "regularAggregate": "countDistinct",
227 | "taxonomyFamily": "cNone"
228 | },
229 | {
230 | "datatype": "integer",
231 | "nullable": true,
232 | "name": "MONTH",
233 | "label": "MONTH",
234 | "usage": "attribute",
235 | "regularAggregate": "countDistinct",
236 | "taxonomyFamily": "cMonth"
237 | },
238 | {
239 | "datatype": "integer",
240 | "nullable": true,
241 | "name": "DAY",
242 | "label": "DAY",
243 | "usage": "attribute",
244 | "regularAggregate": "countDistinct",
245 | "taxonomyFamily": "cDay"
246 | },
247 | {
248 | "datatype": "integer",
249 | "nullable": true,
250 | "name": "YEAR",
251 | "label": "YEAR",
252 | "usage": "attribute",
253 | "regularAggregate": "countDistinct",
254 | "taxonomyFamily": "cYear"
255 | },
256 | {
257 | "datatype": "varchar(3072)",
258 | "nullable": true,
259 | "name": "INPUT_TEXT",
260 | "label": "INPUT_TEXT",
261 | "usage": "attribute",
262 | "regularAggregate": "countDistinct",
263 | "taxonomyFamily": "cNone"
264 | },
265 | {
266 | "datatype": "varchar(3072)",
267 | "nullable": true,
268 | "name": "OUTPUT_TEXT",
269 | "label": "OUTPUT_TEXT",
270 | "usage": "attribute",
271 | "regularAggregate": "countDistinct",
272 | "taxonomyFamily": "cNone"
273 | },
274 | {
275 | "datatype": "varchar(1024)",
276 | "nullable": true,
277 | "name": "INTENT",
278 | "label": "INTENT",
279 | "usage": "attribute",
280 | "regularAggregate": "countDistinct",
281 | "taxonomyFamily": "cNone"
282 | },
283 | {
284 | "datatype": "decimal(5)",
285 | "nullable": true,
286 | "name": "INTENT_CONFIDENCE",
287 | "label": "INTENT_CONFIDENCE",
288 | "usage": "attribute",
289 | "regularAggregate": "countDistinct",
290 | "taxonomyFamily": "cNone"
291 | },
292 | {
293 | "datatype": "varchar(3072)",
294 | "nullable": true,
295 | "name": "ENTITIES",
296 | "label": "ENTITIES",
297 | "usage": "attribute",
298 | "regularAggregate": "countDistinct",
299 | "taxonomyFamily": "cNone"
300 | },
301 | {
302 | "datatype": "varchar(128)",
303 | "nullable": true,
304 | "name": "COVERAGE",
305 | "label": "COVERAGE",
306 | "usage": "attribute",
307 | "regularAggregate": "countDistinct",
308 | "taxonomyFamily": "cNone"
309 | },
310 | {
311 | "datatype": "decimal(5)",
312 | "nullable": true,
313 | "name": "DIALOG_TURN_COUNTER",
314 | "label": "DIALOG_TURN_COUNTER",
315 | "usage": "fact",
316 | "regularAggregate": "total",
317 | "taxonomyFamily": "cNone"
318 | },
319 | {
320 | "datatype": "varchar(128)",
321 | "nullable": true,
322 | "name": "USER_ID",
323 | "label": "USER_ID",
324 | "usage": "attribute",
325 | "regularAggregate": "countDistinct",
326 | "taxonomyFamily": "cNone"
327 | },
328 | {
329 | "datatype": "varchar(256)",
330 | "nullable": true,
331 | "name": "BRANCH_EXITED_REASON",
332 | "label": "BRANCH_EXITED_REASON",
333 | "usage": "attribute",
334 | "regularAggregate": "countDistinct",
335 | "taxonomyFamily": "cNone"
336 | },
337 | {
338 | "datatype": "varchar(1024)",
339 | "nullable": true,
340 | "name": "NODES_VISITED",
341 | "label": "NODES_VISITED",
342 | "usage": "attribute",
343 | "regularAggregate": "countDistinct",
344 | "taxonomyFamily": "cNone"
345 | },
346 | {
347 | "datatype": "varchar(1024)",
348 | "nullable": true,
349 | "name": "PREV_NODES_VISITED",
350 | "label": "PREV_NODES_VISITED",
351 | "usage": "attribute",
352 | "regularAggregate": "countDistinct",
353 | "taxonomyFamily": "cNone"
354 | },
355 | {
356 | "datatype": "timestamp",
357 | "nullable": true,
358 | "name": "REQUEST_TIMESTAMP",
359 | "label": "REQUEST_TIMESTAMP",
360 | "usage": "attribute",
361 | "regularAggregate": "countDistinct",
362 | "taxonomyFamily": "cNone"
363 | },
364 | {
365 | "datatype": "timestamp",
366 | "nullable": true,
367 | "name": "RESPONSE_TIMESTAMP",
368 | "label": "RESPONSE_TIMESTAMP",
369 | "usage": "attribute",
370 | "regularAggregate": "countDistinct",
371 | "taxonomyFamily": "cNone"
372 | },
373 | {
374 | "datatype": "varchar(1024)",
375 | "nullable": true,
376 | "name": "LANGUAGE",
377 | "label": "LANGUAGE",
378 | "usage": "attribute",
379 | "regularAggregate": "countDistinct",
380 | "taxonomyFamily": "cNone"
381 | },
382 | {
383 | "datatype": "varchar(128)",
384 | "nullable": true,
385 | "name": "CALLER_ID",
386 | "label": "CALLER_ID",
387 | "usage": "attribute",
388 | "regularAggregate": "countDistinct",
389 | "taxonomyFamily": "cNone"
390 | },
391 | {
392 | "datatype": "varchar(128)",
393 | "nullable": true,
394 | "name": "VGW_SESSION_ID",
395 | "label": "VGW_SESSION_ID",
396 | "usage": "attribute",
397 | "regularAggregate": "countDistinct",
398 | "taxonomyFamily": "cNone"
399 | },
400 | {
401 | "datatype": "varchar(128)",
402 | "nullable": true,
403 | "name": "SMS_NUMBER",
404 | "label": "SMS_NUMBER",
405 | "usage": "attribute",
406 | "regularAggregate": "countDistinct",
407 | "taxonomyFamily": "cNone"
408 | },
409 | {
410 | "datatype": "varchar(128)",
411 | "nullable": true,
412 | "name": "CALL_TRANSFER",
413 | "label": "CALL_TRANSFER",
414 | "usage": "attribute",
415 | "regularAggregate": "countDistinct",
416 | "taxonomyFamily": "cNone"
417 | }
418 | ],
419 | "name": "WA_FULL_LOGS",
420 | "label": "WA_FULL_LOGS",
421 | "description": ""
422 | }
423 | },
424 | "name": "WA_FULL_LOGS",
425 | "shaping": {
426 | "shapingId": "shaping00000172760d4893_00000000",
427 | "embeddedModuleUpToDate": false,
428 | "moserJSON": {
429 | "version": "5.0.1",
430 | "container": "C",
431 | "useSpec": [
432 | {
433 | "identifier": "ES",
434 | "type": "url",
435 | "storeID": "baseModule",
436 | "imports": "*"
437 | }
438 | ],
439 | "expressionLocale": "en-us",
440 | "calculation": [
441 | {
442 | "expression": "(total (If (WA_FULL_LOGS.COVERAGE = 'covered') then( 1)\nelse (0))\n/\ntotal (If (WA_FULL_LOGS.COVERAGE = 'covered' or WA_FULL_LOGS.COVERAGE = 'uncovered') then (1)\nelse (0))) ",
443 | "usage": "fact",
444 | "datatype": "DOUBLE",
445 | "nullable": true,
446 | "regularAggregate": "calculated",
447 | "datatypeCategory": "number",
448 | "highlevelDatatype": "decimal",
449 | "facetDefinition": {
450 | "enabled": "automatic"
451 | },
452 | "identifier": "COVERAGE_1",
453 | "label": "COVERAGE Calc",
454 | "property": [
455 | {
456 | "name": "ui_expr",
457 | "value": "{\"func\":\"customCalculation\",\"version\":\"5.0.1\"}"
458 | }
459 | ],
460 | "propertyOverride": [
461 | "NEW"
462 | ]
463 | },
464 | {
465 | "expression": "(COVERAGE_1)",
466 | "usage": "fact",
467 | "format": "{\"formatGroup\":{\"percentFormat\":{}}}",
468 | "datatype": "DOUBLE",
469 | "nullable": true,
470 | "regularAggregate": "calculated",
471 | "datatypeCategory": "number",
472 | "highlevelDatatype": "decimal",
473 | "facetDefinition": {
474 | "enabled": "automatic"
475 | },
476 | "identifier": "COVERAGE_100",
477 | "label": "COVERAGE Calc %",
478 | "property": [
479 | {
480 | "name": "ui_expr",
481 | "value": "{\"func\":\"customCalculation\",\"version\":\"5.0.1\"}"
482 | }
483 | ],
484 | "propertyOverride": [
485 | "NEW"
486 | ]
487 | }
488 | ],
489 | "metadataTreeView": [
490 | {
491 | "folderItem": [
492 | {
493 | "ref": "COVERAGE_1"
494 | },
495 | {
496 | "ref": "COVERAGE_100"
497 | }
498 | ]
499 | }
500 | ],
501 | "dataRetrievalMode": "liveConnection",
502 | "identifier": "newModel",
503 | "label": "newModel"
504 | }
505 | }
506 | }
507 | ]
508 | },
509 | "drillThrough": [],
510 | "widgets": {
511 | "model0000017258da3fbc_00000002": {
512 | "id": "model0000017258da3fbc_00000002",
513 | "data": {
514 | "dataViews": [
515 | {
516 | "modelRef": "model0000017258d9dfa0_00000002",
517 | "dataItems": [
518 | {
519 | "id": "model0000017258da3fbb_00000000",
520 | "itemId": "WA_FULL_LOGS.CONVERSATION_ID",
521 | "itemLabel": "CONVERSATION_ID"
522 | }
523 | ],
524 | "id": "model0000017258da3fbc_00000000"
525 | }
526 | ]
527 | },
528 | "visTypeLocked": true,
529 | "slotmapping": {
530 | "slots": [
531 | {
532 | "name": "values",
533 | "dataItems": [
534 | "model0000017258da3fbb_00000000"
535 | ],
536 | "dataItemSettings": [],
537 | "caption": "Value",
538 | "id": "values",
539 | "layerId": "data"
540 | }
541 | ]
542 | },
543 | "type": "live",
544 | "name": {
545 | "translationTable": {
546 | "Default": "Total Conversations"
547 | }
548 | },
549 | "visId": "summary",
550 | "showTitle": true,
551 | "properties": [
552 | {
553 | "id": "colorPalette",
554 | "value": "colorPalette2"
555 | },
556 | {
557 | "id": "valueColor",
558 | "value": 7
559 | },
560 | {
561 | "id": "showItemLabel",
562 | "value": false
563 | }
564 | ]
565 | },
566 | "model0000017258dbae82_00000003": {
567 | "id": "model0000017258dbae82_00000003",
568 | "data": {
569 | "dataViews": [
570 | {
571 | "modelRef": "model0000017258d9dfa0_00000002",
572 | "dataItems": [
573 | {
574 | "id": "model0000017258dbae82_00000000",
575 | "itemId": "WA_FULL_LOGS.CONVERSATION_ID",
576 | "itemLabel": "CONVERSATION_ID"
577 | },
578 | {
579 | "id": "model0000017258dbe23a_00000001",
580 | "itemId": "WA_FULL_LOGS.MONTH_",
581 | "itemLabel": "MONTH"
582 | },
583 | {
584 | "id": "model0000017258dbe9cc_00000000",
585 | "itemId": "WA_FULL_LOGS.DAY_",
586 | "itemLabel": "DAY"
587 | }
588 | ],
589 | "id": "model0000017258dbae82_00000001"
590 | }
591 | ]
592 | },
593 | "visTypeLocked": true,
594 | "slotmapping": {
595 | "slots": [
596 | {
597 | "name": "values",
598 | "dataItems": [
599 | "model0000017258dbae82_00000000"
600 | ],
601 | "caption": "Length",
602 | "id": "values"
603 | },
604 | {
605 | "name": "categories",
606 | "dataItems": [
607 | "model0000017258dbe23a_00000001",
608 | "model0000017258dbe9cc_00000000"
609 | ],
610 | "dataItemSettings": [],
611 | "caption": "Bars",
612 | "id": "categories"
613 | }
614 | ]
615 | },
616 | "type": "live",
617 | "name": {
618 | "translationTable": {
619 | "Default": "Daily Conversations"
620 | }
621 | },
622 | "visId": "com.ibm.vis.rave2bundlecolumn",
623 | "showTitle": true,
624 | "properties": [
625 | {
626 | "id": "itemAxis.labels.visible",
627 | "value": true
628 | },
629 | {
630 | "id": "gridLines.visible",
631 | "value": false
632 | },
633 | {
634 | "id": "titles.visible",
635 | "value": false
636 | },
637 | {
638 | "id": "itemAxis.labels.layoutMode",
639 | "value": "rotate45"
640 | },
641 | {
642 | "id": "valueLabels.visible",
643 | "value": true
644 | },
645 | {
646 | "id": "defaultPaletteIndex",
647 | "value": 15
648 | },
649 | {
650 | "id": "colorPalette",
651 | "value": "colorPalette7"
652 | }
653 | ]
654 | },
655 | "model0000017259188226_00000002": {
656 | "id": "model0000017259188226_00000002",
657 | "data": {
658 | "dataViews": [
659 | {
660 | "modelRef": "model0000017258d9dfa0_00000002",
661 | "dataItems": [
662 | {
663 | "id": "model0000017259188225_00000000",
664 | "itemId": "WA_FULL_LOGS.LANGUAGE_",
665 | "itemLabel": "LANGUAGE"
666 | },
667 | {
668 | "id": "model000001725918c14a_00000001",
669 | "itemId": "WA_FULL_LOGS.LANGUAGE_",
670 | "itemLabel": "LANGUAGE"
671 | }
672 | ],
673 | "id": "model0000017259188226_00000000"
674 | }
675 | ]
676 | },
677 | "visTypeLocked": true,
678 | "slotmapping": {
679 | "slots": [
680 | {
681 | "name": "categories",
682 | "dataItems": [
683 | "model0000017259188225_00000000"
684 | ],
685 | "dataItemSettings": [],
686 | "caption": "Segments",
687 | "id": "categories",
688 | "layerId": "data"
689 | },
690 | {
691 | "name": "values",
692 | "dataItems": [
693 | "model000001725918c14a_00000001"
694 | ],
695 | "caption": "Size",
696 | "id": "values"
697 | }
698 | ]
699 | },
700 | "type": "live",
701 | "name": {
702 | "translationTable": {
703 | "Default": "Language"
704 | }
705 | },
706 | "localFilters": [
707 | {
708 | "id": "WA_FULL_LOGS.LANGUAGE_",
709 | "columnId": "WA_FULL_LOGS.LANGUAGE_",
710 | "values": [
711 | {
712 | "u": "WA_FULL_LOGS.LANGUAGE_->[]",
713 | "d": ""
714 | }
715 | ],
716 | "excludedValues": [],
717 | "operator": "notin",
718 | "type": null
719 | }
720 | ],
721 | "visId": "com.ibm.vis.rave2bundlepie",
722 | "showTitle": true,
723 | "properties": [
724 | {
725 | "id": "donutRadius",
726 | "value": 0.7000000000000001
727 | },
728 | {
729 | "id": "label.percentage",
730 | "value": true
731 | },
732 | {
733 | "id": "contrast.label.color",
734 | "value": true
735 | },
736 | {
737 | "id": "widget.legend.position",
738 | "value": "right"
739 | }
740 | ]
741 | },
742 | "model0000017259232989_00000002": {
743 | "id": "model0000017259232989_00000002",
744 | "data": {
745 | "dataViews": [
746 | {
747 | "modelRef": "model0000017258d9dfa0_00000002",
748 | "dataItems": [
749 | {
750 | "id": "model0000017259232988_00000000",
751 | "itemId": "WA_FULL_LOGS.INPUT_TEXT",
752 | "itemLabel": "INPUT_TEXT"
753 | },
754 | {
755 | "id": "model000001725923c9bc_00000000",
756 | "itemId": "WA_FULL_LOGS.INPUT_TEXT",
757 | "itemLabel": "INPUT_TEXT"
758 | }
759 | ],
760 | "id": "model0000017259232989_00000000"
761 | }
762 | ]
763 | },
764 | "visTypeLocked": true,
765 | "slotmapping": {
766 | "slots": [
767 | {
768 | "name": "categories",
769 | "dataItems": [
770 | "model0000017259232988_00000000"
771 | ],
772 | "dataItemSettings": [],
773 | "caption": "Words",
774 | "id": "categories",
775 | "layerId": "data"
776 | },
777 | {
778 | "name": "size",
779 | "dataItems": [
780 | "model000001725923c9bc_00000000"
781 | ],
782 | "caption": "Size",
783 | "id": "size"
784 | }
785 | ]
786 | },
787 | "type": "live",
788 | "name": {
789 | "translationTable": {
790 | "Default": "Frequent Unhandled Phrases"
791 | }
792 | },
793 | "localFilters": [
794 | {
795 | "id": "WA_FULL_LOGS.COVERAGE",
796 | "columnId": "WA_FULL_LOGS.COVERAGE",
797 | "values": [
798 | {
799 | "d": "uncovered",
800 | "u": "WA_FULL_LOGS.COVERAGE->[uncovered]"
801 | }
802 | ],
803 | "excludedValues": [],
804 | "operator": "in",
805 | "type": null
806 | }
807 | ],
808 | "visId": "com.ibm.vis.rave2bundlewordcloud",
809 | "showTitle": true,
810 | "properties": [
811 | {
812 | "id": "widget.legend.display",
813 | "value": false
814 | }
815 | ]
816 | },
817 | "model0000017259293c90_00000001": {
818 | "id": "model0000017259293c90_00000001",
819 | "data": {
820 | "dataViews": [
821 | {
822 | "modelRef": "model0000017258d9dfa0_00000002",
823 | "dataItems": [
824 | {
825 | "id": "model0000017259293c8f_00000000",
826 | "itemId": "WA_FULL_LOGS.USER_ID",
827 | "itemLabel": "USER_ID",
828 | "aggregate": "countdistinct"
829 | }
830 | ],
831 | "id": "model0000017259293c8f_00000001"
832 | }
833 | ]
834 | },
835 | "visTypeLocked": true,
836 | "slotmapping": {
837 | "slots": [
838 | {
839 | "name": "values",
840 | "dataItems": [
841 | "model0000017259293c8f_00000000"
842 | ],
843 | "dataItemSettings": [],
844 | "caption": "Value",
845 | "id": "values",
846 | "layerId": "data"
847 | }
848 | ]
849 | },
850 | "type": "live",
851 | "name": {
852 | "translationTable": {
853 | "Default": "Active Users"
854 | }
855 | },
856 | "localFilters": [],
857 | "visId": "summary",
858 | "showTitle": true,
859 | "properties": [
860 | {
861 | "id": "colorPalette",
862 | "value": "colorPalette2"
863 | },
864 | {
865 | "id": "valueColor",
866 | "value": 7
867 | },
868 | {
869 | "id": "showItemLabel",
870 | "value": false
871 | }
872 | ]
873 | },
874 | "model0000017259308c46_00000000": {
875 | "id": "model0000017259308c46_00000000",
876 | "data": {
877 | "dataViews": [
878 | {
879 | "modelRef": "model0000017258d9dfa0_00000002",
880 | "dataItems": [
881 | {
882 | "id": "model0000017259308c45_00000000",
883 | "itemId": "WA_FULL_LOGS.INTENT",
884 | "itemLabel": "INTENT"
885 | },
886 | {
887 | "id": "model0000017259310a6b_00000001",
888 | "itemId": "WA_FULL_LOGS.INTENT",
889 | "itemLabel": "INTENT",
890 | "selection": [
891 | {
892 | "operation": "order",
893 | "sort": {
894 | "type": "desc",
895 | "priority": 0,
896 | "by": "caption"
897 | }
898 | },
899 | {
900 | "operation": "keep",
901 | "topBottom": {
902 | "type": "topcount",
903 | "value": 20
904 | }
905 | }
906 | ]
907 | }
908 | ],
909 | "id": "model0000017259308c45_00000001"
910 | }
911 | ]
912 | },
913 | "visTypeLocked": true,
914 | "slotmapping": {
915 | "slots": [
916 | {
917 | "name": "categories",
918 | "dataItems": [
919 | "model0000017259308c45_00000000"
920 | ],
921 | "dataItemSettings": [],
922 | "caption": "Bars",
923 | "id": "categories",
924 | "layerId": "data"
925 | },
926 | {
927 | "name": "values",
928 | "dataItems": [
929 | "model0000017259310a6b_00000001"
930 | ],
931 | "caption": "Length",
932 | "id": "values",
933 | "layerId": "data"
934 | }
935 | ]
936 | },
937 | "type": "live",
938 | "name": {
939 | "translationTable": {
940 | "Default": "Frequent Topics"
941 | }
942 | },
943 | "localFilters": [
944 | {
945 | "id": "WA_FULL_LOGS.INTENT",
946 | "columnId": "WA_FULL_LOGS.INTENT",
947 | "values": [
948 | {
949 | "u": "WA_FULL_LOGS.INTENT->[]",
950 | "d": ""
951 | }
952 | ],
953 | "excludedValues": [],
954 | "operator": "notin",
955 | "type": null
956 | }
957 | ],
958 | "visId": "com.ibm.vis.rave2bundlebar",
959 | "showTitle": true,
960 | "properties": [
961 | {
962 | "id": "itemAxis.labels.visible",
963 | "value": true
964 | },
965 | {
966 | "id": "valueAxis.labels.visible",
967 | "value": true
968 | },
969 | {
970 | "id": "gridLines.visible",
971 | "value": false
972 | },
973 | {
974 | "id": "titles.visible",
975 | "value": false
976 | },
977 | {
978 | "id": "valueLabels.visible",
979 | "value": true
980 | },
981 | {
982 | "id": "valueLabels.format",
983 | "value": "PercentOfColor"
984 | }
985 | ]
986 | },
987 | "model0000017259407010_00000003": {
988 | "id": "model0000017259407010_00000003",
989 | "data": {
990 | "dataViews": [
991 | {
992 | "modelRef": "model0000017258d9dfa0_00000002",
993 | "dataItems": [
994 | {
995 | "id": "model00000172760ce800_00000000",
996 | "itemId": "COVERAGE_100",
997 | "itemLabel": "COVERAGE + % 100"
998 | }
999 | ],
1000 | "id": "model0000017259407010_00000001"
1001 | }
1002 | ]
1003 | },
1004 | "visTypeLocked": true,
1005 | "slotmapping": {
1006 | "slots": [
1007 | {
1008 | "name": "values",
1009 | "dataItems": [
1010 | "model00000172760ce800_00000000"
1011 | ],
1012 | "dataItemSettings": [],
1013 | "caption": "Value",
1014 | "id": "values"
1015 | }
1016 | ]
1017 | },
1018 | "type": "live",
1019 | "name": {
1020 | "translationTable": {
1021 | "Default": "Coverage"
1022 | }
1023 | },
1024 | "localFilters": [],
1025 | "visId": "summary",
1026 | "showTitle": true,
1027 | "properties": [
1028 | {
1029 | "id": "colorPalette",
1030 | "value": "colorPalette2"
1031 | },
1032 | {
1033 | "id": "valueColor",
1034 | "value": 7
1035 | },
1036 | {
1037 | "id": "showItemLabel",
1038 | "value": false
1039 | }
1040 | ]
1041 | },
1042 | "model00000172762bef64_00000000": {
1043 | "id": "model00000172762bef64_00000000",
1044 | "type": "text",
1045 | "content": {
1046 | "translationTable": {
1047 | "Default": "Updated hourly with new user data
"
1048 | }
1049 | },
1050 | "isResponsive": false,
1051 | "visTypeLocked": true,
1052 | "name": ""
1053 | }
1054 | }
1055 | }
1056 |
--------------------------------------------------------------------------------
/cognos-covid-dash.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/preethm/watson-assistant-metrics-notebook/0c10bd46c55a225ba412a4ec0655f6f5e146196c/cognos-covid-dash.png
--------------------------------------------------------------------------------
/cognos-covid-dash2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/preethm/watson-assistant-metrics-notebook/0c10bd46c55a225ba412a4ec0655f6f5e146196c/cognos-covid-dash2.png
--------------------------------------------------------------------------------
/coverage_calc.txt:
--------------------------------------------------------------------------------
1 | (total (If (WA_FULL_LOGS.COVERAGE = 'covered') then( 1)
2 | else (0))
3 | /
4 | total (If (WA_FULL_LOGS.COVERAGE = 'covered' or WA_FULL_LOGS.COVERAGE = 'uncovered') then (1)
5 | else (0)))
6 |
--------------------------------------------------------------------------------
/db2-tables-sql-V2API.txt:
--------------------------------------------------------------------------------
1 | Create Table WATSON_WEB_ACTIONS.WA_FULL_LOGS(
2 | conversation_id Varchar(128),
3 | month Integer,
4 | day Integer,
5 | year Integer,
6 | input_text Varchar(30000),
7 | output_text Varchar(30000),
8 | action_title Varchar(1024),
9 | intent Varchar(1024),
10 | intent_confidence Decimal,
11 | entities Varchar(30000),
12 | input_bigrams Varchar(30000),
13 | dialog_turn_number Integer,
14 | conversation_length Integer,
15 | user_id Varchar(128),
16 | request_timestamp Timestamp,
17 | response_timestamp Timestamp,
18 | language_selection Varchar(128)
19 | );
20 |
21 |
22 | CREATE TABLE WATSON_WEB_ACTIONS.WA_LAST_RUN_LOG(
23 | conversation_id Varchar(128),
24 | request_timestamp Timestamp,
25 | response_timestamp Timestamp,
26 | lastrun_timestamp Timestamp
27 | );
--------------------------------------------------------------------------------
/db2-tables-sql.txt:
--------------------------------------------------------------------------------
1 | Create Table WATSON.WA_FULL_LOGS(
2 | conversation_id Varchar(128),
3 | month Integer,
4 | day Integer,
5 | year Integer,
6 | input_text Varchar(30000),
7 | output_text Varchar(30000),
8 | intent Varchar(1024),
9 | intent_confidence Decimal,
10 | entities Varchar(30000),
11 | coverage Varchar(128),
12 | response_type Varchar(128),
13 | input_bigrams Varchar(30000),
14 | dialog_turn_counter Decimal,
15 | user_id Varchar(128),
16 | branch_exited_reason Varchar(256),
17 | nodes_visited Varchar(1024),
18 | prev_nodes_visited Varchar(1024),
19 | request_timestamp Timestamp,
20 | response_timestamp Timestamp,
21 | language Varchar(1024),
22 | caller_id Varchar(128),
23 | vgw_session_id Varchar(128),
24 | sms_number Varchar(128),
25 | call_transfer Varchar(128)
26 | );
27 |
28 |
29 | CREATE TABLE WATSON.WA_LAST_RUN_LOG(
30 | conversation_id Varchar(128),
31 | request_timestamp Timestamp,
32 | response_timestamp Timestamp,
33 | lastrun_timestamp Timestamp
34 | );
35 |
--------------------------------------------------------------------------------
/dev/Watson Assistant Metrics Notebook (DB2).ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Watson Assistant Metrics Notebook\n",
8 | "\n",
9 | "This notebook performs analytics on the user log records of Watson Assistant (including Voice Interaction). The logs are extracted, transformed, and loaded into a DB2 database and CSV files. A variety of key business metrics are calculated and displayed in the notebook. Using Watson Studio to build a Dashboard are recommended for further data exploration and dashboard visualizations. \n",
10 | "\n",
11 | "
\n",
12 | "\n",
13 | "### Table of Contents\n",
14 | "* [1. Configuration and Log Collection](#config) - This section will extract and transform the user log data from Watson Assistant.\n",
15 | "* [2. Key Performance Metrics](#performance-metrics) - Key metrics including containment rate, active users, and top intents will be calculated. \n",
16 | "* [3. Export Logs](#export-logs) The transformed log data will be saved to a DB2 database and CSV files."
17 | ]
18 | },
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {},
22 | "source": [
23 | "## Housekeeping \n",
24 | "This section will import libraries and dependencies for this notebook. \n",
25 | " \n",
26 | "> **Action Required:** Update the `project_id` and `project_access_token` in order to access your data assets. Instructions can be found here: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/token.html"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": null,
32 | "metadata": {},
33 | "outputs": [],
34 | "source": [
35 | "# @hidden_cell\n",
36 | "# The project token is an authorization token that is used to access project resources like data sources, connections, and used by platform APIs.\n",
37 | "from project_lib import Project\n",
38 | "project = Project(project_id='XXXXXXXX', project_access_token='XXXXXXXXX')\n",
39 | "pc = project.project_context\n"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": null,
45 | "metadata": {
46 | "scrolled": true
47 | },
48 | "outputs": [],
49 | "source": [
50 | "!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/getAllLogs.py\n",
51 | "!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/extractConversations.py\n",
52 | "\n",
53 | "%load_ext autoreload\n",
54 | "%autoreload 2\n",
55 | "import warnings\n",
56 | "warnings.simplefilter(\"ignore\")\n",
57 | "\n",
58 | "!pip install ibm-watson\n",
59 | "!pip install --user --upgrade \"pandas==1.0.3\";\n",
60 | "\n",
61 | "import json\n",
62 | "import pandas as pd\n",
63 | "import getAllLogs\n",
64 | "import extractConversations\n",
65 | "import seaborn as sn\n",
66 | "import ibm_db\n",
67 | "import ibm_db_dbi"
68 | ]
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": null,
73 | "metadata": {},
74 | "outputs": [],
75 | "source": [
76 | "# Custom functions to re-use code throughout notebook\n",
77 | "def turn_dict_to_df(df,col_names):\n",
78 | " df = pd.DataFrame.from_dict(df)\n",
79 | " df.reset_index(level=0, inplace=True)\n",
80 | " df.columns = col_names\n",
81 | " return df"
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "metadata": {},
87 | "source": [
88 | "## 1. Configuration and log collection \n",
89 | "This section will configure your DB2 connection, log query parameters, and will extract the logs from your Watson Assistant instance.\n",
90 | "\n",
91 | "> **Action Required:** Update each of the variables marked with 'XXXXXXXX'. The comments in the cells guide you in the configuration."
92 | ]
93 | },
94 | {
95 | "cell_type": "code",
96 | "execution_count": null,
97 | "metadata": {},
98 | "outputs": [],
99 | "source": [
100 | "# Define the customer name. This prefix will be used for saving CSV & JSON files.\n",
101 | "custName = 'XXXXXXXX'\n",
102 | "\n",
103 | "# Set True or False if you want data to write to DB2 table\n",
104 | "connectDB2 = True\n",
105 | "\n",
106 | "# Set the start date for the log fetch. If you are using the DB2 connection in Section 1.1, this will be defined automatically.\n",
107 | "log_fetch_start = '2020-05-15'"
108 | ]
109 | },
110 | {
111 | "cell_type": "markdown",
112 | "metadata": {},
113 | "source": [
114 | "### 1.1 Configure & Establish DB2 connection\n",
115 | "This section will define the values for your DB2 database, establish the connection, check for the last loaded date.\n",
116 | "\n",
117 | "> **Action Required:** You will first need to provision your DB2 instance and establish the table schemas. Follow these instructions. Then, update the values below marked 'XXXXXXXX'. "
118 | ]
119 | },
120 | {
121 | "cell_type": "code",
122 | "execution_count": 3,
123 | "metadata": {},
124 | "outputs": [],
125 | "source": [
126 | "# Enter the values for you database connection. This can be found in DB2's Service Credentials from the tooling. \n",
127 | "dsn_database = \"XXXXXXXX\" # e.g. \"MORTGAGE\"\n",
128 | "dsn_uid = \"XXXXXXXX\" # e.g. \"dash104434\"\n",
129 | "dsn_pwd = \"XXXXXXXX\" # e.g. \"7dBZ3jWt9xN6$o0JiX!m\"\n",
130 | "dsn_hostname = \"XXXXXXXX\" # e.g. \"Use the same IP as Web Console\"\n",
131 | "dsn_port = \"50000\" # e.g. \"50000\" \n",
132 | "dsn_protocol = \"TCPIP\" # i.e. \"TCPIP\"\n",
133 | "dsn_driver = \"IBM DB2 ODBC DRIVER\" # Don't change"
134 | ]
135 | },
136 | {
137 | "cell_type": "code",
138 | "execution_count": null,
139 | "metadata": {},
140 | "outputs": [],
141 | "source": [
142 | "# Establish database connection\n",
143 | "if connectDB2 == True:\n",
144 | " dsn = (\"DRIVER={{IBM DB2 ODBC DRIVER}};\" \"DATABASE={0};\" \"HOSTNAME={1};\" \"PORT={2};\" \"PROTOCOL=TCPIP;\" \"UID={3};\" \"PWD={4};\").format(dsn_database, dsn_hostname, dsn_port, dsn_uid, dsn_pwd)\n",
145 | " options = { ibm_db.SQL_ATTR_AUTOCOMMIT: ibm_db.SQL_AUTOCOMMIT_ON }\n",
146 | " conn = ibm_db.connect(dsn, \"\", \"\",options)\n",
147 | " #Added options for auto commit\n",
148 | " \n",
149 | " # Retrieve the date for the previous DB2 run. If there is none defined, use 2020-04-15. This variable log_fetch_start is used for filtering WA logs.\n",
150 | " select_sql = 'SELECT * FROM WATSON.WA_LAST_RUN_LOG'\n",
151 | " select_stmt = ibm_db.exec_immediate(conn, select_sql)\n",
152 | " prev_run = ibm_db.fetch_both(select_stmt)\n",
153 | " first_run = True\n",
154 | " log_fetch_start = '2020-04-15'\n",
155 | " if prev_run != False:\n",
156 | " first_run = False\n",
157 | " l_conversation_id = prev_run.get('CONVERSATION_ID')\n",
158 | " l_request_timestamp = prev_run.get('REQUEST_TIMESTAMP')\n",
159 | " l_response_timestamp = prev_run.get('RESPONSE_TIMESTAMP')\n",
160 | " l_prev_run = prev_run.get('LASTRUN_TIMESTAMP')\n",
161 | " log_fetch_start = str(l_response_timestamp.date())\n",
162 | "print('log_fetch_start:',log_fetch_start)"
163 | ]
164 | },
165 | {
166 | "cell_type": "markdown",
167 | "metadata": {},
168 | "source": [
169 | "### 1.2 Retrieve logs from the Watson Assistant instance\n",
170 | "This section will retrieve the user logs from the Assistant `/logs` API.\n",
171 | "\n",
172 | "> **Action Required:** Update the fields below marked 'XXXXXXXX' based on the credentials of your Assistant. \n",
173 | "Solutions using an Assistant layer (v2 API) should set `workspace_id=None` and provide `assistant_id`. Otherwise, define workspace and comment out assistant_id.\n",
174 | "\n",
175 | "\n"
176 | ]
177 | },
178 | {
179 | "cell_type": "code",
180 | "execution_count": null,
181 | "metadata": {},
182 | "outputs": [],
183 | "source": [
184 | "# Extract logs from your assistant. Complete this information.\n",
185 | "iam_apikey = 'XXXXXXXX' \n",
186 | "url = \"XXXXXXXX\" # Set the URL to the region, e.g. https://api.us-east.assistant.watson.cloud.ibm.com\n",
187 | "assistant_id = 'XXXXXXXX'\n",
188 | "workspace_id = None\n",
189 | "\n",
190 | "# If not using assistant_id, comment out the 2nd line below. \n",
191 | "log_filter=\"language::en,response_timestamp>=\" + log_fetch_start \\\n",
192 | "+\",request.context.system.assistant_id::\" + assistant_id\n",
193 | "\n",
194 | "#Change the number of logs retrieved, default settings will return 100,000 logs (200 pages of 500)\n",
195 | "page_size_limit=500\n",
196 | "page_num_limit=200\n",
197 | "\n",
198 | "version=\"2018-09-20\" # Watson Assistant API version\n",
199 | "\n",
200 | "rawLogsJson = getAllLogs.getLogs(iam_apikey, url, workspace_id, log_filter, page_size_limit, page_num_limit, version)\n",
201 | "rawLogsPath= custName + \"_logs.json\"\n",
202 | "\n",
203 | "# getAllLogs.writeLogs(rawLogsJson, rawLogsPath) # Saves the logs locally\n",
204 | "project.save_data(file_name = rawLogsPath,data = json.dumps(rawLogsJson),overwrite=True); # Saves the logs in Studio/COS\n",
205 | "print('\\nSaved log data to {}'.format(rawLogsPath))"
206 | ]
207 | },
208 | {
209 | "cell_type": "markdown",
210 | "metadata": {},
211 | "source": [
212 | "### 1.3 Load logs from JSON file (Defunct)\n",
213 | "If you have previously saved the JSON file, you can uncomment this section to load the logs. Otherwise, comment this section out and continue."
214 | ]
215 | },
216 | {
217 | "cell_type": "code",
218 | "execution_count": null,
219 | "metadata": {},
220 | "outputs": [],
221 | "source": [
222 | "# #If you have previously stored your logs on the file system, you can reload them here by uncommenting these lines\n",
223 | "# rawLogsPath= custName+\"_logs.json\"\n",
224 | "# rawLogsJson = extractConversations.readLogs(rawLogsPath)"
225 | ]
226 | },
227 | {
228 | "cell_type": "markdown",
229 | "metadata": {},
230 | "source": [
231 | "### 1.4 Format logs\n",
232 | "Now that the logs have been retrieved, this section will transform the data out of JSON format and into a Pandas dataframe. \n",
233 | "\n",
234 | "> **Optional:** If you wish to add any custom fields (such as a context variable), add it the first line `customFieldNames` below. Otherwise, run this cell as-is."
235 | ]
236 | },
237 | {
238 | "cell_type": "code",
239 | "execution_count": null,
240 | "metadata": {},
241 | "outputs": [],
242 | "source": [
243 | "# Optionally provide a comma-separated list of custom fields you want to extract, in addition to the default fields\n",
244 | "customFieldNames = ''\n",
245 | "\n",
246 | "# Unique conversation identifier across all records. This is default. For a multi-skill assistant you will need to provide your own key.\n",
247 | "primaryLogKey = \"response.context.conversation_id\"\n",
248 | "conversationKey='conversation_id' # Name of the correlating key as it appears in the data frame columns (remove 'response.context.')\n",
249 | "\n",
250 | "# These custom fields are added to the list. They are used for extracting metrics in the notebook. Do not change these.\n",
251 | "customFieldNames = customFieldNames + \",response.context.vgwSIPFromURI,response.context.vgwSessionID,request.context.vgwSMSFailureReason,\\\n",
252 | "request.context.vgwSMSUserPhoneNumber,response.output.vgwAction.parameters.transferTarget,response.context.language,\\\n",
253 | "response.context.metadata.user_id,response.output.generic\"\n",
254 | "\n",
255 | "allLogsDF = extractConversations.extractConversationData(rawLogsJson, primaryLogKey, customFieldNames)\n",
256 | "conversationsGroup = allLogsDF.groupby(conversationKey,as_index=False)\n",
257 | "\n",
258 | "# Splits the response_timestamp into month, day, and year fields that can be used for easier data filtering/visualizations \n",
259 | "allLogsDF[\"full_date\"] = pd.to_datetime(allLogsDF[\"response_timestamp\"])\n",
260 | "allLogsDF['month'] = allLogsDF['full_date'].dt.month\n",
261 | "allLogsDF['day'] = allLogsDF['full_date'].dt.day\n",
262 | "allLogsDF['year'] = allLogsDF['full_date'].dt.year\n",
263 | "\n",
264 | "print(\"Total log events:\",len(allLogsDF))\n",
265 | "allLogsDF.head()"
266 | ]
267 | },
268 | {
269 | "cell_type": "code",
270 | "execution_count": null,
271 | "metadata": {},
272 | "outputs": [],
273 | "source": [
274 | "# Print the column names\n",
275 | "# allLogsDF.columns"
276 | ]
277 | },
278 | {
279 | "cell_type": "markdown",
280 | "metadata": {},
281 | "source": [
282 | "# 2. Key Performance Metrics \n",
283 | "The notebook will calculate various performance metrics including `coverage` and `containment`. Standard volume metrics will also be provided.\n",
284 | "\n",
285 | "* [2.1 Core Metrics](#core-metrics) - These are conversational metrics that apply to both chat and voice solutions.\n",
286 | "* [2.2 Voice Interaction Metrics](#voice-metrics) - Additional measurements for voice solutions including phone calls, call transfers, unique caller IDs, etc.\n",
287 | "* [2.3 Custom Metrics](#custom-metrics) - Other ad-hoc analysis. Requires knowledge of Python."
288 | ]
289 | },
290 | {
291 | "cell_type": "markdown",
292 | "metadata": {},
293 | "source": [
294 | "## 2.1 Core Metrics \n",
295 | "These metrics apply to all Watson Assistant solutions. For voice solutions, additional metrics are in the next section.\n",
296 | "* [2.1.1 Abandonment at Greeting](#abandonment)\n",
297 | "* [2.1.2 Coverage Metric](#coverage-metric)\n",
298 | "* [2.1.3 Search Skill Responses](#search-skill)\n",
299 | "* [2.1.4 Escalation Requests](#escalation-metric)\n",
300 | "* [2.1.5 Active Users](#active-users)\n",
301 | "* [2.1.6 Top Intents & Average Confidence Scores](#top-intents-scores)\n",
302 | "* [2.1.7 Top Entities](#top-entities)\n",
303 | "* [2.1.8 Optional: Bilingual Assistants](#bilingual-assistants)"
304 | ]
305 | },
306 | {
307 | "cell_type": "code",
308 | "execution_count": null,
309 | "metadata": {},
310 | "outputs": [],
311 | "source": [
312 | "# dict{} that we will send to CSV for use in Watson Studio Cognos Dashboard\n",
313 | "metrics_dict = {}\n",
314 | "\n",
315 | "# These should match the count in the Watson Assistant Analytics tooling.\n",
316 | "totalConvs = len(allLogsDF[conversationKey].unique())\n",
317 | "print(\"Total messages: \", len(allLogsDF))\n",
318 | "print(\"Total conversations:\", totalConvs)"
319 | ]
320 | },
321 | {
322 | "cell_type": "markdown",
323 | "metadata": {},
324 | "source": [
325 | "### 2.1.1 Abandonment at Greeting \n",
326 | "\n",
327 | "The logs include non-user messages such as welcome messages and system messages from a Voice Interaction solution. By filtering out these messages, it will reveal how many conversations abandoned before the first user utterance."
328 | ]
329 | },
330 | {
331 | "cell_type": "code",
332 | "execution_count": null,
333 | "metadata": {},
334 | "outputs": [],
335 | "source": [
336 | "# This removes blank inputs and vgwHangUp tags in log events\n",
337 | "filteredLogsDF = allLogsDF[allLogsDF['input.text'] != \"\"]\n",
338 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwHangUp'] \n",
339 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwPostResponseTimeout'] \n",
340 | "\n",
341 | "filteredMessages = len(filteredLogsDF)\n",
342 | "filteredConvs = len(filteredLogsDF[conversationKey].unique())\n",
343 | "abandonedAtGreeting = (totalConvs - filteredConvs)\n",
344 | "metrics_dict['abandonedAtGreeting'] = [abandonedAtGreeting] # Put into metrics dict\n",
345 | "\n",
346 | "print(\"Abandoned conversations (no user input):\", abandonedAtGreeting)"
347 | ]
348 | },
349 | {
350 | "cell_type": "markdown",
351 | "metadata": {},
352 | "source": [
353 | "### 2.1.2 Coverage Metric \n",
354 | "Coverage is the measurement of the portion of total user messages that your assistant is attempting to respond to. For example, any messages that respond with \"Sorry I didn't understand\" from the anything_else node is considered uncovered.\n",
355 | "\n",
356 | "> **Action Required:** Define the node ids in `anything_else_nodes` list that represent any responses for uncovered messages. This can be found by exporting the Skill from the Assistant tooling, and searching the JSON for the relevant `dialog_node`. "
357 | ]
358 | },
359 | {
360 | "cell_type": "code",
361 | "execution_count": null,
362 | "metadata": {},
363 | "outputs": [],
364 | "source": [
365 | "# Define the node_id for anything_else and other uncovered nodes\n",
366 | "anything_else_nodes = ['XXXXXXXX'] \n",
367 | "\n",
368 | "# coveredDF = allLogsDF\n",
369 | "allLogsDF.rename(columns={'input.text': 'input_text'}, inplace=True)\n",
370 | "coverage = []\n",
371 | "\n",
372 | "for row in allLogsDF.itertuples():\n",
373 | " appended = False \n",
374 | " nodes = row.nodes_visited\n",
375 | " for node in nodes:\n",
376 | " if node in anything_else_nodes:\n",
377 | " coverage.append('uncovered') # Mark as uncovered if message hit node in anything_else_nodes list\n",
378 | " appended = True\n",
379 | " break\n",
380 | " if (row.input_text == '' or row.input_text == 'vgwHangUp' or row.input_text == 'vgwPostResponseTimeout') and not appended:\n",
381 | " coverage.append('system_message') # Mark greetings and voicegateway actions as system_messages\n",
382 | " appended = True\n",
383 | " if not appended:\n",
384 | " coverage.append('covered') # else, everything else is covered\n",
385 | "\n",
386 | "allLogsDF['coverage'] = coverage\n",
387 | "allLogsDF.rename(columns={'input_text': 'input.text'}, inplace=True)\n",
388 | "coveredDF = allLogsDF[allLogsDF['coverage'] == 'covered']\n",
389 | "uncoveredDF = allLogsDF[allLogsDF['coverage'] == 'uncovered']\n",
390 | "\n",
391 | "print('Covered messages: ', len(coveredDF))\n",
392 | "print('Uncovered messages: ', len(allLogsDF[allLogsDF['coverage'] == 'uncovered']))\n",
393 | "print('System messages: ', len(allLogsDF[allLogsDF['coverage'] == 'system_message']))\n",
394 | "print('\\nCoverage metric: ','{:.0%}'.format(len(coveredDF) / filteredMessages))\n",
395 | "\n",
396 | "# coveredMsgs[['input_text','output.text','coverage']].tail(10)\n",
397 | "\n",
398 | "metrics_dict['coverage'] = [len(coveredDF) / filteredMessages] # Put into metrics dict"
399 | ]
400 | },
401 | {
402 | "cell_type": "code",
403 | "execution_count": null,
404 | "metadata": {},
405 | "outputs": [],
406 | "source": [
407 | "# uncoveredDF[['input.text','output.text']].head(10)"
408 | ]
409 | },
410 | {
411 | "cell_type": "markdown",
412 | "metadata": {},
413 | "source": [
414 | "### 2.1.2 Search Skill Responses \n",
415 | "Watson Assistant has multiple response types including `text`, `option`, `image`, `pause`, or `search skill`. Each of these types are marked within `output.generic.response_type` inside the log data. This cell will calculate the number of Search Skill responses."
416 | ]
417 | },
418 | {
419 | "cell_type": "code",
420 | "execution_count": null,
421 | "metadata": {},
422 | "outputs": [],
423 | "source": [
424 | "# Run this cell\n",
425 | "response_type = []\n",
426 | "\n",
427 | "for row in allLogsDF['output.generic']:\n",
428 | " search_skill = False\n",
429 | " for response in row: # each output can have multiple responses\n",
430 | " if response['response_type'] == 'search_skill':\n",
431 | " response_type.append('search_skill')\n",
432 | " search_skill = True\n",
433 | " break\n",
434 | " \n",
435 | " if not search_skill: # if the response was not a search skill, append other to the list\n",
436 | " response_type.append('other')\n",
437 | " \n",
438 | "allLogsDF['response_type'] = response_type # Add in response_type column to allLogsDF\n",
439 | "searchSkillDF = allLogsDF[allLogsDF['response_type'] == 'search_skill'] # Set new DF \n",
440 | "print('Total Search Skill responses:',len(searchSkillDF))\n",
441 | "print('Percentage of total messages: {:.0%}'.format(len(searchSkillDF) / len(allLogsDF) ))\n",
442 | "\n",
443 | "searchSkillDF[['input.text','response_type']].head().reset_index(drop=True) # Print the list of user inputs that caused search skill"
444 | ]
445 | },
446 | {
447 | "cell_type": "code",
448 | "execution_count": null,
449 | "metadata": {},
450 | "outputs": [],
451 | "source": [
452 | "# Saves to CSV\n",
453 | "project.save_data(file_name = custName + \"_search-skill-inputs.csv\",data = searchSkillDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally"
454 | ]
455 | },
456 | {
457 | "cell_type": "markdown",
458 | "metadata": {},
459 | "source": [
460 | "### 2.1.3 Escalation Requests \n",
461 | "\n",
462 | "Escalation refers to any time a user is prompted to contact a live person (e.g. 1-800 number). If the assistant has an integration with a live handoff service (e.g. ZenDesk), this is considered escalation. For Voice Interaction solutions, we calculate `call containment` in the next section by counting the number of actual call transfers in the logs.\n",
463 | "\n",
464 | "> **Action Required:** Define the node id in `escalation_node` for a node that represents any responses to an escalation request (e.g. `#General-Agent-Escalation`). This can be found by exporting the Skill from the Assistant tooling, and searching the JSON for the relevant dialog_node.\n",
465 | " "
466 | ]
467 | },
468 | {
469 | "cell_type": "code",
470 | "execution_count": null,
471 | "metadata": {},
472 | "outputs": [],
473 | "source": [
474 | "# Define the escalation node\n",
475 | "escalation_node = \"XXXXXXXX\" \n",
476 | "node_visits_escalated = allLogsDF[[escalation_node in x for x in allLogsDF['nodes_visited']]]\n",
477 | "\n",
478 | "escalationMetric = len(node_visits_escalated)/filteredMessages\n",
479 | "metrics_dict['escalation'] = [escalationMetric] # Put into metrics dict\n",
480 | "print(\"Total visits to escalation node:\",len(node_visits_escalated))\n",
481 | "print(\"Percent of total messages escalated:\",'{:.0%}'.format(escalationMetric))"
482 | ]
483 | },
484 | {
485 | "cell_type": "markdown",
486 | "metadata": {},
487 | "source": [
488 | "### 2.1.4 Active Users \n",
489 | "How many unique users used the assistant?"
490 | ]
491 | },
492 | {
493 | "cell_type": "code",
494 | "execution_count": null,
495 | "metadata": {},
496 | "outputs": [],
497 | "source": [
498 | "uniqueUsers = allLogsDF[\"metadata.user_id\"].nunique()\n",
499 | "metrics_dict['uniqueUsers'] = [uniqueUsers] # inserts into metrics dict\n",
500 | "print('Total unique users: {}'.format(uniqueUsers))"
501 | ]
502 | },
503 | {
504 | "cell_type": "markdown",
505 | "metadata": {},
506 | "source": [
507 | "### 2.1.5 Top Intents & Average Confidence Scores "
508 | ]
509 | },
510 | {
511 | "cell_type": "code",
512 | "execution_count": null,
513 | "metadata": {},
514 | "outputs": [],
515 | "source": [
516 | "# Using pandas aggregators to count how often each intent is selected and its average confidence\n",
517 | "intentsDF = filteredLogsDF.groupby('intent',as_index=False).agg({\n",
518 | " 'input.text': ['count'], \n",
519 | " 'intent_confidence': ['mean']\n",
520 | "})\n",
521 | "\n",
522 | "intentsDF.columns=[\"intent\",\"count\",\"confidence\"] #Flatten the column headers for ease of use\n",
523 | "\n",
524 | "intentsDF = intentsDF[intentsDF['intent'] !=''] # Remove blanks, usually VGW tags + greetings\n",
525 | "intentsDF = intentsDF.sort_values('count',ascending=False)\n",
526 | "intentsDF = intentsDF.reset_index(drop=True)\n",
527 | "intentsDF.head(5) # If you want specific number shown, edit inside head(). If you want to show all, remove head() "
528 | ]
529 | },
530 | {
531 | "cell_type": "code",
532 | "execution_count": null,
533 | "metadata": {},
534 | "outputs": [],
535 | "source": [
536 | "#ax = sns.barplot(x=\"count\", y=\"intent\", data=intentsDF.head(),orient='h',palette=\"Blues_d\").set_title('Top Intents')"
537 | ]
538 | },
539 | {
540 | "cell_type": "markdown",
541 | "metadata": {},
542 | "source": [
543 | "### 2.1.6 Top Entities (Defunct) "
544 | ]
545 | },
546 | {
547 | "cell_type": "code",
548 | "execution_count": null,
549 | "metadata": {},
550 | "outputs": [],
551 | "source": [
552 | "entityDF = allLogsDF[allLogsDF[\"entities\"] != \"\"]\n",
553 | "#intentsDF = intentsDF[intentsDF['intent'] !=''] # Remove blanks, usually VGW tags + greetings\n",
554 | "entityDF[\"entities\"].iloc[0]"
555 | ]
556 | },
557 | {
558 | "cell_type": "markdown",
559 | "metadata": {},
560 | "source": [
561 | "### 2.1.7 Optional: Bilingual Assistants \n",
562 | "For assistants that use a single skill for two different languages. The skill may set a context variable (e.g. `$language==\"english\"`) and then respond accordingly based on this variable. This cell will count the unique conversation_ids that have a given context variable.\n",
563 | "\n",
564 | "> **Optional:** Define the `languageVar` that your skill uses to identify the language used to respond to the user."
565 | ]
566 | },
567 | {
568 | "cell_type": "code",
569 | "execution_count": null,
570 | "metadata": {},
571 | "outputs": [],
572 | "source": [
573 | "languageVar = 'language' # define the context variable that you retrieved above in customFields\n",
574 | "\n",
575 | "languageDF = allLogsDF.groupby([languageVar])[\"conversation_id\"].nunique()\n",
576 | "languageDF = turn_dict_to_df(languageDF, ['Context Var', 'Count'])\n",
577 | "languageDF = languageDF[languageDF['Context Var'] != '']\n",
578 | "languageDF"
579 | ]
580 | },
581 | {
582 | "cell_type": "markdown",
583 | "metadata": {},
584 | "source": [
585 | "## 2.2 Voice Interaction Metrics \n",
586 | "These metrics are for Voice Agent solutions. We start with volume metrics. \n",
587 | "If your solution is chat only, [skip to the next section.](#export-to-csv)\n",
588 | "\n",
589 | "* [2.2.1 Call Containment Rate](#containment-rate)\n",
590 | "* [2.2.2 Unique Callers](#unique-callers)\n",
591 | "* [2.2.3 SMS Sent](#sms-sent)"
592 | ]
593 | },
594 | {
595 | "cell_type": "code",
596 | "execution_count": null,
597 | "metadata": {},
598 | "outputs": [],
599 | "source": [
600 | "uniqueCallers = allLogsDF['vgwSIPFromURI'].unique()\n",
601 | "uniqueCalls = allLogsDF['vgwSessionID'].unique()\n",
602 | "\n",
603 | "print(\"Total phone calls:\", len(uniqueCalls)) # It will print '1' if there are no calls found in the logs\n",
604 | "print(\"Total unique callers:\", len(uniqueCallers))\n",
605 | "print(\"Average messages per call:\", int(len(allLogsDF) / len(uniqueCalls)))"
606 | ]
607 | },
608 | {
609 | "cell_type": "code",
610 | "execution_count": null,
611 | "metadata": {},
612 | "outputs": [],
613 | "source": [
614 | "# Filters out blank inputs and vgwHangUp tags in log events\n",
615 | "filteredLogsDF = allLogsDF[allLogsDF['input.text'] != \"\"]\n",
616 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwHangUp'] \n",
617 | "filteredLogsDF = filteredLogsDF[filteredLogsDF['input.text'] != 'vgwPostResponseTimeout'] "
618 | ]
619 | },
620 | {
621 | "cell_type": "markdown",
622 | "metadata": {},
623 | "source": [
624 | "### 2.2.1 Call Containment Rate \n",
625 | "How many call transfers did the voice solution perform?"
626 | ]
627 | },
628 | {
629 | "cell_type": "code",
630 | "execution_count": null,
631 | "metadata": {},
632 | "outputs": [],
633 | "source": [
634 | "transfersDF = allLogsDF.groupby([\"output.vgwAction.parameters.transferTarget\"])[\"vgwSessionID\"].count()\n",
635 | "transfersDF = turn_dict_to_df(transfersDF, ['TransferTo', 'Count'])\n",
636 | "transfersDF = transfersDF[transfersDF['TransferTo'] != '']\n",
637 | "\n",
638 | "print('Call transfer count:', transfersDF['Count'].sum()) \n",
639 | "containmentRate = 1 - transfersDF['Count'].sum() / len(uniqueCalls)\n",
640 | "print('Call containment rate:', '{:.0%}'.format(containmentRate))\n",
641 | "metrics_dict['callTransfers'] = [transfersDF['Count'].sum()] # Put into metrics dict\n",
642 | "metrics_dict['containment'] = [containmentRate] # Put into metrics dict\n",
643 | "transfersDF.sort_values('Count',ascending=False)"
644 | ]
645 | },
646 | {
647 | "cell_type": "markdown",
648 | "metadata": {},
649 | "source": [
650 | "### 2.2.2 Unique Callers \n",
651 | "How many unique caller IDs dialed into the voice solution?"
652 | ]
653 | },
654 | {
655 | "cell_type": "code",
656 | "execution_count": null,
657 | "metadata": {},
658 | "outputs": [],
659 | "source": [
660 | "callsDF = allLogsDF.groupby(['vgwSIPFromURI'])['vgwSessionID'].nunique()\n",
661 | "callsDF = pd.DataFrame.from_dict(callsDF)\n",
662 | "callsDF.reset_index(level=0, inplace=True)\n",
663 | "callsDF.columns = ['Caller ID', 'Call Count']\n",
664 | "print('Total unique caller IDs:', len(callsDF))\n",
665 | "callsDF.head().sort_values('Call Count',ascending=False)\n",
666 | "metrics_dict['callerIDs'] = [len(callsDF)] # Put into metrics dict"
667 | ]
668 | },
669 | {
670 | "cell_type": "markdown",
671 | "metadata": {},
672 | "source": [
673 | "### 2.2.3 SMS Sent \n",
674 | "How many SMS were sent by the assistant? A text message can be sent to the caller and can be initiated from within the Watson Assistant JSON editor. This will count the number of SMS sent."
675 | ]
676 | },
677 | {
678 | "cell_type": "code",
679 | "execution_count": null,
680 | "metadata": {},
681 | "outputs": [],
682 | "source": [
683 | "smsDF = allLogsDF[allLogsDF['vgwSMSUserPhoneNumber'] != '']\n",
684 | "metrics_dict['sms'] = [len(smsDF)] # Put into metrics dict\n",
685 | "print('Total SMS sent to callers: {}'.format(len(smsDF)))"
686 | ]
687 | },
688 | {
689 | "cell_type": "markdown",
690 | "metadata": {},
691 | "source": [
692 | "## 2.3 Custom Metrics \n",
693 | "This section is optional and can be used to create custom metrics. It will require the basic knowledge of Python and Pandas. Two examples of custom metrics included below can be modified, or additional metrics can be added here. [Jump to section 2.4](#export-logs) if you do not wish to build custom metrics.\n",
694 | "\n",
695 | "* [2.3.1 Context Variable Count](#context-variable-count)\n",
696 | "* [2.3.2 Response Mentions](#response-mentions)"
697 | ]
698 | },
699 | {
700 | "cell_type": "markdown",
701 | "metadata": {},
702 | "source": [
703 | "### 2.3.1 Context Variable Count \n",
704 | "Some use cases require the use of context variables in order to track user inputs. For one customer, the assistant asks a series of questions in order to screen the patient. \n",
705 | "\n",
706 | "> **Optional:** If you wish to count the number of context variables used across unique conversation IDs, define `contextVar` below."
707 | ]
708 | },
709 | {
710 | "cell_type": "code",
711 | "execution_count": null,
712 | "metadata": {},
713 | "outputs": [],
714 | "source": [
715 | "contextVar = 'preferredContact' # define the context variable that you retrieved above in customFields\n",
716 | "\n",
717 | "contextDF = allLogsDF.groupby([contextVar])[\"conversation_id\"].nunique()\n",
718 | "contextDF = turn_dict_to_df(contextDF, ['Context Var', 'Count'])\n",
719 | "contextDF = contextDF[contextDF['Context Var'] != '']\n",
720 | "contextDF"
721 | ]
722 | },
723 | {
724 | "cell_type": "code",
725 | "execution_count": null,
726 | "metadata": {},
727 | "outputs": [],
728 | "source": [
729 | "contextVar = 'contactSubmitted' # define the context variable that you retrieved above in customFields\n",
730 | "\n",
731 | "contextDF = allLogsDF.groupby([contextVar])[\"conversation_id\"].nunique()\n",
732 | "contextDF = turn_dict_to_df(contextDF, ['Context Var', 'Count'])\n",
733 | "contextDF = contextDF[contextDF['Context Var'] != '']\n",
734 | "contextDF"
735 | ]
736 | },
737 | {
738 | "cell_type": "code",
739 | "execution_count": null,
740 | "metadata": {},
741 | "outputs": [],
742 | "source": [
743 | "# project.save_data(file_name = custName + \"_ScreeningCount.csv\",data = customVarDF.to_csv(index=False),overwrite=True) # This saves in COS. Comment out if running locally"
744 | ]
745 | },
746 | {
747 | "cell_type": "markdown",
748 | "metadata": {},
749 | "source": [
750 | "### 2.3.2 Response Mentions \n",
751 | "A specific customer wanted to identify all mentions of `311` in the responses to users. You can modify this or comment it out."
752 | ]
753 | },
754 | {
755 | "cell_type": "code",
756 | "execution_count": null,
757 | "metadata": {},
758 | "outputs": [],
759 | "source": [
760 | "helpDF = allLogsDF[(allLogsDF['output.text'].str.contains('311')) | (allLogsDF['output.text'].str.contains('3-1-1'))] \n",
761 | "print('Total 3-1-1 response mentions:', len(helpDF))"
762 | ]
763 | },
764 | {
765 | "cell_type": "markdown",
766 | "metadata": {},
767 | "source": [
768 | "# 3. Export Logs \n",
769 | "The transformed log data inside the Pandas dataframe will be saved to CSV files and DB2 on Cloud database. These logs can be used for further data exploration and for creating visualizations in Cognos Dashboard in Watson Studio.\n",
770 | "\n",
771 | "* [3.1 Saving CSV files to Cloud Object Storage](#export-to-csv) CSV files will be saved to the project's Data Assets and Cloud Object Storage.\n",
772 | "* [3.2 Loading into DB2 on Cloud database](#export-to-db2) The data will be saved to a table on your DB2 instance. \n",
773 | "\n",
774 | "## 3.1 Saving CSV files to Cloud Object Storage \n",
775 | "The data will be saved into a CSV file in Cloud Object Storage, accessible via your project's assets folder in Watson Studio. There will be three distinct CSV files saved:\n",
776 | "* `_logs.csv` will contain all of the data within the allLogs dataframe\n",
777 | "* `_KeyMetrics.csv` will contain the calculated metrics such as coverage, escalation, containment rate, etc.\n",
778 | "* `_uncovered_msgs.csv` will contain the selection of uncovered messages. This file can be used for making improvements to intent training and dialog responses.\n",
779 | "\n",
780 | "\n",
781 | "### 3.1.1 Save all logs to CSV"
782 | ]
783 | },
784 | {
785 | "cell_type": "code",
786 | "execution_count": null,
787 | "metadata": {},
788 | "outputs": [],
789 | "source": [
790 | "# allLogsDF.to_csv(custName+'_logs.csv',index=False) # This saves if running notebook locally. Comment out for Studio. \n",
791 | "print('Saving all logs to {}'.format(custName+ \"_logs.csv\"))\n",
792 | "project.save_data(file_name = custName + \"_logs.csv\",data = allLogsDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally"
793 | ]
794 | },
795 | {
796 | "cell_type": "markdown",
797 | "metadata": {},
798 | "source": [
799 | "### 3.1.2 Save KPIs to CSV"
800 | ]
801 | },
802 | {
803 | "cell_type": "code",
804 | "execution_count": null,
805 | "metadata": {},
806 | "outputs": [],
807 | "source": [
808 | "metricsDF = pd.DataFrame(metrics_dict)\n",
809 | "# metricsDF.to_csv(custName + \"_KeyMetrics.csv\",index=False) # This saves if running notebook locally. Comment out for Studio. \n",
810 | "project.save_data(file_name = custName + \"_KeyMetrics.csv\",data = metricsDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally\n",
811 | "print('Saving key metrics to {}'.format(custName+ \"_KeyMetrics.csv\"))\n",
812 | "metricsDF"
813 | ]
814 | },
815 | {
816 | "cell_type": "markdown",
817 | "metadata": {},
818 | "source": [
819 | "### 3.1.3 Save uncovered messages to CSV\n",
820 | "Improve Coverage by analyzing these uncovered messages. This might require adding training data to Intents or customizing STT models."
821 | ]
822 | },
823 | {
824 | "cell_type": "code",
825 | "execution_count": null,
826 | "metadata": {},
827 | "outputs": [],
828 | "source": [
829 | "print('\\nSaved', len(uncoveredDF), 'messages to:', custName + \"_uncovered_msgs.csv\")\n",
830 | "# uncoveredDF.to_csv(custName + \"_uncovered_msgs.csv\",index=False, header=['Utterance','Response','Intent','Confidence'])\n",
831 | "\n",
832 | "project.save_data(file_name = custName + \"_uncovered_msgs.csv\",data = uncoveredDF.to_csv(index=False),overwrite=True); # This saves in COS. Comment out if running locally"
833 | ]
834 | },
835 | {
836 | "cell_type": "markdown",
837 | "metadata": {},
838 | "source": [
839 | "## 3.2 Loading into DB2 on Cloud database \n",
840 | "The transformed log data will be inserted into a table within an instance of DB2 on Cloud. Requires configuration in section 1.1."
841 | ]
842 | },
843 | {
844 | "cell_type": "code",
845 | "execution_count": null,
846 | "metadata": {},
847 | "outputs": [],
848 | "source": [
849 | "# Prepare to create Logs\n",
850 | "columns = 'BRANCH_EXITED_REASON,CONVERSATION_ID,DIALOG_TURN_COUNTER,ENTITIES,INPUT_TEXT,INTENT,INTENT_CONFIDENCE,NODES_VISITED,OUTPUT_TEXT,REQUEST_TIMESTAMP,RESPONSE_TIMESTAMP,LANGUAGE,PREV_NODES_VISITED,MONTH,DAY,YEAR,CALLER_ID,VGW_SESSION_ID,SMS_NUMBER,CALL_TRANSFER,USER_ID,COVERAGE,RESPONSE_TYPE'\n",
851 | "insertSQL = 'Insert into WATSON.WA_FULL_LOGS(' + columns + ') values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)'\n",
852 | "stmt = ibm_db.prepare(conn, insertSQL)\n",
853 | "checkSQL = 'Select CONVERSATION_ID,RESPONSE_TIMESTAMP from WATSON.WA_FULL_LOGS where CONVERSATION_ID = ? and RESPONSE_TIMESTAMP = ?'\n",
854 | "checkStmt = ibm_db.prepare(conn, checkSQL)\n",
855 | "insertSQL"
856 | ]
857 | },
858 | {
859 | "cell_type": "code",
860 | "execution_count": null,
861 | "metadata": {
862 | "scrolled": true
863 | },
864 | "outputs": [],
865 | "source": [
866 | "LOG_EVENTS_COUNTER = 0\n",
867 | "# Insert the rows from the dataframe into the DB2 table.\n",
868 | "for n in range(len(allLogsDF)) :\n",
869 | " # check whether this record exists conversation id, response time stamp\n",
870 | " ibm_db.bind_param(checkStmt,1,allLogsDF.at[n,'conversation_id'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
871 | " ibm_db.bind_param(checkStmt, 2,allLogsDF.at[n,'response_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
872 | " #select_stmt = ibm_db.exec_immediate(conn, select_sql)\n",
873 | " select_result = ibm_db.execute(checkStmt)\n",
874 | " row_exists = ibm_db.fetch_both(checkStmt)\n",
875 | " if row_exists == False :\n",
876 | " #Row does not exists, hence insert into the All logs table\n",
877 | " LOG_EVENTS_COUNTER = LOG_EVENTS_COUNTER + 1\n",
878 | " #print('Inserting conversation id: ' + allLogsDF.at[n,'conversation_id'])\n",
879 | " \n",
880 | " ibm_db.bind_param(stmt,1,allLogsDF.at[n,'branch_exited_reason'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
881 | " ibm_db.bind_param(stmt,2,allLogsDF.at[n,'conversation_id'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
882 | " ibm_db.bind_param(stmt,3,allLogsDF.at[n,'dialog_turn_counter'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_DECIMAL)\n",
883 | " ibm_db.bind_param(stmt,4,str(allLogsDF.at[n,'entities'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
884 | " ibm_db.bind_param(stmt,5,str(allLogsDF.at[n,'input.text'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
885 | " ibm_db.bind_param(stmt,6,str(allLogsDF.at[n,'intent'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
886 | " ibm_db.bind_param(stmt,7,allLogsDF.at[n,'intent_confidence'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_DECIMAL)\n",
887 | " ibm_db.bind_param(stmt, 8,str(allLogsDF.at[n,'nodes_visited'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
888 | " ibm_db.bind_param(stmt, 9,str(allLogsDF.at[n,'output.text'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
889 | " ibm_db.bind_param(stmt, 10,allLogsDF.at[n,'request_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
890 | " ibm_db.bind_param(stmt, 11,allLogsDF.at[n,'response_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
891 | "\n",
892 | " ibm_db.bind_param(stmt, 12,str(allLogsDF.at[n,'language'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
893 | " ibm_db.bind_param(stmt, 13,str(allLogsDF.at[n,'prev_nodes_visited'][:1020]), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
894 | " \n",
895 | " ibm_db.bind_param(stmt, 14,str(allLogsDF.at[n,'month']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_INTEGER)\n",
896 | " ibm_db.bind_param(stmt, 15,str(allLogsDF.at[n,'day']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_INTEGER)\n",
897 | " ibm_db.bind_param(stmt, 16,str(allLogsDF.at[n,'year']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_INTEGER)\n",
898 | "\n",
899 | " ibm_db.bind_param(stmt, 17,str(allLogsDF.at[n,'vgwSIPFromURI']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
900 | " ibm_db.bind_param(stmt, 18,str(allLogsDF.at[n,'vgwSessionID']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
901 | " ibm_db.bind_param(stmt, 19,str(allLogsDF.at[n,'vgwSMSUserPhoneNumber']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
902 | " ibm_db.bind_param(stmt, 20,str(allLogsDF.at[n,'output.vgwAction.parameters.transferTarget']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
903 | " ibm_db.bind_param(stmt, 21,str(allLogsDF.at[n,'metadata.user_id']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
904 | " ibm_db.bind_param(stmt, 22,str(allLogsDF.at[n,'coverage']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
905 | " ibm_db.bind_param(stmt, 23,str(allLogsDF.at[n,'response_type']), ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR) \n",
906 | " \n",
907 | " ibm_db.execute(stmt)\n",
908 | "#ibm_db.commit(conn)\n",
909 | "#ibm_db.close(conn)\n",
910 | "print('Total log events saved to database:', LOG_EVENTS_COUNTER)"
911 | ]
912 | },
913 | {
914 | "cell_type": "markdown",
915 | "metadata": {},
916 | "source": [
917 | "### Update the current run details (WA_LAST_RUN_LOG)"
918 | ]
919 | },
920 | {
921 | "cell_type": "code",
922 | "execution_count": null,
923 | "metadata": {},
924 | "outputs": [],
925 | "source": [
926 | "# Store Current run details\n",
927 | "del_tracking = 'Delete from WATSON.WA_LAST_RUN_LOG'\n",
928 | "insert_tracking = 'Insert into WATSON.WA_LAST_RUN_LOG (conversation_id, request_timestamp, response_timestamp, lastrun_timestamp) Values (?,?,?,?) '\n",
929 | "trans_stmt = ibm_db.prepare(conn, insert_tracking)\n",
930 | "insert_tracking"
931 | ]
932 | },
933 | {
934 | "cell_type": "code",
935 | "execution_count": null,
936 | "metadata": {},
937 | "outputs": [],
938 | "source": [
939 | "#Delete previous entry\n",
940 | "ibm_db.exec_immediate(conn,del_tracking)\n",
941 | "\n",
942 | "# Get the latest log entry from the dataframe. First let's sort it so tail(1) is the last entry.\n",
943 | "allLogsDF.sort_values(by=['response_timestamp'], axis=0, \n",
944 | " ascending=True, inplace=True)\n",
945 | "allLogsDF = allLogsDF.reset_index(drop=True)\n",
946 | "last_row = allLogsDF.tail(1).reset_index(drop=True)\n",
947 | "#store the latest row details.\n",
948 | "ibm_db.bind_param(trans_stmt,1,last_row.at[0,'conversation_id'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_VARCHAR)\n",
949 | "ibm_db.bind_param(trans_stmt,2,last_row.at[0,'request_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
950 | "ibm_db.bind_param(trans_stmt,3,last_row.at[0,'response_timestamp'], ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
951 | "ibm_db.bind_param(trans_stmt,4,pd.Timestamp.now(),ibm_db.SQL_PARAM_INPUT,ibm_db.SQL_TYPE_TIMESTAMP)\n",
952 | "ibm_db.execute(trans_stmt)\n",
953 | "print(pd.Timestamp.now())\n",
954 | "#Commit and close the connection\n",
955 | "ibm_db.commit(conn)\n",
956 | "ibm_db.close(conn)"
957 | ]
958 | },
959 | {
960 | "cell_type": "markdown",
961 | "metadata": {},
962 | "source": [
963 | "### End of Notebook v2.0 (last modified on 6-8-20)"
964 | ]
965 | },
966 | {
967 | "cell_type": "code",
968 | "execution_count": null,
969 | "metadata": {},
970 | "outputs": [],
971 | "source": []
972 | }
973 | ],
974 | "metadata": {
975 | "kernelspec": {
976 | "display_name": "Python 3",
977 | "language": "python",
978 | "name": "python3"
979 | },
980 | "language_info": {
981 | "codemirror_mode": {
982 | "name": "ipython",
983 | "version": 3
984 | },
985 | "file_extension": ".py",
986 | "mimetype": "text/x-python",
987 | "name": "python",
988 | "nbconvert_exporter": "python",
989 | "pygments_lexer": "ipython3",
990 | "version": "3.7.6"
991 | }
992 | },
993 | "nbformat": 4,
994 | "nbformat_minor": 2
995 | }
996 |
--------------------------------------------------------------------------------
/testfile:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/preethm/watson-assistant-metrics-notebook/0c10bd46c55a225ba412a4ec0655f6f5e146196c/testfile
--------------------------------------------------------------------------------