├── .gitignore
├── ACKNOWLEDGEMENTS.md
├── CONTRIBUTING.md
├── LICENSE
├── MAINTAINERS.md
├── README.md
├── data
└── examples
│ └── nodebook_1.ipynb
├── doc
└── source
│ └── images
│ ├── architecture.png
│ ├── new_custom_environment.png
│ ├── new_notebook_custom_environment.png
│ └── notebook_preview.png
└── notebooks
├── images
├── display_sin_cos.png
├── mapbox_americas.png
├── mapbox_uk.png
├── pd_chart_types.png
└── pixiedust_node_schematic.png
└── nodebook_1.ipynb
/.gitignore:
--------------------------------------------------------------------------------
1 | notebooks/.ipynb_checkpoints/
2 | notebooks/derby.log
3 | notebooks/metastore_db/
4 |
5 |
6 |
--------------------------------------------------------------------------------
/ACKNOWLEDGEMENTS.md:
--------------------------------------------------------------------------------
1 | ## Acknowledgements
2 |
3 | * [Glynn Bird](https://github.com/glynnbird) created the [pixiedust_node](https://github.com/ibm-watson-data-lab/pixiedust_node) Python library, on which this code pattern is based.
4 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing
2 |
3 | This is an open source project, and we appreciate your help!
4 |
5 | We use the GitHub issue tracker to discuss new features and non-trivial bugs.
6 |
7 | In addition to the issue tracker, [#journeys on
8 | Slack](https://dwopen.slack.com) is the best way to get into contact with the
9 | project's maintainers.
10 |
11 | To contribute code, documentation, or tests, please submit a pull request to
12 | the GitHub repository. Generally, we expect two maintainers to review your pull
13 | request before it is approved for merging. For more details, see the
14 | [MAINTAINERS](MAINTAINERS.md) page.
15 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/MAINTAINERS.md:
--------------------------------------------------------------------------------
1 | # Maintainers Guide
2 |
3 | This guide is intended for maintainers - anybody with commit access to one or
4 | more Code Pattern repositories.
5 |
6 | ## Methodology
7 |
8 | This repository does not have a traditional release management cycle, but
9 | should instead be maintained as a useful, working, and polished reference at
10 | all times. While all work can therefore be focused on the master branch, the
11 | quality of this branch should never be compromised.
12 |
13 | The remainder of this document details how to merge pull requests to the
14 | repositories.
15 |
16 | ## Merge approval
17 |
18 | The project maintainers use LGTM (Looks Good To Me) in comments on the pull
19 | request to indicate acceptance prior to merging. A change requires LGTMs from
20 | two project maintainers. If the code is written by a maintainer, the change
21 | only requires one additional LGTM.
22 |
23 | ## Reviewing Pull Requests
24 |
25 | We recommend reviewing pull requests directly within GitHub. This allows a
26 | public commentary on changes, providing transparency for all users. When
27 | providing feedback be civil, courteous, and kind. Disagreement is fine, so long
28 | as the discourse is carried out politely. If we see a record of uncivil or
29 | abusive comments, we will revoke your commit privileges and invite you to leave
30 | the project.
31 |
32 | During your review, consider the following points:
33 |
34 | ### Does the change have positive impact?
35 |
36 | Some proposed changes may not represent a positive impact to the project. Ask
37 | whether or not the change will make understanding the code easier, or if it
38 | could simply be a personal preference on the part of the author (see
39 | [bikeshedding](https://en.wiktionary.org/wiki/bikeshedding)).
40 |
41 | Pull requests that do not have a clear positive impact should be closed without
42 | merging.
43 |
44 | ### Do the changes make sense?
45 |
46 | If you do not understand what the changes are or what they accomplish, ask the
47 | author for clarification. Ask the author to add comments and/or clarify test
48 | case names to make the intentions clear.
49 |
50 | At times, such clarification will reveal that the author may not be using the
51 | code correctly, or is unaware of features that accommodate their needs. If you
52 | feel this is the case, work up a code sample that would address the pull
53 | request for them, and feel free to close the pull request once they confirm.
54 |
55 | ### Does the change introduce a new feature?
56 |
57 | For any given pull request, ask yourself "is this a new feature?" If so, does
58 | the pull request (or associated issue) contain narrative indicating the need
59 | for the feature? If not, ask them to provide that information.
60 |
61 | Are new unit tests in place that test all new behaviors introduced? If not, do
62 | not merge the feature until they are! Is documentation in place for the new
63 | feature? (See the documentation guidelines). If not do not merge the feature
64 | until it is! Is the feature necessary for general use cases? Try and keep the
65 | scope of any given component narrow. If a proposed feature does not fit that
66 | scope, recommend to the user that they maintain the feature on their own, and
67 | close the request. You may also recommend that they see if the feature gains
68 | traction among other users, and suggest they re-submit when they can show such
69 | support.
70 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Run Node.js code in Jupyter notebooks
2 |
3 | Notebooks are where data scientists process, analyse, and visualise data in an iterative, collaborative environment. They typically run environments for languages like Python, R, and Scala. For years, data science notebooks have served academics and research scientists as a scratchpad for writing code, refining algorithms, and sharing and proving their work. Today, it's a workflow that lends itself well to web developers experimenting with data sets in Node.js.
4 |
5 | To that end, [pixiedust_node](https://github.com/pixiedust/pixiedust_node) is an add-on for Jupyter notebooks that allows Node.js/JavaScript to run inside notebook cells. To learn more follow the setup steps and explore the getting started notebook or click on the sample image below to preview the output.
6 |
7 | [](https://nbviewer.jupyter.org/github/IBM/nodejs-in-notebooks/blob/master/data/examples/nodebook_1.ipynb)
8 |
9 | When the reader has completed this Code Pattern, they will understand how to:
10 |
11 | * Run Node.js/JavaScript inside a Jupyter Notebook
12 | * Use JavaScript variables, functions, and promises
13 | * Work with remote data sources
14 | * Share data between Python and Node.js
15 |
16 | 
17 |
18 | ## Flow
19 |
20 | 1. Install Node.js in target environment (Watson Studio or a local machine)
21 | 2. Open Node.js notebook in target environment
22 | 3. Run Node.js notebook
23 |
24 | ## Included Components
25 | * [Watson Studio](https://www.ibm.com/cloud/watson-studio): Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
26 | * [Jupyter Notebook](https://jupyter.org/): An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
27 | * [PixieDust](https://github.com/pixiedust/pixiedust): Provides a Python helper library for IPython Notebook.
28 | * [Cloudant NoSQL DB](https://cloud.ibm.com/catalog/services/cloudant): A fully managed data layer designed for modern web and mobile applications that leverages a flexible JSON schema.
29 |
30 | ## Featured Technologies
31 | * [pixiedust_node](https://github.com/pixiedust/pixiedust_node): Open source Python package, providing support for Javascript/Node.js code.
32 | * [Node.js](https://nodejs.org/): An open-source JavaScript run-time environment for executing server-side JavaScript code.
33 |
34 | # Steps
35 |
36 | You can run Node.js code in Watson Studio or your local environment:
37 | * [Run Node.js notebooks in Watson Studio](#run-nodejs-notebooks-in-watson-studio)
38 | * [Run Node.js notebooks in a local environment](#run-nodejs-notebooks-in-a-local-environment)
39 |
40 | To preview an example notebook without going through a setup [follow this link](https://nbviewer.jupyter.org/github/IBM/nodejs-in-notebooks/blob/master/data/examples/nodebook_1.ipynb).
41 |
42 | ## Run Node.js notebooks in Watson Studio
43 |
44 | ### Creating a custom runtime environment
45 |
46 | A runtime environment in Watson Studio (IBM's Data Science platform) is defined by its hardware and software configuration. By default, Node.js is not installed in runtime environments and you therefore need to create a custom runtime environment definition. [[Learn more about environments...]](https://dataplatform.ibm.com/docs/content/analyze-data/notebook-environments.html)
47 |
48 | * Open [Watson Studio](https://www.ibm.com/cloud/watson-studio) in your web browser. Sign up for a free account if necessary.
49 | * [Create a "Complete" project.](https://dataplatform.ibm.com/projects?context=analytics) [[Learn more about projects...]](https://dataplatform.ibm.com/docs/content/manage-data/manage-projects.html)
50 | * In this project, open the **Environments** tab. A list of existing environment definitions for Python and R is displayed.
51 | * Create a new environment definition.
52 | * Assign a name to the new environment definition, such as `Python 2 with Node.js`.
53 | * Enter a brief environment description.
54 | * Choose the desired hardware configuration, such as a minimalist free setup (which is sufficient for demonstration purposes).
55 | * Select Python 2 as _software version_. (Python 3 is currently not supported by pixiedust_node.)
56 | * `Create` the environment definition.
57 | * Customize the software definition.
58 | * Add the [nodejs conda package](https://anaconda.org/anaconda/nodejs) dependency, as shown below:
59 | ```
60 | # Please add conda channels here
61 | channels:
62 | - defaults
63 |
64 | # Please add conda packages here
65 | dependencies:
66 | - nodejs
67 |
68 | # Please add pip packages here
69 | # To add pip packages, please comment out the next line
70 | #- pip:
71 | ```
72 | * `Apply` the customization. It should look as follows:
73 |
74 | 
75 |
76 | You can now associate notebooks with this environment definition and run Node.js in the code cells, as illustrated in the getting started notebook.
77 | > Note: An environment definition is only available within the project that it was defined in.
78 |
79 | ### Loading the getting started notebook
80 |
81 | The [getting started notebook](notebooks/nodebook_1.ipynb) outlines how to
82 | * use variables, functions, and promises,
83 | * work with remote data sources, such as Apache CouchDB (or its managed sibling Cloudant),
84 | * visualize data,
85 | * share data between Python and Node.js.
86 |
87 | In the project you've created, add a new notebook _from URL_:
88 | * Enter any notebook name.
89 | * Specify remote URL `https://raw.githubusercontent.com/IBM/nodebook-code-pattern/master/notebooks/nodebook_1.ipynb` as source.
90 | * Select the custom runtime environment `Python 2 with Node.js.` you've created earlier.
91 |
92 | 
93 |
94 | Follow the notebook instructions.
95 | > You should be able to run all cells one at a time without making any changes. Do not use run all.
96 |
97 | ***
98 |
99 | ## Run Node.js notebooks in a local environment
100 |
101 | ### Prerequisites
102 | To get started with nodebooks you'll need a local installation of
103 |
104 | * [PixieDust and its prerequisites](https://pixiedust.github.io/pixiedust/install.html)
105 | * A Python 2.7 kernel with Spark 2.x. (see section *Install a Jupyter Kernel* in [the PixieDust installation instructions](https://pixiedust.github.io/pixiedust/install.html))
106 | * [Node.js/npm](https://nodejs.org/en/download/)
107 |
108 |
109 | ### Installing the samples
110 |
111 | To access the samples, clone this repository and launch a Jupyter server on your local machine.
112 |
113 | ```
114 | $ git clone https://github.com/IBM/nodejs-in-notebooks.git
115 | $ cd nodejs-in-notebooks
116 | $ jupyter notebook notebooks/
117 | ```
118 |
119 | ### Running the samples
120 |
121 | Open [nodebook_1](notebooks/nodebook_1.ipynb) to learn more about
122 |
123 | * using variables, functions, and promises,
124 | * working with remote data sources, such as Apache CouchDB (or its managed sibling Cloudant),
125 | * visualizing data,
126 | * sharing data between Python and Node.js.
127 |
128 | > You should be able to run all cells one at a time without making any changes. Do not use run all.
129 |
130 | ***
131 |
132 | ## Optional data source customization
133 |
134 | Some of the nodebook code pattern examples access a read-only Cloudant database for illustrative purposes. If you prefer you can create your own copy of this database by replicating from remote database URL `https://56953ed8-3fba-4f7e-824e-5498c8e1d18e-bluemix.cloudant.com/cities`. [[Learn more about database replication](https://developer.ibm.com/clouddataservices/docs/cloudant/replication/)...]
135 |
136 | # Sample Output
137 |
138 |
139 | Open [this link](https://nbviewer.jupyter.org/github/IBM/nodejs-in-notebooks/blob/master/data/examples/nodebook_1.ipynb) to preview the completed notebook.
140 |
141 | # Links
142 | * [pixiedust_node](https://github.com/pixiedust/pixiedust_node)
143 | * [pixiedust](https://github.com/pixiedust/pixiedust)
144 | * [Nodebooks: Introducing Node.js Data Science Notebooks](https://medium.com/ibm-watson-data-lab/nodebooks-node-js-data-science-notebooks-aa140bea21ba)
145 | * [Nodebooks: Sharing Data Between Node.js & Python](https://medium.com/ibm-watson-data-lab/nodebooks-sharing-data-between-node-js-python-3a4acae27a02)
146 | * [Sharing Variables Between Python & Node.js in Jupyter Notebooks](https://medium.com/ibm-watson-data-lab/sharing-variables-between-python-node-js-in-jupyter-notebooks-682a79d4bdd9)
147 |
148 | # Learn more
149 | * **Watson Studio**: Master the art of data science with IBM's [Watson Studio](https://www.ibm.com/cloud/watson-studio/)
150 | * **Data Analytics Code Patterns**: Enjoyed this Code Pattern? Check out our other [Data Analytics Code Patterns](https://developer.ibm.com/technologies/data-science/)
151 | * **With Watson**: Want to take your Watson app to the next level? Looking to utilize Watson Brand assets? [Join the With Watson program](https://www.ibm.com/watson/with-watson/) to leverage exclusive brand, marketing, and tech resources to amplify and accelerate your Watson embedded commercial solution.
152 |
153 | # License
154 |
155 | This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the [Developer Certificate of Origin, Version 1.1 (DCO)](https://developercertificate.org/) and the [Apache Software License, Version 2](https://www.apache.org/licenses/LICENSE-2.0.txt).
156 |
157 | [Apache Software License (ASL) FAQ](https://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN)
158 |
--------------------------------------------------------------------------------
/data/examples/nodebook_1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Nodebooks: Introducing Node.js Data Science Notebooks\n",
8 | "\n",
9 | "Notebooks are where data scientists process, analyse, and visualise data in an iterative, collaborative environment. They typically run environments for languages like Python, R, and Scala. For years, data science notebooks have served academics and research scientists as a scratchpad for writing code, refining algorithms, and sharing and proving their work. Today, it's a workflow that lends itself well to web developers experimenting with data sets in Node.js.\n",
10 | "\n",
11 | "To that end, pixiedust_node is an add-on for Jupyter notebooks that allows Node.js/JavaScript to run inside notebook cells. Not only can web developers use the same workflow for collaborating in Node.js, but they can also use the same tools to work with existing data scientists coding in Python.\n",
12 | "\n",
13 | "pixiedust_node is built on the popular PixieDust helper library. Let’s get started.\n",
14 | "\n",
15 | "> Note: Run one cell at a time or unexpected results might be observed.\n",
16 | "\n",
17 | "\n",
18 | "## Part 1: Variables, functions, and promises\n",
19 | "\n",
20 | "\n",
21 | "### Installing\n",
22 | "Install the [`pixiedust`](https://pypi.python.org/pypi/pixiedust) and [`pixiedust_node`](https://pypi.python.org/pypi/pixiedust-node) packages using `pip`, the Python package manager. "
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": 1,
28 | "metadata": {},
29 | "outputs": [
30 | {
31 | "name": "stdout",
32 | "output_type": "stream",
33 | "text": [
34 | "Requirement already up-to-date: pixiedust in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages\n",
35 | "Requirement not upgraded as not directly required: lxml in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust)\n",
36 | "Requirement not upgraded as not directly required: geojson in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust)\n",
37 | "Requirement not upgraded as not directly required: colour in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust)\n",
38 | "Requirement not upgraded as not directly required: mpld3 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust)\n",
39 | "Requirement not upgraded as not directly required: astunparse in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust)\n",
40 | "Requirement not upgraded as not directly required: markdown in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust)\n",
41 | "Requirement not upgraded as not directly required: six<2.0,>=1.6.1 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from astunparse->pixiedust)\n",
42 | "Requirement not upgraded as not directly required: wheel<1.0,>=0.23.0 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from astunparse->pixiedust)\n",
43 | "Requirement already up-to-date: pixiedust_node in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages\n",
44 | "Requirement not upgraded as not directly required: pixiedust in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust_node)\n",
45 | "Requirement not upgraded as not directly required: pandas in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust_node)\n",
46 | "Requirement not upgraded as not directly required: ipython in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust_node)\n",
47 | "Requirement not upgraded as not directly required: lxml in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust->pixiedust_node)\n",
48 | "Requirement not upgraded as not directly required: geojson in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust->pixiedust_node)\n",
49 | "Requirement not upgraded as not directly required: colour in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust->pixiedust_node)\n",
50 | "Requirement not upgraded as not directly required: mpld3 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust->pixiedust_node)\n",
51 | "Requirement not upgraded as not directly required: astunparse in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust->pixiedust_node)\n",
52 | "Requirement not upgraded as not directly required: markdown in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pixiedust->pixiedust_node)\n",
53 | "Requirement not upgraded as not directly required: python-dateutil in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pandas->pixiedust_node)\n",
54 | "Requirement not upgraded as not directly required: pytz>=2011k in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pandas->pixiedust_node)\n",
55 | "Requirement not upgraded as not directly required: numpy>=1.9.0 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pandas->pixiedust_node)\n",
56 | "Requirement not upgraded as not directly required: setuptools>=18.5 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
57 | "Requirement not upgraded as not directly required: decorator in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
58 | "Requirement not upgraded as not directly required: pickleshare in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
59 | "Requirement not upgraded as not directly required: simplegeneric>0.8 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
60 | "Requirement not upgraded as not directly required: traitlets>=4.2 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
61 | "Requirement not upgraded as not directly required: prompt_toolkit<2.0.0,>=1.0.4 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
62 | "Requirement not upgraded as not directly required: pygments in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
63 | "Requirement not upgraded as not directly required: pexpect in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
64 | "Requirement not upgraded as not directly required: backports.shutil_get_terminal_size in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
65 | "Requirement not upgraded as not directly required: pathlib2 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from ipython->pixiedust_node)\n",
66 | "Requirement not upgraded as not directly required: six<2.0,>=1.6.1 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from astunparse->pixiedust->pixiedust_node)\n",
67 | "Requirement not upgraded as not directly required: wheel<1.0,>=0.23.0 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from astunparse->pixiedust->pixiedust_node)\n",
68 | "Requirement not upgraded as not directly required: ipython_genutils in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from traitlets>=4.2->ipython->pixiedust_node)\n",
69 | "Requirement not upgraded as not directly required: enum34 in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from traitlets>=4.2->ipython->pixiedust_node)\n",
70 | "Requirement not upgraded as not directly required: wcwidth in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from prompt_toolkit<2.0.0,>=1.0.4->ipython->pixiedust_node)\n",
71 | "Requirement not upgraded as not directly required: scandir in /opt/conda/envs/DSX-Python27/lib/python2.7/site-packages (from pathlib2->ipython->pixiedust_node)\n"
72 | ]
73 | }
74 | ],
75 | "source": [
76 | "# install or upgrade the packages\n",
77 | "# restart the kernel to pick up the latest version\n",
78 | "!pip install pixiedust --upgrade\n",
79 | "!pip install pixiedust_node --upgrade"
80 | ]
81 | },
82 | {
83 | "cell_type": "markdown",
84 | "metadata": {},
85 | "source": [
86 | "### Using pixiedust_node\n",
87 | "Now we can import `pixiedust_node` into our notebook:"
88 | ]
89 | },
90 | {
91 | "cell_type": "code",
92 | "execution_count": 2,
93 | "metadata": {},
94 | "outputs": [
95 | {
96 | "name": "stdout",
97 | "output_type": "stream",
98 | "text": [
99 | "Pixiedust database opened successfully\n"
100 | ]
101 | },
102 | {
103 | "data": {
104 | "text/html": [
105 | "\n",
106 | "
\n",
112 | " "
113 | ],
114 | "text/plain": [
115 | ""
116 | ]
117 | },
118 | "metadata": {},
119 | "output_type": "display_data"
120 | },
121 | {
122 | "data": {
123 | "text/html": [
124 | "\n",
125 | " \n"
131 | ],
132 | "text/plain": [
133 | ""
134 | ]
135 | },
136 | "metadata": {},
137 | "output_type": "display_data"
138 | },
139 | {
140 | "name": "stdout",
141 | "output_type": "stream",
142 | "text": [
143 | "pixiedust_node 0.2.5 started. Cells starting '%%node' may contain Node.js code.\n"
144 | ]
145 | }
146 | ],
147 | "source": [
148 | "import pixiedust_node"
149 | ]
150 | },
151 | {
152 | "cell_type": "markdown",
153 | "metadata": {},
154 | "source": [
155 | "And then we can write JavaScript code in cells whose first line is `%%node`:"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": 3,
161 | "metadata": {},
162 | "outputs": [],
163 | "source": [
164 | "%%node\n",
165 | "// get the current date\n",
166 | "var date = new Date();"
167 | ]
168 | },
169 | {
170 | "cell_type": "markdown",
171 | "metadata": {},
172 | "source": [
173 | "It’s that easy! We can have Python and Node.js in the same notebook. Cells are Python by default, but simply starting a cell with `%%node` indicates that the next lines will be JavaScript."
174 | ]
175 | },
176 | {
177 | "cell_type": "markdown",
178 | "metadata": {},
179 | "source": [
180 | "### Displaying HTML and images in notebook cells\n",
181 | "We can use the `html` function to render HTML code in a cell:"
182 | ]
183 | },
184 | {
185 | "cell_type": "code",
186 | "execution_count": 4,
187 | "metadata": {},
188 | "outputs": [
189 | {
190 | "data": {
191 | "text/html": [
192 | "Quote
\"Imagination is more important than knowledge\"\n",
193 | "Albert Einstein
"
194 | ],
195 | "text/plain": [
196 | ""
197 | ]
198 | },
199 | "metadata": {},
200 | "output_type": "display_data"
201 | }
202 | ],
203 | "source": [
204 | "%%node\n",
205 | "var str = 'Quote
\"Imagination is more important than knowledge\"\\nAlbert Einstein
';\n",
206 | "html(str)"
207 | ]
208 | },
209 | {
210 | "cell_type": "markdown",
211 | "metadata": {},
212 | "source": [
213 | "If we have an image we want to render, we can do that with the `image` function:"
214 | ]
215 | },
216 | {
217 | "cell_type": "code",
218 | "execution_count": 5,
219 | "metadata": {},
220 | "outputs": [
221 | {
222 | "data": {
223 | "text/html": [
224 | "
"
225 | ],
226 | "text/plain": [
227 | ""
228 | ]
229 | },
230 | "metadata": {},
231 | "output_type": "display_data"
232 | }
233 | ],
234 | "source": [
235 | "%%node\n",
236 | "var url = 'https://github.com/IBM/nodejs-in-notebooks/blob/master/notebooks/images/pixiedust_node_schematic.png?raw=true';\n",
237 | "image(url);"
238 | ]
239 | },
240 | {
241 | "cell_type": "markdown",
242 | "metadata": {},
243 | "source": [
244 | "### Printing JavaScript variables\n",
245 | "\n",
246 | "Print variables using `console.log`."
247 | ]
248 | },
249 | {
250 | "cell_type": "code",
251 | "execution_count": 6,
252 | "metadata": {},
253 | "outputs": [
254 | {
255 | "name": "stdout",
256 | "output_type": "stream",
257 | "text": [
258 | "{ a: 1, b: 'two', c: true }\n"
259 | ]
260 | }
261 | ],
262 | "source": [
263 | "%%node\n",
264 | "var x = { a:1, b:'two', c: true };\n",
265 | "console.log(x);"
266 | ]
267 | },
268 | {
269 | "cell_type": "markdown",
270 | "metadata": {},
271 | "source": [
272 | "Calling the `print` function within your JavaScript code is the same as calling `print` in your Python code."
273 | ]
274 | },
275 | {
276 | "cell_type": "code",
277 | "execution_count": 7,
278 | "metadata": {},
279 | "outputs": [
280 | {
281 | "name": "stdout",
282 | "output_type": "stream",
283 | "text": [
284 | "{\"a\": 3, \"c\": false, \"b\": \"four\"}\n"
285 | ]
286 | }
287 | ],
288 | "source": [
289 | "%%node\n",
290 | "var y = { a:3, b:'four', c: false };\n",
291 | "print(y);"
292 | ]
293 | },
294 | {
295 | "cell_type": "markdown",
296 | "metadata": {},
297 | "source": [
298 | "### Visualizing data using PixieDust\n",
299 | "You can also use PixieDust’s `display` function to render data graphically. Configuring the output as line chart, the visualization looks as follows: "
300 | ]
301 | },
302 | {
303 | "cell_type": "code",
304 | "execution_count": 8,
305 | "metadata": {
306 | "pixiedust": {
307 | "displayParams": {
308 | "aggregation": "SUM",
309 | "chartsize": "99",
310 | "handlerId": "lineChart",
311 | "keyFields": "x",
312 | "rowCount": "500",
313 | "valueFields": "cos,sin"
314 | }
315 | }
316 | },
317 | "outputs": [],
318 | "source": [
319 | "%%node\n",
320 | "var data = [];\n",
321 | "for (var i = 0; i < 1000; i++) {\n",
322 | " var x = 2*Math.PI * i/ 360;\n",
323 | " var obj = {\n",
324 | " x: x,\n",
325 | " i: i,\n",
326 | " sin: Math.sin(x),\n",
327 | " cos: Math.cos(x),\n",
328 | " tan: Math.tan(x)\n",
329 | " };\n",
330 | " data.push(obj);\n",
331 | "}\n",
332 | "// render data \n",
333 | "display(data);"
334 | ]
335 | },
336 | {
337 | "cell_type": "markdown",
338 | "metadata": {},
339 | "source": [
340 | "
\n",
341 | "\n",
342 | "PixieDust presents visualisations of DataFrames using Matplotlib, Bokeh, Brunel, d3, Google Maps and, MapBox. No code is required on your part because PixieDust presents simple pull-down menus and a friendly point-and-click interface, allowing you to configure how the data is presented:\n",
343 | "\n",
344 | "
"
345 | ]
346 | },
347 | {
348 | "cell_type": "markdown",
349 | "metadata": {},
350 | "source": [
351 | "### Adding npm modules\n",
352 | "There are thousands of libraries and tools in the npm repository, Node.js’s package manager. It’s essential that we can install npm libraries and use them in our notebook code.\n",
353 | "Let’s say we want to make some HTTP calls to an external API service. We could deal with Node.js’s low-level HTTP library, or an easier option would be to use the ubiquitous `request` npm module.\n",
354 | "Once we have pixiedust_node set up, installing an npm module is as simple as running `npm.install` in a Python cell:"
355 | ]
356 | },
357 | {
358 | "cell_type": "code",
359 | "execution_count": 9,
360 | "metadata": {},
361 | "outputs": [
362 | {
363 | "name": "stdout",
364 | "output_type": "stream",
365 | "text": [
366 | "/opt/conda/envs/DSX-Python27/bin/npm install -s request\n",
367 | "+ request@2.87.0\n",
368 | "updated 1 package in 0.856s\n"
369 | ]
370 | }
371 | ],
372 | "source": [
373 | "npm.install('request');"
374 | ]
375 | },
376 | {
377 | "cell_type": "markdown",
378 | "metadata": {},
379 | "source": [
380 | "Once installed, you may `require` the module in your JavaScript code:"
381 | ]
382 | },
383 | {
384 | "cell_type": "code",
385 | "execution_count": 10,
386 | "metadata": {},
387 | "outputs": [
388 | {
389 | "name": "stdout",
390 | "output_type": "stream",
391 | "text": [
392 | "... ... ... ...\n",
393 | "... ...\n",
394 | "{ iss_position: { longitude: '42.2119', latitude: '51.1124' },\n",
395 | "timestamp: 1531779053,\n",
396 | "message: 'success' }\n"
397 | ]
398 | }
399 | ],
400 | "source": [
401 | "%%node\n",
402 | "var request = require('request');\n",
403 | "var r = {\n",
404 | " method:'GET',\n",
405 | " url: 'http://api.open-notify.org/iss-now.json',\n",
406 | " json: true\n",
407 | "};\n",
408 | "request(r, function(err, req, body) {\n",
409 | " console.log(body);\n",
410 | "});\n"
411 | ]
412 | },
413 | {
414 | "cell_type": "markdown",
415 | "metadata": {},
416 | "source": [
417 | "As an HTTP request is an asynchronous action, the `request` library calls our callback function when the operation has completed. Inside that function, we can call print to render the data.\n",
418 | "We can organise our code into functions to encapsulate complexity and make it easier to reuse code. We can create a function to get the current position of the International Space Station in one notebook cell:"
419 | ]
420 | },
421 | {
422 | "cell_type": "code",
423 | "execution_count": 11,
424 | "metadata": {},
425 | "outputs": [
426 | {
427 | "name": "stdout",
428 | "output_type": "stream",
429 | "text": [
430 | "... ..... ..... ..... ..... ... ..... ..... ....... ....... ....... ....... ....... ..... ..... ...\n"
431 | ]
432 | }
433 | ],
434 | "source": [
435 | "%%node\n",
436 | "var request = require('request');\n",
437 | "var getPosition = function(callback) {\n",
438 | " var r = {\n",
439 | " method:'GET',\n",
440 | " url: 'http://api.open-notify.org/iss-now.json',\n",
441 | " json: true\n",
442 | " };\n",
443 | " request(r, function(err, req, body) {\n",
444 | " var obj = null;\n",
445 | " if (!err) {\n",
446 | " obj = body.iss_position\n",
447 | " obj.latitude = parseFloat(obj.latitude);\n",
448 | " obj.longitude = parseFloat(obj.longitude);\n",
449 | " obj.time = new Date().getTime(); \n",
450 | " }\n",
451 | " callback(err, obj);\n",
452 | " });\n",
453 | "};"
454 | ]
455 | },
456 | {
457 | "cell_type": "markdown",
458 | "metadata": {},
459 | "source": [
460 | "And use it in another cell:"
461 | ]
462 | },
463 | {
464 | "cell_type": "code",
465 | "execution_count": 12,
466 | "metadata": {},
467 | "outputs": [
468 | {
469 | "name": "stdout",
470 | "output_type": "stream",
471 | "text": [
472 | "... ...\n",
473 | "{ longitude: 42.9493, latitude: 51.1819, time: 1531779061073 }\n"
474 | ]
475 | }
476 | ],
477 | "source": [
478 | "%%node\n",
479 | "getPosition(function(err, data) {\n",
480 | " console.log(data);\n",
481 | "});"
482 | ]
483 | },
484 | {
485 | "cell_type": "markdown",
486 | "metadata": {},
487 | "source": [
488 | "### Promises\n",
489 | "If you prefer to work with JavaScript Promises when writing asynchronous code, then that’s okay too. Let’s rewrite our `getPosition` function to return a Promise. First we're going to install the `request-promise` module from npm:"
490 | ]
491 | },
492 | {
493 | "cell_type": "code",
494 | "execution_count": 13,
495 | "metadata": {},
496 | "outputs": [
497 | {
498 | "name": "stdout",
499 | "output_type": "stream",
500 | "text": [
501 | "/opt/conda/envs/DSX-Python27/bin/npm install -s request request-promise\n",
502 | "+ request@2.87.0\n",
503 | "+ request-promise@4.2.2\n",
504 | "updated 2 packages in 0.912s\n"
505 | ]
506 | }
507 | ],
508 | "source": [
509 | "npm.install( ('request', 'request-promise') )"
510 | ]
511 | },
512 | {
513 | "cell_type": "markdown",
514 | "metadata": {},
515 | "source": [
516 | "Notice how you can install multiple modules in a single call. Just pass in a Python `list` or `tuple`.\n",
517 | "Then we can refactor our function a little:"
518 | ]
519 | },
520 | {
521 | "cell_type": "code",
522 | "execution_count": 14,
523 | "metadata": {},
524 | "outputs": [
525 | {
526 | "name": "stdout",
527 | "output_type": "stream",
528 | "text": [
529 | "... ..... ..... ..... ..... ... ..... ..... ..... ..... ..... ..... ..... ...\n"
530 | ]
531 | }
532 | ],
533 | "source": [
534 | "%%node\n",
535 | "var request = require('request-promise');\n",
536 | "var getPosition = function(callback) {\n",
537 | " var r = {\n",
538 | " method:'GET',\n",
539 | " url: 'http://api.open-notify.org/iss-now.json',\n",
540 | " json: true\n",
541 | " };\n",
542 | " return request(r).then(function(body) {\n",
543 | " var obj = null;\n",
544 | " obj = body.iss_position;\n",
545 | " obj.latitude = parseFloat(obj.latitude);\n",
546 | " obj.longitude = parseFloat(obj.longitude);\n",
547 | " obj.time = new Date().getTime(); \n",
548 | " return obj;\n",
549 | " });\n",
550 | "};"
551 | ]
552 | },
553 | {
554 | "cell_type": "markdown",
555 | "metadata": {},
556 | "source": [
557 | "And call it in the Promises style:"
558 | ]
559 | },
560 | {
561 | "cell_type": "code",
562 | "execution_count": 15,
563 | "metadata": {},
564 | "outputs": [
565 | {
566 | "name": "stdout",
567 | "output_type": "stream",
568 | "text": [
569 | "... ... ... ...\n",
570 | "{ longitude: 44.0843, latitude: 51.2787, time: 1531779072259 }\n"
571 | ]
572 | }
573 | ],
574 | "source": [
575 | "%%node\n",
576 | "getPosition().then(function(data) {\n",
577 | " console.log(data);\n",
578 | "}).catch(function(err) {\n",
579 | " console.error(err); \n",
580 | "});"
581 | ]
582 | },
583 | {
584 | "cell_type": "markdown",
585 | "metadata": {},
586 | "source": [
587 | "Or call it in a more compact form:"
588 | ]
589 | },
590 | {
591 | "cell_type": "code",
592 | "execution_count": 16,
593 | "metadata": {},
594 | "outputs": [
595 | {
596 | "name": "stdout",
597 | "output_type": "stream",
598 | "text": [
599 | "{ longitude: 44.6288, latitude: 51.3208, time: 1531779077984 }\n"
600 | ]
601 | }
602 | ],
603 | "source": [
604 | "%%node\n",
605 | "getPosition().then(console.log).catch(console.error);"
606 | ]
607 | },
608 | {
609 | "cell_type": "markdown",
610 | "metadata": {},
611 | "source": [
612 | "In the next part of this notebook we'll illustrate how you can access local and remote data sources from within the notebook."
613 | ]
614 | },
615 | {
616 | "cell_type": "markdown",
617 | "metadata": {},
618 | "source": [
619 | "***\n",
620 | "# Part 2: Working with data sources\n",
621 | "\n",
622 | "You can access any data source using your favorite public or home-grown packages. In the second part of this notebook you'll learn how to retrieve data from an Apache CouchDB (or Cloudant) database and visualize it using PixieDust or third-party libraries.\n",
623 | "\n",
624 | "## Accessing Cloudant data sources\n",
625 | "\n",
626 | "\n",
627 | "To access data stored in an Apache CouchDB or Cloudant database, we can use the [`cloudant-quickstart`](https://www.npmjs.com/package/cloudant-quickstart) npm module:"
628 | ]
629 | },
630 | {
631 | "cell_type": "code",
632 | "execution_count": 17,
633 | "metadata": {},
634 | "outputs": [
635 | {
636 | "name": "stdout",
637 | "output_type": "stream",
638 | "text": [
639 | "/opt/conda/envs/DSX-Python27/bin/npm install -s cloudant-quickstart\n",
640 | "+ cloudant-quickstart@1.25.5\n",
641 | "updated 1 package in 0.983s\n"
642 | ]
643 | }
644 | ],
645 | "source": [
646 | "npm.install('cloudant-quickstart')"
647 | ]
648 | },
649 | {
650 | "cell_type": "markdown",
651 | "metadata": {},
652 | "source": [
653 | "With our Cloudant URL, we can start exploring the data in Node.js. First we make a connection to the remote Cloudant database:"
654 | ]
655 | },
656 | {
657 | "cell_type": "code",
658 | "execution_count": 18,
659 | "metadata": {},
660 | "outputs": [],
661 | "source": [
662 | "%%node\n",
663 | "// connect to Cloudant using cloudant-quickstart\n",
664 | "const cqs = require('cloudant-quickstart');\n",
665 | "const cities = cqs('https://56953ed8-3fba-4f7e-824e-5498c8e1d18e-bluemix.cloudant.com/cities');"
666 | ]
667 | },
668 | {
669 | "cell_type": "markdown",
670 | "metadata": {},
671 | "source": [
672 | "> For this code pattern example a remote database has been pre-configured to accept anonymous connection requests. If you wish to explore the `cloudant-quickstart` library beyond what is covered in this nodebook, we recommend you create your own replica and replace above URL with your own, e.g. `https://myid:mypassword@mycloudanthost/mydatabase`.\n",
673 | "\n",
674 | "Now we have an object named `cities` that we can use to access the database. \n",
675 | "\n",
676 | "### Exploring the data using Node.js in a notebook \n",
677 | "\n",
678 | "We can retrieve all documents using `all`."
679 | ]
680 | },
681 | {
682 | "cell_type": "code",
683 | "execution_count": 19,
684 | "metadata": {},
685 | "outputs": [
686 | {
687 | "name": "stdout",
688 | "output_type": "stream",
689 | "text": [
690 | "[ { _id: '1000501',\n",
691 | "name: 'Grahamstown',\n",
692 | "latitude: -33.30422,\n",
693 | "longitude: 26.53276,\n",
694 | "country: 'ZA',\n",
695 | "population: 91548,\n",
696 | "timezone: 'Africa/Johannesburg' },\n",
697 | "{ _id: '1000543',\n",
698 | "name: 'Graaff-Reinet',\n",
699 | "latitude: -32.25215,\n",
700 | "longitude: 24.53075,\n",
701 | "country: 'ZA',\n",
702 | "population: 62896,\n",
703 | "timezone: 'Africa/Johannesburg' },\n",
704 | "{ _id: '100077',\n",
705 | "name: 'Abū Ghurayb',\n",
706 | "latitude: 33.30563,\n",
707 | "longitude: 44.18477,\n",
708 | "country: 'IQ',\n",
709 | "population: 900000,\n",
710 | "timezone: 'Asia/Baghdad' } ]\n"
711 | ]
712 | }
713 | ],
714 | "source": [
715 | "%%node\n",
716 | "// If no limit is specified, 100 documents will be returned\n",
717 | "cities.all({limit:3}).then(console.log).catch(console.error)"
718 | ]
719 | },
720 | {
721 | "cell_type": "markdown",
722 | "metadata": {},
723 | "source": [
724 | "Specifying the optional `limit` and `skip` parameters we can paginate through the document list:\n",
725 | "\n",
726 | "```\n",
727 | "cities.all({limit:10}).then(console.log).catch(console.error)\n",
728 | "cities.all({skip:10, limit:10}).then(console.log).catch(console.error)\n",
729 | "```"
730 | ]
731 | },
732 | {
733 | "cell_type": "markdown",
734 | "metadata": {},
735 | "source": [
736 | "If we know the IDs of documents, we can retrieve them singly:"
737 | ]
738 | },
739 | {
740 | "cell_type": "code",
741 | "execution_count": 20,
742 | "metadata": {},
743 | "outputs": [
744 | {
745 | "name": "stdout",
746 | "output_type": "stream",
747 | "text": [
748 | "{ _id: '2636749',\n",
749 | "name: 'Stowmarket',\n",
750 | "latitude: 52.18893,\n",
751 | "longitude: 0.99774,\n",
752 | "country: 'GB',\n",
753 | "population: 15394,\n",
754 | "timezone: 'Europe/London' }\n"
755 | ]
756 | }
757 | ],
758 | "source": [
759 | "%%node\n",
760 | "cities.get('2636749').then(console.log).catch(console.error);"
761 | ]
762 | },
763 | {
764 | "cell_type": "markdown",
765 | "metadata": {},
766 | "source": [
767 | "Or in bulk:"
768 | ]
769 | },
770 | {
771 | "cell_type": "code",
772 | "execution_count": 21,
773 | "metadata": {
774 | "scrolled": true
775 | },
776 | "outputs": [
777 | {
778 | "name": "stdout",
779 | "output_type": "stream",
780 | "text": [
781 | "[ { _id: '5913490',\n",
782 | "name: 'Calgary',\n",
783 | "latitude: 51.05011,\n",
784 | "longitude: -114.08529,\n",
785 | "country: 'CA',\n",
786 | "population: 1019942,\n",
787 | "timezone: 'America/Edmonton' },\n",
788 | "{ _id: '4140963',\n",
789 | "name: 'Washington, D.C.',\n",
790 | "latitude: 38.89511,\n",
791 | "longitude: -77.03637,\n",
792 | "country: 'US',\n",
793 | "population: 601723,\n",
794 | "timezone: 'America/New_York' },\n",
795 | "{ _id: '3520274',\n",
796 | "name: 'Río Blanco',\n",
797 | "latitude: 18.83036,\n",
798 | "longitude: -97.156,\n",
799 | "country: 'MX',\n",
800 | "population: 39543,\n",
801 | "timezone: 'America/Mexico_City' } ]\n"
802 | ]
803 | }
804 | ],
805 | "source": [
806 | "%%node\n",
807 | "cities.get(['5913490', '4140963','3520274']).then(console.log).catch(console.error);"
808 | ]
809 | },
810 | {
811 | "cell_type": "markdown",
812 | "metadata": {},
813 | "source": [
814 | "Instead of just calling `print` to output the JSON, we can bring PixieDust's `display` function to bear by passing it an array of data to visualize. Using mapbox as renderer and satelite as basemap, we can display the location and population of the selected cities: "
815 | ]
816 | },
817 | {
818 | "cell_type": "code",
819 | "execution_count": 22,
820 | "metadata": {
821 | "pixiedust": {
822 | "displayParams": {
823 | "basemap": "satellite-v9",
824 | "chartsize": "76",
825 | "coloropacity": "53",
826 | "colorrampname": "Orange to Purple",
827 | "handlerId": "mapView",
828 | "keyFields": "latitude,longitude",
829 | "kind": "simple-cluster",
830 | "legend": "false",
831 | "mapboxtoken": "pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpejY4M29iazA2Z2gycXA4N2pmbDZmangifQ.-g_vE53SD2WrJ6tFX7QHmA",
832 | "rendererId": "mapbox",
833 | "rowCount": "500",
834 | "valueFields": "population,name"
835 | }
836 | },
837 | "scrolled": false
838 | },
839 | "outputs": [],
840 | "source": [
841 | "%%node\n",
842 | "cities.get(['5913490', '4140963','3520274']).then(display).catch(console.error);"
843 | ]
844 | },
845 | {
846 | "cell_type": "markdown",
847 | "metadata": {},
848 | "source": [
849 | "
\n",
850 | "We can also query a subset of the data using the `query` function, passing it a [Cloudant Query](https://cloud.ibm.com/docs/services/Cloudant/api/cloudant_query.html#query) statement. Using mapbox as renderer, the customizable output looks as follows:"
851 | ]
852 | },
853 | {
854 | "cell_type": "code",
855 | "execution_count": 23,
856 | "metadata": {
857 | "pixiedust": {
858 | "displayParams": {
859 | "basemap": "outdoors-v9",
860 | "colorrampname": "Yellow to Blue",
861 | "handlerId": "mapView",
862 | "keyFields": "latitude,longitude",
863 | "mapboxtoken": "pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpejY4M29iazA2Z2gycXA4N2pmbDZmangifQ.-g_vE53SD2WrJ6tFX7QHmA",
864 | "rowCount": "500",
865 | "valueFields": "name,population"
866 | }
867 | },
868 | "scrolled": false
869 | },
870 | "outputs": [],
871 | "source": [
872 | "%%node\n",
873 | "// fetch cities in UK above latitude 54 degrees north\n",
874 | "cities.query({country:'GB', latitude: { \"$gt\": 54}}).then(display).catch(console.error);"
875 | ]
876 | },
877 | {
878 | "cell_type": "markdown",
879 | "metadata": {},
880 | "source": [
881 | "
\n",
882 | "\n",
883 | "### Aggregating data\n",
884 | "The `cloudant-quickstart` library also allows aggregations (sum, count, stats) to be performed in the Cloudant database.\n",
885 | "Let’s calculate the sum of the population field:"
886 | ]
887 | },
888 | {
889 | "cell_type": "code",
890 | "execution_count": 24,
891 | "metadata": {},
892 | "outputs": [
893 | {
894 | "name": "stdout",
895 | "output_type": "stream",
896 | "text": [
897 | "2694222973\n",
898 | "\n"
899 | ]
900 | }
901 | ],
902 | "source": [
903 | "%%node\n",
904 | "cities.sum('population').then(console.log).catch(console.error);"
905 | ]
906 | },
907 | {
908 | "cell_type": "markdown",
909 | "metadata": {},
910 | "source": [
911 | "Or compute the sum of the `population`, grouped by the `country` field, displaying 10 countries with the largest population:"
912 | ]
913 | },
914 | {
915 | "cell_type": "code",
916 | "execution_count": 25,
917 | "metadata": {
918 | "pixiedust": {
919 | "displayParams": {
920 | "aggregation": "SUM",
921 | "handlerId": "barChart",
922 | "keyFields": "name",
923 | "mapboxtoken": "pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpejY4M29iazA2Z2gycXA4N2pmbDZmangifQ.-g_vE53SD2WrJ6tFX7QHmA",
924 | "orientation": "vertical",
925 | "rendererId": "google",
926 | "rowCount": "100",
927 | "sortby": "Values DESC",
928 | "valueFields": "population"
929 | }
930 | },
931 | "scrolled": false
932 | },
933 | "outputs": [
934 | {
935 | "name": "stdout",
936 | "output_type": "stream",
937 | "text": [
938 | "... ... ... ..... ..... ... ... ..... ..... ... ... ..... ..... ...\n",
939 | "CN 389,487,480\n",
940 | "IN 269,553,896\n",
941 | "US 190,515,768\n",
942 | "BR 125,426,547\n",
943 | "RU 108,885,695\n",
944 | "JP 99,000,238\n",
945 | "MX 80,474,387\n",
946 | "ID 63,161,801\n",
947 | "DE 58,884,999\n",
948 | "TR 55,733,719\n"
949 | ]
950 | }
951 | ],
952 | "source": [
953 | "%%node\n",
954 | "\n",
955 | "// helper function\n",
956 | "function top10(data) {\n",
957 | " // convert input data structure to array\n",
958 | " var pop_array = [];\n",
959 | " Object.keys(data).forEach(function(n,k) {\n",
960 | " pop_array.push({name: n, population: data[n]});\n",
961 | " });\n",
962 | " // sort array by population in descending order\n",
963 | " pop_array.sort(function(a,b) {\n",
964 | " return b.population - a.population; \n",
965 | " });\n",
966 | " // display top 10 entries\n",
967 | " pop_array.slice(0,10).forEach(function(e) {\n",
968 | " console.log(e.name + ' ' + e.population.toLocaleString()); \n",
969 | " });\n",
970 | "}\n",
971 | "\n",
972 | "// fetch aggregated data and invoke helper routine\n",
973 | "cities.sum('population','country').then(top10).catch(console.error);"
974 | ]
975 | },
976 | {
977 | "cell_type": "markdown",
978 | "metadata": {},
979 | "source": [
980 | "The `cloudant-quickstart` package is just one of several Node.js libraries that you can use to access Apache CouchDB or Cloudant. Follow [this link](https://medium.com/ibm-watson-data-lab/choosing-a-cloudant-library-d14c06f3d714) to learn more about your options. "
981 | ]
982 | },
983 | {
984 | "cell_type": "markdown",
985 | "metadata": {},
986 | "source": [
987 | "### Visualizing data using custom charts\n",
988 | "\n",
989 | "If you prefer, you can also use third-party Node.js charting packages to visualize your data, such as [`quiche`](https://www.npmjs.com/package/quiche)."
990 | ]
991 | },
992 | {
993 | "cell_type": "code",
994 | "execution_count": 26,
995 | "metadata": {},
996 | "outputs": [
997 | {
998 | "name": "stdout",
999 | "output_type": "stream",
1000 | "text": [
1001 | "/opt/conda/envs/DSX-Python27/bin/npm install -s quiche\n",
1002 | "+ quiche@0.3.0\n",
1003 | "updated 1 package in 0.957s\n"
1004 | ]
1005 | }
1006 | ],
1007 | "source": [
1008 | "npm.install('quiche');"
1009 | ]
1010 | },
1011 | {
1012 | "cell_type": "code",
1013 | "execution_count": 27,
1014 | "metadata": {},
1015 | "outputs": [
1016 | {
1017 | "name": "stdout",
1018 | "output_type": "stream",
1019 | "text": [
1020 | "... ... ... ... ... ... ... ... ...\n"
1021 | ]
1022 | },
1023 | {
1024 | "data": {
1025 | "text/html": [
1026 | "
"
1027 | ],
1028 | "text/plain": [
1029 | ""
1030 | ]
1031 | },
1032 | "metadata": {},
1033 | "output_type": "display_data"
1034 | }
1035 | ],
1036 | "source": [
1037 | "%%node\n",
1038 | "var Quiche = require('quiche');\n",
1039 | "var pie = new Quiche('pie');\n",
1040 | "\n",
1041 | "// fetch cities in UK\n",
1042 | "cities.query({name: 'Cambridge'}).then(function(data) {\n",
1043 | "\n",
1044 | " var colors = ['ff00ff','0055ff', 'ff0000', 'ffff00', '00ff00','0000ff'];\n",
1045 | " for(i in data) {\n",
1046 | " var city = data[i];\n",
1047 | " pie.addData(city.population, city.name + '(' + city.country +')', colors[i]);\n",
1048 | " }\n",
1049 | " var imageUrl = pie.getUrl(true);\n",
1050 | " image(imageUrl); \n",
1051 | "});"
1052 | ]
1053 | },
1054 | {
1055 | "cell_type": "markdown",
1056 | "metadata": {},
1057 | "source": [
1058 | "***\n",
1059 | "# Part 3: Sharing data between Python and Node.js cells\n",
1060 | "\n",
1061 | "You can share variables between Python and Node.js cells. Why woud you want to do that? Read on.\n",
1062 | "\n",
1063 | "The Node.js library ecosystem is extensive. Perhaps you need to fetch data from a database and prefer the syntax of a particular Node.js npm module. You can use Node.js to fetch the data, move it to the Python environment, and convert it into a Pandas or Spark DataFrame for aggregation, analysis and visualisation.\n",
1064 | "\n",
1065 | "PixieDust and pixiedust_node give you the flexibility to mix and match Python and Node.js code to suit the workflow you are building and the skill sets you have in your team.\n",
1066 | "\n",
1067 | "Mixing Node.js and Python code in the same notebook is a great way to integrate the work of your software development and data science teams to produce a collaborative report or dashboard.\n",
1068 | "\n",
1069 | "\n",
1070 | "### Sharing data\n",
1071 | "\n",
1072 | "Define variables in a Python cell."
1073 | ]
1074 | },
1075 | {
1076 | "cell_type": "code",
1077 | "execution_count": 28,
1078 | "metadata": {},
1079 | "outputs": [],
1080 | "source": [
1081 | "# define a couple variables in Python\n",
1082 | "a = 'Hello from Python!'\n",
1083 | "b = 2\n",
1084 | "c = False\n",
1085 | "d = {'x':1, 'y':2}\n",
1086 | "e = 3.142\n",
1087 | "f = [{'a':1}, {'a':2}, {'a':3}]"
1088 | ]
1089 | },
1090 | {
1091 | "cell_type": "markdown",
1092 | "metadata": {},
1093 | "source": [
1094 | "Access or modify their values in Node.js cells."
1095 | ]
1096 | },
1097 | {
1098 | "cell_type": "code",
1099 | "execution_count": 29,
1100 | "metadata": {},
1101 | "outputs": [
1102 | {
1103 | "name": "stdout",
1104 | "output_type": "stream",
1105 | "text": [
1106 | "Hello from Python! 2 false { y: 2, x: 1 } 3.142 [ { a: 1 }, { a: 2 }, { a: 3 } ]\n"
1107 | ]
1108 | }
1109 | ],
1110 | "source": [
1111 | "%%node\n",
1112 | "// print variable values\n",
1113 | "console.log(a, b, c, d, e, f);\n",
1114 | "\n",
1115 | "// change variable value \n",
1116 | "a = 'Hello from Node.js!';\n",
1117 | "\n",
1118 | "// define a new variable\n",
1119 | "var g = 'Yes, it works both ways.';"
1120 | ]
1121 | },
1122 | {
1123 | "cell_type": "markdown",
1124 | "metadata": {},
1125 | "source": [
1126 | "Inspect the manipulated data."
1127 | ]
1128 | },
1129 | {
1130 | "cell_type": "code",
1131 | "execution_count": 30,
1132 | "metadata": {},
1133 | "outputs": [
1134 | {
1135 | "name": "stdout",
1136 | "output_type": "stream",
1137 | "text": [
1138 | "Hello from Node.js! Yes, it works both ways.\n"
1139 | ]
1140 | }
1141 | ],
1142 | "source": [
1143 | "# display modified variable and the new variable\n",
1144 | "print('{} {}'.format(a,g))"
1145 | ]
1146 | },
1147 | {
1148 | "cell_type": "markdown",
1149 | "metadata": {},
1150 | "source": [
1151 | "**Note:** PixieDust natively supports [data sharing between Python and Scala](https://ibm-watson-data-lab.github.io/pixiedust/scalabridge.html), extending the loop for some data types:\n",
1152 | " ```\n",
1153 | " %%scala\n",
1154 | " println(a,b,c,d,e,f,g)\n",
1155 | " \n",
1156 | " (Hello from Node.js!,2,null,null,null,null,Yes, it works both ways.)\n",
1157 | " ```"
1158 | ]
1159 | },
1160 | {
1161 | "cell_type": "markdown",
1162 | "metadata": {},
1163 | "source": [
1164 | "### Sharing data from an asynchronous callback\n",
1165 | "\n",
1166 | "If you wish transfer data from Node.js to Python from an asynchronous callback, make sure you write the data to a global variable. \n",
1167 | "\n",
1168 | "Load a csv file from a GitHub repository."
1169 | ]
1170 | },
1171 | {
1172 | "cell_type": "code",
1173 | "execution_count": 31,
1174 | "metadata": {},
1175 | "outputs": [
1176 | {
1177 | "name": "stdout",
1178 | "output_type": "stream",
1179 | "text": [
1180 | "... ... ...\n",
1181 | "Fetched sample data from GitHub.\n"
1182 | ]
1183 | }
1184 | ],
1185 | "source": [
1186 | "%%node\n",
1187 | "\n",
1188 | "// global variable\n",
1189 | "var sample_csv_data = '';\n",
1190 | "\n",
1191 | "// load csv file from GitHub and store data in the global variable\n",
1192 | "request.get('https://github.com/ibm-watson-data-lab/open-data/raw/master/cars/cars.csv').then(function(data) {\n",
1193 | " sample_csv_data = data;\n",
1194 | " console.log('Fetched sample data from GitHub.');\n",
1195 | "});"
1196 | ]
1197 | },
1198 | {
1199 | "cell_type": "markdown",
1200 | "metadata": {},
1201 | "source": [
1202 | "Create a Pandas DataFrame from the downloaded data."
1203 | ]
1204 | },
1205 | {
1206 | "cell_type": "code",
1207 | "execution_count": 32,
1208 | "metadata": {
1209 | "pixiedust": {
1210 | "displayParams": {}
1211 | }
1212 | },
1213 | "outputs": [
1214 | {
1215 | "data": {
1216 | "text/html": [
1217 | "\n",
1218 | "\n",
1231 | "
\n",
1232 | " \n",
1233 | " \n",
1234 | " | \n",
1235 | " mpg | \n",
1236 | " cylinders | \n",
1237 | " engine | \n",
1238 | " horsepower | \n",
1239 | " weight | \n",
1240 | " acceleration | \n",
1241 | " year | \n",
1242 | " origin | \n",
1243 | " name | \n",
1244 | "
\n",
1245 | " \n",
1246 | " \n",
1247 | " \n",
1248 | " 0 | \n",
1249 | " 18.0 | \n",
1250 | " 8 | \n",
1251 | " 307.0 | \n",
1252 | " 130 | \n",
1253 | " 3504 | \n",
1254 | " 12.0 | \n",
1255 | " 70 | \n",
1256 | " American | \n",
1257 | " chevrolet chevelle malibu | \n",
1258 | "
\n",
1259 | " \n",
1260 | " 1 | \n",
1261 | " 15.0 | \n",
1262 | " 8 | \n",
1263 | " 350.0 | \n",
1264 | " 165 | \n",
1265 | " 3693 | \n",
1266 | " 11.5 | \n",
1267 | " 70 | \n",
1268 | " American | \n",
1269 | " buick skylark 320 | \n",
1270 | "
\n",
1271 | " \n",
1272 | " 2 | \n",
1273 | " 18.0 | \n",
1274 | " 8 | \n",
1275 | " 318.0 | \n",
1276 | " 150 | \n",
1277 | " 3436 | \n",
1278 | " 11.0 | \n",
1279 | " 70 | \n",
1280 | " American | \n",
1281 | " plymouth satellite | \n",
1282 | "
\n",
1283 | " \n",
1284 | " 3 | \n",
1285 | " 16.0 | \n",
1286 | " 8 | \n",
1287 | " 304.0 | \n",
1288 | " 150 | \n",
1289 | " 3433 | \n",
1290 | " 12.0 | \n",
1291 | " 70 | \n",
1292 | " American | \n",
1293 | " amc rebel sst | \n",
1294 | "
\n",
1295 | " \n",
1296 | " 4 | \n",
1297 | " 17.0 | \n",
1298 | " 8 | \n",
1299 | " 302.0 | \n",
1300 | " 140 | \n",
1301 | " 3449 | \n",
1302 | " 10.5 | \n",
1303 | " 70 | \n",
1304 | " American | \n",
1305 | " ford torino | \n",
1306 | "
\n",
1307 | " \n",
1308 | "
\n",
1309 | "
"
1310 | ],
1311 | "text/plain": [
1312 | " mpg cylinders engine horsepower weight acceleration year origin \\\n",
1313 | "0 18.0 8 307.0 130 3504 12.0 70 American \n",
1314 | "1 15.0 8 350.0 165 3693 11.5 70 American \n",
1315 | "2 18.0 8 318.0 150 3436 11.0 70 American \n",
1316 | "3 16.0 8 304.0 150 3433 12.0 70 American \n",
1317 | "4 17.0 8 302.0 140 3449 10.5 70 American \n",
1318 | "\n",
1319 | " name \n",
1320 | "0 chevrolet chevelle malibu \n",
1321 | "1 buick skylark 320 \n",
1322 | "2 plymouth satellite \n",
1323 | "3 amc rebel sst \n",
1324 | "4 ford torino "
1325 | ]
1326 | },
1327 | "execution_count": 33,
1328 | "metadata": {},
1329 | "output_type": "execute_result"
1330 | }
1331 | ],
1332 | "source": [
1333 | "import pandas as pd\n",
1334 | "import io\n",
1335 | "# create DataFrame from shared csv data\n",
1336 | "pandas_df = pd.read_csv(io.StringIO(sample_csv_data))\n",
1337 | "# display first five rows\n",
1338 | "pandas_df.head(5)"
1339 | ]
1340 | },
1341 | {
1342 | "cell_type": "markdown",
1343 | "metadata": {},
1344 | "source": [
1345 | "**Note**: Above example is for illustrative purposes only. A much easier solution is to use [PixieDust's sampleData method](https://ibm-watson-data-lab.github.io/pixiedust/loaddata.html#load-a-csv-using-its-url) if you want to create a DataFrame from a URL. "
1346 | ]
1347 | },
1348 | {
1349 | "cell_type": "markdown",
1350 | "metadata": {},
1351 | "source": [
1352 | "#### References:\n",
1353 | " * [Nodebooks: Introducing Node.js Data Science Notebooks](https://medium.com/ibm-watson-data-lab/nodebooks-node-js-data-science-notebooks-aa140bea21ba)\n",
1354 | " * [Nodebooks: Sharing Data Between Node.js & Python](https://medium.com/ibm-watson-data-lab/nodebooks-sharing-data-between-node-js-python-3a4acae27a02)\n",
1355 | " * [Sharing Variables Between Python & Node.js in Jupyter Notebooks](https://medium.com/ibm-watson-data-lab/sharing-variables-between-python-node-js-in-jupyter-notebooks-682a79d4bdd9)"
1356 | ]
1357 | }
1358 | ],
1359 | "metadata": {
1360 | "kernelspec": {
1361 | "display_name": "Python 2.7",
1362 | "language": "python",
1363 | "name": "python2"
1364 | },
1365 | "language_info": {
1366 | "codemirror_mode": {
1367 | "name": "ipython",
1368 | "version": 2
1369 | },
1370 | "file_extension": ".py",
1371 | "mimetype": "text/x-python",
1372 | "name": "python",
1373 | "nbconvert_exporter": "python",
1374 | "pygments_lexer": "ipython2",
1375 | "version": "2.7.15"
1376 | }
1377 | },
1378 | "nbformat": 4,
1379 | "nbformat_minor": 2
1380 | }
1381 |
--------------------------------------------------------------------------------
/doc/source/images/architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/doc/source/images/architecture.png
--------------------------------------------------------------------------------
/doc/source/images/new_custom_environment.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/doc/source/images/new_custom_environment.png
--------------------------------------------------------------------------------
/doc/source/images/new_notebook_custom_environment.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/doc/source/images/new_notebook_custom_environment.png
--------------------------------------------------------------------------------
/doc/source/images/notebook_preview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/doc/source/images/notebook_preview.png
--------------------------------------------------------------------------------
/notebooks/images/display_sin_cos.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/notebooks/images/display_sin_cos.png
--------------------------------------------------------------------------------
/notebooks/images/mapbox_americas.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/notebooks/images/mapbox_americas.png
--------------------------------------------------------------------------------
/notebooks/images/mapbox_uk.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/notebooks/images/mapbox_uk.png
--------------------------------------------------------------------------------
/notebooks/images/pd_chart_types.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/notebooks/images/pd_chart_types.png
--------------------------------------------------------------------------------
/notebooks/images/pixiedust_node_schematic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IBM/nodejs-in-notebooks/aaae516889eebb7cbb4db57bdc4963b6a2ad9325/notebooks/images/pixiedust_node_schematic.png
--------------------------------------------------------------------------------
/notebooks/nodebook_1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Nodebooks: Introducing Node.js Data Science Notebooks\n",
8 | "\n",
9 | "Notebooks are where data scientists process, analyse, and visualise data in an iterative, collaborative environment. They typically run environments for languages like Python, R, and Scala. For years, data science notebooks have served academics and research scientists as a scratchpad for writing code, refining algorithms, and sharing and proving their work. Today, it's a workflow that lends itself well to web developers experimenting with data sets in Node.js.\n",
10 | "\n",
11 | "To that end, pixiedust_node is an add-on for Jupyter notebooks that allows Node.js/JavaScript to run inside notebook cells. Not only can web developers use the same workflow for collaborating in Node.js, but they can also use the same tools to work with existing data scientists coding in Python.\n",
12 | "\n",
13 | "pixiedust_node is built on the popular PixieDust helper library. Let’s get started.\n",
14 | "\n",
15 | "> Note: Run one cell at a time or unexpected results might be observed.\n",
16 | "\n",
17 | "\n",
18 | "## Part 1: Variables, functions, and promises\n",
19 | "\n",
20 | "\n",
21 | "### Installing\n",
22 | "Install the [`pixiedust`](https://pypi.python.org/pypi/pixiedust) and [`pixiedust_node`](https://pypi.python.org/pypi/pixiedust-node) packages using `pip`, the Python package manager. "
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": null,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "# install or upgrade the packages\n",
32 | "# restart the kernel to pick up the latest version\n",
33 | "!pip install pixiedust --upgrade\n",
34 | "!pip install pixiedust_node --upgrade"
35 | ]
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "### Using pixiedust_node\n",
42 | "Now we can import `pixiedust_node` into our notebook:"
43 | ]
44 | },
45 | {
46 | "cell_type": "code",
47 | "execution_count": null,
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "import pixiedust_node"
52 | ]
53 | },
54 | {
55 | "cell_type": "markdown",
56 | "metadata": {},
57 | "source": [
58 | "And then we can write JavaScript code in cells whose first line is `%%node`:"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": null,
64 | "metadata": {},
65 | "outputs": [],
66 | "source": [
67 | "%%node\n",
68 | "// get the current date\n",
69 | "var date = new Date();"
70 | ]
71 | },
72 | {
73 | "cell_type": "markdown",
74 | "metadata": {},
75 | "source": [
76 | "It’s that easy! We can have Python and Node.js in the same notebook. Cells are Python by default, but simply starting a cell with `%%node` indicates that the next lines will be JavaScript."
77 | ]
78 | },
79 | {
80 | "cell_type": "markdown",
81 | "metadata": {},
82 | "source": [
83 | "### Displaying HTML and images in notebook cells\n",
84 | "We can use the `html` function to render HTML code in a cell:"
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": null,
90 | "metadata": {},
91 | "outputs": [],
92 | "source": [
93 | "%%node\n",
94 | "var str = 'Quote
\"Imagination is more important than knowledge\"\\nAlbert Einstein
';\n",
95 | "html(str)"
96 | ]
97 | },
98 | {
99 | "cell_type": "markdown",
100 | "metadata": {},
101 | "source": [
102 | "If we have an image we want to render, we can do that with the `image` function:"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": null,
108 | "metadata": {},
109 | "outputs": [],
110 | "source": [
111 | "%%node\n",
112 | "var url = 'https://github.com/IBM/nodejs-in-notebooks/blob/master/notebooks/images/pixiedust_node_schematic.png?raw=true';\n",
113 | "image(url);"
114 | ]
115 | },
116 | {
117 | "cell_type": "markdown",
118 | "metadata": {},
119 | "source": [
120 | "### Printing JavaScript variables\n",
121 | "\n",
122 | "Print variables using `console.log`."
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": null,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "%%node\n",
132 | "var x = { a:1, b:'two', c: true };\n",
133 | "console.log(x);"
134 | ]
135 | },
136 | {
137 | "cell_type": "markdown",
138 | "metadata": {},
139 | "source": [
140 | "Calling the `print` function within your JavaScript code is the same as calling `print` in your Python code."
141 | ]
142 | },
143 | {
144 | "cell_type": "code",
145 | "execution_count": null,
146 | "metadata": {},
147 | "outputs": [],
148 | "source": [
149 | "%%node\n",
150 | "var y = { a:3, b:'four', c: false };\n",
151 | "print(y);"
152 | ]
153 | },
154 | {
155 | "cell_type": "markdown",
156 | "metadata": {},
157 | "source": [
158 | "### Visualizing data using PixieDust\n",
159 | "You can also use PixieDust’s `display` function to render data graphically. Configuring the output as line chart, the visualization looks as follows: "
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": null,
165 | "metadata": {
166 | "pixiedust": {
167 | "displayParams": {
168 | "aggregation": "SUM",
169 | "chartsize": "99",
170 | "handlerId": "lineChart",
171 | "keyFields": "x",
172 | "rowCount": "500",
173 | "valueFields": "cos,sin"
174 | }
175 | }
176 | },
177 | "outputs": [],
178 | "source": [
179 | "%%node\n",
180 | "var data = [];\n",
181 | "for (var i = 0; i < 1000; i++) {\n",
182 | " var x = 2*Math.PI * i/ 360;\n",
183 | " var obj = {\n",
184 | " x: x,\n",
185 | " i: i,\n",
186 | " sin: Math.sin(x),\n",
187 | " cos: Math.cos(x),\n",
188 | " tan: Math.tan(x)\n",
189 | " };\n",
190 | " data.push(obj);\n",
191 | "}\n",
192 | "// render data \n",
193 | "display(data);"
194 | ]
195 | },
196 | {
197 | "cell_type": "markdown",
198 | "metadata": {},
199 | "source": [
200 | "PixieDust presents visualisations of DataFrames using Matplotlib, Bokeh, Brunel, d3, Google Maps and, MapBox. No code is required on your part because PixieDust presents simple pull-down menus and a friendly point-and-click interface, allowing you to configure how the data is presented:\n",
201 | "\n",
202 | "
"
203 | ]
204 | },
205 | {
206 | "cell_type": "markdown",
207 | "metadata": {},
208 | "source": [
209 | "### Adding npm modules\n",
210 | "There are thousands of libraries and tools in the npm repository, Node.js’s package manager. It’s essential that we can install npm libraries and use them in our notebook code.\n",
211 | "Let’s say we want to make some HTTP calls to an external API service. We could deal with Node.js’s low-level HTTP library, or an easier option would be to use the ubiquitous `request` npm module.\n",
212 | "Once we have pixiedust_node set up, installing an npm module is as simple as running `npm.install` in a Python cell:"
213 | ]
214 | },
215 | {
216 | "cell_type": "code",
217 | "execution_count": null,
218 | "metadata": {},
219 | "outputs": [],
220 | "source": [
221 | "npm.install('request');"
222 | ]
223 | },
224 | {
225 | "cell_type": "markdown",
226 | "metadata": {},
227 | "source": [
228 | "Once installed, you may `require` the module in your JavaScript code:"
229 | ]
230 | },
231 | {
232 | "cell_type": "code",
233 | "execution_count": null,
234 | "metadata": {},
235 | "outputs": [],
236 | "source": [
237 | "%%node\n",
238 | "var request = require('request');\n",
239 | "var r = {\n",
240 | " method:'GET',\n",
241 | " url: 'http://api.open-notify.org/iss-now.json',\n",
242 | " json: true\n",
243 | "};\n",
244 | "request(r, function(err, req, body) {\n",
245 | " console.log(body);\n",
246 | "});\n"
247 | ]
248 | },
249 | {
250 | "cell_type": "markdown",
251 | "metadata": {},
252 | "source": [
253 | "As an HTTP request is an asynchronous action, the `request` library calls our callback function when the operation has completed. Inside that function, we can call print to render the data.\n",
254 | "We can organise our code into functions to encapsulate complexity and make it easier to reuse code. We can create a function to get the current position of the International Space Station in one notebook cell:"
255 | ]
256 | },
257 | {
258 | "cell_type": "code",
259 | "execution_count": null,
260 | "metadata": {},
261 | "outputs": [],
262 | "source": [
263 | "%%node\n",
264 | "var request = require('request');\n",
265 | "var getPosition = function(callback) {\n",
266 | " var r = {\n",
267 | " method:'GET',\n",
268 | " url: 'http://api.open-notify.org/iss-now.json',\n",
269 | " json: true\n",
270 | " };\n",
271 | " request(r, function(err, req, body) {\n",
272 | " var obj = null;\n",
273 | " if (!err) {\n",
274 | " obj = body.iss_position\n",
275 | " obj.latitude = parseFloat(obj.latitude);\n",
276 | " obj.longitude = parseFloat(obj.longitude);\n",
277 | " obj.time = new Date().getTime(); \n",
278 | " }\n",
279 | " callback(err, obj);\n",
280 | " });\n",
281 | "};"
282 | ]
283 | },
284 | {
285 | "cell_type": "markdown",
286 | "metadata": {},
287 | "source": [
288 | "And use it in another cell:"
289 | ]
290 | },
291 | {
292 | "cell_type": "code",
293 | "execution_count": null,
294 | "metadata": {},
295 | "outputs": [],
296 | "source": [
297 | "%%node\n",
298 | "getPosition(function(err, data) {\n",
299 | " console.log(data);\n",
300 | "});"
301 | ]
302 | },
303 | {
304 | "cell_type": "markdown",
305 | "metadata": {},
306 | "source": [
307 | "### Promises\n",
308 | "If you prefer to work with JavaScript Promises when writing asynchronous code, then that’s okay too. Let’s rewrite our `getPosition` function to return a Promise. First we're going to install the `request-promise` module from npm:"
309 | ]
310 | },
311 | {
312 | "cell_type": "code",
313 | "execution_count": null,
314 | "metadata": {},
315 | "outputs": [],
316 | "source": [
317 | "npm.install( ('request', 'request-promise') )"
318 | ]
319 | },
320 | {
321 | "cell_type": "markdown",
322 | "metadata": {},
323 | "source": [
324 | "Notice how you can install multiple modules in a single call. Just pass in a Python `list` or `tuple`.\n",
325 | "Then we can refactor our function a little:"
326 | ]
327 | },
328 | {
329 | "cell_type": "code",
330 | "execution_count": null,
331 | "metadata": {},
332 | "outputs": [],
333 | "source": [
334 | "%%node\n",
335 | "var request = require('request-promise');\n",
336 | "var getPosition = function(callback) {\n",
337 | " var r = {\n",
338 | " method:'GET',\n",
339 | " url: 'http://api.open-notify.org/iss-now.json',\n",
340 | " json: true\n",
341 | " };\n",
342 | " return request(r).then(function(body) {\n",
343 | " var obj = null;\n",
344 | " obj = body.iss_position;\n",
345 | " obj.latitude = parseFloat(obj.latitude);\n",
346 | " obj.longitude = parseFloat(obj.longitude);\n",
347 | " obj.time = new Date().getTime(); \n",
348 | " return obj;\n",
349 | " });\n",
350 | "};"
351 | ]
352 | },
353 | {
354 | "cell_type": "markdown",
355 | "metadata": {},
356 | "source": [
357 | "And call it in the Promises style:"
358 | ]
359 | },
360 | {
361 | "cell_type": "code",
362 | "execution_count": null,
363 | "metadata": {},
364 | "outputs": [],
365 | "source": [
366 | "%%node\n",
367 | "getPosition().then(function(data) {\n",
368 | " console.log(data);\n",
369 | "}).catch(function(err) {\n",
370 | " console.error(err); \n",
371 | "});"
372 | ]
373 | },
374 | {
375 | "cell_type": "markdown",
376 | "metadata": {},
377 | "source": [
378 | "Or call it in a more compact form:"
379 | ]
380 | },
381 | {
382 | "cell_type": "code",
383 | "execution_count": null,
384 | "metadata": {},
385 | "outputs": [],
386 | "source": [
387 | "%%node\n",
388 | "getPosition().then(console.log).catch(console.error);"
389 | ]
390 | },
391 | {
392 | "cell_type": "markdown",
393 | "metadata": {},
394 | "source": [
395 | "In the next part of this notebook we'll illustrate how you can access local and remote data sources from within the notebook."
396 | ]
397 | },
398 | {
399 | "cell_type": "markdown",
400 | "metadata": {},
401 | "source": [
402 | "***\n",
403 | "# Part 2: Working with data sources\n",
404 | "\n",
405 | "You can access any data source using your favorite public or home-grown packages. In the second part of this notebook you'll learn how to retrieve data from an Apache CouchDB (or Cloudant) database and visualize it using PixieDust or third-party libraries.\n",
406 | "\n",
407 | "## Accessing Cloudant data sources\n",
408 | "\n",
409 | "\n",
410 | "To access data stored in an Apache CouchDB or Cloudant database, we can use the [`cloudant-quickstart`](https://www.npmjs.com/package/cloudant-quickstart) npm module:"
411 | ]
412 | },
413 | {
414 | "cell_type": "code",
415 | "execution_count": null,
416 | "metadata": {},
417 | "outputs": [],
418 | "source": [
419 | "npm.install('cloudant-quickstart')"
420 | ]
421 | },
422 | {
423 | "cell_type": "markdown",
424 | "metadata": {},
425 | "source": [
426 | "With our Cloudant URL, we can start exploring the data in Node.js. First we make a connection to the remote Cloudant database:"
427 | ]
428 | },
429 | {
430 | "cell_type": "code",
431 | "execution_count": null,
432 | "metadata": {},
433 | "outputs": [],
434 | "source": [
435 | "%%node\n",
436 | "// connect to Cloudant using cloudant-quickstart\n",
437 | "const cqs = require('cloudant-quickstart');\n",
438 | "const cities = cqs('https://56953ed8-3fba-4f7e-824e-5498c8e1d18e-bluemix.cloudant.com/cities');"
439 | ]
440 | },
441 | {
442 | "cell_type": "markdown",
443 | "metadata": {},
444 | "source": [
445 | "> For this code pattern example a remote database has been pre-configured to accept anonymous connection requests. If you wish to explore the `cloudant-quickstart` library beyond what is covered in this nodebook, we recommend you create your own replica and replace above URL with your own, e.g. `https://myid:mypassword@mycloudanthost/mydatabase`.\n",
446 | "\n",
447 | "Now we have an object named `cities` that we can use to access the database. \n",
448 | "\n",
449 | "### Exploring the data using Node.js in a notebook \n",
450 | "\n",
451 | "We can retrieve all documents using `all`."
452 | ]
453 | },
454 | {
455 | "cell_type": "code",
456 | "execution_count": null,
457 | "metadata": {},
458 | "outputs": [],
459 | "source": [
460 | "%%node\n",
461 | "// If no limit is specified, 100 documents will be returned\n",
462 | "cities.all({limit:3}).then(console.log).catch(console.error)"
463 | ]
464 | },
465 | {
466 | "cell_type": "markdown",
467 | "metadata": {},
468 | "source": [
469 | "Specifying the optional `limit` and `skip` parameters we can paginate through the document list:\n",
470 | "\n",
471 | "```\n",
472 | "cities.all({limit:10}).then(console.log).catch(console.error)\n",
473 | "cities.all({skip:10, limit:10}).then(console.log).catch(console.error)\n",
474 | "```"
475 | ]
476 | },
477 | {
478 | "cell_type": "markdown",
479 | "metadata": {},
480 | "source": [
481 | "If we know the IDs of documents, we can retrieve them singly:"
482 | ]
483 | },
484 | {
485 | "cell_type": "code",
486 | "execution_count": null,
487 | "metadata": {},
488 | "outputs": [],
489 | "source": [
490 | "%%node\n",
491 | "cities.get('2636749').then(console.log).catch(console.error);"
492 | ]
493 | },
494 | {
495 | "cell_type": "markdown",
496 | "metadata": {},
497 | "source": [
498 | "Or in bulk:"
499 | ]
500 | },
501 | {
502 | "cell_type": "code",
503 | "execution_count": null,
504 | "metadata": {},
505 | "outputs": [],
506 | "source": [
507 | "%%node\n",
508 | "cities.get(['5913490', '4140963','3520274']).then(console.log).catch(console.error);"
509 | ]
510 | },
511 | {
512 | "cell_type": "markdown",
513 | "metadata": {},
514 | "source": [
515 | "Instead of just calling `print` to output the JSON, we can bring PixieDust's `display` function to bear by passing it an array of data to visualize. Using mapbox as renderer and satelite as basemap, we can display the location and population of the selected cities: "
516 | ]
517 | },
518 | {
519 | "cell_type": "code",
520 | "execution_count": null,
521 | "metadata": {
522 | "pixiedust": {
523 | "displayParams": {
524 | "basemap": "satellite-v9",
525 | "chartsize": "76",
526 | "coloropacity": "53",
527 | "colorrampname": "Orange to Purple",
528 | "handlerId": "mapView",
529 | "keyFields": "latitude,longitude",
530 | "kind": "simple-cluster",
531 | "legend": "false",
532 | "mapboxtoken": "pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpejY4M29iazA2Z2gycXA4N2pmbDZmangifQ.-g_vE53SD2WrJ6tFX7QHmA",
533 | "rendererId": "mapbox",
534 | "rowCount": "500",
535 | "valueFields": "population,name"
536 | }
537 | },
538 | "scrolled": false
539 | },
540 | "outputs": [],
541 | "source": [
542 | "%%node\n",
543 | "cities.get(['5913490', '4140963','3520274']).then(display).catch(console.error);"
544 | ]
545 | },
546 | {
547 | "cell_type": "markdown",
548 | "metadata": {},
549 | "source": [
550 | "We can also query a subset of the data using the `query` function, passing it a [Cloudant Query](https://cloud.ibm.com/docs/services/Cloudant/api/cloudant_query.html#query) statement. Using mapbox as renderer, the customizable output looks as follows:"
551 | ]
552 | },
553 | {
554 | "cell_type": "code",
555 | "execution_count": null,
556 | "metadata": {
557 | "pixiedust": {
558 | "displayParams": {
559 | "basemap": "outdoors-v9",
560 | "colorrampname": "Yellow to Blue",
561 | "handlerId": "mapView",
562 | "keyFields": "latitude,longitude",
563 | "mapboxtoken": "pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpejY4M29iazA2Z2gycXA4N2pmbDZmangifQ.-g_vE53SD2WrJ6tFX7QHmA",
564 | "rowCount": "500",
565 | "valueFields": "name,population"
566 | }
567 | },
568 | "scrolled": false
569 | },
570 | "outputs": [],
571 | "source": [
572 | "%%node\n",
573 | "// fetch cities in UK above latitude 54 degrees north\n",
574 | "cities.query({country:'GB', latitude: { \"$gt\": 54}}).then(display).catch(console.error);"
575 | ]
576 | },
577 | {
578 | "cell_type": "markdown",
579 | "metadata": {},
580 | "source": [
581 | "### Aggregating data\n",
582 | "The `cloudant-quickstart` library also allows aggregations (sum, count, stats) to be performed in the Cloudant database.\n",
583 | "Let’s calculate the sum of the population field:"
584 | ]
585 | },
586 | {
587 | "cell_type": "code",
588 | "execution_count": null,
589 | "metadata": {},
590 | "outputs": [],
591 | "source": [
592 | "%%node\n",
593 | "cities.sum('population').then(console.log).catch(console.error);"
594 | ]
595 | },
596 | {
597 | "cell_type": "markdown",
598 | "metadata": {},
599 | "source": [
600 | "Or compute the sum of the `population`, grouped by the `country` field, displaying 10 countries with the largest population:"
601 | ]
602 | },
603 | {
604 | "cell_type": "code",
605 | "execution_count": null,
606 | "metadata": {
607 | "pixiedust": {
608 | "displayParams": {
609 | "aggregation": "SUM",
610 | "handlerId": "barChart",
611 | "keyFields": "name",
612 | "mapboxtoken": "pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpejY4M29iazA2Z2gycXA4N2pmbDZmangifQ.-g_vE53SD2WrJ6tFX7QHmA",
613 | "orientation": "vertical",
614 | "rendererId": "google",
615 | "rowCount": "100",
616 | "sortby": "Values DESC",
617 | "valueFields": "population"
618 | }
619 | },
620 | "scrolled": false
621 | },
622 | "outputs": [],
623 | "source": [
624 | "%%node\n",
625 | "\n",
626 | "// helper function\n",
627 | "function top10(data) {\n",
628 | " // convert input data structure to array\n",
629 | " var pop_array = [];\n",
630 | " Object.keys(data).forEach(function(n,k) {\n",
631 | " pop_array.push({name: n, population: data[n]});\n",
632 | " });\n",
633 | " // sort array by population in descending order\n",
634 | " pop_array.sort(function(a,b) {\n",
635 | " return b.population - a.population; \n",
636 | " });\n",
637 | " // display top 10 entries\n",
638 | " pop_array.slice(0,10).forEach(function(e) {\n",
639 | " console.log(e.name + ' ' + e.population.toLocaleString()); \n",
640 | " });\n",
641 | "}\n",
642 | "\n",
643 | "// fetch aggregated data and invoke helper routine\n",
644 | "cities.sum('population','country').then(top10).catch(console.error);"
645 | ]
646 | },
647 | {
648 | "cell_type": "markdown",
649 | "metadata": {},
650 | "source": [
651 | "The `cloudant-quickstart` package is just one of several Node.js libraries that you can use to access Apache CouchDB or Cloudant. Follow [this link](https://medium.com/ibm-watson-data-lab/choosing-a-cloudant-library-d14c06f3d714) to learn more about your options. "
652 | ]
653 | },
654 | {
655 | "cell_type": "markdown",
656 | "metadata": {},
657 | "source": [
658 | "### Visualizing data using custom charts\n",
659 | "\n",
660 | "If you prefer, you can also use third-party Node.js charting packages to visualize your data, such as [`quiche`](https://www.npmjs.com/package/quiche)."
661 | ]
662 | },
663 | {
664 | "cell_type": "code",
665 | "execution_count": null,
666 | "metadata": {},
667 | "outputs": [],
668 | "source": [
669 | "npm.install('quiche');"
670 | ]
671 | },
672 | {
673 | "cell_type": "code",
674 | "execution_count": null,
675 | "metadata": {},
676 | "outputs": [],
677 | "source": [
678 | "%%node\n",
679 | "var Quiche = require('quiche');\n",
680 | "var pie = new Quiche('pie');\n",
681 | "\n",
682 | "// fetch cities in UK\n",
683 | "cities.query({name: 'Cambridge'}).then(function(data) {\n",
684 | "\n",
685 | " var colors = ['ff00ff','0055ff', 'ff0000', 'ffff00', '00ff00','0000ff'];\n",
686 | " for(i in data) {\n",
687 | " var city = data[i];\n",
688 | " pie.addData(city.population, city.name + '(' + city.country +')', colors[i]);\n",
689 | " }\n",
690 | " var imageUrl = pie.getUrl(true);\n",
691 | " image(imageUrl); \n",
692 | "});"
693 | ]
694 | },
695 | {
696 | "cell_type": "markdown",
697 | "metadata": {},
698 | "source": [
699 | "***\n",
700 | "# Part 3: Sharing data between Python and Node.js cells\n",
701 | "\n",
702 | "You can share variables between Python and Node.js cells. Why woud you want to do that? Read on.\n",
703 | "\n",
704 | "The Node.js library ecosystem is extensive. Perhaps you need to fetch data from a database and prefer the syntax of a particular Node.js npm module. You can use Node.js to fetch the data, move it to the Python environment, and convert it into a Pandas or Spark DataFrame for aggregation, analysis and visualisation.\n",
705 | "\n",
706 | "PixieDust and pixiedust_node give you the flexibility to mix and match Python and Node.js code to suit the workflow you are building and the skill sets you have in your team.\n",
707 | "\n",
708 | "Mixing Node.js and Python code in the same notebook is a great way to integrate the work of your software development and data science teams to produce a collaborative report or dashboard.\n",
709 | "\n",
710 | "\n",
711 | "### Sharing data\n",
712 | "\n",
713 | "Define variables in a Python cell."
714 | ]
715 | },
716 | {
717 | "cell_type": "code",
718 | "execution_count": null,
719 | "metadata": {},
720 | "outputs": [],
721 | "source": [
722 | "# define a couple variables in Python\n",
723 | "a = 'Hello from Python!'\n",
724 | "b = 2\n",
725 | "c = False\n",
726 | "d = {'x':1, 'y':2}\n",
727 | "e = 3.142\n",
728 | "f = [{'a':1}, {'a':2}, {'a':3}]"
729 | ]
730 | },
731 | {
732 | "cell_type": "markdown",
733 | "metadata": {},
734 | "source": [
735 | "Access or modify their values in Node.js cells."
736 | ]
737 | },
738 | {
739 | "cell_type": "code",
740 | "execution_count": null,
741 | "metadata": {},
742 | "outputs": [],
743 | "source": [
744 | "%%node\n",
745 | "// print variable values\n",
746 | "console.log(a, b, c, d, e, f);\n",
747 | "\n",
748 | "// change variable value \n",
749 | "a = 'Hello from Node.js!';\n",
750 | "\n",
751 | "// define a new variable\n",
752 | "var g = 'Yes, it works both ways.';"
753 | ]
754 | },
755 | {
756 | "cell_type": "markdown",
757 | "metadata": {},
758 | "source": [
759 | "Inspect the manipulated data."
760 | ]
761 | },
762 | {
763 | "cell_type": "code",
764 | "execution_count": null,
765 | "metadata": {},
766 | "outputs": [],
767 | "source": [
768 | "# display modified variable and the new variable\n",
769 | "print('{} {}'.format(a,g))"
770 | ]
771 | },
772 | {
773 | "cell_type": "markdown",
774 | "metadata": {},
775 | "source": [
776 | "**Note:** PixieDust natively supports [data sharing between Python and Scala](https://ibm-watson-data-lab.github.io/pixiedust/scalabridge.html), extending the loop for some data types:\n",
777 | " ```\n",
778 | " %%scala\n",
779 | " println(a,b,c,d,e,f,g)\n",
780 | " \n",
781 | " (Hello from Node.js!,2,null,null,null,null,Yes, it works both ways.)\n",
782 | " ```"
783 | ]
784 | },
785 | {
786 | "cell_type": "markdown",
787 | "metadata": {},
788 | "source": [
789 | "### Sharing data from an asynchronous callback\n",
790 | "\n",
791 | "If you wish transfer data from Node.js to Python from an asynchronous callback, make sure you write the data to a global variable. \n",
792 | "\n",
793 | "Load a csv file from a GitHub repository."
794 | ]
795 | },
796 | {
797 | "cell_type": "code",
798 | "execution_count": null,
799 | "metadata": {},
800 | "outputs": [],
801 | "source": [
802 | "%%node\n",
803 | "\n",
804 | "// global variable\n",
805 | "var sample_csv_data = '';\n",
806 | "\n",
807 | "// load csv file from GitHub and store data in the global variable\n",
808 | "request.get('https://github.com/ibm-watson-data-lab/open-data/raw/master/cars/cars.csv').then(function(data) {\n",
809 | " sample_csv_data = data;\n",
810 | " console.log('Fetched sample data from GitHub.');\n",
811 | "});"
812 | ]
813 | },
814 | {
815 | "cell_type": "markdown",
816 | "metadata": {},
817 | "source": [
818 | "Create a Pandas DataFrame from the downloaded data."
819 | ]
820 | },
821 | {
822 | "cell_type": "code",
823 | "execution_count": null,
824 | "metadata": {
825 | "pixiedust": {
826 | "displayParams": {}
827 | }
828 | },
829 | "outputs": [],
830 | "source": [
831 | "import pandas as pd\n",
832 | "import io\n",
833 | "# create DataFrame from shared csv data\n",
834 | "pandas_df = pd.read_csv(io.StringIO(sample_csv_data))\n",
835 | "# display first five rows\n",
836 | "pandas_df.head(5)"
837 | ]
838 | },
839 | {
840 | "cell_type": "markdown",
841 | "metadata": {},
842 | "source": [
843 | "**Note**: Above example is for illustrative purposes only. A much easier solution is to use [PixieDust's sampleData method](https://ibm-watson-data-lab.github.io/pixiedust/loaddata.html#load-a-csv-using-its-url) if you want to create a DataFrame from a URL. "
844 | ]
845 | },
846 | {
847 | "cell_type": "markdown",
848 | "metadata": {},
849 | "source": [
850 | "#### References:\n",
851 | " * [Nodebooks: Introducing Node.js Data Science Notebooks](https://medium.com/ibm-watson-data-lab/nodebooks-node-js-data-science-notebooks-aa140bea21ba)\n",
852 | " * [Nodebooks: Sharing Data Between Node.js & Python](https://medium.com/ibm-watson-data-lab/nodebooks-sharing-data-between-node-js-python-3a4acae27a02)\n",
853 | " * [Sharing Variables Between Python & Node.js in Jupyter Notebooks](https://medium.com/ibm-watson-data-lab/sharing-variables-between-python-node-js-in-jupyter-notebooks-682a79d4bdd9)"
854 | ]
855 | }
856 | ],
857 | "metadata": {
858 | "kernelspec": {
859 | "display_name": "Python 2.7",
860 | "language": "python",
861 | "name": "python2"
862 | },
863 | "language_info": {
864 | "codemirror_mode": {
865 | "name": "ipython",
866 | "version": 2
867 | },
868 | "file_extension": ".py",
869 | "mimetype": "text/x-python",
870 | "name": "python",
871 | "nbconvert_exporter": "python",
872 | "pygments_lexer": "ipython2",
873 | "version": "2.7.15"
874 | }
875 | },
876 | "nbformat": 4,
877 | "nbformat_minor": 2
878 | }
879 |
--------------------------------------------------------------------------------