├── enterprise_gateway ├── system-diagram.jpg └── proposal.md ├── .gitignore ├── template └── proposal.md ├── LICENSE ├── scipy_traittypes └── proposal.md ├── README.md ├── contentmanagement └── proposal.md ├── sparkmagic └── proposal.md ├── declarativewidgets └── proposal.md ├── dashboards └── proposal.md └── kernel_gateway └── proposal.md /enterprise_gateway/system-diagram.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jupyter-incubator/proposals/HEAD/enterprise_gateway/system-diagram.jpg -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | 5 | # C extensions 6 | *.so 7 | 8 | # Distribution / packaging 9 | .Python 10 | env/ 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | *.egg-info/ 23 | .installed.cfg 24 | *.egg 25 | 26 | # PyInstaller 27 | # Usually these files are written by a python script from a template 28 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 29 | *.manifest 30 | *.spec 31 | 32 | # Installer logs 33 | pip-log.txt 34 | pip-delete-this-directory.txt 35 | 36 | # Unit test / coverage reports 37 | htmlcov/ 38 | .tox/ 39 | .coverage 40 | .coverage.* 41 | .cache 42 | nosetests.xml 43 | coverage.xml 44 | *,cover 45 | 46 | # Translations 47 | *.mo 48 | *.pot 49 | 50 | # Django stuff: 51 | *.log 52 | 53 | # Sphinx documentation 54 | docs/_build/ 55 | 56 | # PyBuilder 57 | target/ 58 | -------------------------------------------------------------------------------- /template/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | Please read the [New Subproject 4 | Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md) for details 5 | about the incubation process. 6 | 7 | Please answer the following question concisely and completely. This proposal doesn't have 8 | to be very long. 9 | 10 | ## Subproject name 11 | 12 | What is the name of the proposed Subproject? 13 | 14 | ## Development team and Advocate 15 | 16 | Please list the full names, affiliations and GitHub ids of the Subproject development 17 | team: 18 | 19 | * Lisa Simpson, Springfield Elementary School (`@lisasimpson`) 20 | * Bart Simpson, Springfield Elementary School (`@bartsimpson`) 21 | 22 | Who is the Steering Council Advocate for this Subproject? 23 | 24 | * Steering Council Person (`@their-github`) 25 | 26 | ## Subproject goals, scope and functionality 27 | 28 | Please describe/list the goals, scope and functionality of this Subproject. 29 | 30 | ## Audience 31 | 32 | Please describe who would use the Subproject and what they would use it for. 33 | 34 | ## Other options 35 | 36 | Please describe other software, both commercial and open-source that addresses the same goals and audience. 37 | 38 | ## Integration with Project Jupyter 39 | 40 | Please describe how this Subproject will integrate with other official Project Jupyter subprojects and efforts. 41 | 42 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015, jupyter-incubator 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | * Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 10 | * Redistributions in binary form must reproduce the above copyright notice, 11 | this list of conditions and the following disclaimer in the documentation 12 | and/or other materials provided with the distribution. 13 | 14 | * Neither the name of proposals nor the names of its 15 | contributors may be used to endorse or promote products derived from 16 | this software without specific prior written permission. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 19 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 20 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 21 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 22 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 24 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 25 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 27 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | 29 | -------------------------------------------------------------------------------- /scipy_traittypes/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | Scipy Trait Types (jupyter-incubator/traittypes) 6 | 7 | ## Development team and Advocate 8 | 9 | Subproject Development Team: 10 | 11 | * Sylvain Corlay, Bloomberg LP (`@SylvainCorlay`) 12 | * Jason Grout, Bloomberg LP (`@JasonGrout`) 13 | 14 | Steering Council Advocate: 15 | 16 | * Jason Grout (`@JasonGrout`) 17 | 18 | ## Subproject goals, scope and functionality 19 | 20 | ### Goals 21 | 22 | Provide a reference implementation of trait types for common data structures used in the scipy stack such as 23 | - numpy arrays 24 | - pandas / xray data structures 25 | 26 | which are out of the scope of the main traitlets repository but are a common requirement to build applications with traitlets in combination with the scipy stack. 27 | 28 | Another goal is to create adequate serialization and deserialization routines for these trait types to be used with the [ipywidgets](https://github.com/ipython/ipywidgets) project (`to_json` and `from_json`). These could also return a list of binary buffers as allowed by the current message protocol. 29 | 30 | ### Scope 31 | 32 | The trait cross-validation allows for complex coercion or validation of trait values. 33 | 34 | For example, numpy arrays could have dimensional constraints, or be coerced on assignment. We should identify the common denominator that we would want in a most basic implementation and provide extension points for users willing to provide more complex features. 35 | 36 | The recent changes in the traitlets APIs enabling multiple trait notification types allow trait type authors to define custom protocols for element changes in containers. It would be in scope of this project to implement notification protocols for operational transforms. 37 | 38 | ## Audience 39 | 40 | Developers willing to combine traitlets with the Scipy stack. Authors of custom IPython widgets. Matplotlib developers (see https://github.com/matplotlib/matplotlib/pull/4762). 41 | 42 | ## Other options 43 | 44 | - Thomas Robitaille (@astrofrog) recently started [numtraits](https://github.com/astrofrog/numtraits) which implements a numpy array trait type. 45 | - The recently released [bqplot](https://github.com/bloomberg/bqplot) project provides an implementation of numpy array and pandas dataframe trait types. 46 | 47 | ## Integration with Project Jupyter 48 | 49 | If it becomes an official subproject, the right organization for Scipy Trait Types would probably be [IPython](https://github.com/ipython/) rather than [Jupyter](https://github.com/jupyter/) since it is a Python-specific project. It is a natural extension to the traitlets repository. 50 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Proposals For Incubation on [jupyter-incubator](https://github.com/jupyter-incubator) 2 | 3 | 4 | ## Overview 5 | 6 | Project Jupyter is organized as a set of Subprojects that are each a GitHub 7 | repository with a development team that follows the Jupyter governance, license and 8 | contribution model. Officially supported and maintained subprojects are hosted on 9 | the [jupyter](https://github.com/jupyter) GitHub organization. 10 | 11 | Incubation refers to the process of a Subproject initially being developed outside 12 | the official Jupyter organization. The 13 | [jupyter-incubator](https://github.com/jupyter-incubator) GitHub organization is 14 | maintained and managed by Project Jupyter as one possible location where Subproject 15 | can be incubated. The [New Subproject 16 | Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md) 17 | document in the Jupyter [governance](https://github.com/jupyter/governance) 18 | repository details the process of incubation, the process of incorporation into the 19 | main Jupyter organization and the criteria for incorporation. 20 | 21 | 22 | ## Incubation process 23 | 24 | This repository is used for incubation proposals. To initiate this process: 25 | 26 | 1. Read the [New Subproject Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md) 27 | to see if incubation on [jupyter-incubator](https://github.com/jupyter-incubator) is appropriate for your new Subproject (it might not be). If is doubt, please send an email to the main Jupyter list asking for input. 28 | 3. Pick a name for your new Subproject that is free of trademark problems, follows the 29 | naming patterns and conventions used in the project and respects the work of others 30 | even when there are no explicit trademark problems. In general the project favors 31 | functional names that indicate what the Subproject actually does. 32 | 4. Find an Advocate for your Subproject who is an active Steering Council member. 33 | 5. Create a pull request on this repository by copying the `template` directory 34 | into a new directory with the name of your proposed subprojet. 35 | 6. In that pull request fill out the `proposal.md` document in that subdirectory. 36 | 7. Announce your proposal for incubation to the community for the main Jupyter list. 37 | 38 | 39 | ## Incorporation process 40 | 41 | Incorporation refers to the process of an incubating Subproject becoming an officially 42 | maintained and developed part of Project Jupyter. The incorporation process is described 43 | in the [New Subproject 44 | Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md) document. 45 | 46 | 47 | ## Currently incubating Subprojects 48 | 49 | Put links to the [jupyter-incubator](https://github.com/jupyter-incubator) GitHub repositories of all incubating subprojects here. 50 | 51 | * https://github.com/jupyter-incubator/contentmanagement 52 | * https://github.com/jupyter-incubator/sparkmagic 53 | 54 | ## Subprojects that have been incorporated into Project Jupyter 55 | 56 | Put links to the [jupyter](https://github.com/jupyter) GitHub repositories of all Subprojects that previously incubated here, but have been incorporated into the main Jupyter organization 57 | 58 | * https://github.com/jupyter/enterprise_gateway 59 | * https://github.com/jupyter/nb2kg 60 | * https://github.com/jupyter/kernel_gateway 61 | * https://github.com/jupyter/dashboards 62 | * https://github.com/jupyter-widgets/traittypes 63 | * https://github.com/jupyter-attic/declarativewidgets 64 | -------------------------------------------------------------------------------- /contentmanagement/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | Jupyter Content Management Extensions (jupyter-incubator/contentmanagement) 6 | 7 | ## Development team and Advocate 8 | 9 | Subproject development team: 10 | 11 | * Justin Tyberg, IBM (`@jtyberg `) 12 | * Dan Gisolfi, IBM (`@vinomaster`) 13 | * Peter Parente, IBM (`@parente`) 14 | 15 | Steering Council Advocate: 16 | 17 | * Brian Granger (`@ellisonbg`) 18 | 19 | ## Subproject goals, scope and functionality 20 | 21 | ### Concept 22 | The Jupyter Notebook allows for the publishing, editing, organizing and deleting of content from a central web interface. Depending on an individual's workflow and collaboration requirements when working on notebook related projects, the collection of content management features they desire may differ from other users. 23 | 24 | This incubation proposal pertains to the creation of a package of self-installable content management features that extend the baseline Jupyter Notebook. 25 | 26 | ### Goals 27 | * Establish a collection of content management extensions that can be installed to customize a user instance of a Jupyter Notebook. 28 | * Allow 1..n content management extensions to be combined to customize an instance of a Jupyter Notebook. 29 | 30 | ### Scope 31 | 32 | Refactor content management features that were used to build the [IBM Knowledge Anyhow Workbench](https://knowledgeanyhow.org) technology preview, and potentially graduate them to a Jupyter subproject, or integrate them into the [Jupyter Interactive Notebook](jupyter/notebook) repo. 33 | 34 | ### Functionality 35 | 36 | As outlined in this [blog post](http://blog.ibmjstart.net/2015/08/20/jupyter-notebooks-content-management-contributions/), the intent of this proposal is to provide support for: 37 | 38 | * Including secondary notebooks by reference. 39 | * Table of contents and intra-notebook navigation. 40 | * Adding a search button on any or all of the user views (notebook, file-tree, text editor). 41 | * Downloading file bundles (.zips) from the Jupyter dashboard 42 | * Importing any web-accessible notebook via an omnibox 43 | * Sharing notebooks from one Jupyter instance to another via share links 44 | * Uploading files easily from any Jupyter Notebook page 45 | 46 | ## Audience 47 | 48 | Notebook power users who tend to require a broader range of organization, publication, and collaboration features. 49 | 50 | ## Other options 51 | 52 | While each of these potential content management features could be handled as individual pull requests against the baseline Jupyter Notebook, not all features would be desirable for all users. This incubation proposal allows the community to (a) validate the applicability of individual features; (b) gather feedback on when/where various feature combinations are desirable; (c) graduate popular features to [Jupyter Interactive Notebook](jupyter/notebook). 53 | 54 | ## Integration with Project Jupyter 55 | In accordance with the [New Subproject Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md#incubation-of-subprojects), this incubation proposal pertains to existing code that a vendor is willing to contribute but it is not clear yet how the Subproject will integrate with the rest of Jupyter. 56 | 57 | During the incubation process the content management extensions could be vetted by the community by establishing an experimental-notebook as a turnkey docker container in the [jupyter/docker-stacks repo](https://github.com/jupyter/docker-stacks) that combines the baseline [Jupyter Interactive Notebook](https://github.com/jupyter/notebook) with the incubation repo. 58 | 59 | A successful incubation process could yield the justification for the proposed extensions to be directly integrated via pull requests into the [Jupyter Interactive Notebook](https://github.com/jupyter/notebook). 60 | -------------------------------------------------------------------------------- /sparkmagic/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | sparkmagic 6 | 7 | ## Development team and Advocate 8 | 9 | Subproject development team: 10 | 11 | * Alejandro Guerrero Gonzalez, Microsoft (`@aggFTW`) 12 | * Sangeetha Shekar, Microsoft (`@sangeethashekar`) 13 | * Auberon Lopez, Cal Poly (`@alope107`) 14 | 15 | Steering Council Advocate 16 | 17 | * Brian Granger (`@ellisonbg`) 18 | 19 | ## Subproject goals, scope and functionality 20 | 21 | Goals: 22 | * Provide the ability to connect from any IPython notebook to different remote Spark clusters. 23 | * Frictionless Spark usage from IPython notebook without a local installation of Spark. 24 | * The functionality delivered will be pip-installable or conda-installable. 25 | 26 | Scope: 27 | * IPython magics to enable remote Spark code execution through [Livy](https://github.com/cloudera/hue/tree/master/apps/spark/java), a Spark REST endpoint, which allows for Scala, Python, and R support as of September 2015. 28 | * The project will create a Python Livy client that will be used by the magics. 29 | * The project will integrate the output of Livy client (by creating [pandas](https://github.com/pydata/pandas) dataframes from it) with the rich visualization framework that is being proposed [LINK]. 30 | * This project takes a dependency on Livy. The crew will work on Livy improvements required to support these scenarios. 31 | 32 | Functionality: 33 | * Enable users to connect to a remote Spark cluster running Livy from any IPython notebook to execute Scala, Pyspark, and R code. 34 | * Ability to reference custom libraries pre-installed in the remote Spark cluster. 35 | * Automatic rich visualization of Spark responses when appropriate. Look at [Zeppelin](https://zeppelin.incubator.apache.org/) for a vision of the functionality we want to enable for Spark users. 36 | * Return of pandas dataframes from Spark when appropriate to enable users to transform Spark results with the Python libraries available for pandas dataframes. These transformations will only be available on Python. 37 | 38 | Additional notes: 39 | * Livy will be installed on a remote Spark cluster. Livy will create Spark drivers that have full network access to the Spark master and the Spark worker nodes in the cluster. 40 | * Most data transformation will happen on the cluster, by sending Scala, Pyspark, or R code to it, where the Scala, Python, or R execution contexts will live. This will keep compute and data close together. 41 | * Livy JSON responses will be transformed to pandas dataframes when appropriate. These dataframes will be used to generate rich visualizations or to allow users to transform data returned by Spark locally. 42 | * The functionality enables remote Spark code execution on the cluster (e.g. a Spark cluster on Azure or AWS). Any input/output of data will happen on the filesystem enabled on the cluster and not on the local installation of Jupyter. This enables scenarios like Spark Streaming or ML. 43 | 44 | ## Audience 45 | 46 | Spark users who want to have the best experience possible from a Notebook. These users are familiar with the Jupyter ecosystem and the Python libraries available for IPython. 47 | 48 | ## Other options 49 | 50 | Alternatives we know of: 51 | 52 | * Combination of IPython, R kernel (or [rpy2](http://rpy.sourceforge.net/rpy2.html)), and Scala kernel for an in-Spark-cluster Jupyter installation. This does not allow the user to point to different Spark clusters. It might also result in resource contention (CPU or memory) between the Jupyter installation (kernels) and Spark. 53 | * [IBM's Spark kernel](https://github.com/ibm-et/spark-kernel) allows for the execution of Spark code on a local installation of Jupyter in the cluster. Automatic visualizations are not yet supported. 54 | * [sparklingpandas](https://github.com/sparklingpandas/sparklingpandas) builds on Spark's DataFrame class to give users a Pandas-like API. Remote connectivity is not in scope. The project covers Pyspark. 55 | 56 | ## Integration with Project Jupyter 57 | 58 | * These magics will be pip-installable and loadable from any Jupyter/IPython installation. 59 | * By virtue of returning pandas dataframes, the dataframes will be easily visualizable by using the library created by the automatic rich visualizations incubation subproject [LINK]. 60 | * There are integration points that a remote Spark job submission story will have to think through regardless of implementation choice. This project will aim to solve them as they arise. Some issues to work through: 61 | * Figure out the right amount of data from the result set to bring back to the client via the wire. Is it a sample or the top of the result set? 62 | * Spark Streaming. How do we expose the results endpoint? -------------------------------------------------------------------------------- /declarativewidgets/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | Jupyter Declarative Widget Extension (jupyter-incubator/declarativewidgets) 6 | 7 | ## Development team and Advocate 8 | 9 | Subproject development team: 10 | 11 | * Gino Bustelo, IBM (`@ginobustelo`) 12 | * Peter Parente, IBM (`@parente`) 13 | * Justin Tyberg, IBM (`@jtyberg `) 14 | * Dan Gisolfi, IBM (`@vinomaster`) 15 | 16 | Steering Council Advocate: 17 | 18 | * Brian Granger (`@ellisonbg`) 19 | 20 | ## Subproject goals, scope and functionality 21 | 22 | ### Concept 23 | 24 | >What if the ecosystem around Project Jupyter included a pervasive set of rich notebook user interfaces (a.k.a. widgets) using declarative markup that can interact with other code authored in the notebook? 25 | 26 | A very powerful feature of Jupyter Notebooks is the ability to mix interactive widgets along with your code and data. These widgets turn the notebook into more of a web application, allowing the author to create a User Interface (UI) to the code entered in the Notebook’s cells. Rather than editing code and executing cells, the consumer of the Notebook can interact with the code by moving sliders, filling a form, pressing a button, etc. 27 | 28 | This incubation proposal pertains to the creation of a Jupyter extension that would allow developers to create a compendium of language agnostic widgets that can support the Jupyter ecosystem of language backends. The extension would leverage web-components and Polymer in combination with the existing IPywidgets libraries. 29 | 30 | ### Goals 31 | * Give the author access to a vast toolbox of components and a simple yet powerful authoring experience. 32 | * Enable quick and easy integration between the ecosystem of Web Components (e.g., widgets in the Polymer catalog) and the Jupyter Interactive Notebook. 33 | * Provide Polymer based binding support to allow widgets to be aware of interactivity within the scope of the notebook namespace. 34 | * Minimize the effort to fully support interactive widgets on all Notebook supported languages. 35 | * Instantiation of widgets within the notebook using HTML. 36 | * Achieve compatibility and portability between the ecosystem of widgets and the Jupyter Interactive Notebook, regardless of the language used in the Notebook (i.e. Python, R, Scala, etc.). 37 | 38 | ### Scope 39 | The objective of this incubation project is to establish a baseline prototype that begins to blur the lines between a notebook and an application while bootstrapping the ecosystem of notebook compatible widgets. It will build on top of the foundation set by IPywidgets. 40 | 41 | This incubation effort can be scoped as: 42 | 43 | 1. Identification of the minimum essential pieces of IPyWidgets that need to exist in all language kernels 44 | 2. Building a portable layer can leverage an existing and ever growing ecosystem of web widgets. 45 | 3. Providing a widget framework that works across the various and growing set of languages that can be hosted in a Jupyter Interactive Notebook. 46 | 47 | ### Functionality 48 | 49 | As outlined in this [blog post](http://blog.ibmjstart.net/2015/08/21/declarative-widget-system-for-jupyter-notebooks/), the intent of this proposal is to provide support for: 50 | 51 | * A **function element** which will allow the running of a segment of code within the notebook as a side effect of a user's interactivity with UI elements. By using a function defined in a code cell, the author can connect other visual elements to set the function arguments, invoke it and display its result. 52 | * A Pandas or a Spark DataFrame defined is a way to interact with data. Using a **dataframe element** the author can bind other visual elements to display the data held by a dataFrame. The dataframe element can be set to listen to changes to the DataFrame instance in the kernel and trigger notifications that change other elements that are bound to it. 53 | * Extensions to the data binding capabilities of Polymer to allow multiple cell output areas to share data and related events. Support for changing and watching data using the language of the kernel. 54 | * An element and related server extension to support importing new 3rd party elements into the Notebook. The import mechanism used bower to install new Polymer element upon user's request. 55 | 56 | ## Audience 57 | 58 | 1. Developers of rich interactive web widgets. 59 | 2. Notebook users who desire to inject rich interactive widgets into their notebooks. 60 | 3. Notebook users who wish to create interactive dashboards in their notebooks. 61 | 62 | ## Other options 63 | The [IPyWidgets](https://github.com/ipython/ipywidgets) project is the reference implementation of interactive widgets for the Python language. At its core, it defines a communication channel and protocol between the browser and the Python kernel. This channel is used to exchange messages between a Javascript and a Python representation of a widget. It keeps the state of the two in sync and triggers events. A user clicks on a button in the browser, a message is sent to the Python representation of the button and it performs a preconfigured action. However, the IPyWidgets project is not architected to achieve the compatibility and portability goals associated Jupyter language kernels. 64 | 65 | ## Integration with Project Jupyter 66 | In accordance with the [New Subproject Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md#incubation-of-subprojects), this incubation proposal pertains to an idea that has yet to be vetted with the community. 67 | 68 | A successful incubation process could yield the justification for a new Jupyter Subproject for declarative widgets. 69 | -------------------------------------------------------------------------------- /enterprise_gateway/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | Jupyter Enterprise Gateway ('jupyter-incubator/enterprise_gateway') 6 | 7 | ## Development team and Advocate 8 | 9 | * Luciano Resende, IBM ('@lresende') 10 | * Kevin Bates, IBM ('@kevin-bates') 11 | * Christian Kadner, IBM ('@ckadner') 12 | * Kun Liu, IBM ('@liukun1016') 13 | * Alan Chin, IBM ('@akchinSTC') 14 | * Sherry Guo, IBM ('@sherryxg') 15 | * Fred Reiss, IBM ('@frreiss') 16 | * Sanjay Saxena ('@sanjay-saxena') 17 | 18 | Who is the Steering Council Advocate for this Subproject? 19 | 20 | * Kyle Kelley ('@rgbkrk') 21 | * Peter Parente ('@parente') 22 | 23 | ## Subproject goals, scope and functionality 24 | 25 | ### Problem 26 | Founded in academia, the Jupyter projects provide a rich and highly popular set of applications for 27 | interacting with and iterating on large and complex applications. It has been truly ground-breaking. 28 | However, when we first attempted to build a Notebook service that could enable a large number of data 29 | scientists to run frequent and large workloads against a large Apache Spark cluster, we identified 30 | several requirements that were not currently available in the Jupyter open source ecosystem. We tried 31 | to use the Jupyter Kernel Gateway project, but we quickly realized that the JKG server became the 32 | bottleneck because the co-located Spark driver application for these kinds of workloads (in this case, 33 | the kernel process running on behalf of notebook cells) were extremely resource intensive. Toss in a team 34 | or organization of data scientists, and you quickly saturate the compute resources of the Kernel Gateway 35 | server. 36 | 37 | ### Goals 38 | The primary goal behind Jupyter Enterprise Gateway is to address the gaps identified during our experience 39 | described above and provide a lightweight, multi-tenant gateway that enables many notebook users to share 40 | a single Spark Cluster and run kernels as managed resources within the cluster (i.e. Yarn Cluster mode). 41 | By providing tight integration with the resource manager used by the Spark cluster, we avoid resource starvation 42 | caused by multiple processes that are unaware of each other and compete for the same resources. We can also provide 43 | complex scheduling capabilities such as [capacity scheduling and priority queues](https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html) 44 | with a very low development cost, thus satisfying Enterprise Administrators manageability and security 45 | requirements. Although the Apache Yarn resource manager is immediately supported, we have plans to support 46 | others such as Spark on Kubernetes in the near future utilizing the existing resource manager pluggable framework. 47 | 48 | We strive to not alter any kernel code itself. As a result, we've developed kernel launchers that essentially 49 | house the target kernel and serve as life-cycle managers for functionality like interrupt or, in some cases, 50 | kernel termination. 51 | 52 | ![System Diagram](system-diagram.jpg) 53 | 54 | ### Scope 55 | In general, Enterprise Gateway is intended to the address the needs of enterprise and cloud administrators 56 | such that cluster resource utilization is optimized across all applications, including those of data scientists 57 | performing analysis via Notebooks residing on their desktops. 58 | 59 | ### Use Cases 60 | 61 | *As a data scientist, I want to attach a local Jupyter Notebook server to the enterprise/cloud shared compute cluster 62 | to run compute-intensive analytical workload (e.g. multi-tenant interactive gateway to Apache Spark cluster). 63 | 64 | * As an Spark Cluster administrator, I want to enable notebook users to utilize all the computing resources available 65 | in the enterprise/cloud compute cluster to run analytical workloads in a managed and secure way. 66 | 67 | * As an administrator, I want enforce user isolation such that notebook kernel processes are protected against each 68 | other allowing users to preserve and leverage their own environment, i.e. libraries and/or packages, data, etc. 69 | 70 | ## Audience 71 | 72 | Platform Engineers and Apache Spark Administrators that want to enable Data Scientists and other Jupyter 73 | Notebook users to connect and share existing Spark cluster resources when executing their notebooks. 74 | 75 | 76 | ## Other options 77 | 78 | Enterprise Gateway has very close affinity with Kernel Gateway and could be considered the next generation of JKG. 79 | As JKG is planning to move its functionality directly to Jupyter Notebook (see [JKG-259 ](https://github.com/jupyter/kernel_gateway/issues/259) 80 | and [JUPYTER-2644 ](https://github.com/jupyter/notebook/issues/2644)), and based on recommendation from some 81 | members from the JKG community, we have decided to boostrap a new project (“Enterprise Gateway”). We expect that in 82 | the future, once functionality has been moved out of JKG, Enterprise Gateway will have a direct dependency on the 83 | Notebook server. 84 | 85 | Other projects have tried to solve the 'remote kernel' problem (e.g. [remotekernel](https://github.com/danielballan/remotekernel) 86 | or [remote_ikernel](https://bitbucket.org/tdaff/remote_ikernel)) but these seem to be providing a wrapper/proxy kernel 87 | that will introduce a level of indirection that can cause side-effects. In any case, they do not provide any integration 88 | with Spark and its resource manager. 89 | 90 | As for JupyterHub, we believe its a full solution, providing capabilities to spawn and manage Notebook servers per 91 | user. In this scenario, the JupyterHub will be independently consuming and managing cluster compute resources that 92 | compete directly with managed resources. This may lead to operational and management issues for cluster administrators. 93 | 94 | ## Integration with Project Jupyter 95 | 96 | Enterprise Gateway is based on the Jupyter Notebook stack, and directly extends Kernel Gateway, Notebook Server, 97 | and Jupyter Client. These components, including it's transient dependencies to other projects are all used and 98 | required by Jupyter Enterprise Gateway. 99 | -------------------------------------------------------------------------------- /dashboards/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | Jupyter Dynamic Dashboards from Notebooks (jupyter-incubator/dashboards) 6 | 7 | ## Development team and Advocate 8 | 9 | Subproject development team: 10 | 11 | * Peter Parente, IBM (`@parente`) 12 | * Gino Bustelo, IBM (`@ginobustelo`) 13 | * Justin Tyberg, IBM (`@jtyberg `) 14 | * Dan Gisolfi, IBM (`@vinomaster`) 15 | 16 | Steering Council Advocate: 17 | 18 | * Brian Granger (`@ellisonbg`) 19 | 20 | ## Subproject goals, scope and functionality 21 | 22 | ### Concept 23 | 24 | >What if the notebook author could quickly and easily publish rich interactive dashboard applications that allow users to perform interactive visual data exploration? 25 | 26 | Today, notebook authors can share their notebooks for others to view and run in the Jupyter Notebook web application. Authors can also transform their notebooks into a variety of static formats for ease-of-viewing in common tools like web browsers or PDF readers. However, if the notebook author desired to generate a dashboard application that would interact with the insights derived in the notebook, she would have to step outside the authoring environment to build a traditional web application. Such a task involves a great deal of copy/pasting and rewriting code that already exists in the notebook. 27 | 28 | This incubation proposal pertains to a research sandbox associated with investigating ways of transforming notebooks directly into dashboards. The proposed repo would be based on a proof-of-concept extension for Jupyter Notebook that builds upon existing community work like [Thebe](https://github.com/oreillymedia/thebe) and the [recent refactoring of the Jupyter Notebook](http://blog.jupyter.org/2015/04/15/the-big-split/) in order to demonstrate the concept of producing dynamic dashboards from notebooks. 29 | 30 | ### Goals 31 | 32 | * Provide within-notebook enhancements for laying out notebook cell outputs in a dashboard format, capturing layout metadata in the notebook document, and converting notebook documents into deployable web apps. 33 | * Lay the foundation for non-trivial infrastructure topics associated with the deployment of dynamic dashboards (e.g., security, kernel execution model, scalability, and widget ecosystem). 34 | 35 | ### Scope 36 | 37 | The objective of this incubation project is to establish a baseline prototype that demonstrates how notebook authors can publish dashboards and how application users can access and interact with dashboards. 38 | 39 | This incubation effort can be scoped as: 40 | 41 | 1. Extending the Jupyter Notebook user experience to enable drag/drop layout of cell outputs 42 | 2. Extending the Jupyter Notebook backend to enable the conversion of notebook documents to a new, web application format using nbconvert 43 | 3. Driving requirements for other components (e.g., [thebe](https://github.com/oreillymedia/thebe), [jupyter-js-services](https://github.com/jupyter/jupyter-js-services), [kernel gateway](https://github.com/jupyter-incubator/proposals/pull/3)) to enable robust, operation of deployed dashboards 44 | 45 | ### Functionality 46 | 47 | As outlined in this [blog post](http://blog.ibmjstart.net/2015/08/22/dynamic-dashboards-from-jupyter-notebooks/), the intent of this proposal is to seed a repository with a proof of concept extension providing the following features: 48 | 49 | * **Dashboard Layout** where the notebook author can drag/drop and resize notebook cells in a grid layout to dynamically create the web application within Jupyter. 50 | * **Dashboard Transformation** where the notebook author can convert the notebook document into a web app that executes the same code as the origina notebook and retains the desired application layout. Packaging of the web app assets includes: 51 | * Static web resources necessary for operation of the dashboard frontend 52 | * Configuration for directing the frontend to an appropriate kernel execution environment (e.g., [tmpnb](https://github.com/jupyter/tmpnb), [kernel gateway](https://github.com/jupyter-incubator/proposals/pull/3) ) 53 | * Assets to ease deployment of the dashboard frontend using common cloud technologies (e.g., Dockerfile, Cloud Foundry manifest) 54 | 55 | After publishing the additional prototype, improvements will occur within the incubator project in order to mature the features for later integration into Jupyter Notebook or promotion as other mainline components. 56 | 57 | ## Audience 58 | 59 | * Notebook users who wish to switch to a dashboard view for easier interaction with widgets in their notebooks (e.g., during data exploration) 60 | * Notebook users who wish to layout and present their results in a dashboard view 61 | * Notebook users who wish to share interactive dashboard views with other notebook users when passing around notebook documents 62 | * Non-notebook users who want to interact with findings from analyses done in notebooks without running Jupyter Notebook themselves 63 | * New Notebook users who wish to use it as a light-weight application composition environment 64 | 65 | ## Other options 66 | 67 | * Any number of commercial and open source dashboarding applications exist (e.g., Tableau). Some have the ability to produce dashboards specifically from notebooks (e.g., Mathematica Cloud, Apache Zeppelin, Databricks). 68 | * The traditional approach of doing analysis in notebooks and then stepping outside to create a traditional web application is always an option. 69 | * One could use [thebe](https://github.com/oreillymedia/thebe) or an equivalent JS library to manually create a web application that uses Jupyter kernels for code execution. This project automates this otherwise manual effort. 70 | 71 | ## Integration with Project Jupyter 72 | In accordance with the [New Subproject Process](https://github.com/jupyter/governance/blob/master/newsubprojects.md#incubation-of-subprojects), this incubation proposal pertains to an idea that has yet to be vetted with the community. 73 | 74 | A successful incubation process should provide justification for the inclusion of the dashboard layout and conversion features into the baseline [Jupyter Interactive Notebook](https://github.com/jupyter/notebook). In addition, successful incubation should drive the creation of new Jupyter components that enable the robust execution of deployed notebook-dashboards, as well as other next generation Jupyter client applications. -------------------------------------------------------------------------------- /kernel_gateway/proposal.md: -------------------------------------------------------------------------------- 1 | # Proposal for Incubation 2 | 3 | ## Subproject name 4 | 5 | Jupyter Kernel Gateway (`jupyter-incubator/kernel_gateway`) 6 | 7 | ## Development team and Advocate 8 | 9 | * Peter Parente, IBM (`@parente`) 10 | * Dan Gisolfi, IBM (`@vinomaster`) 11 | * Justin Tyberg, IBM (`@jtyberg`) 12 | * Gino Bustelo, IBM (`@lbustelo`) 13 | 14 | Who is the Steering Council Advocate for this Subproject? 15 | 16 | * Kyle Kelley (`@rgbkrk`) 17 | 18 | ## Subproject goals, scope and functionality 19 | 20 | ### Problem 21 | 22 | Applications that use Jupyter kernels as execution engines outside of the traditional notebook / console user experience have started appearing on the web (e.g., [pyxie](https://github.com/oreillymedia/pyxie-static), [pyxie kernel server](https://github.com/oreillymedia/ipython-kernel), [Thebe](https://github.com/oreillymedia/thebe), [gist exec](https://github.com/rgbkrk/gistexec), [notebooks-to-dashboards](http://blog.ibmjstart.net/2015/08/22/dynamic-dashboards-from-jupyter-notebooks/), ). Today, these projects rely on: 23 | 24 | 1. A _client_ (e.g., Thebe) that includes JavaScript code from Jupyter Notebook to request and communicate with kernels 25 | 26 | 2. A _spawner_ (e.g., tmpnb) that provisions gateway servers to handle client kernel requests 27 | 28 | 3. A _gateway_ (e.g., the entirety of Jupyter Notebook) that accepts [CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) requests for kernels, isolates kernel workspaces (e.g., via Docker), and proxies web-friendly protocols (e.g., Websocket) to the kernel protocol (0mq) 29 | 30 | Maturing the Jupyter stack so that these efforts (and future ones) can find robust, common ground will require improvements in each of the above areas. Work is already underway to define a JavaScript library for communication with kernels and kernel provisioners (i.e., [jupyter/jupyter-js-services](https://github.com/jupyter/jupyter-js-services)). Discussion has started about defining an API for provisioning notebook servers as well ([binder project/binder#8](https://github.com/binder-project/binder/issues/8)), a topic that touches on the spawner concept above. This proposal focuses on the third area, the concept of a standard kernel provisioning and communication API. 31 | 32 | ### Goals 33 | 34 | The goal of this incubator project is to **prototype and evaluate** possible solutions that satisfy the rising, novel kernel uses above, particularly with regard to the driving use cases documented below. The design and code of this incubator may become new Jupyter components, or folded into other relevant efforts underway, or discarded if better options arise as the evaluation proceeds. At present, we do not know the **correct** design and implementation, but we believe there is value in trying the following: 35 | 36 | 1. Using jupyter_client, jupyter_core, and pieces of jupyter/notebook (e.g., MappingKernelManager, etc.) to construct a headless kernel gateway that can talk to a cluster manager (e.g., Mesos). 37 | 2. Implementing a websocket to 0mq bridge that can be placed in any Docker container that already runs a kernel, to allow web-friendly access to that kernel. 38 | 3. Adding a new jupyter_client.WebsocketKernelManager that can be plugged into Jupyter Notebook or consumed by other tools to talk to kernels frontend by a websocket to 0mq bridge. (See use case #3 below). 39 | 40 | ### Use Cases 41 | 42 | #### Use Case #1: Simple, Static Web Apps w/ Modifiable Code 43 | 44 | Alice has a scientific computing blog. In her posts, Alice often includes snippets of code demonstrating concepts in languages like R and Python. She sometimes writes these snippets inline in Markdown code blocks. Other times, she embeds GitHub gists containing her code. To date, her readers can view these snippets on her blog, clone her gists to edit them, and copy/paste other code for their own purposes. 45 | 46 | Having heard about Thebe and gist exec, Alice is interested in making her code snippets executable and editable by her readers. She adds a bit of JavaScript code on her blog to include the Thebe JS library, and turn her code blocks into edit areas with Run buttons. She also configures Thebe to contact a publicly addressable Jupyter Kernel Gateway (hosted by the good graces of the community and Rackspace ;) as the execution environment for her code. 47 | 48 | When Bob visits Alice's blog, his browser loads the markup and JS for her latest post. The JS contacts the configured kernel gateway to request a new kernel. The gateway provisions a kernel in its compute cluster and sends information about the new kernel instance back to the requesting in-browser JS client. Most importantly, this response contains information about a Websocket endpoint on the kernel gateway to which the client can establish a connection for communication with the kernel. Thereafter, the gateway acts as a Websocket-to-0mq proxy for communication between Bob's browser and the kernel until Bob leaves the page and the kernel eventually shuts down. 49 | 50 | ![](https://hackpad-attachments.s3.amazonaws.com/jupyter.hackpad.com_sZx2qqNHnY1_p.454990_1440709697802_undefined) 51 | 52 | Note that this use case is not much different from the current Thebe and gist exec sample applications; it simply serves to formalize the APIs and components used for the additional use cases stated next. 53 | 54 | #### Use Case #2: Notebooks Converted to Standalone Dashboard Applications 55 | 56 | Cathy uses Jupyter Notebook in her role as a data scientist at ACME Corp. She writes notebooks to munge data, create models, evaluate models, visualize results, and generate reports for internal stakeholders. Sometimes, her work informs the creation of dashboard web applications that allow other users to explore and interact with her results. In these scenarios, Cathy is faced with the task of rewriting her notebook(s) in the form of a traditional web application. Cathy would love to avoid this costly rewrite step. 57 | 58 | One day, Cathy deploys a Jupyter Kernel Gateway to the same compute cluster where she authors her Jupyter notebooks. The next time she needs to build a web app, she creates a new notebook that includes interactive widgets, uses Jupyter extensions to position the widgets in a dashboard-like layout, and transforms the notebook into a standalone NodeJS web app. Cathy deploys this web app to ACME Corp's internal web hosting platform, and configures it with the URL and credentials for the kernel gateway on her compute cluster. 59 | 60 | Cathy sends the URL of her running dashboard to David, a colleague from the ACME marketing department. When David visits the URL, the application prompts for his intranet credentials. After login, his browser loads the markup and JS for the frontend of the dashboard web app. In contrast to the open-access blog post example in the prior use case, the JavaScript in David's browser does not contact the kernel gateway directly. It does not contain the credentials to do so, and it does not contain any code from the original notebook. Instead, to limit David's control over kernels on the compute cluster, the JS in David's browser only communicates with the dashboard app NodeJS backend. The dashboard server requests the kernel, sends code to the kernel for execution upon David's interaction with the frontend, and proxies the responses back to the frontend JS for display to David. Throughout all this interaction, the kernel gateway behaves in the same manner as in the previous example: it provisions kernels and proxies Websocket-to-0mq connections for all dashboard users. 61 | 62 | ![](https://hackpad-attachments.s3.amazonaws.com/jupyter.hackpad.com_sZx2qqNHnY1_p.454990_1440679339192_undefined) 63 | 64 | #### Use Case #3: Notebook Authoring Separated from Kernel/Compute Clusters 65 | 66 | Erin is a Jupyter Notebook and Spark user. She would like to pay a third-party provider for a hosted compute plus data storage solution, and drive Spark jobs on it using her local Jupyter Notebook server as the authoring environment. When Erin decides to convert some of her notebooks to dashboards, she also wants those deployed dashboards to use her compute provider to avoid having to move her data around. 67 | 68 | In a bright and not-so-distant future, Erin chooses a Spark provider that provides a Jupyter kernel service API. Her provider allows Erin to launch and communicate with Jupyter kernels via Websockets. The kernels run in containers (kernel-stacks) that are pre-configured with Spark and typical scientific computing libraries. Erin points her local Jupyter Notebook server to her provider's kernel service API. When Erin launches a new notebook locally, her Notebook server does the work of requesting a kernel from the kernel service API, and establishing a Websocket connection to the provider's kernel gateway, which proxies her commands to her running kernel via 0mq. 69 | 70 | When Erin converts one of her notebooks to a dashboard, she supplies credentials for the dashboard server to access her kernel provider. When users visit Erin's dashboard, the dashboard server contacts the kernel provider to manage the lifecycle and communication with kernels. 71 | 72 | When Frank, a colleague of Erin, learns about her great setup, he asks to share her compute provider account and make it a team account. Happy to help, Erin does so. Frank then spins up a VM in his current cloud provider, runs Jupyter Notebook server on it, and points it to the kernel gateway running in Erin's hosted environment in the same manner Erin did with her local Jupyter instance. 73 | 74 | ![](https://hackpad-attachments.s3.amazonaws.com/jupyter.hackpad.com_sZx2qqNHnY1_p.454990_1440679897295_undefined) 75 | 76 | **N.B.:** The key difference in this scenario versus what exists in Jupyter Notebook today lies in the fact that the Jupyter Notebook server is no longer talking to kernels via 0mq. Rather, the Notebook **server** is a Websocket client itself, much like the Notebook frontend, and communicates with kernels via a kernel gateway. This setup makes it possible to run the Notebook web application outside of the compute cluster, across the web if need be. Of course, it would require new remote kernel provisioning and Websocket client code paths within the Jupyter Notebook Python code to realize. 77 | 78 | ## Audience 79 | 80 | * Jupyter Notebook users who want to run their notebook server remote from their kernel compute cluster 81 | * Cloud providers who want to provide remote access to kernel compute services to clients other than Jupyter Notebook (e.g., dashboards) 82 | * Application developers that want to create new tools and systems that leverage the use of kernels. 83 | 84 | ## Other options 85 | 86 | Other than the proof of concepts mentioned at the start of this proposal (e.g., tmpnb used headlessly from Thebe / gistexec), there are no other clear options for enabling the use cases described above. Other up-and-coming projects (i.e., mybinder.org) may begin to improve upon these existing proof of concepts, but it is not, at the moment, designed specifically to address the use cases outlined in this proposal. 87 | 88 | ## Integration with Project Jupyter 89 | 90 | This incubation effort should help Jupyter developers make informed decisions about future refactoring, reimplementation, and extension efforts with respect to kernel provisioning and access (e.g., jupyter-js-services). As mentioned above, if code assets produced from this exploratory incubation effort have merit, they should be promoted to full Jupyter projects and maintained as such (e.g., jupyter/kernel-gateway). 91 | --------------------------------------------------------------------------------