├── images
    ├── 5-structure.png
    ├── 1-our-offering.png
    ├── 3-training-sessions.png
    ├── 4-community-driven.png
    ├── 6-innovation-cycle.png
    ├── 2-example-map-heart-disease.png
    └── 2-overview-digital-education.png
├── blaze-proposal
    ├── gantt-color.png
    ├── logo-blaze.png
    ├── video_screenshot.png
    ├── work-plan_color.png
    ├── blaze-example-zika.png
    ├── blaze-example-zoom-details.png
    ├── architecture.md
    ├── use-case.md
    └── proposal.md
├── README.md
└── roadmap.md


/images/5-structure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/5-structure.png


--------------------------------------------------------------------------------
/images/1-our-offering.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/1-our-offering.png


--------------------------------------------------------------------------------
/blaze-proposal/gantt-color.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/blaze-proposal/gantt-color.png


--------------------------------------------------------------------------------
/blaze-proposal/logo-blaze.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/blaze-proposal/logo-blaze.png


--------------------------------------------------------------------------------
/images/3-training-sessions.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/3-training-sessions.png


--------------------------------------------------------------------------------
/images/4-community-driven.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/4-community-driven.png


--------------------------------------------------------------------------------
/images/6-innovation-cycle.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/6-innovation-cycle.png


--------------------------------------------------------------------------------
/blaze-proposal/video_screenshot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/blaze-proposal/video_screenshot.png


--------------------------------------------------------------------------------
/blaze-proposal/work-plan_color.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/blaze-proposal/work-plan_color.png


--------------------------------------------------------------------------------
/blaze-proposal/blaze-example-zika.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/blaze-proposal/blaze-example-zika.png


--------------------------------------------------------------------------------
/images/2-example-map-heart-disease.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/2-example-map-heart-disease.png


--------------------------------------------------------------------------------
/images/2-overview-digital-education.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/images/2-overview-digital-education.png


--------------------------------------------------------------------------------
/blaze-proposal/blaze-example-zoom-details.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OpenKnowledgeMaps/open-discovery/HEAD/blaze-proposal/blaze-example-zoom-details.png


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # open-discovery
2 | This repository is intended to openly develop a roadmap for [Open Knowledge Maps](http://openknowledgemaps.org). Our goal is to revolutionize discovery with a visual interface that dramatically increases the visibility of research results for science and society alike. We are  building this interface to the world’s scientific knowledge as a public good.
3 | 
4 | ## License
5 | <img src="http://mirrors.creativecommons.org/presskit/buttons/88x31/png/by.png" width=100>
6 | 
7 | Unless otherwise noted, all content is licensed under a [Creative Commons Attribution 4.0 International License]((https://creativecommons.org/licenses/by/4.0/)).
8 | 
9 | 


--------------------------------------------------------------------------------
/blaze-proposal/architecture.md:
--------------------------------------------------------------------------------
 1 | #Architecture
 2 | ## Content & Metadata Aggregation
 3 | 
 4 | **Description:** A JSON REST API or Graphql endpoint which wraps various content and metadata sources, such as Scott's [rOpenSci fulltext library](https://github.com/ropensci/fulltext).
 5 | 
 6 | **URL:** https://api.archivelab.org/scholar
 7 | 
 8 | **Stack:** Python, Flask, Postgres, SQLAlchemy, R
 9 | 
10 | **Github:** https://github.com/ArchiveLabs/scholar.archivelab.org
11 | 
12 | ## BLAZE Backend + Database
13 | 
14 | **Description:** A REST API to a database of user accounts, preferences, and curated topic maps
15 | 
16 | **URL:** https://api.openknowledgemaps.org
17 | 
18 | **Stack:** PHP, MySQL
19 | 
20 | **Github:** https://github.com/pkraker/BlazeServer (to be created)
21 | 
22 | ## BLAZE Frontend
23 | 
24 | **Description:** A standalone, serverless SPA (single page app)
25 | 
26 | **URL:** https://openknowledgemaps.org
27 | 
28 | **Stack:** React, D3.js, JQuery
29 | 
30 | **Github:** https://github.com/pkraker/Blaze (to be created)
31 | 
32 | 
33 | **github:** https://github.com/pkraker/Headstart
34 | 
35 | **spec:** pending approval
36 | 


--------------------------------------------------------------------------------
/blaze-proposal/use-case.md:
--------------------------------------------------------------------------------
 1 | # Main Use Case #
 2 | 
 3 | Sarah is a first-year PhD student in biomedicine, writing her thesis on the zika virus. First, she needs to catch up with the literature and build a mental model of the field. With thousands of papers already published on the topic and new ones emerging every day, this is a daunting task.
 4 | 
 5 | *For many researchers, the starting point in their quest to conquer an unfamiliar knowledge domain is to turn to their personal favourite search engine, type in the name of the field of interest and start reading at the top of the list. Once you have read through the first few articles (usually highly cited review articles), and followed relevant references, you develop an idea of important journals and authors in the field and adapt your search strategy accordingly. With time and patience, a researcher can thus build a mental model of a field.*
 6 | 
 7 | *The problem with this strategy is that it can take weeks, if not months before this mental model emerges. Indeed, in many PhD programs, the first year is devoted to this process. There is also a lot of reading and summarizing involved, but searching for relevant literature usually accounts for a large chunk of the time. Another problem is that the results of the discovery process are usually not shared; they become visible only later as reference lists or reading lists, but again with very little context. Therefore, the discovery process is repeated over and over again, taking away a lot of time and resources from the actual research.*
 8 | 
 9 | And that’s where Blaze comes in. Blaze is a collaborative discovery tool that provides a visual representation of this mental model. Sarah types “zika”, her topic of interest, into Blaze.
10 | 
11 | Blaze now searches the literature using the rOpenSci package “fulltext”, covering PubmedCentral, PLOS, BiomedCentral, arXiv, bioRxiv and Crossref.  Blaze also includes other types of research outputs such as data sets, source code and images via archives like figshare, thus leveraging the variety of resources that open science provides. It then uses a variety of measures derived from fulltext, metadata and usage data to infer similarities between the most relevant items. Blaze uses a variety of clustering and ordination techniques to create a map of the results that show items clustered by sub-topic. Using content mining APIs such as OpenAIRE and ContentMine, Blaze adds additional facts to each paper, e.g. the species that are being mentioned in each paper.
12 | 
13 | Sarah is presented with an interactive map for this field. Blue bubbles represent the main research areas; the closer two bubbles, the closer they are subject-wise. This gives Sarah an overview of the main areas within the field and an indication how they are related. Thus, she can quickly get a mental model of the field.
14 | 
15 | Now Sarah starts exploring the map. Blaze uses the well-tested approach of “overview-first, zoom and filter, details-on-demand”, employing and interaction model that she knows from her smartphone. Once she clicks on a research area, she is presented with relevant papers in that area. Sarah can see the metadata of each paper and even read full papers within the same interface.
16 | 
17 | In the course of her exploration, Sarah identifies a number of articles that warrant their own area. So she goes into edit mode. She adds a new area and drags the papers she found into the newly created bubble. She adds a title and places the area on the map.
18 | 
19 | *Algorithms are not perfect- therefore, Blaze offers an edit mode to change maps.*
20 | 
21 | Sarah is interrupted by a message from her supervisor Lauren. Lauren suggests a list of papers related to the zika virus that she has added to their joint Zotero group. Sarah connects Blaze to her Zotero account and imports the papers into her map. Blaze automatically places the new papers on the map and creates additional areas when needed.
22 | 
23 | *Blaze connects to the open science infrastructure, including Zotero and the Open Science Framework.*
24 | 
25 | Sarah is done for the day. So she publishes her map for other users to explore and modify on Blaze. She tweets the link: "Hey #biomed community, check out my overview map on the #zika virus: https://t.co/40FJMC3Ez6. Would love to get feedback #phdchat"
26 | 
27 | *The maps themselves are open. Thus, apart from linking to a map, they can also be embedded on other websites. In addition, the underlying data including the structuring information can be exported in various formats.*
28 | 
29 | The next day, Sarah fires up her e-mail to see that she has received several notifications on her map. She sees that fellow PhD student Amar has added several papers to her map. 
30 | 
31 | *The history will be preserved to show the individual contributions of users.*
32 | 
33 | She also notices that Tom, who is working on a map on Aedes (the genus of mosquito that transmits the zika virus) has included her map, as a sub-map of his.
34 | 
35 | *Using a combination of automated features and collaboration, Blaze aims to build a comprehensive, layered map of biomedical research. This will save thousands of hours that are currently lost in discovery by biomedical researchers each day, and make researchers aware of resources that they would not have found using traditional means of discovery. It will make biomedical research thus more effective and efficient.*
36 | 


--------------------------------------------------------------------------------
/blaze-proposal/proposal.md:
--------------------------------------------------------------------------------
  1 | ![Blaze Logo](images/logo-blaze.png "Blaze Logo")
  2 | 
  3 | &nbsp;
  4 | 
  5 | #BLAZE: The Comprehensive Open Science Discovery Tool
  6 | 
  7 | &nbsp;
  8 | 
  9 | ###A roadmap for developing a visual interface to the world's scientific knowledge.
 10 | 
 11 | &nbsp;
 12 | 
 13 | #####Authors:
 14 | Peter Kraker, Mike Skaug, Scott Chamberlain, Maxi Schramm, Michael Karpeles, Omiros Metaxas, Asura Enkhbayar & Björn Brembs
 15 | 
 16 | &nbsp;
 17 | 
 18 | &nbsp;
 19 | 
 20 | <img src="http://mirrors.creativecommons.org/presskit/buttons/88x31/png/by.png" width=100>
 21 | 
 22 | *Open science, all the way - the proposal and the supplementary files are hosted on Github: [http://github.com/pkraker/open-discovery](http://github.com/pkraker/open-discovery)*
 23 | 
 24 | &nbsp;
 25 | 
 26 | ##Motivation
 27 | Discovery is an essential task for every researcher, especially in dynamic research fields such as biomedicine. For example, researchers often need to get an overview of a research field (e.g. at the beginning of their PhD, or later on when venturing in a neighbouring field) or a certain topic (e.g. when writing the state-of-the-art for a project proposal).
 28 | 
 29 | Currently, there are only a limited number of discovery tools for scientific content that can be used by a mainstream audience. Most researchers still rely on scholarly search engines, which satisfy some information needs, but are a poor discovery tool. Search engines present resources in a linear, one-dimensional way, making it necessary to sift through every item in the list. They provide little context apart from basic metadata, which makes it hard to infer the topical structure of the content. Most scholarly search engines are also not suited to the new open science paradigm, in which not only the written results are published, but also data, source code, images etc., because they typically ignore non-publication resources like GenBank and other NCBI databases.
 30 | 
 31 | Another problem is that the results of the discovery process are usually not shared; they become visible only later as references in a publication or reading lists, but again with very little context and structure. Therefore, the discovery process is repeated over and over again by different researchers, because it lacks the collaborative efficiencies that have become the norm in the open science environment and researchers' time and resources are wasted.
 32 | 
 33 | To overcome these problems with the traditional, closed discovery process, we propose BLAZE, the collaborative discovery tool for open science.
 34 | 
 35 | ##Overview of BLAZE
 36 | BLAZE goes far beyond the functionality of search engines and social reading lists in order to meet the discovery needs of biomedical researchers and students. BLAZE leverages the digital open science ecosystem to provide topical maps of knowledge domains, including not only peer-reviewed literature, but also datasets, presentations, source code, project proposals and media files. The knowledge maps are created automatically using algorithms based on open content to calculate similarities among research content and to derive topical structures. The map visualization reveals relationships between content that are typically hidden in the one-dimensional list returned by a search engine.
 37 | 
 38 | BLAZE provides rich context for the discovered content by associating additional information with each resource. For example, the resources are enriched with open (alt)metrics data to indicate the popularity of items and additional facts extracted from open content, like specific species or genes, can be used to uncover hidden connections between the resources.
 39 | 
 40 | Another key feature of the BLAZE tool is that the automatically generated knowledge maps are not static like search engine results. Instead, the maps become a living and crowd-sourced guide to research fields. Researchers can explore, edit and share the maps from a single intuitive interface. For example, a user could explore different topical areas, filter the content based on different metrics, and view individual resources including full text in the same browser window. Users can even modify and annotate the maps and introduce new papers and topical areas. From there, maps are shared on [openknowledgemaps.org](http://openknowledgemaps.org) where they can be extended by other researchers – collaboratively creating layered maps of research fields. The collaboration history of a map is preserved and the maps themselves are open, so users can embed them on their own websites or in open lab books, and export the structure in various open formats into other tools (e.g. Zotero, Open Science Framework).
 41 | 
 42 | We believe that making the discovery process open and visible will have many advantages. Researchers will be able to reuse previously created maps, saving valuable time and effort. They will also be able to identify other researchers that work in the same area, highlighting potential collaborators long before the research is usually communicated. BLAZE will also improve the meta-dialouge in and across disciplines by making the structure and the vocabulary of domains explicit.
 43 | 
 44 | BLAZE will furthermore show the enormous potential of open science for innovation in scholarly communication and discovery. In addition, we believe that this tool will increase the visibility of and awareness for open content. 
 45 | 
 46 | ##Main Use Case
 47 | Sarah is a first-year PhD student in biomedicine, starting her PhD project on the zika virus. First, she needs to catch up with the literature and build a mental model of the field. 
 48 | 
 49 | Sarah types "zika", her topic of interest, into BLAZE. Sarah is presented with an interactive map for this field. Blue bubbles represent the main research areas; the closer two bubbles, the closer they are subject-wise. This gives Sarah an overview of the main areas within the field and gives her a mental model of the field.
 50 | 
 51 | ![Visualization](images/blaze-example-zika.png "Exemplary visualization of zika")
 52 | 
 53 | Sarah starts exploring the map using a simple intuitive interaction model. Once she clicks on a research area, she is presented with relevant content in that area. Sarah can see associated content and even read full papers within the same interface.
 54 | 
 55 | ![Visualization Details](images/blaze-example-zoom-details.png "Zoom, then details-on-demand")
 56 | 
 57 | Sarah identifies a number of articles that warrant their own area. So she goes into edit mode. She adds a new area and drags the papers she found into the newly created bubble. She adds a title and places the area on the map.
 58 | 
 59 | Sarah is interrupted by a message from her supervisor Lauren. Lauren suggests a presentation related to the zika virus that she has added to their joint Zotero group. Sarah connects Blaze to her Zotero account and imports the presentation into her map. Blaze automatically places the new content on the map and creates additional areas when needed.
 60 | 
 61 | Sarah is done for the day, so she publishes her map for other users to explore and modify on Blaze. She tweets the link: "Hey #biomed community, check out my overview map on the #zika virus: [https://t.co/40FJMC3Ez6](https://t.co/40FJMC3Ez6). Would love to get feedback #phdchat"
 62 | 
 63 | The next day, Sarah fires up her e-mail to see that she has received several notifications on her map. She sees that fellow PhD student Amar has added several papers to her map.
 64 | She also notices that Tom, who is working on a map on Aedes (the genus of mosquito that transmits the zika virus) has included her map, as a sub-map of his.
 65 | 
 66 | Sarah now has a better mental model of her research field and the open resources that will assist her in advancing research on the Zika virus.
 67 | 
 68 | ##Implementation
 69 | ###Current Knowledge Mapping Software "Headstart"
 70 | There is an existing knowledge mapping software for BLAZE (see [http://openknowledgemaps.org/search](http://openknowledgemaps.org/search). The software is able to create an interactive visualization based on a PLOS search result. The software has a JavaScript frontend built with [D3.js](http://d3js.org). Blue bubbles represent the main research areas; the closer two bubbles, the closer they are subject-wise. The size of the bubbles signifies the relative importance of that area (by number of downloads, clicks, readers etc.). Each area contains a number of relevant resources. A dropdown on the right displays the contents' metadata in list form. This gives users an overview of the main areas within the field and gives them a mental model of the field.
 71 | 
 72 | In terms of the interaction design, the visualization follows [Shneiderman’s well-tested approach](http://drum.lib.umd.edu/bitstream/handle/1903/466/CS-TR-3665.pdf) of "overview first, zoom and filter, then details-on- demand". Once you click on a bubble, the visualization zooms in and you are presented with the relevant publication in that area. When clicking on a resource, its metadata and abstract are presented in the right-hand pane. The full text can be retrieved by clicking on the thumbnail in the metadata panel. By clicking on the white background, users can zoom out and inspect another area. In addition, a user can filter the publications by entering terms in the search field on top of the list. Only publications that contain all of the search terms are displayed within the bubbles and the list. The list can also be sorted by title, area, and number of readers to facilitate exploration.
 73 | 
 74 | The backend of the visualization is written in PHP and R. A preprocessing component is responsible for creating the data for the visualization. It connects to the [PLOS API](http://api.plos.org/) via the [rplos](https://ropensci.org/tutorials/rplos_tutorial.html) to retrieve metadata and fulltext. It then proceeds to calculate cosine similarity between papers based on a term-document matrix using the [R tm package](https://cran.r-project.org/web/packages/tm/index.html). Based on the similarity matrix, the spatial representation and the sub-areas are calculated using ordination and clustering techniques. A naming component finally determines the label for each cluster using keywords. After processing, the representation is saved in a SQLite database. The source code for the knowledge mapping software is hosted on Github: [http://github.com/pkraker/Headstart](http://github.com/pkraker/Headstart)
 75 | 
 76 | ###Planned Improvements
 77 | 
 78 | In the first phase, we will extend both the backend and the frontend of the existing knowledge mapping software. 
 79 | 
 80 | ####Expand Content Sources
 81 | One of the primary objectives is to extend the existing software to incoporate further open content sources, including non-publication content. We use many software packages produced by [rOpenSci](http://ropensci.org), including the ability to search for scholarly content against the following engines:
 82 | 
 83 | * Crossref
 84 | * Pubmed
 85 | * EuroPubmed
 86 | 
 87 | In addition, we will query against pre-print services, including:
 88 | 
 89 | * arXiv
 90 | * Biorxiv
 91 | * Peerj Preprints
 92 | 
 93 | Emplyoing the rOpenSci text mining suite we will retrieve open access full text content via those search engines listed above.
 94 | 
 95 | We will also use rOpenSci packages for search against repositories that expose various other types of resources (datasets, presentations, source code, media files...), including - but not limited to: 
 96 | 
 97 | * Figshare
 98 | * Dryad
 99 | * DataONE
100 | 
101 | This means that we will also need to develop a data model that incorporates heterogenous data sources. Part of this will involve adding connector(s) to altmetrics APIs through rOpenSci to provide additional context for each resource. Using the [rAltmetric](http://ropensci.github.io/rAltmetric/) and [alm](https://ropensci.org/tutorials/alm_tutorial.html) packages, we will retrieve altmetrics data from the following resources:
102 | 
103 | * Altmetric.com
104 | * PLOS
105 | * Crossref
106 | * eLife
107 | * PKP
108 | * Pensoft
109 |  
110 | Moreover, we will utilize OpenAIRE services to further annotate/enrich publications with additional side information:
111 | 
112 | * References
113 | * MESH terms
114 | * pdbCodes
115 | * Funding (Grant) Info for NIH, Wellcome Trust, EU
116 | 
117 | The content will then be exposed via an API to the other components of BLAZE.
118 | 
119 | ####Improving Topic Detection and Similarity Analysis
120 | 
121 | Another objetive is to improve the automatic creation of maps. We plan to utilize a novel, multi-view probabilistic topic modeling (MV-PTM) engine that jointly analyzes massive collections of documents and related side information, and identifies hidden themes (topics) that characterize them. Side information ("views") may be of different kinds (e.g. structured or unstructured attributes and metadata), have hierarchies or taxonomies (e.g. MESH terms), be of different modalities (e.g. images), and form networks (e.g. citations). Multiple views can help to explain each other and the discovered multi-view topics are more coherent and interpretable, uncovering concepts not resolved by traditional, textual-only topic models. 
122 | 
123 | Proposed MV-PTM engine has already been used in real world applications, e.g. evaluation of EU-funded projects or publication policy making, and on top of real world massive datasets, e.g. the ACM corpus and open access PubMed. In our case, we plan to utilize proposed engine to:
124 | 
125 | 1. Identify relevant content: Based on MV-PTM we will enhance similarity analysis and relevant content identification. Thus, we will also be able to support topic-based search or search by example (specifying one or more publications). 
126 | 2. Eliminate "duplicate" research areas: Human-based annotation and clustering in large research spaces may lead to duplicate entries on the map. We will utilize MV-PTM to identify overlapping research areas or clusters. 
127 | 
128 | ####Map Extensions
129 | We will extend the map visualization to enable highlighting of contextual facts and to create additional links between the papers. For example, a researcher might want to highlight all papers that contain the same species, focus on recently published material, or view the citation links between papers. Researchers might also want to cluster the resources based on a metric other than keyword similarity, like readership, type of content (i.e. paper, data set, presentation, etc.) or funding source. All of this will be enabled through the interactive BLAZE frontend interface.
130 | 
131 | ####Map Editing and Sharing
132 | To unleash the full potential of BLAZE, one of the primary goals is to enable editing and sharing of knowledge maps. This will also require adaptations to the backend database operations and the frontend user interaction. On the front end, we will enable an edit mode that allows researchers to manually add content to the map, modify or add metadata to content, like tags, and create new clusters. The editing history will be preserved in a Wikipedia-like model to allow collaborative building of knowledge maps. The maps themselves will be saved at [Open Knowledge Maps](http://openknowledgemaps.org/mozfest) where they can me browsed by other researchers and can serve as a starting point for other researchers' exploration. 
133 | 
134 | We will also add integration with existing tools in the open digital ecosystem, including the [Open Science Framework](https://osf.io/), [Zotero](https://www.zotero.org/), and [ORCID](http://orcid.org/), so that BLAZE will fit seemlessly into researchers' current workflows. BLAZE strives to be completely open, so we will also add functionality to export the map and the underlying data in various open formats, so that, for example, a researcher could embed a map on her personal website.
135 | 
136 | ##Architecture
137 | ### Content & Metadata Aggregation
138 | 
139 | **Description:** A JSON REST API or Graphql endpoint which wraps various content and metadata sources, such as Scott's [rOpenSci fulltext library](https://github.com/ropensci/fulltext).
140 | 
141 | **URL:** https://api.archivelab.org/scholar
142 | 
143 | **Stack:** Python, Flask, Postgres, SQLAlchemy, R
144 | 
145 | **Github:** https://github.com/ArchiveLabs/scholar.archivelab.org
146 | 
147 | ### BLAZE Backend + Database
148 | 
149 | **Description:** A REST API to a database of user accounts, preferences, and curated topic maps
150 | 
151 | **URL:** https://api.openknowledgemaps.org
152 | 
153 | **Stack:** PHP, MySQL
154 | 
155 | **Github:** https://github.com/pkraker/BlazeServer (to be created)
156 | 
157 | ### BLAZE Frontend
158 | 
159 | **Description:** A standalone, serverless SPA (single page app)
160 | 
161 | **URL:** https://openknowledgemaps.org
162 | 
163 | **Stack:** React, D3.js, JQuery
164 | 
165 | **Github:** https://github.com/pkraker/Blaze (to be created)
166 | 
167 | ##Work Plan
168 | Development of BLAZE will take place on Github in the repositories outlined above. The concrete targets for developing the innovation will be published as Github issues in these repositories. A summary of the major work components is illustrated below, along with each team member's area of primary contribution.
169 | 
170 | ![Work Plan](images/work-plan_color.png)
171 | 
172 | &nbsp;
173 | 
174 | The development schedule is shown in the timeline below:
175 | 
176 | ![Development Schedule](images/gantt-color.png)
177 | 
178 | ##Licensing
179 | The code will be made available on Github under the license of the existing knowledge mapping software (LGPL v3). The visualizations will be released under CC-BY 4.0 - with the exception of the contained content, which of course retains its original license.
180 | 


--------------------------------------------------------------------------------
/roadmap.md:
--------------------------------------------------------------------------------
  1 | # White Paper
  2 | 
  3 | *By Peter Kraker, Maxi Schramm, Christopher Kittel and [the Open Knowledge Maps team](https://openknowledgemaps.org/team)*  
  4 | *With input from [our advisory board](https://openknowledgemaps.org/team#advisors) and [the Open Knowledge Maps community](https://openknowledgemaps.org/community)*
  5 | 
  6 | Version: 3.0 (Update December 2019)
  7 | 
  8 | **Let us know what you think!**<br>
  9 | We invite you to share your thoughts on our roadmap and services. You can get in touch via [email](mailto:info@openknowledgemaps.org), [Twitter](https://twitter.com/OK_Maps), or [by creating an issue](https://github.com/OpenKnowledgeMaps/open-discovery/issues) in this repository.
 10 | 
 11 | #### Table of Content
 12 | 
 13 | 1. [Motivation](#Motivation)
 14 | 1. [Introduction to Open Knowledge Maps](#Introduction-to-Open-Knowledge-Maps)  
 15 |   2.1 [Services](#Discovery-services)  
 16 |   2.2 [Software and infrastructure](#Software-and-infrastructure)  
 17 |   2.3 [Community Engagement](#Community-Engagement)  
 18 |   2.4 [Openness](#Openness)  
 19 |   2.5 [Sustainability](#Sustainability)  
 20 |   2.6 [Governance](#Governance)
 21 | 1. [Open Knowledge Maps as part of an open discovery infrastructure](#Open-Knowledge-Maps-as-part-of-an-open-discovery-infrastructure)
 22 | 1. [Workplan](#Workplan)
 23 | 1. [Support us](#Support-us)
 24 | 
 25 | # Motivation
 26 | 
 27 | **The open science revolution has dramatically increased the accessibility of scientific knowledge.** In 2015, 45% of research articles were published open access, with a clear upward trend ([Piwowar et al. 2017](https://peerj.com/articles/4375/)). In addition, we see a huge increase in open research data, which will mark the next wave of openly accessible outputs.
 28 | 
 29 | **Discoverability, however, is falling behind.** With two and a half million papers published every year, and tens of thousands of new research projects launched every week, discovery becomes increasingly difficult. Traditional approaches involving search engines providing long, unstructured lists of scientific outputs are not sufficient any more. We can also see this problem reflected in the numbers: a high number of publications remains uncited (between 12% and 82% depending on the discipline, [Lariviere et al. 2008](https://arxiv.org/abs/0809.5250)). For research data, this is even more pronounced, with up to 85% of datasets never being cited ([Peters et al. 2016](https://link.springer.com/article/10.1007/s11192-016-1887-4)). And where it gets really gloomy is the transfer to practice. Even in application-oriented disciplines such as medicine, only a minority of results ever gets transferred to practice, and if so, with a considerable delay ([Brownson et al. 2006](https://journals.sagepub.com/doi/10.1177/003335490612100118)).
 30 | 
 31 | These numbers highlight that accessibility does not equal discoverability and reuse of knowledge. **There is a discoverability crisis that negatively affects the efficiency and effectiveness of science and its transfer to practice**, canceling out many of the positive effects of increased accessibility. In many ways, we cannot cash the cheques written by the open science movement as long as we do not address this issue.
 32 | 
 33 | [Back to top](#Table-of-Content)
 34 | 
 35 | # Introduction to Open Knowledge Maps
 36 | 
 37 | **The goal of Open Knowledge Maps is to close the gap between accessibility and discoverability of scientific knowledge**. We are a charitable non-profit organization dedicated to dramatically increasing the visibility of scientific knowledge for science and society alike. Headquartered in Vienna, Austria, Open Knowledge Maps comprises of an international community including team members, advisors, and partner organizations.
 38 | 
 39 | As part of our mission, we operate the largest visual search engine for research in the world. On our website, users can create knowledge maps of research topics in any discipline based on 150 million scientific outputs. Our services enable a diverse set of stakeholders to openly explore, discover, and make use of scientific content. Among our users are researchers, students, librarians, educators, science journalists and practitioners across all disciplines and geographic regions. Open Knowledge Maps increases the visibility of content from a wide variety of stakeholders such as libraries, repositories, funding organisations, and publishers. At the moment **Open Knowledge Maps is the only solution in this space that is free and open, and that is usable without prior knowledge**.
 40 | 
 41 | **Open Knowledge Maps was established in 2016, but it has already become a crucial infrastructure** for researchers, students and practitioners around the world. In the last 3 years, we have had more than half a million visits on the website, and more than 140,000 maps were created. In addition, we had more than 1,500 participants in our workshops worldwide. At the same time, we have been rapidly growing our platform and coverage to a stable and user-friendly service with high availability. We have also developed an agile organisation, a strong community, as well as professional outreach and training efforts.
 42 | 
 43 | ![](images/1-our-offering.png)
 44 | Figure 1: Overview of our services
 45 | 
 46 | In the future, we want to enable our users to collaboratively edit and discuss the knowledge maps on our platform, thus **transforming discovery from a closed individual process to an open and collaborative one.** By sharing the results of our discoveries, we can build on top of each other’s knowledge. This process will be aided by our artificial intelligence core and result in a large-scale system of open, interactive and interlinked knowledge maps for every research topic, every field and every discipline. **Users will be able to explore the entirety of scientific knowledge within this system.** In addition, we will provide the information generated in the OKMaps system back to the open discovery infrastructure, both the knowledge models as well as metadata improved or added by our users. Therefore, Open Knowledge Maps will be the only visual interface that combines machine learning with human curation, providing a truly human-centered service as a result. View here a [video of that vision](https://vimeo.com/188647919).
 47 | 
 48 | [Back to top](#Table-of-Content)
 49 | 
 50 | ## Discovery services
 51 | Our approach is to use knowledge maps for discovery. Knowledge maps provide an instant overview of a topic by showing the main areas at a glance with relevant papers and concepts attached to each area (see example map below).
 52 | 
 53 | ![](images/2-example-map-heart-disease.png)
 54 | Figure 2: Concept of a knowledge map, using a simplified example for the area of “heart diseases”
 55 | 
 56 | Based on this concept, we operate the largest visual search engine for research in the world. On our website https://openknowledgemaps.org, users can create knowledge maps of research topics in any discipline (see Figure 2 for an example). Users can select between two integrations: BASE, the Bielefeld Academic Search Engine, which provides an index of more than 150 million scientific outputs, and PubMed, the large bibliographic database in the natural sciences with a focus on biomedicine. In addition, we operate a service that makes it possible to create knowledge maps of research projects. VIPER, the Visual Project Explorer, enables users to systematically explore a project’s output, and to understand its reception in different areas.
 57 | 
 58 | Our search services incorporate a wide variety of in total 21 output types including datasets and multimedia files, and many resources, especially from university repositories, that are not included in commercial search engines. In addition, we do not restrict language in searches, thus increasing the visibility of non-English language content and making the service especially appealing to communities that communicate in a language other than English. We can already see that these efforts have paid off. **Open Knowledge Maps serves both users from the Global South and the Global North:** most of our page views are from users in Indonesia, followed by the US, India, Germany, Austria, the UK, China, Canada, and Australia. In fact, we are seeing some of the strongest growth in the Global South with countries like India (109,000 page views, a fivefold increase in 2018), South Africa (10.500 views and a threefold increase in 2018), or Nigeria (a fourfold increase to 10,000 page views in 2018).
 59 | 
 60 | ![](images/2-overview-digital-education.png)
 61 | Figure 3: Knowledge Map of "digital education", online at https://openknowledgemaps.org/map/c8fe7a11ca29a8b8580e5612fcebc42a
 62 | 
 63 | Open Knowledge Maps supports a number of use cases, for which no other comprehensive tools existed before. These use cases include getting an overview of a topic at a glance and understanding the terminology related to that topic. Open Knowledge Maps also enables users to easily separate relevant from irrelevant papers with respect to their current information need by clustering papers into subtopics. Such cross-discipline overviews become more and more relevant as the biggest problems of our time (including [climate change, migration, globalisation](https://www.google.com/url?q=https://www.un.org/sustainabledevelopment/sustainable-development-goals/)) require an interdisciplinary approach. We also allow for direct access to OA papers within our interface, and offer annotation capabilities for these papers courtesy of Hypothesis. Thus, **we enable users to do a large part of their literature research within the same browser tab.**
 64 | 
 65 | Outside of open infrastructures, the need for systems providing visual overviews of scientific fields has also been identified. At the moment **Open Knowledge Maps is the only solution in this space that is free and open, and that is usable without prior knowledge.**
 66 | 
 67 | [Try out our service](https://openknowledgemaps.org) and let us know what you think.
 68 | 
 69 | [Back to top](#Table-of-Content)
 70 | 
 71 | ## Software and infrastructure
 72 | 
 73 | Open Knowledge Maps is based on the open source knowledge mapping framework Head Start. Head Start is a high-performance and high-availability software stack that is capable of automatically producing knowledge maps from a variety of data, including text, metadata, and references using unsupervised machine learning techniques.
 74 | 
 75 | Head Start includes connectors to a number of academic search engines through rOpenSci, including BASE, PubMed, OpenAIRE, PLOS, and DOAJ. We retrieve the metadata of the most relevant documents for a search term from the respective database, and create a knowledge map in four steps:
 76 | 
 77 | 1. Documents are preprocessed by cleaning metadata like title, abstract, author, and journal by removing punctuation, filtering stopwords, transforming to lower-case and stemming.
 78 | 2. Similarity of documents is then calculated based on the cosine-similarity of cleaned metadata.
 79 | 3. The documents are then clustered into groups with Ward’s method of minimum variance, and placed on the map with non-metric multidimensional scaling.
 80 | 4. Finally, clusters are labeled by taking the top three n-grams of keywords, weighted by Term Frequency - Inverse Document Frequency.
 81 | 
 82 | In short: the more words two documents have in common in their metadata, the closer they are placed to each other on the map, and the more likely they are assigned to the same area. The representation of the knowledge map is stored in a database from where it is delivered to the user facing frontend. The frontend consists of a web-based interactive visualization, which follows the Shneiderman’s (1996) principle of  "Overview first, zoom and filter, then details-on-demand". Knowledge maps are complemented by a synchronized list representation of documents and an integrated PDF viewer and annotation tool courtesy of Hypothesis.
 83 | 
 84 | Head Start has a client-server architecture with the user-facing search interface and interactive map frontend based on JavaScript and D3.js. The service and API layer is based on Apache/PHP, and the the natural language processing/machine learning backend is written in R. The architecture is complemented by a persistence and versioning system based on SQLite.
 85 | 
 86 | **The overall technical design aims to balance two goals: reliability of service and ease of maintenance.**
 87 | We choose underlying technologies according to robustness and versatility rather than novelty or specialization, to focus development efforts on improving the user experience. Through our choice of open source software, which themselves are developed and maintained by large communities, we avoid vendor lock-in effects and long-term dependencies.
 88 | 
 89 | The source code for the whole service is available under the open source MIT License on GitHub. In regular frequency we publish new releases of our software, which are available as open source packages and archived via Zenodo.
 90 | 
 91 | [Back to top](#Table-of-Content)
 92 | 
 93 | ## Community Engagement
 94 | 
 95 | In the past years, **we have successfully established a wide array of user engagement activities**, including (1) talks and workshops at research institutions, libraries and conferences around the world, as well as via webinars, (2) engagement via our [newsletter](https://openknowledgemaps.us13.list-manage.com/subscribe?u=c399f89442d6aa733a9896515&id=f9d9e47566) and on [social media](https://twitter.com/OK_Maps/), (3) and our [Enthusiasts program](https://openknowledgemaps.org/community). These activities are aimed at our diverse stakeholder groups, involving different aspects of the discovery workflow. These aspects include both the reader’s and the author’s point of view (finding and getting found), as well as insights into the functioning of discovery tools and the discovery infrastructure in general.
 96 | 
 97 | **Our engagement activities are reusable under CC BY, making it easy for our stakeholders to run their own trainings.** We provide them with instructions and materials that are also published on our website for [self-guided workshops and lectures](https://openknowledgemaps.org/community#training-materials). Currently, stakeholders can choose between two different formats (“The Scientific Scavenger Hunt” and “Academic SEO”). Both workshops have strong participatory elements. In “The Scientific Scavenger Hunt”, we play a game with the participants that introduces them to Open Knowledge Maps and helps improve their literature search skills along the way. In “Academic SEO”, we discuss the inner workings of search engines and tools that participants use for discovery.
 98 | 
 99 | In the [Enthusiasts program](https://openknowledgemaps.org/community) we involve Open Knowledge Maps power users and ambassadors in our engagement activities. Under our guidance, participants with diverse backgrounds from all around the world conduct workshops on open discovery and Open Knowledge Maps in their communities. In doing so, they collect valuable feedback on the future development of our services. These activities give us a unique opportunity to better understand our user groups and their specific needs, especially in regions and communities that we cannot reach, and thus **drive the Open Knowledge Maps development from a community perspective.**
100 | 
101 | ![](images/3-training-sessions.png)
102 | Figure 4: Impressions from our training sessions
103 | 
104 | [Back to top](#Table-of-Content)
105 | 
106 | ## Openness
107 | 
108 | **Open Knowledge Maps is an open infrastructure** based on the principles of open science: source code, content and data are shared under an open license. We are a building block of the open discovery infrastructure, with which we extensively collaborate. As a community-driven initiative, we develop our services in a participatory process together with our stakeholders (see section 2.6 on governance). Our aim is to create an inclusive, sustainable and equitable infrastructure that can be used by anyone, independent of geographic area, age, or stakeholder group. Therefore, we are also conducting workshops to introduce potential users to the tool and help improve their discovery skills along the way. In our community program, the Enthusiasts program, we support power users and ambassadors to conduct workshops themselves and to give feedback from their community’s perspective.
109 | 
110 | ![](images/4-community-driven.png)
111 | Figure 5: Open Knowledge Maps is an open infrastructure that is community-owned and community-driven
112 | 
113 | [Back to top](#Table-of-Content)
114 | 
115 | ## Sustainability
116 | 
117 | Our funding model consists of three main pillars:
118 | 
119 | (1) funded projects  
120 | (2) membership-based funding  
121 | (3) donations
122 | 
123 | Of these, **funded projects currently provide the largest income.** In this pillar, we create new discovery services using our open source software. **The outcome must always serve the public good.** These projects are either funded by a collaborating institution (e.g. Ludwig Boltzmann Gesellschaft or Austrian Academy of Sciences) or by a 3rd party funding body (e.g. the European Commission or Mozilla)
124 | 
125 | **Funded projects currently cover only a small percentage of our activities - the largest part is donated in-kind as volunteer efforts.** So far, we achieved EUR 100,000 in funding, which partially covers the activities of the core team. In addition, project-based funding is highly volatile. To achieve financial, technical, and organisational sustainability **we will complement funded projects with membership-based funding.** In our membership-based funding model, supporting organisations become members of Open Knowledge Maps and provide an annual contribution. In return, members get a say in the future development of Open Knowledge Maps. Members are invited to join the Board of Supporters, which participates in a yearly vote on what features and sources to implement. **This establishes a true community-driven solution for the discovery of scientific knowledge.**
126 | 
127 | Donations are our third pillar, and our smallest yet. In November/December 2018, we launched our first donation campaign. The feedback from the community and the results were encouraging. With the funds raised, we will be able to cover our server costs for 2019.
128 | 
129 | **We know that increased openness does not automatically lead to a more equitable world.** It needs organizations and services that translate this openness in a way that everyone can benefit from it. Open Knowledge Maps is in a prime position to do so having developed a service and interface that can be understood across geographical borders, age groups and societal stakeholders. At this moment we need more funding to do that in a sustainable way.
130 | 
131 | [Back to top](#Table-of-Content)
132 | 
133 | ## Governance
134 | 
135 | Open Knowledge Maps’ structure is largely determined by the legal requirements for a charitable non-profit (“gemeinnütziger Verein”) under Austrian law (Vereinsgesetz 2002) and defined in the organization’s bylaws.
136 | 
137 | Open Knowledge Maps has the following branches (see Figure 2 below):  
138 | (1) The executive branch, which consists of the Executive Board and the team  
139 | (2) The advisory councils, including the Advisory Board and the Board of Supporters  
140 | (3) The monitoring facilities, including the arbitration tribunal and the auditors  
141 | (4) The community, with our users (including the enthusiasts), partners, and networks.
142 | 
143 | ![](images/5-structure.png)
144 | Figure 6: The Open Knowledge Maps governance structure
145 | 
146 | The Executive Board is responsible for the management of the organization and the team. Together with the Executive Board, the team carries out the work to realize the purpose of the organization.
147 | 
148 | The Executive Board has two advisory councils: the Advisory Board and the Board of Supporters. The members of the advisory board advise the Executive Board on organisational and technical development and the means for achieving the purpose of the organisation. The Advisory Board is designed to represent the diversity of stakeholders of the organisation, including users, partners, advocates and so on. Advisors are appointed for one year by the Executive Board; the appointment can be renewed.
149 | 
150 | The Board of Supporters consists of the supporting members. Depending on the extent of the membership fee, a supporting member can appoint up to three representatives to the [Board of Supporters](https://openknowledgemaps.org/supporting-membership). The Board of Supporters gives input to the technical roadmap and is involved in the decision making process regarding the technical roadmap in a yearly voting. The vote of the Board of Supporters is weighted 1/3 towards the composition of the  final roadmap. The other two thirds are decided by the wider Open Knowledge Maps community in a community voting (weighting: 1/3) and a voting by the Open Knowledge Maps team and executive board (weighting: 1/3).
151 | 
152 | The auditors are responsible for auditing the financial management of the association with regard to the regularity of accounting and the use of funds in accordance with the bylaws of the organization. For the arbitration of all disputes arising within the association, the arbitration tribunal of the association is called. It is a "mediation facility" according to Austrian law (Vereinsgesetz 2002).
153 | 
154 | [Back to top](#Table-of-Content)
155 | 
156 | # Open Knowledge Maps as part of an open discovery infrastructure
157 | 
158 | **Discovery is a space that has traditionally been dominated by commercial players** (Elsevier, Clarivate, Google), and with more resources becoming available thanks to Open Science, they have recently increased their activity in this area. In the past two years alone, we have seen the market entrance of Springer with Digital Science Dimensions, as well as the launch of Google Dataset Search and Elsevier DataSearch Beta 2.
159 | 
160 | **Commercial infrastructures have, however, demonstrated a stronger emphasis on reaching market dominance than on developing innovative services.** Their reliance on proprietary indices and list-based interfaces, and a lack of community-driven development has led to the current discoverability crisis (see also section 1). In addition, commercial infrastructures retain all governance over their systems, including user data and algorithms, and due to their proprietary nature lead to lock-in effects. Thankfully, **we can see an open discovery ecosystem emerging** (see Figure 3). But this system is still developing: in a recent analysis [Bianca Kramer and Jeroen Bosman](https://docs.google.com/spreadsheets/d/1h0Aq6NYIeVnLDw33vx1SGnv1jbE2B7widbHhU7tpiUw/edit#gid=2141288902) identified just three scholarly commons-compliant infrastructures in the area of discovery - Open Knowledge Maps being one of them.
161 | 
162 | ![](images/6-innovation-cycle.png)
163 | Figure 7: The open discovery infrastructure
164 | 
165 | **The open discovery system is still fragmented and it needs innovative, cross-disciplinary frontends and infrastructures that act as a catalyst for its development. Open Knowledge Maps does not seek to replace traditional search engines, but to complement their strengths.** While search engines help users whose information need is clearly defined to access research outputs, Open Knowledge Maps provides discovery services that put single pieces of research into context, giving users with a more fuzzy information need multiple entry points into a topic. We also see ourselves complementary to the efforts of visualization tools such as VOSViewer or Scholia, which require expertise in creating and manipulating datasets. We therefore seek to create a useful workflow between all of these systems (see our collaborations with BASE, OpenAIRE and Scholia).
166 | 
167 | We believe that we have what it takes to act as a catalyst for this infrastructure. **Due to our broad use, we have the ability to increase the visibility of features and services developed by the rest of the open discovery infrastructure.** Open Knowledge Maps already acts as a catalyst for source applications in BASE. We also collaborate with rOpenSci, who develop all of our data clients. These clients are then also available as standalone R packages. Thus, when Open Knowledge Maps grows, the data science community also benefits. We have also already integrated Hypothesis, and we are planning integrations of other services. We believe that in this way, we can best use the advantages of openness and jointly become a viable alternative to the proprietary infrastructure.
168 | 
169 | **We also provide a well-tested, user-facing open source service, thus bringing an innovative reusable interface to the open infrastructure.** This is of huge value to our stakeholders, which can use our interface to provide better discoverability for their collections. This is exemplified by collaboration projects with the Austrian Academy of Sciences and the Ludwig Boltzmann Gesellschaft, an OpenAIRE tender project, and an upcoming EU project with OPERAS.
170 | 
171 | **In addition, we are strong advocates for the open discovery infrastructure.** In the #DontLeaveItToGoogle campaign, we spread awareness for the need for an open discovery ecosystem and open infrastructures in general using tweets, blog posts, and invited talks. (See these blog posts for an overview of activities: [#DontLeaveItToGoogle Campaign](https://science20.wordpress.com/2018/09/10/dontleaveittogoogle/) and [Update on the DontLeaveItToGoogle Campaign](https://science20.wordpress.com/2019/03/28/update-on-the-dontleaveittogoogle-campaign/)) This has been met with a lot of interest: we have now reached about 100,000 people with this campaign, and already received invitations for follow-up pieces and talks.
172 | 
173 | [Back to top](#Table-of-Content)
174 | 
175 | # Workplan
176 | 
177 | Our work plan has three main pillars:  
178 | **(1) maintaining and developing our core services  
179 | (2) carrying our community and training activities  
180 | (3) implementing our funding model.**
181 | 
182 | It is structured along 9 crucial objectives. Note that we do not include a timeline for the work plan at this point, as its implementation strongly depends on the available funding. Currently, we are looking for dedicated funding for all activities, as well as expressions of interest from organizations to become [supporting members](https://openknowledgemaps.org/supporting-membership).
183 | 
184 | **1) Achieve financial sustainability**  
185 | We are looking to achieve financial sustainability by strongly developing the sustainable pillars of our funding model, in particular the membership-based funding model. In addition, we will continue our donation program. We do have an existing setup that allows us to run a donation campaign with minimum effort. We will therefore keep doing this to gather more insight into how we can grow this instrument. We will still look to acquire project-based funding that has a strong alignment with our overall roadmap, but to a lesser extent.
186 | 
187 | **2) Strengthen integration with the non-profit open science infrastructure**  
188 | If we want to be taken seriously as an alternative to proprietary infrastructures, the open infrastructures need to work together towards a better user experience. To this end we will work on a more seamless integration of complementary functionalities of other services into our own service. This will in turn provide backlinks to the rest of the infrastructure and increase the use of their services.
189 | 
190 | Based on user feedback and popular use cases we are planning to e.g. integrate following functionalities within the next two years:  
191 | * **DOAJ:** include the Seal of Approval  
192 | * **ORCiD:** include ORCiDs from BASE and link to them  
193 | * **unpaywall:** provide links to additional open access copies  
194 | * **OA Button:** provide OA Button within the interface for non-OA items  
195 | * **Zotero:** enable exporting citations to Zotero  
196 | 
197 | We will also perform user tests evaluating the implementation of the functionalities mentioned above. In addition, we will work towards integration of our functionalities in other open infrastructures.
198 | 
199 | **3) Provide data interfaces to our database**  
200 | The open science ecosystem relies on the creation of positive network effects, from which we benefit in our daily use of open APIs, and to which we want to contribute by offering an API to our own services and data sets. This includes providing a means of access to our services’ knowledge mapping capabilities (‘Discovery as a Service’), and bulk data access to our underlying data sets, so that other initiatives can re-use this for e.g. building on top of that data or research purposes. For those sections of our future data sets that include user-contributed content, our data interfaces will be constrained to data for which users consciously opt-in prior to any publication.
201 | 
202 | **4) Improve our capabilities for data discovery**  
203 | Research data are among the fastest growing openly accessible scientific outputs. There is a need to improve on the low rate of discoverability to increase the reuse of FAIR data - up to 85% of datasets are never cited (Peters et al. 2016). Moreover, several new entrants to the data discovery market are following a closed and proprietary model, therefore creating open alternatives becomes all the more important.
204 | We want to transfer our accumulated experience and competences from literature discoverability to visibility and discoverability of research data across disciplines. We are building on our existing experiences with including data sets in knowledge maps as part of the [VIPER project](https://openknowledgemaps.org/viper/), and supplementary datasets included in BASE. Our role as lead in the GO FAIR Implementation Network “Discovery” puts us in a unique position to work with the community on three tasks:
205 | * **Stocktaking:** We will identify relevant open indices and innovative open source interfaces and user-facing services to be (re-)used, as well as the main use cases that we want to address.  
206 | * **Structuring:** We will define the standards and structure of an open ecosystem of services and interfaces for data discovery that fulfils the identified use cases.  
207 | * **Implementation:** We will work towards implementation of the ecosystem laid out above.
208 | 
209 | **5) Refactoring project: improving our platform and its reusability**  
210 | We will perform fundamental refactoring, which is a necessary foundation for our goal of a collaborative discovery environment. This includes:  
211 | (1) A rewrite and extension of the web-based visualization in a reactive framework that allows easier maintenance and addition of functionalities.  
212 | (2)  The development of a scalable data storage architecture that better enables future growth, introduces data replicability and further reduces risk of data loss via an improved backup system.  
213 | (3) A rewrite of the backend in a scalable, easy deployable architecture to improve portability across collections and further reduce risks of downtime.  
214 | 
215 | As a direct effect of this, our software can be more easily reused or adapted, and therefore become a standard software package for making collections visible and discoverable.
216 | 
217 | **6) Further develop our community activities, in particular our Enthusiasts program**  
218 | In the past years, we have successfully established a wide array of community activities, including workshops, webinars and our enthusiasts program. These activities give us a unique opportunity to better understand our user groups and their specific needs, especially in regions and communities that we cannot regularly interact with, and thus drive the Open Knowledge Maps development from a community perspective. Apart from a science mini-grant from Mozilla and the Sloan Foundation, these activities were created entirely on volunteer time. Moving forward we want to maintain existing activities and we will work towards expanding the program by involving curriculum managers to reach a bigger audience. We also plan to stepwise increase the number of participants that we take on the Enthusiasts program (currently 6), to develop new and improved materials and to support enthusiasts financially when conducting bigger events.
219 | 
220 | **7) Continue our communication and support efforts and make them sustainable**  
221 | We have already established professional communication and support via our website, newsletter and social media channels. As with the community efforts, these are completely volunteer-based at the moment. Over the next two years, we will make our communication efforts sustainable so that they can be carried out by any person within the organisation. This includes a written social media strategy that lays out our coordinated activities across our social media channels to be able to use these more effectively and increase followers and attract new users. On our website, we are planning to replace our news and community sections with a CMS. We will also dedicate a new page to our governance structure and decision-making processes. In addition, we will continue to update the content on our website regularly and send out a newsletter every 4-6 weeks.
222 | 
223 | **8) Continue our advocacy for and networking with open infrastructures**  
224 | We have started a very successful advocacy campaign for open infrastructures, which has already reached about 100,000 people. We will continue working on this campaign with posts, interviews (including one with Open Science Radio), talks and via social media. We will also continue our work in leading open science and open infrastructure networks, including JROST, GO FAIR, the Leibniz Research Alliance Open Science, AAKC, OANA (Open Science Network Austria) and CSNA (Citizen Science Network Austria).
225 | 
226 | **9) Prepare the organisational transition to paid staff and improve operational transparency**  
227 | We have already successfully transitioned from a pure volunteer organisation to one that mixes volunteer work with funded projects. We have set up an organisational structure including executive and administrative functions, supported by a number of project management tools, that allows us to do that in an agile way. Going forward, we will establish the necessary processes and administrative functions to deal with paid staff.
228 | 
229 | Going forward, we will improve our operational transparency. We already provide information about our sustainability goals and efforts on funding, including our membership-based funding model and will continue to do so. We will also communicate our governance structure and decision-making processes on our website.
230 | 
231 | [Back to top](#Table-of-Content)
232 | 
233 | # Support us
234 | 
235 | You can support Open Knowledge Maps in a variety of ways:
236 | 
237 | **Let us know what you think:** share your thoughts on our roadmap and services via [email](mailto:info@openknowledgemaps.org), [Twitter](https://twitter.com/OK_Maps), or [by creating an issue](https://github.com/OpenKnowledgeMaps/open-discovery/issues) in this repository.
238 | 
239 | **Become an enthusiast:** are you interested in spreading the word on open discovery and Open Knowledge Maps? [Sign up for our Enthusiasts program](https://openknowledgemaps.org/community#enthusiasts-program)
240 | 
241 | **Make a donation** (for individuals and organizations): If you want to help us financially in achieving our goals, consider [making a donation](https://openknowledgemaps.org/donate-now).
242 | 
243 | **Become a supporting member** (for organizations): to provide a sustainable platform for open discovery, we propose to fund Open Knowledge Maps in a collective effort: organizations become supporting members and provide a yearly contribution. In return, our supporting members are invited to co-create the platform with us. Find out more [on our website](https://openknowledgemaps.org/supporting-membership).
244 | 
245 | **Fund our roadmap** (for funders): we are currently looking for dedicated funding for all activities in our work plan. Please get in touch with Open Knowledge Maps founder and chairman [Peter Kraker](mailto:pkraker@openknowledgemaps.org).
246 | 
247 | 
248 | [Back to top](#Table-of-Content)
249 | 


--------------------------------------------------------------------------------