├── Meetup ├── Paper ├── CHAOSS_measure index.pdf └── README.md ├── Picture Material ├── LOGO └── PPT Template ├── README.md ├── README_zh_CN.md └── images └── introduction ├── OSPprojects.png ├── WDS advantages.png ├── WDSAdvantages.png ├── WeDataSphere all-in-one-202211.png ├── introduction05.jpg ├── os projects.png └── weChatQQ.png /Meetup: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Paper/CHAOSS_measure index.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/Paper/CHAOSS_measure index.pdf -------------------------------------------------------------------------------- /Paper/README.md: -------------------------------------------------------------------------------- 1 | # This is where we store our documents and industry reports 2 | -------------------------------------------------------------------------------- /Picture Material/LOGO: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Picture Material/PPT Template: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [English](README.md) | [中文](README_zh_CN.md) 2 | 3 | ## WeDataSphere Open Source Components 4 | Project with the blue "S" ball in the image below is open-sourced. Including [DataSphere Studio](https://github.com/WeBankFinTech/DataSphereStudio), [Linkis](https://github.com/WeBankFinTech/Linkis), [Scriptis](https://github.com/WeBankFinTech/Scriptis), [Qualitis](https://github.com/WeBankFinTech/Qualitis), [Schedulis](https://github.com/WeBankFinTech/Schedulis), [Exchangis](https://github.com/WeBankFinTech/Exchangis), [Visualis](https://github.com/WeBankFinTech/Visualis), [Prophecis](https://github.com/WeBankFinTech/Prophecis), [Streamis](https://github.com/WeBankFinTech/Streamis). 5 |
6 | ![OSProjects](images/introduction/WeDataSphere%20all-in-one-202211.png) 7 | 8 | # *[Apache Linkis(Incubating)](https://github.com/WeBankFinTech/Linkis)* 9 | 10 | **[Click me](https://github.com/WeBankFinTech/Linkis) to Github repo** 11 | 12 | [Linkis](https://github.com/WeBankFinTech/Linkis) builds a computation middleware layer to decouple the upper applications and the underlying data engines, provides standardized interfaces (REST, JDBC, WebSocket etc.) to easily connect to various underlying engines (Spark, Presto, Flink, etc.), while enables cross engine context sharing, unified job& engine governance and orchestration. 13 | 14 | # *[DataSphere Studio](https://github.com/WeBankFinTech/DataSphereStudio)* 15 | 16 | **[Click me](https://github.com/WeBankFinTech/DataSphereStudio) to Github repo** 17 | 18 | [DataSphere Studio](https://github.com/WeBankFinTech/DataSphereStudio) is positioned as a data application development portal, and the closed loop covers the entire process of data application development. With a unified UI, the workflow-like graphical drag-and-drop development experience meets the entire lifecycle of data application development from data import, desensitization cleaning, data analysis, data mining, quality inspection, visualization, scheduling to data output applications, etc. 19 | 20 | # *[Scriptis](https://github.com/WeBankFinTech/Scriptis)* 21 | 22 | **[Click me](https://github.com/WeBankFinTech/Scriptis) to Github repo** 23 | 24 | [Scriptis](https://github.com/WeBankFinTech/Scriptis) is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis. 25 | 26 | # *[Qualitis](https://github.com/WeBankFinTech/Qualitis)* 27 | 28 | **[Click me](https://github.com/WeBankFinTech/Qualitis) to Github repo** 29 | 30 | [Qualitis](https://github.com/WeBankFinTech/Qualitis) is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. 31 | 32 | # *[Schedulis](https://github.com/WeBankFinTech/Schedulis)* 33 | 34 | **[Click me](https://github.com/WeBankFinTech/Schedulis) to Github repo** 35 | 36 | [Schedulis](https://github.com/WeBankFinTech/Schedulis) is a high performance workflow task scheduling system that supports high availability and multi-tenant financial level features, Linkis computing middleware, and has been integrated into data application development portal DataSphere Studio 37 | 38 | # *[Exchangis](https://github.com/WeBankFinTech/Exchangis)* 39 | 40 | **[Click me](https://github.com/WeBankFinTech/Exchangis) to Github repo** 41 | 42 | [Exchangis](https://github.com/WeBankFinTech/Exchangis) is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources. On the application layer, it has business features such as data permission management and control, high availability of node services and multi-tenant resource isolation. On the data layer, it also has architectural characteristics such as diversified transmission architecture, module plug-in and low coupling of components. 43 | 44 | # *[Visualis](https://github.com/WeBankFinTech/Visualis)* 45 | 46 | **[Click me](https://github.com/WeBankFinTech/Visualis) to Github repo** 47 | 48 | [Visualis](https://github.com/WeBankFinTech/Visualis) Visualis is an open source project based on Yixin Davinci Developed data visualization Bi tool. It has been integrated into the data application development portal Datasphere Studio in this release, Visualis 1.0.0 supports Linkis 1.1.1 and DSS 1.1.0. 49 | 50 | # *[Prophecis](https://github.com/WeBankFinTech/Prophecis)* 51 | 52 | **[Click me](https://github.com/WeBankFinTech/Prophecis) to Github repo** 53 | 54 | [Prophecis](https://github.com/WeBankFinTech/Prophecis) is a one-stop machine learning platform developed by WeBank. It integrates multiple open-source machine learning frameworks, has the multi tenant management capability of machine learning compute cluster, and provides full stack container deployment and management services for production environment. 55 | 56 | # *[Streamis](https://github.com/WeBankFinTech/Streamis)* 57 | 58 | **[Click me](https://github.com/WeBankFinTech/Streamis) to Github repo** 59 | 60 | [Streamis](https://github.com/WeBankFinTech/Streamis) is an jointed development project for Streaming application development and management established by WeBank, CtYun, Samoyed Financial Cloud and XianWeng Technology. 61 | 62 | # More open-source WDS components? Coming soon... 63 | 64 | ## WeDataSphere Introduction 65 | 66 | WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. The fundamental platform consists of 4 layers for data exchange, data distribution, computation and storage; The functional platform consists of 3 layers for platform tools, data tools and application tools, focusing on the implementations of various user requirements about functional tools. These construct as a complete technical ecosystem of big data platform and provides one-stop sufficient components and functionalities support. 67 | 68 | ## WeDataSphere Core Features 69 | 70 | - Fundamental capabilities 71 | 72 | Powered by miscellaneous open-source components contributed by the community, such as Hadoop, Spark, Hbase, KubeFlow adn FFDL, WeDataSphere achieves financial level reliability on infrastructural data computation, storage and exchange. It also contributes some enhancements to those open-source versions by addressing security, performance, availability and manageability issues in practice with bug fixes. 73 | 74 | - Platform tools 75 | 76 | Consists of a platform portal, a data middleware(Linkis) and an operation management system. The platform portal supports product map, financial expense calculation and cloud service application; As a data middleware, Linkis links concrete applications up with underlying computation/storage systems with capabilities of financial level multi-tenant, resource governance and access isolation, filling gaps for the open-source community and the industry; The operation management system encompasses cluster management, configuration management, change management and service request automation, supports one-click installation, one-click upgrade and graphical operation&maintenance, and provides functionalities of alert, health monitoring&diagnosis and automatic recovery, simplifying the operation&maintenance process of the platform. 77 | 78 | - Data tools 79 | 80 | Consists of data map, data desensitization, data quality and data exchange tools across different Hadoop clusters. Data map manages the universal data resource of the whole bank, with components of meta-data management, data access control, data consanguinity and the on-developing data quality and data model functions. Data desensitization desensitizes highly confidential data and keeps users from accessing it directly. The data quality tool provides a unique process to define and detect the quality of datasets with immediate problem reporting. The data exchange tools across different Hadoop clusters supports the scheduling, monitoring, statistics and management for data exchange tasks. 81 | 82 | - Application tools 83 | 84 | Consists of the development&exploration tool(Scriptis), a graphical workflow scheduling system, a data visualization BI tool and a machine learning support system. Scriptis connects with various computation/storage engines with graphical interface and multi development languages support. The graphical workflow scheduling system provides a graphical interface for workflow definition, job execution, dependency reveal, status display, historical statistics and monitoring configuration. The data visualization BI tool generates various charts by drag&drop operations and simple scripting, with scheduled email available. The machine learning support system supports multiple model training mode, including both self-developed ML algorithms and open-source ML frameworks, with multi-tenant management alility for high-performance clusters. 85 | 86 | ## WeDataSphere major Advantages 87 |
88 | ![WDSAdvantages](https://github.com/WeBankFinTech/WeDataSphere/blob/master/images/introduction/WDSAdvantages.png) 89 | 90 | - One stop 91 | 92 | The 3 layers of platform tools, data tools and application tools plus the powerful machine learning capability, build up an enterprise big data solution. 93 | 94 | - Synchronization across clusters among 3 datacenters in 2 cities 95 | 96 | Effecient&reliable big data transportation across clusters/IDCs, with sophisticated data backup and disaster tolerance solutions. 97 | 98 | - Financial grade 99 | 100 | Unified security control, fully container/microservice adoption and multi-tenant isolation for different layers. 101 | 102 | - Seamless expirence 103 | 104 | The unique data middleware(Linkis) links up systems in different layers, bringing data consanguinity, code reusability and user resources altogether. 105 | 106 | - Open source 107 | 108 | Core components already open source, the rest coming soon. 109 | 110 | ## WeDataSphere Community 111 | 112 | If you desire immediate response, please kindly raise issues to us or scan the below QR code by WeChat and QQ to join our group: 113 |
114 | ![weChatAndQQ](https://github.com/WeBankFinTech/WeDataSphere/assets/11496700/853e2b68-109f-42ba-a1b7-5e42d01b2865) 115 | -------------------------------------------------------------------------------- /README_zh_CN.md: -------------------------------------------------------------------------------- 1 | [English](README.md) | [中文](README_zh_CN.md) 2 | 3 | ## WeDataSphere 已开源组件 4 | 5 | # *[DataSphere Studio](https://github.com/WeBankFinTech/DataSphereStudio)* 6 | 7 | **[点我](https://github.com/WeBankFinTech/DataSphereStudio)进入Github repo** 8 | 9 | [DataSphere Studio](https://github.com/WeBankFinTech/DataSphereStudio)定位为数据应用开发门户,闭环涵盖数据应用开发全流程。在统一的UI下,以工作流式的图形化拖拽开发体验,满足从数据导入、脱敏清洗、分析挖掘、质量检测、可视化展现、定时调度到数据输出应用等,数据应用开发全流程场景需求。 10 | 11 | # *[Qualitis](https://github.com/WeBankFinTech/Qualitis)* 12 | 13 | **[点我](https://github.com/WeBankFinTech/Qualitis)进入Github repo** 14 | 15 | [Qualitis](https://github.com/WeBankFinTech/Qualitis)是一个支持多种异构数据源的质量校验、通知、管理服务的一站式数据质量管理平台,用于解决业务系统运行、数据中心建设及数据治理过程中的各种数据质量问题。 16 | 17 | # *[Linkis](https://github.com/WeBankFinTech/Linkis)* 18 | 19 | **[点我](https://github.com/WeBankFinTech/Linkis)进入Github repo** 20 | 21 | [Linkis](https://github.com/WeBankFinTech/Linkis)是一个打通了多个计算存储引擎如:Spark、Flink、Hive、Python和HBase等,对外提供统一REST/WS/JDBC接口,提交执行SQL、Pyspark、HiveQL、Scala等脚本的计算中间件。 22 | 23 | # *[Scriptis](https://github.com/WeBankFinTech/Scriptis)* 24 | 25 | **[点我](https://github.com/WeBankFinTech/Scriptis)进入GitHub repo** 26 | 27 | [Scriptis](https://github.com/WeBankFinTech/Scriptis)是一款支持在线写SQL、Pyspark、HiveQL等脚本,提交给Linkis执行的交互式数据分析Web工具,且支持UDF、函数、资源管控和智能诊断等企业级特性。 28 | 29 |
30 | 更多开源组件,敬请期待... 31 | 32 | ---- 33 | 34 | ## WeDataSphere 介绍 35 | 36 | WeDataSphere是一套一站式、金融级、全连通、开源开放的大数据平台套件。基础平台由数据交换、数据分发、计算、存储四大层次组成,关注底层数据传输计算存储能力;功能平台由平台工具、数据工具、应用工具三大层次组成,关注用户各类功能工具需求实现。形成了完整的大数据平台技术体系,提供一站式的丰富数据平台组件及功能支撑。 37 | 38 | ---- 39 | 40 | ## WeDataSphere 核心特点 41 | 42 | - 基础能力
43 | 基于开源社区的各种开源组件,如:Hadoop、Spark、Hbase、KubeFlow和FFDL等,构建金融级可靠基础计算存储数据交换能力,及强大的机器学习能力。并在开源版本基础上做加法,解决实际应用场景中遇到的安全、性能、高可用、可管理性等问题及各种bug修复。 44 | 45 | - 平台工具
46 | 提供平台门户、数据中间件Linkis和运营管理系统。平台门户支持产品地图、多租户管控、财务计费、接入方案智能推荐、运营报表和云服务申请;Linkis打造数据中间件,提供金融级多租户、资源管控、权限隔离等能力,连接上层应用和下层计算存储系统,主动填补开源社区和行业空白;运营管理系统涵盖集群管理、配置管理、变更管理、监控管理与服务请求自动化,支持一键安装、一键升级和图形化运维,并提供了预警、健康监测诊断、故障自愈等功能,简化平台的运维过程。 47 | 48 | - 数据工具
49 | 提供数据地图、数据脱敏工具、数据质量工具和跨Hadoop集群的数据传输工具。数据地图管理全行数据资源,包括元数据管理、数据权限、数据血缘,及开发中的数据质量、数据模型等功能模块。数据脱敏工具支持对高密级数据进行脱敏,避免用户直接接触高密级原始数据。数据质量工具提供一整套统一的流程来定义和检测数据集的质量并及时报告问题。跨Hadoop集群的数据传输工具支持数据传输任务调度、状态、统计、监控等管理工作。 50 | 51 | - 应用工具
52 | 提供开发探索工具Scriptis、图形化工作流调度系统、数据展现BI工具和机器学习支持系统。Scriptis支持对接多种计算存储引擎,并提供图形化、多编程语言支持。调度系统提供图形化界面做工作流定义和定时调度执行、依赖展示、状态查看、历史统计、监控配置等功能。BI工具支持通过图形化界面拖拽和简单脚本编写,生成各种图报表,同时支持邮件定时发送功能。机器学习支持系统提供多种模型训练调试方式,集成自研的机器学习算法和多种开源机器学习框架,具备异构高性能集群的多租户管理能力。 53 | 54 | ---- 55 | 56 | ## WeDataSphere 核心优势 57 | 58 | - 一站式
59 | 提供从数据应用开发到数据可视化、从批量作业到实时流式计算能力等的丰富功能组件,满足不同场景的数据应用开发运行和数据管理需求。 60 | 61 | - 金融级
62 | 在高可用、数据治理、数据安全等方面做多种增强,打造金融级高可靠大数据平台,支撑核心关键业务应用。 63 | 64 | - 全连通
65 | 独有数据应用开发管理门户DataSphere Studio 和计算中间件Linkis ,两层连通和集成的架构设计,使平台内各组件间南北东西向真正打通,提供更无缝的用户体验,更简化的平台架构,更强大的管控功能。 66 | 67 | - 开源开放
68 | WeDataSphere 基于开源,回到开源,自研的各种组件会逐步开源;整体设计上开放灵活,对扩展友好且组件可插拔;同时以开源开放的形式,吸引更多个人、组织,参与到WDS的开发建设和推广应用中来。 69 | 70 | ---- 71 | 72 | ## WeDataSphere Community 73 | 74 | 如果您想得到最快的响应,请给我们提issue,或者您也可以扫码进群: 75 | 76 | ![weChatAndQQ](https://github.com/WeBankFinTech/WeDataSphere/assets/11496700/853e2b68-109f-42ba-a1b7-5e42d01b2865) 77 | -------------------------------------------------------------------------------- /images/introduction/OSPprojects.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/OSPprojects.png -------------------------------------------------------------------------------- /images/introduction/WDS advantages.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/WDS advantages.png -------------------------------------------------------------------------------- /images/introduction/WDSAdvantages.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/WDSAdvantages.png -------------------------------------------------------------------------------- /images/introduction/WeDataSphere all-in-one-202211.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/WeDataSphere all-in-one-202211.png -------------------------------------------------------------------------------- /images/introduction/introduction05.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/introduction05.jpg -------------------------------------------------------------------------------- /images/introduction/os projects.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/os projects.png -------------------------------------------------------------------------------- /images/introduction/weChatQQ.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WeBankFinTech/WeDataSphere/8a0a9e6258e78f9c8d4de43f6ca62ad05da4cbe1/images/introduction/weChatQQ.png --------------------------------------------------------------------------------