├── LICENSE
├── PEDT
    └── chinese
    │   └── PEDT.md
└── README.md


/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "{}"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright {yyyy} {name of copyright owner}
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 
203 | 


--------------------------------------------------------------------------------
/PEDT/chinese/PEDT.md:
--------------------------------------------------------------------------------
   1 | # PEDT Specifications
   2 | 
   3 | Author: aimingoo(aimingoo@wandoujia.com)
   4 | 
   5 | Date: 2015.11
   6 | 
   7 | Version: Series 1 - 1.1.0
   8 | 
   9 | Language: Chinese
  10 | 
  11 | # 概要
  12 | 
  13 | 并行的可交换分布式任务（PEDT, Parallel Exchangeable Distribution Task）是以可计算集群为对象的、以实时处理为目标的并行任务规范。该规范包括对任务数据、任务处理和任务调用接口三个部分的定义。
  14 | 
  15 | PEDT旨在为可计算集群提供一种轻量、高效和跨平台的可靠任务处理机制。
  16 | 
  17 | ### Table of Contents
  18 | 
  19 | * [PEDT Specifications](#pedt-specifications)
  20 | * [概要](#概要)
  21 | * [规范](#规范)
  22 |   * [PEDT define specification](#pedt-define-specification)
  23 |     * [基础定义](#基础定义)
  24 |       * [任务定义（taskDef）](#任务定义taskdef)
  25 |       * [任务（task/distribution task）](#任务taskdistribution-task)
  26 |       * [任务标识（taskId）](#任务标识taskid)
  27 |       * [字符串（string）](#字符串string)
  28 |       * [方法(method)](#方法method)
  29 |         * [任务分发范围(distribution scope)](#任务分发范围distribution-scope)
  30 |         * [任务参数（arguments）](#任务参数arguments)
  31 |     * [规范](#规范-1)
  32 |       * [PEDT define specification 0.9](#pedt-define-specification-09)
  33 |       * [PEDT define specification 1.0](#pedt-define-specification-10)
  34 |       * [PEDT define specification 1.1](#pedt-define-specification-11)
  35 |     * [keywords](#keywords)
  36 |   * [PEDT process specification](#pedt-process-specification)
  37 |     * [基础定义](#基础定义-1)
  38 |       * [PEDT任务的四个处理阶段](#pedt任务的四个处理阶段)
  39 |       * [PEDT任务的三个特定处理方法（process method）](#pedt任务的三个特定处理方法process-method)
  40 |       * [PEDT任务的两个特定分发方法（distribution method）](#pedt任务的两个特定分发方法distribution-method)
  41 |       * [对处理系统的基本需求](#对处理系统的基本需求)
  42 |     * [规范](#规范-2)
  43 |       * [PEDT process specification 1.0](#pedt-process-specification-10)
  44 |   * [PEDT interface specification](#pedt-interface-specification)
  45 |     * [本地接口](#本地接口)
  46 |       * [混入对象：mix](#混入对象mix)
  47 |       * [HTTP请求：distributed_request](#http请求distributed_request)
  48 |       * [方法：taskDef.distributed](#方法taskdefdistributed)
  49 |       * [方法：taskDef.promised](#方法taskdefpromised)
  50 |       * [方法：taskDef.rejected](#方法taskdefrejected)
  51 |       * [解析范围](#解析范围)
  52 |       * [下载任务](#下载任务)
  53 |       * [执行任务](#执行任务)
  54 |       * [方法：task.run](#方法taskrun)
  55 |       * [方法：task.map](#方法taskmap)
  56 |     * [远程接口](#远程接口)
  57 |       * [任务注册/register_task](#任务注册register_task)
  58 |       * [任务下载/download_task](#任务下载download_task)
  59 |       * [任务执行/execute_task](#任务执行execute_task)
  60 |       * [资源query](#资源query)
  61 |     * [其它](#其它)
  62 |       * [资源订阅/subscribe](#资源订阅subscribe)
  63 |       * [资源请求/require](#资源请求require)
  64 |       * [方法task.reduce](#方法taskreduce)
  65 |       * [方法task.daemon](#方法taskdaemon)
  66 | * [实现](#实现)
  67 | * [Q&amp;A](#qa)
  68 | 
  69 | # 规范
  70 | 
  71 | 本规范集(specifications)中，任务数据定义(taskDef)以1.0版本为基准，同时发布了taskDef 0.9和1.1共三个标准版本，而对应的任务处理规范(process specification)和任务调用接口规范(interface specification)都只存在1.0版。
  72 | 
  73 | taskDef的不同版本用于应对于不同的分布式任务场景，而它们的逻辑处理与外部接口都是一致的。
  74 | 
  75 | ## PEDT define specification
  76 | 
  77 | 本规范用于描述一个PEDT任务的可传输数据部分的格式，我们称之为taskDef。
  78 | 
  79 | taskDef基于JSON数据定义，0.9版采用受限制的(limited)的JSON格式，而1.0与1.1版本采用完整支持的JSON格式。
  80 | 
  81 | ### 基础定义
  82 | 
  83 | #### 任务定义（taskDef）
  84 | 
  85 | 一个taskDef总是一个JSON对象(object)，它拥有0至任意多个成员(members/fields)；当它至少有一个成员是分布式任务(distribution task)或处理方法(process method)时，它是分布式的(distributed)，否则它是静态的(static)。
  86 | 
  87 | ``` text
  88 | # sample - static taskDef
  89 | {}
  90 | 
  91 | # sample - distributed taskDef
  92 | { "x": { "run": "task:570b41ba61ade63987d318b0c08e4fa4" } }
  93 | ```
  94 | 
  95 | 一个分布式任务定义(distributed taskDef)也可以处理taskDef自身，后者是将静态任务定义(static taskDef)作为数据对象（而非可执行的任务对象）来处理，例如分发或作为数据规格(data schema)。
  96 | 
  97 | 在传输中，taskDef是一个缺省使用utf-8编码的标准字符串，并总是可以按JSON规范解析(parse)。
  98 | 
  99 | #### 任务（task/distribution task）
 100 | 
 101 | 一个task总是一个JSON对象(object)，它拥有至少一个成员名（"map"/"run"），用来表示分发方法(distribution method)，以及可能的成员属性，包括scope和arguments等。
 102 | 
 103 | 如果一个task的map/run成员是任务标识（taskId），那么它是一个需要从远程下载的已发布任务定义(published taskDef)；map成员只接受taskId。
 104 | 
 105 | 如果一个task的run成员是字符串（string），那么它是被编码在taskDef中可执行的普通本地任务(local task)；如果它是对象(object)，那么它将作为一个本地任务定义(local/unpublished taskDef)来处理。
 106 | 
 107 | #### 任务标识（taskId）
 108 | 
 109 | taskDef一旦被定义将不可以被变更，因此它具有一个唯一的ID。在PEDT规范中将基于taskId来交换任务，而不是直接交换任务的文本。
 110 | 
 111 | PEDT Specifications series 1采用任务文本的MD5值作为taskId，格式为：
 112 | 
 113 | ``` text
 114 | <prefix><md5>
 115 | 	prefix  : 前缀"task:"
 116 | 	md5     : taskDef文本的md5值，小写
 117 | ```
 118 | 
 119 | taskId在taskDef中被记为一个字符串值。
 120 | 
 121 | #### 字符串（string）
 122 | 
 123 | 在taskDef中存在两种字符串：一般的(normal)和编码的(encoded)，都是标准的JSON字符串值。编码字符串(encoded string)采用前缀来表达编码内容的含义，需要由处理系统在得到taskDef时进行解码(decode)。
 124 | 
 125 | 编码字符串采用三段前缀(three parts prefix)格式为：
 126 | 
 127 | ``` text
 128 | <type[:subType]>[:encodeType]:
 129 |     type       : string
 130 |     subType    : <type define>
 131 | 	encodeType : string
 132 | ```
 133 | 
 134 | 目前PEDT Specifications series 1只支持"data"、"script"两个顶级type前缀。在顶级type前缀为data时，subType缺省为"string"；而顶级type前缀script没有缺省的subType值。
 135 | 
 136 | > NOTE: 在规范0.9中，"task:"也被作为顶级type前缀来理解，但在规范1.0及以上版本中，它是被转换成标准的task对象来处理的。
 137 | > 
 138 | > NOTE: 三段前缀(three parts prefix)是中位缺省的嵌套格式，它的有效分段数并不确定（大于等于1）。当它的encodeType有效时，subType才可能被有效置值（或使用缺省值）。并且，在后续的内容区(body/context)不能出现":"字符。
 139 | 
 140 | 目前PEDT Specifications series 1支持base64、utf8两种编码类型(encodeType)，缺省为"utf8"。
 141 | 
 142 | 因此如下两个定义是指一般字符串"Hello World!"：
 143 | 
 144 | ``` text
 145 | "data:Hello World!"
 146 | "data:string:utf8:Hello World!"
 147 | 	NOTE1: "data" is enocded by utf8, and
 148 | 	NOTE2: "string" is new type as subType
 149 | ```
 150 | 
 151 | 而如下三个定义都是指编码为base64的字符串"Hello World!"：
 152 | 
 153 | ``` text
 154 | "data:base64:SGVsbG8gV29ybGQhCg=="
 155 | "data:string:base64:SGVsbG8gV29ybGQhCg=="
 156 | "data:string:utf8:base64:SGVsbG8gV29ybGQhCg=="
 157 | ```
 158 | 
 159 | > NOTE: "string"不是顶级type前缀，而是"data"类型的缺省subType。
 160 | 
 161 | 下面的示例声明是等义的：
 162 | 
 163 | ``` text
 164 | "script:lua:base64:cHJpbnQoImhpIikK"
 165 | "script:javascript:base64:Y29uc29sZS5sb2coImhpIikK"
 166 | "script:lua:utf8:print(\"hi\")"
 167 | "script:javascript:utf8:console.log(\"hi\")"
 168 | ```
 169 | 
 170 | > NOTE: "script"前缀没有缺省subType，因此必须显式声明。
 171 | 
 172 | #### 方法(method)
 173 | 
 174 | 方法包括处理方法（process method）与分发方法（distribution method）。处理方法作用于taskDef，而分发方法作用于task。
 175 | 
 176 | 在taskDef文本中，方法总是表达为编码字符串(encoded string for taskDef），或任务标识(taskId for task)。对于编码字符串，它在解码后是可以在处理系统中执行的函数、脚本或调用入口，其编解码方法由处理系统通过字符串前缀来约定。
 177 | 
 178 | > NOTE: 执行这些方法的方式，是由处理系统决定的。本规范的其它章节约定了处理方法的流程和接口，但处理系统可能在所有方面保持与PEDT规范一致，但在具体处理时采用不同的实现方案。例如task.map要求并行执行一组tasks，但实际实现中采用同步方案仍然是可行的——这仅仅带来性能上的差异而不影响对外的接口表现。
 179 | > 
 180 | > NOTE: 在特定的系统中的taskDef中，分发方法（作用于task）也可以表达为对象或编码字符串，这是该系统支持本地函数或对象支持（规范中推荐而非强制）所致。
 181 | 
 182 | 方法中的处理方法包括taskDef.distributed、taskDef.promised和taskDef.rejected，分发方法包括task.run和task.map。其中，处理方法可以没有或有多个（它们当然不可重复的命名），一个拥有全部处理方法的典型的taskDef如下：
 183 | 
 184 | ``` JSON
 185 | {
 186 | 	"distributed": "script:lua:base64: ... ",
 187 | 	"promised": "script:lua:base64: ... ",
 188 |   	"rejected": "script:lua:base64: ... "
 189 | }
 190 | ```
 191 | 
 192 | 一个任务必然有两个分发方法（task.run和task.map）之一。分发方法的任务参数(arguments)是可选定义的。此外，对于map方法来说，还必须要定义一个分发范围(scope)。一个典型的有map方法的task如下：
 193 | 
 194 | ``` JSON
 195 | {
 196 | 	"map": "task: ...",
 197 | 	"scope": "...",
 198 | 	"arguments": { ... }
 199 | }
 200 | ```
 201 | 
 202 | 一个典型的有run方法的task如下：
 203 | 
 204 | ``` JSON
 205 | {
 206 | 	"run": "task: ...",
 207 | 	"arguments": { ... }
 208 | }
 209 | ```
 210 | 
 211 | 如下是实现了本地函数或对象支持（规范中推荐而非强制）的有run方法的task示例：
 212 | 
 213 | ``` JSON
 214 | // 支持本地函数
 215 | {
 216 | 	"run": "script:javascript:base64: ...",
 217 | 	"arguments": { ... }
 218 | }
 219 | // 支持taskObject对象
 220 | {
 221 | 	"run": { ... },
 222 | 	"arguments": { ... }
 223 | }
 224 | ```
 225 | 
 226 | ##### 任务分发范围(distribution scope)
 227 | 
 228 | 当一个task使用map方法时，它需要在task对象中定义一个scope成员来表达分发范围。该scope是一个未经编码一般字符串(normal string），采用三段标记（three parts token）格式：
 229 | 
 230 | ``` text
 231 | <systemPart>:[pathPart]:<scopePart>
 232 | 	systemPart  : 向一个系统分发
 233 | 	pathPart    : 向上述系统的一个路径分发
 234 | 	scopePart   : 向上述系统路径中的（一组结点的）范围分发
 235 | ```
 236 | 
 237 | systemPart与scopePart不能包含":"字符，且不可以缺省；pathPart可以是（包括空字符串在内的）任意字符串。
 238 | 
 239 | > NOTE: 三段标记（three parts token）是固定分段的字符串，因此即使在pathPart中出现了":"字符，也不被作为分隔符处理。
 240 | 
 241 | 正常scope字符串长度必然大于3。长度小于等于3的字符串被作为保留标识字(reserved tokens)。目前明确定义的保留标识字有：
 242 | 
 243 | ``` text
 244 | >	"?"      : unhandled placeholder
 245 | >	"::*"	: unhandled distribution scope
 246 | >	"::?"	: distribution scope scopePart invalid
 247 | >	":::"	: distribution scope invalid
 248 | ```
 249 | 
 250 | 例如，如下是一个“带scope的map方法”的任务定义：
 251 | 
 252 | ``` javascript
 253 | {
 254 | 	"map": "task:570b41ba61ade63987d318b0c08e4fa4",
 255 | 	"scope": "n4c:/a/b/c/test:*"
 256 | }
 257 | ```
 258 | 
 259 | ##### 任务参数（arguments）
 260 | 
 261 | task对象可以定义一个arguments成员来表达该任务执行时的依赖的参数，该参数（亦即该成员）必然是一个对象。多个参数用对象成员来表达，没有顺序关系。
 262 | 
 263 | > NOTE: 目前PEDT Specifications series 1并没有要求处理系统强制检查arguments的类型，这意味着它在具体实现时仍然可以是非对象的。但这仅限于在本地执行的task.run方法，且由于这对规范有明显的破坏性，因此会给“兼容或提升至”规范1.1的某些可选特性时带来障碍。
 264 | 
 265 | 例如，如下是一个“带arguments的run方法”的任务定义：
 266 | 
 267 | ``` javascript
 268 | {
 269 | 	"run": "task:570b41ba61ade63987d318b0c08e4fa4",
 270 | 	"arguments": { "arg1": 1, "arg2": true }
 271 | }
 272 | ```
 273 | 
 274 | ### 规范
 275 | 
 276 | #### PEDT define specification 0.9
 277 | 
 278 | > * 0.9-1 typeDef is limited JSON format
 279 | >   
 280 | >   ``` 
 281 | >   1. full support JSON value types
 282 | >   2. limited support JSON array and object types
 283 | >   	- array/object data can't using members or elements of > other array/object
 284 | >   	- task execute result is limited JSON format, or(optional) support full JSON format
 285 | >   ```
 286 | >   
 287 | > * 0.9-2 "task:" is top level prefix
 288 | >   
 289 | >   ``` 
 290 | >   1. task arguments is unsupported in task.run and task.map
 291 | >   2. local taskObject/function is unsupported in task.run
 292 | >   ```
 293 | >   
 294 | > * 0.9-3 "script:" prefix is optional
 295 | >   
 296 | >   ``` 
 297 | >   1. taskDef.distributed is optional
 298 | >   2. taskDef.promised is optional
 299 | >   3. taskDef.rejected is optional
 300 | >   ```
 301 | 
 302 | PEDT任务定义规范0.9的主要限制在于不支持JSON的array/object的多级定义或相互嵌套的定义。例如下面三个定义都是非法的：
 303 | 
 304 | ``` javascript
 305 | // unsupported in 0.9
 306 | [[1]]
 307 | [{ "a": 1 }]
 308 | { "a": [1] }
 309 | ```
 310 | 
 311 | 基于同样的原因，规范0.9也就不能支持标准的、对象格式的task。例如：
 312 | 
 313 | ``` javascript
 314 | // unsupported in 0.9
 315 | { "x": { "run": "task:570b41ba61ade63987d318b0c08e4fa4" } }
 316 | ```
 317 | 
 318 | 因为这里显然出现了多级的object。因此规范0.9要求用"task:"作为编码字符串前缀来表示这样一个task。
 319 | 
 320 | 规范0.9中的taskDef不能使用数组（因为它必须是一个对象，且不支持嵌套），但在一个任务执行的返回结果中是可以支持数组的（即使是仅实现limited JSON format的版本）。此外，根据规范，返回结果是可选支持full JSON types的。
 321 | 
 322 | 规范0.9可以用一个@前缀来表示taskDef.map中的scope。如果scope中包括":"字符或非utf8字符，那么整个task需要用base64（或其它约定的格式）来编码。类似如下的定义是合法的：
 323 | 
 324 | ``` javascript
 325 | // supported in 0.9
 326 | { "x": "task:570b41ba61ade63987d318b0c08e4fa4"}
 327 | { "x": "task:570b41ba61ade63987d318b0c08e4fa4@localhost" }
 328 | { "x": "task:base64:NTcwYjQxYmE2MWFkZTYzOTg3ZDMxOGIwYzA4ZTRmYTRAaHR0cDovL2xvY2FsaG9zdC90ZXN0OioK" }
 329 | ```
 330 | 
 331 | > NOTE: "@localhost"中的localhost不是一个distribution scope，而是一个系统内部的范围标识(token)，所以它不是标准的three parts格式的。这是由不同处理系统决定的一个可选实现。
 332 | 
 333 | #### PEDT define specification 1.0
 334 | 
 335 | > * 1.0-1 full support JSON types
 336 | >   
 337 | > * 1.0-2 support top level prefix: "data:", "script:"
 338 | >   
 339 | >   ``` 
 340 | >     1. "task:" is not top level prefix, but support downward compatibility
 341 | >     2. "string:" is subType for "data:" only
 342 | >     3. support encodeType: "base64" and "utf8"
 343 | >   ```
 344 | >   
 345 | > * 1.0-3 support full taskDef/task features
 346 | >   
 347 | >   ``` 
 348 | >     1. support scope property for task.map method
 349 | >     2. support arguments property for task.map and task.run
 350 | >     3. support taskDef.promised, taskDef.distributed and taskDef.rejected fields
 351 | >     4. local taskObject/function is strong recommend in task.run
 352 | >   ```
 353 | >   
 354 | > * 1.0-4 support taskDef as member of other taskDef
 355 | >   
 356 | >   ``` 
 357 | >     1. support taskDef array as member of other taskDef
 358 | >   ```
 359 | >   
 360 | > * 1.0-5 typeDef as arguments is optional
 361 | >   
 362 | >   ``` 
 363 | >   1. "reduce" as task method is optional
 364 | >   2. "daemon" as task method is optional
 365 | >   ```
 366 | 
 367 | PEDT任务定义规范1.0的主要限制是task的arguments是一个简单的、未约定含义的JSON对象。因此是否能够将taskDef作为arguments，也就成了一个可选项。更进一步的带来了reduce/daemon等扩展方法也是可选项。
 368 | 
 369 | 规范1.0是推荐实现reduce和方法的，但对如何实现没有明确约定。
 370 | 
 371 | #### PEDT define specification 1.1
 372 | 
 373 | > * 1.1-1 full features of specification 1.0 is supported, and downward compatibility
 374 | >   
 375 | > * 1.1-2 access current task processor in all taskDef methods and task.run method	
 376 | >   
 377 | > * 1.1-3 support taskDef as task arguments
 378 | >   
 379 | > * 1.1-4 more task method is optional
 380 | >   
 381 | >   ``` 
 382 | >     1. "reduce" and "daemon" as task method is strong recommend
 383 | >     2.  more task method is optional
 384 | >   ```
 385 | >   
 386 | > * 1.1-5 run method is map method at local, optional
 387 | >   
 388 | >   ``` 
 389 | >     1. support full/real distribution taskDef when this feature ready
 390 | >   ```
 391 | 
 392 | PEDT任务定义规范1.1具有目前已知的全部特性集，该规范强烈建议你实现reduce/daemon方法。并且，你可以将run方法作为一个本地的map方法来实现，一旦你这样做，则你整个的系统都是完整而纯粹的可分布式系统了。
 393 | 
 394 | > NOTE: 否则，你会有一部分任务将会因为使用了本地逻辑而不能“自由、随意”地迁移到其它结点上执行。然而，后者正是大多数系统的常态，要求“所有方法能能被分布”一定程度上也增加了不必要的系统负担。
 395 | 
 396 | 最后，规范1.1是唯一一个要求你在任务方法中传递任务处理器(current task processor)的规范。它是当前正在使用的任务对象处理器实例（其实就是javascript中的this对象，或者lua中的self对象，等等类似于此）。
 397 | 
 398 | > NOTE: 这是依赖执行环境的，你并不一定能在一个非面向对象的环境中轻易地实现这一特性。
 399 | 
 400 | ### keywords
 401 | 
 402 | JSON/JSON format
 403 | 
 404 | JSON values
 405 | 
 406 | object members/object fields
 407 | 
 408 | array elements
 409 | 
 410 | taskDef
 411 | 
 412 | static taskDef
 413 | 
 414 | distributed taskDef
 415 | 
 416 | published taskDef
 417 | 
 418 | local taskDef/unpublished taskDef
 419 | 
 420 | method
 421 | 
 422 | taskDef methods/process method
 423 | 
 424 | - distributed
 425 | - promised
 426 | - rejected
 427 | 
 428 | task methods/distribution method
 429 | 
 430 | - run
 431 | - map
 432 | - reduce
 433 | - daemon
 434 | 
 435 | local task
 436 | 
 437 | task/distribution task
 438 | 
 439 | task arguments
 440 | 
 441 | scope/distribution scope
 442 | 
 443 | three parts token
 444 | 
 445 | - systemPart
 446 | - pathPart
 447 | - scopePart
 448 | 
 449 | md5
 450 | 
 451 | taskId
 452 | 
 453 | taskId prefix
 454 | 
 455 | - "task:"
 456 | 
 457 | normal string
 458 | 
 459 | encoded string
 460 | 
 461 | encoded string prefix
 462 | 
 463 | three parts prefix
 464 | 
 465 | - "data:"
 466 | - "script:"
 467 | - "string:"
 468 | - "base64:"
 469 | - "utf8:"
 470 | 
 471 | current task processor/this/self
 472 | 
 473 | token
 474 | 
 475 | reserved tokens
 476 | 
 477 | ## PEDT process specification
 478 | 
 479 | 本规范用于描述如何处理taskDef，使得它可以在不同的执行环境中得到一致的处理。
 480 | 
 481 | ### 基础定义
 482 | 
 483 | #### PEDT任务的四个处理阶段
 484 | 
 485 | * 任务声明：
 486 |   
 487 |   taskDef是一个任务的JSON格式文本，它用于存储和分发，它本身不是执行体（不可直接传递函数或代码）。任务中需要执行的部分，要么被编码(encoded)放在字符串中，要么作为一个外部的taskDef以它的taskId来引用（并声明在一个task中）。
 488 |   
 489 | * 任务预处理
 490 |   
 491 |   taskObject是上述taskDef解码后的对象。它首先是JSON文本到本地对象的解码，其次，还必须按PEDT的约定对其中的编码字符串解码。需要注意的是，编码字符串解码之后不一定仍然是字符串，而可能是本地支持的对象成员类型。例如"script:javascript"前缀的解码，就会使对应的成员被重写成函数。
 492 |   
 493 |   一个taskDef被注册后不可变更，因此它有一个唯一对应的taskId。同样，通过这样一个taskDef解码的本地对象taskObject也是不可变更的，这个对象可以被缓存，或者用作其它对象的原型。
 494 |   
 495 | * 任务订单
 496 |   
 497 |   taskOrder是taskObject的一个可处理映像，它可以是一个taskObject的子类或taskObject的一个实例。应确保taskOrder可以得到taskObject的全部可处理信息，且在处理过程中不会破坏taskObject。
 498 |   
 499 |   任务处理过程正式开始于taskOrder的提交。
 500 |   
 501 | * 任务结果
 502 |   
 503 |   taskResult是taskOrder被处理后的结果，它可以是taskOrder自身（taskOrder是可以改写的），或者是对taskOrder再次处理的、类型完全不同的结果。
 504 | 
 505 | #### PEDT任务的三个特定处理方法（process method）
 506 | 
 507 | 在taskDef中可以声明三个处理方法，即taskDef.distributed、taskDef.promised和taskDef.rejected。
 508 | 
 509 | * distributed方法
 510 |   
 511 |   该方法是在当前环境被分发一个新的taskDef时调用的，它在当前环境中只执行一次。这意味着你可以在其中处理一些当前环境相关的信息。例如
 512 |   
 513 |   ``` text
 514 |   taskDef = { "node_id": "unknow" }
 515 |   ```
 516 |   
 517 |   这个声明中，node_id与当前执行环境相关，就可以在distributed方法中对它进行一次性赋值。
 518 |   
 519 |   由于distributed方法只执行一次，因此当前环境可以在该方法执行之后才转换成任务对象(taskObject)并缓存之。
 520 |   
 521 | * promised方法
 522 |   
 523 |   该方法是在taskOrder被正确处理并得到taskResult之后调用的。在该方法中你可以对taskResult做进一步的加工并返回，或者处理成其它结果值返回。
 524 |   
 525 |   如果promised方法没有返回值，则整个taskOrder仍以taskResult为结果；否则以promised返回值为结果。
 526 |   
 527 |   PEDT并不保证一个taskDef/taskOrder的执行结果与其原型（在对象类型上）一致。
 528 |   
 529 | * rejected方法
 530 |   
 531 |   该方法是在taskOrder处理中出现错误时调用的。在该方法中你可以对错误进行进一步的处置。一旦定义了该方法，则错误将被当前taskDef处理吞吃（mute），除非你再次显式地返回错误。
 532 |   
 533 |   在promised方法中处理taskResult出现错误时，也会调用rejected方法；默认情况下，rejected返回的值也被理解为一个成功返回的taskResult，但不会再将控制流程交回promised方法。
 534 | 
 535 | #### PEDT任务的两个特定分发方法（distribution method）
 536 | 
 537 | 在task中可以声明两个分发方法，即task.run和task.map。一个任务有且有仅能只有一个分发方法。
 538 | 
 539 | > NOTE: 从接口的角度上来说，这两个方法是必须实现的。但在具体的环境中，它们实现为部分有效也是可行的，这取决于具体执行环境所处于的阶段（或状态）。所谓部分有效是指：接口是存在的，但处于不可用的状态。
 540 | > 
 541 | > NOTE: 在规范1.1的一个可选实现中，task.run是通过task.map来实现的，因此这种情况下处理系统内部将只有一个分发方法。
 542 | 
 543 | * run方法
 544 |   
 545 |   该方法意味着指定任务是在本地执行的，如果任务被指定为taskId，则处理系统应该先通过下载(download_task)来得到taskDef，并最终转换得到一个taskOrder提交执行。
 546 |   
 547 | * map方法
 548 |   
 549 |   该方法意味着指定任务是在异地的一组结点上执行的，并最终可以得到一组（相同数量的）执行结果，该组结果中每一个必然是一个taskResult。
 550 |   
 551 |   > NOTE: 由于PEDT规范并不保证taskResult与taskDef/taskOrder有对象类型上的相似性，因此上述一组结果可能是不相似的。
 552 | 
 553 | #### 对处理系统的基本需求
 554 | 
 555 | PEDT要求处理系统必须具备如下能力，这些能力简单地描述为可调用接口，以便处理系统具体实现之。
 556 | 
 557 | > NOTE: 各接口的具体说明参考本规范集中的“任务调用接口规范”部分。
 558 | 
 559 | * mix(Object obj, Object ref)
 560 |   
 561 |   处理系统应将ref对象的所有成员混入到obj对象上，返回结果是修改过的obj对象本身。
 562 |   
 563 | * Promise.all(Array arr)
 564 |   
 565 |   提供在map()中向远程RESTfull接口提交一组task并捕获返回结果的能力；并提供在taskDef中将一组任务本地执行并得到结果的能力。arr的成员为普通成员或promise对象。
 566 |   
 567 |   > NOTE: 关于Promise的细节参考：
 568 |   > 
 569 |   > * Promise in MDN: [https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise)
 570 |   > * Promises/A+ specification: [http://promisesaplus.com](http://promisesaplus.com)
 571 |   
 572 | * parse_scope(String scope)
 573 |   
 574 |   提供将distributionScope的部分或全部解析成一组可接受任务分发的结点的能力。这些结点的接口信息是在Promise.all()中处理远程调用时所依赖的。
 575 |   
 576 | * register_task(String taskDef)
 577 |   
 578 |   提供将taskDef放在远程的、所有结点可访问的公共结点中的能力。该接口将返回taskDef对应的taskId字符串。
 579 |   
 580 | * download_task(String taskId)
 581 |   
 582 |   提供从上述公共结点中获取由taskId指定的taskDef的能力。该接口将返回JSON值。
 583 | 
 584 | > NOTE: 以上三个接口通常被封装成internal_xxx接口，表明它们是内部实现的。并且，它们通常是调用外部的接口来实现功能，而自身只做接口与数据的转换。
 585 | > 
 586 | > NOTE: 这三个接口的具体实现与网络环境相关。
 587 | 
 588 | ### 规范
 589 | 
 590 | #### PEDT process specification 1.0
 591 | 
 592 | 1. 从taskDef到taskObject的过程
 593 |    
 594 |    前提：已经通过download_task(taskId)从远程得到taskDef的文本，或从本地缓存中得到taskDef。
 595 |    
 596 |    > 1.1 task_def_decode
 597 |    > 
 598 |    > > 1.1.1 转换成本地对象：localObj = JSON_decode(taskDef)
 599 |    > > 
 600 |    > > 1.1.2 处理每个成员，解码所有编码字符串：decodedObj = decode_task_fields(localObj)
 601 |    > > 
 602 |    > > > NOTE: decode_task_fields()处理localObj的每个编码字符串成员(obj.x)，并使obj.x = decode(obj.x)，且最终成员x可能不再是字符串类型。
 603 |    > > 
 604 |    > > 1.1.3 如果localObj的成员为对象，则为之递归调用1.1.2
 605 |    > > 
 606 |    > > 1.1.4 得到解码结果：taskObject = decodedObj
 607 |    > 
 608 |    > 1.2 distributed_task
 609 |    > 
 610 |    > 如果是第一次从远程得到taskDef，且存在distributed方法，调用该方法：distributed(taskObject)
 611 |    > 
 612 |    > 1.3 返回结果：return taskObject
 613 |    
 614 | 2. 从taskOrder到taskResult的过程
 615 |    
 616 |    前提：已经创建了taskObject的一个映像作为taskOrder，确保修改taskOrder的成员不会影响taskObject，且可以访问后者的所有可访问成员。
 617 |    
 618 |    > 2.1 promise_static_member
 619 |    > 
 620 |    > > 2.1.1 处理taskOrder的每个成员，如果它是一个task对象，则为该task调用task.map或task.run方法；这些tasks记为taskOrder.x0..n。
 621 |    > > 
 622 |    > > > NOTE: 由于taskOrder初始为taskObject的一个映像，因此这时taskOrder.x0与taskObject.x0是相同的；又由于taskObject是从“一次分发，终身不变”的taskDef得来，所以这个x0..n的成员列表以及它们对应的tasks列表事实上也是不变的。
 623 |    > > 
 624 |    > > 2.1.2 将上述方法的结果作为promise对象放到一个与当前taskOrder相关联的数组promises中；这些promises记为promises[0..n]；
 625 |    > > 
 626 |    > > > NOTE: 可以在promises中放入被关联的taskOrder以便后续的处理过程能访问到它——而无需建立其它的关联或索引。
 627 |    > > 
 628 |    > > 2.1.3 调用Promise.all(promises)，并得到返回结果值results。如果在Promise.all()调用中出现异常，则进入2.4。
 629 |    > > 
 630 |    > > > NOTE: 按照Promise的规范，这个results实际是在.then()的onFulfilled函数中得到的。这个onFulfilled是all promises得到异步的、完全的确认——即执行并返回结果——之后一次性调用的。
 631 |    > 
 632 |    > 2.2 promise_member_rewrite
 633 |    > 
 634 |    > 按results的数组元素顺序，严格地将每一个result——这些result记为results[0..n]——抄写到（与之对应task成员的）成员中。即：
 635 |    > 
 636 |    > > taskOrder.x0 = results[0], ...
 637 |    > 
 638 |    > 2.3 pickTaskResult
 639 |    > 
 640 |    > > 2.3.1 得到结果值：taskResult = taskOrder
 641 |    > > 
 642 |    > > 2.3.2 如果taskDef.promised存在，则为taskOrder调用一次该方法：taskResult = promised(taskResult)。如果在调用promised(taskResult)中出现异常，则进入2.4，否则进入2.5。
 643 |    > 
 644 |    > 2.4 rejected
 645 |    > 
 646 |    > > 2.4.1 taskResult = taskDef.rejected(reason)
 647 |    > 
 648 |    > 2.5 返回结果：return taskResult
 649 |    
 650 | 3. 实现execute_task(String taskId, Object args)
 651 |    
 652 |    该方法要求处理系统执行由taskId指定的taskDef，处理系统可以从本地缓存装载或远程下载该taskDef，也可以执行由taskDef得到并缓存的taskObject，但最终需要返回taskResult。
 653 |    
 654 |    > 3.1 internal_download_task
 655 |    > 
 656 |    > 从taskId得到一个可用的taskObject。
 657 |    > 
 658 |    > > 3.1.1 调用内部的download_task()从taskId得到taskDef
 659 |    > > 
 660 |    > > 3.1.2 调用本规范之process 1，从taskDef得到taskObject
 661 |    > 
 662 |    > 3.2 internal_execute_task
 663 |    > 
 664 |    > 需要在执行taskDef之前处理args。
 665 |    > 
 666 |    > > 3.2.1 从taskObject得到一个可写映像作为taskOrder，然后mix(taskOrder, args)
 667 |    > > 
 668 |    > > > NOTE: 这意味着args中的成员值可以影响到当前处理的taskDef（的映像）的成员信息。
 669 |    > > 
 670 |    > > 3.2.2 调用本规范之process 2，从taskOrder得到taskResult
 671 |    > > 
 672 |    > > > NOTE: 在规范1.1中“taskDef可以用作arguments”的约定不影响本接口。亦即是说，这里的args不会得到“预先作为taskDef加以执行”的机会。
 673 |    > 
 674 |    > 3.3 返回结果：return taskResult
 675 |    
 676 | 4. 实现map(String distributionScope, String taskId, Object args)
 677 |    
 678 |    处理系统应先将distributionScope解释成一组可接受任务分发的结点，并向这些结点的execute_task接口投放RESTfull请求，这些请求中会将args作为参数传入。
 679 |    
 680 |    处理系统应在执行调用远程的execute_task接口之前处理args。
 681 |    
 682 |    > NOTE: 在规范1.1中，该args可以是一个新的taskDef2，这种情况下taskDef2会被先执行，且taskDef2的结果对象将作为taskDef的输入参数。
 683 |    > 
 684 |    > NOTE: taskId和args作为RESTfull请求中的参数（或数据）传递的细节参考本规范之“调用接口规范”。
 685 |    
 686 |    处理系统应从上述结点的execute_task接口中获取返回结果，并将结果按调用顺序装入数组，最终将数组结果解析成本地可用的数据，以作为本次map()调用的返回结果。
 687 |    
 688 |    > 4.1 internal_parse_scope
 689 |    > 
 690 |    > 将distributionScope解释成request_uri[0..n]数组。
 691 |    > 
 692 |    > > NOTE: 对distributionScope的解释方法参考本规范之“调用接口规范”。
 693 |    > 
 694 |    > 4.2 distributed_request
 695 |    > 
 696 |    > 将request_uri[]提交为一组远程的、并发的WEB RESTfull请求。
 697 |    > 
 698 |    > > 4.2.1 假设request_uri[1..n]记录每个可接受任务分发的结点的uri，该uri指向目标结点中用WEB RESTfull规格颂的execute_task()接口。则distributed_request()用于向该uri发送
 699 |    > > 
 700 |    > > > http.request(uri + taskId)
 701 |    > > 
 702 |    > > 这样的GET/POST请求。同时，args将可以http请求中的参数提交到每一个结点。
 703 |    > > 
 704 |    > > 4.2.2 用Promise.all()等待所有请求返回结果。
 705 |    > > 
 706 |    > > > NOTE: Promise.all()会将每一个结果装入与request_uri[0..n]对应的results[0..n]。
 707 |    > 
 708 |    > 4.3 extractMapedTaskResult
 709 |    > 
 710 |    > 从results抽取结果值：maped = JSON_decode(results[0..n])
 711 |    > 
 712 |    > > NOTE: 这里的语义相当于JavaScript中的：
 713 |    > > 
 714 |    > > ``` javascript
 715 |    > > maped = results.map(JSON.parse)
 716 |    > > ```
 717 |    > > 
 718 |    > > NOTE: execute_task()的RESTfull接口将返回JSON值，因此本地系统总是能将它解析成对象并放入数组，且最终结果数组仍可以序列化为JSON值。
 719 |    > 
 720 |    > 4.4 返回结果数组：return maped
 721 |    
 722 | 5. 实现run(task, Object args)
 723 |    
 724 |    处理系统将针对task的不同情况做处理。
 725 |    
 726 |    处理系统应在执行task之前处理args。
 727 |    
 728 |    > 5.1 如果task是一个taskId字符串，则交由execute_task()来处理。参见本规范之处理process 3 - 实现execute_task(String taskId, Object args)。
 729 |    > 
 730 |    > 5.2 如果task是一个本地函数，则使用args对象作为唯一参数调用之。
 731 |    > 
 732 |    > 5.3 如果task是一个对象，则尝试作为一个本地任务对象taskObject调用之。参见本规范之处理process 3.2 - internal_execute_task。
 733 |    > 
 734 |    > > NOTE: 在规范1.1中，args参数可以是一个新的taskDef2，这种情况下taskDef2会被先执行，且taskDef2的结果对象将作为taskDef的输入参数。
 735 |    > > 
 736 |    > > NOTE: 5.2和5.3是强烈推荐实现(strong recommend)，但不是必需实现的。
 737 |    > > 
 738 |    > > NOTE: 过程5.2与其它两个处理在对待args上并不相同。5.2是将args作为函数的唯一参数调用，而5.1和5.3是调用mix()将args混入到taskOrder。
 739 |    > > 
 740 |    > > NOTE: 对于过程5.3，当task是一个taskObject时，即使它拥有（从taskDef继承来的）distributed方法，也不会被调用。
 741 | 
 742 | ## PEDT interface specification
 743 | 
 744 | ### 本地接口
 745 | 
 746 | 本地接口是指：需要在当前处理系统中实现的接口，以便处理系统完成本规范所定义的上述处理过程。
 747 | 
 748 | #### 混入对象：mix
 749 | 
 750 | ``` javascript
 751 | funtion mix(obj, ref)
 752 |  - 参数：
 753 | 	obj: JSON_Supported，混入目标对象
 754 | 	ref: JSON_Supported，参考的混入源（对象或值）
 755 |  - 返回值：
 756 | 	return: 如果obj不是对象，则返回一个混入了ref的新对象；否则返回混入了ref的obj自身；如果ref无效(undefined/nil等)，则总是返回obj。
 757 | ```
 758 | 
 759 | 说明：
 760 | 
 761 | 如果调用是ref无效，则返回值是obj，这种情况下并不能确保总是一个对象（因为obj可能是值而不是对象）。
 762 | 
 763 | 如果obj是对象，则mix()总是返回obj自身，只是它的成员被混入了ref的成员。其中与ref同名的成员被重写，在obj中不存在的成员将从ref中抄写。混入运算是递归的。
 764 | 
 765 | > NOTE: JSON_Supported在这里是指所有JSON支持的数据类型。ref和obj是JSON值parse得到的对象，因此其成员限定为4种值类型和两个引用类型(复合类型，对象和数组)。
 766 | 
 767 | #### HTTP请求：distributed_request
 768 | 
 769 | ``` javascript
 770 | function distributed_request(URLs, taskId, args)
 771 |  - 参数：
 772 | 	URLs: Array of String, 一组结点的服务地址。
 773 | 	taskId: String, 以"task:"为前缀的字符串。
 774 | 	args: Object，参数对象。
 775 |  - 返回值：
 776 | 	taskResults: Array of HTTP_Responsed，一组远程执行taskId后的结果数组。
 777 | ```
 778 | 
 779 | 说明：
 780 | 
 781 | 处理器在distributed_request()中实现对于一组服务地址(URLs)的分布式请求。这些请求被封装成两种可能的格式：HTTP GET/POST。
 782 | 
 783 | 无论是使用GET或POST请求，taskId总是作为请求URL的一部分追加到URLs[x]后面。例如（注意其中的taskId是带字符串前缀"task:"的）:
 784 | 
 785 | > ``` text
 786 | > http://.../execute_task:570b41ba61ade63987d318b0c08e4fa4
 787 | > ```
 788 | 
 789 | 或：
 790 | 
 791 | > ``` text
 792 | > http://.../call?task:570b41ba61ade63987d318b0c08e4fa4
 793 | > ```
 794 | 
 795 | 在使用GET请求时，args参数将会被编码成url参数继续追加到上面的URL。从args对象到url参数字符串编码的方法，参考：
 796 | 
 797 | > [querystring.stringify in nodejs](https://nodejs.org/api/querystring.html#querystring_querystring_stringify_obj_sep_eq_options)
 798 | > 
 799 | > [ngx.encode_args in lua-nginx](https://github.com/openresty/lua-nginx-module#ngxencode_args)
 800 | 
 801 | 在使用POST请求时，你有可以选择如下两种HTTP headers之一，来设定在body/data区传送args的方法。
 802 | 
 803 | ``` text
 804 | Content_Type: application/x-www-form-urlencoded
 805 | Content_Type: application/json
 806 | ```
 807 | 
 808 | 当使用x-www-form-urlencoded格式时，应该将args按上述url参数编码的方法编码，并作为post data提交；当使用json格式时，应该将args序列化成json文本，并作为post data提交。如果Content_Type缺省，则服务端应以“application/x-www-form-urlencoded”作为默认格式处理。
 809 | 
 810 | 在使用POST请求时，由于url中也可以同时传递参数，因此服务端最终解析得到的args将会是url paraments与body data混合的结果。这种情况下，你可以通过在Content_Type中加入参数，以指示服务端解析中忽略混入(mixin)操作。例如：
 811 | 
 812 | ``` text
 813 | Content_Type: application/x-www-form-urlencoded; mixin=false
 814 | Content_Type: application/json; mixin=false
 815 | ```
 816 | 
 817 | 你仍然可以在Content_Type中加入language或其它参数来指示服务端的其它行为。
 818 | 
 819 | 最后，distributed_request()在处理服务端的返回结果时，HTTP_Responsed是指具体实现者通过HTTP协议返回的某个结构或数据，这与具体实现的方法有关。通常是一个HTTP Response对象，且response.header存放返回结果的头，response.body存放返回结果的数据区。
 820 | 
 821 | #### 方法：taskDef.distributed
 822 | 
 823 | ``` javascript
 824 | function taskDef.distributed(taskObject)
 825 |  - 参数：
 826 | 	taskObject: Object, 由taskDef解码得到的对象
 827 |  - 返回值：无
 828 | ```
 829 | 
 830 | 说明：
 831 | 
 832 | distributed是taskDef的一个可选声明的处理方法。它可以在taskDef下载到本地之后对它做一些小的修改，例如添加本地标识、IP地址或重写一些task的参数等等。
 833 | 
 834 | distributed通常用于改写taskObject，但也可以用来在得到这个taskDef时执行一些初始化的任务，例如创建本地服务等等。
 835 | 
 836 | #### 方法：taskDef.promised
 837 | 
 838 | ``` javascript
 839 | function taskDef.promised(taskResult)
 840 |  - 参数：
 841 | 	taskResult: Object, 重写后的taskOrder
 842 |  - 返回值：未确定类型, 返回任务的执行结果taskResult或任意可能的值
 843 | ```
 844 | 
 845 | 说明：
 846 | 
 847 | promised()是taskDef的一个可选声明的处理方法。它可以在每次taskDef执行时得到一次处理返回值(taskResult)的机会。在promised()中可以修改taskResult，或返回新的值。甚至，也可以在promised()中调用新的分发方法。
 848 | 
 849 | 如果promised()有非null的返回值，则使用该返回值作为当前taskOrder的执行结果；否则仍然以taskResult作为执行结果——无论它是否在promise()中修改过。
 850 | 
 851 | > NOTE: 不能在promised()中直接返回结果值null，因为这会被理解为“使用当前taskResult”。但在具体的实现环境中，也可以通过返回一个promise的方法来达到相同的效果。例如：
 852 | > 
 853 | > ``` javascript
 854 | > return Promise.resolve(null)
 855 | > ```
 856 | 
 857 | #### 方法：taskDef.rejected
 858 | 
 859 | ``` javascript
 860 | function taskDef.rejected(reason)
 861 |  - 参数：
 862 | 	reason: JSON_Supported, 一个描述错误的值或对象
 863 |  - 返回值：未确定类型, 返回任务的执行结果taskResult或任意可能的值
 864 | ```
 865 | 
 866 | 说明：
 867 | 
 868 | rejected()是taskDef的一个可选声明的处理方法。它可以在每次taskDef执行出错时得到一次处置机会。在rejected()中可以构造并返回一个新的taskResult值，或继续触发reject。
 869 | 
 870 | 也可以在rejected()中调用新的分发方法，并返回后者的结果；如果在rejected()中出现错误reason2，处理系统将忽略原有错误，而以新的reason2返回错误。
 871 | 
 872 | 如果rejected()有非null的返回值，则使用该返回值作为当前taskOrder的正确执行结果；否则它仍然以reason作为一个出错的结果。如果用户代码自行reject新的（或既有的）reason，则也将是一个出错的结果。
 873 | 
 874 | > NOTE 1: 如何reject新的reason是由处理系统决定的。例如在javascript中的：
 875 | > 
 876 | > ``` javascript
 877 | > return Promise.reject(reason)
 878 | > ```
 879 | > 
 880 | > NOTE 2: 与taskDef.promised()相同，在具体的处理系统中，用户代码也可以返回null值。例如：
 881 | > 
 882 | > ``` javascript
 883 | > return Promise.resolve(null)
 884 | > ```
 885 | 
 886 | #### 解析范围
 887 | 
 888 | ``` javascript
 889 | function internal_parse_scope(distributionScope)
 890 |  - 依赖：参考本规范中“远程接口 - 资源query”
 891 |  - 参数：
 892 | 	distributionScope: String, 三段标记（three parts token）字符串
 893 |  - 返回值：
 894 | 	return: Array of String, 字符串数组，是scope对应的一组结点的RESTfull接口地址
 895 | ```
 896 | 
 897 | 说明：
 898 | 
 899 | internal_parse_scope接受三段标记（systemPart:pathPart:scopePart），但只将其中的systemPart:pathPart两节作为标识提交到远程查询(query)并得到一组结点的接口地址URLs。
 900 | 
 901 | > NOTE: 在具体的实现版本中，通常通过require+subscribe机制来封装query接口而非直接调用之，以避免重复的distributionScope解析。
 902 | 
 903 | 随后，internal_parse_scope在本地通过scopePart的定义对URLs进行过滤，以得到最终返回的字符串数组。
 904 | 
 905 | PEDT为scopePart仅预留了一个"*"表达式，表明是URLs的全集。但scopePart也可以是其它的值，以表明还需要通过其它特殊处理来从URLs筛选出一个子集。一些可能的示例包括：
 906 | 
 907 | ``` text
 908 | "free>2g"：表示过滤所有结点中当前空闲内存>2G的结点
 909 | "incluce('master')"：表示过滤所有结点名中包括master符串的结点
 910 | ```
 911 | 
 912 | 这些scopePart表达式由处理系统自行约定与实现。
 913 | 
 914 | #### 下载任务
 915 | 
 916 | ``` javascript
 917 | function internal_download_task(taskId)
 918 |  - 依赖：参考本规范中“远程接口 - 任务下载”
 919 |  - 参数：
 920 | 	taskId: String, 以"task:"为前缀的字符串。
 921 |  - 返回值：
 922 | 	return: String, taskDef的JSON文本。
 923 | ```
 924 | 
 925 | 说明：
 926 | 
 927 | 处理系统应根据与远端（例如任务注册中心结点）协商的接口，通过taskId得到与之对应的taskDef。
 928 | 
 929 | > NOTE: 如何向远端注册一个任务，与本规范中的任务执行过程是无关的。
 930 | 
 931 | #### 执行任务
 932 | 
 933 | ``` javascript
 934 | function execute_task(taskId, args)
 935 |  - 依赖：参考本规范中“本地接口 - 下载任务”
 936 |  - 参数：
 937 | 	taskId: String, 以"task:"为前缀的字符串。
 938 | 	args: Object, 本地对象，其成员应当是JSON支持的数据类型
 939 |  - 返回值：未确定类型, 返回任务的执行结果taskResult或任意可能的值
 940 | ```
 941 | 
 942 | 说明：
 943 | 
 944 | 处理系统会先将taskId转换至taskOrder，然后调用mix()将args混入到taskOrder，然后再开始执行这个taskOrder。
 945 | 
 946 | 执行结果是数据类型未定义的。缺省情况下它应当是将被rewrite的taskOrder作为taskResult，但经过taskDef.promised()的处理之后，它可能是任意值或对象。
 947 | 
 948 | #### 方法：task.run
 949 | 
 950 | ``` javascript
 951 | function task.run(task, args)
 952 |  - 依赖：参考本规范中“本地接口 - execute_task”
 953 |  - 参数：
 954 | 	task: String/Function/Object, 以"task:"为前缀的字符串，或函数，或对象。
 955 | 	args: Object, 本地对象，其成员应当是JSON支持的数据类型
 956 |  - 返回值：未确定类型, 返回任务的执行结果taskResult或任意可能的值
 957 | ```
 958 | 
 959 | 说明：
 960 | 
 961 | 如果task是一个taskId字符串，则交由execute_task()来处理。
 962 | 
 963 | > NOTE: 以下为强烈推荐实现(strong recommend)的部分
 964 | > 
 965 | > * 如果task是一个本地函数，则使用args对象作为唯一参数调用之。
 966 | > * 如果task是一个对象，则尝试作为一个本地任务对象taskObject调用之。
 967 | 
 968 | 执行结果是数据类型未定义的。当task是一个有效的taskId或对象时，在缺省情况下返回值应当是将被rewrite的taskOrder作为taskResult，但经过taskDef.promised()的处理之后，它可能是任意值或对象。当task是本地函数时，是该函数的返回值。
 969 | 
 970 | #### 方法：task.map
 971 | 
 972 | ``` javascript
 973 | function task.map(distributionScope, taskId, args)
 974 |  - 依赖：参考本规范中“本地接口 - internal_parse_scope”
 975 |  - 依赖：参考本规范中“本地接口 - HTTP请求：distributed_request”
 976 |  - 依赖：参考本规范中“远程接口 - 任务执行/execute_task”
 977 |  - 参数：
 978 | 	distributionScope: String, 三段标记（three parts token） 
 979 | 	taskId: String, 以"task:"为前缀的字符串。
 980 | 	args: Object, 本地对象，其成员应当是JSON支持的数据类型
 981 |  - 返回值：
 982 | 	return: Array, 总是一个数组，但成员的个数和成员的类型都具有不确定性。
 983 | ```
 984 | 
 985 | 说明：
 986 | 
 987 | 处理器需要调用internal_parse_scope()来将distributionScope中的systemPart:pathPart解析成一组可访问的结点地址，并假设这些结点能接受远程的execute_task请求，这些请求约定为RESTfull调用。
 988 | 
 989 | 处理器需要在本地解析distributionScope中的scopePart，以确定向上述结点地址列表中的部分或全部发出请求。
 990 | 
 991 | > NOTE: 由于scopePart动态调整了可访问结点的范围，因此并不能确保发出请求的数量（是否等于结点地址列表的大小），同样，也就不能确保相应的结果数组的成员个数。
 992 | 
 993 | 处理器将taskId作为RESTfull请求URL的一部分，使用distributed_request接口（HTTP请求）的方式向上述结点发出请求。这些请求可能是GET的，或POST的，以便能在调用时携带args作为参数或请求数据。
 994 | 
 995 | 处理将上述请求的返回结果作为JSON文本处理，然后解析成本地对象并放入结果数组。最后，返回结果数组作为task.map()的返回值。
 996 | 
 997 | > NOTE: 由于远程的execute_task接口并不保证返回数据与taskId所对应的taskDef/taskObject在数据类型上一致，因此task.map尽管必然返回一个与发出请求数组相同大小、相同顺序的结果数组，但不能确保结果数组中的成员数据类型一致。
 998 | 
 999 | ### 远程接口
1000 | 
1001 | 远程接口是指：需要在当前处理系统之外实现的接口，当前处理系统默认这些远程接口已经先于服务请求调用之前就绪，并且都是HTTP RESTfull接口的形式交付。
1002 | 
1003 | 在以下接口描述中，
1004 | 
1005 | ``` 
1006 | * 使用curl的方式来描述请求，
1007 | * "jq ."用于在控制台上显示返回的JSON结果，
1008 | * ${SERV}参数指代提供服务的地址。
1009 | ```
1010 | 
1011 | #### 任务注册/register_task
1012 | 
1013 | ``` bash
1014 | > curl -s -XPOST --data-binary @taskDef.json "${SERV}/register_task?version=1.1" | jq .
1015 | "task:68bb82e2a6bcbb5f9a83b93c85cff07a"
1016 | ```
1017 | 
1018 | 可以用version参数来指定taskDef采用的规范版本，当该参数缺省时，服务端默认为1.1。
1019 | 
1020 | 成功返回时，其结果是一个JSON字符串(有""引号）而非普通文本。推荐使用“Content-Type: application/json”来返回成功调用的结果。
1021 | 
1022 | > NOTE: 客户端可以通过检查返回文本中否有'"task:'前缀来判断成功与否。
1023 | 
1024 | 当服务端遭遇错误时，推荐通过如下方法来返回错误信息：
1025 | 
1026 | > ``` text
1027 | > * 置http_status_code为5xx；
1028 | > * 置header中的Content-Type为"application/json"；
1029 | > * 将错误信息文本作为json body返回。
1030 | > ```
1031 | 
1032 | #### 任务下载/download_task
1033 | 
1034 | ``` bash
1035 | > curl -s -D- "${SERV}/download_task:68bb82e2a6bcbb5f9a83b93c85cff07a" 
1036 | HTTP/1.1 200 OK
1037 | Content-Type: application/json
1038 | X-Pedt-Version: 1.1
1039 | Date: Wed, 28 Oct 2015 02:59:10 GMT
1040 | Content-Length: 68
1041 | 
1042 | {
1043 | 	"x": { "run": "script:lua:utf8:function() return 'Hi!' end" }
1044 | }
1045 | ```
1046 | 
1047 | 实际发出的请求Path为"/download_"，"task:"被作为前缀的一部分补在上述URL Path之后。在该GET请求返回的header中，用“X-Pedt-Version”来表示该taskDef的注册版本，缺省为1.1。
1048 | 
1049 | 请求的发起者在获得taskDef文本后，可根据“X-Pedt-Version”来自行判断是否支持该任务。
1050 | 
1051 | 成功调用时，返回结果是一个符合PEDT规范的、JSON格式的taskDef文本。推荐使用“Content-Type: application/json”来返回成功调用的结果。
1052 | 
1053 | > NOTE: 客户端可以通过检查返回文本中否有'{'前缀来判断成功与否。
1054 | 
1055 | 当服务端遭遇错误时，通过如下方法来返回错误信息：
1056 | 
1057 | > ``` 
1058 | > * 置http_status_code为5xx；
1059 | > * 置header中的Content-Type为"application/json"；
1060 | > * 将错误信息文本作为json body返回。
1061 | > ```
1062 | 
1063 | #### 任务执行/execute_task
1064 | 
1065 | ``` bash
1066 | > curl -s  "${SERV}/execute_task:68bb82e2a6bcbb5f9a83b93c85cff07a" | jq .
1067 | {
1068 | 	"x": "Hi!"
1069 | }
1070 | ```
1071 | 
1072 | 实际发出的请求Path为"/execute_"，"task:"被作为前缀的一部分补在上述URL Path之后。
1073 | 
1074 | 参考本规范之“本地接口 - HTTP请求：distributed_request”，你可以在请求的URL中添加更多的参数（即execute_task的args参数），或使用POST请求并提交data。更进一步的，在使用POST请求时，也可以通过header来指定data的上下文类型(Context_Type)或指示参数args的处理方法。
1075 | 
1076 | 成功调用时，返回结果可能是任意JSON文本。推荐使用“Content-Type: application/json”来返回成功调用的结果。
1077 | 
1078 | > NOTE: 客户端可以通过检查返回文本前缀，以确定返回值是否是一个有效的JSON文本（注意这并非绝对可靠的方法）。
1079 | 
1080 | 当服务端遭遇错误时，通过如下方法来返回错误信息：
1081 | 
1082 | > ``` 
1083 | > * 置http_status_code为5xx；
1084 | > * 置header中的Content-Type为"application/json"；
1085 | > * 将错误信息文本作为json body返回。
1086 | > ```
1087 | 
1088 | #### 资源query
1089 | 
1090 | ``` bash
1091 | > curl -s  "${SERV}/query?SYS:a/b/c/d" | jq .
1092 | [
1093 |   "http://127.0.0.1:8011/kwaf/invoke?execute=",
1094 |   "http://127.0.0.1:8012/kwaf/invoke?execute=",
1095 |   "http://127.0.0.1:8010/kwaf/invoke?execute=",
1096 |   "http://127.0.0.1:8013/kwaf/invoke?execute="
1097 | ]
1098 | ```
1099 | 
1100 | 在使用GET请求时，实际发出的请求Path为"../query"，其后的整个search字符串"SYS:a/b/c/d"是distributionScope中的systemPart:pathPart两节。
1101 | 
1102 | 如果采用POST请求，则应发送如下JSON对象作为请求数据：
1103 | 
1104 | ``` JSON
1105 | {
1106 | 	"key": "SYS:a/b/c/d",
1107 | 	"type": "scope",
1108 | 	"version": "1.1"
1109 | }
1110 | ```
1111 | 
1112 | 当key值缺省时，服务端会将URL中的search作为key（同于GET请求）。
1113 | 
1114 | 无论是在GET/POST请求中，key总是使用encodeURI()编码过的。
1115 | 
1116 | 成功调用时，服务端的返回结果总是一个字符串数组。可以为空数组。推荐使用“Content-Type: application/json”来返回成功调用的结果。
1117 | 
1118 | > NOTE: 客户端可以通过检查返回文本中否有'['前缀来判断成功与否。
1119 | 
1120 | 当服务端遭遇错误时，通过如下方法来返回错误信息：
1121 | 
1122 | > ``` 
1123 | > * 置http_status_code为5xx；
1124 | > * 置header中的Content-Type为"application/json"；
1125 | > * 将错误信息文本作为json返回。
1126 | > ```
1127 | 
1128 | ### 其它
1129 | 
1130 | 以下列出的接口是推荐实现的。
1131 | 
1132 | #### 资源订阅/subscribe
1133 | 
1134 | ``` bash
1135 | > curl -s -XPOST --data '{...}' "${SERV}/subscribe" | jq .
1136 | [
1137 |   "http://127.0.0.1:8011/kwaf/invoke?execute=",
1138 |   "http://127.0.0.1:8012/kwaf/invoke?execute=",
1139 |   "http://127.0.0.1:8010/kwaf/invoke?execute=",
1140 |   "http://127.0.0.1:8013/kwaf/invoke?execute="
1141 | ]
1142 | ```
1143 | 
1144 | 在资源服务中提供query接口的同时，也可以为指定systemPart:pathPart提供订阅(subscribe)接口。这种情况下，systemPart:pathPart被理解为resource key。
1145 | 
1146 | subscribe接口只支持POST请求，且它的post data与“资源查阅/query”基本一致：
1147 | 
1148 | ``` JSON
1149 | {
1150 | 	"key": "SYS:a/b/c/d",
1151 | 	"type": "scope",
1152 | 	"version": "1.1",
1153 | 	"notify": "..."
1154 | }
1155 | ```
1156 | 
1157 | 在上述post data中的notify字段用于在服务端保留一个订阅者的列表，当被订阅资源发生变更时，订阅者会在notify地址上得到一个通知，订阅者应视为resource key对应的资源失效并发起重取。
1158 | 
1159 | notify上的通知与上述“资源查阅/query”采用完全相同的协议，即：将systemPart:pathPart作为GET请求的search string，或作为POST请求中的key字段。
1160 | 
1161 | subscribe接口返回与query相同的结果——事实上会在完成订阅之后调用query。
1162 | 
1163 | #### 资源请求/require
1164 | 
1165 | 对远端接口query的一个简单封装，例如实现本地缓存等。它总是返回与query相同的结果。
1166 | 
1167 | require接口用于在多级的资源管理服务中实现代理(proxy)，避免客户端直接面临query+subscribe接口。
1168 | 
1169 | #### 方法task.reduce
1170 | 
1171 | ``` javascript
1172 | function task.reduce(distributionScope, taskId, reduce)
1173 | 或
1174 | function task.reduce(distributionScope, taskId, args, reduce)
1175 |  - 参数：
1176 | 	distributionScope, taskId, args: (参见task.map)
1177 | 	reduce: function, 在task.map行为之后的回调函数
1178 |  - 返回值：未确定类型，是调用task.run(reduce, taskResults)之后的结果。
1179 | ```
1180 | 
1181 | 本方法是经典的map/reduce模型的一个实现，由接口中的reduce函数来处理map()返回的一组结果taskResults。
1182 | 
1183 | 在没有args参数时，实现为如下逻辑(示例采用javascript描述)：
1184 | 
1185 | ``` javascript
1186 | return task.run(reduce, task.map(distributionScope, taskId))
1187 | ```
1188 | 
1189 | 在有args参数时，实现为如下逻辑(示例采用javascript描述)：
1190 | 
1191 | ``` javascript
1192 | return task.run(reduce, task.map(distributionScope, taskId, args))
1193 | ```
1194 | 
1195 | #### 方法task.daemon
1196 | 
1197 | ``` javascript
1198 | function task.daemon(distributionScope, taskId, daemon, daemonArgs)
1199 |  - 参数：
1200 | 	distributionScope, task: (参见task.map)
1201 | 	daemon: function, 在task.map行为之前调用的函数
1202 | 	daemonArgs: Object, 使用task.run()来调用daemon时传入的参数
1203 |  - 返回值：Array，是调用task.map()之后的结果。
1204 | ```
1205 | 
1206 | 本方法先在本地调用task.run(daemon, daemonArgs)，以确保本地启动一个服务（或由本地launch一个远程服务），然后再调用task.map()。这个map调用的返回值通常不重要，因为真正的结果或行为的对象是上述的、已经预先启动的服务。
1207 | 
1208 | 该方法总是实现为：
1209 | 
1210 | ``` javascript
1211 | return task.map(distributionScope, taskId, task.run(daemon, daemonArgs))
1212 | ```
1213 | 
1214 | > NOTE: 注意这里是将daemon运行/启动返回结果值作为map()中的args传入。因此map不能自行再定义args值——即使真的有这种需要，也应当定义在daemonArg中并由daemon来返回给map()。
1215 | 
1216 | # 实现
1217 | 
1218 | (暂略)
1219 | 
1220 | # Q&A
1221 | 
1222 | 1. 为什么会有规范0.9
1223 |    
1224 |    规范0.9的核心是在不支持过于复杂的文本分析或面向对象的环境中使用PEDT任务，并且确保能被更高版本的规范向下兼容。比如，在写定这一规范的过程中我们充分地参考了它的一个在BASH环境中的实现。
1225 |    
1226 | 2. 规范0.9真的不支持使用任务参数吗
1227 |    
1228 |    的确。但是你可以在处理系统中通过两阶段地执行任务来实现它。这是指：首先从一个远程的或本地的配置系统中取得参数，并在处理系统中准备好全局(或局部的)变量，然后执行task并使用这些全局(或局部的)变量。
1229 |    
1230 | 3. 规范1.0是作为弱化的1.1版本是为什么存在的
1231 |    
1232 |    规范1.0的核心是静态任务定义与执行。它包含完整的任务定义特性，并能很容易地在跨系统环境得到处理，但是它有关动态执行的所有特性都是“可选的或未定义的”，这意味着要试图在一个任务中动态地发起一个新任务是很难的。
1233 |    
1234 |    规范1.0很适合交付静态逻辑的任务系统，例如定时触发的任务，数据采集与预处理任务，或特定的watch任务。但很难应对有动态逻辑的、复杂的分布式任务的业务。
1235 |    
1236 | 4. 为什么string使用三段前缀
1237 |    
1238 |    它的简单的好处是可以在忽略编码的情况下，通过正向查找快速地确定内容(context/body)的类型，例如当前环境只需要检查“script:lua:”即可确定是否处理。而编码类型是在前缀的后半部分成对、嵌套声明的，因此可以简单递归以完成解码，而无需考虑前半部分的类型声明。
1239 |    
1240 | 5. 处理系统是否需要实现完整的Promise
1241 |    
1242 |    PEDT规范要求按照MDC中的规范（事实上是ECMAScript的规范）来实现Promise，但这并非是强制性的。
1243 |    
1244 |    首先，PEDT事实上只要求了以Promise.all接口的形式来实现并行任务，从接口上来说，处理系统对此实现成异步的或同步的并不会导致接口上的差异（而只有性能上的不同）。其次，PEDT规范中，单个处理结点对外有且仅有一个execute_task接口，该接口只有RESTfull的接口规接要求，并没有限制其内部实现是否使用Promise，或是否是同步/异步。
1245 |    
1246 |    所以处理系统有能力在当前结点内部使用全同步的方式来实现PEDT并对外暴露execute_task接口，以表明自己能够处理PEDT任务。这在基于规范0.9的系统中，是可行的（甚至是常见的）策略。在这样的处理方案中，Promise.all被一个同步的任务队列处理替代，而其它类似的异步特性（例如task.run）也采用了类似处理方法。
1247 |    
1248 |    从另一个角度上来看，处理系统也可以实现Promise.all的异步效果而无需实现Promise框架，这在技术上仍然是可行的。
1249 |    
1250 | 6. 为什么使用"/execute_"这样的RESTfull接口
1251 |    
1252 |    由于PEDT在使用简单的GET请求时需要通过url来传递参数args，因此不适合将taskId作为一个url参数(paraments)来处理。所以在制定本规范的接口时描述时，建议将taskId作为Url Path的一部分追加到接口上。
1253 |    
1254 |    如果使用POST请求，由于可以将args作为data提交，并在header中添加mixin=false来避免服务端将args混入url paraments，因此这种情况下也可以将taskId作url参数来编码。
1255 |    
1256 |    在RESTfull接口的地址（URL+Path）的设计上，本规范是指导性而非强制性的。例如在n4c架构中，ngx_cc模块实现对PEDT协议的支持时，这里使用的URL就是'http://.../../invoke?execute='。
1257 |    
1258 | 7. 可以使用检查前缀的方法来远程接口返回值的有效性吗
1259 |    
1260 |    我们不推荐这种方法。
1261 |    
1262 |    所有PEDT的返回均默认为application/json类型：
1263 |    
1264 |    > * 可以是符合JSON格式的值，而并不一定是对象）；
1265 |    > * 在返回错误时是通过http status来标识的，而并没有（强制地）限定response body的内容。
1266 |    
1267 |    在这样的情况下，使用前缀检测并不安全有效。但是以下情况总是失败的（反之并不能确定总是成功的）：
1268 |    
1269 |    > * 调用register_task而返回结果不以『"task:』前缀开始。
1270 |    > * 调用download_task而返回结果不以“{”开始。
1271 |    > * 调用task.map()或资源require()而返回结果不以“[”开始。


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## N4C
  2 | 
  3 | N4C是一个标准架构，它提供一套指导规范、案例和高度可靠的模块，用于在“可控、可计算、可通信的集群”上实现复杂的、稳定的业务逻辑。
  4 | 
  5 | > N4C is a Controllable & Computable Communication Cluster architecture.
  6 | 
  7 | N4C最初提出的背景是面向实时计算的，它对实时计算提出了三点原则性意见，并作为N4C架构下的实时计算的基本原则。
  8 | 
  9 | > ``` 
 10 | > - 没有全量：瞬态观察下，没有一致性与有效性的要求；在单次计算中的数据完整性是不必要的。
 11 | > - 就近计算：数据越早规格化则收益越高；总是存在数据合并的中心点，但可以是动态的；层次决定并行效率。
 12 | > - 精度逼近：不存在一次精确，应通过足够多的正确来逼近精确。
 13 | > ```
 14 | 
 15 | 实时计算并非N4C唯一考虑的环境因素。
 16 | 
 17 | 本项目(aimingoo/n4c)只提供N4C的标准化文档和相关的索引。
 18 | 
 19 | ### Table of Contents
 20 | 
 21 | * [N4C specifications](#n4c-specifications)
 22 |   * [1. N4C architecture](#1-n4c-architecture)
 23 |   * [2. PEDT specifications](#2-pedt-specifications)
 24 | * [N4C projects](#n4c-projects)
 25 |   * [PEDT implements](#pedt-implements)
 26 |     * [redpoll](#redpoll)
 27 |     * [harpseal](#harpseal)
 28 |     * [tundrawolf](#tundrawolf)
 29 |     * [pedt-scala](#pedt-for-scala)
 30 |   * [N4C implements](#n4c-implements)
 31 |     * [ngx_4c](#ngx_4c)
 32 |     * [sandpiper](#sandpiper)
 33 | * [N4C documents](#n4c-documents)
 34 | * [others](#others)
 35 | * [history](#history)
 36 | 
 37 | # N4C specifications
 38 | 
 39 | ## 1. N4C architecture
 40 | 
 41 | ![n4c architecture](https://github.com/aimingoo/n4c/wiki/images/n4c_architecture.png)
 42 | 
 43 | N4C整体是一个并行的分布式计算架构，包括如下特性：
 44 | 
 45 | - 实时性：支持就地计算以提供足够的实时性
 46 | - 分布式：使用PEDT分布式任务规范
 47 | - 并行计算：Promise并行计算模型
 48 | 
 49 | 集群构建在N4C中被称为可通信集群（Communication Cluster），它要求集群的各节点间可以成组通信，这种“成组”的关系被抽象为资源描述（n4c resource description），并通过接口(n4c resource interface)反映到资源管理层。
 50 | 
 51 | 任务管理层是一个抽象层，其底层的资源管理被再次抽象为范围（Scope）。被放在最顶的应用层不再直接面对“资源”，从而避免了“资源调度”这样的问题。应用层已经被特定地描述为一种“分布式任务”，并有一个独立的规范定义，称为PEDT（Parallel Exchangeable Distribution Task）。
 52 | 
 53 | PEDT定义了任务的格式、交互的方法，以及最重要的是，它也定义了“任务调度”的基本单元与模式。
 54 | 
 55 | ## 2. PEDT specifications
 56 | 
 57 | > @see $(n4c)/PEDT/*
 58 | 
 59 | 并行的可交换分布式任务（PEDT, Parallel Exchangeable Distribution Task）是以可计算集群为对象的、以实时处理为目标的并行任务规范。该规范包括对任务数据、任务处理和任务调用接口三个部分的定义。
 60 | 
 61 | PEDT旨在为可计算集群提供一种轻量、高效和跨平台的可靠任务处理机制。
 62 | 
 63 | # N4C projects
 64 | 
 65 | 本节列出有关N4C的主要开源项目。
 66 | 
 67 | ## PEDT implements
 68 | 
 69 | * [redpoll for nodejs](https://github.com/aimingoo/redpoll)
 70 | * [harpseal for lua](https://github.com/aimingoo/harpseal)
 71 | * [tundrawolf for nginx_lua](https://github.com/aimingoo/tundrawolf)
 72 | * [pedt-scala](https://github.com/cyber4ron/pedt-scala)
 73 | 
 74 | ### redpoll
 75 | 
 76 | redpoll（红顶雀）是一个nodejs上的项目，它完整地实现了PEDT 1.1。它可以作为一个npm的模块安装。
 77 | 
 78 | [https://github.com/aimingoo/redpoll](https://github.com/aimingoo/redpoll)
 79 | 
 80 | ### harpseal
 81 | 
 82 | harpseal（竖琴海豹）是一个lua上的项目，它在原生lua上实现了PEDT 1.1。它需要LuaScoket模块来实现http client，并且需要copas模块来实现并行的http requests。
 83 | 
 84 | > [https://github.com/aimingoo/harpseal](https://github.com/aimingoo/harpseal)
 85 | 
 86 | harpseal实现了一个完整的distributionScope解析函数(prefixParse in infra/taskhelper.lua)，这是最符合PEDT规范的三段标记(three parts token)解析算法。这与其它（例如redpoll for nodejs）很不相同，他们（后者）通常采用简单的前缀匹配来处理三段标记。参见：
 87 | 
 88 | > ``` text
 89 | > $(harpseal)/testcase/t_pedt_prefix.lua
 90 | > ```
 91 | 
 92 | harpseal还实现了一个递归注册taskDef的工具，可以将多层次的taskDef一次性地注册到register_center，参见：
 93 | 
 94 | > ``` text
 95 | > $(harpseal)/testcase/t_loadTask.lua
 96 | > $(harpseal)/tools/taskloader.lua
 97 | > ```
 98 | 
 99 | ### tundrawolf
100 | 
101 | tundrawolf（冻原狼）是一个nginx_lua上的项目，它与harpseal（竖琴海豹）项目是基本相同的，但直接使用nginx lua中内置的http client、MD5、BASE64等功能/模块，是专门为nginx lua环境定制的。
102 | 
103 | > [https://github.com/aimingoo/tundrawolf](https://github.com/aimingoo/tundrawolf)
104 | 
105 | tundrawolf实现了一种独特的（独立的）机制：路由发现（system_route discoveries）。用于管理当前结点中注册到系统路由(system_route)的对象。这些对象可以是任何（除lua false/nil值之外的）东西。一旦这些对象注册到系统路由，那么它就可以在Promise框架中作为并行对象来使用了。
106 | 
107 | tundrawolf通常是在nginx lua中实现N4C的基础组件。
108 | 
109 | ### PEDT for Scala
110 | 
111 | PEDT-scala是一个在scala上，基于Nashorn Javascript实现的PEDT协议。采用了scala来实现PEDT协议本身（其"com.wandoujia.n4c.pedt.core.PEDT4JS"是一个原生的java对象），并使用javascript来交付可扩展的可编程接口。
112 | 
113 | > [https://github.com/cyber4ron/pedt-scala](https://github.com/cyber4ron/pedt-scala)
114 | 
115 | PEDT-scala实现了一个简单的、测试用的n4c mock，部分第三方项目可以参考该mock来完成PEDT协议的测试（另外，也可以选用Sandpiper in nodejs来作为测试用的资源服务器）。
116 | 
117 | PEDT-scala描述了一个清晰而简洁的工作框架。
118 | 
119 | ![illustrate pedt-scala](https://github.com/cyber4ron/notes/raw/master/images/pedt-scala.png)
120 | 
121 | ## N4C implements
122 | 
123 | * [ngx_4c in nginx](https://github.com/aimingoo/ngx_4c)
124 | * [sandpiper in nodejs](https://github.com/aimingoo/sandpiper)
125 | 
126 | ### ngx_4c
127 | 
128 | ngx_4c是一个在nginx上的编程框架，它基于如下项目来实现了N4C架构：
129 | 
130 | > * tundrawolf
131 | >   
132 | >   实现PEDT distribution tasks规范。
133 | >   
134 | > * ngx_cc
135 | >   
136 | >   实现nginx集群内通信。
137 | >   
138 | > * Promise
139 | >   
140 | >   实现并行处理。
141 | >   
142 | > * Events
143 | >   
144 | >   实现了在nginx下实现的，基于多投事件的标准编程架构；并基于Events实现了N4C中的资源管理和任务管理服务。
145 | >   
146 | > * etcd
147 | >   
148 | >   使用etcd来做N4C中集群构建层的存储和心跳通知。
149 | 
150 | 你需要安装一个nginx来构建你的测试环境。由于ngx_4c可选集成ngx_cc，而后者需要在nginx上打一个特定的补丁以便为每个nginx工作进程分配一个通信端口，因此建议你先参考如下文档：
151 | 
152 | > ``` text
153 | > https://github.com/aimingoo/ngx_4c#install--usage
154 | > ```
155 | 
156 | 但是，如果不集成ngx_cc，那么ngx_4c本身并不要求nginx打上述补丁。
157 | 
158 | ### sandpiper
159 | 
160 | sandpiper（Baird's sandpiper, 黑腰滨鹬）是一个在nodejs上的N4C实现，它基于如下项目：
161 | 
162 | > * redpoll
163 | >   
164 | >   实现PEDT distribution tasks规范。
165 | >   
166 | > * node-etcd-promise
167 | >   
168 | >   使用promise并行处理实现的etcd客户端。
169 | >   
170 | > * request
171 | >   
172 | >   简单的http客户端。
173 | >   
174 | > * crc-32
175 | >   
176 | >   在注册任务时使用crc-32对taskDef进行二次验证。
177 | >   
178 | > * etcd
179 | >   
180 | >   使用etcd来做N4C中集群构建层的存储和心跳通知。
181 | 
182 | sandpiper使用nodejs内建的标准Promise来实现并行处理。与ngx_4c类似，它实现了完整的资源中心与任务中心，并发布了它们的标准客户端界面。基于PEDT规范，它完整地实现了N4C集群中执行者(exeutor)、发布者(publisher)、分发者(dispatcher)等角色。
183 | 
184 | sandpiper提供一些小工具，包括一个资源服务（用于快速地检测/测试其它的N4C项目）：
185 | 
186 | ``` bash
187 | > git clone 'http://github.com/aimingoo/sandpiper'
188 | > npm install; npm start
189 | ```
190 | 
191 | # N4C documents
192 | 
193 | 本节列出有关N4C的主要一些公开文档或讨论。
194 | 
195 | # others
196 | 
197 | 其它未确定内容。
198 | 
199 | 
200 | 
201 | # history
202 | 
203 | ``` text
204 | 	2015.11		N4C opensource and github hosted.
205 | ```
206 | 
207 | 


--------------------------------------------------------------------------------