├── 0.9.1 ├── darwin-amd64.zip ├── linux-386.zip └── linux-amd64.zip ├── README.md ├── php-sdk ├── rabbit.0.8.3.php └── rabbit.0.9.1.php └── python-sdk ├── rabbit.0.8.3.py └── rabbit.0.9.1.py /0.9.1/darwin-amd64.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nixuehan/rabbit/31118c076205e9731946a25a398f75eef0edf5fa/0.9.1/darwin-amd64.zip -------------------------------------------------------------------------------- /0.9.1/linux-386.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nixuehan/rabbit/31118c076205e9731946a25a398f75eef0edf5fa/0.9.1/linux-386.zip -------------------------------------------------------------------------------- /0.9.1/linux-amd64.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nixuehan/rabbit/31118c076205e9731946a25a398f75eef0edf5fa/0.9.1/linux-amd64.zip -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | rabbit-兔兔不良 3 | ============ 4 | 5 | 6 | 特点 7 | ------------------- 8 | 9 | * 基于多脏词匹配 10 | * 脏词畸形纠正 11 | * 基于官方大量脏词库,跟上社会动态。。。。 12 | * 自定义脏词 13 | 14 | 15 | 安装 16 | ------------------- 17 | 18 | 下载符合自己机器的版本,运行即可 19 | 20 | 21 | $ ./rabbit 22 | 23 | 24 | ###支持的参数: 25 | 26 | 27 | Usage of ./rabbit: 28 | 29 | -apikey string 30 | 31 | application apikey (default "sdeolkmddw") 32 | 33 | -host string 34 | 35 | bind address (default "127.0.0.1") 36 | 37 | -interval int 38 | 39 | auto reload time interval (default 12) 40 | 41 | -log string 42 | 43 | log file path (default "rabbit.log") 44 | 45 | -port int 46 | 47 | port the rabbit will listen on (default 9394) 48 | 49 | 50 | 51 | 其他默认就好。主要讲下这参数 52 | -interval 设置重载脏词库的时间,单位是小时。默认 12个小时后自动重新加载一次。 53 | #### -apikey 默认是一个测试key,如果要线上正式使用,请联系我拿apikey 加我qq,加时候请说明(拿apikey).qq: 237852571 54 | 55 | 56 | ------------------- 57 | API 测试后台 58 | ------------------- 59 | 60 | ####在线测试后台: http://tutusoft.net 可以先体验下。 61 | 62 | 63 | 目前内置了一个,基本的 脏词 增删改查的 管理后台。 64 | 65 | 启动rabbit服务后, http://你绑定的IP:端口 66 | 67 | 在里面你可以测试拦截效果。添加自定义脏词等。 可以通过这些API 开发你自己的 不良信息拦截监控后台。 68 | 69 | ------------------- 70 | sdk 71 | ------------------- 72 | php: https://github.com/nixuehan/rabbit/tree/master/php-sdk 73 | 74 | python: https://github.com/nixuehan/rabbit/tree/master/python-sdk 75 | 76 | nodejs: 待续 77 | 78 | golang: 待续 79 | 80 | 81 | ------------------- 82 | 使用所要知道的 83 | ------------------- 84 | 85 | 86 | * 黑名单和灰名单: 分别对应值: 1 和 2 比如: 撸撸射 是黑名单 苍井空是灰名单 87 | 这样在拦截后,我会返回一个rate ,你就可以在代码做判断了。到底是拦截呢 还是事后在人工审查。 88 | 89 | 90 | * 脏词库分两部分: 一份是官方的,一份是你自定义的。 内容过来了会先经过官方的核心脏词库过滤 然后到你自定义的。官方会不断更新脏词库。让你省心。以后我会开放脏词库,让大家来一起更新。 91 | 92 | * 畸形纠正: 自定义脏词的时候,会有这个东西。默认的所有数字类型会转成 阿拉伯数字、繁体会转成简体。然后进行判断。但这样就会有误伤。 比如 血案+六四 那么当内容是: 血案呀,今天我被割到手指。在641寝室被割的。 那么这个内容就会被 血案+六四 给命中了。 所以 血案+六四 这个脏词添加的时候。 畸形纠正 我们选择值 2 进行关闭。 那么 六四在底层不会被转换成64. 那误伤率就大大减少了。 93 | 94 | * 脏词类型: 违法、色情、政治敏感等。。。我定义了9个类型。您自己看着办。 95 | 96 | * 脏词组合: 脏词是可以组合的。比如: 天猫招工+空闲+结算 目前最多匹配3个。 还支持几个内置联系类型: qq phone url 比如: 信用卡套现:phone 、 轻松赚钱+曰结:qq 等 97 | 98 | 99 | 100 | ###记得重载脏词 101 | 102 | 修改完脏词后,记得进入 http://你绑定的IP:端口号 里面进行 脏词重新载入。否则不生效的哦。当然也支持 curl 操作 103 | 104 | 105 | -------- 106 | API文档 107 | -------- 108 | 有了API,你就可以很方便的把服务接入你自己的项目了 109 | 110 | ###过滤API 111 | 112 | POST /filter 113 | 114 | 参数: contents=蒙汗药 115 | 116 | 返回格式:json 117 | 118 | 具体返回值说明: 119 | {"Category":"9","CategoryName":"违法信息","Hit":"1","Id":"38509","Rate":"2","Word":"蒙汗药"} 120 | 121 | ```javascript 122 | Hit: 是否命中, 0 否 1 是 123 | 124 | Category: 脏词的分类id 125 | 126 | CategoryName: 脏词所属分类名 127 | 128 | Id: 脏词ID,利用这个ID就可以编辑脏词 129 | 130 | Rate: 黑名单或白名单,1 黑名单 2灰名单(自己review内容) 131 | 132 | Word: 脏词 133 | ``` 134 | 135 | -------- 136 | 137 | ###过滤色情图片 138 | ####目前是根据图片的 人类肤色 比例来进行打分,聊胜于无吧。-_-!别跟我说黑人XXOO无法识别。建议配合人工审核后台使用 139 | 140 | POST /porn 141 | 142 | 参数: file=/data/thumb/xxoo.jpg 或 file = http://sd.com/xxoo.jpg 143 | 144 | 返回格式:json 145 | 146 | 具体返回值说明: 147 | {"score":86} 148 | 149 | ```javascript 150 | score: 建议:65 - 85判断为性感 85以上为色情 151 | ``` 152 | 153 | 154 | ###添加脏词 155 | 156 | POST /create 157 | 158 | 参数: 159 | 160 | category: 分类id. 可通过分类查询接口了解 161 | 162 | word: 脏词 163 | 164 | rate: 黑名单或灰名单. 1黑名单 2灰名单 165 | 166 | correct: 是否支持畸形纠正. 1 是 2否 167 | 168 | 返回格式:json 169 | 170 | 具体返回值说明: { "success": 1 } 171 | 172 | -------- 173 | 174 | ###删除脏词 175 | 176 | DELETE /delete 177 | 178 | 参数: id=1 179 | 180 | 返回格式:json 181 | 182 | 具体返回值说明: { "success": 1 } 183 | 184 | 185 | -------- 186 | 187 | ###修改脏词 188 | 189 | PUT /revise 190 | 191 | 参数: 192 | 193 | id: 脏词id.主键 194 | 195 | category: 分类id.请通过分类查询接口了解 196 | 197 | word: 脏词 198 | 199 | rate: 黑名单或灰名单. 1黑名单 2灰名单 200 | 201 | correct: 是否支持畸形纠正. 1 是 2 否 202 | 203 | 返回格式:json 204 | 205 | 具体返回值说明: { "success": 1 } 206 | 207 | -------- 208 | 209 | ###脏词查询 210 | 211 | GET /query 212 | 213 | 参数: 214 | 215 | id: 脏词id.主键 216 | 217 | category: 分类id.请通过分类查询接口了解 218 | 219 | word: 脏词 220 | 221 | rate: 黑名单或灰名单. 1黑名单 2灰名单 222 | 223 | correct: 是否支持畸形纠正. 1 是 2否 224 | 225 | start: 记录开始数(分页使用) 226 | 227 | end: 记录结束数(分页使用) 228 | 229 | 返回格式:json 230 | 231 | 具体返回值说明: [ { "Id": 9, "Category": 2, "CategoryName": "低俗信息", "Word": "我做你做不做", "Correct": 1, "Rate": 1 }, { "Id": 8, "Category": 2, "CategoryName": "低俗信息", "Word": "发问了你", "Correct": 1, "Rate": 1 } ] 232 | 233 | ```javascript 234 | Category: 脏词的分类id 235 | 236 | CategoryName: 脏词所属分类名 237 | 238 | Id: 脏词ID,利用这个ID就可以编辑脏词 239 | 240 | Rate: 黑名单或白名单,1 黑名单 2灰名单(自己review内容) 241 | 242 | Word: 脏词 243 | 244 | Correct: 是否畸形纠正 245 | ``` 246 | 247 | -------- 248 | 249 | ###脏词分类 250 | 251 | GET /category 252 | 253 | 参数:无 254 | 255 | 返回格式:json 256 | 257 | [{"Category_id":1,"Category_name":"个性化"},{"Category_id":2,"Category_name":"低俗信息"},{"Category_id":3,"Category_name":"灌水信息"},{"Category_id":5,"Category_name":"政治敏感"},{"Category_id":6,"Category_name":"违约广告"},{"Category_id":7,"Category_name":"跨站追杀"},{"Category_id":8,"Category_name":"色情信息"},{"Category_id":9,"Category_name":"违法信息"},{"Category_id":10,"Category_name":"垃圾广告"}] 258 | 259 | 260 | -------- 261 | 262 | ###脏词重载 263 | 264 | GET /reload 265 | 266 | 说明:添加或修改脏词后,重载才会生效 267 | 268 | 返回格式:json 269 | 270 | 具体返回值说明: { "success": 1 } 271 | 272 | 273 | ---------- 274 | 压测数据 275 | ---------- 276 | 硬件配置: 阿里云主机 CPU: 1核 内存: 1024 MB 277 | 278 | 压测的数据如下。内容共计:371个字数、817个字符: 279 | ```javascript 280 | wrk.method = "POST" 281 | wrk.body = "contents=原来,7日中午,张某夫妇带着儿子张峰(化名,现年2岁半)和女儿张娟(化名,现年1岁),来到位于银海区银滩镇龙潭村委会的姐夫刘某家吃饭。吃饭过程中,张峰和张娟被放在刘某房间玩耍,调皮的张峰在床头的夹层里翻出一包东西,他以为是零食,顺手抓了几颗放进嘴里。一旁的张娟见哥哥吃东西,她也爬过来拿着往嘴里塞两人的举动引起张某夫妇的注意。等走近一看,发现孩子们吃的东西竟是老鼠药。吓坏了的张某赶紧抱起儿子张峰,并从其嘴里抠出两粒老鼠药。“快送医院!”目睹眼前这一幕后,刘某急忙大喊。回过神来的夫妇,立即抱起两个孩子冲出门外,并拨打了120急救电话其间,由于担心路上被堵,刘某的妻子建议先把孩子送到附近派出所,再通过民警送往医医。当天下午,经过约一小时的急救和洗胃,张峰和张娟脱离了生命危险。随后,两人被安排在儿科儿童病房进行观察。8日下午,两人的各项身体指标,都已恢复正常医生提醒家长,蒙汗药是一种烈性毒药,千万不要放在小孩够得到的地方" 282 | 283 | wrk.headers["Content-Type"] = "application/form-data" 284 | ``` 285 | 286 | 跑下 287 | ```javascript 288 | ./wrk -t2 -c100 -d60s --script=../post.lua http://10.161.171.74:9394/filter 289 | ``` 290 | 291 | 两线程 100个连接 60秒 数据如下: 292 | ```javascript 293 | Running 1m test @ http://10.16.17.74:9394/filter 294 | 2 threads and 100 connections 295 | Thread Stats Avg Stdev Max +/- Stdev 296 | Latency 12.82ms 10.94ms 90.37ms 88.89% 297 | Req/Sec 4.63k 1.06k 6.85k 63.33% 298 | 552582 requests in 1.00m, 70.62MB read 299 | Requests/sec: 9207.21 300 | Transfer/sec: 1.18MB 301 | ``` 302 | 303 | 两线程 3000个连接 60秒 数据如下: 304 | ```javascript 305 | Running 30s test @ http://10.16.17.74:9394/filter 306 | 2 threads and 3000 connections 307 | Thread Stats Avg Stdev Max +/- Stdev 308 | Latency 314.93ms 97.50ms 1.82s 84.51% 309 | Req/Sec 4.48k 1.60k 7.33k 78.82% 310 | 266017 requests in 30.04s, 33.99MB read 311 | Requests/sec: 8854.35 312 | Transfer/sec: 1.13MB 313 | ``` 314 | 315 | 两线程 4000个连接 60秒 开始出现 timeout了...阿里云服务器也就如此了。数据如下: 316 | ```javascript 317 | Running 30s test @ http://10.161.171.74:9394/filter 318 | 2 threads and 4000 connections 319 | Thread Stats Avg Stdev Max +/- Stdev 320 | Latency 375.02ms 133.17ms 1.95s 78.31% 321 | Req/Sec 4.22k 1.68k 8.15k 69.59% 322 | 249464 requests in 30.10s, 31.88MB read 323 | Socket errors: connect 0, read 871, write 0, timeout 136 324 | Requests/sec: 8288.02 325 | Transfer/sec: 1.06MB 326 | ``` 327 | 328 | 329 | 330 | 换我的MAC 压测下。配置如下:Intel Core i5 1.6 GHz .内存 8 GB. 331 | ```javascript 332 | wrk -t8 -c100 -d60s --script=./post.lua http://127.0.0.1:9394/filter 333 | ``` 334 | 数据如下: 335 | ```javascript 336 | Running 1m test @ http://127.0.0.1:9394/filter 337 | 8 threads and 100 connections 338 | Thread Stats Avg Stdev Max +/- Stdev 339 | Latency 7.76ms 12.46ms 245.15ms 90.71% 340 | Req/Sec 2.62k 637.00 12.44k 79.64% 341 | 1253342 requests in 1.00m, 198.42MB read 342 | Requests/sec: 20853.38 343 | Transfer/sec: 3.30MB 344 | ``` 345 | 346 | 依然是我的MAC 测试下长内容性能和多脏词匹配。内容:4420个字数、9916个字符。脏词:网络+兼职+日入:qq 347 | ```javascript 348 | wrk -t8 -c100 -d60s --script=./post.lua http://127.0.0.1:9394/filter 349 | ``` 350 | 数据如下: 351 | ```javascript 352 | Running 1m test @ http://127.0.0.1:9394/filter 353 | 8 threads and 100 connections 354 | Thread Stats Avg Stdev Max +/- Stdev 355 | Latency 13.27ms 16.38ms 219.79ms 87.87% 356 | Req/Sec 1.33k 235.26 4.59k 73.11% 357 | 636879 requests in 1.00m, 100.82MB read 358 | Requests/sec: 10596.88 359 | Transfer/sec: 1.68MB 360 | ``` 361 | 362 | ---------- 363 | 未来计划? 364 | ---------- 365 | 在写一个完善的监控后台,方便管理存在的脏词和查看已被拦截的内容 366 | 367 | ---------- 368 | 遇到问题了? 369 | ---------- 370 | 加群: 243663452 找我 371 | -------------------------------------------------------------------------------- /php-sdk/rabbit.0.8.3.php: -------------------------------------------------------------------------------- 1 | host($host,$port); 19 | } 20 | 21 | public function __clone(){ 22 | trigger_error('Clone is not allow!',E_USER_ERROR); 23 | } 24 | 25 | /** 26 | * 调试接口返回. 27 | * 28 | * @param string $msg . 29 | * @return $this A reference to the current instance. 30 | */ 31 | private function trace($msg) { 32 | if($this->trace){ 33 | exit($msg); 34 | } 35 | } 36 | 37 | private function decode($data) { 38 | $this->trace($data); 39 | return json_decode($data,true); 40 | } 41 | 42 | private function host($host,$port) { 43 | $this->host = "http://" . $host . ":" . $port."/"; 44 | } 45 | 46 | public function factory($host = 'localhost',$port = 9394) { 47 | if(is_null(self::$instance)) { 48 | self::$instance = new self($host,$port); 49 | } 50 | return self::$instance; 51 | } 52 | 53 | private function request($do,Array $parameter = [],$method = 'GET') { 54 | $query = ''; 55 | 56 | $ch = curl_init(); 57 | 58 | if($method != 'GET') { 59 | curl_setopt($ch, CURLOPT_POSTFIELDS, $parameter); 60 | curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $method); 61 | 62 | }else{ 63 | $query = "?" . http_build_query($parameter); 64 | } 65 | 66 | curl_setopt($ch, CURLOPT_URL, $this->host.$do.$query); 67 | curl_setopt($ch, CURLOPT_TIMEOUT,60); 68 | curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); 69 | 70 | $data = curl_exec($ch); 71 | if(curl_errno($ch)){ 72 | return false; 73 | } 74 | curl_close($ch); 75 | return $data; 76 | } 77 | 78 | /** 79 | * 内容过滤. 80 | * 81 | * @param string $contents 要过滤的内容 82 | * @return bool or array 83 | */ 84 | public function filter($contents) { 85 | $result = $this->request("filter",['contents' => $contents],'POST'); 86 | if($result !== false) { 87 | $code = $this->decode($result); 88 | if(isset($code['success']) && $code['success'] === self::NO) { 89 | return false; 90 | } 91 | return $code; 92 | } 93 | return false; 94 | } 95 | 96 | /** 97 | * 脏词删除. 98 | * 99 | * @param int $id 脏词id (必须) 100 | * @return bool or array 101 | */ 102 | public function delete($id) { 103 | if(!$id) return false; 104 | $result = $this->request("delete",['id' => $id],'DELETE'); 105 | if($result !== false) { 106 | $code = $this->decode($result); 107 | if(isset($code['success']) && $code['success'] === self::NO) { 108 | return false; 109 | } 110 | return true; 111 | } 112 | return false; 113 | } 114 | 115 | /** 116 | * 添加脏词. 117 | * 118 | * @param array 119 | * string $word 脏词 120 | * int $category 脏词分类 121 | * int $rate 黑名单OR灰名单 1 or 2 122 | * int $correct 是否畸形纠正 1 or 2 123 | * @return bool or array. 124 | */ 125 | public function create(Array $parameter) { 126 | $result = $this->request("create",$parameter,'POST'); 127 | if($result !== false) { 128 | $code = $this->decode($result); 129 | if(isset($code['success']) && $code['success'] === self::NO) { 130 | return false; 131 | } 132 | return true; 133 | } 134 | return false; 135 | } 136 | 137 | /** 138 | * 修改脏词. 139 | * 140 | * @param array 141 | * int $id 脏词id (必须) 142 | * string $word 脏词 143 | * int $category 脏词分类 144 | * int $rate 黑名单OR灰名单 1 or 2 145 | * int $correct 是否畸形纠正 1 or 2 146 | * @return bool 147 | */ 148 | public function revise(Array $parameter) { 149 | 150 | if(!isset($parameter['id']) || count($parameter) < 1) { 151 | return false; 152 | } 153 | 154 | $result = $this->request("revise",$parameter,'PUT'); 155 | if($result !== false) { 156 | $code = $this->decode($result); 157 | if(isset($code['success']) && $code['success'] === self::NO) { 158 | return false; 159 | } 160 | return true; 161 | } 162 | return false; 163 | } 164 | 165 | /** 166 | * 查询脏词. 167 | * 168 | * @param array 169 | * int $id 脏词id 170 | * string $word 脏词 171 | * int $category 脏词分类 172 | * int $rate 黑名单OR灰名单 1 or 2 173 | * int $correct 是否畸形纠正 1 or 2 174 | * @return bool or array 175 | */ 176 | public function query(Array $parameter) { 177 | $result = $this->request("query",$parameter,'GET'); 178 | if($result !== false) { 179 | $code = $this->decode($result); 180 | 181 | if(isset($code['success']) && $code['success'] === self::NO) { 182 | return false; 183 | } 184 | return $code; 185 | } 186 | return false; 187 | } 188 | 189 | /** 190 | * 脏词分类. 191 | * ID:分类名 1:个性化 2:低俗信息 3:灌水信息 5:政治敏感 6:违约广告 7:跨站追杀 8:色情信息 9:违法信息 10:垃圾广告 192 | * @return bool or array 193 | */ 194 | public function category() { 195 | $result = $this->request("category"); 196 | if($result !== false) { 197 | $code = $this->decode($result); 198 | 199 | if(isset($code['success']) && $code['success'] === self::NO) { 200 | return false; 201 | } 202 | 203 | return $code; 204 | } 205 | return false; 206 | } 207 | 208 | /** 209 | * 脏词重新载入. 210 | * 211 | * @return bool 212 | */ 213 | public function reload() { 214 | $result = $this->request("reload"); 215 | if($result !== false) { 216 | $code = $this->decode($result); 217 | 218 | if(isset($code['success']) && $code['success'] == self::NO) { 219 | return false; 220 | } 221 | return true; 222 | } 223 | return false; 224 | } 225 | } 226 | 227 | $rabbit = Rabbit::factory('127.0.0.1',9394); 228 | 229 | // $rabbit->trace = true; 230 | 231 | //$data = $rabbit->filter('万人敬仰+大兄弟 15009987776'); 232 | 233 | $data = $rabbit->create([ 234 | 'word' => '什么狗屎+猪头三:phone', 235 | 'category' => 3, 236 | 'rate' => 1, 237 | 'correct' => 1 238 | ]); 239 | 240 | //$data = $rabbit->delete(29); 241 | 242 | // $data = $rabbit->reload(); 243 | 244 | var_dump($data); 245 | -------------------------------------------------------------------------------- /php-sdk/rabbit.0.9.1.php: -------------------------------------------------------------------------------- 1 | host($host,$port); 19 | } 20 | 21 | public function __clone(){ 22 | trigger_error('Clone is not allow!',E_USER_ERROR); 23 | } 24 | 25 | /** 26 | * 调试接口返回. 27 | * 28 | * @param string $msg . 29 | * @return $this A reference to the current instance. 30 | */ 31 | private function trace($msg) { 32 | if($this->trace){ 33 | exit($msg); 34 | } 35 | } 36 | 37 | private function decode($data) { 38 | $this->trace($data); 39 | return json_decode($data,true); 40 | } 41 | 42 | private function host($host,$port) { 43 | $this->host = "http://" . $host . ":" . $port."/"; 44 | } 45 | 46 | public function factory($host = 'localhost',$port = 9394) { 47 | if(is_null(self::$instance)) { 48 | self::$instance = new self($host,$port); 49 | } 50 | return self::$instance; 51 | } 52 | 53 | private function request($do,Array $parameter = [],$method = 'GET') { 54 | $query = ''; 55 | 56 | $ch = curl_init(); 57 | 58 | if($method != 'GET') { 59 | curl_setopt($ch, CURLOPT_POSTFIELDS, $parameter); 60 | curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $method); 61 | 62 | }else{ 63 | $query = "?" . http_build_query($parameter); 64 | } 65 | 66 | curl_setopt($ch, CURLOPT_URL, $this->host.$do.$query); 67 | curl_setopt($ch, CURLOPT_TIMEOUT,60); 68 | curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); 69 | 70 | $data = curl_exec($ch); 71 | if(curl_errno($ch)){ 72 | return false; 73 | } 74 | curl_close($ch); 75 | return $data; 76 | } 77 | 78 | /** 79 | * 内容过滤. 80 | * 81 | * @param string $contents 要过滤的内容 82 | * @return bool or array 83 | */ 84 | public function filter($contents) { 85 | $result = $this->request("filter",['contents' => $contents],'POST'); 86 | if($result !== false) { 87 | $code = $this->decode($result); 88 | if(isset($code['success']) && $code['success'] === self::NO) { 89 | return false; 90 | } 91 | return $code; 92 | } 93 | return false; 94 | } 95 | 96 | /** 97 | * 色图过滤. 98 | * 99 | * @param string $file 要过滤的图片 本地地址 或者 网络地址 100 | * @return bool or int 分数 建议:65 - 85判断为性感 85以上为色情 101 | */ 102 | public function porn($file) { 103 | $result = $this->request("porn",['file' => $file],'POST'); 104 | if($result !== false) { 105 | $code = $this->decode($result); 106 | if(isset($code['success']) && $code['success'] === self::NO) { 107 | return false; 108 | } 109 | return $code['score']; 110 | } 111 | return false; 112 | } 113 | 114 | /** 115 | * 脏词删除. 116 | * 117 | * @param int $id 脏词id (必须) 118 | * @return bool or array 119 | */ 120 | public function delete($id) { 121 | if(!$id) return false; 122 | $result = $this->request("delete",['id' => $id],'DELETE'); 123 | if($result !== false) { 124 | $code = $this->decode($result); 125 | if(isset($code['success']) && $code['success'] === self::NO) { 126 | return false; 127 | } 128 | return true; 129 | } 130 | return false; 131 | } 132 | 133 | /** 134 | * 添加脏词. 135 | * 136 | * @param array 137 | * string word 脏词 138 | * int category 脏词分类 139 | * int rate 黑名单OR灰名单 1 or 2 140 | * int correct 是否畸形纠正 1 or 2 141 | * @return bool or array. 142 | */ 143 | public function create(Array $parameter) { 144 | $result = $this->request("create",$parameter,'POST'); 145 | if($result !== false) { 146 | $code = $this->decode($result); 147 | if(isset($code['success']) && $code['success'] === self::NO) { 148 | return false; 149 | } 150 | return true; 151 | } 152 | return false; 153 | } 154 | 155 | /** 156 | * 修改脏词. 157 | * 158 | * @param array 159 | * int id 脏词id (必须) 160 | * string word 脏词 161 | * int category 脏词分类 162 | * int rate 黑名单OR灰名单 1 or 2 163 | * int correct 是否畸形纠正 1 or 2 164 | * @return bool 165 | */ 166 | public function revise(Array $parameter) { 167 | 168 | if(!isset($parameter['id']) || count($parameter) < 1) { 169 | return false; 170 | } 171 | 172 | $result = $this->request("revise",$parameter,'PUT'); 173 | if($result !== false) { 174 | $code = $this->decode($result); 175 | if(isset($code['success']) && $code['success'] === self::NO) { 176 | return false; 177 | } 178 | return true; 179 | } 180 | return false; 181 | } 182 | 183 | /** 184 | * 查询脏词. 185 | * 186 | * @param array 187 | * int id 脏词id 188 | * string word 脏词 189 | * int category 脏词分类 190 | * int rate 黑名单OR灰名单 1 or 2 191 | * int correct 是否畸形纠正 1 or 2 192 | * @return bool or array 193 | */ 194 | public function query(Array $parameter) { 195 | $result = $this->request("query",$parameter,'GET'); 196 | if($result !== false) { 197 | $code = $this->decode($result); 198 | 199 | if(isset($code['success']) && $code['success'] === self::NO) { 200 | return false; 201 | } 202 | return $code; 203 | } 204 | return false; 205 | } 206 | 207 | /** 208 | * 脏词分类. 209 | * ID:分类名 1:个性化 2:低俗信息 3:灌水信息 5:政治敏感 6:违约广告 7:跨站追杀 8:色情信息 9:违法信息 10:垃圾广告 210 | * @return bool or array 211 | */ 212 | public function category() { 213 | $result = $this->request("category"); 214 | if($result !== false) { 215 | $code = $this->decode($result); 216 | 217 | if(isset($code['success']) && $code['success'] === self::NO) { 218 | return false; 219 | } 220 | 221 | return $code; 222 | } 223 | return false; 224 | } 225 | 226 | /** 227 | * 脏词重新载入. 228 | * 229 | * @return bool 230 | */ 231 | public function reload() { 232 | $result = $this->request("reload"); 233 | if($result !== false) { 234 | $code = $this->decode($result); 235 | 236 | if(isset($code['success']) && $code['success'] == self::NO) { 237 | return false; 238 | } 239 | return true; 240 | } 241 | return false; 242 | } 243 | } 244 | 245 | $rabbit = Rabbit::factory('127.0.0.1',9394); 246 | 247 | // $rabbit->trace = true; 248 | 249 | //$data = $rabbit->filter('万人敬仰+大兄弟 15009987776'); 250 | 251 | // $data = $rabbit->create([ 252 | // 'word' => '什么狗屎+猪头三:phone', 253 | // 'category' => 3, 254 | // 'rate' => 1, 255 | // 'correct' => 1 256 | // ]); 257 | 258 | //$data = $rabbit->delete(29); 259 | 260 | // $data = $rabbit->reload(); 261 | 262 | $score = $rabbit->porn("/Users/yehuiming/Downloads/6609c93d70cf3bc7e43db93dd500baa1cd112a25.jpg"); 263 | 264 | var_dump($score); 265 | -------------------------------------------------------------------------------- /python-sdk/rabbit.0.8.3.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | https://github.com/nixuehan/rabbit 4 | """ 5 | __author__ = '逆雪寒' 6 | __version__ = '0.8.3' 7 | 8 | try: 9 | import requests 10 | except ImportError,msg: 11 | raise ImportError("Please install python requests. pip install requests") 12 | 13 | 14 | class RabbitException(Exception): 15 | pass 16 | 17 | 18 | class Rabbit(object): 19 | """doc: https://github.com/nixuehan/rabbit. """ 20 | 21 | YES,NO = 1,0 22 | 23 | def __init__(self,host,port,timeout=10,trace=0): 24 | self.host,self.timeout,self.trace = "http://" + host + ":" + str(port) + "/",timeout,trace 25 | 26 | def __do(self,do): 27 | return self.host + do 28 | 29 | def __trace(self,msg): 30 | if self.trace == self.YES: 31 | print msg 32 | pass 33 | 34 | def filter(self,contents): 35 | """ 36 | 过滤内容 37 | """ 38 | try: 39 | result = requests.post(self.__do('filter'),params={'contents':contents},timeout=self.timeout) 40 | data = result.json() 41 | if ("success" in data) and (data["success"] == self.NO): 42 | self.__trace(data) 43 | return False 44 | return data 45 | 46 | except Exception,msg: 47 | raise RabbitException(msg) 48 | 49 | 50 | def delete(self,id): 51 | """ 52 | 删除脏词 53 | """ 54 | try: 55 | result = requests.delete(self.__do('delete'),params={'id':id},timeout=self.timeout) 56 | data = result.json() 57 | if ("success" in data) and (data["success"] == self.NO): 58 | return False 59 | return data.json() 60 | except Exception,msg: 61 | raise RabbitException(msg) 62 | 63 | def create(self,info): 64 | """ 65 | 创建脏词 66 | 67 | @param dict 68 | string word 脏词 69 | int category 脏词分类 70 | int rate 黑名单OR灰名单 1 or 2 71 | int correct 是否畸形纠正 1 or 2 72 | """ 73 | try: 74 | result = requests.post(self.__do('create'),params=info,timeout=self.timeout) 75 | data = result.json() 76 | if ("success" in data) and (data["success"] == self.NO): 77 | self.__trace(data) 78 | return False 79 | return data 80 | 81 | except Exception,msg: 82 | raise RabbitException(msg) 83 | 84 | def revise(self,info): 85 | """ 86 | 修改脏词 87 | 88 | @param dict 89 | int id 脏词id 90 | string word 脏词 91 | int category 脏词分类 92 | int rate 黑名单OR灰名单 1 or 2 93 | int correct 是否畸形纠正 1 or 2 94 | """ 95 | try: 96 | result = requests.put(self.__do('revise'),params=info,timeout=self.timeout) 97 | data = result.json() 98 | if ("success" in data) and (data["success"] == self.NO): 99 | self.__trace(data) 100 | return False 101 | else: 102 | return True 103 | return data 104 | 105 | except Exception,msg: 106 | raise RabbitException(msg) 107 | 108 | def query(self,info): 109 | """ 110 | 查询脏词信息 111 | 112 | @param dict 113 | int id 脏词id 114 | string word 脏词 115 | int category 脏词分类 116 | int rate 黑名单OR灰名单 1 or 2 117 | int correct 是否畸形纠正 1 or 2 118 | """ 119 | try: 120 | result = requests.get(self.__do('query'),params=info,timeout=self.timeout) 121 | data = result.json() 122 | if ("success" in data) and (data["success"] == self.NO): 123 | self.__trace(data) 124 | return False 125 | return data 126 | 127 | except Exception,msg: 128 | raise RabbitException(msg) 129 | 130 | def reload(self): 131 | """ 132 | 脏词重载 133 | """ 134 | try: 135 | result = requests.get(self.__do('reload'),timeout=self.timeout) 136 | data = result.json() 137 | if ("success" in data) and (data["success"] == self.NO): 138 | return False 139 | return True 140 | except Exception,msg: 141 | raise RabbitException(msg) 142 | 143 | 144 | rabbit = Rabbit(host='127.0.0.1',port=9394) 145 | 146 | # print rabbit.query({'category':2,'rate':2}) 147 | 148 | result = rabbit.filter("我吃蒙汗药") 149 | if(result['hit']): 150 | print 'hit' 151 | else: 152 | print 'no' -------------------------------------------------------------------------------- /python-sdk/rabbit.0.9.1.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | https://github.com/nixuehan/rabbit 4 | """ 5 | __author__ = '逆雪寒' 6 | __version__ = '0.9.1' 7 | 8 | try: 9 | import requests 10 | except ImportError,msg: 11 | raise ImportError("Please install python requests. pip install requests") 12 | 13 | 14 | class RabbitException(Exception): 15 | pass 16 | 17 | 18 | class Rabbit(object): 19 | """doc: https://github.com/nixuehan/rabbit. """ 20 | 21 | YES,NO = 1,0 22 | 23 | def __init__(self,host,port,timeout=2,trace=0): 24 | self.host,self.timeout,self.trace = "http://" + host + ":" + str(port) + "/",timeout,trace 25 | 26 | def __do(self,do): 27 | return self.host + do 28 | 29 | def __trace(self,msg): 30 | if self.trace == self.YES: 31 | print msg 32 | pass 33 | 34 | def filter(self,contents): 35 | """ 36 | 过滤内容 37 | """ 38 | try: 39 | result = requests.post(self.__do('filter'),params={'contents':contents},timeout=self.timeout) 40 | data = result.json() 41 | if ("success" in data) and (data["success"] == self.NO): 42 | self.__trace(data) 43 | return False 44 | return data 45 | 46 | except Exception,msg: 47 | raise RabbitException(msg) 48 | 49 | 50 | def porn(self,filePath): 51 | """ 52 | 过滤色图 score 建议:65 - 85判断为性感 85以上为色情 53 | """ 54 | try: 55 | result = requests.post(self.__do('porn'),params={'file':filePath},timeout=self.timeout) 56 | data = result.json() 57 | if ("success" in data) and (data["success"] == self.NO): 58 | self.__trace(data) 59 | return False 60 | return data['score'] 61 | 62 | except Exception,msg: 63 | raise RabbitException(msg) 64 | 65 | 66 | def delete(self,id): 67 | """ 68 | 删除脏词 69 | """ 70 | try: 71 | result = requests.delete(self.__do('delete'),params={'id':id},timeout=self.timeout) 72 | data = result.json() 73 | if ("success" in data) and (data["success"] == self.NO): 74 | return False 75 | return data.json() 76 | except Exception,msg: 77 | raise RabbitException(msg) 78 | 79 | def create(self,info): 80 | """ 81 | 创建脏词 82 | 83 | @param dict 84 | string word 脏词 85 | int category 脏词分类 86 | int rate 黑名单OR灰名单 1 or 2 87 | int correct 是否畸形纠正 1 or 2 88 | """ 89 | try: 90 | result = requests.post(self.__do('create'),params=info,timeout=self.timeout) 91 | data = result.json() 92 | if ("success" in data) and (data["success"] == self.NO): 93 | self.__trace(data) 94 | return False 95 | return data 96 | 97 | except Exception,msg: 98 | raise RabbitException(msg) 99 | 100 | def revise(self,info): 101 | """ 102 | 修改脏词 103 | 104 | @param dict 105 | int id 脏词id 106 | string word 脏词 107 | int category 脏词分类 108 | int rate 黑名单OR灰名单 1 or 2 109 | int correct 是否畸形纠正 1 or 2 110 | """ 111 | try: 112 | result = requests.put(self.__do('revise'),params=info,timeout=self.timeout) 113 | data = result.json() 114 | if ("success" in data) and (data["success"] == self.NO): 115 | self.__trace(data) 116 | return False 117 | else: 118 | return True 119 | return data 120 | 121 | except Exception,msg: 122 | raise RabbitException(msg) 123 | 124 | def query(self,info): 125 | """ 126 | 查询脏词信息 127 | 128 | @param dict 129 | int id 脏词id 130 | string word 脏词 131 | int category 脏词分类 132 | int rate 黑名单OR灰名单 1 or 2 133 | int correct 是否畸形纠正 1 or 2 134 | """ 135 | try: 136 | result = requests.get(self.__do('query'),params=info,timeout=self.timeout) 137 | data = result.json() 138 | if ("success" in data) and (data["success"] == self.NO): 139 | self.__trace(data) 140 | return False 141 | return data 142 | 143 | except Exception,msg: 144 | raise RabbitException(msg) 145 | 146 | def reload(self): 147 | """ 148 | 脏词重载 149 | """ 150 | try: 151 | result = requests.get(self.__do('reload'),timeout=self.timeout) 152 | data = result.json() 153 | if ("success" in data) and (data["success"] == self.NO): 154 | return False 155 | return True 156 | except Exception,msg: 157 | raise RabbitException(msg) 158 | 159 | 160 | rabbit = Rabbit(host='127.0.0.1',port=9394) 161 | 162 | try: 163 | score = rabbit.porn("https://ss0.bdstatic.com/94oimg?image&quality=100&size=b4000_4000&sec=1465993408&di=db6c2337ac2ffe0cab3a6410e987098c&src=http://t1.mmonly.cc/uploads/allimg/20150727/h2pg440kruf.jpg") 164 | except RabbitException,msg: 165 | score = False 166 | 167 | print score 168 | # print rabbit.query({'category':2,'rate':2}) 169 | 170 | # result = rabbit.filter("我吃蒙汗药") 171 | # if(result['hit']): 172 | # print 'hit' 173 | # else: 174 | # print 'no' --------------------------------------------------------------------------------