├── README.md
├── localbottle.py
├── requirements.txt
├── smallgfw.py
└── words.txt
/README.md:
--------------------------------------------------------------------------------
1 | SensitivePy
2 | ===========
3 |
4 | 使用python开发的极简的敏感词过滤系统
5 | -----------
6 |
7 | API LIST
8 | -----------
9 | 1.检测敏感词
10 | URL http://your_domain/check
11 | 参数名 请求类型 可选 长度
12 | words POST false 65535
13 |
14 | 返回格式:json
15 | {"count":1,"data":[[0,6,"\u6bcd\u5b5d"]]}
16 |
17 | 2.过滤敏感词
18 | URL http://your_domain/replace
19 | 参数名 请求类型 可选 长度
20 | words POST false 65535
21 | 返回格式:text
22 | **这是已经过滤的文本**,还好
23 |
24 | words.txt为敏感词文件
25 |
26 | ### 安装说明
27 | 先通过pip或easy_install安装bottle框架
28 | 再修改localbottle里的端口设置和域名设置,再使用python 启动即可
29 | *通过云环境的需要修改一下配置,保留wsgi.py,具体参考云环境的说明
30 |
31 | ### 更新说明
32 | 2014/10/7
33 | 1.完成核心检测和过滤API
34 | 2.集成bottle框架
35 | 3.检测使用DFA过滤算法
36 |
37 |
38 | caroltc(312493732@qq.com)
39 |
--------------------------------------------------------------------------------
/localbottle.py:
--------------------------------------------------------------------------------
1 | #-*- coding:utf-8 -*-
2 | #localhost testing
3 | #caroltc 2014/10/7
4 | from bottle import route, run, request
5 | from smallgfw import *
6 | import json
7 | import sys
8 |
9 | def initWords():
10 | path = 'words.txt'
11 | fp = open(path,'r')
12 | word_list = []
13 | for line in fp:
14 | line = line[0:-1]
15 | word_list.append(line)
16 | fp.close()
17 | return word_list
18 |
19 | @route('/replace', method="POST")
20 | def replace():
21 | reload(sys)
22 | sys.setdefaultencoding('utf8')
23 | getwords = request.params.words or ""
24 | gfw = GFW()
25 | words = initWords()
26 | gfw.set(words)#设置敏感词列表
27 | res = gfw.check(getwords.encode('utf8'))
28 | # for obj in res:
29 | # print json.dumps(obj),obj[2]
30 | s = gfw.replace(getwords.encode('utf8'),"**")
31 | return s
32 |
33 | @route('/check',method="POST")
34 | def check():
35 | reload(sys)
36 | sys.setdefaultencoding('utf8')
37 | getwords = request.params.words or ""
38 | gfw = GFW()
39 | words = initWords()
40 | gfw.set(words)#设置敏感词列表
41 | res = gfw.check(getwords.encode('utf8'))
42 | resp = {}
43 | resp['count'] = len(res)
44 | resp['datas']= res
45 | return json.dumps(resp)
46 |
47 | @route('/test')
48 | def test():
49 | reload(sys)
50 | sys.setdefaultencoding('utf8')
51 | webdata = '