├── .gitmodules ├── README.md └── docker ├── Dockerfile ├── build.sh ├── demo_rus.sh └── server /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "api"] 2 | path = api 3 | url = https://github.com/IINemo/syntaxnet_api_server.git 4 | [submodule "syntaxnet_api_server"] 5 | path = syntaxnet_api_server 6 | url = https://github.com/IINemo/syntaxnet_api_server.git 7 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Syntaxnet for Russian 2 | ========= 3 | 4 | [Google's SyntaxNet](https://github.com/tensorflow/models/tree/master/syntaxnet) Parser and POS tagger with a model for Russian language. 5 | 6 | 7 | ## Usage 8 | ----- 9 | 10 | ### 1. Single parse using shell: 11 | ```shell 12 | echo "мама мыла раму" | docker run --rm -i inemo/syntaxnet_rus 13 | ... 14 | Input: Name this boat 15 | Parse (CONLL format): 16 | 1 мама _ NOUN _ Animacy=Anim|Case=Nom|Gender=Fem|Number=Sing|fPOS=NOUN++ 2 nsubj _ _ 17 | 2 мыла _ VERB _ Aspect=Imp|Gender=Fem|Mood=Ind|Number=Sing|Tense=Past|VerbForm=Fin|Voice=Act|fPOS=VERB++ 0 ROOT _ _ 18 | 3 раму _ NOUN _ Animacy=Inan|Case=Acc|Gender=Fem|Number=Sing|fPOS=NOUN++ 2 dobj _ _ 19 | 20 | ``` 21 | ### 2. Standalone SyntaxNet server that does not recreate models (stays alive) (unstable): 22 | 23 | ```shell 24 | docker run --shm-size=1024m -ti --rm -p 8111:9999 inemo/syntaxnet_eng server 0.0.0.0 9999 25 | ``` 26 | Note that, although the current container installs model for Russian, the implemented server can be used for any language (any model trained in SyntaxNet). 27 | 28 | 2.1 You also can use the server in conjunction with SyntaxNet [python wrapper](https://github.com/IINemo/syntaxnet_wrapper). 29 | 30 | 2.2 You can use telnet to talk with parser (be aware about escape problems of unicode in telnet, e.g., 'маму' will not work by default via telnet): 31 | ```shell 32 | telnet localhost 8111 33 | ``` 34 | ```shell 35 | мама мыла 36 | ``` 37 | ```shell 38 | 1 мама _ NOUN _ Animacy=Anim|Case=Nom|Gender=Fem|Number=Sing|fPOS=NOUN++ 2 nsubj _ _ 39 | 2 мыла _ VERB _ Aspect=Imp|Gender=Fem|Mood=Ind|Number=Sing|Tense=Past|VerbForm=Fin|Voice=Act|fPOS=VERB++ 0 ROOT _ _ 40 | 41 | ``` 42 | 43 | 44 | ## Updating 45 | -------- 46 | 47 | ``` 48 | cd /docker/ 49 | ./build.sh 50 | #docker login 51 | #docker build -t inemo/syntaxnet_rus --no-cache . && docker push inemo/syntaxnet_rus 52 | 53 | ``` 54 | 55 | -------------------------------------------------------------------------------- /docker/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:15.10 2 | 3 | 4 | RUN apt-get update && apt-get install -y \ 5 | build-essential \ 6 | curl \ 7 | g++ \ 8 | git \ 9 | libfreetype6-dev \ 10 | libpng12-dev \ 11 | libzmq3-dev \ 12 | libcurl3-dev \ 13 | openjdk-8-jdk \ 14 | pkg-config \ 15 | python-dev \ 16 | python-numpy \ 17 | python-pip \ 18 | software-properties-common \ 19 | swig \ 20 | unzip \ 21 | zip \ 22 | zlib1g-dev \ 23 | && \ 24 | apt-get clean && \ 25 | rm -rf /var/lib/apt/lists/* 26 | 27 | RUN update-ca-certificates -f 28 | 29 | # Set up Bazel. 30 | 31 | # Running bazel inside a `docker build` command causes trouble, cf: 32 | # https://github.com/bazelbuild/bazel/issues/134 33 | # The easiest solution is to set up a bazelrc file forcing --batch. 34 | RUN echo "startup --batch" >>/root/.bazelrc 35 | 36 | # Similarly, we need to workaround sandboxing issues: 37 | # https://github.com/bazelbuild/bazel/issues/418 38 | RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone" \ 39 | >>/root/.bazelrc 40 | ENV BAZELRC /root/.bazelrc 41 | 42 | # Install the most recent bazel release. 43 | ENV BAZEL_VERSION 0.4.5 44 | WORKDIR / 45 | RUN mkdir /bazel && \ 46 | cd /bazel && \ 47 | curl -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \ 48 | chmod +x bazel-*.sh && \ 49 | ./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \ 50 | cd / && \ 51 | rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh 52 | 53 | 54 | # Syntaxnet dependencies 55 | 56 | RUN pip install -U protobuf==3.0.0b2 57 | RUN pip install asciitree mock 58 | 59 | 60 | # Download and build Syntaxnet 61 | 62 | RUN git clone --recursive https://github.com/tensorflow/models.git /root/models 63 | RUN cd /root/models/syntaxnet/tensorflow && tensorflow/tools/ci_build/builds/configured CPU 64 | RUN cd /root/models/syntaxnet && bazel build -c opt @org_tensorflow//tensorflow:tensorflow_py 65 | RUN cd /root/models/syntaxnet && bazel build syntaxnet/... 66 | 67 | ###################################### 68 | 69 | # Setting up locales for Russian 70 | 71 | RUN \ 72 | echo u_RU.UTF-8 UTF-8 > /etc/locale.gen && \ 73 | locale-gen "ru_RU.UTF-8" && \ 74 | echo 'LANG="ru_RU.UTF-8"'>/etc/default/locale && \ 75 | dpkg-reconfigure --frontend=noninteractive locales && \ 76 | update-locale LC_ALL=ru_RU.UTF-8 LANG=ru_RU.UTF-8 77 | 78 | ENV LANG ru_RU.UTF-8 79 | 80 | 81 | # Downloading and unpacking model for Russian 82 | 83 | ADD http://download.tensorflow.org/models/parsey_universal/Russian-SynTagRus.zip /root/models/syntaxnet/syntaxnet/models 84 | RUN unzip /root/models/syntaxnet/syntaxnet/models/Russian-SynTagRus.zip -d /root/models/syntaxnet/syntaxnet/models/ 85 | 86 | 87 | # Misk 88 | 89 | COPY demo_rus.sh /root/models/syntaxnet/syntaxnet/ 90 | COPY server /usr/bin/ 91 | 92 | # Standalone server 93 | 94 | COPY api /root/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/api/ 95 | 96 | ###################################### 97 | 98 | WORKDIR /root/models/syntaxnet/ 99 | 100 | CMD /root/models/syntaxnet/syntaxnet/demo_rus.sh 101 | 102 | -------------------------------------------------------------------------------- /docker/build.sh: -------------------------------------------------------------------------------- 1 | SCRIPT_DIR=$(dirname $0) 2 | rsync -avz --links --delete --exclude ".svn" "$SCRIPT_DIR/../syntaxnet_api_server/src/syntaxnet_api_server/" "$SCRIPT_DIR/api/" || exit -1 3 | 4 | docker build -t inemo/syntaxnet_rus:latest $SCRIPT_DIR 5 | RES=$? 6 | 7 | exit $RES 8 | 9 | -------------------------------------------------------------------------------- /docker/demo_rus.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ./syntaxnet/models/parsey_universal/parse.sh ./syntaxnet/models/Russian-SynTagRus 4 | 5 | -------------------------------------------------------------------------------- /docker/server: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | HOST="$1" 4 | PORT="$2" 5 | python /root/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/api/syntaxnet_rus_api.py --host="$HOST" --port="$PORT" 6 | --------------------------------------------------------------------------------