├── .gitignore ├── .gitmodules ├── .python-version ├── .travis.yml ├── LICENSE ├── MANIFEST.in ├── README.md ├── Vagrantfile ├── bench.py ├── makefile ├── provision.sh ├── requirements.txt ├── setup.cfg ├── setup.py ├── simhash ├── __init__.py ├── simhash.cpp ├── simhash.pxd └── simhash.pyx └── test └── test.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.dSYM 2 | *.o 3 | *.pyc 4 | *.so 5 | *~ 6 | .vagrant/ 7 | MANIFEST 8 | build/ 9 | dist/ 10 | driver 11 | venv/ 12 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "simhash/cpp/deps/Catch"] 2 | path = simhash/cpp/deps/Catch 3 | url = https://github.com/philsquared/Catch.git 4 | [submodule "simhash/simhash-cpp"] 5 | path = simhash/simhash-cpp 6 | url = https://github.com/seomoz/simhash-cpp.git 7 | -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- 1 | 2.7.11 2 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | dist: trusty 2 | sudo: required # this is necessary to get gcc-4.8 for some reason (otherwise gcc-4.6 is used) 3 | language: python 4 | python: 5 | - 2.7 6 | - 3.2 7 | - 3.3 8 | - 3.4 9 | - 3.5 10 | install: pip install -r requirements.txt 11 | script: make test 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013-2014 SEOmoz, Inc. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining 4 | a copy of this software and associated documentation files (the 5 | "Software"), to deal in the Software without restriction, including 6 | without limitation the rights to use, copy, modify, merge, publish, 7 | distribute, sublicense, and/or sell copies of the Software, and to 8 | permit persons to whom the Software is furnished to do so, subject to 9 | the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be 12 | included in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 17 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 18 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 19 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 20 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include simhash/simhash.cpp 2 | include simhash/simhash-cpp/src/* 3 | include simhash/simhash-cpp/include/* 4 | include test/* 5 | makefile 6 | LICENSE 7 | README.md 8 | requirements.txt 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Simhash Near-Duplicate Detection 2 | ================================ 3 | [![Build Status](https://travis-ci.org/seomoz/simhash-py.svg?branch=master)](https://travis-ci.org/seomoz/simhash-py) 4 | 5 | ![Status: Production](https://img.shields.io/badge/status-production-green.svg?style=flat) 6 | ![Team: Big Data](https://img.shields.io/badge/team-big_data-green.svg?style=flat) 7 | ![Scope: External](https://img.shields.io/badge/scope-external-green.svg?style=flat) 8 | ![Open Source: MIT](https://img.shields.io/badge/open_source-MIT-green.svg?style=flat) 9 | ![Critical: Yes](https://img.shields.io/badge/critical-yes-red.svg?style=flat) 10 | 11 | This library enables the efficient identification of near-duplicate documents using 12 | `simhash` using a C++ extension. 13 | 14 | Usage 15 | ===== 16 | `simhash` differs from most hashes in that its goal is to have two similar documents 17 | produce similar hashes, where most hashes have the goal of producing very different 18 | hashes even in the face of small changes to the input. 19 | 20 | The input to `simhash` is a list of hashes representative of a document. The output is an 21 | unsigned 64-bit integer. The input list of hashes can be produced in several ways, but 22 | one common mechanism is to: 23 | 24 | 1. tokenize the document 25 | 1. consider overlapping shingles of these tokens (`simhash.shingle`) 26 | 1. `hash` these overlapping shingles 27 | 1. input these hashes into `simhash.compute` 28 | 29 | This has the effect of considering phrases in a document, rather than just a bag of the 30 | words in it. 31 | 32 | Once we've produced a `simhash`, we would like to compare it to other documents. For two 33 | documents to be considered near-duplicates, they must have few bits that differ. We can 34 | compare two documents: 35 | 36 | ```python 37 | import simhash 38 | 39 | a = simhash.compute(...) 40 | b = simhash.compute(...) 41 | simhash.num_differing_bits(a, b) 42 | ``` 43 | 44 | One of the key advantages of `simhash` is that it does not require `O(n^2)` time to find 45 | all near-duplicate pairs from a set of hashes. Given a whole set of `simhashes`, we can 46 | find all pairs efficiently: 47 | 48 | ```python 49 | import simhash 50 | 51 | # The `simhash`-es from our documents 52 | hashes = [] 53 | 54 | # Number of blocks to use (more in the next section) 55 | blocks = 4 56 | # Number of bits that may differ in matching pairs 57 | distance = 3 58 | matches = simhash.find_all(hashes, blocks, distance) 59 | ``` 60 | 61 | All the matches returned are guaranteed to be _all_ pairs where the hashes differ by 62 | `distance` bits or fewer. The `blocks` parameter is less intuitive, but is best described 63 | in [this article](https://moz.com/devblog/near-duplicate-detection/) or in 64 | [the paper](http://www2007.cpsc.ucalgary.ca/papers/paper215.pdf). The best parameter to 65 | choose depends on the distribution of the input simhashes, but it must always be at least 66 | one greater than the provided `distance`. 67 | 68 | Internally, `find_all` takes `blocks C distance` passes to complete. The idea is that as 69 | that value increases (for instance by increasing `blocks`), each pass completes faster. 70 | In terms of memory, `find_all` takes `O(hashes + matches)` memory. 71 | 72 | Building 73 | ======== 74 | This is installable via `pip`: 75 | 76 | ```bash 77 | pip install git+https://github.com/seomoz/simhash-py.git 78 | ``` 79 | 80 | It can also be built from `git`: 81 | 82 | ```bash 83 | git submodule update --init --recursive 84 | python setup.py install 85 | ``` 86 | 87 | or 88 | ```bash 89 | pip install simhash-py 90 | ``` 91 | under osx, you should 92 | ```bash 93 | export MACOSX_DEPLOYMENT_TARGET = 10.x (10.9,10.10...) 94 | ``` 95 | first 96 | 97 | Benchmark 98 | ========= 99 | This is a rough benchmark, but should help to give you an idea of the order of 100 | magnitude for the performance available. Running on a single core on a `vagrant` instance 101 | on a 2015 MacBook Pro: 102 | 103 | ```bash 104 | $ ./bench.py --random 1000000 --blocks 5 --bits 3 105 | Generating 1000000 hashes 106 | Starting Find all 107 | Ran Find all in 1.595416s 108 | ``` 109 | 110 | Architecture 111 | ============ 112 | Each document gets associated with a 64-bit hash calculated using a rolling 113 | hash function and simhash. This hash can be thought of as a fingerprint for 114 | the content. Two documents are considered near-duplicates if their hashes differ 115 | by at most _k_ bits, a parameter chosen by the user. 116 | 117 | In this context, there is a large corpus of known fingerprints, and we would 118 | like to determine all the fingerprints that differ by our query by _k_ or fewer 119 | bits. To accomplish this, we divide up the 64 bits into at _m_ blocks, where 120 | _m_ is greater than _k_. If hashes A and B differ by at most _k_ bits, then at 121 | least _m - k_ groups are the same. 122 | 123 | Choosing all the unique combinations of _m - k_ blocks, we perform a permutation 124 | on each of the hashes for the documents so that those blocks are first in the 125 | hash. Perhaps a picture would illustrate it better: 126 | 127 | 63------53|52------42|41-----32|31------21|20------10|09------0| 128 | | A | B | C | D | E | F | 129 | 130 | If m = 6, k = 3, we'll choose permutations: 131 | - A B C D E F 132 | - A B D C E F 133 | - A B E C D F 134 | ... 135 | - C D F A B E 136 | - C E F A B D 137 | - D E F A B C 138 | 139 | This generates a number of tables that can be put into sorted order, and then a 140 | small range of candidates can be found in each of those tables for a query, and 141 | then each candidate in that range can be compared to our query. 142 | 143 | The corpus is represented by the union of these tables, could conceivably be 144 | hosted on a separate machine. And each of these tables is also amenable to 145 | sharding, where each shard would comprise a contiguous range of numbers. For 146 | example, you might divide a table into 256 shards, where each shard is 147 | associated with each of the possible first bytes. 148 | 149 | The best partitioning remains to be seen, likely from experimentation, but the 150 | basis of this is the `table`. The `table` tracks hashes inserted into it subject 151 | to a permutation associated with the table. This permutation is described as a 152 | vector of bitmasks of contiguous bit ranges, whose populations sum to 64. 153 | 154 | Example 155 | ======= 156 | 157 | Let's suppose that our corpus has a fingerprint: 158 | 159 | 0100101110111011001000101111101110111100001010011101100110110101 160 | 161 | and we have a query: 162 | 163 | 0100101110111011011000101111101110011100001010011100100110110101 164 | 165 | and they differ by only three bits which happen to fall in blocks B, D and E: 166 | 167 | 63------53|52------42|41-----32|31------21|20------10|09------0| 168 | | A | B | C | D | E | F | 169 | | | | | | | | 170 | 0000000000000000010000000000000000100000000000000001000000000000 171 | 172 | Since any fingerprint matching the query differs by at most 3 bits, at most 3 173 | blocks can differ, and at least 3 must match. Whatever table has the 3 blocks 174 | that do not differ as the leading blocks will match the query when doing a scan. 175 | In this case, the table that's permuted `A C F B D E` will match. It's important 176 | to note that it's possible for a query to match from more than one table. For 177 | example, if two of the non-matching bits are in the same block, or the query 178 | differs by fewer than 3 bits. 179 | 180 | 32-Bit Systems 181 | ============== 182 | The only requirement of `simhash-py` is that it has `uint64_t`. 183 | -------------------------------------------------------------------------------- /Vagrantfile: -------------------------------------------------------------------------------- 1 | # Encoding: utf-8 2 | # -*- mode: ruby -*- 3 | # vi: set ft=ruby : 4 | 5 | ENV['VAGRANT_DEFAULT_PROVIDER'] = 'virtualbox' 6 | 7 | # http://docs.vagrantup.com/v2/ 8 | Vagrant.configure('2') do |config| 9 | config.vm.box = 'ubuntu/trusty64' 10 | config.vm.hostname = 'simhash-py' 11 | config.ssh.forward_agent = true 12 | 13 | config.vm.provider :virtualbox do |vb| 14 | vb.customize ["modifyvm", :id, "--memory", "1024"] 15 | vb.customize ["modifyvm", :id, "--cpus", "2"] 16 | end 17 | 18 | config.vm.provision :shell, path: 'provision.sh', privileged: false 19 | end 20 | -------------------------------------------------------------------------------- /bench.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python 2 | from __future__ import print_function 3 | 4 | import time 5 | import random 6 | import simhash 7 | import argparse 8 | 9 | # Generate some random hashes with known 10 | parser = argparse.ArgumentParser(description='Run a quick bench') 11 | parser.add_argument('--random', dest='random', type=int, default=None, 12 | help='Generate N random hashes for querying') 13 | parser.add_argument('--blocks', dest='blocks', type=int, default=6, 14 | help='Number of blocks to divide 64-bit hashes into') 15 | parser.add_argument('--bits', dest='bits', type=int, default=3, 16 | help='How many bits may differ') 17 | parser.add_argument('--hashes', dest='hashes', type=str, default=None, 18 | help='Path to file with hashes to insert') 19 | 20 | args = parser.parse_args() 21 | 22 | hashes = [] 23 | 24 | if args.hashes: 25 | with open(args.hashes) as f: 26 | hashes = [int(l) for l in f.split('\n')] 27 | 28 | if args.random: 29 | if args.hashes: 30 | print('Random supplied with --hashes') 31 | exit(1) 32 | 33 | if not hashes: 34 | print('Generating %i hashes' % args.random) 35 | hashes = [random.randint(0, 1 << 64) for i in range(args.random)] 36 | elif not args.hashes: 37 | print('No hashes or queries supplied') 38 | exit(2) 39 | 40 | class Timer(object): 41 | def __init__(self, name): 42 | self.name = name 43 | 44 | def __enter__(self): 45 | self.start = -time.time() 46 | print('Starting %s' % self.name) 47 | return self 48 | 49 | def __exit__(self, t, v, tb): 50 | self.start += time.time() 51 | if t: 52 | print(' Failed %s in %fs' % (self.name, self.start)) 53 | else: 54 | print(' Ran %s in %fs' % (self.name, self.start)) 55 | 56 | with Timer('Find all'): 57 | len(simhash.find_all(hashes, args.blocks, args.bits)) 58 | -------------------------------------------------------------------------------- /makefile: -------------------------------------------------------------------------------- 1 | CPP_DEPS = \ 2 | simhash/simhash.pyx \ 3 | simhash/simhash.pxd \ 4 | simhash/simhash-cpp/include/permutation.h \ 5 | simhash/simhash-cpp/src/permutation.cpp \ 6 | simhash/simhash-cpp/include/simhash.h \ 7 | simhash/simhash-cpp/src/simhash.cpp 8 | 9 | .PHONY: test 10 | test: simhash/simhash.so 11 | nosetests --verbose --nocapture 12 | 13 | simhash/simhash.so: $(CPP_DEPS) 14 | python setup.py build_ext --inplace 15 | 16 | clean: 17 | rm -rf simhash.egg-info build dist simhash/simhash.cpp 18 | find . -name '*.pyc' | xargs --no-run-if-empty rm -f 19 | find simhash -name '*.so' | xargs --no-run-if-empty rm -f 20 | 21 | install: 22 | python setup.py install 23 | -------------------------------------------------------------------------------- /provision.sh: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env bash 2 | 3 | set -e 4 | 5 | sudo apt-get update 6 | sudo apt-get install -y tar curl git 7 | 8 | # Libraries required to build a complete python with pyenv: 9 | # https://github.com/yyuu/pyenv/wiki 10 | sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev \ 11 | libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev 12 | 13 | # Install pyenv 14 | git clone https://github.com/yyuu/pyenv.git ~/.pyenv 15 | echo ' 16 | # Pyenv 17 | export PYENV_ROOT="$HOME/.pyenv" 18 | export PATH="$PYENV_ROOT/bin:$PATH" 19 | eval "$(pyenv init -)" 20 | ' >> ~/.bash_profile 21 | source ~/.bash_profile 22 | hash 23 | 24 | pushd /vagrant 25 | 26 | # Submodules 27 | git submodule update --init --recursive 28 | 29 | # Install our python version 30 | pyenv install 31 | pyenv rehash 32 | 33 | # Install a virtualenv 34 | pip install virtualenv 35 | if [ ! -d venv ]; then 36 | virtualenv venv 37 | fi 38 | source venv/bin/activate 39 | 40 | # Lastly, our dependencies 41 | pip install -r requirements.txt 42 | 43 | popd 44 | echo 'cd /vagrant/' >> ~/.bash_profile 45 | echo ' 46 | . /vagrant/venv/bin/activate 47 | ' >> ~/.bash_profile 48 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | colorama==0.3.7 2 | coverage==4.1 3 | Cython==0.29.14 4 | nose==1.3.7 5 | nose-timer==0.6.0 6 | python-termstyle==0.1.10 7 | rednose==1.2.1 8 | termcolor==1.1.0 9 | six==1.10.0 10 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | description-file = README.md 3 | 4 | [nosetests] 5 | verbosity=2 6 | rednose=1 7 | with-timer=1 8 | exe=0 9 | cover-package=url 10 | cover-branches=1 11 | cover-min-percentage=100 12 | cover-inclusive=1 13 | cover-erase=1 14 | logging-clear-handlers=1 15 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python 2 | from __future__ import print_function 3 | 4 | from distutils.core import setup 5 | from distutils.extension import Extension 6 | 7 | # Complain on 32-bit systems. See README for more details 8 | import struct 9 | 10 | if struct.calcsize("P") < 8: 11 | raise RuntimeError("Simhash-py does not work on 32-bit systems. See README.md") 12 | 13 | ext_files = ["simhash/simhash-cpp/src/permutation.cpp", "simhash/simhash-cpp/src/simhash.cpp"] 14 | 15 | kwargs = {} 16 | 17 | try: 18 | from Cython.Distutils import build_ext 19 | 20 | print("Building from Cython") 21 | ext_files.append("simhash/simhash.pyx") 22 | kwargs["cmdclass"] = {"build_ext": build_ext} 23 | except ImportError: 24 | print("Buidling from C++") 25 | ext_files.append("simhash/simhash.cpp") 26 | 27 | ext_modules = [ 28 | Extension( 29 | "simhash.simhash", 30 | ext_files, 31 | language="c++", 32 | extra_compile_args=["-std=c++11"], 33 | include_dirs=["simhash/simhash-cpp/include"], 34 | ) 35 | ] 36 | 37 | setup( 38 | name="simhash-py", 39 | version="0.4.2", 40 | description="Near-Duplicate Detection with Simhash", 41 | url="http://github.com/seomoz/simhash-py", 42 | author="Moz, Inc.", 43 | author_email="turbo@moz.com", 44 | classifiers=[ 45 | "Programming Language :: Python", 46 | "Intended Audience :: Developers", 47 | "Operating System :: OS Independent", 48 | "Topic :: Internet :: WWW/HTTP", 49 | ], 50 | ext_modules=ext_modules, 51 | packages=["simhash"], 52 | package_dir={"simhash": "simhash"}, 53 | tests_require=["coverage", "nose", "nose-timer", "rednose"], 54 | **kwargs 55 | ) 56 | -------------------------------------------------------------------------------- /simhash/__init__.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python 2 | 3 | from .simhash import unsigned_hash, num_differing_bits, compute, find_all 4 | from six.moves import range as six_range 5 | 6 | 7 | def shingle(tokens, window=4): 8 | """A generator for a moving window of the provided tokens.""" 9 | if window <= 0: 10 | raise ValueError("Window size must be positive") 11 | 12 | # Start with an empty output set. 13 | curr_window = [] 14 | 15 | # Iterate over the input tokens, once. 16 | for token in tokens: 17 | # Add to the window. 18 | curr_window.append(token) 19 | 20 | # If we've collected too many, remove the oldest item(s) from the collection 21 | while len(curr_window) > window: 22 | curr_window.pop(0) 23 | 24 | # Finally, if the window is full, yield the data set. 25 | if len(curr_window) == window: 26 | yield list(curr_window) 27 | -------------------------------------------------------------------------------- /simhash/simhash.cpp: -------------------------------------------------------------------------------- 1 | /* Generated by Cython 0.24.1 */ 2 | 3 | #define PY_SSIZE_T_CLEAN 4 | #include "Python.h" 5 | #ifndef Py_PYTHON_H 6 | #error Python headers needed to compile C extensions, please install development version of Python. 7 | #elif PY_VERSION_HEX < 0x02060000 || (0x03000000 <= PY_VERSION_HEX && PY_VERSION_HEX < 0x03020000) 8 | #error Cython requires Python 2.6+ or Python 3.2+. 9 | #else 10 | #define CYTHON_ABI "0_24_1" 11 | #include 12 | #ifndef offsetof 13 | #define offsetof(type, member) ( (size_t) & ((type*)0) -> member ) 14 | #endif 15 | #if !defined(WIN32) && !defined(MS_WINDOWS) 16 | #ifndef __stdcall 17 | #define __stdcall 18 | #endif 19 | #ifndef __cdecl 20 | #define __cdecl 21 | #endif 22 | #ifndef __fastcall 23 | #define __fastcall 24 | #endif 25 | #endif 26 | #ifndef DL_IMPORT 27 | #define DL_IMPORT(t) t 28 | #endif 29 | #ifndef DL_EXPORT 30 | #define DL_EXPORT(t) t 31 | #endif 32 | #ifndef PY_LONG_LONG 33 | #define PY_LONG_LONG LONG_LONG 34 | #endif 35 | #ifndef Py_HUGE_VAL 36 | #define Py_HUGE_VAL HUGE_VAL 37 | #endif 38 | #ifdef PYPY_VERSION 39 | #define CYTHON_COMPILING_IN_PYPY 1 40 | #define CYTHON_COMPILING_IN_CPYTHON 0 41 | #else 42 | #define CYTHON_COMPILING_IN_PYPY 0 43 | #define CYTHON_COMPILING_IN_CPYTHON 1 44 | #endif 45 | #if !defined(CYTHON_USE_PYLONG_INTERNALS) && CYTHON_COMPILING_IN_CPYTHON && PY_VERSION_HEX >= 0x02070000 46 | #define CYTHON_USE_PYLONG_INTERNALS 1 47 | #endif 48 | #if CYTHON_USE_PYLONG_INTERNALS 49 | #include "longintrepr.h" 50 | #undef SHIFT 51 | #undef BASE 52 | #undef MASK 53 | #endif 54 | #if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX < 0x02070600 && !defined(Py_OptimizeFlag) 55 | #define Py_OptimizeFlag 0 56 | #endif 57 | #define __PYX_BUILD_PY_SSIZE_T "n" 58 | #define CYTHON_FORMAT_SSIZE_T "z" 59 | #if PY_MAJOR_VERSION < 3 60 | #define __Pyx_BUILTIN_MODULE_NAME "__builtin__" 61 | #define __Pyx_PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\ 62 | PyCode_New(a+k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos) 63 | #define __Pyx_DefaultClassType PyClass_Type 64 | #else 65 | #define __Pyx_BUILTIN_MODULE_NAME "builtins" 66 | #define __Pyx_PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)\ 67 | PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos) 68 | #define __Pyx_DefaultClassType PyType_Type 69 | #endif 70 | #ifndef Py_TPFLAGS_CHECKTYPES 71 | #define Py_TPFLAGS_CHECKTYPES 0 72 | #endif 73 | #ifndef Py_TPFLAGS_HAVE_INDEX 74 | #define Py_TPFLAGS_HAVE_INDEX 0 75 | #endif 76 | #ifndef Py_TPFLAGS_HAVE_NEWBUFFER 77 | #define Py_TPFLAGS_HAVE_NEWBUFFER 0 78 | #endif 79 | #ifndef Py_TPFLAGS_HAVE_FINALIZE 80 | #define Py_TPFLAGS_HAVE_FINALIZE 0 81 | #endif 82 | #if PY_VERSION_HEX > 0x03030000 && defined(PyUnicode_KIND) 83 | #define CYTHON_PEP393_ENABLED 1 84 | #define __Pyx_PyUnicode_READY(op) (likely(PyUnicode_IS_READY(op)) ?\ 85 | 0 : _PyUnicode_Ready((PyObject *)(op))) 86 | #define __Pyx_PyUnicode_GET_LENGTH(u) PyUnicode_GET_LENGTH(u) 87 | #define __Pyx_PyUnicode_READ_CHAR(u, i) PyUnicode_READ_CHAR(u, i) 88 | #define __Pyx_PyUnicode_KIND(u) PyUnicode_KIND(u) 89 | #define __Pyx_PyUnicode_DATA(u) PyUnicode_DATA(u) 90 | #define __Pyx_PyUnicode_READ(k, d, i) PyUnicode_READ(k, d, i) 91 | #define __Pyx_PyUnicode_IS_TRUE(u) (0 != (likely(PyUnicode_IS_READY(u)) ? PyUnicode_GET_LENGTH(u) : PyUnicode_GET_SIZE(u))) 92 | #else 93 | #define CYTHON_PEP393_ENABLED 0 94 | #define __Pyx_PyUnicode_READY(op) (0) 95 | #define __Pyx_PyUnicode_GET_LENGTH(u) PyUnicode_GET_SIZE(u) 96 | #define __Pyx_PyUnicode_READ_CHAR(u, i) ((Py_UCS4)(PyUnicode_AS_UNICODE(u)[i])) 97 | #define __Pyx_PyUnicode_KIND(u) (sizeof(Py_UNICODE)) 98 | #define __Pyx_PyUnicode_DATA(u) ((void*)PyUnicode_AS_UNICODE(u)) 99 | #define __Pyx_PyUnicode_READ(k, d, i) ((void)(k), (Py_UCS4)(((Py_UNICODE*)d)[i])) 100 | #define __Pyx_PyUnicode_IS_TRUE(u) (0 != PyUnicode_GET_SIZE(u)) 101 | #endif 102 | #if CYTHON_COMPILING_IN_PYPY 103 | #define __Pyx_PyUnicode_Concat(a, b) PyNumber_Add(a, b) 104 | #define __Pyx_PyUnicode_ConcatSafe(a, b) PyNumber_Add(a, b) 105 | #else 106 | #define __Pyx_PyUnicode_Concat(a, b) PyUnicode_Concat(a, b) 107 | #define __Pyx_PyUnicode_ConcatSafe(a, b) ((unlikely((a) == Py_None) || unlikely((b) == Py_None)) ?\ 108 | PyNumber_Add(a, b) : __Pyx_PyUnicode_Concat(a, b)) 109 | #endif 110 | #if CYTHON_COMPILING_IN_PYPY && !defined(PyUnicode_Contains) 111 | #define PyUnicode_Contains(u, s) PySequence_Contains(u, s) 112 | #endif 113 | #if CYTHON_COMPILING_IN_PYPY && !defined(PyByteArray_Check) 114 | #define PyByteArray_Check(obj) PyObject_TypeCheck(obj, &PyByteArray_Type) 115 | #endif 116 | #if CYTHON_COMPILING_IN_PYPY && !defined(PyObject_Format) 117 | #define PyObject_Format(obj, fmt) PyObject_CallMethod(obj, "__format__", "O", fmt) 118 | #endif 119 | #if CYTHON_COMPILING_IN_PYPY && !defined(PyObject_Malloc) 120 | #define PyObject_Malloc(s) PyMem_Malloc(s) 121 | #define PyObject_Free(p) PyMem_Free(p) 122 | #define PyObject_Realloc(p) PyMem_Realloc(p) 123 | #endif 124 | #define __Pyx_PyString_FormatSafe(a, b) ((unlikely((a) == Py_None)) ? PyNumber_Remainder(a, b) : __Pyx_PyString_Format(a, b)) 125 | #define __Pyx_PyUnicode_FormatSafe(a, b) ((unlikely((a) == Py_None)) ? PyNumber_Remainder(a, b) : PyUnicode_Format(a, b)) 126 | #if PY_MAJOR_VERSION >= 3 127 | #define __Pyx_PyString_Format(a, b) PyUnicode_Format(a, b) 128 | #else 129 | #define __Pyx_PyString_Format(a, b) PyString_Format(a, b) 130 | #endif 131 | #if PY_MAJOR_VERSION < 3 && !defined(PyObject_ASCII) 132 | #define PyObject_ASCII(o) PyObject_Repr(o) 133 | #endif 134 | #if PY_MAJOR_VERSION >= 3 135 | #define PyBaseString_Type PyUnicode_Type 136 | #define PyStringObject PyUnicodeObject 137 | #define PyString_Type PyUnicode_Type 138 | #define PyString_Check PyUnicode_Check 139 | #define PyString_CheckExact PyUnicode_CheckExact 140 | #endif 141 | #if PY_MAJOR_VERSION >= 3 142 | #define __Pyx_PyBaseString_Check(obj) PyUnicode_Check(obj) 143 | #define __Pyx_PyBaseString_CheckExact(obj) PyUnicode_CheckExact(obj) 144 | #else 145 | #define __Pyx_PyBaseString_Check(obj) (PyString_Check(obj) || PyUnicode_Check(obj)) 146 | #define __Pyx_PyBaseString_CheckExact(obj) (PyString_CheckExact(obj) || PyUnicode_CheckExact(obj)) 147 | #endif 148 | #ifndef PySet_CheckExact 149 | #define PySet_CheckExact(obj) (Py_TYPE(obj) == &PySet_Type) 150 | #endif 151 | #define __Pyx_TypeCheck(obj, type) PyObject_TypeCheck(obj, (PyTypeObject *)type) 152 | #if PY_MAJOR_VERSION >= 3 153 | #define PyIntObject PyLongObject 154 | #define PyInt_Type PyLong_Type 155 | #define PyInt_Check(op) PyLong_Check(op) 156 | #define PyInt_CheckExact(op) PyLong_CheckExact(op) 157 | #define PyInt_FromString PyLong_FromString 158 | #define PyInt_FromUnicode PyLong_FromUnicode 159 | #define PyInt_FromLong PyLong_FromLong 160 | #define PyInt_FromSize_t PyLong_FromSize_t 161 | #define PyInt_FromSsize_t PyLong_FromSsize_t 162 | #define PyInt_AsLong PyLong_AsLong 163 | #define PyInt_AS_LONG PyLong_AS_LONG 164 | #define PyInt_AsSsize_t PyLong_AsSsize_t 165 | #define PyInt_AsUnsignedLongMask PyLong_AsUnsignedLongMask 166 | #define PyInt_AsUnsignedLongLongMask PyLong_AsUnsignedLongLongMask 167 | #define PyNumber_Int PyNumber_Long 168 | #endif 169 | #if PY_MAJOR_VERSION >= 3 170 | #define PyBoolObject PyLongObject 171 | #endif 172 | #if PY_MAJOR_VERSION >= 3 && CYTHON_COMPILING_IN_PYPY 173 | #ifndef PyUnicode_InternFromString 174 | #define PyUnicode_InternFromString(s) PyUnicode_FromString(s) 175 | #endif 176 | #endif 177 | #if PY_VERSION_HEX < 0x030200A4 178 | typedef long Py_hash_t; 179 | #define __Pyx_PyInt_FromHash_t PyInt_FromLong 180 | #define __Pyx_PyInt_AsHash_t PyInt_AsLong 181 | #else 182 | #define __Pyx_PyInt_FromHash_t PyInt_FromSsize_t 183 | #define __Pyx_PyInt_AsHash_t PyInt_AsSsize_t 184 | #endif 185 | #if PY_MAJOR_VERSION >= 3 186 | #define __Pyx_PyMethod_New(func, self, klass) ((self) ? PyMethod_New(func, self) : PyInstanceMethod_New(func)) 187 | #else 188 | #define __Pyx_PyMethod_New(func, self, klass) PyMethod_New(func, self, klass) 189 | #endif 190 | #if PY_VERSION_HEX >= 0x030500B1 191 | #define __Pyx_PyAsyncMethodsStruct PyAsyncMethods 192 | #define __Pyx_PyType_AsAsync(obj) (Py_TYPE(obj)->tp_as_async) 193 | #elif CYTHON_COMPILING_IN_CPYTHON && PY_MAJOR_VERSION >= 3 194 | typedef struct { 195 | unaryfunc am_await; 196 | unaryfunc am_aiter; 197 | unaryfunc am_anext; 198 | } __Pyx_PyAsyncMethodsStruct; 199 | #define __Pyx_PyType_AsAsync(obj) ((__Pyx_PyAsyncMethodsStruct*) (Py_TYPE(obj)->tp_reserved)) 200 | #else 201 | #define __Pyx_PyType_AsAsync(obj) NULL 202 | #endif 203 | #ifndef CYTHON_RESTRICT 204 | #if defined(__GNUC__) 205 | #define CYTHON_RESTRICT __restrict__ 206 | #elif defined(_MSC_VER) && _MSC_VER >= 1400 207 | #define CYTHON_RESTRICT __restrict 208 | #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L 209 | #define CYTHON_RESTRICT restrict 210 | #else 211 | #define CYTHON_RESTRICT 212 | #endif 213 | #endif 214 | #define __Pyx_void_to_None(void_result) ((void)(void_result), Py_INCREF(Py_None), Py_None) 215 | 216 | #ifndef __cplusplus 217 | #error "Cython files generated with the C++ option must be compiled with a C++ compiler." 218 | #endif 219 | #ifndef CYTHON_INLINE 220 | #define CYTHON_INLINE inline 221 | #endif 222 | template 223 | void __Pyx_call_destructor(T& x) { 224 | x.~T(); 225 | } 226 | template 227 | class __Pyx_FakeReference { 228 | public: 229 | __Pyx_FakeReference() : ptr(NULL) { } 230 | __Pyx_FakeReference(const T& ref) : ptr(const_cast(&ref)) { } 231 | T *operator->() { return ptr; } 232 | operator T&() { return *ptr; } 233 | private: 234 | T *ptr; 235 | }; 236 | 237 | #if defined(WIN32) || defined(MS_WINDOWS) 238 | #define _USE_MATH_DEFINES 239 | #endif 240 | #include 241 | #ifdef NAN 242 | #define __PYX_NAN() ((float) NAN) 243 | #else 244 | static CYTHON_INLINE float __PYX_NAN() { 245 | float value; 246 | memset(&value, 0xFF, sizeof(value)); 247 | return value; 248 | } 249 | #endif 250 | #if defined(__CYGWIN__) && defined(_LDBL_EQ_DBL) 251 | #define __Pyx_truncl trunc 252 | #else 253 | #define __Pyx_truncl truncl 254 | #endif 255 | 256 | 257 | #define __PYX_ERR(f_index, lineno, Ln_error) \ 258 | { \ 259 | __pyx_filename = __pyx_f[f_index]; __pyx_lineno = lineno; __pyx_clineno = __LINE__; goto Ln_error; \ 260 | } 261 | 262 | #if PY_MAJOR_VERSION >= 3 263 | #define __Pyx_PyNumber_Divide(x,y) PyNumber_TrueDivide(x,y) 264 | #define __Pyx_PyNumber_InPlaceDivide(x,y) PyNumber_InPlaceTrueDivide(x,y) 265 | #else 266 | #define __Pyx_PyNumber_Divide(x,y) PyNumber_Divide(x,y) 267 | #define __Pyx_PyNumber_InPlaceDivide(x,y) PyNumber_InPlaceDivide(x,y) 268 | #endif 269 | 270 | #ifndef __PYX_EXTERN_C 271 | #ifdef __cplusplus 272 | #define __PYX_EXTERN_C extern "C" 273 | #else 274 | #define __PYX_EXTERN_C extern 275 | #endif 276 | #endif 277 | 278 | #define __PYX_HAVE__simhash__simhash 279 | #define __PYX_HAVE_API__simhash__simhash 280 | #include 281 | #include "ios" 282 | #include "new" 283 | #include "stdexcept" 284 | #include "typeinfo" 285 | #include 286 | #include 287 | #include "stdint.h" 288 | #include "simhash-cpp/include/simhash.h" 289 | #ifdef _OPENMP 290 | #include 291 | #endif /* _OPENMP */ 292 | 293 | #ifdef PYREX_WITHOUT_ASSERTIONS 294 | #define CYTHON_WITHOUT_ASSERTIONS 295 | #endif 296 | 297 | #ifndef CYTHON_UNUSED 298 | # if defined(__GNUC__) 299 | # if !(defined(__cplusplus)) || (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)) 300 | # define CYTHON_UNUSED __attribute__ ((__unused__)) 301 | # else 302 | # define CYTHON_UNUSED 303 | # endif 304 | # elif defined(__ICC) || (defined(__INTEL_COMPILER) && !defined(_MSC_VER)) 305 | # define CYTHON_UNUSED __attribute__ ((__unused__)) 306 | # else 307 | # define CYTHON_UNUSED 308 | # endif 309 | #endif 310 | #ifndef CYTHON_NCP_UNUSED 311 | # if CYTHON_COMPILING_IN_CPYTHON 312 | # define CYTHON_NCP_UNUSED 313 | # else 314 | # define CYTHON_NCP_UNUSED CYTHON_UNUSED 315 | # endif 316 | #endif 317 | typedef struct {PyObject **p; const char *s; const Py_ssize_t n; const char* encoding; 318 | const char is_unicode; const char is_str; const char intern; } __Pyx_StringTabEntry; 319 | 320 | #define __PYX_DEFAULT_STRING_ENCODING_IS_ASCII 0 321 | #define __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT 0 322 | #define __PYX_DEFAULT_STRING_ENCODING "" 323 | #define __Pyx_PyObject_FromString __Pyx_PyBytes_FromString 324 | #define __Pyx_PyObject_FromStringAndSize __Pyx_PyBytes_FromStringAndSize 325 | #define __Pyx_uchar_cast(c) ((unsigned char)c) 326 | #define __Pyx_long_cast(x) ((long)x) 327 | #define __Pyx_fits_Py_ssize_t(v, type, is_signed) (\ 328 | (sizeof(type) < sizeof(Py_ssize_t)) ||\ 329 | (sizeof(type) > sizeof(Py_ssize_t) &&\ 330 | likely(v < (type)PY_SSIZE_T_MAX ||\ 331 | v == (type)PY_SSIZE_T_MAX) &&\ 332 | (!is_signed || likely(v > (type)PY_SSIZE_T_MIN ||\ 333 | v == (type)PY_SSIZE_T_MIN))) ||\ 334 | (sizeof(type) == sizeof(Py_ssize_t) &&\ 335 | (is_signed || likely(v < (type)PY_SSIZE_T_MAX ||\ 336 | v == (type)PY_SSIZE_T_MAX))) ) 337 | #if defined (__cplusplus) && __cplusplus >= 201103L 338 | #include 339 | #define __Pyx_sst_abs(value) std::abs(value) 340 | #elif SIZEOF_INT >= SIZEOF_SIZE_T 341 | #define __Pyx_sst_abs(value) abs(value) 342 | #elif SIZEOF_LONG >= SIZEOF_SIZE_T 343 | #define __Pyx_sst_abs(value) labs(value) 344 | #elif defined (_MSC_VER) && defined (_M_X64) 345 | #define __Pyx_sst_abs(value) _abs64(value) 346 | #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L 347 | #define __Pyx_sst_abs(value) llabs(value) 348 | #elif defined (__GNUC__) 349 | #define __Pyx_sst_abs(value) __builtin_llabs(value) 350 | #else 351 | #define __Pyx_sst_abs(value) ((value<0) ? -value : value) 352 | #endif 353 | static CYTHON_INLINE char* __Pyx_PyObject_AsString(PyObject*); 354 | static CYTHON_INLINE char* __Pyx_PyObject_AsStringAndSize(PyObject*, Py_ssize_t* length); 355 | #define __Pyx_PyByteArray_FromString(s) PyByteArray_FromStringAndSize((const char*)s, strlen((const char*)s)) 356 | #define __Pyx_PyByteArray_FromStringAndSize(s, l) PyByteArray_FromStringAndSize((const char*)s, l) 357 | #define __Pyx_PyBytes_FromString PyBytes_FromString 358 | #define __Pyx_PyBytes_FromStringAndSize PyBytes_FromStringAndSize 359 | static CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(const char*); 360 | #if PY_MAJOR_VERSION < 3 361 | #define __Pyx_PyStr_FromString __Pyx_PyBytes_FromString 362 | #define __Pyx_PyStr_FromStringAndSize __Pyx_PyBytes_FromStringAndSize 363 | #else 364 | #define __Pyx_PyStr_FromString __Pyx_PyUnicode_FromString 365 | #define __Pyx_PyStr_FromStringAndSize __Pyx_PyUnicode_FromStringAndSize 366 | #endif 367 | #define __Pyx_PyObject_AsSString(s) ((signed char*) __Pyx_PyObject_AsString(s)) 368 | #define __Pyx_PyObject_AsUString(s) ((unsigned char*) __Pyx_PyObject_AsString(s)) 369 | #define __Pyx_PyObject_FromCString(s) __Pyx_PyObject_FromString((const char*)s) 370 | #define __Pyx_PyBytes_FromCString(s) __Pyx_PyBytes_FromString((const char*)s) 371 | #define __Pyx_PyByteArray_FromCString(s) __Pyx_PyByteArray_FromString((const char*)s) 372 | #define __Pyx_PyStr_FromCString(s) __Pyx_PyStr_FromString((const char*)s) 373 | #define __Pyx_PyUnicode_FromCString(s) __Pyx_PyUnicode_FromString((const char*)s) 374 | #if PY_MAJOR_VERSION < 3 375 | static CYTHON_INLINE size_t __Pyx_Py_UNICODE_strlen(const Py_UNICODE *u) 376 | { 377 | const Py_UNICODE *u_end = u; 378 | while (*u_end++) ; 379 | return (size_t)(u_end - u - 1); 380 | } 381 | #else 382 | #define __Pyx_Py_UNICODE_strlen Py_UNICODE_strlen 383 | #endif 384 | #define __Pyx_PyUnicode_FromUnicode(u) PyUnicode_FromUnicode(u, __Pyx_Py_UNICODE_strlen(u)) 385 | #define __Pyx_PyUnicode_FromUnicodeAndLength PyUnicode_FromUnicode 386 | #define __Pyx_PyUnicode_AsUnicode PyUnicode_AsUnicode 387 | #define __Pyx_NewRef(obj) (Py_INCREF(obj), obj) 388 | #define __Pyx_Owned_Py_None(b) __Pyx_NewRef(Py_None) 389 | #define __Pyx_PyBool_FromLong(b) ((b) ? __Pyx_NewRef(Py_True) : __Pyx_NewRef(Py_False)) 390 | static CYTHON_INLINE int __Pyx_PyObject_IsTrue(PyObject*); 391 | static CYTHON_INLINE PyObject* __Pyx_PyNumber_IntOrLong(PyObject* x); 392 | static CYTHON_INLINE Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject*); 393 | static CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t); 394 | #if CYTHON_COMPILING_IN_CPYTHON 395 | #define __pyx_PyFloat_AsDouble(x) (PyFloat_CheckExact(x) ? PyFloat_AS_DOUBLE(x) : PyFloat_AsDouble(x)) 396 | #else 397 | #define __pyx_PyFloat_AsDouble(x) PyFloat_AsDouble(x) 398 | #endif 399 | #define __pyx_PyFloat_AsFloat(x) ((float) __pyx_PyFloat_AsDouble(x)) 400 | #if PY_MAJOR_VERSION >= 3 401 | #define __Pyx_PyNumber_Int(x) (PyLong_CheckExact(x) ? __Pyx_NewRef(x) : PyNumber_Long(x)) 402 | #else 403 | #define __Pyx_PyNumber_Int(x) (PyInt_CheckExact(x) ? __Pyx_NewRef(x) : PyNumber_Int(x)) 404 | #endif 405 | #define __Pyx_PyNumber_Float(x) (PyFloat_CheckExact(x) ? __Pyx_NewRef(x) : PyNumber_Float(x)) 406 | #if PY_MAJOR_VERSION < 3 && __PYX_DEFAULT_STRING_ENCODING_IS_ASCII 407 | static int __Pyx_sys_getdefaultencoding_not_ascii; 408 | static int __Pyx_init_sys_getdefaultencoding_params(void) { 409 | PyObject* sys; 410 | PyObject* default_encoding = NULL; 411 | PyObject* ascii_chars_u = NULL; 412 | PyObject* ascii_chars_b = NULL; 413 | const char* default_encoding_c; 414 | sys = PyImport_ImportModule("sys"); 415 | if (!sys) goto bad; 416 | default_encoding = PyObject_CallMethod(sys, (char*) "getdefaultencoding", NULL); 417 | Py_DECREF(sys); 418 | if (!default_encoding) goto bad; 419 | default_encoding_c = PyBytes_AsString(default_encoding); 420 | if (!default_encoding_c) goto bad; 421 | if (strcmp(default_encoding_c, "ascii") == 0) { 422 | __Pyx_sys_getdefaultencoding_not_ascii = 0; 423 | } else { 424 | char ascii_chars[128]; 425 | int c; 426 | for (c = 0; c < 128; c++) { 427 | ascii_chars[c] = c; 428 | } 429 | __Pyx_sys_getdefaultencoding_not_ascii = 1; 430 | ascii_chars_u = PyUnicode_DecodeASCII(ascii_chars, 128, NULL); 431 | if (!ascii_chars_u) goto bad; 432 | ascii_chars_b = PyUnicode_AsEncodedString(ascii_chars_u, default_encoding_c, NULL); 433 | if (!ascii_chars_b || !PyBytes_Check(ascii_chars_b) || memcmp(ascii_chars, PyBytes_AS_STRING(ascii_chars_b), 128) != 0) { 434 | PyErr_Format( 435 | PyExc_ValueError, 436 | "This module compiled with c_string_encoding=ascii, but default encoding '%.200s' is not a superset of ascii.", 437 | default_encoding_c); 438 | goto bad; 439 | } 440 | Py_DECREF(ascii_chars_u); 441 | Py_DECREF(ascii_chars_b); 442 | } 443 | Py_DECREF(default_encoding); 444 | return 0; 445 | bad: 446 | Py_XDECREF(default_encoding); 447 | Py_XDECREF(ascii_chars_u); 448 | Py_XDECREF(ascii_chars_b); 449 | return -1; 450 | } 451 | #endif 452 | #if __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT && PY_MAJOR_VERSION >= 3 453 | #define __Pyx_PyUnicode_FromStringAndSize(c_str, size) PyUnicode_DecodeUTF8(c_str, size, NULL) 454 | #else 455 | #define __Pyx_PyUnicode_FromStringAndSize(c_str, size) PyUnicode_Decode(c_str, size, __PYX_DEFAULT_STRING_ENCODING, NULL) 456 | #if __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT 457 | static char* __PYX_DEFAULT_STRING_ENCODING; 458 | static int __Pyx_init_sys_getdefaultencoding_params(void) { 459 | PyObject* sys; 460 | PyObject* default_encoding = NULL; 461 | char* default_encoding_c; 462 | sys = PyImport_ImportModule("sys"); 463 | if (!sys) goto bad; 464 | default_encoding = PyObject_CallMethod(sys, (char*) (const char*) "getdefaultencoding", NULL); 465 | Py_DECREF(sys); 466 | if (!default_encoding) goto bad; 467 | default_encoding_c = PyBytes_AsString(default_encoding); 468 | if (!default_encoding_c) goto bad; 469 | __PYX_DEFAULT_STRING_ENCODING = (char*) malloc(strlen(default_encoding_c)); 470 | if (!__PYX_DEFAULT_STRING_ENCODING) goto bad; 471 | strcpy(__PYX_DEFAULT_STRING_ENCODING, default_encoding_c); 472 | Py_DECREF(default_encoding); 473 | return 0; 474 | bad: 475 | Py_XDECREF(default_encoding); 476 | return -1; 477 | } 478 | #endif 479 | #endif 480 | 481 | 482 | /* Test for GCC > 2.95 */ 483 | #if defined(__GNUC__) && (__GNUC__ > 2 || (__GNUC__ == 2 && (__GNUC_MINOR__ > 95))) 484 | #define likely(x) __builtin_expect(!!(x), 1) 485 | #define unlikely(x) __builtin_expect(!!(x), 0) 486 | #else /* !__GNUC__ or GCC < 2.95 */ 487 | #define likely(x) (x) 488 | #define unlikely(x) (x) 489 | #endif /* __GNUC__ */ 490 | 491 | static PyObject *__pyx_m; 492 | static PyObject *__pyx_d; 493 | static PyObject *__pyx_b; 494 | static PyObject *__pyx_empty_tuple; 495 | static PyObject *__pyx_empty_bytes; 496 | static PyObject *__pyx_empty_unicode; 497 | static int __pyx_lineno; 498 | static int __pyx_clineno = 0; 499 | static const char * __pyx_cfilenm= __FILE__; 500 | static const char *__pyx_filename; 501 | 502 | 503 | static const char *__pyx_f[] = { 504 | "simhash/simhash.pyx", 505 | "stringsource", 506 | "simhash/simhash.pxd", 507 | }; 508 | 509 | /*--- Type declarations ---*/ 510 | struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py; 511 | 512 | /* "cfunc.to_py":64 513 | * 514 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 515 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): # <<<<<<<<<<<<<< 516 | * def wrap(hash_t a, hash_t b): 517 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 518 | */ 519 | struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py { 520 | PyObject_HEAD 521 | size_t (*__pyx_v_f)(Simhash::hash_t, Simhash::hash_t); 522 | }; 523 | 524 | 525 | /* --- Runtime support code (head) --- */ 526 | /* Refnanny.proto */ 527 | #ifndef CYTHON_REFNANNY 528 | #define CYTHON_REFNANNY 0 529 | #endif 530 | #if CYTHON_REFNANNY 531 | typedef struct { 532 | void (*INCREF)(void*, PyObject*, int); 533 | void (*DECREF)(void*, PyObject*, int); 534 | void (*GOTREF)(void*, PyObject*, int); 535 | void (*GIVEREF)(void*, PyObject*, int); 536 | void* (*SetupContext)(const char*, int, const char*); 537 | void (*FinishContext)(void**); 538 | } __Pyx_RefNannyAPIStruct; 539 | static __Pyx_RefNannyAPIStruct *__Pyx_RefNanny = NULL; 540 | static __Pyx_RefNannyAPIStruct *__Pyx_RefNannyImportAPI(const char *modname); 541 | #define __Pyx_RefNannyDeclarations void *__pyx_refnanny = NULL; 542 | #ifdef WITH_THREAD 543 | #define __Pyx_RefNannySetupContext(name, acquire_gil)\ 544 | if (acquire_gil) {\ 545 | PyGILState_STATE __pyx_gilstate_save = PyGILState_Ensure();\ 546 | __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), __LINE__, __FILE__);\ 547 | PyGILState_Release(__pyx_gilstate_save);\ 548 | } else {\ 549 | __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), __LINE__, __FILE__);\ 550 | } 551 | #else 552 | #define __Pyx_RefNannySetupContext(name, acquire_gil)\ 553 | __pyx_refnanny = __Pyx_RefNanny->SetupContext((name), __LINE__, __FILE__) 554 | #endif 555 | #define __Pyx_RefNannyFinishContext()\ 556 | __Pyx_RefNanny->FinishContext(&__pyx_refnanny) 557 | #define __Pyx_INCREF(r) __Pyx_RefNanny->INCREF(__pyx_refnanny, (PyObject *)(r), __LINE__) 558 | #define __Pyx_DECREF(r) __Pyx_RefNanny->DECREF(__pyx_refnanny, (PyObject *)(r), __LINE__) 559 | #define __Pyx_GOTREF(r) __Pyx_RefNanny->GOTREF(__pyx_refnanny, (PyObject *)(r), __LINE__) 560 | #define __Pyx_GIVEREF(r) __Pyx_RefNanny->GIVEREF(__pyx_refnanny, (PyObject *)(r), __LINE__) 561 | #define __Pyx_XINCREF(r) do { if((r) != NULL) {__Pyx_INCREF(r); }} while(0) 562 | #define __Pyx_XDECREF(r) do { if((r) != NULL) {__Pyx_DECREF(r); }} while(0) 563 | #define __Pyx_XGOTREF(r) do { if((r) != NULL) {__Pyx_GOTREF(r); }} while(0) 564 | #define __Pyx_XGIVEREF(r) do { if((r) != NULL) {__Pyx_GIVEREF(r);}} while(0) 565 | #else 566 | #define __Pyx_RefNannyDeclarations 567 | #define __Pyx_RefNannySetupContext(name, acquire_gil) 568 | #define __Pyx_RefNannyFinishContext() 569 | #define __Pyx_INCREF(r) Py_INCREF(r) 570 | #define __Pyx_DECREF(r) Py_DECREF(r) 571 | #define __Pyx_GOTREF(r) 572 | #define __Pyx_GIVEREF(r) 573 | #define __Pyx_XINCREF(r) Py_XINCREF(r) 574 | #define __Pyx_XDECREF(r) Py_XDECREF(r) 575 | #define __Pyx_XGOTREF(r) 576 | #define __Pyx_XGIVEREF(r) 577 | #endif 578 | #define __Pyx_XDECREF_SET(r, v) do {\ 579 | PyObject *tmp = (PyObject *) r;\ 580 | r = v; __Pyx_XDECREF(tmp);\ 581 | } while (0) 582 | #define __Pyx_DECREF_SET(r, v) do {\ 583 | PyObject *tmp = (PyObject *) r;\ 584 | r = v; __Pyx_DECREF(tmp);\ 585 | } while (0) 586 | #define __Pyx_CLEAR(r) do { PyObject* tmp = ((PyObject*)(r)); r = NULL; __Pyx_DECREF(tmp);} while(0) 587 | #define __Pyx_XCLEAR(r) do { if((r) != NULL) {PyObject* tmp = ((PyObject*)(r)); r = NULL; __Pyx_DECREF(tmp);}} while(0) 588 | 589 | /* ArgTypeTest.proto */ 590 | static CYTHON_INLINE int __Pyx_ArgTypeTest(PyObject *obj, PyTypeObject *type, int none_allowed, 591 | const char *name, int exact); 592 | 593 | /* PyObjectGetAttrStr.proto */ 594 | #if CYTHON_COMPILING_IN_CPYTHON 595 | static CYTHON_INLINE PyObject* __Pyx_PyObject_GetAttrStr(PyObject* obj, PyObject* attr_name) { 596 | PyTypeObject* tp = Py_TYPE(obj); 597 | if (likely(tp->tp_getattro)) 598 | return tp->tp_getattro(obj, attr_name); 599 | #if PY_MAJOR_VERSION < 3 600 | if (likely(tp->tp_getattr)) 601 | return tp->tp_getattr(obj, PyString_AS_STRING(attr_name)); 602 | #endif 603 | return PyObject_GetAttr(obj, attr_name); 604 | } 605 | #else 606 | #define __Pyx_PyObject_GetAttrStr(o,n) PyObject_GetAttr(o,n) 607 | #endif 608 | 609 | /* GetBuiltinName.proto */ 610 | static PyObject *__Pyx_GetBuiltinName(PyObject *name); 611 | 612 | /* GetModuleGlobalName.proto */ 613 | static CYTHON_INLINE PyObject *__Pyx_GetModuleGlobalName(PyObject *name); 614 | 615 | /* PyObjectCall.proto */ 616 | #if CYTHON_COMPILING_IN_CPYTHON 617 | static CYTHON_INLINE PyObject* __Pyx_PyObject_Call(PyObject *func, PyObject *arg, PyObject *kw); 618 | #else 619 | #define __Pyx_PyObject_Call(func, arg, kw) PyObject_Call(func, arg, kw) 620 | #endif 621 | 622 | /* PyObjectCallMethO.proto */ 623 | #if CYTHON_COMPILING_IN_CPYTHON 624 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallMethO(PyObject *func, PyObject *arg); 625 | #endif 626 | 627 | /* PyObjectCallOneArg.proto */ 628 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallOneArg(PyObject *func, PyObject *arg); 629 | 630 | /* PyObjectCallNoArg.proto */ 631 | #if CYTHON_COMPILING_IN_CPYTHON 632 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallNoArg(PyObject *func); 633 | #else 634 | #define __Pyx_PyObject_CallNoArg(func) __Pyx_PyObject_Call(func, __pyx_empty_tuple, NULL) 635 | #endif 636 | 637 | /* SliceObject.proto */ 638 | static CYTHON_INLINE PyObject* __Pyx_PyObject_GetSlice( 639 | PyObject* obj, Py_ssize_t cstart, Py_ssize_t cstop, 640 | PyObject** py_start, PyObject** py_stop, PyObject** py_slice, 641 | int has_cstart, int has_cstop, int wraparound); 642 | 643 | /* GetItemInt.proto */ 644 | #define __Pyx_GetItemInt(o, i, type, is_signed, to_py_func, is_list, wraparound, boundscheck)\ 645 | (__Pyx_fits_Py_ssize_t(i, type, is_signed) ?\ 646 | __Pyx_GetItemInt_Fast(o, (Py_ssize_t)i, is_list, wraparound, boundscheck) :\ 647 | (is_list ? (PyErr_SetString(PyExc_IndexError, "list index out of range"), (PyObject*)NULL) :\ 648 | __Pyx_GetItemInt_Generic(o, to_py_func(i)))) 649 | #define __Pyx_GetItemInt_List(o, i, type, is_signed, to_py_func, is_list, wraparound, boundscheck)\ 650 | (__Pyx_fits_Py_ssize_t(i, type, is_signed) ?\ 651 | __Pyx_GetItemInt_List_Fast(o, (Py_ssize_t)i, wraparound, boundscheck) :\ 652 | (PyErr_SetString(PyExc_IndexError, "list index out of range"), (PyObject*)NULL)) 653 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_List_Fast(PyObject *o, Py_ssize_t i, 654 | int wraparound, int boundscheck); 655 | #define __Pyx_GetItemInt_Tuple(o, i, type, is_signed, to_py_func, is_list, wraparound, boundscheck)\ 656 | (__Pyx_fits_Py_ssize_t(i, type, is_signed) ?\ 657 | __Pyx_GetItemInt_Tuple_Fast(o, (Py_ssize_t)i, wraparound, boundscheck) :\ 658 | (PyErr_SetString(PyExc_IndexError, "tuple index out of range"), (PyObject*)NULL)) 659 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Tuple_Fast(PyObject *o, Py_ssize_t i, 660 | int wraparound, int boundscheck); 661 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Generic(PyObject *o, PyObject* j); 662 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Fast(PyObject *o, Py_ssize_t i, 663 | int is_list, int wraparound, int boundscheck); 664 | 665 | /* RaiseArgTupleInvalid.proto */ 666 | static void __Pyx_RaiseArgtupleInvalid(const char* func_name, int exact, 667 | Py_ssize_t num_min, Py_ssize_t num_max, Py_ssize_t num_found); 668 | 669 | /* RaiseDoubleKeywords.proto */ 670 | static void __Pyx_RaiseDoubleKeywordsError(const char* func_name, PyObject* kw_name); 671 | 672 | /* ParseKeywords.proto */ 673 | static int __Pyx_ParseOptionalKeywords(PyObject *kwds, PyObject **argnames[],\ 674 | PyObject *kwds2, PyObject *values[], Py_ssize_t num_pos_args,\ 675 | const char* function_name); 676 | 677 | /* FetchCommonType.proto */ 678 | static PyTypeObject* __Pyx_FetchCommonType(PyTypeObject* type); 679 | 680 | /* CythonFunction.proto */ 681 | #define __Pyx_CyFunction_USED 1 682 | #include 683 | #define __Pyx_CYFUNCTION_STATICMETHOD 0x01 684 | #define __Pyx_CYFUNCTION_CLASSMETHOD 0x02 685 | #define __Pyx_CYFUNCTION_CCLASS 0x04 686 | #define __Pyx_CyFunction_GetClosure(f)\ 687 | (((__pyx_CyFunctionObject *) (f))->func_closure) 688 | #define __Pyx_CyFunction_GetClassObj(f)\ 689 | (((__pyx_CyFunctionObject *) (f))->func_classobj) 690 | #define __Pyx_CyFunction_Defaults(type, f)\ 691 | ((type *)(((__pyx_CyFunctionObject *) (f))->defaults)) 692 | #define __Pyx_CyFunction_SetDefaultsGetter(f, g)\ 693 | ((__pyx_CyFunctionObject *) (f))->defaults_getter = (g) 694 | typedef struct { 695 | PyCFunctionObject func; 696 | #if PY_VERSION_HEX < 0x030500A0 697 | PyObject *func_weakreflist; 698 | #endif 699 | PyObject *func_dict; 700 | PyObject *func_name; 701 | PyObject *func_qualname; 702 | PyObject *func_doc; 703 | PyObject *func_globals; 704 | PyObject *func_code; 705 | PyObject *func_closure; 706 | PyObject *func_classobj; 707 | void *defaults; 708 | int defaults_pyobjects; 709 | int flags; 710 | PyObject *defaults_tuple; 711 | PyObject *defaults_kwdict; 712 | PyObject *(*defaults_getter)(PyObject *); 713 | PyObject *func_annotations; 714 | } __pyx_CyFunctionObject; 715 | static PyTypeObject *__pyx_CyFunctionType = 0; 716 | #define __Pyx_CyFunction_NewEx(ml, flags, qualname, self, module, globals, code)\ 717 | __Pyx_CyFunction_New(__pyx_CyFunctionType, ml, flags, qualname, self, module, globals, code) 718 | static PyObject *__Pyx_CyFunction_New(PyTypeObject *, PyMethodDef *ml, 719 | int flags, PyObject* qualname, 720 | PyObject *self, 721 | PyObject *module, PyObject *globals, 722 | PyObject* code); 723 | static CYTHON_INLINE void *__Pyx_CyFunction_InitDefaults(PyObject *m, 724 | size_t size, 725 | int pyobjects); 726 | static CYTHON_INLINE void __Pyx_CyFunction_SetDefaultsTuple(PyObject *m, 727 | PyObject *tuple); 728 | static CYTHON_INLINE void __Pyx_CyFunction_SetDefaultsKwDict(PyObject *m, 729 | PyObject *dict); 730 | static CYTHON_INLINE void __Pyx_CyFunction_SetAnnotationsDict(PyObject *m, 731 | PyObject *dict); 732 | static int __pyx_CyFunction_init(void); 733 | 734 | /* ListCompAppend.proto */ 735 | #if CYTHON_COMPILING_IN_CPYTHON 736 | static CYTHON_INLINE int __Pyx_ListComp_Append(PyObject* list, PyObject* x) { 737 | PyListObject* L = (PyListObject*) list; 738 | Py_ssize_t len = Py_SIZE(list); 739 | if (likely(L->allocated > len)) { 740 | Py_INCREF(x); 741 | PyList_SET_ITEM(list, len, x); 742 | Py_SIZE(list) = len+1; 743 | return 0; 744 | } 745 | return PyList_Append(list, x); 746 | } 747 | #else 748 | #define __Pyx_ListComp_Append(L,x) PyList_Append(L,x) 749 | #endif 750 | 751 | /* IncludeStringH.proto */ 752 | #include 753 | 754 | /* Import.proto */ 755 | static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int level); 756 | 757 | /* CodeObjectCache.proto */ 758 | typedef struct { 759 | PyCodeObject* code_object; 760 | int code_line; 761 | } __Pyx_CodeObjectCacheEntry; 762 | struct __Pyx_CodeObjectCache { 763 | int count; 764 | int max_count; 765 | __Pyx_CodeObjectCacheEntry* entries; 766 | }; 767 | static struct __Pyx_CodeObjectCache __pyx_code_cache = {0,0,NULL}; 768 | static int __pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry* entries, int count, int code_line); 769 | static PyCodeObject *__pyx_find_code_object(int code_line); 770 | static void __pyx_insert_code_object(int code_line, PyCodeObject* code_object); 771 | 772 | /* AddTraceback.proto */ 773 | static void __Pyx_AddTraceback(const char *funcname, int c_line, 774 | int py_line, const char *filename); 775 | 776 | /* CIntToPy.proto */ 777 | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_long(long value); 778 | 779 | /* CIntToPy.proto */ 780 | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_uint64_t(uint64_t value); 781 | 782 | /* CppExceptionConversion.proto */ 783 | #ifndef __Pyx_CppExn2PyErr 784 | #include 785 | #include 786 | #include 787 | #include 788 | static void __Pyx_CppExn2PyErr() { 789 | try { 790 | if (PyErr_Occurred()) 791 | ; // let the latest Python exn pass through and ignore the current one 792 | else 793 | throw; 794 | } catch (const std::bad_alloc& exn) { 795 | PyErr_SetString(PyExc_MemoryError, exn.what()); 796 | } catch (const std::bad_cast& exn) { 797 | PyErr_SetString(PyExc_TypeError, exn.what()); 798 | } catch (const std::domain_error& exn) { 799 | PyErr_SetString(PyExc_ValueError, exn.what()); 800 | } catch (const std::invalid_argument& exn) { 801 | PyErr_SetString(PyExc_ValueError, exn.what()); 802 | } catch (const std::ios_base::failure& exn) { 803 | PyErr_SetString(PyExc_IOError, exn.what()); 804 | } catch (const std::out_of_range& exn) { 805 | PyErr_SetString(PyExc_IndexError, exn.what()); 806 | } catch (const std::overflow_error& exn) { 807 | PyErr_SetString(PyExc_OverflowError, exn.what()); 808 | } catch (const std::range_error& exn) { 809 | PyErr_SetString(PyExc_ArithmeticError, exn.what()); 810 | } catch (const std::underflow_error& exn) { 811 | PyErr_SetString(PyExc_ArithmeticError, exn.what()); 812 | } catch (const std::exception& exn) { 813 | PyErr_SetString(PyExc_RuntimeError, exn.what()); 814 | } 815 | catch (...) 816 | { 817 | PyErr_SetString(PyExc_RuntimeError, "Unknown exception"); 818 | } 819 | } 820 | #endif 821 | 822 | /* CIntFromPy.proto */ 823 | static CYTHON_INLINE uint64_t __Pyx_PyInt_As_uint64_t(PyObject *); 824 | 825 | /* CIntFromPy.proto */ 826 | static CYTHON_INLINE size_t __Pyx_PyInt_As_size_t(PyObject *); 827 | 828 | /* CIntFromPy.proto */ 829 | static CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *); 830 | 831 | /* CIntFromPy.proto */ 832 | static CYTHON_INLINE int __Pyx_PyInt_As_int(PyObject *); 833 | 834 | /* CheckBinaryVersion.proto */ 835 | static int __Pyx_check_binary_version(void); 836 | 837 | /* InitStrings.proto */ 838 | static int __Pyx_InitStrings(__Pyx_StringTabEntry *t); 839 | 840 | 841 | /* Module declarations from 'libcpp.vector' */ 842 | 843 | /* Module declarations from 'libcpp.utility' */ 844 | 845 | /* Module declarations from 'libcpp.unordered_set' */ 846 | 847 | /* Module declarations from 'simhash.simhash' */ 848 | static PyTypeObject *__pyx_ptype___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py = 0; 849 | static PyObject *__Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*)(Simhash::hash_t, Simhash::hash_t)); /*proto*/ 850 | static std::vector __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(PyObject *); /*proto*/ 851 | static std::unordered_set __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(PyObject *); /*proto*/ 852 | static PyObject *__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t(std::pair const &); /*proto*/ 853 | static PyObject *__pyx_convert_vector_to_py_Simhash_3a__3a_match_t(const std::vector &); /*proto*/ 854 | #define __Pyx_MODULE_NAME "simhash.simhash" 855 | int __pyx_module_is_main_simhash__simhash = 0; 856 | 857 | /* Implementation of 'simhash.simhash' */ 858 | static PyObject *__pyx_builtin_range; 859 | static const char __pyx_k_Q[] = ">Q"; 860 | static const char __pyx_k_a[] = "a"; 861 | static const char __pyx_k_b[] = "b"; 862 | static const char __pyx_k_md5[] = "md5"; 863 | static const char __pyx_k_obj[] = "obj"; 864 | static const char __pyx_k_main[] = "__main__"; 865 | static const char __pyx_k_test[] = "__test__"; 866 | static const char __pyx_k_wrap[] = "wrap"; 867 | static const char __pyx_k_range[] = "range"; 868 | static const char __pyx_k_digest[] = "digest"; 869 | static const char __pyx_k_hashes[] = "hashes"; 870 | static const char __pyx_k_import[] = "__import__"; 871 | static const char __pyx_k_struct[] = "struct"; 872 | static const char __pyx_k_unpack[] = "unpack"; 873 | static const char __pyx_k_hashlib[] = "hashlib"; 874 | static const char __pyx_k_cfunc_to_py[] = "cfunc.to_py"; 875 | static const char __pyx_k_stringsource[] = "stringsource"; 876 | static const char __pyx_k_unsigned_hash[] = "unsigned_hash"; 877 | static const char __pyx_k_different_bits[] = "different_bits"; 878 | static const char __pyx_k_simhash_simhash[] = "simhash.simhash"; 879 | static const char __pyx_k_number_of_blocks[] = "number_of_blocks"; 880 | static const char __pyx_k_vagrant_simhash_simhash_pyx[] = "/vagrant/simhash/simhash.pyx"; 881 | static const char __pyx_k_Pyx_CFunc_size__t____hash__t[] = "__Pyx_CFunc_size__t____hash__t____hash__t___to_py..wrap"; 882 | static PyObject *__pyx_n_s_Pyx_CFunc_size__t____hash__t; 883 | static PyObject *__pyx_kp_s_Q; 884 | static PyObject *__pyx_n_s_a; 885 | static PyObject *__pyx_n_s_b; 886 | static PyObject *__pyx_n_s_cfunc_to_py; 887 | static PyObject *__pyx_n_s_different_bits; 888 | static PyObject *__pyx_n_s_digest; 889 | static PyObject *__pyx_n_s_hashes; 890 | static PyObject *__pyx_n_s_hashlib; 891 | static PyObject *__pyx_n_s_import; 892 | static PyObject *__pyx_n_s_main; 893 | static PyObject *__pyx_n_s_md5; 894 | static PyObject *__pyx_n_s_number_of_blocks; 895 | static PyObject *__pyx_n_s_obj; 896 | static PyObject *__pyx_n_s_range; 897 | static PyObject *__pyx_n_s_simhash_simhash; 898 | static PyObject *__pyx_kp_s_stringsource; 899 | static PyObject *__pyx_n_s_struct; 900 | static PyObject *__pyx_n_s_test; 901 | static PyObject *__pyx_n_s_unpack; 902 | static PyObject *__pyx_n_s_unsigned_hash; 903 | static PyObject *__pyx_kp_s_vagrant_simhash_simhash_pyx; 904 | static PyObject *__pyx_n_s_wrap; 905 | static PyObject *__pyx_pf_7simhash_7simhash_unsigned_hash(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_obj); /* proto */ 906 | static PyObject *__pyx_pf_7simhash_7simhash_2compute(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_hashes); /* proto */ 907 | static PyObject *__pyx_pf_7simhash_7simhash_4find_all(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_hashes, PyObject *__pyx_v_number_of_blocks, PyObject *__pyx_v_different_bits); /* proto */ 908 | static PyObject *__pyx_pf_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_wrap(PyObject *__pyx_self, Simhash::hash_t __pyx_v_a, Simhash::hash_t __pyx_v_b); /* proto */ 909 | static PyObject *__pyx_tp_new___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py(PyTypeObject *t, PyObject *a, PyObject *k); /*proto*/ 910 | static PyObject *__pyx_int_0; 911 | static PyObject *__pyx_int_8; 912 | static PyObject *__pyx_int_18446744073709551615; 913 | static PyObject *__pyx_slice_; 914 | static PyObject *__pyx_tuple__2; 915 | static PyObject *__pyx_tuple__4; 916 | static PyObject *__pyx_codeobj__3; 917 | static PyObject *__pyx_codeobj__5; 918 | 919 | /* "simhash/simhash.pyx":8 920 | * 921 | * 922 | * def unsigned_hash(bytes obj): # <<<<<<<<<<<<<< 923 | * '''Returns a hash suitable for use as a hash_t.''' 924 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF 925 | */ 926 | 927 | /* Python wrapper */ 928 | static PyObject *__pyx_pw_7simhash_7simhash_1unsigned_hash(PyObject *__pyx_self, PyObject *__pyx_v_obj); /*proto*/ 929 | static char __pyx_doc_7simhash_7simhash_unsigned_hash[] = "Returns a hash suitable for use as a hash_t."; 930 | static PyMethodDef __pyx_mdef_7simhash_7simhash_1unsigned_hash = {"unsigned_hash", (PyCFunction)__pyx_pw_7simhash_7simhash_1unsigned_hash, METH_O, __pyx_doc_7simhash_7simhash_unsigned_hash}; 931 | static PyObject *__pyx_pw_7simhash_7simhash_1unsigned_hash(PyObject *__pyx_self, PyObject *__pyx_v_obj) { 932 | PyObject *__pyx_r = 0; 933 | __Pyx_RefNannyDeclarations 934 | __Pyx_RefNannySetupContext("unsigned_hash (wrapper)", 0); 935 | if (unlikely(!__Pyx_ArgTypeTest(((PyObject *)__pyx_v_obj), (&PyBytes_Type), 1, "obj", 1))) __PYX_ERR(0, 8, __pyx_L1_error) 936 | __pyx_r = __pyx_pf_7simhash_7simhash_unsigned_hash(__pyx_self, ((PyObject*)__pyx_v_obj)); 937 | 938 | /* function exit code */ 939 | goto __pyx_L0; 940 | __pyx_L1_error:; 941 | __pyx_r = NULL; 942 | __pyx_L0:; 943 | __Pyx_RefNannyFinishContext(); 944 | return __pyx_r; 945 | } 946 | 947 | static PyObject *__pyx_pf_7simhash_7simhash_unsigned_hash(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_obj) { 948 | PyObject *__pyx_r = NULL; 949 | __Pyx_RefNannyDeclarations 950 | PyObject *__pyx_t_1 = NULL; 951 | PyObject *__pyx_t_2 = NULL; 952 | PyObject *__pyx_t_3 = NULL; 953 | PyObject *__pyx_t_4 = NULL; 954 | PyObject *__pyx_t_5 = NULL; 955 | PyObject *__pyx_t_6 = NULL; 956 | PyObject *__pyx_t_7 = NULL; 957 | Py_ssize_t __pyx_t_8; 958 | __Pyx_RefNannySetupContext("unsigned_hash", 0); 959 | 960 | /* "simhash/simhash.pyx":10 961 | * def unsigned_hash(bytes obj): 962 | * '''Returns a hash suitable for use as a hash_t.''' 963 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF # <<<<<<<<<<<<<< 964 | * 965 | * def compute(hashes): 966 | */ 967 | __Pyx_XDECREF(__pyx_r); 968 | __pyx_t_2 = __Pyx_GetModuleGlobalName(__pyx_n_s_struct); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 10, __pyx_L1_error) 969 | __Pyx_GOTREF(__pyx_t_2); 970 | __pyx_t_3 = __Pyx_PyObject_GetAttrStr(__pyx_t_2, __pyx_n_s_unpack); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 10, __pyx_L1_error) 971 | __Pyx_GOTREF(__pyx_t_3); 972 | __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0; 973 | __pyx_t_5 = __Pyx_GetModuleGlobalName(__pyx_n_s_hashlib); if (unlikely(!__pyx_t_5)) __PYX_ERR(0, 10, __pyx_L1_error) 974 | __Pyx_GOTREF(__pyx_t_5); 975 | __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_t_5, __pyx_n_s_md5); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 10, __pyx_L1_error) 976 | __Pyx_GOTREF(__pyx_t_6); 977 | __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0; 978 | __pyx_t_5 = NULL; 979 | if (CYTHON_COMPILING_IN_CPYTHON && unlikely(PyMethod_Check(__pyx_t_6))) { 980 | __pyx_t_5 = PyMethod_GET_SELF(__pyx_t_6); 981 | if (likely(__pyx_t_5)) { 982 | PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_6); 983 | __Pyx_INCREF(__pyx_t_5); 984 | __Pyx_INCREF(function); 985 | __Pyx_DECREF_SET(__pyx_t_6, function); 986 | } 987 | } 988 | if (!__pyx_t_5) { 989 | __pyx_t_4 = __Pyx_PyObject_CallOneArg(__pyx_t_6, __pyx_v_obj); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 10, __pyx_L1_error) 990 | __Pyx_GOTREF(__pyx_t_4); 991 | } else { 992 | __pyx_t_7 = PyTuple_New(1+1); if (unlikely(!__pyx_t_7)) __PYX_ERR(0, 10, __pyx_L1_error) 993 | __Pyx_GOTREF(__pyx_t_7); 994 | __Pyx_GIVEREF(__pyx_t_5); PyTuple_SET_ITEM(__pyx_t_7, 0, __pyx_t_5); __pyx_t_5 = NULL; 995 | __Pyx_INCREF(__pyx_v_obj); 996 | __Pyx_GIVEREF(__pyx_v_obj); 997 | PyTuple_SET_ITEM(__pyx_t_7, 0+1, __pyx_v_obj); 998 | __pyx_t_4 = __Pyx_PyObject_Call(__pyx_t_6, __pyx_t_7, NULL); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 10, __pyx_L1_error) 999 | __Pyx_GOTREF(__pyx_t_4); 1000 | __Pyx_DECREF(__pyx_t_7); __pyx_t_7 = 0; 1001 | } 1002 | __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0; 1003 | __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_t_4, __pyx_n_s_digest); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 10, __pyx_L1_error) 1004 | __Pyx_GOTREF(__pyx_t_6); 1005 | __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0; 1006 | __pyx_t_4 = NULL; 1007 | if (CYTHON_COMPILING_IN_CPYTHON && likely(PyMethod_Check(__pyx_t_6))) { 1008 | __pyx_t_4 = PyMethod_GET_SELF(__pyx_t_6); 1009 | if (likely(__pyx_t_4)) { 1010 | PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_6); 1011 | __Pyx_INCREF(__pyx_t_4); 1012 | __Pyx_INCREF(function); 1013 | __Pyx_DECREF_SET(__pyx_t_6, function); 1014 | } 1015 | } 1016 | if (__pyx_t_4) { 1017 | __pyx_t_2 = __Pyx_PyObject_CallOneArg(__pyx_t_6, __pyx_t_4); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 10, __pyx_L1_error) 1018 | __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0; 1019 | } else { 1020 | __pyx_t_2 = __Pyx_PyObject_CallNoArg(__pyx_t_6); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 10, __pyx_L1_error) 1021 | } 1022 | __Pyx_GOTREF(__pyx_t_2); 1023 | __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0; 1024 | __pyx_t_6 = __Pyx_PyObject_GetSlice(__pyx_t_2, 0, 8, NULL, NULL, &__pyx_slice_, 1, 1, 1); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 10, __pyx_L1_error) 1025 | __Pyx_GOTREF(__pyx_t_6); 1026 | __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0; 1027 | __pyx_t_2 = NULL; 1028 | __pyx_t_8 = 0; 1029 | if (CYTHON_COMPILING_IN_CPYTHON && unlikely(PyMethod_Check(__pyx_t_3))) { 1030 | __pyx_t_2 = PyMethod_GET_SELF(__pyx_t_3); 1031 | if (likely(__pyx_t_2)) { 1032 | PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_3); 1033 | __Pyx_INCREF(__pyx_t_2); 1034 | __Pyx_INCREF(function); 1035 | __Pyx_DECREF_SET(__pyx_t_3, function); 1036 | __pyx_t_8 = 1; 1037 | } 1038 | } 1039 | __pyx_t_4 = PyTuple_New(2+__pyx_t_8); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 10, __pyx_L1_error) 1040 | __Pyx_GOTREF(__pyx_t_4); 1041 | if (__pyx_t_2) { 1042 | __Pyx_GIVEREF(__pyx_t_2); PyTuple_SET_ITEM(__pyx_t_4, 0, __pyx_t_2); __pyx_t_2 = NULL; 1043 | } 1044 | __Pyx_INCREF(__pyx_kp_s_Q); 1045 | __Pyx_GIVEREF(__pyx_kp_s_Q); 1046 | PyTuple_SET_ITEM(__pyx_t_4, 0+__pyx_t_8, __pyx_kp_s_Q); 1047 | __Pyx_GIVEREF(__pyx_t_6); 1048 | PyTuple_SET_ITEM(__pyx_t_4, 1+__pyx_t_8, __pyx_t_6); 1049 | __pyx_t_6 = 0; 1050 | __pyx_t_1 = __Pyx_PyObject_Call(__pyx_t_3, __pyx_t_4, NULL); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 10, __pyx_L1_error) 1051 | __Pyx_GOTREF(__pyx_t_1); 1052 | __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0; 1053 | __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0; 1054 | __pyx_t_3 = __Pyx_GetItemInt(__pyx_t_1, 0, long, 1, __Pyx_PyInt_From_long, 0, 0, 1); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 10, __pyx_L1_error) 1055 | __Pyx_GOTREF(__pyx_t_3); 1056 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 1057 | __pyx_t_1 = PyNumber_And(__pyx_t_3, __pyx_int_18446744073709551615); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 10, __pyx_L1_error) 1058 | __Pyx_GOTREF(__pyx_t_1); 1059 | __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0; 1060 | __pyx_r = __pyx_t_1; 1061 | __pyx_t_1 = 0; 1062 | goto __pyx_L0; 1063 | 1064 | /* "simhash/simhash.pyx":8 1065 | * 1066 | * 1067 | * def unsigned_hash(bytes obj): # <<<<<<<<<<<<<< 1068 | * '''Returns a hash suitable for use as a hash_t.''' 1069 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF 1070 | */ 1071 | 1072 | /* function exit code */ 1073 | __pyx_L1_error:; 1074 | __Pyx_XDECREF(__pyx_t_1); 1075 | __Pyx_XDECREF(__pyx_t_2); 1076 | __Pyx_XDECREF(__pyx_t_3); 1077 | __Pyx_XDECREF(__pyx_t_4); 1078 | __Pyx_XDECREF(__pyx_t_5); 1079 | __Pyx_XDECREF(__pyx_t_6); 1080 | __Pyx_XDECREF(__pyx_t_7); 1081 | __Pyx_AddTraceback("simhash.simhash.unsigned_hash", __pyx_clineno, __pyx_lineno, __pyx_filename); 1082 | __pyx_r = NULL; 1083 | __pyx_L0:; 1084 | __Pyx_XGIVEREF(__pyx_r); 1085 | __Pyx_RefNannyFinishContext(); 1086 | return __pyx_r; 1087 | } 1088 | 1089 | /* "simhash/simhash.pyx":12 1090 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF 1091 | * 1092 | * def compute(hashes): # <<<<<<<<<<<<<< 1093 | * '''Compute the simhash of a vector of hashes.''' 1094 | * return c_compute(hashes) 1095 | */ 1096 | 1097 | /* Python wrapper */ 1098 | static PyObject *__pyx_pw_7simhash_7simhash_3compute(PyObject *__pyx_self, PyObject *__pyx_v_hashes); /*proto*/ 1099 | static char __pyx_doc_7simhash_7simhash_2compute[] = "Compute the simhash of a vector of hashes."; 1100 | static PyObject *__pyx_pw_7simhash_7simhash_3compute(PyObject *__pyx_self, PyObject *__pyx_v_hashes) { 1101 | PyObject *__pyx_r = 0; 1102 | __Pyx_RefNannyDeclarations 1103 | __Pyx_RefNannySetupContext("compute (wrapper)", 0); 1104 | __pyx_r = __pyx_pf_7simhash_7simhash_2compute(__pyx_self, ((PyObject *)__pyx_v_hashes)); 1105 | 1106 | /* function exit code */ 1107 | __Pyx_RefNannyFinishContext(); 1108 | return __pyx_r; 1109 | } 1110 | 1111 | static PyObject *__pyx_pf_7simhash_7simhash_2compute(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_hashes) { 1112 | PyObject *__pyx_r = NULL; 1113 | __Pyx_RefNannyDeclarations 1114 | std::vector __pyx_t_1; 1115 | PyObject *__pyx_t_2 = NULL; 1116 | __Pyx_RefNannySetupContext("compute", 0); 1117 | 1118 | /* "simhash/simhash.pyx":14 1119 | * def compute(hashes): 1120 | * '''Compute the simhash of a vector of hashes.''' 1121 | * return c_compute(hashes) # <<<<<<<<<<<<<< 1122 | * 1123 | * def find_all(hashes, number_of_blocks, different_bits): 1124 | */ 1125 | __Pyx_XDECREF(__pyx_r); 1126 | __pyx_t_1 = __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(__pyx_v_hashes); if (unlikely(PyErr_Occurred())) __PYX_ERR(0, 14, __pyx_L1_error) 1127 | __pyx_t_2 = __Pyx_PyInt_From_uint64_t(Simhash::compute(__pyx_t_1)); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 14, __pyx_L1_error) 1128 | __Pyx_GOTREF(__pyx_t_2); 1129 | __pyx_r = __pyx_t_2; 1130 | __pyx_t_2 = 0; 1131 | goto __pyx_L0; 1132 | 1133 | /* "simhash/simhash.pyx":12 1134 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF 1135 | * 1136 | * def compute(hashes): # <<<<<<<<<<<<<< 1137 | * '''Compute the simhash of a vector of hashes.''' 1138 | * return c_compute(hashes) 1139 | */ 1140 | 1141 | /* function exit code */ 1142 | __pyx_L1_error:; 1143 | __Pyx_XDECREF(__pyx_t_2); 1144 | __Pyx_AddTraceback("simhash.simhash.compute", __pyx_clineno, __pyx_lineno, __pyx_filename); 1145 | __pyx_r = NULL; 1146 | __pyx_L0:; 1147 | __Pyx_XGIVEREF(__pyx_r); 1148 | __Pyx_RefNannyFinishContext(); 1149 | return __pyx_r; 1150 | } 1151 | 1152 | /* "simhash/simhash.pyx":16 1153 | * return c_compute(hashes) 1154 | * 1155 | * def find_all(hashes, number_of_blocks, different_bits): # <<<<<<<<<<<<<< 1156 | * ''' 1157 | * Find the set of all matches within the provided vector of hashes. 1158 | */ 1159 | 1160 | /* Python wrapper */ 1161 | static PyObject *__pyx_pw_7simhash_7simhash_5find_all(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/ 1162 | static char __pyx_doc_7simhash_7simhash_4find_all[] = "\n Find the set of all matches within the provided vector of hashes.\n\n The provided hashes are manipulated in place, but upon completion are\n restored to their original state.\n "; 1163 | static PyObject *__pyx_pw_7simhash_7simhash_5find_all(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) { 1164 | PyObject *__pyx_v_hashes = 0; 1165 | PyObject *__pyx_v_number_of_blocks = 0; 1166 | PyObject *__pyx_v_different_bits = 0; 1167 | PyObject *__pyx_r = 0; 1168 | __Pyx_RefNannyDeclarations 1169 | __Pyx_RefNannySetupContext("find_all (wrapper)", 0); 1170 | { 1171 | static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_hashes,&__pyx_n_s_number_of_blocks,&__pyx_n_s_different_bits,0}; 1172 | PyObject* values[3] = {0,0,0}; 1173 | if (unlikely(__pyx_kwds)) { 1174 | Py_ssize_t kw_args; 1175 | const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args); 1176 | switch (pos_args) { 1177 | case 3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2); 1178 | case 2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1); 1179 | case 1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0); 1180 | case 0: break; 1181 | default: goto __pyx_L5_argtuple_error; 1182 | } 1183 | kw_args = PyDict_Size(__pyx_kwds); 1184 | switch (pos_args) { 1185 | case 0: 1186 | if (likely((values[0] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_hashes)) != 0)) kw_args--; 1187 | else goto __pyx_L5_argtuple_error; 1188 | case 1: 1189 | if (likely((values[1] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_number_of_blocks)) != 0)) kw_args--; 1190 | else { 1191 | __Pyx_RaiseArgtupleInvalid("find_all", 1, 3, 3, 1); __PYX_ERR(0, 16, __pyx_L3_error) 1192 | } 1193 | case 2: 1194 | if (likely((values[2] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_different_bits)) != 0)) kw_args--; 1195 | else { 1196 | __Pyx_RaiseArgtupleInvalid("find_all", 1, 3, 3, 2); __PYX_ERR(0, 16, __pyx_L3_error) 1197 | } 1198 | } 1199 | if (unlikely(kw_args > 0)) { 1200 | if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, "find_all") < 0)) __PYX_ERR(0, 16, __pyx_L3_error) 1201 | } 1202 | } else if (PyTuple_GET_SIZE(__pyx_args) != 3) { 1203 | goto __pyx_L5_argtuple_error; 1204 | } else { 1205 | values[0] = PyTuple_GET_ITEM(__pyx_args, 0); 1206 | values[1] = PyTuple_GET_ITEM(__pyx_args, 1); 1207 | values[2] = PyTuple_GET_ITEM(__pyx_args, 2); 1208 | } 1209 | __pyx_v_hashes = values[0]; 1210 | __pyx_v_number_of_blocks = values[1]; 1211 | __pyx_v_different_bits = values[2]; 1212 | } 1213 | goto __pyx_L4_argument_unpacking_done; 1214 | __pyx_L5_argtuple_error:; 1215 | __Pyx_RaiseArgtupleInvalid("find_all", 1, 3, 3, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(0, 16, __pyx_L3_error) 1216 | __pyx_L3_error:; 1217 | __Pyx_AddTraceback("simhash.simhash.find_all", __pyx_clineno, __pyx_lineno, __pyx_filename); 1218 | __Pyx_RefNannyFinishContext(); 1219 | return NULL; 1220 | __pyx_L4_argument_unpacking_done:; 1221 | __pyx_r = __pyx_pf_7simhash_7simhash_4find_all(__pyx_self, __pyx_v_hashes, __pyx_v_number_of_blocks, __pyx_v_different_bits); 1222 | 1223 | /* function exit code */ 1224 | __Pyx_RefNannyFinishContext(); 1225 | return __pyx_r; 1226 | } 1227 | 1228 | static PyObject *__pyx_pf_7simhash_7simhash_4find_all(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_hashes, PyObject *__pyx_v_number_of_blocks, PyObject *__pyx_v_different_bits) { 1229 | Simhash::matches_t __pyx_v_results_set; 1230 | std::vector __pyx_v_results_vector; 1231 | PyObject *__pyx_r = NULL; 1232 | __Pyx_RefNannyDeclarations 1233 | std::unordered_set __pyx_t_1; 1234 | size_t __pyx_t_2; 1235 | size_t __pyx_t_3; 1236 | PyObject *__pyx_t_4 = NULL; 1237 | __Pyx_RefNannySetupContext("find_all", 0); 1238 | 1239 | /* "simhash/simhash.pyx":23 1240 | * restored to their original state. 1241 | * ''' 1242 | * cdef matches_t results_set = c_find_all(hashes, number_of_blocks, different_bits) # <<<<<<<<<<<<<< 1243 | * cdef vector[match_t] results_vector 1244 | * results_vector.assign(results_set.begin(), results_set.end()) 1245 | */ 1246 | __pyx_t_1 = __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(__pyx_v_hashes); if (unlikely(PyErr_Occurred())) __PYX_ERR(0, 23, __pyx_L1_error) 1247 | __pyx_t_2 = __Pyx_PyInt_As_size_t(__pyx_v_number_of_blocks); if (unlikely((__pyx_t_2 == (size_t)-1) && PyErr_Occurred())) __PYX_ERR(0, 23, __pyx_L1_error) 1248 | __pyx_t_3 = __Pyx_PyInt_As_size_t(__pyx_v_different_bits); if (unlikely((__pyx_t_3 == (size_t)-1) && PyErr_Occurred())) __PYX_ERR(0, 23, __pyx_L1_error) 1249 | __pyx_v_results_set = Simhash::find_all(__pyx_t_1, __pyx_t_2, __pyx_t_3); 1250 | 1251 | /* "simhash/simhash.pyx":25 1252 | * cdef matches_t results_set = c_find_all(hashes, number_of_blocks, different_bits) 1253 | * cdef vector[match_t] results_vector 1254 | * results_vector.assign(results_set.begin(), results_set.end()) # <<<<<<<<<<<<<< 1255 | * return results_vector 1256 | */ 1257 | try { 1258 | __pyx_v_results_vector.assign(__pyx_v_results_set.begin(), __pyx_v_results_set.end()); 1259 | } catch(...) { 1260 | __Pyx_CppExn2PyErr(); 1261 | __PYX_ERR(0, 25, __pyx_L1_error) 1262 | } 1263 | 1264 | /* "simhash/simhash.pyx":26 1265 | * cdef vector[match_t] results_vector 1266 | * results_vector.assign(results_set.begin(), results_set.end()) 1267 | * return results_vector # <<<<<<<<<<<<<< 1268 | */ 1269 | __Pyx_XDECREF(__pyx_r); 1270 | __pyx_t_4 = __pyx_convert_vector_to_py_Simhash_3a__3a_match_t(__pyx_v_results_vector); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 26, __pyx_L1_error) 1271 | __Pyx_GOTREF(__pyx_t_4); 1272 | __pyx_r = __pyx_t_4; 1273 | __pyx_t_4 = 0; 1274 | goto __pyx_L0; 1275 | 1276 | /* "simhash/simhash.pyx":16 1277 | * return c_compute(hashes) 1278 | * 1279 | * def find_all(hashes, number_of_blocks, different_bits): # <<<<<<<<<<<<<< 1280 | * ''' 1281 | * Find the set of all matches within the provided vector of hashes. 1282 | */ 1283 | 1284 | /* function exit code */ 1285 | __pyx_L1_error:; 1286 | __Pyx_XDECREF(__pyx_t_4); 1287 | __Pyx_AddTraceback("simhash.simhash.find_all", __pyx_clineno, __pyx_lineno, __pyx_filename); 1288 | __pyx_r = NULL; 1289 | __pyx_L0:; 1290 | __Pyx_XGIVEREF(__pyx_r); 1291 | __Pyx_RefNannyFinishContext(); 1292 | return __pyx_r; 1293 | } 1294 | 1295 | /* "cfunc.to_py":65 1296 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 1297 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): 1298 | * def wrap(hash_t a, hash_t b): # <<<<<<<<<<<<<< 1299 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1300 | * return f(a, b) 1301 | */ 1302 | 1303 | /* Python wrapper */ 1304 | static PyObject *__pyx_pw_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_1wrap(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/ 1305 | static char __pyx_doc_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_wrap[] = "wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'"; 1306 | static PyMethodDef __pyx_mdef_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_1wrap = {"wrap", (PyCFunction)__pyx_pw_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_1wrap, METH_VARARGS|METH_KEYWORDS, __pyx_doc_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_wrap}; 1307 | static PyObject *__pyx_pw_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_1wrap(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) { 1308 | Simhash::hash_t __pyx_v_a; 1309 | Simhash::hash_t __pyx_v_b; 1310 | PyObject *__pyx_r = 0; 1311 | __Pyx_RefNannyDeclarations 1312 | __Pyx_RefNannySetupContext("wrap (wrapper)", 0); 1313 | { 1314 | static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_a,&__pyx_n_s_b,0}; 1315 | PyObject* values[2] = {0,0}; 1316 | if (unlikely(__pyx_kwds)) { 1317 | Py_ssize_t kw_args; 1318 | const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args); 1319 | switch (pos_args) { 1320 | case 2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1); 1321 | case 1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0); 1322 | case 0: break; 1323 | default: goto __pyx_L5_argtuple_error; 1324 | } 1325 | kw_args = PyDict_Size(__pyx_kwds); 1326 | switch (pos_args) { 1327 | case 0: 1328 | if (likely((values[0] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_a)) != 0)) kw_args--; 1329 | else goto __pyx_L5_argtuple_error; 1330 | case 1: 1331 | if (likely((values[1] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_b)) != 0)) kw_args--; 1332 | else { 1333 | __Pyx_RaiseArgtupleInvalid("wrap", 1, 2, 2, 1); __PYX_ERR(1, 65, __pyx_L3_error) 1334 | } 1335 | } 1336 | if (unlikely(kw_args > 0)) { 1337 | if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, "wrap") < 0)) __PYX_ERR(1, 65, __pyx_L3_error) 1338 | } 1339 | } else if (PyTuple_GET_SIZE(__pyx_args) != 2) { 1340 | goto __pyx_L5_argtuple_error; 1341 | } else { 1342 | values[0] = PyTuple_GET_ITEM(__pyx_args, 0); 1343 | values[1] = PyTuple_GET_ITEM(__pyx_args, 1); 1344 | } 1345 | __pyx_v_a = __Pyx_PyInt_As_uint64_t(values[0]); if (unlikely((__pyx_v_a == (Simhash::hash_t)-1) && PyErr_Occurred())) __PYX_ERR(1, 65, __pyx_L3_error) 1346 | __pyx_v_b = __Pyx_PyInt_As_uint64_t(values[1]); if (unlikely((__pyx_v_b == (Simhash::hash_t)-1) && PyErr_Occurred())) __PYX_ERR(1, 65, __pyx_L3_error) 1347 | } 1348 | goto __pyx_L4_argument_unpacking_done; 1349 | __pyx_L5_argtuple_error:; 1350 | __Pyx_RaiseArgtupleInvalid("wrap", 1, 2, 2, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(1, 65, __pyx_L3_error) 1351 | __pyx_L3_error:; 1352 | __Pyx_AddTraceback("cfunc.to_py.__Pyx_CFunc_size__t____hash__t____hash__t___to_py.wrap", __pyx_clineno, __pyx_lineno, __pyx_filename); 1353 | __Pyx_RefNannyFinishContext(); 1354 | return NULL; 1355 | __pyx_L4_argument_unpacking_done:; 1356 | __pyx_r = __pyx_pf_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_wrap(__pyx_self, __pyx_v_a, __pyx_v_b); 1357 | 1358 | /* function exit code */ 1359 | __Pyx_RefNannyFinishContext(); 1360 | return __pyx_r; 1361 | } 1362 | 1363 | static PyObject *__pyx_pf_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_wrap(PyObject *__pyx_self, Simhash::hash_t __pyx_v_a, Simhash::hash_t __pyx_v_b) { 1364 | struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *__pyx_cur_scope; 1365 | struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *__pyx_outer_scope; 1366 | PyObject *__pyx_r = NULL; 1367 | __Pyx_RefNannyDeclarations 1368 | size_t __pyx_t_1; 1369 | PyObject *__pyx_t_2 = NULL; 1370 | __Pyx_RefNannySetupContext("wrap", 0); 1371 | __pyx_outer_scope = (struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *) __Pyx_CyFunction_GetClosure(__pyx_self); 1372 | __pyx_cur_scope = __pyx_outer_scope; 1373 | 1374 | /* "cfunc.to_py":67 1375 | * def wrap(hash_t a, hash_t b): 1376 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1377 | * return f(a, b) # <<<<<<<<<<<<<< 1378 | * return wrap 1379 | * 1380 | */ 1381 | __Pyx_XDECREF(__pyx_r); 1382 | __pyx_t_1 = __pyx_cur_scope->__pyx_v_f(__pyx_v_a, __pyx_v_b); if (unlikely(PyErr_Occurred())) __PYX_ERR(1, 67, __pyx_L1_error) 1383 | __pyx_t_2 = __Pyx_PyInt_FromSize_t(__pyx_t_1); if (unlikely(!__pyx_t_2)) __PYX_ERR(1, 67, __pyx_L1_error) 1384 | __Pyx_GOTREF(__pyx_t_2); 1385 | __pyx_r = __pyx_t_2; 1386 | __pyx_t_2 = 0; 1387 | goto __pyx_L0; 1388 | 1389 | /* "cfunc.to_py":65 1390 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 1391 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): 1392 | * def wrap(hash_t a, hash_t b): # <<<<<<<<<<<<<< 1393 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1394 | * return f(a, b) 1395 | */ 1396 | 1397 | /* function exit code */ 1398 | __pyx_L1_error:; 1399 | __Pyx_XDECREF(__pyx_t_2); 1400 | __Pyx_AddTraceback("cfunc.to_py.__Pyx_CFunc_size__t____hash__t____hash__t___to_py.wrap", __pyx_clineno, __pyx_lineno, __pyx_filename); 1401 | __pyx_r = NULL; 1402 | __pyx_L0:; 1403 | __Pyx_XGIVEREF(__pyx_r); 1404 | __Pyx_RefNannyFinishContext(); 1405 | return __pyx_r; 1406 | } 1407 | 1408 | /* "cfunc.to_py":64 1409 | * 1410 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 1411 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): # <<<<<<<<<<<<<< 1412 | * def wrap(hash_t a, hash_t b): 1413 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1414 | */ 1415 | 1416 | static PyObject *__Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*__pyx_v_f)(Simhash::hash_t, Simhash::hash_t)) { 1417 | struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *__pyx_cur_scope; 1418 | PyObject *__pyx_v_wrap = 0; 1419 | PyObject *__pyx_r = NULL; 1420 | __Pyx_RefNannyDeclarations 1421 | PyObject *__pyx_t_1 = NULL; 1422 | __Pyx_RefNannySetupContext("__Pyx_CFunc_size__t____hash__t____hash__t___to_py", 0); 1423 | __pyx_cur_scope = (struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *)__pyx_tp_new___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py(__pyx_ptype___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py, __pyx_empty_tuple, NULL); 1424 | if (unlikely(!__pyx_cur_scope)) { 1425 | __Pyx_RefNannyFinishContext(); 1426 | return 0; 1427 | } 1428 | __Pyx_GOTREF(__pyx_cur_scope); 1429 | __pyx_cur_scope->__pyx_v_f = __pyx_v_f; 1430 | 1431 | /* "cfunc.to_py":65 1432 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 1433 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): 1434 | * def wrap(hash_t a, hash_t b): # <<<<<<<<<<<<<< 1435 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1436 | * return f(a, b) 1437 | */ 1438 | __pyx_t_1 = __Pyx_CyFunction_NewEx(&__pyx_mdef_11cfunc_dot_to_py_49__Pyx_CFunc_size__t____hash__t____hash__t___to_py_1wrap, 0, __pyx_n_s_Pyx_CFunc_size__t____hash__t, ((PyObject*)__pyx_cur_scope), __pyx_n_s_cfunc_to_py, __pyx_d, ((PyObject *)__pyx_codeobj__3)); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 65, __pyx_L1_error) 1439 | __Pyx_GOTREF(__pyx_t_1); 1440 | __pyx_v_wrap = __pyx_t_1; 1441 | __pyx_t_1 = 0; 1442 | 1443 | /* "cfunc.to_py":68 1444 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1445 | * return f(a, b) 1446 | * return wrap # <<<<<<<<<<<<<< 1447 | * 1448 | * 1449 | */ 1450 | __Pyx_XDECREF(__pyx_r); 1451 | __Pyx_INCREF(__pyx_v_wrap); 1452 | __pyx_r = __pyx_v_wrap; 1453 | goto __pyx_L0; 1454 | 1455 | /* "cfunc.to_py":64 1456 | * 1457 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 1458 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): # <<<<<<<<<<<<<< 1459 | * def wrap(hash_t a, hash_t b): 1460 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1461 | */ 1462 | 1463 | /* function exit code */ 1464 | __pyx_L1_error:; 1465 | __Pyx_XDECREF(__pyx_t_1); 1466 | __Pyx_AddTraceback("cfunc.to_py.__Pyx_CFunc_size__t____hash__t____hash__t___to_py", __pyx_clineno, __pyx_lineno, __pyx_filename); 1467 | __pyx_r = 0; 1468 | __pyx_L0:; 1469 | __Pyx_XDECREF(__pyx_v_wrap); 1470 | __Pyx_DECREF(((PyObject *)__pyx_cur_scope)); 1471 | __Pyx_XGIVEREF(__pyx_r); 1472 | __Pyx_RefNannyFinishContext(); 1473 | return __pyx_r; 1474 | } 1475 | 1476 | /* "vector.from_py":49 1477 | * 1478 | * @cname("__pyx_convert_vector_from_py_Simhash_3a__3a_hash_t") 1479 | * cdef vector[X] __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(object o) except *: # <<<<<<<<<<<<<< 1480 | * cdef vector[X] v 1481 | * for item in o: 1482 | */ 1483 | 1484 | static std::vector __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(PyObject *__pyx_v_o) { 1485 | std::vector __pyx_v_v; 1486 | PyObject *__pyx_v_item = NULL; 1487 | std::vector __pyx_r; 1488 | __Pyx_RefNannyDeclarations 1489 | PyObject *__pyx_t_1 = NULL; 1490 | Py_ssize_t __pyx_t_2; 1491 | PyObject *(*__pyx_t_3)(PyObject *); 1492 | PyObject *__pyx_t_4 = NULL; 1493 | Simhash::hash_t __pyx_t_5; 1494 | __Pyx_RefNannySetupContext("__pyx_convert_vector_from_py_Simhash_3a__3a_hash_t", 0); 1495 | 1496 | /* "vector.from_py":51 1497 | * cdef vector[X] __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(object o) except *: 1498 | * cdef vector[X] v 1499 | * for item in o: # <<<<<<<<<<<<<< 1500 | * v.push_back(X_from_py(item)) 1501 | * return v 1502 | */ 1503 | if (likely(PyList_CheckExact(__pyx_v_o)) || PyTuple_CheckExact(__pyx_v_o)) { 1504 | __pyx_t_1 = __pyx_v_o; __Pyx_INCREF(__pyx_t_1); __pyx_t_2 = 0; 1505 | __pyx_t_3 = NULL; 1506 | } else { 1507 | __pyx_t_2 = -1; __pyx_t_1 = PyObject_GetIter(__pyx_v_o); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 51, __pyx_L1_error) 1508 | __Pyx_GOTREF(__pyx_t_1); 1509 | __pyx_t_3 = Py_TYPE(__pyx_t_1)->tp_iternext; if (unlikely(!__pyx_t_3)) __PYX_ERR(1, 51, __pyx_L1_error) 1510 | } 1511 | for (;;) { 1512 | if (likely(!__pyx_t_3)) { 1513 | if (likely(PyList_CheckExact(__pyx_t_1))) { 1514 | if (__pyx_t_2 >= PyList_GET_SIZE(__pyx_t_1)) break; 1515 | #if CYTHON_COMPILING_IN_CPYTHON 1516 | __pyx_t_4 = PyList_GET_ITEM(__pyx_t_1, __pyx_t_2); __Pyx_INCREF(__pyx_t_4); __pyx_t_2++; if (unlikely(0 < 0)) __PYX_ERR(1, 51, __pyx_L1_error) 1517 | #else 1518 | __pyx_t_4 = PySequence_ITEM(__pyx_t_1, __pyx_t_2); __pyx_t_2++; if (unlikely(!__pyx_t_4)) __PYX_ERR(1, 51, __pyx_L1_error) 1519 | __Pyx_GOTREF(__pyx_t_4); 1520 | #endif 1521 | } else { 1522 | if (__pyx_t_2 >= PyTuple_GET_SIZE(__pyx_t_1)) break; 1523 | #if CYTHON_COMPILING_IN_CPYTHON 1524 | __pyx_t_4 = PyTuple_GET_ITEM(__pyx_t_1, __pyx_t_2); __Pyx_INCREF(__pyx_t_4); __pyx_t_2++; if (unlikely(0 < 0)) __PYX_ERR(1, 51, __pyx_L1_error) 1525 | #else 1526 | __pyx_t_4 = PySequence_ITEM(__pyx_t_1, __pyx_t_2); __pyx_t_2++; if (unlikely(!__pyx_t_4)) __PYX_ERR(1, 51, __pyx_L1_error) 1527 | __Pyx_GOTREF(__pyx_t_4); 1528 | #endif 1529 | } 1530 | } else { 1531 | __pyx_t_4 = __pyx_t_3(__pyx_t_1); 1532 | if (unlikely(!__pyx_t_4)) { 1533 | PyObject* exc_type = PyErr_Occurred(); 1534 | if (exc_type) { 1535 | if (likely(exc_type == PyExc_StopIteration || PyErr_GivenExceptionMatches(exc_type, PyExc_StopIteration))) PyErr_Clear(); 1536 | else __PYX_ERR(1, 51, __pyx_L1_error) 1537 | } 1538 | break; 1539 | } 1540 | __Pyx_GOTREF(__pyx_t_4); 1541 | } 1542 | __Pyx_XDECREF_SET(__pyx_v_item, __pyx_t_4); 1543 | __pyx_t_4 = 0; 1544 | 1545 | /* "vector.from_py":52 1546 | * cdef vector[X] v 1547 | * for item in o: 1548 | * v.push_back(X_from_py(item)) # <<<<<<<<<<<<<< 1549 | * return v 1550 | * 1551 | */ 1552 | __pyx_t_5 = __Pyx_PyInt_As_uint64_t(__pyx_v_item); if (unlikely(__pyx_t_5 == -1LL && PyErr_Occurred())) __PYX_ERR(1, 52, __pyx_L1_error) 1553 | __pyx_v_v.push_back(__pyx_t_5); 1554 | 1555 | /* "vector.from_py":51 1556 | * cdef vector[X] __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(object o) except *: 1557 | * cdef vector[X] v 1558 | * for item in o: # <<<<<<<<<<<<<< 1559 | * v.push_back(X_from_py(item)) 1560 | * return v 1561 | */ 1562 | } 1563 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 1564 | 1565 | /* "vector.from_py":53 1566 | * for item in o: 1567 | * v.push_back(X_from_py(item)) 1568 | * return v # <<<<<<<<<<<<<< 1569 | * 1570 | * 1571 | */ 1572 | __pyx_r = __pyx_v_v; 1573 | goto __pyx_L0; 1574 | 1575 | /* "vector.from_py":49 1576 | * 1577 | * @cname("__pyx_convert_vector_from_py_Simhash_3a__3a_hash_t") 1578 | * cdef vector[X] __pyx_convert_vector_from_py_Simhash_3a__3a_hash_t(object o) except *: # <<<<<<<<<<<<<< 1579 | * cdef vector[X] v 1580 | * for item in o: 1581 | */ 1582 | 1583 | /* function exit code */ 1584 | __pyx_L1_error:; 1585 | __Pyx_XDECREF(__pyx_t_1); 1586 | __Pyx_XDECREF(__pyx_t_4); 1587 | __Pyx_AddTraceback("vector.from_py.__pyx_convert_vector_from_py_Simhash_3a__3a_hash_t", __pyx_clineno, __pyx_lineno, __pyx_filename); 1588 | __pyx_L0:; 1589 | __Pyx_XDECREF(__pyx_v_item); 1590 | __Pyx_RefNannyFinishContext(); 1591 | return __pyx_r; 1592 | } 1593 | 1594 | /* "set.from_py":120 1595 | * 1596 | * @cname("__pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t") 1597 | * cdef set[X] __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(object o) except *: # <<<<<<<<<<<<<< 1598 | * cdef set[X] s 1599 | * for item in o: 1600 | */ 1601 | 1602 | static std::unordered_set __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(PyObject *__pyx_v_o) { 1603 | std::unordered_set __pyx_v_s; 1604 | PyObject *__pyx_v_item = NULL; 1605 | std::unordered_set __pyx_r; 1606 | __Pyx_RefNannyDeclarations 1607 | PyObject *__pyx_t_1 = NULL; 1608 | Py_ssize_t __pyx_t_2; 1609 | PyObject *(*__pyx_t_3)(PyObject *); 1610 | PyObject *__pyx_t_4 = NULL; 1611 | Simhash::hash_t __pyx_t_5; 1612 | __Pyx_RefNannySetupContext("__pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t", 0); 1613 | 1614 | /* "set.from_py":122 1615 | * cdef set[X] __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(object o) except *: 1616 | * cdef set[X] s 1617 | * for item in o: # <<<<<<<<<<<<<< 1618 | * s.insert(X_from_py(item)) 1619 | * return s 1620 | */ 1621 | if (likely(PyList_CheckExact(__pyx_v_o)) || PyTuple_CheckExact(__pyx_v_o)) { 1622 | __pyx_t_1 = __pyx_v_o; __Pyx_INCREF(__pyx_t_1); __pyx_t_2 = 0; 1623 | __pyx_t_3 = NULL; 1624 | } else { 1625 | __pyx_t_2 = -1; __pyx_t_1 = PyObject_GetIter(__pyx_v_o); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 122, __pyx_L1_error) 1626 | __Pyx_GOTREF(__pyx_t_1); 1627 | __pyx_t_3 = Py_TYPE(__pyx_t_1)->tp_iternext; if (unlikely(!__pyx_t_3)) __PYX_ERR(1, 122, __pyx_L1_error) 1628 | } 1629 | for (;;) { 1630 | if (likely(!__pyx_t_3)) { 1631 | if (likely(PyList_CheckExact(__pyx_t_1))) { 1632 | if (__pyx_t_2 >= PyList_GET_SIZE(__pyx_t_1)) break; 1633 | #if CYTHON_COMPILING_IN_CPYTHON 1634 | __pyx_t_4 = PyList_GET_ITEM(__pyx_t_1, __pyx_t_2); __Pyx_INCREF(__pyx_t_4); __pyx_t_2++; if (unlikely(0 < 0)) __PYX_ERR(1, 122, __pyx_L1_error) 1635 | #else 1636 | __pyx_t_4 = PySequence_ITEM(__pyx_t_1, __pyx_t_2); __pyx_t_2++; if (unlikely(!__pyx_t_4)) __PYX_ERR(1, 122, __pyx_L1_error) 1637 | __Pyx_GOTREF(__pyx_t_4); 1638 | #endif 1639 | } else { 1640 | if (__pyx_t_2 >= PyTuple_GET_SIZE(__pyx_t_1)) break; 1641 | #if CYTHON_COMPILING_IN_CPYTHON 1642 | __pyx_t_4 = PyTuple_GET_ITEM(__pyx_t_1, __pyx_t_2); __Pyx_INCREF(__pyx_t_4); __pyx_t_2++; if (unlikely(0 < 0)) __PYX_ERR(1, 122, __pyx_L1_error) 1643 | #else 1644 | __pyx_t_4 = PySequence_ITEM(__pyx_t_1, __pyx_t_2); __pyx_t_2++; if (unlikely(!__pyx_t_4)) __PYX_ERR(1, 122, __pyx_L1_error) 1645 | __Pyx_GOTREF(__pyx_t_4); 1646 | #endif 1647 | } 1648 | } else { 1649 | __pyx_t_4 = __pyx_t_3(__pyx_t_1); 1650 | if (unlikely(!__pyx_t_4)) { 1651 | PyObject* exc_type = PyErr_Occurred(); 1652 | if (exc_type) { 1653 | if (likely(exc_type == PyExc_StopIteration || PyErr_GivenExceptionMatches(exc_type, PyExc_StopIteration))) PyErr_Clear(); 1654 | else __PYX_ERR(1, 122, __pyx_L1_error) 1655 | } 1656 | break; 1657 | } 1658 | __Pyx_GOTREF(__pyx_t_4); 1659 | } 1660 | __Pyx_XDECREF_SET(__pyx_v_item, __pyx_t_4); 1661 | __pyx_t_4 = 0; 1662 | 1663 | /* "set.from_py":123 1664 | * cdef set[X] s 1665 | * for item in o: 1666 | * s.insert(X_from_py(item)) # <<<<<<<<<<<<<< 1667 | * return s 1668 | * 1669 | */ 1670 | __pyx_t_5 = __Pyx_PyInt_As_uint64_t(__pyx_v_item); if (unlikely(__pyx_t_5 == -1LL && PyErr_Occurred())) __PYX_ERR(1, 123, __pyx_L1_error) 1671 | __pyx_v_s.insert(__pyx_t_5); 1672 | 1673 | /* "set.from_py":122 1674 | * cdef set[X] __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(object o) except *: 1675 | * cdef set[X] s 1676 | * for item in o: # <<<<<<<<<<<<<< 1677 | * s.insert(X_from_py(item)) 1678 | * return s 1679 | */ 1680 | } 1681 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 1682 | 1683 | /* "set.from_py":124 1684 | * for item in o: 1685 | * s.insert(X_from_py(item)) 1686 | * return s # <<<<<<<<<<<<<< 1687 | * 1688 | * 1689 | */ 1690 | __pyx_r = __pyx_v_s; 1691 | goto __pyx_L0; 1692 | 1693 | /* "set.from_py":120 1694 | * 1695 | * @cname("__pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t") 1696 | * cdef set[X] __pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t(object o) except *: # <<<<<<<<<<<<<< 1697 | * cdef set[X] s 1698 | * for item in o: 1699 | */ 1700 | 1701 | /* function exit code */ 1702 | __pyx_L1_error:; 1703 | __Pyx_XDECREF(__pyx_t_1); 1704 | __Pyx_XDECREF(__pyx_t_4); 1705 | __Pyx_AddTraceback("set.from_py.__pyx_convert_unordered_set_from_py_Simhash_3a__3a_hash_t", __pyx_clineno, __pyx_lineno, __pyx_filename); 1706 | __pyx_L0:; 1707 | __Pyx_XDECREF(__pyx_v_item); 1708 | __Pyx_RefNannyFinishContext(); 1709 | return __pyx_r; 1710 | } 1711 | 1712 | /* "pair.to_py":180 1713 | * 1714 | * @cname("__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t") 1715 | * cdef object __pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t(const pair[X,Y]& p): # <<<<<<<<<<<<<< 1716 | * return X_to_py(p.first), Y_to_py(p.second) 1717 | * 1718 | */ 1719 | 1720 | static PyObject *__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t(std::pair const &__pyx_v_p) { 1721 | PyObject *__pyx_r = NULL; 1722 | __Pyx_RefNannyDeclarations 1723 | PyObject *__pyx_t_1 = NULL; 1724 | PyObject *__pyx_t_2 = NULL; 1725 | PyObject *__pyx_t_3 = NULL; 1726 | __Pyx_RefNannySetupContext("__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t", 0); 1727 | 1728 | /* "pair.to_py":181 1729 | * @cname("__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t") 1730 | * cdef object __pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t(const pair[X,Y]& p): 1731 | * return X_to_py(p.first), Y_to_py(p.second) # <<<<<<<<<<<<<< 1732 | * 1733 | * 1734 | */ 1735 | __Pyx_XDECREF(__pyx_r); 1736 | __pyx_t_1 = __Pyx_PyInt_From_uint64_t(__pyx_v_p.first); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 181, __pyx_L1_error) 1737 | __Pyx_GOTREF(__pyx_t_1); 1738 | __pyx_t_2 = __Pyx_PyInt_From_uint64_t(__pyx_v_p.second); if (unlikely(!__pyx_t_2)) __PYX_ERR(1, 181, __pyx_L1_error) 1739 | __Pyx_GOTREF(__pyx_t_2); 1740 | __pyx_t_3 = PyTuple_New(2); if (unlikely(!__pyx_t_3)) __PYX_ERR(1, 181, __pyx_L1_error) 1741 | __Pyx_GOTREF(__pyx_t_3); 1742 | __Pyx_GIVEREF(__pyx_t_1); 1743 | PyTuple_SET_ITEM(__pyx_t_3, 0, __pyx_t_1); 1744 | __Pyx_GIVEREF(__pyx_t_2); 1745 | PyTuple_SET_ITEM(__pyx_t_3, 1, __pyx_t_2); 1746 | __pyx_t_1 = 0; 1747 | __pyx_t_2 = 0; 1748 | __pyx_r = __pyx_t_3; 1749 | __pyx_t_3 = 0; 1750 | goto __pyx_L0; 1751 | 1752 | /* "pair.to_py":180 1753 | * 1754 | * @cname("__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t") 1755 | * cdef object __pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t(const pair[X,Y]& p): # <<<<<<<<<<<<<< 1756 | * return X_to_py(p.first), Y_to_py(p.second) 1757 | * 1758 | */ 1759 | 1760 | /* function exit code */ 1761 | __pyx_L1_error:; 1762 | __Pyx_XDECREF(__pyx_t_1); 1763 | __Pyx_XDECREF(__pyx_t_2); 1764 | __Pyx_XDECREF(__pyx_t_3); 1765 | __Pyx_AddTraceback("pair.to_py.__pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t", __pyx_clineno, __pyx_lineno, __pyx_filename); 1766 | __pyx_r = 0; 1767 | __pyx_L0:; 1768 | __Pyx_XGIVEREF(__pyx_r); 1769 | __Pyx_RefNannyFinishContext(); 1770 | return __pyx_r; 1771 | } 1772 | 1773 | /* "vector.to_py":67 1774 | * 1775 | * @cname("__pyx_convert_vector_to_py_Simhash_3a__3a_match_t") 1776 | * cdef object __pyx_convert_vector_to_py_Simhash_3a__3a_match_t(vector[X]& v): # <<<<<<<<<<<<<< 1777 | * return [X_to_py(v[i]) for i in range(v.size())] 1778 | * 1779 | */ 1780 | 1781 | static PyObject *__pyx_convert_vector_to_py_Simhash_3a__3a_match_t(const std::vector &__pyx_v_v) { 1782 | size_t __pyx_v_i; 1783 | PyObject *__pyx_r = NULL; 1784 | __Pyx_RefNannyDeclarations 1785 | PyObject *__pyx_t_1 = NULL; 1786 | size_t __pyx_t_2; 1787 | size_t __pyx_t_3; 1788 | PyObject *__pyx_t_4 = NULL; 1789 | __Pyx_RefNannySetupContext("__pyx_convert_vector_to_py_Simhash_3a__3a_match_t", 0); 1790 | 1791 | /* "vector.to_py":68 1792 | * @cname("__pyx_convert_vector_to_py_Simhash_3a__3a_match_t") 1793 | * cdef object __pyx_convert_vector_to_py_Simhash_3a__3a_match_t(vector[X]& v): 1794 | * return [X_to_py(v[i]) for i in range(v.size())] # <<<<<<<<<<<<<< 1795 | * 1796 | * 1797 | */ 1798 | __Pyx_XDECREF(__pyx_r); 1799 | __pyx_t_1 = PyList_New(0); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 68, __pyx_L1_error) 1800 | __Pyx_GOTREF(__pyx_t_1); 1801 | __pyx_t_2 = __pyx_v_v.size(); 1802 | for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=1) { 1803 | __pyx_v_i = __pyx_t_3; 1804 | __pyx_t_4 = __pyx_convert_pair_to_py_Simhash_3a__3a_hash_t____Simhash_3a__3a_hash_t((__pyx_v_v[__pyx_v_i])); if (unlikely(!__pyx_t_4)) __PYX_ERR(1, 68, __pyx_L1_error) 1805 | __Pyx_GOTREF(__pyx_t_4); 1806 | if (unlikely(__Pyx_ListComp_Append(__pyx_t_1, (PyObject*)__pyx_t_4))) __PYX_ERR(1, 68, __pyx_L1_error) 1807 | __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0; 1808 | } 1809 | __pyx_r = __pyx_t_1; 1810 | __pyx_t_1 = 0; 1811 | goto __pyx_L0; 1812 | 1813 | /* "vector.to_py":67 1814 | * 1815 | * @cname("__pyx_convert_vector_to_py_Simhash_3a__3a_match_t") 1816 | * cdef object __pyx_convert_vector_to_py_Simhash_3a__3a_match_t(vector[X]& v): # <<<<<<<<<<<<<< 1817 | * return [X_to_py(v[i]) for i in range(v.size())] 1818 | * 1819 | */ 1820 | 1821 | /* function exit code */ 1822 | __pyx_L1_error:; 1823 | __Pyx_XDECREF(__pyx_t_1); 1824 | __Pyx_XDECREF(__pyx_t_4); 1825 | __Pyx_AddTraceback("vector.to_py.__pyx_convert_vector_to_py_Simhash_3a__3a_match_t", __pyx_clineno, __pyx_lineno, __pyx_filename); 1826 | __pyx_r = 0; 1827 | __pyx_L0:; 1828 | __Pyx_XGIVEREF(__pyx_r); 1829 | __Pyx_RefNannyFinishContext(); 1830 | return __pyx_r; 1831 | } 1832 | 1833 | static struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *__pyx_freelist___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py[8]; 1834 | static int __pyx_freecount___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py = 0; 1835 | 1836 | static PyObject *__pyx_tp_new___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py(PyTypeObject *t, CYTHON_UNUSED PyObject *a, CYTHON_UNUSED PyObject *k) { 1837 | PyObject *o; 1838 | if (CYTHON_COMPILING_IN_CPYTHON && likely((__pyx_freecount___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py > 0) & (t->tp_basicsize == sizeof(struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py)))) { 1839 | o = (PyObject*)__pyx_freelist___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py[--__pyx_freecount___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py]; 1840 | memset(o, 0, sizeof(struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py)); 1841 | (void) PyObject_INIT(o, t); 1842 | } else { 1843 | o = (*t->tp_alloc)(t, 0); 1844 | if (unlikely(!o)) return 0; 1845 | } 1846 | return o; 1847 | } 1848 | 1849 | static void __pyx_tp_dealloc___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py(PyObject *o) { 1850 | if (CYTHON_COMPILING_IN_CPYTHON && ((__pyx_freecount___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py < 8) & (Py_TYPE(o)->tp_basicsize == sizeof(struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py)))) { 1851 | __pyx_freelist___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py[__pyx_freecount___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py++] = ((struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py *)o); 1852 | } else { 1853 | (*Py_TYPE(o)->tp_free)(o); 1854 | } 1855 | } 1856 | 1857 | static PyTypeObject __pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py = { 1858 | PyVarObject_HEAD_INIT(0, 0) 1859 | "simhash.simhash.__pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py", /*tp_name*/ 1860 | sizeof(struct __pyx_obj___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py), /*tp_basicsize*/ 1861 | 0, /*tp_itemsize*/ 1862 | __pyx_tp_dealloc___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py, /*tp_dealloc*/ 1863 | 0, /*tp_print*/ 1864 | 0, /*tp_getattr*/ 1865 | 0, /*tp_setattr*/ 1866 | #if PY_MAJOR_VERSION < 3 1867 | 0, /*tp_compare*/ 1868 | #endif 1869 | #if PY_MAJOR_VERSION >= 3 1870 | 0, /*tp_as_async*/ 1871 | #endif 1872 | 0, /*tp_repr*/ 1873 | 0, /*tp_as_number*/ 1874 | 0, /*tp_as_sequence*/ 1875 | 0, /*tp_as_mapping*/ 1876 | 0, /*tp_hash*/ 1877 | 0, /*tp_call*/ 1878 | 0, /*tp_str*/ 1879 | 0, /*tp_getattro*/ 1880 | 0, /*tp_setattro*/ 1881 | 0, /*tp_as_buffer*/ 1882 | Py_TPFLAGS_DEFAULT|Py_TPFLAGS_HAVE_VERSION_TAG|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER, /*tp_flags*/ 1883 | 0, /*tp_doc*/ 1884 | 0, /*tp_traverse*/ 1885 | 0, /*tp_clear*/ 1886 | 0, /*tp_richcompare*/ 1887 | 0, /*tp_weaklistoffset*/ 1888 | 0, /*tp_iter*/ 1889 | 0, /*tp_iternext*/ 1890 | 0, /*tp_methods*/ 1891 | 0, /*tp_members*/ 1892 | 0, /*tp_getset*/ 1893 | 0, /*tp_base*/ 1894 | 0, /*tp_dict*/ 1895 | 0, /*tp_descr_get*/ 1896 | 0, /*tp_descr_set*/ 1897 | 0, /*tp_dictoffset*/ 1898 | 0, /*tp_init*/ 1899 | 0, /*tp_alloc*/ 1900 | __pyx_tp_new___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py, /*tp_new*/ 1901 | 0, /*tp_free*/ 1902 | 0, /*tp_is_gc*/ 1903 | 0, /*tp_bases*/ 1904 | 0, /*tp_mro*/ 1905 | 0, /*tp_cache*/ 1906 | 0, /*tp_subclasses*/ 1907 | 0, /*tp_weaklist*/ 1908 | 0, /*tp_del*/ 1909 | 0, /*tp_version_tag*/ 1910 | #if PY_VERSION_HEX >= 0x030400a1 1911 | 0, /*tp_finalize*/ 1912 | #endif 1913 | }; 1914 | 1915 | static PyMethodDef __pyx_methods[] = { 1916 | {"compute", (PyCFunction)__pyx_pw_7simhash_7simhash_3compute, METH_O, __pyx_doc_7simhash_7simhash_2compute}, 1917 | {"find_all", (PyCFunction)__pyx_pw_7simhash_7simhash_5find_all, METH_VARARGS|METH_KEYWORDS, __pyx_doc_7simhash_7simhash_4find_all}, 1918 | {0, 0, 0, 0} 1919 | }; 1920 | 1921 | #if PY_MAJOR_VERSION >= 3 1922 | static struct PyModuleDef __pyx_moduledef = { 1923 | #if PY_VERSION_HEX < 0x03020000 1924 | { PyObject_HEAD_INIT(NULL) NULL, 0, NULL }, 1925 | #else 1926 | PyModuleDef_HEAD_INIT, 1927 | #endif 1928 | "simhash", 1929 | 0, /* m_doc */ 1930 | -1, /* m_size */ 1931 | __pyx_methods /* m_methods */, 1932 | NULL, /* m_reload */ 1933 | NULL, /* m_traverse */ 1934 | NULL, /* m_clear */ 1935 | NULL /* m_free */ 1936 | }; 1937 | #endif 1938 | 1939 | static __Pyx_StringTabEntry __pyx_string_tab[] = { 1940 | {&__pyx_n_s_Pyx_CFunc_size__t____hash__t, __pyx_k_Pyx_CFunc_size__t____hash__t, sizeof(__pyx_k_Pyx_CFunc_size__t____hash__t), 0, 0, 1, 1}, 1941 | {&__pyx_kp_s_Q, __pyx_k_Q, sizeof(__pyx_k_Q), 0, 0, 1, 0}, 1942 | {&__pyx_n_s_a, __pyx_k_a, sizeof(__pyx_k_a), 0, 0, 1, 1}, 1943 | {&__pyx_n_s_b, __pyx_k_b, sizeof(__pyx_k_b), 0, 0, 1, 1}, 1944 | {&__pyx_n_s_cfunc_to_py, __pyx_k_cfunc_to_py, sizeof(__pyx_k_cfunc_to_py), 0, 0, 1, 1}, 1945 | {&__pyx_n_s_different_bits, __pyx_k_different_bits, sizeof(__pyx_k_different_bits), 0, 0, 1, 1}, 1946 | {&__pyx_n_s_digest, __pyx_k_digest, sizeof(__pyx_k_digest), 0, 0, 1, 1}, 1947 | {&__pyx_n_s_hashes, __pyx_k_hashes, sizeof(__pyx_k_hashes), 0, 0, 1, 1}, 1948 | {&__pyx_n_s_hashlib, __pyx_k_hashlib, sizeof(__pyx_k_hashlib), 0, 0, 1, 1}, 1949 | {&__pyx_n_s_import, __pyx_k_import, sizeof(__pyx_k_import), 0, 0, 1, 1}, 1950 | {&__pyx_n_s_main, __pyx_k_main, sizeof(__pyx_k_main), 0, 0, 1, 1}, 1951 | {&__pyx_n_s_md5, __pyx_k_md5, sizeof(__pyx_k_md5), 0, 0, 1, 1}, 1952 | {&__pyx_n_s_number_of_blocks, __pyx_k_number_of_blocks, sizeof(__pyx_k_number_of_blocks), 0, 0, 1, 1}, 1953 | {&__pyx_n_s_obj, __pyx_k_obj, sizeof(__pyx_k_obj), 0, 0, 1, 1}, 1954 | {&__pyx_n_s_range, __pyx_k_range, sizeof(__pyx_k_range), 0, 0, 1, 1}, 1955 | {&__pyx_n_s_simhash_simhash, __pyx_k_simhash_simhash, sizeof(__pyx_k_simhash_simhash), 0, 0, 1, 1}, 1956 | {&__pyx_kp_s_stringsource, __pyx_k_stringsource, sizeof(__pyx_k_stringsource), 0, 0, 1, 0}, 1957 | {&__pyx_n_s_struct, __pyx_k_struct, sizeof(__pyx_k_struct), 0, 0, 1, 1}, 1958 | {&__pyx_n_s_test, __pyx_k_test, sizeof(__pyx_k_test), 0, 0, 1, 1}, 1959 | {&__pyx_n_s_unpack, __pyx_k_unpack, sizeof(__pyx_k_unpack), 0, 0, 1, 1}, 1960 | {&__pyx_n_s_unsigned_hash, __pyx_k_unsigned_hash, sizeof(__pyx_k_unsigned_hash), 0, 0, 1, 1}, 1961 | {&__pyx_kp_s_vagrant_simhash_simhash_pyx, __pyx_k_vagrant_simhash_simhash_pyx, sizeof(__pyx_k_vagrant_simhash_simhash_pyx), 0, 0, 1, 0}, 1962 | {&__pyx_n_s_wrap, __pyx_k_wrap, sizeof(__pyx_k_wrap), 0, 0, 1, 1}, 1963 | {0, 0, 0, 0, 0, 0, 0} 1964 | }; 1965 | static int __Pyx_InitCachedBuiltins(void) { 1966 | __pyx_builtin_range = __Pyx_GetBuiltinName(__pyx_n_s_range); if (!__pyx_builtin_range) __PYX_ERR(1, 68, __pyx_L1_error) 1967 | return 0; 1968 | __pyx_L1_error:; 1969 | return -1; 1970 | } 1971 | 1972 | static int __Pyx_InitCachedConstants(void) { 1973 | __Pyx_RefNannyDeclarations 1974 | __Pyx_RefNannySetupContext("__Pyx_InitCachedConstants", 0); 1975 | 1976 | /* "simhash/simhash.pyx":10 1977 | * def unsigned_hash(bytes obj): 1978 | * '''Returns a hash suitable for use as a hash_t.''' 1979 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF # <<<<<<<<<<<<<< 1980 | * 1981 | * def compute(hashes): 1982 | */ 1983 | __pyx_slice_ = PySlice_New(__pyx_int_0, __pyx_int_8, Py_None); if (unlikely(!__pyx_slice_)) __PYX_ERR(0, 10, __pyx_L1_error) 1984 | __Pyx_GOTREF(__pyx_slice_); 1985 | __Pyx_GIVEREF(__pyx_slice_); 1986 | 1987 | /* "cfunc.to_py":65 1988 | * @cname("__Pyx_CFunc_size__t____hash__t____hash__t___to_py") 1989 | * cdef object __Pyx_CFunc_size__t____hash__t____hash__t___to_py(size_t (*f)(hash_t, hash_t) except *): 1990 | * def wrap(hash_t a, hash_t b): # <<<<<<<<<<<<<< 1991 | * """wrap(a: 'hash_t', b: 'hash_t') -> 'size_t'""" 1992 | * return f(a, b) 1993 | */ 1994 | __pyx_tuple__2 = PyTuple_Pack(2, __pyx_n_s_a, __pyx_n_s_b); if (unlikely(!__pyx_tuple__2)) __PYX_ERR(1, 65, __pyx_L1_error) 1995 | __Pyx_GOTREF(__pyx_tuple__2); 1996 | __Pyx_GIVEREF(__pyx_tuple__2); 1997 | __pyx_codeobj__3 = (PyObject*)__Pyx_PyCode_New(2, 0, 2, 0, 0, __pyx_empty_bytes, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_tuple__2, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_kp_s_stringsource, __pyx_n_s_wrap, 65, __pyx_empty_bytes); if (unlikely(!__pyx_codeobj__3)) __PYX_ERR(1, 65, __pyx_L1_error) 1998 | 1999 | /* "simhash/simhash.pyx":8 2000 | * 2001 | * 2002 | * def unsigned_hash(bytes obj): # <<<<<<<<<<<<<< 2003 | * '''Returns a hash suitable for use as a hash_t.''' 2004 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF 2005 | */ 2006 | __pyx_tuple__4 = PyTuple_Pack(1, __pyx_n_s_obj); if (unlikely(!__pyx_tuple__4)) __PYX_ERR(0, 8, __pyx_L1_error) 2007 | __Pyx_GOTREF(__pyx_tuple__4); 2008 | __Pyx_GIVEREF(__pyx_tuple__4); 2009 | __pyx_codeobj__5 = (PyObject*)__Pyx_PyCode_New(1, 0, 1, 0, 0, __pyx_empty_bytes, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_tuple__4, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_kp_s_vagrant_simhash_simhash_pyx, __pyx_n_s_unsigned_hash, 8, __pyx_empty_bytes); if (unlikely(!__pyx_codeobj__5)) __PYX_ERR(0, 8, __pyx_L1_error) 2010 | __Pyx_RefNannyFinishContext(); 2011 | return 0; 2012 | __pyx_L1_error:; 2013 | __Pyx_RefNannyFinishContext(); 2014 | return -1; 2015 | } 2016 | 2017 | static int __Pyx_InitGlobals(void) { 2018 | if (__Pyx_InitStrings(__pyx_string_tab) < 0) __PYX_ERR(0, 1, __pyx_L1_error); 2019 | __pyx_int_0 = PyInt_FromLong(0); if (unlikely(!__pyx_int_0)) __PYX_ERR(0, 1, __pyx_L1_error) 2020 | __pyx_int_8 = PyInt_FromLong(8); if (unlikely(!__pyx_int_8)) __PYX_ERR(0, 1, __pyx_L1_error) 2021 | __pyx_int_18446744073709551615 = PyInt_FromString((char *)"18446744073709551615", 0, 0); if (unlikely(!__pyx_int_18446744073709551615)) __PYX_ERR(0, 1, __pyx_L1_error) 2022 | return 0; 2023 | __pyx_L1_error:; 2024 | return -1; 2025 | } 2026 | 2027 | #if PY_MAJOR_VERSION < 3 2028 | PyMODINIT_FUNC initsimhash(void); /*proto*/ 2029 | PyMODINIT_FUNC initsimhash(void) 2030 | #else 2031 | PyMODINIT_FUNC PyInit_simhash(void); /*proto*/ 2032 | PyMODINIT_FUNC PyInit_simhash(void) 2033 | #endif 2034 | { 2035 | PyObject *__pyx_t_1 = NULL; 2036 | __Pyx_RefNannyDeclarations 2037 | #if CYTHON_REFNANNY 2038 | __Pyx_RefNanny = __Pyx_RefNannyImportAPI("refnanny"); 2039 | if (!__Pyx_RefNanny) { 2040 | PyErr_Clear(); 2041 | __Pyx_RefNanny = __Pyx_RefNannyImportAPI("Cython.Runtime.refnanny"); 2042 | if (!__Pyx_RefNanny) 2043 | Py_FatalError("failed to import 'refnanny' module"); 2044 | } 2045 | #endif 2046 | __Pyx_RefNannySetupContext("PyMODINIT_FUNC PyInit_simhash(void)", 0); 2047 | if (__Pyx_check_binary_version() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2048 | __pyx_empty_tuple = PyTuple_New(0); if (unlikely(!__pyx_empty_tuple)) __PYX_ERR(0, 1, __pyx_L1_error) 2049 | __pyx_empty_bytes = PyBytes_FromStringAndSize("", 0); if (unlikely(!__pyx_empty_bytes)) __PYX_ERR(0, 1, __pyx_L1_error) 2050 | __pyx_empty_unicode = PyUnicode_FromStringAndSize("", 0); if (unlikely(!__pyx_empty_unicode)) __PYX_ERR(0, 1, __pyx_L1_error) 2051 | #ifdef __Pyx_CyFunction_USED 2052 | if (__pyx_CyFunction_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2053 | #endif 2054 | #ifdef __Pyx_FusedFunction_USED 2055 | if (__pyx_FusedFunction_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2056 | #endif 2057 | #ifdef __Pyx_Coroutine_USED 2058 | if (__pyx_Coroutine_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2059 | #endif 2060 | #ifdef __Pyx_Generator_USED 2061 | if (__pyx_Generator_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2062 | #endif 2063 | #ifdef __Pyx_StopAsyncIteration_USED 2064 | if (__pyx_StopAsyncIteration_init() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2065 | #endif 2066 | /*--- Library function declarations ---*/ 2067 | /*--- Threads initialization code ---*/ 2068 | #if defined(__PYX_FORCE_INIT_THREADS) && __PYX_FORCE_INIT_THREADS 2069 | #ifdef WITH_THREAD /* Python build with threading support? */ 2070 | PyEval_InitThreads(); 2071 | #endif 2072 | #endif 2073 | /*--- Module creation code ---*/ 2074 | #if PY_MAJOR_VERSION < 3 2075 | __pyx_m = Py_InitModule4("simhash", __pyx_methods, 0, 0, PYTHON_API_VERSION); Py_XINCREF(__pyx_m); 2076 | #else 2077 | __pyx_m = PyModule_Create(&__pyx_moduledef); 2078 | #endif 2079 | if (unlikely(!__pyx_m)) __PYX_ERR(0, 1, __pyx_L1_error) 2080 | __pyx_d = PyModule_GetDict(__pyx_m); if (unlikely(!__pyx_d)) __PYX_ERR(0, 1, __pyx_L1_error) 2081 | Py_INCREF(__pyx_d); 2082 | __pyx_b = PyImport_AddModule(__Pyx_BUILTIN_MODULE_NAME); if (unlikely(!__pyx_b)) __PYX_ERR(0, 1, __pyx_L1_error) 2083 | #if CYTHON_COMPILING_IN_PYPY 2084 | Py_INCREF(__pyx_b); 2085 | #endif 2086 | if (PyObject_SetAttrString(__pyx_m, "__builtins__", __pyx_b) < 0) __PYX_ERR(0, 1, __pyx_L1_error); 2087 | /*--- Initialize various global constants etc. ---*/ 2088 | if (__Pyx_InitGlobals() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2089 | #if PY_MAJOR_VERSION < 3 && (__PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT) 2090 | if (__Pyx_init_sys_getdefaultencoding_params() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2091 | #endif 2092 | if (__pyx_module_is_main_simhash__simhash) { 2093 | if (PyObject_SetAttrString(__pyx_m, "__name__", __pyx_n_s_main) < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2094 | } 2095 | #if PY_MAJOR_VERSION >= 3 2096 | { 2097 | PyObject *modules = PyImport_GetModuleDict(); if (unlikely(!modules)) __PYX_ERR(0, 1, __pyx_L1_error) 2098 | if (!PyDict_GetItemString(modules, "simhash.simhash")) { 2099 | if (unlikely(PyDict_SetItemString(modules, "simhash.simhash", __pyx_m) < 0)) __PYX_ERR(0, 1, __pyx_L1_error) 2100 | } 2101 | } 2102 | #endif 2103 | /*--- Builtin init code ---*/ 2104 | if (__Pyx_InitCachedBuiltins() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2105 | /*--- Constants init code ---*/ 2106 | if (__Pyx_InitCachedConstants() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2107 | /*--- Global init code ---*/ 2108 | /*--- Variable export code ---*/ 2109 | /*--- Function export code ---*/ 2110 | /*--- Type init code ---*/ 2111 | if (PyType_Ready(&__pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py) < 0) __PYX_ERR(1, 64, __pyx_L1_error) 2112 | __pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py.tp_print = 0; 2113 | __pyx_ptype___pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py = &__pyx_scope_struct____Pyx_CFunc_size__t____hash__t____hash__t___to_py; 2114 | /*--- Type import code ---*/ 2115 | /*--- Variable import code ---*/ 2116 | /*--- Function import code ---*/ 2117 | /*--- Execution code ---*/ 2118 | #if defined(__Pyx_Generator_USED) || defined(__Pyx_Coroutine_USED) 2119 | if (__Pyx_patch_abc() < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2120 | #endif 2121 | 2122 | /* "simhash/simhash.pyx":1 2123 | * import hashlib # <<<<<<<<<<<<<< 2124 | * import struct 2125 | * 2126 | */ 2127 | __pyx_t_1 = __Pyx_Import(__pyx_n_s_hashlib, 0, -1); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 1, __pyx_L1_error) 2128 | __Pyx_GOTREF(__pyx_t_1); 2129 | if (PyDict_SetItem(__pyx_d, __pyx_n_s_hashlib, __pyx_t_1) < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2130 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 2131 | 2132 | /* "simhash/simhash.pyx":2 2133 | * import hashlib 2134 | * import struct # <<<<<<<<<<<<<< 2135 | * 2136 | * from simhash cimport compute as c_compute 2137 | */ 2138 | __pyx_t_1 = __Pyx_Import(__pyx_n_s_struct, 0, -1); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 2, __pyx_L1_error) 2139 | __Pyx_GOTREF(__pyx_t_1); 2140 | if (PyDict_SetItem(__pyx_d, __pyx_n_s_struct, __pyx_t_1) < 0) __PYX_ERR(0, 2, __pyx_L1_error) 2141 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 2142 | 2143 | /* "simhash/simhash.pyx":8 2144 | * 2145 | * 2146 | * def unsigned_hash(bytes obj): # <<<<<<<<<<<<<< 2147 | * '''Returns a hash suitable for use as a hash_t.''' 2148 | * return struct.unpack('>Q', hashlib.md5(obj).digest()[0:8])[0] & 0xFFFFFFFFFFFFFFFF 2149 | */ 2150 | __pyx_t_1 = PyCFunction_NewEx(&__pyx_mdef_7simhash_7simhash_1unsigned_hash, NULL, __pyx_n_s_simhash_simhash); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 8, __pyx_L1_error) 2151 | __Pyx_GOTREF(__pyx_t_1); 2152 | if (PyDict_SetItem(__pyx_d, __pyx_n_s_unsigned_hash, __pyx_t_1) < 0) __PYX_ERR(0, 8, __pyx_L1_error) 2153 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 2154 | 2155 | /* "simhash/simhash.pyx":1 2156 | * import hashlib # <<<<<<<<<<<<<< 2157 | * import struct 2158 | * 2159 | */ 2160 | __pyx_t_1 = PyDict_New(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 1, __pyx_L1_error) 2161 | __Pyx_GOTREF(__pyx_t_1); 2162 | if (PyDict_SetItem(__pyx_d, __pyx_n_s_test, __pyx_t_1) < 0) __PYX_ERR(0, 1, __pyx_L1_error) 2163 | __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; 2164 | 2165 | /* "vector.to_py":67 2166 | * 2167 | * @cname("__pyx_convert_vector_to_py_Simhash_3a__3a_match_t") 2168 | * cdef object __pyx_convert_vector_to_py_Simhash_3a__3a_match_t(vector[X]& v): # <<<<<<<<<<<<<< 2169 | * return [X_to_py(v[i]) for i in range(v.size())] 2170 | * 2171 | */ 2172 | 2173 | /*--- Wrapped vars code ---*/ 2174 | { 2175 | PyObject* wrapped = __Pyx_CFunc_size__t____hash__t____hash__t___to_py(Simhash::num_differing_bits); 2176 | if (unlikely(!wrapped)) __PYX_ERR(2, 23, __pyx_L1_error) 2177 | if (PyObject_SetAttrString(__pyx_m, "num_differing_bits", wrapped) < 0) __PYX_ERR(2, 23, __pyx_L1_error); 2178 | } 2179 | 2180 | goto __pyx_L0; 2181 | __pyx_L1_error:; 2182 | __Pyx_XDECREF(__pyx_t_1); 2183 | if (__pyx_m) { 2184 | if (__pyx_d) { 2185 | __Pyx_AddTraceback("init simhash.simhash", __pyx_clineno, __pyx_lineno, __pyx_filename); 2186 | } 2187 | Py_DECREF(__pyx_m); __pyx_m = 0; 2188 | } else if (!PyErr_Occurred()) { 2189 | PyErr_SetString(PyExc_ImportError, "init simhash.simhash"); 2190 | } 2191 | __pyx_L0:; 2192 | __Pyx_RefNannyFinishContext(); 2193 | #if PY_MAJOR_VERSION < 3 2194 | return; 2195 | #else 2196 | return __pyx_m; 2197 | #endif 2198 | } 2199 | 2200 | /* --- Runtime support code --- */ 2201 | /* Refnanny */ 2202 | #if CYTHON_REFNANNY 2203 | static __Pyx_RefNannyAPIStruct *__Pyx_RefNannyImportAPI(const char *modname) { 2204 | PyObject *m = NULL, *p = NULL; 2205 | void *r = NULL; 2206 | m = PyImport_ImportModule((char *)modname); 2207 | if (!m) goto end; 2208 | p = PyObject_GetAttrString(m, (char *)"RefNannyAPI"); 2209 | if (!p) goto end; 2210 | r = PyLong_AsVoidPtr(p); 2211 | end: 2212 | Py_XDECREF(p); 2213 | Py_XDECREF(m); 2214 | return (__Pyx_RefNannyAPIStruct *)r; 2215 | } 2216 | #endif 2217 | 2218 | /* ArgTypeTest */ 2219 | static void __Pyx_RaiseArgumentTypeInvalid(const char* name, PyObject *obj, PyTypeObject *type) { 2220 | PyErr_Format(PyExc_TypeError, 2221 | "Argument '%.200s' has incorrect type (expected %.200s, got %.200s)", 2222 | name, type->tp_name, Py_TYPE(obj)->tp_name); 2223 | } 2224 | static CYTHON_INLINE int __Pyx_ArgTypeTest(PyObject *obj, PyTypeObject *type, int none_allowed, 2225 | const char *name, int exact) 2226 | { 2227 | if (unlikely(!type)) { 2228 | PyErr_SetString(PyExc_SystemError, "Missing type object"); 2229 | return 0; 2230 | } 2231 | if (none_allowed && obj == Py_None) return 1; 2232 | else if (exact) { 2233 | if (likely(Py_TYPE(obj) == type)) return 1; 2234 | #if PY_MAJOR_VERSION == 2 2235 | else if ((type == &PyBaseString_Type) && likely(__Pyx_PyBaseString_CheckExact(obj))) return 1; 2236 | #endif 2237 | } 2238 | else { 2239 | if (likely(PyObject_TypeCheck(obj, type))) return 1; 2240 | } 2241 | __Pyx_RaiseArgumentTypeInvalid(name, obj, type); 2242 | return 0; 2243 | } 2244 | 2245 | /* GetBuiltinName */ 2246 | static PyObject *__Pyx_GetBuiltinName(PyObject *name) { 2247 | PyObject* result = __Pyx_PyObject_GetAttrStr(__pyx_b, name); 2248 | if (unlikely(!result)) { 2249 | PyErr_Format(PyExc_NameError, 2250 | #if PY_MAJOR_VERSION >= 3 2251 | "name '%U' is not defined", name); 2252 | #else 2253 | "name '%.200s' is not defined", PyString_AS_STRING(name)); 2254 | #endif 2255 | } 2256 | return result; 2257 | } 2258 | 2259 | /* GetModuleGlobalName */ 2260 | static CYTHON_INLINE PyObject *__Pyx_GetModuleGlobalName(PyObject *name) { 2261 | PyObject *result; 2262 | #if CYTHON_COMPILING_IN_CPYTHON 2263 | result = PyDict_GetItem(__pyx_d, name); 2264 | if (likely(result)) { 2265 | Py_INCREF(result); 2266 | } else { 2267 | #else 2268 | result = PyObject_GetItem(__pyx_d, name); 2269 | if (!result) { 2270 | PyErr_Clear(); 2271 | #endif 2272 | result = __Pyx_GetBuiltinName(name); 2273 | } 2274 | return result; 2275 | } 2276 | 2277 | /* PyObjectCall */ 2278 | #if CYTHON_COMPILING_IN_CPYTHON 2279 | static CYTHON_INLINE PyObject* __Pyx_PyObject_Call(PyObject *func, PyObject *arg, PyObject *kw) { 2280 | PyObject *result; 2281 | ternaryfunc call = func->ob_type->tp_call; 2282 | if (unlikely(!call)) 2283 | return PyObject_Call(func, arg, kw); 2284 | if (unlikely(Py_EnterRecursiveCall((char*)" while calling a Python object"))) 2285 | return NULL; 2286 | result = (*call)(func, arg, kw); 2287 | Py_LeaveRecursiveCall(); 2288 | if (unlikely(!result) && unlikely(!PyErr_Occurred())) { 2289 | PyErr_SetString( 2290 | PyExc_SystemError, 2291 | "NULL result without error in PyObject_Call"); 2292 | } 2293 | return result; 2294 | } 2295 | #endif 2296 | 2297 | /* PyObjectCallMethO */ 2298 | #if CYTHON_COMPILING_IN_CPYTHON 2299 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallMethO(PyObject *func, PyObject *arg) { 2300 | PyObject *self, *result; 2301 | PyCFunction cfunc; 2302 | cfunc = PyCFunction_GET_FUNCTION(func); 2303 | self = PyCFunction_GET_SELF(func); 2304 | if (unlikely(Py_EnterRecursiveCall((char*)" while calling a Python object"))) 2305 | return NULL; 2306 | result = cfunc(self, arg); 2307 | Py_LeaveRecursiveCall(); 2308 | if (unlikely(!result) && unlikely(!PyErr_Occurred())) { 2309 | PyErr_SetString( 2310 | PyExc_SystemError, 2311 | "NULL result without error in PyObject_Call"); 2312 | } 2313 | return result; 2314 | } 2315 | #endif 2316 | 2317 | /* PyObjectCallOneArg */ 2318 | #if CYTHON_COMPILING_IN_CPYTHON 2319 | static PyObject* __Pyx__PyObject_CallOneArg(PyObject *func, PyObject *arg) { 2320 | PyObject *result; 2321 | PyObject *args = PyTuple_New(1); 2322 | if (unlikely(!args)) return NULL; 2323 | Py_INCREF(arg); 2324 | PyTuple_SET_ITEM(args, 0, arg); 2325 | result = __Pyx_PyObject_Call(func, args, NULL); 2326 | Py_DECREF(args); 2327 | return result; 2328 | } 2329 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallOneArg(PyObject *func, PyObject *arg) { 2330 | #ifdef __Pyx_CyFunction_USED 2331 | if (likely(PyCFunction_Check(func) || PyObject_TypeCheck(func, __pyx_CyFunctionType))) { 2332 | #else 2333 | if (likely(PyCFunction_Check(func))) { 2334 | #endif 2335 | if (likely(PyCFunction_GET_FLAGS(func) & METH_O)) { 2336 | return __Pyx_PyObject_CallMethO(func, arg); 2337 | } 2338 | } 2339 | return __Pyx__PyObject_CallOneArg(func, arg); 2340 | } 2341 | #else 2342 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallOneArg(PyObject *func, PyObject *arg) { 2343 | PyObject *result; 2344 | PyObject *args = PyTuple_Pack(1, arg); 2345 | if (unlikely(!args)) return NULL; 2346 | result = __Pyx_PyObject_Call(func, args, NULL); 2347 | Py_DECREF(args); 2348 | return result; 2349 | } 2350 | #endif 2351 | 2352 | /* PyObjectCallNoArg */ 2353 | #if CYTHON_COMPILING_IN_CPYTHON 2354 | static CYTHON_INLINE PyObject* __Pyx_PyObject_CallNoArg(PyObject *func) { 2355 | #ifdef __Pyx_CyFunction_USED 2356 | if (likely(PyCFunction_Check(func) || PyObject_TypeCheck(func, __pyx_CyFunctionType))) { 2357 | #else 2358 | if (likely(PyCFunction_Check(func))) { 2359 | #endif 2360 | if (likely(PyCFunction_GET_FLAGS(func) & METH_NOARGS)) { 2361 | return __Pyx_PyObject_CallMethO(func, NULL); 2362 | } 2363 | } 2364 | return __Pyx_PyObject_Call(func, __pyx_empty_tuple, NULL); 2365 | } 2366 | #endif 2367 | 2368 | /* SliceObject */ 2369 | static CYTHON_INLINE PyObject* __Pyx_PyObject_GetSlice(PyObject* obj, 2370 | Py_ssize_t cstart, Py_ssize_t cstop, 2371 | PyObject** _py_start, PyObject** _py_stop, PyObject** _py_slice, 2372 | int has_cstart, int has_cstop, CYTHON_UNUSED int wraparound) { 2373 | #if CYTHON_COMPILING_IN_CPYTHON 2374 | PyMappingMethods* mp; 2375 | #if PY_MAJOR_VERSION < 3 2376 | PySequenceMethods* ms = Py_TYPE(obj)->tp_as_sequence; 2377 | if (likely(ms && ms->sq_slice)) { 2378 | if (!has_cstart) { 2379 | if (_py_start && (*_py_start != Py_None)) { 2380 | cstart = __Pyx_PyIndex_AsSsize_t(*_py_start); 2381 | if ((cstart == (Py_ssize_t)-1) && PyErr_Occurred()) goto bad; 2382 | } else 2383 | cstart = 0; 2384 | } 2385 | if (!has_cstop) { 2386 | if (_py_stop && (*_py_stop != Py_None)) { 2387 | cstop = __Pyx_PyIndex_AsSsize_t(*_py_stop); 2388 | if ((cstop == (Py_ssize_t)-1) && PyErr_Occurred()) goto bad; 2389 | } else 2390 | cstop = PY_SSIZE_T_MAX; 2391 | } 2392 | if (wraparound && unlikely((cstart < 0) | (cstop < 0)) && likely(ms->sq_length)) { 2393 | Py_ssize_t l = ms->sq_length(obj); 2394 | if (likely(l >= 0)) { 2395 | if (cstop < 0) { 2396 | cstop += l; 2397 | if (cstop < 0) cstop = 0; 2398 | } 2399 | if (cstart < 0) { 2400 | cstart += l; 2401 | if (cstart < 0) cstart = 0; 2402 | } 2403 | } else { 2404 | if (!PyErr_ExceptionMatches(PyExc_OverflowError)) 2405 | goto bad; 2406 | PyErr_Clear(); 2407 | } 2408 | } 2409 | return ms->sq_slice(obj, cstart, cstop); 2410 | } 2411 | #endif 2412 | mp = Py_TYPE(obj)->tp_as_mapping; 2413 | if (likely(mp && mp->mp_subscript)) 2414 | #endif 2415 | { 2416 | PyObject* result; 2417 | PyObject *py_slice, *py_start, *py_stop; 2418 | if (_py_slice) { 2419 | py_slice = *_py_slice; 2420 | } else { 2421 | PyObject* owned_start = NULL; 2422 | PyObject* owned_stop = NULL; 2423 | if (_py_start) { 2424 | py_start = *_py_start; 2425 | } else { 2426 | if (has_cstart) { 2427 | owned_start = py_start = PyInt_FromSsize_t(cstart); 2428 | if (unlikely(!py_start)) goto bad; 2429 | } else 2430 | py_start = Py_None; 2431 | } 2432 | if (_py_stop) { 2433 | py_stop = *_py_stop; 2434 | } else { 2435 | if (has_cstop) { 2436 | owned_stop = py_stop = PyInt_FromSsize_t(cstop); 2437 | if (unlikely(!py_stop)) { 2438 | Py_XDECREF(owned_start); 2439 | goto bad; 2440 | } 2441 | } else 2442 | py_stop = Py_None; 2443 | } 2444 | py_slice = PySlice_New(py_start, py_stop, Py_None); 2445 | Py_XDECREF(owned_start); 2446 | Py_XDECREF(owned_stop); 2447 | if (unlikely(!py_slice)) goto bad; 2448 | } 2449 | #if CYTHON_COMPILING_IN_CPYTHON 2450 | result = mp->mp_subscript(obj, py_slice); 2451 | #else 2452 | result = PyObject_GetItem(obj, py_slice); 2453 | #endif 2454 | if (!_py_slice) { 2455 | Py_DECREF(py_slice); 2456 | } 2457 | return result; 2458 | } 2459 | PyErr_Format(PyExc_TypeError, 2460 | "'%.200s' object is unsliceable", Py_TYPE(obj)->tp_name); 2461 | bad: 2462 | return NULL; 2463 | } 2464 | 2465 | /* GetItemInt */ 2466 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Generic(PyObject *o, PyObject* j) { 2467 | PyObject *r; 2468 | if (!j) return NULL; 2469 | r = PyObject_GetItem(o, j); 2470 | Py_DECREF(j); 2471 | return r; 2472 | } 2473 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_List_Fast(PyObject *o, Py_ssize_t i, 2474 | CYTHON_NCP_UNUSED int wraparound, 2475 | CYTHON_NCP_UNUSED int boundscheck) { 2476 | #if CYTHON_COMPILING_IN_CPYTHON 2477 | if (wraparound & unlikely(i < 0)) i += PyList_GET_SIZE(o); 2478 | if ((!boundscheck) || likely((0 <= i) & (i < PyList_GET_SIZE(o)))) { 2479 | PyObject *r = PyList_GET_ITEM(o, i); 2480 | Py_INCREF(r); 2481 | return r; 2482 | } 2483 | return __Pyx_GetItemInt_Generic(o, PyInt_FromSsize_t(i)); 2484 | #else 2485 | return PySequence_GetItem(o, i); 2486 | #endif 2487 | } 2488 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Tuple_Fast(PyObject *o, Py_ssize_t i, 2489 | CYTHON_NCP_UNUSED int wraparound, 2490 | CYTHON_NCP_UNUSED int boundscheck) { 2491 | #if CYTHON_COMPILING_IN_CPYTHON 2492 | if (wraparound & unlikely(i < 0)) i += PyTuple_GET_SIZE(o); 2493 | if ((!boundscheck) || likely((0 <= i) & (i < PyTuple_GET_SIZE(o)))) { 2494 | PyObject *r = PyTuple_GET_ITEM(o, i); 2495 | Py_INCREF(r); 2496 | return r; 2497 | } 2498 | return __Pyx_GetItemInt_Generic(o, PyInt_FromSsize_t(i)); 2499 | #else 2500 | return PySequence_GetItem(o, i); 2501 | #endif 2502 | } 2503 | static CYTHON_INLINE PyObject *__Pyx_GetItemInt_Fast(PyObject *o, Py_ssize_t i, int is_list, 2504 | CYTHON_NCP_UNUSED int wraparound, 2505 | CYTHON_NCP_UNUSED int boundscheck) { 2506 | #if CYTHON_COMPILING_IN_CPYTHON 2507 | if (is_list || PyList_CheckExact(o)) { 2508 | Py_ssize_t n = ((!wraparound) | likely(i >= 0)) ? i : i + PyList_GET_SIZE(o); 2509 | if ((!boundscheck) || (likely((n >= 0) & (n < PyList_GET_SIZE(o))))) { 2510 | PyObject *r = PyList_GET_ITEM(o, n); 2511 | Py_INCREF(r); 2512 | return r; 2513 | } 2514 | } 2515 | else if (PyTuple_CheckExact(o)) { 2516 | Py_ssize_t n = ((!wraparound) | likely(i >= 0)) ? i : i + PyTuple_GET_SIZE(o); 2517 | if ((!boundscheck) || likely((n >= 0) & (n < PyTuple_GET_SIZE(o)))) { 2518 | PyObject *r = PyTuple_GET_ITEM(o, n); 2519 | Py_INCREF(r); 2520 | return r; 2521 | } 2522 | } else { 2523 | PySequenceMethods *m = Py_TYPE(o)->tp_as_sequence; 2524 | if (likely(m && m->sq_item)) { 2525 | if (wraparound && unlikely(i < 0) && likely(m->sq_length)) { 2526 | Py_ssize_t l = m->sq_length(o); 2527 | if (likely(l >= 0)) { 2528 | i += l; 2529 | } else { 2530 | if (!PyErr_ExceptionMatches(PyExc_OverflowError)) 2531 | return NULL; 2532 | PyErr_Clear(); 2533 | } 2534 | } 2535 | return m->sq_item(o, i); 2536 | } 2537 | } 2538 | #else 2539 | if (is_list || PySequence_Check(o)) { 2540 | return PySequence_GetItem(o, i); 2541 | } 2542 | #endif 2543 | return __Pyx_GetItemInt_Generic(o, PyInt_FromSsize_t(i)); 2544 | } 2545 | 2546 | /* RaiseArgTupleInvalid */ 2547 | static void __Pyx_RaiseArgtupleInvalid( 2548 | const char* func_name, 2549 | int exact, 2550 | Py_ssize_t num_min, 2551 | Py_ssize_t num_max, 2552 | Py_ssize_t num_found) 2553 | { 2554 | Py_ssize_t num_expected; 2555 | const char *more_or_less; 2556 | if (num_found < num_min) { 2557 | num_expected = num_min; 2558 | more_or_less = "at least"; 2559 | } else { 2560 | num_expected = num_max; 2561 | more_or_less = "at most"; 2562 | } 2563 | if (exact) { 2564 | more_or_less = "exactly"; 2565 | } 2566 | PyErr_Format(PyExc_TypeError, 2567 | "%.200s() takes %.8s %" CYTHON_FORMAT_SSIZE_T "d positional argument%.1s (%" CYTHON_FORMAT_SSIZE_T "d given)", 2568 | func_name, more_or_less, num_expected, 2569 | (num_expected == 1) ? "" : "s", num_found); 2570 | } 2571 | 2572 | /* RaiseDoubleKeywords */ 2573 | static void __Pyx_RaiseDoubleKeywordsError( 2574 | const char* func_name, 2575 | PyObject* kw_name) 2576 | { 2577 | PyErr_Format(PyExc_TypeError, 2578 | #if PY_MAJOR_VERSION >= 3 2579 | "%s() got multiple values for keyword argument '%U'", func_name, kw_name); 2580 | #else 2581 | "%s() got multiple values for keyword argument '%s'", func_name, 2582 | PyString_AsString(kw_name)); 2583 | #endif 2584 | } 2585 | 2586 | /* ParseKeywords */ 2587 | static int __Pyx_ParseOptionalKeywords( 2588 | PyObject *kwds, 2589 | PyObject **argnames[], 2590 | PyObject *kwds2, 2591 | PyObject *values[], 2592 | Py_ssize_t num_pos_args, 2593 | const char* function_name) 2594 | { 2595 | PyObject *key = 0, *value = 0; 2596 | Py_ssize_t pos = 0; 2597 | PyObject*** name; 2598 | PyObject*** first_kw_arg = argnames + num_pos_args; 2599 | while (PyDict_Next(kwds, &pos, &key, &value)) { 2600 | name = first_kw_arg; 2601 | while (*name && (**name != key)) name++; 2602 | if (*name) { 2603 | values[name-argnames] = value; 2604 | continue; 2605 | } 2606 | name = first_kw_arg; 2607 | #if PY_MAJOR_VERSION < 3 2608 | if (likely(PyString_CheckExact(key)) || likely(PyString_Check(key))) { 2609 | while (*name) { 2610 | if ((CYTHON_COMPILING_IN_PYPY || PyString_GET_SIZE(**name) == PyString_GET_SIZE(key)) 2611 | && _PyString_Eq(**name, key)) { 2612 | values[name-argnames] = value; 2613 | break; 2614 | } 2615 | name++; 2616 | } 2617 | if (*name) continue; 2618 | else { 2619 | PyObject*** argname = argnames; 2620 | while (argname != first_kw_arg) { 2621 | if ((**argname == key) || ( 2622 | (CYTHON_COMPILING_IN_PYPY || PyString_GET_SIZE(**argname) == PyString_GET_SIZE(key)) 2623 | && _PyString_Eq(**argname, key))) { 2624 | goto arg_passed_twice; 2625 | } 2626 | argname++; 2627 | } 2628 | } 2629 | } else 2630 | #endif 2631 | if (likely(PyUnicode_Check(key))) { 2632 | while (*name) { 2633 | int cmp = (**name == key) ? 0 : 2634 | #if !CYTHON_COMPILING_IN_PYPY && PY_MAJOR_VERSION >= 3 2635 | (PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 : 2636 | #endif 2637 | PyUnicode_Compare(**name, key); 2638 | if (cmp < 0 && unlikely(PyErr_Occurred())) goto bad; 2639 | if (cmp == 0) { 2640 | values[name-argnames] = value; 2641 | break; 2642 | } 2643 | name++; 2644 | } 2645 | if (*name) continue; 2646 | else { 2647 | PyObject*** argname = argnames; 2648 | while (argname != first_kw_arg) { 2649 | int cmp = (**argname == key) ? 0 : 2650 | #if !CYTHON_COMPILING_IN_PYPY && PY_MAJOR_VERSION >= 3 2651 | (PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 : 2652 | #endif 2653 | PyUnicode_Compare(**argname, key); 2654 | if (cmp < 0 && unlikely(PyErr_Occurred())) goto bad; 2655 | if (cmp == 0) goto arg_passed_twice; 2656 | argname++; 2657 | } 2658 | } 2659 | } else 2660 | goto invalid_keyword_type; 2661 | if (kwds2) { 2662 | if (unlikely(PyDict_SetItem(kwds2, key, value))) goto bad; 2663 | } else { 2664 | goto invalid_keyword; 2665 | } 2666 | } 2667 | return 0; 2668 | arg_passed_twice: 2669 | __Pyx_RaiseDoubleKeywordsError(function_name, key); 2670 | goto bad; 2671 | invalid_keyword_type: 2672 | PyErr_Format(PyExc_TypeError, 2673 | "%.200s() keywords must be strings", function_name); 2674 | goto bad; 2675 | invalid_keyword: 2676 | PyErr_Format(PyExc_TypeError, 2677 | #if PY_MAJOR_VERSION < 3 2678 | "%.200s() got an unexpected keyword argument '%.200s'", 2679 | function_name, PyString_AsString(key)); 2680 | #else 2681 | "%s() got an unexpected keyword argument '%U'", 2682 | function_name, key); 2683 | #endif 2684 | bad: 2685 | return -1; 2686 | } 2687 | 2688 | /* FetchCommonType */ 2689 | static PyTypeObject* __Pyx_FetchCommonType(PyTypeObject* type) { 2690 | PyObject* fake_module; 2691 | PyTypeObject* cached_type = NULL; 2692 | fake_module = PyImport_AddModule((char*) "_cython_" CYTHON_ABI); 2693 | if (!fake_module) return NULL; 2694 | Py_INCREF(fake_module); 2695 | cached_type = (PyTypeObject*) PyObject_GetAttrString(fake_module, type->tp_name); 2696 | if (cached_type) { 2697 | if (!PyType_Check((PyObject*)cached_type)) { 2698 | PyErr_Format(PyExc_TypeError, 2699 | "Shared Cython type %.200s is not a type object", 2700 | type->tp_name); 2701 | goto bad; 2702 | } 2703 | if (cached_type->tp_basicsize != type->tp_basicsize) { 2704 | PyErr_Format(PyExc_TypeError, 2705 | "Shared Cython type %.200s has the wrong size, try recompiling", 2706 | type->tp_name); 2707 | goto bad; 2708 | } 2709 | } else { 2710 | if (!PyErr_ExceptionMatches(PyExc_AttributeError)) goto bad; 2711 | PyErr_Clear(); 2712 | if (PyType_Ready(type) < 0) goto bad; 2713 | if (PyObject_SetAttrString(fake_module, type->tp_name, (PyObject*) type) < 0) 2714 | goto bad; 2715 | Py_INCREF(type); 2716 | cached_type = type; 2717 | } 2718 | done: 2719 | Py_DECREF(fake_module); 2720 | return cached_type; 2721 | bad: 2722 | Py_XDECREF(cached_type); 2723 | cached_type = NULL; 2724 | goto done; 2725 | } 2726 | 2727 | /* CythonFunction */ 2728 | static PyObject * 2729 | __Pyx_CyFunction_get_doc(__pyx_CyFunctionObject *op, CYTHON_UNUSED void *closure) 2730 | { 2731 | if (unlikely(op->func_doc == NULL)) { 2732 | if (op->func.m_ml->ml_doc) { 2733 | #if PY_MAJOR_VERSION >= 3 2734 | op->func_doc = PyUnicode_FromString(op->func.m_ml->ml_doc); 2735 | #else 2736 | op->func_doc = PyString_FromString(op->func.m_ml->ml_doc); 2737 | #endif 2738 | if (unlikely(op->func_doc == NULL)) 2739 | return NULL; 2740 | } else { 2741 | Py_INCREF(Py_None); 2742 | return Py_None; 2743 | } 2744 | } 2745 | Py_INCREF(op->func_doc); 2746 | return op->func_doc; 2747 | } 2748 | static int 2749 | __Pyx_CyFunction_set_doc(__pyx_CyFunctionObject *op, PyObject *value) 2750 | { 2751 | PyObject *tmp = op->func_doc; 2752 | if (value == NULL) { 2753 | value = Py_None; 2754 | } 2755 | Py_INCREF(value); 2756 | op->func_doc = value; 2757 | Py_XDECREF(tmp); 2758 | return 0; 2759 | } 2760 | static PyObject * 2761 | __Pyx_CyFunction_get_name(__pyx_CyFunctionObject *op) 2762 | { 2763 | if (unlikely(op->func_name == NULL)) { 2764 | #if PY_MAJOR_VERSION >= 3 2765 | op->func_name = PyUnicode_InternFromString(op->func.m_ml->ml_name); 2766 | #else 2767 | op->func_name = PyString_InternFromString(op->func.m_ml->ml_name); 2768 | #endif 2769 | if (unlikely(op->func_name == NULL)) 2770 | return NULL; 2771 | } 2772 | Py_INCREF(op->func_name); 2773 | return op->func_name; 2774 | } 2775 | static int 2776 | __Pyx_CyFunction_set_name(__pyx_CyFunctionObject *op, PyObject *value) 2777 | { 2778 | PyObject *tmp; 2779 | #if PY_MAJOR_VERSION >= 3 2780 | if (unlikely(value == NULL || !PyUnicode_Check(value))) { 2781 | #else 2782 | if (unlikely(value == NULL || !PyString_Check(value))) { 2783 | #endif 2784 | PyErr_SetString(PyExc_TypeError, 2785 | "__name__ must be set to a string object"); 2786 | return -1; 2787 | } 2788 | tmp = op->func_name; 2789 | Py_INCREF(value); 2790 | op->func_name = value; 2791 | Py_XDECREF(tmp); 2792 | return 0; 2793 | } 2794 | static PyObject * 2795 | __Pyx_CyFunction_get_qualname(__pyx_CyFunctionObject *op) 2796 | { 2797 | Py_INCREF(op->func_qualname); 2798 | return op->func_qualname; 2799 | } 2800 | static int 2801 | __Pyx_CyFunction_set_qualname(__pyx_CyFunctionObject *op, PyObject *value) 2802 | { 2803 | PyObject *tmp; 2804 | #if PY_MAJOR_VERSION >= 3 2805 | if (unlikely(value == NULL || !PyUnicode_Check(value))) { 2806 | #else 2807 | if (unlikely(value == NULL || !PyString_Check(value))) { 2808 | #endif 2809 | PyErr_SetString(PyExc_TypeError, 2810 | "__qualname__ must be set to a string object"); 2811 | return -1; 2812 | } 2813 | tmp = op->func_qualname; 2814 | Py_INCREF(value); 2815 | op->func_qualname = value; 2816 | Py_XDECREF(tmp); 2817 | return 0; 2818 | } 2819 | static PyObject * 2820 | __Pyx_CyFunction_get_self(__pyx_CyFunctionObject *m, CYTHON_UNUSED void *closure) 2821 | { 2822 | PyObject *self; 2823 | self = m->func_closure; 2824 | if (self == NULL) 2825 | self = Py_None; 2826 | Py_INCREF(self); 2827 | return self; 2828 | } 2829 | static PyObject * 2830 | __Pyx_CyFunction_get_dict(__pyx_CyFunctionObject *op) 2831 | { 2832 | if (unlikely(op->func_dict == NULL)) { 2833 | op->func_dict = PyDict_New(); 2834 | if (unlikely(op->func_dict == NULL)) 2835 | return NULL; 2836 | } 2837 | Py_INCREF(op->func_dict); 2838 | return op->func_dict; 2839 | } 2840 | static int 2841 | __Pyx_CyFunction_set_dict(__pyx_CyFunctionObject *op, PyObject *value) 2842 | { 2843 | PyObject *tmp; 2844 | if (unlikely(value == NULL)) { 2845 | PyErr_SetString(PyExc_TypeError, 2846 | "function's dictionary may not be deleted"); 2847 | return -1; 2848 | } 2849 | if (unlikely(!PyDict_Check(value))) { 2850 | PyErr_SetString(PyExc_TypeError, 2851 | "setting function's dictionary to a non-dict"); 2852 | return -1; 2853 | } 2854 | tmp = op->func_dict; 2855 | Py_INCREF(value); 2856 | op->func_dict = value; 2857 | Py_XDECREF(tmp); 2858 | return 0; 2859 | } 2860 | static PyObject * 2861 | __Pyx_CyFunction_get_globals(__pyx_CyFunctionObject *op) 2862 | { 2863 | Py_INCREF(op->func_globals); 2864 | return op->func_globals; 2865 | } 2866 | static PyObject * 2867 | __Pyx_CyFunction_get_closure(CYTHON_UNUSED __pyx_CyFunctionObject *op) 2868 | { 2869 | Py_INCREF(Py_None); 2870 | return Py_None; 2871 | } 2872 | static PyObject * 2873 | __Pyx_CyFunction_get_code(__pyx_CyFunctionObject *op) 2874 | { 2875 | PyObject* result = (op->func_code) ? op->func_code : Py_None; 2876 | Py_INCREF(result); 2877 | return result; 2878 | } 2879 | static int 2880 | __Pyx_CyFunction_init_defaults(__pyx_CyFunctionObject *op) { 2881 | int result = 0; 2882 | PyObject *res = op->defaults_getter((PyObject *) op); 2883 | if (unlikely(!res)) 2884 | return -1; 2885 | #if CYTHON_COMPILING_IN_CPYTHON 2886 | op->defaults_tuple = PyTuple_GET_ITEM(res, 0); 2887 | Py_INCREF(op->defaults_tuple); 2888 | op->defaults_kwdict = PyTuple_GET_ITEM(res, 1); 2889 | Py_INCREF(op->defaults_kwdict); 2890 | #else 2891 | op->defaults_tuple = PySequence_ITEM(res, 0); 2892 | if (unlikely(!op->defaults_tuple)) result = -1; 2893 | else { 2894 | op->defaults_kwdict = PySequence_ITEM(res, 1); 2895 | if (unlikely(!op->defaults_kwdict)) result = -1; 2896 | } 2897 | #endif 2898 | Py_DECREF(res); 2899 | return result; 2900 | } 2901 | static int 2902 | __Pyx_CyFunction_set_defaults(__pyx_CyFunctionObject *op, PyObject* value) { 2903 | PyObject* tmp; 2904 | if (!value) { 2905 | value = Py_None; 2906 | } else if (value != Py_None && !PyTuple_Check(value)) { 2907 | PyErr_SetString(PyExc_TypeError, 2908 | "__defaults__ must be set to a tuple object"); 2909 | return -1; 2910 | } 2911 | Py_INCREF(value); 2912 | tmp = op->defaults_tuple; 2913 | op->defaults_tuple = value; 2914 | Py_XDECREF(tmp); 2915 | return 0; 2916 | } 2917 | static PyObject * 2918 | __Pyx_CyFunction_get_defaults(__pyx_CyFunctionObject *op) { 2919 | PyObject* result = op->defaults_tuple; 2920 | if (unlikely(!result)) { 2921 | if (op->defaults_getter) { 2922 | if (__Pyx_CyFunction_init_defaults(op) < 0) return NULL; 2923 | result = op->defaults_tuple; 2924 | } else { 2925 | result = Py_None; 2926 | } 2927 | } 2928 | Py_INCREF(result); 2929 | return result; 2930 | } 2931 | static int 2932 | __Pyx_CyFunction_set_kwdefaults(__pyx_CyFunctionObject *op, PyObject* value) { 2933 | PyObject* tmp; 2934 | if (!value) { 2935 | value = Py_None; 2936 | } else if (value != Py_None && !PyDict_Check(value)) { 2937 | PyErr_SetString(PyExc_TypeError, 2938 | "__kwdefaults__ must be set to a dict object"); 2939 | return -1; 2940 | } 2941 | Py_INCREF(value); 2942 | tmp = op->defaults_kwdict; 2943 | op->defaults_kwdict = value; 2944 | Py_XDECREF(tmp); 2945 | return 0; 2946 | } 2947 | static PyObject * 2948 | __Pyx_CyFunction_get_kwdefaults(__pyx_CyFunctionObject *op) { 2949 | PyObject* result = op->defaults_kwdict; 2950 | if (unlikely(!result)) { 2951 | if (op->defaults_getter) { 2952 | if (__Pyx_CyFunction_init_defaults(op) < 0) return NULL; 2953 | result = op->defaults_kwdict; 2954 | } else { 2955 | result = Py_None; 2956 | } 2957 | } 2958 | Py_INCREF(result); 2959 | return result; 2960 | } 2961 | static int 2962 | __Pyx_CyFunction_set_annotations(__pyx_CyFunctionObject *op, PyObject* value) { 2963 | PyObject* tmp; 2964 | if (!value || value == Py_None) { 2965 | value = NULL; 2966 | } else if (!PyDict_Check(value)) { 2967 | PyErr_SetString(PyExc_TypeError, 2968 | "__annotations__ must be set to a dict object"); 2969 | return -1; 2970 | } 2971 | Py_XINCREF(value); 2972 | tmp = op->func_annotations; 2973 | op->func_annotations = value; 2974 | Py_XDECREF(tmp); 2975 | return 0; 2976 | } 2977 | static PyObject * 2978 | __Pyx_CyFunction_get_annotations(__pyx_CyFunctionObject *op) { 2979 | PyObject* result = op->func_annotations; 2980 | if (unlikely(!result)) { 2981 | result = PyDict_New(); 2982 | if (unlikely(!result)) return NULL; 2983 | op->func_annotations = result; 2984 | } 2985 | Py_INCREF(result); 2986 | return result; 2987 | } 2988 | static PyGetSetDef __pyx_CyFunction_getsets[] = { 2989 | {(char *) "func_doc", (getter)__Pyx_CyFunction_get_doc, (setter)__Pyx_CyFunction_set_doc, 0, 0}, 2990 | {(char *) "__doc__", (getter)__Pyx_CyFunction_get_doc, (setter)__Pyx_CyFunction_set_doc, 0, 0}, 2991 | {(char *) "func_name", (getter)__Pyx_CyFunction_get_name, (setter)__Pyx_CyFunction_set_name, 0, 0}, 2992 | {(char *) "__name__", (getter)__Pyx_CyFunction_get_name, (setter)__Pyx_CyFunction_set_name, 0, 0}, 2993 | {(char *) "__qualname__", (getter)__Pyx_CyFunction_get_qualname, (setter)__Pyx_CyFunction_set_qualname, 0, 0}, 2994 | {(char *) "__self__", (getter)__Pyx_CyFunction_get_self, 0, 0, 0}, 2995 | {(char *) "func_dict", (getter)__Pyx_CyFunction_get_dict, (setter)__Pyx_CyFunction_set_dict, 0, 0}, 2996 | {(char *) "__dict__", (getter)__Pyx_CyFunction_get_dict, (setter)__Pyx_CyFunction_set_dict, 0, 0}, 2997 | {(char *) "func_globals", (getter)__Pyx_CyFunction_get_globals, 0, 0, 0}, 2998 | {(char *) "__globals__", (getter)__Pyx_CyFunction_get_globals, 0, 0, 0}, 2999 | {(char *) "func_closure", (getter)__Pyx_CyFunction_get_closure, 0, 0, 0}, 3000 | {(char *) "__closure__", (getter)__Pyx_CyFunction_get_closure, 0, 0, 0}, 3001 | {(char *) "func_code", (getter)__Pyx_CyFunction_get_code, 0, 0, 0}, 3002 | {(char *) "__code__", (getter)__Pyx_CyFunction_get_code, 0, 0, 0}, 3003 | {(char *) "func_defaults", (getter)__Pyx_CyFunction_get_defaults, (setter)__Pyx_CyFunction_set_defaults, 0, 0}, 3004 | {(char *) "__defaults__", (getter)__Pyx_CyFunction_get_defaults, (setter)__Pyx_CyFunction_set_defaults, 0, 0}, 3005 | {(char *) "__kwdefaults__", (getter)__Pyx_CyFunction_get_kwdefaults, (setter)__Pyx_CyFunction_set_kwdefaults, 0, 0}, 3006 | {(char *) "__annotations__", (getter)__Pyx_CyFunction_get_annotations, (setter)__Pyx_CyFunction_set_annotations, 0, 0}, 3007 | {0, 0, 0, 0, 0} 3008 | }; 3009 | static PyMemberDef __pyx_CyFunction_members[] = { 3010 | {(char *) "__module__", T_OBJECT, offsetof(__pyx_CyFunctionObject, func.m_module), PY_WRITE_RESTRICTED, 0}, 3011 | {0, 0, 0, 0, 0} 3012 | }; 3013 | static PyObject * 3014 | __Pyx_CyFunction_reduce(__pyx_CyFunctionObject *m, CYTHON_UNUSED PyObject *args) 3015 | { 3016 | #if PY_MAJOR_VERSION >= 3 3017 | return PyUnicode_FromString(m->func.m_ml->ml_name); 3018 | #else 3019 | return PyString_FromString(m->func.m_ml->ml_name); 3020 | #endif 3021 | } 3022 | static PyMethodDef __pyx_CyFunction_methods[] = { 3023 | {"__reduce__", (PyCFunction)__Pyx_CyFunction_reduce, METH_VARARGS, 0}, 3024 | {0, 0, 0, 0} 3025 | }; 3026 | #if PY_VERSION_HEX < 0x030500A0 3027 | #define __Pyx_CyFunction_weakreflist(cyfunc) ((cyfunc)->func_weakreflist) 3028 | #else 3029 | #define __Pyx_CyFunction_weakreflist(cyfunc) ((cyfunc)->func.m_weakreflist) 3030 | #endif 3031 | static PyObject *__Pyx_CyFunction_New(PyTypeObject *type, PyMethodDef *ml, int flags, PyObject* qualname, 3032 | PyObject *closure, PyObject *module, PyObject* globals, PyObject* code) { 3033 | __pyx_CyFunctionObject *op = PyObject_GC_New(__pyx_CyFunctionObject, type); 3034 | if (op == NULL) 3035 | return NULL; 3036 | op->flags = flags; 3037 | __Pyx_CyFunction_weakreflist(op) = NULL; 3038 | op->func.m_ml = ml; 3039 | op->func.m_self = (PyObject *) op; 3040 | Py_XINCREF(closure); 3041 | op->func_closure = closure; 3042 | Py_XINCREF(module); 3043 | op->func.m_module = module; 3044 | op->func_dict = NULL; 3045 | op->func_name = NULL; 3046 | Py_INCREF(qualname); 3047 | op->func_qualname = qualname; 3048 | op->func_doc = NULL; 3049 | op->func_classobj = NULL; 3050 | op->func_globals = globals; 3051 | Py_INCREF(op->func_globals); 3052 | Py_XINCREF(code); 3053 | op->func_code = code; 3054 | op->defaults_pyobjects = 0; 3055 | op->defaults = NULL; 3056 | op->defaults_tuple = NULL; 3057 | op->defaults_kwdict = NULL; 3058 | op->defaults_getter = NULL; 3059 | op->func_annotations = NULL; 3060 | PyObject_GC_Track(op); 3061 | return (PyObject *) op; 3062 | } 3063 | static int 3064 | __Pyx_CyFunction_clear(__pyx_CyFunctionObject *m) 3065 | { 3066 | Py_CLEAR(m->func_closure); 3067 | Py_CLEAR(m->func.m_module); 3068 | Py_CLEAR(m->func_dict); 3069 | Py_CLEAR(m->func_name); 3070 | Py_CLEAR(m->func_qualname); 3071 | Py_CLEAR(m->func_doc); 3072 | Py_CLEAR(m->func_globals); 3073 | Py_CLEAR(m->func_code); 3074 | Py_CLEAR(m->func_classobj); 3075 | Py_CLEAR(m->defaults_tuple); 3076 | Py_CLEAR(m->defaults_kwdict); 3077 | Py_CLEAR(m->func_annotations); 3078 | if (m->defaults) { 3079 | PyObject **pydefaults = __Pyx_CyFunction_Defaults(PyObject *, m); 3080 | int i; 3081 | for (i = 0; i < m->defaults_pyobjects; i++) 3082 | Py_XDECREF(pydefaults[i]); 3083 | PyObject_Free(m->defaults); 3084 | m->defaults = NULL; 3085 | } 3086 | return 0; 3087 | } 3088 | static void __Pyx_CyFunction_dealloc(__pyx_CyFunctionObject *m) 3089 | { 3090 | PyObject_GC_UnTrack(m); 3091 | if (__Pyx_CyFunction_weakreflist(m) != NULL) 3092 | PyObject_ClearWeakRefs((PyObject *) m); 3093 | __Pyx_CyFunction_clear(m); 3094 | PyObject_GC_Del(m); 3095 | } 3096 | static int __Pyx_CyFunction_traverse(__pyx_CyFunctionObject *m, visitproc visit, void *arg) 3097 | { 3098 | Py_VISIT(m->func_closure); 3099 | Py_VISIT(m->func.m_module); 3100 | Py_VISIT(m->func_dict); 3101 | Py_VISIT(m->func_name); 3102 | Py_VISIT(m->func_qualname); 3103 | Py_VISIT(m->func_doc); 3104 | Py_VISIT(m->func_globals); 3105 | Py_VISIT(m->func_code); 3106 | Py_VISIT(m->func_classobj); 3107 | Py_VISIT(m->defaults_tuple); 3108 | Py_VISIT(m->defaults_kwdict); 3109 | if (m->defaults) { 3110 | PyObject **pydefaults = __Pyx_CyFunction_Defaults(PyObject *, m); 3111 | int i; 3112 | for (i = 0; i < m->defaults_pyobjects; i++) 3113 | Py_VISIT(pydefaults[i]); 3114 | } 3115 | return 0; 3116 | } 3117 | static PyObject *__Pyx_CyFunction_descr_get(PyObject *func, PyObject *obj, PyObject *type) 3118 | { 3119 | __pyx_CyFunctionObject *m = (__pyx_CyFunctionObject *) func; 3120 | if (m->flags & __Pyx_CYFUNCTION_STATICMETHOD) { 3121 | Py_INCREF(func); 3122 | return func; 3123 | } 3124 | if (m->flags & __Pyx_CYFUNCTION_CLASSMETHOD) { 3125 | if (type == NULL) 3126 | type = (PyObject *)(Py_TYPE(obj)); 3127 | return __Pyx_PyMethod_New(func, type, (PyObject *)(Py_TYPE(type))); 3128 | } 3129 | if (obj == Py_None) 3130 | obj = NULL; 3131 | return __Pyx_PyMethod_New(func, obj, type); 3132 | } 3133 | static PyObject* 3134 | __Pyx_CyFunction_repr(__pyx_CyFunctionObject *op) 3135 | { 3136 | #if PY_MAJOR_VERSION >= 3 3137 | return PyUnicode_FromFormat("", 3138 | op->func_qualname, (void *)op); 3139 | #else 3140 | return PyString_FromFormat("", 3141 | PyString_AsString(op->func_qualname), (void *)op); 3142 | #endif 3143 | } 3144 | #if CYTHON_COMPILING_IN_PYPY 3145 | static PyObject * __Pyx_CyFunction_Call(PyObject *func, PyObject *arg, PyObject *kw) { 3146 | PyCFunctionObject* f = (PyCFunctionObject*)func; 3147 | PyCFunction meth = f->m_ml->ml_meth; 3148 | PyObject *self = f->m_self; 3149 | Py_ssize_t size; 3150 | switch (f->m_ml->ml_flags & (METH_VARARGS | METH_KEYWORDS | METH_NOARGS | METH_O)) { 3151 | case METH_VARARGS: 3152 | if (likely(kw == NULL || PyDict_Size(kw) == 0)) 3153 | return (*meth)(self, arg); 3154 | break; 3155 | case METH_VARARGS | METH_KEYWORDS: 3156 | return (*(PyCFunctionWithKeywords)meth)(self, arg, kw); 3157 | case METH_NOARGS: 3158 | if (likely(kw == NULL || PyDict_Size(kw) == 0)) { 3159 | size = PyTuple_GET_SIZE(arg); 3160 | if (likely(size == 0)) 3161 | return (*meth)(self, NULL); 3162 | PyErr_Format(PyExc_TypeError, 3163 | "%.200s() takes no arguments (%" CYTHON_FORMAT_SSIZE_T "d given)", 3164 | f->m_ml->ml_name, size); 3165 | return NULL; 3166 | } 3167 | break; 3168 | case METH_O: 3169 | if (likely(kw == NULL || PyDict_Size(kw) == 0)) { 3170 | size = PyTuple_GET_SIZE(arg); 3171 | if (likely(size == 1)) { 3172 | PyObject *result, *arg0 = PySequence_ITEM(arg, 0); 3173 | if (unlikely(!arg0)) return NULL; 3174 | result = (*meth)(self, arg0); 3175 | Py_DECREF(arg0); 3176 | return result; 3177 | } 3178 | PyErr_Format(PyExc_TypeError, 3179 | "%.200s() takes exactly one argument (%" CYTHON_FORMAT_SSIZE_T "d given)", 3180 | f->m_ml->ml_name, size); 3181 | return NULL; 3182 | } 3183 | break; 3184 | default: 3185 | PyErr_SetString(PyExc_SystemError, "Bad call flags in " 3186 | "__Pyx_CyFunction_Call. METH_OLDARGS is no " 3187 | "longer supported!"); 3188 | return NULL; 3189 | } 3190 | PyErr_Format(PyExc_TypeError, "%.200s() takes no keyword arguments", 3191 | f->m_ml->ml_name); 3192 | return NULL; 3193 | } 3194 | #else 3195 | static PyObject * __Pyx_CyFunction_Call(PyObject *func, PyObject *arg, PyObject *kw) { 3196 | return PyCFunction_Call(func, arg, kw); 3197 | } 3198 | #endif 3199 | static PyTypeObject __pyx_CyFunctionType_type = { 3200 | PyVarObject_HEAD_INIT(0, 0) 3201 | "cython_function_or_method", 3202 | sizeof(__pyx_CyFunctionObject), 3203 | 0, 3204 | (destructor) __Pyx_CyFunction_dealloc, 3205 | 0, 3206 | 0, 3207 | 0, 3208 | #if PY_MAJOR_VERSION < 3 3209 | 0, 3210 | #else 3211 | 0, 3212 | #endif 3213 | (reprfunc) __Pyx_CyFunction_repr, 3214 | 0, 3215 | 0, 3216 | 0, 3217 | 0, 3218 | __Pyx_CyFunction_Call, 3219 | 0, 3220 | 0, 3221 | 0, 3222 | 0, 3223 | Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC, 3224 | 0, 3225 | (traverseproc) __Pyx_CyFunction_traverse, 3226 | (inquiry) __Pyx_CyFunction_clear, 3227 | 0, 3228 | #if PY_VERSION_HEX < 0x030500A0 3229 | offsetof(__pyx_CyFunctionObject, func_weakreflist), 3230 | #else 3231 | offsetof(PyCFunctionObject, m_weakreflist), 3232 | #endif 3233 | 0, 3234 | 0, 3235 | __pyx_CyFunction_methods, 3236 | __pyx_CyFunction_members, 3237 | __pyx_CyFunction_getsets, 3238 | 0, 3239 | 0, 3240 | __Pyx_CyFunction_descr_get, 3241 | 0, 3242 | offsetof(__pyx_CyFunctionObject, func_dict), 3243 | 0, 3244 | 0, 3245 | 0, 3246 | 0, 3247 | 0, 3248 | 0, 3249 | 0, 3250 | 0, 3251 | 0, 3252 | 0, 3253 | 0, 3254 | 0, 3255 | #if PY_VERSION_HEX >= 0x030400a1 3256 | 0, 3257 | #endif 3258 | }; 3259 | static int __pyx_CyFunction_init(void) { 3260 | #if !CYTHON_COMPILING_IN_PYPY 3261 | __pyx_CyFunctionType_type.tp_call = PyCFunction_Call; 3262 | #endif 3263 | __pyx_CyFunctionType = __Pyx_FetchCommonType(&__pyx_CyFunctionType_type); 3264 | if (__pyx_CyFunctionType == NULL) { 3265 | return -1; 3266 | } 3267 | return 0; 3268 | } 3269 | static CYTHON_INLINE void *__Pyx_CyFunction_InitDefaults(PyObject *func, size_t size, int pyobjects) { 3270 | __pyx_CyFunctionObject *m = (__pyx_CyFunctionObject *) func; 3271 | m->defaults = PyObject_Malloc(size); 3272 | if (!m->defaults) 3273 | return PyErr_NoMemory(); 3274 | memset(m->defaults, 0, size); 3275 | m->defaults_pyobjects = pyobjects; 3276 | return m->defaults; 3277 | } 3278 | static CYTHON_INLINE void __Pyx_CyFunction_SetDefaultsTuple(PyObject *func, PyObject *tuple) { 3279 | __pyx_CyFunctionObject *m = (__pyx_CyFunctionObject *) func; 3280 | m->defaults_tuple = tuple; 3281 | Py_INCREF(tuple); 3282 | } 3283 | static CYTHON_INLINE void __Pyx_CyFunction_SetDefaultsKwDict(PyObject *func, PyObject *dict) { 3284 | __pyx_CyFunctionObject *m = (__pyx_CyFunctionObject *) func; 3285 | m->defaults_kwdict = dict; 3286 | Py_INCREF(dict); 3287 | } 3288 | static CYTHON_INLINE void __Pyx_CyFunction_SetAnnotationsDict(PyObject *func, PyObject *dict) { 3289 | __pyx_CyFunctionObject *m = (__pyx_CyFunctionObject *) func; 3290 | m->func_annotations = dict; 3291 | Py_INCREF(dict); 3292 | } 3293 | 3294 | /* Import */ 3295 | static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int level) { 3296 | PyObject *empty_list = 0; 3297 | PyObject *module = 0; 3298 | PyObject *global_dict = 0; 3299 | PyObject *empty_dict = 0; 3300 | PyObject *list; 3301 | #if PY_VERSION_HEX < 0x03030000 3302 | PyObject *py_import; 3303 | py_import = __Pyx_PyObject_GetAttrStr(__pyx_b, __pyx_n_s_import); 3304 | if (!py_import) 3305 | goto bad; 3306 | #endif 3307 | if (from_list) 3308 | list = from_list; 3309 | else { 3310 | empty_list = PyList_New(0); 3311 | if (!empty_list) 3312 | goto bad; 3313 | list = empty_list; 3314 | } 3315 | global_dict = PyModule_GetDict(__pyx_m); 3316 | if (!global_dict) 3317 | goto bad; 3318 | empty_dict = PyDict_New(); 3319 | if (!empty_dict) 3320 | goto bad; 3321 | { 3322 | #if PY_MAJOR_VERSION >= 3 3323 | if (level == -1) { 3324 | if (strchr(__Pyx_MODULE_NAME, '.')) { 3325 | #if PY_VERSION_HEX < 0x03030000 3326 | PyObject *py_level = PyInt_FromLong(1); 3327 | if (!py_level) 3328 | goto bad; 3329 | module = PyObject_CallFunctionObjArgs(py_import, 3330 | name, global_dict, empty_dict, list, py_level, NULL); 3331 | Py_DECREF(py_level); 3332 | #else 3333 | module = PyImport_ImportModuleLevelObject( 3334 | name, global_dict, empty_dict, list, 1); 3335 | #endif 3336 | if (!module) { 3337 | if (!PyErr_ExceptionMatches(PyExc_ImportError)) 3338 | goto bad; 3339 | PyErr_Clear(); 3340 | } 3341 | } 3342 | level = 0; 3343 | } 3344 | #endif 3345 | if (!module) { 3346 | #if PY_VERSION_HEX < 0x03030000 3347 | PyObject *py_level = PyInt_FromLong(level); 3348 | if (!py_level) 3349 | goto bad; 3350 | module = PyObject_CallFunctionObjArgs(py_import, 3351 | name, global_dict, empty_dict, list, py_level, NULL); 3352 | Py_DECREF(py_level); 3353 | #else 3354 | module = PyImport_ImportModuleLevelObject( 3355 | name, global_dict, empty_dict, list, level); 3356 | #endif 3357 | } 3358 | } 3359 | bad: 3360 | #if PY_VERSION_HEX < 0x03030000 3361 | Py_XDECREF(py_import); 3362 | #endif 3363 | Py_XDECREF(empty_list); 3364 | Py_XDECREF(empty_dict); 3365 | return module; 3366 | } 3367 | 3368 | /* CodeObjectCache */ 3369 | static int __pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry* entries, int count, int code_line) { 3370 | int start = 0, mid = 0, end = count - 1; 3371 | if (end >= 0 && code_line > entries[end].code_line) { 3372 | return count; 3373 | } 3374 | while (start < end) { 3375 | mid = start + (end - start) / 2; 3376 | if (code_line < entries[mid].code_line) { 3377 | end = mid; 3378 | } else if (code_line > entries[mid].code_line) { 3379 | start = mid + 1; 3380 | } else { 3381 | return mid; 3382 | } 3383 | } 3384 | if (code_line <= entries[mid].code_line) { 3385 | return mid; 3386 | } else { 3387 | return mid + 1; 3388 | } 3389 | } 3390 | static PyCodeObject *__pyx_find_code_object(int code_line) { 3391 | PyCodeObject* code_object; 3392 | int pos; 3393 | if (unlikely(!code_line) || unlikely(!__pyx_code_cache.entries)) { 3394 | return NULL; 3395 | } 3396 | pos = __pyx_bisect_code_objects(__pyx_code_cache.entries, __pyx_code_cache.count, code_line); 3397 | if (unlikely(pos >= __pyx_code_cache.count) || unlikely(__pyx_code_cache.entries[pos].code_line != code_line)) { 3398 | return NULL; 3399 | } 3400 | code_object = __pyx_code_cache.entries[pos].code_object; 3401 | Py_INCREF(code_object); 3402 | return code_object; 3403 | } 3404 | static void __pyx_insert_code_object(int code_line, PyCodeObject* code_object) { 3405 | int pos, i; 3406 | __Pyx_CodeObjectCacheEntry* entries = __pyx_code_cache.entries; 3407 | if (unlikely(!code_line)) { 3408 | return; 3409 | } 3410 | if (unlikely(!entries)) { 3411 | entries = (__Pyx_CodeObjectCacheEntry*)PyMem_Malloc(64*sizeof(__Pyx_CodeObjectCacheEntry)); 3412 | if (likely(entries)) { 3413 | __pyx_code_cache.entries = entries; 3414 | __pyx_code_cache.max_count = 64; 3415 | __pyx_code_cache.count = 1; 3416 | entries[0].code_line = code_line; 3417 | entries[0].code_object = code_object; 3418 | Py_INCREF(code_object); 3419 | } 3420 | return; 3421 | } 3422 | pos = __pyx_bisect_code_objects(__pyx_code_cache.entries, __pyx_code_cache.count, code_line); 3423 | if ((pos < __pyx_code_cache.count) && unlikely(__pyx_code_cache.entries[pos].code_line == code_line)) { 3424 | PyCodeObject* tmp = entries[pos].code_object; 3425 | entries[pos].code_object = code_object; 3426 | Py_DECREF(tmp); 3427 | return; 3428 | } 3429 | if (__pyx_code_cache.count == __pyx_code_cache.max_count) { 3430 | int new_max = __pyx_code_cache.max_count + 64; 3431 | entries = (__Pyx_CodeObjectCacheEntry*)PyMem_Realloc( 3432 | __pyx_code_cache.entries, (size_t)new_max*sizeof(__Pyx_CodeObjectCacheEntry)); 3433 | if (unlikely(!entries)) { 3434 | return; 3435 | } 3436 | __pyx_code_cache.entries = entries; 3437 | __pyx_code_cache.max_count = new_max; 3438 | } 3439 | for (i=__pyx_code_cache.count; i>pos; i--) { 3440 | entries[i] = entries[i-1]; 3441 | } 3442 | entries[pos].code_line = code_line; 3443 | entries[pos].code_object = code_object; 3444 | __pyx_code_cache.count++; 3445 | Py_INCREF(code_object); 3446 | } 3447 | 3448 | /* AddTraceback */ 3449 | #include "compile.h" 3450 | #include "frameobject.h" 3451 | #include "traceback.h" 3452 | static PyCodeObject* __Pyx_CreateCodeObjectForTraceback( 3453 | const char *funcname, int c_line, 3454 | int py_line, const char *filename) { 3455 | PyCodeObject *py_code = 0; 3456 | PyObject *py_srcfile = 0; 3457 | PyObject *py_funcname = 0; 3458 | #if PY_MAJOR_VERSION < 3 3459 | py_srcfile = PyString_FromString(filename); 3460 | #else 3461 | py_srcfile = PyUnicode_FromString(filename); 3462 | #endif 3463 | if (!py_srcfile) goto bad; 3464 | if (c_line) { 3465 | #if PY_MAJOR_VERSION < 3 3466 | py_funcname = PyString_FromFormat( "%s (%s:%d)", funcname, __pyx_cfilenm, c_line); 3467 | #else 3468 | py_funcname = PyUnicode_FromFormat( "%s (%s:%d)", funcname, __pyx_cfilenm, c_line); 3469 | #endif 3470 | } 3471 | else { 3472 | #if PY_MAJOR_VERSION < 3 3473 | py_funcname = PyString_FromString(funcname); 3474 | #else 3475 | py_funcname = PyUnicode_FromString(funcname); 3476 | #endif 3477 | } 3478 | if (!py_funcname) goto bad; 3479 | py_code = __Pyx_PyCode_New( 3480 | 0, 3481 | 0, 3482 | 0, 3483 | 0, 3484 | 0, 3485 | __pyx_empty_bytes, /*PyObject *code,*/ 3486 | __pyx_empty_tuple, /*PyObject *consts,*/ 3487 | __pyx_empty_tuple, /*PyObject *names,*/ 3488 | __pyx_empty_tuple, /*PyObject *varnames,*/ 3489 | __pyx_empty_tuple, /*PyObject *freevars,*/ 3490 | __pyx_empty_tuple, /*PyObject *cellvars,*/ 3491 | py_srcfile, /*PyObject *filename,*/ 3492 | py_funcname, /*PyObject *name,*/ 3493 | py_line, 3494 | __pyx_empty_bytes /*PyObject *lnotab*/ 3495 | ); 3496 | Py_DECREF(py_srcfile); 3497 | Py_DECREF(py_funcname); 3498 | return py_code; 3499 | bad: 3500 | Py_XDECREF(py_srcfile); 3501 | Py_XDECREF(py_funcname); 3502 | return NULL; 3503 | } 3504 | static void __Pyx_AddTraceback(const char *funcname, int c_line, 3505 | int py_line, const char *filename) { 3506 | PyCodeObject *py_code = 0; 3507 | PyFrameObject *py_frame = 0; 3508 | py_code = __pyx_find_code_object(c_line ? c_line : py_line); 3509 | if (!py_code) { 3510 | py_code = __Pyx_CreateCodeObjectForTraceback( 3511 | funcname, c_line, py_line, filename); 3512 | if (!py_code) goto bad; 3513 | __pyx_insert_code_object(c_line ? c_line : py_line, py_code); 3514 | } 3515 | py_frame = PyFrame_New( 3516 | PyThreadState_GET(), /*PyThreadState *tstate,*/ 3517 | py_code, /*PyCodeObject *code,*/ 3518 | __pyx_d, /*PyObject *globals,*/ 3519 | 0 /*PyObject *locals*/ 3520 | ); 3521 | if (!py_frame) goto bad; 3522 | py_frame->f_lineno = py_line; 3523 | PyTraceBack_Here(py_frame); 3524 | bad: 3525 | Py_XDECREF(py_code); 3526 | Py_XDECREF(py_frame); 3527 | } 3528 | 3529 | /* CIntFromPyVerify */ 3530 | #define __PYX_VERIFY_RETURN_INT(target_type, func_type, func_value)\ 3531 | __PYX__VERIFY_RETURN_INT(target_type, func_type, func_value, 0) 3532 | #define __PYX_VERIFY_RETURN_INT_EXC(target_type, func_type, func_value)\ 3533 | __PYX__VERIFY_RETURN_INT(target_type, func_type, func_value, 1) 3534 | #define __PYX__VERIFY_RETURN_INT(target_type, func_type, func_value, exc)\ 3535 | {\ 3536 | func_type value = func_value;\ 3537 | if (sizeof(target_type) < sizeof(func_type)) {\ 3538 | if (unlikely(value != (func_type) (target_type) value)) {\ 3539 | func_type zero = 0;\ 3540 | if (exc && unlikely(value == (func_type)-1 && PyErr_Occurred()))\ 3541 | return (target_type) -1;\ 3542 | if (is_unsigned && unlikely(value < zero))\ 3543 | goto raise_neg_overflow;\ 3544 | else\ 3545 | goto raise_overflow;\ 3546 | }\ 3547 | }\ 3548 | return (target_type) value;\ 3549 | } 3550 | 3551 | /* CIntToPy */ 3552 | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_long(long value) { 3553 | const long neg_one = (long) -1, const_zero = (long) 0; 3554 | const int is_unsigned = neg_one > const_zero; 3555 | if (is_unsigned) { 3556 | if (sizeof(long) < sizeof(long)) { 3557 | return PyInt_FromLong((long) value); 3558 | } else if (sizeof(long) <= sizeof(unsigned long)) { 3559 | return PyLong_FromUnsignedLong((unsigned long) value); 3560 | } else if (sizeof(long) <= sizeof(unsigned PY_LONG_LONG)) { 3561 | return PyLong_FromUnsignedLongLong((unsigned PY_LONG_LONG) value); 3562 | } 3563 | } else { 3564 | if (sizeof(long) <= sizeof(long)) { 3565 | return PyInt_FromLong((long) value); 3566 | } else if (sizeof(long) <= sizeof(PY_LONG_LONG)) { 3567 | return PyLong_FromLongLong((PY_LONG_LONG) value); 3568 | } 3569 | } 3570 | { 3571 | int one = 1; int little = (int)*(unsigned char *)&one; 3572 | unsigned char *bytes = (unsigned char *)&value; 3573 | return _PyLong_FromByteArray(bytes, sizeof(long), 3574 | little, !is_unsigned); 3575 | } 3576 | } 3577 | 3578 | /* CIntToPy */ 3579 | static CYTHON_INLINE PyObject* __Pyx_PyInt_From_uint64_t(uint64_t value) { 3580 | const uint64_t neg_one = (uint64_t) -1, const_zero = (uint64_t) 0; 3581 | const int is_unsigned = neg_one > const_zero; 3582 | if (is_unsigned) { 3583 | if (sizeof(uint64_t) < sizeof(long)) { 3584 | return PyInt_FromLong((long) value); 3585 | } else if (sizeof(uint64_t) <= sizeof(unsigned long)) { 3586 | return PyLong_FromUnsignedLong((unsigned long) value); 3587 | } else if (sizeof(uint64_t) <= sizeof(unsigned PY_LONG_LONG)) { 3588 | return PyLong_FromUnsignedLongLong((unsigned PY_LONG_LONG) value); 3589 | } 3590 | } else { 3591 | if (sizeof(uint64_t) <= sizeof(long)) { 3592 | return PyInt_FromLong((long) value); 3593 | } else if (sizeof(uint64_t) <= sizeof(PY_LONG_LONG)) { 3594 | return PyLong_FromLongLong((PY_LONG_LONG) value); 3595 | } 3596 | } 3597 | { 3598 | int one = 1; int little = (int)*(unsigned char *)&one; 3599 | unsigned char *bytes = (unsigned char *)&value; 3600 | return _PyLong_FromByteArray(bytes, sizeof(uint64_t), 3601 | little, !is_unsigned); 3602 | } 3603 | } 3604 | 3605 | /* CIntFromPy */ 3606 | static CYTHON_INLINE uint64_t __Pyx_PyInt_As_uint64_t(PyObject *x) { 3607 | const uint64_t neg_one = (uint64_t) -1, const_zero = (uint64_t) 0; 3608 | const int is_unsigned = neg_one > const_zero; 3609 | #if PY_MAJOR_VERSION < 3 3610 | if (likely(PyInt_Check(x))) { 3611 | if (sizeof(uint64_t) < sizeof(long)) { 3612 | __PYX_VERIFY_RETURN_INT(uint64_t, long, PyInt_AS_LONG(x)) 3613 | } else { 3614 | long val = PyInt_AS_LONG(x); 3615 | if (is_unsigned && unlikely(val < 0)) { 3616 | goto raise_neg_overflow; 3617 | } 3618 | return (uint64_t) val; 3619 | } 3620 | } else 3621 | #endif 3622 | if (likely(PyLong_Check(x))) { 3623 | if (is_unsigned) { 3624 | #if CYTHON_USE_PYLONG_INTERNALS 3625 | const digit* digits = ((PyLongObject*)x)->ob_digit; 3626 | switch (Py_SIZE(x)) { 3627 | case 0: return (uint64_t) 0; 3628 | case 1: __PYX_VERIFY_RETURN_INT(uint64_t, digit, digits[0]) 3629 | case 2: 3630 | if (8 * sizeof(uint64_t) > 1 * PyLong_SHIFT) { 3631 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 3632 | __PYX_VERIFY_RETURN_INT(uint64_t, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3633 | } else if (8 * sizeof(uint64_t) >= 2 * PyLong_SHIFT) { 3634 | return (uint64_t) (((((uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0])); 3635 | } 3636 | } 3637 | break; 3638 | case 3: 3639 | if (8 * sizeof(uint64_t) > 2 * PyLong_SHIFT) { 3640 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 3641 | __PYX_VERIFY_RETURN_INT(uint64_t, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3642 | } else if (8 * sizeof(uint64_t) >= 3 * PyLong_SHIFT) { 3643 | return (uint64_t) (((((((uint64_t)digits[2]) << PyLong_SHIFT) | (uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0])); 3644 | } 3645 | } 3646 | break; 3647 | case 4: 3648 | if (8 * sizeof(uint64_t) > 3 * PyLong_SHIFT) { 3649 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 3650 | __PYX_VERIFY_RETURN_INT(uint64_t, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3651 | } else if (8 * sizeof(uint64_t) >= 4 * PyLong_SHIFT) { 3652 | return (uint64_t) (((((((((uint64_t)digits[3]) << PyLong_SHIFT) | (uint64_t)digits[2]) << PyLong_SHIFT) | (uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0])); 3653 | } 3654 | } 3655 | break; 3656 | } 3657 | #endif 3658 | #if CYTHON_COMPILING_IN_CPYTHON 3659 | if (unlikely(Py_SIZE(x) < 0)) { 3660 | goto raise_neg_overflow; 3661 | } 3662 | #else 3663 | { 3664 | int result = PyObject_RichCompareBool(x, Py_False, Py_LT); 3665 | if (unlikely(result < 0)) 3666 | return (uint64_t) -1; 3667 | if (unlikely(result == 1)) 3668 | goto raise_neg_overflow; 3669 | } 3670 | #endif 3671 | if (sizeof(uint64_t) <= sizeof(unsigned long)) { 3672 | __PYX_VERIFY_RETURN_INT_EXC(uint64_t, unsigned long, PyLong_AsUnsignedLong(x)) 3673 | } else if (sizeof(uint64_t) <= sizeof(unsigned PY_LONG_LONG)) { 3674 | __PYX_VERIFY_RETURN_INT_EXC(uint64_t, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x)) 3675 | } 3676 | } else { 3677 | #if CYTHON_USE_PYLONG_INTERNALS 3678 | const digit* digits = ((PyLongObject*)x)->ob_digit; 3679 | switch (Py_SIZE(x)) { 3680 | case 0: return (uint64_t) 0; 3681 | case -1: __PYX_VERIFY_RETURN_INT(uint64_t, sdigit, (sdigit) (-(sdigit)digits[0])) 3682 | case 1: __PYX_VERIFY_RETURN_INT(uint64_t, digit, +digits[0]) 3683 | case -2: 3684 | if (8 * sizeof(uint64_t) - 1 > 1 * PyLong_SHIFT) { 3685 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 3686 | __PYX_VERIFY_RETURN_INT(uint64_t, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3687 | } else if (8 * sizeof(uint64_t) - 1 > 2 * PyLong_SHIFT) { 3688 | return (uint64_t) (((uint64_t)-1)*(((((uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0]))); 3689 | } 3690 | } 3691 | break; 3692 | case 2: 3693 | if (8 * sizeof(uint64_t) > 1 * PyLong_SHIFT) { 3694 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 3695 | __PYX_VERIFY_RETURN_INT(uint64_t, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3696 | } else if (8 * sizeof(uint64_t) - 1 > 2 * PyLong_SHIFT) { 3697 | return (uint64_t) ((((((uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0]))); 3698 | } 3699 | } 3700 | break; 3701 | case -3: 3702 | if (8 * sizeof(uint64_t) - 1 > 2 * PyLong_SHIFT) { 3703 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 3704 | __PYX_VERIFY_RETURN_INT(uint64_t, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3705 | } else if (8 * sizeof(uint64_t) - 1 > 3 * PyLong_SHIFT) { 3706 | return (uint64_t) (((uint64_t)-1)*(((((((uint64_t)digits[2]) << PyLong_SHIFT) | (uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0]))); 3707 | } 3708 | } 3709 | break; 3710 | case 3: 3711 | if (8 * sizeof(uint64_t) > 2 * PyLong_SHIFT) { 3712 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 3713 | __PYX_VERIFY_RETURN_INT(uint64_t, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3714 | } else if (8 * sizeof(uint64_t) - 1 > 3 * PyLong_SHIFT) { 3715 | return (uint64_t) ((((((((uint64_t)digits[2]) << PyLong_SHIFT) | (uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0]))); 3716 | } 3717 | } 3718 | break; 3719 | case -4: 3720 | if (8 * sizeof(uint64_t) - 1 > 3 * PyLong_SHIFT) { 3721 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 3722 | __PYX_VERIFY_RETURN_INT(uint64_t, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3723 | } else if (8 * sizeof(uint64_t) - 1 > 4 * PyLong_SHIFT) { 3724 | return (uint64_t) (((uint64_t)-1)*(((((((((uint64_t)digits[3]) << PyLong_SHIFT) | (uint64_t)digits[2]) << PyLong_SHIFT) | (uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0]))); 3725 | } 3726 | } 3727 | break; 3728 | case 4: 3729 | if (8 * sizeof(uint64_t) > 3 * PyLong_SHIFT) { 3730 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 3731 | __PYX_VERIFY_RETURN_INT(uint64_t, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3732 | } else if (8 * sizeof(uint64_t) - 1 > 4 * PyLong_SHIFT) { 3733 | return (uint64_t) ((((((((((uint64_t)digits[3]) << PyLong_SHIFT) | (uint64_t)digits[2]) << PyLong_SHIFT) | (uint64_t)digits[1]) << PyLong_SHIFT) | (uint64_t)digits[0]))); 3734 | } 3735 | } 3736 | break; 3737 | } 3738 | #endif 3739 | if (sizeof(uint64_t) <= sizeof(long)) { 3740 | __PYX_VERIFY_RETURN_INT_EXC(uint64_t, long, PyLong_AsLong(x)) 3741 | } else if (sizeof(uint64_t) <= sizeof(PY_LONG_LONG)) { 3742 | __PYX_VERIFY_RETURN_INT_EXC(uint64_t, PY_LONG_LONG, PyLong_AsLongLong(x)) 3743 | } 3744 | } 3745 | { 3746 | #if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray) 3747 | PyErr_SetString(PyExc_RuntimeError, 3748 | "_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers"); 3749 | #else 3750 | uint64_t val; 3751 | PyObject *v = __Pyx_PyNumber_IntOrLong(x); 3752 | #if PY_MAJOR_VERSION < 3 3753 | if (likely(v) && !PyLong_Check(v)) { 3754 | PyObject *tmp = v; 3755 | v = PyNumber_Long(tmp); 3756 | Py_DECREF(tmp); 3757 | } 3758 | #endif 3759 | if (likely(v)) { 3760 | int one = 1; int is_little = (int)*(unsigned char *)&one; 3761 | unsigned char *bytes = (unsigned char *)&val; 3762 | int ret = _PyLong_AsByteArray((PyLongObject *)v, 3763 | bytes, sizeof(val), 3764 | is_little, !is_unsigned); 3765 | Py_DECREF(v); 3766 | if (likely(!ret)) 3767 | return val; 3768 | } 3769 | #endif 3770 | return (uint64_t) -1; 3771 | } 3772 | } else { 3773 | uint64_t val; 3774 | PyObject *tmp = __Pyx_PyNumber_IntOrLong(x); 3775 | if (!tmp) return (uint64_t) -1; 3776 | val = __Pyx_PyInt_As_uint64_t(tmp); 3777 | Py_DECREF(tmp); 3778 | return val; 3779 | } 3780 | raise_overflow: 3781 | PyErr_SetString(PyExc_OverflowError, 3782 | "value too large to convert to uint64_t"); 3783 | return (uint64_t) -1; 3784 | raise_neg_overflow: 3785 | PyErr_SetString(PyExc_OverflowError, 3786 | "can't convert negative value to uint64_t"); 3787 | return (uint64_t) -1; 3788 | } 3789 | 3790 | /* CIntFromPy */ 3791 | static CYTHON_INLINE size_t __Pyx_PyInt_As_size_t(PyObject *x) { 3792 | const size_t neg_one = (size_t) -1, const_zero = (size_t) 0; 3793 | const int is_unsigned = neg_one > const_zero; 3794 | #if PY_MAJOR_VERSION < 3 3795 | if (likely(PyInt_Check(x))) { 3796 | if (sizeof(size_t) < sizeof(long)) { 3797 | __PYX_VERIFY_RETURN_INT(size_t, long, PyInt_AS_LONG(x)) 3798 | } else { 3799 | long val = PyInt_AS_LONG(x); 3800 | if (is_unsigned && unlikely(val < 0)) { 3801 | goto raise_neg_overflow; 3802 | } 3803 | return (size_t) val; 3804 | } 3805 | } else 3806 | #endif 3807 | if (likely(PyLong_Check(x))) { 3808 | if (is_unsigned) { 3809 | #if CYTHON_USE_PYLONG_INTERNALS 3810 | const digit* digits = ((PyLongObject*)x)->ob_digit; 3811 | switch (Py_SIZE(x)) { 3812 | case 0: return (size_t) 0; 3813 | case 1: __PYX_VERIFY_RETURN_INT(size_t, digit, digits[0]) 3814 | case 2: 3815 | if (8 * sizeof(size_t) > 1 * PyLong_SHIFT) { 3816 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 3817 | __PYX_VERIFY_RETURN_INT(size_t, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3818 | } else if (8 * sizeof(size_t) >= 2 * PyLong_SHIFT) { 3819 | return (size_t) (((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 3820 | } 3821 | } 3822 | break; 3823 | case 3: 3824 | if (8 * sizeof(size_t) > 2 * PyLong_SHIFT) { 3825 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 3826 | __PYX_VERIFY_RETURN_INT(size_t, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3827 | } else if (8 * sizeof(size_t) >= 3 * PyLong_SHIFT) { 3828 | return (size_t) (((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 3829 | } 3830 | } 3831 | break; 3832 | case 4: 3833 | if (8 * sizeof(size_t) > 3 * PyLong_SHIFT) { 3834 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 3835 | __PYX_VERIFY_RETURN_INT(size_t, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3836 | } else if (8 * sizeof(size_t) >= 4 * PyLong_SHIFT) { 3837 | return (size_t) (((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 3838 | } 3839 | } 3840 | break; 3841 | } 3842 | #endif 3843 | #if CYTHON_COMPILING_IN_CPYTHON 3844 | if (unlikely(Py_SIZE(x) < 0)) { 3845 | goto raise_neg_overflow; 3846 | } 3847 | #else 3848 | { 3849 | int result = PyObject_RichCompareBool(x, Py_False, Py_LT); 3850 | if (unlikely(result < 0)) 3851 | return (size_t) -1; 3852 | if (unlikely(result == 1)) 3853 | goto raise_neg_overflow; 3854 | } 3855 | #endif 3856 | if (sizeof(size_t) <= sizeof(unsigned long)) { 3857 | __PYX_VERIFY_RETURN_INT_EXC(size_t, unsigned long, PyLong_AsUnsignedLong(x)) 3858 | } else if (sizeof(size_t) <= sizeof(unsigned PY_LONG_LONG)) { 3859 | __PYX_VERIFY_RETURN_INT_EXC(size_t, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x)) 3860 | } 3861 | } else { 3862 | #if CYTHON_USE_PYLONG_INTERNALS 3863 | const digit* digits = ((PyLongObject*)x)->ob_digit; 3864 | switch (Py_SIZE(x)) { 3865 | case 0: return (size_t) 0; 3866 | case -1: __PYX_VERIFY_RETURN_INT(size_t, sdigit, (sdigit) (-(sdigit)digits[0])) 3867 | case 1: __PYX_VERIFY_RETURN_INT(size_t, digit, +digits[0]) 3868 | case -2: 3869 | if (8 * sizeof(size_t) - 1 > 1 * PyLong_SHIFT) { 3870 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 3871 | __PYX_VERIFY_RETURN_INT(size_t, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3872 | } else if (8 * sizeof(size_t) - 1 > 2 * PyLong_SHIFT) { 3873 | return (size_t) (((size_t)-1)*(((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]))); 3874 | } 3875 | } 3876 | break; 3877 | case 2: 3878 | if (8 * sizeof(size_t) > 1 * PyLong_SHIFT) { 3879 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 3880 | __PYX_VERIFY_RETURN_INT(size_t, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3881 | } else if (8 * sizeof(size_t) - 1 > 2 * PyLong_SHIFT) { 3882 | return (size_t) ((((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]))); 3883 | } 3884 | } 3885 | break; 3886 | case -3: 3887 | if (8 * sizeof(size_t) - 1 > 2 * PyLong_SHIFT) { 3888 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 3889 | __PYX_VERIFY_RETURN_INT(size_t, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3890 | } else if (8 * sizeof(size_t) - 1 > 3 * PyLong_SHIFT) { 3891 | return (size_t) (((size_t)-1)*(((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]))); 3892 | } 3893 | } 3894 | break; 3895 | case 3: 3896 | if (8 * sizeof(size_t) > 2 * PyLong_SHIFT) { 3897 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 3898 | __PYX_VERIFY_RETURN_INT(size_t, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3899 | } else if (8 * sizeof(size_t) - 1 > 3 * PyLong_SHIFT) { 3900 | return (size_t) ((((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]))); 3901 | } 3902 | } 3903 | break; 3904 | case -4: 3905 | if (8 * sizeof(size_t) - 1 > 3 * PyLong_SHIFT) { 3906 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 3907 | __PYX_VERIFY_RETURN_INT(size_t, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3908 | } else if (8 * sizeof(size_t) - 1 > 4 * PyLong_SHIFT) { 3909 | return (size_t) (((size_t)-1)*(((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]))); 3910 | } 3911 | } 3912 | break; 3913 | case 4: 3914 | if (8 * sizeof(size_t) > 3 * PyLong_SHIFT) { 3915 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 3916 | __PYX_VERIFY_RETURN_INT(size_t, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 3917 | } else if (8 * sizeof(size_t) - 1 > 4 * PyLong_SHIFT) { 3918 | return (size_t) ((((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0]))); 3919 | } 3920 | } 3921 | break; 3922 | } 3923 | #endif 3924 | if (sizeof(size_t) <= sizeof(long)) { 3925 | __PYX_VERIFY_RETURN_INT_EXC(size_t, long, PyLong_AsLong(x)) 3926 | } else if (sizeof(size_t) <= sizeof(PY_LONG_LONG)) { 3927 | __PYX_VERIFY_RETURN_INT_EXC(size_t, PY_LONG_LONG, PyLong_AsLongLong(x)) 3928 | } 3929 | } 3930 | { 3931 | #if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray) 3932 | PyErr_SetString(PyExc_RuntimeError, 3933 | "_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers"); 3934 | #else 3935 | size_t val; 3936 | PyObject *v = __Pyx_PyNumber_IntOrLong(x); 3937 | #if PY_MAJOR_VERSION < 3 3938 | if (likely(v) && !PyLong_Check(v)) { 3939 | PyObject *tmp = v; 3940 | v = PyNumber_Long(tmp); 3941 | Py_DECREF(tmp); 3942 | } 3943 | #endif 3944 | if (likely(v)) { 3945 | int one = 1; int is_little = (int)*(unsigned char *)&one; 3946 | unsigned char *bytes = (unsigned char *)&val; 3947 | int ret = _PyLong_AsByteArray((PyLongObject *)v, 3948 | bytes, sizeof(val), 3949 | is_little, !is_unsigned); 3950 | Py_DECREF(v); 3951 | if (likely(!ret)) 3952 | return val; 3953 | } 3954 | #endif 3955 | return (size_t) -1; 3956 | } 3957 | } else { 3958 | size_t val; 3959 | PyObject *tmp = __Pyx_PyNumber_IntOrLong(x); 3960 | if (!tmp) return (size_t) -1; 3961 | val = __Pyx_PyInt_As_size_t(tmp); 3962 | Py_DECREF(tmp); 3963 | return val; 3964 | } 3965 | raise_overflow: 3966 | PyErr_SetString(PyExc_OverflowError, 3967 | "value too large to convert to size_t"); 3968 | return (size_t) -1; 3969 | raise_neg_overflow: 3970 | PyErr_SetString(PyExc_OverflowError, 3971 | "can't convert negative value to size_t"); 3972 | return (size_t) -1; 3973 | } 3974 | 3975 | /* CIntFromPy */ 3976 | static CYTHON_INLINE long __Pyx_PyInt_As_long(PyObject *x) { 3977 | const long neg_one = (long) -1, const_zero = (long) 0; 3978 | const int is_unsigned = neg_one > const_zero; 3979 | #if PY_MAJOR_VERSION < 3 3980 | if (likely(PyInt_Check(x))) { 3981 | if (sizeof(long) < sizeof(long)) { 3982 | __PYX_VERIFY_RETURN_INT(long, long, PyInt_AS_LONG(x)) 3983 | } else { 3984 | long val = PyInt_AS_LONG(x); 3985 | if (is_unsigned && unlikely(val < 0)) { 3986 | goto raise_neg_overflow; 3987 | } 3988 | return (long) val; 3989 | } 3990 | } else 3991 | #endif 3992 | if (likely(PyLong_Check(x))) { 3993 | if (is_unsigned) { 3994 | #if CYTHON_USE_PYLONG_INTERNALS 3995 | const digit* digits = ((PyLongObject*)x)->ob_digit; 3996 | switch (Py_SIZE(x)) { 3997 | case 0: return (long) 0; 3998 | case 1: __PYX_VERIFY_RETURN_INT(long, digit, digits[0]) 3999 | case 2: 4000 | if (8 * sizeof(long) > 1 * PyLong_SHIFT) { 4001 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 4002 | __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4003 | } else if (8 * sizeof(long) >= 2 * PyLong_SHIFT) { 4004 | return (long) (((((long)digits[1]) << PyLong_SHIFT) | (long)digits[0])); 4005 | } 4006 | } 4007 | break; 4008 | case 3: 4009 | if (8 * sizeof(long) > 2 * PyLong_SHIFT) { 4010 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 4011 | __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4012 | } else if (8 * sizeof(long) >= 3 * PyLong_SHIFT) { 4013 | return (long) (((((((long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0])); 4014 | } 4015 | } 4016 | break; 4017 | case 4: 4018 | if (8 * sizeof(long) > 3 * PyLong_SHIFT) { 4019 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 4020 | __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4021 | } else if (8 * sizeof(long) >= 4 * PyLong_SHIFT) { 4022 | return (long) (((((((((long)digits[3]) << PyLong_SHIFT) | (long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0])); 4023 | } 4024 | } 4025 | break; 4026 | } 4027 | #endif 4028 | #if CYTHON_COMPILING_IN_CPYTHON 4029 | if (unlikely(Py_SIZE(x) < 0)) { 4030 | goto raise_neg_overflow; 4031 | } 4032 | #else 4033 | { 4034 | int result = PyObject_RichCompareBool(x, Py_False, Py_LT); 4035 | if (unlikely(result < 0)) 4036 | return (long) -1; 4037 | if (unlikely(result == 1)) 4038 | goto raise_neg_overflow; 4039 | } 4040 | #endif 4041 | if (sizeof(long) <= sizeof(unsigned long)) { 4042 | __PYX_VERIFY_RETURN_INT_EXC(long, unsigned long, PyLong_AsUnsignedLong(x)) 4043 | } else if (sizeof(long) <= sizeof(unsigned PY_LONG_LONG)) { 4044 | __PYX_VERIFY_RETURN_INT_EXC(long, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x)) 4045 | } 4046 | } else { 4047 | #if CYTHON_USE_PYLONG_INTERNALS 4048 | const digit* digits = ((PyLongObject*)x)->ob_digit; 4049 | switch (Py_SIZE(x)) { 4050 | case 0: return (long) 0; 4051 | case -1: __PYX_VERIFY_RETURN_INT(long, sdigit, (sdigit) (-(sdigit)digits[0])) 4052 | case 1: __PYX_VERIFY_RETURN_INT(long, digit, +digits[0]) 4053 | case -2: 4054 | if (8 * sizeof(long) - 1 > 1 * PyLong_SHIFT) { 4055 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 4056 | __PYX_VERIFY_RETURN_INT(long, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4057 | } else if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) { 4058 | return (long) (((long)-1)*(((((long)digits[1]) << PyLong_SHIFT) | (long)digits[0]))); 4059 | } 4060 | } 4061 | break; 4062 | case 2: 4063 | if (8 * sizeof(long) > 1 * PyLong_SHIFT) { 4064 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 4065 | __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4066 | } else if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) { 4067 | return (long) ((((((long)digits[1]) << PyLong_SHIFT) | (long)digits[0]))); 4068 | } 4069 | } 4070 | break; 4071 | case -3: 4072 | if (8 * sizeof(long) - 1 > 2 * PyLong_SHIFT) { 4073 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 4074 | __PYX_VERIFY_RETURN_INT(long, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4075 | } else if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) { 4076 | return (long) (((long)-1)*(((((((long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0]))); 4077 | } 4078 | } 4079 | break; 4080 | case 3: 4081 | if (8 * sizeof(long) > 2 * PyLong_SHIFT) { 4082 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 4083 | __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4084 | } else if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) { 4085 | return (long) ((((((((long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0]))); 4086 | } 4087 | } 4088 | break; 4089 | case -4: 4090 | if (8 * sizeof(long) - 1 > 3 * PyLong_SHIFT) { 4091 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 4092 | __PYX_VERIFY_RETURN_INT(long, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4093 | } else if (8 * sizeof(long) - 1 > 4 * PyLong_SHIFT) { 4094 | return (long) (((long)-1)*(((((((((long)digits[3]) << PyLong_SHIFT) | (long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0]))); 4095 | } 4096 | } 4097 | break; 4098 | case 4: 4099 | if (8 * sizeof(long) > 3 * PyLong_SHIFT) { 4100 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 4101 | __PYX_VERIFY_RETURN_INT(long, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4102 | } else if (8 * sizeof(long) - 1 > 4 * PyLong_SHIFT) { 4103 | return (long) ((((((((((long)digits[3]) << PyLong_SHIFT) | (long)digits[2]) << PyLong_SHIFT) | (long)digits[1]) << PyLong_SHIFT) | (long)digits[0]))); 4104 | } 4105 | } 4106 | break; 4107 | } 4108 | #endif 4109 | if (sizeof(long) <= sizeof(long)) { 4110 | __PYX_VERIFY_RETURN_INT_EXC(long, long, PyLong_AsLong(x)) 4111 | } else if (sizeof(long) <= sizeof(PY_LONG_LONG)) { 4112 | __PYX_VERIFY_RETURN_INT_EXC(long, PY_LONG_LONG, PyLong_AsLongLong(x)) 4113 | } 4114 | } 4115 | { 4116 | #if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray) 4117 | PyErr_SetString(PyExc_RuntimeError, 4118 | "_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers"); 4119 | #else 4120 | long val; 4121 | PyObject *v = __Pyx_PyNumber_IntOrLong(x); 4122 | #if PY_MAJOR_VERSION < 3 4123 | if (likely(v) && !PyLong_Check(v)) { 4124 | PyObject *tmp = v; 4125 | v = PyNumber_Long(tmp); 4126 | Py_DECREF(tmp); 4127 | } 4128 | #endif 4129 | if (likely(v)) { 4130 | int one = 1; int is_little = (int)*(unsigned char *)&one; 4131 | unsigned char *bytes = (unsigned char *)&val; 4132 | int ret = _PyLong_AsByteArray((PyLongObject *)v, 4133 | bytes, sizeof(val), 4134 | is_little, !is_unsigned); 4135 | Py_DECREF(v); 4136 | if (likely(!ret)) 4137 | return val; 4138 | } 4139 | #endif 4140 | return (long) -1; 4141 | } 4142 | } else { 4143 | long val; 4144 | PyObject *tmp = __Pyx_PyNumber_IntOrLong(x); 4145 | if (!tmp) return (long) -1; 4146 | val = __Pyx_PyInt_As_long(tmp); 4147 | Py_DECREF(tmp); 4148 | return val; 4149 | } 4150 | raise_overflow: 4151 | PyErr_SetString(PyExc_OverflowError, 4152 | "value too large to convert to long"); 4153 | return (long) -1; 4154 | raise_neg_overflow: 4155 | PyErr_SetString(PyExc_OverflowError, 4156 | "can't convert negative value to long"); 4157 | return (long) -1; 4158 | } 4159 | 4160 | /* CIntFromPy */ 4161 | static CYTHON_INLINE int __Pyx_PyInt_As_int(PyObject *x) { 4162 | const int neg_one = (int) -1, const_zero = (int) 0; 4163 | const int is_unsigned = neg_one > const_zero; 4164 | #if PY_MAJOR_VERSION < 3 4165 | if (likely(PyInt_Check(x))) { 4166 | if (sizeof(int) < sizeof(long)) { 4167 | __PYX_VERIFY_RETURN_INT(int, long, PyInt_AS_LONG(x)) 4168 | } else { 4169 | long val = PyInt_AS_LONG(x); 4170 | if (is_unsigned && unlikely(val < 0)) { 4171 | goto raise_neg_overflow; 4172 | } 4173 | return (int) val; 4174 | } 4175 | } else 4176 | #endif 4177 | if (likely(PyLong_Check(x))) { 4178 | if (is_unsigned) { 4179 | #if CYTHON_USE_PYLONG_INTERNALS 4180 | const digit* digits = ((PyLongObject*)x)->ob_digit; 4181 | switch (Py_SIZE(x)) { 4182 | case 0: return (int) 0; 4183 | case 1: __PYX_VERIFY_RETURN_INT(int, digit, digits[0]) 4184 | case 2: 4185 | if (8 * sizeof(int) > 1 * PyLong_SHIFT) { 4186 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 4187 | __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4188 | } else if (8 * sizeof(int) >= 2 * PyLong_SHIFT) { 4189 | return (int) (((((int)digits[1]) << PyLong_SHIFT) | (int)digits[0])); 4190 | } 4191 | } 4192 | break; 4193 | case 3: 4194 | if (8 * sizeof(int) > 2 * PyLong_SHIFT) { 4195 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 4196 | __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4197 | } else if (8 * sizeof(int) >= 3 * PyLong_SHIFT) { 4198 | return (int) (((((((int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0])); 4199 | } 4200 | } 4201 | break; 4202 | case 4: 4203 | if (8 * sizeof(int) > 3 * PyLong_SHIFT) { 4204 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 4205 | __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4206 | } else if (8 * sizeof(int) >= 4 * PyLong_SHIFT) { 4207 | return (int) (((((((((int)digits[3]) << PyLong_SHIFT) | (int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0])); 4208 | } 4209 | } 4210 | break; 4211 | } 4212 | #endif 4213 | #if CYTHON_COMPILING_IN_CPYTHON 4214 | if (unlikely(Py_SIZE(x) < 0)) { 4215 | goto raise_neg_overflow; 4216 | } 4217 | #else 4218 | { 4219 | int result = PyObject_RichCompareBool(x, Py_False, Py_LT); 4220 | if (unlikely(result < 0)) 4221 | return (int) -1; 4222 | if (unlikely(result == 1)) 4223 | goto raise_neg_overflow; 4224 | } 4225 | #endif 4226 | if (sizeof(int) <= sizeof(unsigned long)) { 4227 | __PYX_VERIFY_RETURN_INT_EXC(int, unsigned long, PyLong_AsUnsignedLong(x)) 4228 | } else if (sizeof(int) <= sizeof(unsigned PY_LONG_LONG)) { 4229 | __PYX_VERIFY_RETURN_INT_EXC(int, unsigned PY_LONG_LONG, PyLong_AsUnsignedLongLong(x)) 4230 | } 4231 | } else { 4232 | #if CYTHON_USE_PYLONG_INTERNALS 4233 | const digit* digits = ((PyLongObject*)x)->ob_digit; 4234 | switch (Py_SIZE(x)) { 4235 | case 0: return (int) 0; 4236 | case -1: __PYX_VERIFY_RETURN_INT(int, sdigit, (sdigit) (-(sdigit)digits[0])) 4237 | case 1: __PYX_VERIFY_RETURN_INT(int, digit, +digits[0]) 4238 | case -2: 4239 | if (8 * sizeof(int) - 1 > 1 * PyLong_SHIFT) { 4240 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 4241 | __PYX_VERIFY_RETURN_INT(int, long, -(long) (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4242 | } else if (8 * sizeof(int) - 1 > 2 * PyLong_SHIFT) { 4243 | return (int) (((int)-1)*(((((int)digits[1]) << PyLong_SHIFT) | (int)digits[0]))); 4244 | } 4245 | } 4246 | break; 4247 | case 2: 4248 | if (8 * sizeof(int) > 1 * PyLong_SHIFT) { 4249 | if (8 * sizeof(unsigned long) > 2 * PyLong_SHIFT) { 4250 | __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4251 | } else if (8 * sizeof(int) - 1 > 2 * PyLong_SHIFT) { 4252 | return (int) ((((((int)digits[1]) << PyLong_SHIFT) | (int)digits[0]))); 4253 | } 4254 | } 4255 | break; 4256 | case -3: 4257 | if (8 * sizeof(int) - 1 > 2 * PyLong_SHIFT) { 4258 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 4259 | __PYX_VERIFY_RETURN_INT(int, long, -(long) (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4260 | } else if (8 * sizeof(int) - 1 > 3 * PyLong_SHIFT) { 4261 | return (int) (((int)-1)*(((((((int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0]))); 4262 | } 4263 | } 4264 | break; 4265 | case 3: 4266 | if (8 * sizeof(int) > 2 * PyLong_SHIFT) { 4267 | if (8 * sizeof(unsigned long) > 3 * PyLong_SHIFT) { 4268 | __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4269 | } else if (8 * sizeof(int) - 1 > 3 * PyLong_SHIFT) { 4270 | return (int) ((((((((int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0]))); 4271 | } 4272 | } 4273 | break; 4274 | case -4: 4275 | if (8 * sizeof(int) - 1 > 3 * PyLong_SHIFT) { 4276 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 4277 | __PYX_VERIFY_RETURN_INT(int, long, -(long) (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4278 | } else if (8 * sizeof(int) - 1 > 4 * PyLong_SHIFT) { 4279 | return (int) (((int)-1)*(((((((((int)digits[3]) << PyLong_SHIFT) | (int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0]))); 4280 | } 4281 | } 4282 | break; 4283 | case 4: 4284 | if (8 * sizeof(int) > 3 * PyLong_SHIFT) { 4285 | if (8 * sizeof(unsigned long) > 4 * PyLong_SHIFT) { 4286 | __PYX_VERIFY_RETURN_INT(int, unsigned long, (((((((((unsigned long)digits[3]) << PyLong_SHIFT) | (unsigned long)digits[2]) << PyLong_SHIFT) | (unsigned long)digits[1]) << PyLong_SHIFT) | (unsigned long)digits[0]))) 4287 | } else if (8 * sizeof(int) - 1 > 4 * PyLong_SHIFT) { 4288 | return (int) ((((((((((int)digits[3]) << PyLong_SHIFT) | (int)digits[2]) << PyLong_SHIFT) | (int)digits[1]) << PyLong_SHIFT) | (int)digits[0]))); 4289 | } 4290 | } 4291 | break; 4292 | } 4293 | #endif 4294 | if (sizeof(int) <= sizeof(long)) { 4295 | __PYX_VERIFY_RETURN_INT_EXC(int, long, PyLong_AsLong(x)) 4296 | } else if (sizeof(int) <= sizeof(PY_LONG_LONG)) { 4297 | __PYX_VERIFY_RETURN_INT_EXC(int, PY_LONG_LONG, PyLong_AsLongLong(x)) 4298 | } 4299 | } 4300 | { 4301 | #if CYTHON_COMPILING_IN_PYPY && !defined(_PyLong_AsByteArray) 4302 | PyErr_SetString(PyExc_RuntimeError, 4303 | "_PyLong_AsByteArray() not available in PyPy, cannot convert large numbers"); 4304 | #else 4305 | int val; 4306 | PyObject *v = __Pyx_PyNumber_IntOrLong(x); 4307 | #if PY_MAJOR_VERSION < 3 4308 | if (likely(v) && !PyLong_Check(v)) { 4309 | PyObject *tmp = v; 4310 | v = PyNumber_Long(tmp); 4311 | Py_DECREF(tmp); 4312 | } 4313 | #endif 4314 | if (likely(v)) { 4315 | int one = 1; int is_little = (int)*(unsigned char *)&one; 4316 | unsigned char *bytes = (unsigned char *)&val; 4317 | int ret = _PyLong_AsByteArray((PyLongObject *)v, 4318 | bytes, sizeof(val), 4319 | is_little, !is_unsigned); 4320 | Py_DECREF(v); 4321 | if (likely(!ret)) 4322 | return val; 4323 | } 4324 | #endif 4325 | return (int) -1; 4326 | } 4327 | } else { 4328 | int val; 4329 | PyObject *tmp = __Pyx_PyNumber_IntOrLong(x); 4330 | if (!tmp) return (int) -1; 4331 | val = __Pyx_PyInt_As_int(tmp); 4332 | Py_DECREF(tmp); 4333 | return val; 4334 | } 4335 | raise_overflow: 4336 | PyErr_SetString(PyExc_OverflowError, 4337 | "value too large to convert to int"); 4338 | return (int) -1; 4339 | raise_neg_overflow: 4340 | PyErr_SetString(PyExc_OverflowError, 4341 | "can't convert negative value to int"); 4342 | return (int) -1; 4343 | } 4344 | 4345 | /* CheckBinaryVersion */ 4346 | static int __Pyx_check_binary_version(void) { 4347 | char ctversion[4], rtversion[4]; 4348 | PyOS_snprintf(ctversion, 4, "%d.%d", PY_MAJOR_VERSION, PY_MINOR_VERSION); 4349 | PyOS_snprintf(rtversion, 4, "%s", Py_GetVersion()); 4350 | if (ctversion[0] != rtversion[0] || ctversion[2] != rtversion[2]) { 4351 | char message[200]; 4352 | PyOS_snprintf(message, sizeof(message), 4353 | "compiletime version %s of module '%.100s' " 4354 | "does not match runtime version %s", 4355 | ctversion, __Pyx_MODULE_NAME, rtversion); 4356 | return PyErr_WarnEx(NULL, message, 1); 4357 | } 4358 | return 0; 4359 | } 4360 | 4361 | /* InitStrings */ 4362 | static int __Pyx_InitStrings(__Pyx_StringTabEntry *t) { 4363 | while (t->p) { 4364 | #if PY_MAJOR_VERSION < 3 4365 | if (t->is_unicode) { 4366 | *t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL); 4367 | } else if (t->intern) { 4368 | *t->p = PyString_InternFromString(t->s); 4369 | } else { 4370 | *t->p = PyString_FromStringAndSize(t->s, t->n - 1); 4371 | } 4372 | #else 4373 | if (t->is_unicode | t->is_str) { 4374 | if (t->intern) { 4375 | *t->p = PyUnicode_InternFromString(t->s); 4376 | } else if (t->encoding) { 4377 | *t->p = PyUnicode_Decode(t->s, t->n - 1, t->encoding, NULL); 4378 | } else { 4379 | *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1); 4380 | } 4381 | } else { 4382 | *t->p = PyBytes_FromStringAndSize(t->s, t->n - 1); 4383 | } 4384 | #endif 4385 | if (!*t->p) 4386 | return -1; 4387 | ++t; 4388 | } 4389 | return 0; 4390 | } 4391 | 4392 | static CYTHON_INLINE PyObject* __Pyx_PyUnicode_FromString(const char* c_str) { 4393 | return __Pyx_PyUnicode_FromStringAndSize(c_str, (Py_ssize_t)strlen(c_str)); 4394 | } 4395 | static CYTHON_INLINE char* __Pyx_PyObject_AsString(PyObject* o) { 4396 | Py_ssize_t ignore; 4397 | return __Pyx_PyObject_AsStringAndSize(o, &ignore); 4398 | } 4399 | static CYTHON_INLINE char* __Pyx_PyObject_AsStringAndSize(PyObject* o, Py_ssize_t *length) { 4400 | #if CYTHON_COMPILING_IN_CPYTHON && (__PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT) 4401 | if ( 4402 | #if PY_MAJOR_VERSION < 3 && __PYX_DEFAULT_STRING_ENCODING_IS_ASCII 4403 | __Pyx_sys_getdefaultencoding_not_ascii && 4404 | #endif 4405 | PyUnicode_Check(o)) { 4406 | #if PY_VERSION_HEX < 0x03030000 4407 | char* defenc_c; 4408 | PyObject* defenc = _PyUnicode_AsDefaultEncodedString(o, NULL); 4409 | if (!defenc) return NULL; 4410 | defenc_c = PyBytes_AS_STRING(defenc); 4411 | #if __PYX_DEFAULT_STRING_ENCODING_IS_ASCII 4412 | { 4413 | char* end = defenc_c + PyBytes_GET_SIZE(defenc); 4414 | char* c; 4415 | for (c = defenc_c; c < end; c++) { 4416 | if ((unsigned char) (*c) >= 128) { 4417 | PyUnicode_AsASCIIString(o); 4418 | return NULL; 4419 | } 4420 | } 4421 | } 4422 | #endif 4423 | *length = PyBytes_GET_SIZE(defenc); 4424 | return defenc_c; 4425 | #else 4426 | if (__Pyx_PyUnicode_READY(o) == -1) return NULL; 4427 | #if __PYX_DEFAULT_STRING_ENCODING_IS_ASCII 4428 | if (PyUnicode_IS_ASCII(o)) { 4429 | *length = PyUnicode_GET_LENGTH(o); 4430 | return PyUnicode_AsUTF8(o); 4431 | } else { 4432 | PyUnicode_AsASCIIString(o); 4433 | return NULL; 4434 | } 4435 | #else 4436 | return PyUnicode_AsUTF8AndSize(o, length); 4437 | #endif 4438 | #endif 4439 | } else 4440 | #endif 4441 | #if (!CYTHON_COMPILING_IN_PYPY) || (defined(PyByteArray_AS_STRING) && defined(PyByteArray_GET_SIZE)) 4442 | if (PyByteArray_Check(o)) { 4443 | *length = PyByteArray_GET_SIZE(o); 4444 | return PyByteArray_AS_STRING(o); 4445 | } else 4446 | #endif 4447 | { 4448 | char* result; 4449 | int r = PyBytes_AsStringAndSize(o, &result, length); 4450 | if (unlikely(r < 0)) { 4451 | return NULL; 4452 | } else { 4453 | return result; 4454 | } 4455 | } 4456 | } 4457 | static CYTHON_INLINE int __Pyx_PyObject_IsTrue(PyObject* x) { 4458 | int is_true = x == Py_True; 4459 | if (is_true | (x == Py_False) | (x == Py_None)) return is_true; 4460 | else return PyObject_IsTrue(x); 4461 | } 4462 | static CYTHON_INLINE PyObject* __Pyx_PyNumber_IntOrLong(PyObject* x) { 4463 | PyNumberMethods *m; 4464 | const char *name = NULL; 4465 | PyObject *res = NULL; 4466 | #if PY_MAJOR_VERSION < 3 4467 | if (PyInt_Check(x) || PyLong_Check(x)) 4468 | #else 4469 | if (PyLong_Check(x)) 4470 | #endif 4471 | return __Pyx_NewRef(x); 4472 | m = Py_TYPE(x)->tp_as_number; 4473 | #if PY_MAJOR_VERSION < 3 4474 | if (m && m->nb_int) { 4475 | name = "int"; 4476 | res = PyNumber_Int(x); 4477 | } 4478 | else if (m && m->nb_long) { 4479 | name = "long"; 4480 | res = PyNumber_Long(x); 4481 | } 4482 | #else 4483 | if (m && m->nb_int) { 4484 | name = "int"; 4485 | res = PyNumber_Long(x); 4486 | } 4487 | #endif 4488 | if (res) { 4489 | #if PY_MAJOR_VERSION < 3 4490 | if (!PyInt_Check(res) && !PyLong_Check(res)) { 4491 | #else 4492 | if (!PyLong_Check(res)) { 4493 | #endif 4494 | PyErr_Format(PyExc_TypeError, 4495 | "__%.4s__ returned non-%.4s (type %.200s)", 4496 | name, name, Py_TYPE(res)->tp_name); 4497 | Py_DECREF(res); 4498 | return NULL; 4499 | } 4500 | } 4501 | else if (!PyErr_Occurred()) { 4502 | PyErr_SetString(PyExc_TypeError, 4503 | "an integer is required"); 4504 | } 4505 | return res; 4506 | } 4507 | static CYTHON_INLINE Py_ssize_t __Pyx_PyIndex_AsSsize_t(PyObject* b) { 4508 | Py_ssize_t ival; 4509 | PyObject *x; 4510 | #if PY_MAJOR_VERSION < 3 4511 | if (likely(PyInt_CheckExact(b))) { 4512 | if (sizeof(Py_ssize_t) >= sizeof(long)) 4513 | return PyInt_AS_LONG(b); 4514 | else 4515 | return PyInt_AsSsize_t(x); 4516 | } 4517 | #endif 4518 | if (likely(PyLong_CheckExact(b))) { 4519 | #if CYTHON_USE_PYLONG_INTERNALS 4520 | const digit* digits = ((PyLongObject*)b)->ob_digit; 4521 | const Py_ssize_t size = Py_SIZE(b); 4522 | if (likely(__Pyx_sst_abs(size) <= 1)) { 4523 | ival = likely(size) ? digits[0] : 0; 4524 | if (size == -1) ival = -ival; 4525 | return ival; 4526 | } else { 4527 | switch (size) { 4528 | case 2: 4529 | if (8 * sizeof(Py_ssize_t) > 2 * PyLong_SHIFT) { 4530 | return (Py_ssize_t) (((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 4531 | } 4532 | break; 4533 | case -2: 4534 | if (8 * sizeof(Py_ssize_t) > 2 * PyLong_SHIFT) { 4535 | return -(Py_ssize_t) (((((size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 4536 | } 4537 | break; 4538 | case 3: 4539 | if (8 * sizeof(Py_ssize_t) > 3 * PyLong_SHIFT) { 4540 | return (Py_ssize_t) (((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 4541 | } 4542 | break; 4543 | case -3: 4544 | if (8 * sizeof(Py_ssize_t) > 3 * PyLong_SHIFT) { 4545 | return -(Py_ssize_t) (((((((size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 4546 | } 4547 | break; 4548 | case 4: 4549 | if (8 * sizeof(Py_ssize_t) > 4 * PyLong_SHIFT) { 4550 | return (Py_ssize_t) (((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 4551 | } 4552 | break; 4553 | case -4: 4554 | if (8 * sizeof(Py_ssize_t) > 4 * PyLong_SHIFT) { 4555 | return -(Py_ssize_t) (((((((((size_t)digits[3]) << PyLong_SHIFT) | (size_t)digits[2]) << PyLong_SHIFT) | (size_t)digits[1]) << PyLong_SHIFT) | (size_t)digits[0])); 4556 | } 4557 | break; 4558 | } 4559 | } 4560 | #endif 4561 | return PyLong_AsSsize_t(b); 4562 | } 4563 | x = PyNumber_Index(b); 4564 | if (!x) return -1; 4565 | ival = PyInt_AsSsize_t(x); 4566 | Py_DECREF(x); 4567 | return ival; 4568 | } 4569 | static CYTHON_INLINE PyObject * __Pyx_PyInt_FromSize_t(size_t ival) { 4570 | return PyInt_FromSize_t(ival); 4571 | } 4572 | 4573 | 4574 | #endif /* Py_PYTHON_H */ 4575 | -------------------------------------------------------------------------------- /simhash/simhash.pxd: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # Cython declarations 3 | ################################################################################ 4 | 5 | from libcpp.vector cimport vector 6 | from libcpp.utility cimport pair 7 | from libcpp.unordered_set cimport unordered_set 8 | 9 | cdef extern from "stdint.h": 10 | ctypedef unsigned long long uint64_t 11 | ctypedef long long int64_t 12 | ctypedef unsigned int size_t 13 | 14 | cdef extern from "simhash-cpp/include/simhash.h" namespace "Simhash": 15 | ctypedef uint64_t hash_t 16 | ctypedef pair[hash_t, hash_t] match_t 17 | 18 | cppclass match_t_hash: 19 | size_t operator()(const match_t& v) const 20 | 21 | ctypedef unordered_set[match_t, match_t_hash] matches_t 22 | 23 | cpdef size_t num_differing_bits(hash_t a, hash_t b) 24 | hash_t compute(const vector[hash_t]& hashes) 25 | matches_t find_all(unordered_set[hash_t] hashes, 26 | size_t number_of_blocks, 27 | size_t different_bits) 28 | -------------------------------------------------------------------------------- /simhash/simhash.pyx: -------------------------------------------------------------------------------- 1 | import hashlib 2 | import struct 3 | 4 | from simhash cimport compute as c_compute 5 | from simhash cimport find_all as c_find_all 6 | 7 | 8 | def unsigned_hash(bytes obj): 9 | '''Returns a hash suitable for use as a hash_t.''' 10 | # Takes first 8 bytes of MD5 digest 11 | digest = hashlib.md5(obj).digest()[0:8] 12 | # Unpacks the binary bytes in digest into a Python integer 13 | return struct.unpack('>Q', digest)[0] & 0xFFFFFFFFFFFFFFFF 14 | 15 | def compute(hashes): 16 | '''Compute the simhash of a vector of hashes.''' 17 | return c_compute(hashes) 18 | 19 | def find_all(hashes, number_of_blocks, different_bits): 20 | ''' 21 | Find the set of all matches within the provided vector of hashes. 22 | 23 | The provided hashes are manipulated in place, but upon completion are 24 | restored to their original state. 25 | ''' 26 | cdef matches_t results_set = c_find_all(hashes, number_of_blocks, different_bits) 27 | cdef vector[match_t] results_vector 28 | results_vector.assign(results_set.begin(), results_set.end()) 29 | return results_vector 30 | -------------------------------------------------------------------------------- /test/test.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python 2 | 3 | import re 4 | import unittest 5 | 6 | import simhash 7 | 8 | 9 | class TestNumDifferingBits(unittest.TestCase): 10 | '''Tests about num_differing_bits''' 11 | 12 | def test_basic(self): 13 | a = 0xDEADBEEF 14 | b = 0xDEADBEAD 15 | self.assertEqual(2, simhash.num_differing_bits(a, b)) 16 | 17 | 18 | class TestCompute(unittest.TestCase): 19 | '''Tests about computing a simhash.''' 20 | 21 | def test_empty(self): 22 | self.assertEqual(0, simhash.compute([])) 23 | 24 | def test_repeat(self): 25 | number = 0xDEADBEEF 26 | self.assertEqual(number, simhash.compute([number] * 100)) 27 | 28 | def test_inverse(self): 29 | hashes = [0xDEADBEEFDEADBEEF, 0x2152411021524110] 30 | self.assertEqual(64, simhash.num_differing_bits(*hashes)) 31 | self.assertEqual(0, simhash.compute(hashes)) 32 | 33 | def test_basic(self): 34 | hashes = [0xABCD, 0xBCDE, 0xCDEF] 35 | self.assertEqual(0xADCF, simhash.compute(hashes)) 36 | 37 | 38 | class TestFindAll(unittest.TestCase): 39 | '''Tests about find_all.''' 40 | 41 | def test_basic(self): 42 | hashes = [ 43 | 0x000000FF, 0x000000EF, 0x000000EE, 0x000000CE, 0x00000033, 44 | 0x0000FF00, 0x0000EF00, 0x0000EE00, 0x0000CE00, 0x00003300, 45 | 0x00FF0000, 0x00EF0000, 0x00EE0000, 0x00CE0000, 0x00330000, 46 | 0xFF000000, 0xEF000000, 0xEE000000, 0xCE000000, 0x33000000 47 | ] 48 | expected = [ 49 | (0x000000EF, 0x000000FF), 50 | (0x000000EE, 0x000000EF), 51 | (0x000000EE, 0x000000FF), 52 | (0x000000CE, 0x000000EE), 53 | (0x000000CE, 0x000000EF), 54 | (0x000000CE, 0x000000FF), 55 | (0x0000EF00, 0x0000FF00), 56 | (0x0000EE00, 0x0000EF00), 57 | (0x0000EE00, 0x0000FF00), 58 | (0x0000CE00, 0x0000EE00), 59 | (0x0000CE00, 0x0000EF00), 60 | (0x0000CE00, 0x0000FF00), 61 | (0x00EF0000, 0x00FF0000), 62 | (0x00EE0000, 0x00EF0000), 63 | (0x00EE0000, 0x00FF0000), 64 | (0x00CE0000, 0x00EE0000), 65 | (0x00CE0000, 0x00EF0000), 66 | (0x00CE0000, 0x00FF0000), 67 | (0xEF000000, 0xFF000000), 68 | (0xEE000000, 0xEF000000), 69 | (0xEE000000, 0xFF000000), 70 | (0xCE000000, 0xEE000000), 71 | (0xCE000000, 0xEF000000), 72 | (0xCE000000, 0xFF000000) 73 | ] 74 | for blocks in range(4, 10): 75 | self.assertEqual( 76 | sorted(expected), sorted(simhash.find_all(hashes, blocks, 3))) 77 | 78 | def test_diverse(self): 79 | hashes = [ 80 | 0x00000000, 0x10101000, 0x10100010, 0x10001010, 0x00101010, 81 | 0x01010100, 0x01010001, 0x01000101, 0x00010101 82 | ] 83 | expected = [ 84 | (0x00000000, 0x10101000), 85 | (0x00000000, 0x10100010), 86 | (0x00000000, 0x10001010), 87 | (0x00000000, 0x00101010), 88 | (0x00000000, 0x01010100), 89 | (0x00000000, 0x01010001), 90 | (0x00000000, 0x01000101), 91 | (0x00000000, 0x00010101), 92 | (0x00101010, 0x10001010), 93 | (0x00101010, 0x10100010), 94 | (0x00101010, 0x10101000), 95 | (0x10001010, 0x10100010), 96 | (0x10001010, 0x10101000), 97 | (0x10100010, 0x10101000), 98 | (0x00010101, 0x01000101), 99 | (0x00010101, 0x01010001), 100 | (0x00010101, 0x01010100), 101 | (0x01000101, 0x01010001), 102 | (0x01000101, 0x01010100), 103 | (0x01010001, 0x01010100) 104 | ] 105 | for blocks in range(4, 10): 106 | self.assertEqual( 107 | sorted(expected), sorted(simhash.find_all(hashes, blocks, 3))) 108 | 109 | 110 | class TestShingle(unittest.TestCase): 111 | '''Tests about computing shingles of tokens.''' 112 | 113 | def test_fewer_than_window(self): 114 | tokens = list(range(3)) 115 | self.assertEqual([], list(simhash.shingle(tokens, 4))) 116 | 117 | def test_zero_window_size(self): 118 | tokens = list(range(10)) 119 | with self.assertRaises(ValueError): 120 | list(simhash.shingle(tokens, 0)) 121 | 122 | def test_negative_window_size(self): 123 | tokens = list(range(10)) 124 | with self.assertRaises(ValueError): 125 | list(simhash.shingle(tokens, -1)) 126 | 127 | def test_basic(self): 128 | tokens = list(range(10)) 129 | expected = [ 130 | [0, 1, 2, 3], 131 | [1, 2, 3, 4], 132 | [2, 3, 4, 5], 133 | [3, 4, 5, 6], 134 | [4, 5, 6, 7], 135 | [5, 6, 7, 8], 136 | [6, 7, 8, 9] 137 | ] 138 | self.assertEqual(expected, list(simhash.shingle(tokens, 4))) 139 | 140 | 141 | class TestFunctional(unittest.TestCase): 142 | '''Can the tool be used functionally.''' 143 | 144 | MATCH_THRESHOLD = 3 145 | 146 | jabberwocky = ''' 147 | Twas brillig, and the slithy toves 148 | Did gyre and gimble in the wabe: 149 | All mimsy were the borogoves, 150 | And the mome raths outgrabe. 151 | "Beware the Jabberwock, my son! 152 | The jaws that bite, the claws that catch! 153 | Beware the Jubjub bird, and shun 154 | The frumious Bandersnatch!" 155 | He took his vorpal sword in hand: 156 | Long time the manxome foe he sought -- 157 | So rested he by the Tumtum tree, 158 | And stood awhile in thought. 159 | And, as in uffish thought he stood, 160 | The Jabberwock, with eyes of flame, 161 | Came whiffling through the tulgey wood, 162 | And burbled as it came! 163 | One, two! One, two! And through and through 164 | The vorpal blade went snicker-snack! 165 | He left it dead, and with its head 166 | He went galumphing back. 167 | "And, has thou slain the Jabberwock? 168 | Come to my arms, my beamish boy! 169 | O frabjous day! Callooh! Callay!' 170 | He chortled in his joy. 171 | `Twas brillig, and the slithy toves 172 | Did gyre and gimble in the wabe; 173 | All mimsy were the borogoves, 174 | And the mome raths outgrabe.''' 175 | 176 | pope = '''There once was a man named 'Pope' 177 | Who loved an oscilloscope 178 | And the cyclical trace 179 | Of their carnal embrace 180 | Had a damned-near infinite slope!''' 181 | 182 | def compute(self, text): 183 | tokens = re.split(r'\W+', text.lower(), flags=re.UNICODE) 184 | shingles = [''.join(shingle) for shingle in 185 | simhash.shingle(''.join(tokens), 4)] 186 | hashes = [simhash.unsigned_hash(s.encode('utf8')) for s in shingles] 187 | return simhash.compute(hashes) 188 | 189 | def test_added_text(self): 190 | a = self.compute(self.jabberwocky) 191 | b = self.compute( 192 | self.jabberwocky + ' - Lewis Carroll (Alice in Wonderland)') 193 | 194 | self.assertLessEqual( 195 | simhash.num_differing_bits(a, b), 196 | self.MATCH_THRESHOLD) 197 | 198 | def test_identical_text(self): 199 | a = self.compute(self.jabberwocky) 200 | b = self.compute(self.jabberwocky) 201 | self.assertEqual(0, simhash.num_differing_bits(a, b)) 202 | 203 | def test_different(self): 204 | a = self.compute(self.jabberwocky) 205 | b = self.compute(self.pope) 206 | self.assertGreater( 207 | simhash.num_differing_bits(a, b), 208 | self.MATCH_THRESHOLD) 209 | --------------------------------------------------------------------------------