├── .gitignore ├── LICENSE ├── README.md ├── go.mod ├── go.sum ├── img ├── logo-dark.png └── logo-light.png ├── main.go └── testdata ├── sentences.txt └── tlds.txt /.gitignore: -------------------------------------------------------------------------------- 1 | raink 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Bishop Fox 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 | 4 | logo 5 | 6 |
7 | Use LLMs for document ranking. 8 |

9 | 10 | ## Description 11 | 12 | There's power in AI in that you can "throw a problem at it" and get some result, without even fully defining the problem. For example, give it a bunch of code diffs and a security advisory, and ask, "Which of these diffs seems most likely to fix the security bug?" However, it's not always that easy: 13 | - nondeterminism: doesn't always respond with the same result 14 | - context window: can't pass in all the data at once, need to break it up 15 | - output contraints: sometimes doesn't return all the data you asked it to review 16 | - subjectivity in scoring: has a really hard time assigning a numeric score to an individual item 17 | 18 | We built raink to circumvent those issues and solve general ranking problems that are otherwise difficult for LLMs to process. See our blog post [raink: Use LLMs for Document Ranking](https://bishopfox.com/blog/raink-llms-document-ranking) for more background on this technique, and our talk [Patch Perfect: Harmonizing with LLMs to Find Security Vulns](https://www.youtube.com/watch?v=IBuL1zY69tY) to see how we've applied raink to offensive security problems. 19 | 20 | ## Getting started 21 | 22 | ### Install 23 | 24 | ``` 25 | git clone https://github.com/noperator/raink 26 | cd raink 27 | go install 28 | ``` 29 | 30 | ### Configure 31 | 32 | Set your `OPENAI_API_KEY` environment variable. 33 | 34 | ### Usage 35 | 36 | ``` 37 | raink -h 38 | Usage of raink: 39 | -dry-run 40 | Enable dry run mode (log API calls without making them) 41 | -encoding string 42 | Tokenizer encoding (default "o200k_base") 43 | -f string 44 | Input file 45 | -json 46 | Force JSON parsing regardless of file extension 47 | -o string 48 | JSON output file 49 | -ollama-model string 50 | Ollama model name (if not set, OpenAI will be used) 51 | -ollama-url string 52 | Ollama API URL (default "http://localhost:11434/api/chat") 53 | -openai-model string 54 | OpenAI model name (default "gpt-4o-mini") 55 | -p string 56 | Initial prompt (prefix with @ to use a file) 57 | -r int 58 | Number of runs (default 10) 59 | -ratio float 60 | Refinement ratio as a decimal (e.g., 0.5 for 50%) (default 0.5) 61 | -s int 62 | Number of items per batch (default 10) 63 | -t int 64 | Max tokens per batch (default 128000) 65 | -template string 66 | Template for each object in the input file (prefix with @ to use a file) (default "{{.Data}}") 67 | ``` 68 | 69 | Compares 100 [sentences](https://github.com/noperator/raink/blob/main/testdata/sentences.txt) in under 2 min. 70 | 71 | ``` 72 | raink \ 73 | -f testdata/sentences.txt \ 74 | -r 10 \ 75 | -s 10 \ 76 | -p 'Rank each of these items according to their relevancy to the concept of "time".' | 77 | jq -r '.[:10] | map(.value)[]' | 78 | nl 79 | 80 | 1 The train arrived exactly on time. 81 | 2 The old clock chimed twelve times. 82 | 3 The clock ticked steadily on the wall. 83 | 4 The bell rang, signaling the end of class. 84 | 5 The rooster crowed at the break of dawn. 85 | 6 She climbed to the top of the hill to watch the sunset. 86 | 7 He watched as the leaves fell one by one. 87 | 8 The stars twinkled brightly in the clear night sky. 88 | 9 He spotted a shooting star while stargazing. 89 | 10 She opened the curtains to let in the morning light. 90 | ``` 91 | 92 | #### JSON Support 93 | 94 | If the input file is a JSON document, it will be read as an array of objects and each object will be used for ranking. 95 | 96 | For instance, two objects would be loaded and ranked from this document: 97 | 98 | ```json 99 | [ 100 | { 101 | "path": "/foo", 102 | "code": "bar", 103 | }, 104 | { 105 | "path": "/baz", 106 | "code": "nope", 107 | } 108 | ] 109 | ``` 110 | 111 | #### Templates 112 | 113 | It is possible to include each element from the input file in a template using the [Go template syntax](https://pkg.go.dev/text/template) via the `-template "template string"` (or `-template @file.tpl`) argument. 114 | 115 | For text input files, each line can be referenced in the template with the `Data` variable: 116 | 117 | ``` 118 | Anything you want with {{ .Data }} 119 | ``` 120 | 121 | For JSON input files, each object in the array can be referenced directly. For instance, elements of the previous JSON example can be referenced in the template code like so: 122 | 123 | ``` 124 | # {{ .path }} 125 | 126 | {{ .code }} 127 | ``` 128 | 129 | Note in the following example that the resulting `value` key contains the actual value being presented for ranking (as described by the template), while the `object` key contains the entire original object from the input file for easy reference. 130 | 131 | ``` 132 | # Create some test JSON data. 133 | seq 9 | 134 | paste -d @ - - - | 135 | parallel 'echo {} | tr @ "\n" | jo -a | jo nums=:/dev/stdin' | 136 | jo -a | 137 | tee input.json 138 | 139 | [{"nums":[1,2,3]},{"nums":[4,5,6]},{"nums":[7,8,9]}] 140 | 141 | # Use template to extract the first element of the nums array in each input object. 142 | raink \ 143 | -f input.json \ 144 | -template '{{ index .nums 0 }}' \ 145 | -p 'Which is biggest?' \ 146 | -r 1 | 147 | jq -c '.[]' 148 | 149 | {"key":"eQJpm-Qs","value":"7","object":{"nums":[7,8,9]},"score":0,"exposure":1,"rank":1} 150 | {"key":"SyJ3d9Td","value":"4","object":{"nums":[4,5,6]},"score":2,"exposure":1,"rank":2} 151 | {"key":"a4ayc_80","value":"1","object":{"nums":[1,2,3]},"score":3,"exposure":1,"rank":3} 152 | ``` 153 | 154 | ## Back matter 155 | 156 | ### See also 157 | 158 | - [Hard problems that reduce to document ranking](https://noperator.dev/posts/document-ranking-for-complex-problems/) 159 | - [Commentary: Critical Thinking - Bug Bounty Podcast](https://youtu.be/qd08UBNpu7k?si=pMVEYtmKnyuJkL9B&t=1511) 160 | - [Discussion: Hacker News](https://news.ycombinator.com/item?id=43174910) 161 | - [Raink: Use LLMs for Document Ranking](https://bishopfox.com/blog/raink-llms-document-ranking) 162 | - [Patch Perfect: Harmonizing with LLMs to Find Security Vulns](https://www.youtube.com/watch?v=IBuL1zY69tY) 163 | - [Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting](https://arxiv.org/html/2306.17563v2) 164 | - [Introducing Rerank 3.5: Precise AI Search](https://cohere.com/blog/rerank-3pt5) 165 | 166 | ### To-do 167 | 168 | - [x] parallelize openai calls for each run 169 | - [x] save time by using shorter hash ids 170 | - [x] make sure that each randomized run is evenly split into groups so each one gets included/exposed 171 | - [ ] allow specifying an input _directory_ (where each file is distinct object) 172 | - [x] alert if the incoming context window is super large 173 | - [x] some batches near the end of a run (9?) are small for some reason 174 | - [ ] run openai batch mode 175 | - [x] automatically calculate optimal batch size? 176 | - [x] explore "tournament" sort vs complete exposure each time 177 | - [x] add parameter for refinement ratio 178 | - [x] add blog link 179 | - [x] support non-OpenAI models 180 | - [ ] add ~boolean~ refinement ratio flag 181 | - [ ] separate package and cli tool 182 | - [ ] add python bindings? 183 | - [ ] clarify when prompt included in token estimate 184 | - [ ] remove token limit threshold? potentially confusing/unnecessary 185 | 186 | ### License 187 | 188 | This project is licensed under the [MIT License](LICENSE). 189 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module github.com/bishopfox/raink 2 | 3 | go 1.23.4 4 | 5 | require ( 6 | github.com/invopop/jsonschema v0.12.0 7 | github.com/openai/openai-go v0.1.0-alpha.38 8 | github.com/pkoukk/tiktoken-go v0.1.7 9 | ) 10 | 11 | require ( 12 | github.com/bahlo/generic-list-go v0.2.0 // indirect 13 | github.com/buger/jsonparser v1.1.1 // indirect 14 | github.com/dlclark/regexp2 v1.10.0 // indirect 15 | github.com/google/uuid v1.6.0 // indirect 16 | github.com/mailru/easyjson v0.7.7 // indirect 17 | github.com/tidwall/gjson v1.14.4 // indirect 18 | github.com/tidwall/match v1.1.1 // indirect 19 | github.com/tidwall/pretty v1.2.1 // indirect 20 | github.com/tidwall/sjson v1.2.5 // indirect 21 | github.com/wk8/go-ordered-map/v2 v2.1.8 // indirect 22 | gopkg.in/yaml.v3 v3.0.1 // indirect 23 | ) 24 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | github.com/bahlo/generic-list-go v0.2.0 h1:5sz/EEAK+ls5wF+NeqDpk5+iNdMDXrh3z3nPnH1Wvgk= 2 | github.com/bahlo/generic-list-go v0.2.0/go.mod h1:2KvAjgMlE5NNynlg/5iLrrCCZ2+5xWbdbCW3pNTGyYg= 3 | github.com/buger/jsonparser v1.1.1 h1:2PnMjfWD7wBILjqQbt530v576A/cAbQvEW9gGIpYMUs= 4 | github.com/buger/jsonparser v1.1.1/go.mod h1:6RYKKt7H4d4+iWqouImQ9R2FZql3VbhNgx27UK13J/0= 5 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 6 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 7 | github.com/dlclark/regexp2 v1.10.0 h1:+/GIL799phkJqYW+3YbOd8LCcbHzT0Pbo8zl70MHsq0= 8 | github.com/dlclark/regexp2 v1.10.0/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8= 9 | github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0= 10 | github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= 11 | github.com/invopop/jsonschema v0.12.0 h1:6ovsNSuvn9wEQVOyc72aycBMVQFKz7cPdMJn10CvzRI= 12 | github.com/invopop/jsonschema v0.12.0/go.mod h1:ffZ5Km5SWWRAIN6wbDXItl95euhFz2uON45H2qjYt+0= 13 | github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y= 14 | github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0= 15 | github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc= 16 | github.com/openai/openai-go v0.1.0-alpha.38 h1:j/rL0aEIHWnWaPgA8/AXYKCI79ZoW44NTIpn7qfMEXQ= 17 | github.com/openai/openai-go v0.1.0-alpha.38/go.mod h1:3SdE6BffOX9HPEQv8IL/fi3LYZ5TUpRYaqGQZbyk11A= 18 | github.com/pkoukk/tiktoken-go v0.1.7 h1:qOBHXX4PHtvIvmOtyg1EeKlwFRiMKAcoMp4Q+bLQDmw= 19 | github.com/pkoukk/tiktoken-go v0.1.7/go.mod h1:9NiV+i9mJKGj1rYOT+njbv+ZwA/zJxYdewGl6qVatpg= 20 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 21 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 22 | github.com/stretchr/testify v1.8.2 h1:+h33VjcLVPDHtOdpUCuF+7gSuG3yGIftsP1YvFihtJ8= 23 | github.com/stretchr/testify v1.8.2/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4= 24 | github.com/tidwall/gjson v1.14.2/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk= 25 | github.com/tidwall/gjson v1.14.4 h1:uo0p8EbA09J7RQaflQ1aBRffTR7xedD2bcIVSYxLnkM= 26 | github.com/tidwall/gjson v1.14.4/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk= 27 | github.com/tidwall/match v1.1.1 h1:+Ho715JplO36QYgwN9PGYNhgZvoUSc9X2c80KVTi+GA= 28 | github.com/tidwall/match v1.1.1/go.mod h1:eRSPERbgtNPcGhD8UCthc6PmLEQXEWd3PRB5JTxsfmM= 29 | github.com/tidwall/pretty v1.2.0/go.mod h1:ITEVvHYasfjBbM0u2Pg8T2nJnzm8xPwvNhhsoaGGjNU= 30 | github.com/tidwall/pretty v1.2.1 h1:qjsOFOWWQl+N3RsoF5/ssm1pHmJJwhjlSbZ51I6wMl4= 31 | github.com/tidwall/pretty v1.2.1/go.mod h1:ITEVvHYasfjBbM0u2Pg8T2nJnzm8xPwvNhhsoaGGjNU= 32 | github.com/tidwall/sjson v1.2.5 h1:kLy8mja+1c9jlljvWTlSazM7cKDRfJuR/bOJhcY5NcY= 33 | github.com/tidwall/sjson v1.2.5/go.mod h1:Fvgq9kS/6ociJEDnK0Fk1cpYF4FIW6ZF7LAe+6jwd28= 34 | github.com/wk8/go-ordered-map/v2 v2.1.8 h1:5h/BUHu93oj4gIdvHHHGsScSTMijfx5PeYkE/fJgbpc= 35 | github.com/wk8/go-ordered-map/v2 v2.1.8/go.mod h1:5nJHM5DyteebpVlHnWMV0rPz6Zp7+xBAnxjb1X5vnTw= 36 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM= 37 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 38 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= 39 | gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 40 | -------------------------------------------------------------------------------- /img/logo-dark.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noperator/raink/3a5ac848b3eb9cf59589391ac853c7db333a4766/img/logo-dark.png -------------------------------------------------------------------------------- /img/logo-light.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noperator/raink/3a5ac848b3eb9cf59589391ac853c7db333a4766/img/logo-light.png -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "bytes" 6 | "context" 7 | "crypto/sha256" 8 | "encoding/base64" 9 | "encoding/json" 10 | "flag" 11 | "fmt" 12 | "io" 13 | "log" 14 | "math/rand" 15 | "net/http" 16 | "os" 17 | "path/filepath" 18 | "sort" 19 | "strconv" 20 | "strings" 21 | "text/template" 22 | "time" 23 | 24 | "github.com/invopop/jsonschema" 25 | "github.com/openai/openai-go" 26 | "github.com/openai/openai-go/option" 27 | "github.com/pkoukk/tiktoken-go" 28 | ) 29 | 30 | const ( 31 | idLen = 8 32 | minBatchSize = 2 33 | ) 34 | 35 | /* 36 | When deciding whether a value belongs in Config or Ranker structs, consider the following: 37 | - Does this value change during operation? → Ranker if yes, Config if no 38 | - Should users be able to configure this directly? → Config if yes, Ranker if no 39 | - Is this derived from other configuration? → Usually Ranker 40 | - Does this require initialization or cleanup? → Usually Ranker 41 | - Is this part of the public API? → Config if yes, Ranker if no 42 | */ 43 | 44 | type Config struct { 45 | InitialPrompt string `json:"initial_prompt"` 46 | BatchSize int `json:"batch_size"` 47 | NumRuns int `json:"num_runs"` 48 | OllamaModel string `json:"ollama_model"` 49 | OpenAIModel openai.ChatModel `json:"openai_model"` 50 | TokenLimit int `json:"token_limit"` 51 | RefinementRatio float64 `json:"refinement_ratio"` 52 | OpenAIKey string `json:"-"` 53 | OpenAIAPIURL string `json:"-"` 54 | OllamaAPIURL string `json:"-"` 55 | Encoding string `json:"encoding"` 56 | BatchTokens int `json:"batch_tokens"` 57 | DryRun bool `json:"-"` 58 | } 59 | 60 | // TODO: Move all CLI flag validation this func instead. 61 | func (c *Config) Validate() error { 62 | if c.InitialPrompt == "" { 63 | return fmt.Errorf("initial prompt cannot be empty") 64 | } 65 | if c.BatchSize <= 0 { 66 | return fmt.Errorf("batch size must be greater than 0") 67 | } 68 | if c.NumRuns <= 0 { 69 | return fmt.Errorf("number of runs must be greater than 0") 70 | } 71 | if c.TokenLimit <= 0 { 72 | return fmt.Errorf("token limit must be greater than 0") 73 | } 74 | if c.OllamaModel == "" && c.OpenAIAPIURL == "" && c.OpenAIKey == "" { 75 | return fmt.Errorf("openai key cannot be empty") 76 | } 77 | if c.BatchSize < minBatchSize { 78 | return fmt.Errorf("batch size must be at least %d", minBatchSize) 79 | } 80 | return nil 81 | } 82 | 83 | type Ranker struct { 84 | cfg *Config 85 | encoding *tiktoken.Tiktoken 86 | rng *rand.Rand 87 | numBatches int 88 | round int 89 | } 90 | 91 | func NewRanker(config *Config) (*Ranker, error) { 92 | if err := config.Validate(); err != nil { 93 | return nil, err 94 | } 95 | 96 | encoding, err := tiktoken.GetEncoding(config.Encoding) 97 | if err != nil { 98 | return nil, fmt.Errorf("failed to get tiktoken encoding: %w", err) 99 | } 100 | 101 | return &Ranker{ 102 | cfg: config, 103 | encoding: encoding, 104 | rng: rand.New(rand.NewSource(time.Now().UnixNano())), 105 | }, nil 106 | } 107 | 108 | func (ranker *Ranker) AdjustBatchSize(objects []Object, samples int) { 109 | // Dynamically adjust batch size upfront. 110 | for { 111 | valid := true 112 | var estTotalTokens int 113 | var numBatches int 114 | 115 | for i := 0; i < samples; i++ { 116 | ranker.rng.Shuffle(len(objects), func(i, j int) { 117 | objects[i], objects[j] = objects[j], objects[i] 118 | }) 119 | numBatches = max(1, len(objects)/ranker.cfg.BatchSize) // Need at least one batch. 120 | for j := 0; j < numBatches; j++ { 121 | batch := objects[j*ranker.cfg.BatchSize : (j+1)*min(len(objects), ranker.cfg.BatchSize)] // Don't index more objects than we have. 122 | estBatchTokens := ranker.estimateTokens(batch, true) 123 | estTotalTokens += estBatchTokens 124 | if estBatchTokens > ranker.cfg.TokenLimit { 125 | log.Printf("Sample %d: estimated tokens %d > max token threshold %d", i, estBatchTokens, ranker.cfg.TokenLimit) 126 | ranker.logTokenSizes(batch) 127 | valid = false 128 | break 129 | } 130 | } 131 | if !valid { 132 | break 133 | } 134 | } 135 | 136 | if valid { 137 | avgEstTokens := estTotalTokens / (samples * numBatches) 138 | avgEstPct := float64(avgEstTokens) / float64(ranker.cfg.TokenLimit) * 100 139 | log.Printf("Average estimated tokens: %d (%.2f%% of max %d tokens)", avgEstTokens, avgEstPct, ranker.cfg.TokenLimit) 140 | break 141 | } 142 | if ranker.cfg.BatchSize <= minBatchSize { 143 | log.Fatal("Cannot create a valid batch within the token limit") 144 | } 145 | ranker.cfg.BatchSize-- 146 | log.Printf("Decreasing batch size to %d", ranker.cfg.BatchSize) 147 | } 148 | } 149 | 150 | type Object struct { 151 | // object unique identifier use to identify the object in the final results 152 | ID string `json:"id"` 153 | // string value to be ranked 154 | Value string `json:"value"` 155 | // the original structured object if we're loading a json file 156 | Object interface{} `json:"object"` 157 | } 158 | 159 | type RankedObject struct { 160 | Object Object 161 | Score float64 162 | } 163 | 164 | type RankedObjectResponse struct { 165 | Objects []string `json:"objects" jsonschema_description:"List of ranked object IDs"` 166 | } 167 | 168 | type FinalResult struct { 169 | Key string `json:"key"` 170 | Value string `json:"value"` 171 | // the original structured object if we're loading a json file 172 | Object interface{} `json:"object"` 173 | Score float64 `json:"score"` 174 | Exposure int `json:"exposure"` 175 | Rank int `json:"rank"` 176 | } 177 | 178 | func GenerateSchema[T any]() interface{} { 179 | reflector := jsonschema.Reflector{ 180 | AllowAdditionalProperties: false, 181 | DoNotReference: true, 182 | } 183 | var v T 184 | schema := reflector.Reflect(v) 185 | return schema 186 | } 187 | 188 | var RankedObjectResponseSchema = GenerateSchema[RankedObjectResponse]() 189 | 190 | func ShortDeterministicID(input string, length int) string { 191 | // Keep only A-Za-z0-9 from Base64-encoded SHA-256 hash. 192 | hash := sha256.Sum256([]byte(input)) 193 | base64Encoded := base64.URLEncoding.EncodeToString(hash[:]) 194 | var result strings.Builder 195 | for _, char := range base64Encoded { 196 | if (char >= '0' && char <= '9') || (char >= 'a' && char <= 'z') || (char >= 'A' && char <= 'Z') { 197 | result.WriteRune(char) 198 | } 199 | } 200 | filtered := result.String() 201 | if length > len(filtered) { 202 | length = len(filtered) 203 | } 204 | return filtered[:length] 205 | } 206 | 207 | func loadObjectsFromFile(filePath string, templateData string, forceJSON bool) (objects []Object, err error) { 208 | var tmpl *template.Template 209 | if templateData != "" { 210 | if templateData[0] == '@' { 211 | content, err := os.ReadFile(templateData[1:]) 212 | if err != nil { 213 | return nil, err 214 | } 215 | templateData = string(content) 216 | } 217 | if tmpl, err = template.New("raink-item-template").Parse(templateData); err != nil { 218 | return nil, err 219 | } 220 | } 221 | 222 | file, err := os.Open(filePath) 223 | if err != nil { 224 | return nil, err 225 | } 226 | defer file.Close() 227 | 228 | ext := strings.ToLower(filepath.Ext(filePath)) 229 | if ext == ".json" || forceJSON { 230 | // parse the file in an opaque array 231 | var data []interface{} 232 | if err := json.NewDecoder(file).Decode(&data); err != nil { 233 | return nil, err 234 | } 235 | 236 | // iterate over the map and create objects 237 | for _, value := range data { 238 | var valueStr string 239 | if tmpl != nil { 240 | var tmplData bytes.Buffer 241 | if err := tmpl.Execute(&tmplData, value); err != nil { 242 | return nil, err 243 | } 244 | valueStr = tmplData.String() 245 | } else { 246 | log.Printf("WARNING: using json input without a template, using JSON object as it is\n") 247 | jsonValue, err := json.Marshal(value) 248 | if err != nil { 249 | return nil, err 250 | } 251 | valueStr = string(jsonValue) 252 | } 253 | 254 | id := ShortDeterministicID(valueStr, idLen) 255 | objects = append(objects, Object{ID: id, Object: value, Value: valueStr}) 256 | } 257 | } else { 258 | // read and interpolate the file line by line 259 | reader := bufio.NewReader(file) 260 | for { 261 | line, err := reader.ReadString('\n') 262 | if err != nil { 263 | if err == io.EOF { 264 | break 265 | } 266 | return nil, err 267 | } 268 | line = strings.TrimSpace(line) 269 | 270 | if tmpl != nil { 271 | var tmplData bytes.Buffer 272 | if err := tmpl.Execute(&tmplData, map[string]string{"Data": line}); err != nil { 273 | return nil, err 274 | } 275 | line = tmplData.String() 276 | } 277 | 278 | id := ShortDeterministicID(line, idLen) 279 | objects = append(objects, Object{ID: id, Object: nil, Value: line}) 280 | } 281 | } 282 | 283 | return objects, nil 284 | } 285 | 286 | // TODO: Move all of this CLI-related code to a separate package. 287 | func main() { 288 | log.SetOutput(os.Stderr) 289 | 290 | inputFile := flag.String("f", "", "Input file") 291 | forceJSON := flag.Bool("json", false, "Force JSON parsing regardless of file extension") 292 | inputTemplate := flag.String("template", "{{.Data}}", "Template for each object in the input file (prefix with @ to use a file)") 293 | batchSize := flag.Int("s", 10, "Number of items per batch") 294 | numRuns := flag.Int("r", 10, "Number of runs") 295 | batchTokens := flag.Int("t", 128000, "Max tokens per batch") 296 | initialPrompt := flag.String("p", "", "Initial prompt (prefix with @ to use a file)") 297 | outputFile := flag.String("o", "", "JSON output file") 298 | 299 | ollamaURL := flag.String("ollama-url", "http://localhost:11434/api/chat", "Ollama API URL") 300 | ollamaModel := flag.String("ollama-model", "", "Ollama model name (if not set, OpenAI will be used)") 301 | oaiModel := flag.String("openai-model", openai.ChatModelGPT4oMini, "OpenAI model name") 302 | oaiURL := flag.String("openai-url", "", "OpenAI API base URL (e.g., for OpenAI-compatible API like vLLM)") 303 | encoding := flag.String("encoding", "o200k_base", "Tokenizer encoding") 304 | 305 | dryRun := flag.Bool("dry-run", false, "Enable dry run mode (log API calls without making them)") 306 | refinementRatio := flag.Float64("ratio", 0.5, "Refinement ratio as a decimal (e.g., 0.5 for 50%)") 307 | flag.Parse() 308 | 309 | // TODO: This should be a more resilient check. We're assuming that if the 310 | // batchTokens is 128000, then a user didn't pass that value via CLI (i.e., 311 | // that it's the default value). 312 | if *ollamaModel != "" && *batchTokens == 128000 { 313 | *batchTokens = 4096 314 | } 315 | 316 | // This "threshold" is a way to add some padding to our estimation of 317 | // average token usage per batch. We're effectively leaving 5% of 318 | // wiggle room. 319 | var tokenLimitThreshold = int(0.95 * float64(*batchTokens)) 320 | 321 | if *inputFile == "" { 322 | log.Println("Usage: raink -f [-s ] [-r ] [-p ] [-t ] [-ollama-model ] [-openai-model ] [-openai-url ] [-ratio ]") 323 | return 324 | } 325 | 326 | if *refinementRatio < 0 || *refinementRatio >= 1 { 327 | fmt.Println("Error: Refinement ratio must be >= 0 and < 1") 328 | os.Exit(1) 329 | } 330 | 331 | userPrompt := *initialPrompt 332 | if strings.HasPrefix(userPrompt, "@") { 333 | filePath := strings.TrimPrefix(userPrompt, "@") 334 | content, err := os.ReadFile(filePath) 335 | if err != nil { 336 | log.Fatalf("Error reading initial prompt file: %v", err) 337 | } 338 | userPrompt = string(content) 339 | } 340 | 341 | config := &Config{ 342 | InitialPrompt: userPrompt, 343 | BatchSize: *batchSize, 344 | NumRuns: *numRuns, 345 | OllamaModel: *ollamaModel, 346 | OpenAIModel: *oaiModel, 347 | TokenLimit: tokenLimitThreshold, 348 | RefinementRatio: *refinementRatio, 349 | OpenAIKey: os.Getenv("OPENAI_API_KEY"), 350 | OpenAIAPIURL: *oaiURL, 351 | OllamaAPIURL: *ollamaURL, 352 | Encoding: *encoding, 353 | BatchTokens: *batchTokens, 354 | DryRun: *dryRun, 355 | } 356 | 357 | ranker, err := NewRanker(config) 358 | if err != nil { 359 | log.Fatal(err) 360 | } 361 | 362 | objects, err := loadObjectsFromFile(*inputFile, *inputTemplate, *forceJSON) 363 | if err != nil { 364 | log.Fatal(err) 365 | } 366 | 367 | // check that no object is too large 368 | for _, obj := range objects { 369 | tokens := ranker.estimateTokens([]Object{obj}, true) 370 | if tokens > *batchTokens { 371 | log.Fatalf("Object is too large with %d tokens:\n%s", tokens, obj.Value) 372 | } 373 | } 374 | 375 | // Dynamically adjust batch size upfront. 376 | ranker.AdjustBatchSize(objects, 10) 377 | 378 | // Recursive processing 379 | finalResults := ranker.Rank(objects, 1) 380 | 381 | // Add the rank key to each final result based on its position in the list 382 | for i := range finalResults { 383 | finalResults[i].Rank = i + 1 384 | } 385 | 386 | jsonResults, err := json.MarshalIndent(finalResults, "", " ") 387 | if err != nil { 388 | panic(err) 389 | } 390 | 391 | if !config.DryRun { 392 | fmt.Println(string(jsonResults)) 393 | } 394 | 395 | if *outputFile != "" { 396 | os.WriteFile(*outputFile, jsonResults, 0644) 397 | log.Printf("Results written to %s\n", *outputFile) 398 | } 399 | } 400 | 401 | // TODO: The final exposure value should be the sum of all exposures from all 402 | // refinement rounds (not just the last one). This isn't crucial since exposure 403 | // is just a helpful metric to show that objects compared to a sufficiently 404 | // large number of other objects. 405 | 406 | func (r *Ranker) Rank(objects []Object, round int) []FinalResult { 407 | r.round = round 408 | 409 | log.Printf("Round %d: Ranking %d objects\n", r.round, len(objects)) 410 | 411 | // If we've narrowed down to a single object, we're done. 412 | if len(objects) == 1 { 413 | return []FinalResult{ 414 | { 415 | Key: objects[0].ID, 416 | Value: objects[0].Value, 417 | Object: objects[0].Object, 418 | Score: 0, // 0 is guaranteed to be the "highest" score. 419 | Exposure: 1, 420 | }, 421 | } 422 | } 423 | 424 | // Downstream ranking gets unhappy if we try to rank more objects than we 425 | // have. 426 | if r.cfg.BatchSize > len(objects) { 427 | r.cfg.BatchSize = len(objects) 428 | } 429 | 430 | r.numBatches = len(objects) / r.cfg.BatchSize 431 | 432 | // Process the objects and get the sorted results. 433 | results := r.shuffleBatchRank(objects) 434 | 435 | // If the refinement ratio is 0, that effectively means we're refining 436 | // _none_ of the top objects, so we're done. 437 | if r.cfg.RefinementRatio == 0 { 438 | return results 439 | } 440 | 441 | // Calculate the mid index based on the refinement ratio. 442 | mid := int(float64(len(results)) * r.cfg.RefinementRatio) 443 | topPortion := results[:mid] 444 | bottomPortion := results[mid:] 445 | 446 | // If we haven't reduced the number of objects (as may eventually happen 447 | // for a ratio above 0.5), we're done. 448 | if len(topPortion) == len(objects) { 449 | return results 450 | } 451 | 452 | log.Println("Top items being sent back into recursion:") 453 | for i, obj := range topPortion { 454 | log.Printf("Rank %d: ID=%s, Score=%.2f, Value=%s", i+1, obj.Key, obj.Score, obj.Value) 455 | } 456 | 457 | var topPortionObjects []Object 458 | for _, result := range topPortion { 459 | topPortionObjects = append(topPortionObjects, Object{ID: result.Key, Value: result.Value, Object: result.Object}) 460 | } 461 | 462 | refinedTopPortion := r.Rank(topPortionObjects, round+1) 463 | 464 | // Adjust scores by recursion depth; this serves as an inverted weight so 465 | // that later rounds are guaranteed to sit higher in the final list. 466 | for i := range refinedTopPortion { 467 | refinedTopPortion[i].Score /= float64(2 * round) 468 | } 469 | 470 | // Combine the refined top portion with the unrefined bottom portion. 471 | finalResults := append(refinedTopPortion, bottomPortion...) 472 | 473 | return finalResults 474 | } 475 | 476 | // TODO: Also log the request/retry attempt number. 477 | func (r *Ranker) logFromApiCall(runNum, batchNum int, message string, args ...interface{}) { 478 | formattedMessage := fmt.Sprintf("Round %d, Run %*d/%d, Batch %*d/%d: "+message, r.round, len(strconv.Itoa(r.cfg.NumRuns)), runNum, r.cfg.NumRuns, len(strconv.Itoa(r.numBatches)), batchNum, r.numBatches) 479 | log.Printf(formattedMessage, args...) 480 | } 481 | 482 | func (r *Ranker) shuffleBatchRank(objects []Object) []FinalResult { 483 | scores := make(map[string][]float64) 484 | 485 | exposureCounts := make(map[string]int) 486 | 487 | resultsChan := make(chan []RankedObject, r.numBatches) 488 | 489 | var firstRunRemainderItems []Object 490 | 491 | for i := 0; i < r.cfg.NumRuns; i++ { 492 | r.rng.Shuffle(len(objects), func(i, j int) { 493 | objects[i], objects[j] = objects[j], objects[i] 494 | }) 495 | 496 | // Ensure remainder items from the first run are not in the remainder 497 | // range in the second run 498 | if i == 1 && len(firstRunRemainderItems) > 0 { 499 | for { 500 | remainderStart := r.numBatches * r.cfg.BatchSize 501 | remainderItems := objects[remainderStart:] 502 | conflictFound := false 503 | for _, item := range remainderItems { 504 | for _, firstRunItem := range firstRunRemainderItems { 505 | if item.ID == firstRunItem.ID { 506 | log.Printf("Conflicting remainder item found: %v, %v\n", item, firstRunItem) 507 | conflictFound = true 508 | break 509 | } 510 | } 511 | if conflictFound { 512 | break 513 | } 514 | } 515 | if !conflictFound { 516 | break 517 | } 518 | r.rng.Shuffle(len(objects), func(i, j int) { 519 | objects[i], objects[j] = objects[j], objects[i] 520 | }) 521 | } 522 | } 523 | 524 | // Split into groups of batchSize and process them concurrently 525 | log.Printf("Round %d, Run %*d/%d: Submitting batches to API\n", r.round, len(strconv.Itoa(r.cfg.NumRuns)), i+1, r.cfg.NumRuns) 526 | for j := 0; j < r.numBatches; j++ { 527 | batch := objects[j*r.cfg.BatchSize : (j+1)*r.cfg.BatchSize] 528 | go func(runNumber, batchNumber int, batch []Object) { 529 | rankedBatch := r.rankObjects(batch, runNumber, batchNumber) 530 | resultsChan <- rankedBatch 531 | }(i+1, j+1, batch) 532 | } 533 | 534 | // Collect results from all batches 535 | for j := 0; j < r.numBatches; j++ { 536 | rankedBatch := <-resultsChan 537 | for _, rankedObject := range rankedBatch { 538 | scores[rankedObject.Object.ID] = append(scores[rankedObject.Object.ID], rankedObject.Score) 539 | exposureCounts[rankedObject.Object.ID]++ // Update exposure count 540 | } 541 | } 542 | 543 | // Save remainder items from the first run 544 | if i == 0 { 545 | remainderStart := r.numBatches * r.cfg.BatchSize 546 | if remainderStart < len(objects) { 547 | firstRunRemainderItems = make([]Object, len(objects[remainderStart:])) 548 | copy(firstRunRemainderItems, objects[remainderStart:]) 549 | log.Printf("First run remainder items: %v\n", firstRunRemainderItems) 550 | } 551 | } 552 | } 553 | 554 | // Calculate average scores 555 | finalScores := make(map[string]float64) 556 | for id, scoreList := range scores { 557 | var sum float64 558 | for _, score := range scoreList { 559 | sum += score 560 | } 561 | finalScores[id] = sum / float64(len(scoreList)) 562 | } 563 | 564 | var results []FinalResult 565 | for id, score := range finalScores { 566 | for _, obj := range objects { 567 | if obj.ID == id { 568 | results = append(results, FinalResult{ 569 | Key: id, 570 | Value: obj.Value, 571 | Object: obj.Object, 572 | Score: score, 573 | Exposure: exposureCounts[id], // Include exposure count 574 | }) 575 | break 576 | } 577 | } 578 | } 579 | 580 | sort.Slice(results, func(i, j int) bool { 581 | return results[i].Score < results[j].Score 582 | }) 583 | 584 | return results 585 | } 586 | 587 | func (r *Ranker) logTokenSizes(group []Object) { 588 | log.Println("Logging token sizes for each object in the batch:") 589 | for _, obj := range group { 590 | tokenSize := r.estimateTokens([]Object{obj}, false) 591 | valuePreview := obj.Value 592 | if len(valuePreview) > 100 { 593 | valuePreview = valuePreview[:100] 594 | } 595 | log.Printf("Object ID: %s, Token Size: %d, Value Preview: %s", obj.ID, tokenSize, valuePreview) 596 | } 597 | } 598 | 599 | const promptFmt = "id: `%s`\nvalue:\n```\n%s\n```\n\n" 600 | 601 | // TODO: Merge these and clean them up. 602 | 603 | var promptDisclaimer = fmt.Sprintf( 604 | "\n\nREMEMBER to:\n"+ 605 | "- ALWAYS respond with the short %d-character ID of each item found above the value "+ 606 | "(i.e., I'll provide you with `id: ` above the value, and you should respond with that same ID in your response)\n"+ 607 | "— NEVER respond with the actual value!\n"+ 608 | "— NEVER include backticks around IDs in your response!\n"+ 609 | "— NEVER include scores or a written reason/justification in your response!\n"+ 610 | "- Respond in RANKED DESCENDING order, where the FIRST item in your response is the MOST RELEVANT\n"+ 611 | "- Respond in JSON format, with the following schema:\n {\"objects\": [\"\", \"\", ...]}\n\n"+ 612 | "Here are the objects to be ranked:\n\n", 613 | idLen, 614 | ) 615 | 616 | const missingIDsStr = "Your last response was missing the following IDs: [%s]. " + 617 | "Try again—and make ABSOLUTELY SURE to remember to:\n" + 618 | "- ALWAYS return the IDs and NOT THE VALUES! " + 619 | "- ALWAYS respond in JSON format as specified! " + 620 | "- ALWAYS return ALL of the IDs in the list!" + 621 | "- NEVER include backticks around IDs in your response!" + 622 | "— NEVER include scores or a written reason/justification in your response!" 623 | 624 | const invalidJSONStr = "Your last response was not valid JSON. Try again!" 625 | 626 | func (r *Ranker) estimateTokens(group []Object, includePrompt bool) int { 627 | text := "" 628 | if includePrompt { 629 | text += r.cfg.InitialPrompt + promptDisclaimer 630 | } 631 | for _, obj := range group { 632 | text += fmt.Sprintf(promptFmt, obj.ID, obj.Value) 633 | } 634 | 635 | if r.cfg.OllamaModel != "" { 636 | // TODO: Update to use Ollama tokenize API when this PR is merged: 637 | // https://github.com/ollama/ollama/pull/6586 638 | return len(text) / 4 639 | } else { 640 | return len(r.encoding.Encode(text, nil, nil)) 641 | } 642 | } 643 | 644 | func (r *Ranker) rankObjects(group []Object, runNumber int, batchNumber int) []RankedObject { 645 | prompt := r.cfg.InitialPrompt + promptDisclaimer 646 | for _, obj := range group { 647 | prompt += fmt.Sprintf(promptFmt, obj.ID, obj.Value) 648 | } 649 | 650 | if r.cfg.DryRun { 651 | log.Printf("Dry run API call") 652 | // Simulate a ranked response for dry run 653 | var rankedObjects []RankedObject 654 | for i, obj := range group { 655 | rankedObjects = append(rankedObjects, RankedObject{ 656 | Object: obj, 657 | Score: float64(i + 1), // Simulate scores based on position 658 | }) 659 | } 660 | return rankedObjects 661 | } 662 | 663 | var rankedResponse RankedObjectResponse 664 | inputIDs := make(map[string]bool) 665 | for _, obj := range group { 666 | inputIDs[obj.ID] = true 667 | } 668 | if r.cfg.OllamaModel != "" { 669 | rankedResponse = r.callOllama(prompt, runNumber, batchNumber, inputIDs) 670 | } else { 671 | rankedResponse = r.callOpenAI(prompt, runNumber, batchNumber, inputIDs) 672 | } 673 | 674 | // Assign scores based on position in the ranked list 675 | var rankedObjects []RankedObject 676 | for i, id := range rankedResponse.Objects { 677 | for _, obj := range group { 678 | if obj.ID == id { 679 | rankedObjects = append(rankedObjects, RankedObject{ 680 | Object: obj, 681 | Score: float64(i + 1), // Score based on position (1 for first, 2 for second, etc.) 682 | }) 683 | break 684 | } 685 | } 686 | } 687 | 688 | return rankedObjects 689 | } 690 | 691 | type CustomTransport struct { 692 | Transport http.RoundTripper 693 | Headers http.Header 694 | StatusCode int 695 | Body []byte 696 | } 697 | 698 | func (t *CustomTransport) RoundTrip(req *http.Request) (*http.Response, error) { 699 | resp, err := t.Transport.RoundTrip(req) 700 | if err != nil { 701 | return nil, err 702 | } 703 | 704 | t.Headers = resp.Header 705 | t.StatusCode = resp.StatusCode 706 | 707 | t.Body, err = io.ReadAll(resp.Body) 708 | if err != nil { 709 | return nil, err 710 | } 711 | 712 | resp.Body = io.NopCloser(bytes.NewBuffer(t.Body)) 713 | 714 | return resp, nil 715 | } 716 | 717 | // Updates the rankedResponse in place to fix case-insensitive ID mismatches. 718 | // If any IDs are missing, returns the missing IDs along with an error. 719 | // TODO: Also error on IDs in rankedResponse that are not in inputIDs. For example: 720 | // Run 1/10, Batch 8/10: Missing IDs: [VkCMOyV9] 721 | // Ollama API response: {"objects": ["5reULTRv", "KTJsPKHz", "eBFIaWo7", "AhqhnGsE", "Ug_hOxYp", "bWfMDUnE", "4sSg4Ojz", "VkJMOyV9", "UJ1-iMmW", "v6Puwf8K"]} 722 | 723 | func validateIDs(rankedResponse *RankedObjectResponse, inputIDs map[string]bool) ([]string, error) { 724 | // Create a map for case-insensitive ID matching 725 | inputIDsLower := make(map[string]string) 726 | for id := range inputIDs { 727 | inputIDsLower[strings.ToLower(id)] = id 728 | } 729 | 730 | missingIDs := make(map[string]bool) 731 | for id := range inputIDs { 732 | missingIDs[id] = true 733 | } 734 | 735 | for i, id := range rankedResponse.Objects { 736 | id = strings.ReplaceAll(id, "`", "") 737 | lowerID := strings.ToLower(id) 738 | if correctID, found := inputIDsLower[lowerID]; found { 739 | if correctID != id { 740 | // Replace the case-wrong match with the correct ID 741 | rankedResponse.Objects[i] = correctID 742 | } 743 | delete(missingIDs, correctID) 744 | } 745 | } 746 | 747 | if len(missingIDs) == 0 { 748 | return nil, nil 749 | } else { 750 | missingIDsKeys := make([]string, 0, len(missingIDs)) 751 | for id := range missingIDs { 752 | missingIDsKeys = append(missingIDsKeys, id) 753 | } 754 | return missingIDsKeys, fmt.Errorf("missing IDs: %s", strings.Join(missingIDsKeys, ", ")) 755 | } 756 | } 757 | 758 | func (r *Ranker) callOpenAI(prompt string, runNum int, batchNum int, inputIDs map[string]bool) RankedObjectResponse { 759 | 760 | customTransport := &CustomTransport{Transport: http.DefaultTransport} 761 | customClient := &http.Client{Transport: customTransport} 762 | 763 | clientOptions := []option.RequestOption{ 764 | option.WithAPIKey(r.cfg.OpenAIKey), 765 | option.WithHTTPClient(customClient), 766 | option.WithMaxRetries(5), 767 | } 768 | 769 | // Add base URL option if specified 770 | if r.cfg.OpenAIAPIURL != "" { 771 | // Ensure the URL ends with a trailing slash 772 | baseURL := r.cfg.OpenAIAPIURL 773 | if !strings.HasSuffix(baseURL, "/") { 774 | baseURL += "/" 775 | } 776 | clientOptions = append(clientOptions, option.WithBaseURL(baseURL)) 777 | } 778 | 779 | client := openai.NewClient(clientOptions...) 780 | 781 | backoff := time.Second 782 | 783 | conversationHistory := []openai.ChatCompletionMessageParamUnion{ 784 | openai.UserMessage(prompt), 785 | } 786 | 787 | var rankedResponse RankedObjectResponse 788 | for { 789 | ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second) 790 | defer cancel() 791 | 792 | completion, err := client.Chat.Completions.New(ctx, openai.ChatCompletionNewParams{ 793 | Messages: openai.F(conversationHistory), 794 | ResponseFormat: openai.F[openai.ChatCompletionNewParamsResponseFormatUnion]( 795 | openai.ResponseFormatJSONSchemaParam{ 796 | Type: openai.F(openai.ResponseFormatJSONSchemaTypeJSONSchema), 797 | JSONSchema: openai.F(openai.ResponseFormatJSONSchemaJSONSchemaParam{ 798 | Name: openai.F("ranked_object_response"), 799 | Description: openai.F("List of ranked object IDs"), 800 | Schema: openai.F(RankedObjectResponseSchema), 801 | Strict: openai.Bool(true), 802 | }), 803 | }, 804 | ), 805 | Model: openai.F(r.cfg.OpenAIModel), 806 | }) 807 | if err == nil { 808 | 809 | conversationHistory = append(conversationHistory, 810 | openai.AssistantMessage(completion.Choices[0].Message.Content), 811 | ) 812 | 813 | err = json.Unmarshal([]byte(completion.Choices[0].Message.Content), &rankedResponse) 814 | if err != nil { 815 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Error unmarshalling response: %v\n", err)) 816 | conversationHistory = append(conversationHistory, 817 | openai.UserMessage(invalidJSONStr), 818 | ) 819 | trimmedContent := strings.TrimSpace(completion.Choices[0].Message.Content) 820 | log.Printf("OpenAI API response: %s", trimmedContent) 821 | continue 822 | } 823 | 824 | missingIDs, err := validateIDs(&rankedResponse, inputIDs) 825 | if err != nil { 826 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Missing IDs: [%s]", strings.Join(missingIDs, ", "))) 827 | conversationHistory = append(conversationHistory, 828 | openai.UserMessage(fmt.Sprintf(missingIDsStr, strings.Join(missingIDs, ", "))), 829 | ) 830 | trimmedContent := strings.TrimSpace(completion.Choices[0].Message.Content) 831 | log.Printf("OpenAI API response: %s", trimmedContent) 832 | continue 833 | } 834 | 835 | return rankedResponse 836 | } 837 | 838 | if err == context.DeadlineExceeded { 839 | r.logFromApiCall(runNum, batchNum, "Context deadline exceeded, retrying...") 840 | time.Sleep(backoff) 841 | backoff *= 2 842 | continue 843 | } 844 | 845 | if customTransport.StatusCode == http.StatusTooManyRequests { 846 | for key, values := range customTransport.Headers { 847 | if strings.HasPrefix(key, "X-Ratelimit") { 848 | for _, value := range values { 849 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Rate limit header: %s: %s", key, value)) 850 | } 851 | } 852 | } 853 | 854 | respBody := customTransport.Body 855 | if respBody == nil { 856 | r.logFromApiCall(runNum, batchNum, "Error reading response body: %v", "response body is nil") 857 | } else { 858 | r.logFromApiCall(runNum, batchNum, "Response body: %s", string(respBody)) 859 | } 860 | 861 | remainingTokensStr := customTransport.Headers.Get("X-Ratelimit-Remaining-Tokens") 862 | resetTokensStr := customTransport.Headers.Get("X-Ratelimit-Reset-Tokens") 863 | 864 | remainingTokens, _ := strconv.Atoi(remainingTokensStr) 865 | resetDuration, _ := time.ParseDuration(strings.Replace(resetTokensStr, "s", "s", 1)) 866 | 867 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Rate limit exceeded. Suggested wait time: %v. Remaining tokens: %d", resetDuration, remainingTokens)) 868 | 869 | if resetDuration > 0 { 870 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Waiting for %v before retrying...", resetDuration)) 871 | time.Sleep(resetDuration) 872 | } else { 873 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Waiting for %v before retrying...", backoff)) 874 | time.Sleep(backoff) 875 | backoff *= 2 876 | } 877 | } else { 878 | log.Fatalf("Run %*d/%d, Batch %*d/%d: Unexpected error: %v", len(strconv.Itoa(r.cfg.NumRuns)), runNum, r.cfg.NumRuns, len(strconv.Itoa(r.numBatches)), batchNum, r.numBatches, err) 879 | } 880 | } 881 | } 882 | 883 | func (r *Ranker) callOllama(prompt string, runNum int, batchNum int, inputIDs map[string]bool) RankedObjectResponse { 884 | 885 | var rankedResponse RankedObjectResponse 886 | 887 | // Initialize the conversation history with the initial prompt 888 | conversationHistory := []map[string]interface{}{ 889 | {"role": "user", "content": prompt}, 890 | } 891 | 892 | for { 893 | 894 | requestBody, err := json.Marshal(map[string]interface{}{ 895 | "model": r.cfg.OllamaModel, 896 | "stream": false, 897 | "format": "json", 898 | "num_ctx": r.cfg.BatchTokens, 899 | "messages": conversationHistory, 900 | }) 901 | if err != nil { 902 | log.Fatalf("Error creating Ollama API request body: %v", err) 903 | } 904 | 905 | req, err := http.NewRequest("POST", r.cfg.OllamaAPIURL, bytes.NewReader(requestBody)) 906 | if err != nil { 907 | log.Fatalf("Error creating Ollama API request: %v", err) 908 | } 909 | req.Header.Set("Content-Type", "application/json") 910 | 911 | client := &http.Client{} 912 | 913 | resp, err := client.Do(req) 914 | if err != nil { 915 | log.Fatalf("Error making request to Ollama API: %v", err) 916 | } 917 | defer resp.Body.Close() 918 | 919 | if resp.StatusCode != http.StatusOK { 920 | body, _ := io.ReadAll(resp.Body) 921 | log.Fatalf("Ollama API returned an error: %v, body: %s", resp.StatusCode, body) 922 | } 923 | 924 | responseBody, err := io.ReadAll(resp.Body) 925 | if err != nil { 926 | log.Fatalf("Error reading Ollama API response body: %v", err) 927 | } 928 | 929 | var ollamaResponse struct { 930 | Message struct { 931 | Content string `json:"content"` 932 | } `json:"message"` 933 | } 934 | 935 | err = json.Unmarshal(responseBody, &ollamaResponse) 936 | if err != nil { 937 | log.Fatalf("Error parsing Ollama API response: %v", err) 938 | } 939 | 940 | conversationHistory = append( 941 | conversationHistory, 942 | map[string]interface{}{ 943 | "role": "assistant", 944 | "content": ollamaResponse.Message.Content, 945 | }, 946 | ) 947 | 948 | err = json.Unmarshal([]byte(ollamaResponse.Message.Content), &rankedResponse) 949 | if err != nil { 950 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Error unmarshalling response: %v\n", err)) 951 | conversationHistory = append(conversationHistory, 952 | map[string]interface{}{ 953 | "role": "user", 954 | "content": invalidJSONStr, 955 | }, 956 | ) 957 | trimmedContent := strings.TrimSpace(ollamaResponse.Message.Content) 958 | log.Printf("Ollama API response: %s", trimmedContent) 959 | continue 960 | } 961 | 962 | missingIDs, err := validateIDs(&rankedResponse, inputIDs) 963 | if err != nil { 964 | r.logFromApiCall(runNum, batchNum, fmt.Sprintf("Missing IDs: [%s]", strings.Join(missingIDs, ", "))) 965 | conversationHistory = append(conversationHistory, 966 | map[string]interface{}{ 967 | "role": "user", 968 | "content": fmt.Sprintf(missingIDsStr, strings.Join(missingIDs, ", ")), 969 | }, 970 | ) 971 | trimmedContent := strings.TrimSpace(ollamaResponse.Message.Content) 972 | log.Printf("Ollama API response: %s", trimmedContent) 973 | continue 974 | } 975 | 976 | return rankedResponse 977 | } 978 | } 979 | -------------------------------------------------------------------------------- /testdata/sentences.txt: -------------------------------------------------------------------------------- 1 | A group of hikers trekked through the dense forest. 2 | He read the letter aloud with a trembling voice. 3 | A small child laughed and splashed in the puddle. 4 | The festival was filled with music and laughter. 5 | The rain tapped gently on the roof. 6 | The candlelight created a warm glow in the room. 7 | A sudden storm darkened the sky. 8 | He stacked the logs neatly by the fireplace. 9 | The chef drizzled sauce over the plated dish. 10 | He wrote a letter he would never send. 11 | The snow crunched underfoot as they walked. 12 | The library was quiet except for the rustle of pages. 13 | The fox dashed into the woods at the first sound. 14 | A rainbow appeared after the heavy rain. 15 | He found an old coin buried in the sand. 16 | The clock ticked steadily on the wall. 17 | A spider spun a web between the branches. 18 | The little boy held a balloon tightly in his hand. 19 | He opened the door and gasped in surprise. 20 | A squirrel darted across the park path. 21 | A child waved enthusiastically from the swings. 22 | A butterfly landed on her outstretched hand. 23 | He poured tea into delicate porcelain cups. 24 | The city lights sparkled from the rooftop view. 25 | The waves crashed against the rocky shore. 26 | The train whistle echoed through the valley. 27 | She knitted a scarf while sitting by the window. 28 | The farmer harvested ripe apples from the orchard. 29 | A pair of swans glided gracefully across the lake. 30 | The stage was set for the grand performance. 31 | A thunderclap startled the sleeping dog. 32 | He built a small wooden boat by hand. 33 | The book had a surprising twist at the end. 34 | The aroma of fresh bread filled the air. 35 | A lizard basked in the warm sunlight. 36 | The magician performed a trick that amazed everyone. 37 | The kite soared high in the clear blue sky. 38 | The sound of waves soothed her mind. 39 | The violinist played a hauntingly beautiful melody. 40 | A small fish swam near the surface of the pond. 41 | The painting was breathtakingly beautiful. 42 | The chocolate melted in the summer heat. 43 | She wore a bracelet made of colorful beads. 44 | A paper airplane sailed across the classroom. 45 | The old bridge creaked as they crossed it. 46 | A curious fox appeared at the edge of the forest. 47 | She discovered an old diary in the attic. 48 | He watched as the leaves fell one by one. 49 | A cool breeze swept through the meadow. 50 | The crowd erupted in cheers at the winning goal. 51 | The mountain trail was steep and challenging. 52 | The music played softly in the background. 53 | The ancient ruins stood silently in the desert. 54 | The mirror reflected an unfamiliar face. 55 | She planted flowers in her grandmother's garden. 56 | She opened the curtains to let in the morning light. 57 | The detective inspected the room carefully. 58 | A mysterious letter arrived in the mail. 59 | He whispered a secret into her ear. 60 | A small group of stars formed a familiar constellation. 61 | The bell rang, signaling the end of class. 62 | A stray dog found shelter under the porch. 63 | She placed the last puzzle piece into its spot. 64 | The chef prepared a dish with perfect precision. 65 | He walked into the room with a smile. 66 | He sketched a portrait of his best friend. 67 | The new store had an impressive display of goods. 68 | She discovered a hidden passage behind the bookshelf. 69 | He wore a red scarf on the chilly day. 70 | The ship's horn sounded as it departed the dock. 71 | The stars twinkled brightly in the clear night sky. 72 | The fire crackled softly in the hearth. 73 | The rooster crowed at the break of dawn. 74 | She balanced a tray full of dishes with ease. 75 | The candle flickered as the wind blew gently. 76 | She couldn't find her keys anywhere. 77 | She wore a dress that shimmered in the light. 78 | She decided to bake a cake from scratch. 79 | A bird chirped happily outside the window. 80 | The coffee shop was packed with customers. 81 | A stray cat followed him down the street. 82 | The cat jumped onto the windowsill. 83 | She climbed to the top of the hill to watch the sunset. 84 | The wind carried the scent of the ocean. 85 | He doodled in the margins of his notebook. 86 | She tied a ribbon around the gift box. 87 | He finished the marathon despite the pain. 88 | The lighthouse stood tall against the stormy sky. 89 | A horse galloped across the open field. 90 | The train arrived exactly on time. 91 | He spotted a shooting star while stargazing. 92 | The professor spoke passionately about the subject. 93 | The old clock chimed twelve times. 94 | Her laughter echoed through the hall. 95 | The smell of fresh paint lingered in the air. 96 | The wind howled through the abandoned house. 97 | A group of friends gathered around the campfire. 98 | She wore a hat decorated with colorful feathers. 99 | The puppy wagged its tail excitedly. 100 | A stranger handed her a flower as she walked by. 101 | -------------------------------------------------------------------------------- /testdata/tlds.txt: -------------------------------------------------------------------------------- 1 | aaa 2 | aarp 3 | abb 4 | abbott 5 | abbvie 6 | abc 7 | able 8 | abogado 9 | abudhabi 10 | ac 11 | academy 12 | accenture 13 | accountant 14 | accountants 15 | aco 16 | actor 17 | ad 18 | ads 19 | adult 20 | ae 21 | aeg 22 | aero 23 | aetna 24 | af 25 | afl 26 | africa 27 | ag 28 | agakhan 29 | agency 30 | ai 31 | aig 32 | airbus 33 | airforce 34 | airtel 35 | akdn 36 | al 37 | alibaba 38 | alipay 39 | allfinanz 40 | allstate 41 | ally 42 | alsace 43 | alstom 44 | am 45 | amazon 46 | americanexpress 47 | americanfamily 48 | amex 49 | amfam 50 | amica 51 | amsterdam 52 | analytics 53 | android 54 | anquan 55 | anz 56 | ao 57 | aol 58 | apartments 59 | app 60 | apple 61 | aq 62 | aquarelle 63 | ar 64 | arab 65 | aramco 66 | archi 67 | army 68 | arpa 69 | art 70 | arte 71 | as 72 | asda 73 | asia 74 | associates 75 | at 76 | athleta 77 | attorney 78 | au 79 | auction 80 | audi 81 | audible 82 | audio 83 | auspost 84 | author 85 | auto 86 | autos 87 | aw 88 | aws 89 | ax 90 | axa 91 | az 92 | azure 93 | ba 94 | baby 95 | baidu 96 | banamex 97 | band 98 | bank 99 | bar 100 | barcelona 101 | barclaycard 102 | barclays 103 | barefoot 104 | bargains 105 | baseball 106 | basketball 107 | bauhaus 108 | bayern 109 | bb 110 | bbc 111 | bbt 112 | bbva 113 | bcg 114 | bcn 115 | bd 116 | be 117 | beats 118 | beauty 119 | beer 120 | bentley 121 | berlin 122 | best 123 | bestbuy 124 | bet 125 | bf 126 | bg 127 | bh 128 | bharti 129 | bi 130 | bible 131 | bid 132 | bike 133 | bing 134 | bingo 135 | bio 136 | biz 137 | bj 138 | black 139 | blackfriday 140 | blockbuster 141 | blog 142 | bloomberg 143 | blue 144 | bm 145 | bms 146 | bmw 147 | bn 148 | bnpparibas 149 | bo 150 | boats 151 | boehringer 152 | bofa 153 | bom 154 | bond 155 | boo 156 | book 157 | booking 158 | bosch 159 | bostik 160 | boston 161 | bot 162 | boutique 163 | box 164 | br 165 | bradesco 166 | bridgestone 167 | broadway 168 | broker 169 | brother 170 | brussels 171 | bs 172 | bt 173 | build 174 | builders 175 | business 176 | buy 177 | buzz 178 | bv 179 | bw 180 | by 181 | bz 182 | bzh 183 | ca 184 | cab 185 | cafe 186 | cal 187 | call 188 | calvinklein 189 | cam 190 | camera 191 | camp 192 | canon 193 | capetown 194 | capital 195 | capitalone 196 | car 197 | caravan 198 | cards 199 | care 200 | career 201 | careers 202 | cars 203 | casa 204 | case 205 | cash 206 | casino 207 | cat 208 | catering 209 | catholic 210 | cba 211 | cbn 212 | cbre 213 | cc 214 | cd 215 | center 216 | ceo 217 | cern 218 | cf 219 | cfa 220 | cfd 221 | cg 222 | ch 223 | chanel 224 | channel 225 | charity 226 | chase 227 | chat 228 | cheap 229 | chintai 230 | christmas 231 | chrome 232 | church 233 | ci 234 | cipriani 235 | circle 236 | cisco 237 | citadel 238 | citi 239 | citic 240 | city 241 | ck 242 | cl 243 | claims 244 | cleaning 245 | click 246 | clinic 247 | clinique 248 | clothing 249 | cloud 250 | club 251 | clubmed 252 | cm 253 | cn 254 | co 255 | coach 256 | codes 257 | coffee 258 | college 259 | cologne 260 | com 261 | commbank 262 | community 263 | company 264 | compare 265 | computer 266 | comsec 267 | condos 268 | construction 269 | consulting 270 | contact 271 | contractors 272 | cooking 273 | cool 274 | coop 275 | corsica 276 | country 277 | coupon 278 | coupons 279 | courses 280 | cpa 281 | cr 282 | credit 283 | creditcard 284 | creditunion 285 | cricket 286 | crown 287 | crs 288 | cruise 289 | cruises 290 | cu 291 | cuisinella 292 | cv 293 | cw 294 | cx 295 | cy 296 | cymru 297 | cyou 298 | cz 299 | dad 300 | dance 301 | data 302 | date 303 | dating 304 | datsun 305 | day 306 | dclk 307 | dds 308 | de 309 | deal 310 | dealer 311 | deals 312 | degree 313 | delivery 314 | dell 315 | deloitte 316 | delta 317 | democrat 318 | dental 319 | dentist 320 | desi 321 | design 322 | dev 323 | dhl 324 | diamonds 325 | diet 326 | digital 327 | direct 328 | directory 329 | discount 330 | discover 331 | dish 332 | diy 333 | dj 334 | dk 335 | dm 336 | dnp 337 | do 338 | docs 339 | doctor 340 | dog 341 | domains 342 | dot 343 | download 344 | drive 345 | dtv 346 | dubai 347 | dunlop 348 | dupont 349 | durban 350 | dvag 351 | dvr 352 | dz 353 | earth 354 | eat 355 | ec 356 | eco 357 | edeka 358 | edu 359 | education 360 | ee 361 | eg 362 | email 363 | emerck 364 | energy 365 | engineer 366 | engineering 367 | enterprises 368 | epson 369 | equipment 370 | er 371 | ericsson 372 | erni 373 | es 374 | esq 375 | estate 376 | et 377 | eu 378 | eurovision 379 | eus 380 | events 381 | exchange 382 | expert 383 | exposed 384 | express 385 | extraspace 386 | fage 387 | fail 388 | fairwinds 389 | faith 390 | family 391 | fan 392 | fans 393 | farm 394 | farmers 395 | fashion 396 | fast 397 | fedex 398 | feedback 399 | ferrari 400 | ferrero 401 | fi 402 | fidelity 403 | fido 404 | film 405 | final 406 | finance 407 | financial 408 | fire 409 | firestone 410 | firmdale 411 | fish 412 | fishing 413 | fit 414 | fitness 415 | fj 416 | fk 417 | flickr 418 | flights 419 | flir 420 | florist 421 | flowers 422 | fly 423 | fm 424 | fo 425 | foo 426 | food 427 | football 428 | ford 429 | forex 430 | forsale 431 | forum 432 | foundation 433 | fox 434 | fr 435 | free 436 | fresenius 437 | frl 438 | frogans 439 | frontier 440 | ftr 441 | fujitsu 442 | fun 443 | fund 444 | furniture 445 | futbol 446 | fyi 447 | ga 448 | gal 449 | gallery 450 | gallo 451 | gallup 452 | game 453 | games 454 | gap 455 | garden 456 | gay 457 | gb 458 | gbiz 459 | gd 460 | gdn 461 | ge 462 | gea 463 | gent 464 | genting 465 | george 466 | gf 467 | gg 468 | ggee 469 | gh 470 | gi 471 | gift 472 | gifts 473 | gives 474 | giving 475 | gl 476 | glass 477 | gle 478 | global 479 | globo 480 | gm 481 | gmail 482 | gmbh 483 | gmo 484 | gmx 485 | gn 486 | godaddy 487 | gold 488 | goldpoint 489 | golf 490 | goo 491 | goodyear 492 | goog 493 | google 494 | gop 495 | got 496 | gov 497 | gp 498 | gq 499 | gr 500 | grainger 501 | graphics 502 | gratis 503 | green 504 | gripe 505 | grocery 506 | group 507 | gs 508 | gt 509 | gu 510 | gucci 511 | guge 512 | guide 513 | guitars 514 | guru 515 | gw 516 | gy 517 | hair 518 | hamburg 519 | hangout 520 | haus 521 | hbo 522 | hdfc 523 | hdfcbank 524 | health 525 | healthcare 526 | help 527 | helsinki 528 | here 529 | hermes 530 | hiphop 531 | hisamitsu 532 | hitachi 533 | hiv 534 | hk 535 | hkt 536 | hm 537 | hn 538 | hockey 539 | holdings 540 | holiday 541 | homedepot 542 | homegoods 543 | homes 544 | homesense 545 | honda 546 | horse 547 | hospital 548 | host 549 | hosting 550 | hot 551 | hotels 552 | hotmail 553 | house 554 | how 555 | hr 556 | hsbc 557 | ht 558 | hu 559 | hughes 560 | hyatt 561 | hyundai 562 | ibm 563 | icbc 564 | ice 565 | icu 566 | id 567 | ie 568 | ieee 569 | ifm 570 | ikano 571 | il 572 | im 573 | imamat 574 | imdb 575 | immo 576 | immobilien 577 | in 578 | inc 579 | industries 580 | infiniti 581 | info 582 | ing 583 | ink 584 | institute 585 | insurance 586 | insure 587 | int 588 | international 589 | intuit 590 | investments 591 | io 592 | ipiranga 593 | iq 594 | ir 595 | irish 596 | is 597 | ismaili 598 | ist 599 | istanbul 600 | it 601 | itau 602 | itv 603 | jaguar 604 | java 605 | jcb 606 | je 607 | jeep 608 | jetzt 609 | jewelry 610 | jio 611 | jll 612 | jm 613 | jmp 614 | jnj 615 | jo 616 | jobs 617 | joburg 618 | jot 619 | joy 620 | jp 621 | jpmorgan 622 | jprs 623 | juegos 624 | juniper 625 | kaufen 626 | kddi 627 | ke 628 | kerryhotels 629 | kerrylogistics 630 | kerryproperties 631 | kfh 632 | kg 633 | kh 634 | ki 635 | kia 636 | kids 637 | kim 638 | kindle 639 | kitchen 640 | kiwi 641 | km 642 | kn 643 | koeln 644 | komatsu 645 | kosher 646 | kp 647 | kpmg 648 | kpn 649 | kr 650 | krd 651 | kred 652 | kuokgroup 653 | kw 654 | ky 655 | kyoto 656 | kz 657 | la 658 | lacaixa 659 | lamborghini 660 | lamer 661 | lancaster 662 | land 663 | landrover 664 | lanxess 665 | lasalle 666 | lat 667 | latino 668 | latrobe 669 | law 670 | lawyer 671 | lb 672 | lc 673 | lds 674 | lease 675 | leclerc 676 | lefrak 677 | legal 678 | lego 679 | lexus 680 | lgbt 681 | li 682 | lidl 683 | life 684 | lifeinsurance 685 | lifestyle 686 | lighting 687 | like 688 | lilly 689 | limited 690 | limo 691 | lincoln 692 | link 693 | lipsy 694 | live 695 | living 696 | lk 697 | llc 698 | llp 699 | loan 700 | loans 701 | locker 702 | locus 703 | lol 704 | london 705 | lotte 706 | lotto 707 | love 708 | lpl 709 | lplfinancial 710 | lr 711 | ls 712 | lt 713 | ltd 714 | ltda 715 | lu 716 | lundbeck 717 | luxe 718 | luxury 719 | lv 720 | ly 721 | ma 722 | madrid 723 | maif 724 | maison 725 | makeup 726 | man 727 | management 728 | mango 729 | map 730 | market 731 | marketing 732 | markets 733 | marriott 734 | marshalls 735 | mattel 736 | mba 737 | mc 738 | mckinsey 739 | md 740 | me 741 | med 742 | media 743 | meet 744 | melbourne 745 | meme 746 | memorial 747 | men 748 | menu 749 | merckmsd 750 | mg 751 | mh 752 | miami 753 | microsoft 754 | mil 755 | mini 756 | mint 757 | mit 758 | mitsubishi 759 | mk 760 | ml 761 | mlb 762 | mls 763 | mm 764 | mma 765 | mn 766 | mo 767 | mobi 768 | mobile 769 | moda 770 | moe 771 | moi 772 | mom 773 | monash 774 | money 775 | monster 776 | mormon 777 | mortgage 778 | moscow 779 | moto 780 | motorcycles 781 | mov 782 | movie 783 | mp 784 | mq 785 | mr 786 | ms 787 | msd 788 | mt 789 | mtn 790 | mtr 791 | mu 792 | museum 793 | music 794 | mv 795 | mw 796 | mx 797 | my 798 | mz 799 | na 800 | nab 801 | nagoya 802 | name 803 | navy 804 | nba 805 | nc 806 | ne 807 | nec 808 | net 809 | netbank 810 | netflix 811 | network 812 | neustar 813 | new 814 | news 815 | next 816 | nextdirect 817 | nexus 818 | nf 819 | nfl 820 | ng 821 | ngo 822 | nhk 823 | ni 824 | nico 825 | nike 826 | nikon 827 | ninja 828 | nissan 829 | nissay 830 | nl 831 | no 832 | nokia 833 | norton 834 | now 835 | nowruz 836 | nowtv 837 | np 838 | nr 839 | nra 840 | nrw 841 | ntt 842 | nu 843 | nyc 844 | nz 845 | obi 846 | observer 847 | office 848 | okinawa 849 | olayan 850 | olayangroup 851 | ollo 852 | om 853 | omega 854 | one 855 | ong 856 | onl 857 | online 858 | ooo 859 | open 860 | oracle 861 | orange 862 | org 863 | organic 864 | origins 865 | osaka 866 | otsuka 867 | ott 868 | ovh 869 | pa 870 | page 871 | panasonic 872 | paris 873 | pars 874 | partners 875 | parts 876 | party 877 | pay 878 | pccw 879 | pe 880 | pet 881 | pf 882 | pfizer 883 | pg 884 | ph 885 | pharmacy 886 | phd 887 | philips 888 | phone 889 | photo 890 | photography 891 | photos 892 | physio 893 | pics 894 | pictet 895 | pictures 896 | pid 897 | pin 898 | ping 899 | pink 900 | pioneer 901 | pizza 902 | pk 903 | pl 904 | place 905 | play 906 | playstation 907 | plumbing 908 | plus 909 | pm 910 | pn 911 | pnc 912 | pohl 913 | poker 914 | politie 915 | porn 916 | post 917 | pr 918 | pramerica 919 | praxi 920 | press 921 | prime 922 | pro 923 | prod 924 | productions 925 | prof 926 | progressive 927 | promo 928 | properties 929 | property 930 | protection 931 | pru 932 | prudential 933 | ps 934 | pt 935 | pub 936 | pw 937 | pwc 938 | py 939 | qa 940 | qpon 941 | quebec 942 | quest 943 | racing 944 | radio 945 | re 946 | read 947 | realestate 948 | realtor 949 | realty 950 | recipes 951 | red 952 | redstone 953 | redumbrella 954 | rehab 955 | reise 956 | reisen 957 | reit 958 | reliance 959 | ren 960 | rent 961 | rentals 962 | repair 963 | report 964 | republican 965 | rest 966 | restaurant 967 | review 968 | reviews 969 | rexroth 970 | rich 971 | richardli 972 | ricoh 973 | ril 974 | rio 975 | rip 976 | ro 977 | rocks 978 | rodeo 979 | rogers 980 | room 981 | rs 982 | rsvp 983 | ru 984 | rugby 985 | ruhr 986 | run 987 | rw 988 | rwe 989 | ryukyu 990 | sa 991 | saarland 992 | safe 993 | safety 994 | sakura 995 | sale 996 | salon 997 | samsclub 998 | samsung 999 | sandvik 1000 | sandvikcoromant 1001 | sanofi 1002 | sap 1003 | sarl 1004 | sas 1005 | save 1006 | saxo 1007 | sb 1008 | sbi 1009 | sbs 1010 | sc 1011 | scb 1012 | schaeffler 1013 | schmidt 1014 | scholarships 1015 | school 1016 | schule 1017 | schwarz 1018 | science 1019 | scot 1020 | sd 1021 | se 1022 | search 1023 | seat 1024 | secure 1025 | security 1026 | seek 1027 | select 1028 | sener 1029 | services 1030 | seven 1031 | sew 1032 | sex 1033 | sexy 1034 | sfr 1035 | sg 1036 | sh 1037 | shangrila 1038 | sharp 1039 | shell 1040 | shia 1041 | shiksha 1042 | shoes 1043 | shop 1044 | shopping 1045 | shouji 1046 | show 1047 | si 1048 | silk 1049 | sina 1050 | singles 1051 | site 1052 | sj 1053 | sk 1054 | ski 1055 | skin 1056 | sky 1057 | skype 1058 | sl 1059 | sling 1060 | sm 1061 | smart 1062 | smile 1063 | sn 1064 | sncf 1065 | so 1066 | soccer 1067 | social 1068 | softbank 1069 | software 1070 | sohu 1071 | solar 1072 | solutions 1073 | song 1074 | sony 1075 | soy 1076 | spa 1077 | space 1078 | sport 1079 | spot 1080 | sr 1081 | srl 1082 | ss 1083 | st 1084 | stada 1085 | staples 1086 | star 1087 | statebank 1088 | statefarm 1089 | stc 1090 | stcgroup 1091 | stockholm 1092 | storage 1093 | store 1094 | stream 1095 | studio 1096 | study 1097 | style 1098 | su 1099 | sucks 1100 | supplies 1101 | supply 1102 | support 1103 | surf 1104 | surgery 1105 | suzuki 1106 | sv 1107 | swatch 1108 | swiss 1109 | sx 1110 | sy 1111 | sydney 1112 | systems 1113 | sz 1114 | tab 1115 | taipei 1116 | talk 1117 | taobao 1118 | target 1119 | tatamotors 1120 | tatar 1121 | tattoo 1122 | tax 1123 | taxi 1124 | tc 1125 | tci 1126 | td 1127 | tdk 1128 | team 1129 | tech 1130 | technology 1131 | tel 1132 | temasek 1133 | tennis 1134 | teva 1135 | tf 1136 | tg 1137 | th 1138 | thd 1139 | theater 1140 | theatre 1141 | tiaa 1142 | tickets 1143 | tienda 1144 | tips 1145 | tires 1146 | tirol 1147 | tj 1148 | tjmaxx 1149 | tjx 1150 | tk 1151 | tkmaxx 1152 | tl 1153 | tm 1154 | tmall 1155 | tn 1156 | to 1157 | today 1158 | tokyo 1159 | tools 1160 | top 1161 | toray 1162 | toshiba 1163 | total 1164 | tours 1165 | town 1166 | toyota 1167 | toys 1168 | tr 1169 | trade 1170 | trading 1171 | training 1172 | travel 1173 | travelers 1174 | travelersinsurance 1175 | trust 1176 | trv 1177 | tt 1178 | tube 1179 | tui 1180 | tunes 1181 | tushu 1182 | tv 1183 | tvs 1184 | tw 1185 | tz 1186 | ua 1187 | ubank 1188 | ubs 1189 | ug 1190 | uk 1191 | unicom 1192 | university 1193 | uno 1194 | uol 1195 | ups 1196 | us 1197 | uy 1198 | uz 1199 | va 1200 | vacations 1201 | vana 1202 | vanguard 1203 | vc 1204 | ve 1205 | vegas 1206 | ventures 1207 | verisign 1208 | versicherung 1209 | vet 1210 | vg 1211 | vi 1212 | viajes 1213 | video 1214 | vig 1215 | viking 1216 | villas 1217 | vin 1218 | vip 1219 | virgin 1220 | visa 1221 | vision 1222 | viva 1223 | vivo 1224 | vlaanderen 1225 | vn 1226 | vodka 1227 | volvo 1228 | vote 1229 | voting 1230 | voto 1231 | voyage 1232 | vu 1233 | wales 1234 | walmart 1235 | walter 1236 | wang 1237 | wanggou 1238 | watch 1239 | watches 1240 | weather 1241 | weatherchannel 1242 | webcam 1243 | weber 1244 | website 1245 | wed 1246 | wedding 1247 | weibo 1248 | weir 1249 | wf 1250 | whoswho 1251 | wien 1252 | wiki 1253 | williamhill 1254 | win 1255 | windows 1256 | wine 1257 | winners 1258 | wme 1259 | wolterskluwer 1260 | woodside 1261 | work 1262 | works 1263 | world 1264 | wow 1265 | ws 1266 | wtc 1267 | wtf 1268 | xbox 1269 | xerox 1270 | xihuan 1271 | xin 1272 | xn--11b4c3d 1273 | xn--1ck2e1b 1274 | xn--1qqw23a 1275 | xn--2scrj9c 1276 | xn--30rr7y 1277 | xn--3bst00m 1278 | xn--3ds443g 1279 | xn--3e0b707e 1280 | xn--3hcrj9c 1281 | xn--3pxu8k 1282 | xn--42c2d9a 1283 | xn--45br5cyl 1284 | xn--45brj9c 1285 | xn--45q11c 1286 | xn--4dbrk0ce 1287 | xn--4gbrim 1288 | xn--54b7fta0cc 1289 | xn--55qw42g 1290 | xn--55qx5d 1291 | xn--5su34j936bgsg 1292 | xn--5tzm5g 1293 | xn--6frz82g 1294 | xn--6qq986b3xl 1295 | xn--80adxhks 1296 | xn--80ao21a 1297 | xn--80aqecdr1a 1298 | xn--80asehdb 1299 | xn--80aswg 1300 | xn--8y0a063a 1301 | xn--90a3ac 1302 | xn--90ae 1303 | xn--90ais 1304 | xn--9dbq2a 1305 | xn--9et52u 1306 | xn--9krt00a 1307 | xn--b4w605ferd 1308 | xn--bck1b9a5dre4c 1309 | xn--c1avg 1310 | xn--c2br7g 1311 | xn--cck2b3b 1312 | xn--cckwcxetd 1313 | xn--cg4bki 1314 | xn--clchc0ea0b2g2a9gcd 1315 | xn--czr694b 1316 | xn--czrs0t 1317 | xn--czru2d 1318 | xn--d1acj3b 1319 | xn--d1alf 1320 | xn--e1a4c 1321 | xn--eckvdtc9d 1322 | xn--efvy88h 1323 | xn--fct429k 1324 | xn--fhbei 1325 | xn--fiq228c5hs 1326 | xn--fiq64b 1327 | xn--fiqs8s 1328 | xn--fiqz9s 1329 | xn--fjq720a 1330 | xn--flw351e 1331 | xn--fpcrj9c3d 1332 | xn--fzc2c9e2c 1333 | xn--fzys8d69uvgm 1334 | xn--g2xx48c 1335 | xn--gckr3f0f 1336 | xn--gecrj9c 1337 | xn--gk3at1e 1338 | xn--h2breg3eve 1339 | xn--h2brj9c 1340 | xn--h2brj9c8c 1341 | xn--hxt814e 1342 | xn--i1b6b1a6a2e 1343 | xn--imr513n 1344 | xn--io0a7i 1345 | xn--j1aef 1346 | xn--j1amh 1347 | xn--j6w193g 1348 | xn--jlq480n2rg 1349 | xn--jvr189m 1350 | xn--kcrx77d1x4a 1351 | xn--kprw13d 1352 | xn--kpry57d 1353 | xn--kput3i 1354 | xn--l1acc 1355 | xn--lgbbat1ad8j 1356 | xn--mgb9awbf 1357 | xn--mgba3a3ejt 1358 | xn--mgba3a4f16a 1359 | xn--mgba7c0bbn0a 1360 | xn--mgbaam7a8h 1361 | xn--mgbab2bd 1362 | xn--mgbah1a3hjkrd 1363 | xn--mgbai9azgqp6j 1364 | xn--mgbayh7gpa 1365 | xn--mgbbh1a 1366 | xn--mgbbh1a71e 1367 | xn--mgbc0a9azcg 1368 | xn--mgbca7dzdo 1369 | xn--mgbcpq6gpa1a 1370 | xn--mgberp4a5d4ar 1371 | xn--mgbgu82a 1372 | xn--mgbi4ecexp 1373 | xn--mgbpl2fh 1374 | xn--mgbt3dhd 1375 | xn--mgbtx2b 1376 | xn--mgbx4cd0ab 1377 | xn--mix891f 1378 | xn--mk1bu44c 1379 | xn--mxtq1m 1380 | xn--ngbc5azd 1381 | xn--ngbe9e0a 1382 | xn--ngbrx 1383 | xn--node 1384 | xn--nqv7f 1385 | xn--nqv7fs00ema 1386 | xn--nyqy26a 1387 | xn--o3cw4h 1388 | xn--ogbpf8fl 1389 | xn--otu796d 1390 | xn--p1acf 1391 | xn--p1ai 1392 | xn--pgbs0dh 1393 | xn--pssy2u 1394 | xn--q7ce6a 1395 | xn--q9jyb4c 1396 | xn--qcka1pmc 1397 | xn--qxa6a 1398 | xn--qxam 1399 | xn--rhqv96g 1400 | xn--rovu88b 1401 | xn--rvc1e0am3e 1402 | xn--s9brj9c 1403 | xn--ses554g 1404 | xn--t60b56a 1405 | xn--tckwe 1406 | xn--tiq49xqyj 1407 | xn--unup4y 1408 | xn--vermgensberater-ctb 1409 | xn--vermgensberatung-pwb 1410 | xn--vhquv 1411 | xn--vuq861b 1412 | xn--w4r85el8fhu5dnra 1413 | xn--w4rs40l 1414 | xn--wgbh1c 1415 | xn--wgbl6a 1416 | xn--xhq521b 1417 | xn--xkc2al3hye2a 1418 | xn--xkc2dl3a5ee0h 1419 | xn--y9a3aq 1420 | xn--yfro4i67o 1421 | xn--ygbi2ammx 1422 | xn--zfr164b 1423 | xxx 1424 | xyz 1425 | yachts 1426 | yahoo 1427 | yamaxun 1428 | yandex 1429 | ye 1430 | yodobashi 1431 | yoga 1432 | yokohama 1433 | you 1434 | youtube 1435 | yt 1436 | yun 1437 | za 1438 | zappos 1439 | zara 1440 | zero 1441 | zip 1442 | zm 1443 | zone 1444 | zuerich 1445 | zw 1446 | --------------------------------------------------------------------------------