├── README.md ├── index.go ├── search.go ├── search_test.go └── segmentation.go /README.md: -------------------------------------------------------------------------------- 1 | # redisgosearch 2 | 3 | redisgosearch implements fast full-text search with Golang and Redis, using Redis's rich support for sets. 4 | 5 | This fork of @sunfmin's original allows custom keyword segmentation, and cleaned-up interfaces with documentation. The tests also have no external dependencies other than a working Redis installation. 6 | 7 | The [original documentation](https://theplant.jp/en/blogs/13-techforce-making-a-simple-full-text-search-with-golang-and-redis) is clarified below. 8 | 9 | ## Tutorial 10 | 11 | Check out [godoc](http://godoc.org/github.com/purohit/redisgosearch) for package documentation. 12 | 13 | Let's say you have blog entries: 14 | 15 | ```go 16 | type Entry struct { 17 | Id string 18 | Title string 19 | Content string 20 | } 21 | ``` 22 | 23 | You want to be able to search on the `Title` and `Content` fields. Let's create two blog entries to index. 24 | 25 | ```go 26 | Entry { 27 | Id: "50344415ff3a8aa694000001", 28 | Title: "Organizing Go code", 29 | Content: "Go code is organized differently from that of other languages. This post discusses", 30 | } 31 | 32 | Entry { 33 | Id: "50344415ff3a8aa694000002", 34 | Title: "Getting to know the Go community", 35 | Content: "Over the past couple of years Go has attracted a lot of users and contributors", 36 | } 37 | ``` 38 | 39 | All keys in Redis are prefixed by a namespace you pass in when creating a `Client` (in this case, `entries`). 40 | 41 | When you call `Index` on the two entries, the text from `Title` and `Keyword` is broken up into keywords by the `DefaultSegment` function (if you have your own keyword segmentation function, call `IndexCustom`) Each key's value is a set whose members point back to the original entries. 42 | 43 | ``` 44 | redis 127.0.0.1:6379> keys * 45 | 1) "entries:keywords:go" 46 | 2) "entries:keywords:community" 47 | 3) ... 48 | 49 | redis 127.0.0.1:6379> SMEMBERS entries:keywords:go 50 | 1) "entries:entity:50344415ff3a8aa694000001" 51 | 2) "entries:entity:50344415ff3a8aa694000002" 52 | 53 | redis 127.0.0.1:6379> SMEMBERS entries:keywords:community 54 | 1) "entries:entity:50344415ff3a8aa694000002" 55 | ``` 56 | 57 | That way, searching for entries that belong to a query such as "go community" is a simple set intersection. The query is first segmented to `["go", "community"]`, and the intersection (Redis: `SINTER`) is performed to return the IDs of the original entities. 58 | 59 | ``` 60 | redis 127.0.0.1:6379> SINTER entries:keywords:go entries:keywords:community 61 | 1) "entries:entity:50344415ff3a8aa694000002" 62 | ``` 63 | Then, `Search` will take the resulting keys, unmarshal the original structs, 64 | and return them to you. redisgosearch can index any Go struct satisfying `Indexable`. 65 | 66 | ```go 67 | type Entry struct { 68 | Id string 69 | GroupId string 70 | Title string 71 | Content string 72 | Attachments []*Attachment 73 | } 74 | 75 | func (entry *Entry) IndexPieces() (r []string, ais []redisgosearch.Indexable) { 76 | r = append(r, entry.Title) 77 | r = append(r, entry.Content) 78 | 79 | for _, a := range entry.Attachments { 80 | r = append(r, a.Filename) 81 | ais = append(ais, &IndexedAttachment{entry, a}) 82 | } 83 | 84 | return 85 | } 86 | 87 | func (entry *Entry) IndexEntity() (indexType string, key string, entity interface{}) { 88 | key = entry.Id 89 | indexType = "entries" 90 | entity = entry 91 | return 92 | } 93 | 94 | func (entry *Entry) IndexFilters() (r map[string]string) { 95 | r = make(map[string]string) 96 | r["group"] = entry.GroupId 97 | return 98 | } 99 | ``` 100 | 101 | `IndexPieces` tells the package what text should be segmented and indexed. In our example, you might also want to index other data connected to an entry, like attachment data, so you could search any filename and find out which entries those files belong to. Thus, `ais` can return an array of `Indexable` objects that are indexed and connected with the original struct. 102 | 103 | `IndexEntity` tells the package the string indexType (used to prefix keys), and the unique key. Combined with the namespace, this becomes the key that owns a Redis `SET`. The actual entity struct will be marshalled into JSON and stored into Redis. 104 | 105 | `IndexFilters` allows metadata to further filter queries. For example, because we added a filter above, you can search “go community” filtered by the "group" “New York”: 106 | 107 | ```go 108 | var entries []*Entry 109 | count, err := client.Search("entries", "go community", 110 | map[string]string{"group": "New York"}, 111 | 0, 20, &entries) 112 | ``` 113 | 114 | The 0 and 20 is for pagination, and `count` is the total number of entries that matched "go community". 115 | 116 | ## Contributing 117 | The current feature set is simple, and new features are appreciated. Please initiate a pull request, and make sure to `go fmt` and `golint`! 118 | -------------------------------------------------------------------------------- /index.go: -------------------------------------------------------------------------------- 1 | package redisgosearch 2 | 3 | import ( 4 | "encoding/json" 5 | "strings" 6 | 7 | "github.com/garyburd/redigo/redis" 8 | ) 9 | 10 | // Client wraps a namespace (Redis-key prefix) and internal connection. 11 | type Client struct { 12 | namespace string 13 | redisConn redis.Conn 14 | } 15 | 16 | // Indexable is satisfied by any struct that can be indexed 17 | // and searched in Redis by this package. 18 | type Indexable interface { 19 | IndexPieces() (pieces []string, relatedPieces []Indexable) 20 | IndexEntity() (indexType string, key string, entity interface{}, rank int64) 21 | IndexFilters() (r map[string]string) 22 | } 23 | 24 | // NewClient returns a Client given the redis address and namespace, 25 | // or an error if a connection couldn't be made. 26 | func NewClient(address string, namespace string) (r *Client, err error) { 27 | r = &Client{namespace: namespace} 28 | r.redisConn, err = redis.Dial("tcp", address) 29 | return 30 | } 31 | 32 | func (client *Client) index(i Indexable, segmentFn SegmentFn) (err error) { 33 | indexType, key, entity, rank := i.IndexEntity() 34 | 35 | c, err := json.Marshal(entity) 36 | if err != nil { 37 | return 38 | } 39 | 40 | pieces, relatedIndexables := i.IndexPieces() 41 | 42 | entityKey := client.withnamespace(indexType, "entity", key) 43 | client.redisConn.Do("SET", entityKey, c) 44 | client.redisConn.Do("SET", "rank_"+entityKey, rank) 45 | 46 | filters := i.IndexFilters() 47 | 48 | for k, v := range filters { 49 | client.redisConn.Do("SADD", client.withnamespace(indexType, "filters", k, v), entityKey) 50 | } 51 | 52 | for _, piece := range pieces { 53 | words := segmentFn(piece) 54 | for _, word := range words { 55 | client.redisConn.Do("SADD", client.withnamespace(indexType, "keywords", word), entityKey) 56 | } 57 | } 58 | 59 | if len(relatedIndexables) > 0 { 60 | for _, i1 := range relatedIndexables { 61 | client.Index(i1) 62 | } 63 | } 64 | 65 | return 66 | } 67 | 68 | // Index marshals the given Indexable and stores 69 | // it in the Redis database, using the default keyword segmentation function. 70 | func (client *Client) Index(i Indexable) (err error) { 71 | return client.index(i, DefaultSegment) 72 | } 73 | 74 | // IndexCustom does the same as Index, with a custom keyword segmentation function. 75 | func (client *Client) IndexCustom(i Indexable, segmentFn SegmentFn) (err error) { 76 | return client.index(i, segmentFn) 77 | } 78 | 79 | func (client *Client) removeIndex(i Indexable, segmentFn SegmentFn) (err error) { 80 | indexType, key, entity, rank := i.IndexEntity() 81 | 82 | c, err := json.Marshal(entity) 83 | if err != nil { 84 | return 85 | } 86 | 87 | pieces, relatedIndexables := i.IndexPieces() 88 | 89 | entityKey := client.withnamespace(indexType, "entity", key) 90 | client.redisConn.Do("DEL", entityKey, c) 91 | client.redisConn.Do("DEL", "rank_"+entityKey, rank) 92 | 93 | filters := i.IndexFilters() 94 | 95 | for k, v := range filters { 96 | client.redisConn.Do("SREM", client.withnamespace(indexType, "filters", k, v), entityKey) 97 | } 98 | 99 | for _, piece := range pieces { 100 | words := segmentFn(piece) 101 | for _, word := range words { 102 | client.redisConn.Do("SREM", client.withnamespace(indexType, "keywords", word), entityKey) 103 | } 104 | } 105 | 106 | if len(relatedIndexables) > 0 { 107 | for _, i1 := range relatedIndexables { 108 | client.RemoveIndex(i1) 109 | } 110 | } 111 | 112 | return 113 | } 114 | 115 | // RemoveIndex deletes the Redis keys and data for the given 116 | // Indexable (the opposite of Index) 117 | func (client *Client) RemoveIndex(i Indexable) (err error) { 118 | return client.removeIndex(i, DefaultSegment) 119 | } 120 | 121 | // RemoveIndexCustom does the same as RemoveIndex, with a custom keyword segmentation function. 122 | func (client *Client) RemoveIndexCustom(i Indexable, segmentFn SegmentFn) (err error) { 123 | return client.removeIndex(i, segmentFn) 124 | } 125 | 126 | func (client *Client) withnamespace(keys ...string) (r string) { 127 | keys = append([]string{client.namespace}, keys...) 128 | r = strings.Join(keys, ":") 129 | return 130 | } 131 | -------------------------------------------------------------------------------- /search.go: -------------------------------------------------------------------------------- 1 | package redisgosearch 2 | 3 | import ( 4 | "encoding/json" 5 | "strings" 6 | ) 7 | 8 | func (client *Client) search(indexType string, keywords string, filters map[string]string, skip int, limit int, segmentFn SegmentFn, result interface{}) (count int, err error) { 9 | words := DefaultSegment(keywords) 10 | if len(words) == 0 { 11 | return 12 | } 13 | keywordsKey := client.withnamespace(indexType, "search", strings.Join(words, "+")) 14 | var args []interface{} 15 | for _, word := range words { 16 | args = append(args, client.withnamespace(indexType, "keywords", word)) 17 | } 18 | 19 | if filters != nil { 20 | for k, v := range filters { 21 | args = append(args, client.withnamespace(indexType, "filters", k, v)) 22 | } 23 | } 24 | 25 | args = append([]interface{}{keywordsKey}, args...) 26 | 27 | _, err = client.redisConn.Do("SINTERSTORE", args...) 28 | 29 | if err != nil { 30 | return 31 | } 32 | 33 | sortArgs := []interface{}{keywordsKey, "BY", "rank_*", "DESC"} 34 | 35 | rawKeyRs, err := client.redisConn.Do("SORT", sortArgs...) 36 | if err != nil { 37 | return 38 | } 39 | 40 | iKeyRs := rawKeyRs.([]interface{}) 41 | if len(iKeyRs) == 0 { 42 | return 43 | } 44 | 45 | count = len(iKeyRs) 46 | end := skip + limit 47 | if end > count { 48 | end = count 49 | } 50 | iKeyRs = iKeyRs[skip:end] 51 | 52 | rawRs, err := client.redisConn.Do("MGET", iKeyRs...) 53 | if err != nil { 54 | return 55 | } 56 | 57 | iRs := rawRs.([]interface{}) 58 | 59 | var stringRs []string 60 | for _, row := range iRs { 61 | if row == nil { 62 | continue 63 | } 64 | stringRs = append(stringRs, string(row.([]byte))) 65 | } 66 | 67 | jsonData := "[" + strings.Join(stringRs, ", ") + "]" 68 | err = json.Unmarshal([]byte(jsonData), result) 69 | 70 | return 71 | } 72 | 73 | // Search returns the Redis-stored marshalled JSON struct of type indexType, that was originally indexed, filtered by the given parameters. 74 | func (client *Client) Search(indexType string, keywords string, filters map[string]string, skip int, limit int, result interface{}) (count int, err error) { 75 | return client.search(indexType, keywords, filters, skip, limit, DefaultSegment, result) 76 | } 77 | 78 | // SearchCustom does the same as Search, but with a custom keyword segmentation function instead of the default one. See SegmentFn. 79 | func (client *Client) SearchCustom(indexType string, keywords string, filters map[string]string, skip int, limit int, segmentFn SegmentFn, result interface{}) (count int, err error) { 80 | return client.search(indexType, keywords, filters, skip, limit, segmentFn, result) 81 | } 82 | -------------------------------------------------------------------------------- /search_test.go: -------------------------------------------------------------------------------- 1 | package redisgosearch_test 2 | 3 | import ( 4 | "testing" 5 | "time" 6 | 7 | "github.com/purohit/redisgosearch" 8 | ) 9 | 10 | type Entry struct { 11 | ID string 12 | GroupID string 13 | Title string 14 | Content string 15 | Attachments []*Attachment 16 | CreatedAt time.Time 17 | } 18 | 19 | type Attachment struct { 20 | Filename string 21 | ContentType string 22 | CreatedAt time.Time 23 | } 24 | 25 | type IndexedAttachment struct { 26 | Entry *Entry 27 | Attachment *Attachment 28 | } 29 | 30 | func (attachment *IndexedAttachment) IndexPieces() (r []string, ais []redisgosearch.Indexable) { 31 | r = append(r, attachment.Attachment.Filename) 32 | return 33 | } 34 | 35 | func (attachment *IndexedAttachment) IndexEntity() (indexType string, key string, entity interface{}, rank int64) { 36 | key = attachment.Entry.ID + attachment.Attachment.Filename 37 | indexType = "files" 38 | entity = attachment 39 | rank = attachment.Entry.CreatedAt.UnixNano() 40 | return 41 | } 42 | 43 | func (attachment *IndexedAttachment) IndexFilters() (r map[string]string) { 44 | r = make(map[string]string) 45 | r["group"] = attachment.Entry.GroupID 46 | return 47 | } 48 | 49 | func (entry *Entry) IndexPieces() (r []string, ais []redisgosearch.Indexable) { 50 | r = append(r, entry.Title) 51 | r = append(r, entry.Content) 52 | 53 | for _, a := range entry.Attachments { 54 | r = append(r, a.Filename) 55 | ais = append(ais, &IndexedAttachment{entry, a}) 56 | } 57 | 58 | return 59 | } 60 | 61 | func (entry *Entry) IndexEntity() (indexType string, key string, entity interface{}, rank int64) { 62 | key = entry.ID 63 | indexType = "entries" 64 | entity = entry 65 | rank = entry.CreatedAt.UnixNano() 66 | return 67 | } 68 | 69 | func (entry *Entry) IndexFilters() (r map[string]string) { 70 | r = make(map[string]string) 71 | r["group"] = entry.GroupID 72 | return 73 | } 74 | 75 | // TestIndexAndSearch requires that Redis be setup in your environment, 76 | // with the default port. 77 | func TestIndexAndSearch(t *testing.T) { 78 | client, err := redisgosearch.NewClient("localhost:6379", "theplant") 79 | if err != nil { 80 | t.Error(err) 81 | } 82 | 83 | e1 := &Entry{ 84 | ID: "50344415ff3a8aa694000001", 85 | GroupID: "Qortex", 86 | Title: "Thread Safety", 87 | Content: "The connection http://google.com Send and Flush methods cannot be called concurrently with other calls to these methods. The connection Receive method cannot be called concurrently with other calls to Receive. Because the connection Do method uses Send, Flush and Receive, the Do method cannot be called concurrently with Send, Flush, Receive or Do. Unless stated otherwise, all other concurrent access is allowed.", 88 | Attachments: []*Attachment{ 89 | { 90 | Filename: "QORTEX UI 0.88.pdf", 91 | ContentType: "application/pdf", 92 | CreatedAt: time.Now(), 93 | }, 94 | }, 95 | CreatedAt: time.Unix(10000, 0), 96 | } 97 | e2 := &Entry{ 98 | ID: "50344415ff3a8aa694000002", 99 | GroupID: "ASICS", 100 | Title: "redis is a client for the Redis database", 101 | Content: "The Conn interface is the primary interface for working with Redis. Applications create connections by calling the Dial, DialWithTimeout or NewConn functions. In the future, functions will be added for creating shareded and other types of connections.", 102 | Attachments: []*Attachment{ 103 | { 104 | Filename: "Screen Shot 2012-08-19 at 11.52.51 AM.png", 105 | ContentType: "image/png", 106 | CreatedAt: time.Now(), 107 | }, { 108 | Filename: "Alternate Qortex Logo.jpg", 109 | ContentType: "image/jpg", 110 | CreatedAt: time.Now(), 111 | }, 112 | }, 113 | CreatedAt: time.Unix(20000, 0), 114 | } 115 | 116 | client.Index(e1) 117 | client.Index(e2) 118 | 119 | var entries []*Entry 120 | count, err := client.Search("entries", "concurrent access", nil, 0, 10, &entries) 121 | if err != nil { 122 | t.Error(err) 123 | } 124 | if count != 1 { 125 | t.Error(entries) 126 | } 127 | if entries[0].Title != "Thread Safety" { 128 | t.Error(entries[0]) 129 | } 130 | 131 | var attachments []*IndexedAttachment 132 | _, err = client.Search("files", "alternate qortex", map[string]string{"group": "ASICS"}, 0, 20, &attachments) 133 | if err != nil { 134 | t.Error(err) 135 | } 136 | 137 | if attachments[0].Attachment.Filename != "Alternate Qortex Logo.jpg" || len(attachments) != 1 { 138 | t.Error(attachments[0]) 139 | } 140 | 141 | // sort 142 | var sorted []*Entry 143 | client.Search("entries", "other", nil, 0, 10, &sorted) 144 | if sorted[0].ID != "50344415ff3a8aa694000002" { 145 | t.Error(sorted[0]) 146 | } 147 | } 148 | -------------------------------------------------------------------------------- /segmentation.go: -------------------------------------------------------------------------------- 1 | package redisgosearch 2 | 3 | import ( 4 | "strings" 5 | "unicode" 6 | ) 7 | 8 | // SegmentFn breaks the given string into 9 | // keywords to be indexed. 10 | type SegmentFn func(string) []string 11 | 12 | func nonWordOrNumbers(w rune) (r bool) { 13 | r = !unicode.IsLetter(w) && !unicode.IsDigit(w) 14 | return 15 | } 16 | 17 | // DefaultSegment splits strings at any non-letter, non-digit char. 18 | func DefaultSegment(p string) (r []string) { 19 | p = strings.ToLower(p) 20 | r1 := strings.Fields(p) 21 | for _, word := range r1 { 22 | deepSplitWords := strings.FieldsFunc(word, nonWordOrNumbers) 23 | if len(deepSplitWords) >= 1 { 24 | for _, w := range deepSplitWords { 25 | r = append(r, w) 26 | } 27 | } 28 | } 29 | return 30 | } 31 | --------------------------------------------------------------------------------