├── Makefile ├── README.asciidoc ├── best-practices.asciidoc ├── code-concepts.asciidoc ├── code-cpp.asciidoc ├── code-golang.asciidoc ├── code-java.asciidoc ├── example.thrift ├── language-reference.asciidoc ├── publish.sh ├── thrift-docinfo.html └── thrift.asciidoc /Makefile: -------------------------------------------------------------------------------- 1 | default: 2 | asciidoc \ 3 | -b html5 \ 4 | -a theme=flask \ 5 | -a toc2 \ 6 | -a data-uri \ 7 | -a docinfo \ 8 | -a icons \ 9 | -a pygments \ 10 | -a iconsdir=/usr/local/Cellar/asciidoc/8.6.8/etc/asciidoc/images/icons \ 11 | -o index.html \ 12 | thrift.asciidoc 13 | 14 | pdf: 15 | a2x --fop \ 16 | -a toc \ 17 | -a data-uri \ 18 | -a docinfo \ 19 | -a icons \ 20 | -a pygments \ 21 | -a iconsdir=/usr/local/Cellar/asciidoc/8.6.8/etc/asciidoc/images/icons \ 22 | -a pygments \ 23 | --no-xmllint \ 24 | thrift.asciidoc 25 | 26 | all: default pdf 27 | 28 | clean: 29 | rm -f index.html *.png 30 | 31 | publish: default pdf 32 | ./publish.sh 33 | -------------------------------------------------------------------------------- /README.asciidoc: -------------------------------------------------------------------------------- 1 | Thrift: The Missing Guide 2 | ========================= 3 | Diwaker Gupta 4 | 5 | This project is an attempt to plug the gap in http://thrift.apache.org[Thrift] 6 | documentation. The guide can be found at: 7 | 8 | http://diwakergupta.github.io/thrift-missing-guide 9 | 10 | A PDF version can be found at: 11 | 12 | http://diwakergupta.github.io/thrift-missing-guide/thrift.pdf 13 | -------------------------------------------------------------------------------- /best-practices.asciidoc: -------------------------------------------------------------------------------- 1 | Versioning/Compatibility 2 | ~~~~~~~~~~~~~~~~~~~~~~~~ 3 | 4 | Protocols evolve over time. If an existing message type no longer meets all 5 | your needs -- for example, you'd like the message format to have an extra field 6 | -- but you'd still like to use code created with the old format, don't worry! 7 | It's very simple to update message types without breaking any of your existing 8 | code. Just remember the following rules: 9 | 10 | * Don't change the numeric tags for any existing fields. 11 | * Any new fields that you add should be optional. This means that any messages 12 | serialized by code using your "old" message format can be parsed by your new 13 | generated code, as they won't be missing any required elements. You should set 14 | up sensible default values for these elements so that new code can properly 15 | interact with messages generated by old code. Similarly, messages created by 16 | your new code can be parsed by your old code: old binaries simply ignore the 17 | new field when parsing. However, the unknown fields are not discarded, and if 18 | the message is later serialized, the unknown fields are serialized along with 19 | it -- so if the message is passed on to new code, the new fields are still 20 | available. 21 | * Non-required fields can be removed, as long as the tag number is not used 22 | again in your updated message type (it may be better to rename the field 23 | instead, perhaps adding the prefix "OBSOLETE_", so that future users of your 24 | .thrift can't accidentally reuse the number). 25 | * Changing a default value is generally OK, as long as you remember that default 26 | values are never sent over the wire. Thus, if a program receives a message in 27 | which a particular field isn't set, the program will see the default value as 28 | it was defined in that program's version of the protocol. It will NOT see the 29 | default value that was defined in the sender's code. 30 | -------------------------------------------------------------------------------- /code-concepts.asciidoc: -------------------------------------------------------------------------------- 1 | Here is a pictorial view of the Thrift network stack: 2 | 3 | ["ditaa"] 4 | .The Thrift Network Stack 5 | ----------------------------------------------------------------------------- 6 | +-------------------------------------------+ 7 | | cGRE | 8 | | Server | 9 | | (single-threaded, event-driven etc) | 10 | +-------------------------------------------+ 11 | | cBLU | 12 | | Processor | 13 | | (compiler generated) | 14 | +-------------------------------------------+ 15 | | cGRE | 16 | | Protocol | 17 | | (JSON, compact etc) | 18 | +-------------------------------------------+ 19 | | cGRE | 20 | | Transport | 21 | | (raw TCP, HTTP etc) | 22 | +-------------------------------------------+ 23 | ----------------------------------------------------------------------------- 24 | 25 | Transport 26 | ^^^^^^^^^ 27 | 28 | The Transport layer provides a simple abstraction for reading/writing from/to 29 | the network. This enables Thrift to decouple the underlying transport from the 30 | rest of the system (serialization/deserialization, for instance). 31 | 32 | Here are some of the methods exposed by the +Transport+ interface: 33 | 34 | * +open+ 35 | * +close+ 36 | * +read+ 37 | * +write+ 38 | * +flush+ 39 | 40 | In addition to the +Transport+ interface above, Thrift also uses a 41 | +ServerTransport+ interface used to accept or create primitive transport 42 | objects. As the name suggest, +ServerTransport+ is used mainly on the server 43 | side to create new Transport objects for incoming connections. 44 | 45 | * +open+ 46 | * +listen+ 47 | * +accept+ 48 | * +close+ 49 | 50 | Here are some of the transports available for majority of the Thrift-supported 51 | languages: 52 | 53 | * file: read/write to/from a file on disk 54 | * http: as the name suggests 55 | 56 | Protocol 57 | ^^^^^^^^ 58 | 59 | The Protocol abstraction defines a mechanism to map in-memory data structures to 60 | a wire-format. In other words, a protocol specifies how datatypes use the 61 | underlying Transport to encode/decode themselves. Thus the protocol 62 | implementation governs the encoding scheme and is responsible for 63 | (de)serialization. Some examples of protocols in this sense include JSON, XML, 64 | plain text, compact binary etc. 65 | 66 | Here is the +Protocol+ interface: 67 | [source,cpp] 68 | ----------------------------------------------------------------------------- 69 | writeMessageBegin(name, type, seq) 70 | writeMessageEnd() 71 | writeStructBegin(name) 72 | writeStructEnd() 73 | writeFieldBegin(name, type, id) 74 | writeFieldEnd() 75 | writeFieldStop() 76 | writeMapBegin(ktype, vtype, size) 77 | writeMapEnd() 78 | writeListBegin(etype, size) 79 | writeListEnd() 80 | writeSetBegin(etype, size) 81 | writeSetEnd() 82 | writeBool(bool) 83 | writeByte(byte) 84 | writeI16(i16) 85 | writeI32(i32) 86 | writeI64(i64) 87 | writeDouble(double) 88 | writeString(string) 89 | 90 | name, type, seq = readMessageBegin() 91 | readMessageEnd() 92 | name = readStructBegin() 93 | readStructEnd() 94 | name, type, id = readFieldBegin() 95 | readFieldEnd() 96 | k, v, size = readMapBegin() 97 | readMapEnd() 98 | etype, size = readListBegin() 99 | readListEnd() 100 | etype, size = readSetBegin() 101 | readSetEnd() 102 | bool = readBool() 103 | byte = readByte() 104 | i16 = readI16() 105 | i32 = readI32() 106 | i64 = readI64() 107 | double = readDouble() 108 | string = readString() 109 | ----------------------------------------------------------------------------- 110 | 111 | Thrift Protocols are stream oriented by design. There is no need for any 112 | explicit framing. For instance, it is not necessary to know the length of a 113 | string or the number of items in a list before we start serializing them. 114 | 115 | Here are some of the protocols available for majority of the Thrift-supported 116 | languages: 117 | 118 | * binary: Fairly simple binary encoding -- the length and type of a field are 119 | encoded as bytes followed by the actual value of the field. 120 | * compact: Described in 121 | https://issues.apache.org/jira/browse/THRIFT-110[THRIFT-110] 122 | * json: 123 | 124 | Processor 125 | ^^^^^^^^^ 126 | 127 | A Processor encapsulates the ability to read data from input streams and write 128 | to output streams. The input and output streams are represented by Protocol 129 | objects. The Processor interface is extremely simple: 130 | 131 | [source,java] 132 | ----------------------------------------------------------------------------- 133 | interface TProcessor { 134 | bool process(TProtocol in, TProtocol out) throws TException 135 | } 136 | ----------------------------------------------------------------------------- 137 | 138 | Service-specific processor implementations are generated by the compiler. The 139 | Processor essentially reads data from the wire (using the input protocol), 140 | delegates processing to the handler (implemented by the user) and writes the 141 | response over the wire (using the output protocol). 142 | 143 | Server 144 | ^^^^^^ 145 | 146 | A Server pulls together all of the various features described above: 147 | 148 | * Create a transport 149 | * Create input/output protocols for the transport 150 | * Create a processor based on the input/output protocols 151 | * Wait for incoming connections and hand them off to the processor 152 | 153 | Next we discuss the generated code for specific languages. Unless mentioned 154 | otherwise, the sections below will assume the following Thrift specification: 155 | 156 | [source,cpp] 157 | .Example IDL 158 | ----------------------------------------------------------------------------- 159 | include::example.thrift[] 160 | ----------------------------------------------------------------------------- 161 | 162 | .How are nested structs initialized? 163 | ***************************************************************************** 164 | In an earlier section, we saw how Thrift allows structs to contain other structs 165 | (no nested definitions yet though!) In most object-oriented and/or dynamic 166 | languages, structs map to objects and so it is instructive to understand how 167 | Thrift initializes nested structs. One reasonable approach would be to treat the 168 | nested structs as pointers or references and initialize them with NULL, until 169 | explicitly set by the user. 170 | 171 | Unfortunately, for many languages, Thrift uses a 'pass by value' model. As a 172 | concrete example, consider the generated C++ code for the +Tweet+ struct in our 173 | example above: 174 | 175 | [source,cpp] 176 | ----------------------------------------------------------------------------- 177 | ... 178 | int32_t userId; 179 | std::string userName; 180 | std::string text; 181 | Location loc; 182 | TweetType::type tweetType; 183 | std::string language; 184 | ... 185 | ----------------------------------------------------------------------------- 186 | 187 | As you can see, the nested +Location+ structure is *fully allocated inline*. 188 | Because +Location+ is optional, the code uses the internal '__isset' flags to 189 | determine if the field has actually been "set" by the user. 190 | 191 | This can lead to some surprising and unintuitive behavior: 192 | 193 | * Since the full size of every sub-structure may be allocated at initialization 194 | in some languages, memory usage may be higher than you expect, especially for 195 | complicated structures with many unset fields. 196 | * The parameters and return types for service methods may not be "optional" and 197 | you can't assign or return +null+ in any dynamic language. Thus to return a 198 | "no value" result from a method, you must declare an envelope structure with 199 | an optional field containing the value and then return the envelope with that 200 | field unset. 201 | * The transport layer can, however, marshal method calls from older versions of 202 | a service definition with missing parameters. Thus, if the original service 203 | contained a method +postTweet(1: Tweet tweet)+ and a later version changes it 204 | to +postTweet(1: Tweet tweet, 2: string group)+, then an older client invoking 205 | the previous method will result in a newer server receiving the call with the 206 | new parameter unset. If the new server is in Java, for instance, you may 207 | in fact receive a +null+ value for the new parameter. And yet you may not 208 | declare a parameter to be nullable within the IDL. 209 | ***************************************************************************** 210 | -------------------------------------------------------------------------------- /code-cpp.asciidoc: -------------------------------------------------------------------------------- 1 | Generated Files 2 | ^^^^^^^^^^^^^^^ 3 | 4 | * all constants go into a single +.cpp/.h+ pair 5 | * all type definitions (enums and structs) go into another +.cpp/.h+ pair 6 | * each service gets its own +.cpp/.h+ pair 7 | 8 | ----------------------------------------------------------------------------- 9 | $ tree gen-cpp 10 | |-- example_constants.cpp 11 | |-- example_constants.h 12 | |-- example_types.cpp 13 | |-- example_types.h 14 | |-- Twitter.cpp 15 | |-- Twitter.h 16 | `-- Twitter_server.skeleton.cpp 17 | ----------------------------------------------------------------------------- 18 | 19 | Types 20 | ^^^^^ 21 | 22 | Thrift maps the various base and container types to C++ types as follows: 23 | 24 | * +bool+: +bool+ 25 | * +binary+: +std::string+ 26 | * +byte+: +int8_t+ 27 | * +i16+: +int16_t+ 28 | * +i32+: +int32_t+ 29 | * +i64+: +int64_t+ 30 | * +double+: +double+ 31 | * +string+: +std::string+ 32 | * +list+: +std::vector+ 33 | * +set+: +std::set+ 34 | * +map+: +std::map+ 35 | -------------------------------------------------------------------------------- /code-golang.asciidoc: -------------------------------------------------------------------------------- 1 | Generated Files 2 | ^^^^^^^^^^^^^^^ 3 | 4 | * The +constants.go+ file, which contains all constants. 5 | * The +ttypes.go+, which contains type definitions. 6 | 7 | ----------------------------------------------------------------------------- 8 | $ tree gen-go 9 | `-- thrift 10 | `-- example 11 | |-- constants.go 12 | `-- ttypes.go 13 | ----------------------------------------------------------------------------- 14 | 15 | [TIP] 16 | .Naming with Go 17 | ============================================================================= 18 | The Go language uses capitalization to determine export rules. The Go thrift 19 | library adapts thrift names to adhere to this convention. So a struct 20 | field called +userName+ will be accessible in Go as +UserName+. As a rule, the 21 | first letter of any constant or struct field will be capitalized. 22 | ============================================================================= 23 | 24 | Types 25 | ^^^^^ 26 | 27 | Thrift maps the various base and container types to Go types as follows: 28 | 29 | * +bool+: +bool+ 30 | * +binary+: +[]byte+ 31 | * +byte+: +byte+ 32 | * +i16+: +int16+ 33 | * +i32+: +int32+ 34 | * +i64+: +int64+ 35 | * +double+: +float64+ 36 | * +string+: +string+ 37 | * +list+: +[]t1+ 38 | * +set+: +map[t1]bool+ where +bool+ is always +true+ 39 | * +map+: +map[t1]t2+ 40 | 41 | With the exception of +set+, which has no direct Go equivalent, all of the 42 | types are straightforward. 43 | 44 | The Thrift developers decided that the best way to represent Thrift +set+ types 45 | in Go was to implement them as +map[t1]bool+. 46 | 47 | Typedefs 48 | ^^^^^^^^ 49 | 50 | Thrift typedefs are implemented in Go as Go types. Thus +typedef i32 MyInt+ 51 | becomes the Go +type MyInt int32+. 52 | 53 | Enums 54 | ^^^^^ 55 | 56 | The Go Thrift compiler translates enums to constants. The strategy it employs 57 | is to create a set of constants where the +enum+ name is the prefix and the 58 | +enum+ item's name is the suffix. The two are separated by an underscore (+_+). 59 | 60 | The +TweetType+ of +TWEET+ becomes, in Go, +TweetType_TWEET+. 61 | 62 | Constants 63 | ^^^^^^^^^ 64 | 65 | Thrift constants are declared +const+ for Go types that can be declared as 66 | constants (e.g. +int+, +string+, etc.). For types that cannot be declared 67 | +const+ (like maps and slices), the Go library declares these as global +var+. 68 | 69 | Structs 70 | ^^^^^^^ 71 | 72 | The Thrift +struct+ is translated to Go +struct+ differently than in other 73 | language libraries. 74 | 75 | Unlike most Thrift implementations, the type declarations are substantially 76 | impacted by whether or not a thrift attribute is declared +required+ or 77 | +optional+. To distinguish between an empty value and a +nil+ value, the 78 | Go library uses pointers for all +optional+ values. 79 | 80 | For example, here is an abbreviated listing of the +Tweet+ struct discussed 81 | above: 82 | 83 | [source,thrift] 84 | -------------------------------------------------------------------------------- 85 | struct Tweet { 86 | 1: required i32 userId 87 | // ... 88 | 4: optional Location loc 89 | // ... 90 | } 91 | -------------------------------------------------------------------------------- 92 | 93 | In Go, the field +userId+ will be +UserId int32+, while +loc+ will be 94 | +Loc *Location+. The salient detail is that any field tagged as +optional+ 95 | will be implemented as a Go pointer to a value, while +required+ types are 96 | implemented as values. 97 | 98 | Unions 99 | ^^^^^^ 100 | 101 | Thrift +union+ types are translated into Go structs, where only one field on 102 | the struct may be set to a non-nil value. 103 | 104 | -------------------------------------------------------------------------------- /code-java.asciidoc: -------------------------------------------------------------------------------- 1 | Generated Files 2 | ^^^^^^^^^^^^^^^ 3 | 4 | * a single file (+Constants.java+) containing all constant definitions 5 | * one file per struct, enum and service 6 | 7 | ----------------------------------------------------------------------------- 8 | $ tree gen-java 9 | `-- thrift 10 | `-- example 11 | |-- Constants.java 12 | |-- Location.java 13 | |-- Tweet.java 14 | |-- TweetSearchResult.java 15 | |-- TweetType.java 16 | `-- Twitter.java 17 | ----------------------------------------------------------------------------- 18 | 19 | [TIP] 20 | .Naming Conventions 21 | ============================================================================= 22 | While the Thrift compiler does not enforce any naming conventions, it is 23 | advisable to stick to standard naming conventions otherwise you may be in for 24 | some surprises. For instance, if you have a struct named +tweetSearchResults+ 25 | (note the mixedCase), the Thrift compiler will generated a Java file named 26 | +TweetSearchResults+ (note the CamelCase) containing a class named 27 | +tweetSearchResults+ (like the original struct). This will obviously not 28 | compile under Java. 29 | ============================================================================= 30 | 31 | Types 32 | ^^^^^ 33 | 34 | Thrift maps the various base and container types to Java types as follows: 35 | 36 | * +bool+: +boolean+ 37 | * +binary+: +byte[]+ 38 | * +byte+: +byte+ 39 | * +i16+: +short+ 40 | * +i32+: +int+ 41 | * +i64+: +long+ 42 | * +double+: +double+ 43 | * +string+: +String+ 44 | * +list+: +List+ 45 | * +set+: +Set+ 46 | * +map+: +Map+ 47 | 48 | As you can see, the mapping is straight forward and one-to-one for the most 49 | part. This is not surprising given that Java was the primary target language 50 | when the Thrift project began. 51 | 52 | Typedefs 53 | ^^^^^^^^ 54 | 55 | The Java language does not have any native support for "typedefs". So when the 56 | Thrit Java code generator encounters a typedef declaration, it merely 57 | substitutes it with the original type. That is, even though you may have 58 | typedefd +TypeA+ to +TypeB+, in the generated Java code, all references to 59 | +TypeB+ will be replaced by +TypeA+. 60 | 61 | Consider the example IDL above. The declaration for +tweets+ in the generated 62 | code for +TweetSearchResults+ is simply +public List tweets+. 63 | 64 | Enums 65 | ^^^^^ 66 | 67 | Thrift enums map to Java +enum+ types. You can obtain the numeric value of an 68 | enum by using the +getValue+ method (via the interface +TEnum+). In addition, 69 | the compiler generates a +findByValue+ method to obtain the enum corresponding 70 | to a numeric value. This is more robust than using the +ordinal+ feature of Java 71 | enums. 72 | 73 | Constants 74 | ^^^^^^^^^ 75 | 76 | Thrift puts all defined constants in a public class named +Constants+ as +public 77 | static final+ members. Constants of any of the primitive types are supported. 78 | 79 | [TIP] 80 | .Contain your Constants 81 | ============================================================================= 82 | If you have multiple Thrift files (in the same namespace) containing const 83 | definitions, the Thrift compiler will overwrite the +Constants.java+ file with 84 | the definitions found in the file processed last. You _must_ either define all 85 | your constants in a single file, or invoke the compiler on a single file that 86 | includes all the other files. 87 | ============================================================================= 88 | -------------------------------------------------------------------------------- /example.thrift: -------------------------------------------------------------------------------- 1 | namespace cpp thrift.example 2 | namespace java thrift.example 3 | 4 | enum TweetType { 5 | TWEET, 6 | RETWEET = 2, 7 | DM = 0xa, 8 | REPLY 9 | } 10 | 11 | struct Location { 12 | 1: required double latitude; 13 | 2: required double longitude; 14 | } 15 | 16 | struct Tweet { 17 | 1: required i32 userId; 18 | 2: required string userName; 19 | 3: required string text; 20 | 4: optional Location loc; 21 | 5: optional TweetType tweetType = TweetType.TWEET; 22 | 16: optional string language = "english"; 23 | } 24 | 25 | typedef list TweetList 26 | 27 | struct TweetSearchResult { 28 | 1: TweetList tweets; 29 | } 30 | 31 | exception TwitterUnavailable { 32 | 1: string message; 33 | } 34 | 35 | const i32 MAX_RESULTS = 100; 36 | 37 | service Twitter { 38 | void ping(), 39 | bool postTweet(1:Tweet tweet) throws (1:TwitterUnavailable unavailable), 40 | TweetSearchResult searchTweets(1:string query); 41 | oneway void zip() 42 | } 43 | -------------------------------------------------------------------------------- /language-reference.asciidoc: -------------------------------------------------------------------------------- 1 | Types 2 | ~~~~~ 3 | 4 | The Thrift type system consists of pre-defined base types, user-defined structs, 5 | container types, exceptions and service definitions. 6 | 7 | Base Types 8 | ^^^^^^^^^^ 9 | 10 | * +bool+: A boolean value (true or false), one byte 11 | * +byte+: A signed byte 12 | * +i16+: A 16-bit signed integer 13 | * +i32+: A 32-bit signed integer 14 | * +i64+: A 64-bit signed integer 15 | * +double+: A 64-bit floating point number 16 | * +binary+: A byte array 17 | * +string+: Encoding agnostic text or binary string 18 | 19 | Note that Thrift does not support unsigned integers because they have no direct 20 | translation to native (primitive) types in many of Thrift's target languages. 21 | 22 | Containers 23 | ^^^^^^^^^^ 24 | 25 | Thrift containers are strongly typed containers that map to the most commonly 26 | used containers in popular programming languages. They are annotated using the 27 | Java Generics style. There are three containers types available: 28 | 29 | * +list+: An ordered list of elements of type +t1+. May contain duplicates. 30 | * +set+: An unordered set of unique elements of type +t1+. 31 | * +map+: A map of strictly unique keys of type +t1+ to values of type 32 | +t2+. 33 | 34 | Types used in containers may be any valid Thrift type (including structs and 35 | exceptions) excluding services. 36 | 37 | Structs and Exceptions 38 | ^^^^^^^^^^^^^^^^^^^^^^ 39 | 40 | A Thrift struct is conceptually similar to a +C+ struct -- a convenient way of 41 | grouping together (and encapsulating) related items. Structs translate to 42 | classes in object-oriented languages. 43 | 44 | Exceptions are syntactically and functionally equivalent to structs except that 45 | they are declared using the +exception+ keyword instead of the +struct+ keyword. 46 | They differ from structs in semantics -- when defining RPC services, developers 47 | may declare that a remote method throws an exception. 48 | 49 | Details on defining structs and exceptions are the subject of a 50 | <<_defining_structs,later section>>. 51 | 52 | Services 53 | ^^^^^^^^ 54 | 55 | Service definitions are semantically equivalent to defining an +interface+ (or a 56 | pure virtual abstract class) in object-oriented programming. The Thrift compiler 57 | generates fully functional client and server stubs that implement the interface. 58 | 59 | Details on defining services are the subject of a <<_defining_services,later 60 | section>>. 61 | 62 | Typedefs 63 | ~~~~~~~~ 64 | 65 | Thrift supports C/C++ style typedefs. 66 | 67 | [source,c] 68 | ----------------------------------------------------------------------------- 69 | typedef i32 MyInteger // <1> 70 | typedef Tweet ReTweet // <2> 71 | ----------------------------------------------------------------------------- 72 | <1> Note there is no trailing semi-colon 73 | <2> Structs can also be used in typedefs 74 | 75 | Enums 76 | ~~~~~ 77 | 78 | When you're defining a message type, you might want one of its fields to only 79 | have one of a pre-defined list of values. For example, let's say you want to add 80 | a +tweetType+ field for each +Tweet+, where the +tweetType+ can be 81 | +TWEET+, +RETWEET+, +DM+, or +REPLY+. You can do this very simply by 82 | adding an enum to your message definition -- a field with an enum type can only 83 | have one of a specified set of constants as its value (if you try to provide a 84 | different value, the parser will treat it like an unknown field). In the 85 | following example we've added an enum called +TweetType+ with all the possible 86 | values, and a field of the same type: 87 | 88 | [source,c] 89 | ----------------------------------------------------------------------------- 90 | enum TweetType { 91 | TWEET, // <1> 92 | RETWEET = 2, // <2> 93 | DM = 0xa, // <3> 94 | REPLY 95 | } // <4> 96 | 97 | struct Tweet { 98 | 1: required i32 userId; 99 | 2: required string userName; 100 | 3: required string text; 101 | 4: optional Location loc; 102 | 5: optional TweetType tweetType = TweetType.TWEET // <5> 103 | 16: optional string language = "english" 104 | } 105 | ----------------------------------------------------------------------------- 106 | <1> Enums are specified C-style. Compiler assigns default values starting at 0. 107 | <2> You can of course, supply specific integral values for constants. 108 | <3> Hex values are also acceptable. 109 | <4> Again notice no trailing semi-colon 110 | <5> Use the fully qualified name of the constant when assigning default values. 111 | 112 | Note that unlike Protocol Buffers, Thrift does NOT yet support nested enums (or 113 | structs, for that matter). 114 | 115 | Enumerator constants MUST be in the range of _positive_ 32-bit integers. 116 | 117 | Comments 118 | ~~~~~~~~ 119 | 120 | Thrift supports shell-style, C-style multi-line as well as single-line Java/C++ 121 | style comments. 122 | 123 | [source,c] 124 | ----------------------------------------------------------------------------- 125 | # This is a valid comment. 126 | 127 | /* 128 | * This is a multi-line comment. 129 | * Just like in C. 130 | */ 131 | 132 | // C++/Java style single-line comments work just as well. 133 | ----------------------------------------------------------------------------- 134 | 135 | Namespaces 136 | ~~~~~~~~~~ 137 | 138 | Namespaces in Thrift are akin to namespaces in C++ or packages in Java -- they 139 | offer a convenient way of organizing (or isolating) your code. Namespaces may 140 | also be used to prevent name clashes between type definitions. 141 | 142 | Because each language has its own package-like mechanisms (e.g. Python has 143 | modules), Thrift allows you to customize the namespace behavior on a 144 | per-language basis: 145 | 146 | [source,cpp] 147 | ----------------------------------------------------------------------------- 148 | namespace cpp com.example.project // <1> 149 | namespace java com.example.project // <2> 150 | ----------------------------------------------------------------------------- 151 | <1> Translates to +namespace com { namespace example { namespace project {+ 152 | <2> Translates to +package com.example.project+ 153 | 154 | Includes 155 | ~~~~~~~~ 156 | 157 | It is often useful to split up Thrift definitions in separate files to ease 158 | maintainance, enable reuse and improve modularity/organization. Thrift allows 159 | files to _include_ other Thrift files. Included files are looked up in the 160 | current directory and by searching relative to any paths specified with the +-I+ 161 | compiler flag. 162 | 163 | Included objects are accessed using the name of the Thrift file as a prefix. 164 | 165 | [source,cpp] 166 | ----------------------------------------------------------------------------- 167 | include "tweet.thrift" // <1> 168 | ... 169 | struct TweetSearchResult { 170 | 1: list tweets; // <2> 171 | } 172 | ----------------------------------------------------------------------------- 173 | <1> File names must be quoted; again notice the absent semi-colon. 174 | <2> Note the +tweet+ prefix. 175 | 176 | Constants 177 | ~~~~~~~~~ 178 | 179 | Thrift lets you define constants for use across languages. Complex types and 180 | structs are specified using JSON notation. 181 | 182 | [source,cpp] 183 | ----------------------------------------------------------------------------- 184 | const i32 INT_CONST = 1234; // <1> 185 | const map MAP_CONST = {"hello": "world", "goodnight": "moon"} 186 | ----------------------------------------------------------------------------- 187 | <1> Semi-colon is (confusingly) optional; hex values are valid here. 188 | 189 | Defining Structs 190 | ~~~~~~~~~~~~~~~~ 191 | 192 | Structs (also known as 'messages' in some systems) are the basic building blocks 193 | in a Thrift IDL. A struct is composed of _fields_; each field has a unique 194 | integer identifier, a type, a name and an optional default value. 195 | 196 | Consider a simple example. Suppose you want to build a 197 | http://twitter.com[Twitter]-like service. Here is how you might define a +Tweet+: 198 | 199 | [source,c] 200 | ----------------------------------------------------------------------------- 201 | struct Location { // <5> 202 | 1: required double latitude; 203 | 2: required double longitude; 204 | } 205 | 206 | struct Tweet { 207 | 1: required i32 userId; // <1> 208 | 2: required string userName; // <2> 209 | 3: required string text; 210 | 4: optional Location loc; // <3> 211 | 16: optional string language = "english" // <4> 212 | } 213 | ----------------------------------------------------------------------------- 214 | <1> Every field *must* have a unique, positive integer identifier 215 | <2> Fields may be marked as +required+ or +optional+ 216 | <3> Structs may contain other structs 217 | <4> You may specify an optional "default" value for a field 218 | <5> Multiple structs can be defined and referred to within the same Thrift file 219 | 220 | As you can see, each field in the message definition has a unique numbered tag. 221 | These tags are used to identify your fields in the wire format, and should not 222 | be changed once your message type is in use. 223 | 224 | Fields may be marked +required+ or +optional+ with obvious meanings for 225 | well-formed structs. Thrift will complain if required fields have not been set 226 | in a struct, for instance. If an optional field has not been set in the struct, 227 | it will not be serialized over the wire. If a default value has been specified 228 | for an optional field, the field is assigned the default value when the struct 229 | is parsed and no value has been explicitly assigned for that field. 230 | 231 | Unlike services, structs do not support inheritance, that is, a struct may not 232 | extend other structs. 233 | 234 | [WARNING] 235 | .Required Is Forever 236 | You should be very careful about marking fields as required. If at some point 237 | you wish to stop writing or sending a required field, it will be problematic to 238 | change the field to an optional field -- old readers will consider messages 239 | without this field to be incomplete and may reject or drop them unintentionally. 240 | You should consider writing application-specific custom validation routines for 241 | your buffers instead. Some have come the conclusion that using required does 242 | more harm than good; they prefer to use only optional. However, this view is not 243 | universal. 244 | 245 | Defining Services 246 | ~~~~~~~~~~~~~~~~~ 247 | 248 | While there are several popular serialization/deserialization frameworks (like 249 | Protocol Buffers), there are few frameworks that provide out-of-the-box support 250 | for RPC-based services across multiple languages. This is one of the major 251 | attractions of Thrift. 252 | 253 | Think of service definitions as Java interfaces -- you need to supply a name and 254 | signatures for the methods. Optionally, a service may extend other services. 255 | 256 | The Thrift compiler will generate service interface code (for the server) and 257 | stubs (for the client) in your chosen language. Thrift ships with RPC libraries 258 | for most languages that you can then use to run your client and server. 259 | 260 | [source,java] 261 | ----------------------------------------------------------------------------- 262 | service Twitter { 263 | // A method definition looks like C code. It has a return type, arguments, 264 | // and optionally a list of exceptions that it may throw. Note that argument 265 | // lists and exception list are specified using the exact same syntax as 266 | // field lists in structs. 267 | void ping(), // <1> 268 | bool postTweet(1:Tweet tweet) throws (1:TwitterUnavailable unavailable), // <2> 269 | TweetSearchResult searchTweets(1:string query); // <3> 270 | 271 | // The 'oneway' modifier indicates that the client only makes a request and 272 | // does not wait for any response at all. Oneway methods MUST be void. 273 | oneway void zip() // <4> 274 | } 275 | ----------------------------------------------------------------------------- 276 | <1> Confusingly, method definitions can be terminated using comma or semi-colon 277 | <2> Arguments can be primitive types or structs 278 | <3> Likewise for return types 279 | <4> +void+ is a valid return type for functions 280 | 281 | Note that the argument lists (and exception lists) for functions are specified 282 | exactly like structs. 283 | 284 | Services support inheritance: a service may optionally inherit from another 285 | service using the +extends+ keyword. 286 | 287 | // TODO: an example here. 288 | 289 | [IMPORTANT] 290 | .Nested Types 291 | As of this writing, Thrift does NOT support nested type _definitions_. That is, 292 | you may not define a struct (or an enum) within a struct; you may of course 293 | _use_ structs/enums within other structs. 294 | -------------------------------------------------------------------------------- /publish.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash - 2 | 3 | set -o nounset # Treat unset variables as an error 4 | 5 | rm -rf pages 6 | git clone -b gh-pages \ 7 | git@github.com:diwakergupta/thrift-missing-guide.git pages 8 | mv index.html thrift.pdf pages 9 | 10 | pushd pages 11 | git config user.name "Diwaker Gupta" 12 | git config user.email diwakergupta@gmail.com 13 | git add index.html thrift.pdf 14 | git commit -m Update 15 | git push origin HEAD:gh-pages 16 | popd 17 | 18 | rm -rf pages 19 | -------------------------------------------------------------------------------- /thrift-docinfo.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 17 | -------------------------------------------------------------------------------- /thrift.asciidoc: -------------------------------------------------------------------------------- 1 | Thrift: The Missing Guide 2 | ========================= 3 | Diwaker Gupta 4 | {localdate}: 5 | Written against Thrift 0.6.0 6 | 7 | From the http://thrift.apache.org[Thrift website]: 8 | [quote] 9 | Thrift is a software framework for scalable cross-language services development. 10 | It combines a software stack with a code generation engine to build services 11 | that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, 12 | Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml. 13 | 14 | Thrift is clearly abundant in features. What is sorely lacking though is _good_ 15 | documentation. This guide is an attempt to fill that hole. But note that this is 16 | a reference guide -- for a step-by-step example on how to use Thrift, refer to 17 | the Thrift tutorial. 18 | 19 | Many aspects of the structure and organization of this guide have been borrowed 20 | from the (excellent) 21 | http://code.google.com/apis/protocolbuffers/docs/proto.html[Google Protocol 22 | Buffer Language Guide]. I thank the authors of that document. 23 | 24 | A link:thrift.pdf[PDF version] is also available. 25 | 26 | .Copyright 27 | 28 | Copyright (C) 2013 Diwaker Gupta 29 | 30 | This work is licensed under the 31 | http://creativecommons.org/licenses/by-nc/3.0/[Creative Commons 32 | Attribution-NonCommercial 3.0 33 | Unported License]. 34 | 35 | .Contributions 36 | 37 | I welcome feedback and contributions to this guide. You can find the 38 | https://github.com/diwakergupta/thrift-missing-guide[source code] 39 | over at http://github.com[GitHub]. Alternatively, you can file a 40 | https://github.com/diwakergupta/thrift-missing-guide/issues[bug]. 41 | 42 | .Acknowledgements 43 | 44 | I thank the authors of Thrift for the software, the authors of the Google 45 | Protocol Buffer documentation for the inspiration and the Thrift community for 46 | the feedback. Special thanks to Dave Engberg from Evernote for his input. 47 | 48 | .About the Author 49 | 50 | I'm an open source geek and a software architect. I blog over at 51 | http://floatingsun.net[Floating Sun] and you can find more about me 52 | http://diwakergupta.info[here]. 53 | 54 | Language Reference 55 | ------------------ 56 | include::language-reference.asciidoc[] 57 | 58 | Generated Code 59 | -------------- 60 | 61 | This section contains documentation for working with Thrift generated code in 62 | various target languages. We begin by introducing the common concepts that are 63 | used across the board -- these govern how the generated code is structured and 64 | will hopefully help you understand how to use it effectively. 65 | 66 | Concepts 67 | ~~~~~~~~ 68 | include::code-concepts.asciidoc[] 69 | 70 | Java 71 | ~~~~ 72 | include::code-java.asciidoc[] 73 | 74 | C++ 75 | ~~~ 76 | include::code-cpp.asciidoc[] 77 | 78 | 79 | Other Languages 80 | ~~~~~~~~~~~~~~~ 81 | 82 | Python, Ruby, Javascript etc. 83 | 84 | Best Practices 85 | -------------- 86 | include::best-practices.asciidoc[] 87 | 88 | Resources 89 | --------- 90 | 91 | * http://thrift.apache.org/static/files/thrift-20070401.pdf[Thrift whitepaper] 92 | * http://wiki.apache.org/thrift/Tutorial[Thrift Tutorial] 93 | * http://wiki.apache.org/thrift[Thrift Wiki] 94 | * http://code.google.com/apis/protocolbuffers/docs/overview.html[Protocol 95 | Buffers] 96 | 97 | Translations 98 | ------------ 99 | 100 | * http://science.webhostinggeeks.com/thrift-uputstvo-koje-nedostaje[Serbo-Croatian] 101 | by Anja Skarba of http://webhostinggeeks.com/ 102 | --------------------------------------------------------------------------------