├── .gitignore ├── LICENSE ├── README.md ├── SECURITY.md ├── doc ├── azureportaldatabasecontainer.jpg ├── confluentaddconnector.JPG └── gremlinconnectorconfig.jpg ├── pom.xml └── src ├── main └── java │ └── com │ └── microsoft │ └── cosmos │ └── gremlin │ ├── GremlinQueryBuilder.java │ ├── KafkaGremlinSinkConnector.java │ ├── KafkaGremlinSinkTask.java │ └── StickyLoadBalancingStrategy.java └── test └── java └── com └── microsoft └── cosmos └── gremlin ├── GremlinQueryBuilderTest.java ├── KafkaGremlinSinkConnectorTest.java ├── KafkaGremlinSinkTaskTest.java ├── StickyLoadBalancingStrategyTest.java └── TestSinkTaskContext.java /.gitignore: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # This .gitignore file was automatically created by Microsoft(R) Visual Studio. 3 | ################################################################################ 4 | 5 | # Exclude build from source control 6 | /target 7 | 8 | # Exclude eclipse files 9 | /.project 10 | /.classpath 11 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. All rights reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Apache Kafka connector for Cosmos DB Gremlin API 2 | 3 | This is a sink connector from [Apache Kafka](https://kafka.apache.org/documentation/#connect) into [Microsoft Azure Cosmos DB Graph](https://docs.microsoft.com/en-us/azure/cosmos-db/graph-introduction) account. It allows modelling events as vertices and edges of a graph and manipulating them using [Apache Tinkerpop Gremlin](https://tinkerpop.apache.org/gremlin.html) language. 4 | 5 | This connector supports primitive, Binary, Json and Avro serializers. 6 | 7 | # Contributing 8 | 9 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 10 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 11 | the rights to use your contribution. For details, visit https://cla.microsoft.com. 12 | 13 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide 14 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions 15 | provided by the bot. You will only need to do this once across all repos using our CLA. 16 | 17 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 18 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 19 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 20 | 21 | # Setup 22 | Setup instructions are applicable after Confluent or Apache Kafka are up and running. 23 | 1. Clone the repository 24 | 2. Open root folder in terminal and execute 25 | ```maven 26 | mvn package 27 | ``` 28 | This command will produce connector and dependencies. 29 | ``` 30 | /target/dependencies/*.* 31 | /target/kafka-connect-cosmosdb-graph-0.1.jar 32 | ``` 33 | 3. Copy these dependencies into your Kafka cluster plugin folder. For Confluent platform please create a folder and copy all dependencies and connector: 34 | ``` 35 | /share/java/kafka-connect-gremlin 36 | ``` 37 | 4. Restart your Connect worker process. It will discover new connector automatically by inspecting plugin folder. 38 | 39 | # Configuration 40 | To start using connector please open your Confluence Control Center and navigate to **Management** -> **Kafka Connect** -> **Send data out** -> **Add Connector** 41 | 42 | ![Confluence Command Center Add Connector](/doc/confluentaddconnector.JPG) 43 | 44 | On the next page please select **KafkaGremlinSinkConnector**. If this connector is not available, likely Connect worker did not pick up the changes and it is recommended to restart worker again and let him finish directory scan before trying to add a connector again. 45 | 46 | ![Cosmos DB Graph Connector Configuration](/doc/gremlinconnectorconfig.jpg) 47 | 48 | **host** - fully qualified domain name of gremlin account. Please specify DNS record in zone **gremlin.cosmos.azure.com** for public Azure. Please do not put **documents.azure.com**, it will not work. 49 | 50 | **port** - default HTTPS port 443 51 | 52 | **database** - this is database resource in Cosmos DB, not to be confused with global database account. This value appears in Data Explorer after "New Graph" is created. 53 | 54 | **container** - name of Cosmos DB collection that contains graph data. 55 | 56 | ![Cosmos DB Graph Connector Configuration](/doc/azureportaldatabasecontainer.jpg) 57 | 58 | **traversal** - gremlin traversal to execute for every Kafka message published to the Kafka Topic and received by connector. Sample traversal could be adding a vertex for every event 59 | 60 | ``` 61 | g.addV() 62 | .property('id', ${value.uid}) 63 | .property('email', ${value.emailAddress}) 64 | .property('language', ${value.language}) 65 | ``` 66 | 67 | ## Support event syntax 68 | Each Kafka event contains a `key` and a `value` properties, each of which has schema. Both can be resolved independently in a traversal template configured on a connector. 69 | 70 | | Schema type | Mapping | Result | 71 | | ------------------------------------------------------------------- | --------------------------------------- | -------------------------------------------------- | 72 | | INT8, INT16, INT32, INT64, FLOAT32, FLOAT64, BOOLEAN, STRING | `${key}` or `${value}` | Value as is | 73 | | STRUCT | `${key.field}` or `${value.field}` | Resolves to structure field | 74 | | MAP | `${key.key}` or `${value.key}` | Resolves to value of the key in the map | 75 | | ARRAY | `${key[index]}` or `${value[index]}` | Resolves to a positional element in an array | 76 | | BYTES | `${key}` or `${value}` | Resolves to Java string representation of an array | 77 | 78 | Gremlin is a very powerful language as such a great deal of transformations of events can be done within gremlin itself on the server side. 79 | For example, Cosmos DB requires **id** property be string, but incoming stream may carry id as integer. 80 | ``` 81 | g.addV() 82 | .property('id', ${value.uid}.toString()) 83 | ``` 84 | 85 | # References 86 | It is worth looking through this material to get better understanding how this connector works and how to use it 87 | 88 | [Kafka Connect Deep Dive](https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues) 89 | 90 | [Kafka, Avro Serialization, and the Schema Registry](https://dzone.com/articles/kafka-avro-serialization-and-the-schema-registry) 91 | 92 | [Spring Kafka - JSON Serializer Deserializer Example](https://codenotfound.com/spring-kafka-json-serializer-deserializer-example.html) 93 | 94 | [Gremlin Language Reference](http://tinkerpop.apache.org/docs/current/reference/) 95 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Security 4 | 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). 6 | 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below. 8 | 9 | ## Reporting Security Issues 10 | 11 | **Please do not report security vulnerabilities through public GitHub issues.** 12 | 13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report). 14 | 15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey). 16 | 17 | You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc). 18 | 19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue: 20 | 21 | * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.) 22 | * Full paths of source file(s) related to the manifestation of the issue 23 | * The location of the affected source code (tag/branch/commit or direct URL) 24 | * Any special configuration required to reproduce the issue 25 | * Step-by-step instructions to reproduce the issue 26 | * Proof-of-concept or exploit code (if possible) 27 | * Impact of the issue, including how an attacker might exploit the issue 28 | 29 | This information will help us triage your report more quickly. 30 | 31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs. 32 | 33 | ## Preferred Languages 34 | 35 | We prefer all communications to be in English. 36 | 37 | ## Policy 38 | 39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd). 40 | 41 | 42 | -------------------------------------------------------------------------------- /doc/azureportaldatabasecontainer.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure/kafka-connect-cosmosdb-graph/06797b784a8f715d29159829eec5e8e75f12c5de/doc/azureportaldatabasecontainer.jpg -------------------------------------------------------------------------------- /doc/confluentaddconnector.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure/kafka-connect-cosmosdb-graph/06797b784a8f715d29159829eec5e8e75f12c5de/doc/confluentaddconnector.JPG -------------------------------------------------------------------------------- /doc/gremlinconnectorconfig.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure/kafka-connect-cosmosdb-graph/06797b784a8f715d29159829eec5e8e75f12c5de/doc/gremlinconnectorconfig.jpg -------------------------------------------------------------------------------- /pom.xml: -------------------------------------------------------------------------------- 1 | 4 | 4.0.0 5 | 6 | com.microsoft.cosmos.gremlin 7 | kafka-connect-cosmosdb-graph 8 | 0.2 9 | jar 10 | 11 | kafka-connect-cosmosdb-graph 12 | http://maven.apache.org 13 | 14 | 15 | UTF-8 16 | 2.2.0 17 | 3.4.1 18 | 3.0 19 | 5.4.0 20 | 21 | 22 | 23 | 24 | junit 25 | junit 26 | 3.8.1 27 | test 28 | 29 | 30 | 31 | org.apache.kafka 32 | connect-api 33 | ${kafka.version} 34 | provided 35 | 36 | 37 | 38 | org.apache.tinkerpop 39 | gremlin-driver 40 | ${tinkerpop.version} 41 | 42 | 43 | org.junit.jupiter 44 | junit-jupiter-api 45 | ${junit.version} 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | org.apache.maven.plugins 54 | maven-compiler-plugin 55 | ${maven.compiler.version} 56 | 57 | 1.8 58 | 1.8 59 | 60 | 61 | 62 | 63 | 64 | 65 | org.apache.maven.plugins 66 | maven-dependency-plugin 67 | 2.6 68 | 69 | 70 | copy-dependencies 71 | prepare-package 72 | 73 | copy-dependencies 74 | 75 | 76 | ${project.build.directory}/dependencies 77 | false 78 | false 79 | true 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | -------------------------------------------------------------------------------- /src/main/java/com/microsoft/cosmos/gremlin/GremlinQueryBuilder.java: -------------------------------------------------------------------------------- 1 | package com.microsoft.cosmos.gremlin; 2 | 3 | import java.util.HashMap; 4 | import java.util.List; 5 | import java.util.Map; 6 | 7 | import org.apache.kafka.connect.data.Schema; 8 | import org.apache.kafka.connect.data.Schema.Type; 9 | import org.apache.kafka.connect.data.Struct; 10 | import org.apache.kafka.connect.sink.SinkRecord; 11 | import org.slf4j.Logger; 12 | import org.slf4j.LoggerFactory; 13 | 14 | /** 15 | * 16 | * Class that generates Gremlin statements to push data from task into the 17 | * service. 18 | * 19 | * @author olignat 20 | */ 21 | final class GremlinQueryBuilder { 22 | private static final Logger log = LoggerFactory.getLogger(GremlinQueryBuilder.class); 23 | 24 | private static final String PARAMETER_MARKER = "gp"; 25 | 26 | private static final String KEY_DESIGNATION = "key"; 27 | private static final String VALUE_DESIGNATION = "value"; 28 | 29 | /** 30 | * Class that contains results of parameterization of Gremlin traversal before parameter values are materialized 31 | * 32 | * @author olignat 33 | */ 34 | static final class GremlinParameterizedQuery { 35 | private String parameterizedTraversal; 36 | private Map traversalParameters; 37 | 38 | GremlinParameterizedQuery(String parameterizedTraversal, Map traversalParameters) { 39 | this.parameterizedTraversal = parameterizedTraversal; 40 | this.traversalParameters = traversalParameters; 41 | } 42 | 43 | /** 44 | * @return modified traversal that contains Kafka parameters replaced with Gremlin parameter markers 45 | */ 46 | public String getParameterizedTraversal() { 47 | return this.parameterizedTraversal; 48 | } 49 | 50 | /** 51 | * @return map of Gremlin parameters to Kafka parameters that they replaced 52 | * Key is a Gremlin marker (e.g. "gp1") 53 | * Value is a Kafka marker without ${} around it (e.g. "value.property") 54 | */ 55 | public Map getTraversalParameters() { 56 | return this.traversalParameters; 57 | } 58 | } 59 | 60 | /** 61 | * Process traversal and replace all Kafka event markers with Gremlin parameter markers. 62 | * Extract the map of Kafka to Gremlin parameters to be substituted at a later time during execution. 63 | */ 64 | static GremlinParameterizedQuery parameterize(String traversal) { 65 | String parameterizedTraversal = traversal; 66 | Map traversalParameters = new HashMap(); 67 | 68 | if (traversal != null && !traversal.isEmpty()) { 69 | int lastMatchedMarker = -1; 70 | int lastMatchedMarkerEnd = -1; 71 | int parameterCounter = 1; 72 | 73 | do { 74 | // Match parameter 75 | lastMatchedMarker = parameterizedTraversal.indexOf("${", lastMatchedMarker + 1); 76 | if (lastMatchedMarker != -1) { 77 | lastMatchedMarkerEnd = parameterizedTraversal.indexOf("}", lastMatchedMarker); 78 | if (lastMatchedMarkerEnd != -1) { 79 | // Extract parameter 80 | String kafkaParameterMarker = parameterizedTraversal.substring(lastMatchedMarker + 2, lastMatchedMarkerEnd).trim(); 81 | String gremlinParameterMarker = GremlinQueryBuilder.PARAMETER_MARKER + String.valueOf(parameterCounter++); 82 | 83 | // Replace with a Gremlin parameter marker 84 | parameterizedTraversal = 85 | parameterizedTraversal.substring(0, lastMatchedMarker) 86 | + gremlinParameterMarker 87 | + parameterizedTraversal.substring(lastMatchedMarkerEnd + 1); 88 | 89 | // Capture parameter mapping 90 | traversalParameters.put(gremlinParameterMarker, kafkaParameterMarker); 91 | } 92 | } 93 | } while (lastMatchedMarker != -1); 94 | } 95 | 96 | log.debug("Parameterized traversal '{}' as '{}' with {} parameters", traversal, parameterizedTraversal, traversalParameters.keySet().size()); 97 | 98 | return new GremlinQueryBuilder.GremlinParameterizedQuery(parameterizedTraversal, traversalParameters); 99 | } 100 | 101 | /** 102 | * Using parameterized query populate a map of parameters 103 | */ 104 | static Map materialize(GremlinParameterizedQuery parameterizedQuery, SinkRecord record) { 105 | Map materializedParameters = new HashMap(); 106 | 107 | // Check if we have work to do 108 | if (!parameterizedQuery.getTraversalParameters().isEmpty()) { 109 | // Iterate through each Gremlin parameter 110 | for (Map.Entry traversalParameter : parameterizedQuery.getTraversalParameters().entrySet()) { 111 | // Get the references 112 | String gremlinParameterMarker = traversalParameter.getKey(); 113 | String kafkaParameterMarker = traversalParameter.getValue(); 114 | 115 | Schema parameterSchema = null; 116 | Object parameterValue = null; 117 | 118 | // Resolve schema and value 119 | if (kafkaParameterMarker.startsWith(GremlinQueryBuilder.KEY_DESIGNATION)) { 120 | parameterSchema = record.keySchema(); 121 | parameterValue = record.key(); 122 | } 123 | else if (kafkaParameterMarker.startsWith(GremlinQueryBuilder.VALUE_DESIGNATION)) { 124 | parameterSchema = record.valueSchema(); 125 | parameterValue = record.value(); 126 | } 127 | 128 | // Process different types of schema 129 | if (parameterSchema != null) { 130 | // Check if parameter marker includes any child references 131 | int indexOfChildPropertySeparator = kafkaParameterMarker.indexOf('.'); 132 | 133 | if (parameterSchema.type() == Type.MAP) { 134 | // Check if we have a child property reference 135 | // If we do not - entire map is a fair game 136 | if (indexOfChildPropertySeparator != -1) { 137 | // Handle the map 138 | Map parameterValueMap = (Map) parameterValue; 139 | parameterValue = parameterValueMap.get(kafkaParameterMarker.substring(indexOfChildPropertySeparator + 1)); 140 | } 141 | } else if (parameterSchema.type() == Type.STRUCT) { 142 | // Check if we have a child property reference 143 | // If we do not - entire map is a fair game 144 | if (indexOfChildPropertySeparator != -1) { 145 | // Handle structure 146 | Struct parameterValueStruct = (Struct) parameterValue; 147 | parameterValue = parameterValueStruct.get(kafkaParameterMarker.substring(indexOfChildPropertySeparator + 1)); 148 | } 149 | } else if (parameterSchema.type() == Type.ARRAY) { 150 | // Check if we have a positional element reference 151 | int indexOfArrayIndexerStart = kafkaParameterMarker.indexOf('['); 152 | if (indexOfArrayIndexerStart != -1) { 153 | // Find the first closing positional element 154 | int indexOfArrayIndexerEnd = kafkaParameterMarker.indexOf(']', indexOfArrayIndexerStart + 1); 155 | if (indexOfArrayIndexerEnd > indexOfArrayIndexerStart) { 156 | // Handle array 157 | List parameterValueArray = (List) parameterValue; 158 | parameterValue = parameterValueArray.get(Integer.parseInt(kafkaParameterMarker.substring(indexOfArrayIndexerStart + 1, indexOfArrayIndexerEnd))); 159 | } 160 | } 161 | } 162 | } 163 | 164 | // Store resolver parameter value 165 | materializedParameters.put(gremlinParameterMarker, parameterValue); 166 | } 167 | } 168 | 169 | return materializedParameters; 170 | } 171 | } 172 | -------------------------------------------------------------------------------- /src/main/java/com/microsoft/cosmos/gremlin/KafkaGremlinSinkConnector.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import java.util.ArrayList; 7 | import java.util.HashMap; 8 | import java.util.List; 9 | import java.util.Map; 10 | 11 | import org.apache.kafka.common.config.ConfigDef; 12 | import org.apache.kafka.common.config.ConfigDef.Importance; 13 | import org.apache.kafka.common.config.ConfigDef.Type; 14 | import org.apache.kafka.common.config.ConfigException; 15 | import org.apache.kafka.common.utils.AppInfoParser; 16 | import org.apache.kafka.connect.connector.Task; 17 | import org.apache.kafka.connect.sink.SinkConnector; 18 | 19 | /** 20 | * Entry point into Gremlin sink connector 21 | * 22 | * @author olignat 23 | * 24 | */ 25 | public final class KafkaGremlinSinkConnector extends SinkConnector { 26 | 27 | public enum Keys { 28 | ; 29 | static final String HOST = "host"; 30 | static final String PORT = "port"; 31 | static final String DATABASE = "database"; 32 | static final String CONTAINER = "container"; 33 | static final String KEY = "key"; 34 | static final String TRAVERSAL = "traversal"; 35 | static final String ENABLE_SKIP_ON_CONFLICT = "enableSkipOnConflict"; 36 | static final String ENABLE_SSL = "enableSsl"; 37 | static final String ENABLE_ERROR_ON_EMPTY_RESULTS = "enableErrorOnEmptyResult"; 38 | static final String MAX_WAIT_FOR_CONNECTION_MILLISECONDS = "maxWaitForConnectionMilliseconds"; 39 | static final String RECORD_WRITE_RETRY_COUNT = "recordWriteRetryCount"; 40 | static final String RECORD_WRITE_RETRY_MILLISECONDS = "recordWriteRetryMilliseconds"; 41 | } 42 | 43 | static final int DEFAULT_PORT = 443; 44 | static final boolean DEFAULT_ENABLE_SKIP_ON_CONFLICT = false; 45 | static final boolean DEFAULT_ENABLE_SSL = true; 46 | static final boolean DEFAULT_ENABLE_ERROR_ON_EMPTY_RESULTS = false; 47 | static final int DEFAULT_MAX_WAIT_FOR_CONNECTION_MILLISECONDS = 15000; 48 | static final int DEFAULT_RECORD_WRITE_RETRY_COUNT = 3; 49 | static final int DEFAULT_RECORD_WRITE_RETRY_MILLISECONDS = 1000; 50 | 51 | private static final ConfigDef CONFIG_DEF = new ConfigDef().define(Keys.HOST, Type.STRING, "", Importance.HIGH, 52 | "Microsoft Azure Cosmos Gremlin Accounty fully qualified name in the format *.gremlin.cosmos.azure.com") 53 | .define(Keys.PORT, Type.INT, KafkaGremlinSinkConnector.DEFAULT_PORT, Importance.HIGH, 54 | "Port number to which to send traffic at the host") 55 | .define(Keys.DATABASE, Type.STRING, "", Importance.HIGH, "Database inside global database account") 56 | .define(Keys.CONTAINER, Type.STRING, "", Importance.HIGH, "Container or collection inside database") 57 | .define(Keys.KEY, Type.STRING, "", Importance.HIGH, "Primary or secondary authentication key") 58 | .define(Keys.TRAVERSAL, Type.STRING, "", Importance.HIGH, 59 | "Gremlin query to execute for every event. Use ${key.property} or ${value.property} marker to match fields in MAP and STRUCT messages. For primitive types use simple ${key} and ${value} markers instead. For arrays it is possible to match an entire array with ${key} or ${value} or specific zero-based position in an array with ${key[5]} or ${value[0]}.") 60 | .define(Keys.ENABLE_SKIP_ON_CONFLICT, Type.BOOLEAN, 61 | KafkaGremlinSinkConnector.DEFAULT_ENABLE_SKIP_ON_CONFLICT, Importance.MEDIUM, 62 | "When enabled connector will skip over traversals that result in conflicting writes and just drop the records rather than fail and stall the flow of messages.") 63 | .define(Keys.ENABLE_SSL, Type.BOOLEAN, KafkaGremlinSinkConnector.DEFAULT_ENABLE_SSL, Importance.MEDIUM, 64 | "Flag that controls whether SSL is enabled or disabled. SSL is required for Microsoft Azure Cosmos DB accounts but can be disabled for local testing with emulator.") 65 | .define(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, Type.BOOLEAN, 66 | KafkaGremlinSinkConnector.DEFAULT_ENABLE_ERROR_ON_EMPTY_RESULTS, Importance.MEDIUM, 67 | "Flag that turns empty result from gremlin traversal into a connector error.") 68 | .define(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, Type.INT, 69 | KafkaGremlinSinkConnector.DEFAULT_MAX_WAIT_FOR_CONNECTION_MILLISECONDS, Importance.MEDIUM, 70 | "Amount of time a client would wait for a connection to Microsoft Azure Cosmos DB before giving up.") 71 | .define(Keys.RECORD_WRITE_RETRY_COUNT, Type.INT, KafkaGremlinSinkConnector.DEFAULT_RECORD_WRITE_RETRY_COUNT, 72 | Importance.MEDIUM, 73 | "Number of times to attempt to write a record to Microsoft Azure Cosmos DB account before giving up.") 74 | .define(Keys.RECORD_WRITE_RETRY_MILLISECONDS, Type.INT, 75 | KafkaGremlinSinkConnector.DEFAULT_RECORD_WRITE_RETRY_MILLISECONDS, Importance.MEDIUM, 76 | "Default retry interval for a failed attempt to write a record into Microsoft Azure Cosmos DB account."); 77 | 78 | private String host; 79 | private String port; 80 | private String database; 81 | private String container; 82 | private String key; 83 | private String traversal; 84 | private String enableSkipOnConflict; 85 | private String enableSsl; 86 | private String enableErrorOnEmptyResult; 87 | private String maxWaitForConnectionMilliseconds; 88 | private String recordWriteRetryCount; 89 | private String recordWriteRetryMilliseconds; 90 | 91 | public String version() { 92 | return AppInfoParser.getVersion(); 93 | } 94 | 95 | @Override 96 | public void start(Map props) { 97 | this.host = props.get(Keys.HOST); 98 | if (this.host == null || this.host.isEmpty()) { 99 | throw new ConfigException(Keys.HOST, "", 100 | "Global database account address is required to establish connection"); 101 | } 102 | 103 | this.port = props.get(Keys.PORT); 104 | if (this.port == null || this.port.isEmpty()) { 105 | throw new ConfigException(Keys.PORT, "", "Port is required to establish connection"); 106 | } 107 | 108 | this.database = props.get(Keys.DATABASE); 109 | if (this.database == null || this.database.isEmpty()) { 110 | throw new ConfigException(Keys.DATABASE, "", "Database name is required to establish connection"); 111 | } 112 | 113 | this.container = props.get(Keys.CONTAINER); 114 | if (this.container == null || this.container.isEmpty()) { 115 | throw new ConfigException(Keys.CONTAINER, "", "Container name is required to establish connection"); 116 | } 117 | 118 | this.key = props.get(Keys.KEY); 119 | if (this.key == null || this.key.isEmpty()) { 120 | throw new ConfigException(Keys.KEY, "", "Authentication key is required to establish connection"); 121 | } 122 | 123 | this.traversal = props.get(Keys.TRAVERSAL); 124 | this.enableSkipOnConflict = props.get(Keys.ENABLE_SKIP_ON_CONFLICT); 125 | this.enableSsl = props.get(Keys.ENABLE_SSL); 126 | this.enableErrorOnEmptyResult = props.get(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS); 127 | this.maxWaitForConnectionMilliseconds = props.get(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS); 128 | this.recordWriteRetryCount = props.get(Keys.RECORD_WRITE_RETRY_COUNT); 129 | this.recordWriteRetryMilliseconds = props.get(Keys.RECORD_WRITE_RETRY_MILLISECONDS); 130 | } 131 | 132 | @Override 133 | public Class taskClass() { 134 | return KafkaGremlinSinkTask.class; 135 | } 136 | 137 | @Override 138 | public List> taskConfigs(int maxTasks) { 139 | ArrayList> configs = new ArrayList>(); 140 | for (int i = 0; i < maxTasks; i++) { 141 | Map config = new HashMap(); 142 | 143 | config.put(Keys.HOST, this.host); 144 | config.put(Keys.PORT, this.port); 145 | config.put(Keys.DATABASE, this.database); 146 | config.put(Keys.CONTAINER, this.container); 147 | config.put(Keys.KEY, this.key); 148 | 149 | if (this.traversal != null && !this.traversal.isEmpty()) { 150 | config.put(Keys.TRAVERSAL, this.traversal); 151 | } 152 | 153 | if (this.enableSkipOnConflict != null && !this.enableSkipOnConflict.isEmpty()) { 154 | config.put(Keys.ENABLE_SKIP_ON_CONFLICT, this.enableSkipOnConflict); 155 | } 156 | 157 | if (this.enableSsl != null && !this.enableSsl.isEmpty()) { 158 | config.put(Keys.ENABLE_SSL, this.enableSsl); 159 | } 160 | 161 | if (this.enableErrorOnEmptyResult != null && !this.enableErrorOnEmptyResult.isEmpty()) { 162 | config.put(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, this.enableErrorOnEmptyResult); 163 | } 164 | 165 | if (this.maxWaitForConnectionMilliseconds != null && !this.maxWaitForConnectionMilliseconds.isEmpty()) { 166 | config.put(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, this.maxWaitForConnectionMilliseconds); 167 | } 168 | 169 | if (this.recordWriteRetryCount != null && !this.recordWriteRetryCount.isEmpty()) { 170 | config.put(Keys.RECORD_WRITE_RETRY_COUNT, this.recordWriteRetryCount); 171 | } 172 | 173 | if (this.recordWriteRetryMilliseconds != null && !this.recordWriteRetryMilliseconds.isEmpty()) { 174 | config.put(Keys.RECORD_WRITE_RETRY_MILLISECONDS, this.recordWriteRetryMilliseconds); 175 | } 176 | 177 | configs.add(config); 178 | } 179 | 180 | return configs; 181 | } 182 | 183 | @Override 184 | public void stop() { 185 | } 186 | 187 | @Override 188 | public ConfigDef config() { 189 | return CONFIG_DEF; 190 | } 191 | } -------------------------------------------------------------------------------- /src/main/java/com/microsoft/cosmos/gremlin/KafkaGremlinSinkTask.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import java.time.Duration; 7 | import java.time.LocalTime; 8 | import java.util.Collection; 9 | import java.util.HashMap; 10 | import java.util.List; 11 | import java.util.Map; 12 | 13 | import org.apache.kafka.clients.consumer.OffsetAndMetadata; 14 | import org.apache.kafka.common.TopicPartition; 15 | import org.apache.kafka.connect.errors.ConnectException; 16 | import org.apache.kafka.connect.errors.RetriableException; 17 | import org.apache.kafka.connect.sink.SinkRecord; 18 | import org.apache.kafka.connect.sink.SinkTask; 19 | import org.apache.tinkerpop.gremlin.driver.AuthProperties; 20 | import org.apache.tinkerpop.gremlin.driver.Client; 21 | import org.apache.tinkerpop.gremlin.driver.Cluster; 22 | import org.apache.tinkerpop.gremlin.driver.Result; 23 | import org.apache.tinkerpop.gremlin.driver.ResultSet; 24 | import org.apache.tinkerpop.gremlin.driver.exception.ResponseException; 25 | import org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0; 26 | import org.slf4j.Logger; 27 | import org.slf4j.LoggerFactory; 28 | 29 | /** 30 | * Sink task used to replicate data from Kafka into Cosmos DB Gremlin account 31 | * 32 | * @author olignat 33 | * 34 | */ 35 | public final class KafkaGremlinSinkTask extends SinkTask { 36 | 37 | private static final Logger log = LoggerFactory.getLogger(KafkaGremlinSinkTask.class); 38 | 39 | private static final String X_MS_STATUS_CODE_HEADER = "x-ms-status-code"; 40 | private static final String X_MS_RETRY_AFTER_MS_HEADER = "x-ms-retry-after-ms"; 41 | 42 | private static final int X_MS_STATUS_CODE_VALUE_UNKNOWN = -1; 43 | private static final int X_MS_STATUS_CODE_VALUE_CONFLICT = 409; 44 | private static final int X_MS_MAX_RETRY_AFTER_MS = 10000; 45 | 46 | private String host; 47 | private int port; 48 | private String database; 49 | private String container; 50 | private String key; 51 | private GremlinQueryBuilder.GremlinParameterizedQuery parameterizedTraversal; 52 | private Boolean enableSkipOnConflict; 53 | private Boolean enableSsl; 54 | private Boolean enableErrorOnEmptyResult; 55 | private int maxWaitForConnectionMilliseconds; 56 | private int recordWriteRetryCount; 57 | private int recordWriteRetryMilliseconds; 58 | 59 | private Cluster cluster; 60 | private Client client; 61 | 62 | private int remainingRetries; 63 | 64 | public KafkaGremlinSinkTask() { 65 | } 66 | 67 | public String version() { 68 | return new KafkaGremlinSinkConnector().version(); 69 | } 70 | 71 | @Override 72 | public void start(Map props) { 73 | this.host = props.get(KafkaGremlinSinkConnector.Keys.HOST); 74 | this.port = Integer.parseInt(props.get(KafkaGremlinSinkConnector.Keys.PORT)); 75 | this.database = props.get(KafkaGremlinSinkConnector.Keys.DATABASE); 76 | this.container = props.get(KafkaGremlinSinkConnector.Keys.CONTAINER); 77 | this.key = props.get(KafkaGremlinSinkConnector.Keys.KEY); 78 | 79 | // Process traversal and prepare for execution 80 | this.parameterizedTraversal = GremlinQueryBuilder 81 | .parameterize(props.get(KafkaGremlinSinkConnector.Keys.TRAVERSAL)); 82 | 83 | if (props.containsKey(KafkaGremlinSinkConnector.Keys.ENABLE_SKIP_ON_CONFLICT)) { 84 | this.enableSkipOnConflict = Boolean 85 | .parseBoolean(props.get(KafkaGremlinSinkConnector.Keys.ENABLE_SKIP_ON_CONFLICT)); 86 | } else { 87 | this.enableSkipOnConflict = KafkaGremlinSinkConnector.DEFAULT_ENABLE_SKIP_ON_CONFLICT; 88 | } 89 | 90 | if (props.containsKey(KafkaGremlinSinkConnector.Keys.ENABLE_SSL)) { 91 | this.enableSsl = Boolean.parseBoolean(props.get(KafkaGremlinSinkConnector.Keys.ENABLE_SSL)); 92 | } else { 93 | this.enableSsl = KafkaGremlinSinkConnector.DEFAULT_ENABLE_SSL; 94 | } 95 | 96 | if (props.containsKey(KafkaGremlinSinkConnector.Keys.ENABLE_ERROR_ON_EMPTY_RESULTS)) { 97 | this.enableErrorOnEmptyResult = Boolean 98 | .parseBoolean(props.get(KafkaGremlinSinkConnector.Keys.ENABLE_ERROR_ON_EMPTY_RESULTS)); 99 | } else { 100 | this.enableErrorOnEmptyResult = KafkaGremlinSinkConnector.DEFAULT_ENABLE_ERROR_ON_EMPTY_RESULTS; 101 | } 102 | 103 | if (props.containsKey(KafkaGremlinSinkConnector.Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS)) { 104 | this.maxWaitForConnectionMilliseconds = Integer 105 | .parseInt(props.get(KafkaGremlinSinkConnector.Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS)); 106 | } else { 107 | this.maxWaitForConnectionMilliseconds = KafkaGremlinSinkConnector.DEFAULT_MAX_WAIT_FOR_CONNECTION_MILLISECONDS; 108 | } 109 | 110 | if (props.containsKey(KafkaGremlinSinkConnector.Keys.RECORD_WRITE_RETRY_COUNT)) { 111 | this.recordWriteRetryCount = Integer 112 | .parseInt(props.get(KafkaGremlinSinkConnector.Keys.RECORD_WRITE_RETRY_COUNT)); 113 | } else { 114 | this.recordWriteRetryCount = KafkaGremlinSinkConnector.DEFAULT_RECORD_WRITE_RETRY_COUNT; 115 | } 116 | 117 | if (props.containsKey(KafkaGremlinSinkConnector.Keys.RECORD_WRITE_RETRY_MILLISECONDS)) { 118 | this.recordWriteRetryMilliseconds = Integer 119 | .parseInt(props.get(KafkaGremlinSinkConnector.Keys.RECORD_WRITE_RETRY_MILLISECONDS)); 120 | } else { 121 | this.recordWriteRetryMilliseconds = KafkaGremlinSinkConnector.DEFAULT_RECORD_WRITE_RETRY_MILLISECONDS; 122 | } 123 | 124 | // Stop in case we are started 125 | this.stop(); 126 | 127 | Cluster.Builder builder = Cluster.build(); 128 | builder.addContactPoint(this.host); 129 | builder.port(this.port); 130 | builder.maxWaitForConnection(this.maxWaitForConnectionMilliseconds); 131 | 132 | AuthProperties authenticationProperties = new AuthProperties(); 133 | authenticationProperties.with(AuthProperties.Property.USERNAME, 134 | String.format("/dbs/%s/colls/%s", this.database, this.container)); 135 | authenticationProperties.with(AuthProperties.Property.PASSWORD, this.key); 136 | 137 | builder.authProperties(authenticationProperties); 138 | builder.enableSsl(this.enableSsl); 139 | 140 | Map config = new HashMap(); 141 | config.put("serializeResultToString", "true"); 142 | 143 | GraphSONMessageSerializerV1d0 serializer = new GraphSONMessageSerializerV1d0(); 144 | serializer.configure(config, null); 145 | 146 | builder.serializer(serializer); 147 | 148 | // Configure special load balancing strategy for Azure that ignores host 149 | // unavailability 150 | // and continues to talk to the same host 151 | builder.loadBalancingStrategy(new StickyLoadBalancingStrategy()); 152 | 153 | this.cluster = builder.create(); 154 | this.client = this.cluster.connect(); 155 | 156 | this.remainingRetries = this.recordWriteRetryCount; 157 | } 158 | 159 | public void put(Collection sinkRecords) { 160 | for (SinkRecord sinkRecord : sinkRecords) { 161 | // Materialize traversal parameters 162 | Map materializedTraversalParameters = GremlinQueryBuilder 163 | .materialize(this.parameterizedTraversal, sinkRecord); 164 | log.debug("Executing {} with {} parameters", this.parameterizedTraversal.getParameterizedTraversal(), 165 | materializedTraversalParameters.size()); 166 | 167 | try { 168 | ResultSet resultSet = this.client.submit(this.parameterizedTraversal.getParameterizedTraversal(), 169 | materializedTraversalParameters); 170 | List results = resultSet.all().get(); 171 | 172 | if (results == null || results.isEmpty()) { 173 | log.debug("Completed successfully without results"); 174 | 175 | if (this.enableErrorOnEmptyResult) { 176 | throw new ConnectException("Completed successfully without results"); 177 | } 178 | } else { 179 | for (Result result : results) { 180 | log.debug("Result {}", result.toString()); 181 | } 182 | } 183 | } catch (Exception e) { 184 | log.error("Write failed {}, remaining retries = {}", e.toString(), this.remainingRetries); 185 | 186 | int targetRecordWriteRetryMilliseconds = this.recordWriteRetryMilliseconds; 187 | 188 | // Special case for known errors when conflicting documents are being inserted 189 | ResponseException re = KafkaGremlinSinkTask.getResponseExceptionIfPossible(e); 190 | if (re != null) { 191 | // Check for known errors that need to be retried or skipped 192 | if (re.getStatusAttributes().isPresent()) { 193 | Map attributes = re.getStatusAttributes().get(); 194 | int statusCode = (int) attributes.getOrDefault(KafkaGremlinSinkTask.X_MS_STATUS_CODE_HEADER, 195 | KafkaGremlinSinkTask.X_MS_STATUS_CODE_VALUE_UNKNOWN); 196 | 197 | // Now we can check for specific conditions 198 | if (statusCode == KafkaGremlinSinkTask.X_MS_STATUS_CODE_VALUE_CONFLICT) { 199 | if (this.enableSkipOnConflict) { 200 | // Do not retry on this error - move on to next item 201 | log.warn( 202 | "Record in partition {} and offset {} resulted in conflicting traversal. Record is skipped.", 203 | sinkRecord.kafkaPartition(), sinkRecord.kafkaOffset()); 204 | continue; 205 | } 206 | } 207 | 208 | // Check if we need to delay retry 209 | if (attributes.containsKey(KafkaGremlinSinkTask.X_MS_RETRY_AFTER_MS_HEADER)) { 210 | int suggestedRetryAfter = KafkaGremlinSinkTask.parseTimeSpan( 211 | (String) attributes.get(KafkaGremlinSinkTask.X_MS_RETRY_AFTER_MS_HEADER)); 212 | if (suggestedRetryAfter > 0) { 213 | // Use suggestion within reasonable bounds 214 | targetRecordWriteRetryMilliseconds = Math.min(suggestedRetryAfter, 215 | KafkaGremlinSinkTask.X_MS_MAX_RETRY_AFTER_MS); 216 | } 217 | } 218 | } 219 | } 220 | 221 | if (this.remainingRetries == 0) { 222 | throw new ConnectException(e); 223 | } 224 | 225 | this.remainingRetries -= 1; 226 | this.context.timeout(targetRecordWriteRetryMilliseconds); 227 | throw new RetriableException(e); 228 | } 229 | } 230 | 231 | this.remainingRetries = this.recordWriteRetryCount; 232 | } 233 | 234 | public void flush(Map offsets) { 235 | } 236 | 237 | @Override 238 | public void stop() { 239 | if (this.client != null) { 240 | this.client.close(); 241 | this.client = null; 242 | } 243 | 244 | if (this.cluster != null) { 245 | this.cluster.close(); 246 | this.cluster = null; 247 | } 248 | } 249 | 250 | private static ResponseException getResponseExceptionIfPossible(Throwable e) { 251 | if (e == null) { 252 | return null; 253 | } 254 | 255 | if (e instanceof ResponseException) { 256 | return (ResponseException) e; 257 | } 258 | 259 | if (e.getCause() != null) { 260 | return getResponseExceptionIfPossible(e.getCause()); 261 | } 262 | 263 | return null; 264 | } 265 | 266 | /** 267 | * Parse a string in format "00:00:00.5000000" and return total milliseconds it 268 | * represents. 269 | * 270 | * @param timeSpanString is the string to parse 271 | * @return a value parsed out of the string or -1 if parsing failed 272 | * 273 | * @author olignat 274 | */ 275 | private static int parseTimeSpan(String timeSpanString) { 276 | try { 277 | // Sanity check, in case we got a simple integer number 278 | return Integer.parseInt(timeSpanString); 279 | } catch (NumberFormatException e) { 280 | // Do nothing and keep parsing 281 | } 282 | 283 | try { 284 | LocalTime locaTime = LocalTime.parse(timeSpanString); 285 | Duration duration = Duration.between(LocalTime.MIN, locaTime); 286 | return (int) duration.toMillis(); 287 | } catch (Exception e) { 288 | // We couldn't parse 289 | } 290 | 291 | // We couldn't parse it 292 | return -1; 293 | } 294 | } -------------------------------------------------------------------------------- /src/main/java/com/microsoft/cosmos/gremlin/StickyLoadBalancingStrategy.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import java.util.Collection; 7 | import java.util.Iterator; 8 | import java.util.Random; 9 | import java.util.concurrent.CopyOnWriteArrayList; 10 | import java.util.concurrent.atomic.AtomicInteger; 11 | 12 | import org.apache.tinkerpop.gremlin.driver.Cluster; 13 | import org.apache.tinkerpop.gremlin.driver.Host; 14 | import org.apache.tinkerpop.gremlin.driver.LoadBalancingStrategy; 15 | import org.apache.tinkerpop.gremlin.driver.message.RequestMessage; 16 | 17 | /** 18 | * In Microsoft Azure Cosmos DB there is only a single host (DNS CNAME) that 19 | * points to a single VIP behind which there are many physical nodes serving 20 | * traffic. It is incorrect to mark host as unavailable and look for a better 21 | * one because there won't be a better one. VIP does not go down, but a single 22 | * node can. Failure of a node should not render entire cluster unusable We need 23 | * load balancing strategy that sticks to a single host no matter what. 24 | * 25 | * * @author olignat 26 | * 27 | */ 28 | final class StickyLoadBalancingStrategy implements LoadBalancingStrategy { 29 | private final CopyOnWriteArrayList hosts = new CopyOnWriteArrayList(); 30 | private final AtomicInteger index = new AtomicInteger(); 31 | 32 | @Override 33 | public void initialize(final Cluster cluster, final Collection hosts) { 34 | this.hosts.addAll(hosts); 35 | this.index.set(new Random().nextInt(Math.max(hosts.size(), 1))); 36 | } 37 | 38 | @Override 39 | public Iterator select(final RequestMessage msg) { 40 | final int startIndex = index.getAndIncrement(); 41 | 42 | if (startIndex > Integer.MAX_VALUE - 10000) 43 | index.set(0); 44 | 45 | return new Iterator() { 46 | 47 | private int currentIndex = startIndex; 48 | 49 | @Override 50 | public boolean hasNext() { 51 | return hosts.size() > 0; 52 | } 53 | 54 | @Override 55 | public Host next() { 56 | int c = currentIndex++ % hosts.size(); 57 | if (c < 0) 58 | c += hosts.size(); 59 | return hosts.get(c); 60 | } 61 | }; 62 | } 63 | 64 | @Override 65 | public void onAvailable(final Host host) { 66 | } 67 | 68 | @Override 69 | public void onUnavailable(final Host host) { 70 | } 71 | 72 | @Override 73 | public void onNew(final Host host) { 74 | this.hosts.addIfAbsent(host); 75 | } 76 | 77 | @Override 78 | public void onRemove(final Host host) { 79 | this.hosts.remove(host); 80 | } 81 | } 82 | -------------------------------------------------------------------------------- /src/test/java/com/microsoft/cosmos/gremlin/GremlinQueryBuilderTest.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import static org.junit.jupiter.api.Assertions.*; 7 | 8 | import java.util.ArrayList; 9 | import java.util.HashMap; 10 | import java.util.List; 11 | import java.util.Map; 12 | 13 | import org.apache.kafka.connect.data.Schema; 14 | import org.apache.kafka.connect.data.SchemaBuilder; 15 | import org.apache.kafka.connect.data.Struct; 16 | import org.apache.kafka.connect.sink.SinkRecord; 17 | import org.junit.jupiter.api.Test; 18 | 19 | /** 20 | * Test coverage for gremlin query builder 21 | * 22 | * @author olignat 23 | * 24 | */ 25 | final public class GremlinQueryBuilderTest { 26 | 27 | @Test 28 | public void testParameterizeNull() throws Exception { 29 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize(null); 30 | assertNotNull(parameterizedQuery, "Expected to receive a instance of parameterized query for null input traversal"); 31 | assertNull(parameterizedQuery.getParameterizedTraversal(), "Expected to receive null parameterized traversal"); 32 | assertNotNull(parameterizedQuery.getTraversalParameters(), "Expected to receive an instance of traversal parameters"); 33 | assertEquals(0, parameterizedQuery.getTraversalParameters().size(), "Traversal parameters are expected to be empty because there is no traversal on input"); 34 | } 35 | 36 | @Test 37 | public void testParameterizeNonMatchingPlaceholder() throws Exception { 38 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("test123 ${placeholder xyz"); 39 | assertEquals("test123 ${placeholder xyz", parameterizedQuery.getParameterizedTraversal(), 40 | "Expected placeholder to not be reset because it doesn't have } marker"); 41 | assertNotNull(parameterizedQuery.getTraversalParameters(), "Expected to have an instance of traversal parameters map regardless of the content"); 42 | assertEquals(0, parameterizedQuery.getTraversalParameters().size(), "Traversal parameters are expected to be empty because there are no matching parameter placeholders"); 43 | 44 | parameterizedQuery = GremlinQueryBuilder.parameterize("test123 placeholder} xyz"); 45 | assertEquals("test123 placeholder} xyz", parameterizedQuery.getParameterizedTraversal(), 46 | "Expected placeholder to not be reset because it doesn't have { marker"); 47 | assertNotNull(parameterizedQuery.getTraversalParameters(), "Expected to have an instance of traversal parameters map regardless of the content"); 48 | assertEquals(0, parameterizedQuery.getTraversalParameters().size(), "Traversal parameters are expected to be empty because there are no matching parameter placeholders"); 49 | 50 | parameterizedQuery = GremlinQueryBuilder.parameterize("test123 {placeholder} xyz"); 51 | assertEquals("test123 {placeholder} xyz", parameterizedQuery.getParameterizedTraversal(), 52 | "Expected placeholder to not be reset because it doesn't have $ marker"); 53 | assertNotNull(parameterizedQuery.getTraversalParameters(), "Expected to have an instance of traversal parameters map regardless of the content"); 54 | assertEquals(0, parameterizedQuery.getTraversalParameters().size(), "Traversal parameters are expected to be empty because there are no matching parameter placeholders"); 55 | } 56 | 57 | @Test 58 | public void testParameterizeMultipleMatchingPlaceholders() throws Exception { 59 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("test123 ${p1} Y ${2} xyz ${} ---"); 60 | assertEquals("test123 gp1 Y gp2 xyz gp3 ---", parameterizedQuery.getParameterizedTraversal(), 61 | "Expected 3 placeholders to be replaced by parameter markers"); 62 | assertNotNull(parameterizedQuery.getTraversalParameters(), "Expected to have an instance of traversal parameters map"); 63 | assertEquals(3, parameterizedQuery.getTraversalParameters().size(), "Traversal parameters are expected to be stored in traversal map"); 64 | 65 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp1"), "Expected to find the first gremlin parameter inside traversal map"); 66 | assertEquals("p1", parameterizedQuery.getTraversalParameters().get("gp1"), "Expected the first Gremlin parameter to match the first Kafka parameter"); 67 | 68 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp2"), "Expected to find the second gremlin parameter inside traversal map"); 69 | assertEquals("2", parameterizedQuery.getTraversalParameters().get("gp2"), "Expected the second Gremlin parameter to match the second Kafka parameter"); 70 | 71 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp3"), "Expected to find the third gremlin parameter inside traversal map"); 72 | assertEquals("", parameterizedQuery.getTraversalParameters().get("gp3"), "Expected the third Gremlin parameter to match the third Kafka parameter"); 73 | } 74 | 75 | @Test 76 | public void testReplacePrimitiveKey() throws Exception { 77 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("before ${key} after"); 78 | assertNotNull(parameterizedQuery, "Expected a valid instance of parameterized query"); 79 | 80 | assertEquals("before gp1 after", parameterizedQuery.getParameterizedTraversal(), "Unexpected parameterized Gremlin traversal"); 81 | assertEquals(1, parameterizedQuery.getTraversalParameters().size(), "Expected exactly one key to be stored in traversal parameters"); 82 | 83 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp1"), "Expected Gremlin parameter was not found in traversal map"); 84 | assertEquals("key", parameterizedQuery.getTraversalParameters().get("gp1"), "Unexpected Kafka placeholder value for Gremlin parameter"); 85 | 86 | // Testing integer value 87 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, Schema.INT32_SCHEMA, 1, null, null, 15L); 88 | Map materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 89 | 90 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters for integer type"); 91 | assertEquals(1, materializedParameters.size(), "Expected a single entry in materialized parameters map because there is a single integer parameter to materialize"); 92 | 93 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain a single first Gremlin parameter"); 94 | assertEquals(1, materializedParameters.get("gp1"), "Unexpected value of a materialized integer Gremlin parameter"); 95 | 96 | // Testing string value 97 | sinkRecord = new SinkRecord("Test-Topic", 1, Schema.STRING_SCHEMA, "abc", null, null, 15L); 98 | materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 99 | 100 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters for string type"); 101 | assertEquals(1, materializedParameters.size(), "Expected a single entry in materialized parameters map because there is a single string parameter to materialize"); 102 | 103 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain a single first Gremlin parameter"); 104 | assertEquals("abc", materializedParameters.get("gp1"), "Unexpected value of a materialized string Gremlin parameter"); 105 | } 106 | 107 | @Test 108 | public void testReplacePrimitiveValue() throws Exception { 109 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("before ${value} after"); 110 | assertNotNull(parameterizedQuery, "Expected a valid instance of parameterized query"); 111 | 112 | assertEquals("before gp1 after", parameterizedQuery.getParameterizedTraversal(), "Unexpected parameterized Gremlin traversal"); 113 | assertEquals(1, parameterizedQuery.getTraversalParameters().size(), "Expected exactly one key to be stored in traversal parameters"); 114 | 115 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp1"), "Expected Gremlin parameter was not found in traversal map"); 116 | assertEquals("value", parameterizedQuery.getTraversalParameters().get("gp1"), "Unexpected Kafka placeholder value for Gremlin parameter"); 117 | 118 | // Testing integer value 119 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, null, null, Schema.INT32_SCHEMA, 1234, 15L); 120 | Map materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 121 | 122 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters for integer type"); 123 | assertEquals(1, materializedParameters.size(), "Expected a single entry in materialized parameters map because there is a single integer parameter to materialize"); 124 | 125 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain a single first Gremlin parameter"); 126 | assertEquals(1234, materializedParameters.get("gp1"), "Unexpected value of a materialized integer Gremlin parameter"); 127 | 128 | // Testing integer value 129 | sinkRecord = new SinkRecord("Test-Topic", 1, null, null, Schema.STRING_SCHEMA, "xyz", 15L); 130 | materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 131 | 132 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters for string type"); 133 | assertEquals(1, materializedParameters.size(), "Expected a single entry in materialized parameters map because there is a single string parameter to materialize"); 134 | 135 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain a single first Gremlin parameter"); 136 | assertEquals("xyz", materializedParameters.get("gp1"), "Unexpected value of a materialized string Gremlin parameter"); 137 | 138 | // Test null string value 139 | sinkRecord = new SinkRecord("Test-Topic", 1, null, null, Schema.STRING_SCHEMA, null, 15L); 140 | materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 141 | 142 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters for null string type"); 143 | assertEquals(1, materializedParameters.size(), "Expected a single entry in materialized parameters map because there is a single null string parameter to materialize"); 144 | 145 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain a single first Gremlin parameter"); 146 | assertEquals(null, materializedParameters.get("gp1"), "Unexpected value of a materialized null string Gremlin parameter"); 147 | } 148 | 149 | @Test 150 | public void testReplaceMapValue() throws Exception { 151 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("first ${value.firstField} second ${value.secondField} third '${value.thirdField}' end"); 152 | assertNotNull(parameterizedQuery, "Expected a valid instance of parameterized query"); 153 | 154 | assertEquals("first gp1 second gp2 third 'gp3' end", parameterizedQuery.getParameterizedTraversal(), "Unexpected parameterized Gremlin traversal"); 155 | assertEquals(3, parameterizedQuery.getTraversalParameters().size(), "Expected exactly 3 keys to be stored in traversal parameters because there are 3 valid placeholders in the traversal template"); 156 | 157 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp1"), "Expected first Gremlin parameter was not found in traversal map"); 158 | assertEquals("value.firstField", parameterizedQuery.getTraversalParameters().get("gp1"), "Unexpected Kafka placeholder value for the first Gremlin parameter"); 159 | 160 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp2"), "Expected second Gremlin parameter was not found in traversal map"); 161 | assertEquals("value.secondField", parameterizedQuery.getTraversalParameters().get("gp2"), "Unexpected Kafka placeholder value for the second Gremlin parameter"); 162 | 163 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp3"), "Expected third Gremlin parameter was not found in traversal map"); 164 | assertEquals("value.thirdField", parameterizedQuery.getTraversalParameters().get("gp3"), "Unexpected Kafka placeholder value for the third Gremlin parameter"); 165 | 166 | Schema valueSchema = SchemaBuilder.map(Schema.STRING_SCHEMA, Schema.INT32_SCHEMA).build(); 167 | 168 | Map valueMap = new HashMap(); 169 | valueMap.put("firstField", 2019); 170 | valueMap.put("secondField", 423); 171 | valueMap.put("thirdField", null); 172 | 173 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, null, null, valueSchema, valueMap, 11L); 174 | Map materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 175 | 176 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters"); 177 | assertEquals(3, materializedParameters.size(), "Expected materialized parameters map to have the same number of entries as there are placeholders in traversal template"); 178 | 179 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain the first Gremlin parameter"); 180 | assertEquals(2019, materializedParameters.get("gp1"), "Unexpected value of the first materialized Gremlin parameter"); 181 | 182 | assertTrue(materializedParameters.containsKey("gp2"), "Expected materialized parameters to contain the second Gremlin parameter"); 183 | assertEquals(423, materializedParameters.get("gp2"), "Unexpected value of the second materialized Gremlin parameter"); 184 | 185 | assertTrue(materializedParameters.containsKey("gp3"), "Expected materialized parameters to contain the third Gremlin parameter"); 186 | assertEquals(null, materializedParameters.get("gp3"), "Unexpected value of the third materialized Gremlin parameter"); 187 | } 188 | 189 | @Test 190 | public void testReplaceStructValue() throws Exception { 191 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("first |${key.boolField1}| second {${key.intField2}} third ${key.stringField3} end"); 192 | assertNotNull(parameterizedQuery, "Expected a valid instance of parameterized query"); 193 | 194 | assertEquals("first |gp1| second {gp2} third gp3 end", parameterizedQuery.getParameterizedTraversal(), "Unexpected parameterized Gremlin traversal"); 195 | assertEquals(3, parameterizedQuery.getTraversalParameters().size(), "Expected exactly 3 keys to be stored in traversal parameters because there are 3 valid placeholders in the traversal template"); 196 | 197 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp1"), "Expected first Gremlin parameter was not found in traversal map"); 198 | assertEquals("key.boolField1", parameterizedQuery.getTraversalParameters().get("gp1"), "Unexpected Kafka placeholder value for the first Gremlin parameter"); 199 | 200 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp2"), "Expected second Gremlin parameter was not found in traversal map"); 201 | assertEquals("key.intField2", parameterizedQuery.getTraversalParameters().get("gp2"), "Unexpected Kafka placeholder value for the second Gremlin parameter"); 202 | 203 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp3"), "Expected third Gremlin parameter was not found in traversal map"); 204 | assertEquals("key.stringField3", parameterizedQuery.getTraversalParameters().get("gp3"), "Unexpected Kafka placeholder value for the third Gremlin parameter"); 205 | 206 | Schema keySchema = SchemaBuilder 207 | .struct() 208 | .field("boolField1", Schema.BOOLEAN_SCHEMA) 209 | .field("intField2", Schema.INT32_SCHEMA) 210 | .field("stringField3", Schema.STRING_SCHEMA) 211 | .field("nullField4", Schema.OPTIONAL_STRING_SCHEMA).build(); 212 | 213 | Struct keyRecord = new Struct(keySchema) 214 | .put("boolField1", true) 215 | .put("intField2", 64534) 216 | .put("stringField3", "StringValue") 217 | .put("nullField4", null); 218 | 219 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, keySchema, keyRecord, null, null, 11L); 220 | Map materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 221 | 222 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters"); 223 | assertEquals(3, materializedParameters.size(), "Expected materialized parameters map to have the same number of entries as there are placeholders in traversal template"); 224 | 225 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain the first Gremlin parameter"); 226 | assertEquals(true, materializedParameters.get("gp1"), "Unexpected value of the first materialized Gremlin parameter"); 227 | 228 | assertTrue(materializedParameters.containsKey("gp2"), "Expected materialized parameters to contain the second Gremlin parameter"); 229 | assertEquals(64534, materializedParameters.get("gp2"), "Unexpected value of the second materialized Gremlin parameter"); 230 | 231 | assertTrue(materializedParameters.containsKey("gp3"), "Expected materialized parameters to contain the third Gremlin parameter"); 232 | assertEquals("StringValue", materializedParameters.get("gp3"), "Unexpected value of the third materialized Gremlin parameter"); 233 | } 234 | 235 | @Test 236 | public void testReplaceArrayValue() throws Exception { 237 | GremlinQueryBuilder.GremlinParameterizedQuery parameterizedQuery = GremlinQueryBuilder.parameterize("a ${value[0]} b /${value[1]}/ c |${value[2]}| d"); 238 | assertNotNull(parameterizedQuery, "Expected a valid instance of parameterized query"); 239 | 240 | assertEquals("a gp1 b /gp2/ c |gp3| d", parameterizedQuery.getParameterizedTraversal(), "Unexpected parameterized Gremlin traversal"); 241 | assertEquals(3, parameterizedQuery.getTraversalParameters().size(), "Expected exactly 3 keys to be stored in traversal parameters because there are 3 valid placeholders in the traversal template"); 242 | 243 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp1"), "Expected first Gremlin parameter was not found in traversal map"); 244 | assertEquals("value[0]", parameterizedQuery.getTraversalParameters().get("gp1"), "Unexpected Kafka placeholder value for the first Gremlin parameter"); 245 | 246 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp2"), "Expected second Gremlin parameter was not found in traversal map"); 247 | assertEquals("value[1]", parameterizedQuery.getTraversalParameters().get("gp2"), "Unexpected Kafka placeholder value for the second Gremlin parameter"); 248 | 249 | assertTrue(parameterizedQuery.getTraversalParameters().containsKey("gp3"), "Expected third Gremlin parameter was not found in traversal map"); 250 | assertEquals("value[2]", parameterizedQuery.getTraversalParameters().get("gp3"), "Unexpected Kafka placeholder value for the third Gremlin parameter"); 251 | 252 | Schema valueSchema = SchemaBuilder.array(Schema.STRING_SCHEMA).build(); 253 | 254 | List valueRecord = new ArrayList(); 255 | valueRecord.add("first"); 256 | valueRecord.add("second"); 257 | valueRecord.add(null); 258 | 259 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, null, null, valueSchema, valueRecord, 11L); 260 | Map materializedParameters = GremlinQueryBuilder.materialize(parameterizedQuery, sinkRecord); 261 | 262 | assertNotNull(materializedParameters, "Expected a valid instance of materialized parameters"); 263 | assertEquals(3, materializedParameters.size(), "Expected materialized parameters map to have the same number of entries as there are placeholders in traversal template"); 264 | 265 | assertTrue(materializedParameters.containsKey("gp1"), "Expected materialized parameters to contain the first Gremlin parameter"); 266 | assertEquals("first", materializedParameters.get("gp1"), "Unexpected value of the first materialized Gremlin parameter"); 267 | 268 | assertTrue(materializedParameters.containsKey("gp2"), "Expected materialized parameters to contain the second Gremlin parameter"); 269 | assertEquals("second", materializedParameters.get("gp2"), "Unexpected value of the second materialized Gremlin parameter"); 270 | 271 | assertTrue(materializedParameters.containsKey("gp3"), "Expected materialized parameters to contain the third Gremlin parameter"); 272 | assertEquals(null, materializedParameters.get("gp3"), "Unexpected value of the third materialized Gremlin parameter"); 273 | } 274 | } 275 | -------------------------------------------------------------------------------- /src/test/java/com/microsoft/cosmos/gremlin/KafkaGremlinSinkConnectorTest.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import static org.junit.jupiter.api.Assertions.*; 7 | 8 | import java.lang.reflect.Modifier; 9 | import java.util.HashMap; 10 | import java.util.List; 11 | import java.util.Map; 12 | import java.util.UUID; 13 | 14 | import org.apache.kafka.common.config.ConfigDef; 15 | import org.apache.kafka.common.config.ConfigDef.ConfigKey; 16 | import org.apache.kafka.common.config.ConfigException; 17 | import org.junit.jupiter.api.Test; 18 | 19 | import com.microsoft.cosmos.gremlin.KafkaGremlinSinkConnector.Keys; 20 | 21 | /** 22 | * Test coverage for sink connector 23 | * 24 | * @author olignat 25 | * 26 | */ 27 | final public class KafkaGremlinSinkConnectorTest { 28 | 29 | @Test 30 | public void testReflectionContract() throws Exception { 31 | assertTrue(Modifier.isPublic(KafkaGremlinSinkConnector.class.getModifiers()), 32 | "Sink connector must be public to be discoverable by Kafka"); 33 | assertTrue(KafkaGremlinSinkConnector.class.getConstructors().length > 0, 34 | "Sink connector must have a public constructor"); 35 | } 36 | 37 | @Test 38 | public void testConfigurationDefinition() throws Exception { 39 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 40 | ConfigDef configurationDefinition = connector.config(); 41 | 42 | assertNotNull(configurationDefinition, "Configuration definition should not be null"); 43 | assertFalse(configurationDefinition.names().isEmpty(), "Set of configuration names should not be empty"); 44 | 45 | for (String configurationKey : configurationDefinition.names()) { 46 | ConfigKey configurationKeyDefinition = configurationDefinition.configKeys().get(configurationKey); 47 | 48 | assertNotNull(configurationKeyDefinition.name, "Configuration key definition name should not be null"); 49 | assertFalse(configurationKeyDefinition.name.isEmpty(), 50 | "Configuration key definition name should not be empty"); 51 | 52 | assertNotNull(configurationKeyDefinition.documentation, 53 | "Configuration key " + configurationKeyDefinition.name + " documentation name should not be null"); 54 | assertFalse(configurationKeyDefinition.documentation.isEmpty(), 55 | "Configuration key " + configurationKeyDefinition.name + " documentation name should not be empty"); 56 | } 57 | 58 | // Ensure it contains all expected keys 59 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.HOST), 60 | "Key " + Keys.HOST + " should be present in configuration definition"); 61 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.PORT), 62 | "Key " + Keys.PORT + " should be present in configuration definition"); 63 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.DATABASE), 64 | "Key " + Keys.DATABASE + " should be present in configuration definition"); 65 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.CONTAINER), 66 | "Key " + Keys.CONTAINER + " should be present in configuration definition"); 67 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.KEY), 68 | "Key " + Keys.KEY + " should be present in configuration definition"); 69 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.TRAVERSAL), 70 | "Key " + Keys.TRAVERSAL + " should be present in configuration definition"); 71 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.ENABLE_SKIP_ON_CONFLICT), 72 | "Key " + Keys.ENABLE_SKIP_ON_CONFLICT + " should be present in configuration definition"); 73 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.ENABLE_SSL), 74 | "Key " + Keys.ENABLE_SSL + " should be present in configuration definition"); 75 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS), 76 | "Key " + Keys.ENABLE_ERROR_ON_EMPTY_RESULTS + " should be present in configuration definition"); 77 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS), 78 | "Key " + Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS + " should be present in configuration definition"); 79 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.RECORD_WRITE_RETRY_COUNT), 80 | "Key " + Keys.RECORD_WRITE_RETRY_COUNT + " should be present in configuration definition"); 81 | assertTrue(configurationDefinition.configKeys().containsKey(Keys.RECORD_WRITE_RETRY_MILLISECONDS), 82 | "Key " + Keys.RECORD_WRITE_RETRY_MILLISECONDS + " should be present in configuration definition"); 83 | 84 | assertEquals(12, configurationDefinition.configKeys().keySet().size(), 85 | "Unexpected configuration values found in configuration definition"); 86 | } 87 | 88 | @Test 89 | public void testVersion() throws Exception { 90 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 91 | assertEquals(connector.version(), "2.2.0", "Expected specific version to be returned by connector"); 92 | } 93 | 94 | @Test 95 | public void testSinkTaskClass() throws Exception { 96 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 97 | assertEquals(connector.taskClass(), KafkaGremlinSinkTask.class, 98 | "Expected sink task to be returned by connector"); 99 | } 100 | 101 | @Test 102 | public void testStartEmptyConfiguration() throws Exception { 103 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 104 | 105 | Map configuration = new HashMap(); 106 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 107 | } 108 | 109 | @Test 110 | public void testStartRequiredConfiguration() throws Exception { 111 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 112 | 113 | Map configuration = new HashMap(); 114 | 115 | configuration.put(Keys.HOST, ""); 116 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 117 | 118 | configuration.put(Keys.HOST, "test value"); 119 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 120 | 121 | configuration.put(Keys.PORT, ""); 122 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 123 | 124 | configuration.put(Keys.PORT, "321"); 125 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 126 | 127 | configuration.put(Keys.DATABASE, ""); 128 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 129 | 130 | configuration.put(Keys.DATABASE, "database"); 131 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 132 | 133 | configuration.put(Keys.CONTAINER, ""); 134 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 135 | 136 | configuration.put(Keys.CONTAINER, "container"); 137 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 138 | 139 | configuration.put(Keys.KEY, ""); 140 | assertThrows(ConfigException.class, () -> connector.start(configuration)); 141 | 142 | configuration.put(Keys.KEY, "key"); 143 | connector.start(configuration); 144 | 145 | configuration.put(Keys.TRAVERSAL, ""); 146 | connector.start(configuration); 147 | 148 | configuration.put(Keys.TRAVERSAL, "traversal"); 149 | connector.start(configuration); 150 | 151 | configuration.put(Keys.ENABLE_SKIP_ON_CONFLICT, ""); 152 | connector.start(configuration); 153 | 154 | configuration.put(Keys.ENABLE_SKIP_ON_CONFLICT, "true"); 155 | connector.start(configuration); 156 | 157 | configuration.put(Keys.ENABLE_SSL, ""); 158 | connector.start(configuration); 159 | 160 | configuration.put(Keys.ENABLE_SSL, "false"); 161 | connector.start(configuration); 162 | 163 | configuration.put(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, ""); 164 | connector.start(configuration); 165 | 166 | configuration.put(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, "true"); 167 | connector.start(configuration); 168 | 169 | configuration.put(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, ""); 170 | connector.start(configuration); 171 | 172 | configuration.put(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, "12345"); 173 | connector.start(configuration); 174 | 175 | configuration.put(Keys.RECORD_WRITE_RETRY_COUNT, ""); 176 | connector.start(configuration); 177 | 178 | configuration.put(Keys.RECORD_WRITE_RETRY_COUNT, "54321"); 179 | connector.start(configuration); 180 | 181 | configuration.put(Keys.RECORD_WRITE_RETRY_MILLISECONDS, ""); 182 | connector.start(configuration); 183 | 184 | configuration.put(Keys.RECORD_WRITE_RETRY_MILLISECONDS, "3"); 185 | connector.start(configuration); 186 | } 187 | 188 | @Test 189 | public void testTaskConfigs() throws Exception { 190 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 191 | 192 | Map configuration = new HashMap(); 193 | 194 | configuration.put(Keys.HOST, "host" + UUID.randomUUID().toString()); 195 | configuration.put(Keys.PORT, "port-" + UUID.randomUUID().toString()); 196 | configuration.put(Keys.DATABASE, "database-" + UUID.randomUUID().toString()); 197 | configuration.put(Keys.CONTAINER, "container-" + UUID.randomUUID().toString()); 198 | configuration.put(Keys.KEY, "key-" + UUID.randomUUID().toString()); 199 | configuration.put(Keys.TRAVERSAL, "traversal"); 200 | configuration.put(Keys.ENABLE_SKIP_ON_CONFLICT, "true"); 201 | configuration.put(Keys.ENABLE_SSL, "true"); 202 | configuration.put(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, "true"); 203 | configuration.put(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, "10001"); 204 | configuration.put(Keys.RECORD_WRITE_RETRY_COUNT, "7"); 205 | configuration.put(Keys.RECORD_WRITE_RETRY_MILLISECONDS, "1324"); 206 | connector.start(configuration); 207 | 208 | List> taskConfigs = connector.taskConfigs(2); 209 | 210 | assertEquals(2, taskConfigs.size(), 211 | "Expected exactly two configurations because only two task configurations were requested"); 212 | 213 | for (Map taskConfig : taskConfigs) { 214 | for (String taskConfigKey : configuration.keySet()) { 215 | assertEquals(configuration.get(taskConfigKey), taskConfig.get(taskConfigKey), 216 | "Expected task configuration for " + taskConfigKey + " to match connector"); 217 | } 218 | } 219 | } 220 | 221 | @Test 222 | public void testOptionalTaskConfigs() throws Exception { 223 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 224 | 225 | Map configuration = new HashMap(); 226 | 227 | configuration.put(Keys.HOST, "host" + UUID.randomUUID().toString()); 228 | configuration.put(Keys.PORT, "port-" + UUID.randomUUID().toString()); 229 | configuration.put(Keys.DATABASE, "database-" + UUID.randomUUID().toString()); 230 | configuration.put(Keys.CONTAINER, "container-" + UUID.randomUUID().toString()); 231 | configuration.put(Keys.KEY, "key-" + UUID.randomUUID().toString()); 232 | connector.start(configuration); 233 | 234 | List> taskConfigs = connector.taskConfigs(1); 235 | assertEquals(1, taskConfigs.size(), 236 | "Expected exactly one configurations because only one task configurations was requested"); 237 | 238 | // Verify that optional keys are not present 239 | assertFalse(taskConfigs.get(0).containsKey(Keys.TRAVERSAL), 240 | "Expected to not find " + Keys.TRAVERSAL + " because it was not specified in connector configuration"); 241 | assertFalse(taskConfigs.get(0).containsKey(Keys.ENABLE_SKIP_ON_CONFLICT), "Expected to not find " 242 | + Keys.ENABLE_SKIP_ON_CONFLICT + " because it was not specified in connector configuration"); 243 | assertFalse(taskConfigs.get(0).containsKey(Keys.ENABLE_SSL), 244 | "Expected to not find " + Keys.ENABLE_SSL + " because it was not specified in connector configuration"); 245 | assertFalse(taskConfigs.get(0).containsKey(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS), "Expected to not find " 246 | + Keys.ENABLE_ERROR_ON_EMPTY_RESULTS + " because it was not specified in connector configuration"); 247 | assertFalse(taskConfigs.get(0).containsKey(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS), 248 | "Expected to not find " + Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS 249 | + " because it was not specified in connector configuration"); 250 | assertFalse(taskConfigs.get(0).containsKey(Keys.RECORD_WRITE_RETRY_COUNT), "Expected to not find " 251 | + Keys.RECORD_WRITE_RETRY_COUNT + " because it was not specified in connector configuration"); 252 | assertFalse(taskConfigs.get(0).containsKey(Keys.RECORD_WRITE_RETRY_MILLISECONDS), "Expected to not find " 253 | + Keys.RECORD_WRITE_RETRY_MILLISECONDS + " because it was not specified in connector configuration"); 254 | 255 | // Set empty values 256 | configuration.put(Keys.TRAVERSAL, ""); 257 | configuration.put(Keys.ENABLE_SKIP_ON_CONFLICT, ""); 258 | configuration.put(Keys.ENABLE_SSL, ""); 259 | configuration.put(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, ""); 260 | configuration.put(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, ""); 261 | configuration.put(Keys.RECORD_WRITE_RETRY_COUNT, ""); 262 | configuration.put(Keys.RECORD_WRITE_RETRY_MILLISECONDS, ""); 263 | connector.start(configuration); 264 | 265 | taskConfigs = connector.taskConfigs(1); 266 | assertEquals(1, taskConfigs.size(), 267 | "Expected exactly one configurations because only one task configurations was requested subsequently"); 268 | 269 | // Verify that optional keys are not present 270 | assertFalse(taskConfigs.get(0).containsKey(Keys.TRAVERSAL), 271 | "Expected to not find " + Keys.TRAVERSAL + " because it was not set to empty string"); 272 | assertFalse(taskConfigs.get(0).containsKey(Keys.ENABLE_SKIP_ON_CONFLICT), 273 | "Expected to not find " + Keys.ENABLE_SKIP_ON_CONFLICT + " because it was not set to empty string"); 274 | assertFalse(taskConfigs.get(0).containsKey(Keys.ENABLE_SSL), 275 | "Expected to not find " + Keys.ENABLE_SSL + " because it was not set to empty string"); 276 | assertFalse(taskConfigs.get(0).containsKey(Keys.ENABLE_ERROR_ON_EMPTY_RESULTS), "Expected to not find " 277 | + Keys.ENABLE_ERROR_ON_EMPTY_RESULTS + " because it was not set to empty string"); 278 | assertFalse(taskConfigs.get(0).containsKey(Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS), "Expected to not find " 279 | + Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS + " because it was not set to empty string"); 280 | assertFalse(taskConfigs.get(0).containsKey(Keys.RECORD_WRITE_RETRY_COUNT), 281 | "Expected to not find " + Keys.RECORD_WRITE_RETRY_COUNT + " because it was not set to empty string"); 282 | assertFalse(taskConfigs.get(0).containsKey(Keys.RECORD_WRITE_RETRY_MILLISECONDS), "Expected to not find " 283 | + Keys.RECORD_WRITE_RETRY_MILLISECONDS + " because it was not set to empty string"); 284 | } 285 | 286 | @Test 287 | public void testStop() { 288 | KafkaGremlinSinkConnector connector = new KafkaGremlinSinkConnector(); 289 | connector.stop(); 290 | } 291 | } 292 | -------------------------------------------------------------------------------- /src/test/java/com/microsoft/cosmos/gremlin/KafkaGremlinSinkTaskTest.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import static org.junit.jupiter.api.Assertions.*; 7 | 8 | import java.lang.reflect.Modifier; 9 | import java.util.ArrayList; 10 | import java.util.Collection; 11 | import java.util.HashMap; 12 | import java.util.List; 13 | import java.util.Map; 14 | import java.util.UUID; 15 | import java.util.concurrent.ExecutionException; 16 | 17 | import org.apache.kafka.clients.consumer.OffsetAndMetadata; 18 | import org.apache.kafka.common.TopicPartition; 19 | import org.apache.kafka.connect.data.Schema; 20 | import org.apache.kafka.connect.errors.RetriableException; 21 | import org.apache.kafka.connect.sink.SinkRecord; 22 | import org.apache.kafka.connect.sink.SinkTaskContext; 23 | import org.apache.tinkerpop.gremlin.driver.AuthProperties; 24 | import org.apache.tinkerpop.gremlin.driver.Client; 25 | import org.apache.tinkerpop.gremlin.driver.Cluster; 26 | import org.apache.tinkerpop.gremlin.driver.Result; 27 | import org.apache.tinkerpop.gremlin.driver.exception.ResponseException; 28 | import org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0; 29 | import org.junit.jupiter.api.Test; 30 | 31 | /** 32 | * Test coverage for sink task 33 | * 34 | * Setup instructions for this test: 1. Start CosmosDB.Emulator.exe with 35 | * /EnableGremlinEndpoint argument 2. In the browser, navigate to 36 | * https://localhost:8081/_explorer/index.html 3. Click on Explorer tab on the 37 | * left -> New Collection on the right 4. Create new database 38 | * "kafka-gremlin-database" and collection "kafka-gremlin-container" with 39 | * partition key "/gremlinpk" 40 | * 41 | * @author olignat 42 | * 43 | */ 44 | final public class KafkaGremlinSinkTaskTest { 45 | 46 | private static final String COSMOS_EMULATOR_HOST = "localhost"; 47 | private static final int COSMOS_EMULATOR_PORT = 8901; 48 | private static final String COSMOS_EMULATOR_DATABASE = "kafka-gremlin-database"; 49 | private static final String COSMOS_EMULATOR_CONTAINER = "kafka-gremlin-container"; 50 | private static final String COSMOS_EMULATOR_CONTAINER_PARTITIONKEY = "gremlinpk"; 51 | 52 | /** 53 | * This is a well known Cosmos DB local emulator key 54 | */ 55 | private static final String COSMOS_EMULATOR_KEY = "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="; 56 | 57 | @Test 58 | public void testReflectionContract() throws Exception { 59 | assertTrue(Modifier.isPublic(KafkaGremlinSinkTask.class.getModifiers()), 60 | "Sink task must be public to be discoverable by Kafka"); 61 | assertTrue(KafkaGremlinSinkTask.class.getConstructors().length > 0, "Sink task must have a public constructor"); 62 | } 63 | 64 | @Test 65 | public void testVersion() throws Exception { 66 | KafkaGremlinSinkTask task = new KafkaGremlinSinkTask(); 67 | assertEquals(task.version(), "2.2.0", "Expected specific version to be returned by task"); 68 | } 69 | 70 | @Test 71 | public void testStopWithoutStart() throws Exception { 72 | KafkaGremlinSinkTask task = new KafkaGremlinSinkTask(); 73 | task.stop(); 74 | } 75 | 76 | @Test 77 | public void testFlushWithoutStart() throws Exception { 78 | KafkaGremlinSinkTask task = new KafkaGremlinSinkTask(); 79 | task.flush(new HashMap()); 80 | } 81 | 82 | @Test 83 | public void testPrimitiveKeyTraversal() throws Exception { 84 | // Get properties and configure traversal for primitive type 85 | Map taskConfiguration = KafkaGremlinSinkTaskTest.getRequiredConnectionProperties(); 86 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.TRAVERSAL, "g.addV().property('" 87 | + KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_CONTAINER_PARTITIONKEY + "', ${key})"); 88 | 89 | // Create task 90 | KafkaGremlinSinkTask task = new KafkaGremlinSinkTask(); 91 | SinkTaskContext context = new TestSinkTaskContext(); 92 | task.initialize(context); 93 | task.start(taskConfiguration); 94 | 95 | // Prepare record to traverse 96 | String property = "gremlin-test-property-" + UUID.randomUUID().toString(); 97 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, Schema.STRING_SCHEMA, property, null, null, 15L); 98 | 99 | Collection sinkRecords = new ArrayList(); 100 | sinkRecords.add(sinkRecord); 101 | 102 | // Execute a single record traversal 103 | task.put(sinkRecords); 104 | 105 | // Verify that record is there 106 | Cluster cluster = KafkaGremlinSinkTaskTest.createCluster(); 107 | Client client = cluster.connect(); 108 | 109 | List results = client.submit("g.V().has('" 110 | + KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_CONTAINER_PARTITIONKEY + "', '" + property + "')").all() 111 | .get(); 112 | assertTrue(results.size() > 0, 113 | "Expected to find a vertex with the property that was just inserted by the test"); 114 | 115 | // Shut down the client 116 | client.close(); 117 | cluster.close(); 118 | 119 | // Stop task 120 | task.stop(); 121 | } 122 | 123 | @Test 124 | public void testOptionalConfigurationTraversal() throws Exception { 125 | // Get properties and configure traversal 126 | Map taskConfiguration = KafkaGremlinSinkTaskTest.getRequiredConnectionProperties(); 127 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.TRAVERSAL, "g.addV().property('" 128 | + KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_CONTAINER_PARTITIONKEY + "', '${value}')"); 129 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.ENABLE_SKIP_ON_CONFLICT, "true"); 130 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.ENABLE_ERROR_ON_EMPTY_RESULTS, "true"); 131 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.MAX_WAIT_FOR_CONNECTION_MILLISECONDS, "5000"); 132 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.RECORD_WRITE_RETRY_COUNT, "50"); 133 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.RECORD_WRITE_RETRY_MILLISECONDS, "35"); 134 | 135 | taskConfiguration.remove(KafkaGremlinSinkConnector.Keys.ENABLE_SSL); 136 | 137 | // Create task 138 | KafkaGremlinSinkTask task = new KafkaGremlinSinkTask(); 139 | SinkTaskContext context = new TestSinkTaskContext(); 140 | task.initialize(context); 141 | task.start(taskConfiguration); 142 | task.stop(); 143 | } 144 | 145 | @Test 146 | public void testConflictFailTraversal() throws Exception { 147 | // Get properties and configure traversal for primitive type 148 | Map taskConfiguration = KafkaGremlinSinkTaskTest.getRequiredConnectionProperties(); 149 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.TRAVERSAL, 150 | "g.addV().property('" + KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_CONTAINER_PARTITIONKEY + "', ${key})" 151 | + ".property('id', ${value})"); 152 | 153 | // Create task 154 | KafkaGremlinSinkTask task = new KafkaGremlinSinkTask(); 155 | SinkTaskContext context = new TestSinkTaskContext(); 156 | task.initialize(context); 157 | task.start(taskConfiguration); 158 | 159 | // Prepare record to traverse 160 | String pkProperty = "gremlin-pk-property-" + UUID.randomUUID().toString(); 161 | String idProperty = "gremlin-id-property-" + UUID.randomUUID().toString(); 162 | SinkRecord sinkRecord = new SinkRecord("Test-Topic", 1, Schema.STRING_SCHEMA, pkProperty, Schema.STRING_SCHEMA, 163 | idProperty, 15L); 164 | 165 | Collection sinkRecords = new ArrayList(); 166 | sinkRecords.add(sinkRecord); 167 | 168 | // Execute a single record traversal 169 | task.put(sinkRecords); 170 | 171 | try { 172 | // First execution should succeed, second execution must throw an exception 173 | // because we've added the same identifier 174 | task.put(sinkRecords); 175 | } catch (RetriableException e) { 176 | 177 | Throwable currentThrowable = e.getCause(); 178 | assertTrue(currentThrowable instanceof ExecutionException, 179 | "Expected to see concurrent execution exception"); 180 | 181 | currentThrowable = currentThrowable.getCause(); 182 | assertTrue(currentThrowable instanceof ResponseException, 183 | "Expected to receive a conflicting error because duplicate key was inserted"); 184 | 185 | ResponseException responseException = (ResponseException) currentThrowable; 186 | 187 | assertEquals(409, (int) responseException.getStatusAttributes().get().get("x-ms-status-code"), 188 | "Unexpected response code on conflicting write"); 189 | } 190 | 191 | // Disable failures on conflicting writes 192 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.ENABLE_SKIP_ON_CONFLICT, Boolean.TRUE.toString()); 193 | task.start(taskConfiguration); 194 | 195 | // Now this will throw an exception internally and will be caught 196 | task.put(sinkRecords); 197 | 198 | // Stop task 199 | task.stop(); 200 | } 201 | 202 | private static Map getRequiredConnectionProperties() { 203 | Map taskConfiguration = new HashMap(); 204 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.HOST, KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_HOST); 205 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.PORT, 206 | String.valueOf(KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_PORT)); 207 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.DATABASE, 208 | KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_DATABASE); 209 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.CONTAINER, 210 | KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_CONTAINER); 211 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.KEY, KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_KEY); 212 | taskConfiguration.put(KafkaGremlinSinkConnector.Keys.ENABLE_SSL, Boolean.FALSE.toString()); 213 | 214 | return taskConfiguration; 215 | } 216 | 217 | private static Cluster createCluster() throws Exception { 218 | Cluster.Builder builder = Cluster.build(); 219 | builder.addContactPoint(KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_HOST); 220 | builder.port(KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_PORT); 221 | 222 | AuthProperties authenticationProperties = new AuthProperties(); 223 | authenticationProperties.with(AuthProperties.Property.USERNAME, String.format("/dbs/%s/colls/%s", 224 | KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_DATABASE, KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_CONTAINER)); 225 | authenticationProperties.with(AuthProperties.Property.PASSWORD, KafkaGremlinSinkTaskTest.COSMOS_EMULATOR_KEY); 226 | 227 | builder.authProperties(authenticationProperties); 228 | builder.enableSsl(false); 229 | 230 | Map config = new HashMap(); 231 | config.put("serializeResultToString", "true"); 232 | 233 | GraphSONMessageSerializerV1d0 serializer = new GraphSONMessageSerializerV1d0(); 234 | serializer.configure(config, null); 235 | 236 | builder.serializer(serializer); 237 | 238 | return builder.create(); 239 | } 240 | } 241 | -------------------------------------------------------------------------------- /src/test/java/com/microsoft/cosmos/gremlin/StickyLoadBalancingStrategyTest.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import static org.junit.jupiter.api.Assertions.*; 7 | 8 | import java.util.ArrayList; 9 | import java.util.Iterator; 10 | 11 | import org.apache.tinkerpop.gremlin.driver.Cluster; 12 | import org.apache.tinkerpop.gremlin.driver.Host; 13 | import org.junit.jupiter.api.Test; 14 | 15 | /** 16 | * Test coverage for load balancing strategy for Microsoft Azure Cosmos DB 17 | * 18 | * @author olignat 19 | * 20 | */ 21 | final public class StickyLoadBalancingStrategyTest { 22 | 23 | @Test 24 | public void testUnavailable() throws Exception { 25 | StickyLoadBalancingStrategy strategy = new StickyLoadBalancingStrategy(); 26 | 27 | Cluster cluster = Cluster.build().addContactPoint("localhost").create(); 28 | cluster.init(); 29 | 30 | strategy.initialize(cluster, cluster.allHosts()); 31 | 32 | Iterator iterator = strategy.select(null); 33 | assertTrue(iterator.hasNext(), "Expected to have a host on the first call"); 34 | assertNotNull(iterator.next(), "Expected iterator to return single available host"); 35 | assertTrue(iterator.hasNext(), "Expected to have a host on subsequent call"); 36 | assertNotNull(iterator.next(), "Expected iterator to return single available host on subsequent call"); 37 | 38 | // Notify strategy that host is down 39 | strategy.onUnavailable(cluster.allHosts().iterator().next()); 40 | 41 | iterator = strategy.select(null); 42 | assertTrue(iterator.hasNext(), "Expected to have a host on the first call after unavailability notification"); 43 | assertNotNull(iterator.next(), 44 | "Expected iterator to return single available host after unavailability notification"); 45 | assertTrue(iterator.hasNext(), "Expected to have a host on subsequent call after unavailability notification"); 46 | assertNotNull(iterator.next(), 47 | "Expected iterator to return single available host on subsequent call after unavailability notification"); 48 | 49 | // Notify strategy that host is up 50 | strategy.onAvailable(cluster.allHosts().iterator().next()); 51 | 52 | iterator = strategy.select(null); 53 | assertTrue(iterator.hasNext(), "Expected to have a host on the first call after availability notification"); 54 | assertNotNull(iterator.next(), 55 | "Expected iterator to return single available host after availability notification"); 56 | assertTrue(iterator.hasNext(), "Expected to have a host on subsequent call after availability notification"); 57 | assertNotNull(iterator.next(), 58 | "Expected iterator to return single available host on subsequent call after availability notification"); 59 | } 60 | 61 | @Test 62 | public void testNewRemove() throws Exception { 63 | StickyLoadBalancingStrategy strategy = new StickyLoadBalancingStrategy(); 64 | 65 | Cluster cluster = Cluster.build().addContactPoint("localhost").create(); 66 | cluster.init(); 67 | 68 | // Start with empty collection 69 | strategy.initialize(cluster, new ArrayList()); 70 | 71 | Iterator iterator = strategy.select(null); 72 | assertFalse(iterator.hasNext(), "Expected to not have a host"); 73 | assertFalse(iterator.hasNext(), "Expected to not have a host on subsequent call"); 74 | 75 | // Add a host 76 | strategy.onNew(cluster.allHosts().iterator().next()); 77 | 78 | iterator = strategy.select(null); 79 | assertTrue(iterator.hasNext(), "Expected to have a host after registration"); 80 | assertTrue(iterator.hasNext(), "Expected to have a host on subsequent call after registration"); 81 | 82 | // Remove a host 83 | strategy.onRemove(cluster.allHosts().iterator().next()); 84 | 85 | iterator = strategy.select(null); 86 | assertFalse(iterator.hasNext(), "Expected to not have a host after removal"); 87 | assertFalse(iterator.hasNext(), "Expected to have a host on subsequent call after removal"); 88 | } 89 | } 90 | -------------------------------------------------------------------------------- /src/test/java/com/microsoft/cosmos/gremlin/TestSinkTaskContext.java: -------------------------------------------------------------------------------- 1 | /** 2 | * 3 | */ 4 | package com.microsoft.cosmos.gremlin; 5 | 6 | import java.util.Map; 7 | import java.util.Set; 8 | 9 | import org.apache.kafka.common.TopicPartition; 10 | import org.apache.kafka.connect.sink.SinkTaskContext; 11 | 12 | /** 13 | * Test sink task context 14 | * 15 | * @author olignat 16 | * 17 | */ 18 | public class TestSinkTaskContext implements SinkTaskContext { 19 | 20 | @Override 21 | public Map configs() { 22 | return null; 23 | } 24 | 25 | @Override 26 | public void offset(Map offsets) { 27 | } 28 | 29 | @Override 30 | public void offset(TopicPartition tp, long offset) { 31 | } 32 | 33 | @Override 34 | public void timeout(long timeoutMs) { 35 | } 36 | 37 | @Override 38 | public Set assignment() { 39 | return null; 40 | } 41 | 42 | @Override 43 | public void pause(TopicPartition... partitions) { 44 | } 45 | 46 | @Override 47 | public void resume(TopicPartition... partitions) { 48 | } 49 | 50 | @Override 51 | public void requestCommit() { 52 | } 53 | 54 | } 55 | --------------------------------------------------------------------------------