├── cassandra-features-4x ├── virtual-tables │ ├── finish.md │ ├── foreground.sh │ ├── background.sh │ ├── intro.md │ ├── assets │ │ └── wait.sh │ ├── quiz.md │ ├── step5.md │ ├── step6.md │ ├── step7.md │ ├── step1.md │ ├── step3.md │ ├── index.json │ ├── step2.md │ └── step4.md ├── cassandra4-audit-logging │ ├── finish.md │ ├── foreground.sh │ ├── assets │ │ ├── daniel.png │ │ ├── simon.png │ │ ├── stopped.png │ │ ├── daniel-log.png │ │ ├── nodetool-status.png │ │ └── wait.sh │ ├── intro.md │ ├── background.sh │ ├── quiz.md │ ├── index.json │ ├── step1.md │ └── step3.md ├── cassandra4-full-query-logging │ ├── finish.md │ ├── foreground.sh │ ├── background.sh │ ├── intro.md │ ├── step3.md │ ├── assets │ │ └── wait.sh │ ├── quiz.md │ ├── step1.md │ ├── index.json │ ├── step4.md │ └── step2.md ├── cassandra4-migrate-cassandra-3-to-4 │ ├── finish.md │ ├── background.sh │ ├── assets │ │ ├── normal.jpg │ │ ├── version.png │ │ ├── setup-complete.jpg │ │ └── wait.sh │ ├── step4.md │ ├── intro.md │ ├── step5.md │ ├── step7.md │ ├── step1.md │ ├── quiz.md │ ├── step6.md │ ├── foreground.sh │ ├── index.json │ ├── step2.md │ └── step3.md ├── cassandra4-internode-message │ ├── assets │ │ ├── out.cql │ │ ├── in.cql │ │ └── wait.sh │ ├── finish.md │ ├── intro.md │ ├── foreground.sh │ ├── step4.md │ ├── step1.md │ ├── step2.md │ ├── step6.md │ ├── quiz.md │ ├── step5.md │ ├── step3.md │ ├── step7.md │ └── index.json ├── cassandra4-repair-improvements │ ├── finish.md │ ├── foreground.sh │ ├── intro.md │ ├── assets │ │ └── wait.sh │ ├── quiz.md │ ├── index.json │ └── step2.md └── structure.json ├── cql ├── foreground.sh ├── step2.md ├── step7.md ├── step5.md ├── quiz.md ├── step9.md ├── step8.md ├── step4.md ├── step6.md ├── intro.md ├── finish.md ├── step1.md ├── step3.md ├── assets │ └── wait.sh ├── step10.md ├── background.sh └── index.json ├── README.md ├── cassandra-fundamentals ├── cql │ ├── foreground.sh │ ├── step2.md │ ├── step7.md │ ├── step5.md │ ├── step9.md │ ├── step4.md │ ├── step8.md │ ├── intro.md │ ├── step6.md │ ├── finish.md │ ├── step1.md │ ├── quiz.md │ ├── step3.md │ ├── assets │ │ └── wait.sh │ ├── step10.md │ ├── background.sh │ └── index.json ├── queries │ ├── foreground.sh │ ├── intro.md │ ├── finish.md │ ├── assets │ │ └── wait.sh │ ├── step4.md │ ├── step9.md │ ├── quiz.md │ ├── step11.md │ ├── step10.md │ ├── step5.md │ ├── step1.md │ ├── step8.md │ ├── step6.md │ ├── background.sh │ ├── index.json │ ├── step7.md │ └── step2.md └── structure.json └── cassandra-data-modeling ├── music-data ├── foreground.sh ├── assets │ ├── music_data.tar.gz │ └── wait.sh ├── finish.md ├── step1.md ├── step10.md ├── step5.md ├── step9.md ├── step12.md ├── step8.md ├── step6.md ├── step7.md ├── step11.md ├── step13.md ├── intro.md ├── step4.md ├── background.sh ├── step2.md ├── index.json └── step3.md ├── time-series-data ├── foreground.sh ├── step4.md ├── finish.md ├── assets │ ├── time_series_data.tar.gz │ └── wait.sh ├── step1.md ├── step6.md ├── step5.md ├── intro.md ├── step11.md ├── step9.md ├── step10.md ├── step7.md ├── step8.md ├── index.json ├── background.sh └── step2.md ├── sensor-data ├── foreground.sh ├── finish.md ├── step6.md ├── step4.md ├── step1.md ├── step7.md ├── intro.md ├── step3.md ├── assets │ └── wait.sh ├── step2.md ├── index.json ├── background.sh └── step5.md ├── investment-data ├── foreground.sh ├── finish.md ├── step4.md ├── step5.md ├── step1.md ├── step6.md ├── intro.md ├── step7.md ├── step10.md ├── step8.md ├── step9.md ├── assets │ └── wait.sh ├── step3.md ├── background.sh ├── index.json └── step2.md ├── messaging-data ├── foreground.sh ├── finish.md ├── step1.md ├── intro.md ├── step5.md ├── step4.md ├── step6.md ├── assets │ └── wait.sh ├── step3.md ├── step7.md ├── step2.md ├── index.json └── background.sh ├── order-management-data ├── foreground.sh ├── finish.md ├── step7.md ├── step1.md ├── step5.md ├── step6.md ├── step4.md ├── intro.md ├── assets │ └── wait.sh ├── step3.md ├── step8.md ├── index.json ├── background.sh └── step2.md ├── shopping-cart-data ├── foreground.sh ├── finish.md ├── step7.md ├── step8.md ├── step1.md ├── step5.md ├── step9.md ├── intro.md ├── assets │ └── wait.sh ├── step3.md ├── step6.md ├── step10.md ├── step4.md ├── step2.md ├── index.json └── background.sh └── structure.json /cassandra-features-4x/virtual-tables/finish.md: -------------------------------------------------------------------------------- 1 | Finish -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/finish.md: -------------------------------------------------------------------------------- 1 | Finish -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/finish.md: -------------------------------------------------------------------------------- 1 | Finish -------------------------------------------------------------------------------- /cql/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # killercoda-scenarios 2 | Collaborative Scenarios repository 3 | -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 8; wait.sh -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/finish.md: -------------------------------------------------------------------------------- 1 | *** Finish *** 2 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 8; wait.sh -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 2; wait.sh; source .bashrc 3 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 6; wait.sh; source .bashrc 3 | 4 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 10; wait.sh; source .bashrc 3 | 4 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | 4 | cqlsh 5 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | 4 | cqlsh 5 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | 4 | cqlsh 5 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | 4 | cqlsh 5 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | sleep 4; wait.sh; source .bashrc 3 | 4 | cqlsh 5 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/assets/out.cql: -------------------------------------------------------------------------------- 1 | SELECT sent_count, sent_bytes FROM system_views.internode_outbound; 2 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/assets/in.cql: -------------------------------------------------------------------------------- 1 | SELECT received_count, received_bytes FROM system_views.internode_inbound; 2 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step4.md: -------------------------------------------------------------------------------- 1 | Start the CQL shell and connect to the `time_series` keyspace: 2 | ```bash 3 | cqlsh -k time_series 4 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/assets/music_data.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-data-modeling/music-data/assets/music_data.tar.gz -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for a sensor data use case 4 | * Sample sensor data 5 | * CQL queries over sensor data -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/assets/daniel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-audit-logging/assets/daniel.png -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/assets/simon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-audit-logging/assets/simon.png -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for a messaging data use case 4 | * Sample messaging data 5 | * CQL queries over messaging data -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/assets/stopped.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-audit-logging/assets/stopped.png -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for a time series use case 4 | * Sample time series data 5 | * CQL queries over time series data -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/assets/daniel-log.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-audit-logging/assets/daniel-log.png -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for a digital library use case 4 | * Sample digital library data 5 | * CQL queries over digital library data -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/assets/time_series_data.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-data-modeling/time-series-data/assets/time_series_data.tar.gz -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for a shopping cart data use case 4 | * Sample shopping cart data 5 | * CQL queries over shopping cart data -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/assets/nodetool-status.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-audit-logging/assets/nodetool-status.png -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/normal.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/normal.jpg -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/version.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/version.png -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for an order management use case 4 | * Sample order management data 5 | * CQL queries over order management data -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | while [ ! -f /usr/local/bin/wait.sh ] || [ $(stat -c "%a" /usr/local/bin/wait.sh) != "755" ]; do clear; sleep 0.2; done 3 | clear 4 | wait.sh 5 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you learned about: 2 | 3 | * Internode messaging optimization 4 | * Internode messaging stabilization 5 | * Internode metrics virtual tables 6 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario, you explored: 2 | 3 | * Schema design for an investment portfolio use case 4 | * Sample investment portfolio data 5 | * CQL queries over investment portfolio data -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/intro.md: -------------------------------------------------------------------------------- 1 | In this learning unit, you will learn about: 2 | 3 | * Internode messaging optimization 4 | * Internode messaging stabilization 5 | * Internode metrics virtual tables 6 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/setup-complete.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datastaxdevs/killercoda-scenarios/main/cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/setup-complete.jpg -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `music_data` keyspace: 2 | ```sql 3 | cqlsh -e " 4 | 5 | CREATE KEYSPACE music_data 6 | WITH replication = { 7 | 'class': 'NetworkTopologyStrategy', 8 | 'DC-Houston': 1 };" 9 | ```{{execute}} 10 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step10.md: -------------------------------------------------------------------------------- 1 | Find tracks with title `Let It Be`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM tracks_by_title 9 | WHERE title = 'Let It Be'; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step5.md: -------------------------------------------------------------------------------- 1 | Find a performer with name `The Beatles`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM performers 9 | WHERE name = 'The Beatles'; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `time_series` keyspace: 2 | ```sql 3 | cqlsh -e " 4 | 5 | CREATE KEYSPACE time_series 6 | WITH replication = { 7 | 'class': 'NetworkTopologyStrategy', 8 | 'DC-Houston': 1 };" 9 | ```{{execute}} 10 | 11 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step7.md: -------------------------------------------------------------------------------- 1 | Find all information about an item with id `Box2`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM items_by_id 9 | WHERE id = 'Box2'; 10 | ```{{execute}} 11 | 12 |
13 | 14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step9.md: -------------------------------------------------------------------------------- 1 | Find albums from genre `Classical`; order by year (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM albums_by_genre 9 | WHERE genre = 'Classical'; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step8.md: -------------------------------------------------------------------------------- 1 | Find all information about items with name `Chocolate Cake`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM items_by_name 9 | WHERE name = 'Chocolate Cake'; 10 | ```{{execute}} 11 | 12 |
-------------------------------------------------------------------------------- /cql/step2.md: -------------------------------------------------------------------------------- 1 | The CQL shell is a command-line client for executing CQL statements over a Cassandra database interactively. 2 | 3 | Get the CQL shell usage help: 4 | ``` 5 | cqlsh -h 6 | ```{{execute}} 7 | 8 | Start the CQL shell: 9 | ``` 10 | cqlsh 11 | ```{{execute}} 12 | 13 | 14 | 15 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | 4 | /usr/share/cassandra/bin/cassandra -R 5 | until /usr/share/cassandra/bin/cqlsh -e "DESCRIBE KEYSPACES" 6 | do 7 | sleep 0.1 8 | done 9 | 10 | echo "done" >> /opt/katacoda-background-finished -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step12.md: -------------------------------------------------------------------------------- 1 | Find a user with id `12345678-aaaa-bbbb-cccc-123456789abc`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM users 9 | WHERE id = 12345678-aaaa-bbbb-cccc-123456789abc; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step8.md: -------------------------------------------------------------------------------- 1 | Find albums with title `20 Greatest Hits`; order by year (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM albums_by_title 9 | WHERE title = '20 Greatest Hits'; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step6.md: -------------------------------------------------------------------------------- 1 | Find information about all sensors in network `forest-net`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM sensors_by_network 9 | WHERE network = 'forest-net'; 10 | ```{{execute}} 11 | 12 |
13 | 14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step4.md: -------------------------------------------------------------------------------- 1 | Find information about all investment accounts of a user with username `joe`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM accounts_by_user 9 | WHERE username = 'joe'; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step6.md: -------------------------------------------------------------------------------- 1 | Find albums of performer `The Beatles`; order by year (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM albums_by_performer 9 | WHERE performer = 'The Beatles'; 10 | ```{{execute}} 11 | 12 |
13 | 14 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/finish.md: -------------------------------------------------------------------------------- 1 | In this scenario you have: 2 | 3 | - introduced the need for data repair on a small Cassandra cluster 4 | - performed the repair and observed the data being in agreement again within the cluster 5 | - learned about incremental repair in Cassandra 4.0 6 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step5.md: -------------------------------------------------------------------------------- 1 | Find all positions in account `joe001`; order by instrument symbol (asc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM positions_by_account 9 | WHERE account = 'joe001'; 10 | ```{{execute}} 11 | 12 |
13 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | export JAVA_HOME="/usr/lib/jvm/default-java" 3 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 4 | clear 5 | sleep 5; wait.sh 6 | echo "Cassandra Cluster with nodes [[HOST1_IP]] and [[HOST2_IP]] has started!" 7 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step4.md: -------------------------------------------------------------------------------- 1 | Find information about all networks; order by name (asc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT name, description, 8 | region, num_sensors 9 | FROM networks 10 | WHERE bucket = 'all'; 11 | ```{{execute}} 12 | 13 |
-------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | export JAVA_HOME="/usr/lib/jvm/default-java" 3 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 4 | clear 5 | sleep 5; wait.sh 6 | echo "Cassandra Cluster with nodes [[HOST1_IP]] and [[HOST2_IP]] has started!" 7 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step2.md: -------------------------------------------------------------------------------- 1 | The CQL shell is a command-line client for executing CQL statements over a Cassandra database interactively. 2 | 3 | Get the CQL shell usage help: 4 | ``` 5 | cqlsh -h 6 | ```{{execute}} 7 | 8 | Start the CQL shell: 9 | ``` 10 | cqlsh 11 | ```{{execute}} 12 | 13 | 14 | 15 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step7.md: -------------------------------------------------------------------------------- 1 | Find an album with title `Magical Mystery Tour` and year `1967`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM albums_by_title 9 | WHERE title = 'Magical Mystery Tour' 10 | AND year = 1967; 11 | ```{{execute}} 12 | 13 |
14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step6.md: -------------------------------------------------------------------------------- 1 | Find information about all metrics stored in bucket `all`; order by metric name (asc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM time_series.metrics 9 | WHERE bucket = 'all'; 10 | ```{{execute}} 11 | 12 |
13 | 14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `sensor_data` keyspace: 2 | ```sql 3 | CREATE KEYSPACE sensor_data 4 | WITH replication = { 5 | 'class': 'NetworkTopologyStrategy', 6 | 'DC-Houston': 1 }; 7 | ```{{execute}} 8 | 9 | Set the current working keyspace: 10 | ```sql 11 | USE sensor_data; 12 | ```{{execute}} 13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `messaging_data` keyspace: 2 | ```sql 3 | CREATE KEYSPACE messaging_data 4 | WITH replication = { 5 | 'class': 'NetworkTopologyStrategy', 6 | 'DC-Houston': 1 }; 7 | ```{{execute}} 8 | 9 | Set the current working keyspace: 10 | ```sql 11 | USE messaging_data; 12 | ```{{execute}} 13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `investment_data` keyspace: 2 | ```sql 3 | CREATE KEYSPACE investment_data 4 | WITH replication = { 5 | 'class': 'NetworkTopologyStrategy', 6 | 'DC-Houston': 1 }; 7 | ```{{execute}} 8 | 9 | Set the current working keyspace: 10 | ```sql 11 | USE investment_data; 12 | ```{{execute}} 13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step7.md: -------------------------------------------------------------------------------- 1 | Find a status history for order `111-0461064-1669732`; sort by status timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM order_status_history_by_id 9 | WHERE order_id = '111-0461064-1669732'; 10 | ```{{execute}} 11 | 12 |
13 | 14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step11.md: -------------------------------------------------------------------------------- 1 | Find tracks from album `Magical Mystery Tour` of `1967`; order by track number (asc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM tracks_by_album 9 | WHERE album_title = 'Magical Mystery Tour' 10 | AND album_year = 1967; 11 | ```{{execute}} 12 | 13 |
14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step7.md: -------------------------------------------------------------------------------- 1 | Find raw measurements for sensor `s1003` on `2020-07-06`; order by timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT timestamp, value 8 | FROM temperatures_by_sensor 9 | WHERE sensor = 's1003' 10 | AND date = '2020-07-06'; 11 | ```{{execute}} 12 | 13 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `shopping_cart_data` keyspace: 2 | ```sql 3 | CREATE KEYSPACE shopping_cart_data 4 | WITH replication = { 5 | 'class': 'NetworkTopologyStrategy', 6 | 'DC-Houston': 1 }; 7 | ```{{execute}} 8 | 9 | Set the current working keyspace: 10 | ```sql 11 | USE shopping_cart_data; 12 | ```{{execute}} 13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step5.md: -------------------------------------------------------------------------------- 1 | Find information about all data sources in group `House A`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT group, source, description, 8 | characteristics['Model number'] 9 | FROM sources_by_group 10 | WHERE group = 'House A'; 11 | ```{{execute}} 12 | 13 |
14 | 15 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step1.md: -------------------------------------------------------------------------------- 1 | Create the `order_management_data` keyspace: 2 | ```sql 3 | CREATE KEYSPACE order_management_data 4 | WITH replication = { 5 | 'class': 'NetworkTopologyStrategy', 6 | 'DC-Houston': 1 }; 7 | ```{{execute}} 8 | 9 | Set the current working keyspace: 10 | ```sql 11 | USE order_management_data; 12 | ```{{execute}} 13 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step5.md: -------------------------------------------------------------------------------- 1 | Find all information about order `113-3827060-8722206`; sort items by name (asc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | EXPAND ON; 8 | 9 | SELECT * 10 | FROM orders_by_id 11 | WHERE order_id = '113-3827060-8722206'; 12 | 13 | EXPAND OFF; 14 | ```{{execute}} 15 | 16 |
17 | -------------------------------------------------------------------------------- /cql/step7.md: -------------------------------------------------------------------------------- 1 | Now, retrieve the row using the CQL `SELECT` statement: 2 | ``` 3 | SELECT * FROM users 4 | WHERE email = 'joe@datastax.com'; 5 | ```{{execute}} 6 | 7 | Retrieve a different row from the table: 8 |
9 | Solution 10 | ``` 11 | SELECT * FROM users 12 | WHERE email = 'jen@datastax.com'; 13 | ```{{execute}} 14 |
15 | 16 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step5.md: -------------------------------------------------------------------------------- 1 | Find ids and names of all shopping carts that belong to user `jen`; order by cart name (asc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT user_id, cart_name, 8 | cart_id, cart_is_active 9 | FROM carts_by_user 10 | WHERE user_id = 'jen'; 11 | ```{{execute}} 12 | 13 |
14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step6.md: -------------------------------------------------------------------------------- 1 | Find all orders that contain item `n-0023` and are placed by user `joe`; sort by order timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM orders_by_user_item 9 | WHERE item_id = 'n-0023' 10 | AND user_id = 'joe'; 11 | ```{{execute}} 12 | 13 |
14 | 15 | 16 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step7.md: -------------------------------------------------------------------------------- 1 | Now, retrieve the row using the CQL `SELECT` statement: 2 | ``` 3 | SELECT * FROM users 4 | WHERE email = 'joe@datastax.com'; 5 | ```{{execute}} 6 | 7 | Retrieve a different row from the table: 8 | 9 |
10 | Solution 11 | 12 | ``` 13 | SELECT * FROM users 14 | WHERE email = 'jen@datastax.com'; 15 | ```{{execute}} 16 | 17 |
18 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step13.md: -------------------------------------------------------------------------------- 1 | Find all tracks played by a user in `September 2020`; order by timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT timestamp, album_title, album_year, number, title 8 | FROM tracks_by_user 9 | WHERE id = 12345678-aaaa-bbbb-cccc-123456789abc 10 | AND month = '2020-09-01'; 11 | ```{{execute}} 12 | 13 |
14 | -------------------------------------------------------------------------------- /cql/step5.md: -------------------------------------------------------------------------------- 1 | A Cassandra table has named columns with data types, rows with values, and a primary key to uniquely identify each row. 2 | As an example, let's create table `users` with four columns and primary key `email`. 3 | 4 | Create the table: 5 | ``` 6 | CREATE TABLE users ( 7 | email TEXT PRIMARY KEY, 8 | name TEXT, 9 | age INT, 10 | date_joined DATE 11 | ); 12 | ```{{execute}} 13 | 14 | 15 | -------------------------------------------------------------------------------- /cql/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | >>1. Which CQL statement can be used to add rows into a table? << 4 | ( ) SELECT 5 | ( ) DELETE 6 | (*) INSERT 7 | 8 | >>2. Which CQL statement can be used to retrieve rows from a table? << 9 | (*) SELECT 10 | ( ) DELETE 11 | ( ) INSERT 12 | 13 | >>3. Which CQL statement can be used to remove rows from a table? << 14 | ( ) SELECT 15 | (*) DELETE 16 | ( ) INSERT -------------------------------------------------------------------------------- /cql/step9.md: -------------------------------------------------------------------------------- 1 | Finally, delete the row using the CQL `DELETE` statement: 2 | ``` 3 | DELETE FROM users 4 | WHERE email = 'joe@datastax.com'; 5 | 6 | SELECT * FROM users; 7 | ```{{execute}} 8 | 9 | Deleting another row from the table: 10 |
11 | Solution 12 | ``` 13 | DELETE FROM users 14 | WHERE email = 'jen@datastax.com'; 15 | 16 | SELECT * FROM users; 17 | ```{{execute}} 18 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step6.md: -------------------------------------------------------------------------------- 1 | Find all trades for account `joe001`; order by trade date (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT account, 8 | TODATE(DATEOF(trade_id)) AS date, 9 | trade_id, type, symbol, 10 | shares, price, amount 11 | FROM trades_by_a_d 12 | WHERE account = 'joe001'; 13 | ```{{execute}} 14 | 15 |
16 | 17 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/intro.md: -------------------------------------------------------------------------------- 1 | Audit logging is one of the important new features of Apache Cassandra™ 4.x. 2 | Logging is crucial for regulatory compliance, security compliance and debugging. 3 | 4 | In this scenario you will learn how to: 5 | 6 | - Dynamically enable/disable audit logging using *nodetool* 7 | - Statically enable/disable audit logging in *cassandra.yaml* 8 | - Configure logging properties -------------------------------------------------------------------------------- /cassandra-features-4x/structure.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "New Features in Apache Cassandra™ 4.x", 3 | "description": "Learn about the exciting new features in Apache Cassandra™ 4.x", 4 | "items": [ 5 | 6 | { "path": "virtual-tables", 7 | "title": "Virtual Tables in Cassandra", 8 | "description": "Learn what are virtual tables and how to use them in Cassandra" 9 | } 10 | 11 | ] 12 | } -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | 4 | echo "deb http://downloads.apache.org/cassandra/debian 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list 5 | deb http://downloads.apache.org/cassandra/debian 40x main 6 | 7 | curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add - 8 | 9 | sudo apt-get update 10 | 11 | sudo apt-get install cassandra -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step5.md: -------------------------------------------------------------------------------- 1 | A Cassandra table has named columns with data types, rows with values, and a primary key to uniquely identify each row. 2 | As an example, let's create table `users` with four columns and primary key `email`. 3 | 4 | Create the table: 5 | ``` 6 | CREATE TABLE users ( 7 | email TEXT PRIMARY KEY, 8 | name TEXT, 9 | age INT, 10 | date_joined DATE 11 | ); 12 | ```{{execute}} 13 | 14 | 15 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step4.md: -------------------------------------------------------------------------------- 1 | Now, we want to _show_ you some of the changes. 2 | 3 | Let's start by taking a look at the cluster we have set up. 4 | 5 | ``` 6 | nodetool status 7 | ```{{execute}} 8 | 9 | We see there are two datacenters with one node each. 10 | In a real-life production cluster, you can usually expect to have 3 or more nodes per datacenter, but for this demo, we only need two nodes. 11 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step4.md: -------------------------------------------------------------------------------- 1 | Find all orders placed by user `joe` between dates `2020-01-01` and `2020-12-31`, inclusive; sort by order timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT * 8 | FROM orders_by_user 9 | WHERE user_id = 'joe' 10 | AND order_timestamp >= '2020-01-01' 11 | AND order_timestamp < '2021-01-01'; 12 | ```{{execute}} 13 | 14 |
15 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for a sensor data use case 4 | * Populate tables with sample sensor data 5 | * Design and execute CQL queries over sensor data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/sensor-data-model) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cql/step8.md: -------------------------------------------------------------------------------- 1 | Next, update the row using the CQL `UPDATE` statement: 2 | ``` 3 | UPDATE users SET name = 'Joseph' 4 | WHERE email = 'joe@datastax.com'; 5 | 6 | SELECT * FROM users; 7 | ```{{execute}} 8 | 9 | Update another row in the table: 10 |
11 | Solution 12 | ``` 13 | UPDATE users SET name = 'Jennifer' 14 | WHERE email = 'jen@datastax.com'; 15 | 16 | SELECT * FROM users; 17 | ```{{execute}} 18 |
-------------------------------------------------------------------------------- /cassandra-fundamentals/structure.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Apache Cassandra™ Fundamentals", 3 | "description": "Hands-on introduction to Apache Cassandra™ NoSQL database", 4 | "items": [ 5 | 6 | { "path": "cql", 7 | "title": "Cassandra Query Language", 8 | "description": "Learn about the most essential data definition and data manipulation statements in Cassandra Query Language (CQL)" 9 | } 10 | 11 | ] 12 | } -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for a messaging data use case 4 | * Populate tables with sample messaging data 5 | * Design and execute CQL queries over messaging data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/messaging-data-model) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for a time series use case 4 | * Populate tables with sample time series data 5 | * Design and execute CQL queries over time series data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/time-series-model) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step9.md: -------------------------------------------------------------------------------- 1 | Find all items and their subtotal for a cart with id `19925cc1-4f8b-4a44-b893-2a49a8434fc8`; order items by timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT timestamp, item_id, item_price, 8 | quantity, cart_subtotal 9 | FROM items_by_cart 10 | WHERE cart_id = 19925cc1-4f8b-4a44-b893-2a49a8434fc8; 11 | ```{{execute}} 12 | 13 |
14 | 15 | 16 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step9.md: -------------------------------------------------------------------------------- 1 | Finally, delete the row using the CQL `DELETE` statement: 2 | ``` 3 | DELETE FROM users 4 | WHERE email = 'joe@datastax.com'; 5 | 6 | SELECT * FROM users; 7 | ```{{execute}} 8 | 9 | Delete another row from the table: 10 | 11 |
12 | Solution 13 | 14 | ``` 15 | DELETE FROM users 16 | WHERE email = 'jen@datastax.com'; 17 | 18 | SELECT * FROM users; 19 | ```{{execute}} 20 | 21 |
22 | -------------------------------------------------------------------------------- /cql/step4.md: -------------------------------------------------------------------------------- 1 | Many CQL statements work with tables, indexes and other objects defined within a specific keyspace. 2 | For example, to refer to a table, we have to either use a fully-qualified name consisting of a keyspace name and a table name, 3 | or set a working keyspace and simply refer to the table by its name. For convenience, we go with the second option. 4 | 5 | Set the current working keyspace: 6 | ``` 7 | USE killr_video; 8 | ```{{execute}} 9 | 10 | -------------------------------------------------------------------------------- /cql/step6.md: -------------------------------------------------------------------------------- 1 | Add the row into our table using the CQL `INSERT` statement: 2 | ``` 3 | INSERT INTO users (email, name, age, date_joined) 4 | VALUES ('joe@datastax.com', 'Joe', 25, '2020-01-01'); 5 | ```{{execute}} 6 | 7 | Insert another row into the table: 8 |
9 | Solution 10 | ``` 11 | INSERT INTO users (email, name, age, date_joined) 12 | VALUES ('jen@datastax.com', 'Jen', 27, '2020-01-01'); 13 | ```{{execute}} 14 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for a shopping cart data use case 4 | * Populate tables with sample shopping cart data 5 | * Design and execute CQL queries over shopping cart data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/shopping-cart) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for a digital library data use case 4 | * Populate tables with sample digital library data 5 | * Design and execute CQL queries over digital library data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/digital-library-data-model) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for an order management use case 4 | * Populate tables with sample order management data 5 | * Design and execute CQL queries over order management data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/order-management) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step4.md: -------------------------------------------------------------------------------- 1 | Many CQL statements work with tables, indexes and other objects defined within a specific keyspace. 2 | For example, to refer to a table, we have to either use a fully-qualified name consisting of a keyspace name and a table name, 3 | or set a working keyspace and simply refer to the table by its name. For convenience, we go with the second option. 4 | 5 | Set the current working keyspace: 6 | ``` 7 | USE killr_video; 8 | ```{{execute}} 9 | 10 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step8.md: -------------------------------------------------------------------------------- 1 | Next, update the row using the CQL `UPDATE` statement: 2 | ``` 3 | UPDATE users SET name = 'Joseph' 4 | WHERE email = 'joe@datastax.com'; 5 | 6 | SELECT * FROM users; 7 | ```{{execute}} 8 | 9 | Update another row in the table: 10 | 11 |
12 | Solution 13 | 14 | ``` 15 | UPDATE users SET name = 'Jennifer' 16 | WHERE email = 'jen@datastax.com'; 17 | 18 | SELECT * FROM users; 19 | ```{{execute}} 20 | 21 |
22 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Create tables for an investment portfolio data use case 4 | * Populate tables with sample investment portfolio data 5 | * Design and execute CQL queries over investment portfolio data 6 | 7 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/data-modeling-by-example/investment-data-model) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | sudo apt-get install -y pip 4 | 5 | pip install dse-driver==2.11.1 6 | 7 | echo "deb http://downloads.apache.org/cassandra/debian 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list 8 | deb http://downloads.apache.org/cassandra/debian 40x main 9 | 10 | curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add - 11 | 12 | sudo apt-get update 13 | 14 | sudo apt-get install -y cassandra 15 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Learn about Cassandra Query Language (CQL) 4 | * Use the CQL shell 5 | * Execute statements `CREATE KEYSPACE`, `USE` and `CREATE TABLE` 6 | * Practice using statements `INSERT`, `SELECT`, `UPDATE` and `DELETE` 7 | 8 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/cassandra-fundamentals/cql) site, where you can find many more resources to help you succeed with Apache Cassandra™._ 9 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step5.md: -------------------------------------------------------------------------------- 1 | Find ids, subjects, senders, read/unread statuses and timestamps of all emails with label `inbox` for user `joe@datastax.com`; order by timestamp (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT id, subject, "from", is_read, 8 | toTimestamp(id) AS timestamp 9 | FROM emails_by_user_folder 10 | WHERE label = 'inbox' 11 | AND username = 'joe@datastax.com'; 12 | ```{{execute}} 13 | 14 |
15 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step6.md: -------------------------------------------------------------------------------- 1 | Add the row into our table using the CQL `INSERT` statement: 2 | ``` 3 | INSERT INTO users (email, name, age, date_joined) 4 | VALUES ('joe@datastax.com', 'Joe', 25, '2020-01-01'); 5 | ```{{execute}} 6 | 7 | Insert another row into the table: 8 | 9 |
10 | Solution 11 | 12 | ``` 13 | INSERT INTO users (email, name, age, date_joined) 14 | VALUES ('jen@datastax.com', 'Jen', 27, '2020-01-01'); 15 | ```{{execute}} 16 | 17 |
18 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/intro.md: -------------------------------------------------------------------------------- 1 | With the 4.0 release of Apache Cassandra™, there have been improvements to 2 | incremental repair: the process is now more reliable and robust, easier to manage 3 | and ready for use in production. 4 | 5 | In this scenario you will: 6 | 7 | - populate a small cluster with data and introduce the need for repair 8 | - execute incremental repair and observe the data being reconciled between nodes 9 | - learn how to manage repairs on a Cassandra cluster. 10 | -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/intro.md: -------------------------------------------------------------------------------- 1 | Virtual tables, introduced in Apache Cassandra™ 4.0, allow you to expose 2 | metrics and properties of the node with the same interface as ordinary tables. 3 | This allows for a much easier way to observe the state and the health 4 | of the cluster without leaving the CQL protocol. 5 | 6 | In this scenario you will learn: 7 | 8 | - how to look at virtual tables and their contents 9 | - what you _cannot_ do with virtual tables 10 | - what to expect if you change node settings 11 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step1.md: -------------------------------------------------------------------------------- 1 | One performance enhancement of Apache Cassandra™ 4.X is internode communication optimizations and tuning. 2 | These adjustments consist of several changes; some of these changes are just good to know about, but others you can tune with parameters. 3 | Many of the internode communication changes are a result of retiring technical debt. 4 | 5 | Not all of these adjustments are visible outside of the Cassandra code. 6 | But in this scenario, we explain the changes and show you what we can. 7 | -------------------------------------------------------------------------------- /cql/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario, you will: 2 | 3 | * Learn about Cassandra Query Language (CQL) 4 | * Use the CQL shell 5 | * Execute statements `CREATE KEYSPACE`, `USE` and `CREATE TABLE` 6 | * Practice using statements `INSERT`, `SELECT`, `UPDATE` and `DELETE` 7 | 8 | | TEST | TEST | 9 | | ---- | ---- | 10 | | OK | OK | 11 | 12 | 13 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/cassandra-fundamentals/cql) site, where you can find many more resources to help you succeed with Apache Cassandra™._ 14 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step11.md: -------------------------------------------------------------------------------- 1 | Find daily min, max, median, mean and standard deviation values for 2 | a time series with source `Termostate A2`, metric `humidity` and 3 | date range [`2019-12-25`,`2020-01-07`]; order by date (desc): 4 | 5 |
6 | Solution 7 | 8 | ```sql 9 | SELECT * 10 | FROM time_series.statistics_by_source_metric 11 | WHERE source = 'Termostate A2' 12 | AND metric = 'humidity' 13 | AND date >= '2019-12-25' 14 | AND date <= '2020-01-07'; 15 | ```{{execute}} 16 | 17 |
-------------------------------------------------------------------------------- /cql/finish.md: -------------------------------------------------------------------------------- 1 | **Did you know?** 2 | 3 | You can use Cassandra as a service in the cloud. Nothing to install, no credit card required. 4 | Sign up and launch your database with a few clicks at [astra.datastax.com](https://astra.datastax.com/register?utm_source=devplay&utm_medium=katacoda&utm_campaign=cassandra-fundamentals)! 5 | 6 | **In this scenario, you learned about:** 7 | 8 | * Cassandra Query Language (CQL) 9 | * The CQL shell 10 | * Statements `CREATE KEYSPACE`, `USE` and `CREATE TABLE` 11 | * Statements `INSERT`, `SELECT`, `UPDATE` and `DELETE` -------------------------------------------------------------------------------- /cql/step1.md: -------------------------------------------------------------------------------- 1 | *Cassandra Query Language* (*CQL*) is the primary language for interacting with Apache Cassandra™ databases. 2 | CQL data definition and data manipulation statements include: 3 | 4 | *CQL Data Definition* 5 | - `CREATE | ALTER | DROP KEYSPACE` 6 | - `USE` 7 | - `CREATE | ALTER | DROP | TRUNCATE TABLE` 8 | 9 | *CQL Data Manipulation* 10 | - `INSERT` (**C**reate) 11 | - `SELECT` (**R**ead) 12 | - `UPDATE` (**U**pdate) 13 | - `DELETE` (**D**elete) 14 | 15 | Let's use some of these statements and see how they work. 16 | 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step7.md: -------------------------------------------------------------------------------- 1 | Find all trades for account `joe001` and date range `2020-09-07` - `2020-09-11`; order by trade date (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT account, 8 | TODATE(DATEOF(trade_id)) AS date, 9 | trade_id, type, symbol, 10 | shares, price, amount 11 | FROM trades_by_a_d 12 | WHERE account = 'joe001' 13 | AND trade_id > maxTimeuuid('2020-09-07') 14 | AND trade_id < minTimeuuid('2020-09-12'); 15 | ```{{execute}} 16 | 17 |
18 | 19 | 20 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/finish.md: -------------------------------------------------------------------------------- 1 | **Did you know?** 2 | 3 | You can use Cassandra as a service in the cloud. Nothing to install, no credit card required. 4 | Sign up and launch your database with a few clicks at [astra.datastax.com](https://astra.datastax.com/register?utm_source=devplay&utm_medium=katacoda&utm_campaign=cassandra-fundamentals)! 5 | 6 | **In this scenario, you learned about:** 7 | 8 | * Cassandra Query Language (CQL) 9 | * The CQL shell 10 | * Statements `CREATE KEYSPACE`, `USE` and `CREATE TABLE` 11 | * Statements `INSERT`, `SELECT`, `UPDATE` and `DELETE` -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step9.md: -------------------------------------------------------------------------------- 1 | Retrieve time series with a high resolution of 60 seconds for metric `temperature`, 2 | group `House A` and time range [`2020-10-04 23:59:00`,`2020-10-05 00:01:00`]; 3 | order by timestamp (desc) and source (asc): 4 | 5 |
6 | Solution 7 | 8 | ```sql 9 | SELECT * 10 | FROM time_series.series_by_metric_high 11 | WHERE group = 'House A' 12 | AND metric = 'temperature' 13 | AND timestamp >= '2020-10-04 23:59:00' 14 | AND timestamp <= '2020-10-05 00:01:00'; 15 | ```{{execute}} 16 | 17 |
-------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step1.md: -------------------------------------------------------------------------------- 1 | *Cassandra Query Language* (*CQL*) is the primary language for interacting with Apache Cassandra™ databases. 2 | CQL data definition and data manipulation statements include: 3 | 4 | *CQL Data Definition* 5 | - `CREATE | ALTER | DROP KEYSPACE` 6 | - `USE` 7 | - `CREATE | ALTER | DROP | TRUNCATE TABLE` 8 | 9 | *CQL Data Manipulation* 10 | - `INSERT` (**C**reate) 11 | - `SELECT` (**R**ead) 12 | - `UPDATE` (**U**pdate) 13 | - `DELETE` (**D**elete) 14 | 15 | Let's use some of these statements and see how they work. 16 | 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/intro.md: -------------------------------------------------------------------------------- 1 | In this learning unit, you will: 2 | 3 | * Query tables using the CQL `SELECT` statement 4 | * Understand efficient data access patterns 5 | * Learn about equality and inequality predicates 6 | * Group rows and compute aggregates 7 | * Order rows based on the table clustering order 8 | * Use other CQL querying capabilities 9 | 10 | _This scenario is also available on our [datastax.com/dev](https://www.datastax.com/learn/cassandra-fundamentals/queries) site, where you can find many more resources to help you succeed with Apache Cassandra™._ -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step10.md: -------------------------------------------------------------------------------- 1 | Retrieve time series with a low resolution of 60 minutes for metric `temperature`, 2 | group `House A` and time range [`2019-01-01 00:00:00`,`2019-01-01 06:00:00`]; 3 | order by timestamp (desc) and source (asc): 4 | 5 |
6 | Solution 7 | 8 | ```sql 9 | SELECT * 10 | FROM time_series.series_by_metric_low 11 | WHERE group = 'House A' 12 | AND year = 2019 13 | AND metric = 'temperature' 14 | AND timestamp >= '2019-01-01 00:00:00' 15 | AND timestamp <= '2019-01-01 06:00:00'; 16 | ```{{execute}} 17 | 18 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step10.md: -------------------------------------------------------------------------------- 1 | Find all trades for account `joe001`, date range `2020-09-07` - `2020-09-11` and instrument symbol `AAPL`; order by trade date (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT account, 8 | TODATE(DATEOF(trade_id)) AS date, 9 | trade_id, type, symbol, 10 | shares, price, amount 11 | FROM trades_by_a_sd 12 | WHERE account = 'joe001' 13 | AND symbol = 'AAPL' 14 | AND trade_id > maxTimeuuid('2020-09-07') 15 | AND trade_id < minTimeuuid('2020-09-12'); 16 | ```{{execute}} 17 | 18 |
19 | 20 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step8.md: -------------------------------------------------------------------------------- 1 | Find all trades for account `joe001`, date range `2020-09-07` - `2020-09-11` and transaction type `buy`; order by trade date (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT account, 8 | TODATE(DATEOF(trade_id)) AS date, 9 | trade_id, type, symbol, 10 | shares, price, amount 11 | FROM trades_by_a_td 12 | WHERE account = 'joe001' 13 | AND type = 'buy' 14 | AND trade_id > maxTimeuuid('2020-09-07') 15 | AND trade_id < minTimeuuid('2020-09-12'); 16 | ```{{execute}} 17 | 18 |
19 | 20 | 21 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step4.md: -------------------------------------------------------------------------------- 1 | Find all folder labels and colors for user `joe@datastax.com`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT label, color 8 | FROM folders_by_user 9 | WHERE username = 'joe@datastax.com'; 10 | ```{{execute}} 11 | 12 |
13 | 14 |
15 | 16 | Find all folder labels and unread email quantities for user `joe@datastax.com`: 17 | 18 |
19 | Solution 20 | 21 | ```sql 22 | SELECT label, num_unread 23 | FROM unread_email_stats 24 | WHERE username = 'joe@datastax.com'; 25 | ```{{execute}} 26 | 27 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step7.md: -------------------------------------------------------------------------------- 1 | Retrieve time series with a high resolution of 60 seconds for group `House A`, 2 | sources `Refrigerator A1` and `Freezer A1`, and time range [`2020-10-05 12:44:00`,`2020-10-05 12:47:00`]; 3 | order by timestamp (desc) and metric (asc): 4 | 5 |
6 | Solution 7 | 8 | ```sql 9 | SELECT * 10 | FROM time_series.series_by_source_high 11 | WHERE group = 'House A' 12 | AND source IN ('Refrigerator A1','Freezer A1') 13 | AND timestamp >= '2020-10-05 12:44:00' 14 | AND timestamp <= '2020-10-05 12:47:00'; 15 | ```{{execute}} 16 | 17 |
18 | 19 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/intro.md: -------------------------------------------------------------------------------- 1 | Cassandra 4.x full query logging enables you to get the exact CQL query strings used by your client applications. This information can be used for: 2 | - Analyzing poorly performing queries 3 | - Debugging queries that are producing incorrect results 4 | - Live traffic capture and replay 5 | - Troubleshooting/Debugging 6 | 7 | In this scenario you will: 8 | 9 | - Enable and disable full query logging statically (in *cassandra.yaml*) and dynamically (using *nodetool*). 10 | - Learn features of the full query logging tool *fqltool* 11 | - Read and interpret full query logs 12 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step8.md: -------------------------------------------------------------------------------- 1 | Retrieve time series with a low resolution of 60 minutes for group `House A`, 2 | sources `Refrigerator A1` and `Freezer A1`, and time range [`2020-10-05 12:00:00`,`2020-10-05 15:00:00`]; 3 | order by source (asc), timestamp (desc) and metric (asc): 4 | 5 |
6 | Solution 7 | 8 | ```sql 9 | SELECT * 10 | FROM time_series.series_by_source_low 11 | WHERE group = 'House A' 12 | AND year = 2020 13 | AND source IN ('Refrigerator A1','Freezer A1') 14 | AND timestamp >= '2020-10-05 12:00:00' 15 | AND timestamp <= '2020-10-05 15:00:00'; 16 | ```{{execute}} 17 | 18 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step9.md: -------------------------------------------------------------------------------- 1 | Find all trades for account `joe001`, date range `2020-09-07` - `2020-09-11`, transaction type `buy` and instrument symbol `AAPL`; order by trade date (desc): 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT account, 8 | TODATE(DATEOF(trade_id)) AS date, 9 | trade_id, type, symbol, 10 | shares, price, amount 11 | FROM trades_by_a_std 12 | WHERE account = 'joe001' 13 | AND symbol = 'AAPL' 14 | AND type = 'buy' 15 | AND trade_id > maxTimeuuid('2020-09-07') 16 | AND trade_id < minTimeuuid('2020-09-12'); 17 | ```{{execute}} 18 | 19 |
20 | 21 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/finish.md: -------------------------------------------------------------------------------- 1 | **Did you know?** 2 | 3 | You can use Cassandra as a service in the cloud. Nothing to install, no credit card required. 4 | Sign up and launch your database with a few clicks at [astra.datastax.com](https://astra.datastax.com/register?utm_source=devplay&utm_medium=katacoda&utm_campaign=cassandra-fundamentals)! 5 | 6 | **In this scenario, you learned about:** 7 | 8 | * CQL queries and the `SELECT` statement 9 | * Efficient data access patterns 10 | * Equality and inequality predicates 11 | * Grouping rows and computing aggregates 12 | * Ordering rows based on the table clustering order 13 | * Other CQL querying capabilities 14 | 15 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/step3.md: -------------------------------------------------------------------------------- 1 | In this step, you will use *fqltool* to read the contents of a full query log. 2 | 3 | Let's use the `fqltool dump` command, pointing it to the directory we set up previously to store our full query logs. 4 | 5 | ``` 6 | fqltool dump /tmp/fqllogs 7 | ```{{execute}} 8 | 9 | To see more about the different options for the `dump` command, use the help: 10 | 11 | ``` 12 | fqltool help dump 13 | ```{{execute}} 14 | 15 | To see what other 'cqltool' commands are allowed, try: 16 | 17 | ``` 18 | fqltool help 19 | ```{{execute}} 20 | 21 | # Summary 22 | 23 | In this step, you used *fqltool* to view the contents of a full query log. -------------------------------------------------------------------------------- /cql/step3.md: -------------------------------------------------------------------------------- 1 | A keyspace is a namespace for a set of tables sharing a data replication strategy and some options. 2 | It is conceptually similar to a "database" in a relational database management system. 3 | 4 | Create the keyspace: 5 | ``` 6 | CREATE KEYSPACE killr_video 7 | WITH replication = { 8 | 'class': 'NetworkTopologyStrategy', 9 | 'DC-Houston': 1 }; 10 | ```{{execute}} 11 | 12 | Our keyspace name is `killr_video`. Any data in this keyspace will be replicated to datacenter `DC-Houston` 13 | using replication strategy `NetworkTopologyStrategy` and replication factor `1`. In production, however, we strongly 14 | recommend multiple datacenters and at least three replicas per datacenter for higher availability. 15 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step4.md: -------------------------------------------------------------------------------- 1 | In this step, we will prepare the Cassandra 3.x cluster for the upgrade. 2 | 3 | Take a snapshot of the node in case you need to roll back thE upgrade. 4 | (`nodetool snapshot` also flushes the memtables to disk) 5 | ``` 6 | nodetool snapshot 7 | ```{{execute T1}} 8 | 9 | Stop the node by finding the PID and calling kill. 10 | ``` 11 | pgrep -u root -f cassandra | xargs kill -9 12 | ```{{execute T1}} 13 | 14 | Use nodetool to verify that the server has been shut down. 15 | ``` 16 | nodetool status 17 | ```{{execute T1}} 18 | 19 | Clear the screen and continue. 20 | ``` 21 | clear 22 | ```{{execute T1}} 23 | 24 | The node has been shutdown, continue to the next step. -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | Q1. Which CQL statement can be used to add rows into a table? 4 | 5 | - [ ] A. SELECT 6 | - [ ] B. DELETE 7 | - [ ] C. INSERT 8 | 9 |
10 | Answer 11 |

C

12 |
13 | 14 | Q2. Which CQL statement can be used to retrieve rows from a table? 15 | 16 | - [ ] A. SELECT 17 | - [ ] B. DELETE 18 | - [ ] C. INSERT 19 | 20 |
21 | Answer 22 |

A

23 |
24 | 25 | Q3. Which CQL statement can be used to remove rows from a table? 26 | 27 | - [ ] A. SELECT 28 | - [ ] B. DELETE 29 | - [ ] C. INSERT 30 | 31 |
32 | Answer 33 |

B

34 |
35 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step3.md: -------------------------------------------------------------------------------- 1 | A keyspace is a namespace for a set of tables sharing a data replication strategy and some options. 2 | It is conceptually similar to a "database" in a relational database management system. 3 | 4 | Create the keyspace: 5 | ``` 6 | CREATE KEYSPACE killr_video 7 | WITH replication = { 8 | 'class': 'NetworkTopologyStrategy', 9 | 'DC-Houston': 1 }; 10 | ```{{execute}} 11 | 12 | Our keyspace name is `killr_video`. Any data in this keyspace will be replicated to datacenter `DC-Houston` 13 | using replication strategy `NetworkTopologyStrategy` and replication factor `1`. In production, however, we strongly 14 | recommend multiple datacenters and at least three replicas per datacenter for higher availability. 15 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step6.md: -------------------------------------------------------------------------------- 1 | Find all available information about an email with id `8ae31dd0-d361-11ea-a40e-5dd6331dfc45`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT id, "to", "from", 8 | toTimestamp(id) AS timestamp, 9 | subject, body, 10 | attachments 11 | FROM emails 12 | WHERE id = 8ae31dd0-d361-11ea-a40e-5dd6331dfc45; 13 | ```{{execute}} 14 | 15 |
16 | 17 |
18 | 19 | Notice the file names and sizes (measured in kilobytes) in column `attachments`. For more efficient retrieval, we can assume that larger files are split into 20 | chunks of 1000KB or less. For example, a 530KB file can be stored as one chunk, while a 2416KB file has to be stored using 3 chunks. 21 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step3.md: -------------------------------------------------------------------------------- 1 | Execute the CQL script to insert sample data: 2 | ```sql 3 | SOURCE '~/sensor_data.cql' 4 | ```{{execute}} 5 | 6 | Retrieve all rows from table `networks`: 7 | ```sql 8 | SELECT * FROM networks; 9 | ```{{execute}} 10 | 11 | Retrieve all rows from table `temperatures_by_network`: 12 | ```sql 13 | SELECT network, week, date_hour, 14 | sensor, avg_temperature 15 | FROM temperatures_by_network; 16 | ```{{execute}} 17 | 18 | Retrieve all rows from table `sensors_by_network`: 19 | ```sql 20 | SELECT * FROM sensors_by_network; 21 | ```{{execute}} 22 | 23 | Retrieve all rows from table `temperatures_by_sensor`: 24 | ```sql 25 | SELECT * FROM temperatures_by_sensor; 26 | ```{{execute}} 27 | -------------------------------------------------------------------------------- /cql/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step2.md: -------------------------------------------------------------------------------- 1 | A significant change to the Cassandra 4.X code has to do with how the code receives messages. 2 | Prior to 4.0, the code would post a read and wait for a message. 3 | The thread posting the read would block and not be able to do anything else until it received the message. 4 | 5 | The new code uses a Java package known as _NIO_, and a messaging framework known as _Netty_. 6 | This package and framework perform _asynchronous_ IO, which means the thread gets notified when it needs to receive a message. 7 | The result is that threads no longer block and there is less overhead associated with thread-switching and management. 8 | 9 | The bottom line is that Cassandra 4.X can process messages without consuming as much CPU! 10 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/intro.md: -------------------------------------------------------------------------------- 1 | In this scenario you will learn how to upgrade an Apache Cassandra™ cluster from 3.x to 4.x. To keep the scenario from becoming too complex, in the hands on exercises of this senario you will perform a migration of a *single-node* cluster. 2 | 3 | Production Cassandra clusters always have multiple nodes. Therefore, steps have notes describing the *extra* work necessary to upgrade *multi-node* clusters. 4 | 5 | In this scenarion, you will: 6 | - Configure a single node Cassandra 3.x cluster 7 | - Populate this cluster with data 8 | - Prepare the cluster for upgrade 9 | - Install Cassandra 4.x 10 | - Start Cassandra 4.x cluster 11 | - Verify that the data has upgraded to the Cassandra 4.x cluster 12 | 13 | ## Let's get started -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step4.md: -------------------------------------------------------------------------------- 1 | Table `users` stores information about users who are uniquely identified by their email addresses. 2 | This table has single-row partitions and 3 | the primary key defined as `PRIMARY KEY ((email))`. 4 | Let's first retrieve all rows from the table to learn how the data looks like and then focus 5 | on predicates that the primary key can support. 6 | 7 | Q1. Retrieve all rows: 8 | ``` 9 | SELECT * FROM users; 10 | ```{{execute}} 11 | 12 | Q2. Retrieve one row/partition: 13 | ``` 14 | SELECT * FROM users 15 | WHERE email = 'joe@datastax.com'; 16 | ```{{execute}} 17 | 18 | Q3. Retrieve two rows/partitions: 19 | ``` 20 | SELECT * FROM users 21 | WHERE email IN ('joe@datastax.com', 22 | 'jen@datastax.com'); 23 | ```{{execute}} 24 | 25 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/assets/wait.sh: -------------------------------------------------------------------------------- 1 | # !/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | # sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | sudo grep -i "Startup complete" /var/log/cassandra/system.log &> /dev/null 13 | if [[ "$?" -ne 0 ]]; then 14 | temp="${spinstr#?}" 15 | printf " [%c] " "${spinstr}" 16 | spinstr=${temp}${spinstr%"${temp}"} 17 | sleep "${delay}" 18 | printf "\b\b\b\b\b\b" 19 | else 20 | break 21 | fi 22 | done 23 | clear 24 | printf " \b\b\b\b" 25 | echo "" 26 | echo "Cassandra has started!" 27 | } 28 | 29 | show_progress 30 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/assets/wait.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | if [[ "$?" -ne 0 ]]; then 13 | temp="${spinstr#?}" 14 | printf " [%c] " "${spinstr}" 15 | spinstr=${temp}${spinstr%"${temp}"} 16 | sleep "${delay}" 17 | printf "\b\b\b\b\b\b" 18 | else 19 | break 20 | fi 21 | done 22 | printf " \b\b\b\b" 23 | echo "" 24 | echo "Cassandra has started!" 25 | } 26 | 27 | show_progress 28 | sleep 1 29 | clear 30 | printf "\033[0;32mYour Interactive Bash Terminal.\033[0m\n" 31 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/assets/wait.sh: -------------------------------------------------------------------------------- 1 | # !/bin/bash 2 | 3 | show_progress() 4 | { 5 | local -r pid="${1}" 6 | local -r delay='0.75' 7 | local spinstr='\|/-' 8 | local temp 9 | echo -n "Starting up Cassandra..." 10 | while true; do 11 | # sudo grep -i "done" /opt/katacoda-background-finished &> /dev/null 12 | sudo grep -i "Startup complete" /var/log/cassandra/system.log &> /dev/null 13 | if [[ "$?" -ne 0 ]]; then 14 | temp="${spinstr#?}" 15 | printf " [%c] " "${spinstr}" 16 | spinstr=${temp}${spinstr%"${temp}"} 17 | sleep "${delay}" 18 | printf "\b\b\b\b\b\b" 19 | else 20 | break 21 | fi 22 | done 23 | clear 24 | printf " \b\b\b\b" 25 | echo "" 26 | echo "Cassandra has started!" 27 | } 28 | 29 | show_progress 30 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step5.md: -------------------------------------------------------------------------------- 1 | In this step, you will dowanload and unpack the Cassandra 4.0.0 distribution. 2 | 3 | Download the tarball. 4 | ``` 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | ```{{execute T1}} 7 | 8 | Extract Cassandra. 9 | ``` 10 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 11 | ```{{execute T1}} 12 | 13 | Move the Cassandra folder. 14 | ``` 15 | mv apache-cassandra-4.0.0 /usr/share/cassandra4 16 | ```{{execute T1}} 17 | 18 | Delete the archive. 19 | ``` 20 | rm apache-cassandra-4.0.0-bin.tar.gz 21 | ```{{execute T1}} 22 | 23 | Update the PATH variable. 24 | ``` 25 | export PATH="/usr/bin:/usr/share/cassandra4/bin:/usr/share/cassandra4/tools/bin:$PATH" 26 | ```{{execute T1}} 27 | 28 | Clear the screen and continue. 29 | ``` 30 | clear 31 | ```{{execute T1}} -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step3.md: -------------------------------------------------------------------------------- 1 | Execute the CQL script to insert sample data: 2 | ```sql 3 | SOURCE '~/shopping_cart_data.cql' 4 | ```{{execute}} 5 | 6 | Retrieve all rows from table `carts_by_user`: 7 | ```sql 8 | SELECT user_id, cart_name, 9 | cart_id, cart_is_active 10 | FROM carts_by_user; 11 | ```{{execute}} 12 | 13 | Retrieve all rows from table `items_by_id`: 14 | ```sql 15 | SELECT * FROM items_by_id; 16 | ```{{execute}} 17 | 18 | Retrieve all rows from materialized view `items_by_name`: 19 | ```sql 20 | SELECT * FROM items_by_name; 21 | ```{{execute}} 22 | 23 | Retrieve all rows from table `items_by_cart`: 24 | ```sql 25 | SELECT cart_id, timestamp, item_id 26 | FROM items_by_cart; 27 | 28 | SELECT cart_id, item_id, item_price, 29 | quantity, cart_subtotal 30 | FROM items_by_cart; 31 | ```{{execute}} 32 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step9.md: -------------------------------------------------------------------------------- 1 | Some queries may need to organize rows into groups 2 | and compute aggregates for each individual group. In Cassandra, 3 | grouping is always based on partition and clustering key columns and 4 | must follow the primary key definition order. In other words, a group 5 | is always defined as a set of rows belonging to the same partition. 6 | Consider the following query examples. 7 | 8 | Q1. Calculate average ratings for all movies: 9 | ``` 10 | SELECT title, year, 11 | AVG(CAST(rating AS FLOAT)) AS avg_rating 12 | FROM ratings_by_movie 13 | GROUP BY title, year; 14 | ```{{execute}} 15 | 16 | Q2. Calculate the number of ratings per user: 17 |
18 | Solution 19 | 20 | ``` 21 | SELECT email, COUNT(rating) AS n 22 | FROM ratings_by_user 23 | GROUP BY email; 24 | ```{{execute}} 25 | 26 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step3.md: -------------------------------------------------------------------------------- 1 | Execute the CQL script to insert sample data: 2 | ```sql 3 | SOURCE '~/messaging_data.cql' 4 | ```{{execute}} 5 | 6 | Retrieve all rows from table `folders_by_user`: 7 | ```sql 8 | SELECT * FROM folders_by_user; 9 | ```{{execute}} 10 | 11 | Retrieve all rows from table `unread_email_stats`: 12 | ```sql 13 | SELECT * FROM unread_email_stats; 14 | ```{{execute}} 15 | 16 | Retrieve all rows from table `emails_by_user_folder`: 17 | ```sql 18 | SELECT * FROM emails_by_user_folder; 19 | ```{{execute}} 20 | 21 | Retrieve all rows from table `emails`: 22 | ```sql 23 | SELECT id, "to", "from" FROM emails; 24 | SELECT id, subject, body FROM emails; 25 | SELECT id, attachments FROM emails; 26 | ```{{execute}} 27 | 28 | Retrieve all rows from table `attachments`: 29 | ```sql 30 | SELECT * FROM attachments; 31 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | >>1. What is a valid statement about full query log configuration in Cassandra? << 4 | ( ) Full query logging can only be controlled from the cassandra.yaml configuration file. 5 | ( ) Enabling full query logging on a node using nodetool ensures the feature will be enabled even when the node is restarted. 6 | (*) Setting a directory for full query log files in cassandra.yaml enables the full query logging feature. 7 | 8 | 9 | >>2. Which of the following is not a command supported by the full query logging tool fqltool? << 10 | ( ) fqltool dump 11 | (*) fqltool archive 12 | ( ) fqltool replay 13 | ( ) fqltool compare 14 | 15 | 16 | >>3. Full query log files are human-readable << 17 | ( ) TRUE 18 | (*) FALSE 19 | 20 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step7.md: -------------------------------------------------------------------------------- 1 | In this step, you will verify that the Cassandra node has been upgraded and that the data is still available. 2 | 3 | Verify that the version is 4.0.0 4 | ``` 5 | nodetool version 6 | ```{{execute T1}} 7 | 8 | Make sure the node is in the *UP* and *NORMAL* (*UN*) state. 9 | ``` 10 | nodetool status | grep -v UN 11 | ```{{execute T1}} 12 | 13 | Verify that there are no errors. 14 | ``` 15 | grep -e "WARN" -e "ERROR" /usr/share/cassandra/logs/system.log 16 | ```{{execute T1}} 17 | 18 | Open a cql shell. 19 | ``` 20 | cqlsh 21 | ```{{execute T1}} 22 | 23 | Use the keyspace. 24 | ``` 25 | USE united_states; 26 | ```{{execute T1}} 27 | 28 | Verify that the data has been loaded. 29 | ``` 30 | SELECT * FROM cities_by_state; 31 | ```{{execute T1}} 32 | 33 | If you can see the data, you have successfullly upgraded from Cassandra 3.11.9 to 4.0.0! -------------------------------------------------------------------------------- /cql/step10.md: -------------------------------------------------------------------------------- 1 | If you are familiar with SQL, CQL may look quite similar. 2 | Indeed, there are many syntactic similarities between the two languages, but there are also many 3 | important differences. Here are just a few facts about CQL that highlight some of the differences: 4 | 5 | - CQL supports tables with single-row and multi-row partitions 6 | - CQL table primary key consists of a mandatory partition key and an optional clustering key 7 | - CQL does not support referential integrity constraints 8 | - CQL updates or inserts may result in upserts 9 | - CQL queries cannot retrieve data based on an arbitrary table column 10 | - CQL supports no joins or other binary operations 11 | - CQL CRUD operations are executed with a tunable consistency level 12 | - CQL supports lightweight transactions but not ACID transactions 13 | 14 | If some of the above facts do not sound familiar, you know that there are more about CQL to learn! 15 | 16 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step3.md: -------------------------------------------------------------------------------- 1 | Execute the CQL script to insert sample data: 2 | ```sql 3 | SOURCE '~/investment_data.cql' 4 | ```{{execute}} 5 | 6 | Retrieve all rows from table `accounts_by_user`: 7 | ```sql 8 | SELECT * FROM accounts_by_user; 9 | ```{{execute}} 10 | 11 | Retrieve all rows from table `positions_by_account`: 12 | ```sql 13 | SELECT * FROM positions_by_account; 14 | ```{{execute}} 15 | 16 | Retrieve all rows from table `trades_by_a_d`: 17 | ```sql 18 | SELECT * FROM trades_by_a_d; 19 | ```{{execute}} 20 | 21 | Retrieve all rows from table `trades_by_a_td`: 22 | ```sql 23 | SELECT * FROM trades_by_a_td; 24 | ```{{execute}} 25 | 26 | Retrieve all rows from table `trades_by_a_std`: 27 | ```sql 28 | SELECT * FROM trades_by_a_std; 29 | ```{{execute}} 30 | 31 | Retrieve all rows from table `trades_by_a_sd`: 32 | ```sql 33 | SELECT * FROM trades_by_a_sd; 34 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/step10.md: -------------------------------------------------------------------------------- 1 | If you are familiar with SQL, CQL may look quite similar. 2 | Indeed, there are many syntactic similarities between the two languages, but there are also many 3 | important differences. Here are just a few facts about CQL that highlight some of the differences: 4 | 5 | - CQL supports tables with single-row and multi-row partitions 6 | - CQL table primary key consists of a mandatory partition key and an optional clustering key 7 | - CQL does not support referential integrity constraints 8 | - CQL updates or inserts may result in upserts 9 | - CQL queries cannot retrieve data based on an arbitrary table column 10 | - CQL supports no joins or other binary operations 11 | - CQL CRUD operations are executed with a tunable consistency level 12 | - CQL supports lightweight transactions but not ACID transactions 13 | 14 | If some of the above facts do not sound familiar, you know that there are more about CQL to learn! 15 | 16 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step1.md: -------------------------------------------------------------------------------- 1 | In this step, a script running in the background is installing JDK 8 and Cassandra 3.11.9. The script creates a *single-node* Cassandra cluster. The script performs the following actions: 2 | 3 | 1. Remove JDK 11 (the base image for this exercise has JDK 11 installed by default. Cassandra 3.x *does not* support JDK 11) 4 | 2. Install JDK 8 5 | 3. Install Cassandra 3.11.9 and configure environment variables 6 | 4. Start Cassandra 7 | 8 | Wait until you see the message *Cassandra setup complete*. (This may take a few minutes.) 9 | 10 | ![Setup Complete](./assets/setup-complete.jpg) 11 | 12 | Click to verify that the Cassandra version is 3.11.9. 13 | ``` 14 | nodetool version 15 | ```{{execute T1}} 16 | 17 | You should see the correct version. 18 | ![Version 3.11.9](./assets/version.png) 19 | 20 | After verifying the version, clear the screen and continue to the next step. 21 | ``` 22 | clear 23 | ```{{execute T1}} 24 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step6.md: -------------------------------------------------------------------------------- 1 | Let's use _CQLSH_ to query the virtual tables. 2 | 3 | Notice that each Cassandra node has two local tables. 4 | Let's look at the tables in the second node. 5 | 6 | ``` 7 | cqlsh node2 8 | ```{{execute}} 9 | 10 | Here's the query to see the tables' contents. 11 | 12 | ``` 13 | SELECT * FROM system_views.internode_inbound; 14 | SELECT * FROM system_views.internode_outbound; 15 | ```{{execute}} 16 | 17 | Let's switch and look at the contents of the tables in the first node. 18 | 19 | ``` 20 | QUIT 21 | cqlsh node1 22 | ```{{execute}} 23 | 24 | Now, query the first nodes' tables. 25 | ``` 26 | SELECT * FROM system_views.internode_inbound; 27 | SELECT * FROM system_views.internode_outbound; 28 | ```{{execute}} 29 | 30 | Notice that the tables in the first node show the DC-East datacenter, whereas the tables in the second node showed the DC-West datacenter. 31 | 32 | Exit _CQLSH_. 33 | 34 | ``` 35 | QUIT 36 | ```{{execute}} 37 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step6.md: -------------------------------------------------------------------------------- 1 | Save an active shopping cart with name `My Birthday` and id `4e66baf8-f3ad-4c3b-9151-52be4574f2de`, 2 | and designate a different cart with name `Gifts for Mom` and id `19925cc1-4f8b-4a44-b893-2a49a8434fc8` to be a new active shopping cart for user `jen`: 3 | 4 |
5 | Solution 6 | 7 | ```sql 8 | BEGIN BATCH 9 | UPDATE carts_by_user 10 | SET cart_is_active = false 11 | WHERE user_id = 'jen' 12 | AND cart_name = 'My Birthday' 13 | AND cart_id = 4e66baf8-f3ad-4c3b-9151-52be4574f2de 14 | IF cart_is_active = true; 15 | UPDATE carts_by_user 16 | SET cart_is_active = true 17 | WHERE user_id = 'jen' 18 | AND cart_name = 'Gifts for Mom' 19 | AND cart_id = 19925cc1-4f8b-4a44-b893-2a49a8434fc8; 20 | APPLY BATCH; 21 | 22 | SELECT user_id, cart_name, 23 | cart_id, cart_is_active 24 | FROM carts_by_user 25 | WHERE user_id = 'jen'; 26 | ```{{execute}} 27 | 28 |
29 | 30 | 31 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | >>1. What is valid statement about upgrading a cluster from Cassandra 3.x to 4.x? << 4 | ( ) All nodes must be in the Down and Normal state. 5 | (*) All nodes must be in the Up and Normal state. 6 | ( ) All nodes must be in the Up and Leaving state. 7 | 8 | 9 | >>2. Which strategy can support zero-downtime migration? << 10 | ( ) Shut down the 3.x cluster and migrate all nodes at once. 11 | (*) Leave the 3.x cluster running and migrate nodes one at a time. 12 | ( ) Leave the 3.x cluster running, start the 4.x cluster then migrate all nodes. 13 | 14 | 15 | >>3. What is valid statement about upgrading a cluster from Cassandra 3.x to 4.x? << 16 | (*) Cassandra 4.x reads the existing Cassandra 3.x files. 17 | ( ) Cassandra 3.x SSTable files must be upgraded using nodetool. 18 | ( ) Cassandra 3.x SSTable files must be exported Cassandra 4.x format. -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | >>1. What is valid statement about audit log configuration in Cassandra? << 4 | ( ) Audit log configuration set with nodetool persists across server restarts. 5 | ( ) Audit log configuration set with nodetool applies to all nodes in a cassandra cluster. 6 | (*) Audit log configuration set with nodetool overrides configuration set in cassandra.yaml. 7 | 8 | 9 | >>2. How should you handle sensitive data when sharing audit logs? << 10 | ( ) Audit logs do not contain sesitive data. 11 | (*) Manually redact sensitive data in the audit logs. 12 | ( ) Use nodetool to redact specific fieds in the audit logs. 13 | 14 | 15 | >>3. Which command disables the audit log for the finance keyspace? << 16 | ( ) nodetool auditlog --ignore finance 17 | ( ) nodetool auditlog --disable --keyspace finance 18 | (*) nodetool enableauditlog --excluded-keyspaces finance 19 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | Q1. A table with a composite partition key can be queried using ... 4 | 5 | - [ ] A. only the first column of the partition key 6 | - [ ] B. any subset of partition key columns, as long as the primary key definition order is respected 7 | - [ ] C. all columns of the partition key 8 | 9 |
10 | Answer 11 |

C

12 |
13 | 14 |
15 | 16 | Q2. Inequality predicates are allowed on ... 17 | 18 | - [ ] A. partition key columns 19 | - [ ] B. clustering key columns 20 | - [ ] C. all table columns 21 | 22 |
23 | Answer 24 |

B

25 |
26 | 27 |
28 | 29 | Q3. Row ordering is only possible ... 30 | 31 | - [ ] A. on all primary key columns 32 | - [ ] B. within each partition 33 | - [ ] C. when explicitly specified in a query 34 | 35 |
36 | Answer 37 |

B

38 |
39 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step10.md: -------------------------------------------------------------------------------- 1 | Add item `Box2` into active cart `19925cc1-4f8b-4a44-b893-2a49a8434fc8` and update the cart subtotal to `111.50`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | BEGIN BATCH 8 | INSERT INTO items_by_cart ( 9 | cart_id, 10 | timestamp, 11 | item_id, 12 | item_name, 13 | item_description, 14 | item_price, 15 | quantity) 16 | VALUES ( 17 | 19925cc1-4f8b-4a44-b893-2a49a8434fc8, 18 | TOTIMESTAMP(NOW()), 19 | 'Box2', 20 | 'Chocolates', 21 | '25 gourmet chocolates from our collection', 22 | 60.00, 23 | 1); 24 | UPDATE items_by_cart 25 | SET cart_subtotal = 111.50 26 | WHERE cart_id = 19925cc1-4f8b-4a44-b893-2a49a8434fc8 27 | IF cart_subtotal = 51.50; 28 | APPLY BATCH; 29 | 30 | SELECT timestamp, item_id, item_price, 31 | quantity, cart_subtotal 32 | FROM items_by_cart 33 | WHERE cart_id = 19925cc1-4f8b-4a44-b893-2a49a8434fc8; 34 | ```{{execute}} 35 | 36 |
37 | 38 | 39 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step11.md: -------------------------------------------------------------------------------- 1 | Finally, a query can limit the number of rows that can be returned. 2 | This is doable with clauses `PER PARTITION LIMIT` and `LIMIT`, which can be used by themselves 3 | or together in the same query. 4 | 5 | Q1. Use no limits: 6 | ``` 7 | SELECT * FROM ratings_by_user 8 | WHERE email IN ('joe@datastax.com', 9 | 'jim@datastax.com'); 10 | ```{{execute}} 11 | 12 | Q2. Use the per partition limit: 13 | ``` 14 | SELECT * FROM ratings_by_user 15 | WHERE email IN ('joe@datastax.com', 16 | 'jim@datastax.com') 17 | PER PARTITION LIMIT 2; 18 | ```{{execute}} 19 | 20 | Q3. Use the overall limit: 21 | ``` 22 | SELECT * FROM ratings_by_user 23 | WHERE email IN ('joe@datastax.com', 24 | 'jim@datastax.com') 25 | LIMIT 3; 26 | ```{{execute}} 27 | 28 | Q4. Use both limits: 29 | ``` 30 | SELECT * FROM ratings_by_user 31 | WHERE email IN ('joe@datastax.com', 32 | 'jim@datastax.com') 33 | PER PARTITION LIMIT 2 34 | LIMIT 3; 35 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/step1.md: -------------------------------------------------------------------------------- 1 | In this step, you will enable full query logging via *nodetool*. 2 | 3 | We've already started a single node Cassandra cluster for you in the background. When the command prompt appears in the terminal, the node is initialized and ready to go. 4 | 5 | First, let's create a directory to store our full query log files: 6 | 7 | ``` 8 | mkdir /tmp/fqllogs 9 | ```{{execute}} 10 | 11 | Now you can connect to the node using nodetool and enable full query logging, using the directory we just created as the path: 12 | 13 | ``` 14 | nodetool enablefullquerylog --path /tmp/fqllogs 15 | ```{{execute}} 16 | 17 | To get a listing of the other options available on this command, execute the following: 18 | 19 | ``` 20 | nodetool help enablefullquerylog 21 | ```{{execute}} 22 | 23 | # Summary 24 | 25 | In this step, you enabled full query logging dynamically on a running Cassandra node using *nodetool* and learned about the available options on the `enablefullquerylog` command. 26 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step10.md: -------------------------------------------------------------------------------- 1 | Cassandra does not sort rows when executing queries. Instead, a query either preserves the clustering order or reverses it 2 | when retrieving rows from a table. Even when `ORDER BY` is not used, a query result still preserves the clustering order. 3 | Also, remember that the clustering order applies to rows within the same partition and does not apply to rows that belong 4 | to different partitions. 5 | 6 | For table `ratings_by_user` with `CLUSTERING ORDER BY (title ASC, year DESC)`, there are only two ordering options as shown below. 7 | 8 | Q1. Use the clustering order: 9 | ``` 10 | SELECT * FROM ratings_by_user 11 | WHERE email = 'jim@datastax.com' 12 | ORDER BY title ASC, year DESC; 13 | 14 | -- ORDER BY can be omitted 15 | SELECT * FROM ratings_by_user 16 | WHERE email = 'jim@datastax.com'; 17 | ```{{execute}} 18 | 19 | Q2. Use the reverse clustering order: 20 | ``` 21 | SELECT * FROM ratings_by_user 22 | WHERE email = 'jim@datastax.com' 23 | ORDER BY title DESC, year ASC; 24 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step4.md: -------------------------------------------------------------------------------- 1 | Find id and name of an active shopping cart that belongs to user `jen`: 2 | 3 |
4 | Solution 1 (preferred) 5 | 6 | ```sql 7 | -- Retrieve all carts for jen 8 | -- and scan the result set 9 | -- within an application 10 | -- to find an active cart. 11 | SELECT user_id, cart_name, 12 | cart_id, cart_is_active 13 | FROM carts_by_user 14 | WHERE user_id = 'jen'; 15 | ```{{execute}} 16 | 17 |
18 | 19 |
20 | 21 |
22 | Solution 2 23 | 24 | ```sql 25 | -- Retrieve all carts for jen 26 | -- and scan the result set 27 | -- within Cassandra 28 | -- to find an active cart. 29 | -- Note that this is a rare case of 30 | -- scanning within a small partition 31 | -- when ALLOW FILTERING 32 | -- might be acceptable. 33 | SELECT user_id, cart_name, 34 | cart_id, cart_is_active 35 | FROM carts_by_user 36 | WHERE user_id = 'jen' 37 | AND cart_is_active = true 38 | ALLOW FILTERING; 39 | ```{{execute}} 40 | 41 |
-------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/quiz.md: -------------------------------------------------------------------------------- 1 | Let's consolidate what your learnt in this hands-on experience 2 | with a short quiz! 3 | 4 | >>1. You can create your own virtual tables.<< 5 | ( ) True 6 | (*) False 7 | 8 | >>2. Virtual tables make using 'nodetool' obsolete.<< 9 | ( ) True 10 | (*) False 11 | 12 | >>3. How can you alter a setting in Cassandra 4.0? (Choose two.)<< 13 | [*] By using 'nodetool' 14 | [ ] By writing into the 'settings' virtual table 15 | [*] By changing 'cassandra.yaml' and restarting 16 | 17 | >>4. Virtual tables cannot coexist with regular tables in a given keyspace.<< 18 | (*) True 19 | ( ) False 20 | 21 | >>5. It is OK to use the clause 'ALLOW FILTERING' on a virtual table<< 22 | (*) True 23 | ( ) False 24 | 25 | >>6. You can close a client connection with a DELETE on the 'clients' virtual table<< 26 | ( ) True 27 | (*) False 28 | 29 | >>7. If you change a setting, the change is reflected into...<< 30 | [*] the output of 'nodetool' 31 | [ ] the 'cassandra.yaml' configuration file 32 | [*] the virtual table 'settings' 33 | -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step5.md: -------------------------------------------------------------------------------- 1 | Virtual tables are also a great way to access all sorts of settings and 2 | configuration parameters for a Cassandra node. 3 | 4 | Let's turn our attention to the **read request timeout**, a quantity that 5 | specifies how long this node will wait before timing out when it's acting 6 | as read query coordinator. 7 | 8 | You can look for the setting directly in the `cassandra.yaml` file: 9 | ``` 10 | grep "read_request_timeout_in_ms:" /etc/cassandra/cassandra.yaml 11 | ```{{execute T1}} 12 | 13 | Alternatively you can use the corresponding `get*` 14 | operation offered by `nodetool`: 15 | ``` 16 | nodetool gettimeout read 17 | ```{{execute T1}} 18 | 19 | With virtual tables, the same information is now available with a `SELECT`: 20 | ``` 21 | SELECT * FROM system_views.settings 22 | WHERE name = 'read_request_timeout_in_ms'; 23 | ```{{execute T2}} 24 | 25 | In most situations, the default setting (5000 milliseconds) 26 | is perfectly fine; however, there may be exceptions, as we will soon see. 27 | 28 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/quiz.md: -------------------------------------------------------------------------------- 1 | Here is a short quiz for you. 2 | 3 | >>1. Which is NOT true about Cassandra 4.X internode messaging? << 4 | (*) Internode messaging uses ESP 5 | ( ) Internode messaging improvements retired technical debt 6 | ( ) Internode messaging now employs non-blocking IO 7 | 8 | >>2. Which of the following is a way in which Cassandra 4.X exposes internode messaging metrics? << 9 | ( ) Via email 10 | (*) Using virtual tables 11 | ( ) Using ESP, just like the messages themselves 12 | 13 | >>3. Which one of the following is true about internode messaging metrics? << 14 | ( ) In a Cassandra cluster, the total of all inbound bytes received will appear to exceed the bytes sent because of inefficiencies 15 | (*) In a Cassandra cluster, the total of all outbound bytes sent will appear to be greater than or equal to the total bytes received 16 | ( ) In a Cassandra cluster, the total of all inbound bytes received will never appear to equal the total bytes sent because the receiving node is always behind the sending node. 17 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step6.md: -------------------------------------------------------------------------------- 1 | In this step, you will configure `cassandra.yaml` and start Cassandra. 2 | 3 | Use `sed` to modify the number of virtual nodes in the server. The 3.x cluster had 256 and the 4.0 cluster is set to 16 by default. Set `num_tokens` to 256 in Cassandra 4.x 4 | ``` 5 | sed -i 's/num_tokens: 16/num_tokens: 256/' /usr/share/cassandra4/conf/cassandra.yaml 6 | ```{{execute T1}} 7 | 8 | Use `sed` to point the new cluster to the old datafiles. 9 | ``` 10 | sed -i 's/# data_file_directories:/data_file_directories:/' /usr/share/cassandra4/conf/cassandra.yaml 11 | sed -i 's/# - \/var\/lib\/cassandra\/data/ - \/usr\/share\/cassandra\/data\/data/' /usr/share/cassandra4/conf/cassandra.yaml 12 | ```{{execute T1}} 13 | 14 | Start the Cassandra 4.x server. 15 | ``` 16 | cassandra -R 17 | ```{{execute T1}} 18 | 19 | Look for the message, *state jump to NORMAL* to indicate that the server is running. 20 | ![Version 3.11.9](./assets/normal.jpg) 21 | 22 | Clear the screen and continue. 23 | ``` 24 | clear 25 | ```{{execute T1}} -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step6.md: -------------------------------------------------------------------------------- 1 | _A heavy exceptional maintenance task is starting on your node: 2 | you need to temporarily 3 | raise the read timeout to 18 seconds until the task is over. This way, you'll 4 | prevent the application from repeatedy timing out 5 | (increased latency is more acceptable in your case)._ 6 | 7 | You sure don't want to edit `cassandra.yaml` and restart the nodes, so you 8 | decide to change this setting on the fly, with: 9 | ``` 10 | nodetool settimeout read 18000 11 | ```{{execute T1}} 12 | 13 | (remember writing on a virtual table is not supported... yet.) 14 | 15 | Now, does the `cassandra.yaml` automatically reflect this change? 16 | ``` 17 | grep "read_request_timeout_in_ms:" /etc/cassandra/cassandra.yaml 18 | ```{{execute T1}} 19 | 20 | Does `nodetool` itself reflect the change? 21 | ``` 22 | nodetool gettimeout read 23 | ```{{execute T1}} 24 | 25 | Does the virtual-table method give you the newly-set value of 18000? 26 | ``` 27 | SELECT * FROM system_views.settings WHERE name = 'read_request_timeout_in_ms'; 28 | ```{{execute T2}} 29 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/foreground.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | echo "Downgrade Java to JDK 8" 4 | 5 | sudo rm -r /usr/lib/jvm 6 | apt-get update > /dev/null 2>&1 7 | sudo apt-get -y install openjdk-8-jdk openjdk-8-jre < "/dev/null" > /dev/null 2>&1 8 | 9 | until [[ -e /usr/lib/jvm/java-8-openjdk-amd64 && -e /usr/bin/java ]] 10 | do 11 | sleep 1 12 | echo -n '.' 13 | done 14 | 15 | echo "Install Cassandra 3.11.9" 16 | 17 | wget https://archive.apache.org/dist/cassandra/3.11.9/apache-cassandra-3.11.9-bin.tar.gz < "/dev/null" > /dev/null 2>&1 18 | tar xzf apache-cassandra-3.11.9-bin.tar.gz 19 | mv apache-cassandra-3.11.9 /usr/share/cassandra 20 | rm apache-cassandra-3.11.9-bin.tar.gz 21 | export JAVA_HOME="/usr" 22 | export PATH="$PATH:/usr/bin:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 23 | 24 | echo "Start Cassandra" 25 | 26 | cassandra -R < "/dev/null" > /dev/null 2>&1 27 | 28 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do sleep 10; done 29 | 30 | echo "Cassandra setup complete" -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step5.md: -------------------------------------------------------------------------------- 1 | Table `movies` stores information about movies, which are uniquely identified by their titles and release years. 2 | This table has single-row partitions and 3 | the primary key defined as `PRIMARY KEY ((title, year))`. 4 | Let's first retrieve all rows from the table to learn how the data looks like and then focus 5 | on predicates that the primary key can support. 6 | 7 | Q1. Retrieve all rows: 8 |
9 | Solution 10 | 11 | ``` 12 | SELECT * FROM movies; 13 | ```{{execute}} 14 | 15 |
16 | 17 |
18 | 19 | Q2. Retrieve one row/partition: 20 |
21 | Solution 22 | 23 | ``` 24 | SELECT * FROM movies 25 | WHERE title = 'Alice in Wonderland' 26 | AND year = 2010; 27 | ```{{execute}} 28 | 29 |
30 | 31 |
32 | 33 | Q3. Retrieve two rows/partitions: 34 |
35 | Solution 36 | 37 | ``` 38 | SELECT * FROM movies 39 | WHERE title = 'Alice in Wonderland' 40 | AND year IN (2010, 1951); 41 | ```{{execute}} 42 | 43 |
44 | 45 | 46 | -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step7.md: -------------------------------------------------------------------------------- 1 | _Now the maintenance is completed and the node load is back to normal. 2 | You could simply revert the timeout setting to its default with 3 | a `nodetool settimeout read 5000` command._ 4 | 5 | Let's try instead restarting Cassandra on this node to see 6 | if the timeout gets reset to the default of 5 seconds: 7 | 8 | ``` 9 | systemctl restart cassandra 10 | ```{{execute T1}} 11 | 12 | Wait until `nodetool status` reports state `UN` (=Up, Normal) again: 13 | ``` 14 | nodetool status 15 | ```{{execute T1}} 16 | 17 | Now let's look at the timeout value as read through the `system_views.settings` 18 | virtual table: 19 | ``` 20 | SELECT * FROM system_views.settings WHERE name = 'read_request_timeout_in_ms'; 21 | ```{{execute T2}} 22 | _(Note: the node may appear unavailable for a short while, in which case you can 23 | repeat this `SELECT` command to see the results.)_ 24 | 25 | Compare with the output that `nodetool` provides: 26 | ``` 27 | nodetool gettimeout read 28 | ```{{execute T1}} 29 | 30 | Has the setting reverted to the default after restarting? 31 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step7.md: -------------------------------------------------------------------------------- 1 | Find an attachment file with name `Budget.xlsx` for an email with id `8ae31dd0-d361-11ea-a40e-5dd6331dfc45`, assuming that the complete file is stored in one partition with chunk number `1`: 2 | 3 |
4 | Solution 5 | 6 | ```sql 7 | SELECT filename, type, value, 8 | blobAsText(value) 9 | FROM attachments 10 | WHERE email_id = 8ae31dd0-d361-11ea-a40e-5dd6331dfc45 11 | AND filename = 'Budget.xlsx' 12 | AND chunk_number = 1; 13 | ```{{execute}} 14 | 15 |
16 | 17 |
18 | 19 | Find an attachment file with name `Presentation.pptx` for an email with id `8ae31dd0-d361-11ea-a40e-5dd6331dfc45`, assuming that the three file chunks are stored across three partitions with chunk numbers `1`, `2` and `3`: 20 | 21 |
22 | Solution 23 | 24 | ```sql 25 | SELECT filename, type, value, 26 | blobAsText(value) 27 | FROM attachments 28 | WHERE email_id = 8ae31dd0-d361-11ea-a40e-5dd6331dfc45 29 | AND filename = 'Presentation.pptx' 30 | AND chunk_number IN (1,2,3); 31 | ```{{execute}} 32 | 33 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step3.md: -------------------------------------------------------------------------------- 1 | Execute the CQL script to insert sample data: 2 | ```sql 3 | SOURCE '~/order_management_data.cql' 4 | ```{{execute}} 5 | 6 | Retrieve all rows from table `orders_by_user`: 7 | ```sql 8 | SELECT * FROM orders_by_user; 9 | ```{{execute}} 10 | 11 | Retrieve all rows from table `orders_by_id`: 12 | ```sql 13 | EXPAND ON; 14 | 15 | SELECT 16 | order_id, 17 | item_name, 18 | item_id, 19 | item_description, 20 | item_price, 21 | item_quantity, 22 | order_status, 23 | order_timestamp, 24 | order_subtotal, 25 | order_shipping, 26 | order_tax, 27 | order_total, 28 | payment_summary, 29 | payment_details, 30 | billing_summary, 31 | billing_details, 32 | shipping_summary, 33 | shipping_details, 34 | delivery_id, 35 | delivery_details 36 | FROM orders_by_id; 37 | 38 | EXPAND OFF; 39 | ```{{execute}} 40 | 41 | Retrieve all rows from table `orders_by_user_item`: 42 | ```sql 43 | SELECT * FROM orders_by_user_item; 44 | ```{{execute}} 45 | 46 | Retrieve all rows from table `order_status_history_by_id`: 47 | ```sql 48 | SELECT * FROM order_status_history_by_id; 49 | ```{{execute}} 50 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/quiz.md: -------------------------------------------------------------------------------- 1 | Let's complement this scenario with a small quiz about what you have learnt! 2 | 3 | >>1. One should perform data repair: << 4 | ( ) When a node restarts after crashing 5 | ( ) Under exceptional circumstances of various kinds 6 | (*) Periodically, similar to other ordinary maintenance operations 7 | ( ) Just after a heavy bulk write/data migration 8 | 9 | >>2. To check if an SSTable has been repaired already, one can... << 10 | ( ) Execute a SHOW REPAIRED TABLECHUNKS command in cqlsh 11 | (*) Use the sstablemetadata command-line tool 12 | ( ) Look into the system.repairs table 13 | ( ) Do nothing: there is no way to get this information 14 | 15 | >>3. Incremental repair in Cassandra 4.0 is structured with: << 16 | ( ) A transaction; two SSTable pools (repaired/non-repaired) 17 | ( ) No transaction; two SSTable pools (repaired/non-repaired) 18 | (*) A transaction; three SSTable pools (repaired/pending/non-repaired) 19 | ( ) No transaction; three SSTable pools (repaired/pending/non-repaired) 20 | 21 | >>4. The system.repairs table is a virtual table: its contents are potentially different on each node. << 22 | ( ) True 23 | (*) False 24 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/step2.md: -------------------------------------------------------------------------------- 1 | Create table `carts_by_user`: 2 | ```sql 3 | CREATE TABLE carts_by_user ( 4 | user_id TEXT, 5 | cart_name TEXT, 6 | cart_id UUID, 7 | cart_is_active BOOLEAN, 8 | user_email TEXT STATIC, 9 | PRIMARY KEY ((user_id),cart_name,cart_id) 10 | ); 11 | ```{{execute}} 12 | 13 | Create table `items_by_id`: 14 | ```sql 15 | CREATE TABLE items_by_id ( 16 | id TEXT, 17 | name TEXT, 18 | description TEXT, 19 | price DECIMAL, 20 | PRIMARY KEY ((id)) 21 | ); 22 | ```{{execute}} 23 | 24 | Create materialized view `items_by_name`: 25 | ```sql 26 | CREATE MATERIALIZED VIEW items_by_name 27 | AS 28 | SELECT * FROM items_by_id 29 | WHERE name IS NOT NULL 30 | AND id IS NOT NULL 31 | PRIMARY KEY ((name), id); 32 | ```{{execute}} 33 | 34 | 35 | Create table `items_by_cart`: 36 | ```sql 37 | CREATE TABLE items_by_cart ( 38 | cart_id UUID, 39 | timestamp TIMESTAMP, 40 | item_id TEXT, 41 | item_name TEXT, 42 | item_description TEXT, 43 | item_price DECIMAL, 44 | quantity INT, 45 | cart_subtotal DECIMAL STATIC, 46 | PRIMARY KEY ((cart_id),timestamp,item_id) 47 | ) WITH CLUSTERING ORDER BY (timestamp DESC, item_id ASC); 48 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step1.md: -------------------------------------------------------------------------------- 1 | _Note: wait until you see the message "Cassandra has started" in the 2 | terminal before proceeding._ 3 | 4 | First, verify that Cassandra is properly installed on this machine and is running 5 | as a system service. To do so, you can ask your operating system's daemon 6 | manager: 7 | 8 | ``` 9 | systemctl status cassandra --no-pager 10 | ```{{execute T1}} 11 | 12 | Look for a green circle and `Active (running)` in the output. 13 | 14 | Even better, you can use `nodetool`, Cassandra's utility for node administration: 15 | 16 | ``` 17 | nodetool status 18 | ```{{execute T1}} 19 | 20 | The output should show that the current node (which forms a cluster by itself) 21 | is in a status "UN" (meaning Up and Normal). 22 | 23 | _Make sure Cassandra is completely started before proceeding._ 24 | 25 | Please run the following to initialize the other terminals of this scenario: 26 | ``` 27 | echo Initializing terminal 2 28 | ```{{execute T2}} 29 | 30 | ``` 31 | echo Initializing terminal 3 32 | ```{{execute T3}} 33 | 34 | You will use `cqlsh` several times during this exercise. So let us open a 35 | `cqlsh` console and keep it running on the second terminal: 36 | 37 | ``` 38 | cqlsh 39 | ```{{execute T2}} 40 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step2.md: -------------------------------------------------------------------------------- 1 | Create table `networks`: 2 | ```sql 3 | CREATE TABLE networks ( 4 | bucket TEXT, 5 | name TEXT, 6 | description TEXT, 7 | region TEXT, 8 | num_sensors INT, 9 | PRIMARY KEY ((bucket),name) 10 | ); 11 | ```{{execute}} 12 | 13 | Create table `temperatures_by_network`: 14 | ```sql 15 | CREATE TABLE temperatures_by_network ( 16 | network TEXT, 17 | week DATE, 18 | date_hour TIMESTAMP, 19 | sensor TEXT, 20 | avg_temperature FLOAT, 21 | latitude DECIMAL, 22 | longitude DECIMAL, 23 | PRIMARY KEY ((network,week),date_hour,sensor) 24 | ) WITH CLUSTERING ORDER BY (date_hour DESC, sensor ASC); 25 | ```{{execute}} 26 | 27 | Create table `sensors_by_network`: 28 | ```sql 29 | CREATE TABLE sensors_by_network ( 30 | network TEXT, 31 | sensor TEXT, 32 | latitude DECIMAL, 33 | longitude DECIMAL, 34 | characteristics MAP, 35 | PRIMARY KEY ((network),sensor) 36 | ); 37 | ```{{execute}} 38 | 39 | 40 | Create table `temperatures_by_sensor`: 41 | ```sql 42 | CREATE TABLE temperatures_by_sensor ( 43 | sensor TEXT, 44 | date DATE, 45 | timestamp TIMESTAMP, 46 | value FLOAT, 47 | PRIMARY KEY ((sensor,date),timestamp) 48 | ) WITH CLUSTERING ORDER BY (timestamp DESC); 49 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/step2.md: -------------------------------------------------------------------------------- 1 | Create table `folders_by_user`: 2 | ```sql 3 | CREATE TABLE folders_by_user ( 4 | username TEXT, 5 | label TEXT, 6 | color TEXT, 7 | PRIMARY KEY ((username),label) 8 | ); 9 | ```{{execute}} 10 | 11 | Create table `unread_email_stats`: 12 | ```sql 13 | CREATE TABLE unread_email_stats ( 14 | username TEXT, 15 | label TEXT, 16 | num_unread COUNTER, 17 | PRIMARY KEY ((username),label) 18 | ); 19 | ```{{execute}} 20 | 21 | Create table `emails_by_user_folder`: 22 | ```sql 23 | CREATE TABLE emails_by_user_folder ( 24 | username TEXT, 25 | label TEXT, 26 | id TIMEUUID, 27 | "from" TEXT, 28 | subject TEXT, 29 | is_read BOOLEAN, 30 | PRIMARY KEY ((username,label),id) 31 | ) WITH CLUSTERING ORDER BY (id DESC); 32 | ```{{execute}} 33 | 34 | Create table `emails`: 35 | ```sql 36 | CREATE TABLE emails ( 37 | id TIMEUUID, 38 | "to" LIST, 39 | "from" TEXT, 40 | subject TEXT, 41 | body TEXT, 42 | attachments MAP, 43 | PRIMARY KEY ((id)) 44 | ); 45 | ```{{execute}} 46 | 47 | Create table `attachments`: 48 | ```sql 49 | CREATE TABLE attachments ( 50 | email_id TIMEUUID, 51 | filename TEXT, 52 | chunk_number INT, 53 | type TEXT, 54 | value BLOB, 55 | PRIMARY KEY ((email_id,filename,chunk_number)) 56 | ); 57 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "private": false, 3 | "title": "Audit Logging", 4 | "description": "New Features in Cassandra 4", 5 | "difficulty": "Easy", 6 | "time": "15 minutes", 7 | "details": { 8 | "assets": { 9 | "host01": [ 10 | {"file": "wait.sh", "target": "/usr/local/bin", "chmod": "+x"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Setup", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Enable Audit Logging Dynamically", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Enable Audit Logging in cassandra.yaml", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Configuring Audit Logging", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Test Your Understanding", 32 | "text": "quiz.md" 33 | } 34 | ], 35 | "intro": { 36 | "courseData": "background.sh", 37 | "code": "foreground.sh", 38 | "text": "intro.md" 39 | }, 40 | "finish": { 41 | "text": "finish.md" 42 | } 43 | }, 44 | "environment": { 45 | "uilayout": "editor-terminal", 46 | "uieditorpath": "/" 47 | }, 48 | "backend": { 49 | "imageid": "ubuntu:1804" 50 | } 51 | } 52 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step5.md: -------------------------------------------------------------------------------- 1 | One visible aspect of the Cassandra 4.X internode messaging improvements is that internode messaging metrics are now available as virtual tables. 2 | In this step, we'll show you what we mean. 3 | 4 | Each node in the Cassandra cluster has virtual tables in which Cassandra keeps internode messaging metrics. 5 | One table, internode_inbound, keeps track of inbound messaging metrics, and internode_outbound keeps track of the outbound metrics. 6 | Both tables are in the system_views keyspace. 7 | 8 | Note that these are not real tables. 9 | They merely _appear_ as tables to allow access to the metrics they contain. 10 | 11 | We'll use _CQLSH_ to look at these tables. 12 | 13 | ``` 14 | cqlsh node1 15 | ```{{execute}} 16 | 17 | In _CQLSH_, the following command shows what the inbound table looks like. 18 | 19 | ``` 20 | DESCRIBE TABLE system_views.internode_inbound; 21 | ```{{execute}} 22 | 23 | Here's the outbound table. 24 | 25 | ``` 26 | DESCRIBE TABLE system_views.internode_outbound; 27 | ```{{execute}} 28 | 29 | Notice that these descriptions are embedded within comments. 30 | This is because the tables are virtual and were never actually created. 31 | 32 | Exit _CQLSH_ using the following command. 33 | 34 | ``` 35 | QUIT 36 | ```{{execute}} 37 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step3.md: -------------------------------------------------------------------------------- 1 | Besides changing the way threads receive messages, Cassandra developers did a lot of cleanup and tuning of the internode message code path. 2 | 3 | As developers work on code and make changes, sometimes the code can become a bit brittle or inefficient. 4 | Developers refer to this as _Technical Debt_. 5 | 6 | It's good to retire technical debt by refactoring or cleaning up the code, and that is exactly what developers did with the internode message code for the 4.X release. 7 | The benefits of retiring technical debt include: 8 | * More efficient code, which means the code requires less processing 9 | * Code that is easier to read and understand so future changes are easier 10 | * Code that is more robust, yielding faster and more predictable response times 11 | 12 | The Cassandra 4.X cleanup includes: 13 | * Protocol improvements that remove redundant information and make the protocol more efficient 14 | * Handling corner cases where code didn't deal gracefully with exceptions 15 | * Buffer optimization that reduces memory requirements due to internode messaging 16 | * Introduction of messaging timeouts under certain conditions 17 | * Optimizations that allow a node to bypass long code paths when sending messages to itself 18 | 19 | The bottom line for these changes is Cassandra is faster, more efficient and more robust! 20 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Full Query Logging", 3 | "description": "New Features in Cassandra 4", 4 | "difficulty": "Easy", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin", "chmod": "+x"} 10 | ] 11 | }, 12 | "steps": [ 13 | { 14 | "title": "Enable Full Query Logging via nodetool", 15 | "text": "step1.md" 16 | }, 17 | { 18 | "title": "Create Schema and Perform Queries", 19 | "text": "step2.md" 20 | }, 21 | { 22 | "title": "Use fqltool to review Full Query Logs", 23 | "text": "step3.md" 24 | }, 25 | { 26 | "title": "Configure Full Query Logging via cassandra.yaml", 27 | "text": "step4.md" 28 | }, 29 | { 30 | "title": "Test Your Understanding", 31 | "text": "quiz.md" 32 | } 33 | ], 34 | "intro": { 35 | "text": "intro.md", 36 | "courseData": "background.sh", 37 | "code": "foreground.sh" 38 | }, 39 | "finish": { 40 | "text": "finish.md" 41 | } 42 | }, 43 | "environment": { 44 | "uilayout": "editor-terminal", 45 | "uieditorpath": "/" 46 | }, 47 | "backend": { 48 | "imageid": "datastax-oss-cassandra" 49 | } 50 | } 51 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/step7.md: -------------------------------------------------------------------------------- 1 | Since there are only two nodes, you would expect that the number of bytes sent from one node should be equal to the number of bytes received by the other node. 2 | Let's see if we can demonstrate that. 3 | 4 | We have prepared two files containing the CQL queries for these tables. 5 | We will run these queries on separate nodes nearly simultaneously and look at the results. 6 | 7 | Take a look at the inbound query. 8 | 9 | ``` 10 | cat in.cql 11 | ```{{execute}} 12 | 13 | You see we are only looking at two fields: the number of operations and the number of bytes. 14 | Isolating these metrics makes it a little easier to compare the results across nodes. 15 | 16 | 17 | Here's the outbound query 18 | 19 | ``` 20 | cat out.cql 21 | ```{{execute}} 22 | 23 | Here's the command to execute both of these queries on separate nodes nearly simultaneously. 24 | 25 | ``` 26 | cqlsh node1 -f in.cql; cqlsh node2 -f out.cql 27 | ```{{execute}} 28 | 29 | Often, the number of bytes written will exceed the number of bytes read. 30 | You can make sense of this by considering the number of operations. 31 | You see that the number of write operations often exceeds the number of read operations (until the read node catches up). 32 | 33 | Re-run the queries (by clicking above) until the number of operations is the same for both nodes. 34 | You see that the number of bytes transferred also matches. 35 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step1.md: -------------------------------------------------------------------------------- 1 | CQL queries look just like SQL queries. However, while you will see familiar clauses `SELECT`, `FROM`, `WHERE`, `GROUP BY` 2 | and `ORDER BY`, CQL queries are much more restrictive in what goes into those clauses. 3 | 4 | A CQL query can only retrieve data from a single table, so there are no joins, self-joins, nested queries, unions, intersections and so forth. 5 | Moreover, only columns that are declared in table's `PRIMARY KEY` definition can be used to filter, group or order rows. 6 | The *primary key definition order* must be respected when filtering and grouping, such that a complete partition key must be used and 7 | when a clustering key column is used, any preceding clustering column in the primary key definition must also be used. 8 | When ordering rows, the *clustering order* declared in the table definition must be respected. Ordering only applies to rows within a partition and 9 | can be either preserved or reversed. 10 | 11 | These restrictions ensure that your queries only use efficient data access patterns, which include *retrieving one row*, 12 | *retrieving all rows or a subset of rows from one partition* and *retrieving rows from at most a few partitions*. 13 | The smaller the number of partitions a query touches, the better performance and throughput can be expected. When studying 14 | our query examples in this tutorial, pay attention to data access patterns they implement. 15 | 16 | -------------------------------------------------------------------------------- /cassandra-data-modeling/structure.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Data Modeling By Example", 3 | "description": "Learn how to create efficient and scalable Cassandra data models for IoT, e-commerce, finance, and more.", 4 | "items": [ 5 | { "path": "sensor-data", 6 | "title": "Sensor Data Modeling", 7 | "description": "Learn how to create a data model for temperature monitoring sensor networks" }, 8 | { "path": "messaging-data", 9 | "title": "Messaging Data Modeling", 10 | "description": "Learn how to create a data model for an email system" }, 11 | { "path": "music-data", 12 | "title": "Digital Library Data Modeling", 13 | "description": "Learn how to create a data model for a digital music library" }, 14 | { "path": "investment-data", 15 | "title": "Investment Portfolio Data Modeling", 16 | "description": "Learn how to create a data model for investment accounts or portfolios" }, 17 | { "path": "time-series-data", 18 | "title": "Time Series Data Modeling", 19 | "description": "Learn how to create a data model for time series data" }, 20 | { "path": "shopping-cart-data", 21 | "title": "Shopping Cart Data Modeling", 22 | "description": "Learn how to create a data model for online shopping carts" }, 23 | { "path": "order-management-data", 24 | "title": "Order Management Data Modeling", 25 | "description": "Learn how to create a data model for an order management system" } 26 | ] 27 | } -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step8.md: -------------------------------------------------------------------------------- 1 | Cancel order `113-3827060-8722206` placed by user `joe` on `2020-11-17` at `22:20:43` by updating its status from `pending` to `canceled`: 2 | 3 |
4 | Solution 5 | 6 |

Step 1. Update the "source-of-truth" table using a light weight transaction:

7 | 8 | ```sql 9 | UPDATE orders_by_id 10 | SET order_status = 'canceled' 11 | WHERE order_id = '113-3827060-8722206' 12 | IF order_status = 'pending'; 13 | ```{{execute}} 14 | 15 | 16 |

Step 2. Update the other tables if and only if the previous transaction was successfully applied:

17 | 18 | ```sql 19 | UPDATE orders_by_user 20 | SET order_status = 'canceled' 21 | WHERE order_id = '113-3827060-8722206' 22 | AND user_id = 'joe' 23 | AND order_timestamp = '2020-11-17 22:20:43'; 24 | 25 | INSERT INTO order_status_history_by_id (order_id, status_timestamp, order_status) 26 | VALUES ('113-3827060-8722206',TOTIMESTAMP(NOW()),'canceled'); 27 | ```{{execute}} 28 | 29 |

Step 3. Optionally, verify the changes:

30 | 31 | ```sql 32 | SELECT order_status 33 | FROM orders_by_id 34 | WHERE order_id = '113-3827060-8722206'; 35 | 36 | SELECT order_status 37 | FROM orders_by_user 38 | WHERE order_id = '113-3827060-8722206' 39 | AND user_id = 'joe' 40 | AND order_timestamp = '2020-11-17 22:20:43'; 41 | 42 | SELECT order_status 43 | FROM order_status_history_by_id 44 | WHERE order_id = '113-3827060-8722206' 45 | LIMIT 1; 46 | ```{{execute}} 47 | 48 |
49 | 50 | 51 | -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Sensor Data Modeling Example for Cassandra", 3 | "description": "Explore how IoT sensor data can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "sensor_data.cql", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Design query Q1", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q2", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design query Q3", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q4", 40 | "text": "step7.md" 41 | } 42 | ], 43 | "intro": { 44 | "courseData": "background.sh", 45 | "code": "foreground.sh", 46 | "text": "intro.md" 47 | }, 48 | "finish": { 49 | "text": "finish.md" 50 | } 51 | }, 52 | "environment": { 53 | "uilayout": "terminal" 54 | }, 55 | "backend": { 56 | "imageid": "ubuntu20.04" 57 | } 58 | } 59 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Messaging Data Modeling Example for Cassandra", 3 | "description": "Explore how messaging data can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "messaging_data.cql", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Design queries Q1.1 and Q1.2", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q2", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design query Q3", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q4", 40 | "text": "step7.md" 41 | } 42 | ], 43 | "intro": { 44 | "courseData": "background.sh", 45 | "code": "foreground.sh", 46 | "text": "intro.md" 47 | }, 48 | "finish": { 49 | "text": "finish.md" 50 | } 51 | }, 52 | "environment": { 53 | "uilayout": "terminal" 54 | }, 55 | "backend": { 56 | "imageid": "ubuntu20.04" 57 | } 58 | } 59 | -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step3.md: -------------------------------------------------------------------------------- 1 | Virtual tables and their related keyspaces impose several restrictions 2 | on the kinds of operations that can be performed. 3 | 4 | Find out if you can add a column to a virtual table: 5 | ``` 6 | ALTER TABLE settings ADD comment TEXT; 7 | ```{{execute T2}} 8 | 9 | _(Note: this command and the next ones will produce an error in the 10 | `cqlsh` console.)_ 11 | 12 | Find out if you can upsert a new row to a virtual table: 13 | ``` 14 | INSERT INTO settings (name , value ) VALUES ( 'MaxNumberOfGlorxes', '137'); 15 | ```{{execute T2}} 16 | 17 | Find ouf if you can clear the contents of a virtual table: 18 | ``` 19 | TRUNCATE settings ; 20 | ```{{execute T2}} 21 | 22 | Find out if you can create an index: 23 | ``` 24 | CREATE INDEX ON settings (value) ; 25 | ```{{execute T2}} 26 | 27 | Virtual tables can be queried with the same syntax as regular tables. 28 | Suppose we want to list all (Boolean) settings that are set to "true": 29 | ``` 30 | SELECT name FROM settings WHERE value='true'; 31 | ```{{execute T2}} 32 | 33 | This query, as it is, will fail with a standard message about _data filtering_ 34 | (which is how CQL advises against full-cluster scans). 35 | Thanks to the fact that virtual tables are **not** actually distributed, 36 | however, it is 37 | perfectly fine to add the `ALLOW FILTERING` clause to such a query 38 | (indeed, this is one of the very few cases it is acceptable to): 39 | 40 | ``` 41 | SELECT name FROM settings WHERE value='true' ALLOW FILTERING ; 42 | ```{{execute T2}} 43 | 44 | This observation will come handy in next step. -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step8.md: -------------------------------------------------------------------------------- 1 | CQL aggregates include `COUNT`, `SUM`, `AVG`, `MIN` and `MAX`. CQL also 2 | supports many functions, of which we will showcase `CAST`, `NOW`, and `TODATE`. 3 | It is also possible to create user-defined aggregates and functions using 4 | statements `CREATE AGGREGATE` and `CREATE FUNCTION`. We will create a function to calculate 5 | the number of days between two dates. Study and execute the following query examples. 6 | 7 | Q1. Analize ratings for the movie: 8 | ``` 9 | SELECT COUNT(rating) AS count, 10 | SUM(rating) AS sum, 11 | AVG(CAST(rating AS FLOAT)) AS avg, 12 | MIN(rating) AS min, 13 | MAX(rating) AS max 14 | FROM ratings_by_movie 15 | WHERE title = 'Alice in Wonderland' 16 | AND year = 2010; 17 | ```{{execute}} 18 | 19 | Q2. Find the user name, date of joining and current date: 20 | ``` 21 | SELECT name, 22 | date_joined, 23 | TODATE(NOW()) AS date_today 24 | FROM users 25 | WHERE email = 'joe@datastax.com'; 26 | ```{{execute}} 27 | 28 | Q3. Calculate how many days passed since the user joined: 29 | ``` 30 | CREATE FUNCTION IF NOT EXISTS 31 | DAYS_BETWEEN_DATES(date1 TEXT, date2 TEXT) 32 | RETURNS NULL ON NULL INPUT 33 | RETURNS BIGINT 34 | LANGUAGE Java AS 35 | 'return java.lang.Math.abs( 36 | java.time.temporal.ChronoUnit.DAYS.between( 37 | java.time.LocalDate.parse(date1), 38 | java.time.LocalDate.parse(date2) 39 | ) 40 | );'; 41 | 42 | SELECT name, 43 | DAYS_BETWEEN_DATES( 44 | CAST(date_joined AS TEXT), 45 | CAST(TODATE(NOW()) AS TEXT) ) AS days 46 | FROM users 47 | WHERE email = 'joe@datastax.com'; 48 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Virtual Tables", 3 | "description": "New Features in Cassandra 4", 4 | "difficulty": "Easy", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin", "chmod": "+x"} 10 | ] 11 | }, 12 | "steps": [ 13 | { 14 | "title": "Setup", 15 | "text": "step1.md" 16 | }, 17 | { 18 | "title": "Inspect tables", 19 | "text": "step2.md" 20 | }, 21 | { 22 | "title": "Limitations on tables", 23 | "text": "step3.md" 24 | }, 25 | { 26 | "title": "Table 'clients'", 27 | "text": "step4.md" 28 | }, 29 | { 30 | "title": "Checking read timeout", 31 | "text": "step5.md" 32 | }, 33 | 34 | { 35 | "title": "Changing read timeout!", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Restarting Cassandra", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Test Your Understanding", 44 | "text": "quiz.md" 45 | } 46 | ], 47 | "intro": { 48 | "courseData": "background.sh", 49 | "code": "foreground.sh", 50 | "text": "intro.md" 51 | }, 52 | "finish": { 53 | "text": "finish.md" 54 | } 55 | }, 56 | "environment": { 57 | "uilayout": "terminal", 58 | "terminals": [ 59 | {"name": "CQL console", "target": "host01"}, 60 | {"name": "Client console", "target": "host01"} 61 | ] 62 | }, 63 | "backend": { 64 | "imageid": "ubuntu:1804" 65 | } 66 | } 67 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Order Management Data Modeling Example for Cassandra", 3 | "description": "Explore how online orders can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "order_management_data.cql", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Design query Q1", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q2", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design query Q3", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q4", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Design update U1", 44 | "text": "step8.md" 45 | } 46 | ], 47 | "intro": { 48 | "courseData": "background.sh", 49 | "code": "foreground.sh", 50 | "text": "intro.md" 51 | }, 52 | "finish": { 53 | "text": "finish.md" 54 | } 55 | }, 56 | "environment": { 57 | "uilayout": "terminal" 58 | }, 59 | "backend": { 60 | "imageid": "ubuntu20.04" 61 | } 62 | } 63 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step4.md: -------------------------------------------------------------------------------- 1 | Start the CQL shell: 2 | ```bash 3 | cqlsh -k music_data 4 | ```{{execute}} 5 | 6 | Insert rows into table `users`: 7 | 8 | ```sql 9 | INSERT INTO users (id, name) 10 | VALUES (12345678-aaaa-bbbb-cccc-123456789abc, 'Joe'); 11 | INSERT INTO users (id, name) 12 | VALUES (UUID(), 'Jen'); 13 | INSERT INTO users (id, name) 14 | VALUES (UUID(), 'Jim'); 15 | 16 | SELECT * FROM users; 17 | ```{{execute}} 18 | 19 | Insert rows into table `tracks_by_user`: 20 | 21 | ```sql 22 | INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title) 23 | VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-01-01', '2020-01-05T11:22:33', '20 Greatest Hits', 1982, 16, 'Hey Jude'); 24 | 25 | INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title) 26 | VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T09:00:00', '20 Greatest Hits', 1982, 16, 'Hey Jude'); 27 | 28 | INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title) 29 | VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T16:41:10', 'Legendary Concert Performances', 1978, 6, 'Johnny B. Goode'); 30 | 31 | INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title) 32 | VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T16:44:56', 'The Beatles 1967-1970', 1973, 17, 'Come Together'); 33 | 34 | INSERT INTO tracks_by_user (id, month, timestamp, album_title, album_year, number, title) 35 | VALUES (12345678-aaaa-bbbb-cccc-123456789abc, '2020-09-01', '2020-09-15T21:13:13', 'Dark Side Of The Moon', 1973, 3, 'Time'); 36 | 37 | SELECT * FROM tracks_by_user; 38 | ```{{execute}} 39 | -------------------------------------------------------------------------------- /cql/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | echo '127.0.0.1 node1' >> /etc/hosts 15 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 16 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 17 | mv apache-cassandra-4.0.0 /usr/share/cassandra 18 | rm apache-cassandra-4.0.0-bin.tar.gz 19 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 20 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 21 | source .bashrc 22 | /usr/share/cassandra/bin/cassandra -R 23 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 24 | sleep 15 25 | done 26 | echo "done" >> /opt/katacoda-background-finished 27 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/step1.md: -------------------------------------------------------------------------------- 1 | In this step, you will verify that Cassandra has been installed and is running as a service. 2 | Next, you will connect using *cqlsh* and create a keyspace and table. 3 | 4 | During startup, this scenario uses *apt-get* to install and start a single Cassandra node running as a service. 5 | This process may take a few minutes. Wait until you see `Cassandra has started!` before you continue. 6 | 7 | Once Cassandra has started, click to verify the cluster status with *nodetool*. 8 | ``` 9 | nodetool status 10 | ```{{execute}} 11 | 12 | --- 13 |

14 | **Status:** 15 | Look at the first two characters of the status. 16 | Each character has an individual meaning. 17 | The sequence `UN` means the node's status is `Up` and state is `Normal`. 18 |

19 | --- 20 | 21 | ![Up/Normal](./assets/nodetool-status.png) 22 | 23 | Now that the node is running, you will create a keyspace and table. 24 | Start the CQL Shell (*cqlsh*) so you can issue CQL commands. 25 | 26 | ``` 27 | cqlsh 28 | ```{{execute}} 29 | 30 | Create the `music` keyspace. 31 | 32 | ``` 33 | create KEYSPACE music WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; 34 | ```{{execute}} 35 | 36 | Use the `music` keyspace. 37 | 38 | ``` 39 | use music; 40 | ```{{execute}} 41 | 42 | Create the `songs` table. 43 | 44 | ``` 45 | CREATE TABLE songs ( 46 | artist TEXT, 47 | title TEXT, 48 | year INT, 49 | PRIMARY KEY ((artist), title) 50 | ); 51 | ```{{execute}} 52 | 53 | Type `exit` to close *cqlsh*. 54 | ``` 55 | exit 56 | ```{{execute}} 57 | 58 | # Summary 59 | 60 | In this step, you have verified that Cassandra is running and created the *music* keyspace and the *songs* table. -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | echo '127.0.0.1 node1' >> /etc/hosts 15 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 16 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 17 | mv apache-cassandra-4.0.0 /usr/share/cassandra 18 | rm apache-cassandra-4.0.0-bin.tar.gz 19 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 20 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 21 | source .bashrc 22 | /usr/share/cassandra/bin/cassandra -R 23 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 24 | sleep 15 25 | done 26 | echo "done" >> /opt/katacoda-background-finished 27 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step6.md: -------------------------------------------------------------------------------- 1 | Table `ratings_by_user` stores information about movie ratings organized by users, 2 | such that each partition contains all ratings left by one particular user. 3 | This table has multi-row partitions and 4 | the primary key defined as `PRIMARY KEY ((email), title, year)`. 5 | Let's first retrieve all rows from the table to learn how the data looks like and then focus 6 | on predicates that the primary key can support. 7 | 8 | Q1. Retrieve all rows: 9 | ``` 10 | SELECT * FROM ratings_by_user; 11 | ```{{execute}} 12 | 13 | Q2. Retrieve one partition: 14 | ``` 15 | SELECT * FROM ratings_by_user 16 | WHERE email = 'joe@datastax.com'; 17 | ```{{execute}} 18 | 19 | Q3. Retrieve two partitions: 20 | ``` 21 | SELECT * FROM ratings_by_user 22 | WHERE email IN ('joe@datastax.com', 23 | 'jen@datastax.com'); 24 | ```{{execute}} 25 | 26 | Q4. Retrieve one row: 27 | ``` 28 | SELECT * FROM ratings_by_user 29 | WHERE email = 'jim@datastax.com' 30 | AND title = 'Alice in Wonderland' 31 | AND year = 2010; 32 | ```{{execute}} 33 | 34 | Q5 - Q8. Retrieve a subset of rows from a partition: 35 | ``` 36 | SELECT * FROM ratings_by_user 37 | WHERE email = 'jim@datastax.com' 38 | AND title = 'Alice in Wonderland' 39 | AND year IN (2010, 1951); 40 | ```{{execute}} 41 | ``` 42 | SELECT * FROM ratings_by_user 43 | WHERE email = 'jim@datastax.com' 44 | AND title = 'Alice in Wonderland' 45 | AND year > 1950; 46 | ```{{execute}} 47 | ``` 48 | SELECT * FROM ratings_by_user 49 | WHERE email = 'jim@datastax.com' 50 | AND title = 'Alice in Wonderland'; 51 | ```{{execute}} 52 | ``` 53 | SELECT * FROM ratings_by_user 54 | WHERE email = 'jim@datastax.com' 55 | AND title < 'Charlie and the Chocolate Factory'; 56 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | echo '127.0.0.1 node1' >> /etc/hosts 15 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 16 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 17 | mv apache-cassandra-4.0.0 /usr/share/cassandra 18 | rm apache-cassandra-4.0.0-bin.tar.gz 19 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 20 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 21 | source .bashrc 22 | /usr/share/cassandra/bin/cassandra -R 23 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 24 | sleep 15 25 | done 26 | echo "done" >> /opt/katacoda-background-finished 27 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | echo '127.0.0.1 node1' >> /etc/hosts 15 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 16 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 17 | mv apache-cassandra-4.0.0 /usr/share/cassandra 18 | rm apache-cassandra-4.0.0-bin.tar.gz 19 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 20 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 21 | source .bashrc 22 | /usr/share/cassandra/bin/cassandra -R 23 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 24 | sleep 15 25 | done 26 | echo "done" >> /opt/katacoda-background-finished 27 | -------------------------------------------------------------------------------- /cassandra-data-modeling/messaging-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | echo '127.0.0.1 node1' >> /etc/hosts 15 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 16 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 17 | mv apache-cassandra-4.0.0 /usr/share/cassandra 18 | rm apache-cassandra-4.0.0-bin.tar.gz 19 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 20 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 21 | source .bashrc 22 | /usr/share/cassandra/bin/cassandra -R 23 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 24 | sleep 15 25 | done 26 | echo "done" >> /opt/katacoda-background-finished 27 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "private": false, 3 | "title": "Migrate Cassandra 3.x -> 4.x", 4 | "description": "Learn how to perform 'zero-downtime' migration from Cassandra 3.x -> 4.x", 5 | "difficulty": "Intermediate", 6 | "time": "20 minutes", 7 | "details": { 8 | "assets": { 9 | "host01": [ 10 | {"file": "wait.sh", "target": "/usr/local/bin", "chmod": "+x"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a Cassandra 3.11.9 Cluster", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Populate the Cluster", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Verify that the Cluster is Ready to Upgrade", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Prepare the 3.x Cluster for Migration", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Install Cassandra 4.0", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Start New Node", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Verify New Node", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Test Your Understanding", 44 | "text": "quiz.md" 45 | } 46 | ], 47 | "intro": { 48 | "courseData": "background.sh", 49 | "code": "foreground.sh", 50 | "text": "intro.md" 51 | }, 52 | "finish": { 53 | "text": "finish.md" 54 | } 55 | }, 56 | "environment": { 57 | "uilayout": "terminal" 58 | }, 59 | "backend": { 60 | "imageid": "ubuntu:1804" 61 | } 62 | } -------------------------------------------------------------------------------- /cassandra-data-modeling/sensor-data/step5.md: -------------------------------------------------------------------------------- 1 | Find hourly average temperatures for every sensor in network `forest-net` and date range [`2020-07-05`,`2020-07-06`] within the week of `2020-07-05`; 2 | order by date (desc) and hour (desc): 3 | 4 |
5 | Solution 6 | 7 | ```sql 8 | SELECT date_hour, avg_temperature, 9 | latitude, longitude, sensor 10 | FROM temperatures_by_network 11 | WHERE network = 'forest-net' 12 | AND week = '2020-07-05' 13 | AND date_hour >= '2020-07-05' 14 | AND date_hour < '2020-07-07'; 15 | ```{{execute}} 16 | 17 |
18 | 19 |
20 | 21 | Find hourly average temperatures for every sensor in network `forest-net` and date range [`2020-07-04`,`2020-07-06`] within the weeks of `2020-06-28` and `2020-07-05`; 22 | order by date (desc) and hour (desc): 23 | 24 |
25 | Solution 1 26 | 27 | ```sql 28 | SELECT date_hour, avg_temperature, 29 | latitude, longitude, sensor 30 | FROM temperatures_by_network 31 | WHERE network = 'forest-net' 32 | AND week = '2020-07-05' 33 | AND date_hour >= '2020-07-04' 34 | AND date_hour < '2020-07-07'; 35 | 36 | SELECT date_hour, avg_temperature, 37 | latitude, longitude, sensor 38 | FROM temperatures_by_network 39 | WHERE network = 'forest-net' 40 | AND week = '2020-06-28' 41 | AND date_hour >= '2020-07-04' 42 | AND date_hour < '2020-07-07'; 43 | ```{{execute}} 44 | 45 |
46 | 47 |
48 | Solution 2 49 | 50 | ```sql 51 | SELECT date_hour, avg_temperature, 52 | latitude, longitude, sensor 53 | FROM temperatures_by_network 54 | WHERE network = 'forest-net' 55 | AND week IN ('2020-07-05','2020-06-28') 56 | AND date_hour >= '2020-07-04' 57 | AND date_hour < '2020-07-07'; 58 | ```{{execute}} 59 | 60 |
-------------------------------------------------------------------------------- /cql/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Cassandra Query Language", 3 | "description": "Learn about the most essential data definition and data manipulation statements in Cassandra Query Language (CQL)", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"} 10 | ] 11 | }, 12 | "steps": [ 13 | { 14 | "title": "CQL", 15 | "text": "step1.md" 16 | }, 17 | { 18 | "title": "Start the CQL shell", 19 | "text": "step2.md" 20 | }, 21 | { 22 | "title": "Create a keyspace", 23 | "text": "step3.md" 24 | }, 25 | { 26 | "title": "Set a working keyspace", 27 | "text": "step4.md" 28 | }, 29 | { 30 | "title": "Create a table", 31 | "text": "step5.md" 32 | }, 33 | { 34 | "title": "Insert a row", 35 | "text": "step6.md" 36 | }, 37 | { 38 | "title": "Retrieve a row", 39 | "text": "step7.md" 40 | }, 41 | { 42 | "title": "Update a row", 43 | "text": "step8.md" 44 | }, 45 | { 46 | "title": "Delete a row", 47 | "text": "step9.md" 48 | }, 49 | { 50 | "title": "CQL vs. SQL", 51 | "text": "step10.md" 52 | }, 53 | { 54 | "title": "Test Your Understanding", 55 | "text": "quiz.md" 56 | } 57 | ], 58 | "intro": { 59 | "courseData": "background.sh", 60 | "code": "foreground.sh", 61 | "text": "intro.md" 62 | }, 63 | "finish": { 64 | "text": "finish.md" 65 | } 66 | }, 67 | "environment": { 68 | "uilayout": "terminal" 69 | }, 70 | "backend": { 71 | "imageid": "ubuntu:1804" 72 | } 73 | } 74 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Shopping Cart Data Modeling Example for Cassandra", 3 | "description": "Explore how online shopping carts can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "shopping_cart_data.cql", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Design query Q1", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q4", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design update U2", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q2", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Design query Q3", 44 | "text": "step8.md" 45 | }, 46 | { 47 | "title": "Design query Q5", 48 | "text": "step9.md" 49 | }, 50 | { 51 | "title": "Design update U1", 52 | "text": "step10.md" 53 | } 54 | ], 55 | "intro": { 56 | "courseData": "background.sh", 57 | "code": "foreground.sh", 58 | "text": "intro.md" 59 | }, 60 | "finish": { 61 | "text": "finish.md" 62 | } 63 | }, 64 | "environment": { 65 | "uilayout": "terminal" 66 | }, 67 | "backend": { 68 | "imageid": "ubuntu20.04" 69 | } 70 | } 71 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Investment Portfolio Data Modeling Example for Cassandra", 3 | "description": "Explore how investment portfolio data can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "investment_data.cql", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Design query Q1", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q2", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design query Q3.1", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q3.2", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Design query Q3.3", 44 | "text": "step8.md" 45 | }, 46 | { 47 | "title": "Design query Q3.4", 48 | "text": "step9.md" 49 | }, 50 | { 51 | "title": "Design query Q3.5", 52 | "text": "step10.md" 53 | } 54 | ], 55 | "intro": { 56 | "courseData": "background.sh", 57 | "code": "foreground.sh", 58 | "text": "intro.md" 59 | }, 60 | "finish": { 61 | "text": "finish.md" 62 | } 63 | }, 64 | "environment": { 65 | "uilayout": "terminal" 66 | }, 67 | "backend": { 68 | "imageid": "ubuntu20.04" 69 | } 70 | } 71 | -------------------------------------------------------------------------------- /cassandra-data-modeling/shopping-cart-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | sed -i 's/^enable_materialized_views:.*$/enable_materialized_views: true/g' apache-cassandra-4.0.0/conf/cassandra.yaml 15 | echo '127.0.0.1 node1' >> /etc/hosts 16 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 17 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 18 | mv apache-cassandra-4.0.0 /usr/share/cassandra 19 | rm apache-cassandra-4.0.0-bin.tar.gz 20 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 21 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 22 | source .bashrc 23 | /usr/share/cassandra/bin/cassandra -R 24 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 25 | sleep 15 26 | done 27 | echo "done" >> /opt/katacoda-background-finished 28 | -------------------------------------------------------------------------------- /cassandra-fundamentals/cql/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Cassandra Query Language", 3 | "description": "Learn about the most essential data definition and data manipulation statements in Cassandra Query Language (CQL)", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"} 10 | ] 11 | }, 12 | "steps": [ 13 | { 14 | "title": "CQL", 15 | "text": "step1.md" 16 | }, 17 | { 18 | "title": "Start the CQL shell", 19 | "text": "step2.md" 20 | }, 21 | { 22 | "title": "Create a keyspace", 23 | "text": "step3.md" 24 | }, 25 | { 26 | "title": "Set a working keyspace", 27 | "text": "step4.md" 28 | }, 29 | { 30 | "title": "Create a table", 31 | "text": "step5.md" 32 | }, 33 | { 34 | "title": "Insert a row", 35 | "text": "step6.md" 36 | }, 37 | { 38 | "title": "Retrieve a row", 39 | "text": "step7.md" 40 | }, 41 | { 42 | "title": "Update a row", 43 | "text": "step8.md" 44 | }, 45 | { 46 | "title": "Delete a row", 47 | "text": "step9.md" 48 | }, 49 | { 50 | "title": "CQL vs. SQL", 51 | "text": "step10.md" 52 | }, 53 | { 54 | "title": "Test Your Understanding", 55 | "text": "quiz.md" 56 | } 57 | ], 58 | "intro": { 59 | "courseData": "background.sh", 60 | "code": "foreground.sh", 61 | "text": "intro.md" 62 | }, 63 | "finish": { 64 | "text": "finish.md" 65 | } 66 | }, 67 | "environment": { 68 | "uilayout": "terminal" 69 | }, 70 | "backend": { 71 | "imageid": "ubuntu:1804" 72 | } 73 | } 74 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | sed -i 's/^enable_user_defined_functions:.*$/enable_user_defined_functions: true/g' apache-cassandra-4.0.0/conf/cassandra.yaml 15 | echo '127.0.0.1 node1' >> /etc/hosts 16 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 17 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 18 | mv apache-cassandra-4.0.0 /usr/share/cassandra 19 | rm apache-cassandra-4.0.0-bin.tar.gz 20 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 21 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 22 | source .bashrc 23 | /usr/share/cassandra/bin/cassandra -R 24 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 25 | sleep 15 26 | done 27 | echo "done" >> /opt/katacoda-background-finished 28 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 6 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 7 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 8 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 9 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 10 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 11 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 12 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 13 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 14 | sed -i 's/^enable_materialized_views:.*$/enable_materialized_views: true/g' apache-cassandra-4.0.0/conf/cassandra.yaml 15 | echo '127.0.0.1 node1' >> /etc/hosts 16 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 17 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 18 | mv apache-cassandra-4.0.0 /usr/share/cassandra 19 | rm apache-cassandra-4.0.0-bin.tar.gz 20 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 21 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 22 | source .bashrc 23 | /usr/share/cassandra/bin/cassandra -R 24 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 25 | sleep 15 26 | done 27 | echo "done" >> /opt/katacoda-background-finished 28 | -------------------------------------------------------------------------------- /cassandra-data-modeling/order-management-data/step2.md: -------------------------------------------------------------------------------- 1 | Create table `orders_by_user`: 2 | ```sql 3 | CREATE TABLE orders_by_user ( 4 | user_id TEXT, 5 | order_timestamp TIMESTAMP, 6 | order_id TEXT, 7 | order_status TEXT, 8 | order_total DECIMAL, 9 | PRIMARY KEY ((user_id),order_timestamp,order_id) 10 | ) WITH CLUSTERING ORDER BY (order_timestamp DESC, order_id ASC); 11 | ```{{execute}} 12 | 13 | Create table `orders_by_id`: 14 | ```sql 15 | CREATE TABLE orders_by_id ( 16 | order_id TEXT, 17 | item_name TEXT, 18 | item_id TEXT, 19 | item_description TEXT, 20 | item_price DECIMAL, 21 | item_quantity INT, 22 | order_status TEXT STATIC, 23 | order_timestamp TIMESTAMP STATIC, 24 | order_subtotal DECIMAL STATIC, 25 | order_shipping DECIMAL STATIC, 26 | order_tax DECIMAL STATIC, 27 | order_total DECIMAL STATIC, 28 | payment_summary TEXT STATIC, 29 | payment_details MAP STATIC, 30 | billing_summary TEXT STATIC, 31 | billing_details MAP STATIC, 32 | shipping_summary TEXT STATIC, 33 | shipping_details MAP STATIC, 34 | delivery_id TEXT STATIC, 35 | delivery_details MAP STATIC, 36 | PRIMARY KEY ((order_id),item_name,item_id) 37 | ); 38 | ```{{execute}} 39 | 40 | Create table `orders_by_user_item`: 41 | ```sql 42 | CREATE TABLE orders_by_user_item ( 43 | user_id TEXT, 44 | item_id TEXT, 45 | order_timestamp TIMESTAMP, 46 | order_id TEXT, 47 | PRIMARY KEY ((user_id,item_id),order_timestamp,order_id) 48 | ) WITH CLUSTERING ORDER BY (order_timestamp DESC, order_id ASC); 49 | ```{{execute}} 50 | 51 | 52 | Create table `order_status_history_by_id`: 53 | ```sql 54 | CREATE TABLE order_status_history_by_id ( 55 | order_id TEXT, 56 | status_timestamp TIMESTAMP, 57 | order_status TEXT, 58 | PRIMARY KEY ((order_id),status_timestamp) 59 | ) WITH CLUSTERING ORDER BY (status_timestamp DESC); 60 | ```{{execute}} 61 | 62 | 63 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/step4.md: -------------------------------------------------------------------------------- 1 | Previously, you enabled full query logging for a Cassandra node using *nodetool*, but the logging will not remain enabled when the node is restarted unless you edit the `cassandra.yaml` file. In this step, you will learn how to configure some of the properties full query logging. 2 | 3 | Click to open the `/usr/share/cassandra/conf/cassandra.yaml`{{open}} file in the editor. 4 | 5 | Add the YAML configuration to enable full query logging: 6 |
full_query_logging_options:
7 |
    log_dir: /tmp/fqllogs
8 | 9 | 10 | # Configurable Properties 11 | Here are the configurable properties for full query logging: 12 | 13 | - `log_dir`: Enable full query logging by setting this property to an existing directory location. 14 | - `roll_cycle`: Sets the frequency at which log segments are rolled - DAILY, HOURLY (the default), or MINUTELY. 15 | - `block`: Determines whether writes to the full query log will block query completion if full query logging falls behind, defaults to true. 16 | - `max_queue_weight`: Sets the maximum size of the in-memory queue of full query logs to be written to disk before blocking occurs, defaults to 256 MiB. 17 | - `max_log_size`: Sets the maximum size of full query log files on disk (default 16 GiB). After this value is exceeded, the oldest log file will be deleted. 18 | - `archive_command`: Optionally, provides a command that will be used to archive full query log files before deletion. 19 | - `max_archive_retries`: Sets a maximum number of times a failed archive command will be retried (defaults to 10) 20 | 21 | # Summary 22 | 23 | In this step, you learned how to enable full query logging in the `cassandra.yaml` file and explored the configurable properties of full query logging. 24 | 25 | 26 | -------------------------------------------------------------------------------- /cassandra-data-modeling/investment-data/step2.md: -------------------------------------------------------------------------------- 1 | Create table `accounts_by_user`: 2 | ```sql 3 | CREATE TABLE accounts_by_user ( 4 | username TEXT, 5 | account_number TEXT, 6 | cash_balance DECIMAL, 7 | name TEXT STATIC, 8 | PRIMARY KEY ((username),account_number) 9 | ); 10 | ```{{execute}} 11 | 12 | Create table `positions_by_account`: 13 | ```sql 14 | CREATE TABLE positions_by_account ( 15 | account TEXT, 16 | symbol TEXT, 17 | quantity DECIMAL, 18 | PRIMARY KEY ((account),symbol) 19 | ); 20 | ```{{execute}} 21 | 22 | Create table `trades_by_a_d`: 23 | ```sql 24 | CREATE TABLE trades_by_a_d ( 25 | account TEXT, 26 | trade_id TIMEUUID, 27 | type TEXT, 28 | symbol TEXT, 29 | shares DECIMAL, 30 | price DECIMAL, 31 | amount DECIMAL, 32 | PRIMARY KEY ((account),trade_id) 33 | ) WITH CLUSTERING ORDER BY (trade_id DESC); 34 | ```{{execute}} 35 | 36 | Create table `trades_by_a_td`: 37 | ```sql 38 | CREATE TABLE trades_by_a_td ( 39 | account TEXT, 40 | trade_id TIMEUUID, 41 | type TEXT, 42 | symbol TEXT, 43 | shares DECIMAL, 44 | price DECIMAL, 45 | amount DECIMAL, 46 | PRIMARY KEY ((account),type,trade_id) 47 | ) WITH CLUSTERING ORDER BY (type ASC, trade_id DESC); 48 | ```{{execute}} 49 | 50 | Create table `trades_by_a_std`: 51 | ```sql 52 | CREATE TABLE trades_by_a_std ( 53 | account TEXT, 54 | trade_id TIMEUUID, 55 | type TEXT, 56 | symbol TEXT, 57 | shares DECIMAL, 58 | price DECIMAL, 59 | amount DECIMAL, 60 | PRIMARY KEY ((account),symbol,type,trade_id) 61 | ) WITH CLUSTERING ORDER BY (symbol ASC, type ASC, trade_id DESC); 62 | ```{{execute}} 63 | 64 | Create table `trades_by_a_sd`: 65 | ```sql 66 | CREATE TABLE trades_by_a_sd ( 67 | account TEXT, 68 | trade_id TIMEUUID, 69 | type TEXT, 70 | symbol TEXT, 71 | shares DECIMAL, 72 | price DECIMAL, 73 | amount DECIMAL, 74 | PRIMARY KEY ((account),symbol,trade_id) 75 | ) WITH CLUSTERING ORDER BY (symbol ASC, trade_id DESC); 76 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-internode-message/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Apache Cassandra™ Internode Messaging Improvements", 3 | "description": "New Features in Cassandra 4", 4 | "difficulty": "Beginner", 5 | "time": "10 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "in.cql", "target": "/root/"}, 11 | {"file": "out.cql", "target": "/root/"} 12 | ], 13 | "host02": [ 14 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 15 | {"file": "in.cql", "target": "/root/"}, 16 | {"file": "out.cql", "target": "/root/"} 17 | ] 18 | }, 19 | "steps": [ 20 | { 21 | "title": "Welcome to Internode Messaging", 22 | "text": "step1.md" 23 | }, 24 | { 25 | "title": "Asynchronous Messages", 26 | "text": "step2.md" 27 | }, 28 | { 29 | "title": "Cleaning Up Technical Debt", 30 | "text": "step3.md" 31 | }, 32 | { 33 | "title": "Check Out the Cluster", 34 | "text": "step4.md" 35 | }, 36 | { 37 | "title": "Internode Metrics and Virtual Tables", 38 | "text": "step5.md" 39 | }, 40 | { 41 | "title": "Review the Metrics", 42 | "text": "step6.md" 43 | }, 44 | { 45 | "title": "Do the Metrics Add Up?", 46 | "text": "step7.md" 47 | }, 48 | { 49 | "title": "Test your understanding", 50 | "text": "quiz.md" 51 | } 52 | ], 53 | "intro": { 54 | "courseData": "background.sh", 55 | "code": "foreground.sh", 56 | "text": "intro.md" 57 | }, 58 | "finish": { 59 | "text": "finish.md" 60 | } 61 | }, 62 | "environment": { 63 | "uilayout": "terminal" 64 | }, 65 | "backend": { 66 | "imageid": "docker-swarm" 67 | } 68 | } 69 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Time Series Data Modeling Example for Cassandra", 3 | "description": "Explore how time series data can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "time_series_data.tar.gz", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables using DSBulk", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Start the CQL shell", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q1", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design query Q2", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q3", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Design query Q4", 44 | "text": "step8.md" 45 | }, 46 | { 47 | "title": "Design query Q5", 48 | "text": "step9.md" 49 | }, 50 | { 51 | "title": "Design query Q6", 52 | "text": "step10.md" 53 | }, 54 | { 55 | "title": "Design query Q7", 56 | "text": "step11.md" 57 | } 58 | ], 59 | "intro": { 60 | "courseData": "background.sh", 61 | "code": "foreground.sh", 62 | "text": "intro.md" 63 | }, 64 | "finish": { 65 | "text": "finish.md" 66 | } 67 | }, 68 | "environment": { 69 | "uilayout": "terminal" 70 | }, 71 | "backend": { 72 | "imageid": "ubuntu20.04" 73 | } 74 | } 75 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Queries in Apache Cassandra™", 3 | "description": "Learn how to retrieve data from Cassandra tables", 4 | "difficulty": "Beginner", 5 | "time": "20 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"} 10 | ] 11 | }, 12 | "steps": [ 13 | { 14 | "title": "Querying tables", 15 | "text": "step1.md" 16 | }, 17 | { 18 | "title": "Syntax", 19 | "text": "step2.md" 20 | }, 21 | { 22 | "title": "Let's get started ...", 23 | "text": "step3.md" 24 | }, 25 | { 26 | "title": "Querying table \"users\"", 27 | "text": "step4.md" 28 | }, 29 | { 30 | "title": "Querying table \"movies\"", 31 | "text": "step5.md" 32 | }, 33 | { 34 | "title": "Querying table \"ratings_by_user\"", 35 | "text": "step6.md" 36 | }, 37 | { 38 | "title": "Querying table \"ratings_by_movie\"", 39 | "text": "step7.md" 40 | }, 41 | { 42 | "title": "Using aggregates and functions", 43 | "text": "step8.md" 44 | }, 45 | { 46 | "title": "Grouping rows", 47 | "text": "step9.md" 48 | }, 49 | { 50 | "title": "Ordering rows", 51 | "text": "step10.md" 52 | }, 53 | { 54 | "title": "Setting limits", 55 | "text": "step11.md" 56 | }, 57 | { 58 | "title": "Test Your Understanding", 59 | "text": "quiz.md" 60 | } 61 | ], 62 | "intro": { 63 | "courseData": "background.sh", 64 | "code": "foreground.sh", 65 | "text": "intro.md" 66 | }, 67 | "finish": { 68 | "text": "finish.md" 69 | } 70 | }, 71 | "environment": { 72 | "uilayout": "terminal" 73 | }, 74 | "backend": { 75 | "imageid": "ubuntu20.04" 76 | } 77 | } 78 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | tar -xzvf music_data.tar.gz 6 | rm music_data.tar.gz 7 | wget https://downloads.datastax.com/dsbulk/dsbulk.tar.gz 8 | tar -xzvf dsbulk.tar.gz 9 | rm dsbulk.tar.gz 10 | mv dsbulk* dsbulk 11 | echo 'PATH="$PATH:/root/dsbulk/bin"' >> .bashrc 12 | export PATH="$PATH:/root/dsbulk/bin" 13 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 14 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 15 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 16 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 17 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 18 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 19 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 20 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 21 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 22 | echo '127.0.0.1 node1' >> /etc/hosts 23 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 24 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 25 | mv apache-cassandra-4.0.0 /usr/share/cassandra 26 | rm apache-cassandra-4.0.0-bin.tar.gz 27 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 28 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 29 | source .bashrc 30 | /usr/share/cassandra/bin/cassandra -R 31 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 32 | sleep 15 33 | done 34 | echo "done" >> /opt/katacoda-background-finished 35 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Repair Improvements", 3 | "description": "Learn how to manage incremental repair in a Cassandra 4.0 cluster", 4 | "difficulty": "Intermediate", 5 | "time": "25 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "delete_nongases.cql", "target": "/root/"}, 11 | {"file": "elements.csv", "target": "/root/"} 12 | ], 13 | "host02": [ 14 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 15 | {"file": "delete_nongases.cql", "target": "/root/"}, 16 | {"file": "elements.csv", "target": "/root/"} 17 | ] 18 | }, 19 | "steps": [ 20 | { 21 | "title": "Setup & create data", 22 | "text": "step1.md" 23 | }, 24 | { 25 | "title": "Fun with SSTables", 26 | "text": "step2.md" 27 | }, 28 | { 29 | "title": "The need for repair", 30 | "text": "step3.md" 31 | }, 32 | { 33 | "title": "Incremental repair", 34 | "text": "step4.md" 35 | }, 36 | { 37 | "title": "Test Your Understanding", 38 | "text": "quiz.md" 39 | } 40 | ], 41 | "intro": { 42 | "courseData": "background.sh", 43 | "code": "foreground.sh", 44 | "text": "intro.md" 45 | }, 46 | "finish": { 47 | "text": "finish.md" 48 | } 49 | }, 50 | "environment": { 51 | "uilayout": "terminal", 52 | "terminals": [ 53 | {"name": "Node1 Admin", "target": "host01"}, 54 | {"name": "Node1 Console", "target": "host01"}, 55 | {"name": "Node1 CQLSH", "target": "host01"}, 56 | {"name": "Node2 Admin", "target": "host02"}, 57 | {"name": "Node2 Console", "target": "host02"}, 58 | {"name": "Node2 CQLSH", "target": "host02"} 59 | ] 60 | }, 61 | "backend": { 62 | "imageid": "docker-swarm" 63 | } 64 | } 65 | -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step7.md: -------------------------------------------------------------------------------- 1 | Table `ratings_by_movie` stores information about ratings organized by movies, 2 | such that each partition contains all ratings for one particular movie. 3 | This table has multi-row partitions and 4 | the primary key defined as `PRIMARY KEY ((title, year), email)`. 5 | Let's first retrieve all rows from the table to learn how the data looks like and then focus 6 | on predicates that the primary key can support. 7 | 8 | Q1. Retrieve all rows: 9 |
10 | Solution 11 | 12 | ``` 13 | SELECT * FROM ratings_by_movie; 14 | ```{{execute}} 15 | 16 |
17 | 18 |
19 | 20 | Q2. Retrieve one partition: 21 |
22 | Solution 23 | 24 | ``` 25 | SELECT * FROM ratings_by_movie 26 | WHERE title = 'Alice in Wonderland' 27 | AND year = 2010; 28 | ```{{execute}} 29 | 30 |
31 | 32 |
33 | 34 | Q3. Retrieve two partitions: 35 |
36 | Solution 37 | 38 | ``` 39 | SELECT * FROM ratings_by_movie 40 | WHERE title = 'Alice in Wonderland' 41 | AND year IN (2010, 1951); 42 | ```{{execute}} 43 | 44 |
45 | 46 |
47 | 48 | Q4. Retrieve one row: 49 |
50 | Solution 51 | 52 | ``` 53 | SELECT * FROM ratings_by_movie 54 | WHERE title = 'Alice in Wonderland' 55 | AND year = 2010 56 | AND email = 'joe@datastax.com'; 57 | ```{{execute}} 58 | 59 |
60 | 61 |
62 | 63 | Q5 - Q6. Retrieve a subset of rows from a partition: 64 |
65 | Solution 1 66 | 67 | ``` 68 | SELECT * FROM ratings_by_movie 69 | WHERE title = 'Alice in Wonderland' 70 | AND year = 2010 71 | AND email IN ('jen@datastax.com', 72 | 'jim@datastax.com'); 73 | ```{{execute}} 74 | 75 |
76 |
77 | Solution 2 78 | 79 | ``` 80 | SELECT * FROM ratings_by_movie 81 | WHERE title = 'Alice in Wonderland' 82 | AND year = 2010 83 | AND email < 'job@datastax.com'; 84 | ```{{execute}} 85 | 86 |
-------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step2.md: -------------------------------------------------------------------------------- 1 | Create tables `performers`, `albums_by_performer`, `albums_by_title`, 2 | `albums_by_genre`, `tracks_by_title`, `tracks_by_album`, `users` and `tracks_by_user`: 3 | ```sql 4 | cqlsh -e " 5 | 6 | USE music_data; 7 | 8 | CREATE TABLE performers ( 9 | name TEXT, 10 | type TEXT, 11 | country TEXT, 12 | born INT, 13 | died INT, 14 | founded INT, 15 | PRIMARY KEY ((name)) 16 | ); 17 | 18 | CREATE TABLE albums_by_performer ( 19 | performer TEXT, 20 | year INT, 21 | title TEXT, 22 | genre TEXT, 23 | PRIMARY KEY ((performer),year,title) 24 | ) WITH CLUSTERING ORDER BY (year DESC, title ASC); 25 | 26 | CREATE TABLE albums_by_title ( 27 | title TEXT, 28 | year INT, 29 | performer TEXT, 30 | genre TEXT, 31 | PRIMARY KEY ((title),year) 32 | ) WITH CLUSTERING ORDER BY (year DESC); 33 | 34 | CREATE TABLE albums_by_genre ( 35 | genre TEXT, 36 | year INT, 37 | title TEXT, 38 | performer TEXT, 39 | PRIMARY KEY ((genre),year,title) 40 | ) WITH CLUSTERING ORDER BY (year DESC, title ASC); 41 | 42 | CREATE TABLE tracks_by_title ( 43 | title TEXT, 44 | album_year INT, 45 | album_title TEXT, 46 | number INT, 47 | length INT, 48 | genre TEXT, 49 | PRIMARY KEY ((title),album_year,album_title,number) 50 | ) WITH CLUSTERING ORDER BY (album_year DESC, album_title ASC, number ASC); 51 | 52 | CREATE TABLE tracks_by_album ( 53 | album_title TEXT, 54 | album_year INT, 55 | number INT, 56 | title TEXT, 57 | length INT, 58 | genre TEXT STATIC, 59 | PRIMARY KEY ((album_title,album_year),number) 60 | ); 61 | 62 | CREATE TABLE users ( 63 | id UUID, 64 | name TEXT, 65 | PRIMARY KEY ((id)) 66 | ); 67 | 68 | CREATE TABLE tracks_by_user ( 69 | id UUID, 70 | month DATE, 71 | timestamp TIMESTAMP, 72 | album_title TEXT, 73 | album_year INT, 74 | number INT, 75 | title TEXT, 76 | length INT, 77 | PRIMARY KEY ((id,month),timestamp,album_title,album_year,number) 78 | ) WITH CLUSTERING ORDER BY (timestamp DESC, album_title ASC, album_year ASC, number ASC);" 79 | ```{{execute}} -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/background.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | apt-get update 3 | apt install -y openjdk-11-jre-headless 4 | export JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" 5 | tar -xzvf time_series_data.tar.gz 6 | rm time_series_data.tar.gz 7 | wget https://downloads.datastax.com/dsbulk/dsbulk.tar.gz 8 | tar -xzvf dsbulk.tar.gz 9 | rm dsbulk.tar.gz 10 | mv dsbulk* dsbulk 11 | echo 'PATH="$PATH:/root/dsbulk/bin"' >> .bashrc 12 | export PATH="$PATH:/root/dsbulk/bin" 13 | wget https://archive.apache.org/dist/cassandra/4.0.0/apache-cassandra-4.0.0-bin.tar.gz 14 | tar xzf apache-cassandra-4.0.0-bin.tar.gz 15 | sed -i 's/^cluster_name: .*$/cluster_name: "Cassandra Cluster"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 16 | #sed -i "s/^num_tokens:.*$/num_tokens: 1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 17 | #sed -i "s/^# initial_token:.*$/initial_token: -9223372036854775808/g" apache-cassandra-4.0.0/conf/cassandra.yaml 18 | sed -i 's/^endpoint_snitch: .*$/endpoint_snitch: GossipingPropertyFileSnitch/g' apache-cassandra-4.0.0/conf/cassandra.yaml 19 | sed -i 's/^dc=.*$/dc=DC-Houston/g' apache-cassandra-4.0.0/conf/cassandra-rackdc.properties 20 | sed -i "s/^listen_address:.*$/listen_address: 127.0.0.1/g" apache-cassandra-4.0.0/conf/cassandra.yaml 21 | sed -i 's/^rpc_address:.*$/rpc_address: 127.0.0.1/g' apache-cassandra-4.0.0/conf/cassandra.yaml 22 | echo '127.0.0.1 node1' >> /etc/hosts 23 | #echo '[[HOST2_IP]] node2' >> /etc/hosts 24 | sed -i 's/^ - seeds:.*$/ - seeds: "127.0.0.1"/g' apache-cassandra-4.0.0/conf/cassandra.yaml 25 | mv apache-cassandra-4.0.0 /usr/share/cassandra 26 | rm apache-cassandra-4.0.0-bin.tar.gz 27 | echo 'PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin"' >> .bashrc 28 | export PATH="$PATH:/usr/share/cassandra/bin:/usr/share/cassandra/tools/bin" 29 | source .bashrc 30 | /usr/share/cassandra/bin/cassandra -R 31 | while [ `grep "Starting listening for CQL clients" /usr/share/cassandra/logs/system.log | wc -l` -lt 1 ]; do 32 | sleep 15 33 | done 34 | echo "done" >> /opt/katacoda-background-finished 35 | -------------------------------------------------------------------------------- /cassandra-data-modeling/time-series-data/step2.md: -------------------------------------------------------------------------------- 1 | Create tables `sources_by_group`, `metrics`, `series_by_source_high`, 2 | `series_by_source_low`, `series_by_metric_high`, `series_by_metric_low` and `statistics_by_source_metric`: 3 | ```sql 4 | cqlsh -e " 5 | 6 | USE time_series; 7 | 8 | CREATE TABLE sources_by_group ( 9 | group TEXT, 10 | source TEXT, 11 | characteristics MAP, 12 | description TEXT STATIC, 13 | PRIMARY KEY ((group), source) 14 | ); 15 | 16 | CREATE TABLE metrics ( 17 | bucket TEXT, 18 | metric TEXT, 19 | unit TEXT, 20 | PRIMARY KEY ((bucket), metric) 21 | ); 22 | 23 | CREATE TABLE series_by_source_high ( 24 | group TEXT, 25 | source TEXT, 26 | timestamp TIMESTAMP, 27 | metric TEXT, 28 | value DECIMAL, 29 | PRIMARY KEY ((group, source), timestamp, metric) 30 | ) WITH CLUSTERING ORDER BY (timestamp DESC, metric ASC); 31 | 32 | CREATE TABLE series_by_source_low ( 33 | group TEXT, 34 | year INT, 35 | source TEXT, 36 | timestamp TIMESTAMP, 37 | metric TEXT, 38 | value DECIMAL, 39 | PRIMARY KEY ((group, year), source, timestamp, metric) 40 | ) WITH CLUSTERING ORDER BY (source ASC, timestamp DESC, metric ASC); 41 | 42 | CREATE TABLE series_by_metric_high ( 43 | group TEXT, 44 | metric TEXT, 45 | timestamp TIMESTAMP, 46 | source TEXT, 47 | value DECIMAL, 48 | PRIMARY KEY ((group, metric), timestamp, source) 49 | ) WITH CLUSTERING ORDER BY (timestamp DESC, source ASC); 50 | 51 | CREATE TABLE series_by_metric_low ( 52 | group TEXT, 53 | year INT, 54 | metric TEXT, 55 | timestamp TIMESTAMP, 56 | source TEXT, 57 | value DECIMAL, 58 | PRIMARY KEY ((group, year, metric), timestamp, source) 59 | ) WITH CLUSTERING ORDER BY (timestamp DESC, source ASC); 60 | 61 | CREATE TABLE statistics_by_source_metric ( 62 | source TEXT, 63 | metric TEXT, 64 | date DATE, 65 | min DECIMAL, 66 | max DECIMAL, 67 | median DECIMAL, 68 | mean DECIMAL, 69 | stdev DECIMAL, 70 | PRIMARY KEY ((source,metric),date) 71 | ) WITH CLUSTERING ORDER BY (date DESC);" 72 | ```{{execute}} 73 | 74 | -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/index.json: -------------------------------------------------------------------------------- 1 | { 2 | "title": "Digital Library Data Modeling Example for Cassandra", 3 | "description": "Explore how digital library data can be stored and queried in Cassandra NoSQL database", 4 | "difficulty": "Beginner", 5 | "time": "15 minutes", 6 | "details": { 7 | "assets": { 8 | "host01": [ 9 | {"file": "wait.sh", "target": "/usr/local/bin/", "chmod": "+x"}, 10 | {"file": "music_data.tar.gz", "target": "/root/"} 11 | ] 12 | }, 13 | "steps": [ 14 | { 15 | "title": "Create a keyspace", 16 | "text": "step1.md" 17 | }, 18 | { 19 | "title": "Create tables", 20 | "text": "step2.md" 21 | }, 22 | { 23 | "title": "Populate tables using DSBulk", 24 | "text": "step3.md" 25 | }, 26 | { 27 | "title": "Insert rows using the CQL shell", 28 | "text": "step4.md" 29 | }, 30 | { 31 | "title": "Design query Q1", 32 | "text": "step5.md" 33 | }, 34 | { 35 | "title": "Design query Q2", 36 | "text": "step6.md" 37 | }, 38 | { 39 | "title": "Design query Q3", 40 | "text": "step7.md" 41 | }, 42 | { 43 | "title": "Design query Q4", 44 | "text": "step8.md" 45 | }, 46 | { 47 | "title": "Design query Q5", 48 | "text": "step9.md" 49 | }, 50 | { 51 | "title": "Design query Q6", 52 | "text": "step10.md" 53 | }, 54 | { 55 | "title": "Design query Q7", 56 | "text": "step11.md" 57 | }, 58 | { 59 | "title": "Design query Q8", 60 | "text": "step12.md" 61 | }, 62 | { 63 | "title": "Design query Q9", 64 | "text": "step13.md" 65 | } 66 | ], 67 | "intro": { 68 | "courseData": "background.sh", 69 | "code": "foreground.sh", 70 | "text": "intro.md" 71 | }, 72 | "finish": { 73 | "text": "finish.md" 74 | } 75 | }, 76 | "environment": { 77 | "uilayout": "terminal" 78 | }, 79 | "backend": { 80 | "imageid": "ubuntu20.04" 81 | } 82 | } 83 | -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step2.md: -------------------------------------------------------------------------------- 1 | Let's have a look at the virtual tables that are available, and the specific 2 | keyspaces where these tables are located. 3 | 4 | First, list the existing keyspaces in the CQL console with: 5 | ``` 6 | DESCRIBE KEYSPACES; 7 | ```{{execute T2}} 8 | 9 | Notice there are two keyspaces named `system_virtual_schema` and `system_views`. 10 | They are special-purpose keyspaces designed to host virtual tables only. 11 | 12 | We will work with the `system_views` keyspace, so let us make it the default 13 | one for subsequent operations: 14 | ``` 15 | USE system_views; 16 | ```{{execute T2}} 17 | 18 | Get a listing of all tables in this keyspace: 19 | ``` 20 | DESCRIBE TABLES; 21 | ```{{execute T2}} 22 | 23 | A virtual table that provides valuable insight into read performance 24 | is the `tombstones_per_read` table: it contains a row for each table 25 | in the database, with statistics on how many tombstones are encountered 26 | while reading. 27 | Let's have a closer look at the table structure with: 28 | ``` 29 | DESCRIBE TABLE tombstones_per_read; 30 | ```{{execute T2}} 31 | 32 | Remember that the output of this command is for reference only, since 33 | (as we will soon see) you cannot directly alter these tables. 34 | 35 | Look at the primary key for table `tombstones_per_read`: there is indeed 36 | one row for each table in the database. 37 | One can imagine building 38 | a performance-monitoring tool based on the contents of these rows. 39 | 40 | Another very important table is the `settings` table. 41 | It provides a way to 42 | programmatically access the whole configuration as specified in file 43 | `cassandra.yaml`. 44 | Try reading the table in its entirety: 45 | ``` 46 | SELECT * FROM settings; 47 | ```{{execute T2}} 48 | 49 | How many rows does the table contain? _(Hint: press_ Enter _until you get 50 | to the total row count.)_ 51 | 52 | Try looking for a particular setting: 53 | ``` 54 | SELECT value FROM settings where name = 'num_tokens'; 55 | ```{{execute T2}} 56 | 57 | Can you find out what _data type_ the `value` column is? How? 58 | _(Hint: the answer can be found in the output of one of the commands above.)_ 59 | Do you have an explanation for this choice of data type? 60 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-audit-logging/step3.md: -------------------------------------------------------------------------------- 1 | In this step, you will stop the Cassandra service, enable audit logging in `cassandra.yaml`,re-start the Cassandra service, insert some data and view the inserts in the *audit log*. 2 | 3 | --- 4 |

5 | **Note:** 6 | Settings in `cassandra.yaml` only take effect after a node start or re-start. 7 |

8 | --- 9 | 10 | Stop the Cassandra service 11 | ``` 12 | service cassandra stop 13 | ```{{execute}} 14 | 15 | Verify that Cassandra has stopped 16 | ``` 17 | nodetool status 18 | ```{{execute}} 19 | 20 | You should see a message like this: 21 | ![Stopped](./assets/stopped.png) 22 | 23 | Click to open the `/etc/cassandra/cassandra.yaml`{{open}} file in the editor. 24 | 25 | Add the YAML configuration to enable audit logging: 26 |
audit_logging_options:
27 |
    enabled: true
28 | 29 | Re-start the Cassandra service 30 | ``` 31 | service cassandra start 32 | ```{{execute}} 33 | 34 | Verify that Cassandra has started 35 | ``` 36 | nodetool status 37 | ```{{execute}} 38 | 39 | --- 40 |

41 | **Note:** 42 | You may need to run `nodetool status` a few times before Cassandra has finished the startup process. 43 |

44 | --- 45 | 46 | Next you will insert another song and verify that the insertion shows up in the audit logs. 47 | 48 | Open cqlsh 49 | ``` 50 | cqlsh 51 | ```{{execute}} 52 | 53 | 54 | Insert another song into the *songs* table. 55 | ``` 56 | use music; 57 | INSERT INTO songs (artist, title, year) VALUES('Paul Simon', 'Kodachrome', 1973); 58 | ```{{execute}} 59 | 60 | Type `exit` to close *cqlsh*. 61 | ``` 62 | exit 63 | ```{{execute}} 64 | 65 | View the audit logs. 66 | ``` 67 | auditlogviewer /var/log/cassandra/audit 68 | ```{{execute}} 69 | 70 | You should now see that Paul Simon's *Kodachrome* has been inserted. 71 | 72 | ![Kodachrome](./assets/simon.png) 73 | 74 | # Summary 75 | 76 | In this step, you modified `cassandra.yaml` and re-started the server to enable audit logging. You then used *auditlogviewer* to verify that the operations you performed were recorded in the audit logs. -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step2.md: -------------------------------------------------------------------------------- 1 | In this step, you will create a keyspace and a table and populate them with some data. 2 | 3 | Click to start a CQL shell (cqlsh) to execute CQL commands in the cluster. 4 | ``` 5 | cqlsh 6 | ```{{execute T1}} 7 | 8 | Create a keyspace. 9 | ``` 10 | CREATE KEYSPACE united_states WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; 11 | ```{{execute T1}} 12 | 13 | Use the keyspace. 14 | ``` 15 | USE united_states; 16 | ```{{execute T1}} 17 | 18 | Create the table. 19 | ``` 20 | CREATE TABLE cities_by_state( 21 | state text, 22 | name text, 23 | population int, 24 | PRIMARY KEY((state), name) 25 | ); 26 | ```{{execute T1}} 27 | 28 | Insert the top 10 (by population) cities in the United States. 29 | ``` 30 | INSERT INTO cities_by_state (state, name, population) 31 | VALUES ('New York','New York City',8622357); 32 | INSERT INTO cities_by_state (state, name, population) 33 | VALUES ('California','Los Angeles',4085014); 34 | INSERT INTO cities_by_state (state, name, population) 35 | VALUES ('Illinois','Chicago',2670406); 36 | INSERT INTO cities_by_state (state, name, population) 37 | VALUES ('Texas','Houston',2378146); 38 | INSERT INTO cities_by_state (state, name, population) 39 | VALUES ('Arizona','Phoenix',1743469); 40 | INSERT INTO cities_by_state (state, name, population) 41 | VALUES ('Pennsylvania','Philadelphia',1590402); 42 | INSERT INTO cities_by_state (state, name, population) 43 | VALUES ('Texas','San Antonio',1579504); 44 | INSERT INTO cities_by_state (state, name, population) 45 | VALUES ('California','San Diego',1469490); 46 | INSERT INTO cities_by_state (state, name, population) 47 | VALUES ('Texas','Dallas',1400337); 48 | INSERT INTO cities_by_state (state, name, population) 49 | VALUES ('California','San Jose',1036242); 50 | ```{{execute T1}} 51 | 52 | Verify that the data has been loaded. 53 | ``` 54 | SELECT * FROM cities_by_state; 55 | ```{{execute T1}} 56 | 57 | Retrieve all the cities in California. 58 | ``` 59 | SELECT * FROM cities_by_state WHERE state = 'California'; 60 | ```{{execute T1}} 61 | 62 | Exit the CQL shell and clear the screen. 63 | ``` 64 | exit 65 | clear 66 | ```{{execute T1}} 67 | 68 | You have loaded the data, continue to the next step. -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-full-query-logging/step2.md: -------------------------------------------------------------------------------- 1 | In this step, you will connect using *cqlsh* and create a keyspace and table, perform some queries, and verify that full query logs are being created 2 | 3 | Start the CQL Shell (*cqlsh*) so you can issue CQL commands. 4 | 5 | ``` 6 | cqlsh 7 | ```{{execute}} 8 | 9 | Create the `movies` keyspace. 10 | 11 | ``` 12 | create KEYSPACE movies WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; 13 | ```{{execute}} 14 | 15 | Use the `movies` keyspace. 16 | 17 | ``` 18 | use movies; 19 | ```{{execute}} 20 | 21 | Create the `movie_metadata` table. 22 | 23 | ``` 24 | CREATE TABLE movie_metadata( 25 | imdb_id text, 26 | overview text, 27 | release_date text, 28 | title text, 29 | average_rating float, 30 | PRIMARY KEY(imdb_id)); 31 | ```{{execute}} 32 | 33 | Insert a row into the *movie_metadata* table. 34 | ``` 35 | INSERT INTO movie_metadata ( 36 | imdb_id, overview, release_date, title, average_rating 37 | ) VALUES('tt0114709', 'Led by Woody, Andy''s toys live happily in his room until Andy''s birthday brings Buzz Lightyear onto the scene. Afraid of losing his place in Andy''s heart, Woody plots against Buzz. But when circumstances separate Buzz and Woody from their owner, the duo eventually learns to put aside their differences.', '10/30/95', 'Toy Story', 7.7); 38 | ```{{execute}} 39 | 40 | Now let's do a `SELECT` 41 | 42 | ``` 43 | SELECT * FROM movie_metadata WHERE imdb_id = 'tt0114709'; 44 | ```{{execute}} 45 | 46 | You should see the row you just inserted. 47 | 48 | Type `exit` to close *cqlsh*. 49 | ``` 50 | exit 51 | ```{{execute}} 52 | 53 | Now, let's check the contents of our log directory to see if anything has been created: 54 | 55 | ``` 56 | ls /tmp/fqllogs 57 | ```{{execute}} 58 | 59 | You'll see two files, a file with a date timestamp in the name, and another file which provides a directory of all the dated files that have been written. You can try opening these files if you wish, but the contents won't make a lot of sense since they are binary data. Don't worry, Cassandra has a way to read this data. 60 | 61 | # Summary 62 | 63 | In this step, you have created the *movies* keyspace and the *movie_metadata* table, and performed some queries, and verified that full query logs were created. -------------------------------------------------------------------------------- /cassandra-features-4x/virtual-tables/step4.md: -------------------------------------------------------------------------------- 1 | Now, we want to look at the clients currently connected to this node through CQL. 2 | This is done by querying the virtual table `system_views.clients`: 3 | 4 | ``` 5 | SELECT port, connection_stage, driver_name, protocol_version, username FROM clients ; 6 | ```{{execute T2}} 7 | 8 | Wait a minute ... who are these clients? 9 | 10 | It turns out that `cqlsh` uses the Python driver. 11 | This driver keeps two connections alive on two different ports 12 | (the port numbers are chosen dynamically). 13 | So you are simply looking at the connection between your own `cqlsh` 14 | and the node. 15 | 16 | Let's create more connections. 17 | First, let's start a Python interpreter console (or _REPL_) and connect to the 18 | node from there. 19 | Go to the third terminal and type 20 | ``` 21 | python3 22 | ```{{execute T3}} 23 | 24 | Next, import the Python drivers and use them to connect to the local node 25 | (which is the default connection, so you don't need to provide IP addresses): 26 | ``` 27 | from dse.cluster import Cluster 28 | cluster = Cluster(protocol_version=4) 29 | session = cluster.connect() 30 | ```{{execute T3}} 31 | 32 | (Note: the drivers, `dse-driver==2.11.1`, have been preinstalled in Python for 33 | this scenario). 34 | 35 | In the Python REPL, try the following loop - which achieves the same effect 36 | as the query you ran earlier in `cqlsh` - **press Enter** to 37 | make it run: 38 | ``` 39 | rows = session.execute('SELECT port, connection_stage, ' 40 | 'driver_name, protocol_version FROM ' 41 | 'system_views.clients') 42 | for row in rows: 43 | print('%5i %8s %36s %2i' % ( 44 | row.port, row.connection_stage, 45 | row.driver_name, row.protocol_version 46 | )) 47 | ```{{execute T3}} 48 | 49 | How many rows are there? Look at the ports used and the protocol versions. 50 | Notice that the latter matches the required version specified a few lines above, 51 | when creating the `Cluster` object (`protocol_version=4`). 52 | 53 | Suppose you want to make sure all your clients have been upgraded to the 54 | more recent protocol (version 5). Check by issuing, in `cqlsh`, 55 | the following command (note its `WHERE` clause): 56 | ``` 57 | SELECT address, protocol_version, username FROM clients WHERE protocol_version < 5 ALLOW FILTERING ; 58 | ```{{execute T2}} 59 | 60 | Recall that for virtual tables there's no need to worry about 61 | full-cluster scans. 62 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-migrate-cassandra-3-to-4/step3.md: -------------------------------------------------------------------------------- 1 | In this step, we will verify that the Cassandra 3.x cluster is ready to be upgraded. There are 9 factors to consider: 2 | 3 | **Current State** 4 | All nodes in the cluster need to be in an ‘Up and Normal’ state. Check that there are no nodes in the cluster that are in a state different to *Up and Normal*. This command will list any nodes **not** in the *UN* state. 5 | ``` 6 | nodetool status | grep -v UN 7 | ```{{execute T1}} 8 | 9 | **Disk Space** 10 | Verify that each node has at least 50% diskspace free. 11 | ``` 12 | df -h 13 | ```{{execute T1}} 14 | 15 | **Errors** 16 | Ensure that there are no unresolved errors on nodes. Take alook at logged warnings as well. 17 | 18 | ``` 19 | grep -e "WARN" -e "ERROR" /usr/share/cassandra/logs/system.log 20 | ```{{execute T1}} 21 | 22 | **Gossip Stable** 23 | Verify all entries in the gossip information output have the gossip state ‘STATUS:NORMAL’. Use the following command to check if there are any nodes that have a status other than ‘NORMAL’. 24 | ``` 25 | nodetool gossipinfo | grep STATUS | grep -v NORMAL 26 | ```{{execute T1}} 27 | 28 | **Dropped Messages** 29 | Establish no Dropped Message log messages have been recorded on any node in the previous 72 hours. 30 | ``` 31 | nodetool tpstats | grep -A 12 Dropped 32 | ```{{execute T1}} 33 | 34 | **Backups Disabled** 35 | Verify that all automatic backups have been disabled. This includes disabling *Medusa* and any scripts that call `nodetool snapshot` until the upgrade is complete. 36 | 37 | **Repair Disabled** 38 | Verify that *repairs* have been disabled. This includes disabling automated repairs in *Reaper*. 39 | 40 | **Monitoring** 41 | Upgrading may result in a temporary reduction in performance, as it simulates a series of temporary node failures. Understanding how the upgrade impacts the performance of the system, both during and after, is crucial when working through the process. 42 | 43 | **Availability** 44 | Confirm that areas of the application that require Strong Consistency are using the `LOCAL_QUORUM` Consistency Level and a Replication Factor of 3. 45 | 46 | When `LOCAL_QUORUM` is used with a Replication Factor below 3, all nodes must be available for requests to start. A rolling restart using this configuration will result in full or partial unavailability while a node is *DOWN*. 47 | 48 | --- 49 | 50 | Clear the screen. 51 | ``` 52 | clear 53 | ```{{execute T1}} 54 | 55 | You are now ready to continue to the next step and begin the upgrade. -------------------------------------------------------------------------------- /cassandra-fundamentals/queries/step2.md: -------------------------------------------------------------------------------- 1 | To retrieve data from a table, Cassandra Query Language provides statement `SELECT` with the following simplified syntax: 2 | 3 | ``` 4 | SELECT [DISTINCT] * | 5 | select_expression [AS column_name][ , ... ] 6 | FROM [keyspace_name.] table_name 7 | [WHERE partition_key_predicate 8 | [AND clustering_key_predicate]] 9 | [GROUP BY primary_key_column_name][ , ... ] 10 | [ORDER BY clustering_key_column_name ASC|DESC][ , ... ] 11 | [PER PARTITION LIMIT number] 12 | [LIMIT number] 13 | [ALLOW FILTERING] 14 | ``` 15 | 16 | The `SELECT` clause specifies what to project into a final result. The projection list can include all columns using wildcard `*`, 17 | individual column names, aggregates, such as `COUNT` and `AVG`, and numerous functions that work with write-time timestamps, 18 | TTLs, and values of various data types. It is even possible to create user-defined aggregates and functions using 19 | statements `CREATE AGGREGATE` and `CREATE FUNCTION`. 20 | 21 | The `FROM` clause uses keyspace name and table name to identify an existing table. 22 | If a keyspace name is omitted, the current working keyspace is used. 23 | 24 | The `WHERE` clause supplies partition and row filtering predicates. At the very least, 25 | *all* partition key column values should be provided. Predicates for *one or more* clustering key columns can 26 | further restrict the result, as long as the primary key definition order is respected. All predicates must be *equality* predicates (`=` and `IN`), 27 | except the last clustering key column predicate can be an *inequality* predicate (`>`, `<`, `>=`, `<=`). 28 | 29 | The `GROUP BY` clause can group rows based on partition and clustering key columns, as long as the primary key definition order is respected. 30 | 31 | The `ORDER BY` clause can retrieve rows from each partition based on the clustering order declared in a table definition or its reverse. 32 | Even when `ORDER BY` is not used, a query result still preserves the clustering order. 33 | 34 | The `PER PARTITION LIMIT` and `LIMIT` clauses are used to specify the maximum number of rows per partition or overall, respectively, 35 | that can appear in a final result. 36 | 37 | Finally, `ALLOW FILTERING` allows Cassandra to scan data to execute queries. While this relaxes many restrictions on what predicates can be used in the `WHERE` clause, 38 | scanning is a very inefficient access pattern that should not be used in production. Only in rare cases, when a partition key is known, 39 | scanning rows within one partition may be ok. Even then, a new table, materialized view or secondary index should be considered instead as a better alternative. 40 | As a rule of thumb, you should avoid using `ALLOW FILTERING` in your queries and you can expect us to do the same in our examples. -------------------------------------------------------------------------------- /cassandra-data-modeling/music-data/step3.md: -------------------------------------------------------------------------------- 1 | Load data into table `performers`: 2 | ```bash 3 | dsbulk load -url performers.csv \ 4 | -k music_data \ 5 | -t performers \ 6 | -header true \ 7 | -logDir /tmp/logs 8 | ```{{execute}} 9 | 10 | Retrieve some rows from table `performers`: 11 | ```sql 12 | cqlsh -e "SELECT * FROM music_data.performers LIMIT 10;" 13 | ```{{execute}} 14 | 15 | Load data into tables `albums_by_performer`, `albums_by_title` and `albums_by_genre`: 16 | ```bash 17 | dsbulk load -url albums.csv \ 18 | -k music_data \ 19 | -t albums_by_performer \ 20 | -header true \ 21 | -logDir /tmp/logs 22 | 23 | dsbulk load -url albums.csv \ 24 | -k music_data \ 25 | -t albums_by_title \ 26 | -header true \ 27 | -logDir /tmp/logs 28 | 29 | dsbulk load -url albums.csv \ 30 | -k music_data \ 31 | -t albums_by_genre \ 32 | -header true \ 33 | -logDir /tmp/logs 34 | ```{{execute}} 35 | 36 | Retrieve some rows from tables `albums_by_performer`, `albums_by_title` and `albums_by_genre`: 37 | ```sql 38 | cqlsh -e "SELECT * FROM music_data.albums_by_performer LIMIT 5;" 39 | cqlsh -e "SELECT * FROM music_data.albums_by_title LIMIT 5;" 40 | cqlsh -e "SELECT * FROM music_data.albums_by_genre LIMIT 5;" 41 | ```{{execute}} 42 | 43 | Load data into tables `tracks_by_title` and `tracks_by_album`: 44 | ```bash 45 | dsbulk load -url tracks.csv \ 46 | -k music_data \ 47 | -t tracks_by_title \ 48 | -header true \ 49 | -m "0=album_title, \ 50 | 1=album_year, \ 51 | 2=genre, \ 52 | 3=number, \ 53 | 4=title" \ 54 | -logDir /tmp/logs 55 | 56 | dsbulk load -url tracks.csv \ 57 | -k music_data \ 58 | -t tracks_by_album \ 59 | -header true \ 60 | -m "0=album_title, \ 61 | 1=album_year, \ 62 | 2=genre, \ 63 | 3=number, \ 64 | 4=title" \ 65 | -logDir /tmp/logs 66 | ```{{execute}} 67 | 68 | Retrieve some rows from tables `tracks_by_title` and `tracks_by_album`: 69 | ```sql 70 | cqlsh -e "SELECT * FROM music_data.tracks_by_title LIMIT 5;" 71 | cqlsh -e "SELECT * FROM music_data.tracks_by_album LIMIT 5;" 72 | ```{{execute}} 73 | 74 | 75 | 76 | -------------------------------------------------------------------------------- /cassandra-features-4x/cassandra4-repair-improvements/step2.md: -------------------------------------------------------------------------------- 1 | We are about to bring the cluster to the conditions that warrant a data 2 | repair; but first, we have to make sure all recently-inserted rows, probably 3 | still lingering in memory (in the memtables), are flushed to disk in the 4 | form of SSTables. 5 | 6 | ### Flushing data 7 | 8 | Each time a table is created, it gets an ID that is used, among other things, 9 | also in the name of the directory containing the corresponding data. 10 | To identify the full name of the data directory for `elements`, look at 11 | the result of this command on Node1: 12 | ``` 13 | ls /usr/share/cassandra/data/data/chemistry/ 14 | ```{{Execute T3}} 15 | The output will be something similar to 16 | `elements-8f40e960043011ec8f376feadc8291b4`. 17 | 18 | Since the rows we just inserted are just a few, probably the data directory 19 | is still empty: 20 | this can be verified with (**NOTE**: copy and paste the 21 | actual ID in the command before executing): 22 | ``` 23 | ls /usr/share/cassandra/data/data/chemistry/elements- 24 | ```{{Execute T3}} 25 | (there should just be a `backups` subdirectory for incremental backups - we 26 | can ignore it here.) 27 | 28 | Now we can force a flush of all insertions to disk, by executing the following 29 | command on both nodes: 30 | ``` 31 | nodetool flush # Node1 32 | ```{{Execute T3}} 33 | 34 | ``` 35 | nodetool flush # Node2 36 | ```{{Execute T6}} 37 | 38 | This time, inspection of the data directory will confirm that at least 39 | one SSTable has been created: remember that you can use the up-arrow key to 40 | bring up a command you already typed, and re-execute the following: 41 | ``` 42 | ls /usr/share/cassandra/data/data/chemistry/elements- 43 | ```{{Execute T3}} 44 | You will now see the SSTable files. 45 | Notice in particular a file named `[...]-Data.db`, where the actual contents 46 | of the table are stored. 47 | 48 | ### Examining SSTables 49 | 50 | You can examine the repair status of this brand new SSTable with the following 51 | command (**NOTE**: again, replace the actual table ID and the SSTable 52 | file name for the command to work): 53 | ``` 54 | sstablemetadata /usr/share/cassandra/data/data/chemistry/elements-/-Data.db 55 | ```{{execute T3}} 56 | 57 | Look for the repair information in the output: there should be two lines such as 58 | ``` 59 | ... 60 | Repaired at: 0 61 | Pending repair: -- 62 | ... 63 | ``` 64 | 65 | meaning, respectively, that the table has never been repaired yet, 66 | and is not currently in the pending-repair pool of any running repair. 67 | 68 | ### Recap 69 | 70 | We have forced a data flush to disk to make sure our SSTable files 71 | are up-to-date; indeed the files are there and, as expected, have 72 | never undergone any repair operation (...yet). 73 | 74 | Now it's time to engineer a data misalignment between the two nodes, 75 | to later see incremental repair in action! 76 | --------------------------------------------------------------------------------