├── post-install ├── mapr_fstab ├── spark-terasort-1.1-SNAPSHOT.jar ├── spark-terasort-1.2-SNAPSHOT.jar ├── chunkChk.sh ├── mr-verify.sh ├── pig-verify.sh ├── solr51-install.sh ├── mkBMvol.sh ├── spark-verify.sh ├── hive-verify.sh ├── add-mapr-user.sh ├── spReconfig.sh ├── runSparkTeraGenSort.sh ├── hbasePerf.sh ├── gen_profile.sh ├── hbase-install.sh ├── runDFSIO.sh ├── runRWSpeedTest.sh ├── runTeraGenSort.sh ├── runYCSB.sh ├── regionsp.py ├── profile_cluster.sh ├── hive-install.sh └── mapr-audit.sh ├── pre-install ├── iperf ├── iozone ├── rpctest ├── rpctest1 ├── stream59 ├── lat_mem_rd ├── storcli-config.sh ├── memory-test.sh ├── java-post-install.sh ├── make-repos.sh ├── summIOzone.sh ├── mapr3x-install.sh ├── disk-test.sh ├── network-test.sh ├── cluster-audit.sh └── mapr-install.sh ├── .github └── ISSUE_TEMPLATE │ ├── feature_request.md │ └── bug_report.md └── README.md /post-install/mapr_fstab: -------------------------------------------------------------------------------- 1 | localhost:/mapr /mapr hard,intr,nolock,noatime 2 | -------------------------------------------------------------------------------- /pre-install/iperf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/pre-install/iperf -------------------------------------------------------------------------------- /pre-install/iozone: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/pre-install/iozone -------------------------------------------------------------------------------- /pre-install/rpctest: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/pre-install/rpctest -------------------------------------------------------------------------------- /pre-install/rpctest1: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/pre-install/rpctest1 -------------------------------------------------------------------------------- /pre-install/stream59: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/pre-install/stream59 -------------------------------------------------------------------------------- /pre-install/lat_mem_rd: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/pre-install/lat_mem_rd -------------------------------------------------------------------------------- /post-install/spark-terasort-1.1-SNAPSHOT.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/post-install/spark-terasort-1.1-SNAPSHOT.jar -------------------------------------------------------------------------------- /post-install/spark-terasort-1.2-SNAPSHOT.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MapRPS/archive-cv/HEAD/post-install/spark-terasort-1.2-SNAPSHOT.jar -------------------------------------------------------------------------------- /post-install/chunkChk.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Jul-22 vi: set ai et sw=3 tabstop=3: 3 | 4 | #TBD: check for non-uniform map-slots or containers 5 | # Check TeraGenerated chunks per node 6 | echo 'Checking /benchmarks/tera/in/part* for chunks per node' 7 | hadoop mfs -ls '/benchmarks/tera/in/part*' | 8 | grep ':5660' | grep -v -E 'p [0-9]+\.[0-9]+\.[0-9]+' | 9 | tr -s '[:blank:]' ' ' | cut -d' ' -f 4 | 10 | sort | uniq -c 11 | # tr -s '[:blank:]' ' ' | cut -d' ' -f 2 12 | # sort | uniq | wc -l 13 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Desktop (please complete the following information):** 27 | - OS: [e.g. iOS] 28 | - Browser [e.g. chrome, safari] 29 | - Version [e.g. 22] 30 | 31 | **Smartphone (please complete the following information):** 32 | - Device: [e.g. iPhone6] 33 | - OS: [e.g. iOS8.1] 34 | - Browser [e.g. stock browser, safari] 35 | - Version [e.g. 22] 36 | 37 | **Additional context** 38 | Add any other context about the problem here. 39 | -------------------------------------------------------------------------------- /post-install/mr-verify.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Quick smoke-test for MapReduce on Yarn using builtin example code (Pi) 3 | 4 | [[ $(id -u) -eq 0 ]] && { echo This script must be run as non-root; exit 1; } 5 | 6 | # Extract service account name from daemon.conf (typically 'mapr') 7 | srvid=$(awk -F= '/mapr.daemon.user/{print $2}' /opt/mapr/conf/daemon.conf) 8 | if [[ "$srvid" == $(id -un) ]]; then 9 | echo This script should ALSO be run as non-service-account 10 | fi 11 | 12 | # Get total disk count to use one map task per disk 13 | nmaps=$(maprcli dashboard info -json |grep total_disks |grep -o '[0-9]*') 14 | if ! [[ $nmaps -gt 0 ]]; then 15 | maprcli dashboard info - |grep yarn -A999 16 | echo total_disks not found; exit 17 | fi 18 | 19 | # Get the examples jar file path 20 | exjar=$(find /opt/mapr/hadoop -name hadoop-mapreduce-examples\*.jar \ 21 | 2>/dev/null |grep -v sources) 22 | 23 | # Run the Pi example mapreduce job, expect ~3.14.... on the console 24 | hadoop jar "$exjar" pi "$nmaps" 4000 25 | 26 | # If this job is not successful, the cluster is not ready 27 | #4000 nSamples takes ~same time as 400. 28 | 29 | #readarray -t factors < <(maprcli dashboard info -json | \ 30 | # grep -e num_node_managers -e total_disks | grep -o '[0-9]*') 31 | #nmaps=$(( ${factors[0]} * ${factors[1]} )); #nmaps=${factors[1]} 32 | -------------------------------------------------------------------------------- /post-install/pig-verify.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Script to verify pig works for non-root, non-mapr user 4 | hadoop fs -ls || { echo Pig requires user directory, directory not found; exit 1; } 5 | #sudo hadoop fs -mkdir /user/$(id -un) && sudo hadoop fs -chown $(id -un):$(id -gn) /user/$(id -un) 6 | #sudo hadoop fs -mkdir /tmp 7 | #sudo hadoop fs -chmod 777 /tmp 8 | 9 | hadoop fs -copyFromLocal /opt/mapr/pig/pig-0.1?/tutorial/data/excite-small.log /tmp 10 | 11 | pig <<-'EOF' 12 | set mapred.map.child.java.opts '-Xmx1g' 13 | lines = load '/tmp/excite-small.log' using TextLoader() as (line:chararray); -- load a file from hadoop FS 14 | words = foreach lines generate flatten(TOKENIZE(line,' \t')) as word; -- Split each line into words using space and tab as delimiters 15 | uniqwords = group words by word; 16 | wordcount = foreach uniqwords generate group, COUNT(words); 17 | dump wordcount; 18 | EOF 19 | #store wordcount into 'pig-wordcount'; 20 | 21 | #echo 'Pig website with tutorial: http://pig.apache.org/docs/r0.13.0/start.html' 22 | 23 | # Pig code to load hive table 24 | #tab = load 'tablename' using HCatLoader(); 25 | #ftab = filter tab by date = '20100819' ; 26 | #... 27 | #store ftab into 'processedevents' using HCatStorer("date=20100819"); 28 | 29 | #STORE my_processed_data INTO 'tablename2' 30 | # USING org.apache.hcatalog.pig.HCatStorer(); 31 | # https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-HCatStorer 32 | -------------------------------------------------------------------------------- /pre-install/storcli-config.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Jan-06 vi: set ai et sw=3 tabstop=3: 3 | 4 | [ $(id -u) -ne 0 ] && { echo This script must be run as root; exit 1; } 5 | 6 | if type storcli >& /dev/null; then 7 | : 8 | elif [ -x /opt/MegaRAID/storcli/storcli64 ]; then 9 | storcli() { /opt/MegaRAID/storcli/storcli64; } 10 | else 11 | echo storcli command not found; exit 2 12 | fi 13 | 14 | storcli /c0 show all #Display all disk controller values 15 | storcli /c0 /eall /sall show | awk '$3 == "UGood"{print $1}'; exit 16 | 17 | #Modify existing virtual drive 1 configuration (example) 18 | #storcli /c0 /v1 set wrcache=wb rdcache=ra iopolicy=cached pdcache=off strip=1024 #strip size probably cannot be changed 19 | 20 | #Loop over all UGood drives and create RAID0 single disk virtual drive (vd) 21 | #storcli /c0 /eall /sall show | awk '$3 == "UGood"{print $1}' | xargs -i sudo storcli /c0 add vd drives={} type=r0 strip=1024 ra wb cached pdcache=off 22 | 23 | #Assuming drive 17:7 is UGood. 1024 strip needs recent LSI/Avago controller and 7.x RHEL Linux kernel 24 | #sudo storcli /c0 add vd drives=17:7 type=r0 strip=1024 ra wb cached pdcache=off 25 | 26 | # Download 45MB zip file (July 2016): 27 | # https://docs.broadcom.com/docs/1.20.15_StorCLI 28 | 29 | # Use smartctl to examine MegaRAID virtual drives: 30 | # smartctl -a -d megaraid,0 /dev/sdd 31 | # Test unmount drives for bad spots and other problems: 32 | # smartctl -d megaraid,0 -t short /dev/sdd 33 | -------------------------------------------------------------------------------- /post-install/solr51-install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2015-May-05 vi: set ai et sw=3 tabstop=3: 3 | 4 | [ $(id -u) -ne 0 ] && { echo This script must be run as root; exit 1; } 5 | 6 | hostname=$(hostname) 7 | clustername=$(awk 'NR==1{print $1}' /opt/mapr/conf/mapr-clusters.conf) 8 | zookeepers=$(maprcli node listzookeepers | sed -n 2p) 9 | 10 | # Download Solr 5.1 tarball and place in /mapr/$clustername/tmp/ 11 | # curl 'http://mirror.cogentco.com/pub/apache/lucene/solr/5.1.0/solr-5.1.0.tgz' -o /mapr/$clustername/tmp/solr-5.1.0.tgz 12 | tar xzf /mapr/$clustername/tmp/solr-5.1.0.tgz solr-5.1.0/bin/install_solr_service.sh --strip-components=2 #Extract install script 13 | ./install_solr_service.sh /mapr/$clustername/tmp/solr-5.1.0.tgz -i /opt -d /var/solr -u mapr -s solr -p 8983 #Use Linux FS 14 | 15 | #maprcli volume create -name localvol-$hostname -path /apps/solr/localvol-$hostname -createparent true -localvolumehost $hostname -replication 1 16 | #./install_solr_service.sh /mapr/$clustername/tmp/solr-5.1.0.tgz -i /opt -d /mapr/$clustername/apps/solr/localvol-$hostname -u mapr -s solr -p 8983 #Use MapR NFS 17 | 18 | service solr stop 19 | # add to /var/solr/solr.in.sh 20 | cat - <> /var/solr/solr.in.sh 21 | SOLR_MODE=solrcloud 22 | SOLR_HOST="$hostname" 23 | ZK_HOST="$zookeepers/solr" 24 | EOF 25 | 26 | /opt/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost $zookeepers/solr -cmd bootstrap -solrhome /var/solr/data 27 | 28 | service solr start 29 | service solr status 30 | -------------------------------------------------------------------------------- /post-install/mkBMvol.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Jul-22 vi: set ai et sw=3 tabstop=3: 3 | 4 | # Remove and recreate a MapR 1x volume just for benchmarking 5 | # Use replication 1 to get peak write performance 6 | if maprcli volume info -name benchmarks1x > /dev/null; then 7 | maprcli volume unmount -name benchmarks1x 8 | maprcli volume remove -name benchmarks1x 9 | sleep 2 10 | fi 11 | maprcli volume create -name benchmarks1x -path /benchmarks1x -replication 1 12 | # use -topology /data... if needed 13 | hadoop fs -chmod 777 /benchmarks1x #open the folder up for all to use 14 | 15 | # Create standard 3x benchmarks volume/folder 16 | if maprcli volume info -name benchmarks > /dev/null; then 17 | maprcli volume unmount -name benchmarks 18 | maprcli volume remove -name benchmarks 19 | sleep 2 20 | fi 21 | maprcli volume create -name benchmarks -path /benchmarks 22 | hadoop fs -chmod 777 /benchmarks #open the folder up for all to use 23 | 24 | # Increase MFS cache on all nodes (with clush) 25 | # wconf=/opt/mapr/conf/warden.conf 26 | # sed -i 's/mfs.heapsize.percent=20/mfs.heapsize.percent=30/' "$wconf" 27 | # /opt/mapr/hadoop/hadoop-0.20.2/conf/mapred-site.xml 28 | # mapred.tasktracker.map.tasks.maximum CPUS-1 29 | # mapred.tasktracker.reduce.tasks.maximum CPUS/2 30 | 31 | #source maprcli_check function from mapr-audit.sh 32 | #source <(awk '/^ *maprcli_check\(\)/,/^ *} *$/' mapr-audit.sh) 33 | 34 | #hadoop fs -stat /benchmarks #Check if folder exists and ... TBD 35 | #hadoop mfs -setcompression off /benchmarks 36 | #compression may help but not allowed by sortbenchmark.org 37 | #hadoop mfs -setchunksize $[512*1024*1024] /benchmarks 38 | #default 256MB, optimal chunksize determined by cluster size 39 | 40 | -------------------------------------------------------------------------------- /post-install/spark-verify.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Quick smoke-test for Spark on Yarn using builtin example code 3 | 4 | # Check UID 5 | [[ $(id -u) -eq 0 ]] && { echo This script must be run as non-root; exit 1; } 6 | if [[ -f /opt/mapr/conf/daemon.conf ]]; then 7 | srvid=$(awk -F= '/mapr.daemon.user/{print $2}' /opt/mapr/conf/daemon.conf) 8 | fi 9 | if [[ "$srvid" == $(id -un) ]]; then 10 | echo "This script should be run as a non service ($srvid) account" 11 | fi 12 | 13 | spkhome=$(find /opt/mapr/spark -maxdepth 1 -type d -name spark-\* \ 14 | |sort -n |tail -1) 15 | spkjar=$(find "$spkhome" -name spark-examples\*.jar) 16 | # JavaWordCount requires script arg to be an existing file in maprfs:/// 17 | spkclass=org.apache.spark.examples.JavaWordCount 18 | spkclass=org.apache.spark.examples.SparkPi 19 | #spkdrv=$(hostname -i) 20 | #$spkhome/bin/spark-submit --conf spark.driver.host=$spkdrv \ 21 | 22 | "$spkhome/bin/spark-submit" \ 23 | --master yarn \ 24 | --deploy-mode client \ 25 | --class $spkclass \ 26 | "$spkjar" "${1:-40}" 27 | 28 | # Cluster mode, use sparkhistory logs to view stdout 29 | #$spkhome/bin/spark-submit --master yarn --deploy-mode cluster \ 30 | # --class $spkclass $spkjar 40 31 | 32 | # /opt/mapr/hadoop/hadoop-2.7.0/bin/yarn logs -applicationId application_1469809164296_0036 |awk '/^LogType:stdout/,/^End of LogType:stdout/' #Look for Pi answer 3.14.... 33 | 34 | # /opt/mapr/spark-1.6.1-bin-without-hadoop/bin/spark-submit --driver-java-options="-Dmylevel=WARN" --driver-library-path /opt/mapr/spark-1.6.1-bin-without-hadoop/lib --master yarn-client --class org.apache.spark.examples.SparkPi /opt/mapr/spark-1.6.1-bin-without-hadoop/lib/spark-examples-1.6.1-hadoop2.2.0.jar 40 #log4j filter by setting mylevel 35 | # log4j setting to pass runtime level as shown above: /opt/mapr/spark-1.6.1-bin-without-hadoop/conf/log4j.properties:log4j.appender.console.threshold=${mylevel} 36 | -------------------------------------------------------------------------------- /pre-install/memory-test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2012-Jan-19 vi: set ai et sw=3 tabstop=3: 3 | # Run Stream benchmark or mem latency benchmark 4 | 5 | sockets=$(grep '^physical' /proc/cpuinfo | sort -u | grep -c ^) 6 | cores=$(grep '^cpu cores' /proc/cpuinfo | sort -u | awk '{print $NF}') 7 | thrds=$(grep '^siblings' /proc/cpuinfo | sort -u | awk '{print $NF}') 8 | scriptdir="$(cd "$(dirname "$0")"; pwd -P)" #absolute path to this script's folder 9 | #objdump -d $scriptdir/stream59 10 | 11 | if [ -f /sys/kernel/mm/transparent_hugepage/enabled ]; then 12 | enpath=/sys/kernel/mm/transparent_hugepage/enabled 13 | elif [ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ]; then 14 | enpath=/sys/kernel/mm/redhat_transparent_hugepage/enabled 15 | else 16 | echo Transparent Huge Page setting not found, performance may be affected 17 | fi 18 | 19 | # -t (THP) option to set THP for peak performance 20 | while getopts "t" opt; do 21 | case $opt in 22 | t) thp=setit ;; 23 | \?) echo "Invalid option: -$OPTARG" >&2; exit ;; 24 | esac 25 | done 26 | 27 | if [ -n "$enpath" -a -n "$thp" ]; then # save current THP setting 28 | # To enable/disable hugepages, you must run as root 29 | thp=$(cat $enpath | grep -o '\[.*\]') 30 | thp=${thp#[}; thp=${thp%]} #strip [] brackets off string 31 | fi 32 | 33 | if [ "$1" == "lat" ]; then 34 | #http://www.bitmover.com/lmbench/lat_mem_rd.8.html (also in my evernotes) 35 | echo 'Running lat_mem(lmbench) to measure memory latency in nano seconds' 36 | [ -n "$thp" -a $(id -u) -eq 0 ] && echo always > $enpath 37 | taskset 0x1 $scriptdir/lat_mem_rd -N3 -P1 2048m 513 2>&1 38 | # Pinned to 1st socket. Pinning to 2nd socket (0x8) shows slower latency 39 | [ -n "$thp" -a $(id -u) -eq 0 ] && echo $thp > $enpath 40 | else 41 | [ -n "$thp" -a $(id -u) -eq 0 ] && echo never > $enpath 42 | if [ $cores == $thrds ]; then 43 | $scriptdir/stream59 44 | else 45 | OMP_NUM_THREADS=$((cores * sockets)) KMP_AFFINITY=granularity=core,scatter $scriptdir/stream59 46 | fi 47 | [ -n "$thp" -a $(id -u) -eq 0 ] && echo $thp > $enpath 48 | fi 49 | 50 | -------------------------------------------------------------------------------- /pre-install/java-post-install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Oct-06 vi: set ai et sw=3 tabstop=3: 3 | set -o nounset 4 | set -o errexit 5 | 6 | usage() { 7 | cat << EOF 8 | This script sets up the Java default using alternatives. 9 | This can be useful when there are multiple Java versions installed. 10 | 11 | There are commented commands in the script that demonstrate using 12 | the CLI to download and install JDK 13 | EOF 14 | } 15 | 16 | # Handle script options 17 | while getopts "d" opt; do 18 | case $opt in 19 | \?) usage; exit ;; 20 | esac 21 | done 22 | 23 | # Set some global variables 24 | javapath=/usr/java/default #Oracle 25 | #javapath=/usr/lib/jvm/java #OpenJDK 26 | sep=$(printf %80s); sep=${sep// /#} #Substitute all blanks with ###### 27 | distro=$(cat /etc/*release 2>&1 |grep -m1 -i -o -e ubuntu -e redhat -e 'red hat' -e centos) || distro=centos 28 | distro=$(echo $distro | tr '[:upper:]' '[:lower:]') 29 | #distro=$(lsb_release -is | tr [[:upper:]] [[:lower:]]) 30 | 31 | [ -d $javapath ] || { echo $javapath does not exist; exit 1; } 32 | echo $javapath is $(readlink -f $javapath) 33 | 34 | for item in java javac javaws jar jps javah keytool; do 35 | alternatives --install /usr/bin/$item $item $javapath/bin/$item 9 36 | alternatives --set $item $javapath/bin/$item 37 | done 38 | 39 | # Download and install using CLI 40 | #curl -L -C - -b "oraclelicense=accept-securebackup-cookie" -O http://download.oracle.com/otn-pub/java/jdk/7u80-b15/jdk-7u80-linux-x64.rpm 41 | #curl -L -C - -b "oraclelicense=accept-securebackup-cookie" -O http://download.oracle.com/otn-pub/java/jdk/8u121-b13/e9e7ea248e2c4826b92b3f075a80e441/jdk-8u121-linux-x64.rpm 42 | #clush -ab -c /tmp/jdk-7u75-linux-x64.rpm #Push it out to all the nodes in /tmp/ 43 | #clush -ab yum -y localinstall /tmp/jdk-7u75-linux-x64.rpm 44 | 45 | ## Java Browser (Mozilla) Plugin 32-bit ## 46 | #alternatives --install /usr/lib/mozilla/plugins/libjavaplugin.so libjavaplugin.so /usr/java/jdk1.6.0_32/jre/lib/i386/libnpjp2.so 20000 47 | ## Java Browser (Mozilla) Plugin 64-bit ## 48 | #alternatives --install /usr/lib64/mozilla/plugins/libjavaplugin.so libjavaplugin.so.x86_64 /usr/java/jdk1.6.0_32/jre/lib/amd64/libnpjp2.so 20000 49 | 50 | -------------------------------------------------------------------------------- /post-install/hive-verify.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Script to verify hive works for non-root, non-mapr user 4 | [ $(id -u) -eq 0 ] && { echo This script must be run as non-root; exit 1; } 5 | 6 | srvid=$(awk -F= '/mapr.daemon.user/{ print $2}' /opt/mapr/conf/daemon.conf) 7 | if [[ "$srvid" == $(id -u) ]]; then 8 | echo This script should be run as non-service-account 9 | fi 10 | if [[ $# -ne 1 ]]; then 11 | #TBD: find HS2 hostname with maprcli 12 | echo This script requires an HS2 hostname as only argument; exit 2 13 | fi 14 | if ! hadoop fs -ls; then 15 | echo Hive requires user directory, directory not found; exit 3 16 | fi 17 | 18 | tmpfile=$(mktemp); trap 'rm $tmpfile' 0 1 2 3 15 19 | hs2host=$1 20 | hivehome=$(eval echo /opt/mapr/hive/hive-2*) 21 | 22 | # Create simple csv table 23 | cat - > $tmpfile <&1 |grep "$grepopts") || distro=centos 27 | distro=$(echo $distro | tr '[:upper:]' '[:lower:]') 28 | reponame=/etc/yum.repos.d/mapr.repo 29 | 30 | if clush "$clargs" -S -B -g all 'grep -qi mapr /etc/yum.repos.d/*'; then 31 | echo MapR repos found; exit 1 32 | fi 33 | if clush "$clargs" -S -B -g all 'grep -qi -m1 epel /etc/yum.repos.d/*'; then 34 | echo Warning, EPEL repo not found 35 | fi 36 | 37 | #Create 6.x repos on all nodes 38 | cat </dev/null 62 | clstr: @all 63 | zk: 64 | cldb: 65 | rm: 66 | hist: 67 | EOF1 68 | } 69 | -------------------------------------------------------------------------------- /post-install/add-mapr-user.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2015-Jun-29 vi: set ai et sw=3 tabstop=3 retab: 3 | 4 | # Usage 5 | usage() { 6 | echo "Usage: $0 new-user-name " 7 | echo group name and gid will match user-name and uid 8 | echo optional uid will be checked for availability and used if available 9 | exit 1 10 | } 11 | [[ $# -lt 1 ]] && usage 12 | [[ $(id -u) -ne 0 ]] && { echo This script must be run as root; exit 1; } 13 | type clush >& /dev/null || { echo clush required for this script; exit 2; } 14 | 15 | pw=${3:-password} 16 | # Check if current host in clush group all and define exception 17 | nodeset -e @all | grep $(hostname -s) && xprimenode="-x $(hostname -s)" 18 | 19 | # Check for existing uid and gid 20 | if [[ $# -gt 1 ]]; then 21 | clush -S -b -g all $xprimenode "getent passwd $2" && { echo $2 is in use already; exit 1; } 22 | clush -S -b -g all $xprimenode "getent group $2" && { echo $2 in use already; exit 1; } 23 | adduid="-u $2" 24 | addgid="-g $2" 25 | fi 26 | 27 | # Create new Linux user on all cluster nodes 28 | prep-linux-user() { 29 | groupadd $addgid $1 30 | useradd -m -c 'MapR user account' -g $1 $adduid $1 31 | # set password for user 32 | echo -e "$pw\n$pw" | passwd $1 33 | # Get system generated uid/gid 34 | uid=$(getent passwd $1| awk -F: '{print $3}') 35 | gid=$(getent group $1| awk -F: '{print $3}') 36 | adduid="-g $uid"; printf "UID: $uid\n" 37 | addgid="-g $gid"; printf "GID: $gid\n" 38 | 39 | # Create group on all nodes 40 | clush -b -g all $xprimenode "groupadd $addgid $1" 41 | # Create user on all nodes 42 | clush -b -g all $xprimenode "useradd -m -c 'MapR user account' -g $1 $adduid $1" 43 | # Set password for user on all nodes 44 | clush -b -g all $xprimenode "echo -e '$pw\n$pw' | passwd $1" 45 | # Set secondary group membership as needed, modify and uncomment 46 | # clush -b -g all $xprimenode usermod -G wheel,project1 $1 47 | # Verify consistent id 48 | clush -b -g all "id $1" 49 | done 50 | 51 | # Set up MapR folder and proxy for the specified user 52 | prep-mapr-user() { 53 | clush -a touch /opt/mapr/conf/proxy/$1 54 | clush -a chown mapr:mapr /opt/mapr/conf/proxy/$1 55 | #TBD: run as mapr and define ticket location 56 | hadoop fs -mkdir /user/$1 57 | hadoop fs -chown $1:$1 /user/$1 58 | } 59 | 60 | if getent passwd $1; do 61 | echo $1 is in use already 62 | clush -b -g all "id $1" 63 | echo Adding $1 to MapR cluster 64 | prep-mapr-user 65 | else 66 | prep-linux-user 67 | prep-mapr-user 68 | done 69 | 70 | -------------------------------------------------------------------------------- /post-install/spReconfig.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2014-Oct-16 vi: set ai et sw=3 tabstop=3: 3 | 4 | [ $(id -u) -ne 0 ] && { echo This script must be run as root; exit 1; } 5 | 6 | tmpfile=$(mktemp); trap 'rm $tmpfile' 1 2 3 15 7 | 8 | # Define the set of disk devices that will be taken offline and then 9 | # merged into new Storage Pool. All data on these drives will be lost. 10 | # Assumes the disks in this list form at least one Storage Pool 11 | disklist='/dev/sdh /dev/sdi /dev/sdj /dev/sdk' 12 | disklist='' #Set to null for safety, script must be edited to define disks 13 | echo These disks will be reformatted and all data lost: $disklist 14 | read -p "Press enter to continue or ctrl-c to abort" 15 | 16 | # Configure re-replication to start in 1 minute after SP goes offline, 17 | # rather than default 60 min. Assumes cluster is otherwise quiet. 18 | maprcli config save -values '{"cldb.fs.mark.rereplicate.sec":"60"}' 19 | maprcli config save -values '{"cldb.replication.max.in.transit.containers.per.sp":"8"}' 20 | 21 | /opt/mapr/server/mrconfig sp list -v 22 | 23 | # Iterate over each disk, allowing the data on each disk to be re-replicated 24 | # before going to the next disk 25 | # Minimal risk since script can be interrupted and SPs brought back online 26 | for dsk in $disklist; do 27 | echo Taking $dsk offline 28 | /opt/mapr/server/mrconfig sp offline $dsk || { echo $dsk not an SP; continue; } # If $dsk fails to offline must not be SP 29 | date; echo Waiting 180 seconds for rereplication to start; sleep 180 30 | # Now wait until rereplication stops 31 | until (maprcli dump rereplicationinfo | grep 'No active rereplications'); do 32 | echo -n 'Still Replicating '; sleep 120 33 | done 34 | date; echo rereplication for $dsk is done, next disk 35 | done 36 | 37 | echo $disklist All offline and data rereplicated 38 | maprcli dump rereplicationinfo 39 | 40 | echo Initial MapR Disktab 41 | cat /opt/mapr/conf/disktab 42 | 43 | echo Removing $disklist from MapR FS, SPs cannot be brought back now 44 | maprcli disk remove -host $(hostname) -disks ${disklist// /,} -force true 45 | 46 | >$tmpfile 47 | for dsk in $disklist; do 48 | echo $dsk >> $tmpfile 49 | done 50 | cat $tmpfile 51 | 52 | /opt/mapr/server/disksetup -W ${#disklist[@]} -F $tmpfile 53 | echo New disktab ================ 54 | cat /opt/mapr/conf/disktab 55 | 56 | /opt/mapr/server/mrconfig sp list -v 57 | maprcli config save -values '{"cldb.fs.mark.rereplicate.sec":"3600"}' #Reset 58 | maprcli config save -values '{"cldb.replication.max.in.transit.containers.per.sp":"4"}' 59 | echo Restart the warden: service mapr-warden restart 60 | 61 | -------------------------------------------------------------------------------- /post-install/runSparkTeraGenSort.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2017-Apr-14 vi: set ai et sw=3 tabstop=3: 3 | 4 | # Swap sizes below for full 1TB run 5 | size=1T 6 | size=50G 7 | 8 | # Size Spark resources starting with max 4 cores per executor 9 | ecores=4 10 | nodes=$(maprcli node list -columns service |grep -c nodemanager) 11 | vcores=$(maprcli dashboard info -json |awk -F: '/total_vcores/{printf("%i\n", $2)}') 12 | ncores=$((vcores/nodes)) # Cores per node 13 | if [[ "$ncores" -gt 7 ]]; then 14 | nexecs=$((ncores/ecores)) 15 | else 16 | nexecs=$((ncores/2)) 17 | ecores=2 18 | fi 19 | nexecs=$((nexecs * nodes)) 20 | vram=$(maprcli dashboard info -json |awk -F: '/total_memory_mb/{printf("%i\n", $2)}') 21 | emem=$(( (vram / nexecs) - 2000 )) 22 | echo nexecs: $nexecs 23 | echo ecores: $ecores 24 | echo emem: $emem 25 | 26 | tsjar=spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar 27 | tsjar=spark-terasort-1.1-SNAPSHOT.jar 28 | if ! jar tf $tsjar > /dev/null; then 29 | echo spark-terasort jar not readable 30 | exit 31 | fi 32 | #PATH=/opt/mapr/spark/spark-1.6.1/bin:$PATH 33 | spkhome=$(find /opt/mapr/spark -maxdepth 1 -type d -name spark-\* \ 34 | |sort -n |tail -1) 35 | PATH=$spkhome/bin:$PATH 36 | #spkjar=$(find $spkhome -name spark-examples\*.jar) 37 | #spkclass=org.apache.spark.examples.SparkPi 38 | 39 | #spark-submit --master yarn-client \ 40 | spark-submit --master yarn --deploy-mode cluster \ 41 | --name 'TeraGen' \ 42 | --class com.github.ehiggs.spark.terasort.TeraGen \ 43 | --num-executors $nexecs \ 44 | --executor-cores $ecores \ 45 | --executor-memory $emem \ 46 | $tsjar $size /user/$USER/spark-terasort 47 | # Use small 50G test to verify all is in order, increase to 1T or more 48 | # Move block comment line up to skip TeraGen once data is created 49 | 50 | : << '--BLOCK-COMMENT--' 51 | exit 52 | --executor-cores 4 53 | --executor-memory 16G 54 | --BLOCK-COMMENT-- 55 | 56 | #export DRIVER_MEMORY=1g 57 | #spark-submit --master yarn-cluster \ 58 | spark-submit --master yarn --deploy-mode client --driver-memory 1g \ 59 | --name 'TeraSort' \ 60 | --class com.github.ehiggs.spark.terasort.TeraSort \ 61 | --num-executors $nexecs \ 62 | --executor-cores $ecores \ 63 | --executor-memory $emem \ 64 | $tsjar /user/$USER/spark-terasort /user/$USER/terasort-output 65 | #Many small executors seems to perform better than fewer large executors 66 | 67 | # --conf 'spark.driver.extraClassPath=/opt/mapr/lib/libprotodefs-4.0.1-mapr.jar:/opt/mapr/lib/protobuf-java-2.5.0.jar:/opt/mapr/lib/guava-13.0.1.jar' \ 68 | # --class org.apache.spark.examples.terasort.TeraGen \ 69 | # --conf 'mapreduce.terasort.num.partitions=5' \ 70 | # --spark.driver.extraJavaOptions -Dspark.driver.port=40880 \ 71 | # --driver-java-options=-Dspark.driver.port=40880 \ #FW hole didn't work 72 | -------------------------------------------------------------------------------- /post-install/hbasePerf.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Sep-13 vi: set ai et sw=3 tabstop=3: 3 | 4 | # MapR DB tests below assume /tables volume exists and mapping in core-site.xml is configured 5 | #maprcli volume create -name tables -path /tables -topology /data/default-rack -replication 3 -replicationtype low_latency 6 | #hadoop mfs -setcompression off /tables 7 | #echo ' hbase.table.namespace.mappings *:/tables ' >> /opt/mapr/hadoop.../conf/core-site.xml 8 | # Apache HBase will be used if core-site.xml mappings do not exist and $table set to just TestTable 9 | table=/tables/TestTable #table name used by PerformanceEvaluation object 10 | 11 | # HBase bundled performance tool, multithreaded 12 | # Integer arg is number of Java threads or mapreduce processes 13 | # Each thread writes and then reads 1M 1K-byte rows (1GB) 14 | thrds=4 15 | /usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation --table=$table --nomapred sequentialWrite $thrds |& tee hbasePerfEvalSeqWrite-${thrds}T.log 16 | /usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation --table=$table --nomapred randomWrite $thrds |& tee hbasePerfEvalRanWrite-${thrds}T.log 17 | /usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation --table=$table --nomapred sequentialRead $thrds |& tee hbasePerfEvalSeqRead-${thrds}T.log 18 | /usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation --table=$table --nomapred randomRead $thrds |& tee hbasePerfEvalRanRead-${thrds}T.log 19 | 20 | # mapreduce clients (use processes across the cluster instead of threads on a client machine) 21 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 20 |& tee hbasePerfEvalSeqWrite20P.log 22 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 20 |& tee hbasePerfEvalRanWrite20P.log 23 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialRead 20 |& tee hbasePerfEvalSeqRead20P.log 24 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation randomRead 20 |& tee hbasePerfEvalRanRead20P.log 25 | 26 | # Very time consuming tests: 27 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred randomSeekScan 4 |& tee hbasePerfEvalRanSeekScan.log 28 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation randomSeekScan 20 |& tee hbasePerfEvalRanSeekScan20P.log 29 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred scanRange1000 4 |& tee hbasePerfEvalScanRange1K.log 30 | #/usr/bin/time hbase org.apache.hadoop.hbase.PerformanceEvaluation scanRange1000 20 |& tee hbasePerfEvalScanRange1K20P.log 31 | 32 | # What does the table look like: 33 | hbase shell <<< "scan '$table', {LIMIT=>20}" 34 | 35 | columns=$(stty -a | awk '/columns/{printf "%d\n",$7}') 36 | # How did the regions get distributed across the cluster nodes: 37 | /opt/mapr/bin/maprcli table region list -path $table | cut -c -$columns 38 | # How did the regions get distributed across the storage pools: 39 | #./regionsp.py $table 40 | 41 | echo Get Throughput: grep '[0-9.]* MB/s' hbasePerfEval\*.log 42 | -------------------------------------------------------------------------------- /post-install/gen_profile.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # dgomerman@maprtech.com 2014-Mar-25 3 | OUTPUT="/tmp/node_profile.sh" 4 | SUDO="" 5 | [ $(id -u) -ne 0 ] && SUDO=sudo 6 | 7 | # OS 8 | distro=$(cat /etc/*release | grep -m1 -i -o -e ubuntu -e redhat -e centos) 9 | if [ ! -n "$distro" ] ; then 10 | distro="unknown" 11 | fi 12 | # Manufacturer 13 | manufacturer=$($SUDO dmidecode |grep -A2 '^System Information' |grep -i "Manufacturer" |sed -e 's/^.*: //') 14 | if [ ! -n "$manufacturer" ] ; then 15 | manufacturer="unknown" 16 | fi 17 | product=$($SUDO dmidecode |grep -A2 '^System Information' |grep -i "Product" |sed -e 's/[^0-9]*//g') 18 | if [ ! -n "$product" ] ; then 19 | product="unknown" 20 | fi 21 | # Memory DIMS 22 | memoryDims=$($SUDO dmidecode | grep -c '^[[:space:]]Size: [0-9]* MB') 23 | if [ ! -n "$memoryDims" ] ; then 24 | memoryDims="0" 25 | fi 26 | # Memory Total 27 | memoryTotal=$(cat /proc/meminfo | grep -i ^memt | uniq |sed -e 's/[^0-9]*//g') 28 | if [ ! -n "$memoryTotal" ] ; then 29 | memoryTotal="0" 30 | fi 31 | # Core Count 32 | coreCount=$(lscpu | grep -v -e op-mode -e ^Vendor -e family -e Model: -e Stepping: -e BogoMIPS -e Virtual -e ^Byte -e '^NUMA node(s)' | awk '/^CPU MHz:/{sub($3,sprintf("%0.0f",$3))};{print}' |grep '^CPU(s)' |sed -e 's/[^0-9]*//g') 33 | if [ ! -n "$coreCount" ] ; then 34 | coreCount="0" 35 | fi 36 | # Core Speed 37 | cpuMhz=$(lscpu | grep -v -e op-mode -e ^Vendor -e family -e Model: -e Stepping: -e BogoMIPS -e Virtual -e ^Byte -e '^NUMA node(s)' | awk '/^CPU MHz:/{sub($3,sprintf("%0.0f",$3))};{print}' |grep '^CPU MHz' |sed -e 's/[^0-9]*//g') 38 | if [ ! -n "$cpuMhz" ] ; then 39 | cpuMhz="0" 40 | fi 41 | # NIC Count & Speed 42 | which ethtool >/dev/null 2>&1 43 | ethExists=$? 44 | if [ $ethExists -eq 0 ] ; then 45 | nicSpeeds=$(/sbin/ip link show | sed '/ lo: /,+1d' | awk '/UP/{sub(":","",$2);print $2}' | xargs -l $SUDO ethtool | grep -e ^Settings -e Speed |grep "Speed" |sed -e 's/[^0-9]//g') 46 | else 47 | nicSpeeds=$(/sbin/ip link show | sed '/ lo: /,+1d' | awk '/UP/{sub(":","",$2);print $2}' | xargs -l $SUDO mii-tool|sed -e 's/^.*negotiated//' -e 's/[^0-9]//g') 48 | fi 49 | if [ ! -n "$nicSpeeds" ] ; then 50 | nicSpeeds="1000" 51 | fi 52 | 53 | avgSpeed=0 54 | aggSpeed=0 55 | nicCount=0 56 | for speed in $nicSpeeds ; do 57 | let aggSpeed+=$speed 58 | let nicCount+=1 59 | done 60 | let avgSpeed=$aggSpeed/$nicCount 61 | # Disk count 62 | diskCount=$(cat /opt/mapr/conf/disktab |grep '^\/' |wc --lines) 63 | if [ ! -n "$diskCount" ] ; then 64 | nicSpeeds="0" 65 | fi 66 | 67 | # Save Profile 68 | cat < $OUTPUT 69 | #!/bin/bash 70 | # OS 71 | export DISTRO="$distro" 72 | # Manufacturer 73 | export MANUFACTURER="$manufacturer" 74 | export PRODUCT="$product" 75 | # Memory DIMS 76 | export MEMORY_DIMS="$memoryDims" 77 | # Memory Total 78 | export MEMORY_TOTAL="$memoryTotal" 79 | # Core Count 80 | export CORE_COUNT="$coreCount" 81 | # Core Speed 82 | export CPU_MHZ="$cpuMhz" 83 | # NIC Count & Speed 84 | export NIC_SPEEDS="$nicSpeeds" 85 | export AVG_NIC_SPEED="$avgSpeed" 86 | export AGG_NIC_SPEED="$aggSpeed" 87 | export NIC_COUNT="$nicCount" 88 | # Disk count 89 | export DISK_COUNT="$diskCount" 90 | EOF 91 | -------------------------------------------------------------------------------- /post-install/hbase-install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Mar-20 vi: set ai et sw=3 tabstop=3: 3 | 4 | export JAVA_HOME=/usr/java/default #Oracle JDK 5 | clargs='-o -qtt' 6 | [ $(id -u) -ne 0 ] && SUDO=sudo #Use sudo, assuming account has passwordless sudo (sudo -i)? 7 | 8 | grep ^hbm /etc/clustershell/groups || { echo clustershell group: hbm undefined; exit 1; } 9 | grep ^hbr /etc/clustershell/groups || { echo clustershell group: hbr undefined; exit 1; } 10 | grep ^zk /etc/clustershell/groups || { echo clustershell group: zk undefined; exit 1; } 11 | grep ^cldb /etc/clustershell/groups || { echo clustershell group: cldb undefined; exit 1; } 12 | #clush -ab "java -version |& grep -e x86_64 -e 64-Bit" 13 | clush -S -B -g hbm,hbr "$JAVA_HOME/bin/java -version |& grep -e x86_64 -e 64-Bit" || { echo $JAVA_HOME/bin/java does not exist on all nodes or is not 64bit; exit 3; } 14 | # Check for JAVA_HOME setting 15 | clush $clargs -g hbm,hbr "${SUDO:-} grep '^export JAVA_HOME' /opt/mapr/conf/env.sh" 16 | clush -S -B -g hbm,hbr 'grep -i mapr /etc/yum.repos.d/*' || { echo MapR repos not found; exit 3; } 17 | 18 | # Install MapR Patch needed for HBase replication 19 | #clush $clargs -v -g clstr "${SUDO:-} yum -y install mapr-patch-4.1.0.31175.GA-32134" #If patch is in your repo 20 | clush $clargs -v -g clstr "${SUDO:-} yum -y localinstall /tmp/mapr-patch-4.1.0.31175.GA-32134.x86_64.rpm" 21 | # Install HBase Region Servers 22 | clush $clargs -v -g hbr "${SUDO:-} yum -y install mapr-hbase-0.98.7.201501291259-1.noarch mapr-hbase-regionserver-0.98.7.201501291259-1.noarch" 23 | # Install HBase Region Servers 24 | clush $clargs -v -g hbm "${SUDO:-} yum -y install mapr-hbase-0.98.7.201501291259-1.noarch mapr-hbase-master-0.98.7.201501291259-1.noarch" 25 | # Install HBase client patch 26 | clush $clargs -g hbr "${SUDO:-} mv /opt/mapr/hbase/hbase-0.98.7/lib/hbase-client-0.98.7-mapr-1501-r1.jar /opt/mapr/hbase/hbase-0.98.7/lib/hbase-client-0.98.7-mapr-1501-r1.jar.orig" 27 | clush $clargs -g hbr "${SUDO:-} cp /tmp/hbase-client-0.98.7-mapr-1501-SNAPSHOT.jar /opt/mapr/hbase/hbase-0.98.7/lib/hbase-client-0.98.7-mapr-1501-r1.jar" 28 | clush $clargs -g hbr "${SUDO:-} chmod 644 /opt/mapr/hbase/hbase-0.98.7/lib/hbase-client-0.98.7-mapr-1501-r1.jar" 29 | 30 | 31 | # Configure ALL HBase nodes with the CLDB and Zookeeper info (-N does not like spaces in the name) 32 | clush $clargs -g hbm,hbr "${SUDO:-} /opt/mapr/server/configure.sh -Z $(nodeset -S, -e @zk) -C $(nodeset -S, -e @cldb) -R" 33 | 34 | echo Restart Warden on all nodes 35 | #read -p "Press enter to continue or ctrl-c to abort" 36 | clush $clargs -a "${SUDO:-} service mapr-warden restart" 37 | 38 | echo Waiting 2 minutes for system to initialize; end=$((SECONDS+120)) 39 | sp='/-\|'; printf ' '; while [ $SECONDS -lt $end ]; do printf '\b%.1s' "$sp"; sp=${sp#?}${sp%???}; sleep .3; done # Spinner from StackOverflow 40 | #ssh -qtt $(nodeset -I0 -e @zk) "${SUDO:-} maprcli node cldbmaster" 41 | ssh -qtt $(nodeset -I0 -e @zk) "su - mapr -c ' maprcli node cldbmaster'" 42 | [ $? -ne 0 ] && { echo CLDB did not startup, check status and logs on $node1; exit 3; } 43 | 44 | echo Verify correct versions of HBase are installed with a command like this: 45 | echo "clush -ab 'yum list installed mapr-hbase\*'" 46 | -------------------------------------------------------------------------------- /post-install/runDFSIO.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Apr-08 vi: set ai et sw=3 tabstop=3: 3 | # 4 | # This test will create files of size , where is defined as 5 | # Number of total disk available in the cluster (for YARN clusters), OR 6 | # Number of Map Slots (for MapReduce Version 1 clusters) 7 | # fsize is defined as 250MB to avoid file sharding due to chunksize 8 | # 9 | # For YARN clusters, the first arg to the script becomes a multiplier of files. 10 | # This arg can be used to find the maximum throughput per map task 11 | # 12 | # The output is appended to a TestDFSIO.log file for all runs. The 13 | # key metrics reported are elapsed time and throughput (per map task). 14 | # 15 | # DFSIO creates a directory TestDFSIO under /benchmarks in the 16 | # distributed file system. The folder "/benchmarks" can be its own 17 | # volume in MapR (see mkBMvol.sh) 18 | 19 | MRV=$(hadoop version | awk 'NR==1{printf("%1.1s\n",$2)}') 20 | # Size of files to be written (in MB) 21 | fsize=250 22 | filesPerDisk=${1:-1} 23 | 24 | if [ $MRV == "2" ] ; then 25 | hadooppath=$(ls -c1 -d /opt/mapr/hadoop/hadoop-* |sort -n |tail -1) 26 | jarpath="$hadooppath/share/hadoop/mapreduce/" 27 | jarpath+="hadoop-mapreduce-client-jobclient-${hadoopver}*-tests.jar" 28 | jarpath=$(eval ls $jarpath) 29 | hadoopver=${hadooppath#*/hadoop-} 30 | 31 | clcmd="/opt/mapr/server/mrconfig sp list -v " 32 | clcmd+=" |grep -o '/dev/[^ ,]*' | sort -u | wc -l" 33 | tdisks=$(clush -aN "$clcmd" |awk '{ndisks+=$1}; END{print ndisks}') 34 | #tdisks=$(( $tdisks * $filesPerDisk )) 35 | tdisks=$(echo "$tdisks * $filesPerDisk" |bc ) 36 | mapDisks=$(echo "scale=2; 1 / $filesPerDisk" | bc) 37 | echo Number of disks per Map task: $mapDisks 38 | echo tdisks: $tdisks; echo filesPerDisk: $filesPerDisk; echo fsize: $fsize 39 | read -p "Press enter to continue or ctrl-c to abort" 40 | 41 | # Use "mapreduce" properties to force containers per available disk 42 | # Default is 1 container/disk (so map.disk=1 and nrFiles is tdisks*1 ) 43 | # The intent is to create one 'wave' of map tasks with max containers 44 | # per node utilized. More than 1 container/disk can be specified to 45 | # discover peak cluster throughput 46 | hadoop jar $jarpath TestDFSIO \ 47 | -Dmapreduce.job.name=DFSIO-write \ 48 | -Dmapreduce.map.cpu.vcores=0 \ 49 | -Dmapreduce.map.memory.mb=768 \ 50 | -Dmapreduce.map.disk=${mapDisks:-1} \ 51 | -Dmapreduce.map.speculative=false \ 52 | -Dmapreduce.reduce.speculative=false \ 53 | -write -nrFiles $tdisks \ 54 | -fileSize $fsize -bufferSize 65536 55 | 56 | hadoop jar $jarpath TestDFSIO \ 57 | -Dmapreduce.job.name=DFSIO-read \ 58 | -Dmapreduce.map.cpu.vcores=0 \ 59 | -Dmapreduce.map.memory.mb=768 \ 60 | -Dmapreduce.map.disk=${mapDisks:-1} \ 61 | -Dmapreduce.map.speculative=false \ 62 | -Dmapreduce.reduce.speculative=false \ 63 | -read -nrFiles $tdisks \ 64 | -fileSize $fsize -bufferSize 65536 65 | 66 | # Optional settings to ratchet down memory consumption 67 | # -Dmapreduce.map.memory.mb=768 # default 1024 68 | # -Dmapreduce.map.java.opts=-Xmx768m # default -Xmx900m 69 | 70 | else # $MRV == 1 71 | 72 | HHOME=$(ls -d /opt/mapr/hadoop/hadoop-0*) 73 | HVER=${HHOME#*/hadoop-} 74 | TJAR=$HHOME/hadoop-${HVER}-dev-test.jar 75 | mtasks=$(maprcli dashboard info -json | grep map_task_capacity | grep -o '[0-9][0-9]*') 76 | 77 | # DFSIO write test 78 | hadoop jar $TJAR TestDFSIO \ 79 | -Dmapred.job.name=DFSIO-write \ 80 | -Dmapred.map.tasks.speculative.execution=false \ 81 | -Dmapred.reduce.tasks.speculative.execution=false \ 82 | -write -nrFiles $mtasks -fileSize $fsize -bufferSize 65536 83 | 84 | # DFSIO read test 85 | hadoop jar $TJAR TestDFSIO \ 86 | -Dmapred.job.name=DFSIO-read \ 87 | -Dmapred.map.tasks.speculative.execution=false \ 88 | -Dmapred.reduce.tasks.speculative.execution=false \ 89 | -read -nrFiles $mtasks -fileSize $fsize -bufferSize 65536 90 | 91 | fi 92 | 93 | echo "Results appended to ./TestDFSIO_results.log" 94 | echo " NOTE: Resulting metric is per map slot / container" 95 | 96 | # Quick test of map-reduce. Can be used right after building/rebuild a cluster 97 | # hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar pi 10 10 98 | # hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar wordcount file:///etc/services apacheWC 99 | -------------------------------------------------------------------------------- /pre-install/summIOzone.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # updated to work with AWS 3 | # 20171020 R.Itäpuro initialize valsum variable to get correct CV values 4 | 5 | usage() { 6 | cat << EOF 7 | 8 | This script summarizes iozone results on a set of disks 9 | iozone results presumed to be in current folder in .log files 10 | Use -c option to output a csv format 11 | Use -C option to output a csv format with header line 12 | Use clush -aLN summIOzone.sh -c to gather all disk-test.sh results in csv. 13 | 14 | EOF 15 | exit 16 | } 17 | 18 | csv=false; hdr=false DBG='' 19 | while getopts "Ccd" opt; do 20 | case $opt in 21 | c) csv=true ;; 22 | C) csv=true hdr=true;; 23 | d) DBG=true ;; # Enable debug statements 24 | *) usage ;; 25 | esac 26 | done 27 | [[ -n "$DBG" ]] && echo Options set: csv: $csv 28 | [[ -n "$DBG" ]] && read -rp "Press enter to continue or ctrl-c to abort" 29 | 30 | files=$(ls ./*-iozone.log 2>/dev/null) 31 | [[ -n "$files" ]] || { echo No iozone.log files found; exit 1; } 32 | 33 | if [[ $csv = "true" ]]; then 34 | gawk -v OFS=, -v HOST="$(hostname -s)" -v HDR=$hdr ' 35 | BEGIN { 36 | hdr="Host,Disk,DataSize,RecordSize,SeqWrite,SeqRead," 37 | hdr=hdr"RandRead,RandWrite" 38 | if ( HDR == "true" ) print hdr 39 | } 40 | /KB reclen +write/ { 41 | getline 42 | print HOST, substr(FILENAME,0,3), $1, $2, $3, $5, $7, $8 43 | } 44 | ' ./*-iozone.log 45 | exit 46 | fi 47 | 48 | cat ./*-iozone.log | gawk ' 49 | BEGIN { 50 | # Initialize seq & rand, min and max values 51 | swmin=6000000; srmin=swmin; rrmin=swmin; rwmin=swmin 52 | } 53 | 54 | # For all input files, 55 | # Match header of IOzone output line, 56 | # Get next line, read and store data fields 57 | /KB reclen +write/ { 58 | getline 59 | # err chk if NF < 8 60 | count++ 61 | fsize = $1 62 | swtotal += $3 63 | srtotal += $5 64 | rrtotal += $7 65 | rwtotal += $8 66 | swvals[count] = $3 67 | srvals[count] = $5 68 | rrvals[count] = $7 69 | rwvals[count] = $8 70 | if ($3 < swmin) swmin = $3; if ($3 > swmax) swmax = $3 71 | if ($5 < srmin) srmin = $5; if ($5 > srmax) srmax = $5 72 | if ($7 < rrmin) rrmin = $7; if ($7 > rrmax) rrmax = $7 73 | if ($8 < rwmin) rwmin = $8; if ($8 > rwmax) rwmax = $8 74 | } 75 | 76 | END { 77 | printf "%-7s %1.2f%s\n", "File size:", fsize/(1024*1024), "GB" 78 | printf "%-7s %6d\n", "Disk count:", count 79 | print "" 80 | 81 | print "IOzone Sequential Write Summary(KB/sec)" 82 | swavg = swtotal/count 83 | printf "%-7s %1.2f%s\n", "aggregate:", swtotal/(1024*1024), "GB/sec" 84 | printf "%-7s %6d\n", "mean:", swavg 85 | printf "%-7s %6d\n", "min:", swmin 86 | printf "%-7s %6d\n", "max:", swmax 87 | valsum = 0 88 | for (val in swvals) { 89 | valsum += (swvals[val] - swavg) ** 2 90 | } 91 | # print "stdev: ", sqrt(valsum/count) 92 | printf "CV: %00.1f%%\n", 100*(sqrt(valsum/count) / swavg) 93 | print "" 94 | 95 | print "IOzone Sequential Read Summary(KB/sec)" 96 | sravg = srtotal/count 97 | printf "%-7s %1.2f%s\n", "aggregate:", srtotal/(1024*1024), "GB/sec" 98 | printf "%-7s %6d\n", "mean:", sravg 99 | printf "%-7s %6d\n", "min:", srmin 100 | printf "%-7s %6d\n", "max:", srmax 101 | valsum = 0 102 | for (val in srvals) { 103 | valsum += (srvals[val] - sravg) ** 2 104 | } 105 | # print "stdev: ", sqrt(valsum/count) 106 | printf "CV: %00.1f%%\n", 100*(sqrt(valsum/count) / sravg) 107 | print "" 108 | 109 | print "IOzone Random Write Summary(KB/sec)" 110 | rwavg = rwtotal/count 111 | printf "%-7s %1.2f%s\n", "aggregate:", rwtotal/(1024*1024), "GB/sec" 112 | printf "%-7s %6d\n", "mean:", rwavg 113 | printf "%-7s %6d\n", "min:", rwmin 114 | printf "%-7s %6d\n", "max:", rwmax 115 | valsum = 0 116 | for (val in rwvals) { 117 | valsum += (rwvals[val] - rwavg) ** 2 118 | } 119 | # print "stdev: ", sqrt(valsum/count) 120 | printf "CV: %00.1f%%\n", 100*(sqrt(valsum/count) / rwavg) 121 | print "" 122 | 123 | print "IOzone Random Read Summary(KB/sec)" 124 | rravg = rrtotal/count 125 | printf "%-7s %1.2f%s\n", "aggregate:", rrtotal/(1024*1024), "GB/sec" 126 | printf "%-7s %6d\n", "mean:", rravg 127 | printf "%-7s %6d\n", "min:", rrmin 128 | printf "%-7s %6d\n", "max:", rrmax 129 | valsum = 0 130 | for (val in rrvals) { 131 | valsum += (rrvals[val] - rravg) ** 2 132 | } 133 | # print "stdev: ", sqrt(valsum/count) 134 | printf "CV: %00.1f%%\n", 100*(sqrt(valsum/count) / rravg) 135 | print "" 136 | } 137 | ' 138 | # jbenninghoff 2012-Aug-31 vim: set ai et sw=3 tabstop=3: 139 | -------------------------------------------------------------------------------- /pre-install/mapr3x-install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Mar-20 vi: set ai et sw=3 tabstop=3: 3 | 4 | grep ^all /etc/clustershell/groups || { echo clustershell group: all undefined; exit 1; } 5 | grep ^jt /etc/clustershell/groups || { echo clustershell group: jt undefined; exit 1; } 6 | grep ^zkcldb /etc/clustershell/groups || { echo clustershell group: zkcldb undefined; exit 1; } 7 | 8 | tmpfile=$(mktemp); trap 'rm $tmpfile' 0 1 2 3 15 9 | clname=pslab1 10 | admin1=jbenninghoff #Set to a non-root, non-mapr linux account which has a known password, this will be used to login to web ui 11 | node1=$(nodeset -I0 -e @zkcldb) #first node in zkcldb group 12 | clargs='-o -qtt' 13 | [ $(id -u) -ne 0 ] && SUDO=sudo #Use sudo, assuming account has passwordless sudo 14 | 15 | cat - << 'EOF' 16 | # Assumes clush is installed, available from EPEL repository 17 | # Assumes all nodes have been audited with cluster-audit.sh and all issues fixed 18 | # Assumes all nodes have met subsystem performance expectations as measured by memory-test.sh, network-test.sh and disk-test.sh 19 | # Assumes that MapR will run as mapr user 20 | # Assumes jt, zkcldb and all group have been defined for clush in /etc/clustershell/groups 21 | EOF 22 | exit #modify this script to match your set of nodes, then remove or comment out the exit command 23 | 24 | #Create 3.x repos on all nodes #TBD does not work with sudo 25 | #cat - << 'EOF2' | clush -a "cat - > /etc/yum.repos.d/maprtech.repo" 26 | cat - << 'EOF2' > $tmpfile 27 | [maprtech] 28 | name=MapR Technologies 29 | baseurl=http://package.mapr.com/releases/v3.1.1/redhat/ 30 | enabled=1 31 | gpgcheck=0 32 | protect=1 33 | 34 | [maprecosystem] 35 | name=MapR Technologies 36 | baseurl=http://package.mapr.com/releases/ecosystem/redhat/ 37 | enabled=1 38 | gpgcheck=0 39 | protect=1 40 | EOF2 41 | 42 | clush -abc $tmpfile --dest /tmp/${tmpfile##*/} 43 | clush $clargs -a "${SUDO:-} mv /tmp/${tmpfile##*/} /etc/yum.repos.d/maprtech.repo" 44 | clush $clargs -a "${SUDO:-} yum clean all" 45 | 46 | # Identify and format the data disks for MapR, destroys all data on all disks listed in /tmp/disk.list on all nodes 47 | clush -B $clargs -a "${SUDO:-} lsblk -id | grep -o ^sd. | grep -v ^sda |sort|sed 's,^,/dev/,' | tee /tmp/disk.list; wc /tmp/disk.list" 48 | echo Scrutinize the disk list above. All disks will be formatted for MapR FS, destroying all existing data on the disks 49 | echo Once the disk list is approved, edit this script and remove or comment the exit statement below 50 | read -p "Press enter to continue or ctrl-c to abort" 51 | 52 | # Install all servers with minimal rpms to provide storage and compute plus NFS 53 | clush $clargs -a "${SUDO:-} yum -y install mapr-fileserver mapr-nfs mapr-tasktracker" 54 | #read -p "Press enter to continue or ctrl-c to abort" 55 | 56 | # Service Layout option #1 ==================== 57 | # Admin services layered over data nodes defined in jt and zkcldb groups 58 | clush $clargs -g jt "${SUDO:-} yum -y install mapr-jobtracker mapr-webserver mapr-metrics" # At least 2 JobTracker nodes 59 | clush $clargs -g zkcldb "${SUDO:-} yum -y install mapr-cldb" # 3 CLDB nodes for HA, 1 does writes, all 3 do reads 60 | clush $clargs -g zkcldb "${SUDO:-} yum -y install mapr-zookeeper" #3 Zookeeper nodes, fileserver, tt and nfs could be erased 61 | ssh -qtt $node1 "${SUDO:-} yum -y install mapr-webserver" # Install webserver to bootstrap cluster install 62 | 63 | # Service Layout option #2 ==================== 64 | # Admin services on dedicated nodes, uncomment the line below 65 | #clush $clargs -g jt,zkcldb "${SUDO:-} yum -y erase mapr-tasktracker" 66 | 67 | # Check for correct java version and set JAVA_HOME 68 | clush $clargs -a "${SUDO:-} sed -i.bak 's,^#export JAVA_HOME=,export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64,' /opt/mapr/conf/env.sh" 69 | 70 | # Configure ALL nodes with the CLDB and Zookeeper info (-N does not like spaces in the name) 71 | #clush $clargs -a "${SUDO:-} /opt/mapr/server/configure.sh -N $clname -Z $(nodeset -S, -e @zkcldb) -C $(nodeset -S, -e @zkcldb) -u mapr -g mapr -no-autostart" 72 | #[ $? -ne 0 ] && { echo configure.sh failed, check screen for errors; exit 2; } 73 | 74 | # Identify and format the data disks for MapR 75 | #clush $clargs -a "${SUDO:-} /opt/mapr/server/disksetup -F /tmp/disk.list" 76 | #clush $clargs -a "${SUDO:-} /opt/mapr/server/disksetup -W $(cat /tmp/disk.list | wc -l) -F /tmp/disk.list" #Fast but less resilient storage 77 | 78 | #clush $clargs -g zkcldb "${SUDO:-} service mapr-zookeeper start"; sleep 10 79 | #ssh -qtt $node1 "${SUDO:-} service mapr-warden start" # Start 1 CLDB and webserver on first node 80 | 81 | clush $clargs -a "${SUDO:-} /opt/mapr/server/configure.sh -N $clname -Z $(nodeset -S, -e @zkcldb) -C $(nodeset -S, -e @zkcldb) -u mapr -g mapr -F /tmp/disk.list" 82 | [ $? -ne 0 ] && { echo configure.sh failed, check screen for errors; exit 2; } 83 | echo Wait at least 120 seconds for system to initialize; sleep 120 84 | 85 | ssh -qtt $node1 "${SUDO:-} maprcli node cldbmaster" && ssh $node1 "${SUDO:-} maprcli acl edit -type cluster -user $admin1:fc" 86 | [ $? -ne 0 ] && { echo CLDB did not startup, check status and logs on $node1; exit 3; } 87 | 88 | echo With a web browser, open this URL to continue with license installation: 89 | echo "https://$node1:8443/" 90 | echo 91 | echo Alternatively, license can be installed with maprcli like this: 92 | cat - << 'EOF3' 93 | You can use any browser to connect to mapr.com, in the upper right corner there is a login link. login and register if you have not already. 94 | Once logged in, you can use the register button on the right of your login page to register a cluster by just entering a clusterid. 95 | You can get the cluster id with maprcli like this: 96 | maprcli dashboard info -json -cluster TestCluster |grep id 97 | "id":"4878526810219217706" 98 | Once you finish the register form, you will get back a license which you can copy and paste to a file on the same node you ran maprcli (corpmapr-r02 I believe). 99 | Use that file as filename in the following maprcli command: 100 | maprcli license add -is_file true -license filename 101 | EOF3 102 | echo 103 | echo Restart mapr-warden on all servers once the license is applied 104 | echo clush $clargs -a "${SUDO:-} service mapr-warden start" 105 | -------------------------------------------------------------------------------- /post-install/runRWSpeedTest.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Sep-25 vi: set ai et sw=3 tabstop=3: 3 | 4 | usage() { 5 | cat << EOF 6 | 7 | # Simple script to run the MapR RWSpeedTest on a single node in a local volume. 8 | # RWSpeedTest is NOT like DFSIO ... it is not a MapReduce job. 9 | # So to run on all nodes in a cluster you need to use clush 10 | # RWSpeedTest measures MFS throughput 11 | # Compare Local volume Aggregate results with disk-test.sh Sequential Aggregate results per node 12 | 13 | Usage: $0 [-n] [-r] [-p] [-d] [-s] [-x ] 14 | -d option for script debug 15 | -s option to run a set of fixed size tests 16 | -x option for file size divider 17 | -X option for file size multiplier 18 | -p option to preserve local volume [default is to delete it] 19 | -r option to use regular 3x replication volume 20 | -n option to skip compression on tests 21 | 22 | EOF 23 | 24 | } 25 | 26 | dbg=false; fact=1; sizes=false; preserve=false; volume=local; compression=true 27 | mult=1 28 | while getopts "nrpdsx:X:" opt; do 29 | case $opt in 30 | n) compression=false ;; 31 | r) volume=regular ;; 32 | p) preserve=true ;; 33 | d) dbg=true ;; 34 | s) sizes=true ;; 35 | x) [[ "$OPTARG" =~ ^[0-9]+$ ]] && fact=$OPTARG || { echo $OPTARG is not an integer; usage; exit 1; } ;; 36 | X) [[ "$OPTARG" =~ ^[0-9]+$ ]] && mult=$OPTARG || { echo $OPTARG is not an integer; usage; exit 1; } ;; 37 | :) echo "Option -$OPTARG requires an argument." >&2; usage; exit 2 ;; 38 | *) usage; exit 3 ;; 39 | esac 40 | done 41 | 42 | mapracct=$(stat -c "%U" /opt/mapr/conf/) 43 | tmpfile=$(mktemp); trap "rm $tmpfile; echo EXIT sigspec: $?; exit" EXIT 44 | localvol=localvol-$(hostname -s) 45 | MAPR_HOME=${MAPR_HOME:-/opt/mapr} 46 | if [[ $(id -un) != $mapracct && $(id -u) -ne 0 ]]; then 47 | echo This script must be run as root or mapr; exit 1 48 | fi 49 | 50 | #Check if folder exists and clear it out 51 | if hadoop fs -stat /$localvol >& /dev/null; then 52 | hadoop fs -rm -r -skipTrash /$localvol/\* 53 | fi 54 | #TBD: ! maprcli volume info -name $localvol > /dev/null #vol exists? 55 | 56 | if [[ $volume == "regular" ]]; then 57 | # Make regular volume configured with replication 3 and compression off 58 | regvol=mfs-benchmarks-$(hostname -s) 59 | opts="-name $regvol -path /$regvol -replication 3" 60 | opts+=" -topology /data/default-rack" 61 | maprcli volume create $opts 62 | hadoop fs -rm -r -skipTrash /$regvol/\* >/dev/null 63 | hadoop mfs -setcompression off /$regvol 64 | else 65 | # Make local volume configured with replication 1 and compression off 66 | opts="-name $localvol -path /$localvol -replication 1 " 67 | opts+=" -localvolumehost $(<$MAPR_HOME/hostname)" 68 | maprcli volume create $opts 69 | hadoop mfs -setcompression off /$localvol 70 | fi 71 | 72 | #find jars, there should only be one of these jars ... let's hope :) 73 | MFS_TEST_JAR=$(find $MAPR_HOME/lib -name maprfs-diagnostic-tools-\*.jar) 74 | 75 | #set number of Java processes to the number of data drives 76 | pcmd="grep -o '/dev/[^ ,]*' | sort -u | wc -l" 77 | ndisk=$(/opt/mapr/server/mrconfig sp list -v | eval "$pcmd") 78 | #ndisk=$(/opt/mapr/server/mrconfig sp list -v | "$pcmd") 79 | #(( ndisk=ndisk*2 )) #Modify the process count if need be 80 | echo ndisk: $ndisk 81 | echo 82 | 83 | # Show the Storage Pools on this node 84 | /opt/mapr/server/mrconfig sp list -v 85 | echo 86 | # Show multimfs status 87 | maprcli config load -json |grep multimfs 88 | netstat -plnt |grep :566 89 | echo 90 | 91 | # Find the available MapR storage space on this node 92 | fsize=$(/opt/mapr/server/mrconfig sp list | awk '/totalfree/{print $9}') 93 | #echo Total File space $fsize MB 94 | 95 | # Start with 1% of available space 96 | (( fsize=(fsize/100) )) 97 | # Divide by the number of processes that will be run to set the file size 98 | (( fsize=(fsize/(${fact:-1}*ndisk) ) )) 99 | [[ $mult -gt 1 ]] && (( fsize=(fsize*mult) )) 100 | mfscache=$(pgrep -a mfs |grep -o -- '-m [0-9]*' |grep -o '[0-9]*') 101 | echo Num processes: $ndisk 102 | echo File size set to $fsize MB; echo 103 | if [[ $mfscache -gt $((ndisk * fsize)) ]]; then 104 | echo MFS cache exceeds aggregate file size 105 | echo mfscache: $mfscache 106 | echo data size: $((ndisk * fsize)) 107 | exit 108 | fi 109 | #read -p "Press enter to continue or ctrl-c to abort" 110 | 111 | # Usage: RWSpeedTest filename [-]megabytes uri 112 | # A simple single core (1 process) test to verify node if needed 113 | if [[ "$dbg" == "true" ]]; then 114 | opts="/$localvol/RWTestSingleTest $fsize maprfs:///" 115 | eval hadoop jar $MFS_TEST_JAR com.mapr.fs.RWSpeedTest "$opts" 116 | exit 117 | fi 118 | 119 | export HADOOP_ROOT_LOGGER="WARN,console" 120 | 121 | #TBD: Add loop to run with specific file sizes 256MB, 1GB and calculated size 122 | #run RWSpeedTest writes uncompressed 123 | for i in $(seq 1 $ndisk); do 124 | hadoop jar $MFS_TEST_JAR com.mapr.fs.RWSpeedTest /$localvol/RWTest${i} $fsize maprfs:/// & 125 | done | tee $tmpfile 126 | wait 127 | sleep 3 128 | awk '/Write rate:/{mbs+=$3};END{print "Aggregate Write Rate for this node is:", mbs, "MB/sec";}' $tmpfile 129 | 130 | #run RWSpeedTest reads uncompressed 131 | for i in $(seq 1 $ndisk); do 132 | hadoop jar $MFS_TEST_JAR com.mapr.fs.RWSpeedTest /$localvol/RWTest${i} -$fsize maprfs:/// & 133 | done | tee $tmpfile 134 | wait 135 | sleep 3 136 | awk '/Read rate:/{mbs+=$3};END{print "Aggregate Read Rate for this node is:", mbs, "MB/sec";}' $tmpfile 137 | 138 | if [[ $compression == "true" ]]; then 139 | hadoop fs -rm -r -skipTrash /$localvol/\* 140 | hadoop mfs -setcompression on /$localvol 141 | echo 142 | 143 | #run RWSpeedTest writes on compressed volume 144 | for i in $(seq 1 $ndisk); do 145 | hadoop jar $MFS_TEST_JAR com.mapr.fs.RWSpeedTest /$localvol/RWTest${i} $fsize maprfs:/// & 146 | done | tee $tmpfile 147 | wait 148 | sleep 3 149 | awk '/Write rate:/{mbs+=$3};END{print "Aggregate Write Rate using MFS compression for this node is:", mbs, "MB/sec";}' $tmpfile 150 | 151 | #run RWSpeedTest reads on compressed volume 152 | for i in $(seq 1 $ndisk); do 153 | hadoop jar $MFS_TEST_JAR com.mapr.fs.RWSpeedTest /$localvol/RWTest${i} -$fsize maprfs:/// & 154 | done | tee $tmpfile 155 | wait 156 | sleep 3 157 | awk '/Read rate:/{mbs+=$3};END{print "Aggregate Read Rate using MFS compression for this node is:", mbs, "MB/sec";}' $tmpfile 158 | fi 159 | 160 | if [[ $preserve == "false" ]]; then 161 | maprcli volume unmount -name $localvol 162 | maprcli volume remove -name $localvol 163 | fi 164 | -------------------------------------------------------------------------------- /post-install/runTeraGenSort.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Mar-8 vi: set ai et sw=3 tabstop=3: 3 | 4 | usage() { 5 | cat << EOF 6 | Usage: $0 [-d -t] 7 | -d To enable debug output 8 | -t To specify thin map tasks 9 | 10 | This script runs TeraGen and TeraSort using MR2 to measure big data 11 | sort performance. Specify a reduce task per node count as an 12 | argument to the script. Modern multicore servers will run best with 13 | 8 or more reduce tasks per node. Some experimentation is required 14 | to achieve peak performance. 15 | 16 | EOF 17 | } 18 | 19 | # Handle script options 20 | DBG=""; mtask=fat 21 | while getopts "dt" opt; do 22 | case $opt in 23 | d) DBG=true ;; 24 | t) mtask=thin ;; 25 | \?) usage; exit ;; 26 | esac 27 | done 28 | [ -n "$DBG" ] && set -x 29 | 30 | if [[ $mtask == "thin" ]]; then 31 | chunksize=$((1*256*1024*1024)) # default size 32 | else 33 | chunksize=$((3*256*1024*1024)) # Fat map tasks 34 | fi 35 | ### TeraGen size (specify size using 100 Byte records) 36 | size=$((10*1000*1000*1000)) #1TB for full TeraSort run 37 | size=$((1000*1000*1000)) # 1B records (100GB) for quick runs when tuning 38 | bytes=$((size*100)) #Convert size to bytes 39 | 40 | #Define a map count for TeraGen resulting in no sharded/chunked files 41 | #Bash does not do floating point, round up using modulo 42 | maps=$(( (bytes/chunksize) + (bytes % chunksize > 0) )) 43 | 44 | #find latest hadoop installed 45 | hdphome=$(find /opt/mapr/hadoop -maxdepth 1 -type d -name hadoop-\* \ 46 | |sort -n |tail -1) 47 | hdpjar=$(find "$hdphome" -name hadoop-mapreduce-examples\*[0-9].jar) 48 | #tbpath=/benchmarks/100tb 49 | 50 | #Run TeraGen 51 | if (hadoop fs -stat /benchmarks/tera/in); then 52 | echo Using existing TeraGen data 53 | mesg="Use: hadoop fs -rm -r /benchmarks/tera/in, " 54 | mesg+="to remove and trigger new TeraGen" 55 | echo "$mesg" 56 | sleep 3 57 | else 58 | #Delete previous data generated by teragen 59 | hadoop fs -rm -r /benchmarks/tera/in 60 | maprcli volume create -name benchmarks -path /benchmarks -replication 1 61 | hadoop fs -chmod 777 /benchmarks 62 | hadoop fs -mkdir /benchmarks/tera 63 | #Set MFS chunksize to 768MB for fat map tasks 64 | if ! hadoop mfs -setchunksize $chunksize /benchmarks/tera; then 65 | echo "setchunksize failed" 66 | echo "set chunksize on /benchmarks/tera to $chunksize manually" 67 | exit 68 | fi 69 | # Run TeraGen 70 | hadoop jar "$hdpjar" teragen \ 71 | -Dmapreduce.job.maps=$maps \ 72 | -Dmapreduce.map.disk=0 \ 73 | -Dmapreduce.map.cpu.vcores=0 \ 74 | -Dmapreduce.map.speculative=false \ 75 | $size /benchmarks/tera/in 76 | sleep 3 77 | fi 78 | hadoop mfs -ls /benchmarks/tera/in | grep ^-rwx | tail 79 | 80 | # maprcli config load -keys cldb.balancer.role.paused 81 | # maprcli config load -keys cldb.balancer.disk.paused 82 | # maprcli config save -values '{"cldb.balancer.disk.paused":"1"}' 83 | # maprcli config save -values '{"cldb.balancer.role.paused":"1"}' 84 | # Define vars for TeraSort run 85 | logname=terasort-$(date "+%FT%T").log 86 | nodes=$(maprcli node list -columns service |grep -c nodemanager) 87 | # Start with 2 reduce tasks per node, reduce tasks per node limited by RAM 88 | ((rtasks=nodes*${1:-2})) 89 | echo nodes="$nodes" | tee "$logname" 90 | echo rtasks="$rtasks" | tee -a "$logname" 91 | hadoop fs -rm -r /benchmarks/tera/out 92 | 93 | # Run TeraSort with fat or thin map tasks, depending on chunksize 94 | # Set mapreduce disk and vcores to 0 to size yarn containers by RAM only 95 | case $chunksize in 96 | $((3*256*1024*1024)) ) 97 | echo "Running TeraSort (size=$size) using 'fat' map tasks" 98 | echo "Uses fewer map tasks, reduces MxR shuffle" 99 | sleep 2 100 | hadoop jar "$hdpjar" terasort \ 101 | -Dmapreduce.map.disk=0 \ 102 | -Dmapreduce.map.cpu.vcores=0 \ 103 | -Dmapreduce.map.output.compress=false \ 104 | -Dmapreduce.map.sort.spill.percent=0.99 \ 105 | -Dmapreduce.map.memory.mb=2000 \ 106 | -Dmapreduce.map.java.opts="-Xmx1900m -Xms1900m" \ 107 | -Dmapreduce.task.io.sort.mb=1500 \ 108 | -Dmapreduce.task.io.sort.factor=100 \ 109 | -Dmapreduce.reduce.disk=0 \ 110 | -Dmapreduce.reduce.cpu.vcores=0 \ 111 | -Dmapreduce.reduce.shuffle.parallelcopies="$nodes" \ 112 | -Dmapreduce.reduce.merge.inmem.threshold=0 \ 113 | -Dmapreduce.job.reduces=$rtasks \ 114 | -Dmapreduce.job.reduce.slowstart.completedmaps=0.85 \ 115 | -Dyarn.app.mapreduce.am.log.level=ERROR \ 116 | -Dyarn.app.mapreduce.am.resource.mb=4000 \ 117 | -Dyarn.app.mapreduce.am.command-opts="-Xmx3200M -Xms3200M" \ 118 | /benchmarks/tera/in /benchmarks/tera/out 2>&1 | tee terasort.tmp 119 | # AM resized to handle 10 and 100TB runs 120 | ;; 121 | $((1*256*1024*1024)) ) 122 | echo "Running TeraSort (size=$size) using thin map tasks(more map tasks)" 123 | sleep 2 124 | hadoop jar "$hdpjar" terasort \ 125 | -Dmapreduce.map.disk=0 \ 126 | -Dmapreduce.map.cpu.vcores=0 \ 127 | -Dmapreduce.map.output.compress=false \ 128 | -Dmapreduce.map.sort.spill.percent=0.99 \ 129 | -Dmapreduce.reduce.disk=0 \ 130 | -Dmapreduce.reduce.cpu.vcores=0 \ 131 | -Dmapreduce.reduce.shuffle.parallelcopies="$nodes" \ 132 | -Dmapreduce.reduce.merge.inmem.threshold=0 \ 133 | -Dmapreduce.task.io.sort.mb=480 \ 134 | -Dmapreduce.task.io.sort.factor=100 \ 135 | -Dmapreduce.job.reduces=$rtasks \ 136 | -Dmapreduce.job.reduce.slowstart.completedmaps=0.55 \ 137 | -Dyarn.app.mapreduce.am.log.level=ERROR \ 138 | /benchmarks/tera/in /benchmarks/tera/out 2>&1 | tee terasort.tmp 139 | ;; 140 | *) echo Undefined chunk size!; exit ;; 141 | esac 142 | 143 | #-Dmapreduce.map.speculative=false \ 144 | #-Dmapreduce.reduce.speculative=false \ 145 | #-Dmapreduce.reduce.memory.mb=3000 \ 146 | 147 | # Post-process the TeraSort job output 148 | sleep 3 149 | # Capture the job history log 150 | myd=$(date +'%Y/%m/%d') 151 | myj=$(grep 'INFO mapreduce.Job: Running job' terasort.tmp |awk '{print $7}') 152 | myhist="/var/mapr/cluster/yarn/rm/staging/history/done/$myd/00*/$myj*.jhist" 153 | until (hadoop fs -stat $myhist); do 154 | echo Waiting 20 sec for "$myhist"; sleep 20 155 | done 156 | myhist=$(hadoop fs -ls "$myhist" |awk '{print $NF}') 157 | #echo "HISTORY FILE: $myhist" 158 | 159 | mapred job -history "$myhist" >> "$logname" # capture the run log 160 | cat "$0" >> "$logname" # append actual script run to the log 161 | head -22 "$logname" # show the top of the log with elapsed time, etc 162 | echo; echo View "$logname" for full job stats 163 | cat terasort.tmp >> "$logname"; rm terasort.tmp 164 | #./mapr-audit.sh >> $logname 165 | 166 | # To validate TeraSort output, uncomment below and change output folder 167 | # hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar 168 | # teravalidate /benchmarks/tera/out /benchmarks/tera/validate 169 | -------------------------------------------------------------------------------- /post-install/runYCSB.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | #jbenninghoff 2017-Apr-26 vi: set ai et sw=3 tabstop=3 retab 3 | #set -o nounset 4 | 5 | usage() { 6 | cat << EOF 7 | Usage: $0 -t -r -m -s 8 | -w -c -n -T -p -d 9 | -t Thread count (30/client HBase default, 20/client for MapR-DB) 10 | -r Row count (100M default) 11 | -m MapR-DB (using /tables/usertable) 12 | -s Specify comma separated list of YCSB client hostnames 13 | -w YCSB workload files a-f (default is auto-generated workloadtest) 14 | -c Disable client buffering (client buffering provides best performance) 15 | -n Skip load phase 16 | -T Create table 17 | -p Presplit the table 18 | -d Enable debug 19 | 20 | This script runs one YCSB client against an HBase or MapR-DB cluster to 21 | measure throughput and latency. The output is saved to a log file. 22 | 23 | The script can be invoked by clush to run multiple YCSB clients. 24 | To do so, the MapR HBase client and YCSB package must be installed 25 | on all client nodes. The YCSB package and this script must be pushed to 26 | all client nodes. A clush group such as 'ycsb' should be defined. 27 | The clush command would be: 28 | 29 | clush -b -g ycsb -c \$PWD 30 | clush -b -g ycsb \$PWD/runYCSB.sh -s \$(nodeset -S, -e @ycsb) 31 | #plus any other YCSB options needed 32 | 33 | Make sure MapR HBase client package, mapr-hbase, is installed on this node 34 | and all YCSB client nodes. 35 | 36 | YCSB package available here: 37 | https://github.com/brianfrankcooper/YCSB 38 | 39 | EOF 40 | } 41 | 42 | # Handle script arguments 43 | DBG=''; clients=''; rows=$[1000*1000]; threads=30; table=usertable; 44 | cbuff=true; wkld=test load=true; create=false; istart=0; seq=0; presplit=false 45 | while getopts "Tdpncmt:r:s:w" opt; do 46 | case $opt in 47 | d) DBG=true ;; 48 | c) cbuff=false ;; 49 | n) load=false ;; 50 | p) presplit=true ;; 51 | T) create=true ;; 52 | m) table=/benchmarks/usertable; threads=20; echo Using $table ;; 53 | w) wkld=$(echo {a..f}) ;; 54 | t) if [[ "$OPTARG" =~ ^[0-9]+$ ]]; then 55 | threads=$OPTARG 56 | else 57 | echo $OPTARG is not a number; exit 58 | fi ;; 59 | r) if [[ "$OPTARG" =~ ^[0-9]+$ ]]; then 60 | rows=$OPTARG 61 | else 62 | echo $OPTARG is not an number; exit 63 | fi ;; 64 | s) if [[ "$OPTARG" =~ .*,.* ]]; then 65 | clients=$OPTARG 66 | else 67 | echo $OPTARG is not a host list; exit; 68 | fi ;; 69 | \?) usage; exit ;; 70 | esac 71 | done 72 | if [[ -n "$DBG" ]]; then 73 | echo clients: $clients, rows: $rows, threads: $threads, table: $table 74 | echo wkld: $wkld, load: $load, cbuff: $cbuff 75 | fi 76 | 77 | # Check for hbase shell 78 | if ! type hbase >& /dev/null; then 79 | echo hbase client required for this script 80 | yum -q list installed mapr-hbase || echo mapr-hbase not installed 81 | exit 2 82 | fi 83 | 84 | # Set up some variables 85 | setvars() { 86 | hostcount=1 87 | thishost=$(hostname -s) 88 | opcount=$[rows/hostcount] 89 | if [[ -n "$clients" ]]; then 90 | hostarray=(${clients//,/ }) 91 | hostcount=${#hostarray[@]} 92 | opcount=$[rows/hostcount] 93 | i=0 94 | for host in ${hostarray[@]}; do 95 | [[ "$host" == "$thishost" ]] && { istart=$[opcount*i]; seq=$i; } 96 | ((i++)) 97 | done 98 | fi 99 | [[ $[$opcount / (1000)] -gt 0 ]] && mag=$[$opcount / (1000)]K 100 | [[ $[$opcount / (1000*1000)] -gt 0 ]] && mag=$[$opcount / (1000*1000)]M 101 | [[ $[$opcount / (1000*1000*1000)] -gt 0 ]] && mag=$[$opcount / (1000*1000*1000)]B 102 | ycsbdb=hbase10 103 | ycsbdb=maprdb 104 | ycsbargs="-s -threads $threads -p table=$table " 105 | ycsbargs+=" -p clientbuffering=$cbuff -p recordcount=$rows" 106 | ycsbargs+=" -p operationcount=$opcount -cp $(hbase classpath)" 107 | #ycsbargs+=" -p operationcount=$opcount -cp lib" 108 | tmpdate=$(date '+%Y-%m-%dT%H+%M') 109 | teelog="ycsb-${threads}T-$thishost-$seq-$mag-${table//\//-}-$tmpdate" 110 | #export CLASSPATH=$(hbase classpath) #YCSB does not appear to honor envar 111 | } 112 | setvars 113 | 114 | # Check for YCSB package 115 | [[ -d ${ycsbdb}-binding ]] || { echo YCSB or DB binding not installed; exit 2; } 116 | 117 | #Create table if requested 118 | if [[ "$create" == "true" ]]; then 119 | hbase shell <<< "disable '$table'; drop '$table'" 120 | if [[ "$presplit" == "true" ]]; then 121 | hbase shell <<-EOF 122 | #nreg=64 #num of regions 123 | #keyrange=9.999999E18 #Max row key value 124 | #regsize=keyrange/nreg #Region size 125 | #splits=(1..nreg-1).map {|i| "user#{sprintf '%019d', i*regsize}"} 126 | nreg=71; keyrange=9.999999E18; regsize=keyrange/nreg #Based 100M rows, ~64 used regions 127 | splits=(8..nreg-1).map {|i| "user#{sprintf '%019d', i*regsize}"} #zero padded keys do not appear to be used by YCSB 128 | create '$table', {NAME=>'family',COMPRESSION=>'none',IN_MEMORY=>'true'}, SPLITS => splits 129 | EOF 130 | else 131 | hbase shell <<-EOF2 132 | # Simple table for MapR-DB 133 | create '$table', {NAME=>'family',IN_MEMORY=>'true'} 134 | #create '$table',{NAME=>'family',COMPRESSION=>'none',IN_MEMORY=>'true'} 135 | EOF2 136 | fi 137 | fi 138 | 139 | #Generate a workload file 140 | cat <workloads/workloadtest 141 | workload=com.yahoo.ycsb.workloads.CoreWorkload 142 | 143 | #100M rows, 5x500 byte row length(2.5K) ~= 250GB data set 144 | recordcount=$rows 145 | fieldlength=500 146 | fieldcount=5 147 | columnfamily=family 148 | insertstart=$istart 149 | insertcount=$opcount 150 | #must be here due to ycsb bug. Will specify to YCSB CLI. 151 | # 0 means run forever 152 | operationcount=$opcount 153 | 154 | #Op proportions 155 | readproportion=0.0 156 | scanproportion=0.0 157 | insertproportion=0.0 158 | updateproportion=1.0 159 | #caution: insert changes size of db which makes comparisons across 160 | #runs impossible. Use with caution. Since the work of an update is 161 | #identical to insert we recommend not using insert. 162 | #readmodifywriteproportion=0.05 163 | 164 | #distribution 165 | #readallfields=true 166 | requestdistribution=zipfian 167 | maxscanlength=100 168 | scanlengthdistribution=uniform 169 | EOF3 170 | 171 | 172 | for wkldf in $wkld; do 173 | args2=" -P workloads/workload$wkldf " 174 | args2+=" -p fieldlength=500 -p fieldcount=5 -p columnfamily=family" 175 | hbase shell <<< "disable '$table'; drop '$table'" >/dev/null 176 | hbase shell <<< "create '$table', {NAME=>'family',IN_MEMORY=>'true'}" 177 | 178 | #YCSB load phase 179 | if [[ "$load" == "true" ]]; then 180 | teelog1=${teelog}-wkld-$wkldf-load.log 181 | args3=" -p insertcount=$opcount -p insertstart=$istart" 182 | if [[ -n "$DBG" ]]; then 183 | echo bin/ycsb load $ycsbdb $ycsbargs $args2 $args3 $teelog1; exit 184 | fi 185 | bin/ycsb load $ycsbdb $ycsbargs $args2 $args3 |& tee $teelog1 186 | fi 187 | #YCSB run phase 188 | teelog2=${teelog}-wkld-$wkldf-run.log 189 | bin/ycsb run $ycsbdb $ycsbargs $args2 |& tee $teelog2 190 | done 191 | 192 | #Check for table 193 | #gs1='is not a MapRDB'; gs2='does not exist' 194 | #if (hbase shell <<< "exists '$table'" |grep -q -e "$gs1" -e "$gs2"); then 195 | # echo Table Does Not Exist; exit 2 196 | #fi 197 | -------------------------------------------------------------------------------- /post-install/regionsp.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # 3 | # Show distribution of regions(tablets) for a MapR table across storage pools 4 | # 5 | # Requirements: 6 | # maprcli is in PATH 7 | # passwordless ssh to cluster nodes to run mrconfig 8 | # 9 | # Usage: regionsp.py 10 | # 11 | 12 | import sys 13 | import subprocess 14 | import re 15 | import json 16 | import locale 17 | from collections import defaultdict 18 | 19 | locale.setlocale(locale.LC_ALL, 'en_US') 20 | progname=sys.argv[0] 21 | 22 | def usage(*ustring): 23 | print 'Usage: '+ progname + ' tablepath' 24 | if len(ustring) > 0: 25 | print " ",ustring[0] 26 | exit(1) 27 | 28 | def errexit(estring, err): 29 | print estring,": ",err 30 | exit(1) 31 | 32 | def execCmd(cmdlist): 33 | process = subprocess.Popen(cmdlist, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 34 | out, err = process.communicate() 35 | return out,err 36 | 37 | def getSpDict(node): 38 | # Create a storage pool dictionary. 39 | # Key : SP Name 40 | # Value : List of Containers 41 | spContainerDict = defaultdict(list) 42 | 43 | # Get a storage pool listing for node 44 | spls, errout = execCmd(["ssh", "-o", "StrictHostKeyChecking=no", node, "/opt/mapr/server/mrconfig sp list"]) 45 | #if errout: 46 | # errexit(node + " mrconfig sp list", errout) 47 | splines = spls.split('\n') 48 | 49 | # For each storage pool, extract name and first disk device 50 | for spline in splines: 51 | # splineRe=re.compile('^SP .+name (.+?),.*path (.+?)\s*\Z') 52 | splineRe=re.compile('^SP .+name (.+?), (.+?),.*path (.+?)\s*\Z') 53 | if not splineRe.match(spline): 54 | continue 55 | matches = splineRe.search(spline).groups() 56 | spName=matches[0] 57 | spStatus=matches[1] 58 | spDev=matches[2] 59 | 60 | if spStatus == "Offline": 61 | continue 62 | 63 | # For the given storage pool (first disk device), get RW container list 64 | spcntr, errout = execCmd(["ssh", "-o", "StrictHostKeyChecking=no", node, "/opt/mapr/server/mrconfig info containers rw " + spDev]) 65 | spcntrRe=re.compile('^RW containers: (.*)') 66 | matches = spcntrRe.search(spcntr).groups() 67 | cntrlist = matches[0].split() 68 | 69 | # put the list in the dictionary and return 70 | spContainerDict[spName].extend(cntrlist) 71 | 72 | return spContainerDict 73 | 74 | def getRegionInfo(tablepath): 75 | regionInfo, errout = execCmd(["/opt/mapr/bin/maprcli", "table", "region", "list", "-path", tablepath, "-json"]) 76 | regionInfoJson=json.loads(regionInfo) 77 | return regionInfoJson 78 | 79 | def isValidRegionJson(regionInfoJson): 80 | # region info JSON should always have a status 81 | desc = "NOT_OK" 82 | status = "NOT_OK" 83 | if "status" in regionInfoJson: 84 | desc = status = regionInfoJson["status"] 85 | if "errors" in regionInfoJson: 86 | desc = regionInfoJson["errors"][0]["desc"] 87 | return status, desc 88 | 89 | def kStr(i): 90 | # return number as a string with comma separated thousands 91 | if isinstance(i, (int,long)): 92 | return locale.format("%ld", i, grouping=True) 93 | else: 94 | return i 95 | 96 | #def printfields(field1, sp, regionCnt, rowCnt, size, fidList): 97 | # print field1.ljust(16), sp.ljust(8), ":", kStr(regionCnt).rjust(4), "regions", kStr(rowCnt).rjust(14), "rows", kStr(size).rjust(17), "bytes ", fidList 98 | 99 | 100 | def printfields(field1, sp, regionCnt, containerCnt, rowCnt, size, cList): 101 | print field1.ljust(17), sp.ljust(6), kStr(regionCnt).rjust(8), kStr(containerCnt).rjust(6), kStr(rowCnt).rjust(16), kStr(size).rjust(19), " ", cList 102 | 103 | def main(): 104 | 105 | if len(sys.argv) != 2: 106 | usage() 107 | 108 | tablepath=sys.argv[1] 109 | 110 | regionInfoJson = getRegionInfo(tablepath) 111 | status, description = isValidRegionJson(regionInfoJson) 112 | if status != "OK": 113 | usage(description) 114 | 115 | nodeList=[] 116 | tableContainerDict = defaultdict(list) 117 | # Create a table container dictionary 118 | # Key : Node 119 | # Value : List of Region Dicts 120 | 121 | ''' 122 | Each region dictionary looks like this. One per region. 123 | { 124 | "primarynode":"se-node11.se.lab", 125 | "secondarynodes":"se-node10.se.lab, se-node13.se.lab", 126 | "startkey":"user1987445486054028621", 127 | "endkey":"user24927076355330151", 128 | "lastheartbeat":0, 129 | "fid":"3058.32.131224", 130 | "physicalsize":5983674368, 131 | "logicalsize":5983674368, 132 | "numberofrows":4869428 133 | } 134 | 135 | ''' 136 | 137 | regionList=regionInfoJson["data"] # This is a list of region dicts 138 | 139 | # Create a list of nodes that hold a container for the table 140 | for regionDict in regionList: 141 | node=regionDict["primarynode"] 142 | tableContainerDict[node].append(regionDict) 143 | if node not in nodeList: 144 | nodeList.append(node) 145 | nodeList.sort() 146 | 147 | # Loop through nodes printing out region info for each SP. 148 | # Note that a given container may contain more than one region (fid) for the table 149 | tableCntTotal=0 150 | tableSizeTotal=0 151 | tableRowsTotal=0 152 | tableContainersTotal=0 153 | for node in nodeList: 154 | spTableFidDict = defaultdict(list) # for each SP, the table FIDs in that SP 155 | nodeCntTotal=0 156 | nodeSizeTotal=0 157 | nodeRowsTotal=0 158 | nodeContainersTotal=0 159 | nodeContainerList=[] 160 | printfields(node, "SP", "Regions", "Cntnrs", "Rows", "Bytes", "Container IDs") 161 | spRegionCntDict=defaultdict(int) 162 | spRegionSzDict=defaultdict(long) 163 | spRegionRowsDict=defaultdict(long) 164 | spContainerDict=getSpDict(node) # For specified node, SPName:list of containers 165 | for regionDict in tableContainerDict[node]: 166 | regionContainer=regionDict["fid"].split('.')[0] 167 | #print "regionContainer = ", regionContainer 168 | for sp in spContainerDict: 169 | if regionContainer in spContainerDict[sp]: 170 | spTableFidDict[sp].append(str(regionDict["fid"])) # Add FID to this SP's list of FIDs 171 | spRegionCntDict[sp] += 1 172 | spRegionSzDict[sp] += regionDict["physicalsize"] 173 | spRegionRowsDict[sp] += regionDict["numberofrows"] 174 | # Need to print out 0 for SPs that don't have anything 175 | for sp in sorted(spRegionCntDict.iterkeys()): 176 | #print " ", sp, "\t", spRegionCntDict[sp] 177 | size=spRegionSzDict[sp] 178 | sizeMB=spRegionSzDict[sp]/(1024*1024) 179 | rowCnt=spRegionRowsDict[sp] 180 | #print " ", sp.ljust(8), ":", kStr(spRegionCntDict[sp]).rjust(4), "regions", kStr(rowCnt).rjust(14), "rows", kStr(sizeMB).rjust(10), "MB" 181 | spTableFidDict[sp].sort() 182 | containerList = [fid.split('.')[0] for fid in spTableFidDict[sp]] 183 | containerList=list(set(containerList)) 184 | containerCnt = len(set(containerList)) 185 | printfields(" ", sp, spRegionCntDict[sp], containerCnt, rowCnt, size, ' '.join(containerList)) # spTableFidDict[sp]) 186 | #print " ", sp.ljust(8), ":", kStr(spRegionCntDict[sp]).rjust(4), "regions", kStr(rowCnt).rjust(14), "rows", kStr(size).rjust(17), "bytes ", spTableFidDict[sp] 187 | nodeCntTotal += spRegionCntDict[sp] 188 | #nodeSizeTotal += sizeMB 189 | nodeSizeTotal += size 190 | nodeRowsTotal += rowCnt 191 | nodeContainersTotal += containerCnt 192 | nodeContainerList.extend(containerList) 193 | nodeContainerList.sort() 194 | #print " ", "TOTAL".ljust(8), ":", kStr(nodeCntTotal).rjust(4), "regions", kStr(nodeRowsTotal).rjust(14), "rows", kStr(nodeSizeTotal).rjust(10), "MB" 195 | printfields("TOTAL".rjust(17), "", nodeCntTotal, nodeContainersTotal, nodeRowsTotal, nodeSizeTotal, ' '.join(nodeContainerList)) 196 | print " " 197 | tableCntTotal += nodeCntTotal 198 | tableSizeTotal += nodeSizeTotal 199 | tableRowsTotal += nodeRowsTotal 200 | tableContainersTotal += nodeContainersTotal 201 | #print "TABLE TOTAL", ":", kStr(tableCntTotal).rjust(4), "regions", kStr(tableRowsTotal).rjust(14), "rows", kStr(tableSizeTotal).rjust(17), "bytes" 202 | printfields("TABLE TOTAL:", "", tableCntTotal, tableContainersTotal, tableRowsTotal, tableSizeTotal, "") 203 | 204 | if __name__ == "__main__": 205 | main() 206 | 207 | exit(0) 208 | 209 | -------------------------------------------------------------------------------- /post-install/profile_cluster.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # dgomerman@maprtech.com 2014-Mar-25 3 | # Variables 4 | DATE=$(date +%Y%m%d%H%M) 5 | GEN_PROFILE="gen_profile.sh" 6 | CLUSH_TT_GROUP="tt" 7 | CLUSH_DEST="/tmp/tera_profile" 8 | GEN_PROFILE_OUTPUT="/tmp/node_profile.sh" 9 | OUTPUT_DIR="/tmp/profile_results.$$" 10 | RESULTS_DIR="$OUTPUT_DIR/node_results" 11 | CLUSTER_OUTPUT="$OUTPUT_DIR/cluster_profile.ini" 12 | CLUSTER_ZIP="/tmp/cluster-profile-$DATE.tar.gz" 13 | TERA_SCRIPT="./runTeraTune.sh" 14 | TERA_LOG="$OUTPUT_DIR/teraTune.log" 15 | CLUSH_CMD="clush" 16 | CLUSH_OPTIONS="--nostdin" 17 | NODESET_CMD="nodeset" 18 | MAPRCLI_CMD="maprcli" 19 | SSH_CMD="" 20 | #SSH_CMD="ssh root@node" # Uncomment to specify SSH HOST with maprcli 21 | 22 | # Functions 23 | pcheck() { 24 | local p="$1" 25 | which "$p" >/dev/null 2>&1 26 | local pE=$? 27 | if [ $pE -ne 0 ] ; then 28 | echo "ERROR: $p does not exist. Exiting." 29 | exit 1 30 | fi 31 | } 32 | 33 | # Pre-checks 34 | pcheck $CLUSH_CMD 35 | pcheck $NODESET_CMD 36 | pcheck $MAPRCLI_CMD 37 | 38 | if [ ! -e "$GEN_PROFILE" ] ; then 39 | echo "ERROR: $GEN_PROFILE script is missing. Exiting." 40 | exit 1 41 | fi 42 | 43 | if [ -d "$RESULTS_DIR" ] ; then 44 | mv "$RESULTS_DIR" "$RESULTS_DIR.bk" 45 | fi 46 | mkdir -p "$RESULTS_DIR" 47 | 48 | # Check for Clush group definition and existence of nodes 49 | if [ ! -n "$CLUSH_TT_GROUP" ] ; then 50 | echo "ERROR: clush task tracker group \"CLUSH_TT_GROUP\" must be defined. Exiting." 51 | exit 1 52 | fi 53 | myNodes="`$NODESET_CMD -e @$CLUSH_TT_GROUP`" 54 | if [ ! -n "$myNodes" ] ; then 55 | echo "ERROR: CLUSH_TT_GROUP \"$CLUSH_TT_GROUP\" does not contain any nodes. Exiting." 56 | exit 1 57 | fi 58 | clush_args="$CLUSH_OPTIONS -g $CLUSH_TT_GROUP" 59 | 60 | # Copy gen_profile.sh script to all nodes 61 | $CLUSH_CMD $clush_args "mkdir -p \"$CLUSH_DEST\"" 62 | $CLUSH_CMD $clush_args --copy "$GEN_PROFILE" --dest "$CLUSH_DEST" 63 | # Run gen_profile.sh script on all Task Tracker nodes 64 | $CLUSH_CMD $clush_args -B "$CLUSH_DEST/$GEN_PROFILE" 65 | # Collect gen_profile.sh results from all nodes 66 | $CLUSH_CMD $clush_args --rcopy "$GEN_PROFILE_OUTPUT" --dest "$RESULTS_DIR" 67 | # Remote cleanup 68 | #$CLUSH_CMD $clush_args "rm -rf \"$CLUSH_DEST\"" 69 | #$CLUSH_CMD $clush_args "rm -f \"$GEN_PROFILE_OUTPUT\"" 70 | # Analyze results 71 | c_node_distro="unknown" 72 | c_node_manufacturer="unknown" 73 | c_node_product="unknown" 74 | c_node_dims_total=0 75 | c_node_dims_avg=0 76 | c_node_memory_total=0 77 | c_node_memory_avg=0 78 | c_node_core_total=0 79 | c_node_core_avg=0 80 | c_node_cpu_mhz_total=0 81 | c_node_cpu_mhz_avg=0 82 | c_node_nic_avg_speed_total=0 83 | c_node_nic_avg_speed_avg=0 84 | c_node_nic_agg_speed_total=0 85 | c_node_nic_agg_speed_avg=0 86 | c_node_nic_count_total=0 87 | c_node_nic_count_avg=0 88 | c_node_disk_count_total=0 89 | c_node_disk_count_avg=0 90 | 91 | count=0 92 | homogenous=1 93 | cat < $CLUSTER_OUTPUT 94 | ; This file is auto generated by MapR profile_cluster.sh 95 | 96 | EOF 97 | 98 | cd "$RESULTS_DIR" 99 | files=$(find node_profile.sh* -maxdepth 1 -type f) 100 | if [ ! -n "$files" ] ; then 101 | echo "ERROR: no node_profile.sh* files found. Exiting." 102 | exit 1 103 | fi 104 | for f in $files ; do 105 | # Save old values to check for homogenous cluster 106 | MEMORY_DIMS_P=$MEMORY_DIMS 107 | MEMORY_TOTAL_P=$MEMORY_TOTAL 108 | CORE_COUNT_P=$CORE_COUNT 109 | CPU_MHZ_P=$CPU_MHZ 110 | AVG_NIC_SPEED_P=$AVG_NIC_SPEED 111 | AGG_NIC_SPEED_P=$AGG_NIC_SPEED 112 | NIC_COUNT_P=$NIC_COUNT 113 | DISK_COUNT_P=$DISK_COUNT 114 | 115 | # Clear old values in case future values are blank 116 | unset DISTRO 117 | unset MANUFACTURER 118 | unset PRODUCT 119 | unset MEMORY_DIMS 120 | unset MEMORY_TOTAL 121 | unset CORE_COUNT 122 | unset CPU_MHZ 123 | unset NIC_SPEEDS 124 | unset AVG_NIC_SPEED 125 | unset AGG_NIC_SPEED 126 | unset NIC_COUNT 127 | unset DISK_COUNT 128 | # Get new values 129 | source "$f" 130 | cat <> $CLUSTER_OUTPUT 131 | [node$count] 132 | distro = "$DISTRO" 133 | manufacturer = "$MANUFACTURER" 134 | product = "$PRODUCT" 135 | memory_dims = "$MEMORY_DIMS" 136 | memory_total = "$MEMORY_TOTAL" 137 | core_count = "$CORE_COUNT" 138 | cpu_mhz = "$CPU_MHZ" 139 | nic_speeds = "$NIC_SPEEDS" 140 | avg_nic_speed = "$AVG_NIC_SPEED" 141 | agg_nic_speed = "$AGG_NIC_SPEED" 142 | nic_count = "$NIC_COUNT" 143 | disk_count = "$DISK_COUNT" 144 | 145 | EOF 146 | # Analyze 147 | if [ "$c_node_distro" == "unknown" ] ; then 148 | c_node_distro="$DISTRO" 149 | elif [ "$c_node_distro" != "$DISTRO" ] ; then 150 | c_node_distro="mixed" 151 | homogenous=0 152 | fi 153 | if [ "$c_node_manufacturer" == "unknown" ] ; then 154 | c_node_manufacturer="$MANUFACTURER" 155 | elif [ "$c_node_manufacturer" != "$MANUFACTURER" ] ; then 156 | c_node_manufacturer="mixed" 157 | homogenous=0 158 | fi 159 | if [ "$c_node_product" == "unknown" ] ; then 160 | c_node_product="$PRODUCT" 161 | elif [ "$c_node_product" != "$PRODUCT" ] ; then 162 | c_node_product="mixed" 163 | homogenous=0 164 | fi 165 | let c_node_dims_total+=$MEMORY_DIMS 166 | if [ ! -n "$MEMORY_DIMS_P" ] ; then 167 | MEMORY_DIMS_P="$MEMORY_DIMS" 168 | fi 169 | if [ $MEMORY_DIMS_P -ne $MEMORY_DIMS ] ; then 170 | homogenous=0 171 | fi 172 | let c_node_memory_total+=$MEMORY_TOTAL 173 | if [ ! -n "$MEMORY_TOTAL_P" ] ; then 174 | MEMORY_TOTAL_P="$MEMORY_TOTAL" 175 | fi 176 | if [ $MEMORY_TOTAL_P -ne $MEMORY_TOTAL ] ; then 177 | homogenous=0 178 | fi 179 | let c_node_core_total+=$CORE_COUNT 180 | if [ ! -n "$CORE_COUNT_P" ] ; then 181 | CORE_COUNT_P="$CORE_COUNT" 182 | fi 183 | if [ $CORE_COUNT_P -ne $CORE_COUNT ] ; then 184 | homogenous=0 185 | fi 186 | let c_node_cpu_mhz_total+=$CPU_MHZ 187 | if [ ! -n "$CPU_MHZ_P" ] ; then 188 | CPU_MHZ_P="$CPU_MHZ" 189 | fi 190 | if [ $CPU_MHZ_P -ne $CPU_MHZ ] ; then 191 | homogenous=0 192 | fi 193 | let c_node_nic_avg_speed_total+=$AVG_NIC_SPEED 194 | if [ ! -n "$AVG_NIC_SPEED_P" ] ; then 195 | AVG_NIC_SPEED_P="$AVG_NIC_SPEED" 196 | fi 197 | if [ $AVG_NIC_SPEED_P -ne $AVG_NIC_SPEED ] ; then 198 | homogenous=0 199 | fi 200 | let c_node_nic_agg_speed_total+=$AGG_NIC_SPEED 201 | 202 | let c_node_nic_count_total+=$NIC_COUNT 203 | if [ ! -n "$NIC_COUNT_P" ] ; then 204 | NIC_COUNT_P="$NIC_COUNT" 205 | fi 206 | if [ $NIC_COUNT_P -ne $NIC_COUNT ] ; then 207 | homogenous=0 208 | fi 209 | let c_node_disk_count_total+=$DISK_COUNT 210 | if [ ! -n "$DISK_COUNT_P" ] ; then 211 | DISK_COUNT_P="$DISK_COUNT" 212 | fi 213 | if [ $DISK_COUNT_P -ne $DISK_COUNT ] ; then 214 | homogenous=0 215 | fi 216 | 217 | let count+=1 218 | done 219 | # Calculate averages 220 | let c_node_dims_avg=$c_node_dims_total/$count 221 | let c_node_memory_avg=$c_node_memory_total/$count 222 | let c_node_core_avg=$c_node_core_total/$count 223 | let c_node_cpu_mhz_avg=$c_node_cpu_mhz_total/$count 224 | let c_node_nic_avg_speed_avg=$c_node_nic_avg_speed_total/$count 225 | let c_node_nic_agg_speed_avg=$c_node_nic_agg_speed_total/$count 226 | let c_node_nic_count_avg=$c_node_nic_count_total/$count 227 | let c_node_disk_count_avg=$c_node_disk_count_total/$count 228 | 229 | # Grab cluster information from CLI 230 | if [ -n "$SSH_CMD" ] ; then 231 | MAPRCLI_CMD="$SSH_CMD $MAPRCLI_CMD" 232 | fi 233 | map_capacity="`$MAPRCLI_CMD dashboard info -json |grep '\"map_task_capacity\"' |sed -e 's/[^0-9]*//g'`" 234 | reduce_capacity="`$MAPRCLI_CMD dashboard info -json |grep '\"reduce_task_capacity\"' |sed -e 's/[^0-9]*//g'`" 235 | nodes=$($MAPRCLI_CMD node list -columns hostname,cpus,ttReduceSlots | awk '/^[1-9]/{if ($2>0) count++};END{print count}') 236 | cluster_tt_nodes=$($MAPRCLI_CMD node list -columns hostname,configuredservice -filter "[rp==/*]and[svc==tasktracker]" |tail -n +2 |wc --lines) 237 | mapr_version=$($MAPRCLI_CMD dashboard info -version true |tail -1 |sed -e 's/[ \t]*//g') 238 | cluster_name=$($MAPRCLI_CMD dashboard info -json |grep -A3 "cluster\":{" |grep "name" |sed -e 's/^.*:\"//' -e 's/\".*$//') 239 | cluster_id=$($MAPRCLI_CMD dashboard info -json |grep -A3 "cluster\":{" |grep "id" |sed -e 's/^.*:\"//' -e 's/\".*$//') 240 | 241 | cd - >/dev/null 2>&1 242 | # Complete cluster profile 243 | cat <> $CLUSTER_OUTPUT 244 | [cluster] 245 | cluster_name = "$cluster_name" 246 | cluster_id = "$cluster_id" 247 | mapr_version = "$mapr_version" 248 | homogenous = "$homogenous" 249 | cluster_nodes = "$nodes" 250 | cluster_tt_nodes = "$cluster_tt_nodes" 251 | distro = "$c_node_distro" 252 | manufacturer = "$c_node_manufacturer" 253 | product = "$c_node_product" 254 | memory_dims_avg = "$c_node_dims_avg" 255 | memory_total_avg = "$c_node_memory_avg" 256 | memory_total = "$c_node_memory_total" 257 | core_count_avg = "$c_node_core_avg" 258 | core_count_total = "$c_node_core_total" 259 | cpu_mhz_avg = "$c_node_cpu_mhz_avg" 260 | nic_speed_avg = "$c_node_nic_avg_speed_avg" 261 | agg_nic_speed_avg = "$c_node_nic_agg_speed_avg" 262 | nic_count_avg = "$c_node_nic_count_avg" 263 | disk_count = "$c_node_disk_count_avg" 264 | map_slots = "$map_capacity" 265 | reduce_slots = "$reduce_capacity" 266 | 267 | EOF 268 | 269 | # Run TeraTune 270 | $TERA_SCRIPT "$TERA_LOG" 271 | 272 | # Zip up results 273 | cd "$OUTPUT_DIR" 274 | tar zcf "$CLUSTER_ZIP" * >/dev/null 2>&1 275 | # Local cleanup 276 | # Instructions 277 | cat <&2; usage ;; 45 | esac;; 46 | a) diskset=all ;; 47 | s) seq=true ;; 48 | r) testtype=readtest ;; 49 | p) preserve=true ;; 50 | z) [[ "$OPTARG" =~ ^[0-9.]+$ ]] && size=$OPTARG 51 | [[ "$OPTARG" =~ ^[0-9.]+$ ]] || { echo $OPTARG is not a number;exit; } ;; 52 | d) DBG=true ;; # Enable debug statements 53 | *) usage ;; 54 | esac 55 | #TBD: add disk detail option, -i 56 | done 57 | [[ -n "$DBG" ]] && echo Options set: diskset: $diskset, seq: $seq, size: $size 58 | [[ -n "$DBG" ]] && read -p "Press enter to continue or ctrl-c to abort" 59 | 60 | [[ $(id -u) != 0 ]] && { echo This script must be run as root; exit 1; } 61 | 62 | find_unused_disks() { 63 | [[ -n "$DBG" ]] && set -x 64 | disklist="" 65 | fdisks=$(fdisk -l |& awk '/^Disk .* bytes/{print $2}' |sort) 66 | for d in $fdisks; do 67 | [[ -n "$DBG" ]] && echo Fdisk list loop, Checking Device: $dev 68 | dev=${d%:} # Strip colon off the dev path string 69 | # If mounted, skip device 70 | mount | grep -q -w -e $dev -e ${dev}1 -e ${dev}2 && continue 71 | # If swap partition, skip device 72 | swapon -s | grep -q -w $dev && continue 73 | # If physical volume is part of LVM, skip device 74 | type pvdisplay &> /dev/null && pvdisplay $dev &> /dev/null && continue 75 | # If device name appears to be LVM swap device, skip device 76 | [[ $dev == *swap* ]] && continue 77 | # Looks like might be swap device 78 | lsblk -nl "$(readlink -f $dev)" | grep -i swap && continue 79 | # If device is part of encrypted partition, skip device 80 | type cryptsetup >& /dev/null && cryptsetup isLuks $dev && continue 81 | if [[ $dev == /dev/md* ]]; then 82 | mdisks+="$(mdadm -D $dev |grep -o '/dev/[^0-9 ]*' |grep -v /dev/md) " 83 | continue 84 | fi 85 | if [[ "$testtype" != "readtest" ]]; then 86 | #Looks like part of MapR disk set already 87 | grep $dev /opt/mapr/conf/disktab &>/dev/null && continue 88 | #Looks like something has device open 89 | lsof $dev && continue 90 | fi 91 | ## Survived all filters, add device to the list of unused disks!! 92 | disklist="$disklist $dev " 93 | done 94 | 95 | for d in $mdisks; do #Remove devices used by /dev/md* 96 | echo Removing MDisk from list: $d 97 | disklist=${disklist/$d } 98 | done 99 | 100 | #Remove devices used by LVM or mounted partitions 101 | [[ -n "$DBG" ]] && echo LVM checks 102 | awkcmd='$2=="lvm" {print "/dev/"$3; print "/dev/mapper/"$1}; ' 103 | awkcmd+=' $2=="part" {print "/dev/"$3; print "/dev/"$1}' 104 | lvmdisks=$(lsblk -ln -o NAME,TYPE,PKNAME,MOUNTPOINT |awk "$awkcmd" |sort -u) 105 | for d in $lvmdisks; do 106 | echo Removing LVM disk from list: $d 107 | disklist=${disklist/$d } 108 | done 109 | 110 | # Remove /dev/mapper duplicates from $disklist 111 | for i in $disklist; do 112 | [[ "$i" != /dev/mapper* ]] && continue 113 | [[ -n "$DBG" ]] && echo Disk is mapper: $i 114 | #/dev/mapper underlying device 115 | dupdev=$(lsblk |grep -B2 "$(basename $i)" |awk '/disk/{print "/dev/"$1}') 116 | #strip underlying device used by mapper from disklist 117 | disklist=${disklist/$dupdev } 118 | #disklist=${disklist/$i } #strip mapper device 119 | done 120 | 121 | # Remove /dev/secvm/dev duplicates from $disklist (Vormetric) 122 | for i in $disklist; do 123 | [[ "$i" != /dev/secvm/dev/* ]] && continue 124 | [[ -n "$DBG" ]] && echo Disk is Vormetric: $i 125 | #/dev/secvm/dev underlying device 126 | dupdev=$(lsblk |grep -B2 "$(basename $i)" |awk '/disk/{print "/dev/"$1}') 127 | #strip underlying device used by secvm(Vormetric) from disklist 128 | disklist=${disklist/$dupdev } 129 | #disklist=${disklist/$i } #strip secvm(Vormetric) device 130 | done 131 | [[ -n "$DBG" ]] && { set +x; echo DiskList: $disklist; } 132 | [[ -n "$DBG" ]] && read -p "Press enter to continue or ctrl-c to abort" 133 | } 134 | 135 | # Report on unused or all disks found 136 | case "$diskset" in 137 | all) 138 | disklist=$(fdisk -l 2>/dev/null | awk '/^Disk \// {print $2}' |sort) 139 | echo -e "All disks: " $disklist; echo; exit 140 | ;; 141 | unused) 142 | if [[ $preserve == true ]]; then 143 | [[ -f /tmp/disk.list ]] || { echo /tmp/disk.list does not exist; exit; } 144 | # Re-use /tmp/disk.list 145 | disklist=$(/tmp/disk.list 151 | fi 152 | [[ -n "$DBG" ]] && cat /tmp/disk.list 153 | [[ -n "$DBG" ]] && read -p "Press enter to continue or ctrl-c to abort" 154 | if [[ -n "$disklist" ]]; then 155 | echo; echo "Unused disks: $disklist" 156 | [[ -t 1 ]] && { tput -S <<< $'setab 3\nsetaf 0'; } 157 | echo -n Scrutinize this list carefully!! 158 | [[ -t 1 ]] && tput op 159 | echo 160 | #echo -e "\033[33;5;7mScrutinize this list carefully!!\033[0m" 161 | else 162 | echo; echo "No Unused disks!"; echo; exit 1 163 | fi 164 | : << '--BLOCK-COMMENT--' 165 | diskqty=$(echo $disklist | wc -w) 166 | #See /opt/mapr/conf/mfs.conf: mfs.max.disks 167 | #TBD: add smartctl disk detail probes 168 | if type smartctl >& /dev/null; then 169 | grepopts='-e ^Vendor -e ^Product -e Capacity -e ^Rotation ' 170 | grepopts+=' -e ^Form -e ^Transport' 171 | smartctl -d megaraid,0 -a /dev/sdf | grep $grepopts 172 | elif [[ -f /opt/MegaRAID/MegaCLI ]]; then 173 | /opt/MegaRAID/MegaCLI ... 174 | fi 175 | --BLOCK-COMMENT-- 176 | ;; 177 | esac 178 | 179 | # Run read-only or read-write (destructive) tests 180 | case "$testtype" in 181 | readtest) 182 | [[ -n "$DBG" ]] && set -x 183 | #read-only dd test, possible even after MFS is in place 184 | ddopts="of=/dev/null iflag=direct bs=1M count=$((size*1000))" 185 | if [[ $seq == "false" ]]; then 186 | [[ -n "$DBG" ]] && echo Concurrent dd disklist: $disklist 187 | for i in $disklist; do 188 | dd if=$i $ddopts |& tee "$(basename $i)-dd.log" & 189 | done 190 | echo; echo "Waiting for dd to finish" 191 | wait 192 | echo 193 | else 194 | for i in $disklist; do 195 | dd if=$i $ddopts |& tee "$(basename $i)-seq-dd.log" 196 | done 197 | fi 198 | sleep 3 199 | for i in $disklist; do grep -H MB/s "$(basename $i)*-dd.log"; done 200 | ;; 201 | destroy) 202 | [[ -n "$DBG" ]] && set -x 203 | if service mapr-warden status; then 204 | echo 'MapR warden appears to be running' 205 | echo 'Stop warden (e.g. service mapr-warden stop)' 206 | exit 207 | fi 208 | if pgrep iozone; then 209 | echo 'iozone appears to be running' 210 | echo 'kill all iozones running (e.g. pkill iozone)' 211 | exit 212 | fi 213 | #tar up previous log files 214 | files=$(ls ./*-{dd,iozone}.log 2>/dev/null) 215 | if [[ -n "$files" ]]; then 216 | tar czf disk-tests-"$(date "+%FT%T" |tr : .)".tgz $files 217 | rm -f $files 218 | fi 219 | #TBD: add sync option. Async IO (-k) by default 220 | iozopts="-I -r 1M -s ${size}G -k 10 -+n -i 0 -i 1 -i 2" 221 | if [[ $seq == "false" ]]; then # Benchmark all disks concurrently 222 | for disk in $disklist; do 223 | iozlog=$(basename $disk)-iozone.log 224 | $scriptdir/iozone $iozopts -f $disk > $iozlog & 225 | sleep 2 #Some disk controllers lockup without a delay 226 | done 227 | echo; echo "Waiting for iozone to finish"; wait; echo 228 | else # Sequence through the disk list, one by one 229 | for disk in $disklist; do 230 | iozlog=$(basename $disk)-seq-iozone.log 231 | $scriptdir/iozone $iozopts -f $disk > $iozlog 232 | done 233 | echo; echo "IOzone sequential testing done"; echo 234 | fi 235 | #write dd test 236 | #dd if=/dev/urandom of=/dev/sdX oflag=direct bs=1M count=4000 237 | ;; 238 | none) 239 | echo No test requested 240 | ;; 241 | esac 242 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Cluster Validation 2 | ================== 3 | 4 | Before installing MapR it is critically important to validate the 5 | hardware and software that MapR will be dependent on. Doing so 6 | will verify that items like disks, CPUs and NICs are performing as 7 | expected and the benchmark metric logged. Doing so will also 8 | verify that many of the basic OS configurations and packages are 9 | in the required state and that state is also recorded in an output 10 | log. 11 | 12 | Please use the steps below to test CPU/RAM, disk, and networking 13 | performance as well as to verify that your cluster meets MapR 14 | installation requirements. Pre-install tests should be run before 15 | installing MapR. Post-install tests should be run after installing 16 | the MapR software and configuring it. Post-install tests 17 | help assure that the cluster is in good working order and ready 18 | to hand over to your production team. 19 | 20 | Install clustershell (rpm available via EPEL) on a machine with 21 | password-less ssh to all other cluster nodes. If using a non-root 22 | account, then non-root account must have password-less sudo rights 23 | configured in /etc/sudoers. Update the file 24 | `/etc/clustershell/groups.d/local.cfg` or `/etc/clustershell/groups` 25 | to include an entry for "all" matching a pattern or patterns of 26 | host names in your cluster. 27 | For example; 28 | 29 | all: node[0-10] 30 | 31 | Verify clush works correctly by running: 32 | 33 | clush -a date 34 | 35 | Compare results with: 36 | 37 | clush -ab date 38 | 39 | Complete documentation for clush and clustershell can be found here: 40 | http://clustershell.readthedocs.org/en/latest/tools/clush.html 41 | 42 | If you don't find clustershell in EPEL, you may be able to download rpm here: 43 | `http://mirror.math.princeton.edu/pub/epel/6/x86_64/clustershell-1.7.2-1.el6.noarch.rpm` 44 | 45 | Next, download and extract the cluster-validation package with a command like this: 46 | 47 | curl -L -o cluster-validation.tgz http://github.com/MapRPS/cluster-validation/tarball/master 48 | 49 | Extract with tar in /root or your home folder and rename the top level folder like this: 50 | 51 | mv jbenninghoff-cluster-validation-* cluster-validation 52 | or 53 | mv cluster-validation-* cluster-validation 54 | 55 | Copy the cluster-validation folder to all nodes in the cluster. The 56 | clush command simplifies this: 57 | 58 | clush -a --copy /path.../cluster-validation 59 | clush -Ba ls /path.../cluster-validation # confirm that all nodes have the utilties 60 | 61 | Step 1 : Gather Base Audit Information 62 | -------------------------------------- 63 | Run cluster-audit.sh as root to verify that all nodes have met the 64 | MapR installation requirements. Run: 65 | 66 | cd /root/cluster-validation/ 67 | pre-install/cluster-audit.sh | tee cluster-audit.log 68 | 69 | Run those commands on the node where clush has been installed and 70 | ssh configured to access all the cluster nodes. Examine the log 71 | for inconsistency among any nodes. Do not proceed until all 72 | inconsistencies have been resolved and all requirements such as 73 | missing rpms, Java version, etc, have been met. Please send the 74 | output of the cluster-audit.log back to us. 75 | 76 | NOTE: cluster-audit.sh is designed for physical servers. 77 | Virtual Instances in cloud environments (eg Amazon, Google, or 78 | OpenStack) may generate confusing responses to some specific 79 | commands (eg dmidecode). In most cases, these anomolies are 80 | irrelevant. 81 | 82 | Step 2 : Evaluate Network Interconnect Bandwidth 83 | ------------------------------------------------ 84 | Use the network test to validate network bandwidth. This will take 85 | about two minutes or so to run and produce output so be patient. 86 | The script will use clush to collect the IP addresses of all the 87 | nodes and split the set in half, using first half as servers and 88 | the second half as clients. The half1 and half2 arrays in the 89 | network-test.sh script can be manually defined as well. There are 90 | command line options for sequential mode and to run iperf as well. 91 | Run: 92 | 93 | cd /root/cluster-validation/ 94 | pre-install/network-test.sh | tee network-test.log 95 | 96 | Run those commands on the node where clush has been installed and 97 | configured. Expect about 90% of peak bandwidth for either 1GbE or 98 | 10GbE networks: 99 | 100 | 1 GbE ==> ~115 MB/sec 101 | 10 GbE ==> ~1150 MB/sec 102 | 103 | Step 3 : Evaluate Raw Memory Performance 104 | ---------------------------------------- 105 | Use the stream59 benchmark to test memory performance. This test will take 106 | about a minute or so to run. It can be executed in parallel on all 107 | the cluster nodes with the clush command: 108 | 109 | cd /root/cluster-validation/ 110 | clush -Ba "$PWD/pre-install/memory-test.sh | grep -e ^Func -e ^Triad" | tee memory-test.log 111 | 112 | System memory bandwidth is determined by speed of DIMMs, number of 113 | memory channels and to a lesser degree by CPU frequency. Current 114 | generation Xeon based servers with eight or more 1600MHz DIMMs can 115 | deliver 70-80GB/sec Triad results. Previous generation Xeon cpus 116 | (Westmere) can deliver ~40GB/sec Triad results. You can look up 117 | the CPU as shown by cluster-audit.sh, at http://ark.intel.com/. 118 | The technical specifications for the CPU model will include the Max 119 | Memory Bandwidth (per CPU socket). Servers typically have two CPU 120 | sockets, so doubling that Max Bandwidth shown will give you the Max 121 | Memory Bandwidth for the server. The measured Stream Triad result 122 | from memory-test.log can then be compared to the Max Memory Bandwidth 123 | of the server. The Triad bandwidth should be approximately 75-85% 124 | of the Max Memory Bandwidth of the server. 125 | 126 | Step 4 : Evaluate Raw Disk Performance 127 | -------------------------------------- 128 | Use the iozone benchmark to test disk performance. This process 129 | is destructive to disks that are tested, so make sure that 130 | you have not installed MapR nor have any needed data on those 131 | spindles. The script as shipped will ONLY list out the disks to 132 | be tested. When run with no arguments, this script outputs a 133 | list of unused disks. After carefully examining this list, run 134 | again with --destroy as the argument ('disk-test.sh --destroy') 135 | to run the destructive IOzone tests on all unused disks. 136 | 137 | The test can be run in parallel on all nodes with clush: 138 | 139 | cd /root/cluster-validation/ 140 | clush -ab "$PWD/pre-install/disk-test.sh" 141 | clush -ab "$PWD/pre-install/summIOzone.sh" 142 | 143 | Current generation (2012+) 7200 rpm SATA drives can report 100-150MB/sec 144 | sequential read and write throughput. SAS, SSD and NVMe drives can 145 | report from 200MB/sec to nearly 2GB/sec for NVMe drives as measured 146 | by sequential iozone or fio tests on the raw device. By default, 147 | the disk test only uses a 4GB data set size in order to finish 148 | quickly. Consider using a larger size to measure disk throughput 149 | more thoroughly. Doing so will typically require hours to run which 150 | could be done overnight if your schedule allows. For large numbers 151 | of nodes and disks there is a summIOzone.sh script that can help 152 | provide a summary of disk-test.sh output using clush. 153 | 154 | clush -ab /root/cluster-validation/pre-install/summIOzone.sh 155 | 156 | Complete Pre-Installation Checks 157 | -------------------------------- 158 | When all subsystem tests have passed and met expectations, there 159 | is an example install script in the pre-install folder that can be 160 | modified and used for a scripted install by experienced system 161 | administrators. 162 | 163 | pre-install/mapr-install.sh -h 164 | 165 | Otherwise, follow the instructions from the mapr web site: 166 | http://maprdocs.mapr.com/home/install.html 167 | 168 | Post Installation tests 169 | -------------------------------- 170 | Post installation tests are in the post-install folder. The primary 171 | tests are RWSpeedTest and TeraSort. Scripts to run each are 172 | provided in the folder. Read the scripts for additional info. These 173 | scripts should all be run by the MapR service account, typically 174 | 'mapr'. The tests should be rerun as a normal user account to 175 | verify that normal accounts have no permission issues and can 176 | achieve the same throughput. 177 | 178 | 1: Create the benchmarks volume 179 | -------------------------------------- 180 | A script to create a benchmarks volume `mkBMvol.sh` is provided. 181 | Be sure to create the benchmarks volume before running any of the 182 | post install benchmarks. 183 | 184 | 2: Run MFS benchmark 185 | -------------------------------------- 186 | A script to run a per node MFS benchmark is the first post-install 187 | test to run on MapR. It is run on each cluster node using the script 188 | post-install/runRWSpeedTest.sh. It should be run on all nodes using clush: 189 | 190 | clush -ab post-install/runRWSpeedTest.sh |tee MFSbenchmark.log 191 | 192 | The default values are a good place to start. Use the -h option 193 | to the script to see the available options. 194 | 195 | 3: Run DFSIO benchmark 196 | -------------------------------------- 197 | DFSIO is a standard Hadoop benchmark to measure HDFS throughput. 198 | The script to run the benchmark can be run like this: 199 | 200 | post-install/runDFSIO.sh |tee DFSIO.log 201 | 202 | The metric is a per map task throughput (averaged). 203 | 204 | 4: Run TeraSort benchmark 205 | -------------------------------------- 206 | TeraSort is a standard Hadoop benchmark to measure MapReduce 207 | performance for a simple but common use case, sorting. 208 | The script to run the benchmark can be run like this: 209 | 210 | post-install/runTeraGenSort.sh 211 | 212 | The script combines TeraGen and TeraSort and takes an argument which 213 | is the number of reduce tasks per node. It creates its own log file. 214 | 215 | NOTE: The TeraSort benchmark (executed by runTeraGenSort.sh) will likely 216 | require tuning for each specific cluster. At a minimum, pass integer 217 | arguments in powers of 2 (e.g. 4, 8, etc) to the script to increase the 218 | number of reduce tasks per node up to the maximum reduce slots available on 219 | your cluster. Experiment with the -D options if you are an MR2 expert. 220 | 221 | There are other tunings in the script to optimize TeraSort performance. 222 | 223 | 5: Run Spark TeraSort benchmark 224 | -------------------------------------- 225 | There is an unoptimized version of TeraSort for Spark on github. 226 | This script runs that benchmark and can be run like this: 227 | 228 | post-install/runSparkTeraGenSort.sh |tee SparkTeraSort.log 229 | 230 | This script needs improvement. It currently requires cores, memory 231 | and executors to be modified in the script for specific cluster 232 | sizes. 233 | 234 | Additionally the default script uses a jar built for Scala 2.10 or 2.11 235 | and will fail when running on MapR 6.2 or above. The script will need to 236 | be updated to reference version 1.2 of the jar instead of 1.1 to work on 237 | newer clusters. 238 | 239 | 240 | The post-install folder also contains a mapr-audit.sh script which 241 | can be run to provide an audit log of the MapR configuration. The 242 | script contains a useful set of example maprcli commands. There are 243 | also install, upgrade and un-install options to mapr-install.sh 244 | that leverage clush to run quickly on an entire set of nodes or 245 | cluster. Some of the scripts will not run without editing, so read the 246 | scripts carefully to understand how to edit them with site specific 247 | info. All scripts support the -h option to show help on usage. 248 | 249 | /John Benninghoff 250 | -------------------------------------------------------------------------------- /pre-install/network-test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Jun-07 vi: set ai et sw=3 tabstop=3: 3 | # shellcheck disable=SC2016,SC2029 4 | 5 | usage() { 6 | cat << EOF >&2 7 | 8 | This script runs the iperf benchmark to validate network bandwidth 9 | using a cluster bisection strategy. One half of the nodes act as 10 | clients by sending load to other half of the nodes acting as servers. 11 | The throughput between each pair of nodes is reported. When run 12 | concurrently (the default case), load is also applied to the network 13 | switch(s). 14 | 15 | Use -s option to run tests in seqential mode 16 | Use -c option to run MapR rpctest instead of iperf 17 | Use -x option to run multiple flows/streams from each 18 | client to measure bonding/teaming NICs (default 1) 19 | Use -X option to run 2 processes on servers and clients to measure 20 | bonding/teaming NICs 21 | Use -z option to specify size of test in MB (default=5000) 22 | Use -r to specify reverse sort order of the target IP addresses 23 | Use -R to specify random sort order of IP addresses 24 | Use -g option to specify a clush group (default is group all) 25 | Use -m option to run tests on multiple server NIC IP addresses 26 | Use -d option to enable debug output 27 | 28 | EOF 29 | exit 30 | } 31 | 32 | concurrent=true 33 | runiperf=true 34 | multinic=false 35 | size=5000 36 | DBG='' 37 | xtra=1 38 | group=all 39 | procs=1 40 | while getopts "XdscmrRg:z:x:" opt; do 41 | msg="$OPTARG is not an integer" 42 | case $opt in 43 | d) DBG=true ;; 44 | g) group=$OPTARG ;; 45 | s) concurrent=false ;; 46 | c) runiperf=false ;; 47 | x) [[ "$OPTARG" =~ ^[0-9]+$ ]] && xtra=$OPTARG || { echo $msg; usage; };; 48 | X) procs=2 ;; 49 | m) multinic=true ;; 50 | r) ORDER=reverse ;; 51 | R) ORDER=random ;; 52 | z) [[ "$OPTARG" =~ ^[0-9]+$ ]] && size=$OPTARG || { echo $msg; usage; };; 53 | \?) usage >&2; exit ;; 54 | esac 55 | done 56 | 57 | setvars() { 58 | scriptdir="$(cd "$(dirname "$0")"; pwd -P)" #absolute path to script dir 59 | iperfbin=iperf3 #Installed iperf3 {uses same options} 60 | if type iperf >& /dev/null; then 61 | iperfbin=iperf #Installed iperf version 62 | else 63 | iperfbin=$scriptdir/iperf #Packaged version 64 | fi 65 | rpctestbin=/opt/mapr/server/tools/rpctest #Installed version 66 | rpctestbin=$scriptdir/rpctest #Packaged version 67 | port2=5002 68 | #Uncomment next 3 vars to enable NUMA taskset, check nodes with numactl -H 69 | #numanode0="0-13,28-41" 70 | #numanode1="14-27,42-55" 71 | #taskset="taskset -c " 72 | #tmpfile=$(mktemp); trap "rm $tmpfile; echo EXIT sigspec: $?; exit" EXIT 73 | if [[ $(id -u) != 0 ]]; then 74 | ssh() { /usr/bin/ssh -l root "$@"; } 75 | fi 76 | } 77 | setvars 78 | if [[ -n "$DBG" ]]; then 79 | echo concurrent: $concurrent 80 | echo runiperf: $runiperf 81 | echo multinic: $multinic 82 | echo size: $size 83 | echo ORDER: ${ORDER} 84 | echo xtra: $xtra 85 | echo group: $group 86 | echo procs: $procs 87 | read -p "DBG: Press enter to continue or ctrl-c to abort" 88 | fi 89 | 90 | # Generate a host list array 91 | if type nodeset >& /dev/null; then 92 | hostlist=( $(nodeset -e @${group}) ) #Host list in bash array 93 | elif [[ -f ~/host.list ]]; then 94 | hostlist=( $(< ~/host.list) ) 95 | else 96 | echo 'This test requires a host list via clush/nodeset or ' >&2 97 | echo "$HOME/host.list file, one host per line" >&2 98 | exit 99 | fi 100 | [[ -n "$DBG" ]] && echo hostlist: "${hostlist[@]}" 101 | 102 | # Convert host list into an ip list array, awk/match first IPv4 addr 103 | awkcmd='{ if (match($0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/, m)) print m[0] }' 104 | for host in "${hostlist[@]}"; do 105 | iplist+=( $(ssh "$host" hostname -i |awk "$awkcmd") ) 106 | #iplist+=( $(ssh "$host" hostname -i |grep -Po '\d+(\.\d+){3}\s') ) 107 | done 108 | [[ -n "$DBG" ]] && echo iplist: "${iplist[@]}" 109 | 110 | # Capture multiple NIC addrs on servers 111 | if [[ $multinic == "true" ]]; then 112 | for host in "${hostlist[@]}"; do 113 | iplist2+=( $(ssh $host "hostname -I |sed 's/ /,/g'") ) 114 | done #comma sep pair in bash array 115 | len=${#iplist2[@]}; ((len=len/2)); ((len--)) 116 | #extract first half of array (servers for rpctest) 117 | multinics=( ${iplist2[@]:0:$len} ) 118 | [[ -n "$DBG" ]] && echo multinics: "${multinics[@]}" 119 | fi 120 | 121 | # Sort client list 122 | if [[ "x${ORDER}" == "xrandom" ]]; then 123 | readarray -t sortlist < <(printf '%s\n' "${iplist[@]}" | sort -R ) 124 | iplist=( "${sortlist[@]}" ) 125 | echo Randomized iplist: "${iplist[@]}" 126 | fi 127 | # Generate the 2 bash arrays with IP address values using array extraction 128 | len=${#iplist[@]}; ((len=len/2)) 129 | half1=( "${iplist[@]:0:$len}" ) #extract first half of array (servers) 130 | half2=( "${iplist[@]:$len}" ) #extract second half of array (clients) 131 | [[ -n "$DBG" ]] && echo half1: "${half1[@]}" 132 | [[ -n "$DBG" ]] && echo half2: "${half2[@]}" 133 | 134 | # Tar up old log files on the server hosts 135 | for host in "${half2[@]}"; do 136 | p1='files=$(ls *-{rpctest,iperf}.log 2>/dev/null); ' 137 | p1+='[[ -n "$files" ]] && ' 138 | p1+='{ tar czf network-tests-$(date "+%FT%T" |tr : .).tgz $files; ' 139 | p1+='rm -f $files; echo "$(hostname -s): ' 140 | p1+='Previous run results archived into: $PWD/network-tests-*.tgz"; }' 141 | [[ -n "$DBG" ]] && echo ssh "$host" "$p1" 142 | [[ -n "$DBG" ]] && echo ssh "$host" 'ls *-{rpctest,iperf}.log 2>/dev/null' 143 | ssh "$host" "$p1" 144 | done 145 | echo 146 | 147 | # Sort client list 148 | if [[ "x${ORDER}" == "xreverse" ]]; then 149 | readarray -t sortlist < <(printf '%s\n' "${half2[@]}" | sort -r ) 150 | half2=( "${sortlist[@]}" ) 151 | echo Reversed half2: "${half2[@]}" 152 | fi 153 | 154 | # Handle uneven total host count, save and strip last element 155 | #list of 3 hosts is special case, reason half2/2 can't be used 156 | len=${#iplist[@]} 157 | if [[ $((len & 1)) -eq 1 ]]; then 158 | echo Uneven IP address count, removing extra client IP 159 | #recalc length of client array, to be used to modify client array 160 | len=${#half2[@]} 161 | (( len-- )) 162 | extraip=${half2[$len]}; echo extraip: "$extraip" 163 | #(( len-- )); echo len: $len 164 | half2=( "${half2[@]:0:$len}" ) 165 | [[ -n "$DBG" ]] && echo half2: "${half2[@]}" 166 | fi 167 | [[ -n "$DBG" ]] && read -p "DBG: Press enter to continue or ctrl-c to abort" 168 | 169 | ##### Servers ############################################### 170 | # Its possible but not recommended to manually define array of server hosts 171 | # half1=(10.10.100.165 10.10.100.166 10.10.100.167) 172 | # NOTE: use IP addresses to ensure specific NIC utilization 173 | for node in "${half1[@]}"; do 174 | if [[ $runiperf == "true" ]]; then 175 | ssh -n $node "$taskset $numanode0 $iperfbin -s > /dev/null" & 176 | if [[ $procs -gt 1 ]]; then 177 | ssh -n $node "$taskset $numanode1 $iperfbin -s -p $port2 >/dev/null" & 178 | echo 2nd server process on port $port2 179 | fi 180 | else 181 | ssh -n $node $rpctestbin -server & 182 | fi 183 | #ssh $node 'echo $[4*1024] $[1024*1024] $[4*1024*1024] | \ 184 | #tee /proc/sys/net/ipv4/tcp_wmem > /proc/sys/net/ipv4/tcp_rmem' 185 | done 186 | echo ${#half1[@]} Servers have been launched 187 | [[ $procs -gt 1 ]] && echo $procs processes per server launched 188 | sleep 5 # let the servers stabilize 189 | [[ -n "$DBG" ]] && read -rp "DBG: Press enter to continue or ctrl-c to abort" 190 | 191 | ##### Clients ############################################### 192 | # Its possible but not recommended to manually define array of client hosts 193 | # half2=(10.10.100.168 10.10.100.169 10.10.100.169) 194 | # NOTE: use IP addresses to ensure specific NIC utilization 195 | i=0 # Index into the server array 196 | for node in "${half2[@]}"; do #Loop over all clients 197 | [[ -n "$DBG" ]] && echo client-node: $node, server-node: ${half1[$i]} 198 | log="${half1[$i]}--$node-iperf.log" 199 | if [[ $concurrent == "true" ]]; then 200 | if [[ $runiperf == "true" ]]; then 201 | #$iperfbin -w 16K #16K window size MapR uses 202 | cmd="$taskset $numanode0 $iperfbin -c ${half1[$i]} -t 30 -P$xtra" 203 | ssh -n $node "$cmd > $log" & 204 | clients+=" $!" #catch PID 205 | if [[ $procs -gt 1 ]]; then 206 | cmd="$taskset $numanode1 $iperfbin -c ${half1[$i]} -t 30 -P$xtra" 207 | ssh -n $node "$cmd -p $port2 > ${log/iperf/$port2-iperf/}" & 208 | clients+=" $!" #catch PID 209 | fi 210 | else 211 | log="${half1[$i]}--$node-rpctest.log" 212 | if [[ $multinic == "true" ]]; then 213 | cmd="$rpctestbin -client -b 32 $size ${multinics[$i]}" 214 | ssh -n $node "$cmd > $log" & 215 | clients+=" $!" #catch this client PID 216 | else 217 | #increase $size value 10x for better test 218 | cmd="$rpctestbin -client -b 32 $size ${half1[$i]}" 219 | ssh -n $node "$cmd > $log" & 220 | clients+=" $!" #catch this client PID 221 | if [[ $procs -gt 1 ]]; then 222 | ssh -n $node "$cmd > ${log/rpctest/2-rpctest}" & 223 | clients+=" $!" #catch this client PID 224 | fi 225 | [[ -n "$DBG" ]] && { jobs -l; jobs -p; } 226 | fi 227 | fi 228 | [[ -n "$DBG" ]] && echo clients: "$clients $!" 229 | else #Sequential mode can be used to help isolate NIC and cable issues 230 | if [[ $runiperf == "true" ]]; then 231 | #$iperfbin -w 16K #16K window size MapR uses 232 | if [[ $procs -gt 1 ]]; then 233 | cmd="$taskset $numanode1 $iperfbin -c ${half1[$i]} -t 30 -P$xtra" 234 | ssh -n $node "$cmd -p $port2 > ${log/iperf/$port2-iperf}" & 235 | echo 2nd client process to port $port2 236 | echo logging to ${log/iperf/$port2-iperf} 237 | fi 238 | cmd="$taskset $numanode0 $iperfbin -c ${half1[$i]} -t 30 -P$xtra" 239 | ssh $node "$cmd > $log" 240 | else 241 | log="${half1[$i]}--$node-rpctest.log" 242 | if [[ $multinic == "true" ]]; then 243 | cmd="$rpctestbin -client -b 32 $size ${multinics[$i]}" 244 | ssh -n $node "$cmd > $log" 245 | else 246 | cmd="$rpctestbin -client -b 32 $size ${half1[$i]}" 247 | if [[ $procs -gt 1 ]]; then 248 | ssh -n $node "$cmd > ${log/rpctest/2-rpctest}" & 249 | fi 250 | ssh -n $node "$cmd > $log" 251 | fi 252 | fi 253 | echo; echo "Test from $node to ${half1[$i]} complete" 254 | # Get NIC stats when running sequential tests 255 | echo >> $log; echo >> $log 256 | if type ip >& /dev/null; then 257 | nics=$(ssh $node ip neigh |awk '{print $3}' |sort -u) 258 | for inic in $nics; do 259 | ssh $node "ethtool $inic |grep -e ^Settings -e Speed: >> $log" 260 | ssh $node "ip -s link show dev $inic >> $log" 261 | done 262 | else 263 | gs="-e HWaddr -e errors -e 'inet addr:'" 264 | nics=$(ssh $node arp -na |awk '{print $NF}' |sort -u) 265 | for inic in $nics; do 266 | ssh $node "ethtool $inic |grep -e ^Settings -e Speed: >> $log" 267 | ssh $node "ifconfig $inic |grep $gs >> $log" 268 | done 269 | fi 270 | fi 271 | ((i++)) 272 | done 273 | 274 | if [[ $concurrent == "true" ]]; then 275 | echo ${#half2[@]} Clients have been launched 276 | [[ $procs -gt 1 ]] && echo $procs processes per client launched 277 | echo Waiting for client PIDS: $clients 278 | wait $clients 279 | echo Wait over, post processing; sleep 3 280 | fi 281 | 282 | # Handle the odd numbered node count case (extra node) 283 | if [[ -n "$extraip" ]]; then 284 | [[ -n "$DBG" ]] && set -x 285 | echo 286 | echo Measuring extra IP address, NOT a concurrent measurement 287 | ((i--)) #decrement to reuse last server in server list $half1 288 | if [[ $runiperf == "true" ]]; then 289 | iargs="-c ${half1[$i]} -t30 -P$xtra" 290 | ilog="${half1[$i]}--$extraip" 291 | if [[ $procs -gt 1 ]]; then 292 | ssh -n $extraip "$iperfbin $iargs -p $port2 > $ilog-$port2-iperf.log" & 293 | fi 294 | ssh -n $extraip "$iperfbin $iargs > $ilog-iperf.log" 295 | else 296 | rargs="-client -b 32 $size ${half1[$i]}" 297 | rlog="${half1[$i]}--$extraip" 298 | ssh -n $extraip "$rpctestbin $rargs > $rlog-rpctest.log" 299 | fi 300 | echo Extra IP address $extraip done. 301 | [[ -n "$DBG" ]] && set +x 302 | fi 303 | 304 | # Define list of client nodes to collect results from 305 | half2+=("$extraip") 306 | [[ -n "$DBG" ]] && echo Clients: "${half2[@]}" 307 | [[ -n "$DBG" ]] && read -rp "DBG: Press enter to continue or ctrl-c to abort" 308 | echo 309 | 310 | if [[ $concurrent == "true" ]]; then 311 | echo Concurrent network throughput results 312 | else 313 | echo Sequential network throughput results 314 | fi 315 | 316 | if [[ $runiperf == "true" ]]; then 317 | # Print the measured bandwidth (string TBD) 318 | for host in "${half2[@]}"; do ssh "$host" 'grep -ih -e ^ *-iperf.log'; done 319 | echo 320 | echo "Theoretical Max: 1GbE=125MB/s, 10GbE=1250MB/s" 321 | echo "Expect 90-94% best case, 1125-1175 MB/sec on all pairs for 10GbE" 322 | #Kill the servers 323 | for host in "${half1[@]}"; do ssh "$host" pkill iperf; done 324 | else 325 | # Print the network bandwidth 326 | for host in "${half2[@]}"; do 327 | ssh "$host" 'grep -i -H -e ^Rate -e error *-rpctest.log' 328 | done 329 | echo 330 | echo "(mb/s is MB/sec), Theoretical Max: 1GbE=125MB/s, 10GbE=1250MB/s" 331 | echo "expect 90-94% best case" 332 | echo "e.g. Expect 1125-1175 MB/sec on all pairs for 10GbE links" 333 | #Kill the servers 334 | for host in "${half1[@]}"; do ssh "$host" pkill rpctest; done 335 | fi 336 | 337 | # Unlike most Linux commands, option order is important for rpctest, 338 | # -port must be used before other options 339 | #[root@rhel1 ~]# /opt/mapr/server/tools/rpctest --help 340 | #usage: rpctest [-port port (def:5555)] -server 341 | #usage: rpctest [-port port (def:5555)] -client mb-to-xfer 342 | #(prefix - to fetch, + for bi-dir) ip ip ip ... 343 | 344 | #ssh $node 'echo $[4*1024] $[1024*1024] $[4*1024*1024] | \ 345 | #tee /proc/sys/net/ipv4/tcp_wmem > /proc/sys/net/ipv4/tcp_rmem' 346 | -------------------------------------------------------------------------------- /post-install/hive-install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2014-Aug-24 vi: set ai et sw=3 ts=3: 3 | # shellcheck disable=SC2162 4 | 5 | [[ $(id -u) -ne 0 ]] && { echo This script must be run as root; exit 1; } 6 | type clush >/dev/null 2>&1 || { echo clush required for install; exit 3; } 7 | 8 | usage() { 9 | echo "Usage: $0 " 10 | echo Provide all required hostnames for installation, metastore optional 11 | exit 2 12 | } 13 | [[ $# -ne 3 ]] && usage 14 | 15 | # Configure clush groups 16 | clush_grps() { 17 | clgrps=/etc/clustershell/groups 18 | clgrps=/etc/clustershell/groups.d/local.cfg 19 | grep ^mysql: $clgrps || echo mysql: "$1" >> $clgrps 20 | grep ^hs2: $clgrps || echo hs2: "$2" >> $clgrps 21 | grep ^hivemeta: $clgrps || echo hivemeta: "$3" >> $clgrps 22 | tail $clgrps 23 | read -p "Press enter to continue or ctrl-c to abort" 24 | } 25 | clush_grps "$1" "$2" "$3" 26 | 27 | # Install all Hive RPMs 28 | hive_rpms() { 29 | # Embedded metastore class (without service) requires multiple MySQL accts 30 | # Metastore service provides single (shared) MySQL account access 31 | # See http://doc.mapr.com/display/MapR/Hive 32 | # Install hive, hive metastore, and hiveserver2 33 | clush -g hs2 "yum install -y mapr-hiveserver2 mapr-hive mysql" 34 | clush -g hivemeta "yum install -y mapr-hivemetastore mapr-hive mysql" 35 | clush -g all "yum install -y mapr-hive" 36 | # Capture latest installed Hive version/path 37 | hivepath=$(find /opt/mapr/hive -type d -name hive-\* |sort -n |tail -1) 38 | #TBD: check /opt/mapr/conf/env for HIVE/SASL settings 39 | echo hivepath: "$hivepath" 40 | read -p "Press enter to continue or ctrl-c to abort" 41 | } 42 | 43 | # Variables used for mysql and hive-site.xml configuration 44 | setvars() { 45 | MYSQL_NODE=$(nodeset -e @mysql) 46 | METASTORE_NODE=$(nodeset -e @hivemeta) 47 | for mhost in $METASTORE_NODE; do 48 | METASTORE_URI+="thrift://$mhost:9083," 49 | done 50 | # Remove trailing comma 51 | METASTORE_URI=${METASTORE_URI%,} 52 | # Set to empty value to use local metastore class 53 | #METASTORE_URI='' 54 | HS2_NODE=$(nodeset -e @hs2) 55 | ZK_NODES=$(nodeset -S, -e @zk) 56 | # Set up mysql database and user 57 | ROOT_PASSWORD=mapr 58 | DATABASE=hive 59 | USER=hive 60 | PASSWORD=mapr 61 | } 62 | 63 | # Install new mysql rpms, hive user and grants 64 | install_mariadb() { 65 | #initial mysql configuration 66 | clush -g mysql "yum install -y mariadb-server" 67 | clush -g mysql "systemctl enable --now mariadb" 68 | #set mysql root password 69 | clush -g mysql "mysqladmin -u root password $ROOT_PASSWORD" 70 | # Reset mysql root password 71 | #clush -g mysql "mysqladmin -u root -pmapr password $ROOT_PASSWORD" 72 | 73 | clush -g mysql mysql -u root -p$ROOT_PASSWORD << EOF 74 | create database $DATABASE; 75 | create user '$USER'@'%' identified by '$PASSWORD'; 76 | grant all privileges on $DATABASE.* to '$USER'@'%' with grant option; 77 | create user '$USER'@'localhost' IDENTIFIED BY '$PASSWORD'; 78 | grant all privileges on $DATABASE.* to '$USER'@'localhost' with grant option; 79 | create user '$USER'@'$METASTORE_NODE' IDENTIFIED BY '$PASSWORD'; 80 | grant all privileges on $DATABASE.* to '$USER'@'$METASTORE_NODE' with grant option; 81 | create user '$USER'@'$HS2_NODE' IDENTIFIED BY '$PASSWORD'; 82 | grant all privileges on $DATABASE.* to '$USER'@'$HS2_NODE' with grant option; 83 | flush privileges; 84 | EOF 85 | 86 | echo Scroll up and check for mysql install errors 87 | read -p "Press enter to continue or ctrl-c to abort" 88 | mysql -uroot -p$ROOT_PASSWORD -e "select user,host,password from mysql.user" 89 | mysql -uroot -p$ROOT_PASSWORD "show grants for 'hive';" 90 | read -p "Press enter to continue or ctrl-c to abort" 91 | #TBD: check for errors 92 | #echo -e "[client]\nuser=root\npassword=$ROOT_PASSWORD" > ~/.my.cnf 93 | #chmod 600 ~/.my.cnf 94 | #mysql -uroot -pmapr -sNe"$(mysql -uroot -pmapr -se"SELECT CONCAT('SHOW GRANTS FOR \'',user,'\'@\'',host,'\';') FROM mysql.user;")" 95 | } 96 | install_mariadb 97 | 98 | # Old install function 99 | install_mysql() { 100 | #initial mysql configuration 101 | clush -g mysql "yum install -y mysql-server" 102 | clush -g mysql "service mysqld start" 103 | #set mysql root password 104 | clush -g mysql "mysqladmin -u root password $ROOT_PASSWORD" 105 | 106 | clush -g mysql mysql -u root -p$ROOT_PASSWORD << EOF 107 | create database $DATABASE; 108 | create user '$USER'@'%' identified by '$PASSWORD'; 109 | grant all privileges on $DATABASE.* to '$USER'@'%' with grant option; 110 | create user '$USER'@'localhost' IDENTIFIED BY '$PASSWORD'; 111 | grant all privileges on $DATABASE.* to '$USER'@'localhost' with grant option; 112 | create user '$USER'@'$METASTORE_NODE' IDENTIFIED BY '$PASSWORD'; 113 | grant all privileges on $DATABASE.* to '$USER'@'$METASTORE_NODE' with grant option; 114 | create user '$USER'@'$HS2_NODE' IDENTIFIED BY '$PASSWORD'; 115 | grant all privileges on $DATABASE.* to '$USER'@'$HS2_NODE' with grant option; 116 | flush privileges; 117 | EOF 118 | 119 | #TBD: check for errors 120 | #mysql -uroot -p"$ROOT_PASSWORD" -sNe"$(mysql -uroot -p"$ROOT_PASSWORD" -se"SELECT CONCAT('SHOW GRANTS FOR \'',user,'\'@\'',host,'\';') FROM mysql.user;")" 121 | 122 | #echo -e "[client]\nuser=root\npassword=$ROOT_PASSWORD" > ~/.my.cnf; chmod 600 ~/.my.cnf 123 | #mysql -e "select user,host,password from mysql.user; show grants for 'hive';" 124 | echo Check for mysql install errors 125 | read -p "Press enter to continue or ctrl-c to abort" 126 | } 127 | 128 | # The driver for the MySQL JDBC connector (a jar file) is part of the 129 | # MapR distribution under /opt/mapr/lib/. Link file into the Hive lib directory 130 | clush -g mysql "ln -s /opt/mapr/lib/mysql-connector-java-5.1.*-bin.jar $hivepath/lib/" 131 | 132 | #create a hive-site.xml 133 | cat > /tmp/hive-site.xml < 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | hive.metastore.uris 143 | thrift://$METASTORE_NODE:9083 144 | Use blank(no value) to enable local metastore, 145 | use a host:pair to enable a 'remote' metastore. 146 | Use multiple host:port pairs separated by commas for HA 147 | 148 | 149 | 150 | 151 | 152 | 153 | javax.jdo.option.ConnectionURL 154 | jdbc:mysql://$MYSQL_NODE:3306/hive?createDatabaseIfNotExist=true 155 | JDBC connect string for a JDBC metastore 156 | 157 | 158 | 159 | javax.jdo.option.ConnectionDriverName 160 | com.mysql.jdbc.Driver 161 | Driver class name for a JDBC metastore 162 | 163 | 164 | 165 | javax.jdo.option.ConnectionUserName 166 | $USER 167 | username to use against metastore database 168 | 169 | 170 | 171 | javax.jdo.option.ConnectionPassword 172 | $PASSWORD 173 | password to use against metastore database 174 | 175 | 176 | 179 | 180 | 181 | hive.metastore.sasl.enabled 182 | true 183 | Property to enable Hive Metastore SASL on secure cluster 184 | 185 | 186 | 187 | 188 | hive.metastore.schema.verification 189 | false 190 | 191 | Enforce metastore schema version consistency. 192 | True: Verify that version information stored in is compatible 193 | with one from Hive jars. Also disable automatic 194 | schema migration attempt. Users are required to manually 195 | migrate schema after Hive upgrade which ensures proper 196 | metastore schema migration. (Default) 197 | 198 | False: Warn if the version information stored in metastore 199 | doesn't match with one from in Hive jars. 200 | 201 | 202 | 203 | 204 | datanucleus.schema.autoCreateTables 205 | true 206 | 207 | 208 | 209 | hive.metastore.execute.setugi 210 | false 211 | Set this property to true to enable Hive Metastore service 212 | impersonation in unsecure mode. 213 | True causes the metastore to execute DFS operations 214 | using the client's reported user and group permissions. 215 | Note that this property must be set on 216 | BOTH the client and server sides. 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | hive.server2.authentication 226 | PAM 227 | 228 | 229 | 230 | hive.server2.authentication.pam.services 231 | login,sudo,sshd 232 | comma separated list of pam modules to verify 233 | 234 | 235 | 236 | hive.server2.enable.doAs 237 | true 238 | Set this property to enable impersonation in Hive Server 2 239 | 240 | 241 | 242 | hive.server2.thrift.port 243 | 10000 244 | TCP port number for Hive Server to listen on, default 10000, conflicts with webmin 245 | 246 | 247 | 248 | hive.server2.thrift.sasl.qop 249 | auth-conf 250 | Added in Hive 2.1 Secure cluster 251 | 252 | 253 | 254 | hive.server2.webui.use.pam 255 | true 256 | 257 | 258 | 259 | hive.server2.webui.use.ssl 260 | true 261 | 262 | 263 | 264 | hive.server2.webui.keystore.path 265 | /opt/mapr/conf/ssl_keystore 266 | 267 | 268 | 269 | hive.server2.webui.keystore.password 270 | mapr123 271 | 272 | 273 | 274 | hive.server2.support.dynamic.service.discovery 275 | true 276 | Set to true to enable HiveServer2 dynamic service discovery 277 | by its clients. (default is false) 278 | 279 | 280 | 281 | 282 | hive.server2.zookeeper.namespace 283 | hiveserver2 284 | The parent znode in ZooKeeper, which is used by HiveServer2 285 | when supporting dynamic service discovery.(default value) 286 | 287 | 288 | 289 | 296 | 297 | 298 | 299 | hive.support.concurrency 300 | true 301 | Enable Hive's Table Lock Manager Service 302 | 303 | 304 | 305 | hive.zookeeper.quorum 306 | $ZK_NODES 307 | Zookeeper quorum used by Hive's Table Lock Manager 308 | List of ZooKeeper servers to talk to. 309 | Used in connection string by JDBC/ODBC clients instead of URI of specific HiveServer2 instance. 310 | 311 | 312 | 313 | 314 | hive.zookeeper.client.port 315 | 5181 316 | The Zookeeper client port. The MapR default clientPort is 5181. 317 | 318 | 319 | 320 | hive.zookeeper.session.timeout 321 | 600000 322 | (600000 is default value) 323 | 324 | 325 | 358 | 359 | 360 | EOF 361 | 362 | if type xmllint >& /dev/null; then 363 | xmllint /tmp/hive-site.xml 364 | fi 365 | echo xmldiff /tmp/hive-site.xml with "$hivepath/conf/hive-site.xml" 366 | read -p "Press enter to continue or ctrl-c to abort" 367 | 368 | # Run these commands using 'mapr' service account 369 | cat - <] 13 | -d option for script debug 14 | -v option for more extensive auditing 15 | -t option for a quick terse output 16 | -s option to audit the cluster for security 17 | -e option to audit an edge node 18 | -g option to specify a clush group other than all 19 | -a option to audit volume ACEs 20 | 21 | EOF 22 | } 23 | 24 | verbose=false; terse=false; security=false; edge=false; group=all; volacl=false 25 | while getopts ":dvtsea:g:" opt; do 26 | case $opt in 27 | d) dbg=true ;; 28 | v) verbose=true ;; 29 | t) terse=true ;; 30 | s) security=true ;; 31 | e) edge=true ;; 32 | g) group=$OPTARG ;; 33 | a) volacl=true; mntpt=$OPTARG ;; 34 | :) echo "Option -$OPTARG requires an argument." >&2; exit 1 ;; 35 | \?) usage >&2; exit ;; 36 | esac 37 | done 38 | 39 | setvars() { 40 | eval printf -v sep "'#%.0s'" "{1..${COLUMNS:-80}}" 41 | parg="-b -g ${group:-all}" 42 | mrv=$(hadoop version |awk 'NR==1 {printf("%1.1s\n", $2)}') 43 | #printf -v sep '#%.0s' {1..80} #Set sep to 80 # chars 44 | #TBD: If edge==true group!=all 45 | #TBD: clush can use --hostfile if group cannot be set 46 | } 47 | 48 | if [[ -f /opt/mapr/conf/daemon.conf ]]; then 49 | srvid=$(awk -F= '/mapr.daemon.user/ {print $2}' /opt/mapr/conf/daemon.conf) 50 | else 51 | srvid=mapr #guess at service acct if not found 52 | fi 53 | 54 | maprcli_check() { 55 | if ( ! type maprcli > /dev/null 2>&1 ); then #If maprcli not on this machine 56 | node=$(nodeset -I0 -e @$group) 57 | if [[ -z "$node" ]]; then 58 | read -e -p 'maprcli not found, enter host name to run maprcli on: ' node 59 | fi 60 | if ( ! ssh "$node" "type maprcli > /dev/null 2>&1" ); then 61 | echo maprcli not found on host "$node", rerun with valid host name; exit 62 | fi 63 | node="ssh -qtt $node " #Single node to run maprcli commands from 64 | #chgu="su -u $srvid -c " # Run as service account 65 | fi 66 | #Sudo to mapr on secure cluster 67 | #node="sudo -u mapr MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket " 68 | } 69 | 70 | sudo_setup() { 71 | SUDO="sudo -u $srvid -i " 72 | snode=$(nodeset -I0 -e @$group) 73 | sscmd="sudo -U $srvid -ln 2>&1 |grep -q 'sudo: a password is required'" 74 | if (ssh -qtt $snode $sscmd); then 75 | read -s -e -p 'Enter sudo password: ' mypasswd 76 | #echo $mypasswd | sudo -S -i dmidecode -t bios || exit 77 | SUDO="echo $mypasswd | $SUDO -S " 78 | fi 79 | 80 | clcmd="${SUDO:-} grep -q '^Defaults.*requiretty' /etc/sudoers" 81 | if (clush $parg -S $clcmd >& /dev/null); then 82 | parg="-o -qtt $parg" # Add -qtt for sudo tty via ssh/clush 83 | #To run sudo without a tty use: 84 | # clush -ab -o -qtt 85 | # "sudo sed -i.bak '/^Defaults.*requiretty/s/^/#/' /etc/sudoers" 86 | fi 87 | } 88 | 89 | cluster_checks1() { 90 | echo ==================== MapR audits ================================ 91 | date; echo "$sep" 92 | if [[ "$mrv" == "1" ]] ; then # MRv1 93 | msg="Hadoop Jobs Status "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 94 | ${node:-} timeout 9 hadoop job -list; echo "$sep" 95 | elif [[ "$mrv" == "2" ]] ; then # MRv2 96 | msg="Hadoop Jobs Status "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 97 | ${node:-} timeout 9 mapred job -list; echo "$sep" 98 | fi 99 | msg="MapR Dashboard "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 100 | if (type -p jq >/dev/null); then 101 | jqcmd='.data[] | {version, cluster,utilization}' 102 | ${node:-} maprcli dashboard info -json | jq "$jqcmd" 103 | echo "$sep" 104 | else 105 | ${node:-} maprcli dashboard info -json; echo "$sep" 106 | fi 107 | msg="MapR Alarms "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 108 | ${node:-} maprcli alarm list -summary true; echo "$sep" 109 | msg="MapR Services "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 110 | ${node:-} maprcli node list -columns svc |awk 'NF{--NF};1{printf "%-25s %s\n", $1,$2;}'; echo "$sep" 111 | #${node:-} maprcli node list -columns svc; echo "$sep" 112 | msg="Zookeepers: "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 113 | ${node:-} maprcli node listzookeepers; echo "$sep" 114 | msg="Current MapR Version: "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 115 | ${node:-} maprcli config load -keys mapr.targetversion 116 | msg="Current MapR Licenses: "; printf "%s%s \\n" "$msg" "${sep:${#msg}}" 117 | ${node:-} maprcli license list | grep -i lictype 118 | echo 119 | } 120 | 121 | cluster_checks2() { 122 | echo ==================== Additional MapR audits =========================== 123 | msg="MapR System Stats "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 124 | ${node:-} maprcli node list -columns hostname,cpus,mused; echo "$sep" 125 | msg="Customer Site Specific Volumes " 126 | printf "%s%s \n" "$msg" "${sep:${#msg}}" 127 | opts='-filter "[n!=mapr.*] and [n!=*local*]"' 128 | opts+=' -columns n,numreplicas,mountdir,used,numcontainers,logicalUsed' 129 | eval ${node:-} maprcli volume list "$opts"; echo "$sep" 130 | echo 131 | #clush $parg "echo MapR /etc/shadow access:; ls -l /etc/shadow; id $srvid" 132 | clush $parg "echo MapR /etc/shadow access:; stat -c '%A %U %G %n'\ 133 | /etc/shadow; id $srvid" 134 | echo "$sep" 135 | clush $parg "echo 'MapR SHMEM Segments:'; ${SUDO:-} ipcs -m | uniq -w10" 136 | echo "$sep" 137 | clush $parg "echo MapR HostID:; cat /opt/mapr/hostid"; echo "$sep" 138 | clush $parg "echo MapR Patch; yum --noplugins list installed mapr-patch" 139 | echo "$sep" 140 | echo MFS Heap Size: 141 | clush ${parg/-b/} "pgrep -oaf /opt/mapr/server/mfs" | \ 142 | grep -e '\-m [^ ]*' -e '^[^ ]*' 143 | echo "$sep" 144 | clush $parg "echo 'MapR Storage Pools'; \ 145 | ${SUDO:-} /opt/mapr/server/mrconfig sp list -v" 146 | echo "$sep" 147 | clush $parg "echo 'Cat mapr-clusters.conf'; \ 148 | cat /opt/mapr/conf/mapr-clusters.conf" 149 | echo "$sep" 150 | #TBD: if mapr-clusters.conf has more than one line, 151 | #look for mirror volumes {maprcli volume list -json |grep mirror???} 152 | clush $parg "echo 'MapR Env Settings'; grep ^export /opt/mapr/conf/env.sh" 153 | echo "$sep" 154 | if [[ "$mrv" == "1" ]] ; then # MRv1 155 | clush $parg "echo 'Mapred-site.xml Checksum Consistency'; \ 156 | sum /opt/mapr/hadoop/hadoop-0.20.2/conf/mapred-site.xml" 157 | echo "$sep" 158 | clush $parg "echo 'core-site.xml Checksum Consistency'; sum /opt/mapr/hadoop/hadoop-0.20.2/conf/core-site.xml"; echo "$sep" 159 | clush $parg "echo 'MapR Central Logging Setting'; grep ROOT_LOGGER /opt/mapr/hadoop/hadoop-0.20.2/conf/hadoop-env.sh"; echo "$sep" 160 | else 161 | clush $parg "echo 'MR2 core-site.xml Checksum Consistency'; sum /opt/mapr/hadoop/hadoop-2.*/etc/hadoop/core-site.xml"; echo "$sep" 162 | clush $parg "echo 'MR2 core-site.xml Property Count: '; awk '//dev/null); then 181 | #TBD: get history server hostname/ip, get 1-3 days history and log it. 182 | #hist=$(maprcli node list -columns svc |awk '/historyserver/ {print $1}') 183 | #begin=$(( ($(date +%s) - 86400*3) * 1000 )) 184 | #url="https://$hist:19890/ws/v1/history/mapreduce/jobs" 185 | #url+="?startedTimeBegin=$begin" 186 | #curl -s -u mapr:mapr -k "$url" 187 | #curl -s -u mapr:mapr -k "$url" |jq 188 | } 189 | 190 | edgenode_checks() { 191 | msg="Edge Node Checking "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 192 | clush $parg 'echo "MapR packages installed"; rpm -qa |grep mapr- |sort' 193 | echo "$sep" 194 | 195 | msg="Checking for MySQL Server "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 196 | clush $parg ${SUDO:-} "service mysqld status 2>/dev/null" 197 | echo "$sep" 198 | 199 | #TBD: Check Hive config 200 | #TBD: Check HS2 port and config 201 | #TBD: Check MetaStore port and config 202 | msg="Checking Hive Configuration "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 203 | clcmd="sed '//d' /opt/mapr/hive/hive-2.1/conf/hive-site.xml \ 204 | |sed '//d' |grep ''" 205 | clush $parg ${SUDO:-} "$clcmd" 206 | echo "$sep" 207 | 208 | #TBD: Check Hue port and config (hue.ini) 209 | #TBD: Check Sqoop config (RDBMS jars) 210 | #TBD: Check Pig config 211 | #TBD: Check Spark/Yarn config (spark-defaults.conf, web-proxy jar) 212 | } 213 | 214 | cluster_checks3() { 215 | [[ -n "$dbg" ]] && set -x 216 | msg="Verbose audits "; printf "%s%s \n" "$msg" "${sep:${#msg}}"; echo 217 | #$node maprcli dump balancerinfo | sort | awk '$1 == prvkey {size += $9}; $1 != prvkey {if (prvkey!="") print size; prvkey=$1; size=$9}' 218 | #echo MapR disk list per host 219 | clush $parg 'echo "MapR packages installed"; rpm -qa |grep mapr- |sort'; echo "$sep" 220 | clush $parg 'echo "MapR Disk List per Host"; maprcli disk list -output terse -system 0 -host $(hostname)'; echo "$sep" 221 | clush $parg 'echo "MapR Disk Stripe Depth"; ${SUDO:-} /opt/mapr/server/mrconfig dg list | grep -A4 StripeDepth'; echo "$sep" 222 | #clush $parg 'echo "MapR Disk Stripe Depth"; ${SUDO:-} /opt/mapr/server/mrconfig dg list '; echo "$sep" 223 | msg="MapR Complete Volume List "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 224 | ${node:-} maprcli volume list -columns n,numreplicas,mountdir,used,numcontainers,logicalUsed; echo "$sep" 225 | msg="MapR Storage Pool Details "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 226 | ${node:-} maprcli dump balancerinfo | sort -r; echo "$sep" 227 | msg="Hadoop Configuration Variable Dump "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 228 | if [[ "$mrv" == "1" ]] ; then # MRv1 229 | ${node:-} hadoop conf -dump | sort; echo "$sep" 230 | else 231 | ${node:-} hadoop conf-details print-all-effective-properties | grep -o '.*' |sed 's///;s/<\/value>//;s/<\/name>/=/' 232 | echo "$sep" 233 | fi 234 | msg="MapR Configuration Variable Dump "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 235 | ${node:-} maprcli config load -json; echo "$sep" 236 | #msg="List Unique File Owners, Down 4 Levels"; printf "%s%s \n" "$msg" "${sep:${#msg}}" 237 | #${node:-} find $mntpt -maxdepth 4 -exec stat -c '%U' {} \; |sort -u; echo "$sep" #find uniq owners 238 | # TBD: check all hadoop* packages installed 239 | clush -b -g zk -g cldb "echo 'ZK and CLDB nice values'; ps -ocomm,pid,nice $(/dev/null && wc /etc/sssd/sssd.conf' #TBD: Check sssd settings 255 | clush $parg ${SUDO:-} "service krb5kbc status |sed 's/(.*)//'; service kadmin status |sed 's/(.*)//'" # Check for Kerberos 256 | clush $parg ${SUDO:-} "echo Checking for Firewall; service iptables status |sed 's/(.*)//'" 257 | clush $parg ${SUDO:-} 'echo Checking for LUKS; grep -v -e ^# -e ^$ /etc/crypttab' 258 | clush $parg ${SUDO:-} 'echo Checking for C and Java Compilers; type gcc; type javac; find /usr/lib -name javac|sort' 259 | #TBD: clush $parg ${SUDO:-} 'echo Checking MySQL; type mysql && mysql -u root -e "show databases" && echo "Passwordless MySQL access"' 260 | clush $parg 'echo Checking for Internet Access; { curl -f http://mapr.com/ >/dev/null 2>&1 || curl -f http://54.245.106.105/; } && echo Internet Access Available || echo Internet Access Unavailable' 261 | clush $parg "echo Checking All TCP/UDP connections; netstat -t -u -p -e --numeric-ports" 262 | 263 | # Cluster nodes only 264 | if [[ "$edge" == "false" ]]; then 265 | msg="MapR Secure Mode "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 266 | ${node:-} maprcli dashboard info -json | grep 'secure.*true,' && maprsecure=true || echo === MapR cluster running non-secure 267 | msg="MapR Auditing Status "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 268 | ${node:-} maprcli config load -json | grep "mfs.feature.audit.support" && mapraudit=true || echo === MapR FS Auditing unavailable 269 | msg="MapR Fast Inode Scan "; printf "%s%s \n" "$msg" "${sep:${#msg}}" 270 | ${node:-} maprcli config load -json | grep "mfs.feature.fastinodescan.support" && fastinodes=true || echo === MapR Fast Inode Scan not enabled 271 | # Fast Inode Scan helps mirroring thousands/millions of files/volume 272 | msg="MapR Cluster Admin ACLs"; printf "%s%s \n" "$msg" "${sep:${#msg}}" 273 | ${node:-} maprcli acl show -type cluster 274 | # Check for MapR whitelist: http://doc.mapr.com/display/MapR/Configuring+MapR+Security#ConfiguringMapRSecurity-whitelist 275 | clush $parg "echo 'MapR MFS Whitelist Defined'; grep mfs.subnets.whitelist /opt/mapr/conf/mfs.conf" 276 | clush $parg "echo 'MapR YARN Submit ACLs'; awk '// {if (/acl|/dev/null; then 340 | echo clush required for advanced options; exit 1 341 | fi 342 | if [[ $(nodeset -c @"${group:-all}") == 0 ]]; then 343 | echo group: "${group:-all}" does not exist; exit 2 344 | fi 345 | if ! clush $parg -S test -d /opt/mapr; then 346 | echo MapR not installed in node group "$group"; exit 3 347 | fi 348 | echo "$sep" 349 | 350 | [[ "$(id -un)" != "$srvid" ]] && sudo_setup 351 | [[ "$edge" == "false" && "$terse" == "false" ]] && cluster_checks2 352 | [[ "$edge" == "false" && "$verbose" == "true" ]] && cluster_checks3 353 | [[ "$edge" == "false" && "$volacl" == "true" ]] && volume_acls 354 | [[ "$edge" == "true" ]] && edgenode_checks 355 | [[ "$security" == "true" ]] && security_checks 356 | 357 | echo "Extract cluster summary from the captured output log with awk: " 358 | awkcmd='FNR==1 {print FILENAME}; /[ \t]+\"version\":/; ' 359 | awkcmd+='/[ \t]+\"cluster\":/,/},/' 360 | echo "awk '$awkcmd' mapr-audit-*.log" 361 | -------------------------------------------------------------------------------- /pre-install/cluster-audit.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | # jbenninghoff 2013-Oct-06 vi: set ai et sw=3 tabstop=3: 3 | # edwbuck 01FEB2021 4 | # shellcheck disable=SC2162,SC2086,SC2046,SC2016 5 | #set -o nounset errexit 6 | 7 | usage() { 8 | cat << EOF 9 | Usage: $0 -g -d -l -s 10 | -g To specify clush group other than "all" 11 | -d To enable debug output 12 | -l To specify clush/ssh user other than $USER 13 | -s To specify a service account name other than "mapr" 14 | 15 | This script is a sequence of parallel shell commands probing for 16 | current system configuration and highlighting differences between 17 | the nodes in a cluster. 18 | 19 | The script requires that the clush utility (a parallel ssh tool) 20 | be installed and configured using passwordless ssh connectivity for root to 21 | all the nodes under test. Or passwordless sudo for a non-root account. 22 | Use -l mapr for example if mapr account has passwordless sudo rights. 23 | 24 | EOF 25 | } 26 | 27 | # Handle script options 28 | DBG=""; group=all; cluser="" 29 | while getopts "dl:g:s:" opt; do 30 | case $opt in 31 | d) DBG=true ;; 32 | g) group=$OPTARG ;; 33 | l) cluser="-l $OPTARG" ;; 34 | s) srvid="$OPTARG" ;; 35 | \?) usage; exit ;; 36 | esac 37 | done 38 | [ -n "$DBG" ] && set -x 39 | 40 | # Set some global variables 41 | printf -v sep '#%.0s' {1..80} #Set sep to 80 # chars 42 | 43 | # Set distro information 44 | export SUPPORTED_DISTROS=( rhel sles ubuntu ) 45 | 46 | distro_detect() 47 | { 48 | DISTRO_ID=$(awk 'BEGIN { FS="=" } $1=="ID" { gsub(/"/, "", $2); print $2 }' /etc/os-release) 49 | DISTRO_ID_LIKE=( $(awk 'BEGIN { FS="=" } $1=="ID_LIKE" { gsub(/"/, "", $2); print $2 }' /etc/os-release) ) 50 | DISTRO_ID_VERSION=( $(awk 'BEGIN { FS="=" } $1=="VERSION_ID" { gsub(/"/, "", $2); print $2 }' /etc/os-release) ) 51 | export DISTRO_ID 52 | export DISTRO_ID_LIKE 53 | export DISTRO_ID_VERSION 54 | } 55 | 56 | distro_detect 57 | 58 | if [[ " ${SUPPORTED_DISTROS[@]} " == *" ${DISTRO_ID} "* ]] 59 | then 60 | EFFECTIVE_DISTRO=${DISTRO_ID} 61 | export EFFECTIVE_DISTRO 62 | else 63 | for SIMILAR_DISTRO in "${DISTRO_ID_LIKE[@]}" 64 | do 65 | if [[ " ${SUPPORTED_DISTROS[@]} " == *" $SIMILAR_DISTRO "* ]] 66 | then 67 | EFFECTIVE_DISTRO=${SIMILAR_DISTRO} 68 | export EFFECTIVE_DISTRO 69 | break 70 | fi 71 | done 72 | fi 73 | 74 | echo Distro = $DISTRO_ID, effective distro = $EFFECTIVE_DISTRO, version = $DISTRO_ID_VERSION 75 | 76 | if [[ -z ${EFFECTIVE_DISTRO} ]] 77 | then 78 | echo "unsupported distro ${DISTRO_ID}" 79 | exit -1 80 | fi 81 | 82 | [[ "$(uname -s)" == "Darwin" ]] && alias sed=gsed 83 | #Turn the BOKS chatter down 84 | export BOKS_SUDO_NO_WARNINGS=1 85 | 86 | # Check for clush and provide alt if not found 87 | if type clush >& /dev/null; then 88 | [ $(nodeset -c @${group:-all}) -gt 0 ] || { echo group: ${group:-all} does not exist; exit 2; } 89 | #clush specific arguments 90 | parg="${cluser} -b -g ${group:-all}" 91 | node=$(nodeset -I0 -e @${group:-all}) 92 | narg="-w $node -o -qtt" 93 | sshcnf=$HOME/.ssh/config 94 | [[ ! -f $sshcnf ]] && { touch $sshcnf; chmod 600 $sshcnf; } 95 | if ! grep -q StrictHostKeyChecking $sshcnf ; then 96 | echo To suppress ssh noise, add the following to $sshcnf 97 | echo StrictHostKeyChecking no 98 | echo LogLevel ERROR 99 | fi 100 | #e1='/^StrictHostKeyChecking/{s/.*/StrictHostKeyChecking no/;:z;n;bz}' 101 | #e2='$aStrictHostKeyChecking no\nLogLevel ERROR' 102 | #sed -i.bak -e "$e1" -e "$e2" $sshcnf 103 | #if ! diff $sshcnf $sshcnf.bak >/dev/null; then 104 | # echo To suppress ssh noise, $sshcnf has been modified 105 | #fi 106 | # Common arguments to pass in to clush execution 107 | #clcnt=$(nodeset -c @all) 108 | #parg="$parg -f $clcnt" #fanout set to cluster node count 109 | #parg="-o '-oLogLevel=ERROR' $parg" 110 | else 111 | echo clush not found, doing a single node inspection without ssh; sleep 3 112 | clush() { eval "$@"; } #clush becomes no-op, all commands run locally doing a single node inspection 113 | #clush() { for h in $(<~/host.list); do; ssh $h $@; done; } #ssh in for loop 114 | fi 115 | if [[ -n "$DBG" ]]; then 116 | clush "${cluser} -b -g ${group:-all}" -S -u 30 date || { echo clush failed; usage; exit 3; } 117 | fi 118 | 119 | # Locate or guess MapR Service Account 120 | if [[ -f /opt/mapr/conf/daemon.conf ]]; then 121 | echo "Using mapr.daemon.user from /opt/mapr/conf/daemon.conf"; sleep 3 122 | srvid=$(awk -F= '/mapr.daemon.user/ {print $2}' /opt/mapr/conf/daemon.conf) 123 | [[ -z "$srvid" ]] && srvid=mapr #guess 124 | else 125 | srvid=${srvid:-mapr} #guess at service acct if not found 126 | #TBD: add 'getent passwd |grep -i mapr' to list other service acct names 127 | fi 128 | 129 | # Define Sudo options if available 130 | if [[ $(id -u) -ne 0 && "$cluser" != "-l root" ]]; then 131 | if (clush $narg sudo -ln 2>&1 | grep -q 'sudo: a password is required'); then 132 | read -s -e -p 'Enter sudo password: ' mypasswd 133 | #echo $mypasswd | sudo -S -i dmidecode -t bios || exit 134 | SUDO="echo $mypasswd | sudo -S -i " 135 | # sudo -ln can say pw required when it isn't required 136 | else 137 | SUDO='sudo PATH=/sbin:/usr/sbin:$PATH ' 138 | fi 139 | gs="'^Defaults.*requiretty'" 140 | if (clush $narg -S "${SUDO:-} grep -q $gs /etc/sudoers" >&/dev/null);then 141 | parg="-o -qtt $parg" # Add -qtt for sudo tty via ssh/clush 142 | #echo Use: clush -ab -o -qtt "sudo sed -i.bak 143 | #'/^Defaults.*requiretty/s/^/#/' /etc/sudoers" 144 | #To run sudo without a tty 145 | fi 146 | fi 147 | 148 | # Check for systemd and basic RPMs 149 | clcmd="[ -f /etc/systemd/system.conf ]" 150 | sysd=$(clush -qNS -g ${group:-all} ${cluser} "$clcmd" && echo true || echo false) 151 | 152 | distro_match() 153 | { 154 | for PATTERN in "$@"; do 155 | if [[ "${EFFECTIVE_DISTRO}-${DISTRO_ID_VERSION}" =~ $PATTERN ]]; then 156 | return 0 157 | fi 158 | done 159 | return 1 160 | } 161 | 162 | verify_installed_packages() { 163 | if distro_match rhel-8; then 164 | if ! clush $parg -S "rpm -q $@ >/dev/null" >/dev/null 2>&1; then 165 | echo "Required packages not installed, fix with:" 166 | echo " clush $parg -S \"dnf -y install $@\" " 167 | return 1 168 | fi 169 | return 0 170 | fi 171 | if distro_match rhel-7 rhel-6 sles-*; then 172 | echo matched rhel something 173 | if ! clush $parg -S "rpm -q $@ >/dev/null"; then 174 | echo "Required packages not installed, fix with:" 175 | echo " clush $parg -S \"yum -y install $@\" " 176 | return 1 177 | fi 178 | return 0 179 | fi 180 | if distro_match ubuntu-*; then 181 | echo matched ubuntu something 182 | if ! check="clush $parg -S dpkg -l $@ >/dev/null"; then 183 | echo "Required packages not installed, fix with:" 184 | echo " clush $parg -S \"apt-get -y install $@\" " 185 | return 1 186 | fi 187 | return 0 188 | fi 189 | } 190 | 191 | # Checking tool requirements 192 | echo "Checking cluster-audit required tools" 193 | required_packages=() 194 | if distro_match rhel-* sles-*; then 195 | required_packages+=( "pciutils" "dmidecode" "net-tools" "ethtool" "bind-utils" ) 196 | fi 197 | if distro_match ubuntu-*; then 198 | required_packages=( "pciutils" "dmidecode" "net-tools" "ethtool" "bind9utils" ) 199 | fi 200 | if ! verify_installed_packages ${required_packages[@]}; then 201 | echo Exiting in failure; exit 1 202 | fi 203 | 204 | [ -n "$DBG" ] && { echo sysd: $sysd; echo srvid: $srvid; echo SUDO: $SUDO; echo parg: $parg; echo node: $node; } 205 | [ -n "$DBG" ] && exit 206 | 207 | 208 | echo;echo "#################### Hardware audits ###############################" 209 | date; echo $sep 210 | echo NodeSet: $(nodeset -e @${group:-all}); echo $sep 211 | echo All the groups currently defined for clush:; nodeset -l 212 | echo groups zk, cldb, rm, and hist needed for clush based install; echo $sep 213 | # probe for system info ############### 214 | clush $parg "echo DMI Sys Info:; ${SUDO:-} dmidecode | grep -A2 '^System Information'"; echo $sep 215 | clush $parg "echo DMI BIOS:; ${SUDO:-} dmidecode |grep -A3 '^BIOS I'"; echo $sep 216 | 217 | # probe for cpu info ############## 218 | clush $parg "grep '^model name' /proc/cpuinfo | sort -u"; echo $sep 219 | clush $parg "lscpu | grep -v -e op-mode -e ^Vendor -e family -e Model: -e Stepping: -e BogoMIPS -e Virtual -e ^Byte -e '^NUMA node(s)' -e '^CPU MHz:' -e ^Flags -e cache: " 220 | echo $sep 221 | #clush $parg "lscpu | grep -v -e op-mode -e ^Vendor -e family -e Model: -e Stepping: -e BogoMIPS -e Virtual -e ^Byte -e '^NUMA node(s)' | awk '/^CPU MHz:/{sub(\$3,sprintf(\"%0.0f\",\$3))};{print}'"; echo $sep 222 | #clush $parg "lscpu | grep -e ^Thread"; echo $sep 223 | #TBD: grep '^model name' /proc/cpuinfo | sed 's/.*CPU[ ]*\(.*\)[ ]*@.*/\1/' 224 | #TBD: curl -s -L 'http://ark.intel.com/search?q=E5-2420%20v2' | grep -A2 -e 'Memory Channels' -e 'Max Memory Bandwidth' 225 | 226 | # probe for mem/dimm info ############### 227 | clush $parg "cat /proc/meminfo | grep -i ^memt | uniq"; echo $sep 228 | clush $parg "echo -n 'DIMM slots: '; ${SUDO:-} dmidecode -t memory |grep -c '^[[:space:]]*Locator:'"; echo $sep 229 | clush $parg "echo -n 'DIMM count is: '; ${SUDO:-} dmidecode -t memory | grep -c '^[[:space:]]Size: [0-9][0-9]*'"; echo $sep 230 | clush $parg "echo DIMM Details; ${SUDO:-} dmidecode -t memory | awk '/Memory Device$/,/^$/ {print}' | grep -e '^Mem' -e Size: -e Speed: -e Part | sort -u | grep -v -e 'NO DIMM' -e 'No Module Installed' -e 'Not Specified'"; echo $sep 231 | 232 | # probe for nic info ############### 233 | #clush $parg "ifconfig | grep -o ^eth.| xargs -l ${SUDO:-} /usr/sbin/ethtool | grep -e ^Settings -e Speed -e detected" 234 | #clush $parg "ifconfig | awk '/^[^ ]/ && \$1 !~ /lo/{print \$1}' | xargs -l ${SUDO:-} /usr/sbin/ethtool | grep -e ^Settings -e Speed" 235 | clush $parg "${SUDO:-} lspci | grep -i ether" 236 | clush $parg ${SUDO:-} "ip link show |sed '/ lo: /,+1d; /@.*:/,+1d' |awk '/UP/{sub(\":\",\"\",\$2);print \$2}' |sort |xargs -l /sbin/ethtool |grep -e ^Settings -e Speed -e Link" 237 | #Above filters out lo and vnics using @interface labels 238 | #TBD: fix SUDO to find ethtool, not /sbin/ethtool 239 | #clush $parg "echo -n 'Nic Speed: '; /sbin/ip link show | sed '/ lo: /,+1d' | awk '/UP/{sub(\":\",\"\",\$2);print \$2}' | xargs -l -I % cat /sys/class/net/%/speed" 240 | echo $sep 241 | [ -n "$DBG" ] && exit 242 | 243 | # probe for disk info ############### 244 | #TBD: Probe disk controller settings, needs storcli64 binary, won't work on HP which needs smartarray tool 245 | #/opt/MegaRAID/storcli/storcli64 /c0 /eall /sall show | awk '$3 == "UGood"{print $1}'; exit 246 | #./MegaCli64 -cfgeachdskraid0 WT RA cached NoCachedBadBBU –strpsz256 -a0 247 | clush $parg "echo 'Storage Controller: '; ${SUDO:-} lspci | grep -i -e ide -e raid -e storage -e lsi"; echo $sep 248 | clush $parg "echo 'SCSI RAID devices in dmesg: '; ${SUDO:-} dmesg | grep -i raid | grep -i -o 'scsi.*$' |uniq"; echo $sep 249 | case ${EFFECTIVE_DISTRO} in 250 | ubuntu) 251 | clush $parg "${SUDO:-} fdisk -l | grep '^Disk /.*:' |sort"; echo $sep 252 | ;; 253 | rhel|sles) 254 | clush $parg "echo 'Block Devices: '; lsblk -id -o NAME,SIZE,TYPE,MOUNTPOINT |grep -v ^sr0 |uniq -c -f1 |sed '1s/ 1/Qty/'"; echo $sep 255 | ;; 256 | *) echo Unknown Linux distro! ${DISTRO_ID}; exit ;; 257 | esac 258 | #TBD: add smartctl disk detail probes 259 | # smartctl -d megaraid,0 -a /dev/sdf | grep -e ^Vendor -e ^Product -e Capacity -e ^Rotation -e ^Form -e ^Transport 260 | clush $parg "echo 'Udev rules: '; ${SUDO:-} ls /etc/udev/rules.d"; echo $sep 261 | #clush $parg "echo 'Storage Drive(s): '; fdisk -l 2>/dev/null | grep '^Disk /dev/.*: ' | sort | grep mapper" 262 | #clush $parg "echo 'Storage Drive(s): '; fdisk -l 2>/dev/null | grep '^Disk /dev/.*: ' | sort | grep -v mapper" 263 | 264 | echo 265 | echo "#################### Linux audits ################################" 266 | #clush $parg "cat /etc/*release | uniq"; echo $sep 267 | clush $parg "[ -f /etc/system-release ] && cat /etc/system-release || cat /etc/os-release | uniq"; echo $sep 268 | #clush $parg "uname -a | fmt"; echo $sep 269 | clush $parg "uname -srvmo | fmt"; echo $sep 270 | clush $parg "echo Time Sync Check: ; date"; echo $sep 271 | 272 | echo Hostname IP addresses 273 | if [[ "${EFFECTIVE_DISTRO}" != "sles" ]]; then 274 | clush ${parg/-b /} 'hostname -I'; echo $sep 275 | else 276 | clush ${parg/-b /} 'hostname -i'; echo $sep 277 | fi 278 | echo DNS lookup 279 | clush ${parg/-b /} 'host $(hostname -f)'; echo $sep 280 | echo Reverse DNS lookup 281 | clush ${parg/-b /} 'host $(hostname -i)'; echo $sep 282 | 283 | case ${EFFECTIVE_DISTRO} in 284 | ubuntu) 285 | # Ubuntu SElinux tools not so good. 286 | clush $parg "echo 'NTP status '; ${SUDO:-} service ntp status"; echo $sep 287 | clush $parg "${SUDO:-} apparmor_status | sed 's/([0-9]*)//'"; echo $sep 288 | clush $parg "echo -n 'SElinux status: '; ([ -d /etc/selinux -a -f /etc/selinux/config ] && grep ^SELINUX= /etc/selinux/config) || echo Disabled" 289 | echo $sep 290 | clush $parg "echo 'Firewall status: '; ${SUDO:-} service ufw status | head -10"; echo $sep 291 | clush $parg "echo 'IPtables status: '; ${SUDO:-} iptables -L | head -10"; echo $sep 292 | clush $parg "echo 'NFS packages installed '; dpkg -l '*nfs*' | grep ^i"; echo $sep 293 | ;; 294 | rhel|sles) 295 | if [[ "${EFFECTIVE_DISTRO}" == "sles" ]]; then 296 | clush $parg 'echo "MapR Repos Check "; zypper repos | grep -i mapr && zypper -q info mapr-core mapr-spark mapr-patch';echo $sep 297 | clush $parg "echo -n 'SElinux status: '; rpm -q selinux-tools selinux-policy" ; echo $sep 298 | clush $parg "${SUDO:-} service SuSEfirewall2_init status"; echo $sep 299 | else 300 | clush $parg -S 'echo "MapR Repos Check "; yum --noplugins repolist | grep -i mapr && yum -q info mapr-core mapr-spark mapr-patch';echo $sep 301 | clush $parg "echo -n 'SElinux status: '; grep ^SELINUX= /etc/selinux/config; ${SUDO:-} getenforce" ; echo $sep 302 | fi 303 | clush $parg 'echo "NFS packages installed "; rpm -qa | grep -i nfs |sort' 304 | echo $sep 305 | pkgs="dmidecode bind-utils irqbalance syslinux hdparm sdparm rpcbind" 306 | pkgs+=" nfs-utils redhat-lsb-core ntp" #TBD: SLES should have lsb5-core 307 | clush $parg "echo Required RPMs: ; rpm -q $pkgs | grep 'is not installed' || echo All Required RPMS are Installed"; echo $sep 308 | pkgs="patch nc dstat xml2 jq git tmux zsh vim nmap mysql mysql-server" 309 | pkgs+=" tuned smartmontools pciutils lsof lvm2 iftop ntop iotop atop" 310 | pkgs+=" ftop htop ntpdate tree net-tools ethtool" 311 | clush $parg "echo Optional RPMs:; rpm -q $pkgs |grep 'not installed' |sort" 312 | echo $sep 313 | #TBD suggest: setenforce Permissive and sed -i.bak 's/enforcing/permissive/' /etc/selinux/config 314 | #TBD SElinux different for SLES 315 | case $sysd in 316 | true) 317 | #clush $parg "ntpstat | head -1" ; echo $sep 318 | clush $parg "echo NTPD Active:; ${SUDO:-} systemctl is-active ntpd" 319 | echo $sep 320 | clush $parg "${SUDO:-} systemctl list-dependencies iptables" 321 | echo $sep 322 | clush $parg "${SUDO:-} systemctl status iptables"; echo $sep 323 | clush $parg "${SUDO:-} systemctl status firewalld"; echo $sep 324 | clush $parg "${SUDO:-} systemctl status cpuspeed"; echo $sep 325 | ;; 326 | false) 327 | clush $parg "echo 'NTP status '; ${SUDO:-} service ntpd status |sed 's/(.*)//'"; echo $sep 328 | clush $parg "${SUDO:-} chkconfig --list iptables" ; echo $sep 329 | clush $parg "${SUDO:-} service iptables status | head -10"; echo $sep 330 | clush $parg "echo -n 'CPUspeed Service: '; ${SUDO:-} service cpuspeed status"; echo $sep 331 | clush $parg "${SUDO:-} service sssd status|sed 's/(.*)//' && chkconfig --list sssd | grep -e 3:on -e 5:on >/dev/null" 332 | clush $parg "${SUDO:-} wc /etc/sssd/sssd.conf" #TBD: Check sssd settings and add sysd checks 333 | #clush $parg "/sbin/service iptables status | grep -m 3 -e ^Table -e ^Chain" 334 | #clush $parg "echo -n 'Frequency Governor: '; for dev in /sys/devices/system/cpu/cpu[0-9]*; do cat \$dev/cpufreq/scaling_governor; done | uniq -c" 335 | #clush $parg "echo -n 'CPUspeed Service: '; ${SUDO:-} chkconfig --list cpuspeed"; echo $sep 336 | ;; 337 | esac 338 | ;; 339 | *) echo Unknown Linux distro! ${DISTRO_ID}; exit ;; 340 | #clush $parg 'echo "MapR Repos Check "; zypper repos |grep -i mapr && yum -q info mapr-core mapr-spark mapr-patch';echo $sep 341 | esac 342 | 343 | # See https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/ 344 | clush $parg "echo 'Sysctl Values: '; ${SUDO:-} sysctl vm.swappiness net.ipv4.tcp_retries2 vm.overcommit_memory"; echo $sep 345 | echo -e "/etc/sysctl.conf values should be:\nvm.swappiness = 1\nnet.ipv4.tcp_retries2 = 5\nvm.overcommit_memory = 0"; echo $sep 346 | #clush $parg "grep AUTOCONF /etc/sysconfig/network" ; echo $sep 347 | clush $parg "echo -n 'Transparent Huge Pages: '; cat /sys/kernel/mm/transparent_hugepage/enabled" ; echo $sep 348 | clush $parg ${SUDO:-} 'echo Checking for LUKS; grep -v -e ^# -e ^$ /etc/crypttab | uniq -c -f2' 349 | clush $parg 'echo "Disk Controller Max Transfer Size:"; files=$(ls /sys/block/{sd,xvd,vd}*/queue/max_hw_sectors_kb 2>/dev/null); for each in $files; do printf "%s: %s\n" $each $(cat $each); done |uniq -c -f1'; echo $sep 350 | clush $parg 'echo "Disk Controller Configured Transfer Size:"; files=$(ls /sys/block/{sd,xvd,vd}*/queue/max_sectors_kb 2>/dev/null); for each in $files; do printf "%s: %s\n" $each $(cat $each); done |uniq -c -f1'; echo $sep 351 | echo Check Mounted FS 352 | case $sysd in 353 | true) 354 | clush $parg -u 30 "df -h --output=fstype,size,pcent,target -x tmpfs -x devtmpfs"; echo $sep ;; 355 | false) 356 | clush $parg -u 30 "df -hT | cut -c22-28,39- | grep -e ' *' | grep -v -e /dev"; echo $sep ;; 357 | esac 358 | echo Check for nosuid and noexec mounts 359 | clush $parg -u 30 "mount | grep -e noexec -e nosuid | grep -v tmpfs |grep -v 'type cgroup'"; echo $sep 360 | #clush $parg -u 30 "mount | grep -e noexec -e nosuid | grep -v tmpfs |grep -v 'type cgroup'" |cut -d' ' -f3- |column -t; echo $sep 361 | echo Check for /tmp permission 362 | clush $parg "stat -c %a /tmp | grep 1777 || echo /tmp permissions not 1777" ; echo $sep 363 | case $sysd in 364 | true) 365 | ;; 366 | false) 367 | echo Check for tmpwatch on NM local dir 368 | clush $parg -B "grep -H /tmp/hadoop-mapr/nm-local-dir /etc/cron.daily/tmpwatch || echo Not in tmpwatch: /tmp/hadoop-mapr/nm-local-dir"; echo $sep 369 | ;; 370 | esac 371 | #FIX: clush -l root -ab "echo '/usr/sbin/tmpwatch \"\$flags\" -x /tmp/hadoop-mapr/nm-local-dir' >> /etc/cron.daily/tmpwatch" 372 | #TBD: systemd-tmpfiles 'tmpfiles.d' man page. Configuration 373 | #in /usr/lib/tmpfiles.d/tmp.conf, and in /etc/tmpfiles.d/*.conf. 374 | 375 | echo Java Version 376 | clush $parg -B 'java -version || echo See java-post-install.sh' 377 | if [[ "${EFFECTIVE_DISTRO}" != "sles" ]]; then 378 | clush $parg -B 'yum list installed \*jdk\* \*java\*' 379 | else 380 | clush $parg -B 'zypper search -i java jdk' 381 | fi 382 | clush $parg -B 'javadir=$(dirname $(readlink -f /usr/bin/java)); test -x $javadir/jps || { test -x $javadir/../../bin/jps || echo JDK not installed; }' 383 | echo $sep 384 | echo Check for root ownership of /opt/mapr 385 | clush $parg -B 'stat --printf="%U:%G %A %n\n" $(readlink -f /opt/mapr)'; echo $sep 386 | echo "Check for $srvid login" 387 | clush $parg -S "echo '$srvid account for MapR Hadoop '; getent passwd $srvid" || { echo "$srvid user NOT found!"; exit 2; } 388 | #TBD: add 'getent passwd |grep -i mapr' to search for other service acct names 389 | echo $sep 390 | 391 | if [[ $(id -u) -eq 0 || "$parg" =~ root || "$SUDO" =~ sudo ]]; then 392 | #TBD check umask for root and mapr 393 | echo Check for $srvid user specific open file and process limits 394 | clush $parg "echo -n 'Open process limit(should be >=32K): '; ${SUDO:-} su - $srvid -c 'ulimit -u'" 395 | clush $parg "echo -n 'Open file limit(should be >=32K): '; ${SUDO:-} su - $srvid -c 'ulimit -n'"; echo $sep 396 | echo Check for $srvid users java exec permission and version 397 | clush $parg -B "echo -n 'Java version: '; ${SUDO:-} su - $srvid -c 'java -version'"; echo $sep 398 | clush $parg -B "echo -n 'Locale setting(must be en_US): '; ${SUDO:-} su - $srvid -c 'locale |grep LANG'"; echo $sep 399 | echo "Check for $srvid passwordless ssh (only for MapR v3.x)" 400 | clush $parg "${SUDO:-} ls ~$srvid/.ssh/authorized_keys"; echo $sep 401 | elif [[ $(id -un) == $srvid ]]; then 402 | echo Check for $srvid user specific open file and process limits 403 | clush $parg "echo -n 'Open process limit(should be >=32K): '; ulimit -u" 404 | clush $parg "echo -n 'Open file limit(should be >=32K): '; ulimit -n"; echo $sep 405 | echo Check for $srvid users java exec permission and version 406 | clush $parg -B "echo -n 'Java version: '; java -version"; echo $sep 407 | echo "Check for $srvid passwordless ssh (only for MapR v3.x)" 408 | clush $parg "ls ~$srvid/.ssh/authorized_keys*"; echo $sep 409 | else 410 | echo Must have root access or sudo rights to check $srvid limits 411 | fi 412 | echo Check for system wide nproc and nofile limits 413 | clush $parg "${SUDO:-} test -d /etc/security/limits.d && { grep -e nproc -e nofile /etc/security/limits.d/*.conf |grep -v ':#'; } || exit 0" 414 | clush $parg "${SUDO:-} grep -e nproc -e nofile /etc/security/limits.conf |grep -v ':#' "; echo $sep 415 | #echo 'Check for root user login and passwordless ssh (not needed for MapR, just easy for clush)' 416 | #clush $parg "echo 'Root login '; getent passwd root && { ${SUDO:-} echo ~root/.ssh; ${SUDO:-} ls ~root/.ssh; }"; echo $sep 417 | -------------------------------------------------------------------------------- /pre-install/mapr-install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # jbenninghoff 2013-Mar-20 vi: set ai et sw=3 tabstop=3: 3 | # shellcheck disable=SC2029,SC2046,SC2086,SC2181,SC2162 4 | # ,SC2016 5 | #TBD: Handle mapruid install 6 | 7 | usage() { 8 | cat << EOF 9 | 10 | Usage: 11 | $0 -L -M -s -m|-e -u -x -a -k ServicePrincipalName -n ClusterName 12 | 13 | -L option to install MapR Log Searching (Kibana, ES, fluentd), after core install 14 | -M option to install MapR Metrics (Grafana, etc), after core install 15 | -n option to specify cluster name (no spaces) 16 | -s option for secure cluster installation 17 | -k option for Kerberos cluster installation (implies -s) 18 | -m option for MFS only cluster installation 19 | -a option for cluster with dedicated admin nodes not running nodemanager 20 | -e option to install on edge node (no fileserver). Can combine with -s or -x 21 | -u option to upgrade existing cluster 22 | -x option to uninstall existing cluster, destroying all data! 23 | 24 | MapR Install methods: 25 | 1) Manually follow documentation at http://maprdocs.mapr.com/home/install.html 26 | 2) MapR GUI installer 27 | https://maprdocs.mapr.com/home/MapRInstaller.html 28 | (curl -LO http://package.mapr.com/releases/installer/mapr-setup.sh) 29 | 3) Bash script using clush groups and yum (this script) 30 | 4) Ansible install playbooks 31 | https://github.com/mapr-emea/mapr-ansible 32 | 33 | Install of MapR must be done as root 34 | (or with passwordless sudo as mapr service account) 35 | 36 | Kerberos configuration document: 37 | https://mapr.com/docs/61/SecurityGuide/Configuring-Kerberos-User-Authentication.html 38 | 39 | This script requires these clush groups: 40 | clstr cldb zk rm hist [graf otsdb kibana es] 41 | 42 | EOF 43 | exit 2 44 | } 45 | 46 | secure=false; kerberos=false; mfsonly=false; uninstall=false; upgrade=false 47 | admin=false; edge=false; metrics=false; clname=''; logsearch=false; dare=false 48 | pn=""; DBG=false 49 | while getopts "LMsdDmuxaek:n:" opt; do 50 | case $opt in 51 | L) logsearch=true ;; 52 | M) metrics=true ;; 53 | n) clname="$OPTARG" ;; 54 | s) secure=true; ;; 55 | k) kerberos=true; secure=true; pn="$OPTARG" ;; 56 | m) mfsonly=true ;; 57 | u) upgrade=true ;; 58 | x) uninstall=true ;; 59 | a) admin=true ;; 60 | e) edge=true ;; 61 | d) dare=true ;; 62 | D) DBG=true ;; 63 | \?) usage ;; 64 | esac 65 | done 66 | 67 | shift $(( OPTIND - 1 )) #Report unknown options and exit 68 | if [[ $# -gt 0 ]]; then 69 | echo Unknown option or argument 1: "$1" 70 | echo Unknown option or argument 2: "$2" 71 | exit 72 | fi 73 | 74 | setvars() { 75 | ########## Site specific variables 76 | # If MacOS has coreutils via brew install 77 | if [[ -d /usr/local/opt/coreutils/libexec/gnubin ]]; then 78 | PATH="/usr/local/opt/coreutils/libexec/gnubin:$PATH" 79 | fi 80 | pause="Press enter to continue or ctrl-c to abort" 81 | clname=${clname:-''} #Name for the entire cluster, no spaces 82 | realm="" #Kerberos Realm 83 | # Login to web ui; use non-root, non-mapr account to create "hadoop admin"` 84 | admin1='mapr' #Non-root, non-mapr linux account which has a known password 85 | mapruid=mapr; maprgid=mapr #MapR service account and group 86 | spw=2 #Storage Pool width 87 | if compgen -G "/etc/*release" >/dev/null; then 88 | gcmd="grep -m1 -i -o -e ubuntu -e redhat -e 'red hat' -e centos" 89 | distro=$(cat /etc/*release 2>/dev/null | eval "$gcmd") 90 | else 91 | distro=centos 92 | fi 93 | maprver=v6.1.0 #TBD: Grep repo file to confirm or alter 94 | nfs='mapr-nfs' #Set to null to skip MapR NFS install 95 | #export JAVA_HOME=/usr/java/default #Oracle JDK 96 | export JAVA_HOME=/usr/lib/jvm/java #Openjdk 97 | # MapR rpm installs look for $JAVE_HOME, 98 | # all clush/ssh cmds will forward setting in ~/.ssh/environment 99 | (umask 0077 && echo JAVA_HOME=$JAVA_HOME >> $HOME/.ssh/environment) 100 | #If not root use sudo, assuming mapr account has password-less sudo 101 | [[ $(id -u) -ne 0 ]] && SUDO='sudo PATH=/sbin:/usr/sbin:$PATH ' 102 | cldb1=$(nodeset -I0 -e @cldb) #first node in cldb group 103 | # If root has mapr public key on all nodes 104 | #clush() { /Users/jbenninghoff/bin/clush -l root $@; } 105 | if [[ $(id -u) -ne 0 ]] && [[ $(id -un) != "$mapruid" ]]; then 106 | echo "This script must be run as root or $mapruid (with sudo)" 107 | #exit 1 108 | fi 109 | t1=$SECONDS 110 | } 111 | setvars #Set some global vars for install 112 | 113 | # Check install pre-requisites 114 | chk_prereq() { 115 | # Check for clush groups to layout services 116 | groups="clstr cldb zk rm hist" 117 | [[ "$metrics" == true ]] && groups+=" graf otsdb" 118 | [[ "$logsearch" == true ]] && groups+=" kibana es" 119 | #[[ "$edge" == true ]] && groups+=" edge " 120 | if [[ $(nodeset -c @cldb) -ne $(nodeset -c @clstr) ]]; then 121 | groups+=" noncldb" 122 | fi 123 | clushgrps=true 124 | for grp in $groups; do 125 | gmsg="Clustershell group: $grp undefined" 126 | [[ $(nodeset -c "@$grp") == 0 ]] && { echo "$gmsg"; clushgrps=false; } 127 | done 128 | [[ "$clushgrps" == false ]] && exit 1 129 | 130 | if [[ "$uninstall" == "false" && -z "$clname" ]]; then 131 | echo Cluster name not set. Use -n option to set cluster name 132 | usage 133 | exit 2 134 | fi 135 | if [[ "$kerberos" == true && $realm == "" ]]; then 136 | echo Kerberos Realm not set. Set realm var in this script. 137 | exit 2 138 | fi 139 | cldb1=$(nodeset -I0 -e @cldb) #first node in cldb group 140 | if [[ -z "$cldb1" ]]; then 141 | echo Primary node name not set. 142 | echo Set or check cldb1 variable in this script 143 | nodeset -I0 -e @cldb 144 | exit 2 145 | fi 146 | clush -S -B -g clstr id $admin1 || { echo $admin1 account does not exist on all nodes; exit 3; } 147 | clush -S -B -g clstr id $mapruid || { echo $mapruid account does not exist on all nodes; exit 3; } 148 | if [[ "$admin1" != "$mapruid" ]]; then 149 | clush -S -B -g clstr id $admin1 || \ 150 | { echo $admin1 account does not exist on all nodes; exit 3; } 151 | fi 152 | clush -S -B -g clstr "$JAVA_HOME/bin/java -version \ 153 | |& grep -e x86_64 -e 64-Bit -e version" || \ 154 | { echo $JAVA_HOME/bin/java does not exist on all nodes or is not 64bit; \ 155 | exit 3; } 156 | clush -qB -g clstr 'pkill -f yum; exit 0' 157 | clush -SB -g clstr 'echo "MapR Repo Check "; yum --noplugins -q search mapr-core' || { echo MapR RPMs not found; exit 3; } 158 | clush -SB -g clstr 'echo "MapR Repo URL ";yum --noplugins repoinfo mapr\* |grep baseurl' 159 | #rpm --import http://package.mapr.com/releases/pub/maprgpg.key 160 | #clush -S -B -g clstr 'echo "MapR Repos Check "; grep -li mapr /etc/yum.repos.d/* |xargs -l grep -Hi baseurl' || { echo MapR repos not found; } 161 | #clush -S -B -g clstr 'echo Check for EPEL; grep -qi -m1 epel /etc/yum.repos.d/*' || { echo Warning EPEL repo not found; } 162 | #TBD check for gpgcheck and key(s) 163 | read -p "All checks passed, $pause" 164 | } 165 | [[ "$uninstall" == "true" || "$edge" == "true" ]] || chk_prereq 166 | 167 | #Find, Download and install mapr-patch (called by other functions) 168 | install_patch() { 169 | inrepo=false 170 | clush -SB -g clstr ${SUDO:-} "yum --noplugins info mapr-patch" && inrepo=true 171 | if [[ "$inrepo" == "true" ]]; then 172 | clush -v -g clstr ${SUDO:-} "yum --noplugins -y install mapr-patch" 173 | else 174 | rpmsite="http://package.mapr.com/patches/releases/$maprver/redhat/" 175 | sedcmd="s/.*\\(mapr-patch-${maprver//v}.*.rpm\\).*/\\1/p" 176 | patchrpm=$(timeout 9 curl -s $rpmsite | sed -n "$sedcmd") 177 | if [[ $? -ne 0 ]]; then 178 | url=http://package.mapr.com/patches/releases/$maprver/redhat/$patchrpm 179 | clush -v -g clstr ${SUDO:-} "yum --noplugins -y install $url" 180 | else 181 | echo "Patch not found, patchrpm=$patchrpm" 182 | fi 183 | fi 184 | # If mapr-patch rpm cannot be added to local repo 185 | #clush -abc /tmp/mapr-patch-6.0.1.20180404222005.GA-20180626035114.x86_64.rpm 186 | #clush -ab "${SUDO:-} yum -y localinstall /tmp/mapr-patch-6*.rpm" 187 | #clush -ab "systemctl stop mapr-warden; systemctl stop mapr-zookeeper" 188 | #clush -ab "systemctl start mapr-zookeeper; systemctl start mapr-warden" 189 | } 190 | 191 | install_metrics() { 192 | if [[ "$uninstall" == "true" ]]; then 193 | sshpfx="MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 194 | sshpfx+=" maprcli node services -action stop " 195 | 196 | # Stop the metric services 197 | sshcmd="$sshpfx -name grafana -nodes $(nodeset -e @graf)" 198 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" 199 | sshcmd="$sshpfx -name opentsdb -nodes $(nodeset -e @otsdb)" 200 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" 201 | sshcmd="$sshpfx -name collectd -nodes $(nodeset -e @clstr)" 202 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" 203 | 204 | # Remove the metric rpms 205 | clush -g graf "${SUDO:-} yum --noplugins -y erase mapr-grafana" 206 | clush -g otsdb "${SUDO:-} yum --noplugins -y erase mapr-opentsdb" 207 | clush -g clstr "${SUDO:-} yum --noplugins -y erase mapr-collectd" 208 | 209 | # Reconfigure MapR 210 | clcmd="env MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket " 211 | clcmd+=" /opt/mapr/server/configure.sh -R " 212 | clush -g clstr "${SUDO:-} $clcmd" 213 | else 214 | # Check for MapR install 215 | clcmd="test -f /opt/mapr/conf/disktab >& /dev/null" 216 | if clush -S -B -g clstr "$clcmd"; then 217 | echo MapR appears to be installed, installing metric packages 218 | else 219 | echo MapR appears to not be installed, install cluster first 220 | exit 2 221 | fi 222 | 223 | # Install RPMs 224 | clush -g graf "${SUDO:-} yum --noplugins -y install mapr-grafana" 225 | clush -g otsdb "${SUDO:-} yum --noplugins -y install mapr-opentsdb" 226 | clush -g clstr "${SUDO:-} yum --noplugins -y install mapr-collectd" 227 | 228 | # Run configure.sh as root with MapR built-in ticket 229 | clcmd="env MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket " 230 | clcmd+=" /opt/mapr/server/configure.sh -R " 231 | clcmd+=" -OT $(nodeset -S, -e @otsdb) " 232 | clush -g clstr "${SUDO:-} $clcmd" 233 | fi 234 | exit 235 | } 236 | [[ "$metrics" == true ]] && install_metrics # And exit script 237 | 238 | install_logsearch() { 239 | if [[ "$uninstall" == "true" ]]; then 240 | sshpfx="MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 241 | sshpfx+=" maprcli node services -action stop " 242 | 243 | # Stop the log search services 244 | sshcmd="$sshpfx -name kibana -nodes $(nodeset -e @kibana)" 245 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" 246 | sshcmd="$sshpfx -name elasticsearch -nodes $(nodeset -e @es)" 247 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" 248 | sshcmd="$sshpfx -name fluentd -nodes $(nodeset -e @clstr)" 249 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" 250 | 251 | # Remove the log search rpms 252 | clush -g kibana "${SUDO:-} yum --noplugins -y erase mapr-kibana" 253 | clush -g es "${SUDO:-} yum --noplugins -y erase mapr-elasticsearch" 254 | clush -g clstr "${SUDO:-} yum --noplugins -y erase mapr-fluentd" 255 | 256 | # Reconfigure MapR 257 | clcmd="env MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket " 258 | clcmd+=" /opt/mapr/server/configure.sh -R " 259 | clush -g clstr "${SUDO:-} $clcmd" 260 | else 261 | # Fluentd copies MapR service logs to ES 262 | clush -g clstr "${SUDO:-} yum --noplugins -y install mapr-fluentd" 263 | # ElasticSearch on 3 nodes for HA (mapr ulimit -n >= 64K) 264 | if [[ "$DBG" == "true" ]]; then 265 | clush -b -g es 'su -c "echo Num Files; ulimit -n" - mapr' 266 | read -p "$pause" 267 | fi 268 | clush -g es "${SUDO:-} yum --noplugins -y install mapr-elasticsearch" 269 | # Kibana provides webui to ES 270 | clush -g kibana "${SUDO:-} yum --noplugins -y install mapr-kibana" 271 | 272 | # If Cluster is secure... 273 | if [[ "$secure" == "true" ]]; then 274 | # Generate ES keys on first ES node 275 | es1=$(nodeset -I0 -e @es) #first node in es group 276 | clcmd="env MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket " 277 | clcmd+=" /opt/mapr/server/configure.sh -R " 278 | clcmd+=" -ES $(nodeset -S, -e @es) " 279 | clcmd+=" -EPelasticsearch '-genESKeys' " 280 | clush -w $es1 "${SUDO:-} $clcmd" 281 | clcmd="cat /opt/mapr/elasticsearch/elasticsearchversion" 282 | esver=$(clush -Nw $es1 "${SUDO:-} $clcmd") 283 | [[ "$DBG" == "true" ]] && echo ES Version $esver 284 | [[ "$DBG" == "true" ]] && read -p "$pause" 285 | 286 | # Pull a copy of the keys from first ES node 287 | esdir="/opt/mapr/elasticsearch/elasticsearch-$esver/etc/elasticsearch" 288 | [[ "$DBG" == "true" ]] && ssh root@$es1 ls $esdir $esdir/ca $esdir/sg $esdir/keystores 289 | [[ "$DBG" == "true" ]] && read -p "$pause" 290 | eskeys="sg/ca/es-root-ca.pem" 291 | eskeys+=" sg/sg2.yml .keystore_password" 292 | eskeys+=" sg/truststore.jks sg/admin-usr-keystore.jks" 293 | for eshost in $(nodeset -e @es); do 294 | eskeys+=" sg/$eshost-srvr-keystore.jks" 295 | eskeys+=" sg/sg_http_$eshost.yml sg/sg_ssl_$eshost.yml" 296 | done 297 | rm -rf ~/eskeys 298 | mkdir -p ~/eskeys/sg ~/eskeys/sg/ca 299 | [[ "$DBG" == "true" ]] && echo Pulling ES Keys 300 | for file in $eskeys; do 301 | ssh root@$es1 dd status=none if=$esdir/$file > ~/eskeys/"$file" #Pull 302 | [[ "$DBG" == "true" ]] && { echo file is: $file; read -p "$pause"; } 303 | done 304 | 305 | # Copy pem to all ES nodes 306 | file="ca/es-root-ca.pem" 307 | clush -g es -x $es1 "${SUDO:-} mkdir -p $esdir/ca" 308 | ddcmd="dd of=$esdir/$file status=none" 309 | clush -g es -x $es1 "${SUDO:-} $ddcmd" < ~/eskeys/sg/"$file" #Push 310 | chcmd="chown mapr:mapr $esdir/$file; chmod 640 $esdir/$file" 311 | clush -g es -x $es1 "${SUDO:-} $chcmd" #chown/chmod 312 | 313 | # Copy Java keystores to all ES nodes 314 | eskeys="truststore.jks admin-usr-keystore.jks" 315 | for eshost in $(nodeset -e @es); do 316 | eskeys+=" $eshost-srvr-keystore.jks" 317 | done 318 | clush -g es -x $es1 "${SUDO:-} mkdir -p $esdir/keystores" 319 | [[ "$DBG" == "true" ]] && echo Pushing ES .jks 320 | for file in $eskeys; do 321 | ddcmd="dd of=$esdir/keystores/$file status=none" 322 | clush -g es -x $es1 "${SUDO:-} $ddcmd" < ~/eskeys/sg/"$file" #Push 323 | chcmd="chown mapr:mapr $esdir/keystores/$file; chmod 640 $esdir/keystores/$file" 324 | clush -g es -x $es1 "${SUDO:-} $chcmd" #chown/chmod 325 | [[ "$DBG" == "true" ]] && { echo file is: $file; read -p "$pause"; } 326 | done 327 | 328 | # Copy YAML files to all ES nodes 329 | eskeys="sg2.yml" 330 | for eshost in $(nodeset -e @es); do 331 | eskeys+=" sg_http_$eshost.yml sg_ssl_$eshost.yml" 332 | done 333 | [[ "$DBG" == "true" ]] && echo Pushing ES yml files 334 | for file in $eskeys; do 335 | ddcmd="dd of=$esdir/$file status=none" 336 | clush -g es -x $es1 "${SUDO:-} $ddcmd" < ~/eskeys/sg/"$file" #Push 337 | chcmd="chown mapr:mapr $esdir/$file; chmod 640 $esdir/$file" 338 | clush -g es -x $es1 "${SUDO:-} $chcmd" #chown/chmod 339 | [[ "$DBG" == "true" ]] && { echo file is: $file; read -p "$pause"; } 340 | done 341 | 342 | # Copy keystore password file to all ES nodes 343 | file=".keystore_password" 344 | ddcmd="dd of=$esdir/$file status=none" 345 | clush -g es -x $es1 "${SUDO:-} $ddcmd" < ~/eskeys/"$file" #Push 346 | chcmd="chown mapr:mapr $esdir/$file; chmod 640 $esdir/$file" 347 | clush -g es -x $es1 "${SUDO:-} $chcmd" #chown/chmod 348 | 349 | # Copy pem and keystore_password to all Kibana nodes 350 | # v6.0.1 bug: vi /opt/mapr/kibana/kibana-5.4.1/bin/configure.sh (line #375) 351 | clcmd="cat /opt/mapr/kibana/kibanaversion" 352 | kibver=$(clush -Ng kibana "${SUDO:-} $clcmd") 353 | kibdir="/opt/mapr/kibana/kibana-$kibver/config" 354 | file=sg/ca/es-root-ca.pem 355 | clush -g kibana "${SUDO:-} mkdir -p $kibdir/ca; chown mapr:mapr $kibdir/ca" 356 | ddcmd="dd of=$kibdir/ca/$(basename $file) status=none" 357 | clush -g kibana "${SUDO:-} $ddcmd" < ~/eskeys/$file 358 | chcmd="chown mapr:mapr $kibdir/ca/$(basename $file);" 359 | chcmd+=" chmod 640 $kibdir/ca/$(basename $file)" 360 | clush -g kibana "${SUDO:-} $chcmd" 361 | 362 | file=.keystore_password 363 | ddcmd="dd of=$kibdir/$file status=none" 364 | clush -g kibana "${SUDO:-} $ddcmd" < ~/eskeys/$file 365 | chcmd="chown mapr:mapr $kibdir/$file;" 366 | chcmd+=" chmod 640 $kibdir/$file" 367 | clush -g kibana "${SUDO:-} $chcmd" 368 | [[ "$DBG" == "true" ]] && echo All ES Keys pushed out 369 | fi 370 | 371 | # Run configure with all required options 372 | clcmd="env MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 373 | #clcmd+=" ES_ADMIN_PASSWORD=admin" 374 | clcmd+=" /opt/mapr/server/configure.sh -R" 375 | clcmd+=" -ES $(nodeset -S, -e @es)" 376 | clcmd+=" -EPelasticsearch '-password admin' -EPkibana '-password admin'" 377 | clcmd+=" -EPfluentd '-password admin'" 378 | clush -g clstr "${SUDO:-} $clcmd" 379 | fi 380 | exit 381 | } 382 | [[ "$logsearch" == true ]] && install_logsearch # And exit script 383 | 384 | do_upgrade() { 385 | #TBD: grep secure=true /opt/mapr/conf/mapr-clusters.conf && 386 | # { cp ../post-install/mapr-audit.sh /tmp; 387 | # sudo -u $mapruid /tmp/mapr-audit.sh; } 388 | #sudo -u mapr bash -c : && RUNAS="sudo -u mapr"; $RUNAS bash </dev/null 442 | done 443 | folder_list='conf/ hadoop/hadoop-*/etc/hadoop/ hadoop/hadoop-*/conf ' 444 | folder_list+=$(for role in $(ls /opt/mapr/roles |grep -v -e cldb -e fileserver -e nodemanager -e nfs -e apiserver); do cd /opt/mapr; ls -d $role/$role-*/{conf,etc} 2>/dev/null; done |xargs echo; echo ' ') 445 | --BLOCK-COMMENT-- 446 | clcmd="${SUDO:-} cd /opt/mapr/ && " 447 | clcmd+="tar cfz $HOME/mapr_configs-\$(hostname -f)-\$(date "+%FT%T").tgz " 448 | clcmd+="${folder_list}" 449 | clush -g clstr -b "$clcmd" 450 | #ansible -i /etc/ansible/hosts all -m shell -a "$clcmd" 451 | clush -g clstr -b ${SUDO:-} "ls -l $HOME/mapr_configs*.tgz" 452 | #ansible -i /etc/ansible/hosts all -m shell -a "ls -l $HOME/mapr_conf*.tgz" 453 | # TBD: make /tmp script, push it to all nodes, run it on all nodes. 454 | 455 | # Remove mapr-patch 456 | clush -g clstr -b ${SUDO:-} yum --noplugins -y erase mapr-patch 457 | 458 | # Update all MapR RPMs on all nodes 459 | # yum --disablerepo mapr-eco update mapr-\* 460 | #Exclude specific rpms with --exclude=mapr-some-somepackage 461 | clush -v -g clstr ${SUDO:-} "yum --noplugins -y update mapr-\\*" 462 | readtxt="Check console for errors. If none, press enter to continue or " 463 | readtxt+="ctrl-c to abort" 464 | read -p "$readtxt" 465 | 466 | # Download and install mapr-patch 467 | install_patch 468 | 469 | # Run configure.sh -R to insure configuration is updated 470 | clush -g clstr -b ${SUDO:-} /opt/mapr/server/configure.sh -R 471 | # TBD: modify yarn-site.xml and mapred-site.xml and container-executor.cfg 472 | # when upgrading 473 | 474 | # Start rpcbind, zk and warden 475 | clush -g clstr -b ${SUDO:-} service rpcbind restart 476 | clush -g zk -b ${SUDO:-} service mapr-zookeeper start 477 | sleep 9 478 | clush -g zk -b ${SUDO:-} service mapr-zookeeper qstatus 479 | clush -g clstr -b ${SUDO:-} service mapr-warden start 480 | sleep 90 481 | export MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket 482 | maprconf='{mapr.targetversion:"$(cat /opt/mapr/MapRBuildVersion)"}' 483 | sudo -u mapr maprcli config save -values "$maprconf" 484 | sudo -u mapr maprcli cluster feature enable -all 485 | exit 486 | } 487 | [[ "$upgrade" == "true" ]] && do_upgrade # And exit script 488 | 489 | uninstall() { 490 | sshcmd="MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 491 | sshcmd+=" maprcli dashboard info -json " 492 | clush -o -qtt -w $cldb1 "su - mapr -c '$sshcmd'" |awk '/"disk_space":{/,/}/' 493 | read -p "All data will be lost, $pause" 494 | clush -g clstr -b ${SUDO:-} umount /mapr 495 | clush -g clstr -b ${SUDO:-} service mapr-warden stop 496 | clush -g zk -b ${SUDO:-} service mapr-zookeeper stop 497 | clush -g clstr -b ${SUDO:-} jps 498 | clush -g clstr -b ${SUDO:-} pkill -u $mapruid 499 | clush -g clstr -b "${SUDO:-} ps ax | grep $mapruid" 500 | readtxt="If any $mapruid process is still running, " 501 | readtxt+="press ctrl-c to abort. Kill all $mapruid processes manually" 502 | read -p "$readtxt" 503 | clcmd="cp /opt/mapr/conf/disktab /var/tmp/mapr-disktab" 504 | clush -g clstr -b "${SUDO:-} $clcmd" 505 | echo Copy of disktab saved to /var/tmp/ on all nodes 506 | 507 | shopt -s nocasematch 508 | while read -p "Enter 'yes' to remove all mapr packages and /opt/mapr: "; do 509 | [[ "$REPLY" == "yes" ]] && break 510 | done 511 | 512 | case $distro in 513 | redhat|centos|red*) 514 | clcmd="yum clean all; yum -y erase mapr-\\*" 515 | clush -g clstr -b ${SUDO:-} "$clcmd" ;; 516 | ubuntu) 517 | clush -g clstr -B 'dpkg -P mapr-\*' ;; 518 | *) echo Unknown Linux distro! $distro; exit ;; 519 | esac 520 | clush -g clstr -b ${SUDO:-} rm -rf /opt/mapr 521 | clush -g clstr -b ${SUDO:-} rm -rf /tmp/hadoop-mapr 522 | clush -g clstr -b ${SUDO:-} 'rm -rf /tmp/maprticket_*' 523 | exit 524 | } 525 | [[ "$uninstall" == "true" && "$edge" == "false" ]] && uninstall # And exit 526 | 527 | install_edge_node() { 528 | if [[ $(nodeset -c @edge) == 0 ]]; then 529 | echo clustershell group: edge undefined 530 | exit 1 531 | fi 532 | if [[ "$uninstall" == "true" ]]; then 533 | clush -g edge -b ${SUDO:-} umount /mapr 534 | clush -g edge -b ${SUDO:-} service mapr-warden stop 535 | clush -g edge -b ${SUDO:-} service mapr-posix-client-basic stop 536 | clush -g edge -b ${SUDO:-} jps 537 | clush -g edge -b ${SUDO:-} pkill -u $mapruid 538 | clush -g edge -b "${SUDO:-} ps ax | grep $mapruid" 539 | read -p "If any $mapruid process is still running, \ 540 | press ctrl-c to abort and kill all manually" 541 | clush -g edge -b "${SUDO:-} yum clean all; yum -y erase mapr-\\*" 542 | clush -g edge -b ${SUDO:-} rm -rf /opt/mapr 543 | exit 544 | else 545 | if ! clush -S -B -g edge id $mapruid; then 546 | echo $mapruid account does not exist on all nodes 547 | mustexit=true 548 | fi 549 | clcmd="$JAVA_HOME/bin/java -version |& grep -e x86_64 -e 64-Bit -e version" 550 | if ! clush -S -B -g edge "$clcmd"; then 551 | echo $JAVA_HOME/bin/java does not exist on all nodes or is not 64bit 552 | mustexit=true 553 | fi 554 | clush -qB -g edge 'pkill -f yum; exit 0' 555 | if ! clush -SB -g edge 'echo "MapR Repo Check "; yum --noplugins -q search mapr-core'; then 556 | echo MapR RPMs not found, define mapr repo 557 | mustexit=true 558 | fi 559 | if [[ "$mustexit" == "true" ]]; then 560 | echo Pre-requisites not met; exit 3 561 | fi 562 | clush -SB -g edge 'echo "MapR Repo URL ";yum --noplugins repoinfo mapr* |grep baseurl' 563 | # Install mapr-core to use warden to run HS2,Metastore,etc 564 | rpms="mapr-core mapr-posix-client-basic" 565 | clush -v -g edge "${SUDO:-} yum -y install $rpms" 566 | # Edge node without maprcli 567 | #rpms="mapr-client mapr-posix-client-basic" 568 | # Enables edge node as simple client with loopback NFS to maprfs 569 | #rpms="mapr-client mapr-nfs" 570 | # TBD: If mapr-core installed, install patch? 571 | 572 | if [[ "$secure" == "true" ]]; then 573 | keys="ssl_truststore,ssl_keystore,maprserverticket,mapruserticket" 574 | scp "root@$cldb1:/opt/mapr/conf/{$keys}" . #fetch a copy of the keys 575 | clush -g edge -c ssl_truststore --dest /opt/mapr/conf/ 576 | clush -g edge -c ssl_keystore --dest /opt/mapr/conf/ 577 | clush -g edge -c maprserverticket --dest /opt/mapr/conf/ 578 | clush -g edge -c mapruserticket --dest /opt/mapr/conf/ 579 | clush -g edge "${SUDO:-} chown $mapruid:$maprgid /opt/mapr/conf/{$keys}" 580 | clush -g edge "${SUDO:-} chmod 600 /opt/mapr/conf/{$keys}" 581 | clush -g edge "${SUDO:-} chmod 644 /opt/mapr/conf/ssl_truststore" 582 | fi 583 | # v4.1+ use RM zeroconf, no -RM option 584 | confopts="-N $clname -Z $(nodeset -S, -e @zk) -C $(nodeset -S, -e @cldb)" 585 | confopts+=" -HS $(nodeset -I0 -e @hist) -u $mapruid -g $maprgid" 586 | confopts+=" -no-autostart" 587 | #confopts+=" -no-autostart -c" 588 | [[ "$secure" == "true" ]] && confopts+=" -S" 589 | clush -S -g edge "${SUDO:-} /opt/mapr/server/configure.sh $confopts" 590 | chmod u+s /opt/mapr/bin/fusermount 591 | mkdir -p /mapr 592 | echo Edit /opt/mapr/conf/fuse.conf. Append mapr ticket file path 593 | echo systemctl start mapr-posix-client-basic 594 | #systemctl restart mapr-warden 595 | exit 596 | fi 597 | } 598 | [[ "$edge" == "true" ]] && install_edge_node # And exit script 599 | 600 | chk_disk_list() { 601 | clear 602 | clush -S -B -g clstr "cat /tmp/disk.list; wc /tmp/disk.list" || { echo /tmp/disk.list not found, run clush disk-test.sh; exit 4; } 603 | clush -S -B -g clstr 'test -f /opt/mapr/conf/disktab' >& /dev/null && { echo MapR appears to be installed; exit 3; } 604 | 605 | # Create multiple disk lists for heterogeneous Storage Pools 606 | #clush -B -g clstr "sed -n '1,10p' /tmp/disk.list > /tmp/disk.list1" 607 | #clush -B -g clstr "sed -n '11,\$p' /tmp/disk.list > /tmp/disk.list2" 608 | #clush -B -g clstr "cat /tmp/disk.list1; wc /tmp/disk.list1" || { echo /tmp/disk.list1 not found; exit 4; } 609 | #clush -B -g clstr "cat /tmp/disk.list2; wc /tmp/disk.list2" || { echo /tmp/disk.list2 not found; exit 4; } 610 | 611 | cat <& /dev/null" 691 | [[ "$DBG" == "true" ]] && { echo Removed existing keys; read -p "$pause"; } 692 | #echo rm-keys done; read -p "$pause" 693 | 694 | # Generate keys using primary CLDB node 695 | clcmd="/opt/mapr/server/configure.sh -N $clname " 696 | clcmd+=" -Z $(nodeset -S, -e @zk) -C $(nodeset -S, -e @cldb) " 697 | clcmd+=" -secure -genkeys -f -u $mapruid -g $maprgid " 698 | clcmd+=" -no-autostart -OT $(nodeset -S, -e @otsdb) " 699 | [[ "$kerberos" == "true" ]] && clcmd+=" -K -P $mapruid/$clname@$realm " 700 | [[ "$dare" == "true" ]] && clcmd+=" -dare " 701 | clush -S -w "$cldb1" "${SUDO:-} $clcmd" 702 | if [[ $? -ne 0 ]]; then 703 | echo "configure.sh -genkeys failed" 704 | echo ${SUDO:-} $clcmd 705 | echo check screen and $cldb1:/opt/mapr/logs for errors 706 | exit 2 707 | fi 708 | [[ "$DBG" == "true" ]] && { echo "gen-keys done"; read -p "$pause"; } 709 | 710 | # Pull a copy of the keys from first CLDB node, then push to all nodes 711 | for file in $secfiles; do 712 | ssh root@$cldb1 dd status=none if=/opt/mapr/conf/$file > ~/"$file" #Pull 713 | ddcmd="dd of=/opt/mapr/conf/$file status=none" 714 | clush -g clstr -x $cldb1 "${SUDO:-} $ddcmd" < ~/"$file" #Push 715 | #echo file is: $file; read -p "$pause" 716 | done 717 | ssh root@$cldb1 "cksum /opt/mapr/conf/$seckeys" 718 | ssh root@$cldb1 "ls -l /opt/mapr/conf/$seckeys" 719 | [[ "$DBG" == "true" ]] && { echo "pull-keys done"; read -p "$pause"; } 720 | 721 | # Set owner and permissions on all key files pushed out 722 | clcmd="chown $mapruid:$maprgid /opt/mapr/conf/$seckeys" 723 | clush -g clstr "${SUDO:-} $clcmd" 724 | clcmd="chmod 400 /opt/mapr/conf/$seckeys" 725 | clush -g clstr "${SUDO:-} $clcmd" 726 | clcmd="chmod 444 /opt/mapr/conf/ssl_truststore*" 727 | clush -g clstr "${SUDO:-} $clcmd" 728 | clcmd="chmod 600 /opt/mapr/conf/{cldb.key,maprserverticket}" 729 | clush -g clstr "${SUDO:-} $clcmd" 730 | if [[ "$dare" == "true" ]]; then 731 | clcmd="chmod 600 /opt/mapr/conf/dare.master.key" 732 | clush -g clstr "${SUDO:-} $clcmd" 733 | fi 734 | #clush -b -g clstr "cksum /opt/mapr/conf/$seckeys" 735 | #echo install_keys; read -p "$pause" 736 | } 737 | [[ "$secure" == "true" ]] && install_keys && echo MapR Keys installed 738 | 739 | configure_nodes() { 740 | cfg=/opt/mapr/server/configure.sh 741 | # Define all configure.sh options needed 742 | confopts="-N $clname -Z $(nodeset -S, -e @zk) -C $(nodeset -S, -e @cldb) " 743 | confopts+=" -u $mapruid -g $maprgid -no-autostart " 744 | [[ "$mfsonly" == "false" ]] && confopts+="-HS $(nodeset -I0 -e @hist) " 745 | [[ "$secure" == "true" ]] && confopts+=" -S " 746 | [[ "$kerberos" == "true" ]] && confopts+=" -K -P $mapruid/$clname@$realm " 747 | #TBD: Handle $pn and $realm 748 | if [[ "$1" == "cldb" ]]; then 749 | clush -S -w $(nodeset -S, -e @cldb -x $cldb1) "${SUDO:-} $cfg $confopts" 750 | else 751 | clush -S -g $1 "${SUDO:-} $cfg $confopts" 752 | fi 753 | if [[ $? -ne 0 ]]; then 754 | echo configure.sh failed 755 | echo check screen history and /opt/mapr/logs/configure.log for errors 756 | exit 2 757 | fi 758 | #echo configure_nodes; read -p "$pause" 759 | } 760 | configure_nodes cldb && echo Configure.sh on CLDB nodes finished 761 | 762 | format_disks() { 763 | disks=/tmp/disk.list 764 | dargs="-F -W $spw" 765 | clush -g $1 "${SUDO:-} rm -f /opt/mapr/conf/disktab" 766 | clush -S -g $1 "${SUDO:-} /opt/mapr/server/disksetup $dargs $disks" 767 | if [[ $? -ne 0 ]]; then 768 | echo disksetup failed, check terminal and /opt/mapr/logs for errors 769 | exit 3 770 | fi 771 | #echo format_disks(); read -p "$pause" 772 | } 773 | format_disks cldb && echo CLDB disks formatted 774 | 775 | start_mapr() { 776 | clush -g zk "${SUDO:-} service mapr-zookeeper start" 777 | clush -g $1 "${SUDO:-} service mapr-warden start" 778 | 779 | echo Waiting 2 minutes for system to initialize 780 | end=$((SECONDS+120)) 781 | sp='/-\|' 782 | printf ' ' 783 | while (( SECONDS < end )); do 784 | printf '\b%.1s' "$sp" 785 | sp=${sp#?}${sp%???} 786 | sleep .3 787 | done # Spinner from StackOverflow 788 | 789 | t2=$SECONDS; echo -n "Duration time for installation: " 790 | date -u -d @$((t2 - t1)) +"%T" 791 | } 792 | start_mapr cldb && echo Warden started on CLDB nodes 793 | 794 | # Repeat configuration on non-cldb nodes 795 | if [[ $(nodeset -c @noncldb) -ne 0 ]]; then 796 | configure_nodes noncldb && echo MapR noncldb configured 797 | format_disks noncldb && echo MapR disks formatted 798 | start_mapr noncldb && echo MapR warden started 799 | fi 800 | 801 | add_acl_lic() { 802 | sshcmd="MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 803 | sshcmd+=" maprcli node cldbmaster" 804 | clush -o -qtt -w $cldb1 "su - $mapruid -c '$sshcmd'" 805 | if [[ $? -ne 0 ]]; then 806 | echo CLDB did not startup, check status and logs on $cldb1 807 | exit 3 808 | fi 809 | 810 | sshcmd="MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 811 | sshcmd+=" maprcli acl edit -type cluster -user $admin1:fc,a" 812 | clush -o -qtt -w $cldb1 "su - $mapruid -c '$sshcmd'" 813 | 814 | cat << LICMESG 815 | With a web browser, connect to one of the webservers to continue 816 | with license installation: 817 | Webserver nodes: $(nodeset -e @cldb) 818 | 819 | Alternatively, the license can be installed with maprcli. 820 | First, get the cluster id with maprcli like this: 821 | 822 | maprcli dashboard info -json |grep -e id -e name 823 | "name":"MyCluster", 824 | "id":"1111111111111111111", 825 | 826 | Then you can use any browser to connect to http://mapr.com/. In the 827 | upper right corner there is a login link. login and register if you 828 | have not already. Once logged in, you can use the register button on 829 | the right of your login page to register a cluster by just entering 830 | a clusterid. 831 | Once you finish the register form, you will get back a license which 832 | you can copy and paste to a file on the same node you ran maprcli. 833 | Use that file as filename in the following maprcli command: 834 | maprcli license add -is_file true -license filename 835 | 836 | The license server API can also be used with a valid mapr.com 837 | login which requires prior registration. Specify the generated 838 | cluster ID and the cluster name to the REST interface: 839 | curl -u jbenninghoff@maprtech.com 'https://mapr-installer-dialhome.appspot.com/trial?cluster_id=5681466578299529065&cluster_name=ps&out=text' 840 | 841 | Copy the resulting file (stdout) to the cluster if need be. 842 | Install the license on the MapR cluster: 843 | env MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket maprcli license add -is_file true -license /tmp/WFtest3.lic 844 | 845 | Restart the entire cluster: 846 | clush -ab systemctl restart mapr-warden 847 | 848 | LICMESG 849 | 850 | sshcmd="MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket" 851 | sshcmd+=" maprcli dashboard info -json " 852 | clush -o -qtt -w $cldb1 "su - $mapruid -c '$sshcmd'" |grep -e id -e name 853 | } 854 | add_acl_lic 855 | --------------------------------------------------------------------------------