├── README.md ├── artefacts ├── common-config │ └── set-env-variables.sh ├── git │ ├── hooks │ │ └── pre-commit │ ├── list_latest_tags.sh │ ├── package-git-repo.sh │ └── update-all-git-repos.sh ├── pdi │ ├── .kettle │ │ ├── .gitignore │ │ ├── .spoonrc │ │ ├── kettle.properties │ │ ├── repositories-file.xml │ │ ├── repositories-jackrabbit.xml │ │ └── shared.xml │ ├── repo │ │ ├── db_connection_template.kdb │ │ ├── jb_master.kjb │ │ └── module_1 │ │ │ ├── jb_module_1_master.kjb │ │ │ └── tr_dummy.ktr │ └── shell-scripts │ │ ├── ci-settings.sh │ │ ├── delete-pentaho-repo-folder.sh │ │ ├── export-file-based-repo.sh │ │ └── upload-pdi-repo.sh ├── pentaho-server │ ├── shell-scripts │ │ ├── command-line-utility-based │ │ │ ├── export-pentaho-server-content.sh │ │ │ └── upload-pentaho-server-content.sh │ │ ├── download-all-content.sh │ │ ├── upload-file.sh │ │ ├── upload-folder.sh │ │ ├── upload-jdbc-connection.sh │ │ ├── upload-metadata-model.sh │ │ ├── upload-mondrian-schema.sh │ │ └── upload-schedule.sh │ └── templates │ │ ├── jdbc-connection-generic.json │ │ └── schedule.json ├── project-config │ ├── project.properties │ ├── run_jb_name.sh │ └── wrapper.sh └── utilities │ └── build-rpm │ └── template.spec ├── config └── settings.sh ├── initialise-repo.sh └── presentations ├── how-to-run-presentation.md ├── pcm2017.md └── pics ├── IMG_0050.PNG ├── code-config-separation.png ├── git-merge-sqash.png ├── git-merge-squash.png ├── logo_v4_websafe_small.gif ├── logo_v4_white_small.png ├── modules-shown-in-repo-browser.png ├── pre-commit-filename-validation.png ├── pre-commit-validation.png ├── spoon-preconfigured.png ├── structure-with-common-artefacts.png └── structure-without-common-artefacts.png /README.md: -------------------------------------------------------------------------------- 1 | # Getting Started 2 | 3 | ## Purpose 4 | 5 | To understand what the project is about, have a look at the presentation in this repos' [presentation](./presentations/pcm2017.md). Read this before proceeding. 6 | 7 | > **In a nutshell**: This project delivers a utility script which sets up potentially several Pentaho specific git repos with a predefined/standardised folder structure and utilises Git Hooks to enforce a few coding standards. 8 | 9 | ## Initial Setup 10 | 11 | 12 | Clone the repo to your local machine: 13 | 14 | ```bash 15 | git clone git@github.com:diethardsteiner/pentaho-standardised-git-repo-setup.git 16 | cd pentaho-standardised-git-repo-setup 17 | ``` 18 | 19 | Next have to change the following file **for each project and environment** that you want to create the git skeleton for: 20 | 21 | - `config/settings.sh`: Read comments in file to understand what the settings mean. 22 | 23 | > **Note** on `MODULES_GIT_REPO_URL`: If you do not have a repo yet for the PDI modules (reusable code), use the `initialise-repo.sh` to create it and push it to your Git Server (GitHub, etc). Then adjust the configuration. 24 | 25 | There are also some artefacts that are corrently not controlled by the central settings file, e.g. Spoon settings, located in: 26 | 27 | ``` 28 | aretefacts/pdi/.kettle/.spoonrc 29 | ``` 30 | 31 | 32 | ## Initialise Project Structure 33 | 34 | The `initialise-repo.sh` can create: 35 | 36 | - project-specific **config repo** for a given environment (`-conf-`) 37 | - common **config repo** for a given environment (`common-conf-`) 38 | - project **code repo** (`-code`) 39 | - common **docu repo** (`common-documentation`) 40 | - project **docu repo** (`-documentation`) 41 | - PDI **modules** (`modules`): for reusable code/patterns. Holds plain modules only, so it can be use either in file-based or repo-based PDI setup. 42 | - PDI **modules repo** (`modules-pdi-repo`): required when creating modules via PDI repo. 43 | 44 | with the very basic folder structure and required artifacts. The script enables you to create them individually or combinations of certain repositories. 45 | 46 | ### Standardised Git Repo Structure - Code Repo 47 | 48 | | folder | description 49 | |------------------------------------ |--------------------------------- 50 | | `pdi/repo` | pdi files (ktr, kjb). Also root of file based repo if used. 51 | | `pdi/sql` | SQL queries 52 | | `pdi/sql/ddl` | ddl 53 | | `pentaho-server/metadata` | pentaho metadata models 54 | | `pentaho-server/mondrian` | mondrian cube definitions 55 | | `pentaho-server/repo` | contains export from pentaho server repo 56 | | `pentaho-server/prd` | perntaho report files 57 | | `shell-scripts` | any shell-scripts that don't hold configuration specific instructions 58 | 59 | > **Note**: Data, like lookup tables, must not be stored with the code. For development and unit testing they can be stored in the `config` git repo's `test-data` folder. But in prod it must reside outside any git repo if it is the only source available. 60 | 61 | ### Standardised Git Repo Structure - Configuration Repo 62 | 63 | | folder | description 64 | |------------------------------------ |--------------------------------- 65 | | `pdi/.kettle` | pdi config files 66 | | `pdi/metadata` | any metadata files that drive DI processes 67 | | `pdi/properties` | properties files source by pdi 68 | | `pdi/schedules` | holds crontab instructions, DI server schedules or similar 69 | | `pdi/shell-scripts` | shell scripts to execute e.g. PDI jobs 70 | | `pdi/test-data` | optional: test data for development or unit testing - specific to environment 71 | | `pentaho-server/connections` | pentaho server connections 72 | | `utilities` | 73 | 74 | ### How to run the script 75 | 76 | The `initialise-repo.sh` script expects following **arguments**: 77 | 78 | - action (required) 79 | - project name (not always required) 80 | - environment (not always required) 81 | - PDI file storage (not always required) 82 | 83 | > **Important**: The project name must only include letters, no other characters. The same applies to the environment name. 84 | 85 | > **Important**: All the repositories have to be located within the same folder. This folder is referred to as `BASE_DIR`. 86 | 87 | > **Note**: If any of these repositories already exist within the same folder, they will not be overwritten. The idea is to run the script in a fresh/clean base dir, have the script create the repos and then push them to the central git server. 88 | 89 | You can just run the script without arguments and the expected usage pattern will be displayed: 90 | 91 | ```bash 92 | $ initialise-repo.sh 93 | ``` 94 | 95 | > **Important**: As of version 7 Pentaho officially only supports the **file-based** approach and the Jackrabbit/Pentaho Server repository. For this reason I strongly recommend that you use the **file-based** approach for development. In production you can either do the same or import the files into the Jackrabbit/Pentaho Server repository. Also note that the default PDI modules, which are sourced by `initialise-repo.sh`, are to be used with a file-based approach only. 96 | 97 | ### Example 98 | 99 | Creating a new **project** called `myproject` with **common artefacts** for the `dev` **environment** using a PDI file-based **storage approach** 100 | 101 | ```bash 102 | $ sh /initialise-repo.sh -a 1 -g myproject -p mpr -e dev -s file-based 103 | ``` 104 | 105 | This will create a folder called `myproject` in the current directory, which will hold all the other git repositories. 106 | 107 | Once this is in place, most settings should be automatically set, however, double check the following files and amend if required: 108 | 109 | - `common-config-/pdi/.kettle/repositories.xml` (only when using repo storage mode) 110 | - `common-config-/pdi/shell-scripts/set_env_variables.sh`: Adjust `PDI_HOME` and `LOG_DIR`. 111 | - `myproject-config-/pdi/shell-scripts/wrapper.sh`: There are only changes required in the `PROJECT-SPECIFIC CONFIGURATION PROPERTIES` section. 112 | - `myproject-config-/pdi/shell-scripts/run_jb__master.sh`: adjust path to main PDI job (once it exists). 113 | 114 | If you are setting this up on your local workstation, you should be able to start Spoon now and connect to the PDI repository. 115 | 116 | > **Note**: Pay attention to the console output while running the script. There should be a line at the end saying how you can initialise the essential environment variables. You have to run this command before starting Spoon! 117 | 118 | As the next step you might want to adjust: 119 | 120 | - `common-config-/pdi/.kettle/kettle.properties` 121 | - `common-config-/pdi/.kettle/shared.xml` (only when using file-based storage mode) 122 | - `myproject-config-/pdi/properties/myproject.properties` 123 | - `myproject-config-/pdi/properties/jb_myproject_master.properties` 124 | 125 | Don't forget to commit all these changes. You will also have to set the Git remote for these repositories. 126 | 127 | ### Example: Setting up various environments 128 | 129 | Change to the directory where the **Pentaho Standardised Git Repo Setup** repo is located: 130 | 131 | We first will create a shell variable called `PSGRS_HOME` to store the location of the **Pentaho Standardised Git Repo Setup** repo: 132 | 133 | ```bash 134 | $ export PSGRS_HOME=`pwd` 135 | $ echo $PSGRS_HOME 136 | /home/dsteiner/git/pentaho-standardised-git-repo-setup 137 | # your location will be different 138 | ``` 139 | 140 | Let's change the directory now to a convenient location where we can create our new project repos. We will create the dev environment setup and we use a file based setup. Note that we use the action switch `1`, which will create a series of required repos to facilitate the setup: 141 | 142 | ```bash 143 | # I chose this dir, you might want to choose a different one 144 | $ cd ~/git 145 | $ $PSGRS_HOME/initialise-repo.sh -a 1 -g myproject -p mpr -e dev -s file-repo 146 | ``` 147 | 148 | At the end of the log output you will find a message like this (path will vary): 149 | 150 | ``` 151 | Before using Spoon, source this file: 152 | source /home/dsteiner/git/myproject/common-config-dev/pdi/shell-scripts/set-env-variables.sh 153 | =============================== 154 | ``` 155 | 156 | Execute this command. This will make sure that the essential variables are set for PDI Spoon to pick it up if it is started within the same shell window. 157 | 158 | You should have following directories now: 159 | 160 | ```bash 161 | $ ll myproject/ 162 | total 20 163 | drwxrwxr-x. 4 dsteiner dsteiner 4096 Feb 6 23:02 common-config-dev 164 | drwxrwxr-x. 3 dsteiner dsteiner 4096 Feb 6 23:02 common-documentation 165 | drwxrwxr-x. 6 dsteiner dsteiner 4096 Feb 6 23:02 mpr-code 166 | drwxrwxr-x. 6 dsteiner dsteiner 4096 Feb 6 23:02 mpr-config-dev 167 | drwxrwxr-x. 3 dsteiner dsteiner 4096 Feb 6 23:02 mpr-documentatio 168 | ``` 169 | 170 | Each of these folders is a dedicated git repo. It is recommended that you create equivalent repos (so same named repos) on your central **Git Server** (Gitlab, Bitbucket, Github, etc). Usually once you do this, commands will be shown on how to set up those repos locally - usually one of the command sections shown is the one where you can link an **existing** local repo with your online/central one. Use these commands for each of your local repos. So in a nutshell: We are linking our existing local repos with the remote/central repos. 171 | 172 | There are a few config settings etc that you can or should adjust at this point, which are mentioned in the previous example section. E.g.: 173 | 174 | - `common-config-/pdi/shell-scripts/set_env_variables.sh`: Adjust `PDI_HOME` and `LOG_DIR`. 175 | 176 | Next, within the same terminal window, navigate to the directory where the **PDI client** is installed and start **Spoon**: 177 | 178 | ```bash 179 | $ sh ./spoon.sh 180 | ``` 181 | 182 | In our case we choose a file repo, so within **Spoon** go to **Tools > Repository > Connect**, choose File Repository and point it to `myproject/mpr-code/pdi/repo`. Any new jobs and transformations should be stored in this repo in the `mpr` folder. You must not change and files within the `modules` folder. Treat it as a read-only folder! Call your main job `jb_mpr_master`. 183 | 184 | You can execute your main job via: 185 | 186 | ``` 187 | myproject/mpr-config-dev/pdi/shell-scripts/run_jb_mpr_master.sh 188 | ``` 189 | 190 | Let's simulate another environment called `test` now. For this we have to create an additional set of configs: 191 | 192 | - `common-config-test` 193 | - `mpr-config-test` 194 | 195 | We use the specific action parameters `common_config` and `project_config` to create these: 196 | 197 | ```bash 198 | $ $PSGRS_HOME/initialise-repo.sh -a common_config -g myproject -p mpr -e test -s file-repo 199 | $ $PSGRS_HOME/initialise-repo.sh -a project_config -g myproject -p mpr -e test -s file-repo 200 | ``` 201 | 202 | We should see the new repos created now: 203 | 204 | ```bash 205 | $ ll myproject/ 206 | total 28 207 | drwxrwxr-x. 4 dsteiner dsteiner 4096 Feb 6 23:02 common-config-dev 208 | drwxrwxr-x. 4 dsteiner dsteiner 4096 Feb 6 23:22 common-config-test 209 | drwxrwxr-x. 3 dsteiner dsteiner 4096 Feb 6 23:02 common-documentation 210 | drwxrwxr-x. 6 dsteiner dsteiner 4096 Feb 6 23:02 mpr-code 211 | drwxrwxr-x. 6 dsteiner dsteiner 4096 Feb 6 23:02 mpr-config-dev 212 | drwxrwxr-x. 6 dsteiner dsteiner 4096 Feb 6 23:22 mpr-config-test 213 | drwxrwxr-x. 3 dsteiner dsteiner 4096 Feb 6 23:02 mpr-documentation 214 | ``` 215 | 216 | Again, create respective remote/central repos for them and link them up. 217 | 218 | Adjust the config files to match the new environment. Among those files will be: 219 | 220 | ``` 221 | myproject/mpr-config-test/pdi/properties/jb_mpr_master.properties 222 | ``` 223 | 224 | You can execute your main job for the new `test` environment via: 225 | 226 | ``` 227 | myproject/mpr-config-test/pdi/shell-scripts/run_jb_mpr_master.sh 228 | ``` 229 | 230 | Note how easy it is to switch the environment: You just pick the respective config folder and everything else is the same! 231 | 232 | ### How to execute your PDI jobs 233 | 234 | You must execute your jobs via `pdi/shell-scripts/run_jb_[PROJECT]_master.sh`, which is located in the environment specific project git repo. The reason for this is that this shell script will call first of all the `wrapper.sh`, which sets up all relevant environment variables and this one in turn will call a wrapper PDI job from the PDI modules, which sets the project and job specific properties and only then calls the jobs you are asking to be executed. 235 | 236 | You cannot achieve exactly the same behaviour from within Spoon. However, for testing it should be possible to execute your job via the module wrapper job in Spoon as well. 237 | 238 | > **Important**: Do not change the module wrapper job! This should be treated as read-only! Supply parameter values via the **Run** dialog. 239 | 240 | # Code Repository 241 | 242 | ## What is NOT Code 243 | 244 | - **Configuration**: Goes into dedicated config repo by environment. 245 | - **Documentation**: Goes into dedicated docu repo. 246 | - **Data**: 247 | - Lookup Data: E.g. business user provide you with lookup data to enrich operational data. This should be stored separately. 248 | - Test Data: Can be stored with your code since it serves the purpose of testing the quality of your code. 249 | - **Binary files**: Excel, Word, Zip files etc -------------------------------------------------------------------------------- /artefacts/common-config/set-env-variables.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # PENTAHO GLOBAL ENV VARIABLES 4 | # Note: 5 | # if you nest the common config folder within 6 | # a project specific parent folder on deployment 7 | # in which case it is not really shared code 8 | # make sure that you include the project name variable in the path below 9 | # the project name variable ${PROJECT_NAME} is set in the wrapper.sh 10 | # before this script is called so it will be available! 11 | # just to be clear, this only happens when is for following scenario 12 | # myproject 13 | # common-config-prod 14 | # myproject-code 15 | # myproject-config-prod 16 | # 17 | # which is in contrast to the normal deployment 18 | # 19 | # common-config-prod 20 | # myproject 21 | # myproject-code 22 | # myproject-config-prod 23 | # 24 | export KETTLE_HOME=${PSGRS_KETTLE_HOME} 25 | export PDI_DIR=${PSGRS_PDI_DIR} 26 | #export LOG_DIR=${PSGRS_LOG_DIR} 27 | -------------------------------------------------------------------------------- /artefacts/git/hooks/pre-commit: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | SHELL_DIR=$(dirname $0) 3 | source ${SHELL_DIR}/settings.sh 4 | # check whether it is a config or code repository 5 | IS_CONFIG={{ IS_CONFIG }} 6 | # IS_CONFIG=N 7 | # check whether it is a file-based or repo-based PDI setup 8 | IS_REPO_BASED={{ IS_REPO_BASED }} 9 | #IS_REPO_BASED=N 10 | TEST_FAILED=" - Test - [\e[31m FAILED\e[39m ]" 11 | TEST_PASSED=" - Test - [\e[32m PASSED\e[39m ]" 12 | GIT_ROOT_DIR=`git rev-parse --show-toplevel` 13 | GIT_PDI_DIR=/pdi/repo 14 | declare -i STAT=0 15 | 16 | # color scales: http://misc.flogisoft.com/bash/tip_colors_and_formatting 17 | 18 | # utiltiy functions 19 | function print_header { 20 | echo -e "\e[34m\e[47mRunning $@ Test\e[0m" 21 | } 22 | 23 | function print_failed { 24 | echo -e $@ ${TEST_FAILED} 25 | } 26 | 27 | function print_passed { 28 | echo -e ${TEST_PASSED} 29 | } 30 | 31 | 32 | 33 | 34 | # it would we good to have a switch here in case we have to check a legacy project 35 | # in which case we have to go through all files, not just the new ones 36 | cd ${GIT_ROOT_DIR} 37 | if [ ${PSGRS_GIT_HOOKS_CHECKS_NEW_FILES_ONLY} = "Y" ]; then 38 | FILE_LIST=`git diff --cached --name-only` 39 | else 40 | # get list of files known by git and exclude hidden files 41 | FILE_LIST=`git ls-files | grep -vE "^\..+"` 42 | # FILE_LIST=`find . -path ./.git -prune -o -print` 43 | fi 44 | 45 | 46 | PDI_KTR_AND_KJB_FILES=$(find pdi ! -type d -name '*ktr' -or -name '*kjb') 47 | 48 | 49 | 50 | ####################################### 51 | ## FILE NAME CHECKS 52 | ####################################### 53 | 54 | # Check for non ASCII filenames 55 | function check_for_non_ascii_filenames { 56 | 57 | # create array to hold matched values 58 | FILE_NAMES_WITH_NON_ASCII_CHARACTERS=() 59 | # create counter for matched values 60 | declare -i NUMBER_OF_FILE_NAMES_WITH_NON_ASCII_CHARACTERS=0 61 | 62 | # printable range starts at space character and ends with tilde 63 | # Note that the brackets around the tr range are required for portability 64 | # to Solari's /usr/bin/tr. The square bracket bytes happen to fall in 65 | # the designated space 66 | 67 | 68 | for FILE in ${FILE_LIST} 69 | do 70 | FILE_MATCH=$(echo ${FILE} | LC_ALL=C tr -d '[ -~]\0') 71 | if [ -n "${FILE_MATCH}" ] 72 | then 73 | FILE_NAMES_WITH_NON_ASCII_CHARACTERS+=(${FILE_MATCH}) 74 | NUMBER_OF_FILE_NAMES_WITH_NON_ASCII_CHARACTERS+=1 75 | fi 76 | done 77 | 78 | 79 | echo "Number of non ASCII file names: ${NUMBER_OF_FILE_NAMES_WITH_NON_ASCII_CHARACTERS}" 80 | 81 | # error message if unacceptable files present 82 | if [ ${NUMBER_OF_FILE_NAMES_WITH_NON_ASCII_CHARACTERS} -gt 0 ]; then 83 | print_failed ${TEST_NAME} 84 | echo -e "\e[93mPlease only use ASCII characters for filenames!" 85 | echo -e "Following filenames contain non ASCII characters:" 86 | echo -e "${FILE_NAMES_WITH_NON_ASCII_CHARACTERS}\n" 87 | STAT=1 88 | else 89 | print_passed ${TEST_NAME} 90 | fi 91 | return ${STAT} 92 | 93 | } 94 | 95 | 96 | function check_for_paths_with_whitespaces { 97 | 98 | TEST_NAME="Whitespaces in File Paths" 99 | print_header ${TEST_NAME} 100 | 101 | # create array to hold matched values 102 | PATHS_WITH_WHITESPACES=() 103 | # create counter for matched values 104 | declare -i NUMBER_OF_PATHS_WITH_WHITESPACES=0 105 | 106 | MY_FILE_LIST=$(find . -path pdi/.meta -prune -o -print) 107 | 108 | for FILE in ${MY_FILE_LIST} 109 | do 110 | FILE_MATCH=$(echo ${FILE} | grep " ") 111 | if [ -n "${FILE_MATCH}" ] 112 | then 113 | PATHS_WITH_WHITESPACES+=(${FILE_MATCH}) 114 | NUMBER_OF_PATHS_WITH_WHITESPACES+=1 115 | fi 116 | done 117 | 118 | echo "Number of paths with whitespaces: ${NUMBER_OF_PATHS_WITH_WHITESPACES}" 119 | 120 | # error message if unacceptable files present 121 | if [ ${NUMBER_OF_PATHS_WITH_WHITESPACES} -gt 0 ]; then 122 | print_failed ${TEST_NAME} 123 | echo -e "\e[93mPlease remove whitespaces!\e[39m" 124 | echo -e "Following paths contain whitespaces:" 125 | echo -e "${PATHS_WITH_WHITESPACES}\n" 126 | STAT=1 127 | else 128 | print_passed ${TEST_NAME} 129 | fi 130 | return ${STAT} 131 | } 132 | 133 | # Check new files meet naming convention 134 | function check_new_pdi_files_meet_naming_convention { 135 | 136 | TEST_NAME="Naming Convention" 137 | print_header ${TEST_NAME} 138 | 139 | # create array to hold matched values 140 | UNACCEPTABLE_PDI_FILES=() 141 | # create counter for matched values 142 | declare -i NUMBER_OF_UNACCEPTABLE_PDI_FILES=0 143 | 144 | # Get a list of new files to be pushed 145 | # Check which transformations don't match with the following patterns: 146 | # jb_filename.kjb, tr_filename.ktr 147 | 148 | for FILE in ${FILE_LIST} 149 | do 150 | FILE_MATCH=$(echo ${FILE} | grep -Ev "${PSGRS_PDI_ACCEPTED_JOB_OR_TRANSFORMATION_NAME}" | grep -E "ktr$|kjb$") 151 | if [ -n "${FILE_MATCH}" ] 152 | then 153 | UNACCEPTABLE_PDI_FILES+=(${FILE_MATCH}) 154 | NUMBER_OF_UNACCEPTABLE_PDI_FILES+=1 155 | fi 156 | done 157 | 158 | echo "Number of unacceptable new PDI file names: ${NUMBER_OF_UNACCEPTABLE_PDI_FILES}" 159 | 160 | # error message if unacceptable files present 161 | if [ ${NUMBER_OF_UNACCEPTABLE_PDI_FILES} -gt 0 ]; then 162 | print_failed ${TEST_NAME} 163 | echo -e "\e[93mPlease only use alphanumberic filenames!" 164 | echo -e "Jobs/Transformations must follow the pattern:" 165 | echo -e " - tr_filename.ktr" 166 | echo -e " - jb_filename.kjb\e[39m" 167 | echo -e "\e[93mThe following filename(s) do not meet filename conventions:\e[39m" 168 | echo -e "${UNACCPETABLE_NEW_PDI_FILES}\n" 169 | STAT=1 170 | else 171 | print_passed ${TEST_NAME} 172 | fi 173 | return ${STAT} 174 | } 175 | 176 | 177 | 178 | # Check new files matches accepted file types 179 | function check_supported_file_type { 180 | 181 | TEST_NAME="Accepted File Type" 182 | print_header ${TEST_NAME} 183 | 184 | # create array to hold matched values 185 | NOT_SUPPORTED_NEW_FILES=() 186 | # create counter for matched values 187 | declare -i NUMBER_OF_NOT_SUPPORTED_NEW_FILES=0 188 | 189 | # Get a list of new files to be pushed 190 | # Check if they match a list of accepted file extensions 191 | # Keep non matching ones 192 | for FILE in ${FILE_LIST} 193 | do 194 | if [ ${IS_CONFIG} = "Y" ] 195 | then 196 | FILE_MATCH=$(echo ${FILE} | grep --invert-match -E "${PSGRS_GIT_CONFIG_REPO_ACCEPTED_FILE_EXTENSIONS_REGEX}") 197 | else 198 | FILE_MATCH=$(echo ${FILE} | grep --invert-match -E "${PSGRS_GIT_CODE_REPO_ACCEPTED_FILE_EXTENSIONS_REGEX}") 199 | fi 200 | 201 | if [ -n "${FILE_MATCH}" ] 202 | then 203 | NOT_SUPPORTED_NEW_FILES+=(${FILE_MATCH}) 204 | NUMBER_OF_NOT_SUPPORTED_NEW_FILES+=1 205 | fi 206 | done 207 | 208 | echo "Number of files with not supported file types: ${NUMBER_OF_NOT_SUPPORTED_NEW_FILES}" 209 | 210 | # error message if unacceptable files present 211 | if [ ${NUMBER_OF_NOT_SUPPORTED_NEW_FILES} -gt 0 ]; then 212 | print_failed ${TEST_NAME} 213 | echo -e "\e[93mThe following filename(s) do not match the list of accepted file extensions:\e[39m" 214 | echo -e "${NOT_SUPPORTED_NEW_FILES}\n" 215 | STAT=1 216 | else 217 | print_passed ${TEST_NAME} 218 | fi 219 | return ${STAT} 220 | } 221 | 222 | 223 | function check_filenames_are_lower_case { 224 | 225 | TEST_NAME="All File And Folder Names Are Lower Case And Without Whitespaces" 226 | print_header ${TEST_NAME} 227 | 228 | # create array to hold matched values 229 | FILES_WITH_UPPER_CASES_OR_WHITESPACES=() 230 | # create counter for matched values 231 | declare -i NUMBER_OF_FILES_WITH_UPPER_CASES_OR_WHITESPACES=0 232 | 233 | 234 | for FILE in ${FILE_LIST} 235 | do 236 | FILE_MATCH=$(echo ${FILE} | grep --invert-match -E "^[a-z0-9\_-\./]+$") 237 | if [ -n "${FILE_MATCH}" ] 238 | then 239 | FILES_WITH_UPPER_CASES_OR_WHITESPACES+=(${FILE_MATCH}) 240 | NUMBER_OF_FILES_WITH_UPPER_CASES_OR_WHITESPACES+=1 241 | fi 242 | done 243 | 244 | echo "Number of files with upper case letters and/or whitespaces: ${NUMBER_OF_FILES_WITH_UPPER_CASES_OR_WHITESPACES}" 245 | 246 | 247 | # error message if unacceptable files present 248 | if [ ${NUMBER_OF_FILES_WITH_UPPER_CASES_OR_WHITESPACES} -gt 0 ]; then 249 | print_failed ${TEST_NAME} 250 | echo -e "\e[93mPlease only use lower case letters in filenames and avoid whitespaces!" 251 | echo -e "Following filenames contain upper case letters and/or whitespaces:" 252 | echo -e "${FILES_WITH_UPPER_CASES_OR_WHITESPACES}\n" 253 | STAT=1 254 | else 255 | print_passed ${TEST_NAME} 256 | fi 257 | return ${STAT} 258 | 259 | } 260 | 261 | # File or folder names do not contain `dev`, test`, `beta`, `new`, `v[0-9]{1}` 262 | # this counteracts not using dedicated feature branches 263 | # doing this check at the end because by now every filename should be lower case 264 | function check_filenames_for_forbidden_keywords { 265 | 266 | TEST_NAME="File And Folder Names Do Not Contain ${PSGRS_FILE_OR_FOLDER_NAME_FORBIDDEN_KEYWORD}" 267 | print_header ${TEST_NAME} 268 | 269 | # create array to hold matched values 270 | FILES_WITH_FORBIDDEN_KEYWORD=() 271 | # create counter for matched values 272 | declare -i NUMBER_OF_FILES_WITH_FORBIDDEN_KEYWORD=0 273 | 274 | 275 | for FILE in ${FILE_LIST} 276 | do 277 | FILE_MATCH=$(echo ${FILE} | grep -E "${PSGRS_FILE_OR_FOLDER_NAME_FORBIDDEN_KEYWORD}") 278 | if [ -n "${FILE_MATCH}" ] 279 | then 280 | FILES_WITH_FORBIDDEN_KEYWORD+=(${FILE_MATCH}) 281 | NUMBER_OF_FILES_WITH_FORBIDDEN_KEYWORD+=1 282 | fi 283 | done 284 | 285 | echo "Number of files with upper case letters and/or whitespaces: ${NUMBER_OF_FILES_WITH_FORBIDDEN_KEYWORD}" 286 | 287 | # error message if unacceptable files present 288 | if [ ${NUMBER_OF_FILES_WITH_FORBIDDEN_KEYWORD} -gt 0 ]; then 289 | print_failed ${TEST_NAME} 290 | echo -e "\e[93mPlease stop using dev, test, beta, new, v[0-9]{1} in your filenames!" 291 | echo -e "It gives the impression you are not using a dedicated feature branch" 292 | echo -e "Following filenames contain dev, test, beta, new, v[0-9]{1}:" 293 | echo -e "${FILES_WITH_FORBIDDEN_KEYWORD}\n" 294 | STAT=1 295 | else 296 | print_passed ${TEST_NAME} 297 | fi 298 | return ${STAT} 299 | 300 | } 301 | 302 | 303 | 304 | ####################################### 305 | ## CONTENT CHECKS 306 | ####################################### 307 | 308 | # Check for jobs and transformations whose repository path does not match their filesystem path 309 | # applies to file-repo setup only 310 | function check_repo_path { 311 | 312 | TEST_NAME="Check Repo Paths" 313 | print_header ${TEST_NAME} 314 | # cd ${GIT_ROOT_DIR}${GIT_PDI_DIR} 315 | 316 | MY_FILES=$(echo ${PDI_KTR_AND_KJB_FILES}) 317 | 318 | declare -i COUNTER=0 319 | 320 | # change internal file separator so that spaces in file and folder names 321 | # dont cause havoc 322 | # SAVEIFS=${IFS} 323 | # IFS=$(echo -en "\n\b") 324 | 325 | for MY_FILE in ${MY_FILES} 326 | do 327 | FOLDER=$(dirname ${MY_FILE}) 328 | # get rid of the `pdi/repo/` starting bit 329 | FOLDER=`echo ${FOLDER} | cut -c 9-` 330 | # `grep -m 1` replaced with `grep ... | head -n1` since some older systems do not support the first option 331 | REPO_FOLDER=$(grep '' $MY_FILE | head -n1 | \ 332 | sed -e 's| *||g' \ 333 | -e 's|/|/|g' \ 334 | -e 's|.*||g') 335 | 336 | 337 | if [[ ! "${FOLDER}" = "${REPO_FOLDER}" ]]; then 338 | echo "${MY_FILE} -> FILESYSTEM: ${FOLDER} -> REPO: ${REPO_FOLDER}" 339 | STAT=1 340 | COUNTER=COUNTER+1 341 | fi 342 | done 343 | 344 | # restore IFS 345 | # IFS=${SAVEIFS} 346 | 347 | if [ ${COUNTER} -gt 0 ]; then 348 | print_failed ${TEST_NAME} 349 | else 350 | print_passed ${TEST_NAME} 351 | fi 352 | 353 | return ${STAT} 354 | } 355 | 356 | 357 | # Check for PDI files with hardcoded IP addresses 358 | # grep -Er "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" . 359 | function check_hardcoded_ip { 360 | 361 | TEST_NAME="Check Hardcoded IP" 362 | print_header ${TEST_NAME} 363 | 364 | if [ -d "${GIT_ROOT_DIR}${GIT_PDI_DIR}" ] 365 | then 366 | 367 | cd ${GIT_ROOT_DIR}${GIT_PDI_DIR} 368 | 369 | # create array to hold matched values 370 | FILES_WITH_HARDCODED_IP=() 371 | # create counter for matched values 372 | declare -i NUMBER_OF_FILES_WITH_HARDCODED_IP=0 373 | 374 | 375 | FILES_WITH_HARDCODED_IP=$(grep -Erl "${PSGRS_PDI_IP_ADDRESS_REGEX}") 376 | # if the list is not empty ... 377 | if [ ! -z "${FILES_WITH_HARDCODED_IP}" ]; then 378 | NUMBER_OF_FILES_WITH_HARDCODED_IP=$(echo ${FILES_WITH_HARDCODED_IP} | wc -w ) 379 | fi 380 | 381 | echo "Number of files PDI files with hardcoded IP addresses: ${NUMBER_OF_FILES_WITH_HARDCODED_IP}" 382 | 383 | # error message if unacceptable files present 384 | if [ ${NUMBER_OF_FILES_WITH_HARDCODED_IP} -gt 0 ]; then 385 | print_failed ${TEST_NAME} 386 | echo -e "\e[93mThe following filename(s) do have hard code IP addresses:\e[39m" 387 | echo -e "${FILES_WITH_HARDCODED_IP}\n" 388 | STAT=1 389 | else 390 | print_passed ${TEST_NAME} 391 | fi 392 | return ${STAT} 393 | 394 | fi 395 | } 396 | 397 | # Check for PDI files with hardcoded domain name 398 | function check_hard_coded_domain_name { 399 | 400 | TEST_NAME="Check Domain Name" 401 | print_header ${TEST_NAME} 402 | 403 | 404 | if [ -d "${GIT_ROOT_DIR}${GIT_PDI_DIR}" ] 405 | then 406 | 407 | cd ${GIT_ROOT_DIR}${GIT_PDI_DIR} 408 | 409 | FILES_WITH_HARDCODED_DOMAIN_NAME=() 410 | declare -i NUMBER_OF_FILES_WITH_HARDCODED_DOMAIN_NAME=0 411 | # older versions of PDI seems to escape colon and slashes in xml, newer versions not 412 | # we also make sure that you can define something like this: http://${VAR_NAME} 413 | FILES_WITH_HARDCODED_DOMAIN_NAME=$(grep -Erl "${PSGRS_PDI_DOMAIN_NAME_REGEX}" . ) 414 | # if the list is not empty ... 415 | if [ ! -z "${FILES_WITH_HARDCODED_DOMAIN_NAME}" ]; then 416 | NUMBER_OF_FILES_WITH_HARDCODED_DOMAIN_NAME=$(echo ${FILES_WITH_HARDCODED_DOMAIN_NAME} | wc -w ) 417 | fi 418 | 419 | echo "Number of PDI files with hardcoded domain names: ${NUMBER_OF_FILES_WITH_HARDCODED_DOMAIN_NAME}" 420 | 421 | # error message if unacceptable files present 422 | if [ ${NUMBER_OF_FILES_WITH_HARDCODED_DOMAIN_NAME} -gt 0 ]; then 423 | print_failed ${TEST_NAME} 424 | echo -e "\e[93mThe following filename(s) do have hard code Domain Names:\e[39m" 425 | echo -e "${FILES_WITH_HARDCODED_DOMAIN_NAME}\n" 426 | STAT=1 427 | else 428 | print_passed ${TEST_NAME} 429 | fi 430 | return ${STAT} 431 | else 432 | echo "Skipping test since no PDI repo present ..." 433 | fi 434 | } 435 | 436 | 437 | # Check connection references do match a predefined list 438 | function check_referenced_database_are_part_of_the_project { 439 | 440 | TEST_NAME="Check Referenced Databases Are Part of the Project" 441 | print_header ${TEST_NAME} 442 | 443 | if [ -d "${GIT_ROOT_DIR}${GIT_PDI_DIR}" ] 444 | then 445 | 446 | cd ${GIT_ROOT_DIR}${GIT_PDI_DIR} 447 | 448 | FILES_WITH_DB_CONNECTION_REFERENCE=$(grep -Erl ".*.*?.*" --include="*.kjb" --include="*.ktr" --include="*.kdb" . || (echo "")) 449 | 450 | declare -i FILES_WITH_DB_CONNECTION_REFERENCE_COUNTER=0 451 | FILES_WITH_INVALID_DB_CONNECTION_REFERENCE=() 452 | 453 | # change internal file separator so that spaces in file and folder names 454 | # dont cause havoc 455 | SAVEIFS=${IFS} 456 | IFS=$(echo -en "\n\b") 457 | 458 | for FILE_WITH_DB_CONNECTION_REFERENCE in ${FILES_WITH_DB_CONNECTION_REFERENCE} 459 | do 460 | # allowed database names specified at the end here - should be externalised in the long term 461 | CONNECTIONS=$(grep -E ".*.*?.*" ${FILE_WITH_DB_CONNECTION_REFERENCE} | cut -d'>' -f2 | cut -d'<' -f1 | sort -u | grep -v "") 462 | LOCAL_COUNTER=$(echo ${CONNECTIONS} | grep -cvE "${PSGRS_PDI_ACCEPTED_DATABASE_CONNECTION_NAMES_REGEX}") 463 | if [ ${LOCAL_COUNTER} -gt 0 ]; then 464 | FILES_WITH_INVALID_DB_CONNECTION_REFERENCE+=(${FILE_WITH_DB_CONNECTION_REFERENCE}) 465 | FILES_WITH_DB_CONNECTION_REFERENCE_COUNTER+=1 466 | 467 | echo "${FILE_WITH_DB_CONNECTION_REFERENCE} has a non-project specific connection: " 468 | echo ${CONNECTIONS} | grep -vE "${PSGRS_PDI_ACCEPTED_DATABASE_CONNECTION_NAMES_REGEX}" 469 | 470 | fi 471 | done 472 | 473 | # restore IFS 474 | IFS=${SAVEIFS} 475 | 476 | 477 | echo "Number of files with non-project specific database connection references: ${FILES_WITH_DB_CONNECTION_REFERENCE_COUNTER}" 478 | # echo "Number of unacceptable files: ${#FILES_WITH_INVALID_DB_CONNECTION_REFERENCE[*]}" 479 | # error message if unacceptable files present 480 | if [ ${FILES_WITH_DB_CONNECTION_REFERENCE_COUNTER} -gt 0 ]; then 481 | print_failed ${TEST_NAME} 482 | echo -e "\e[93mThe following filename(s) do have invalid DB references:\e[39m" 483 | 484 | 485 | for ITEM in ${FILES_WITH_INVALID_DB_CONNECTION_REFERENCE[*]} 486 | do 487 | echo -e "${ITEM}" 488 | done 489 | STAT=1 490 | else 491 | print_passed ${TEST_NAME} 492 | fi 493 | return ${STAT} 494 | 495 | else 496 | echo "Skipping test since no PDI repo present ..." 497 | fi 498 | } 499 | 500 | 501 | function check_parameters_and_variables_follow_naming_convention { 502 | 503 | TEST_NAME="Parameters and Variables Follow Naming Convetion" 504 | print_header ${TEST_NAME} 505 | 506 | if [ -d "${GIT_ROOT_DIR}${GIT_PDI_DIR}" ] 507 | then 508 | 509 | cd ${GIT_ROOT_DIR}${GIT_PDI_DIR} 510 | 511 | FILES_WITH_PARAMETERS=$(grep -Erl "\\$\{" . || (echo "")) 512 | 513 | declare -i FILES_WITH_PARAMETERS_COUNTER=0 514 | FILE_WITH_INVALID_PARAMETERS=() 515 | 516 | # change internal file separator so that spaces in file and folder names 517 | # dont cause havoc 518 | SAVEIFS=${IFS} 519 | IFS=$(echo -en "\n\b") 520 | 521 | for FILE_WITH_PARAMETERS in ${FILES_WITH_PARAMETERS} 522 | do 523 | # allowed parameter names specified at the end here - should be externalised in the long term 524 | PARAMS_LIST=$(grep -E "\\$\{" ${FILE_WITH_PARAMETERS} | cut -d'{' -f2 | cut -d'}' -f1) 525 | LOCAL_COUNTER=$(echo ${PARAMS_LIST} | grep -Evc "${PSGRS_PDI_ACCEPTED_PARAMETER_OR_VARIABLE_NAME}") 526 | if [ ${LOCAL_COUNTER} -gt 0 ]; then 527 | FILE_WITH_INVALID_PARAMETERS+=(${FILE_WITH_PARAMETERS}) 528 | FILES_WITH_PARAMETERS_COUNTER+=1 529 | 530 | echo "${FILE_WITH_PARAMETERS} has a non-project specific connection: " 531 | echo ${PARAMS_LIST} | grep -Ev "${PSGRS_PDI_ACCEPTED_PARAMETER_OR_VARIABLE_NAME}" 532 | 533 | fi 534 | done 535 | 536 | # restore IFS 537 | IFS=${SAVEIFS} 538 | 539 | 540 | echo "Number of files with non valid parameters: ${FILES_WITH_PARAMETERS_COUNTER}" 541 | # echo "Number of unacceptable files: ${#FILE_WITH_INVALID_PARAMETERS[*]}" 542 | # error message if unacceptable files present 543 | if [ ${FILES_WITH_PARAMETERS_COUNTER} -gt 0 ]; then 544 | print_failed ${TEST_NAME} 545 | echo -e "\e[93mThe following filename(s) do not have valid parameter names:\e[39m" 546 | 547 | 548 | for ITEM in ${FILE_WITH_INVALID_PARAMETERS[*]} 549 | do 550 | echo -e "${ITEM}" 551 | done 552 | STAT=1 553 | else 554 | print_passed ${TEST_NAME} 555 | fi 556 | return ${STAT} 557 | 558 | else 559 | echo "Skipping test since no PDI repo present ..." 560 | fi 561 | 562 | } 563 | 564 | 565 | ####################################### 566 | ## GIT REPO CHECKS 567 | ####################################### 568 | 569 | function check_supported_branch_name { 570 | 571 | TEST_NAME="Supported Branch Name" 572 | print_header ${TEST_NAME} 573 | 574 | cd ${GIT_ROOT_DIR} 575 | 576 | declare -i VALID_BRANCH_NAME_COUNT=0 577 | 578 | # create array to store branch names 579 | BRANCH_NAME=`git branch | while read STAR BRANCH; do echo $BRANCH; done` 580 | 581 | INVALID_BRANCH_NAMES_COUNT=$(echo ${BRANCH_NAME} | grep -Evc "${PSGRS_GIT_REPO_ACCEPTED_BRANCH_NAMES_REGEX}") 582 | INVALID_BRANCH_NAMES=$(echo ${BRANCH_NAME} | grep -Ev "${PSGRS_GIT_REPO_ACCEPTED_BRANCH_NAMES_REGEX}") 583 | 584 | 585 | # error message if unacceptable files present 586 | if [ ${INVALID_BRANCH_NAMES_COUNT} -gt 0 ]; then 587 | print_failed ${TEST_NAME} 588 | echo -e "\e[93mThe following branch names does not match the list of accepted ones:\e[39m" 589 | echo -e ${INVALID_BRANCH_NAMES} 590 | echo -e "Accepted Branch Names: ${PSGRS_GIT_REPO_ACCEPTED_BRANCH_NAMES_REGEX}" 591 | STAT=1 592 | else 593 | print_passed ${TEST_NAME} 594 | fi 595 | return ${STAT} 596 | } 597 | 598 | # Check defined database connections do match a predefined list 599 | function check_defined_database_connections_are_part_of_the_project { 600 | 601 | TEST_NAME="Check Defined Databases Are Part of the Project" 602 | print_header ${TEST_NAME} 603 | 604 | if [ -d "${GIT_ROOT_DIR}${GIT_PDI_DIR}" ] 605 | then 606 | 607 | cd ${GIT_ROOT_DIR}${GIT_PDI_DIR} 608 | 609 | # get connection references that do not match a predefined list 610 | # grep -Erl "" -A+1 . | grep "" | cut -d'>' -f2 | cut -d'<' -f1 | grep -cvE "hive_generic|impala_generic|mysql_process_control" 611 | # grep -Er ".*?" . 612 | 613 | FILES_WITH_DB_CONNECTION_DEFINITION=$(grep -Erl "*.*." || (echo "")) 614 | 615 | declare -i FILES_WITH_DB_CONNECTION_DEFINITION_COUNTER=0 616 | FILES_WITH_INVALID_DB_CONNECTION_DEFINITION=() 617 | 618 | # change internal file separator so that spaces in file and folder names 619 | # dont cause havoc 620 | SAVEIFS=${IFS} 621 | IFS=$(echo -en "\n\b") 622 | 623 | for FILE_WITH_DB_CONNECTION_REFERENCE in ${FILES_WITH_DB_CONNECTION_DEFINITION} 624 | do 625 | LOCAL_COUNTER=$(grep -E "*.*." -A+1 ${FILE_WITH_DB_CONNECTION_REFERENCE} | grep "*.*." | cut -d'>' -f2 | cut -d'<' -f1 | sort -u | grep -cvE "hive_generic|impala_generic|mysql_process_control") 626 | if [ ${LOCAL_COUNTER} -gt 0 ]; then 627 | FILES_WITH_INVALID_DB_CONNECTION_DEFINITION+=(${FILE_WITH_DB_CONNECTION_REFERENCE}) 628 | FILES_WITH_DB_CONNECTION_DEFINITION_COUNTER+=1 629 | fi 630 | done 631 | 632 | # restore IFS 633 | IFS=${SAVEIFS} 634 | 635 | 636 | echo "Number of files with project unrelated database definitions: ${FILES_WITH_DB_CONNECTION_DEFINITION_COUNTER}" 637 | # echo "Number of unacceptable files: ${#FILES_WITH_INVALID_DB_CONNECTION_DEFINITION[*]}" 638 | # error message if unacceptable files present 639 | if [ ${FILES_WITH_DB_CONNECTION_DEFINITION_COUNTER} -gt 0 ]; then 640 | print_failed ${TEST_NAME} 641 | echo -e "\e[93mThe following filename(s) do have invalid DB connection definitions:\e[39m" 642 | 643 | 644 | for ITEM in ${FILES_WITH_INVALID_DB_CONNECTION_DEFINITION[*]} 645 | do 646 | echo -e "${ITEM}" 647 | done 648 | STAT=1 649 | else 650 | print_passed ${TEST_NAME} 651 | fi 652 | return ${STAT} 653 | 654 | else 655 | echo "Skipping test since no PDI repo present ..." 656 | fi 657 | } 658 | 659 | 660 | ####################################### 661 | ## BUNDLE CHECKS 662 | ####################################### 663 | 664 | if [ ${IS_CONFIG} = "Y" ]; then 665 | # Checks to only run for config repos 666 | check_supported_file_type 667 | check_for_non_ascii_filenames 668 | else 669 | # Checks to only run for code repos 670 | check_for_paths_with_whitespaces 671 | check_supported_file_type 672 | check_for_non_ascii_filenames 673 | check_new_pdi_files_meet_naming_convention 674 | check_hardcoded_ip 675 | check_hard_coded_domain_name 676 | check_defined_database_connections_are_part_of_the_project 677 | check_referenced_database_are_part_of_the_project 678 | check_parameters_and_variables_follow_naming_convention 679 | check_filenames_are_lower_case 680 | check_filenames_for_forbidden_keywords 681 | check_supported_branch_name 682 | if [ ${IS_REPO_BASED} = "Y" ]; then 683 | check_repo_path 684 | fi 685 | fi 686 | 687 | if [ ${STAT} -gt 0 ]; then 688 | echo "" 689 | echo -e "[\e[31m Too many errors. Not committing anything. Resolve issues. \e[39m ]" 690 | echo "" 691 | exit 1 692 | fi -------------------------------------------------------------------------------- /artefacts/git/list_latest_tags.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Kindly provided by Luis Silva 4 | 5 | # Script to list the latest tags for all 6 | # git repositories under this folder. 7 | 8 | cd $PWD 9 | 10 | # Header 11 | printf "%-30s|%-15s %-6s|%-11s|%-15s|%-11s\n" "REPO" "latest" "+extra" "date" "deployed" "date" 12 | printf "%-30s|%-15s-%-6s|%-11s|%-15s|%-11s\n" "------------------------------" "---------------" "------" "-----------" "---------------" "-----------" 13 | 14 | # loop over all of the sandboxes below the current folder except those that 15 | # have data in the name. 16 | for sandbox in `find . -type d -iname .git|grep -v data|sort` 17 | do 18 | ( 19 | # Get the simplified path of the repo 20 | folder=`readlink -f $sandbox/..` 21 | cd $folder 22 | 23 | # Extract deployed tag 24 | deployed=`git tag --sort=committerdate -l *-prod|grep -v pre-prod|tail -n1` 25 | deployed_date="" 26 | test -n "$deployed" && deployed_date=`git show -s --pretty=%ci $deployed|cut -c -10` 27 | 28 | # Extract latest tag 29 | latest=`git tag --sort=committerdate -l |tail -n1` 30 | latest_date="" 31 | test -n "$latest" && latest_date=`git show -s --pretty=%ci $latest|cut -c -10` 32 | 33 | # Check for extra commits since the latest tag. 34 | extra_commits="" 35 | # There may not be a lates tag so account for that. 36 | test -n "$latest" && extra_commits="+`git rev-list --count ^$latest HEAD`" 37 | 38 | # for each project, print all we have on it. 39 | printf "%-30s|%-15s %-6s|%-11s|%-15s|%-11s\n" "`basename $folder`" "$latest" "$extra_commits" "$latest_date" "$deployed" "$deployed_date" 40 | ) 41 | done 42 | 43 | -------------------------------------------------------------------------------- /artefacts/git/package-git-repo.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # expects: 4 | # - GIT_DIR: location of the git repo you want to package 5 | # - VERSION: git tag 6 | # - PREFIX 7 | # - PACKAGE_FILE_PATH: path to and filename of package file 8 | 9 | # read GIT_DIR VERSION PREFIX PACKAGE_FILE_PATH BUILD_RPM 10 | GIT_DIR=$1 11 | VERSION=$2 12 | PREFIX=$3 13 | PACKAGE_FILE_PATH=$4 14 | BUILD_RPM="Y" 15 | 16 | echo "Following parameter values were passed:" 17 | echo "GIT_DIR: ${GIT_DIR}" 18 | echo "VERSION: ${VERSION}" 19 | echo "PREFIX: ${PREFIX}" 20 | echo "PACKAGE_FILE_PATH: ${PACKAGE_FILE_PATH}" 21 | echo "BUILD_RPM: ${BUILD_RPM}" 22 | 23 | # ----------------- 24 | # EXAMPLE USAGE: ./package-git-repo.sh /tmp/test/mysampleproj-code 0.1 mysampleproj-code /tmp/mysampleproj-code.tar Y 25 | # ----------------- 26 | 27 | PSGRS_SHELL_DIR=$(dirname $0) 28 | 29 | source ${PSGRS_SHELL_DIR}/settings.sh 30 | 31 | cd ${GIT_DIR} 32 | 33 | # Adding version number to prefix might be a good idea since you can easily roll 34 | # back then. On target system you can use a sym link to abstract this. 35 | 36 | TAG=`git tag | grep ${VERSION}` 37 | if [ ${TAG}="" ]; then 38 | echo "Warning: ${VERSION} does not exist. Using last commit instead." 39 | LAST_COMMIT_ID=`git log --format="%H" -n 1` 40 | TAG=${LAST_COMMIT_ID} 41 | fi 42 | 43 | git archive --prefix=${PREFIX}-${TAG}/ -o ${PACKAGE_FILE_PATH} ${TAG} 44 | 45 | # create folder to store tmp tar files if it doesn't exist already 46 | if [ ! -d "${GIT_DIR}/rpmbuild" ]; then 47 | echo "Creating tmp dir ${GIT_DIR}/rpmbuild ..." 48 | mkdir ${GIT_DIR}/rpmbuild 49 | fi 50 | 51 | echo "Following version was used for the package:" > ${GIT_DIR}/rpmbuild/MANIFEST-REPOS.md 52 | echo "main module tag/commit id: ${TAG}" >> ${GIT_DIR}/rpmbuild/MANIFEST-REPOS.md 53 | 54 | # FETCH SUBMODULE CODE 55 | 56 | # pipe the output of the git submodule foreach command into a while loop 57 | (echo .; git submodule foreach) | \ 58 | # the previous command retruns something like `Entering etl/modules` 59 | # we are only interested in the latter part 60 | while read ENTERING PATH_SUBMODULE; do 61 | # get rid of the enclosing single quotation marks 62 | TEMP_PATH_SUBMODULE="${PATH_SUBMODULE%\'}" 63 | TEMP_PATH_SUBMODULE="${TEMP_PATH_SUBMODULE#\'}" 64 | PATH_SUBMODULE=${TEMP_PATH_SUBMODULE} 65 | # check there is actually a path value available 66 | if [ ! "${PATH_SUBMODULE}" = "" ]; then 67 | echo "Submodule found: ${PATH_SUBMODULE}" 68 | echo "Changing to following submodule dir: ${GIT_DIR}/${PATH_SUBMODULE}" 69 | # change to submodule folder 70 | cd ${GIT_DIR}/${PATH_SUBMODULE} 71 | # Run a normal git archive command 72 | # Create a plain uncompressed tar archive in a temporary director 73 | LAST_COMMIT_ID_SUBMODULE=`git log --format="%H" -n 1` 74 | echo "Packaging Submodule ..." 75 | git archive --prefix=${PREFIX}-${TAG}/${PATH_SUBMODULE}/ ${LAST_COMMIT_ID_SUBMODULE} > ${GIT_DIR}/rpmbuild/tmp.tar 76 | # Add the temporary submodule tar file to the existing project tar file 77 | echo "Adding Submodule package to Main package ..." 78 | tar --concatenate --file=${PACKAGE_FILE_PATH} ${GIT_DIR}/rpmbuild/tmp.tar 79 | # Remove temporary tar file 80 | rm ${GIT_DIR}/rpmbuild/tmp.tar 81 | echo "submodule ${PATH_SUBMODULE} tag/commit id: ${LAST_COMMIT_ID_SUBMODULE}" >> ${GIT_DIR}/rpmbuild/MANIFEST-REPOS.md 82 | fi 83 | done 84 | 85 | # ADD THE MANIFEST 86 | # using the -C flag to change to the directory where the file is in 87 | # so that it is placed in the root directory of our file 88 | echo "Adding Repo Manifest to Main package ..." 89 | tar --append --file=${PACKAGE_FILE_PATH} -C ${GIT_DIR}/rpmbuild MANIFEST-REPOS.md 90 | 91 | 92 | function git_root { 93 | git rev-parse --show-toplevel 94 | } 95 | 96 | # create a manifest for files of main repo only (no submodules) 97 | cd `git_root` 98 | git ls-tree -r ${TAG} --abbrev > ${GIT_DIR}/rpmbuild/MANIFEST.md 99 | 100 | echo "Adding File Manifest to Main package ..." 101 | tar --append --file=${PACKAGE_FILE_PATH} -C ${GIT_DIR}/rpmbuild MANIFEST.md 102 | 103 | # ADD CHANGELOG 104 | 105 | 106 | cd `git_root` 107 | # get commit id from first commit 108 | TAG_FROM=$(git log --format="%H" | head -n2 | tail -n1) 109 | TAG_TO=${TAG} 110 | TEMP=`mktemp` 111 | 112 | # Append the new log to the top of the changelog file 113 | echo -e "# Version ${TAG_TO}\n=============" > ${GIT_DIR}/rpmbuild/CHANGELOG.md 114 | echo "Showing changes from: ${TAG_FROM}" >> ${GIT_DIR}/rpmbuild/CHANGELOG.md 115 | CHANGELOG=$(git log $TAG_FROM..$TAG_TO --no-merges --sparse --date='format-local:%Y-%m-%d %H:%M:%S' --pretty=format:"%ad %<(20,trunc)%aN %s" | sed -e 's/_/\\_/g') 116 | echo ${CHANGELOG} >> ${GIT_DIR}/rpmbuild/CHANGELOG.md 117 | 118 | echo "Adding File Manifest to Main package ..." 119 | tar --append --file=${PACKAGE_FILE_PATH} -C ${GIT_DIR}/rpmbuild CHANGELOG.md 120 | 121 | # Remove temporary files 122 | if [ -d "${GIT_DIR}/rpmbuild" ]; then 123 | echo "Reomoving tmp dir ${GIT_DIR}/rpmbuild ..." 124 | rm -r ${GIT_DIR}/rpmbuild 125 | fi 126 | 127 | # BUILD RPM 128 | 129 | # this is an extremly basic implementation and should not be used 130 | # improvement necessary so that RPM can be build in isolated environment 131 | 132 | if [ ${BUILD_RPM} = "Y" ] 133 | then 134 | 135 | mkdir ${PSGRS_RPM_BUILD_HOME} 136 | cd ${PSGRS_RPM_BUILD_HOME} 137 | # create minimum required folders 138 | mkdir SOURCES SPECS BUILD RPMS SRPMS 139 | # copy tar file into source folder 140 | cp ${PACKAGE_FILE_PATH} SOURCES 141 | # copy spec file 142 | cp ${PSGRS_SHELL_DIR}/template.spec SPECS 143 | # build rpm 144 | rpmbuild -v -bb --clean SPECS/template.spec 145 | fi -------------------------------------------------------------------------------- /artefacts/git/update-all-git-repos.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Kindly provided by Luis Silva 4 | 5 | # Script to update all of the 6 | # git repositories under this folder. 7 | 8 | cd $PWD 9 | pad=$(printf '%0.1s' " "{1..150}) 10 | padlength=80 11 | SUCCESS="SUCCESS!" 12 | FAILURE="FAILURE!" 13 | 14 | # The following is for the correct padding of output messages. 15 | function success() 16 | { 17 | local folder=$1 18 | printf '%*.*s' 0 $((padlength - ${#folder} - 9 )) "$pad" 19 | printf '\e[32m%s\e[39m\n' $SUCCESS 20 | } 21 | 22 | function failure() 23 | { 24 | local folder=$1 25 | printf '%*.*s' 0 $((padlength - ${#folder} - 9 )) "$pad" 26 | printf '\e[31m%s\e[39m\n' $FAILURE 27 | } 28 | 29 | for sandbox in `find . -type d -iname .git` 30 | do 31 | ( 32 | folder=`readlink -f $sandbox/..` 33 | printf '%s' "Updating $folder..." 34 | cd $folder && git pull > /dev/null 2>&1 && success $folder || failure $folder 35 | ) 36 | done 37 | 38 | -------------------------------------------------------------------------------- /artefacts/pdi/.kettle/.gitignore: -------------------------------------------------------------------------------- 1 | *.backup -------------------------------------------------------------------------------- /artefacts/pdi/.kettle/.spoonrc: -------------------------------------------------------------------------------- 1 | CanvasGridSize=16 2 | IconSize=32 3 | ShowCanvasGrid=Y 4 | SaveOnlyUsedConnectionsToXML=Y -------------------------------------------------------------------------------- /artefacts/pdi/.kettle/kettle.properties: -------------------------------------------------------------------------------- 1 | ######################################## 2 | ## KETTLE PROPERTIES ## 3 | ######################################## 4 | 5 | 6 | ## STANDARD PRE-DEFINED KETTLE PROPERTIES 7 | 8 | # Show the full ancestor line of a given job entry or step 9 | KETTLE_LOG_MARK_MAPPINGS=Y 10 | # Avoids PDI spending ages on finding out the hostname, see http://jira.pentaho.com/browse/PDI-12358 11 | KETTLE_SYSTEM_HOSTNAME=localhost 12 | 13 | ## DATABASE CONNECTIONS 14 | 15 | # generic set of parameters to define details for a database conneciton 16 | # ideally rename params to something more unique since you might have more than one db connection. 17 | # when using the pdi repo: 18 | # these parameters will be picked up by the *.kdb file located in the root of the pid repository. 19 | # when not using the pdi repo: 20 | # these parameters will be picked up by the `.kettle/shared.xml` file. 21 | VAR_DB_CONNECTION_NAME= 22 | VAR_DB_CONNECTION_URL= 23 | VAR_DB_DRIVER_CLASS_NAME= 24 | VAR_DB_PORT= 25 | VAR_DB_USER_NAME= 26 | VAR_DB_PASSWORD= 27 | -------------------------------------------------------------------------------- /artefacts/pdi/.kettle/repositories-file.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | KettleFileRepository 5 | ${PSGRS_PDI_REPO_NAME} 6 | ${PSGRS_PDI_REPO_DESCRIPTION} 7 | false 8 | ${PSGRS_PDI_REPO_PATH} 9 | N 10 | N 11 | 12 | 13 | -------------------------------------------------------------------------------- /artefacts/pdi/.kettle/repositories-jackrabbit.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | PentahoEnterpriseRepository 5 | ${PSGRS_PDI_REPO_NAME} 6 | ${PSGRS_PDI_REPO_DESCRIPTION} 7 | false 8 | ${PSGRS_PDI_REPO_URL} 9 | N 10 | 11 | 12 | -------------------------------------------------------------------------------- /artefacts/pdi/.kettle/shared.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | {{ VAR_DB_CONNECTION_NAME }} 5 | 6 | GENERIC 7 | Native 8 | 9 | 1521 10 | ${VAR_DB_USER_NAME} 11 | ${VAR_DB__PASSWPRD} 12 | 13 | 14 | 15 | 16 | CUSTOM_DRIVER_CLASS${VAR_DB_DRIVER_CLASS_NAME} 17 | CUSTOM_URL${VAR_DB_CONNECTION_URL} 18 | FORCE_IDENTIFIERS_TO_LOWERCASEN 19 | FORCE_IDENTIFIERS_TO_UPPERCASEN 20 | IS_CLUSTEREDN 21 | PORT_NUMBER1521 22 | PRESERVE_RESERVED_WORD_CASEY 23 | QUOTE_ALL_FIELDSN 24 | SUPPORTS_BOOLEAN_DATA_TYPEY 25 | SUPPORTS_TIMESTAMP_DATA_TYPEY 26 | USE_POOLINGN 27 | 28 | 29 | 30 | 31 | -------------------------------------------------------------------------------- /artefacts/pdi/repo/db_connection_template.kdb: -------------------------------------------------------------------------------- 1 | 2 | {{ VAR_DB_CONNECTION_NAME }} 3 | 4 | GENERIC 5 | Native 6 | 7 | ${VAR_DB_PORT} 8 | ${VAR_DB_USER_NAME} 9 | ${VAR_DB_USER_PASSWORD} 10 | 11 | 12 | 13 | 14 | CUSTOM_DRIVER_CLASS${VAR_DB_DRIVER_CLASS_NAME} 15 | CUSTOM_URL${VAR_DB_CONNECTION_URL} 16 | FORCE_IDENTIFIERS_TO_LOWERCASEN 17 | FORCE_IDENTIFIERS_TO_UPPERCASEN 18 | IS_CLUSTEREDN 19 | PORT_NUMBER1521 20 | PRESERVE_RESERVED_WORD_CASEY 21 | QUOTE_ALL_FIELDSN 22 | SUPPORTS_BOOLEAN_DATA_TYPEY 23 | SUPPORTS_TIMESTAMP_DATA_TYPEY 24 | USE_POOLINGN 25 | 26 | 27 | -------------------------------------------------------------------------------- /artefacts/pdi/repo/jb_master.kjb: -------------------------------------------------------------------------------- 1 | 2 | 3 | jb_master 4 | 5 | 6 | 7 | /{{ PSGRS_PROJECT_NAME }} 8 | - 9 | 2018/03/22 22:12:33.002 10 | - 11 | 2018/03/22 22:12:33.002 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | ID_JOB 25 | Y 26 | ID_JOB 27 | 28 | 29 | CHANNEL_ID 30 | Y 31 | CHANNEL_ID 32 | 33 | 34 | JOBNAME 35 | Y 36 | JOBNAME 37 | 38 | 39 | STATUS 40 | Y 41 | STATUS 42 | 43 | 44 | LINES_READ 45 | Y 46 | LINES_READ 47 | 48 | 49 | LINES_WRITTEN 50 | Y 51 | LINES_WRITTEN 52 | 53 | 54 | LINES_UPDATED 55 | Y 56 | LINES_UPDATED 57 | 58 | 59 | LINES_INPUT 60 | Y 61 | LINES_INPUT 62 | 63 | 64 | LINES_OUTPUT 65 | Y 66 | LINES_OUTPUT 67 | 68 | 69 | LINES_REJECTED 70 | Y 71 | LINES_REJECTED 72 | 73 | 74 | ERRORS 75 | Y 76 | ERRORS 77 | 78 | 79 | STARTDATE 80 | Y 81 | STARTDATE 82 | 83 | 84 | ENDDATE 85 | Y 86 | ENDDATE 87 | 88 | 89 | LOGDATE 90 | Y 91 | LOGDATE 92 | 93 | 94 | DEPDATE 95 | Y 96 | DEPDATE 97 | 98 | 99 | REPLAYDATE 100 | Y 101 | REPLAYDATE 102 | 103 | 104 | LOG_FIELD 105 | Y 106 | LOG_FIELD 107 | 108 | 109 | EXECUTING_SERVER 110 | N 111 | EXECUTING_SERVER 112 | 113 | 114 | EXECUTING_USER 115 | N 116 | EXECUTING_USER 117 | 118 | 119 | START_JOB_ENTRY 120 | N 121 | START_JOB_ENTRY 122 | 123 | 124 | CLIENT 125 | N 126 | CLIENT 127 | 128 | 129 | 130 | 131 | 132 |
133 | 134 | 135 | ID_BATCH 136 | Y 137 | ID_BATCH 138 | 139 | 140 | CHANNEL_ID 141 | Y 142 | CHANNEL_ID 143 | 144 | 145 | LOG_DATE 146 | Y 147 | LOG_DATE 148 | 149 | 150 | JOBNAME 151 | Y 152 | TRANSNAME 153 | 154 | 155 | JOBENTRYNAME 156 | Y 157 | STEPNAME 158 | 159 | 160 | LINES_READ 161 | Y 162 | LINES_READ 163 | 164 | 165 | LINES_WRITTEN 166 | Y 167 | LINES_WRITTEN 168 | 169 | 170 | LINES_UPDATED 171 | Y 172 | LINES_UPDATED 173 | 174 | 175 | LINES_INPUT 176 | Y 177 | LINES_INPUT 178 | 179 | 180 | LINES_OUTPUT 181 | Y 182 | LINES_OUTPUT 183 | 184 | 185 | LINES_REJECTED 186 | Y 187 | LINES_REJECTED 188 | 189 | 190 | ERRORS 191 | Y 192 | ERRORS 193 | 194 | 195 | RESULT 196 | Y 197 | RESULT 198 | 199 | 200 | NR_RESULT_ROWS 201 | Y 202 | NR_RESULT_ROWS 203 | 204 | 205 | NR_RESULT_FILES 206 | Y 207 | NR_RESULT_FILES 208 | 209 | 210 | LOG_FIELD 211 | N 212 | LOG_FIELD 213 | 214 | 215 | COPY_NR 216 | N 217 | COPY_NR 218 | 219 | 220 | 221 | 222 | 223 |
224 | 225 | 226 | ID_BATCH 227 | Y 228 | ID_BATCH 229 | 230 | 231 | CHANNEL_ID 232 | Y 233 | CHANNEL_ID 234 | 235 | 236 | LOG_DATE 237 | Y 238 | LOG_DATE 239 | 240 | 241 | LOGGING_OBJECT_TYPE 242 | Y 243 | LOGGING_OBJECT_TYPE 244 | 245 | 246 | OBJECT_NAME 247 | Y 248 | OBJECT_NAME 249 | 250 | 251 | OBJECT_COPY 252 | Y 253 | OBJECT_COPY 254 | 255 | 256 | REPOSITORY_DIRECTORY 257 | Y 258 | REPOSITORY_DIRECTORY 259 | 260 | 261 | FILENAME 262 | Y 263 | FILENAME 264 | 265 | 266 | OBJECT_ID 267 | Y 268 | OBJECT_ID 269 | 270 | 271 | OBJECT_REVISION 272 | Y 273 | OBJECT_REVISION 274 | 275 | 276 | PARENT_CHANNEL_ID 277 | Y 278 | PARENT_CHANNEL_ID 279 | 280 | 281 | ROOT_CHANNEL_ID 282 | Y 283 | ROOT_CHANNEL_ID 284 | 285 | 286 | N 287 | 288 | 289 | 290 | START 291 | 292 | SPECIAL 293 | Y 294 | N 295 | N 296 | 0 297 | 0 298 | 60 299 | 12 300 | 0 301 | 1 302 | 1 303 | N 304 | Y 305 | 0 306 | 192 307 | 112 308 | 309 | 310 | Success 311 | 312 | SUCCESS 313 | N 314 | Y 315 | 0 316 | 192 317 | 272 318 | 319 | 320 | Write To Log 321 | 322 | WRITE_TO_LOG 323 | ========== STARTING {{ PSGRS_PROJECT_NAME }} MASTER JOB =========== 324 | Basic 325 | 326 | N 327 | Y 328 | 0 329 | 192 330 | 192 331 | 332 | 333 | 334 | 335 | START 336 | Write To Log 337 | 0 338 | 0 339 | Y 340 | Y 341 | Y 342 | 343 | 344 | Write To Log 345 | Success 346 | 0 347 | 0 348 | Y 349 | Y 350 | N 351 | 352 | 353 | 354 | 355 | 356 | -------------------------------------------------------------------------------- /artefacts/pdi/repo/module_1/jb_module_1_master.kjb: -------------------------------------------------------------------------------- 1 | 2 | jb_module_1_master 3 | 4 | 5 | 6 | 0 7 | /module_1 8 | - 9 | 2017/08/26 15:25:52.288 10 | - 11 | 2017/08/26 15:36:43.173 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | ID_JOB 25 | Y 26 | ID_JOB 27 | 28 | 29 | CHANNEL_ID 30 | Y 31 | CHANNEL_ID 32 | 33 | 34 | JOBNAME 35 | Y 36 | JOBNAME 37 | 38 | 39 | STATUS 40 | Y 41 | STATUS 42 | 43 | 44 | LINES_READ 45 | Y 46 | LINES_READ 47 | 48 | 49 | LINES_WRITTEN 50 | Y 51 | LINES_WRITTEN 52 | 53 | 54 | LINES_UPDATED 55 | Y 56 | LINES_UPDATED 57 | 58 | 59 | LINES_INPUT 60 | Y 61 | LINES_INPUT 62 | 63 | 64 | LINES_OUTPUT 65 | Y 66 | LINES_OUTPUT 67 | 68 | 69 | LINES_REJECTED 70 | Y 71 | LINES_REJECTED 72 | 73 | 74 | ERRORS 75 | Y 76 | ERRORS 77 | 78 | 79 | STARTDATE 80 | Y 81 | STARTDATE 82 | 83 | 84 | ENDDATE 85 | Y 86 | ENDDATE 87 | 88 | 89 | LOGDATE 90 | Y 91 | LOGDATE 92 | 93 | 94 | DEPDATE 95 | Y 96 | DEPDATE 97 | 98 | 99 | REPLAYDATE 100 | Y 101 | REPLAYDATE 102 | 103 | 104 | LOG_FIELD 105 | Y 106 | LOG_FIELD 107 | 108 | 109 | EXECUTING_SERVER 110 | N 111 | EXECUTING_SERVER 112 | 113 | 114 | EXECUTING_USER 115 | N 116 | EXECUTING_USER 117 | 118 | 119 | START_JOB_ENTRY 120 | N 121 | START_JOB_ENTRY 122 | 123 | 124 | CLIENT 125 | N 126 | CLIENT 127 | 128 | 129 | 130 | 131 | 132 |
133 | 134 | 135 | ID_BATCH 136 | Y 137 | ID_BATCH 138 | 139 | 140 | CHANNEL_ID 141 | Y 142 | CHANNEL_ID 143 | 144 | 145 | LOG_DATE 146 | Y 147 | LOG_DATE 148 | 149 | 150 | JOBNAME 151 | Y 152 | TRANSNAME 153 | 154 | 155 | JOBENTRYNAME 156 | Y 157 | STEPNAME 158 | 159 | 160 | LINES_READ 161 | Y 162 | LINES_READ 163 | 164 | 165 | LINES_WRITTEN 166 | Y 167 | LINES_WRITTEN 168 | 169 | 170 | LINES_UPDATED 171 | Y 172 | LINES_UPDATED 173 | 174 | 175 | LINES_INPUT 176 | Y 177 | LINES_INPUT 178 | 179 | 180 | LINES_OUTPUT 181 | Y 182 | LINES_OUTPUT 183 | 184 | 185 | LINES_REJECTED 186 | Y 187 | LINES_REJECTED 188 | 189 | 190 | ERRORS 191 | Y 192 | ERRORS 193 | 194 | 195 | RESULT 196 | Y 197 | RESULT 198 | 199 | 200 | NR_RESULT_ROWS 201 | Y 202 | NR_RESULT_ROWS 203 | 204 | 205 | NR_RESULT_FILES 206 | Y 207 | NR_RESULT_FILES 208 | 209 | 210 | LOG_FIELD 211 | N 212 | LOG_FIELD 213 | 214 | 215 | COPY_NR 216 | N 217 | COPY_NR 218 | 219 | 220 | 221 | 222 | 223 |
224 | 225 | 226 | ID_BATCH 227 | Y 228 | ID_BATCH 229 | 230 | 231 | CHANNEL_ID 232 | Y 233 | CHANNEL_ID 234 | 235 | 236 | LOG_DATE 237 | Y 238 | LOG_DATE 239 | 240 | 241 | LOGGING_OBJECT_TYPE 242 | Y 243 | LOGGING_OBJECT_TYPE 244 | 245 | 246 | OBJECT_NAME 247 | Y 248 | OBJECT_NAME 249 | 250 | 251 | OBJECT_COPY 252 | Y 253 | OBJECT_COPY 254 | 255 | 256 | REPOSITORY_DIRECTORY 257 | Y 258 | REPOSITORY_DIRECTORY 259 | 260 | 261 | FILENAME 262 | Y 263 | FILENAME 264 | 265 | 266 | OBJECT_ID 267 | Y 268 | OBJECT_ID 269 | 270 | 271 | OBJECT_REVISION 272 | Y 273 | OBJECT_REVISION 274 | 275 | 276 | PARENT_CHANNEL_ID 277 | Y 278 | PARENT_CHANNEL_ID 279 | 280 | 281 | ROOT_CHANNEL_ID 282 | Y 283 | ROOT_CHANNEL_ID 284 | 285 | 286 | N 287 | 288 | 289 | 290 | Success 291 | 292 | SUCCESS 293 | N 294 | Y 295 | 0 296 | 304 297 | 432 298 | 299 | 300 | START 301 | 302 | SPECIAL 303 | Y 304 | N 305 | N 306 | 0 307 | 0 308 | 60 309 | 12 310 | 0 311 | 1 312 | 1 313 | N 314 | Y 315 | 0 316 | 304 317 | 80 318 | 319 | 320 | Transformation 321 | 322 | TRANS 323 | rep_name 324 | 325 | 326 | tr_dummy 327 | ${Internal.Entry.Current.Directory} 328 | N 329 | N 330 | N 331 | N 332 | N 333 | N 334 | 335 | 336 | N 337 | N 338 | Basic 339 | N 340 | 341 | N 342 | Y 343 | N 344 | N 345 | N 346 | Pentaho local 347 | 348 | Y 349 | 350 | N 351 | Y 352 | 0 353 | 304 354 | 240 355 | 356 | 357 | 358 | 359 | START 360 | Transformation 361 | 0 362 | 0 363 | Y 364 | Y 365 | Y 366 | 367 | 368 | Transformation 369 | Success 370 | 0 371 | 0 372 | Y 373 | Y 374 | N 375 | 376 | 377 | 378 | 379 | Dummy to demonstrate module structure 380 | 470 381 | 63 382 | 545 383 | 48 384 | 385 | -1 386 | N 387 | N 388 | 0 389 | 0 390 | 0 391 | 255 392 | 205 393 | 112 394 | 100 395 | 100 396 | 100 397 | Y 398 | 399 | 400 | 401 | 402 | METASTORE.pentaho 403 | 404 | Default Run Configuration 405 | {"namespace":"pentaho","id":"Default Run Configuration","name":"Default Run Configuration","description":"Defines a default run configuration","metaStoreName":null} 406 | 407 | 408 | 409 | {"_":"Embedded MetaStore Elements","namespace":"pentaho","type":"Default Run Configuration"} 410 | 411 | Pentaho local 412 | {"children":[{"children":[],"id":"server","value":null},{"children":[],"id":"clustered","value":"N"},{"children":[],"id":"name","value":"Pentaho local"},{"children":[],"id":"description","value":null},{"children":[],"id":"readOnly","value":"Y"},{"children":[],"id":"sendResources","value":"N"},{"children":[],"id":"logRemoteExecutionLocally","value":"N"},{"children":[],"id":"remote","value":"N"},{"children":[],"id":"local","value":"Y"},{"children":[],"id":"showTransformations","value":"N"}],"id":"Pentaho local","value":null,"name":"Pentaho local","owner":null,"ownerPermissionsList":[]} 413 | 414 | 415 | 416 | 417 | -------------------------------------------------------------------------------- /artefacts/pdi/repo/module_1/tr_dummy.ktr: -------------------------------------------------------------------------------- 1 | 2 | 3 | tr_dummy 4 | 5 | 6 | 7 | Normal 8 | 0 9 | /module_1 10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | ID_BATCH 22 | Y 23 | ID_BATCH 24 | 25 | 26 | CHANNEL_ID 27 | Y 28 | CHANNEL_ID 29 | 30 | 31 | TRANSNAME 32 | Y 33 | TRANSNAME 34 | 35 | 36 | STATUS 37 | Y 38 | STATUS 39 | 40 | 41 | LINES_READ 42 | Y 43 | LINES_READ 44 | 45 | 46 | 47 | LINES_WRITTEN 48 | Y 49 | LINES_WRITTEN 50 | 51 | 52 | 53 | LINES_UPDATED 54 | Y 55 | LINES_UPDATED 56 | 57 | 58 | 59 | LINES_INPUT 60 | Y 61 | LINES_INPUT 62 | 63 | 64 | 65 | LINES_OUTPUT 66 | Y 67 | LINES_OUTPUT 68 | 69 | 70 | 71 | LINES_REJECTED 72 | Y 73 | LINES_REJECTED 74 | 75 | 76 | 77 | ERRORS 78 | Y 79 | ERRORS 80 | 81 | 82 | STARTDATE 83 | Y 84 | STARTDATE 85 | 86 | 87 | ENDDATE 88 | Y 89 | ENDDATE 90 | 91 | 92 | LOGDATE 93 | Y 94 | LOGDATE 95 | 96 | 97 | DEPDATE 98 | Y 99 | DEPDATE 100 | 101 | 102 | REPLAYDATE 103 | Y 104 | REPLAYDATE 105 | 106 | 107 | LOG_FIELD 108 | Y 109 | LOG_FIELD 110 | 111 | 112 | EXECUTING_SERVER 113 | N 114 | EXECUTING_SERVER 115 | 116 | 117 | EXECUTING_USER 118 | N 119 | EXECUTING_USER 120 | 121 | 122 | CLIENT 123 | N 124 | CLIENT 125 | 126 | 127 | 128 | 129 | 130 |
131 | 132 | 133 | 134 | ID_BATCH 135 | Y 136 | ID_BATCH 137 | 138 | 139 | SEQ_NR 140 | Y 141 | SEQ_NR 142 | 143 | 144 | LOGDATE 145 | Y 146 | LOGDATE 147 | 148 | 149 | TRANSNAME 150 | Y 151 | TRANSNAME 152 | 153 | 154 | STEPNAME 155 | Y 156 | STEPNAME 157 | 158 | 159 | STEP_COPY 160 | Y 161 | STEP_COPY 162 | 163 | 164 | LINES_READ 165 | Y 166 | LINES_READ 167 | 168 | 169 | LINES_WRITTEN 170 | Y 171 | LINES_WRITTEN 172 | 173 | 174 | LINES_UPDATED 175 | Y 176 | LINES_UPDATED 177 | 178 | 179 | LINES_INPUT 180 | Y 181 | LINES_INPUT 182 | 183 | 184 | LINES_OUTPUT 185 | Y 186 | LINES_OUTPUT 187 | 188 | 189 | LINES_REJECTED 190 | Y 191 | LINES_REJECTED 192 | 193 | 194 | ERRORS 195 | Y 196 | ERRORS 197 | 198 | 199 | INPUT_BUFFER_ROWS 200 | Y 201 | INPUT_BUFFER_ROWS 202 | 203 | 204 | OUTPUT_BUFFER_ROWS 205 | Y 206 | OUTPUT_BUFFER_ROWS 207 | 208 | 209 | 210 | 211 | 212 |
213 | 214 | 215 | ID_BATCH 216 | Y 217 | ID_BATCH 218 | 219 | 220 | CHANNEL_ID 221 | Y 222 | CHANNEL_ID 223 | 224 | 225 | LOG_DATE 226 | Y 227 | LOG_DATE 228 | 229 | 230 | LOGGING_OBJECT_TYPE 231 | Y 232 | LOGGING_OBJECT_TYPE 233 | 234 | 235 | OBJECT_NAME 236 | Y 237 | OBJECT_NAME 238 | 239 | 240 | OBJECT_COPY 241 | Y 242 | OBJECT_COPY 243 | 244 | 245 | REPOSITORY_DIRECTORY 246 | Y 247 | REPOSITORY_DIRECTORY 248 | 249 | 250 | FILENAME 251 | Y 252 | FILENAME 253 | 254 | 255 | OBJECT_ID 256 | Y 257 | OBJECT_ID 258 | 259 | 260 | OBJECT_REVISION 261 | Y 262 | OBJECT_REVISION 263 | 264 | 265 | PARENT_CHANNEL_ID 266 | Y 267 | PARENT_CHANNEL_ID 268 | 269 | 270 | ROOT_CHANNEL_ID 271 | Y 272 | ROOT_CHANNEL_ID 273 | 274 | 275 | 276 | 277 | 278 |
279 | 280 | 281 | ID_BATCH 282 | Y 283 | ID_BATCH 284 | 285 | 286 | CHANNEL_ID 287 | Y 288 | CHANNEL_ID 289 | 290 | 291 | LOG_DATE 292 | Y 293 | LOG_DATE 294 | 295 | 296 | TRANSNAME 297 | Y 298 | TRANSNAME 299 | 300 | 301 | STEPNAME 302 | Y 303 | STEPNAME 304 | 305 | 306 | STEP_COPY 307 | Y 308 | STEP_COPY 309 | 310 | 311 | LINES_READ 312 | Y 313 | LINES_READ 314 | 315 | 316 | LINES_WRITTEN 317 | Y 318 | LINES_WRITTEN 319 | 320 | 321 | LINES_UPDATED 322 | Y 323 | LINES_UPDATED 324 | 325 | 326 | LINES_INPUT 327 | Y 328 | LINES_INPUT 329 | 330 | 331 | LINES_OUTPUT 332 | Y 333 | LINES_OUTPUT 334 | 335 | 336 | LINES_REJECTED 337 | Y 338 | LINES_REJECTED 339 | 340 | 341 | ERRORS 342 | Y 343 | ERRORS 344 | 345 | 346 | LOG_FIELD 347 | N 348 | LOG_FIELD 349 | 350 | 351 | 352 | 353 | 354 |
355 | 356 | 357 | ID_BATCH 358 | Y 359 | ID_BATCH 360 | 361 | 362 | CHANNEL_ID 363 | Y 364 | CHANNEL_ID 365 | 366 | 367 | LOG_DATE 368 | Y 369 | LOG_DATE 370 | 371 | 372 | METRICS_DATE 373 | Y 374 | METRICS_DATE 375 | 376 | 377 | METRICS_CODE 378 | Y 379 | METRICS_CODE 380 | 381 | 382 | METRICS_DESCRIPTION 383 | Y 384 | METRICS_DESCRIPTION 385 | 386 | 387 | METRICS_SUBJECT 388 | Y 389 | METRICS_SUBJECT 390 | 391 | 392 | METRICS_TYPE 393 | Y 394 | METRICS_TYPE 395 | 396 | 397 | METRICS_VALUE 398 | Y 399 | METRICS_VALUE 400 | 401 | 402 | 403 | 404 | 405 |
406 | 407 | 0.0 408 | 0.0 409 | 410 | 10000 411 | 50 412 | 50 413 | N 414 | Y 415 | 50000 416 | Y 417 | 418 | N 419 | 1000 420 | 100 421 | 422 | 423 | 424 | 425 | 426 | 427 | 428 | 429 | - 430 | 2017/08/26 15:27:47.155 431 | - 432 | 2017/08/26 15:29:01.086 433 | 434 | N 435 | 436 | 437 | 438 | 439 | 440 | Generate Rows 441 | Write to log 442 | Y 443 | 444 | 445 | 446 | Generate Rows 447 | RowGenerator 448 | 449 | Y 450 | 451 | 1 452 | 453 | none 454 | 455 | 456 | 457 | 458 | dummy 459 | String 460 | 461 | 462 | 463 | 464 | hello world 465 | -1 466 | -1 467 | N 468 | 469 | 470 | 1 471 | N 472 | 5000 473 | now 474 | FiveSecondsAgo 475 | 476 | 477 | 478 | 479 | 480 | 481 | 482 | 483 | 208 484 | 100 485 | Y 486 | 487 | 488 | 489 | Write to log 490 | WriteToLog 491 | 492 | Y 493 | 494 | 1 495 | 496 | none 497 | 498 | 499 | log_level_basic 500 | Y 501 | N 502 | 0 503 | ================= 504 | 505 | 506 | dummy 507 | 508 | 509 | 510 | 511 | 512 | 513 | 514 | 515 | 516 | 517 | 208 518 | 288 519 | Y 520 | 521 | 522 | 523 | 524 | 525 | 526 | N 527 | 528 | -------------------------------------------------------------------------------- /artefacts/pdi/shell-scripts/ci-settings.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | ## File-Based Repo Details ## 3 | PARAM_FB_PDI_REPO_USER=admin 4 | PARAM_FB_PDI_REPO_PASSWORD=password 5 | PARAM_FB_PDI_REPO_NAME= 6 | # repo location of the job 7 | PARAM_FB_PDI_CI_DIR=/modules/continuous_integration 8 | # where to store the exported xml on the file system 9 | PARAM_FB_REPO_EXPORT_FILE_LOCATION=/tmp/fb-repo-export.xml 10 | ## EE Repo Details ## 11 | PARAM_EE_PDI_REPO_USER=admin 12 | PARAM_EE_PDI_REPO_PASSWORD=password 13 | PARAM_EE_PDI_REPO_NAME=pentaho-di 14 | PARAM_EE_PDI_REPO_PATH_PREFIX=/home/projectx 15 | ## DI Server Details ## 16 | PARAM_DI_SERVER_HOST= 17 | PARAM_DI_SERVER_PORT= 18 | # depending on server version set to `pentaho` or `pentaho-di` 19 | PARAM_DI_SERVER_WEB_APP_NAME=pentaho-di 20 | # comment that should be added to the ee repo once upload is complete 21 | PARAM_COMMENT=latest-and-greatest -------------------------------------------------------------------------------- /artefacts/pdi/shell-scripts/delete-pentaho-repo-folder.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | source ./ci-settings.sh 3 | # we need PDI_DIR 4 | source ../../../common-config-dev/shell-scripts/set_env_variables.sh 5 | 6 | cd ${PDI_DIR} 7 | 8 | ./pan.sh \ 9 | -user=${PARAM_FB_PDI_REPO_USER} \ 10 | -pass=${PARAM_FB_PDI_REPO_PASSWORD} \ 11 | -rep=${PARAM_FB_PDI_REPO_NAME} \ 12 | -dir=${PARAM_FB_PDI_CI_DIR} \ 13 | -trans=tr_delete_ee_repo_folder \ 14 | -param:PARAM_DI_SERVER_REPO_PATH_TO_FOLDER=${PARAM_DI_SERVER_REPO_PATH_TO_FOLDER} \ 15 | -param:PARAM_DI_SERVER_WEB_APP_NAME=${PARAM_DI_SERVER_WEB_APP_NAME} \ 16 | -param:PARAM_PDI_EE_REPO_USER=${PARAM_EE_PDI_REPO_USER} \ 17 | -param:PARAM_PDI_EE_REPO_PASSWORD=${PARAM_EE_PDI_REPO_PASSWORD} -------------------------------------------------------------------------------- /artefacts/pdi/shell-scripts/export-file-based-repo.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | source ./ci-settings.sh 3 | # we need KETTLE_HOME, PDI_DIR 4 | source ../../../common-config-dev/shell-scripts/set_env_variables.sh 5 | 6 | cd ${PDI_DIR} 7 | ./kitchen.sh \ 8 | -user=${PARAM_FB_PDI_REPO_USER} \ 9 | -pass=${PARAM_FB_PDI_REPO_PASSWORD} \ 10 | -rep=${PARAM_FB_PDI_REPO_NAME} \ 11 | -dir=${PARAM_FB_PDI_CI_DIR} \ 12 | -job=jb_export_file_repo_all_objects \ 13 | -param:PARAM_PDI_REPO_NAME=${PARAM_FB_PDI_REPO_NAME} \ 14 | -param:PARAM_PDI_REPO_USER=${PARAM_FB_PDI_REPO_USER} \ 15 | -param:PARAM_PDI_REPO_PASSWORD=${PARAM_FB_PDI_REPO_PASSWORD} \ 16 | -param:PARAM_REPO_EXPORT_TARGET_FOLDER_AND_FILE=${PARAM_FB_REPO_EXPORT_TARGET_FOLDER_AND_FILE} 17 | 18 | -------------------------------------------------------------------------------- /artefacts/pdi/shell-scripts/upload-pdi-repo.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # NOTE: It might be better to use the Penatho Server REST API restore call instead 4 | # since there is no dependency on repositories.xml 5 | 6 | source ./ci-settings.sh 7 | # we need PDI_DIR 8 | source ../../../common-config-dev/shell-scripts/set_env_variables.sh 9 | 10 | cd ${PDI_DIR} 11 | 12 | echo all | ./import.sh \ 13 | -rep=${PARAM_EE_PDI_REPO_NAME} \ 14 | -user=${PARAM_EE_PDI_REPO_USER} \ 15 | -pass=${PARAM_EE_PDI_REPO_PASSWORD} \ 16 | -dir=${PARAM_EE_PDI_REPO_PATH_PREFIX} \ 17 | -file=${PARAM_FB_REPO_EXPORT_FILE_LOCATION} \ 18 | -replace=true \ 19 | -norules=N \ 20 | -coe=false \ 21 | -comment=${PARAM_COMMENT} 22 | -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/command-line-utility-based/export-pentaho-server-content.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # [OPEN] Use REST calls instead? 4 | 5 | PENTAHO_SERVER_HOME=/home/dsteiner/apps/pentaho-server-ce-7.1 6 | PENTAHO_SERVER_URL=http://localhost:8080/pentaho 7 | PENTAHO_SERVER_USER=admin 8 | PENTAHO_SERVER_PASSWORD=password 9 | TARGET_FILE=/tmp/test.zip 10 | PENTAHO_SERVER_REPO_PATH=/ 11 | LOG_FILE_PATH=/tmp/pentaho-server-export.log 12 | 13 | cd ${PENTAHO_SERVER_HOME} 14 | ./import-export.sh \ 15 | --export \ 16 | --url=${PENTAHO_SERVER_URL} \ 17 | --username=${PENTAHO_SERVER_USER} \ 18 | --password=${PENTAHO_SERVER_PASSWORD} \ 19 | --file-path=${TARGET_FILE} \ 20 | --charset=UTF-8 \ 21 | --path=${PENTAHO_SERVER_REPO_PATH} \ 22 | --withManifest=true \ 23 | --logfile=${LOG_FILE_PATH} 24 | 25 | # unzip 26 | 27 | # exlude following directories 28 | # public/plugin-samples,public/cde/,public/Steel+Wheels/,/public/bi-developers/ 29 | 30 | # add to /pentaho-solutions folder 31 | 32 | 33 | # This script should be part of a project code repo 34 | 35 | 36 | # exporting data sources 37 | # cd ${PENTAHO_SERVER_HOME} 38 | # ./import-export.sh \ 39 | # --export \ 40 | # --url=${PENTAHO_SERVER_URL} \ 41 | # --username=${PENTAHO_SERVER_USER} \ 42 | # --password=${PENTAHO_SERVER_PASSWORD} \ 43 | # --resource-type=datasource \ 44 | # --datasource-type=analysis \ 45 | # --analysis-datasource=SampleData \ 46 | # --charset=UTF-8 \ 47 | # --file-path=${TARGET_FILE} \ 48 | # --withManifest=true \ 49 | # --logfile=${LOG_FILE_PATH} -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/command-line-utility-based/upload-pentaho-server-content.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # [OPEN] Use REST calls instead? 4 | 5 | PENTAHO_SERVER_HOME=/home/dsteiner/apps/pentaho-server-ce-7.1 6 | PENTAHO_SERVER_URL=http://localhost:8080/pentaho 7 | PENTAHO_SERVER_USER=admin 8 | PENTAHO_SERVER_PASSWORD=password 9 | TARGET_FILE=/tmp/test.zip 10 | PENTAHO_SERVER_REPO_PATH=/public 11 | LOG_FILE_PATH=/tmp/pentaho-server-export.log 12 | 13 | 14 | cd ${PENTAHO_SERVER_HOME} 15 | ./import-export.sh \ 16 | --import \ 17 | --url=${PENTAHO_SERVER_URL} \ 18 | --username=${PENTAHO_SERVER_USER} \ 19 | --password=${PENTAHO_SERVER_PASSWORD} \ 20 | --file-path=${TARGET_FILE} \ 21 | --charset=UTF-8 \ 22 | --path=${PENTAHO_SERVER_REPO_PATH} \ 23 | --logfile=${LOG_FILE_PATH} \ 24 | --permission=true \ 25 | --overwrite=true -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/download-all-content.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # OPEN - DUMMY ONLY 4 | 5 | curl http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/pentaho/api/repo/files/backup > /tmp/backup.zip -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/upload-file.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # OPEN - DUMMY ONLY 4 | 5 | # sample values 6 | # PATH_TO_SOURCE_FILE=/home/dsteiner/Downloads/report1.xanalyzer 7 | # PENTAHO_SERVER_REPO_TARGET_PATH=/public/report1.xanalyzer 8 | 9 | curl -v -X POST -H "Content-Type: multipart/form-data" \ 10 | -F fileUpload=@${PATH_TO_SOURCE_FILE} \ 11 | -F overwriteFile=true \ 12 | -F importPath=${PENTAHO_SERVER_REPO_TARGET_PATH} \ 13 | http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/pentaho/api/repo/publish/publishfile -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/upload-folder.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # OPEN - DUMMY ONLY 4 | 5 | curl -X POST -v -H "Content-Type: multipart/form-data" \ 6 | -F overwriteFile=true \ 7 | -F fileUpload=@${PATH_TO_SOURCE_ZIP_FILE} \ 8 | http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/pentaho/api/repo/files/systemRestore -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/upload-jdbc-connection.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # OPEN - DUMMY ONLY 4 | function upload_jdbc_connection { 5 | 6 | curl -vX PUT http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/pentaho/plugin/data-access/api/datasource/jdbc/connection/${PENTAHO_SERVER_DB_CONNECTION_NAME} \ 7 | --header "Content-Type: application/json" \ 8 | --data "${PATH_TO_JSON_DB_CONNECTION_DEFINITION_FILE}" 9 | 10 | } -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/upload-metadata-model.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # OPEN - PROTOTYPE ONLY 3 | curl -v -H "Content-Type: multipart/form-data" -X PUT \ 4 | -F metadataFile=@${PATH_TO_METADATA_FILE} -F overwrite=true \ 5 | http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/yourdomain/plugin/data-access/api/datasource/metadata/domain/${METADATA_DOMAIN_NAME} -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/upload-mondrian-schema.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # OPEN - PROTOTYPE ONLY 3 | curl -v -H "Content-Type: multipart/form-data" -X PUT \ 4 | -F uploadInput=@${PATH_TO_MONDRIAN_SCHEMA_FILE} -F overwrite=true -F xmlaEnabledFlag=false -F parameters="Datasource=${PENTAHO_SERVER_DB_CONNECTION_NAME}" \ 5 | http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/yourdomain/plugin/data-access/api/datasource/analysis/catalog/${MONDRIAN_SCHEMA_NAME} -------------------------------------------------------------------------------- /artefacts/pentaho-server/shell-scripts/upload-schedule.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # OPEN - DUMMY ONLY 4 | 5 | # Sample values 6 | # REPOSITORY_PATH_TO_PDI_JOB_OR_TRANSFORMATION=/home/user1/pdi/jb_main.kjb 7 | # PENTAHO_SERVER_SCHEDULE_START_TIME=2017-08-14T11:46:00.000-04:00 8 | # PDI_JOB_OR_TRANSFORMATION_NAME=jb_main 9 | 10 | curl -X POST -v \ 11 | -H "Content-Type: application/json" \ 12 | --data "schedule.json" http://${PENTAHO_SERVER_USER}:${PENTAHO_SERVER_PASSWORD}@${PENTAHO_SERVER_HOST}:${PENTAHO_SERVER_PORT}/pentaho/api/scheduler/job -------------------------------------------------------------------------------- /artefacts/pentaho-server/templates/jdbc-connection-generic.json: -------------------------------------------------------------------------------- 1 | { 2 | "accessType": "NATIVE", 3 | "attributes": { 4 | "CUSTOM_DRIVER_CLASS": "${PSGRS_JDBC_CUSTOM_DRIVER_CLASS}", 5 | "CUSTOM_URL": "${PSGRS_JDBC_CUSTOM_URL}", 6 | "PORT_NUMBER": "0" 7 | }, 8 | "changed": true, 9 | "connectSql": "", 10 | "connectionPoolingProperties": {}, 11 | "databaseName": null, 12 | "databasePort": "0", 13 | "databaseType": { 14 | "defaultDatabaseName": null, 15 | "defaultDatabasePort": -1, 16 | "defaultOptions": null, 17 | "extraOptionsHelpUrl": null, 18 | "name": "Generic database", 19 | "shortName": "GENERIC" 20 | }, 21 | "extraOptions": {}, 22 | "extraOptionsOrder": {}, 23 | "forcingIdentifiersToLowerCase": false, 24 | "forcingIdentifiersToUpperCase": false, 25 | "hostname": "", 26 | "initialPoolSize": 0, 27 | "maximumPoolSize": 0, 28 | "name": "${PSGRS_JDBC_CONNECTION_NAME}", 29 | "partitioned": false, 30 | "password": "", 31 | "quoteAllFields": false, 32 | "streamingResults": false, 33 | "username": "", 34 | "usingConnectionPool": true, 35 | "usingDoubleDecimalAsSchemaTableSeparator": false 36 | } -------------------------------------------------------------------------------- /artefacts/pentaho-server/templates/schedule.json: -------------------------------------------------------------------------------- 1 | { 2 | "inputFile" : "${REPOSITORY_PATH_TO_PDI_JOB_OR_TRANSFORMATION}", 3 | "simpleJobTrigger" : { 4 | "repeatInterval" : "5", 5 | "repeatCount" : "-1", 6 | "startTime" : "${PENTAHO_SERVER_SCHEDULE_START_TIME}", 7 | "uiPassParam" : "MINUTES" 8 | }, 9 | "jobName" : "${PDI_JOB_OR_TRANSFORMATION_NAME}" 10 | } -------------------------------------------------------------------------------- /artefacts/project-config/project.properties: -------------------------------------------------------------------------------- 1 | ####################################################### 2 | ### PROPERTIES FOR ${PSGRS_PROJECT_NAME} ### 3 | ####################################################### 4 | 5 | # Project name 6 | PROP_PROJECT_NAME=${PSGRS_PROJECT_NAME} 7 | # Environment 8 | PROP_ENVIRONMENT=${PSGRS_ENV} 9 | # Path to folder that holds the git repos 10 | PROP_BASE_DIR=${PSGRS_BASE_DIR} 11 | # Project properties home 12 | PROP_PROJECT_PROPERTIES_HOME=${PSGRS_BASE_DIR}/${PSGRS_PROJECT_NAME}-config-${PSGRS_ENV} 13 | # Path to SQL Folder 14 | PROP_SQL_FOLDER_PATH=${PSGRS_BASE_DIR}/${PSGRS_PROJECT_NAME}-code/pdi/sql 15 | # Path to the modules folder within the PDI repo 16 | PROP_PDI_MODULES_REPO_PATH=modules -------------------------------------------------------------------------------- /artefacts/project-config/run_jb_name.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | # specify the path to the job with 3 | JOB_PATH="your_project_name" 4 | # specify the PDI job name - with extension for file-based approach, otherwise without 5 | JOB_NAME="jb_name.kjb" 6 | 7 | #BASE_DIR=$(dirname 0) 8 | # to make it run with crontab as well 9 | BASE_DIR="$( cd "$( /usr/bin/dirname "$0" )" && pwd )" 10 | echo "The run shell script is running from following directory: ${BASE_DIR}" 11 | # the repo name has to be the env name 12 | source ${BASE_DIR}/wrapper.sh ${JOB_PATH} ${JOB_NAME} 13 | -------------------------------------------------------------------------------- /artefacts/project-config/wrapper.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # ======================= INPUT ARGUMENTS ============================ # 4 | 5 | ## ~~~~~~~~~~~~~~~~~~~~~~~~ DO NOT CHANGE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~## 6 | ## ______________________ ## 7 | 8 | 9 | ## ------ THIS MEANT TO WORK WITH A FILE BASED PDI APPRACH ONLY ----- ## 10 | 11 | # environmental argument parameter 12 | if [ $# -eq 0 ] || [ -z "$1" ] || [ -z "$2" ] 13 | then 14 | echo "ERROR: Not all mandatory arguments supplied, please supply environment and/or job arguments" 15 | echo 16 | echo "Usage: wrapper.sh [JOB NAME] [JOB HOME]" 17 | echo "Run a wrapper PDI job to execute input PDI jobs" 18 | echo 19 | echo "Mandatory arguments" 20 | echo 21 | echo "JOB_HOME: The PDI repo path. Specify '/' if job located in root dir." 22 | echo "JOB_NAME: The name of the target job to run from within the wrapper" 23 | echo 24 | echo "exiting ..." 25 | exit 1 26 | else 27 | # PDI repo relative path for home directory of project kjb files 28 | JOB_HOME="$1" 29 | echo "JOB_HOME: ${JOB_HOME}" 30 | # target job name (kjb file name) 31 | JOB_NAME="$2" 32 | echo "JOB_NAME: ${JOB_NAME}" 33 | fi 34 | 35 | 36 | # ============= PROJECT-SPECIFIC CONFIGURATION PROPERTIES ============ # 37 | 38 | ## ~~~~~~~~~~~~~~~~~~~~~~~~ DO NOT CHANGE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~## 39 | ## ______________________ ## 40 | 41 | # some logic to automatically figure out project name and environment 42 | 43 | WRAPPER_DIR="$( cd "$( /usr/bin/dirname "$0" )" && pwd )" 44 | echo "The wrapper shell script is running from following directory:" 45 | echo "${WRAPPER_DIR}" 46 | 47 | # Path to the root directory where the common and project specific git repos are stored in 48 | # 49 | # we know that this shell script runs from within the 50 | # `proj-config-env/pdi/shell-scripts` folder 51 | # we want to get the path to proj-config-env folder 52 | # so we can easily extract project and environment names 53 | PROJECT_CONFIG_DIR=${WRAPPER_DIR%/*/*} 54 | # deployment dir is 4 levels up if common config is external 55 | DEPLOYMENT_DIR=${WRAPPER_DIR%/*/*/*/*} 56 | # Get project group name 57 | PROJECT_GROUP_DIR=${WRAPPER_DIR%/*/*/*} 58 | # Get name of project group folder 59 | PROJECT_GROUP_FOLDER_NAME=${PROJECT_GROUP_DIR##*/} 60 | 61 | # The following sections expects: 62 | # Either a common config in a common group folder 63 | # Or a standalone project (no common config) 64 | 65 | # Extract project name and environment from the standardised project folder name 66 | # The folder name gets initially standardised by the `initialise-repo.sh` 67 | # Get last `/` and apply substring from there to the end 68 | PROJECT_CONFIG_FOLDER_NAME=${PROJECT_CONFIG_DIR##*/} 69 | # Get substring from first character to first `-` 70 | PROJECT_NAME=${PROJECT_CONFIG_FOLDER_NAME%%-*} 71 | # Get substring from last `-` to the end 72 | PDI_ENV=${PROJECT_CONFIG_FOLDER_NAME##*-} 73 | # build path for project code dir 74 | PROJECT_CODE_DIR=${PROJECT_GROUP_DIR}/${PROJECT_NAME}-code 75 | # path to di files root dir - used for file-based pdi approach 76 | PROJECT_CODE_PDI_REPO_DIR=${PROJECT_CODE_DIR}/pdi/repo 77 | # Path to the environment specific common configuration 78 | # [OPEN]: make `pentaho-common` configurable - var already exists 79 | COMMON_CONFIG_DIR="${DEPLOYMENT_DIR}/pentaho-common/common-config-${PDI_ENV}" 80 | # workaround so that we can handle standalone projects as well 81 | if [ ! -d ${COMMON_CONFIG_DIR} ]; then 82 | echo "No common config exists, so assuming it is a standalone project ..." 83 | COMMON_CONFIG_DIR="${DEPLOYMENT_DIR}/${PROJECT_GROUP_FOLDER_NAME}/${PROJECT_NAME}-config-${PDI_ENV}" 84 | fi 85 | # source common environment variables here so that they can be used straight away for project specifc variables 86 | source ${COMMON_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh 87 | # is this project using a pdi repo setup or a file based one? 88 | if [ -f ${COMMON_CONFIG_DIR}/pdi/.kettle/repositories.xml ] 89 | then 90 | echo "Note: repositories.xml does exist ... " 91 | IS_PDI_REPO_BASED="Y" 92 | else 93 | echo "Note: repositories.xml does not exist ... " 94 | echo "... so assuming there is a file-based PDI setup in place." 95 | IS_PDI_REPO_BASED="N" 96 | fi 97 | 98 | # Absolute path for project log files 99 | PROJECT_LOG_HOME="${PROJECT_GROUP_DIR}/logs/${PDI_ENV}" 100 | 101 | 102 | 103 | # ============= PROJECT-SPECIFIC CONFIGURATION PROPERTIES ============ # 104 | 105 | ## ~~~~~~~~~~~~~~~~~~~~ AMEND FOR YOUR PROJECT ~~~~~~~~~~~~~~~~~~~~~~~## 106 | ## ______________________ ## 107 | 108 | # PDI repo name 109 | PDI_REPO_NAME="{{ PSGRS_PROJECT_NAME }}" 110 | # PDI user name 111 | PDI_REPO_USER="yourUserName" 112 | # PDI password 113 | PDI_REPO_PASS="yourPassword" 114 | # PDI repo root dir: if your repo has a deep hierarchy and the first few levels do not hold any 115 | # files - this saves you from specifying always the full path for the job directory as well as 116 | # helps if the root directory is located in different levels on differnet environments 117 | # e.g. if in dev your project folder is in the root but in prod within /home/pentaho 118 | # so for prod set PDI_REPO_MAIN_DIR_PATH to /home/pentaho 119 | PDI_REPO_MAIN_DIR_PATH="" 120 | 121 | 122 | 123 | # ============== JOB-SPECIFIC CONFIGURATION PROPERTIES =============== # 124 | 125 | ## ~~~~~~~~~~~~~~~~~~~~~~~~ DO NOT CHANGE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~## 126 | ## ______________________ ## 127 | 128 | 129 | # PDI repo directory path 130 | PDI_ARTEFACT_DIRECTORY_PATH=${PDI_REPO_MAIN_DIR_PATH}/${JOB_HOME} 131 | # Project logs file name 132 | JOB_LOG_FILE="${JOB_NAME}.err.log" 133 | # Project historic logs filename 134 | JOB_LOG_HIST_FILE="${JOB_NAME}.hist.log" 135 | # Project properties file 136 | PROJECT_PROPERTIES_FILE="${PROJECT_CONFIG_DIR}/pdi/properties/${PROJECT_NAME}.properties" 137 | # Job properties file 138 | JOB_PROPERTIES_FILE="${PROJECT_CONFIG_DIR}/pdi/properties/${JOB_NAME}.properties" 139 | # PDI repo full path for home directory of wrapper job 140 | # --- [OPEN] --- Wrapper has to be part of the modules repo 141 | WRAPPER_JOB_HOME="${PDI_REPO_MAIN_DIR_PATH}/modules/master_wrapper" 142 | # PDI wrapper job name 143 | WRAPPER_JOB_NAME="jb_master_wrapper" 144 | # get currently selected branche names 145 | cd ${COMMON_CONFIG_DIR} 146 | COMMON_CONFIG_DIR_BRANCH=`git branch` 147 | cd ${PROJECT_CONFIG_DIR} 148 | PROJECT_CONFIG_DIR_BRANCH=`git branch` 149 | cd ${PROJECT_CODE_DIR} 150 | PROJECT_CODE_DIR_BRANCH=`git branch` 151 | 152 | 153 | START_DATETIME=`date '+%Y-%m-%d_%H-%M-%S'` 154 | START_UNIX_TIMESTAMP=`date "+%s"` 155 | 156 | 157 | mkdir -p ${PROJECT_LOG_HOME} 158 | 159 | 160 | echo "Location of log file: ${PROJECT_LOG_HOME}/${JOB_LOG_FILE}" 161 | 162 | cat > ${PROJECT_LOG_HOME}/${JOB_LOG_FILE} <> ${PROJECT_LOG_HOME}/${JOB_LOG_FILE} 2>&1 206 | else 207 | ./kitchen.sh \ 208 | -file="${PROJECT_CODE_PDI_REPO_DIR}/${WRAPPER_JOB_HOME}/${WRAPPER_JOB_NAME}.kjb" \ 209 | -param:PARAM_PROJECT_PROPERTIES_FILE="${PROJECT_PROPERTIES_FILE}" \ 210 | -param:PARAM_JOB_PROPERTIES_FILE="${JOB_PROPERTIES_FILE}" \ 211 | -param:PARAM_JOB_NAME="${JOB_NAME}" \ 212 | -param:PARAM_TRANSFORMATION_NAME="" \ 213 | -param:PARAM_PDI_ARTEFACT_DIRECTORY_PATH="${PROJECT_CODE_PDI_REPO_DIR}/${PDI_ARTEFACT_DIRECTORY_PATH}" \ 214 | -param:PARAM_CONTROL_FILE_DIRECTORY="/tmp/{{ PSGRS_PROJECT_NAME }}/" \ 215 | -param:PARAM_CONTROL_FILE_NAME="${JOB_NAME}" \ 216 | >> ${PROJECT_LOG_HOME}/${JOB_LOG_FILE} 2>&1 217 | fi 218 | 219 | RES=$? 220 | 221 | END_DATETIME=`date '+%Y-%m-%d_%H-%M-%S'` 222 | END_UNIX_TIMESTAMP=`date "+%s"` 223 | echo 224 | echo "End DateTime: ${END_DATETIME}" >> ${PROJECT_LOG_HOME}/${JOB_LOG_FILE} 225 | 226 | 227 | DURATION_IN_SECONDS=`expr ${END_UNIX_TIMESTAMP} - ${START_UNIX_TIMESTAMP}` 228 | #DURATION_IN_MINUTES=`echo "scale=0;${DURATION_IN_SECONDS}/60" | bc` 229 | DURATION_IN_SECONDS_MSG=`printf '%dh:%dm:%ds\n' $((${DURATION_IN_SECONDS}/(60*60))) $((${DURATION_IN_SECONDS}%(60*60)/60)) $((${DURATION_IN_SECONDS}%60))` 230 | 231 | # Project historic logs filename 232 | JOB_LOG_HIST_FILE="${JOB_NAME}.hist.log" 233 | # Project archive logs filename 234 | PROJECT_LOG_ARCHIVE_FILE="${JOB_NAME}_${END_DATETIME}.err.log" 235 | 236 | # Get the duration in human-readable format 237 | # DURATION=`grep "Processing ended after " ${PROJECT_LOG_HOME}/${JOB_LOG_FILE} | sed -n -e 's/^.*Processing ended after after //p'` 238 | 239 | echo "Result: ${RES}" 240 | # DURATION_IN_SECONDS calc missing 241 | echo "Start: ${START_DATETIME} END: ${END_DATETIME} Result: ${RES} Duration: ${DURATION_IN_SECONDS_MSG} - Duration in Seconds: ${DURATION_IN_SECONDS}s" >> ${PROJECT_LOG_HOME}/${JOB_LOG_HIST_FILE} 242 | cat ${PROJECT_LOG_HOME}/${JOB_LOG_FILE} > ${PROJECT_LOG_HOME}/${PROJECT_LOG_ARCHIVE_FILE} 243 | 244 | exit ${RES} 245 | -------------------------------------------------------------------------------- /artefacts/utilities/build-rpm/template.spec: -------------------------------------------------------------------------------- 1 | # This is a sample spec file 2 | 3 | %define _topdir ${PSGRS_RPM_BUILD_HOME} 4 | %define name ${PSGRS_PROJECT_NAME} 5 | %define release 1 # <-- [OPEN has to be replaced at run-time] 6 | %define version 1.01 # <-- [OPEN has to be replaced at run-time] 7 | # SOURCE BOTH PARAM VALUES WHEN USER RUNS package-git-repo.sh 8 | %define buildroot %{_topdir}/%{name}-%{version}-root 9 | 10 | BuildRoot: %{buildroot} 11 | Summary: ${PSGRS_RPM_SUMMARY} 12 | License: GPL 13 | Name: %{name} 14 | Version: %{version} 15 | Release: %{release} 16 | Group: Development/Tools 17 | Requires: bash 18 | Source: %{name}-%{version}.tgz 19 | 20 | %description 21 | ${PSGRS_RPM_DESCRIPTION} 22 | 23 | %global debug_package %{nil} 24 | 25 | 26 | # Prep is used to set up the environment for building the rpm package 27 | # Expansion of source tar balls are done in this section 28 | %prep 29 | %setup 30 | 31 | # Used to compile and to build the source 32 | %build 33 | # Normally this part would be full of fancy compile stuff. Make this, make that. 34 | # We simple folks, we just want to copy some files out of a tar.gz. 35 | # so we pass this section with nothing done... 36 | 37 | # The installation. 38 | # We actually just put all our install files into a directory structure that 39 | # mimics a server directory structure here 40 | # Normally using the “make install” 41 | # command, it normally places the files 42 | # where they need to go. You can also copy the files, as we do here... 43 | %install 44 | # target directory of install is %{_topdir}/BUILD 45 | # create the directory structure that matches the final one 46 | # where packages will be installed 47 | 48 | # First we make sure we start clean 49 | rm -r %{buildroot} 50 | 51 | # Then we create the directories where the files go 52 | # don't worry if the directories exist on your target systems, rpm 53 | # creates if necessary 54 | mkdir -p %{buildroot}/opt/${PSGRS_PROJECT_NAME} 55 | 56 | 57 | # install -p -m 755 %{_topdir}/BUILD/%{name}-%{version}/* %{buildroot}/opt/${PSGRS_PROJECT_NAME} 58 | cp %{_topdir}/BUILD/%{name}-%{version}/* %{buildroot}/opt/${PSGRS_PROJECT_NAME} 59 | 60 | %clean 61 | rm -rf %{buildroot} 62 | 63 | 64 | %post 65 | echo "Installed %{name} scripts to /opt/${PSGRS_PROJECT_NAME}" 66 | # Contains a list of the files that are part of the package 67 | # See useful directives such as attr here: http://www.rpm.org/max-rpm-snapshot/s1-rpm-specref-files-list-directives.html 68 | 69 | # list files or directories that should be bundled into the RPM 70 | # optionally set permissions 71 | %files 72 | /opt/${PSGRS_PROJECT_NAME} 73 | # set permissions 74 | #%defattr(-,root,root) 75 | #/usr/local/bin/wget 76 | 77 | %changelog 78 | 79 | # add manual 80 | #%doc %attr(0444,root,root) /usr/local/share/man/man1/wget.1 81 | -------------------------------------------------------------------------------- /config/settings.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # ========== ORGANISATION SPECFIC CONFIGURATION PROPERTIES =========== # 4 | 5 | ## ~~~~~~~~~~~~~~~~~ AMEND FOR YOUR ORANGISATION ~~~~~~~~~~~~~~~~~~~~~## 6 | ## ______________________ ## 7 | # The SSH or HTTPS URL to clone the modules repo 8 | export PSGRS_MODULES_GIT_REPO_URL=git@github.com:diethardsteiner/pentaho-pdi-modules.git 9 | # **Note**: If this repo is not present yet, use this script 10 | # to create it and push it to your Git Server (GitHub, etc). 11 | 12 | # name of the common group name folder 13 | export PSGRS_G_COMMON_GROUP_NAME="pentaho-common" 14 | # name of the deployment folder, which holds the group and project folder 15 | export PSGRS_G_DEPLOYMENT_FOLDER_NAME="psgr-target" 16 | 17 | ## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~## 18 | ### TARGET ENVIRONMENT SETTINGS ### 19 | export PSGRS_PDI_DIR=${HOME}/apps/data-integration 20 | #export PSGRS_LOG_DIR=${HOME}/logs 21 | # OPEN: NEXT FOUR ONES NOT INTEGRATED YET ... they are for the wrapper.sh 22 | export PSGRS_PDI_REPO_NAME= 23 | export PSGRS_PDI_REPO_USER= 24 | export PSGRS_PDI_REPO_PASS= 25 | export PSGRS_PDI_REPO_MAIN_DIR_PATH= 26 | ### SETTINGS FOR GIT HOOKS ### 27 | export PSGRS_GIT_HOOKS_CHECKS_NEW_FILES_ONLY="N" 28 | # pipe delimited list of accepted database connection names 29 | export PSGRS_PDI_ACCEPTED_DATABASE_CONNECTION_NAMES_REGEX="^module\_.*|^hive_generic$|^impala_generic$|^mysql_process_control$" 30 | # regex to identify valid file extensions in a code repo 31 | export PSGRS_GIT_CODE_REPO_ACCEPTED_FILE_EXTENSIONS_REGEX="cda$|cdfde$|css$|csv$|gitignore$|html$|jpeg$|js$|json$|kdb$|kjb$|ktr$|md$|png$|prpt$|prpti$|sql$|sh$|svg$|txt$|wcdf$|xanalyzer$|xmi$|xml$" 32 | # regex to identify valid file extensions in a config repo 33 | export PSGRS_GIT_CONFIG_REPO_ACCEPTED_FILE_EXTENSIONS_REGEX="gitignore$|md$|properties$|sh$|spec$|spoonrc$|json$|xml$|txt$|csv$" 34 | # regex specifying the accepted git branch names 35 | export PSGRS_GIT_REPO_ACCEPTED_BRANCH_NAMES_REGEX="^master$|^dev$|^feature\_.+|^release\_.+|^hotfix\_.+" 36 | # regex specfying any words that should not show up in file and folder names 37 | export PSGRS_FILE_OR_FOLDER_NAME_FORBIDDEN_KEYWORD="(dev|test|beta|new|v[0-9]{1})" 38 | # regex specifying the accepted pdi parameter or variable name 39 | export PSGRS_PDI_ACCEPTED_PARAMETER_OR_VARIABLE_NAME="(^(VAR\_|PROP\_|PARAM\_)[A-Z0-9\_]+|^(Internal|awt|embedded|felix|file|https|java|karaf|log4j|org|sun|user|vfs)[a-zA-Z\.]+)" 40 | # regex specifying the accepted pdi job and transformation name 41 | export PSGRS_PDI_ACCEPTED_JOB_OR_TRANSFORMATION_NAME="(.*/)?tr\_[a-z0-9\_]+\.ktr$|(.*/)?jb\_[a-z0-9\_]+\.kjb$" 42 | # regex specifying the IP address match 43 | export PSGRS_PDI_IP_ADDRESS_REGEX="[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" 44 | # regex specifying the domain name match 45 | export PSGRS_PDI_DOMAIN_NAME_REGEX="(https://[a-zA-Z]+)|(http://[a-zA-Z]+)|(https\:\/\/[a-zA-Z]+)|(http\:\/\/[a-zA-Z]+)|(www\.[a-zA-Z]+)" 46 | ## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~## 47 | ### RPM BUILD ### 48 | # directory which will hold the rpm build artefacts - should be within home dir 49 | export PSGRS_RPM_BUILD_HOME=${HOME}/psgrs 50 | # top level description that forms part of the rpm build - use quotation marks 51 | export PSGRS_RPM_SUMMARY="A sample high level summary" 52 | # slightly longer description that forms part of the rpm build - use quotation marks 53 | export PSGRS_RPM_DESCRIPTION="A sample detailed description" -------------------------------------------------------------------------------- /initialise-repo.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | ## ~~~~~~~~~~~~~~~~~~~~~~~~~ DO NOT CHANGE ~~~~~~~~~~~~~~~~~~~~~~~~~~~## 5 | ## ______________________ ## 6 | 7 | 8 | if [ $# -eq 0 ] || [ -z "$1" ] 9 | then 10 | echo "ERROR: Not all mandatory arguments supplied, please supply environment and/or job arguments" 11 | echo 12 | echo "Usage: initialise-repo.sh ..." 13 | echo "Creates a basic folder structure for a Pentaho code or config repository" 14 | echo 15 | echo "Mandatory arguments" 16 | echo 17 | echo "-a PSGRS_ACTION: Choose number" 18 | echo " (1) Project Repo with Common Config and Modules" 19 | echo " (2) Standalone Project and Config (No Common Artefacts)" 20 | echo " pdi_module" 21 | echo " pdi_module_repo" 22 | echo " project_code" 23 | echo " project_config" 24 | echo " standalone_project_config" 25 | echo " common_code" 26 | echo " common_config" 27 | echo " project_docu" 28 | echo " common_docu" 29 | echo " " 30 | echo "Mandatory arguments:" 31 | echo " " 32 | echo "-g GROUP NAME: Full Project Name, e.g. world-wide-trading" 33 | echo " Lower case, only letters allowed, no underscores, dashes etc." 34 | echo " Minimum of 3 to a maximum of 20 letters." 35 | echo "-p PROJECT NAME: Project name abbreviation, e.g. wwt" 36 | echo " Lower case, only letters allowed, no underscores, dashes etc." 37 | echo " Minimum of 3 to a maximum of 20 letters." 38 | echo "-e ENVIRONMENT: Name of the environment: dev, test, prod or similiar. " 39 | echo " Lower case, only letters allowed, no underscores, dashes etc" 40 | echo " Minimum of 3 to a maximum of 10 letters." 41 | echo "-s STORAGE TYPE: Which type of PDI storage type to use." 42 | echo " Possible values: file-based, file-repo. Not supported: db-repo, ee-repo" 43 | echo " File-repo is only partially supported. You will have to create your own modules/wrapper jobs." 44 | echo "-w WEB-SPOON: Optional. If you intend to run everything within a WebSpoon Docker container." 45 | echo " For now only relevant if you use the file-based PDI repository." 46 | echo " Using this option means config won't work outside the container!" 47 | echo " Possible values: yes" 48 | echo "" 49 | echo "Sample usage:" 50 | echo "initialise-repo.sh -a 1 -g mysampleproj -p msp -e dev -s file-based" 51 | echo "initialise-repo.sh -a 1 -g mysampleproj -p mys -e dev -s file-repo -w yes" 52 | echo "initialise-repo.sh -a 2 -g mysampleproj -p msp -e dev -s file-based" 53 | echo "initialise-repo.sh -a standalone_project_config -g mysampleproj -p msp -e dev -s file-based" 54 | echo "" 55 | echo "exiting ..." 56 | exit 1 57 | fi 58 | 59 | while getopts ":a:g:p:e:s:w:" opt; do 60 | case $opt in 61 | a) PSGRS_ACTION="$OPTARG" 62 | echo "Submitted PSGRS_ACTION value: ${PSGRS_ACTION}" 63 | ;; 64 | g) export PSGRS_PROJECT_GROUP_NAME="$OPTARG" 65 | echo "Submitted project name value: ${PSGRS_PROJECT_GROUP_NAME}" 66 | if [[ ! ${PSGRS_PROJECT_GROUP_NAME} =~ ^[a-z\-]{3,40}$ ]]; then 67 | echo "Unsupported group name!" 68 | echo "Lower case, only letters and dashes allowed, no underscores etc." 69 | echo "Minimum of 3 to a maximum of 40 characters." 70 | exit 1 71 | fi 72 | ;; 73 | p) export PSGRS_PROJECT_NAME="$OPTARG" 74 | echo "Submitted project name value: ${PSGRS_PROJECT_NAME}" 75 | if [[ ! ${PSGRS_PROJECT_NAME} =~ ^[a-z]{3,20}$ ]]; then 76 | echo "Unsupported project name!" 77 | echo "Lower case, only letters allowed, no underscores, dashes, spaces etc." 78 | echo "Minimum of 3 to a maximum of 20 characters." 79 | exit 1 80 | fi 81 | ;; 82 | e) export PSGRS_ENV="$OPTARG" 83 | echo "Submitted environment value: ${PSGRS_ENV}" 84 | if [[ ! ${PSGRS_ENV} =~ ^[a-z]{3,10}$ ]]; then 85 | echo "Unsupported environment name!" 86 | echo "Lower case, only letters allowed, no underscores, dashes, spaces etc." 87 | echo "Minimum of 3 to a maximum of 10 letters." 88 | exit 1 89 | fi 90 | ;; 91 | s) PSGRS_PDI_STORAGE_TYPE="$OPTARG" 92 | echo "Submitted environment value: ${PSGRS_PDI_STORAGE_TYPE}" 93 | # check that supplied value is in the list of possible values 94 | # validate() { echo "files file-repo ee-repo" | grep -F -q -w "${PSGRS_PDI_STORAGE_TYPE}"; } 95 | LIST_CHECK=$(echo "file-based file-repo ee-repo" | grep -F -q -w "${PSGRS_PDI_STORAGE_TYPE}" && echo "valid" || echo "invalid") 96 | echo "List check: ${LIST_CHECK}" 97 | if [ ${LIST_CHECK} = "invalid" ]; then 98 | echo "Unsupported storage type!" 99 | echo "Possible values: file-based, file-repo, ee-repo" 100 | exit 1 101 | fi 102 | ;; 103 | w) PSGRS_PDI_WEBSPOON_SUPPORT="$OPTARG" 104 | echo "Submitted WebSpoon Support value: ${PSGRS_PDI_WEBSPOON_SUPPORT}" 105 | ;; 106 | \?) 107 | echo "Invalid option -$OPTARG" >&2 108 | exit 1 109 | ;; 110 | esac 111 | done 112 | 113 | # Example Usage: 114 | # /home/dsteiner/git/pentaho-standardised-git-repo-setup/initialise-repo.sh -a standalone_project_config -g mysampleproj -p mys -e dev -s file-based 115 | # /home/dsteiner/git/pentaho-standardised-git-repo-setup/initialise-repo.sh -a 1 -g mysampleproj -p mys -e dev -s file-based 116 | # /home/dsteiner/git/pentaho-standardised-git-repo-setup/initialise-repo.sh -a 1 -g mysampleproj -p mys -e dev -s file-repo 117 | # /home/dsteiner/git/pentaho-standardised-git-repo-setup/initialise-repo.sh -a 1 -g mysampleproj -p mys -e dev -s file-repo -w yes 118 | 119 | 120 | # MAIN SCRIPT 121 | 122 | PSGRS_WORKING_DIR=`pwd` 123 | PSGRS_SHELL_DIR=$(dirname $0) 124 | 125 | # Source config settings 126 | source ${PSGRS_SHELL_DIR}/config/settings.sh 127 | 128 | 129 | PSGRS_COMMON_GROUP_NAME=${PSGRS_G_COMMON_GROUP_NAME} 130 | # folder that holds the common and project specific repos 131 | # added as there some scripts that sit outside the repos 132 | PSGRS_DEPLOYMENT_FOLDER_NAME=${PSGRS_G_DEPLOYMENT_FOLDER_NAME} 133 | # create top level folder to not pollute any other folder 134 | 135 | # make sure group name value is set 136 | if [ -z ${PSGRS_PROJECT_GROUP_NAME} ]; then 137 | echo "Not all required arguments were supplied. Group Name value missing." 138 | echo "exiting ..." 139 | exit 1 140 | fi 141 | 142 | # check if directory already exists 143 | # otherwise create it 144 | if [ ! -d "${PSGRS_DEPLOYMENT_FOLDER_NAME}" ]; then 145 | mkdir ${PSGRS_DEPLOYMENT_FOLDER_NAME} 146 | fi 147 | 148 | cd ${PSGRS_DEPLOYMENT_FOLDER_NAME} 149 | 150 | if [ ! -d "${PSGRS_COMMON_GROUP_NAME}" ]; then 151 | mkdir ${PSGRS_COMMON_GROUP_NAME} 152 | fi 153 | 154 | if [ ! -d "${PSGRS_PROJECT_GROUP_NAME}" ]; then 155 | mkdir ${PSGRS_PROJECT_GROUP_NAME} 156 | fi 157 | 158 | PSGRS_DEPLOYMENT_DIR=${PSGRS_WORKING_DIR}/${PSGRS_DEPLOYMENT_FOLDER_NAME} 159 | PSGRS_COMMON_GROUP_DIR=${PSGRS_WORKING_DIR}/${PSGRS_DEPLOYMENT_FOLDER_NAME}/${PSGRS_COMMON_GROUP_NAME} 160 | PSGRS_PROJECT_GROUP_DIR=${PSGRS_WORKING_DIR}/${PSGRS_DEPLOYMENT_FOLDER_NAME}/${PSGRS_PROJECT_GROUP_NAME} 161 | 162 | 163 | echo "===============** PSGRS PATHS **====================" 164 | echo "" 165 | echo "PSGRS SHELL DIR: ${PSGRS_SHELL_DIR}" 166 | echo "PSGRS_DEPLOYMENT_DIR: ${PSGRS_DEPLOYMENT_DIR}" 167 | echo "PSGRS COMMON GROUP DIR: ${PSGRS_COMMON_GROUP_DIR}" 168 | echo "PSGRS PROJECT GROUP DIR: ${PSGRS_PROJECT_GROUP_DIR}" 169 | echo "" 170 | echo "====================================================" 171 | 172 | function pdi_module { 173 | # check if required parameter values are available 174 | if [ -z ${PSGRS_ACTION} ]; then 175 | echo "Not all required arguments were supplied. Required:" 176 | echo "-a " 177 | echo "exiting ..." 178 | exit 1 179 | fi 180 | echo "================PDI MODULES====================" 181 | cd ${PSGRS_COMMON_GROUP_DIR} 182 | PDI_MODULES_DIR=${PSGRS_COMMON_GROUP_DIR}/modules 183 | echo "PDI_MODULES_DIR: ${PDI_MODULES_DIR}" 184 | if [ ! -d "${PDI_MODULES_DIR}" ]; then 185 | echo "Creating and pointing to default git branch" 186 | git checkout -b dev 187 | echo "Creating PDI modules folder ..." 188 | mkdir ${PDI_MODULES_DIR} 189 | cd ${PDI_MODULES_DIR} 190 | echo "Initialising Git Repo ..." 191 | git init . 192 | # git hooks wont work here since the directory structure is different 193 | # echo "Adding Git hooks ..." 194 | # cp ${PSGRS_SHELL_DIR}/artefacts/git/hooks/* ${PDI_MODULES_DIR}/.git/hooks 195 | # we have to create a file so that the master branch is created 196 | echo "creating README file ..." 197 | touch readme.md 198 | echo "adding module_1 sample module ..." 199 | cp -r ${PSGRS_SHELL_DIR}/artefacts/pdi/repo/module_1 . 200 | git add --all 201 | git commit -am "initial commit" 202 | fi 203 | } 204 | 205 | 206 | function project_code { 207 | # check if required parameter values are available 208 | if [ -z ${PSGRS_ACTION} ] || [ -z ${PSGRS_PROJECT_NAME} ] || [ -z ${PSGRS_PDI_STORAGE_TYPE} ]; then 209 | echo "Not all required arguments were supplied. Required:" 210 | echo "-a " 211 | echo "-p " 212 | echo "-s " 213 | echo "exiting ..." 214 | exit 1 215 | fi 216 | echo "================PROJECT CODE====================" 217 | cd ${PSGRS_PROJECT_GROUP_DIR} 218 | PROJECT_CODE_DIR=${PSGRS_PROJECT_GROUP_DIR}/${PSGRS_PROJECT_NAME}-code 219 | echo "PROJECT_CODE_DIR: ${PROJECT_CODE_DIR}" 220 | 221 | if [ ! -d "${PROJECT_CODE_DIR}" ]; then 222 | 223 | echo "Creating project code folder ..." 224 | echo "location: ${PROJECT_CODE_DIR}" 225 | mkdir ${PROJECT_CODE_DIR} 226 | cd ${PROJECT_CODE_DIR} 227 | 228 | echo "Initialising Git Repo ..." 229 | git init . 230 | 231 | echo "Adding Git hooks ..." 232 | cp ${PSGRS_SHELL_DIR}/artefacts/git/hooks/* ${PROJECT_CODE_DIR}/.git/hooks 233 | cp ${PSGRS_SHELL_DIR}/config/settings.sh ${PROJECT_CODE_DIR}/.git/hooks 234 | perl -0777 \ 235 | -pe "s@\{\{ IS_CONFIG \}\}@N@igs" \ 236 | -i ${PROJECT_CODE_DIR}/.git/hooks/pre-commit 237 | if [ ${PSGRS_PDI_STORAGE_TYPE} = "file-based" ]; then 238 | perl -0777 \ 239 | -pe "s@\{\{ IS_REPO_BASED \}\}@N@igs" \ 240 | -i ${PROJECT_CODE_DIR}/.git/hooks/pre-commit 241 | else 242 | perl -0777 \ 243 | -pe "s@\{\{ IS_REPO_BASED \}\}@Y@igs" \ 244 | -i ${PROJECT_CODE_DIR}/.git/hooks/pre-commit 245 | fi 246 | 247 | 248 | echo "Creating and pointing to default git branch" 249 | git checkout -b dev 250 | 251 | echo "Creating basic folder structure ..." 252 | mkdir -p pdi/repo/${PSGRS_PROJECT_NAME} 253 | mkdir -p pdi/sql/ddl 254 | mkdir -p pentaho-server/repo 255 | mkdir -p pentaho-server/metadata 256 | mkdir -p pentaho-server/mondrian 257 | mkdir -p pentaho-server/prd 258 | mkdir -p shell-scripts 259 | 260 | # adding file so folders can be committed 261 | touch pdi/repo/${PSGRS_PROJECT_NAME}/.gitignore 262 | touch pdi/sql/ddl/.gitignore 263 | touch pentaho-server/repo/.gitignore 264 | touch pentaho-server/metadata/.gitignore 265 | touch pentaho-server/mondrian/.gitignore 266 | touch pentaho-server/prd/.gitignore 267 | touch shell-scripts/this-folder-contains-non-environment-specific-shell-files.md 268 | 269 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/repo/jb_master.kjb \ 270 | ${PROJECT_CODE_DIR}/pdi/repo/${PSGRS_PROJECT_NAME} 271 | 272 | mv ${PROJECT_CODE_DIR}/pdi/repo/${PSGRS_PROJECT_NAME}/jb_master.kjb \ 273 | ${PROJECT_CODE_DIR}/pdi/repo/${PSGRS_PROJECT_NAME}/jb_${PSGRS_PROJECT_NAME}_master.kjb 274 | 275 | echo "Creating basic README file ..." 276 | echo "Documentation can be found in the dedicated documentation Git repo called ${PSGRS_PROJECT_NAME}-documentation" > readme.md 277 | 278 | if [ ${PSGRS_PDI_STORAGE_TYPE} = "file-repo" ]; then 279 | echo "Adding kettle db connection files ..." 280 | cp -r ${PSGRS_SHELL_DIR}/artefacts/pdi/repo/*.kdb pdi/repo 281 | 282 | perl -0777 \ 283 | -pe "s@\{\{ VAR_DB_CONNECTION_NAME \}\}@sample_db_connection@igs" \ 284 | -i ${PROJECT_CODE_DIR}/pdi/repo/db_connection_template.kdb 285 | 286 | perl -0777 \ 287 | -pe "s@\{\{ PSGRS_PROJECT_NAME \}\}@${PSGRS_PROJECT_NAME}@igs" \ 288 | -i ${PROJECT_CODE_DIR}/pdi/repo/${PSGRS_PROJECT_NAME}/jb_${PSGRS_PROJECT_NAME}_master.kjb 289 | fi 290 | 291 | if [ ${PSGRS_PDI_STORAGE_TYPE} = "file-based" ]; then 292 | # nothing to do: shared.xml is part of .kettle, which lives in the config repo 293 | perl -0777 \ 294 | -pe "s@\{\{ PSGRS_PROJECT_NAME \}\}@@igs" \ 295 | -i ${PROJECT_CODE_DIR}/pdi/repo/${PSGRS_PROJECT_NAME}/jb_${PSGRS_PROJECT_NAME}_master.kjb 296 | fi 297 | 298 | echo "Adding pdi modules as a git submodule ..." 299 | 300 | git submodule add -b master ${PSGRS_MODULES_GIT_REPO_URL} pdi/repo/modules 301 | git submodule init 302 | git submodule update 303 | 304 | # echo "Setting branch for submodule ..." 305 | # cd pdi/repo/modules 306 | # git checkout master 307 | 308 | 309 | # committing new files 310 | git add --all 311 | git commit -am "initial commit" 312 | 313 | cd ${PROJECT_CODE_DIR} 314 | 315 | # enable pre-commit hook 316 | chmod 700 ${PROJECT_CODE_DIR}/.git/hooks/pre-commit 317 | chmod 700 ${PROJECT_CODE_DIR}/.git/hooks/settings.sh 318 | 319 | fi 320 | } 321 | 322 | function project_config { 323 | # check if required parameter values are available 324 | if [ -z ${PSGRS_ACTION} ] || [ -z ${PSGRS_PROJECT_NAME} ] || [ -z ${PSGRS_ENV} ] || [ -z ${PSGRS_PDI_STORAGE_TYPE} ]; then 325 | echo "Not all required arguments were supplied. Required:" 326 | echo "-a " 327 | echo "-p " 328 | echo "-e " 329 | echo "-s " 330 | echo "exiting ..." 331 | exit 1 332 | fi 333 | echo "================PROJECT CONFIG==================" 334 | cd ${PSGRS_PROJECT_GROUP_DIR} 335 | PROJECT_CONFIG_DIR=${PSGRS_PROJECT_GROUP_DIR}/${PSGRS_PROJECT_NAME}-config-${PSGRS_ENV} 336 | echo "PROJECT_CONFIG_DIR: ${PROJECT_CONFIG_DIR}" 337 | 338 | if [ ! -d "${PROJECT_CONFIG_DIR}" ]; then 339 | 340 | echo "Creating project config folder ..." 341 | echo "location: ${PROJECT_CONFIG_DIR}" 342 | mkdir ${PROJECT_CONFIG_DIR} 343 | cd ${PROJECT_CONFIG_DIR} 344 | 345 | echo "Initialising Git Repo ..." 346 | git init . 347 | 348 | echo "Creating and pointing to default git branch" 349 | git checkout -b master 350 | 351 | echo "Adding Git hooks ..." 352 | cp ${PSGRS_SHELL_DIR}/artefacts/git/hooks/* ${PROJECT_CONFIG_DIR}/.git/hooks 353 | cp ${PSGRS_SHELL_DIR}/config/settings.sh ${PROJECT_CONFIG_DIR}/.git/hooks 354 | perl -0777 \ 355 | -pe "s@\{\{ IS_CONFIG \}\}@Y@igs" \ 356 | -i ${PROJECT_CONFIG_DIR}/.git/hooks/pre-commit 357 | perl -0777 \ 358 | -pe "s@\{\{ IS_REPO_BASED \}\}@N@igs" \ 359 | -i ${PROJECT_CONFIG_DIR}/.git/hooks/pre-commit 360 | 361 | echo "Creating basic folder structure ..." 362 | 363 | # mkdir -p pdi/.kettle -> standalone project only 364 | mkdir -p pdi/metadata 365 | mkdir -p pdi/properties 366 | mkdir -p pdi/schedules 367 | mkdir -p pdi/shell-scripts 368 | mkdir -p pdi/test-data 369 | mkdir -p pentaho-server/connections 370 | 371 | # adding file so that the folders can be commited 372 | touch pdi/metadata/.gitignore 373 | touch pdi/properties/.gitignore 374 | touch pdi/schedules/.gitignore 375 | touch pdi/shell-scripts/.gitignore 376 | touch pdi/test-data/.gitignore 377 | touch pentaho-server/connections/.gitignore 378 | 379 | echo "Adding essential shell files ..." 380 | 381 | cp ${PSGRS_SHELL_DIR}/artefacts/project-config/wrapper.sh \ 382 | ${PROJECT_CONFIG_DIR}/pdi/shell-scripts 383 | 384 | perl -0777 \ 385 | -pe "s@\{\{ PSGRS_PROJECT_NAME \}\}@${PSGRS_PROJECT_NAME}@igs" \ 386 | -i ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/wrapper.sh 387 | 388 | 389 | cp ${PSGRS_SHELL_DIR}/artefacts/project-config/run_jb_name.sh \ 390 | ${PROJECT_CONFIG_DIR}/pdi/shell-scripts 391 | 392 | mv ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/run_jb_name.sh \ 393 | ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/run_jb_${PSGRS_PROJECT_NAME}_master.sh 394 | 395 | perl -0777 \ 396 | -pe "s@your_project_name@${PSGRS_PROJECT_NAME}@igs" \ 397 | -i ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/run_jb_${PSGRS_PROJECT_NAME}_master.sh 398 | 399 | perl -0777 \ 400 | -pe "s@jb_name@jb_${PSGRS_PROJECT_NAME}_master@igs" \ 401 | -i ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/run_jb_${PSGRS_PROJECT_NAME}_master.sh 402 | 403 | chmod 700 ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/*.sh 404 | 405 | echo "Adding essential properties files ..." 406 | 407 | envsubst \ 408 | < ${PSGRS_SHELL_DIR}/artefacts/project-config/project.properties \ 409 | > ${PROJECT_CONFIG_DIR}/pdi/properties/project.properties 410 | 411 | # rename project properies file 412 | mv ${PROJECT_CONFIG_DIR}/pdi/properties/project.properties \ 413 | ${PROJECT_CONFIG_DIR}/pdi/properties/${PSGRS_PROJECT_NAME}.properties 414 | 415 | touch ${PROJECT_CONFIG_DIR}/pdi/properties/jb_${PSGRS_PROJECT_NAME}_master.properties 416 | 417 | # copy deployment scripts across 418 | # [OPEN] 419 | 420 | # rpm script 421 | mkdir -p utilities/build-rpm 422 | 423 | cp \ 424 | ${PSGRS_SHELL_DIR}/artefacts/git/package-git-repo.sh \ 425 | ${PROJECT_CONFIG_DIR}/utilities/build-rpm 426 | 427 | cp \ 428 | ${PSGRS_SHELL_DIR}/config/settings.sh \ 429 | ${PROJECT_CONFIG_DIR}/utilities/build-rpm 430 | 431 | envsubst \ 432 | < ${PSGRS_SHELL_DIR}/artefacts/utilities/build-rpm/template.spec \ 433 | > ${PROJECT_CONFIG_DIR}/utilities/build-rpm/template.spec 434 | 435 | 436 | echo "Creating basic README file ..." 437 | echo "Project specific configuration for ${PSGRS_ENV} environment." > ${PROJECT_CONFIG_DIR}/readme.md 438 | 439 | # commit new files 440 | git add --all 441 | git commit -am "initial commit" 442 | 443 | # enable pre-commit hook 444 | chmod 700 ${PROJECT_CONFIG_DIR}/.git/hooks/pre-commit 445 | chmod 700 ${PROJECT_CONFIG_DIR}/.git/hooks/settings.sh 446 | 447 | 448 | fi 449 | } 450 | 451 | function standalone_project_config { 452 | # This caters for projects that do not need a common project or config 453 | # check if required parameter values are available 454 | if [ -z ${PSGRS_ACTION} ] || [ -z ${PSGRS_PROJECT_NAME} ] || [ -z ${PSGRS_ENV} ] || [ -z ${PSGRS_PDI_STORAGE_TYPE} ]; then 455 | echo "Not all required arguments were supplied. Required:" 456 | echo "-a " 457 | echo "-p " 458 | echo "-e " 459 | echo "-s " 460 | echo "exiting ..." 461 | exit 1 462 | fi 463 | 464 | project_config 465 | 466 | mkdir -p pdi/.kettle 467 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/.gitignore \ 468 | ${PROJECT_CONFIG_DIR}/pdi/.kettle 469 | 470 | echo "Adding essential shell files ..." 471 | 472 | export PSGRS_KETTLE_HOME=${PROJECT_CONFIG_DIR}/pdi 473 | 474 | envsubst \ 475 | < ${PSGRS_SHELL_DIR}/artefacts/common-config/set-env-variables.sh \ 476 | > ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh 477 | 478 | # add_kettle_artefacts 479 | echo "Adding .kettle files for ${PSGRS_PDI_STORAGE_TYPE} ..." 480 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/kettle.properties \ 481 | ${PROJECT_CONFIG_DIR}/pdi/.kettle 482 | 483 | if [ ${PSGRS_PDI_STORAGE_TYPE} = 'file-repo' ]; then 484 | 485 | export PSGRS_PDI_REPO_NAME=${PSGRS_PROJECT_NAME} 486 | export PSGRS_PDI_REPO_DESCRIPTION="This is the repo for the ${PSGRS_PROJECT_NAME} project" 487 | if [ "${PSGRS_PDI_WEBSPOON_SUPPORT}" = "yes" ]; then 488 | # we mount the project code repo into the Docker container under /root/my-project 489 | export PSGRS_PDI_REPO_PATH=/root/${PSGRS_DEPLOYMENT_FOLDER_NAME}/${PSGRS_PROJECT_GROUP_NAME}/${PSGRS_PROJECT_NAME}-code/pdi/repo 490 | else 491 | export PSGRS_PDI_REPO_PATH=${PSGRS_PROJECT_GROUP_DIR}/${PSGRS_PROJECT_NAME}-code/pdi/repo 492 | fi 493 | 494 | envsubst \ 495 | < ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/repositories-file.xml \ 496 | > ${PROJECT_CONFIG_DIR}/pdi/.kettle/repositories.xml 497 | 498 | fi 499 | 500 | if [ ${PSGRS_PDI_STORAGE_TYPE} = "file-based" ]; then 501 | 502 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/shared.xml \ 503 | ${PROJECT_CONFIG_DIR}/pdi/.kettle 504 | fi 505 | 506 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/.spoonrc \ 507 | ${PROJECT_CONFIG_DIR}/pdi/.kettle 508 | 509 | # disable pre-commit hook 510 | chmod 400 ${PROJECT_CONFIG_DIR}/.git/hooks/pre-commit 511 | chmod 400 ${PROJECT_CONFIG_DIR}/.git/hooks/settings.sh 512 | 513 | 514 | # commit new files 515 | git add --all 516 | git commit -am "initial commit" 517 | 518 | # enable pre-commit hook 519 | chmod 700 ${PROJECT_CONFIG_DIR}/.git/hooks/pre-commit 520 | chmod 700 ${PROJECT_CONFIG_DIR}/.git/hooks/settings.sh 521 | 522 | echo "" 523 | echo "===============================" 524 | echo "" 525 | echo -e "\e[34m\e[47mIMPORTANT\e[0m" 526 | echo "Amend the following configuration file:" 527 | echo "${PROJECT_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh" 528 | echo "" 529 | echo "Before using Spoon, source this file:" 530 | echo "source ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh" 531 | echo "===============================" 532 | echo "" 533 | 534 | # echo "Running set-env-variables.sh now so that at least KETTLE_HOME is defined." 535 | # echo "You can start PDI Spoon now if working on a dev machine." 536 | echo "" 537 | 538 | source ${PROJECT_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh 539 | 540 | } 541 | 542 | 543 | function common_config { 544 | # check if required parameter values are available 545 | if [ -z ${PSGRS_ACTION} ] || [ -z ${PSGRS_ENV} ] || [ -z ${PSGRS_PDI_STORAGE_TYPE} ]; then 546 | echo "Not all required arguments were supplied. Required:" 547 | echo "-a " 548 | echo "-e " 549 | echo "-s " 550 | echo "exiting ..." 551 | exit 1 552 | fi 553 | echo "==========COMMON CONFIG==================" 554 | cd ${PSGRS_COMMON_GROUP_DIR} 555 | COMMON_CONFIG_DIR=${PSGRS_COMMON_GROUP_DIR}/common-config-${PSGRS_ENV} 556 | echo "COMMON_CONFIG_DIR: ${COMMON_CONFIG_DIR}" 557 | if [ ! -d "${COMMON_CONFIG_DIR}" ]; then 558 | 559 | echo "Creating common config folder ..." 560 | echo "location: ${COMMON_CONFIG_DIR}" 561 | mkdir ${COMMON_CONFIG_DIR} 562 | cd ${COMMON_CONFIG_DIR} 563 | 564 | echo "Initialising Git Repo ..." 565 | git init . 566 | 567 | echo "Creating and pointing to default git branch" 568 | git checkout -b master 569 | 570 | echo "Creating basic folder structure ..." 571 | 572 | mkdir -p pdi/.kettle 573 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/.gitignore \ 574 | ${COMMON_CONFIG_DIR}/pdi/.kettle 575 | 576 | mkdir -p pdi/shell-scripts 577 | 578 | 579 | echo "Adding Git hooks ..." 580 | cp ${PSGRS_SHELL_DIR}/artefacts/git/hooks/* ${COMMON_CONFIG_DIR}/.git/hooks 581 | cp ${PSGRS_SHELL_DIR}/config/settings.sh ${COMMON_CONFIG_DIR}/.git/hooks 582 | 583 | perl -0777 \ 584 | -pe "s@\{\{ IS_CONFIG \}\}@Y@igs" \ 585 | -i ${COMMON_CONFIG_DIR}/.git/hooks/pre-commit 586 | perl -0777 \ 587 | -pe "s@\{\{ IS_REPO_BASED \}\}@N@igs" \ 588 | -i ${COMMON_CONFIG_DIR}/.git/hooks/pre-commit 589 | 590 | # add_kettle_artefacts 591 | 592 | echo "Adding .kettle files ..." 593 | 594 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/kettle.properties \ 595 | pdi/.kettle 596 | 597 | if [ ${PSGRS_PDI_STORAGE_TYPE} = "file-repo" ]; then 598 | 599 | export PSGRS_PDI_REPO_NAME=${PSGRS_PROJECT_NAME} 600 | export PSGRS_PDI_REPO_DESCRIPTION="This is the repo for the ${PSGRS_PROJECT_NAME} project" 601 | if [ "${PSGRS_PDI_WEBSPOON_SUPPORT}" = "yes" ]; then 602 | # we mount the project code repo into the Docker container under /root/my-project 603 | export PSGRS_PDI_REPO_PATH=/root/${PSGRS_DEPLOYMENT_FOLDER_NAME}/${PSGRS_PROJECT_GROUP_NAME}/${PSGRS_PROJECT_NAME}-code/pdi/repo 604 | else 605 | export PSGRS_PDI_REPO_PATH=${PSGRS_PROJECT_GROUP_DIR}/${PSGRS_PROJECT_NAME}-code/pdi/repo 606 | fi 607 | 608 | 609 | envsubst \ 610 | < ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/repositories-file.xml \ 611 | > ${COMMON_CONFIG_DIR}/pdi/.kettle/repositories.xml 612 | 613 | fi 614 | if [ ${PSGRS_PDI_STORAGE_TYPE} = "file-based" ]; then 615 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/shared.xml \ 616 | pdi/.kettle 617 | fi 618 | # --- 619 | echo "Adding essential shell files ..." 620 | 621 | export PSGRS_KETTLE_HOME=${COMMON_CONFIG_DIR}/pdi 622 | 623 | envsubst \ 624 | < ${PSGRS_SHELL_DIR}/artefacts/common-config/set-env-variables.sh \ 625 | > ${COMMON_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh 626 | 627 | cp ${PSGRS_SHELL_DIR}/artefacts/pdi/.kettle/.spoonrc \ 628 | ${COMMON_CONFIG_DIR}/pdi/.kettle 629 | 630 | # commit new files 631 | git add --all 632 | git commit -am "initial commit" 633 | 634 | # enable pre-commit hook 635 | chmod 700 ${COMMON_CONFIG_DIR}/.git/hooks/pre-commit 636 | chmod 700 ${COMMON_CONFIG_DIR}/.git/hooks/settings.sh 637 | 638 | 639 | echo "Creating basic README file ..." 640 | echo "Common configuration for ${PSGRS_ENV} environment." > ${COMMON_CONFIG_DIR}/readme.md 641 | 642 | echo "" 643 | echo "===============================" 644 | echo "" 645 | echo -e "\e[34m\e[47mIMPORTANT\e[0m" 646 | echo "Amend the following configuration file:" 647 | echo "${COMMON_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh" 648 | echo "" 649 | echo "" 650 | echo "Before using Spoon, source this file:" 651 | echo "source ${COMMON_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh" 652 | echo "===============================" 653 | echo "" 654 | 655 | # echo "Running set-env-variables.sh now so that at least KETTLE_HOME is defined." 656 | # echo "You can start PDI Spoon now if working on a dev machine." 657 | echo "" 658 | 659 | source ${COMMON_CONFIG_DIR}/pdi/shell-scripts/set-env-variables.sh 660 | fi 661 | } 662 | 663 | 664 | function project_docu { 665 | # check if required parameter values are available 666 | if [ -z ${PSGRS_ACTION} ] || [ -z ${PSGRS_PROJECT_NAME} ]; then 667 | echo "Not all required arguments were supplied. Required:" 668 | echo "-a " 669 | echo "-p " 670 | echo "exiting ..." 671 | exit 1 672 | fi 673 | echo "===========PROJECT DOCUMENTATION==================" 674 | cd ${PSGRS_PROJECT_GROUP_DIR} 675 | PROJECT_DOCU_DIR=${PSGRS_PROJECT_GROUP_DIR}/${PSGRS_PROJECT_NAME}-documentation 676 | echo "PROJECT_DOCU_DIR: ${PROJECT_DOCU_DIR}" 677 | if [ ! -d "${PROJECT_DOCU_DIR}" ]; then 678 | 679 | echo "Creating project documentation folder ..." 680 | echo "location: ${PROJECT_DOCU_DIR}" 681 | 682 | mkdir ${PROJECT_DOCU_DIR} 683 | cd ${PROJECT_DOCU_DIR} 684 | 685 | echo "Initialising Git Repo ..." 686 | git init . 687 | 688 | echo "Creating and pointing to default git branch" 689 | git checkout -b master 690 | 691 | echo "Creating basic README file ..." 692 | echo "# Documentation for ${PSGRS_PROJECT_NAME}" > ${PROJECT_DOCU_DIR}/readme.md 693 | 694 | # commit new files 695 | git add --all 696 | git commit -am "initial commit" 697 | 698 | fi 699 | } 700 | 701 | function common_docu { 702 | # check if required parameter values are available 703 | if [ -z ${PSGRS_ACTION} ]; then 704 | echo "Not all required arguments were supplied. Required:" 705 | echo "-a " 706 | echo "exiting ..." 707 | exit 1 708 | fi 709 | echo "===========COMMON DOCUMENTATION==================" 710 | cd ${PSGRS_COMMON_GROUP_DIR} 711 | COMMON_DOCU_DIR=${PSGRS_COMMON_GROUP_DIR}/common-documentation 712 | echo "COMMON_DOCU_DIR: ${COMMON_DOCU_DIR}" 713 | if [ ! -d "${COMMON_DOCU_DIR}" ]; then 714 | 715 | echo "Creating project documentation folder ..." 716 | echo "location: ${COMMON_DOCU_DIR}" 717 | 718 | mkdir ${COMMON_DOCU_DIR} 719 | cd ${COMMON_DOCU_DIR} 720 | 721 | echo "Initialising Git Repo ..." 722 | git init . 723 | 724 | echo "Creating and pointing to default git branch" 725 | git checkout -b master 726 | 727 | echo "Creating basic README file ..." 728 | echo "# Common Documentation" > ${COMMON_DOCU_DIR}/readme.md 729 | 730 | # commit new files 731 | git add --all 732 | git commit -am "initial commit" 733 | 734 | fi 735 | } 736 | 737 | 738 | # full setup with common config 739 | if [ ${PSGRS_ACTION} = "1" ]; then 740 | project_config 741 | project_code 742 | project_docu 743 | common_docu 744 | common_config 745 | 746 | # copy utility scripts 747 | cd ${PSGRS_DEPLOYMENT_DIR} 748 | cp ${PSGRS_SHELL_DIR}/artefacts/git/update-all-git-repos.sh . 749 | 750 | if [ "${PSGRS_PDI_WEBSPOON_SUPPORT}" = "yes" ]; then 751 | cat > ${PSGRS_DEPLOYMENT_DIR}/start-webspoon.sh < ${PSGRS_DEPLOYMENT_DIR}/start-webspoon.sh < Each team possibly defines their own standards, but what about global standards? 38 | 39 | # Challenges 40 | 41 | - **Standards** are **boring** ... 42 | - Teams don't **talk** to each other 43 | - The usual PDI developer is **not a programmer**. **Version Control System** (Git) is often an unfamiliar concept. 44 | - Extremely tight schedules: **Just get the job done!** 45 | 46 | ## Total Chaos? 47 | 48 | - **Inconsistent usage of branches**. *Team A* uses `master` branch for development and *Team B* uses it for the prod-ready code. Which one to deploy? 49 | - Different **file name conventions** being used or none at all. 50 | - Whole lot of **file types** committed to the Git repository that shouldn't be there in the first place. 51 | - Hard coded **configuration details**. 52 | - Code can't run easily in environment X because setup is not flexible enough 53 | - Difficult to identify **main job** due to lack of consistent naming across projects. 54 | - Teams will **develop code over and over again**. 55 | - On deployment, not all required tables are created automatically 56 | 57 | ## My world - their world 58 | 59 | ![IMG_0050.PNG](pics/IMG_0050.PNG) 60 | 61 | ## Think of ... deploying 62 | 63 | - an automated process (**DevOps**) has to be able to pick up your code and deploy it 64 | - this requires **standards** 65 | - this requires strict **separation** of **code** and **configuration**: In prod config details might be supplied differently then in dev. 66 | 67 | ## Think of ... supporting 68 | 69 | Think of the one who has to monitor and support your projects! 70 | 71 | - They shouldn't have to consult a **100 page project-specific handbook** to keep your process alive. 72 | - They **monitor 100 other projects**. 73 | - **Consistency is key!** 74 | 75 | # What we are aiming for 76 | 77 | **ENFORCE**: 78 | 79 | - **standardised git folder structure** setup 80 | - **naming conventions** (to some extend) 81 | 82 | **ENABLE**: 83 | 84 | - **easy configuration** of **multiple environments** (but not necessarily production) 85 | - running **multiple projects next to each other** with the same OS user 86 | - **Simulation of multiple environments** on one machine within one user account 87 | - **sharing of artefacts** across multiple projects (PDI and config) 88 | 89 | No software dependencies: We are cut off from the world 90 | 91 | # Artefacts 92 | 93 | ## PDI Store Types 94 | 95 | - File based 96 | - File Repo 97 | - DB Repo 98 | - Pentaho Repository (Jackrabbit based, CE and EE) 99 | 100 | 101 | ## PDI Artefacts 102 | 103 | 104 | | Name | Storage Type | Purpose 105 | | ----------------------------- | :------------: | ---- | 106 | | `.kettle/kettle.properties` | all | Stores global PDI properties 107 | | `.kettle/repositories.xml` | repo | Stores locations of PDI repositories 108 | | `.kettle/shared.xml` | file-based | Enables sharing DB connection details 109 | | `.kettle/metastore` | all | Stores various other artefacts 110 | | `/.kdb` | repo | Stores a db connection 111 | | `.kjb` | all | PDI job 112 | | `.ktr` | all | PDI transformation 113 | 114 | 115 | ## Pentaho Server Artefacts 116 | 117 | | Name | File Extension | Store as is? | 118 | | -------------------- | :--------------: | :------------: | 119 | | Mondrian Schema | `xml` | yes | 120 | | Metadata Model | `xmi` (xml) | yes | 121 | | Analyzer Report | `xanalyzer` (xml) | yes | 122 | | Interactive Report | `prpti` (zip) | no | 123 | | CDE | `cda`, `cdfde`, `wcdf`, `html`, `js`, `css`, etc | yes | 124 | | DB Connection | `json` | yes | 125 | 126 | 127 | # Solution 128 | 129 | ## Developers need ... 130 | 131 | **A starter package**: 132 | 133 | - with **predefined folder structure** and 134 | - **git hooks** to control names and file types that can be committed. 135 | 136 | ## Separating Configuration from Code (1) 137 | 138 | - Configuration details stored in **dedicated Git Repo per environment** 139 | - Only one branch used: `master` 140 | 141 | > **Give Me Code! Only Code!** 142 | 143 | > **We develop not for any specific environment, but for any environment**: Process has to be generic enough! 144 | 145 | ## Separating Configuration from Code (2) 146 | 147 | Reasons for separation of code and config: 148 | 149 | - **Security**: Developers should not see prod config 150 | - **Incompatible with branching**: You can't merge config from dev branch into test branch since parameter values are different. 151 | - **Avoids mix-ups**: Dev config should not be on prod 152 | - **Enforces parameterisation of code**: Developers are more conscious since code and config repos are physically separated 153 | 154 | ## Separating Configuration from Code (3) 155 | 156 | ![code-config-separation](pics/code-config-separation.png) 157 | 158 | ## What is NOT Code 159 | 160 | - **Configuration**: Goes into dedicated config repo by environment. 161 | - **Documentation**: Goes into dedicated docu repo. 162 | - **Data**: 163 | - Lookup Data: E.g. business user provide you with lookup data to enrich operational data. This should be stored separately. 164 | - Test Data: Can be stored with your code since it serves the purpose of testing the quality of your code. 165 | - **Binary files**: Excel, Word, Zip files etc 166 | 167 | ## Standardised Git Repo Structure - Code Repo 168 | 169 | | folder | description 170 | |------------------------------------ |--------------------------------- 171 | | `pdi/repo` | pdi files (ktr, kjb). Also root of file based repo if used. 172 | | `pdi/sql` | SQL queries 173 | | `pdi/sql/ddl` | ddl 174 | | `pentaho-server/metadata` | pentaho metadata models 175 | | `pentaho-server/mondrian` | mondrian cube definitions 176 | | `pentaho-server/repo` | contains export from pentaho server repo 177 | | `pentaho-server/prd` | perntaho report files 178 | | `shell-scripts` | any shell-scripts that don't hold configuration specific instructions 179 | 180 | > **Note**: Data, like lookup tables, must not be stored with the code. For development and unit testing they can be stored in the `config` git repo's `test-data` folder. But in prod it must reside outside any git repo if it is the only source available. 181 | 182 | ## Standardised Git Repo Structure - Configuration Repo 183 | 184 | | folder | description 185 | |------------------------------------ |--------------------------------- 186 | | `pdi/.kettle` | pdi config files 187 | | `pdi/metadata` | any metadata files that drive DI processes 188 | | `pdi/properties` | properties files source by pdi 189 | | `pdi/schedules` | holds crontab instructions, DI server schedules or similar 190 | | `pdi/shell-scripts` | shell scripts to execute e.g. PDI jobs 191 | | `pdi/test-data` | optional: test data for development or unit testing - specific to environment 192 | | `pentaho-server/connections` | pentaho server connections 193 | | `utilities` | 194 | 195 | ## Enforcing Standards via Git 196 | 197 | **Special thanks** to **Luis Silva** for suggesting **Git Hooks** and providing some of the hooks. 198 | 199 | ## Enforcing Standards via Git - File Names and Paths 200 | 201 | **Check if**: 202 | 203 | - File names have **non ASCII characters** 204 | - File names are **unique** 205 | - File name are **lower case** 206 | - File or folder names do not contain `dev`, `test`, `beta`, `new`, `v[0-9]{1}` 207 | - File type is in the list of **accepted file extensions** 208 | - PDI job and transformation filenames meet **naming conventions** 209 | 210 | ## Enforcing Standards via Git - File Content 211 | 212 | PDI jobs and transformations: 213 | 214 | - **Repository path** does match their OS level filesystem path 215 | - Hardcoded **IP addresses** were not used 216 | - Hardcoded **domain names** were not used 217 | - **Parameters** and **variables** follow naming convention (Upper case, prefixes) 218 | - Referenced **database connections** are part of a specified list (also not empty) 219 | - Defined **database connections** are part fo a specified list 220 | 221 | ## Enforcing Standards via Git - Repo 222 | 223 | - Git **Branch name** is valid: Checks against a certain set of **accepted branch names**. 224 | 225 | 226 | ## Pre-Commit File Extension Check 227 | 228 | Supported Extensions: 229 | 230 | - **Code Repository**: cda, cdfde, css, csv, html, jpeg, js, json, kjb, ktr, md, png, prpt, prpti, sh, svg, txt, wcdf, xanalyzer, xmi, xml 231 | - **Config Repository**: csv, md, properties, sh 232 | 233 | ## Other Git Gems 234 | 235 | - **Generate Manifest**: Allows you to see which version of code was added to a package (when you prepare code for deployment) 236 | - **Generate Changelog**: Visibility of what features, bug fixes etc were implemented in last built. 237 | 238 | 239 | # Auto-Setup 240 | 241 | Use `initialise-repo.sh` script to setup standardised **Git repo**. Can be created individually or in popular combinations: 242 | 243 | **project-specific**: 244 | 245 | - **config repo** for a given environment (`-conf-`) 246 | - **code repo** (`-code`) 247 | - **docu repo** (`-documentation`) 248 | 249 | **common**: 250 | 251 | - **config repo** for a given environment (`common-conf-`) 252 | - **docu repo** (`common-documentation`) 253 | 254 | **modules**: 255 | 256 | - PDI **modules** (`modules`): for reusable code/patterns. Holds plain modules only, so it can be use either in file-based or repo-based PDI setup. 257 | - PDI **modules repo** (`modules-pdi-repo`): required when creating modules via PDI repo. 258 | 259 | ## Structure: Standalone Project (1) 260 | 261 | - Download the code from [GitHub](https://github.com/diethardsteiner/pentaho-standardised-git-repo-setup) 262 | - Followed the configuration instructions outlined in the README 263 | - Example: Create a standalone project with a PDI file repo - no shared artefacts: 264 | 265 | 266 | ``` 267 | ./initialise-repo.sh -a 2 -p mysampleproj -e dev -s file-repo 268 | ``` 269 | 270 | ## Structure: Standalone Project (2) 271 | 272 | ![Picture: Structure without common artefacts](./pics/structure-without-common-artefacts.png) 273 | 274 | ## Structure: Standalone Project (3) 275 | 276 | ``` 277 | mysampleproj-code 278 | ├── .git 279 | │   └── hooks <-- COMMIT CHECKS 280 | ├── pdi 281 | │   ├── repo 282 | │   │   ├── db_connection_template.kdb 283 | │   │   ├── modules <-- REUSABLE CODE 284 | │   │   │   ├── continuous_delivery 285 | │   │   │   ├── database_versioning_tool 286 | │   │   │   ├── master_wrapper 287 | │   │   │   ├── pentaho_server_refresh 288 | │   │   │   └── restartable_job 289 | │   │   └── mysampleproj 290 | │   └── sql 291 | │   └── ddl 292 | ├── pentaho-server 293 | │   ├── metadata 294 | │   ├── mondrian 295 | │   ├── prd 296 | │   └── repo 297 | ├── readme.md 298 | └── shell-scripts 299 | ``` 300 | 301 | ## Structure: Standalone Project (4) 302 | 303 | ``` 304 | mysampleproj-config-dev/ 305 | ├── .git 306 | │   └── hooks <-- COMMIT CHECKS 307 | ├── pdi 308 | │   ├── .kettle <-- PDI CONFIG 309 | │   │   ├── kettle.properties 310 | │   │   ├── repositories.xml 311 | │   │   └── .spoonrc 312 | │   ├── metadata 313 | │   ├── properties 314 | │   │   ├── jb_mysampleproj_master.properties <-- JOB CONFIG 315 | │   │   └── mysampleproj.properties <-- PROJECT CONFIG 316 | │   ├── schedules 317 | │   ├── shell-scripts <-- STANDARDISED EXECUTION SCRIPTS 318 | │   │   ├── run_jb_mysampleproj_master.sh <-- MASTER JOB RUNNER 319 | │   │   ├── set-env-variables.sh 320 | │   │   └── wrapper.sh <-- GENERIC JOB WRAPPER (USED BY ALL RUNNERS) 321 | │   └── test-data 322 | ├── pentaho-server 323 | │   └── connections 324 | ├── readme.md 325 | └── utilities 326 | └── build-rpm 327 | ├── package-git-repo.sh 328 | ├── settings.sh 329 | └── template.spec 330 | ``` 331 | 332 | ## Structure: Project with Common Artefacts 333 | 334 | ![Picture: Structure with common artefacts](pics/structure-with-common-artefacts.png ) 335 | 336 | ## Spoon Pre-Configured 337 | 338 | - Important settings enforced out-of-the-box: 339 | 340 | ![Picture: Spoon Preconfigured](./pics/spoon-preconfigured.png) 341 | 342 | ## Repository 343 | 344 | - Preconfigured access to file based **PDI repository**: After initialisation Developer can access the repo straight away from **Spoon**. 345 | - The PDI repo is **preloaded** with centrally maintained **Modules**, to ensure **common design patterns** are followed: 346 | 347 | ![Picture: PDI Repo Access Pre-Configured](./pics/modules-shown-in-repo-browser.png) 348 | 349 | ## PDI Reusable Code: Modules 350 | 351 | - Modules reside in a **separate repository** 352 | - They are referenced from within each project's `code` repository 353 | - We use **Git Submodules** to achieve this 354 | - Modules are maintained in one place 355 | - Code changes/Improvements can be easily pulled in each project 356 | - Projects reference **module's master branch**: Any improvement work happens in the module's feature branches. (follows GitFlow) 357 | 358 | ## Default Branches 359 | 360 | | git repo | default branch | code propagation flow 361 | | --------------------------- |--------------------|---------------- 362 | | `myproject-code` | `dev` | `featureX > dev > releaseX > master` 363 | | `myproject-config-` | `master` | `n/a` 364 | | `myproject-documentation` | `master` | `n/a` 365 | | `common-conf-` | `master` | `n/a` 366 | | `common-documentation` | `master` | `n/a` 367 | 368 | ## Git Hooks 369 | 370 | - Straight from the first commit checks will be run: 371 | 372 | ![Picture: Pre-Commit Validation](./pics/pre-commit-validation.png) 373 | 374 | ## Simulating Multiple Environments On One Machine 375 | 376 | ### Same Code Branch different Configs 377 | 378 | Since we externalised the config details, we can just throw any config at the code: 379 | 380 | ``` 381 | myproject-code <-- e.g. release_X branch checked out 382 | myproject-config-integration <-- config details for integration env 383 | myproject-config-uat <-- config details for uat env 384 | ``` 385 | 386 | ### Mixing Different Code Branches On Same Machine 387 | 388 | Simple: Just create parent folder and check out different code branches with different names, e.g.: 389 | 390 | ``` 391 | // 392 | -config- 393 | -config- 394 | ``` 395 | 396 | **Example**: Cloning with specific local folder name 397 | 398 | ``` 399 | $ mkdir myproject && cd myproject 400 | $ git clone URL release 401 | ``` 402 | 403 | ## Utilities for Continuous Integration 404 | 405 | *This is still work in progress* 406 | 407 | - Package repo 408 | - Upload to EE repository 409 | - Upload artefacts to BA Server 410 | - Purge existing artefacts in EE repository 411 | 412 | 413 | ## Deployment 414 | 415 | *This is still work in progress* 416 | 417 | Simple deployment options: 418 | 419 | - Package as RPM 420 | - Version name included in folder, so on target machine you can symlink to it: Enables easy rollback 421 | 422 | ## Deployment - Isolation of Common Artefacts (Requirement) 423 | 424 | - In production you might want to allow projects to reference different versions of the common artefacts. 425 | - In this case, common artefacts cannot be shared among projects any more. 426 | - This avoids impact on legacy projects if there are any code changes and not enough time and budget for testing is available. 427 | 428 | ## Deployment - Isolation of Common Artefacts (Solution) 429 | 430 | - Introduction of a parent folder, where all required repos get cloned into 431 | - If using the file repo and deploying to the EE repo or file repo, use the `import.sh` **path prefix** option to create one additional parent folder. 432 | 433 | ``` 434 | projectX <--- TOP LEVEL FOLDER ADDED 435 | ├── common-config-prod 436 | ├── projectX-code 437 | | └── pdi 438 | | └── projectX <--- TOP LEVEL FOLDER ADDED 439 | | ├── modules 440 | | └── projectX 441 | └── projectX-config-prod 442 | ``` 443 | 444 | # Other Recommendations and Comments 445 | 446 | ## PDI: Using Project and Job specific properties files 447 | 448 | General Hierarchy: **3 Levels of Scope** 449 | 450 | ``` 451 | kettle.properties <--- GLOBAL 452 | └── .properties <--- PROJECT SPECIFIC 453 | └── .properties <--- JOB SPECIFIC (MASTER JOB) 454 | └── .properties <--- JOB SPECIFIC (SUB JOB) 455 | └── .properties <--- JOB SPECIFIC (SUB JOB) 456 | └── ... 457 | ``` 458 | 459 | A **generic wrapper** job sources **project and job specific properties files**. 460 | Job specific properties files should have the same name as job. 461 | Reference job specific props like so `${PROP_CONFIG_PATH}/${Internal.Job.Name}.properties`. 462 | 463 | ## Notes on kettle.properties 464 | 465 | - Using `kettle.properties` for the global scope works only really reliably if used with the `pan` and `kitchen` command line utilities. 466 | - The Pentaho Server/DI Server requires a full restart if the properties file changes. 467 | - If the **DI Server** is used for **Scheduling**, a global properties file with a different name should be used which is sourced each time via a PDI job. 468 | 469 | ## PDI: Externalise SQL 470 | 471 | - **Easier to maintain** 472 | - Don't have to open Spoon to change it 473 | - Syntax highlighting in text editor 474 | - Any other goodies offered by text editor 475 | 476 | ## Notes on Scheduling 477 | 478 | - A specific **Linux User** runs the **DI Server**. When a DI job is scheduled via the DI Server's **Scheduler**, it will use this user and hence there can always only be one `kettle.properties` file. 479 | - Crontab is user specific. If you run e.g. 2 processes at the same time and use a wrapper script, everything you set within the wrapper script will not overlap with what is set in the other process, as long as you do not use `EXPORT`. So in a nutshell, `KETTLE_HOME` can be defined for each user. 480 | 481 | ## Notes on PDI Repository Folder Structre 482 | 483 | - In developement, **do not nest your project folder to deep** (e.g `/home/pentaho/projectX` VS `projectX`) 484 | - Ideally, create your project folder directly on the **top level** (just under ` dev -> releaseX -> master** 538 | 539 | - **feature branches**: One for each new feature implemented 540 | - **dev branch**: consolidates code for finished features 541 | - **release branches**: One for each release 542 | - **master branch**: holds latest production ready code 543 | 544 | - Code gets **propagated** from featureX all the way up to master 545 | - **Developers** can **only write** to `feature*` and `dev` branches 546 | - Code review mandatory before merging `feature*` branch into `dev` branch 547 | - **No code changes** on master branch! 548 | - release codes run against **integration tests** first before being promoted 549 | 550 | ## Basic Branching Strategy (2) 551 | 552 | Git Frontend (GitLab, BitBucket, etc) has to be configured so that merge into development branch is not possible without accepting a **pull request/merge request** (= code review). 553 | 554 | On merge: 555 | 556 | - **Full history preserved** VS 557 | - **squashed history** 558 | 559 | ## Simple Merge Strategy 560 | 561 | Since our code repo only contains code and we standardised the way code gets promoted, the merge strategy is simple: 562 | 563 | ``` 564 | # create new feature branch for jira issue cis-201 based on dev branch 565 | $ git checkout -b feature-cis-201 dev 566 | # once complete initiate pull request into dev branch 567 | --- has to be done via web frontend --- 568 | # merge into dev branch 569 | --- has to be done via web frontend --- 570 | $ git checkout dev (shown for illustration purpose only) 571 | $ git merge --no-ff feature-cis-201 dev (shown for illustration purpose only) 572 | # promote to release: create new release branch 573 | $ git checkout -b release-1.2 dev 574 | # promote to master 575 | $ git checkout master 576 | $ git merge --no-ff release-1.2 577 | $ git tag -a 1.2 578 | # if we made changes in the release branch, merge them back into dev 579 | $ git checkout dev 580 | $ git merge --no-ff release-1.2 581 | ``` 582 | 583 | Example 2: 584 | 585 | ```bash 586 | cd -code 587 | git checkout master 588 | # check if there are any changes 589 | git status 590 | git merge --squash release 591 | git commit 592 | git push 593 | # check for tags created so far 594 | git tag 595 | git tag -a x.x.x-prod -m "my comments" 596 | git push --tags 597 | git checkout dev 598 | 599 | cd ../-config-prod 600 | # check if there are any changes 601 | git status 602 | # check for tags created so far 603 | git tag 604 | git tag -a x.x.x -m " my comments " 605 | git push --tags 606 | ``` 607 | 608 | ### How to easily spot differences in config files between environments 609 | 610 | Finding config properties files that are different: 611 | 612 | ``` 613 | $ diff --brief dm-config-uat/properties/ dm-config-prod/properties/ 614 | ``` 615 | 616 | Based on the output of the previous command, take a look at each of the mentioned files to see what the difference is: 617 | 618 | ```bash 619 | $ diff dm-config-uat/properties/dm.properties dm-config-prod/properties/dm.properties 620 | ``` 621 | 622 | Or alternatively you might prefer the side by side comparison: 623 | 624 | ```bash 625 | $ diff --side-by-side dm-config-uat/properties/dm.properties dm-config-prod/properties/dm.properties 626 | ``` 627 | 628 | > **Note**: Depending on the line length, the output might be cut a bit. In any case, it will give you a first indication where the differences are. 629 | 630 | ## Enforcements 631 | 632 | - restrict permissions on certain branches (e.g. `master`) 633 | - introduce merge requests 634 | - make `dev` the default branch 635 | 636 | 637 | ## Example: Restricting Merge Permissions on GitLab 638 | 639 | GitLab used to illustrate process, feature is available in other Git Servers as well. 640 | 641 | 1. Go to your project on the GitLab website, choose **Settings**, then click **Repository**. 642 | 2. Go to the **Protected Branches** section and fill out the form to **allow merge into branch** only for certain group. GitLab has `Master` and `Developer` roles. 643 | 644 | branch | allowed to merge | allowed to push | should be applied? 645 | -----------|------------------|-----------------|----------- 646 | `master` | `Masters` | `Masters` | mandatory 647 | `release*` | `Masters` | `Masters` | recommended 648 | 649 | 650 | ## Example: Merge Request on GitLab 651 | 652 | [Info](https://docs.gitlab.com/ee/user/project/merge_requests/merge_request_approvals.html) 653 | 654 | Available in **Enterprise Edition** only. 655 | 656 | ## Provide essential Git Info 657 | 658 | Each commit should be linked to your name. Provide details like so: 659 | 660 | ``` 661 | git config --global user.name "Diethard Steiner" 662 | git config --global user.email "diethard.steiner@bissolconsulting.com" 663 | ``` 664 | 665 | ## Git Tagging For Code Branches 666 | 667 | Our git branches: 668 | 669 | - `feature*` 670 | - `dev` 671 | - `release*` 672 | - `master` 673 | 674 | We use following tagging notation: 675 | 676 | ``` 677 | x.x.x- 678 | ``` 679 | 680 | The first 3 digits donate the following: 681 | 682 | - **Major**: Major changes 683 | - **Minor**: small incremental changes 684 | - **Patch**/feature 685 | 686 | Increment the **patch number** when you have merge one or more new features into the `dev` branch. After you have accumulated enough features in the `dev` branch, that a new release is warranted, merge the code into the `release*` branch (or directly into the `master` branch if you do not use a `release*` branch) and tag it by increasing the **minor number** and adding the `-prod` suffix. 687 | 688 | On merge we recommend that you use the `squash` strategy, and this is why: 689 | 690 | ![](pics/git-merge-squash.png) 691 | 692 | If `Feature1` branch gets merged, a tag `1.1.1-dev` gets created and later `Feature2` later gets merged: If we checkout `1.1.1-dev` we will also get any commits coming from `Feature2` at this stage since the whole history got merged. So if we were to put this code into production, we would have partially finished `Feature2` features in it, which is - of course - a problem. Do we need the history of features? Not really, not each commit. The solution is to squash all the commits for all merges, which basically compresses the history into just one commit (so all the history gets preserved, it’s like one single **changelog** instead of many individual ones). The command to use is: 693 | 694 | ``` 695 | merge --squash ... 696 | ``` 697 | 698 | Over time feature branches get deleted. 699 | 700 | We merge into the release or master branch only when we have enough features ready. In this case, we just want a squashed merge. We tag the merge like so: `1.1.0-prod`. 701 | 702 | **Example**: Merging and tagging code branch (`dev` to `release`): 703 | 704 | ```bash 705 | # check if release branch already exists 706 | $ git branch 707 | # if release branch does not exist yet 708 | # create new branch based on dev branch 709 | $ git checkout -b release dev 710 | # OR if release branch already exists 711 | $ git checkout release && git merge 712 | # tag the release 713 | $ git tag -a “1.1.0-dev” -m “CDC added” 714 | $ git push —tags 715 | # and now back to the dev branch since you must not 716 | # commit anything on the release branch 717 | $ git checkout devs 718 | ``` 719 | 720 | ## Git Tagging For Config Branches 721 | 722 | Since there is no branching and no real releases on the config repos, the environment suffix is not required. If you change the config, just increase the patch number for the tag. 723 | 724 | Example: `1.1.0` 725 | 726 | **Example**: Tagging config branch 727 | 728 | ```bash 729 | $ git tag -a “1.1.0” -m “DB config details changed” 730 | $ git push —tags 731 | ``` 732 | 733 | ## Example: Setting the default branch on Gitlab 734 | 735 | Applies to `*-code` repos only: 736 | 737 | - When you clone a project, usually by default the `master` branch gets checked out. 738 | - If developer doesn’t pay attention, it could be that they change code and commit it back to the production-ready code! 739 | 740 | To change default branch in GitLab: 741 | 742 | 1. Settings > General > General project settings > Expand 743 | 2. Default Branch > Change your project default branch 744 | 3. Save changes 745 | 746 | ## Central Quality Control by Technical Architect 747 | 748 | - Automatic controls are great but ultimately they cannot catch everything 749 | - Recommendation: Every project before going into project must be checked and approved by a Senior Developer/Technical Architect. This person has to be assigned to this job most of the time. 750 | - It is essential that this person has a view across all the projects: Usually people only focus on project-specific work and don’t worry too much about common standards. Having someone central in place to assure that common standards are followed is essential! 751 | 752 | ## How to easily spot differences in config files between environments 753 | 754 | One of the important things to do at times is to check that properties files across environments are in sync. By "in sync" is meant that the same properties are listed, but values might differ (as you expect). To make this task as easy as possible, properties should be listed in the same order across environments. 755 | 756 | To easily spot the differences between properties files in different environment git repos, you can apply following strategy: 757 | 758 | Finding config properties files that are different: 759 | 760 | ``` 761 | $ diff --brief mpr-config-uat/properties/ mpr-config-prod/properties/ 762 | ``` 763 | 764 | Based on the output of the previous command, take a look at each of the mentioned files to see what the difference is: 765 | 766 | ```bash 767 | $ diff mpr-config-uat/properties/mpr.properties mpr-config-prod/properties/mpr.properties 768 | ``` 769 | 770 | Or alternatively you might prefer the side by side comparison: 771 | 772 | ```bash 773 | $ diff --side-by-side mpr-config-uat/properties/mpr.properties mpr-config-prod/properties/mpr.properties 774 | ``` 775 | 776 | 777 | -------------------------------------------------------------------------------- /presentations/pics/IMG_0050.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/IMG_0050.PNG -------------------------------------------------------------------------------- /presentations/pics/code-config-separation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/code-config-separation.png -------------------------------------------------------------------------------- /presentations/pics/git-merge-sqash.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/git-merge-sqash.png -------------------------------------------------------------------------------- /presentations/pics/git-merge-squash.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/git-merge-squash.png -------------------------------------------------------------------------------- /presentations/pics/logo_v4_websafe_small.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/logo_v4_websafe_small.gif -------------------------------------------------------------------------------- /presentations/pics/logo_v4_white_small.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/logo_v4_white_small.png -------------------------------------------------------------------------------- /presentations/pics/modules-shown-in-repo-browser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/modules-shown-in-repo-browser.png -------------------------------------------------------------------------------- /presentations/pics/pre-commit-filename-validation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/pre-commit-filename-validation.png -------------------------------------------------------------------------------- /presentations/pics/pre-commit-validation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/pre-commit-validation.png -------------------------------------------------------------------------------- /presentations/pics/spoon-preconfigured.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/spoon-preconfigured.png -------------------------------------------------------------------------------- /presentations/pics/structure-with-common-artefacts.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/structure-with-common-artefacts.png -------------------------------------------------------------------------------- /presentations/pics/structure-without-common-artefacts.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/diethardsteiner/pentaho-standardised-git-repo-setup/40d01b2f5968d02f19fdddf0fccbfa1c88cdbd1f/presentations/pics/structure-without-common-artefacts.png --------------------------------------------------------------------------------