├── README.md ├── base_image.md ├── cf-pg-cluster.json └── cf-pg-configs ├── README.md ├── ro ├── i3en.2xlarge.conf └── i3en.xlarge.conf └── rw ├── r5a.2xlarge.conf ├── r5a.xlarge.conf └── t3a.medium.conf /README.md: -------------------------------------------------------------------------------- 1 | # Cloudformation Script for creating PostgreSQL Clusters 2 | 3 | This project is aimed at turnkey deployment of PostgreSQL clusters on the AWS cloud. It uses [AWS Cloudformation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/stacks.html) to create "Stack" by defining the Postgres Primary, Replica (optional) and related resources. Things like instance type, AV zone, DB password and keypair are parameterized so they can be set at run time. This implies you can parameterize much more than I have depending your needs. This is intended to be a starting point. 4 | 5 | In its simplest deployment this you can select `None` for the ec2 instance type of the replica and a single database instance will be created 6 | 7 | ## Motivation 8 | 9 | The motivation behind this project is to create containerless Postgres cluster with the click of a button on varrying instance types. This implies that tuning needs to take place for each instance type to get the right values for the various params that affect performance in **postgresql.conf**. You can use tuning guides like this https://postgresqlco.nf/ or simpler one like https://pgtune.leopard.in.ua 10 | 11 | #### More on motivation: 12 | 13 | [HERE](https://medium.com/@mkremer_75412/why-postgres-rds-didnt-work-for-us-and-why-it-won-t-work-for-you-if-you-re-implementing-a-big-6c4fff5a8644) and [HERE])(https://medium.com/@mkremer_75412/how-to-replicate-postgres-rds-functionality-with-a-few-scripts-and-a-cloudformation-template-748c391fce51) 14 | 15 | ## EC2 Instances and Tuning postgresql.conf 16 | 17 | This project focuses on two instance types, the r5a and i3en this does not preclude you from using other instance types and sizes. The r5a is a general purpose memory optimized instance good for ingestion and batch processing workloads. The i3en which will serve as our read replica (via PostGreSQL streaming replication) is good at fast IO and performs OLAP workloads very quickly on large datasets. [Read more about how setting instance specific postgresql.conf parameters are accomplished in Cloudformation](cf-pg-configs/README.md). 18 | 19 | > **Note:** When launching instances make sure to pair the instance sizes accordingly. `large` primary means `large` replica, `xlarge` with `xlarge`, `2xl` with `2xl`, etc. 20 | 21 | #### The Memory Optimized r5a instance types (Read/Write Primary) 22 | https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-store-volumes.html#instance-store-vol-mo 23 | 24 | #### The Storage Optimized i3en instance types (Read Only Replica) 25 | https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-store-volumes.html#instance-store-vol-gp 26 | 27 | 28 | ## Instrumental debug data for stack development 29 | 30 | Use this command to check (grep) for errors during stack creation by sshing into the EC2 instance. 31 | 32 | https://repost.aws/knowledge-center/cloudformation-failed-signal 33 | 34 | ```sh 35 | grep -ni 'error\|failure' $(sudo find /var/log -name cfn\* -or -name cloud-init\*) 36 | ``` 37 | 38 | ### Huge Pages and Mappings 39 | 40 | Both primary and replica Postgres servers rely on Huge Pages performance optimization. The config files in S3 for the contain `huge_pages = on`. 41 | There may be a way to compute nPages dynamically by querying the instance type's total available memory - for now its statically coded in the Mappings section of the Cloudformation template like this: 42 | 43 | ```json 44 | "Mappings": { 45 | "HugePages": { 46 | "r5a.2xlarge": { 47 | "nPages": "16000" 48 | }, 49 | "r5a.xlarge": { 50 | "nPages": "8000" 51 | }, 52 | "t3a.medium": { 53 | "nPages": "1500" 54 | }, 55 | "i3en.2xlarge": { 56 | "nPages": "16000" 57 | }, 58 | "i3en.xlarge": { 59 | "nPages": "8000" 60 | } 61 | } 62 | } 63 | ``` 64 | 65 | ### ZFS 66 | 67 | The instances rely on ZFS which provides tremendous I/O performance improvements over conventional file systems by trading computation for compression and hence faster fetches. The [base_image.md](base_image.md) contains installation steps for ZFS - this should be part of your base AMI. 68 | 69 | https://openzfs.github.io/openzfs-docs/Getting%20Started/RHEL-based%20distro/index.html 70 | 71 | ### VPC securtiy groups 72 | 73 | The template demonstrates referencing existing vpc groupId IDs and also creating them on the fly as resources as using them when associating security on the EC2 resources. Replace my security group ID like `sg-0055ac66` with yours. 74 | 75 | >Beware: There are weird limitation around creating ingress and egress rules as part of the security group in one JSON model. This is why SecurityGroupIngress and SecurityGroup resources are created separately in the template 76 | 77 | 78 | ### WAL-G and replication 79 | 80 | WAL archiving is performed by WAL-G via PostgreSQL’s `archive_command`, and streaming replication is supported by PostgreSQL’s protocol. These functions are distinct, but closely related. WAL-G handles backups and point-in-time-restore (PITR via WAL log shipping), while streaming replication supports the hot standby server. However, if the standby falls behind the primary and misses a WAL segment it can also retrieve it from S3 using WAL-G. WAl-G should be installed on the base AMI 81 | 82 | ### Cloudwatch 83 | 84 | Cloudwatch alarms are handy for monitoring ZFS pool usage AKA disk space consumption. The script sets alarms for 80% thresholds but you may change as you see fit for parameterize. 85 | 86 | 87 | ### DNS 88 | 89 | I create route53 DNS records for an existing local hosted zone. This is handy for setting server names for the cluster for addressing the servers upstream. You end up with DNS names like this `[stack-name]-rw.myapp.local`. Replace `myapp` with your namespace. -------------------------------------------------------------------------------- /base_image.md: -------------------------------------------------------------------------------- 1 | 2 | ## Recipe for creating a PostgreSQL Base AMI using PostgreSQL 14 and ZFS 3 | 4 | Start with the Amazon Linux 2 AMI (HVM) - Kernel 5.10: 5 | 6 | **ami-0895022f3dac85884 (64-bit (x86))** 7 | 8 | 9 | ### Install zfs 10 | 11 | ZFS is huge performance booster for improving I/O for Postgres. By trading a bit of CPU time for compression we are able to read/write data off disk faster. 12 | 13 | https://openzfs.github.io/openzfs-docs/Getting%20Started/RHEL-based%20distro/index.html 14 | 15 | 16 | 17 | #### Install ZFS and it's dependencies 18 | ``` 19 | sudo amazon-linux-extras install epel 20 | sudo yum install -y kernel-devel 21 | sudo yum install -y epel-release 22 | sudo yum install https://zfsonlinux.org/epel/zfs-release-2-2.el7.noarch.rpm 23 | sudo sed -i 's/\$releasever/7/g' /etc/yum.repos.d/zfs.repo** 24 | reboot 25 | sudo yum install -y zfs 26 | sudo /sbin/modprobe zfs 27 | ``` 28 | Note the funny business above with sed replacing $releasever with 7 to is get unix compatible version for AWS Linux2 29 | 30 | 31 | ### Install PostgreSQL using amazon-linux-extras 32 | 33 | ``` 34 | sudo amazon-linux-extras install postgresql14 35 | sudo yum install -y postgresql-server postgresql-contrib postgresql-server-devel.x86_64 36 | sudo yum update 37 | ``` 38 | 39 | ### Install a few handy Perl utils we'll need 40 | 41 | ``` 42 | sudo yum install -y perl-Switch \ 43 |   perl-DateTime \ 44 |   perl-Sys-Syslog \ 45 |   perl-LWP-Protocol-https \ 46 |   perl-Digest-SHA.x86_64 47 | ``` 48 | 49 | ### Get and unzip CloudWatch Monitoring scripts 50 | ``` 51 | curl https://aws-cloudwatch.s3.amazonaws.com/downloads/CloudWatchMonitoringScripts-1.2.2.zip -O 52 | unzip CloudWatchMonitoringScripts-1.2.2.zip && \ 53 |   rm CloudWatchMonitoringScripts-1.2.2.zip && \ 54 |   cd aws-scripts-mon 55 | ``` 56 | 57 | 58 | ### Install WAL-G for (AWS Linux 2 compatible version) 59 | ``` 60 | wget https://github.com/wal-g/wal-g/releases/download/v2.0.1/wal-g-pg-ubuntu-18.04-amd64.tar.gz 61 | tar -xvf wal-g-pg-ubuntu-18.04-amd64.tar.gz  62 | mv wal-g-pg-ubuntu-18.04-amd64 wal-g 63 | sudo chown root:root wal-g 64 | sudo mv wal-g /usr/local/bin/ 65 | ``` 66 | 67 | ### Install dependencies for WAL-G. 68 | ``` 69 | sudo yum install -y lzo 70 | ``` 71 | ### Install nvme-cli for querying nvme drives 72 | ``` 73 | sudo yum install nvme-cli 74 | ``` 75 | 76 | Once you've done the above create your own AMI and use the ami-#### ID in your Cloudformation template -------------------------------------------------------------------------------- /cf-pg-cluster.json: -------------------------------------------------------------------------------- 1 | { 2 | "AWSTemplateFormatVersion": "2010-09-09", 3 | "Description": "", 4 | "Parameters": { 5 | "BaseImage": { 6 | "Type": "String", 7 | "Default": "ami-005973cfb88e3d3af", 8 | "AllowedValues": [ 9 | "ami-005973cfb88e3d3af" 10 | ] 11 | }, 12 | "KeyName": { 13 | "Description": "", 14 | "Type": "AWS::EC2::KeyPair::KeyName", 15 | "ConstraintDescription": "must be the name of an existing EC2 KeyPair." 16 | }, 17 | "AvailabilityZone": { 18 | "Description": "", 19 | "Type": "AWS::EC2::AvailabilityZone::Name" 20 | }, 21 | "InstanceTypeRW": { 22 | "Description": "PostgreSQL EC2 instance type", 23 | "Type": "String", 24 | "AllowedValues": [ 25 | "r5a.2xlarge", 26 | "r5a.xlarge", 27 | "t3a.medium" 28 | ], 29 | "ConstraintDescription": "must be a valid EC2 instance type." 30 | }, 31 | "InstanceTypeRO": { 32 | "Description": "PostgreSQL EC2 instance type", 33 | "Type": "String", 34 | "Default": "not_set", 35 | "AllowedValues": [ 36 | "not_set", 37 | "i3en.2xlarge", 38 | "i3en.xlarge" 39 | ], 40 | "ConstraintDescription": "must be a valid EC2 instance type." 41 | }, 42 | "SSHLocation": { 43 | "Description": " The IP address range that can be used to SSH to the EC2 instances", 44 | "Type": "String", 45 | "MinLength": "9", 46 | "MaxLength": "18", 47 | "Default": "0.0.0.0/0", 48 | "AllowedPattern": "(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})/(\\d{1,2})", 49 | "ConstraintDescription": "must be a valid IP CIDR range of the form x.x.x.x/x." 50 | }, 51 | "DBPassword": { 52 | "NoEcho": "true", 53 | "Description": "DB Password for postgres and repuser users", 54 | "Type": "String", 55 | "MinLength": "1", 56 | "MaxLength": "41", 57 | "AllowedPattern": "[a-zA-Z0-9]*", 58 | "ConstraintDescription": "must contain only alphanumeric characters." 59 | } 60 | }, 61 | "Mappings": { 62 | "HugePages": { 63 | "r5a.2xlarge": { 64 | "nPages": "16000" 65 | }, 66 | "r5a.xlarge": { 67 | "nPages": "8000" 68 | }, 69 | "t3a.medium": { 70 | "nPages": "1500" 71 | }, 72 | "i3en.2xlarge": { 73 | "nPages": "16000" 74 | }, 75 | "i3en.xlarge": { 76 | "nPages": "8000" 77 | } 78 | } 79 | }, 80 | "Conditions": { 81 | "ConditionsDeployRO": { 82 | "Fn::Not": [ 83 | { 84 | "Fn::Equals": [ 85 | { 86 | "Ref": "InstanceTypeRO" 87 | }, 88 | "not_set" 89 | ] 90 | } 91 | ] 92 | } 93 | }, 94 | "Resources": { 95 | "InstanceProfile": { 96 | "Type": "AWS::IAM::InstanceProfile", 97 | "Properties": { 98 | "Path": "/", 99 | "Roles": [ 100 | "EC2_S3_Role" 101 | ] 102 | } 103 | }, 104 | "SG": { 105 | "Type": "AWS::EC2::SecurityGroup", 106 | "Properties": { 107 | "GroupDescription": "Enable SSH and replication", 108 | "VpcId": "vpc-5de9c836" 109 | } 110 | }, 111 | "SGinSelfPG": { 112 | "Type": "AWS::EC2::SecurityGroupIngress", 113 | "DependsOn": "SG", 114 | "Properties": { 115 | "GroupId": { "Ref": "SG" }, 116 | "IpProtocol": "tcp", 117 | "FromPort": "5432", 118 | "ToPort": "5432", 119 | "SourceSecurityGroupId": { "Ref": "SG" } 120 | } 121 | }, 122 | "SGinPG": { 123 | "Type": "AWS::EC2::SecurityGroupIngress", 124 | "DependsOn": "SG", 125 | "Properties": { 126 | "GroupId": { "Ref": "SG" }, 127 | "IpProtocol": "tcp", 128 | "FromPort": "5432", 129 | "ToPort": "5432", 130 | "CidrIp": { "Ref": "SSHLocation" } 131 | } 132 | }, 133 | "SGinSSH": { 134 | "Type": "AWS::EC2::SecurityGroupIngress", 135 | "DependsOn": "SG", 136 | "Properties": { 137 | "GroupId": { "Ref": "SG" }, 138 | "IpProtocol": "tcp", 139 | "FromPort": "22", 140 | "ToPort": "22", 141 | "CidrIp": { "Ref": "SSHLocation" } 142 | } 143 | }, 144 | "PsqlServer": { 145 | "Type": "AWS::EC2::Instance", 146 | "Metadata": { 147 | "AWS::CloudFormation::Authentication": { 148 | "S3AccessCreds": { 149 | "type": "S3", 150 | "roleName": "EC2_S3_Role", 151 | "buckets": [ 152 | "cf-pg-configs", 153 | "cf-pg-backups" 154 | ] 155 | } 156 | }, 157 | "AWS::CloudFormation::Init": { 158 | "configSets": { 159 | "InstallAndRun": [ 160 | "Install" 161 | ], 162 | "Configs": [ 163 | "configFiles", 164 | "setPassword", 165 | "setupBackup", 166 | "restartPG", 167 | "push1stBackup" 168 | ] 169 | }, 170 | "Install": { 171 | "packages": { 172 | "yum": { 173 | "htop": [] 174 | } 175 | }, 176 | "commands": { 177 | "00_command": { 178 | "command": "/sbin/modprobe zfs" 179 | }, 180 | "01_command": { 181 | "command": "zpool create -o ashift=9 pgtank nvme1n1 nvme2n1" 182 | }, 183 | "02_command": { 184 | "command": "zfs set compression=lz4 pgtank && zfs set atime=off pgtank && zfs set relatime=on pgtank && zfs set xattr=sa pgtank && zfs set recordsize=128K pgtank && zfs set primarycache=all pgtank" 185 | }, 186 | "03_command": { 187 | "command": "zfs create pgtank/db" 188 | }, 189 | "04_command": { 190 | "command": "mkdir /pgtank/db/postgres" 191 | }, 192 | "05_command": { 193 | "command": "chown postgres:postgres /pgtank/db/postgres" 194 | }, 195 | "06_command": { 196 | "command": "mkdir /backup_scripts && chown postgres:postgres /backup_scripts " 197 | }, 198 | "07_command": { 199 | "command": "runuser -l postgres -c 'initdb -D /pgtank/db/postgres'" 200 | }, 201 | "08_command": { 202 | "command": "echo vm.nr_hugepages=$PAGE_N | tee -a /etc/sysctl.conf", 203 | "env": { 204 | "PAGE_N": { 205 | "Fn::FindInMap": [ 206 | "HugePages", 207 | { 208 | "Ref": "InstanceTypeRW" 209 | }, 210 | "nPages" 211 | ] 212 | } 213 | } 214 | }, 215 | "09_command":{ 216 | "command": "sysctl -p /etc/sysctl.conf" 217 | }, 218 | "10_command": { 219 | "command": { 220 | "Fn::Join": [ 221 | "", 222 | [ 223 | "sudo hostnamectl set-hostname ", 224 | { 225 | "Ref": "AWS::StackName" 226 | }, 227 | "-rw.myapp.local" 228 | ] 229 | ] 230 | } 231 | } 232 | }, 233 | "services": { 234 | "sysvinit": { 235 | "cfn-hup": { 236 | "enabled": "true", 237 | "ensureRunning": "true", 238 | "files": [ 239 | "/etc/cfn/cfn-hup.conf", 240 | "/etc/cfn/hooks.d/cfn-auto-reloader.conf" 241 | ] 242 | } 243 | } 244 | }, 245 | "files": { 246 | "/etc/cfn/cfn-hup.conf": { 247 | "content": { 248 | "Fn::Join": [ 249 | "", 250 | [ 251 | "[main]\n", 252 | "stack=", 253 | { 254 | "Ref": "AWS::StackId" 255 | }, 256 | "\n", 257 | "region=", 258 | { 259 | "Ref": "AWS::Region" 260 | }, 261 | "\n", 262 | "verbose=true\n", 263 | "interval=5\n" 264 | ] 265 | ] 266 | }, 267 | "mode": "000400", 268 | "owner": "root", 269 | "group": "root" 270 | }, 271 | "/etc/cfn/hooks.d/cfn-auto-reloader.conf": { 272 | "content": { 273 | "Fn::Join": [ 274 | "", 275 | [ 276 | "[cfn-auto-reloader-hook]\n", 277 | "triggers=post.update\n", 278 | "path=Resources.PsqlServer.Metadata.AWS::CloudFormation::Init\n", 279 | "action=/opt/aws/bin/cfn-init -v ", 280 | " --stack ", 281 | { 282 | "Ref": "AWS::StackName" 283 | }, 284 | " --resource PsqlServer ", 285 | " --configsets Configs ", 286 | " --region ", 287 | { 288 | "Ref": "AWS::Region" 289 | }, 290 | "\n", 291 | "runas=root\n" 292 | ] 293 | ] 294 | }, 295 | "mode": "000400", 296 | "owner": "root", 297 | "group": "root" 298 | } 299 | } 300 | }, 301 | "configFiles": { 302 | "services": { 303 | "systemd": { 304 | "postgresql": { 305 | "enabled": "true", 306 | "ensureRunning": "true" 307 | } 308 | } 309 | }, 310 | "files": { 311 | "/pgtank/db/postgres/pg_hba.conf": { 312 | "mode": "000755", 313 | "owner": "postgres", 314 | "group": "postgres", 315 | "content": { 316 | "Fn::Join": [ 317 | "", 318 | [ 319 | "local all all trust\n", 320 | "host all all 127.0.0.1/32 trust\n", 321 | "host all all 0.0.0.0/0 scram-sha-256\n", 322 | "host all all ::1/128 trust\n", 323 | "local replication all trust\n", 324 | "host replication all 127.0.0.1/32 trust\n", 325 | "host replication all ::1/128 trust\n" 326 | ] 327 | ] 328 | } 329 | }, 330 | "/pgtank/db/postgres/postgresql.conf": { 331 | "mode": "000755", 332 | "owner": "postgres", 333 | "group": "postgres", 334 | "source": { 335 | "Fn::Join": [ 336 | "", 337 | [ 338 | "https://cf-pg-configs.s3.us-west-2.amazonaws.com/rw/", 339 | { 340 | "Ref": "InstanceTypeRW" 341 | }, 342 | ".conf" 343 | ] 344 | ] 345 | }, 346 | "authentication": "S3AccessCreds" 347 | }, 348 | "/usr/lib/systemd/system/postgresql.service": { 349 | "mode": "000755", 350 | "owner": "postgres", 351 | "group": "postgres", 352 | "content": { 353 | "Fn::Join": [ 354 | "", 355 | [ 356 | "[Unit]\n", 357 | "Description=PostgreSQL database server\n", 358 | "After=network.target\n", 359 | "\n", 360 | "[Service]\n", 361 | "Type=notify\n", 362 | "\n", 363 | "User=postgres\n", 364 | "Group=postgres\n", 365 | "\n", 366 | "# Where to send early-startup messages from the server (before the logging\n", 367 | "# options of postgresql.conf take effect)\n", 368 | "# This is normally controlled by the global default set by systemd\n", 369 | "# StandardOutput=syslog\n", 370 | "\n", 371 | "# Disable OOM kill on the postmaster\n", 372 | "OOMScoreAdjust=-1000\n", 373 | "# ... but allow it still to be effective for child processes\n", 374 | "# (note that these settings are ignored by Postgres releases before 9.5)\n", 375 | "Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj\n", 376 | "Environment=PG_OOM_ADJUST_VALUE=0\n", 377 | "\n", 378 | "Environment=PGDATA=/pgtank/db/postgres\n", 379 | "\n", 380 | "ExecStartPre=/usr/libexec/postgresql-check-db-dir %N\n", 381 | "# Even though the $PGDATA variable is exported (postmaster would accept that)\n", 382 | "# use the -D option here so PGDATA content is printed by /bin/ps and by\n", 383 | "# 'systemctl status'.\n", 384 | "ExecStart=/usr/bin/postmaster -D ${PGDATA}\n", 385 | "ExecReload=/bin/kill -HUP $MAINPID\n", 386 | "KillMode=mixed\n", 387 | "KillSignal=SIGINT\n", 388 | "\n", 389 | "# No artificial start/stop timeout.\n", 390 | "TimeoutSec=0\n", 391 | "\n", 392 | "[Install]\n", 393 | "WantedBy=multi-user.target\n" 394 | ] 395 | ] 396 | } 397 | }, 398 | "/backup_scripts/push-backup.sh": { 399 | "mode": "000755", 400 | "content": { 401 | "Fn::Join": [ 402 | "", 403 | [ 404 | "#!/bin/sh\n", 405 | "export PGHOST=/var/run/postgresql\n", 406 | "export PGUSER=postgres\n", 407 | "export WALG_S3_PREFIX=s3://cf-pg-backups/", 408 | { 409 | "Ref": "AWS::StackName" 410 | }, 411 | "\nexport WALG_COMPRESSION_METHOD=brotli\n", 412 | "/usr/local/bin/wal-g backup-push /pgtank/db/postgres\n", 413 | "/usr/local/bin/wal-g delete --confirm retain FULL 2\n" 414 | ] 415 | ] 416 | } 417 | }, 418 | "/backup_scripts/poll-pg_hba_update.sh": { 419 | "mode": "000755", 420 | "content": { 421 | "Fn::Join": [ 422 | "", 423 | [ 424 | "#!/bin/sh -xe\n", 425 | "STACK_NAME='", 426 | { 427 | "Ref": "AWS::StackName" 428 | }, 429 | "'\n", 430 | "LAST_IP_FILE='/backup_scripts/last_replica_ip.txt'\n", 431 | "REPLICA_IP=$(aws cloudformation describe-stacks --stack-name $STACK_NAME --region us-west-2 ", 432 | "--query \"Stacks[0].Outputs[?OutputKey=='PsqlServerROIP'].OutputValue\" --output text)\n", 433 | "if [ \"$REPLICA_IP\" = \"None\" ]; then\n", 434 | " echo \"Replica IP not available yet.\"\n", 435 | " exit 1\n", 436 | "fi\n", 437 | "if [ -f \"$LAST_IP_FILE\" ]; then\n", 438 | " LAST_IP=$(cat \"$LAST_IP_FILE\")\n", 439 | "else\n", 440 | " LAST_IP=\"\"\n", 441 | "fi\n", 442 | "if [ \"$REPLICA_IP\" != \"$LAST_IP\" ]; then\n", 443 | " echo \"Updating pg_hba.conf with new replica IP: $REPLICA_IP\"\n", 444 | " cp /pgtank/db/postgres/pg_hba.conf /pgtank/db/postgres/pg_hba.conf.bak\n", 445 | " echo \"host replication repuser $REPLICA_IP/32 scram-sha-256\" >> /pgtank/db/postgres/pg_hba.conf\n", 446 | " psql -c \"SELECT pg_reload_conf();\"\n", 447 | " echo \"$REPLICA_IP\" > \"$LAST_IP_FILE\"\n", 448 | "else\n", 449 | " echo \"Replica IP has not changed.\"\n", 450 | "fi\n" 451 | ] 452 | ] 453 | } 454 | } 455 | } 456 | }, 457 | "setPassword": { 458 | "commands": { 459 | "set_password": { 460 | "command": { 461 | "Fn::Join": [ 462 | "", 463 | [ 464 | "psql -U postgres -c \"ALTER USER postgres with PASSWORD '", 465 | { 466 | "Ref": "DBPassword" 467 | }, 468 | "';\"" 469 | ] 470 | ] 471 | } 472 | } 473 | } 474 | }, 475 | "restartPG": { 476 | "commands": { 477 | "restart": { 478 | "command": "systemctl restart postgresql" 479 | } 480 | 481 | } 482 | }, 483 | "push1stBackup": { 484 | "commands": { 485 | 486 | "pushBackup": { 487 | "command": "runuser -l postgres -c '/backup_scripts/push-backup.sh'" 488 | } 489 | 490 | } 491 | }, 492 | "setupBackup": { 493 | "commands": { 494 | "00_command": { 495 | "command": "chmod u+x /backup_scripts/push-backup.sh" 496 | }, 497 | "01_command": { 498 | "command": "runuser -l postgres -c 'psql -U postgres -c \"CREATE EXTENSION plpython3u\" -d postgres'" 499 | }, 500 | "02_command": { 501 | "command": "runuser -l postgres -c 'psql -U postgres -c \"CREATE EXTENSION aws_s3\" -d postgres'" 502 | }, 503 | "03_command": { 504 | "command": "echo \"archive_command = 'WALG_S3_PREFIX=s3://cf-pg-backups/$BUCKUP_FOLDER wal-g wal-push %p'\" >> /pgtank/db/postgres/postgresql.conf", 505 | "env": { 506 | "BUCKUP_FOLDER": { 507 | "Ref": "AWS::StackName" 508 | } 509 | } 510 | }, 511 | "04_cron1_command": { 512 | "command": "echo \"0 4 * * * /backup_scripts/push-backup.sh &> /backup_scripts/push-backup.log\" >> /var/spool/cron/postgres" 513 | }, 514 | "04_cron2_command": { 515 | "command": "echo \"*/5 * * * * /home/ec2-user/aws-scripts-mon/mon-put-instance-data.pl --mem-util --mem-used --disk-space-util --disk-path=/ --disk-path=/pgtank/db --from-cron\" >> /var/spool/cron/ec2-user" 516 | }, 517 | "05_command": { 518 | "command": "sudo -u postgres createuser -U postgres repuser -c 5 --replication -w" 519 | } 520 | "set_repuser_password": { 521 | "command": { 522 | "Fn::Join": [ 523 | "", 524 | [ 525 | "psql -U postgres -c \"ALTER USER repuser with PASSWORD '", 526 | { 527 | "Ref": "DBPassword" 528 | }, 529 | "';\"" 530 | ] 531 | ] 532 | } 533 | }, 534 | "update_replica_ip_pg_hba_polling": { 535 | "command": "echo \"*/5 * * * * /backup_scripts/poll-pg_hba_update.sh >> /backup_scripts/poll-pg_hba_update.log 2>&1\" >> /var/spool/cron/postgres" 536 | } 537 | 538 | } 539 | } 540 | } 541 | }, 542 | "Properties": { 543 | "IamInstanceProfile": { 544 | "Ref": "InstanceProfile" 545 | }, 546 | "Tags": [ 547 | { 548 | "Key": "Name", 549 | "Value": { 550 | "Fn::Join": [ 551 | "", 552 | [ 553 | { 554 | "Ref": "AWS::StackName" 555 | }, 556 | "-rw" 557 | ] 558 | ] 559 | } 560 | } 561 | ], 562 | "AvailabilityZone": { 563 | "Ref": "AvailabilityZone" 564 | }, 565 | "ImageId": { 566 | "Ref": "BaseImage" 567 | }, 568 | "InstanceType": { 569 | "Ref": "InstanceTypeRW" 570 | }, 571 | "BlockDeviceMappings": [ 572 | { 573 | "DeviceName": "/dev/xvda", 574 | "Ebs": { 575 | "VolumeType": "gp3", 576 | "VolumeSize": "30", 577 | "DeleteOnTermination": "true" 578 | } 579 | } 580 | ], 581 | "SecurityGroupIds": [ 582 | { 583 | "Ref": "SG" 584 | }, 585 | "sg-0055ac66" 586 | ], 587 | "Volumes": [ 588 | { 589 | "Device": "/dev/sdf", 590 | "VolumeId": { 591 | "Ref": "disk1" 592 | } 593 | }, 594 | { 595 | "Device": "/dev/sdh", 596 | "VolumeId": { 597 | "Ref": "disk2" 598 | } 599 | } 600 | ], 601 | "KeyName": { 602 | "Ref": "KeyName" 603 | }, 604 | "UserData": { 605 | "Fn::Base64": { 606 | "Fn::Join": [ 607 | "", 608 | [ 609 | "#!/bin/bash -xe\n", 610 | "yum install -y aws-cfn-bootstrap\n", 611 | "# Install the files and packages from the metadata\n", 612 | "/opt/aws/bin/cfn-init -v ", 613 | " --stack ", 614 | { 615 | "Ref": "AWS::StackName" 616 | }, 617 | " --resource PsqlServer ", 618 | " --configsets InstallAndRun,Configs ", 619 | " --region ", 620 | { 621 | "Ref": "AWS::Region" 622 | }, 623 | "\n", 624 | "# Signal the status from cfn-init\n", 625 | "/opt/aws/bin/cfn-signal -e $? ", 626 | " --stack ", 627 | { 628 | "Ref": "AWS::StackName" 629 | }, 630 | " --resource PsqlServer ", 631 | " --region ", 632 | { 633 | "Ref": "AWS::Region" 634 | }, 635 | "\n" 636 | ] 637 | ] 638 | } 639 | } 640 | }, 641 | "CreationPolicy": { 642 | "ResourceSignal": { 643 | "Timeout": "PT10M" 644 | } 645 | } 646 | }, 647 | "PsqlServerPgTankAlarm": { 648 | "Type": "AWS::CloudWatch::Alarm", 649 | "Properties": { 650 | "AlarmName": { 651 | "Fn::Join": [ 652 | "", 653 | [ 654 | { "Ref": "AWS::StackName" }, 655 | "-RW-PGTANK-", 656 | { "Ref": "PsqlServer" } 657 | ] 658 | ] 659 | }, 660 | "Namespace": "System/Linux", 661 | "MetricName": "DiskSpaceUtilization", 662 | "Dimensions": [ 663 | { 664 | "Name": "InstanceId", 665 | "Value": { "Ref": "PsqlServer" } 666 | }, 667 | { 668 | "Name": "MountPath", 669 | "Value": "/pgtank/db" 670 | }, 671 | { 672 | "Name": "Filesystem", 673 | "Value": "pgtank/db" 674 | } 675 | ], 676 | "Statistic": "Average", 677 | "Period": 300, 678 | "EvaluationPeriods": 1, 679 | "Threshold": 80, 680 | "ComparisonOperator": "GreaterThanThreshold", 681 | "AlarmActions": [ 682 | "arn:aws:sns:us-west-2:528107429540:PGrds" 683 | ], 684 | "TreatMissingData": "missing" 685 | } 686 | }, 687 | "PsqlServerRO": { 688 | "Type": "AWS::EC2::Instance", 689 | "Condition": "ConditionsDeployRO", 690 | "Metadata": { 691 | "AWS::CloudFormation::Authentication": { 692 | "S3AccessCreds": { 693 | "type": "S3", 694 | "roleName": "EC2_S3_Role", 695 | "buckets": [ 696 | "cf-pg-configs", 697 | "cf-pg-backups" 698 | ] 699 | } 700 | }, 701 | "AWS::CloudFormation::Init": { 702 | "configSets": { 703 | "InstallAndRun": [ 704 | "Install" 705 | ], 706 | "Configs": [ 707 | "configFiles", 708 | "restartPG" 709 | ] 710 | }, 711 | "Install": { 712 | "packages": { 713 | "yum": { 714 | "htop": [] 715 | } 716 | }, 717 | "commands": { 718 | "00_command": { 719 | "command": "/sbin/modprobe zfs" 720 | }, 721 | "01_command": { 722 | "command": "./backup_scripts/zpool-nvme.sh" 723 | }, 724 | "02_command": { 725 | "command": "zfs set compression=lz4 pgtank && zfs set atime=off pgtank && zfs set relatime=on pgtank && zfs set xattr=sa pgtank && zfs set recordsize=128K pgtank && zfs set primarycache=all pgtank" 726 | }, 727 | "03_command": { 728 | "command": "zfs create pgtank/db" 729 | }, 730 | "06_command": { 731 | "command": "mkdir /pgtank/db/postgres" 732 | }, 733 | "07_command": { 734 | "command": "chown postgres:postgres /pgtank/db/postgres" 735 | }, 736 | "08_command": { 737 | "command": "WALG_S3_PREFIX=s3://cf-pg-backups/$BUCKUP_FOLDER wal-g backup-fetch /pgtank/db/postgres LATEST", 738 | "env": { 739 | "BUCKUP_FOLDER": { 740 | "Ref": "AWS::StackName" 741 | } 742 | } 743 | }, 744 | "09_command": { 745 | "command": "aws s3 cp s3://cf-pg-configs/ro/${INSTANCE_TYPE_RO}.conf /pgtank/db/postgres/postgresql.conf && chown postgres:postgres /pgtank/db/postgres/postgresql.conf", 746 | "env": { 747 | "INSTANCE_TYPE_RO": {"Ref": "InstanceTypeRO"}, 748 | "AWS_DEFAULT_REGION": {"Ref": "AWS::Region"} 749 | } 750 | } 751 | , 752 | "10_command": { 753 | "command": "echo \"restore_command = 'WALG_S3_PREFIX=s3://cf-pg-backups/$BUCKUP_FOLDER wal-g wal-fetch \"%f\" \"%p\"'\" >> /pgtank/db/postgres/postgresql.conf", 754 | "env": { 755 | "BUCKUP_FOLDER": { 756 | "Ref": "AWS::StackName" 757 | } 758 | } 759 | }, 760 | "11_command": { 761 | "command": "touch /pgtank/db/postgres/standby.signal" 762 | }, 763 | "12_command": { 764 | "command": "echo \"primary_conninfo = 'host=$MASTER_IP port=5432 user=repuser password=$DB_PASS'\" >> /pgtank/db/postgres/postgresql.conf", 765 | "env": { 766 | "BUCKUP_FOLDER": { 767 | "Ref": "AWS::StackName" 768 | }, 769 | "MASTER_IP": { 770 | "Fn::GetAtt": [ 771 | "PsqlServer", 772 | "PrivateIp" 773 | ] 774 | }, 775 | "DB_PASS": { 776 | "Ref": "DBPassword" 777 | } 778 | } 779 | }, 780 | "13_command": { 781 | "command": "chown -R postgres:postgres /pgtank/db/postgres/" 782 | }, 783 | "14_command": { 784 | "command": "echo vm.nr_hugepages=$PAGE_N | tee -a /etc/sysctl.conf", 785 | "env": { 786 | "PAGE_N": { 787 | "Fn::FindInMap": [ 788 | "HugePages", 789 | { 790 | "Ref": "InstanceTypeRW" 791 | }, 792 | "nPages" 793 | ] 794 | } 795 | } 796 | }, 797 | "15_command":{ 798 | "command": "sysctl -p /etc/sysctl.conf" 799 | }, 800 | "16_command": { 801 | "command": { 802 | "Fn::Join": [ 803 | "", 804 | [ 805 | "sudo hostnamectl set-hostname ", 806 | { 807 | "Ref": "AWS::StackName" 808 | }, 809 | "-ro.myapp.local" 810 | ] 811 | ] 812 | } 813 | } 814 | }, 815 | "services": { 816 | "sysvinit": { 817 | "cfn-hup": { 818 | "enabled": "true", 819 | "ensureRunning": "true", 820 | "files": [ 821 | "/etc/cfn/cfn-hup.conf", 822 | "/etc/cfn/hooks.d/cfn-auto-reloader.conf" 823 | ] 824 | } 825 | } 826 | }, 827 | "files": { 828 | "/etc/cfn/cfn-hup.conf": { 829 | "content": { 830 | "Fn::Join": [ 831 | "", 832 | [ 833 | "[main]\n", 834 | "stack=", 835 | { 836 | "Ref": "AWS::StackId" 837 | }, 838 | "\n", 839 | "region=", 840 | { 841 | "Ref": "AWS::Region" 842 | }, 843 | "\n", 844 | "verbose=true\n", 845 | "interval=5\n" 846 | ] 847 | ] 848 | }, 849 | "mode": "000400", 850 | "owner": "root", 851 | "group": "root" 852 | }, 853 | "/etc/cfn/hooks.d/cfn-auto-reloader.conf": { 854 | "content": { 855 | "Fn::Join": [ 856 | "", 857 | [ 858 | "[cfn-auto-reloader-hook]\n", 859 | "triggers=post.update\n", 860 | "path=Resources.PsqlServerRO.Metadata.AWS::CloudFormation::Init\n", 861 | "action=/opt/aws/bin/cfn-init -v ", 862 | " --stack ", 863 | { 864 | "Ref": "AWS::StackName" 865 | }, 866 | " --resource PsqlServerRO ", 867 | " --configsets Configs ", 868 | " --region ", 869 | { 870 | "Ref": "AWS::Region" 871 | }, 872 | "\n", 873 | "runas=root\n" 874 | ] 875 | ] 876 | }, 877 | "mode": "000400", 878 | "owner": "root", 879 | "group": "root" 880 | }, 881 | "/backup_scripts/zpool-nvme.sh": { 882 | "content": { 883 | "Fn::Join": [ 884 | "", 885 | [ 886 | "#!/bin/bash\n", 887 | "declare -a DRIVE_NAMES\n", 888 | "while IFS= read -r line; do\n", 889 | " drive=$(echo \"$line\" | awk '{print $1}' | sed 's|/dev/||')\n", 890 | " DRIVE_NAMES+=(\"$drive\")\n", 891 | "done < <(sudo nvme list | grep \"NVMe Instance Storage\")\n", 892 | "if [ ${#DRIVE_NAMES[@]} -eq 0 ]; then\n", 893 | " echo \"No NVMe instance storage drives found.\"\n", 894 | " exit 1\n", 895 | "fi\n", 896 | "DRIVE_LIST=\"\"\n", 897 | "for drive in \"${DRIVE_NAMES[@]}\"; do\n", 898 | " DRIVE_LIST+=\"$drive \"\n", 899 | "done\n", 900 | "DRIVE_LIST=$(echo $DRIVE_LIST | xargs)\n", 901 | "sudo zpool create -o ashift=9 pgtank $DRIVE_LIST\n", 902 | "echo \"ZFS pool 'pgtank' created with drives: $DRIVE_LIST\"\n" 903 | ] 904 | ] 905 | }, 906 | "mode": "000700", 907 | "owner": "root", 908 | "group": "root" 909 | } 910 | } 911 | }, 912 | "configFiles": { 913 | "services": { 914 | "systemd": { 915 | "postgresql": { 916 | "enabled": "true", 917 | "ensureRunning": "true" 918 | } 919 | } 920 | }, 921 | "files": { 922 | "/pgtank/db/postgres/pg_hba.conf": { 923 | "mode": "000755", 924 | "owner": "postgres", 925 | "group": "postgres", 926 | "content": { 927 | "Fn::Join": [ 928 | "", 929 | [ 930 | "local all all trust\n", 931 | "host all all 127.0.0.1/32 trust\n", 932 | "host all all 0.0.0.0/0 scram-sha-256\n", 933 | "host all all ::1/128 trust\n", 934 | "local replication all trust\n", 935 | "host replication all 127.0.0.1/32 trust\n", 936 | "host replication all ::1/128 trust\n" 937 | ] 938 | ] 939 | } 940 | }, 941 | "/usr/lib/systemd/system/postgresql.service": { 942 | "content": { 943 | "Fn::Join": [ 944 | "", 945 | [ 946 | "[Unit]\n", 947 | "Description=PostgreSQL database server\n", 948 | "After=network.target\n", 949 | "\n", 950 | "[Service]\n", 951 | "Type=notify\n", 952 | "\n", 953 | "User=postgres\n", 954 | "Group=postgres\n", 955 | "\n", 956 | "# Where to send early-startup messages from the server (before the logging\n", 957 | "# options of postgresql.conf take effect)\n", 958 | "# This is normally controlled by the global default set by systemd\n", 959 | "# StandardOutput=syslog\n", 960 | "\n", 961 | "# Disable OOM kill on the postmaster\n", 962 | "OOMScoreAdjust=-1000\n", 963 | "# ... but allow it still to be effective for child processes\n", 964 | "# (note that these settings are ignored by Postgres releases before 9.5)\n", 965 | "Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj\n", 966 | "Environment=PG_OOM_ADJUST_VALUE=0\n", 967 | "\n", 968 | "Environment=PGDATA=/pgtank/db/postgres\n", 969 | "\n", 970 | "ExecStartPre=/usr/libexec/postgresql-check-db-dir %N\n", 971 | "# Even though the $PGDATA variable is exported (postmaster would accept that)\n", 972 | "# use the -D option here so PGDATA content is printed by /bin/ps and by\n", 973 | "# 'systemctl status'.\n", 974 | "ExecStart=/usr/bin/postmaster -D ${PGDATA}\n", 975 | "ExecReload=/bin/kill -HUP $MAINPID\n", 976 | "KillMode=mixed\n", 977 | "KillSignal=SIGINT\n", 978 | "\n", 979 | "# No artificial start/stop timeout.\n", 980 | "TimeoutSec=0\n", 981 | "\n", 982 | "[Install]\n", 983 | "WantedBy=multi-user.target\n" 984 | ] 985 | ] 986 | } 987 | } 988 | } 989 | }, 990 | "restartPG": { 991 | "commands": { 992 | "01_command": { 993 | "command": "systemctl restart postgresql" 994 | } 995 | } 996 | } 997 | } 998 | }, 999 | "Properties": { 1000 | "IamInstanceProfile": { 1001 | "Ref": "InstanceProfile" 1002 | }, 1003 | "Tags": [ 1004 | { 1005 | "Key": "Name", 1006 | "Value": { 1007 | "Fn::Join": [ 1008 | "", 1009 | [ 1010 | { 1011 | "Ref": "AWS::StackName" 1012 | }, 1013 | "-ro" 1014 | ] 1015 | ] 1016 | } 1017 | } 1018 | ], 1019 | "AvailabilityZone": { 1020 | "Ref": "AvailabilityZone" 1021 | }, 1022 | "ImageId": { 1023 | "Ref": "BaseImage" 1024 | }, 1025 | "InstanceType": { 1026 | "Ref": "InstanceTypeRO" 1027 | }, 1028 | "SecurityGroupIds": [ 1029 | { 1030 | "Ref": "SG" 1031 | }, 1032 | "sg-0055ac66" 1033 | ], 1034 | "KeyName": { 1035 | "Ref": "KeyName" 1036 | }, 1037 | "UserData": { 1038 | "Fn::Base64": { 1039 | "Fn::Join": [ 1040 | "", 1041 | [ 1042 | "#!/bin/bash -xe\n", 1043 | "yum install -y aws-cfn-bootstrap\n", 1044 | "# Install the files and packages from the metadata\n", 1045 | "/opt/aws/bin/cfn-init -v ", 1046 | " --stack ", 1047 | { 1048 | "Ref": "AWS::StackName" 1049 | }, 1050 | " --resource PsqlServerRO ", 1051 | " --configsets InstallAndRun,Configs ", 1052 | " --region ", 1053 | { 1054 | "Ref": "AWS::Region" 1055 | }, 1056 | "\n", 1057 | "# Signal the status from cfn-init\n", 1058 | "/opt/aws/bin/cfn-signal -e $? ", 1059 | " --stack ", 1060 | { 1061 | "Ref": "AWS::StackName" 1062 | }, 1063 | " --resource PsqlServerRO ", 1064 | " --region ", 1065 | { 1066 | "Ref": "AWS::Region" 1067 | }, 1068 | "\n" 1069 | ] 1070 | ] 1071 | } 1072 | } 1073 | }, 1074 | "CreationPolicy": { 1075 | "ResourceSignal": { 1076 | "Timeout": "PT10M" 1077 | } 1078 | } 1079 | }, 1080 | "PsqlServerROPgTankAlarm": { 1081 | "Type": "AWS::CloudWatch::Alarm", 1082 | "Condition": "ConditionsDeployRO", 1083 | "Properties": { 1084 | "AlarmName": { 1085 | "Fn::Join": [ 1086 | "", 1087 | [ 1088 | { "Ref": "AWS::StackName" }, 1089 | "-RO-PGTANK-", 1090 | { "Ref": "PsqlServerRO" } 1091 | ] 1092 | ] 1093 | }, 1094 | "Namespace": "System/Linux", 1095 | "MetricName": "DiskSpaceUtilization", 1096 | "Dimensions": [ 1097 | { 1098 | "Name": "InstanceId", 1099 | "Value": { "Ref": "PsqlServerRO" } 1100 | }, 1101 | { 1102 | "Name": "MountPath", 1103 | "Value": "/pgtank/db" 1104 | }, 1105 | { 1106 | "Name": "Filesystem", 1107 | "Value": "pgtank/db" 1108 | } 1109 | ], 1110 | "Statistic": "Average", 1111 | "Period": 300, 1112 | "EvaluationPeriods": 1, 1113 | "Threshold": 80, 1114 | "ComparisonOperator": "GreaterThanThreshold", 1115 | "AlarmActions": [ 1116 | "arn:aws:sns:us-west-2:528107429540:PGrds" 1117 | ], 1118 | "TreatMissingData": "missing" 1119 | } 1120 | }, 1121 | 1122 | "disk1": { 1123 | "Type": "AWS::EC2::Volume", 1124 | "Properties": { 1125 | "Tags": [ 1126 | { 1127 | "Key": "Name", 1128 | "Value": { 1129 | "Fn::Join": [ 1130 | "", 1131 | [ 1132 | { 1133 | "Ref": "AWS::StackName" 1134 | }, 1135 | "-disk1" 1136 | ] 1137 | ] 1138 | } 1139 | } 1140 | ], 1141 | "Size": 30, 1142 | "VolumeType": "gp3", 1143 | "AvailabilityZone": { 1144 | "Ref": "AvailabilityZone" 1145 | }, 1146 | "Iops": 1000 1147 | }, 1148 | "Metadata": { 1149 | } 1150 | }, 1151 | "disk2": { 1152 | "Type": "AWS::EC2::Volume", 1153 | "Properties": { 1154 | "Tags": [ 1155 | { 1156 | "Key": "Name", 1157 | "Value": { 1158 | "Fn::Join": [ 1159 | "", 1160 | [ 1161 | { 1162 | "Ref": "AWS::StackName" 1163 | }, 1164 | "-disk2" 1165 | ] 1166 | ] 1167 | } 1168 | } 1169 | ], 1170 | "Size": 30, 1171 | "VolumeType": "gp3", 1172 | "AvailabilityZone": { 1173 | "Ref": "AvailabilityZone" 1174 | }, 1175 | "Iops": 1000 1176 | }, 1177 | "Metadata": { 1178 | } 1179 | }, 1180 | "PrimaryDNSRecord": { 1181 | "Type": "AWS::Route53::RecordSet", 1182 | "Properties": { 1183 | "HostedZoneId": "Z22IFTP89RRJMU", 1184 | "Name": { 1185 | "Fn::Join": [ 1186 | "", 1187 | [ 1188 | { "Ref": "AWS::StackName" }, 1189 | "-rw.myapp.local." 1190 | ] 1191 | ] 1192 | }, 1193 | "Type": "A", 1194 | "TTL": "300", 1195 | "ResourceRecords": [ 1196 | { 1197 | "Fn::GetAtt": ["PsqlServer", "PrivateIp"] 1198 | } 1199 | ] 1200 | } 1201 | }, 1202 | "ReadOnlyDNSRecord": { 1203 | "Type": "AWS::Route53::RecordSet", 1204 | "Condition": "ConditionsDeployRO", 1205 | "Properties": { 1206 | "HostedZoneId": "Z22IFTP89RRJMU", 1207 | "Name": { 1208 | "Fn::Join": [ 1209 | "", 1210 | [ 1211 | { "Ref": "AWS::StackName" }, 1212 | "-ro.myapp.local." 1213 | ] 1214 | ] 1215 | }, 1216 | "Type": "A", 1217 | "TTL": "300", 1218 | "ResourceRecords": [ 1219 | { 1220 | "Fn::GetAtt": ["PsqlServerRO", "PrivateIp"] 1221 | } 1222 | ] 1223 | } 1224 | } 1225 | }, 1226 | "Metadata": { 1227 | }, 1228 | "Outputs": { 1229 | "PsqlServerRWIP": { 1230 | "Description": "Private ip of rw master", 1231 | "Value": { 1232 | "Fn::GetAtt": [ 1233 | "PsqlServer", 1234 | "PrivateIp" 1235 | ] 1236 | } 1237 | }, 1238 | "PsqlServerROIP": { 1239 | "Condition":"ConditionsDeployRO", 1240 | "Description": "Private ip of ro replica", 1241 | "Value": { 1242 | "Fn::GetAtt": [ 1243 | "PsqlServerRO", 1244 | "PrivateIp" 1245 | ] 1246 | } 1247 | } 1248 | 1249 | } 1250 | } 1251 | 1252 | -------------------------------------------------------------------------------- /cf-pg-configs/README.md: -------------------------------------------------------------------------------- 1 | 2 | # EC2 Instance type postgresql.conf 3 | 4 | This project focuses on 3 instance types, the r5a, t3a and i3en this does not preclude you from using other instance types and sizes. The r5a is a general purpose memory optimized instance good for ingestion and batch processing workloads. The i3en which will serve as our read replica (via PostGreSQL streaming replication) is good at fast IO and performs OLAP workloads very quickly on large datasets. The Cloudformation template references a **cf-pg-configs S3 bucket** that contains an inventory of [instance_type.size].conf files representing the Postgresql.conf parameter file for the associated instance type prefixed by its use in the cluster (rw primary & ro replica) like this: 5 | 6 | 7 | ``` 8 | cf-pg-configs.s3.us-west-2.amazonaws.com/rw/r5a.2xlarge.conf 9 | cf-pg-configs.s3.us-west-2.amazonaws.com/rw/r5a.xlarge.conf 10 | cf-pg-configs.s3.us-west-2.amazonaws.com/rw/t3a.medium.conf 11 | cf-pg-configs.s3.us-west-2.amazonaws.com/ro/i3en.2xlarge.conf 12 | cf-pg-configs.s3.us-west-2.amazonaws.com/ro/i3en.xlarge.conf 13 | ``` 14 | 15 | The above naming convention is relied upon in the Cloudformation template. -------------------------------------------------------------------------------- /cf-pg-configs/ro/i3en.2xlarge.conf: -------------------------------------------------------------------------------- 1 | listen_addresses = '0.0.0.0' # what IP address(es) to listen on; 2 | max_connections = 100 # (change requires restart) 3 | password_encryption = scram-sha-256 # md5 or scram-sha-256 4 | shared_buffers = 24GB # min 128kB 5 | huge_pages = on # on, off, or try 6 | work_mem = 256MB # min 64kB 7 | maintenance_work_mem = 2GB # min 1MB 8 | dynamic_shared_memory_type = posix # the default is the first option 9 | effective_io_concurrency = 200 # 1-1000; 0 disables prefetching 10 | max_worker_processes = 4 # (change requires restart) 11 | max_parallel_workers_per_gather = 2 # taken from max_parallel_workers 12 | max_parallel_workers = 4 # maximum number of max_worker_processes that 13 | wal_level = replica # minimal, replica, or logical 14 | synchronous_commit = off # synchronization level; 15 | wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers 16 | max_wal_size = 4GB 17 | min_wal_size = 1GB 18 | archive_mode = off # enables archiving; off, on, or always 19 | archive_timeout = 60 # force a logfile segment switch after this 20 | hot_standby = on # ""off"" disallows queries during recovery 21 | max_standby_archive_delay = 1200s # max delay before canceling queries 22 | max_standby_streaming_delay = 1200s # max delay before canceling queries 23 | random_page_cost = 1.2 # same scale as above 24 | effective_cache_size = 20GB 25 | default_statistics_target = 10000 # range 1-10000 26 | logging_collector = on # Enable capturing of stderr and csvlog 27 | log_filename = 'postgresql-%a.log' # log file name pattern, 28 | log_truncate_on_rotation = on # If on, an existing log file with the 29 | log_rotation_age = 1d # Automatic rotation of logfiles will 30 | log_rotation_size = 0 # Automatic rotation of logfiles will 31 | log_timezone = 'UTC' 32 | track_activity_query_size = 16384 # (change requires restart) 33 | datestyle = 'iso, ymd' 34 | timezone = 'UTC' 35 | lc_messages = 'en_CA.UTF-8' # locale for system error message 36 | lc_monetary = 'en_CA.UTF-8' # locale for monetary formatting 37 | lc_numeric = 'en_CA.UTF-8' # locale for number formatting 38 | lc_time = 'en_CA.UTF-8' # locale for time formatting 39 | default_text_search_config = 'pg_catalog.english' 40 | max_locks_per_transaction = 1024 # min 10 -------------------------------------------------------------------------------- /cf-pg-configs/ro/i3en.xlarge.conf: -------------------------------------------------------------------------------- 1 | listen_addresses = '0.0.0.0' # what IP address(es) to listen on; 2 | max_connections = 100 # (change requires restart) 3 | password_encryption = scram-sha-256 # md5 or scram-sha-256 4 | shared_buffers = 8GB # min 128kB 5 | huge_pages = on # on, off, or try 6 | work_mem = 256MB # min 64kB 7 | maintenance_work_mem = 2GB # min 1MB 8 | dynamic_shared_memory_type = posix # the default is the first option 9 | effective_io_concurrency = 200 # 1-1000; 0 disables prefetching 10 | max_worker_processes = 4 # (change requires restart) 11 | max_parallel_workers_per_gather = 2 # taken from max_parallel_workers 12 | max_parallel_workers = 4 # maximum number of max_worker_processes that 13 | wal_level = logical # minimal, replica, or logical 14 | synchronous_commit = off # synchronization level; 15 | wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers 16 | max_wal_size = 4GB 17 | min_wal_size = 1GB 18 | archive_mode = off # enables archiving; off, on, or always 19 | archive_timeout = 60 # force a logfile segment switch after this 20 | hot_standby = on # off disallows queries during recovery 21 | max_standby_archive_delay = 1200s # max delay before canceling queries 22 | max_standby_streaming_delay = 1200s # max delay before canceling queries 23 | random_page_cost = 1.2 # same scale as above 24 | effective_cache_size = 20GB 25 | default_statistics_target = 10000 # range 1-10000 26 | logging_collector = on # Enable capturing of stderr and csvlog 27 | log_filename = 'postgresql-%a.log' # log file name pattern, 28 | log_truncate_on_rotation = on # If on, an existing log file with the 29 | log_rotation_age = 1d # Automatic rotation of logfiles will 30 | log_rotation_size = 0 # Automatic rotation of logfiles will 31 | log_timezone = 'UTC' 32 | track_activity_query_size = 16384 # (change requires restart) 33 | datestyle = 'iso, ymd' 34 | timezone = 'UTC' 35 | lc_messages = 'en_CA.UTF-8' # locale for system error message 36 | lc_monetary = 'en_CA.UTF-8' # locale for monetary formatting 37 | lc_numeric = 'en_CA.UTF-8' # locale for number formatting 38 | lc_time = 'en_CA.UTF-8' # locale for time formatting 39 | default_text_search_config = 'pg_catalog.english' 40 | max_locks_per_transaction = 1024 # min 10 41 | 42 | -------------------------------------------------------------------------------- /cf-pg-configs/rw/r5a.2xlarge.conf: -------------------------------------------------------------------------------- 1 | listen_addresses = '0.0.0.0' # what IP address(es) to listen on; 2 | max_connections = 100 # (change requires restart) 3 | password_encryption = scram-sha-256 # md5 or scram-sha-256 4 | shared_buffers = 4GB # min 128kB 5 | huge_pages = on # on, off, or try 6 | work_mem = 256MB # min 64kB 7 | maintenance_work_mem = 1GB # min 1MB 8 | autovacuum_work_mem = 1GB # min 1MB, or -1 to use maintenance_work_mem 9 | dynamic_shared_memory_type = posix # the default is the first option 10 | effective_io_concurrency = 200 # 1-1000; 0 disables prefetching 11 | max_worker_processes = 4 # (change requires restart) 12 | max_parallel_workers_per_gather = 2 # taken from max_parallel_workers 13 | max_parallel_workers = 4 # maximum number of max_worker_processes that 14 | wal_level = logical # minimal, replica, or logical 15 | synchronous_commit = off # synchronization level; 16 | wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers 17 | max_wal_size = 4GB 18 | min_wal_size = 80MB 19 | archive_mode = on # enables archiving; off, on, or always 20 | archive_timeout = 120 # force a logfile segment switch after this 21 | random_page_cost = 1.2 # same scale as above 22 | effective_cache_size = 20GB 23 | default_statistics_target = 10000 # range 1-10000 24 | logging_collector = on # Enable capturing of stderr and csvlog 25 | log_filename = 'postgresql-%a.log' # log file name pattern, 26 | log_truncate_on_rotation = on # If on, an existing log file with the 27 | log_rotation_age = 1d # Automatic rotation of logfiles will 28 | log_rotation_size = 0 # Automatic rotation of logfiles will 29 | log_timezone = 'UTC' 30 | track_activity_query_size = 8192 # (change requires restart) 31 | autovacuum = on # Enable autovacuum subprocess? 'on' 32 | autovacuum_max_workers = 3 # max number of autovacuum subprocesses 33 | autovacuum_naptime = 300s # time between autovacuum runs 34 | autovacuum_vacuum_threshold = 2001 # min number of row updates before 35 | autovacuum_analyze_threshold = 2000 # min number of row updates before 36 | autovacuum_vacuum_scale_factor = 0.008 # fraction of table size before vacuum 37 | autovacuum_analyze_scale_factor = 0.008 # fraction of table size before analyze 38 | datestyle = 'iso, ymd' 39 | timezone = 'UTC' 40 | lc_messages = 'en_CA.UTF-8' # locale for system error message 41 | lc_monetary = 'en_CA.UTF-8' # locale for monetary formatting 42 | lc_numeric = 'en_CA.UTF-8' # locale for number formatting 43 | lc_time = 'en_CA.UTF-8' # locale for time formatting 44 | default_text_search_config = 'pg_catalog.english' 45 | max_locks_per_transaction = 1024 # min 10 46 | -------------------------------------------------------------------------------- /cf-pg-configs/rw/r5a.xlarge.conf: -------------------------------------------------------------------------------- 1 | listen_addresses = '0.0.0.0' # what IP address(es) to listen on; 2 | max_connections = 100 # (change requires restart) 3 | password_encryption = scram-sha-256 # md5 or scram-sha-256 4 | shared_buffers = 4GB # min 128kB 5 | huge_pages = on # on, off, or try 6 | work_mem = 256MB # min 64kB 7 | maintenance_work_mem = 1GB # min 1MB 8 | autovacuum_work_mem = 1GB # min 1MB, or -1 to use maintenance_work_mem 9 | dynamic_shared_memory_type = posix # the default is the first option 10 | effective_io_concurrency = 200 # 1-1000; 0 disables prefetching 11 | max_worker_processes = 4 # (change requires restart) 12 | max_parallel_workers_per_gather = 2 # taken from max_parallel_workers 13 | max_parallel_workers = 4 # maximum number of max_worker_processes that 14 | wal_level = logical # minimal, replica, or logical 15 | synchronous_commit = off # synchronization level; 16 | wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers 17 | max_wal_size = 4GB 18 | min_wal_size = 80MB 19 | archive_mode = on # enables archiving; off, on, or always 20 | archive_timeout = 120 # force a logfile segment switch after this 21 | random_page_cost = 1.2 # same scale as above 22 | effective_cache_size = 20GB 23 | default_statistics_target = 10000 # range 1-10000 24 | logging_collector = on # Enable capturing of stderr and csvlog 25 | log_filename = 'postgresql-%a.log' # log file name pattern, 26 | log_truncate_on_rotation = on # If on, an existing log file with the 27 | log_rotation_age = 1d # Automatic rotation of logfiles will 28 | log_rotation_size = 0 # Automatic rotation of logfiles will 29 | log_timezone = 'UTC' 30 | track_activity_query_size = 8192 # (change requires restart) 31 | autovacuum = on # Enable autovacuum subprocess? 'on' 32 | autovacuum_max_workers = 3 # max number of autovacuum subprocesses 33 | autovacuum_naptime = 300s # time between autovacuum runs 34 | autovacuum_vacuum_threshold = 2001 # min number of row updates before 35 | autovacuum_analyze_threshold = 2000 # min number of row updates before 36 | autovacuum_vacuum_scale_factor = 0.008 # fraction of table size before vacuum 37 | autovacuum_analyze_scale_factor = 0.008 # fraction of table size before analyze 38 | datestyle = 'iso, ymd' 39 | timezone = 'UTC' 40 | lc_messages = 'en_CA.UTF-8' # locale for system error message 41 | lc_monetary = 'en_CA.UTF-8' # locale for monetary formatting 42 | lc_numeric = 'en_CA.UTF-8' # locale for number formatting 43 | lc_time = 'en_CA.UTF-8' # locale for time formatting 44 | default_text_search_config = 'pg_catalog.english' 45 | max_locks_per_transaction = 1024 # min 10 46 | -------------------------------------------------------------------------------- /cf-pg-configs/rw/t3a.medium.conf: -------------------------------------------------------------------------------- 1 | listen_addresses = '0.0.0.0' # what IP address(es) to listen on 2 | max_connections = 100 # (change requires restart) 3 | password_encryption = scram-sha-256 # md5 or scram-sha-256 4 | shared_buffers = 512kB # min 128kB 5 | dynamic_shared_memory_type = posix # the default is the first option 6 | wal_level = logical # minimal, replica, or logical 7 | archive_mode = on # enables archiving; off, on, or always 8 | archive_timeout = 60 # force a logfile segment switch after this 9 | logging_collector = on # Enable capturing of stderr and csvlog 10 | log_filename = 'postgresql-%a.log' # log file name pattern, 11 | log_truncate_on_rotation = on # If on, an existing log file with the 12 | log_rotation_age = 1d # Automatic rotation of logfiles will 13 | log_rotation_size = 0 # Automatic rotation of logfiles will 14 | log_timezone = 'UTC' 15 | track_activity_query_size = 8096 # (change requires restart) 16 | datestyle = 'iso, mdy' 17 | timezone = 'UTC' 18 | lc_messages = 'C.UTF-8' # locale for system error message 19 | lc_monetary = 'C.UTF-8' # locale for monetary formatting 20 | lc_numeric = 'C.UTF-8' # locale for number formatting 21 | lc_time = 'C.UTF-8' # locale for time formatting 22 | default_text_search_config = 'pg_catalog.english' 23 | max_locks_per_transaction = 1024 # min 10 24 | --------------------------------------------------------------------------------