├── LICENSE.txt ├── README.md ├── blank.gif ├── get-easylist.sh ├── patterns.sed └── urls.txt /LICENSE.txt: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 James White 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # squidGuard/ufdbGuard Adblock Plus Expression Lists Converter 2 | 3 | A shell script that converts Adblock Plus lists into expressions files compatible with squidGuard and ufdbGuard 4 | 5 | Many similar scripts exist to convert such lists, but some are quite old and use `sed` patterns that actually cause problems. 6 | 7 | ### Dependencies 8 | 9 | * bash 10 | * sed 11 | * wget 12 | * squid 13 | * squidGuard or ufdbGuard 14 | 15 | ### Setup 16 | 17 | The script is designed to automatically detect the required aspects of your Squid and squidGuard/ufdbGuard configuration, paths, file locations etc. When you run the script it will output what it thinks is correct, with the ability to stop the script if anything is wrong. 18 | 19 | While I try and test this script on multiple Linux distributions, there may be issues with detection in some cases. If you experience any problems, please file an issue on GitHub and I'll look into it. 20 | 21 | ### Installation 22 | 23 | No formal installation is required, other than to get the source and to extract it to any directory on the machine you are running Squid on. 24 | 25 | You will need to grant executable permission to the shell script before running it: 26 | 27 | ``` 28 | chmod +x get-easylist.sh 29 | ``` 30 | 31 | See usage section for instructions on how to use the script 32 | 33 | ### Usage 34 | 35 | The script requires a couple of user parameters based on what setup you are running: 36 | 37 | ``` 38 | ./get-easylist.sh [squidGuard/ufdbGuard] [autoconfirm] 39 | ``` 40 | 41 | For additional guidance, you can run the script without any parameters which will show the help message 42 | 43 | #### Examples: 44 | 45 | ##### squidGuard 46 | 47 | ``` 48 | ./get-easylist.sh squidGuard 49 | ``` 50 | 51 | ##### ufdbGuard 52 | 53 | ``` 54 | ./get-easylist.sh ufdbGuard 55 | ``` 56 | 57 | The lists will be automatically be downloaded, converted and written to the database folder of the respective URL filter. The conversion will need a bit of processing power as they can be quite large. 58 | 59 | ### Adding the converted Adblock expression files to your URL filter 60 | 61 | ##### squidGuard 62 | 63 | ``` 64 | dest adblock { 65 | expressionlist adblock/easylist 66 | expressionlist adblock/easyprivacy 67 | } 68 | ``` 69 | 70 | ``` 71 | acl { 72 | default { 73 | pass !adblock all 74 | } 75 | } 76 | ``` 77 | 78 | ##### ufdbGuard 79 | 80 | ``` 81 | category adblock { 82 | expressionlist adblock/easylist 83 | expressionlist adblock/easyprivacy 84 | } 85 | ``` 86 | 87 | ``` 88 | acl { 89 | default { 90 | pass !adblock any 91 | } 92 | } 93 | ``` 94 | 95 | #### Replacing advertisements with a transparent image 96 | 97 | You should also consider putting in a redirect directive in your dest/category adblock config, to serve a transparent image that will replace the original advert space/content when a expression rule matches. You can find a 1x1 blank.gif image within the source. This needs to be hosted on a web server and referenced like so in a `dest` or `category` block: 98 | 99 | ##### squidGuard 100 | 101 | ``` 102 | redirect http://yourwebsite.com/blank.gif 103 | ``` 104 | 105 | The image file could also be hosted on an internal server, as long as the server is accessible when using the proxy. 106 | 107 | ##### ufdbGuard 108 | 109 | With ufdbGuard you have the option to run a local Apache HTTPD instance which will serve the transparent image for you as it already includes one, the settings for Apache are present in the `ufdbGuard.conf`. An example redirect in ufdbGuard that is generally non-intrusive is below: 110 | 111 | ``` 112 | redirect http://your-proxy-server.com:8080/cgi-bin/URLblocked.cgi?admin=%A&mode=default&color=red&size=normal&clientaddr=%a&clientname=%n&clientuser=%i&clientgroup=%s&targetgroup=%t&url=%u 113 | ``` 114 | 115 | Make sure you change the proxy address and port number to match your setup. It doesn't necessarily have to be a registered DNS name, an IP address will also be sufficient. 116 | 117 | ### Automatic updates of expressions lists via cron 118 | 119 | The Adblock list files themselves are updated quite regularly, you can also run this script through cron, you just need to pass an additional parameter to avoid the user confirmation prompt: 120 | 121 | ``` 122 | ./get-easylist.sh [squidGuard/ufdbGuard] autoconfirm 123 | ``` 124 | 125 | You can then schedule the job through cron, be sure to update the path of where you've actually stored the script, an example cron could be: 126 | 127 | ``` 128 | 0 0 * * * /path/to/get_easylist.sh squidGuard autoconfirm >/dev/null 2>&1 129 | ``` 130 | 131 | This would run the script everyday at midnight 12 AM (00:00 AM) for squidGuard 132 | 133 | ### Comparison to Adblock Plus Browser Addon 134 | 135 | There are advantages and disadvantages of running Adblock expression lists through Squid, I'll cover them here: 136 | 137 | #### Advantages 138 | 139 | * Filtering is applied at the proxy level, no need for browser addons for each client 140 | * Enables ad filtering for devices that don't support browser addons but can use a user-defined HTTP proxy 141 | * Useful for devices with low RAM as the proxy will take care of all the filtering 142 | * Being able to point mutiple clients at the proxy means you don't have to configure each client with ad blocking 143 | * Works with Squid as a transparent proxy 144 | 145 | #### Disadvantages 146 | 147 | * Cannot detect JavaScript based ads 148 | * Some ad space can sometimes be left behind 149 | 150 | ### Contributing 151 | 152 | Pull Requests and feedback welcome! 153 | -------------------------------------------------------------------------------- /blank.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jamesmacwhite/squidguard-adblock/116e95bd0c6405c518d80d8586dd6344e6e83d1b/blank.gif -------------------------------------------------------------------------------- /get-easylist.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | ############################################################################################### 3 | ## get-easylist.sh 4 | ## Author: James White (james@jmwhite.co.uk) 5 | ## Description: Gets Adblock lists and converts them to squidGuard/ufdbGuard expression lists 6 | ## Version: 0.3 BETA 2 7 | ## 8 | ## Notes: 9 | ## A specific sed pattern file is required for the conversion 10 | ## Due to changes in the EasyList formats, older sed patterns will cause problems 11 | ## The pattern this script uses is tested regularly for any issues 12 | ## 13 | 14 | SCRIPT_NAME=${0##*/} 15 | SCRIPT_VERSION="0.3 BETA 2" 16 | GITHUB_REPO="https://github.com/jamesmacwhite/squidguard-adblock" 17 | WORKING_DIR=$(dirname "$0") 18 | OS=$(uname) 19 | 20 | if ! [ "$(id -u)" = 0 ] ; then 21 | echo "Please run this script as root" 22 | exit 1 23 | fi 24 | 25 | usage() { 26 | 27 | printf "%s\n" " " \ 28 | "-------------------------------------------------------------------------" \ 29 | "${SCRIPT_NAME} (Version ${SCRIPT_VERSION})" \ 30 | "${GITHUB_REPO}" \ 31 | "Gets Adblock lists and converts them for use with squidGuard/ufdbGuard" \ 32 | "Developed by James White" \ 33 | "-------------------------------------------------------------------------" \ 34 | "" \ 35 | "USAGE:" \ 36 | "${SCRIPT_NAME} [squidGuard/ufdbGuard] [autoconfirm]" \ 37 | "Note: The autoconfirm parameter is for running the script without user prompts" \ 38 | "" 39 | 40 | } 41 | 42 | # If no parameters are specified, show help guide 43 | if [ $# -eq 0 ] ; then 44 | usage 45 | exit 0 46 | fi 47 | 48 | show_message() { 49 | 50 | printf "\n%s\n" "INFO: $1" 51 | } 52 | 53 | report_issue() { 54 | # If anything fails with detection, prompt to submit a bug report 55 | GITHUB_NEW_ISSUE_URL="${GITHUB_REPO}/issues/new/?title=$1" 56 | 57 | printf "%s\n" "ERROR: $1" \ 58 | "Please report a bug via this URL:" \ 59 | "${GITHUB_NEW_ISSUE_URL}" \ 60 | " " \ 61 | "Providing additional information in your report such as your OS and setup will help" \ 62 | exit 1 63 | } 64 | 65 | # Pattern and URL files 66 | SED_PATTERN_FILE="${WORKING_DIR}/patterns.sed" 67 | URL_LIST_FILE="${WORKING_DIR}/urls.txt" 68 | 69 | if [ ! -e "${SED_PATTERN_FILE}" ] || [ ! -e "${URL_LIST_FILE}" ] ; then 70 | echo "One or more helper files are missing" 71 | exit 1 72 | fi 73 | 74 | # Find the path to squid binary on target system 75 | if [ "${OS}" = "FreeBSD" ] ; then 76 | SQUID_BIN=$(type squid | awk '{ print $3 }') 77 | else 78 | SQUID_BIN=$(command -v squid squid2 squid3) 79 | fi 80 | 81 | if [ -z "${SQUID_BIN}" ] ; then 82 | report_issue "Squid was not detected in PATH" 83 | exit 1 84 | fi 85 | 86 | get_squid_build_flag() { 87 | # $1: Squid build flag value 88 | ${SQUID_BIN} -v | tr "'" "\n" | grep -- "$1" | tail -n1 | cut -f2 -d '=' 89 | } 90 | 91 | get_squid_conf_value() { 92 | # $1: Squid config value 93 | # $2: Squid config filename 94 | grep -i "$1" "$2" | awk '{ print $2 }' | cut -f2 -d ':' 95 | 96 | } 97 | 98 | UFDBGUARD_SYSCONF_FILE="/etc/sysconfig/ufdbguard" 99 | 100 | # If ufdbGuard has this file present use that for config values 101 | if [ -e "${UFDBGUARD_SYSCONF_FILE}" ] ; then 102 | UFDBGUARD_SYSCONF=1 103 | else 104 | UFDBGUARD_SYSCONF=0 105 | fi 106 | 107 | get_ufdb_sysconf_value() { 108 | grep "^$1" ${UFDBGUARD_SYSCONF_FILE} | cut -f2 -d '=' | tr -d '"' 109 | } 110 | 111 | get_ufdb_conf_value() { 112 | # $1: ufdbGuard config value 113 | # $2: ufdbGuard filename path 114 | grep -i "$1" "$2" | awk '{ print $2 }' | sed 's/\"//g' 115 | } 116 | 117 | show_message "Scanning your setup please wait..." 118 | 119 | # Squid configuration values, we can mostly use ./configure parameters 120 | SQUID_USER=$(get_squid_build_flag "--with-default-user") 121 | SQUID_CONF_DIR=$(get_squid_build_flag "--sysconfdir") 122 | SQUID_CONF_FILE="${SQUID_CONF_DIR}/squid.conf" 123 | SQUID_LOG_FILE=$(get_squid_conf_value "access_log" "${SQUID_CONF_FILE}") 124 | 125 | # If any of these are blank, better stop what were doing, because the script will fail 126 | # Log file is not critical as its checked differently later 127 | if [ -z "${SQUID_USER}" ] || 128 | [ -z "${SQUID_CONF_FILE}" ] || 129 | [ -z "${SQUID_CONF_DIR}" ] ; then 130 | report_issue "Squid configuration could not be properly detected" 131 | exit 1 132 | fi 133 | 134 | # Depending on filter type passed to the script, set the values accordingly 135 | case "$1" in 136 | [sS][qQ][uU][iI][dD][gG][uU][aA][rR][dD]) 137 | 138 | FILTER_TYPE="squidGuard" 139 | if [ "${OS}" = "FreeBSD" ] ; then 140 | SQUIDGUARD_BIN=$(type squidGuard | awk '{ print $3 }') 141 | else 142 | SQUIDGUARD_BIN=$(command -v squidguard squidGuard) 143 | fi 144 | FILTER_CONF_FILE="${SQUID_CONF_DIR}/${FILTER_TYPE}.conf" 145 | FILTER_DB_DIR=$(get_squid_conf_value "dbhome" "${FILTER_CONF_FILE}") 146 | FILTER_LOG_DIR=$(get_squid_conf_value "logdir" "${FILTER_CONF_FILE}") 147 | 148 | ;; 149 | 150 | [uU][fF][dD][bB][gG][uU][aA][rR][dD]) 151 | 152 | FILTER_TYPE="ufdbGuard" 153 | if [ "${OS}" = "FreeBSD" ] ; then 154 | UFDBGUARD_BIN=$(type ufdbgclient | awk '{ print $3 }') 155 | else 156 | UFDBGUARD_BIN=$(command -v ufdbgclient) 157 | fi 158 | FILTER_CONF_FILE=$(find / -iname ${FILTER_TYPE}.conf 2>&1 | grep -v "Permission denied") 159 | FILTER_DB_DIR=$(get_ufdb_conf_value "dbhome" "${FILTER_CONF_FILE}") 160 | FILTER_LOG_DIR=$(get_ufdb_conf_value "logdir" "${FILTER_CONF_FILE}") 161 | 162 | # if sysconfig file exists use this to pull values instead 163 | if [ "${UFDBGUARD_SYSCONF}" -eq 1 ] ; then 164 | FILTER_DB_DIR=$(get_ufdb_sysconf_value "BLACKLIST_DIR" "${UFDBGUARD_SYSCONF}") 165 | UFDBGUARD_USER=$(get_ufdb_sysconf_value "RUNAS" "${UFDBGUARD_SYSCONF}") 166 | SQUID_USER=${UFDBGUARD_USER} 167 | fi 168 | 169 | ;; 170 | 171 | *) 172 | echo "$1 is not a valid filter value" 173 | exit 1 174 | ;; 175 | 176 | esac 177 | 178 | FILTER_ADBLOCK_DIR="${FILTER_DB_DIR}/adblock" 179 | 180 | # Again if any these are blank the script will fail 181 | # FILTER_ADBLOCK_DIR would never be blank so it doesn't get checked 182 | if [ -z "${FILTER_CONF_FILE}" ] || 183 | [ -z "${FILTER_DB_DIR}" ] ; then 184 | report_issue "Unable to detect ${FILTER_TYPE} setup" 185 | exit 1 186 | fi 187 | 188 | if [ "${FILTER_TYPE}" == "squidGuard" ] ; then 189 | FILTER_BIN=${SQUIDGUARD_BIN} 190 | fi 191 | 192 | if [ "${FILTER_TYPE}" == "ufdbGuard" ] ; then 193 | FILTER_BIN=${UFDBGUARD_BIN} 194 | fi 195 | 196 | if [ -z "${FILTER_BIN}" ] ; then 197 | report_issue "Cannot detect ${FILTER_TYPE} in PATH" 198 | exit 1 199 | fi 200 | 201 | 202 | show_message "The following setup has been detected" 203 | 204 | printf "%s\n" "Squid Bin Path: ${SQUID_BIN}" \ 205 | "Squid User: ${SQUID_USER}" \ 206 | "Squid Config Folder: ${SQUID_CONF_DIR}" \ 207 | "Squid Config File: ${SQUID_CONF_FILE}" \ 208 | "${FILTER_TYPE} Bin Path ${FILTER_BIN}" \ 209 | "${FILTER_TYPE} Config File: ${FILTER_CONF_FILE}" \ 210 | "${FILTER_TYPE} Database Folder: ${FILTER_DB_DIR}" \ 211 | "${FILTER_TYPE} Folder ${FILTER_ADBLOCK_DIR}" 212 | 213 | if [ ! "$2" == "autoconfirm" ] ; then 214 | 215 | read -r -p "Does everything look OK? [Y/N] " SQUID_CONF_OK 216 | 217 | case ${SQUID_CONF_OK} in 218 | 219 | [yY][eE][sS]|[yY]) 220 | echo "Great, will continue executing script" 221 | ;; 222 | 223 | *) 224 | echo "Exiting..." 225 | exit 1 226 | ;; 227 | esac 228 | 229 | fi 230 | 231 | # Removes the header and modifies the format for use with this script 232 | strip_file_header() { 233 | grep -v '^$\|^#' "$1" | sed 's/$/ /' | tr -d '\n' 234 | } 235 | 236 | ADBLOCK_PATTERNS=$(strip_file_header "${SED_PATTERN_FILE}") 237 | URL_LIST=$(strip_file_header "${URL_LIST_FILE}") 238 | 239 | # EasyList Configuration 240 | EASYLIST_TMP_DIR="/tmp/adblock" 241 | EASYLIST_URL_LIST=(${URL_LIST}) # URL list as array to loop 242 | 243 | mkdir -p "${FILTER_ADBLOCK_DIR}" 244 | mkdir -p ${EASYLIST_TMP_DIR} 245 | 246 | show_message "Preparing expressions lists" 247 | 248 | for URL in "${EASYLIST_URL_LIST[@]}" 249 | do 250 | echo "Downloading list from: ${URL}" 251 | wget -q --no-check-certificate -P ${EASYLIST_TMP_DIR} "${URL}" 252 | 253 | LIST_FILE_PATH="${EASYLIST_TMP_DIR}/$(basename "${URL}")" 254 | LIST_FILE_NAME="$(basename "${LIST_FILE_PATH}" .txt)" 255 | 256 | grep -q -E '^\[Adblock.*\]$' "${LIST_FILE_PATH}" 257 | 258 | if [ ! $? -eq 0 ] ; then 259 | echo "An non-Adblock list was detected" 260 | exit 1 261 | fi 262 | 263 | echo "Converting ${LIST_FILE_NAME} to an expressions list for ${FILTER_TYPE}" 264 | sed -e "${ADBLOCK_PATTERNS}" "${LIST_FILE_PATH}" > "${FILTER_ADBLOCK_DIR}/${LIST_FILE_NAME}" 265 | 266 | done 267 | 268 | show_message "Rebuilding Database" 269 | 270 | if [ "${FILTER_TYPE}" == "squidGuard" ] ; then 271 | ${SQUIDGUARD_BIN} -b -d -C all 272 | fi 273 | 274 | if [ "${FILTER_TYPE}" == "ufdbGuard" ] ; then 275 | if [ "${UFDBGUARD_SYSCONF}" -eq 1 ] ; then 276 | systemctl restart ufdb 277 | else 278 | /etc/init.d/ufdb restart 279 | fi 280 | fi 281 | 282 | # Make sure permissions are good, to prevent problems with launching any processes 283 | chmod 644 "${FILTER_CONF_FILE}" 284 | chmod -R 640 "${FILTER_DB_DIR}" 285 | chmod -R 640 "${FILTER_LOG_DIR}" 286 | chmod -R 644 "$(dirname "${FILTER_LOG_FILE}")" 287 | chown "${SQUID_USER}":"${SQUID_USER}" "${FILTER_CONF_FILE}" 288 | chown -R "${SQUID_USER}":"${SQUID_USER}" "${FILTER_DB_DIR}" 289 | find "${FILTER_DB_DIR}" -type d -exec chmod 755 \{\} \; > /dev/null 2>&1 290 | 291 | # access_log may not be defined or set to none, so we need to check before using chmod 292 | if [ ! "${SQUID_LOG_FILE}" == "none" ] || [ ! -z "${SQUID_LOG_FILE}" ] ; then 293 | chmod 755 "$(dirname "${SQUID_LOG_FILE}")" 294 | fi 295 | 296 | show_message "Reloading squid" 297 | ${SQUID_BIN} -k reconfigure 298 | 299 | # Remove adblock folder in /tmp 300 | rm -rf ${EASYLIST_TMP_DIR} > /dev/null 2>&1 301 | 302 | show_message "Adblock expressions lists are now installed!" 303 | -------------------------------------------------------------------------------- /patterns.sed: -------------------------------------------------------------------------------- 1 | ############################################################################ 2 | ## patterns.sed 3 | ## Description: sed expression rules that are used in the conversion 4 | ## Last Modified: 16/06/2015 5 | ## 6 | ## Notes: 7 | ## These rules have been adapted from older working examples with tweaks 8 | ## Tests are run regularly to confirm the conversion process is accurate 9 | ## Updates to this file may be required if upstream changes to lists occur 10 | ## 11 | 12 | s/\r//g; 13 | /Adblock/d; 14 | /.*\$.*/d; 15 | /\n/d; 16 | /.*\#.*/d; 17 | /@@.*/d; 18 | /^!.*/d; 19 | /^\[.*\]$/d; 20 | s#http://#||#g; 21 | s/\/\//||/g; 22 | s,[+.?&/|],\\&,g; 23 | s/\[/\\\[/g; 24 | s/\]/\\\]/g; 25 | s#*#.*#g; 26 | s,\$.*$,,g; 27 | s/\\|\\|\(.*\)\^\(.*\)/(^|\\\.)\1\\\/\2/g; 28 | s/\\|\\|\(.*\)/(^|\\\.)\1/g; 29 | /^\.\*$/d; 30 | /^$/d; 31 | -------------------------------------------------------------------------------- /urls.txt: -------------------------------------------------------------------------------- 1 | ######################################################################### 2 | ## urls.txt 3 | ## Description: URL's of Adblock lists to download for conversion 4 | ## 5 | ## Notes: 6 | ## Each list URL should be entered on a new line 7 | ## By default EasyList and EasyPrivacy are included 8 | ## Additional list files: https://adblockplus.org/subscriptions 9 | ## 10 | 11 | https://easylist-downloads.adblockplus.org/easylist.txt 12 | https://easylist-downloads.adblockplus.org/easyprivacy.txt 13 | --------------------------------------------------------------------------------