├── .gitignore
├── Dockerfile
├── LICENSE
├── Makefile
├── README.md
├── duckling-proxy.go
├── go.mod
└── go.sum
/.gitignore:
--------------------------------------------------------------------------------
1 | .hgignore
2 | *.crt
3 | *.key
4 | .hg/
5 | .idea/
6 |
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM alpine:latest
2 |
3 | ADD build/linux/duckling /
4 |
5 | # These two files must exist.
6 | ADD duckling.crt /duckling.crt
7 | ADD duckling.key /duckling.key
8 |
9 | CMD ["/duckling", "--address", "0.0.0.0", "--serverCert", "/duckling.crt", "--serverKey", "/duckling.key"]
10 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 LukeEmmet
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | .PHONY: clean run build.local build.linux build.docker deploy
2 |
3 | BINARY ?= duckling
4 | SOURCES = $(shell find . -name '*.go')
5 | VERSION ?= $(shell git describe --tags --always)
6 | IMAGE ?= deploy.glv.one/lukee/$(BINARY)
7 | DOCKERFILE ?= Dockerfile
8 | BUILD_FLAGS ?= -v
9 | LDFLAGS ?= -w -s
10 |
11 | default: run
12 |
13 | clean:
14 | rm -rf build
15 |
16 | run: build.local
17 | ./build/$(BINARY) --serverCert duckling.crt --serverKey duckling.key
18 |
19 | build.local: build/$(BINARY)
20 | build.linux: build/linux/$(BINARY)
21 |
22 | build/$(BINARY): $(SOURCES)
23 | CGO_ENABLED=0 go build -o build/$(BINARY) $(BUILD_FLAGS) -ldflags "$(LDFLAGS)" .
24 |
25 | build/linux/$(BINARY): $(SOURCES)
26 | GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build $(BUILD_FLAGS) -o build/linux/$(BINARY) -ldflags "$(LDFLAGS)" .
27 |
28 | build.docker: build.linux
29 | docker build --rm -t "$(IMAGE):$(VERSION)" -f $(DOCKERFILE) .
30 |
31 | deploy: build.docker
32 | docker push "$(IMAGE):$(VERSION)"
33 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Duckling Proxy 🦆
2 | Duckling proxy is a Gemini proxy to access the Small Web. Connecting to it with your Gemini client means you can access many web pages directly with your favourite client.
3 |
4 | Cross platform, written in Go.
5 |
6 | ## What is the Small Web?
7 |
8 | The Small Web are those pages on the WWW that are typically characterised:
9 |
10 | * simple and document/content centric, using simple headings, bullets, links and tables only
11 | * accessible to different web clients, but do not need a monster browser such as Chrome to view them
12 | * do not require client side javascript
13 | * do not require tracking, cookies, forms or authentication to be viewed
14 | * can be accessed using standard HTTP GET requests
15 | * could be rendered as text/gemini without significant loss of information
16 | * apart from the huge commercial mega sites, a significant portion of the web
17 |
18 | With Duckling, you can now browse the Small Web using your favourite Gemini client, and just open a standard web browser only when you need to!
19 |
20 | ## What is the Duckling proxy?
21 |
22 | The Duckling proxy 🦆 is a scheme-specific filtering proxy for Gemini clients to access the web. It behaves as a normal Gemini server, except it retrieves its content from the web. You can tailor its behaviour when it starts, to tailor how web pages are transformed to gemtext.
23 |
24 | It is scheme-specific, i.e. it is designed to handle HTTP requests only. [Agena](https://tildegit.org/solderpunk/agena) is another example of a scheme specific proxy, supporting gopher://
25 |
26 | Web pages are translated to text/gemini. Other web resources are returned directly.
27 |
28 | The primary intended use case for this proxy is as a personal proxy to make the web accessible to your favourite Gemini client.
29 |
30 | ## Why is it called "Duckling"?
31 |
32 | Small Web Daemon -> Small WebD -> Small webbed -> Duckling.
33 |
34 | ## Usage
35 |
36 | ```
37 | Usage:
38 |
39 | duckling-proxy [flags]
40 |
41 | -a, --address string Bind to address
42 | (default "127.0.0.1")
43 | -m, --citationMarkers Use footnote style citation markers
44 | -s, --citationStart int Start citations from this index (default 1)
45 | -e, --emitImagesAsLinks Emit links to included images
46 | -l, --linkEmitFrequency int Emit gathered links through the document after this number of paragraphs (default 2)
47 | -T, --maxConnectTime int Max connect time (s)
48 | (default 5)
49 | -t, --maxDownloadTime int Max download time (s)
50 | (default 10)
51 | -n, --numberedLinks Number the links
52 | -p, --port int Server port (default 1965)
53 | -r, --prettyTables Pretty tables - works with most simple tables
54 | -c, --serverCert string serverCert path.
55 | -k, --serverKey string serverKey path.
56 | --unfiltered Do not filter text/html to text/gemini
57 | -u, --userAgent string User agent for HTTP requests
58 | -v, --version Find out what version of Duckling Proxy you're running
59 |
60 | ```
61 |
62 | ## Remarks
63 |
64 | * serverCert - required - path to Gemini server TLS certificate
65 | * serverKey - required - path to Gemini server TLS private key
66 | * All other flags are optional and you can experiment with them
67 |
68 | You will need to configure your Gemini client to point to the server when there is a need to access any http://
or https://
requests.
69 |
70 | ## Supported clients
71 |
72 | The following clients support per-scheme proxies and can be configured to use Duckling proxy.
73 |
74 | * [Amfora](https://github.com/makeworld-the-better-one/amfora) - supports per scheme proxies since v1.5.0
75 | * [AV-98](https://tildegit.org/solderpunk/AV-98) - Merge [pull request #24](https://tildegit.org/solderpunk/AV-98/pulls/24) then use `set http_proxy machine:port` to access.
76 | * [diohsc](https://repo.or.cz/diohsc.git) - edit diohscrc config file
77 | * [gemget](https://github.com/makeworld-the-better-one/gemget) - use -p option
78 | * [GemiNaut](https://github.com/LukeEmmet/GemiNaut) - since 0.8.8, which also has its own native html to gemini conversion - update in settings
79 | * [Lagrange](https://git.skyjake.fi/skyjake/lagrange) - set proxy in preferences (use 127.0.0.1:port, not localhost:port for localhost)
80 | * [Telescope](https://telescope.omarpolo.com/) - set proxy in the config file add: ```proxy "https" via "gemini://127.0.0.1:1965"```, and similarly for http
81 |
82 | Let me know if your client supports per scheme proxies and I'll add it to the list.
83 |
84 | ## Installation
85 |
86 | If you have Go installed, you can also install the latest commit it with:
87 |
88 | ```
89 | go env -w GO111MODULE=on
90 | go get github.com/LukeEmmet/duckling-proxy@master
91 | ```
92 |
93 | ## Feedback
94 |
95 | Send me your thoughts and feedback to
96 |
97 | ```
98 | luke [at] marmaladefoo [dot] com
99 | ```
100 |
101 | ## History
102 |
103 | ### 0.2.1
104 |
105 | First publicly versioned build.
106 |
107 | * fix bug whereby Duckling would crash on download timeout.
108 | * removed patch for AV-98 and updated readme now that AV-98 is officially supports http proxies
109 | * add -v version flag
110 | * print version in footer
111 |
112 |
113 | ### 0.1
114 |
115 | First release (unversioned), 30-Aug-2020
116 |
--------------------------------------------------------------------------------
/duckling-proxy.go:
--------------------------------------------------------------------------------
1 | package main
2 |
3 | import (
4 | "github.com/LukeEmmet/html2gemini"
5 | gemini "github.com/makeworld-the-better-one/go-gemini"
6 | flag "github.com/spf13/pflag"
7 | "strconv"
8 |
9 | "fmt"
10 | "io"
11 | "io/ioutil"
12 | "log"
13 | "net"
14 | "net/http"
15 | "os"
16 | "strings"
17 | "time"
18 | )
19 |
20 |
21 | var version = "0.2.1"
22 |
23 | type WebPipeHandler struct {
24 | }
25 |
26 | var (
27 | citationStart = flag.IntP("citationStart", "s", 1, "Start citations from this index")
28 | citationMarkers = flag.BoolP("citationMarkers", "m", false, "Use footnote style citation markers")
29 | numberedLinks = flag.BoolP("numberedLinks", "n", false, "Number the links")
30 | prettyTables = flag.BoolP("prettyTables", "r", false, "Pretty tables - works with most simple tables")
31 | emitImagesAsLinks = flag.BoolP("emitImagesAsLinks", "e", false, "Emit links to included images")
32 | linkEmitFrequency = flag.IntP("linkEmitFrequency", "l", 2, "Emit gathered links through the document after this number of paragraphs")
33 | serverCert = flag.StringP("serverCert", "c", "", "serverCert path. ")
34 | serverKey = flag.StringP("serverKey", "k", "", "serverKey path. ")
35 | userAgent = flag.StringP("userAgent", "u", "", "User agent for HTTP requests\n")
36 | maxDownloadTime = flag.IntP("maxDownloadTime", "t", 10, "Max download time (s)\n")
37 | maxConnectTime = flag.IntP("maxConnectTime", "T", 5, "Max connect time (s)\n")
38 | port = flag.IntP("port", "p", 1965, "Server port")
39 | address = flag.StringP("address", "a", "127.0.0.1", "Bind to address\n")
40 | unfiltered = flag.BoolP("unfiltered", "", false, "Do not filter text/html to text/gemini")
41 | verFlag = flag.BoolP("version", "v", false, "Find out what version of Duckling Proxy you're running")
42 | )
43 |
44 | func fatal(format string, a ...interface{}) {
45 | urlError(format, a...)
46 | os.Exit(1)
47 | }
48 |
49 | func urlError(format string, a ...interface{}) {
50 | format = "Error: " + strings.TrimRight(format, "\n") + "\n"
51 | fmt.Fprintf(os.Stderr, format, a...)
52 | }
53 |
54 | func info(format string, a ...interface{}) {
55 | format = "Info: " + strings.TrimRight(format, "\n") + "\n"
56 | fmt.Fprintf(os.Stderr, format, a...)
57 | }
58 |
59 | func check(e error) {
60 | if e != nil {
61 | panic(e)
62 | os.Exit(1)
63 | }
64 | }
65 |
66 | func htmlToGmi(inputHtml string) (string, error) {
67 |
68 | //convert html to gmi
69 | options := html2gemini.NewOptions()
70 | options.PrettyTables = *prettyTables
71 | options.CitationStart = *citationStart
72 | options.LinkEmitFrequency = *linkEmitFrequency
73 | options.CitationMarkers = *citationMarkers
74 | options.NumberedLinks = *numberedLinks
75 | options.EmitImagesAsLinks = *emitImagesAsLinks
76 |
77 | //dont use an extra line to separate header from body, but
78 | //do separate each row visually
79 | options.PrettyTablesOptions.HeaderLine = false
80 | options.PrettyTablesOptions.RowLine = true
81 |
82 | //pretty tables option is somewhat experimental
83 | //and the column positions not always correct
84 | //so use invisible borders of spaces for now
85 | options.PrettyTablesOptions.CenterSeparator = " "
86 | options.PrettyTablesOptions.ColumnSeparator = " "
87 | options.PrettyTablesOptions.RowSeparator = " "
88 |
89 | ctx := html2gemini.NewTraverseContext(*options)
90 |
91 | return html2gemini.FromString(inputHtml, *ctx)
92 |
93 | }
94 |
95 | func (h WebPipeHandler) Handle(r gemini.Request) *gemini.Response {
96 |
97 | url := r.URL.String()
98 | if r.URL.Scheme != "http" && r.URL.Scheme != "https" {
99 | //any other schemes are not implemented by this proxy
100 | return &gemini.Response{53, "Scheme not supported: " + r.URL.Scheme, nil, nil}
101 | }
102 |
103 | info("Retrieve: %s", r.URL.String())
104 |
105 | //see https://medium.com/@nate510/don-t-use-go-s-default-http-client-4804cb19f779
106 | //also https://gist.github.com/ijt/950790/fca88967337b9371bb6f7155f3304b3ccbf3946f
107 |
108 | connectTimeout := time.Second * time.Duration(*maxConnectTime)
109 | clientTimeout := time.Second * time.Duration(*maxDownloadTime)
110 |
111 | //create custom transport with timeout
112 | var netTransport = &http.Transport{
113 | Dial: (&net.Dialer{
114 | Timeout: connectTimeout,
115 | }).Dial,
116 | TLSHandshakeTimeout: connectTimeout,
117 | }
118 |
119 | //create custom client with timeout
120 | var netClient = &http.Client{
121 | Timeout: clientTimeout,
122 | Transport: netTransport,
123 | }
124 |
125 | //fmt.Println("making request")
126 | req, err := http.NewRequest("GET", url, nil)
127 | if err != nil {
128 | return &gemini.Response{43, "Could not connect to remote HTTP host", nil, nil}
129 | }
130 |
131 | //set user agent if specified
132 | if *userAgent != "" {
133 | req.Header.Add("User-Agent", *userAgent)
134 | }
135 |
136 | response, err := netClient.Do(req)
137 | if err != nil {
138 | return &gemini.Response{43, "Remote host did not respond with valid HTTP", nil, nil}
139 | }
140 |
141 | defer response.Body.Close()
142 |
143 | //final response (may have redirected)
144 | if url != response.Request.URL.String() {
145 | //notify of target location on stderr
146 | //see https://stackoverflow.com/questions/16784419/in-golang-how-to-determine-the-final-url-after-a-series-of-redirects
147 | info("Redirected to: %s", response.Request.URL.String())
148 |
149 | //tell the client to get it from a different location otherwise the client
150 | //wont know the baseline for link refs
151 | return &gemini.Response{30, response.Request.URL.String(), nil, nil}
152 | }
153 |
154 | contents, err := ioutil.ReadAll(response.Body)
155 | if err != nil {
156 | abandonMsg := fmt.Sprintf("Download abandoned after %d seconds: %s", *maxDownloadTime, response.Request.URL.String())
157 | info(abandonMsg)
158 | return &gemini.Response{43, abandonMsg, nil, nil}
159 | }
160 |
161 | if response.StatusCode == 200 {
162 | contentType := response.Header.Get("Content-Type")
163 |
164 | info("Content-Type: %s", contentType)
165 |
166 | var body io.ReadCloser
167 | if !*unfiltered && strings.Contains(contentType, "text/html") {
168 |
169 | info("Converting to text/gemini: %s", r.URL.String())
170 |
171 | //translate html to gmi
172 | gmi, err := htmlToGmi(string(contents))
173 |
174 | if err != nil {
175 | return &gemini.Response{42, "HTML to GMI conversion failure", nil, nil}
176 | }
177 |
178 | //add a footer to communicate that the content was filtered and not original
179 | //also the link provides a clickable link that the user can activate to launch a browser, depending on their client
180 | //behaviour (e.g. Ctrl-Click or similar)
181 | footer := ""
182 | footer += "\n\n──────────────────── 🦆 ──────────────────── 🦆 ──────────────────── \n\n"
183 | footer += "Web page filtered and simplified by Duckling Proxy v" + version + ". To view the original content, open the page in your system web browser.\n"
184 | footer += "=> " + r.URL.String() + " Source page \n"
185 |
186 | body = ioutil.NopCloser(strings.NewReader(string(gmi) + footer))
187 |
188 | contentType = "text/gemini"
189 |
190 | } else {
191 | //let everything else through with the same content type
192 | body = ioutil.NopCloser(strings.NewReader(string(contents)))
193 | }
194 |
195 | return &gemini.Response{20, contentType, body, nil}
196 |
197 | } else if response.StatusCode == 404 {
198 | return &gemini.Response{51, "Not found", nil, nil}
199 | } else {
200 | return &gemini.Response{50, "Failure: HTTP status: " + response.Status, nil, nil}
201 | }
202 |
203 | }
204 |
205 | func main() {
206 |
207 | flag.Parse()
208 |
209 | if *verFlag {
210 | fmt.Println("Duckling Proxy v" + version)
211 | return
212 | }
213 |
214 | handler := WebPipeHandler{}
215 |
216 | info("Starting Duckling Proxy v%s on %s port: %d", version, *address, *port)
217 |
218 | err := gemini.ListenAndServe(*address+":"+strconv.Itoa(*port), *serverCert, *serverKey, handler)
219 |
220 |
221 | if err != nil {
222 | log.Fatal(err)
223 | }
224 | }
225 |
--------------------------------------------------------------------------------
/go.mod:
--------------------------------------------------------------------------------
1 | module github.com/LukeEmmet/duckling-proxy
2 |
3 | go 1.14
4 |
5 | require (
6 | github.com/LukeEmmet/html2gemini v0.0.0-20200831220433-65476d2a84ff
7 | github.com/makeworld-the-better-one/go-gemini v0.8.5
8 | github.com/olekukonko/tablewriter v0.0.4 // indirect
9 | github.com/spf13/pflag v1.0.5
10 | github.com/ssor/bom v0.0.0-20170718123548-6386211fdfcf // indirect
11 | golang.org/x/net v0.0.0-20200822124328-c89045814202 // indirect
12 | )
13 |
--------------------------------------------------------------------------------
/go.sum:
--------------------------------------------------------------------------------
1 | github.com/LukeEmmet/html2gemini v0.0.0-20200831220433-65476d2a84ff h1:gtWkiW3wWq3QolGyMhE1tEjOjQE7yDWRUe8DZk2pnzg=
2 | github.com/LukeEmmet/html2gemini v0.0.0-20200831220433-65476d2a84ff/go.mod h1:yj8BdB/daZB09i9DoL3y7oOuDm7KLOSIxzIwESjEk5k=
3 | github.com/google/go-cmp v0.3.1 h1:Xye71clBPdm5HgqGwUkwhbynsUJZhDbS20FvLhQ2izg=
4 | github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
5 | github.com/makeworld-the-better-one/go-gemini v0.8.5 h1:+sGjfHvbhuhLFpMyVeujo7kKpts2NcGJGzJSMNWoobc=
6 | github.com/makeworld-the-better-one/go-gemini v0.8.5/go.mod h1:P7/FbZ+IEIbA/d+A0Y3w2GNgD8SA2AcNv7aDGJbaWG4=
7 | github.com/mattn/go-runewidth v0.0.7 h1:Ei8KR0497xHyKJPAv59M1dkC+rOZCMBJ+t3fZ+twI54=
8 | github.com/mattn/go-runewidth v0.0.7/go.mod h1:H031xJmbD/WCDINGzjvQ9THkh0rPKHF+m2gUSrubnMI=
9 | github.com/olekukonko/tablewriter v0.0.4 h1:vHD/YYe1Wolo78koG299f7V/VAS08c6IpCLn+Ejf/w8=
10 | github.com/olekukonko/tablewriter v0.0.4/go.mod h1:zq6QwlOf5SlnkVbMSr5EoBv3636FWnp+qbPhuoO21uA=
11 | github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
12 | github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
13 | github.com/ssor/bom v0.0.0-20170718123548-6386211fdfcf h1:pvbZ0lM0XWPBqUKqFU8cmavspvIl9nulOYwdy6IFRRo=
14 | github.com/ssor/bom v0.0.0-20170718123548-6386211fdfcf/go.mod h1:RJID2RhlZKId02nZ62WenDCkgHFerpIOmW0iT7GKmXM=
15 | golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
16 | golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
17 | golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
18 | golang.org/x/net v0.0.0-20200822124328-c89045814202 h1:VvcQYSHwXgi7W+TpUR6A9g6Up98WAHf3f/ulnJ62IyA=
19 | golang.org/x/net v0.0.0-20200822124328-c89045814202/go.mod h1:/O7V0waA8r7cgGh81Ro3o1hOxt32SMVPicZroKQ2sZA=
20 | golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
21 | golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
22 | golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
23 | golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
24 |
--------------------------------------------------------------------------------