├── .github
├── FUNDING.yml
├── ISSUE_TEMPLATE.md
├── PULL_REQUEST_TEMPLATE.md
└── workflows
│ └── ci.yml
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── README.md
├── deprecated.md
├── domains.yml
├── index.js
├── lib
├── error.js
├── index.js
├── is_error_fatal.js
└── providers
│ ├── clck.ru.js
│ ├── flic.kr.js
│ ├── google.com.js
│ ├── index.js
│ └── vk.com.js
├── package.json
├── support
└── unshort.js
└── test
├── cache.js
├── default.js
├── expand.js
├── services.js
└── services.yml
/.github/FUNDING.yml:
--------------------------------------------------------------------------------
1 | open_collective: puzrin
2 | patreon: puzrin
3 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | Prior to request adding new site to defaults, please check:
2 |
3 | - It should be popular
4 | - It should not be restricted to single domain (fb.me and others)
5 |
6 | Anything else can be added via library API at user's side.
7 |
--------------------------------------------------------------------------------
/.github/PULL_REQUEST_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | Requirements for adding new site to defaults:
2 |
3 | - It should be popular
4 | - It should not be restricted to single domain (fb.me and others)
5 | - Test for new site should exist (`npm run test-all`)
6 |
7 | Anything else can be added via library API at user's side.
8 |
--------------------------------------------------------------------------------
/.github/workflows/ci.yml:
--------------------------------------------------------------------------------
1 | name: CI
2 |
3 | on:
4 | push:
5 | pull_request:
6 | schedule:
7 | - cron: '0 0 * * 3'
8 |
9 | jobs:
10 | test:
11 | runs-on: ubuntu-latest
12 |
13 | steps:
14 | - uses: actions/checkout@v2
15 | - uses: actions/setup-node@v2
16 |
17 | - run: npm install
18 |
19 | - name: Test
20 | run: |
21 | npm test
22 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | node_modules
2 | doc
3 | *.log
4 | *.swp
5 |
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | 6.1.0 / 2022-10-22
2 | ------------------
3 |
4 | - Deps bump.
5 | - Deprecated `soo.gd` & `korta.nu`.
6 | - Fixed `vk.cc` & `vurl.com`.
7 |
8 |
9 | 6.0.0 / 2022-05-17
10 | ------------------
11 |
12 | - Cleanup deprecated redirectors. Move deprecation info to separate file.
13 | - node.js v14+ required.
14 | - Renamed option `select` => `link_selector`.
15 | - Added method `.remove()` method.
16 | - Add `isErrorFatal` helper.
17 | - Deps bump.
18 |
19 |
20 | 5.0.0 / 2017-06-08
21 | ------------------
22 |
23 | - Switch to native async/await (need nodejs 7.+)
24 | - Drop callbacks support.
25 |
26 |
27 | 4.1.0 / 2017-06-08
28 | ------------------
29 |
30 | - Maintenance, deps bump. `got` 6.x -> 7.x. `got` timeouts may work a bit
31 | different but should affect result.
32 |
33 |
34 | 4.0.0 / 2016-12-08
35 | ------------------
36 |
37 | - Move request options to `options.request`.
38 | - Update default User-Agent string.
39 | - Deprecate `error.status` (use `error.statusCode`).
40 | - Add more info (code) to error messages.
41 | - flic.kr should use `.request()` method.
42 | - Increase default request timeout to 30 seconds.
43 |
44 |
45 | 3.1.0 / 2016-12-05
46 | ------------------
47 |
48 | - `err.status` -> `err.statusCode` (old `err.status` still exists for backward
49 | compatibility, but will be deprecated).
50 |
51 |
52 | 3.0.0 / 2016-11-27
53 | ------------------
54 |
55 | - Rewrite internals to promises (including .require() / cache.get() /
56 | cache.set()).
57 | - Drop old node.js support, now v4.+ required.
58 |
59 |
60 | 2.1.0 / 2016-07-15
61 | ------------------
62 |
63 | - Added `google.*/url` unshortening.
64 | - Reenabled some glitching services.
65 | - Added incident dates to default config for tracking progress in future.
66 |
67 |
68 | 2.0.0 / 2016-05-24
69 | ------------------
70 |
71 | - Added Promise support in `.expand` method.
72 | - Services cleanup.
73 |
74 |
75 | 1.1.3 / 2016-01-17
76 | ------------------
77 |
78 | - Maintenance: deps update.
79 |
80 |
81 | 1.1.2 / 2015-12-07
82 | ------------------
83 |
84 | - Enchanced error info with `code` & `status` properties.
85 |
86 |
87 | 1.1.1 / 2015-11-27
88 | ------------------
89 |
90 | - Improved cache use for edge case with empty result.
91 |
92 |
93 | 1.1.0 / 2015-11-25
94 | ------------------
95 |
96 | - Optimized cache use. Store data only if fetch happened.
97 | - Increased request timeout to 10 seconds.
98 |
99 |
100 | 1.0.1 / 2015-10-28
101 | ------------------
102 |
103 | - Added `vk.com/away.php` support.
104 |
105 |
106 | 1.0.0 / 2015-08-16
107 | ------------------
108 |
109 | - First release.
110 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright (c) 2015 Vitaly Puzrin.
2 |
3 | Permission is hereby granted, free of charge, to any person
4 | obtaining a copy of this software and associated documentation
5 | files (the "Software"), to deal in the Software without
6 | restriction, including without limitation the rights to use,
7 | copy, modify, merge, publish, distribute, sublicense, and/or sell
8 | copies of the Software, and to permit persons to whom the
9 | Software is furnished to do so, subject to the following
10 | conditions:
11 |
12 | The above copyright notice and this permission notice shall be
13 | included in all copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
16 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
17 | OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
18 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
19 | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
20 | WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
21 | FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22 | OTHER DEALINGS IN THE SOFTWARE.
23 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # url-unshort
2 |
3 | [](https://github.com/nodeca/url-unshort/actions/workflows/ci.yml)
4 | [](https://www.npmjs.org/package/url-unshort)
5 |
6 | > This library expands urls provided by url shortening services (see [full list](https://github.com/nodeca/url-unshort/blob/master/domains.yml)).
7 |
8 |
9 | ## Why should I use it?
10 |
11 | It has been [argued](http://joshua.schachter.org/2009/04/on-url-shorteners) that
12 | “shorteners are bad for the ecosystem as a whole”. In particular, if you're
13 | running a forum or a blog, such services might cause trouble for your users:
14 |
15 | - such links load slower than usual (shortening services require an extra DNS
16 | and HTTP request)
17 | - it adds another point of failure (should this service go down, the links will
18 | die; [301works](https://archive.org/details/301works) tries to solve this,
19 | but it's better to avoid the issue in the first place)
20 | - users don't see where the link points to (tinyurl previews don't *really*
21 | solve this)
22 | - it can be used for user activity tracking
23 | - certain shortening services are displaying ads before redirect
24 | - shortening services can be malicious or be hacked so they could redirect to
25 | a completely different place next month
26 |
27 | Also, short links are used to bypass the spam filters. So if you're implementing
28 | a domain black list for your blog comments, you might want to check where all
29 | those short links *actually* point to.
30 |
31 |
32 | ## Installation
33 |
34 | ```js
35 | $ npm install url-unshort
36 | ```
37 |
38 | ## Basic usage
39 |
40 | ```js
41 | const uu = require('url-unshort')()
42 |
43 | try {
44 | const url = await uu.expand('http://goo.gl/HwUfwd')
45 |
46 | if (url) console.log('Original url is: ${url}')
47 | else console.log('This url can\'t be expanded')
48 |
49 | } catch (err) {
50 | console.log(err);
51 | }
52 | ```
53 |
54 | ## Retrying errors
55 |
56 | Temporary network errors are retried automatically once (`options.request.retry=1` by default).
57 |
58 | You may choose to retry some errors after an extended period of time using code like this:
59 |
60 | ```js
61 | const uu = require('url-unshort')()
62 | const { isErrorFatal } = require('url-unshort')
63 | let tries = 0
64 |
65 | while (true) {
66 | try {
67 | tries++
68 | const url = await uu.expand('http://goo.gl/HwUfwd')
69 |
70 | // If url is expanded, it returns string (expanded url);
71 | // "undefined" is returned if service is unknown
72 | if (url) console.log(`Original url is: ${url}`)
73 | else console.log("This url can't be expanded")
74 | break
75 |
76 | } catch (err) {
77 | // use isErrorFatal function to check if url can be retried or not
78 | if (isErrorFatal(err)) {
79 | // this url can't be expanded (e.g. 404 error)
80 | console.log(`Unshort error (fatal): ${err}`)
81 | break
82 | }
83 |
84 | // Temporary error, trying again in 10 minutes
85 | // (5xx errors, ECONNRESET, etc.)
86 | console.log(`Unshort error (retrying): ${err}`)
87 | if (tries >= 3) {
88 | console.log(`Too many errors, aborting`)
89 | break
90 | }
91 | await new Promise(resolve => setTimeout(resolve, 10 * 60 * 1000))
92 | }
93 | }
94 | ```
95 |
96 |
97 | ## API
98 |
99 | ### Creating an instance
100 |
101 | When you create an instance, you can pass an options object to fine-tune unshortener behavior.
102 |
103 | ```js
104 | const uu = require('url-unshort')({
105 | nesting: 3,
106 | cache: {
107 | get: async key => {},
108 | set: async (key, value) => {}
109 | }
110 | });
111 | ```
112 |
113 | Available options are:
114 |
115 | - **nesting** (Number, default: `3`) - stop resolving urls
116 | when `nesting` amount of redirects is reached.
117 |
118 | It happens if one shortening service refers to a link belonging to
119 | another shortening service which in turn points to yet another one
120 | and so on.
121 |
122 | If this limit is reached, `expand()` will return an error.
123 |
124 | - **cache** (Object) - set a custom cache implementation (e.g. if you wish
125 | to store urls in Redis).
126 |
127 | You need to specify 2 promise-based functions, `set(key, value)` & `get(key)`.
128 |
129 | - **request** (Object) - default options for
130 | [got](https://github.com/sindresorhus/got) in `.request()` method. Can be
131 | used to set custom `User-Agent` and other headers.
132 |
133 |
134 | ### uu.expand(url) -> Promise
135 |
136 | Expand an URL supplied. If we don't know how to expand it, returns `null`.
137 |
138 | ```js
139 | const uu = require('url-unshort')();
140 |
141 | try {
142 | const url = await uu.expand('http://goo.gl/HwUfwd')
143 |
144 | if (url) console.log('Original url is: ${url}')
145 | // no shortening service or an unknown one is used
146 | else console.log('This url can\'t be expanded')
147 |
148 | } catch (err) {
149 | console.log(err)
150 | }
151 | ```
152 |
153 | ### uu.add(domain [, options])
154 |
155 | Add a new url shortening service (domain name or an array of them) to the white
156 | list of domains we know how to expand.
157 |
158 | ```js
159 | uu.add([ 'tinyurl.com', 'bit.ly' ])
160 | ```
161 |
162 | The default behavior will be to follow the URL with a HEAD request and check
163 | the status code. If it's `3xx`, return the `Location` header. You can override
164 | this behavior by supplying your own function in options.
165 |
166 | Options:
167 |
168 | - **aliases** (Array) - Optional. List of alternate domaine names, if exist.
169 | - **match** (String|RegExp) - Optional. Custom regexp to use for URL match.
170 | For example, if you need to match wildcard prefixes or country-specific
171 | suffixes. If used with `validate`, then regexp may be not precise, only to
172 | filter out noise. If `match` not passed, then exact value auto-generated from
173 | `domain` & `aliases`.
174 | - **validate** (Function) - Optional. Does exact URL check, when complex logic
175 | required and regexp is not enouth (when `match` is only preliminary). See
176 | `./lib/providers/*` for example.
177 | - **fetch** (Function) - Optional. Specifies custom function to retrieve expanded
178 | url, see `./lib/providers/*` for examples. If not set - default method used
179 | (it checks 30X redirect codes & ``
180 | in HTML).
181 | - **link_selector** (String) - Optional. Some sites may return HTML pages instead
182 | of 302 redirects. This option allows use jquery-like selector to extract
183 | `` value.
184 |
185 | Example:
186 |
187 | ```js
188 | const uu = require('url-unshort')()
189 |
190 | uu.add('notlong.com', {
191 | match: '^(https?:)//[a-zA-Z0-9_-]+[.]notlong[.]com/'
192 | })
193 |
194 | uu.add('tw.gs', {
195 | link_selector: '#lurllink > a'
196 | })
197 | ```
198 |
199 | ### uu.remove(domain)
200 |
201 | (String|Array|Undefined). Opposite to `.add()`. Remove selected domains from
202 | instance config. If no params passed - remove everything.
203 |
204 |
205 | ## Security considerations
206 |
207 | Only `http` and `https` protocols are allowed in the output. Browsers technically
208 | support redirects to other protocols (like `ftp` or `magnet`), but most url
209 | shortening services limit redirects to `http` and `https` anyway. In case
210 | service redirects to an unknown protocol, `expand()` will return an error.
211 |
212 | `expand()` function returns url from the url shortening **as is** without any
213 | escaping or even ensuring that the url is valid. If you want to guarantee a
214 | valid url as an output, you're encouraged to re-encode it like this:
215 |
216 | ```js
217 | var URL = require('url');
218 |
219 | url = await uu.expand('http://goo.gl/HwUfwd')
220 |
221 | if (url) url = URL.format(URL.parse(url, null, true))
222 |
223 | console.log(url));
224 | ```
225 |
226 | ## License
227 |
228 | [MIT](https://raw.github.com/nodeca/url-unshort/master/LICENSE)
229 |
--------------------------------------------------------------------------------
/deprecated.md:
--------------------------------------------------------------------------------
1 | List of outdated domains, removed from `domains.yml`.
2 |
3 | **2.gp**
4 |
5 | Alias: 7.ly
6 |
7 | 2021.11.30. Redirects to another site, old links removed.
8 |
9 | **adfa.st**
10 |
11 | 2016.07.13. Domain lost.
12 |
13 | **b23.ru**
14 |
15 | 2016.07.13. Not responding.
16 |
17 | **budurl.me**
18 |
19 | 2016.07.13. Become adware, old links removed.
20 |
21 | **fur.ly**
22 |
23 | 2016.12.06. Empty main page. 404 to all links.
24 |
25 | **korta.nu**
26 |
27 | 2022.10.22. Not working (accedd denied)
28 |
29 | **macte.ch**
30 |
31 | 2016.07.13. Removed old links & disabled foreign domains
32 |
33 | **migre.me**
34 |
35 | 2021.11.30. Not working.
36 |
37 | **minu.me**
38 |
39 | 2021.11.30. Not responding.
40 |
41 | **nsfw.in**
42 |
43 | 2021.11.30. Cyclic redirect.
44 |
45 | **o-x.fr**
46 |
47 | 2016.07.13. Domain lost.
48 |
49 | **qr.net**
50 |
51 | 2016.12.06. Not working, strange default redirects.
52 |
53 | **scrnch.me**
54 |
55 | 2021.11.30. Domain lost.
56 |
57 | **smsh.me**
58 |
59 | 2016.07.13. Not working.
60 |
61 | **snipurl.com**
62 |
63 | 2021.11.30. "We're migrating to a new server" (and nothing changed)
64 |
65 | **soo.gd**
66 |
67 | 2022.10.22. Domain lost.
68 |
69 | **thecow.me**
70 |
71 | 2021.11.30. Only root page works (and nothing changed)
72 |
73 | **tiny.ly**
74 |
75 | 2016.07.13. Not responding.
76 |
77 | **tnij.org**
78 |
79 | 2021.11.30. Not working.
80 |
81 | **to.ly**
82 |
83 | 2016.07.13. Not responding. Currently redirects to another site, old links removed.
84 |
85 | **tr.im**
86 |
87 | 2022.02.02. Domain lost.
88 |
89 | **trim.li**
90 |
91 | 2021.11.30. Not working.
92 |
93 | **url.az**
94 |
95 | 2016.07.13. Become adware with intermediate page. Old links removed.
96 |
97 | **http://ur1.ca**
98 |
99 | 2021.11.30. Domain lost.
100 |
101 | **➡.ws**
102 |
103 | With aliases: ➯.ws, ➔.ws, ➞.ws, ➽.ws, ➹.ws, ✩.ws, ✿.ws, ❥.ws, ›.ws, ⌘.ws, ‽.ws,
104 | ☁.ws, ta.gd, ri.ms.
105 |
106 | 2021.11.30. All domains unconfigured or lost.
--------------------------------------------------------------------------------
/domains.yml:
--------------------------------------------------------------------------------
1 | #
2 | # The list of domains handled by internal rules
3 | #
4 |
5 | - 0rz.tw
6 | - alturl.com
7 | - amzn.to
8 | - bit.do
9 | - bit.ly:
10 | aliases: [ aaja.de, adct.me, archdai.ly, aspt.co, ccwc.me, crks.me,
11 | bcool.bz, detne.ws, digs.by, drudge.tw, emu.sc, go72.de, got.cr, j.mp,
12 | j-tv.me, hnnng.de, s.htc.com, kon.gg, livesi.de, perez.ly, rol.st,
13 | scr.bi, theatln.tc, tgr.ph, trib.in, utsd.us, yhoo.it ]
14 | - chilp.it
15 | - clck.ru
16 | - cort.as
17 | - cutt.us
18 | - db.tt
19 | - fave.co
20 | - flic.kr
21 | - goo.gl
22 | - google.com:
23 | match: '^(https?:)//(www[.])?google([.]\w+)([.]\w+)?/'
24 | - is.gd
25 | - merky.de
26 | - notlong.com:
27 | match: '^(https?:)//[a-zA-Z0-9_-]+[.]notlong[.]com/'
28 | - ow.ly:
29 | aliases: [ owl.li, ht.ly ]
30 | - shorl.com
31 | - smu.gs
32 | - t.co
33 | # 2016.07.13, Domain blacklisted by idiots from RKN. Viva Russia!
34 | - tiny.cc
35 | - tiny.pl
36 | - tinyurl.com
37 | - tmblr.co:
38 | aliases: [ tumblr.com ]
39 | # 2022.10.22 Shows DB Error
40 | - tw.gs:
41 | link_selector: '#lurllink > a'
42 | # 2021.11.30 Unstable, but works.
43 | - url.ie
44 | - v.gd:
45 | link_selector: '.biglink'
46 | - vk.cc
47 | - vk.com:
48 | aliases: [ vkontakte.ru ]
49 | - vurl.com:
50 | link_selector: '.padder a:nth-child(3)'
51 | - wp.me
52 | - xurl.es
53 |
--------------------------------------------------------------------------------
/index.js:
--------------------------------------------------------------------------------
1 | 'use strict'
2 |
3 | module.exports = require('./lib')
4 | module.exports.Error = require('./lib/error')
5 | module.exports.isErrorFatal = require('./lib/is_error_fatal')
6 |
--------------------------------------------------------------------------------
/lib/error.js:
--------------------------------------------------------------------------------
1 | // Error class based on http://stackoverflow.com/questions/8458984
2 | //
3 | 'use strict'
4 |
5 | class UnshortError extends Error {
6 | constructor (message, code, statusCode) {
7 | super(message)
8 | this.name = this.constructor.name
9 | Error.captureStackTrace(this, this.constructor)
10 |
11 | if (code) this.code = code
12 | if (statusCode) this.statusCode = statusCode
13 | }
14 | }
15 |
16 | module.exports = UnshortError
17 |
--------------------------------------------------------------------------------
/lib/index.js:
--------------------------------------------------------------------------------
1 | // Main class
2 | //
3 |
4 | 'use strict'
5 |
6 | const $ = require('cheerio/lib/slim').load('')
7 | const read = require('fs').readFileSync
8 | const yaml = require('js-yaml')
9 | const path = require('path')
10 | const punycode = require('punycode/')
11 | const got = require('got')
12 | const escapeRe = require('escape-string-regexp')
13 | const merge = require('lodash.merge')
14 | const URL = require('url').URL
15 | const UnshortError = require('./error')
16 | const pkg = require('../package.json')
17 |
18 | const config = yaml.load(read(path.join(__dirname, '..', 'domains.yml'), 'utf8'))
19 |
20 | const defaultAgent = `${pkg.name}/${pkg.version} (+https://github.com/nodeca/url-unshort)`
21 |
22 | const defaultOptions = {
23 | timeout: 30 * 1000,
24 | retry: 1,
25 | followRedirect: false, // redirects are handled manually
26 | headers: {
27 | 'User-Agent': defaultAgent
28 | }
29 | }
30 |
31 | const customProviders = require('./providers/index')
32 |
33 | // Create an unshortener instance
34 | //
35 | // options:
36 | // - cache (Object) - cache instance
37 | // - get(key) -> Promise
38 | // - set(key, value) -> Promise
39 | // - nesting (Number) - max amount of redirects to follow, default: `3`
40 | // - request (Object) - default options for `got` in `.request()` method
41 | //
42 | function Unshort (options = {}) {
43 | if (!(this instanceof Unshort)) return new Unshort(options)
44 |
45 | this._options = merge({}, defaultOptions, options.request || {})
46 |
47 | // config data with compiled regexps and fetch functions attached
48 | this._sites = []
49 | this._compiled_sites = []
50 |
51 | this.cache = options.cache || {
52 | get: async () => {},
53 | set: async () => {}
54 | }
55 |
56 | this.nesting = options.nesting || 3
57 |
58 | // Regexp that matches links to all the known services, it is used
59 | // to determine whether url should be processed at all or not.
60 | //
61 | // Initialized to regexp that matches nothing, it gets overwritten
62 | // when domains are added.
63 | //
64 | this._matchAllRE = /(?!)/
65 |
66 | // Merge config data & custom providers
67 | config.forEach(site => {
68 | let domain, options
69 |
70 | if (typeof site === 'string') {
71 | [domain, options] = [site, {}]
72 | } else {
73 | [domain, options] = Object.entries(site)[0]
74 | }
75 |
76 | if (customProviders[domain]) {
77 | Object.assign(options, customProviders[domain])
78 | }
79 |
80 | this.add(domain, options)
81 | })
82 | }
83 |
84 | // Remove previously added domain.
85 | //
86 | // - domain (String|Array) - list of domain names (leave undefined to drop all)
87 | //
88 | Unshort.prototype.remove = function (domain) {
89 | if (!domain) {
90 | this._sites.length = 0
91 | } else if (!Array.isArray(domain)) {
92 | this._sites = this._sites.filter(s => s.id !== domain)
93 | } else {
94 | for (const d of domain) {
95 | this._sites = this._sites.filter(s => s.id !== d)
96 | }
97 | }
98 |
99 | this._compile()
100 | }
101 |
102 | // Add a domain name to the list of known domains
103 | //
104 | // - domain (String|Array) - list of domain names
105 | // - options (Object) - options for these domains
106 | // - link_selector (String) - jquery-like selector to retrieve url with
107 | // - match (String|RegExp) - custom regexp to use to match this domain
108 | // - fetch (Function) - custom function to retrieve expanded url
109 | //
110 | Unshort.prototype.add = function (domain, options = {}) {
111 | if (Array.isArray(domain)) {
112 | for (const d of domain) {
113 | this._sites.push(Object.assign({ id: d }, options))
114 | }
115 | } else {
116 | this._sites.push(Object.assign({ id: domain }, options))
117 | }
118 |
119 | this._compile()
120 | }
121 |
122 | // Normalize site data:
123 | //
124 | // - create default handlers if not exist
125 | // - build `match` regexp, depending on other fields
126 | //
127 | // Returns normalized object, suitable for unified processing.
128 | //
129 | Unshort.prototype._compileSingle = function (site) {
130 | // Prepare list of all the domain names, including aliases
131 | // and punycode variations
132 | let dList = [site.id].concat(site.aliases || [])
133 |
134 | // create variations + make unique
135 | dList = Array.from(new Set(
136 | dList.map(punycode.toASCII).concat(dList.map(punycode.toUnicode))
137 | ))
138 |
139 | let match
140 |
141 | if (site.match) {
142 | // regexp is specified by a user
143 | match = typeof site.match === 'string'
144 | ? new RegExp(site.match, 'i')
145 | : site.match
146 | } else {
147 | // regexp is auto-generated out of domain list
148 | match = new RegExp(
149 | `^(https?:)?//(www[.])?(${dList.map(escapeRe).join('|')})/`,
150 | 'i'
151 | )
152 | }
153 |
154 | return Object.assign(
155 | { fetch: this._defaultFetch, validate: () => true },
156 | site,
157 | { match }
158 | )
159 | }
160 |
161 | // Rebuild all regexps & default handlers for fast run
162 | Unshort.prototype._compile = function () {
163 | this._compiled_sites.length = 0
164 |
165 | for (const site of this._sites) {
166 | this._compiled_sites.push(this._compileSingle(site))
167 | }
168 |
169 | // Create global search regexp
170 | this._matchAllRE = new RegExp(
171 | this._compiled_sites.map(cs => cs.match.source).join('|'),
172 | 'i'
173 | )
174 | }
175 |
176 | // Internal method to perform an http(s) request, it's supposed to be used
177 | // in fetchers. You can override it with custom implementation (for example,
178 | // if you want to avoid http requests at all and use cache only, you can
179 | // replace this with a stub).
180 | //
181 | Unshort.prototype.request = function (url, options) {
182 | const opts = merge({}, this._options, options || {})
183 |
184 | return got(url, opts).catch(err => {
185 | let statusCode = err.statusCode
186 |
187 | if (err.code === 'ERR_NON_2XX_3XX_RESPONSE' && err.response) {
188 | // https://github.com/sindresorhus/got/blob/main/documentation/8-errors.md
189 | statusCode = err.response.statusCode
190 | }
191 |
192 | throw new UnshortError(
193 | `Remote server error, code ${err.code}, statusCode ${statusCode}`,
194 | 'EHTTP',
195 | statusCode)
196 | })
197 | }
198 |
199 | // Expand an URL
200 | //
201 | // - url (String) - url to expand
202 | //
203 | Unshort.prototype.expand = function (url) {
204 | return this._expand(url)
205 | }
206 |
207 | // Internal method that expands url recursively up to `nesting` times,
208 | // on each execution it parses input url and calls a fetcher of the
209 | // matching domain.
210 | //
211 | Unshort.prototype._expand = async function (origUrl) {
212 | if (origUrl.startsWith('//')) {
213 | try {
214 | /* eslint-disable no-new */
215 | new URL(origUrl)
216 | } catch (e) {
217 | try {
218 | // set protocol for relative links like `//example.com`
219 | new URL('http:' + origUrl)
220 | origUrl = 'http:' + origUrl
221 | } catch {}
222 | }
223 | }
224 |
225 | let url = origUrl
226 | let shouldCache = false
227 | let nestingLeft = this.nesting
228 |
229 | for (; nestingLeft >= 0; nestingLeft--) {
230 | let hash = ''
231 |
232 | //
233 | // Normalize url & pre-validate
234 | //
235 |
236 | let u = new URL(url)
237 |
238 | // user-submitted url has weird protocol, just return `null` in this case
239 | if (u.protocol !== 'http:' && u.protocol !== 'https:') break
240 |
241 | if (u.hash) {
242 | // Copying browser-like behavior here: if we're not redirected to a hash,
243 | // but original url has one, set it as a final hash.
244 | hash = u.hash
245 | u.hash = ''
246 | }
247 |
248 | const urlNormalized = u.toString()
249 |
250 | //
251 | // At top level try cache first. On recursive calls skip cache.
252 | // !! Cache should be probed even for disabled services, to resolve old links.
253 | //
254 | let result
255 |
256 | if (nestingLeft === this.nesting) {
257 | result = await this.cache.get(urlNormalized)
258 |
259 | // If cache exists - use it.
260 | if (result || result === null) {
261 | // forward hash if needed
262 | if (hash && result) {
263 | u = new URL(result)
264 | u.hash = u.hash || hash
265 | result = u.toString()
266 | }
267 |
268 | return result
269 | }
270 | }
271 |
272 | //
273 | // First pass validation (quick).
274 | //
275 |
276 | if (!this._matchAllRE.test(urlNormalized)) break
277 |
278 | // Something found - run additional checks.
279 |
280 | const siteConfig = this._compiled_sites.find(cs => cs.match.exec(urlNormalized))
281 |
282 | if (!siteConfig || !siteConfig.validate(urlNormalized)) break
283 |
284 | // Valid redirector => should cache result
285 | shouldCache = true
286 |
287 | result = await siteConfig.fetch.call(this, urlNormalized, siteConfig)
288 |
289 | // If unshortener has persistent fail - stop.
290 | if (!result) break
291 |
292 | // Parse and check url
293 | //
294 | try {
295 | u = new URL(result)
296 | } catch (e) {
297 | if (e instanceof TypeError && e.message === 'Invalid URL') {
298 | throw new UnshortError('Redirected to an invalid location', 'EBADREDIRECT')
299 | }
300 |
301 | throw e
302 | }
303 |
304 | if (u.protocol !== 'http:' && u.protocol !== 'https:') {
305 | // Accept:
306 | //
307 | // - http:// protocol (e.g. http://example.org/)
308 | // - https:// protocol (e.g. https://example.org/)
309 | //
310 | // Restriction is done for security reasons. Even though browsers
311 | // can redirect anywhere, most shorteners have similar restrictions.
312 | //
313 | throw new UnshortError('Redirected to an invalid location', 'EBADREDIRECT')
314 | }
315 |
316 | // restore hash if needed
317 | if (hash && !u.hash) {
318 | u.hash = hash
319 | result = u.toString()
320 | }
321 |
322 | url = result
323 | }
324 |
325 | if (nestingLeft < 0) {
326 | throw new UnshortError('Too many redirects', 'EBADREDIRECT')
327 | }
328 |
329 | const result = (url !== origUrl) ? url : null
330 |
331 | if (shouldCache) {
332 | // Cache result.
333 | // !! use normalized original URL for cache key.
334 | const uo = new URL(origUrl)
335 |
336 | uo.hash = ''
337 |
338 | await this.cache.set(uo.toString(), result)
339 | }
340 |
341 | return result
342 | }
343 |
344 | Unshort.prototype._isRedirect = function (code) {
345 | return [301, 302, 303, 307, 308].includes(code)
346 | }
347 |
348 | // Default fetcher, it requests an url and retrieves url it redirects to
349 | // using following data sources:
350 | //
351 | // - "Location" header if response code is 3xx
352 | // - meta tag
353 | // - $(selector).attr('href, src') if selector is specified
354 | //
355 | Unshort.prototype._defaultFetch = async function (url, options) {
356 | let res
357 |
358 | try {
359 | res = await this.request(url)
360 | } catch (e) {
361 | if (e.statusCode >= 400 && e.statusCode < 500) return null
362 | throw e
363 | }
364 |
365 | if (this._isRedirect(res.statusCode)) {
366 | return res.headers.location ? res.headers.location.trim() : null
367 | }
368 |
369 | if (res.statusCode >= 200 && res.statusCode < 300) {
370 | if (!res.headers['content-type'] ||
371 | res.headers['content-type'].split(';')[0].trim() !== 'text/html') {
372 | return null
373 | }
374 |
375 | const body = String(res.body)
376 |
377 | if (options.link_selector) {
378 | // try to lookup selector if it's defined in the config
379 | const el = $(body).find(options.link_selector)
380 | const result = el.attr('href')
381 |
382 | if (result) return result.trim()
383 | }
384 |
385 | // try tag
386 | let refresh = $(body)
387 | .find('meta[http-equiv="refresh"]')
388 | .attr('content')
389 |
390 | if (!refresh) return null
391 |
392 | // parse meta-tag and remove timeout,
393 | // refresh at this point is like `0.5; url=http://example.org`
394 | refresh = refresh.replace(/^[^;]+;\s*url=/i, '').trim()
395 |
396 | return refresh
397 | }
398 |
399 | throw new UnshortError(
400 | `Remote server error, code ${res.code}, statusCode ${res.statusCode}`,
401 | 'EHTTP',
402 | res.statusCode)
403 | }
404 |
405 | module.exports = Unshort
406 |
--------------------------------------------------------------------------------
/lib/is_error_fatal.js:
--------------------------------------------------------------------------------
1 | // Function checks whether an error can be retried later or not
2 | //
3 | 'use strict'
4 |
5 | function isErrorFatal (err) {
6 | // HTTP errors, fatal errors are everything except:
7 | // - 5xx - server-side errors
8 | // - 429 - rate limit
9 | // - 408 - request timeout
10 | if (err.statusCode && !String(+err.statusCode).match(/^(5..|429|408)$/)) {
11 | return true
12 | }
13 |
14 | // EINVAL - bad urls like http://1234
15 | if (err.code === 'EINVAL') return true
16 |
17 | // server returned invalid url or caused redirect loop
18 | if (err.code === 'EBADREDIRECT') return true
19 |
20 | return false
21 | }
22 |
23 | module.exports = isErrorFatal
24 |
--------------------------------------------------------------------------------
/lib/providers/clck.ru.js:
--------------------------------------------------------------------------------
1 | // clck.ru redirects to https://sba.yandex.net/redirect?url=..., which
2 | // restricts allowed user agents.
3 | // Let's extract url directly.
4 |
5 | 'use strict'
6 |
7 | const URL = require('url').URL
8 | const UnshortError = require('../error')
9 |
10 | exports.fetch = async function (url) {
11 | let res
12 | try {
13 | res = await this.request(url, { method: 'HEAD' })
14 | } catch (e) {
15 | if (e.statusCode >= 400 && e.statusCode < 500) return null
16 | throw e
17 | }
18 |
19 | if (!this._isRedirect(res.statusCode)) {
20 | throw new UnshortError(
21 | `Unexpected server response ${res.statusCode}, expect redirect`,
22 | 'EHTTP',
23 | 500
24 | )
25 | }
26 |
27 | let dest
28 | try {
29 | const u = new URL(res.headers.location)
30 | dest = u.searchParams.get('url')
31 | } catch (e) {
32 | throw new UnshortError(
33 | 'Redirected to an invalid location',
34 | 'EBADREDIRECT'
35 | )
36 | }
37 |
38 | return dest
39 | }
40 |
--------------------------------------------------------------------------------
/lib/providers/flic.kr.js:
--------------------------------------------------------------------------------
1 | // Process flic.kr redirects (including relative urls default fetcher can't do)
2 | //
3 |
4 | 'use strict'
5 |
6 | const URL = require('url').URL
7 | const UnshortError = require('../error')
8 |
9 | exports.fetch = async function (url) {
10 | let nestingLeft = 5
11 |
12 | while (nestingLeft--) {
13 | let res
14 |
15 | try {
16 | res = await this.request(url, { method: 'HEAD' })
17 | } catch (e) {
18 | if (e.statusCode >= 400 && e.statusCode < 500) return null
19 | throw e
20 | }
21 |
22 | if (this._isRedirect(res.statusCode)) {
23 | try {
24 | const uDst = new URL(res.headers.location, url)
25 |
26 | url = uDst.toString()
27 | continue
28 | } catch (e) {
29 | if (e instanceof TypeError && e.message === 'Invalid URL') {
30 | throw new UnshortError('Redirected to an invalid location', 'EBADREDIRECT')
31 | }
32 |
33 | throw e
34 | }
35 | }
36 |
37 | // reached destination
38 | if (res.statusCode >= 200 && res.statusCode < 300) return url
39 |
40 | throw new UnshortError(`Unexpected status code: ${res.statusCode}`, 'EHTTP', res.statusCode)
41 | }
42 |
43 | return null
44 | }
45 |
--------------------------------------------------------------------------------
/lib/providers/google.com.js:
--------------------------------------------------------------------------------
1 | // Process google.com/url?... redirects
2 | //
3 |
4 | 'use strict'
5 |
6 | const URL = require('url').URL
7 | const isGoogle = require('is-google-domain')
8 |
9 | exports.validate = url => {
10 | const u = new URL(url)
11 |
12 | if (!isGoogle(u.hostname)) return false
13 |
14 | return u.pathname === '/url' && u.searchParams.get('url')
15 | }
16 |
17 | exports.fetch = async url => {
18 | const u = new URL(url)
19 |
20 | return u.searchParams.get('url')
21 | }
22 |
--------------------------------------------------------------------------------
/lib/providers/index.js:
--------------------------------------------------------------------------------
1 | 'use strict'
2 |
3 | module.exports = {
4 | 'clck.ru': require('./clck.ru'),
5 | 'flic.kr': require('./flic.kr'),
6 | 'google.com': require('./google.com'),
7 | 'vk.com': require('./vk.com')
8 | }
9 |
--------------------------------------------------------------------------------
/lib/providers/vk.com.js:
--------------------------------------------------------------------------------
1 | // Process vk.com/away.php redirects
2 | //
3 |
4 | 'use strict'
5 |
6 | const URL = require('url').URL
7 |
8 | exports.validate = url => {
9 | const u = new URL(url)
10 |
11 | return u.pathname === '/away.php' && u.searchParams.get('to')
12 | }
13 |
14 | exports.fetch = async url => {
15 | const u = new URL(url)
16 |
17 | return u.searchParams.get('to')
18 | }
19 |
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
1 | {
2 | "name": "url-unshort",
3 | "version": "6.1.0",
4 | "description": "Expand urls provided by url shortening services.",
5 | "keywords": [
6 | "unshort",
7 | "expand",
8 | "url"
9 | ],
10 | "repository": "nodeca/url-unshort",
11 | "license": "MIT",
12 | "scripts": {
13 | "lint": "standardx -v .",
14 | "test": "npm run lint && mocha",
15 | "test-all": "npm run lint && LINKS_CHECK=all mocha"
16 | },
17 | "files": [
18 | "index.js",
19 | "domains.yml",
20 | "lib/"
21 | ],
22 | "dependencies": {
23 | "cheerio": "^1.0.0-rc.12",
24 | "escape-string-regexp": "^4.0.0",
25 | "got": "^11.8.3",
26 | "is-google-domain": "^1.0.0",
27 | "js-yaml": "^4.1.0",
28 | "lodash.merge": "^4.6.2",
29 | "mdurl": "^1.0.0",
30 | "punycode": "^2.0.1"
31 | },
32 | "devDependencies": {
33 | "mocha": "^10.1.0",
34 | "mocha.parallel": "^0.15.2",
35 | "nock": "^13.2.4",
36 | "standardx": "^7.0.0"
37 | },
38 | "mocha": {
39 | "timeout": 60000
40 | },
41 | "engines": {
42 | "node": ">=14"
43 | }
44 | }
45 |
--------------------------------------------------------------------------------
/support/unshort.js:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env node
2 |
3 | 'use strict'
4 |
5 | /* eslint-disable no-console */
6 |
7 | const params = process.argv.slice(2)
8 |
9 | if (!params.length) {
10 | console.error('Usage: unshort.js URL')
11 | require('process').exit()
12 | }
13 |
14 | const url = params[0]
15 |
16 | require('../')().expand(url).then((to) => {
17 | console.log(to)
18 | })
19 |
--------------------------------------------------------------------------------
/test/cache.js:
--------------------------------------------------------------------------------
1 | 'use strict'
2 |
3 | /* eslint-env mocha */
4 |
5 | const assert = require('assert')
6 |
7 | describe('Cache', function () {
8 | let uu
9 | let fetchCount = 0
10 | let cache = {}
11 | let result
12 |
13 | before(() => {
14 | uu = require('../')({
15 | cache: {
16 | get: async key => cache[key],
17 | set: async (key, value) => {
18 | cache[key] = value
19 | return true
20 | }
21 | }
22 | })
23 |
24 | uu.add('example.org', {
25 | async fetch () {
26 | fetchCount++
27 | return 'http://foo.bar/'
28 | }
29 | })
30 | })
31 |
32 | it('should cache urls', async () => {
33 | cache = {}
34 |
35 | result = await uu.expand('http://example.org/foo')
36 | assert.strictEqual(result, 'http://foo.bar/')
37 |
38 | result = await uu.expand('http://example.org/foo')
39 | assert.strictEqual(result, 'http://foo.bar/')
40 | assert.strictEqual(fetchCount, 1)
41 | })
42 |
43 | it('should not cache invalid urls', async () => {
44 | cache = {}
45 |
46 | result = await uu.expand('http://invalid-url.com/foo')
47 | assert.strictEqual(result, null)
48 | assert.deepStrictEqual(cache, {})
49 | })
50 |
51 | it('should resolve disabled services from cache, if used before', async () => {
52 | cache = { 'http://old.service.com/123': 'http://redirected.to/' }
53 |
54 | result = await uu.expand('http://old.service.com/123')
55 | assert.strictEqual(result, 'http://redirected.to/')
56 | })
57 |
58 | it('should forward hash to cached value', async () => {
59 | cache = { 'http://old.service.com/123': 'http://redirected.to/' }
60 |
61 | result = await uu.expand('http://old.service.com/123#foo')
62 | assert.strictEqual(result, 'http://redirected.to/#foo')
63 | })
64 |
65 | it('should cache null result after first fetch', async () => {
66 | uu.add('example2.org', {
67 | fetch: async () => null
68 | })
69 |
70 | cache = {}
71 |
72 | result = await uu.expand('http://example2.org/foo')
73 | assert.strictEqual(result, null)
74 | assert.deepStrictEqual(cache, { 'http://example2.org/foo': null })
75 |
76 | result = await uu.expand('http://example2.org/foo')
77 | assert.strictEqual(result, null)
78 | })
79 |
80 | it('should properly cache last null fetch in nested redirects', async () => {
81 | uu.add('example3.org', {
82 | fetch: async () => 'http://example4.org/test'
83 | })
84 |
85 | uu.add('example4.org', {
86 | fetch: async () => null
87 | })
88 |
89 | cache = {}
90 |
91 | result = await uu.expand('http://example3.org/foo')
92 | assert.strictEqual(result, 'http://example4.org/test')
93 | assert.deepStrictEqual(cache, { 'http://example3.org/foo': 'http://example4.org/test' })
94 | })
95 | })
96 |
--------------------------------------------------------------------------------
/test/default.js:
--------------------------------------------------------------------------------
1 | 'use strict'
2 |
3 | /* eslint-env mocha */
4 |
5 | const assert = require('assert')
6 | const nock = require('nock')
7 | const { isErrorFatal } = require('../')
8 |
9 | describe('Default', function () {
10 | let uu
11 |
12 | before(async () => {
13 | uu = require('..')({
14 | request: {
15 | retry: 0
16 | }
17 | })
18 | uu.add('example.org')
19 | })
20 |
21 | it('should process redirect', async () => {
22 | nock('http://example.org')
23 | .get('/foo')
24 | .reply(301, '', { location: 'https://github.com/0' })
25 |
26 | const result = await uu.expand('http://example.org/foo')
27 | assert.strictEqual(result, 'https://github.com/0')
28 | })
29 |
30 | it('should parse meta tags', async () => {
31 | const html = ''
32 | nock('http://example.org')
33 | .get('/bar')
34 | .reply(200, html, { 'content-type': 'text/html' })
35 |
36 | const result = await uu.expand('http://example.org/bar')
37 | assert.strictEqual(result, 'https://github.com/1')
38 | })
39 |
40 | it("should not process file if it's not html", async () => {
41 | const html = ''
42 | nock('http://example.org')
43 | .get('/zzz')
44 | .reply(200, html, { 'content-type': 'application/json' })
45 |
46 | const result = await uu.expand('http://example.org/zzz')
47 | assert.strictEqual(result, null)
48 | })
49 |
50 | it('should return nothing on 404', async () => {
51 | nock('http://example.org')
52 | .get('/baz')
53 | .reply(404, '')
54 |
55 | const result = await uu.expand('http://example.org/baz')
56 | assert.strictEqual(result, null)
57 | })
58 |
59 | it('should return errors on unknown status codes', async () => {
60 | nock('http://example.org')
61 | .get('/bazzz')
62 | .reply(503, '')
63 |
64 | await assert.rejects(
65 | async () => uu.expand('http://example.org/bazzz'),
66 | err => {
67 | assert.match(err.message, /Remote server error/)
68 | assert.strictEqual(err.code, 'EHTTP')
69 | assert.strictEqual(err.statusCode, 503)
70 | assert.strictEqual(isErrorFatal(err), false)
71 | return true
72 | }
73 | )
74 | })
75 |
76 | it('should treat invalid urls as fatal error', async () => {
77 | nock('http://example.org')
78 | .get('/invalid')
79 | .reply(301, '', { location: 'http://xn--/1' })
80 |
81 | await assert.rejects(
82 | async () => uu.expand('http://example.org/invalid'),
83 | err => {
84 | assert.match(err.message, /Redirected to an invalid location/)
85 | assert.strictEqual(err.code, 'EBADREDIRECT')
86 | assert.strictEqual(isErrorFatal(err), true)
87 | return true
88 | }
89 | )
90 | })
91 |
92 | it.skip('should fail on page > 100K', async () => {
93 | const html = ' '.repeat(110000) + ''
94 | nock('http://example.org')
95 | .get('/large')
96 | .reply(200, html, { 'content-type': 'text/html' })
97 |
98 | const result = await uu.expand('http://example.org/large')
99 | assert.strictEqual(result, null)
100 | })
101 | })
102 |
--------------------------------------------------------------------------------
/test/expand.js:
--------------------------------------------------------------------------------
1 | 'use strict'
2 |
3 | /* eslint-env mocha */
4 |
5 | const assert = require('assert')
6 |
7 | const urls = {
8 | 'http://example.org/regular': 'https://github.com/',
9 |
10 | // loop1 -> loop2 -> loop3 -> loop4 -> github
11 | 'http://example.org/loop1': 'http://example.org/loop2',
12 | 'http://example.org/loop2': 'http://example.org/loop3',
13 | 'http://example.org/loop3': 'http://example.org/loop4',
14 | 'http://example.org/loop4': 'https://github.com/',
15 |
16 | // self-referenced
17 | 'http://example.org/cycle': 'http://example.org/cycle',
18 |
19 | // control characters in the output
20 | 'http://example.org/control': 'https://github.com/',
21 |
22 | // invalid protocol
23 | 'http://example.org/file': 'file:///etc/passwd',
24 |
25 | // result has anchor in it
26 | 'http://example.org/hashy': 'https://github.com/foo#bar',
27 |
28 | // relative urls
29 | 'http://example.org/rel1': '//github.com/foo',
30 | 'http://example.org/rel2': '/foo',
31 |
32 | // invalid urls
33 | 'http://example.org/invalid_punycode': 'https://xn--/',
34 |
35 | // internationalized domain names
36 | 'http://example.org/idn1': 'http://www.bücher.de/',
37 | 'http://example.org/idn2': 'http://www.xn--bcher-kva.de/',
38 |
39 | // l1 -> l2 -> null
40 | 'http://example.org/l1': 'http://example.org/l2',
41 | 'http://example.org/l2': null
42 | }
43 |
44 | describe('Expand', function () {
45 | let uu
46 | let result
47 |
48 | before(function () {
49 | uu = require('../')()
50 |
51 | uu.add('example.org', {
52 | fetch: url => urls[url.replace(/^https/, 'http')]
53 | })
54 | })
55 |
56 | it('should expand regular url via Promise', async () => {
57 | result = await uu.expand('http://example.org/regular')
58 | assert.strictEqual(result, 'https://github.com/')
59 | })
60 |
61 | it('should expand url up to 3 levels', async () => {
62 | result = await uu.expand('http://example.org/loop2')
63 | assert.strictEqual(result, 'https://github.com/')
64 | })
65 |
66 | it('should fail on url nested more than 3 levels', async () => {
67 | await assert.rejects(
68 | async () => uu.expand('http://example.org/loop1'),
69 | /Too many redirects/
70 | )
71 | })
72 |
73 | it('should fail on links redirecting to themselves', async () => {
74 | await assert.rejects(
75 | async () => uu.expand('http://example.org/cycle'),
76 | /Too many redirects/
77 | )
78 | })
79 |
80 | it('should fail on bad protocols', async () => {
81 | await assert.rejects(
82 | async () => uu.expand('http://example.org/file'),
83 | /Redirected to an invalid location/
84 | )
85 | })
86 |
87 | it('should not encode non-url characters', async () => {
88 | result = await uu.expand('http://example.org/control')
89 | assert.strictEqual(result, 'https://github.com/')
90 | })
91 |
92 | it('should preserve an anchor', async () => {
93 | result = await uu.expand('http://example.org/regular#foobar')
94 | assert.strictEqual(result, 'https://github.com/#foobar')
95 | })
96 |
97 | it('should respect destination anchor', async () => {
98 | result = await uu.expand('http://example.org/hashy#quux')
99 | assert.strictEqual(result, 'https://github.com/foo#bar')
100 | })
101 |
102 | it('should accept relative urls without protocol', async () => {
103 | result = await uu.expand('//example.org/regular')
104 | assert.strictEqual(result, 'https://github.com/')
105 | })
106 |
107 | it('should reject links to relative urls without protocol', async () => {
108 | await assert.rejects(
109 | async () => uu.expand('http://example.org/rel1'),
110 | /Redirected to an invalid location/
111 | )
112 | })
113 |
114 | it('should reject links to relative urls without host', async () => {
115 | await assert.rejects(
116 | async () => uu.expand('http://example.org/rel2'),
117 | /Redirected to an invalid location/
118 | )
119 | })
120 |
121 | it('should reject links to invalid urls', async () => {
122 | await assert.rejects(
123 | async () => uu.expand('http://example.org/invalid_punycode'),
124 | /Redirected to an invalid location/
125 | )
126 | })
127 |
128 | it('should accept IDN (decoded)', async () => {
129 | result = await uu.expand('http://example.org/idn1')
130 | assert.strictEqual(result, 'http://www.bücher.de/')
131 | })
132 |
133 | it('should accept IDN (punycode)', async () => {
134 | result = await uu.expand('http://example.org/idn2')
135 | assert.strictEqual(result, 'http://www.xn--bcher-kva.de/')
136 | })
137 |
138 | it('should properly expand url with last null fetch in nested redirects', async () => {
139 | result = await uu.expand('http://example.org/l1')
140 | assert.strictEqual(result, 'http://example.org/l2')
141 | })
142 | })
143 |
--------------------------------------------------------------------------------
/test/services.js:
--------------------------------------------------------------------------------
1 | 'use strict'
2 |
3 | /* eslint-env mocha */
4 |
5 | const assert = require('assert')
6 | const read = require('fs').readFileSync
7 | const YAML = require('js-yaml')
8 | const path = require('path')
9 | const punycode = require('punycode/')
10 | const URL = require('url').URL
11 | const uu = require('../')()
12 | const parallel = require('mocha.parallel')
13 |
14 | const urls = YAML.load(read(path.join(__dirname, 'services.yml'), 'utf8'))
15 | const domains = YAML.load(read(path.join(__dirname, '..', 'domains.yml'), 'utf8'))
16 |
17 | const checkAll = (process.env.LINKS_CHECK === 'all')
18 |
19 | // get 2nd level domain, e.g. "foo.example.org" -> "example.org"
20 | function truncateDomain (str) {
21 | return str.split('.').slice(-2).join('.')
22 | }
23 |
24 | describe('Services', function () {
25 | it('all services should be tested', function () {
26 | let expected = []
27 | const actual = []
28 |
29 | domains.forEach(function (d) {
30 | if (typeof d === 'string') {
31 | expected.push(d)
32 | } else {
33 | expected = expected.concat(Object.keys(d))
34 | }
35 | })
36 |
37 | Object.keys(urls).forEach(function (url) {
38 | const u = new URL(url)
39 |
40 | actual.push(u.host)
41 | })
42 |
43 | assert.deepStrictEqual(
44 | expected.map(truncateDomain).map(punycode.toUnicode).sort(),
45 | actual.map(truncateDomain).map(punycode.toUnicode).sort()
46 | )
47 | })
48 |
49 | parallel('ping services', function () {
50 | let links = Object.keys(urls)
51 |
52 | if (!checkAll) { links = links.slice(0, 1) }
53 |
54 | links.forEach(function (link) {
55 | it(link, async () => {
56 | const result = await uu.expand(link)
57 | assert.strictEqual(result, urls[link])
58 | })
59 | })
60 | })
61 | })
62 |
--------------------------------------------------------------------------------
/test/services.yml:
--------------------------------------------------------------------------------
1 | # In normal mode we run only the first test, to avoid unnecessary errors in CY.
2 | http://bit.ly/1gMSVzZ: https://github.com/nodeca/url-unshort
3 |
4 |
5 | http://0rz.tw/DvhxQ: https://www.google.ru/maps
6 | http://alturl.com/8iyap: https://github.com/nodeca/url-unshort
7 | http://amzn.to/MyKindleBook: http://www.amazon.com/The-Most-Useful-Websites-ebook/dp/B006R4RN3U
8 | http://bit.do/9ZiH: https://github.com/nodeca/url-unshort
9 | http://chilp.it/e8dcdce: https://github.com/nodeca/url-unshort
10 | https://clck.ru/9ZEvf: https://github.com/nodeca/url-unshort
11 | http://cort.as/VgV0: https://github.com/nodeca/url-unshort
12 | http://cutt.us/0B2XI: https://github.com/nodeca/url-unshort
13 | https://db.tt/c0mFuu1Y: https://www.dropbox.com/s/toyzur6e0m34t7v/dropbox-logos_dropbox-glyph-blue.png
14 | http://fave.co/1Tba9m5: https://github.com/nodeca/url-unshort
15 | https://flic.kr/p/p6kuZs: https://www.flickr.com/photos/europeanspaceagency/15156592796/
16 | https://goo.gl/HwUfwd: https://github.com/nodeca/url-unshort
17 | https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwj5m5GD_bjLAhUrMJoKHVQiBWsQFggcMAA&url=http%3A%2F%2Fwww.rcdesign.ru%2Farticles%2Favia%2Fdvs_trnr&usg=AFQjCNHTKYt-4cx-4_r7vljdRdm1wspAlA: http://www.rcdesign.ru/articles/avia/dvs_trnr
18 | http://is.gd/YkmUG5: https://github.com/nodeca/url-unshort
19 | http://merky.de/1fce5b: https://github.com/nodeca/url-unshort
20 | http://ow.ly/QCoNe: https://github.com/nodeca/url-unshort
21 | http://shorl.com/gridejynybyso: https://github.com/nodeca/url-unshort
22 | http://smu.gs/1JUWjme: http://www.ianbrodiephoto.net/Image-of-the-Day/Image-of-the-Day/i-mchdBZ3/
23 | https://t.co/DD3MKQZtXj: https://github.com/nodeca/url-unshort
24 | http://tiny.cc/6d2muz: https://www.youtube.com/watch?v=EHLTVVMxXuA
25 | http://tiny.pl/gx13v: https://github.com/nodeca/url-unshort
26 | https://tinyurl.com/nzezbl8: https://github.com/nodeca/url-unshort
27 | http://tmblr.co/ZIaJJw1raVase: https://calciofication.tumblr.com/post/126240050600/1990
28 | http://tw.gs/3xT0fX: https://github.com/nodeca/url-unshort
29 | http://url.ie/z346: https://github.com/nodeca/url-unshort
30 | http://v.gd/rSZr8O: https://github.com/nodeca/url-unshort
31 | http://vk.cc/45cFoR: https://github.com/nodeca/url-unshort
32 | http://vk.com/away.php?to=https%3A%2F%2Fgithub.com%2Fnodeca%2Furl-unshort: https://github.com/nodeca/url-unshort
33 | http://vurl.com/mZGMO: https://github.com/nodeca/url-unshort
34 | http://winpe.notlong.com/: 'http://technet2.microsoft.com/WindowsVista/en/library/08629d0b-56b0-4194-9782-88d01a488ae01033.mspx?mfr=true'
35 | http://wp.me/pBMYe-Lu: https://wptavern.com/?p=2944
36 | http://xurl.es/p8xti: https://github.com/nodeca/url-unshort
37 |
--------------------------------------------------------------------------------