You can install this via the published NPM package:
npm i beam-uri
A complete definition of what constitutes a valid URL can be found in RFC 3986 and RFC 3987. The short version is that a valid URL must, at minimum, consist of a scheme (https://
, http://ftp://
, http://gopher://
) and a host name. If it does not, validation should fail, and the browser should throw an error.
A URL string is a structured string containing multiple meaningful components. When parsed, a URL object is returned containing properties for each of these components.
The Node.js url
module provides two APIs for working with URLs: a legacy API that is Node.js specific, and a newer API that implements the same WHATWG
URL Standard used by web browsers.
┌────────────────────────────────────────────────────────────────────────────────────────────────┐
│ href │
├──────────┬──┬─────────────────────┬────────────────────────┬───────────────────────────┬───────┤
│ protocol │ │ auth │ host │ path │ hash │
│ │ │ ├─────────────────┬──────┼──────────┬────────────────┤ │
│ │ │ │ hostname │ port │ pathname │ search │ │
│ │ │ │ │ │ ├─┬──────────────┤ │
│ │ │ │ │ │ │ │ query │ │
" https: // user : pass @ sub.example.com : 8080 /p/a/t/h ? query=string #hash "
│ │ │ │ │ hostname │ port │ │ │ │
│ │ │ │ ├─────────────────┴──────┤ │ │ │
│ protocol │ │ username │ password │ host │ │ │ │
├──────────┴──┼──────────┴──────────┼────────────────────────┤ │ │ │
│ origin │ │ origin │ pathname │ search │ hash │
├─────────────┴─────────────────────┴────────────────────────┴──────────┴────────────────┴───────┤
│ href │
└────────────────────────────────────────────────────────────────────────────────────────────────┘
(all spaces in the "" line should be ignored — they are purely for formatting)
We can extract the domain from a url by leveraging our method for parsing the hostname. Since the above getHostName() method gets us very close to a solution, we just need to remove the sub-domain and clean-up special cases (such as .co.uk)
Returns: String
- the extracted domain
Extract the main domain without the .domain notation
Returns: String
- the extracted domain
Extracting the hostname from a url is generally easier than parsing the domain. The hostname of a url consists of the entire domain plus sub-domain. We can easily parse this with a regular expression, which looks for everything to the left of the double-slash in a url. We remove the “www” (and associated integers e.g. www2), as this is typically not needed when parsing the hostname from a url
Returns: String
- the extracted hostname
Identify if the link is for a social website
Kind: global function
Validate if a passed string is a valid IP according to: http://jsfiddle.net/AJEzQ/
Returns: Boolean
- indication if the string is valid URI or not
Validate if a passed string is a valid URI according to: https://gist.github.com/dperini/729294
Returns: Boolean
- indication if the string is valid URI or not
normalize and canonicalise urls including data URL The function first normalize the url by performing various steps from lower-casing to encoding The function then strips any url trackers and paddings in the url The function tries to canonicalise the url if possible based on configurations depending on the domain name
Returns: String
- the normalized and canonical url
removes tracking query parameters from the url
Returns: String
- strippedUrl the URL address after tracker stripping
Parses a valid URI into its subparts
Returns: Object
- the parsed url
- In search of the perfect URL validation regex
- uri-js: An RFC 3986 compliant, scheme extendable URI parsing/validating/normalizing/resolving library for JavaScript
- regex-weburl: Regular Expression for URL validation
- parse-domain: Splits a URL into sub-domain, domain and the top-level domain. Provides TypeScript typings
- normalize-url: Normalize a URL