GitXplorerGitXplorer
s

whatwg-url

public
21 stars
4 forks
1 issues

Commits

List of commits on branch master.
Verified
84be3cb71c327944a57422746d862270a09327e7

No maintenance intended

ssethmlarson committed 7 months ago
Verified
59af37289d40d8843a89525a7648a6999dfa4c80

Release 2018.8.26 (#5)

ssethmlarson committed 6 years ago
Unverified
55a59ed759d66ff57fb43bc81f44650b376e5034

Add package dist tests

ssethmlarson committed 6 years ago
Verified
62619e0c1af8e40113799422a24aa1d0a7c88a58

Create Travis deployment steps (#4)

ssethmlarson committed 6 years ago
Unverified
0a8fff4802fe2bff96a125ee8d95474a4ebfe154

Remove Python 3.7, test only again 3.7-dev

ssethmlarson committed 6 years ago
Unverified
971e73f3f9d7ec5703ccb801995e2f6b8eee6267

Handle UTF-8 in the web platform tests

ssethmlarson committed 6 years ago

README

The README file for this repository.

whatwg-url

No Maintenance Intended

Python implementation of the WHATWG URL Living Standard.

The latest revision that this package implements of the standard is August 7th, 2018 (commit 49060c7)

Getting Started

Install the whatwg-url package using pip.

python -m pip install whatwg-url

And use the module like so:

import whatwg_url

url = whatwg_url.parse_url("https://www.google.com")
print(url)
# Url(scheme='https', hostname='www.google.com', port=None, path='', query='', fragment='')

Features

Compatibility with urllib.parse.urlparse()

import whatwg_url

parseresult = whatwg_url.urlparse("https://seth:larson@www.google.com:1234/maps?query=string#fragment")

print(parseresult.scheme)  # 'https'
print(parseresult.netloc)  # 'www.google.com:1234'
print(parseresult.userinfo)  # 'seth:larson'
print(parseresult.path)  # '/maps'
print(parseresult.params)  # ''
print(parseresult.query)  # 'query=string'
print(parseresult.fragment)  # 'fragment'
print(parseresult.username)  # 'seth'
print(parseresult.password)  # 'larson'
print(parseresult.hostname)  # 'www.google.com'
print(parseresult.port)  # 1234
print(parseresult.geturl())  # 'https://seth:larson@www.google.com:1234/maps?query=string#fragment'

URL Normalization

The WHATWG URL specification describes methods of normalizing URL inputs to usable URLs. It handles percent-encodings, default ports, paths, IPv4 and IPv6 addresses, IDNA (2008 and 2003), multiple slashes after scheme, etc.

import whatwg_url

print(whatwg_url.normalize_url("https://////www.google.com"))  # https://www.google.com
print(whatwg_url.normalize_url("https://www.google.com/dir1/../dir2"))  # https://www.google.com/dir2
print(whatwg_url.normalize_url("https://你好你好"))  # https://xn--6qqa088eba/
print(whatwg_url.normalize_url("https://0Xc0.0250.01"))  # https://192.168.0.1/

URL Validation

print(whatwg_url.is_valid_url("https://www.google.com"))  # True
print(whatwg_url.is_valid_url("https://www .google.com"))  # False

Relative URLs

HTTP redirects often contain relative URLs (via the Location header) that need to be applied to the current URL location. Specifying the base parameter allows for giving relative URLs as input and the changes be applied to a new URL object.

import whatwg_url

url = whatwg_url.parse_url("../dev?a=1#f", base="https://www.google.com/maps")
print(url.href)  # https://www.google.com/dev?a=1#f

URL Property Mutators

Modifying properties on a URL object use the parser and "state overrides" to properly mutate the URL object.

url = whatwg_url.parse_url("http://www.google.com:443")

print(url.scheme)  # 'http'
print(url.port)  # 443

url.scheme = 'https'

print(url.scheme)  # 'https'
print(url.port)  # None

"Splatable"

The module is a single file which allows for easy vendoring into projects.

License

Apache-2.0