This site was created for fun on April 1st 2013 by Christopher Gutteridge (University of Southampton) with a very over-the-top set of claims. Since July 2022, it is hosted and developed by IS4. Check out the news!
How does it work?
The URI resolver just parses the URI to get the parts, then uses registries and external queries to retrieve information about MIME types, top-level domains and so forth. This data is taken from IANA and Wikidata and is periodically updated.
It accepts any possible URI, including web addresses, email addresses, ISBNs as well as some more obscure schemes such as Gopher.
The resources hosted by this site use URIs in the format:
<type> is one of
<identifier> is necessary only for
# and the usual invalid characters. Note that due to the limitations of w3id.org, you should always use
https:// and never escape
/ in the described URI is encoded as
Resolving a URI will invoke content negotiation to pick one of Turtle, RDF/XML, JSON-LD or HTML and a 303 redirect to the relevant document. Each URI has an associated set of documents:
html. These have URLs in the following formats:
https://w3id.org/uri4uri/uri/http://xkcd.com/123/ - URI for "http://xkcd.com/123/"
https://w3id.org/uri4uri/host/totl.net - URI for the domain "totl.net"
https://w3id.org/uri4uri/suffix/pdf - URI for the suffix ".pdf"
https://w3id.org/uri4uri/scheme/ftp - URI for the URI scheme "ftp"
https://w3id.org/uri4uri/mime/text/plain - URI for the MIME Type "text/plain"
https://w3id.org/uri4uri/urn/uuid - URI for the "urn:uuid:" Namespace
https://w3id.org/uri4uri/well-known/void - URI for the "/.well-known/void" Service
https://w3id.org/uri4uri/port/80 - URI for the port 80
https://w3id.org/uri4uri/protocol/tcp - URI for the TCP protocol
Each identifier is normalized before a description is generated. All identifiers except for URIs are converted to lowercase. Domain names are converted to their Unicode-based variants. Invalid characters in URIs are percent-encoded.
How big is it?
Virtually infinite, and still growing! Since it generates most results on the fly however, the size can be pretty efficiently compressed to almost 0. The remainder are the registries from IANA, which are cached and take about 2.5 MiB of space in total.
What is included
URIs, Internet Domains, Mime Types, File Suffixes, URI Schemes, URN Namespaces, Well-Known URIs, Ports, Protocols.
How do I find the URI4URI which identifies the URL of a page I'm viewing?
What parts of a URI are supported?
The majority of the effort has gone into calculating the components of http and https URIs. An example showing off all the parts of a URI would be http://foo:firstname.lastname@example.org:80/index.html?a=1&b=2#fragment, however other URI schemes are supported, e.g. tel: or secondlife:.
Can I see the source code?
Sure, you can find the uri4uri source on github.com
This section summarizes the changes to this website since July 2022: