This site was created for fun on April 1st 2013 by Christopher Gutteridge (University of Southampton) with a very over-the-top set of claims. Since July 2022, it is hosted and developed by IS4. Check out the news!
How does it work?
The URI resolver just parses the URI to get the parts, then uses registries and external queries to retrieve information about MIME types, top-level domains and so forth. This data is taken from IANA and Wikidata and is periodically updated.
It accepts any possible URI, including web addresses, email addresses, ISBNs as well as some more obscure schemes such as Gopher.
URIs
The resources hosted by this site use URIs in the format:
https://w3id.org/uri4uri/<type>/<identifier>
...where <type>
is one of uri
, scheme
, host
, part
, mime
, suffix
, urn
, well-known
, port
, protocol
, service
. Percent-encoding <identifier>
is necessary only for ?
and #
and the usual invalid characters. Note that due to the limitations of w3id.org, you should always use https://
and never escape /
(in /uri/
, escaped /
in the described URI is encoded as %252F
).
Resolving a URI will invoke content negotiation to pick one of Turtle, RDF/XML, JSON-LD or HTML and a 303 redirect to the relevant document. Each URI has an associated set of documents: ttl
, rdf
, jsonld
, nt
, html
. These have URLs in the following formats:
https://w3id.org/uri4uri/<type>.<format>/<identifier>
Examples:
https://w3id.org/uri4uri/uri/http://xkcd.com/123/
- URI for "http://xkcd.com/123/"
https://w3id.org/uri4uri/host/totl.net
- URI for the domain "totl.net"
https://w3id.org/uri4uri/suffix/pdf
- URI for the suffix ".pdf"
https://w3id.org/uri4uri/scheme/ftp
- URI for the URI scheme "ftp"
https://w3id.org/uri4uri/mime/text/plain
- URI for the MIME Type "text/plain"
https://w3id.org/uri4uri/urn/uuid
- URI for the "urn:uuid:" Namespace
https://w3id.org/uri4uri/well-known/void
- URI for the "/.well-known/void" Service
https://w3id.org/uri4uri/port/80
- URI for the port 80
https://w3id.org/uri4uri/protocol/tcp
- URI for the TCP protocol
Each identifier is normalized before a description is generated. All identifiers except for URIs are converted to lowercase. Domain names are converted to their Unicode-based variants. Invalid characters in URIs are percent-encoded.
How big is it?
Virtually infinite, and still growing! Since it generates most results on the fly however, the size can be pretty efficiently compressed to almost 0. The remainder are the registries from IANA, which are cached and take about 2.5 MiB of space in total.
What is included
URIs, Internet Domains, Mime Types, File Suffixes, URI Schemes, URN Namespaces, Well-Known URIs, Ports, Protocols.
How do I find the URI4URI which identifies the URL of a page I'm viewing?
Simple, just drag this handy "bookmarklette" into your browser tool bar: URI4URI when you click it, it runs a teeny tiny bit of javascript which takes you to a page telling you about the URL and it's uri4uri.
What parts of a URI are supported?
The majority of the effort has gone into calculating the components of http and https URIs. An example showing off all the parts of a URI would be http://foo:bar@bbc.co.uk:80/index.html?a=1&b=2#fragment, however other URI schemes are supported, e.g. tel: or secondlife:.
Can I see the source code?
Sure, you can find the uri4uri source on github.com
Changelog
This section summarizes the changes to this website since July 2022: