I don't think you understand what I mean. I have tried all the parsing platforms. The addition and parsing of such suffixes is not supported at all. You can also test them yourself. I want to know how do I parse them? If you have a parsing tutorial, please let me know.
Good morning,
Can you give us an example of a parsing program, that you are referring to? Multiple references would be great.
But, since we understand your question a bit better, we think we might be able to point to the answer.
With reference to our previous reply.
The 'Public Suffix List' (PSL), which can be found here - publicsuffix.org
This is one of the purposes of that list.
It is an authoritative list of all publicly available domain name extensions, used by industry (it is itself a quasi official standard).
To quote it's website -
The Public Suffix List is an initiative of Mozilla, but is maintained as a community resource. It is available for use in any software, but was originally created to meet the needs of browser manufacturers. It allows browsers to, for example:
- Avoid privacy-damaging "supercookies" being set for high-level domain name suffixes
- Highlight the most important part of a domain name in the user interface
- Accurately sort history entries by site
(Note the second point)
To further quote the site -
The Public Suffix List is a cross-vendor initiative to provide an accurate list of domain name suffixes, maintained by the hard work of Mozilla volunteers and by submissions from registries, to whom we are very grateful.
To further quote the site -
Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list. This is the aim of the Public Suffix List.
To further quote the site -
Some people use the PSL to determine what is a valid domain name and what isn't.
END QUOTES
What is being said is that there is no way to determine what part of a domain name is the 'domain word' (e.g. 'whatever') and what part is the domain extension (e.g. '.com.ru').
So, by maintaining a list, one can find what part of a domain is the 'domain word' and which part is the domain extension.
We presume this is what you mean by parsing (splitting a domain name into the 'domain word' and the domain extension).
This is why the list is used for the following purposes, among others (quoting the site) -
Chromium/Google Chrome (pre-processing, DAFSA builder, parser)
- Restricting cookie setting
- Determining whether entered text is a search or a website URL
- Determining whether wildcard subdomains are allowed in Origin Trial tokens
Opera
- Restricting cookie setting
- Restricting the setting of the document.domain property
Internet Explorer
- Restricting cookie setting
- Domain highlighting in the URL bar
- Zone determination
- ActiveX opt-in list security restriction
Other Apps
Qt uses it to restrict cookie setting from version 4.7.2 onwards.
WhoisMind uses it to get the domain name out of inputted URLs.
Crawler-Commons is a suite of tools for building a web crawler, and it uses the PSL.
Libraries
C, Perl and PHP: regdom-libs includes libraries for working with the Public Suffix List.
C: libpsl, a fast offline PSL lookup library in C
C: Faup, a command line tool with a C library and Python bindings
C#: Nager.PublicSuffix
Elixir: publicsuffix-elixir
Erlang: publicsuffix_erlang
Go: x/net/publicsuffix
Go: tldextract
Go: publicsuffix-go
Haskell: publicsuffix-haskell
Java: regdom-libs has a Java port too
Java: Guava - Google's core Java libraries - has a PSL-using class
Java: Java API for the Public Suffix List
JavaScript: publicsuffixlist.js
JavaScript: tld.js
TypeScript: tldts
Lua: lua-psl
.NET: Louw.PublicSuffix.
Objective-C: KKDomain
Perl: Domain::PublicSuffix
PHP: php-domain-parser
PHP: TLDExtract
Python: publicsuffix
Python: publicsuffixlist
Python: dnspy - claims to be more flexible.
Ruby: publicsuffix-ruby gem
Rust: publicsuffix
Swift: Dashlane/SwiftDomainParser
There's also a list of libraries in various languages in the comments on this Stack Overflow question.
Standards
DMARC
CAB Forum Baseline Requirements. The Baseline Requirements ban the issuance of wildcard certs where the wildcard is the next label immediately after a registry-controlled label, and suggests using the "ICANN DOMAINS" section of the Public Suffix List for determining what's registry-controlled.
HTML 5 (document.domain)
Other
Let's Encrypt uses it for rate limiting applications to their CA. If you just need an exception from their rate limits, please do not request a change to the list, but instead use their form, linked from their documentation. This is a faster way to achieve what you want.
END QUOTE
Please note the following lines -
Chromium/Google Chrome (pre-processing, DAFSA builder, parser)
Internet Explorer
- Domain highlighting in the URL bar
- Zone determination
Go: tldextract
PHP: php-domain-parser
Swift: Dashlane/SwiftDomainParser
As you will see, this is the purpose of the PSL, among other reasons, to determine what part of a domain name is the 'domain word' and the domain extension, when parsing.
Whichever parsing platforms, to which you refer, either have some unique algorithm we cannot comment on, or utilize the PSL, which they are using outdated copy.
Again, please give examples of these parsing platforms.
But, this is the answer to your question.
Check out the 'Public Suffix List' (PSL), which is found at - publicsuffix.org
We invite you to take advantage of our coupon code offers, register a domain, and try our service for yourself.