Parse a URL in PHP to return domain

SpaceshipSpaceship
Watch

RJ

Domain BuyerTop Member
Impact
3,210
Please write a short php function to parse a URL and return just the domain name.

Example:
Code:
$url="http://www.namepros.com/showthread.php?p=350493";

$domain = getdomain($url); # domain equals "namepros.com"

Offering a new domain registration.
 
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
GoDaddyGoDaddy
HI Anand,

Well I got it all working with other code I had already but it did not work with .co.uk or multi extensions and when I saw your code would handle .co.uk I tried it but could not get it to work using the code below..

$domain = $_SERVER["HTTP_HOST"];
$myurl = parse_url_domain($domain);

If you work it out let me know, I will be happy to donate some NP$ if I can get it working, :)

Thanks
Tom Dahne

Expired Domain Software
http://www.expireddomainsoftware.com
 
0
•••
Whoa! After hours of php debugging, I discovered a certain predicament with my code... using no 'www' in the URL. In this case, the system actually parses and removes the domain, as that's the first target (no 'www' is present, remember...)

I'm in the process of making the parsing code "intelligent" by figuring if a www is present or not, and if not, to leave as is (after regular parsing), if there is, then run the additional parsing :)

And actually, Tom, your problem is related with that, as the Server HTTP HOST variable contains no 'www' :!: So I've developed a preliminary checking mechanism for that too, it should go well ;)

I'll get back to you on this shortly !

Cheers
Anand
 
0
•••
v3.01 FINAL - DomainFind PHP SCRIPT - by Anand A.

Here is the Version 3.01 Final, 100% working, and newest full-featured version of this script. ;) :talk: :]

Now, it's protected by the free GNU Public License, so please respect the property rights and leave the credits.

Info:
PHP:
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// ---  FIND DOMAIN NAME (HOSTNAME) GIVEN ANY TYPE OF URL OR TLD EXTENSION (v3.01)  ---
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
//                    --  VERSION 3.01 Final  --
**NOTE: The below comments may be removed in your script. Above must remain intact.
------------------
| Changes/Fixes  |
------------------
+ Added auto removal of 'www' if present
+ Added ANY TLD routine ... CountryCode TLDs work! ex> co.uk, any.tld.ext
+ Added subdomain function ... manual recognition
+ Integrated all routines in function... only end-user input is isolated
+ Fixed SERVER variables, in which "host" is undefined, only "path" array holds
+ Fixed case insenstivity mode
+ Improved Execution of Code via Regular Expressions
*/

Package Attached to this post. If ever you run into problems, have suggestions, feature improvements or bug reports, simply email me at [email protected] OR PM here, OR post here :)

Any $NP donation is much appreciated, I worked hard developing a nice alternative to the other codes, and a lot of hard work was put into it ensuring it's bug-free !

Leave your comments on this thread as well...

Cheers,
Anand
 
Last edited:
0
•••
Note that if you are in PHP routine and you just want the hostname for the page you are on, it's as simple as this:

PHP:
$host=str_replace('www.', '', strtolower($_SERVER['HTTP_HOST']));
NP$'s accepted if used ;)
 
0
•••
This may seem like a simple job but it's actually pretty complicated. Without going too much into the details of dealing with URL obsfucation, host names containing more than [a-z0-9-,] people who don't bother to use HTTP://, port numbers, etc., it's still difficult. There is a fundamental problem when dealing with ccTLDs.

For example, originally domains could be registered as .COM.CN, .NET.CN, etc. Now names can be registered as .CN. The result is that www.net.cn is a domain name, whereas www.ten.cn is a host in the ten.cn domain.

To accurately do what you want, you would need to include a rule set for every TLD and ccTLD to define what the TLD really is( ie com.cn and/or .cn). More importantly you need to keep this up to date as the rules change from time to time.

Without doing all that your best bet is to grab the host name with a simple regex and either strip the www off the front, or take the last 2 or 3 levels of the host name. This can be done with a simple regex. For example:

"http?:\/\/(www\.)?(.*?)[^a-z0-9\-\.]"
Puts the host name less www. Into $2

"http?:\/\/.*?([a-zA-Z0-9-]{2,67}\.[a-zA-Z]{2,4})[^a-z0-9\-\.]"
puts the last two levels of the host name into $1 (ie yahoo.com, .co.uk)

"http?:\/\/.*?([a-zA-Z0-9-]{2,67}\.[a-zA-Z0-9-]{2,67}\.[a-zA-Z]{2,4})[^a-z0-9\-\.]"
puts the last three levels of the host name into $1 (ie yahoo.co.uk, mail.yahoo.com)

Any combination of those ought to be able to approximate what you want. Personally I use a half baked version of the rule set system described above, which is fine for what I need.
 
0
•••
Please excuse my lack of programming knowledge but I have 2 basic questions for all of you and that is I just don't get what use this can be to me?

Could you please explain in easy to undertstand terms how I may apply and use this and the exact benefit of it. Sorry I do not grasp its usage at all.
 
0
•••
0
•••
RealNames said:
Please excuse my lack of programming knowledge but I have 2 basic questions for all of you and that is I just don't get what use this can be to me?

Could you please explain in easy to undertstand terms how I may apply and use this and the exact benefit of it. Sorry I do not grasp its usage at all.
Well it could be used as part of a larger script or something which for example, would allow users to enter a large amount of URL's into a form and it would strip all the domain names out of each url so you could have a nice clean list of domains to do whatever you wanted with. That's just one example to give you an idea.
 
0
•••
deadserious said:
-RJ- said:
Ideally, it should be able to handle all types of domains, including .co.uk and URLS that contain uppercase letters (http://WWW.NamePros.com/ returns namepros.com)

Even subdomains that are not second or or third level domain names? So http://sub.sub2.sub3.sub4.sub5.net would be okay? Or are you looking for it to only match real domains?


I solved the problem, don't worry, my script takes care of it all ;)
 
0
•••
PolurNET said:
I solved the problem, don't worry, my script takes care of it all ;)
And I should just take your word for that shouldn't I? Just like the first code you posted was the "best custom function" although you had to go on and post 5 more variations of it? And just like your version 2.7.0 was the "final version, all integrated, and 100% working" although you went on and posted yet "another" final version?

And just like you claimed the code that I posted was "your version 2.5" when in fact it was a modified example off php.net just like I said it was. And if you compare the results from the code I posted and the code you posted you'll see there's quite a difference. And just like you claimed that your version 2.5 results in "nameprosrules.tld" when the actual result of your code in that version is "subdomain.nameprosrules.tld"? But I should just take your word for everything even when you don't even know the results of your own code?

:bah:

And I should just take your word for everything you said although most everything you said was a bunch of inaccurate jargon shouldn't I? :| Actually I don't think so, you may have said something that was accurate, but I sure wouldn't take your word for it considering all the inaccurate jargon you've posted. And if you cared to notice I was not asking you the question that I posted. Maybe if you would have posted with an answer other than one basically telling my I should not post or ask questions just because you say so I would have let it go at that just like I initially did when you inaccurately claimed the code I posted was "your code," although your code didn't even do what you said it did, and the code I posted actually did do what you inaccurately claimed your code did.

And I'm surely not worried about anything, especially not posting or asking questions just because you say I shouldn't. I'll post and ask questions as I feel necessary.
 
0
•••
Whoa, are you feeling okay? :lala:

All i got to say is:

so far ezimedia, and RJ can vouch on my behalf that the code works 100% as described, and the reasons for variations are due to improvements and bug fixes, a natural process in all php scripts. If you have personal problems with me that are hard to solve, then I suggest you re-read the Namepros rules, that state respect & professionalism

Thanks
 
Last edited:
0
•••
PolurNET said:
Whoa, are you feeling okay? :lala:
And I'm the one that needs to understand what respect and professionalism are, yea right. It was just so respectful and professional of you to respond to me as if I'm mentally disturbed or something, hmmm :|

PolurNET said:
All i got to say is:

so far ezimedia, and RJ can vouch on my behalf that the code works 100% as described, and the reasons for variations are due to improvements and bug fixes, a natural process in all php scripts. If you have personal problems with me that are hard to solve, then I suggest you re-read the Namepros rules, that state respect & professionalism

Thanks
And I suggest that you learn what respect and professionalism actually mean. They surely don't mean that I cannot disagree with you or that everyone must believe everything you say nor do they mean plagiarism and inaccuracy. I think you're the one that needs to re-read the Namepros rules. Of course I know what the rules state, I wrote half of them, but it looks like you're only able to selectively understand the rules in a single forum just like you were only able to selectively respond to what I said while ignoring the whole point. And if you cared to notice I never questioned anything about your final code not even whether it works as described or not. This was of no concern to me. I think if you cared to read everything that I said you'll be able to easily understand what I'm saying and why I said it.

And to answer your question since you didn't seem to get it from what I have already said. No I don't have a personal problem with you, and of course I could solve it if I did (The ignore feature works well)

I have a problem with you basically telling everyone who wants to post their code here or ask questions that are not even directed towards you that there is no need to do so because your code is the best, and that their code is your code, and that everyone else is wasting their time and that no one else should even participate just because you say so.
 
0
•••
deadserious said:
-RJ- said:
Ideally, it should be able to handle all types of domains, including .co.uk and URLS that contain uppercase letters (http://WWW.NamePros.com/ returns namepros.com)

Even subdomains that are not second or or third level domain names? So http://sub.sub2.sub3.sub4.sub5.net would be okay? Or are you looking for it to only match real domains?

For my purposes, I would want it to retun the actual second level domain and strip the subdomains. So your example would return sub5.net

I can't quite vouch for the latest version of Polur's scripts as I haven't tested it out yet. At the time I needed it, I was able to fill my need a function based off an earlier version and used it with an existing function I had to trim subdomains, which worked fine!

When I get a chance, I'll try out the new stuff and see what performs best.
 
0
•••
Based on your requests RJ for a script that recognizes ccTLDs (such as ".co.uk" or ".cc1.cc2.com", etc.) the final versions of the scripts were designed.

deadserious here alleges that my first versions of the script do not output exactly what was described. Although it was tested at time of release, a few bugs did appear, which were identified by ezimedia, when using subdomains.

The reason for the three initial versions are simply due to differences in coding, however, the latest one I posted satisfies for RJ's purposes at least, the subdomain removal feature, ccTLDs, $_SERVER variables and everything else as described.

If you still allege that the code I posted is just a copy of php.net, I suggest you get a pair of proper reading glasses with a focal point that actually helps your nearpoint... if I did find one on php.net, I would've obviously stated so clearly, I have no benefit of posting unoriginal code. Indeed I still can't find the place you refer to @ php.net that contains my code. Also, if you also allege I prevented anyone from posting their codes here, well I'll be ****** ! I simply met the coding needs of the original poster, and improved on my version. Your posts seem to be oblivious of this fact, and just repeat the same facts and not one piece of original code that actually was developed by you is posted. Not that it's wrong, it's simply the fact your whining isn't helping the original poster nor solving the script problem(s), if any.

If you believe calling scripts final versions, yet some people who report some minor bug fixes would suddenly discredit the script and say it's "crap", that's the predicament IMO. Bugfixes & improvements are always happening on good products. I just wanted to help people, and did successfully, I really don't know your aim :!:

Okay, signing out.
 
Last edited:
0
•••
Cool, here's some more new stuff. It probably needs some adjustments and such, but I think it at least "almost" works. :tu: You would just need to add/remove the second level domains that you want to be matched. Of course it could be made so you could add/remove these through a form or something, but this is just an example of how or "if" it even functions for now.
PHP:
<?php 

function getdomain($url) {

$url = strtolower($url);

$slds = 
    '\.co\.uk|\.me\.uk|\.net\.uk|\.org\.uk|\.sch\.uk|
    \.ac\.uk|\.gov\.uk|\.nhs\.uk|\.police\.uk|
    \.mod\.uk|\.asn\.au|\.com\.au|\.net\.au|\.id\.au|
    \.org\.au|\.edu\.au|\.gov\.au|\.csiro\.au';

    preg_match (
        "/^(http:\/\/|https:\/\/|)[a-zA-Z-]([^\/]+)/i",
        $url, $matches
    );

    $host = $matches[2];

    if (preg_match("/$slds$/", $host, $matches)) {
        preg_match (
            "/[^\.\/]+\.[^\.\/]+\.[^\.\/]+$/", 
            $host, $matches
        );
    } 
    else {
        preg_match (
            "/[^\.\/]+\.[^\.\/]+$/", 
            $host, $matches
        );
    }
    return "{$matches[0]}";
} 

$url1 = "http://WWW.NAMEPROS.COM/showthread.php?t=53456"; 
$url2 = "https://www.direct.gov.uk/Homepage/fs/en";    
$url3 = "http://horribly.long.subdomains.file.net/index.php"; 
$url4 = "http://66.98.205.16/tester";  # just for fun
$url5 = "http://WWW.somedomain.pros.com.au/dir/file.html"; 
$url6 = "https://www.www2.hosting.org.uk/home/index.php?file=1";    
$url7 = "http://sub.sub2.sub3.sub4.cars.com.au/index.php"; 
$url8 = "http://66.227.205.43/files/index.pl";  # just for fun2
$url9 = "http://WWW.BIGDOMAIN.co.uk/long/file/dir/path/enter.htm"; 
$url10 = "https://www.er.doctors.net.uk/";    
$url11 = "http://number.2.3.4.711.com/"; 
$url12 = "http://66.98.205.16/tester/ip.a/files";  # just for fun3   

echo "<br>url: $url1 <Br>domain: ".getdomain($url1)."<br>";
echo "<br>url: $url2 <Br>domain: ".getdomain($url2)."<br>";
echo "<br>url: $url3 <Br>domain: ".getdomain($url3)."<br>";
echo "<br>url: $url4 <Br>domain: ".getdomain($url4)."<br>";
echo "<br>url: $url5 <Br>domain: ".getdomain($url5)."<br>";
echo "<br>url: $url6 <Br>domain: ".getdomain($url6)."<br>";
echo "<br>url: $url7 <Br>domain: ".getdomain($url7)."<br>";
echo "<br>url: $url8 <Br>domain: ".getdomain($url8)."<br>";
echo "<br>url: $url9 <Br>domain: ".getdomain($url9)."<br>";
echo "<br>url: $url10 <Br>domain: ".getdomain($url10)."<br>";
echo "<br>url: $url11 <Br>domain: ".getdomain($url11)."<br>";
echo "<br>url: $url12 <Br>domain: ".getdomain($url12)."<br>";
?>

The current results from the above example url's, so you can get an idea, I think would look something like this:

url: http://horribly.long.subdomains.file.net/index.php
domain: file.net

url: http://66.98.205.16/tester
domain:

url: http://WWW.somedomain.pros.com.au/dir/file.html
domain: pros.com.au

url: https://www.www2.hosting.org.uk/home/index.php?file=1
domain: hosting.org.uk

url: http://sub.sub2.sub3.sub4.cars.com.au/index.php
domain: cars.com.au

url: http://66.227.205.43/files/index.pl
domain:

url: http://WWW.BIGDOMAIN.co.uk/long/file/dir/path/enter.htm
domain: bigdomain.co.uk

url: https://www.er.doctors.net.uk/
domain: doctors.net.uk

url: http://number.2.3.4.711.com/
domain: 711.com

url: http://66.98.205.16/tester/ip.a/files
domain:
 
Last edited:
0
•••
0
•••
0
•••
armstrong said:
What will be the result for http://domain.com.keyword.net/index.php ? Will it correctly return keyword.net ?

Works with the versions I posted too, you can input subdirectories, $_SERVER vars etc. And the advantage is, IMHO, the script does NOT need to know all the ccTLDs that exist, but instead calculates that info itself. But it depends exactly what the use is for; in mine, you'll need to input if the incoming URL is a subdomain or not (just one more switch when calling function), as it already automatically verifies the "www" presence, and doesn't need to know what kind of domain extensions are present.

Anyway, Peace,
Anand
 
Last edited:
0
•••
HI Anand

I can say your code worked fine for me and you replied to my post quickly with a fix and I got your code to do what I wanted it to and that is all I was worried about.

And I am sure if some one else could not have gotten your code to work you would have been more then happy to help them out also.

I don't really think knocking each other about whos code is better really matters.. Or does it... Well at least not to me :)

Keep up the good work anyways,

Thanks
Tom Dahne
 
0
•••
ds's getdomain funciton

Code:
getdomain results:
url: [url]http://WWW.NAMEPROS.COM/showthread.php?t=53456[/url] 
domain: namepros.com

url: [url]https://www.direct.gov.uk/Homepage/fs/en[/url] 
domain: direct.gov.uk

url: [url]http://horribly.long.subdomains.file.net/index.php[/url] 
domain: file.net

url: [url]http://66.98.205.16/tester[/url] 
domain: 

url: [url]http://WWW.somedomain.pros.com.au/dir/file.html[/url] 
domain: pros.com.au

url: [url]https://www.www2.hosting.org.uk/home/index.php?file=1[/url] 
domain: hosting.org.uk

url: [url]http://sub.sub2.sub3.sub4.cars.com.au/index.php[/url] 
domain: cars.com.au

url: [url]http://66.227.205.43/files/index.pl[/url] 
domain: 

url: [url]http://WWW.BIGDOMAIN.co.uk/long/file/dir/path/enter.htm[/url] 
domain: bigdomain.co.uk

url: [url]https://www.er.doctors.net.uk/[/url] 
domain: doctors.net.uk

url: [url]http://number.2.3.4.711.com/[/url] 
domain: 711.com

url: [url]http://66.98.205.16/tester/ip.a/files[/url] 
domain:

Polur version test

Code:
parse_url_domain results:
url: http://WWW.NAMEPROS.COM/showthread.php?t=53456 
domain: namepros.com

url: https://www.direct.gov.uk/Homepage/fs/en 
domain: direct.gov.uk

url: http://horribly.long.subdomains.file.net/index.php 
domain: horribly.long.subdomains.file.net

url: http://66.98.205.16/tester 
domain: 66.98.205.16

url: http://WWW.somedomain.pros.com.au/dir/file.html 
domain: somedomain.pros.com.au

url: https://www.www2.hosting.org.uk/home/index.php?file=1 
domain: www2.hosting.org.uk

url: http://sub.sub2.sub3.sub4.cars.com.au/index.php 
domain: sub.sub2.sub3.sub4.cars.com.au

url: http://66.227.205.43/files/index.pl 
domain: 66.227.205.43

url: http://WWW.BIGDOMAIN.co.uk/long/file/dir/path/enter.htm 
domain: bigdomain.co.uk

url: https://www.er.doctors.net.uk/ 
domain: er.doctors.net.uk

url: http://number.2.3.4.711.com/ 
domain: number.2.3.4.711.com

url: http://66.98.205.16/tester/ip.a/files 
domain: 66.98.205.16

polur, w/ subdomain=true
Code:
parse_url_domain results:
url: http://WWW.NAMEPROS.COM/showthread.php?t=53456 
domain: com

url: https://www.direct.gov.uk/Homepage/fs/en 
domain: gov.uk

url: http://horribly.long.subdomains.file.net/index.php 
domain: long.subdomains.file.net

url: http://66.98.205.16/tester 
domain: 98.205.16

url: http://WWW.somedomain.pros.com.au/dir/file.html 
domain: pros.com.au

url: https://www.www2.hosting.org.uk/home/index.php?file=1 
domain: hosting.org.uk

url: http://sub.sub2.sub3.sub4.cars.com.au/index.php 
domain: sub2.sub3.sub4.cars.com.au

url: http://66.227.205.43/files/index.pl 
domain: 227.205.43

url: http://WWW.BIGDOMAIN.co.uk/long/file/dir/path/enter.htm 
domain: co.uk

url: https://www.er.doctors.net.uk/ 
domain: doctors.net.uk

url: http://number.2.3.4.711.com/ 
domain: 2.3.4.711.com

url: http://66.98.205.16/tester/ip.a/files 
domain: 98.205.16
 
0
•••
Dynadot — .com TransferDynadot — .com Transfer
CatchedCatched

We're social

Escrow.com
Spaceship
Rexus Domain
CryptoExchange.com
Domain Recover
CatchDoms
DomDB
NameFit
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back