IT.COM

Parse a URL in PHP to return domain

Spaceship Spaceship
Watch

RJ

Domain BuyerTop Member
Impact
3,028
Please write a short php function to parse a URL and return just the domain name.

Example:
Code:
$url="http://www.namepros.com/showthread.php?p=350493";

$domain = getdomain($url); # domain equals "namepros.com"

Offering a new domain registration.
 
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
<? function getdomain($url)
{
$explode = explode(".", $url);
$tld = $explode[2];
$tld = explode("/", $tld);
$name = $explode[1];
print("$name.$tld[0]");
}

print(getdomain("http://www.namepros.com/showthread.php?p=350493"));

?>

this should work.
 
1
•••
wait working on a better one RJ :D
 
1
•••
PHP:
//Copyright 2004 by Anand A. For authorized use by -RJ- Only :D
// [url]www.polurnet.com[/url]
/*=================================
----EDIT WITH DESIRED URL BELOW--- */

$myurl = "http://www.polurnet.com"

//=====NO More Editing Below Required!==========

function parse_url_domain ($url) {
$parsed = parse_url($url);
$hostname = $parsed['host'];
return $hostname;
}

$raw_url = parse_url($myurl);
$domain_only =str_replace ('www.','', $raw_url);
echo $domain_only['host'];

Example:

$myurl = "http://www.polurnet.com";

Returns
Code:
polurnet.com

The best custom function, and never fails :D Tested for 1 hour too!

Cheers,
Anand
 
Last edited:
1
•••
Would have taken this on if I saw it earlier :) Nice function Polur!
 
0
•••
VERSION 2.0

Example:

$myurl = "http://subdomain.mycrazydomain.anytld";

Returns
Code:
mycrazydomain.anytld


PHP:
//Copyright 2004 by Anand A. For authorized use by -RJ- Only :D
// www.polurnet.com
// VERSION 2.0  -- ANY SUBDOMAIN REMOVAL!

function parse_url_domain ($url) {
$parsed = parse_url($url);
$hostname = $parsed['host'];
return $hostname;
}

$raw_url = parse_url("http://subdomain.mycrazydomain.anytld");
preg_match ("/\.([^\/]+)/", $raw_url['host'], $domain_only);
echo $domain_only[1];

:cy:
 
0
•••
VERSION 2.5

USING FAST REGULAR EXPRESSION SEARCH ONLY!

PHP:
//Copyright 2004 by Anand A. For authorized use by -RJ- Only :D
// www.polurnet.com

//SET URL HERE
$myurl = "http://subdomain.nameprosrules.tld/index.html";

// get host name
preg_match("/^(http:\/\/)?([^\/]+)/i",
   $myurl, $domain_only);
$host = $domain_only[2];
echo "Domain Name is:  " . $host;

output:
domain name is: nameprosrules.tld
 
0
•••
Ideally, it should be able to handle all types of domains, including .co.uk and URLS that contain uppercase letters (http://WWW.NamePros.com/ returns namepros.com)

So far, this one seems to be working good. This was based off Polur's first submission. I was using a function similar to axilant's explode method before but those third level ccTLDs would trip it up.

PHP:
function getdomain($url) { 
$parsed = parse_url($url); 
return str_replace('www.','', strtolower($parsed['host'])); 
}

See any problems with that? Simply want to give it a url at anytime and return the domain name associated with it, ie:

$domain = getdomain($url);
 
0
•••
Version 2 and Version 2.5 are what I recommend RJ. They handle ALL SUDOMAINS (meaning, will remove, even if 'www', or 'blablah' or 'forum', etc.) It will also accept ALL TYPES of TLDS... you name it you got it. That's why the examples are posted with the 'anytld' as the extension :D

Version 1 will only change 'www' which is not an ideal situation

Version 2.5 would be the fastest processing version, as only a complicated, albeit fast regular expression operation is performed. Only thing is that I didn't make it a function, but let me know

You pick :D

BTW: All case INSENSITIVE I believe, if not, let me know :wave:
 
0
•••
Could you put that in a function that handles it all?

Template for it,

PHP:
<?
function getdomain($url) {

#
# your function here
#

return $domain;
}

$url1 = "http://WWW.NAMEPROS.COM/showthread.php?t=53456";
$url2 = "https://www.direct.gov.uk/Homepage/fs/en";  
$url3 = "http://horribly.long.subdomains.file.net/index.php";
$url4 = "http://66.98.205.16/tester";  # just for fun

echo "<br>url: $url1 <Br>domain: ".getdomain($url1)."<br>";
echo "<br>url: $url2 <Br>domain: ".getdomain($url2)."<br>";
echo "<br>url: $url3 <Br>domain: ".getdomain($url3)."<br>";
echo "<br>url: $url4 <Br>domain: ".getdomain($url4)."<br>";
?>
 
0
•••
::Proof of All Domain (Even multi-dots) Compatibility with Version 1.0, 2.0, 2.5::

Say my URL is "http://forum.nameprosrules.co.uk"

Returns: "nameprosrules.co.uk" !!!
=========

For Case Insensitivity, Modify above last echo line by:

{Version 2.0}
PHP:
echo strtolower($domain_only[1]);

{Version 2.5)
PHP:
echo "Domain Name is:  " . strtolower($host);

Hope it helps

Cheers,
Anand
 
0
•••
-RJ- said:
Could you put that in a function that handles it all?

Template for it,

Actually, on second thought, Version 2.0 (using combo of regexp and function) would remain the most stable. The 'fastest' one actually may cause unexpected troubles, as it's a PURE regexp, with strict rules. Thus, stick with the second one, with all function being coded. And the above post also verifies all cases possible.
 
0
•••
Hold on, I'm going to modify the function, so that only the URL needs to be inputted, and no preg_match is done outside of function :D
 
0
•••
PolurNET said:
Hold on, I'm going to modify the function, so that only the URL needs to be inputted, and no preg_match is done outside of function :D

Nice, thanks. The less that needs to be done outside the function, the better so I can just through a .getdomain($url). in my code anywhere.
 
0
•••
Final Version 2.70 - The Best!

Version 2.70

PHP:
//Copyright 2004 by Anand A. For authorized use by -RJ- Only
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// -----  FIND DOMAIN NAME ONLY (HOSTNAME) GIVEN ANY TYPE OF URL OR TLD  -------------
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// | ADVERTISEMENT | Visit : www.polurnet.com
// VERSION 2.70  -- Integrated to One Function... Input any URL and go!!

function parse_url_domain ($url) {
$raw_url= parse_url($url);
preg_match ("/\.([^\/]+)/", $raw_url['host'], $domain_only);
return strtolower($domain_only[1]);
}

/// Put URL Of Intrest Here :::   ///
$myurl = parse_url_domain ("http://subdomain.mycrazydomain.any.tld/index.php");

/// Output : Case  lowered, only domain + tld shown

echo $myurl;

// Enjoy!


Okay, the above is the final version, all integrated, and 100% working :hehe:

The function that you should call to execute the parsing: parse_url_domain() (Input: Mixed String, Output: Mixed String)
Let me know if you need anything else, or more help :)

Cheers,
Anand
 
Last edited:
0
•••
I just took an example off php.net and modified it to grab domains out of httpS URL's as well as http and put it in a function how I think you wanted it. It would probably just take a simple modification to the regex if you wanted to ignore parsing URL's which are just ip addresses.

PHP:
<? 

function getdomain($url) {

    preg_match (
        "/^(http:\/\/|https:\/\/)?([^\/]+)/i",
        $url, $matches
    );

    $host = $matches[2]; 

    preg_match (
        "/[^\.\/]+\.[^\.\/]+$/", 
        $host, $matches
    );
    
    return strtolower("{$matches[0]}");
} 

$url1 = "http://WWW.NAMEPROS.COM/showthread.php?t=53456"; 
$url2 = "https://www.direct.gov.uk/Homepage/fs/en";   
$url3 = "http://horribly.long.subdomains.file.net/index.php"; 
$url4 = "http://66.98.205.16/tester";  # just for fun 

echo "<br>url: $url1 <Br>domain: ".getdomain($url1)."<br>"; 
echo "<br>url: $url2 <Br>domain: ".getdomain($url2)."<br>"; 
echo "<br>url: $url3 <Br>domain: ".getdomain($url3)."<br>"; 
echo "<br>url: $url4 <Br>domain: ".getdomain($url4)."<br>";

?>
 
0
•••
Nice, I would have replied too had I seen this but I'm a newb here :)
 
0
•••
deadserious, yup, that's my version 2.5 above, but I didn't notice php.net already having it, so I made my own :D
 
0
•••
HI

Interesting thread it seems very close to what I am looking for a code that will support .co.uk domain but what is the Var$ I need to use to do the following

$myurl =$_SERVER["HTTP_HOST"];

I tried the following,

$myurl = parse_url_domain($_SERVER["HTTP_HOST"]);

and also

$domain = $_SERVER["HTTP_HOST"];
$myurl = parse_url_domain($domain);

echo $myurl;

But it did not work, any ideas.

Reason why I want this to work is that I have multiple domains parked as one website with the same dns and I want the domain the visitor has in their browser to load in a special popunder I have...

I have it working with all ext but .co.uk or any thing with a multi ext on the end of the domain.

Do I make sense if not sorry Im still a newbie to PHP

Thanks
Tom Dahne

Expire Domain Software
http://www.expireddomainsoftware.com
 
0
•••
Hi Tom,

If I'm not mistaken it might be caused by a lack of quotes around the server call code, when defining $domain.

But I'll try with the server variable you said, and get back to you :)

:hehe: if you liked this code too, I'd appreciate a small donation of NPs if you don't mind :)

Cheers
Anand
 
0
•••
HI Anand,

Well I got it all working with other code I had already but it did not work with .co.uk or multi extensions and when I saw your code would handle .co.uk I tried it but could not get it to work using the code below..

$domain = $_SERVER["HTTP_HOST"];
$myurl = parse_url_domain($domain);

If you work it out let me know, I will be happy to donate some NP$ if I can get it working, :)

Thanks
Tom Dahne

Expired Domain Software
http://www.expireddomainsoftware.com
 
0
•••
Whoa! After hours of php debugging, I discovered a certain predicament with my code... using no 'www' in the URL. In this case, the system actually parses and removes the domain, as that's the first target (no 'www' is present, remember...)

I'm in the process of making the parsing code "intelligent" by figuring if a www is present or not, and if not, to leave as is (after regular parsing), if there is, then run the additional parsing :)

And actually, Tom, your problem is related with that, as the Server HTTP HOST variable contains no 'www' :!: So I've developed a preliminary checking mechanism for that too, it should go well ;)

I'll get back to you on this shortly !

Cheers
Anand
 
0
•••
v3.01 FINAL - DomainFind PHP SCRIPT - by Anand A.

Here is the Version 3.01 Final, 100% working, and newest full-featured version of this script. ;) :talk: :]

Now, it's protected by the free GNU Public License, so please respect the property rights and leave the credits.

Info:
PHP:
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// ---  FIND DOMAIN NAME (HOSTNAME) GIVEN ANY TYPE OF URL OR TLD EXTENSION (v3.01)  ---
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
//                    --  VERSION 3.01 Final  --
**NOTE: The below comments may be removed in your script. Above must remain intact.
------------------
| Changes/Fixes  |
------------------
+ Added auto removal of 'www' if present
+ Added ANY TLD routine ... CountryCode TLDs work! ex> co.uk, any.tld.ext
+ Added subdomain function ... manual recognition
+ Integrated all routines in function... only end-user input is isolated
+ Fixed SERVER variables, in which "host" is undefined, only "path" array holds
+ Fixed case insenstivity mode
+ Improved Execution of Code via Regular Expressions
*/

Package Attached to this post. If ever you run into problems, have suggestions, feature improvements or bug reports, simply email me at [email protected] OR PM here, OR post here :)

Any $NP donation is much appreciated, I worked hard developing a nice alternative to the other codes, and a lot of hard work was put into it ensuring it's bug-free !

Leave your comments on this thread as well...

Cheers,
Anand
 
Last edited:
0
•••
Note that if you are in PHP routine and you just want the hostname for the page you are on, it's as simple as this:

PHP:
$host=str_replace('www.', '', strtolower($_SERVER['HTTP_HOST']));
NP$'s accepted if used ;)
 
0
•••
This may seem like a simple job but it's actually pretty complicated. Without going too much into the details of dealing with URL obsfucation, host names containing more than [a-z0-9-,] people who don't bother to use HTTP://, port numbers, etc., it's still difficult. There is a fundamental problem when dealing with ccTLDs.

For example, originally domains could be registered as .COM.CN, .NET.CN, etc. Now names can be registered as .CN. The result is that www.net.cn is a domain name, whereas www.ten.cn is a host in the ten.cn domain.

To accurately do what you want, you would need to include a rule set for every TLD and ccTLD to define what the TLD really is( ie com.cn and/or .cn). More importantly you need to keep this up to date as the rules change from time to time.

Without doing all that your best bet is to grab the host name with a simple regex and either strip the www off the front, or take the last 2 or 3 levels of the host name. This can be done with a simple regex. For example:

"http?:\/\/(www\.)?(.*?)[^a-z0-9\-\.]"
Puts the host name less www. Into $2

"http?:\/\/.*?([a-zA-Z0-9-]{2,67}\.[a-zA-Z]{2,4})[^a-z0-9\-\.]"
puts the last two levels of the host name into $1 (ie yahoo.com, .co.uk)

"http?:\/\/.*?([a-zA-Z0-9-]{2,67}\.[a-zA-Z0-9-]{2,67}\.[a-zA-Z]{2,4})[^a-z0-9\-\.]"
puts the last three levels of the host name into $1 (ie yahoo.co.uk, mail.yahoo.com)

Any combination of those ought to be able to approximate what you want. Personally I use a half baked version of the rule set system described above, which is fine for what I need.
 
0
•••
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back