NameSilo

Help needed, RegEx string

Namecheap AuctionsNamecheap Auctions
Namecheap AuctionsNamecheap Auctions
SpaceshipSpaceship
Watch

cafi

Established Member
Impact
1
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
GoDaddyGoDaddy
Negative matching is tricky in regular expressions, and almost impossible when you don't have a strong positive match just before the negative. I'd be more inclined to search for the file type you want (e.g. html, htm, php) rather than the ones you don't. This is also going to be tricky, since you'll also want to match URLs giving just the domain name, ending with a directory name and ending with a slash.

You could use your existing regular expression and then filter out any unwanted URLs afterwards. The following code gives an example of this:

PHP:
<?php

$text = '
http://www.namepros.com/images/logo-60.gif
http://www.namepros.com/wibble.jpg
http://www.namepros.com/
http://www.namepros.com
http://www.namepros.com/index.html
';

preg_match_all('#https?:\/\/[^\s\)\"]*#', $text, $matches);

echo "All URLs:\n";
print_r($matches);
echo "\n";

for ($i = count($matches[0]) - 1; $i >= 0; $i--)
{
    if (preg_match('#\.(gif|jpg|jpeg|png|exe)$#i', $matches[0][$i]))
    {
        unset($matches[0][$i]);
    }
}

echo "With images and exes removed:\n";
print_r($matches);
?>
 
0
•••
Truehost — .com domains from $4.99, hosting includedTruehost — .com domains from $4.99, hosting included

We're social

Escrow.com
Spaceship
CryptoExchange.com
Domain Recover
DomDB
NameFit
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back