NameSilo

Help needed, RegEx string

SpaceshipSpaceship
Watch

cafi

Established Member
Impact
1
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
AfternicAfternic
Negative matching is tricky in regular expressions, and almost impossible when you don't have a strong positive match just before the negative. I'd be more inclined to search for the file type you want (e.g. html, htm, php) rather than the ones you don't. This is also going to be tricky, since you'll also want to match URLs giving just the domain name, ending with a directory name and ending with a slash.

You could use your existing regular expression and then filter out any unwanted URLs afterwards. The following code gives an example of this:

PHP:
<?php

$text = '
http://www.namepros.com/images/logo-60.gif
http://www.namepros.com/wibble.jpg
http://www.namepros.com/
http://www.namepros.com
http://www.namepros.com/index.html
';

preg_match_all('#https?:\/\/[^\s\)\"]*#', $text, $matches);

echo "All URLs:\n";
print_r($matches);
echo "\n";

for ($i = count($matches[0]) - 1; $i >= 0; $i--)
{
    if (preg_match('#\.(gif|jpg|jpeg|png|exe)$#i', $matches[0][$i]))
    {
        unset($matches[0][$i]);
    }
}

echo "With images and exes removed:\n";
print_r($matches);
?>
 
0
•••
CatchedCatched

We're social

Escrow.com
Spaceship
Rexus Domain
CryptoExchange.com
Domain Recover
CatchDoms
DomainEasy — Payment Flexibility
DomDB
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back