NameSilo

[PHP] LInk parsing

Spaceship Spaceship
Watch

liam_d

The original NP Emo KidEstablished Member
Impact
25
Basically i am trying to parse "www.blah.com/efdsf" kinda thing in my forum script.

I have it work when an address has "http://" behind it but not when "www." is on it's own.

Here is my code:
PHP:
$post = preg_replace("`\b(https?|ftp|file)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]*[-A-Za-z0-9+&@#/%=~_|]\b`", '<a href="\0" target="_blank">\0</a>', $post);

$post = preg_replace("/\s(www\.([a-z][a-z0-9_\..-]*[a-z]{2,6})([a-zA-Z0-9\/*-?&%]*))\s/i", " <a href=\"http://$1\">$1</a> ", $post);

Second one should parse with "www." on it's own but it doesn't.
 
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
AfternicAfternic
I can't seen anything wrong with the search part. Your $post must have spaces before and after your expected URL. There are some special characters which aren't escaped, which might be causing your problem. Also, I would escape \ within a double quoted string to avoid confusion ("\z" is the same as "\\z" and '\z', but "\v" is a vertical tab, so is not the same as "\\v" or '\v'.)

The replace part uses $1, which will be interpreted by PHP before being passed to preg_replace. Either escape the $, use a \ instead or use single quotes round the string.

Try:

Code:
$post = preg_replace("/\\s(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))\s/i", ' <a href="http://$1">$1</a> ', $post);
 
0
•••
I tried the code you posted to no luck, doesn't do anything still :(
 
0
•••
I tried the following and it worked:

test.php:

Code:
<?php

$post = ' www.blah.com/efdsf ';

$post = preg_replace("/\\s(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))\s/i", ' <a href="http://$1">$1</a> ', $post);

echo $post;
echo "\n";

?>

$ php test.php
<a href="http://www.blah.com/efdsf">www.blah.com/efdsf</a>
$
 
0
•••
Well it still won't work my end here is the whole code:
PHP:
// main post parser
function main_post_parser($post)
{
	global $db;
	
	$post = htmlentities($post);
	
	$post = trim($post); 
	
	// sort out new lines into breaks
	$post = str_replace(array("\r\n", "\r", "\n"), "<br />", $post);
	if( get_magic_quotes_gpc() ) 
	{
		$post = stripslashes($post);
	}

	if( !is_numeric($post) || $post[0] == '0' ) 
	{
		$post = $db->escape($post);
	}
	
	// check if they are a guest, if they are check if guests links get parsed
	if ($_SESSION['group'] == 4)
	{
		if ($site_config['guest_links_parsed'] == 0)
		{
			// then don't parse
		}
		
		else
		{
			// auto-make links
			// Thanks to "geirha" from ubuntu forums for his amazing work on this for me :)
			$post = preg_replace("`\b(https?|ftp|file)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]*[-A-Za-z0-9+&@#/%=~_|]\b`", '<a href="\0" target="_blank">\0</a>', $post);
			$post = preg_replace("/\\s(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))\s/i", ' <a href="http://$1">$1</a> ', $post);

		}
	}
	
	// they are not a guest so parse away
	else if ($_SESSION['group'] != 4)
	{
		// auto-make links
		// Thanks to "geirha" from ubuntu forums for his amazing work on this for me :)
		$post = preg_replace("`\b(https?|ftp|file)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]*[-A-Za-z0-9+&@#/%=~_|]\b`", '<a href="\0" target="_blank">\0</a>', $post);
		$post = preg_replace("/\\s(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))\s/i", ' <a href="http://$1">$1</a> ', $post);
	}
	
	return $post;
}
 
0
•••
Why do you write your own parser? PHP has an excellent built-in function for it :

parse_url()
 
0
•••
It is not just an url parser it parses bbcode, html input etc. And i am trying to change it onto a clickable url "parse_url" just breaks it down into pieces which doesn't do much for me?
 
0
•••
mvl said:
Why do you write your own parser? PHP has an excellent built-in function for it :

parse_url()

I think you have misunderstood the problem. parse_url is for pasring a single URL. The OP is trying to replace a URL within a string with an HTML link.

liam_d said:
Well it still won't work my end here is the whole code:

I think my test shows that the preg_replace code should work. I guess that one of the conditions is not being met in your code. Perhaps ($_SESSION['group'] == 4) and ($site_config['guest_links_parsed'] == 0).
 
0
•••
If you check the second bit that checks if the session group isnt 4 (which mine is not it is 1) and the other part does not matter for it.
 
0
•••
OK - What is your $post input string? If you call:

Code:
$_SESSION['group'] = 1;
echo main_post_parser(' www.blah.com/efdsf ');

what do you get?
 
0
•••
"www.blah.com/efdsf"

I also output the session to check and the group is deffinately at 1.
 
0
•••
I found the problem. You have:

Code:
    $post = trim($post);

which removes the leading and trailing space. Then the preg_replace will not match the string. Move the trim to the end of the function instead and try again.

The code as it stands replaces newlines with "<br />". This will stop it matching a URL after or before a newline. You might want to move this replacement to the end of the function also and change the preg to:

Code:
$post = preg_replace("/(\\s)(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))(\\s)/i", '$1<a href="http://$2">$2</a>$5', $post);

the above preserves that matching whitespace before and after the URL.
 
Last edited:
0
•••
Wait but that's it though, what if the url is right at the start with no space then it won't parse it.
 
0
•••
Add a space before and after it then:

Code:
function main_post_parser($post)
{
    global $db;
    
    $post = " $post ";
    $post = htmlentities($post);
...
 
0
•••
Can we not just stop the preg_replace having to have spaces?
 
0
•••
Try:

Code:
$post = preg_replace("/(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))/i", '<a href="http://$1">$1</a>', $post);
 
0
•••
0
•••
qbert220 said:
Try:

Code:
$post = preg_replace("/(www\\.([a-z][a-z0-9_\\.-]*[a-z]{2,6})([a-zA-Z0-9\\/\\*-\\?&%]*))/i", '<a href="http://$1">$1</a>', $post);

I tried that with this:
Code:
www.prxa.info/test
http://www.test.com
test.com
http://prxa.info

and got this:
Code:
www.prxa.info/test
/>www.test.com" target="_blank">http://www.test.com
/>test.com
http://prxa.info
 
0
•••
0
•••
it can be a great tool for parsing links as well if you just put it in the <a tag... but i guess its not exactly hat you guys are talking about.
 
0
•••
Dynadot โ€” .com Registration $8.99Dynadot โ€” .com Registration $8.99
Appraise.net
Unstoppable Domains
Domain Recover
NameMaxi - Your Domain Has Buyers
  • The sidebar remains visible by scrolling at a speed relative to the pageโ€™s height.
Back