07-29-2007, 12:17 PM
· #1 Resident Linux Geek
Name: Michael Walker
Location: East Yorkshire, England
Join Date: Aug 2005
Posts: 2,413
NP$: 300.25 (
Donate )
Get all links from a page
This code will get all links from a page,
example . I developed it as part of a simple spider i'm working on.
This is what i'm using it for , obviously it's not finished, but I think its a pretty good (if strange) idea. Needs JavaScript. Only tested in Opera.
PHP Code:
< pre ><? php
$url = $_GET [ 'url' ];
$html = file_get_contents ( $url );
$preg = array();
$base = array();
$links = array();
$parsed = parse_url ( $url );
preg_match_all ( "/\<a(\s*)href(\s*)=(\s*)\"(.*?)\"(.*?)\>(.*?)\<\/a\>/i" , $html , $preg [ 0 ]);
preg_match_all ( "/\<a(\s*)href(\s*)=(\s*)'(.*?)'(.*?)\>(.*?)\<\/a\>/i" , $html , $preg [ 1 ]);
preg_match ( "/\<base(\s*)href(\s*)=(\s*)\"(.*?)\"(\s*)\/\>/i" , $html , $base );
$title = array_merge ( $preg [ 0 ][ 6 ], $preg [ 1 ][ 6 ]);
$href = array_merge ( $preg [ 0 ][ 4 ], $preg [ 1 ][ 4 ]);
$base = $base [ 4 ];
if(empty( $base ))
$base = (!empty( $parsed [ 'user' ])) ? "{$parsed['scheme']}://{$parsed['user']}:{$parsed['pass']}@{$parsed['host']}" : "{$parsed['scheme']}://{$parsed['host']}" ;
for( $i = 0 ; $i < count ( $href ); $i ++){
if( substr ( $href [ $i ], 0 , 1 ) == '/' )
$href [ $i ] = "{$base}{$href[$i]}" ;
if( substr ( $href [ $i ], 0 , 1 ) == '?' || substr ( $href [ $i ], 0 , 1 ) == '#' )
$href [ $i ] = "{$url}{$href[$i]}" ;
$links [ $i ] = array( "title" => htmlentities ( $title [ $i ]), "url" => htmlentities ( $href [ $i ]));
}
print_r ( $links );
?> </pre>
Last edited by Mikor : 07-29-2007 at 02:23 PM .
08-04-2007, 10:13 AM
· #2 Danltn.com
Name: Daniel Neville
Location: Danltn.com / Nottingham, UK
Join Date: May 2007
Posts: 1,183
NP$: 676.56 (
Donate )
I like it!
Very good potential on this script, thanks for posting.
You don't suppose you could post/zip the other files, .css, .js (although I think it's inline), and .php - We could of course source them, but it's polite to ask.
Thanks,
Dan
08-05-2007, 03:45 PM
· #3 Senior Member
Location: Ireland
Join Date: Dec 2004
Posts: 2,455
NP$: 14.50 (
Donate )
Thanks
Repped
__________________
Quote: - Don't learn the tricks of the trade, learn the trade
08-05-2007, 11:07 PM
· #4 Resident Linux Geek
Name: Michael Walker
Location: East Yorkshire, England
Join Date: Aug 2005
Posts: 2,413
NP$: 300.25 (
Donate )
08-06-2007, 11:54 AM
· #5 Danltn.com
Name: Daniel Neville
Location: Danltn.com / Nottingham, UK
Join Date: May 2007
Posts: 1,183
NP$: 676.56 (
Donate )
Is this Open Source, unrestricted code? I have a commercial use of this, I can send you a finished script with Resell Rights in exchange for full reseller rights usage of the code.
Thanks,
Dan.
(P.S. Please don't say no
)
08-06-2007, 02:22 PM
· #6 Resident Linux Geek
Name: Michael Walker
Location: East Yorkshire, England
Join Date: Aug 2005
Posts: 2,413
NP$: 300.25 (
Donate )
Originally Posted by Danltn Is this Open Source, unrestricted code? I have a commercial use of this, I can send you a finished script with Resell Rights in exchange for full reseller rights usage of the code.
Thanks,
Dan.
(P.S. Please don't say no
)
Of course.
Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off