Dynadot โ€” .com Registration $8.99

[Resolved] Regular Expression Help

Spaceship Spaceship
Watch
Impact
38
Regular Expression Help

Hi All,

I am loading a craigslist page through CURL and I have the page contents loaded as a PHP variable. I need to locate a couple of things on this page:

http://sandiego.craigslist.org/off/438472718.html

I need the loc variable from this link (google map link on above page):
http://maps.google.com/?q=loc:+Balboa+Avenue+at+Genesee+San+Diego+CA+US

I also would like to have the src value of the photos (up to 4 photos) from that listing page.

I think what I need is a regular expression to find these matches, but there may be something easier (for instance if I could use the DOM then it would be much easier to look for these matches). Any help or hard code is appreciated.

For the code minded:
PHP:
require_once 'rss_fetch.inc';

$url = 'http://sandiego.craigslist.org/off/index.rss';
$rss = fetch_rss($url);

echo "Site: ", $rss->channel['title'], "<br>";
foreach ($rss->items as $item ) {
	$title = $item['title'];
	$url   = $item['link'];
	echo "<a href=$url>$title</a><br />\n";
$curl_handle=curl_init();
curl_setopt($curl_handle,CURLOPT_URL,$url);
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
$buffer = curl_exec($curl_handle);
curl_close($curl_handle);

if (empty($buffer))
{
    print "Sorry, example.com are a bunch of poopy-heads.<p>";
}
else
{
    print $buffer;

//this is where I'm at
$googleLink = ??????? 
$images[] = ??????

}
}
 
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
Unstoppable DomainsUnstoppable Domains
A limited alternative to Regex would be to use strpos and substr simultaneously.
Here is an example to get the title <title>Title</title>
PHP:
$data = "<title>Title</title>";
$title_detect_1 = strpos($data, "<title>") + 7;
$title_detect_2 = strpos($data, "</title>", $title_detect_1);
$title_detect = $title_detect_2 - $title_detect_1;
$title = substr($data, $title_detect_1, $title_detect);
 
Last edited:
0
•••
.
-=Azn-Devil=- said:
A limited alternative to Regex would be to use strpos and substr simultaneously.
Here is an example to get the title <title>Title</title>
PHP:
$data = "<title>Title</title>";
$title_detect_1 = strpos($data, "<title>") + 7;
$title_detect_2 = strpos($data, "</title>", $title_detect_1);
$title_detect = $title_detect_2 - $title_detect_1;
$title = substr($data, $title_detect_1, $title_detect);


PHP:
preg_match_all("/<title>(.*)<\/title>/", "<title>Hello World</title>", $matches);

is a better way
 
0
•••
I thought he was asking for an alternative to regex.
 
0
•••
This is using my custom functions, I think this is what you need.

PHP:
<?php
require_once("/home/danltn/www/functions.php"); // Custom functions file
$con = @get_page("http://maps.google.com/maps?f=q&&hl=en&q=1+Inham+Road%2C+Chilwell"."&output=html");
preg_match('/<\/span><\/a><\/td><\/tr><\/table><p><table cellpadding="0" cellspacing="0" border="0"><tr><td valign="center" nowrap>(.*)<\/td><td valign="middle" nowrap style="padding-left: 3em">/',$con, $output);

$addr = $output[1];
$addr = str_replace("<br>",", ",$addr);
echo $addr;

get_page is my version of a cURL script, just different so I can use it quickly.

EDIT: After re-reading your post, I think I was off on the wrong track. :hehe:
 
Last edited:
0
•••
You guys are on the right track.

What I need is an expression to grab this data:

http://images.craigslist.org/01010401020801030920071002d8d864cdaa0f337847009ed6.jpg
http://images.craigslist.org/01021201040501030620071002ba10e98c42090c3bec00c410.jpg
http://images.craigslist.org/01011201021001040820071002b54834d55abdc86a3900126f.jpg
Balboa+Avenue+at+Genesee+San+Diego+CA+US

From this page:
http://sandiego.craigslist.org/off/438472718.html

So I am trying to get the image absolute paths (there can be 0-4 images on any listing) AND I need the location string from the google map URL.
 
0
•••
PHP:
<?php
require_once("/home/danltn/www/functions.php"); // Custom functions

$page = @get_page("http://sandiego.craigslist.org/off/438472718.html"); // Custom function for cURL

preg_match_all('/<img src="http:\/\/images.craigslist.org\/(.*)"><\/td>/',$page,$con);

foreach($con[1] as $c => $k)

{

echo "http://images.craigslist.org/$k<br />";

}

?>

PHP:
<?php
require_once("/home/danltn/www/functions.php"); // Custom functions

$page = @get_page("http://sandiego.craigslist.org/off/438472718.html"); // Custom function for cURL

preg_match_all('/http:\/\/maps.google.com\/\?q=loc%3A(.*)">google map<\/a>/',$page,$con);

$url = $con[1][0];
$url = trim($url,"+");
echo $url;

?>

Combined:

PHP:
<?php
require_once("/home/danltn/www/functions.php"); // Custom functions

$page = @get_page("http://sandiego.craigslist.org/off/438472718.html"); // Custom function for cURL

preg_match_all('/http:\/\/maps.google.com\/\?q=loc%3A(.*)">google map<\/a>/',$page,$goog);

$url = $goog[1][0];
$url = trim($url,"+");
echo $url;

echo "<br />---<br />";

preg_match_all('/<img src="http:\/\/images.craigslist.org\/(.*)"><\/td>/',$page,$con);

foreach($con[1] as $c => $k)

{

echo "http://images.craigslist.org/$k<br />";

}

?>

Enjoy :)
 
Last edited:
1
•••
Perfect! Thanks Dan you leet mofo
 
0
•••
DylanButler said:
Perfect! Thanks Dan you leet mofo
Mods, out of the goodness of your hearts never remove that post! :hehe:
 
0
•••
Wow I didn't see you supplied a google one as well. I was able to write my own using RegEx Coach.

PHP:
preg_match('/<a.*href="http:\/\/maps.google.com\/\?q=loc%3A\+(.*)">google map<\/a>/',$buffer,$googleLink);
$googleLink = urldecode($googleLink[1]);

Thanks again +rep added
 
0
•••
Dynadot โ€” .com Registration $8.99Dynadot โ€” .com Registration $8.99
Unstoppable Domains
Domain Recover
DomainEasy โ€” Live Options
  • The sidebar remains visible by scrolling at a speed relative to the pageโ€™s height.
Back