How to use Curl to grab contents between <tag></tag>?

SpaceshipSpaceship
Watch

newsiness

VIP Member
Impact
49
Would like to seek help...
how to use curl to achieve something like

- putting a web url
- scan through the entire html source of the web url given and grab all the contents starting from tag <font...>ABCDE</font>
- then display "ABCDE" as an output

Thanks !!!
 
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
AfternicAfternic
Font tag is deprecated. Are you using this just on your own websites? Because I am sure the majority of remote sites won't use font tags anymore.

Anyways heres how to do it when wrapped in a function. (Untested but should work).

PHP:
<?php

/**
 * Finds all data between <font> and </font> tags.
 * 
 * @param string $url
 * @return array $fontData
 */
function getFontData($url)
{
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $data = curl_exec($ch);
    curl_close($ch);

    // Use regular expressions to match all font data.
    preg_match_all("/<font>(.*?)<\/font>/i", $data, $matches);

    return $matches[1];
}
?>

:tu:
 
0
•••
Building on DomainManDave's code, you probably want to iterate over each match and output it like this (again untested):

Code:
    // Use regular expressions to match all font data.
    $matchCount = preg_match_all("/<font>(.*?)<\/font>/i", $data, $matches);

    // Iterate over each match
    for ($i = 0; $i < $matchCount; $i++)
    {
        $text = $matches[$i][1];  // Gets the part of the match within the first ()s
        echo "$text<br />\n";  // Output it
    }
 
Last edited:
0
•••
actually I think it should be $matches[$i] , no ?

you may also want to add each $text into an array that you can use outside of you "for" loop
 
0
•••
echo $text."<br />\n";
 
0
•••
Well realistically it depends on the page you are trying to scrape the data from.

If it has more then 1 tags that you are searching for. Then you need some kind of unique identifier to scrape exact data from exact places on the page.

If could be anything. It could be the number of times that same tag is repeated on the page. Or it could be the class or id related to its stylesheet.
 
0
•••
Dynadot — .com TransferDynadot — .com Transfer
CatchedCatched

We're social

Escrow.com
Spaceship
Rexus Domain
CryptoExchange.com
Domain Recover
CatchDoms
DomainEasy — Payment Flexibility
DomDB
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back