How to use Curl to grab contents between <tag></tag>?

newsiness · Sep 19, 2009

Would like to seek help...
how to use curl to achieve something like

- putting a web url
- scan through the entire html source of the web url given and grab all the contents starting from tag <font...>ABCDE</font>
- then display "ABCDE" as an output

Thanks !!!

Dave · Sep 19, 2009

Font tag is deprecated. Are you using this just on your own websites? Because I am sure the majority of remote sites won't use font tags anymore.

Anyways heres how to do it when wrapped in a function. (Untested but should work).

PHP:

<?php

/**
 * Finds all data between <font> and </font> tags.
 * 
 * @param string $url
 * @return array $fontData
 */
function getFontData($url)
{
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $data = curl_exec($ch);
    curl_close($ch);

    // Use regular expressions to match all font data.
    preg_match_all("/<font>(.*?)<\/font>/i", $data, $matches);

    return $matches[1];
}
?>

:tu:

qbert220 · Sep 21, 2009

Building on DomainManDave's code, you probably want to iterate over each match and output it like this (again untested):

Code:

    // Use regular expressions to match all font data.
    $matchCount = preg_match_all("/<font>(.*?)<\/font>/i", $data, $matches);

    // Iterate over each match
    for ($i = 0; $i < $matchCount; $i++)
    {
        $text = $matches[$i][1];  // Gets the part of the match within the first ()s
        echo "$text<br />\n";  // Output it
    }

wellmanneredsquirrel · Oct 10, 2009

actually I think it should be $matches[$i] , no ?

you may also want to add each $text into an array that you can use outside of you "for" loop

pixelbypixel · Oct 10, 2009

echo $text."<br />\n";

Keral_Patel · Oct 15, 2009

Well realistically it depends on the page you are trying to scrape the data from.

If it has more then 1 tags that you are searching for. Then you need some kind of unique identifier to scrape exact data from exact places on the page.

If could be anything. It could be the number of times that same tag is repeated on the page. Or it could be the class or id related to its stylesheet.

streamingdistribucionesportela.com.co

wqxx6001.info

dentalista.es

itstvarit.com

biharmasti.asia

ikwal.id

coworkinglasolana.es

omshreesnehafoodproductwai.org

theprinceofpurston.co.uk

rhizome.school

How to use Curl to grab contents between <tag></tag>?

VIP Member

Top Member

Established Member

Established Member

Established Member

I'll do itRestricted (Chatroom)

Similar threads

We're social

evileye.xyz

HoReCa.com.co

Sauser.com

DebtCulture.com

rosettaGPT.com

UIHY.COM

Re-Develop.com

EngagingTV.com

avatarsdancing.com

Jantar.com

Pinned

Appreciation

Agreement

Answers

Relevance

Reaction

Status

Feeling