String Manipulation Help

Ik · Oct 23, 2009

Please see the code below.

Code:

<ul>
	<li>
		Services
		<ul>
			<li>
				Web Design & Development
				<ul>
					<li>CMS</li>
					<li>e-Commerce</li>
				</ul>
			</li>
			<li>SEO</li>
			<li>Hosting</li>
		</ul>
		<li>About</li>
		<li>Contact</li>
	</li>
</ul>

I have it in a string variable, and I want to be able to get the 3rd level UL from it. So here is what I want to get:

Code:

<ul>
<li>CMS</li>
<li>e-Commerce</li>
</ul>

Your help is appreciated. I'm using PHP 5

Kate · Oct 23, 2009

I would have a look at the PHP DOM parser to traverse the HTML structure, other XML libraries should do the job as well.

qbert220 · Oct 24, 2009

Or you could use preg_match like this (untested):

Code:

if (preg_match('#^.*?<ul>.*?<ul>.*?(<ul>.*?</ul>)#i', $matches))
{
    echo $matches[1];
}

Ik · Oct 26, 2009

qbert220 said:
Or you could use preg_match like this (untested):

Code:

if (preg_match('#^.*?<ul>.*?<ul>.*?(<ul>.*?</ul>)#i', $matches)) { echo $matches[1]; }

It seems to me that preg_match is the solution, but the code above is not working anf I'm not too good with regular expressions.

RageD · Oct 26, 2009

Well, what exactly are you looking for?

The best thing to do is to find a single regex that searches for all that you're looking for. Else, you can use a for(); loop or similar to go through multiple regex queries, but preg_match(); will show greater signs of slow downs than str_match();

-RageD

qbert220 · Oct 26, 2009

Tested working code:

PHP:

<?php

$string = '<ul>
        <li>
                Services
                <ul>
                        <li>
                                Web Design & Development
                                <ul>
                                        <li>CMS</li>
                                        <li>e-Commerce</li>
                                </ul>
                        </li>
                        <li>SEO</li>
                        <li>Hosting</li>
                </ul>
                <li>About</li>
                <li>Contact</li>
        </li>
</ul>
';

if (preg_match('#^.*?<ul>.*?<ul>.*?(\s*<ul>.*?</ul>)#is', $string, $matches))
{
    echo $matches[1]."\n";
}

?>

This should display:

Code:

                                <ul>
                                        <li>CMS</li>
                                        <li>e-Commerce</li>
                                </ul>

In the preg_match:

"^" means match the start of the string
"." means match any character
"*" means zero or more of the previous element
"?" means non-greedy (matches as little as possible - be default the regexp will match as many characters as possible)
"\s" means match whitespace characters

The "i" modifier means cases insensitive
The "s" modifier enables multi-line matching

So ".*" means match zero or more of any character. Adding the "?" means match as few characters as possible. I added the "\s*" to also capture the indenting before the last <td>.

Ik · Oct 26, 2009

That is perfect. Thanks a lot for the explanation as well

qbert220 said:
Tested working code:

...snip...

String Manipulation Help

Quality //VIP Member

Domainosaurus RexTop Member

Established Member

Quality //VIP Member

VIP Member

Established Member

Quality //VIP Member

Data.link

hartadinata.com

NetUSA.com

mycarinfo.com

Abet.co

Resell.co

ABCRoofing.com

Freely.co

comeagain.com

akupintar.com

Similar threads

We're social

Data.link

hartadinata.com

NetUSA.com

mycarinfo.com

Abet.co

Resell.co

ABCRoofing.com

Freely.co

comeagain.com

akupintar.com

Pinned

Appreciation

Agreement

Answers

Relevance

Reaction

Status

Feeling