Regular Expression Help [2 lines]

DylanButler · May 30, 2009

Hi folks,

Plese see the regex below and let me know where I'm going wrong! I have been banging my head on this for a few hours already and I am just stumped.

PHP:

$subcategories = '<li><a href="test">Test - Sports</a></li>
<li><a href="test">Something - Fashion</a></li>
<li><a href="test">Random - Technology</a></li>';
$subcategories = preg_replace('/<a (.*?)/>(.*?)/<\/a/>','<a '.$1.'>'.(isset(explode(' - ', $2)[1])) ? explode(' - ', $2)[1]:$2.'</a>', $subcategories);

I want it to become:

HTML:

<li><a href="test">Sports</a></li>
<li><a href="test">Fashion</a></li>
<li><a href="test">Technology</a></li>

Any help would be appreciated and repped, thanks!

Kate · May 30, 2009

You could parse the whole string into an array with the split function, using the linebreak as a delimiter.

PHP:

<?php
$subcategories = '<li><a href="test">Test - Sports</a></li>
<li><a href="test">Test - Fashion</a></li>
<li><a href="test">Test - Technology</a></li>';
//$subcategories = preg_replace('/<a (.*?)/>(.*?)/<\/a/>','<a '.$1.'>'.(isset(explode(' - ', $2)[1])) ? explode(' - ', $2)[1]:$2.'</a>', $subcategories); 
list($text) = split("[\n\r]+", $subcategories,null);

print_r($text);
?>

DylanButler · May 30, 2009

Thanks sdsinc, I will give this a shot. I am not sure this will solve the whole problem. I still need to match on the hyphen (-) and remove the first word from before it. That's the part I'm having the most trouble with.

Kate · May 31, 2009

Like this I capture everything but the first word after the anchor (assuming the href target is always "test")

PHP:

<?php
$subcategories = '<li><a href="test">Test - Sports</a></li>
<li><a href="test">Something - Fashion</a></li>
<li><a href="test">Random - Technology</a></li>';

$pattern = '/(<li><a href="test">)\w+ - (.*)(<\/a><\/li>)/i';

$replacement = '$1$2$3';
echo preg_replace($pattern, $replacement, $subcategories);
?>

DylanButler · May 31, 2009

sdsinc said:
Like this I capture everything but the first word after the anchor (assuming the href target is always "test")

PHP:

<?php $subcategories = '<li><a href="test">Test - Sports</a></li> <li><a href="test">Something - Fashion</a></li> <li><a href="test">Random - Technology</a></li>'; $pattern = '/(<li><a href="test">)\w+ - (.*)(<\/a><\/li>)/i'; $replacement = '$1$2$3'; echo preg_replace($pattern, $replacement, $subcategories); ?>

Close, but my example wasn't loose enough. The list items and anchor tags actually have title attributes and differing href values, so it would be something more like this:

PHP:

$subcategories = '<li><a href="sports.htm" title="Sports keywords">Test - Sports</a></li>
<li><a href="fashion.php" title="Fashion Stuff">Something - Fashion</a></li>
<li><a href="test">Random - Technology</a></li>';

Rep given, I appreicate you looking at this for me.

Kate · May 31, 2009

OK let's spice up things lol. What about this:

PHP:

<?php
$subcategories = '<li><a href="sports.htm" title="Sports keywords">Test - Sports</a></li>
<li><a href="fashion.php" title="Fashion Stuff">Something - Fashion</a></li>
<li><a href="test">Random - Technology</a></li>'; 

$pattern = '/<li><a href="([^"]+)"(?: title="[^"]+")?>\w+ - (.*)<\/a><\/li>/im';

$replacement = '<li><a href="$1">$2</a></li>';
echo preg_replace($pattern, $replacement, $subcategories); 
?>

Result:

HTML:

<li><a href="sports.htm">Sports</a></li>
<li><a href="fashion.php">Fashion</a></li>
<li><a href="test">Technology</a></li>

First of all, note that we are just capturing the 2 fields that you need: anchor value and the text within.
Basically, it says:

capture the value within <a href=""> (anything but the double quote ")
allow for an optional tag title, located one space after the href
Note the ?: at the beginning of the title pattern, this is to match the expression between parentheses but not capture it, I assume you don't need it.
for the title tag, same simplistic expression, everything else than double quote "

Note the /im modifier at the end of the pattern, i means case-insensitive, m stands for multiline. There are other modifiers like u (Unicode) etc.

DylanButler · Jun 1, 2009

That will work perfectly for what I need, thank you! By the way I like you too much to give more rep to you, apparently. :/

hosteriajuyendequilotoa.com

foza7comuni.it

vai.cl

escrow.finance

swapid.pro

breadfruit.studio

7199bet.me

scotlandkilts.co.uk

jacheteenlocal.fr

dam-aguas.es

Regular Expression Help [2 lines]

VIP Member

Domainosaurus RexTop Member

VIP Member

Domainosaurus RexTop Member

VIP Member

Domainosaurus RexTop Member

VIP Member

ZorbasHotel.gr

mercedesworkshop.com

BruvParty.com

Jenny.io

Kitchn.com

Reserve.co

CheapKeys.com

Chime.co

GearMax.com

Swan.co

Similar threads

We're social

ZorbasHotel.gr

mercedesworkshop.com

BruvParty.com

Jenny.io

Kitchn.com

Reserve.co

CheapKeys.com

Chime.co

GearMax.com

Swan.co

Pinned

Appreciation

Agreement

Answers

Relevance

Reaction

Status

Feeling