Aidan's PHP Repository

A repository for PHP functions and classes ...

function.str_highlight.php

Highlight a string in text without corrupting HTML tags

  • Author: Aidan Lister <aidan@php.net>
  • Version: 3.1.1
  • Link: http://aidanlister.com/repos/v/function.str_highlight.php
  • Return: Text with needle highlighted
  • Views: 25659
  • Downloads: 1678

Source

Download this script <?php
/**
 * Perform a simple text replace
 * This should be used when the string does not contain HTML
 * (off by default)
 */
define('STR_HIGHLIGHT_SIMPLE', 1);
 
/**
 * Only match whole words in the string
 * (off by default)
 */
define('STR_HIGHLIGHT_WHOLEWD', 2);
 
/**
 * Case sensitive matching
 * (off by default)
 */
define('STR_HIGHLIGHT_CASESENS', 4);
 
/**
 * Overwrite links if matched
 * This should be used when the replacement string is a link
 * (off by default)
 */
define('STR_HIGHLIGHT_STRIPLINKS', 8);
 
/**
 * Highlight a string in text without corrupting HTML tags
 *
 * @author      Aidan Lister <aidan@php.net>
 * @version     3.1.1
 * @link        http://aidanlister.com/repos/v/function.str_highlight.php
 * @param       string          $text           Haystack - The text to search
 * @param       array|string    $needle         Needle - The string to highlight
 * @param       bool            $options        Bitwise set of options
 * @param       array           $highlight      Replacement string
 * @return      Text with needle highlighted
 */
function str_highlight($text, $needle, $options = null, $highlight = null)
{
    // Default highlighting
    if ($highlight === null) {
        $highlight = '<strong>\1</strong>';
    }
 
    // Select pattern to use
    if ($options & STR_HIGHLIGHT_SIMPLE) {
        $pattern = '#(%s)#';
        $sl_pattern = '#(%s)#';
    } else {
        $pattern = '#(?!<.*?)(%s)(?![^<>]*?>)#';
        $sl_pattern = '#<a\s(?:.*?)>(%s)</a>#';
    }
 
    // Case sensitivity
    if (!($options & STR_HIGHLIGHT_CASESENS)) {
        $pattern .= 'i';
        $sl_pattern .= 'i';
    }
 
    $needle = (array) $needle;
    foreach ($needle as $needle_s) {
        $needle_s = preg_quote($needle_s);
 
        // Escape needle with optional whole word check
        if ($options & STR_HIGHLIGHT_WHOLEWD) {
            $needle_s = '\b' . $needle_s . '\b';
        }
 
        // Strip links
        if ($options & STR_HIGHLIGHT_STRIPLINKS) {
            $sl_regex = sprintf($sl_pattern, $needle_s);
            $text = preg_replace($sl_regex, '\1', $text);
        }
 
        $regex = sprintf($pattern, $needle_s);
        $text = preg_replace($regex, $highlight, $text);
    }
 
    return $text;
}
 
?>

Example

<pre>
<?php
require_once 'function.str_highlight.php';
 
// Simple Example
$string = 'This is a site about PHP and SQL';
$search = array('php', 'sql');
echo str_highlight($string, $search);
echo "\n";
 
// With HTML in the text
$string = 'Link to <a href="php">php</a>';
$search = 'php';
echo htmlspecialchars(str_highlight($string, $search));
echo "\n";
 
// Matching whole words only
$string = 'I like to eat bananas with my nana!';
$search = 'Nana';
echo str_highlight($string, $search, STR_HIGHLIGHT_SIMPLE|STR_HIGHLIGHT_WHOLEWD);
echo "\n";
 
// With custom highlighting
$string = 'With custom highlighting!';
$search = 'custom';
$highlight = '<span style="text-decoration: underline;">\1</span>';
echo str_highlight($string, $search, STR_HIGHLIGHT_SIMPLE, $highlight);
echo "\n";
 
// With links
$string = 'I am a <a href="http://www.php.net">link</a>';
$search = 'link';
$highlight = '<a href="http://www.google.com/">\1</a>';
echo htmlspecialchars(str_highlight($string, $search, STR_HIGHLIGHT_STRIPLINKS, $highlight));
 
?>
</pre>

Output

This is a site about PHP and SQL
Link to <a href="php"><strong>php</strong></a>
I like to eat bananas with my nana!
With custom highlighting!
I am a <a href="http://www.google.com/">link</a>

Comments

July 13th, 2006
Fantastic stuff!
June 27th, 2006
Thank You very much! Very simple and useful!
April 24th, 2006
Hello Aidan, thanks for the script. There is only one problem I have come across and that is ... The output becomes My name is toseef and I want to see the world Notice how toseef isnt right? [Editor's Note: Yes, if the search strings overlap the word will be highlighted twice. Unfortunately without drastically increasing the complexity of the function, this can not be solved. On the positive side, the user should rarely notice.]
March 25th, 2006
Awesome! I stole your regex for something completely else, but i was long lloking for such a thing.
December 8th, 2005
Exactly what I was looking for !!! Many Thanks David
October 21st, 2005
Just found that the word boundary flag (\b) make the pattern fails if it is next to a high ascii character. Example: \b?t?\b \W don't have this limitation but doesn't mean the same thing. Any idea how I could work around this? Otherwise, this library is really cool! [Editor's Note: Unfortunately this is a problem with the regex library and probably won't be fixed until PHP6]
April 29th, 2005
Thanks for a great function! Keep up the good work...
April 25th, 2005
Aidan... You're my hero :)
March 28th, 2005
Great job ! I reused your script in my website ! Thanks a lot.
March 22nd, 2005
Hi, good work. Enjoy it.
March 15th, 2005
When a Title is entered into the link, and the title contains a keyword it creates a link inside of a link. Can you fix this??? [Editor's Note: Not really, the regex becomes too complicated. It may be easier to start lexing the HTML instead.]
February 5th, 2005
Jacky, Look at the example "with custom highlighting". Something like: style="color: red;" would work fine.
February 2nd, 2005
great job! nice tool! question: is it possible to highlight the string with a special color?
January 25th, 2005
if you search for ' '.$needle.' ' it should avoid any html links
January 25th, 2005
Can you make it so that it doesn't replace already made links containing the word? I just want it to skip the replacing if the higligh word is in a link (i.e. between ... ).
January 20th, 2005
Great, tenx a lot. I've searched all day for something like that and found only crabs. Really tenx. Great!!!
December 15th, 2004
I like it
December 3rd, 2004
Awesome little bit of code! If the search comes up with a page that contains a large amount of text how can I show just, say, 50 words from the text WHILE ALSO including the 50-word section that contains at least one of the searched-for words? By the way, I added $needle = explode(' ',$needle); to the top of the function to convert my search words into an array which works really well at finding all occurrences of searched words regardless of whether or not they are adjacent to each other. Thanks again and I look forward to some ideas of how I can summarise my search results. - Galen
November 30th, 2004
WOW ! That?s what i was searching for a long while ! THX !!!
November 5th, 2004
I tried to solve a similar problem but when I saw your function I immediately realized I could re-use your idea. Elegant solution! Thanks a lot!
November 5th, 2004
I love this solution. Quick and easy. However I'm such a novice at regular expressions... -how do you make it so that it only highlights "whole" words? Thanks! [Editors Note: I've added this in as an option]
November 5th, 2004
this looks great and a good solution for me.
November 5th, 2004
Thanks a lot! I was about to code a similar function, but yours does the job with style!
November 5th, 2004
Muchas graciassss, desde Argentina. Thanks!
November 5th, 2004
Thanks for your str_highlight! Usefull for all our search results at www.3fragezeichen.de! Matthias
November 5th, 2004
Thanks a lot, very usful! I've got the same problem as muscottyb: I'm a novice at regular expressions and want to highlight a string case sensitive. How to do this? Thanks! [Editor's Note: Set fifth param to true]
November 5th, 2004
hi aidan's wonderfull!!!!!!!
November 5th, 2004
Thank you very much for sharing your work. I am a real beginner and this helped me out in my current project. But, more importantly it helped me learn.
November 5th, 2004
Due to the number of options people have kindly requested, I've had to change the function prototype. You may now use bitwise constants to set each option in the 3rd parameter. I've added support for highlighting with links with the STR_HIGHLIGHT_STRIPLINKS constant. This will simply remove the existing link if a match is found in the text, ostensibly to be replaced by the replacement link. Note: If you don't like the length of the constants, you can use the values instead. For example, STR_HIGHLIGHT_SIMPLE|STR_HIGHLIGHT_WHOLEWD is the same as 1|2 or 3. Also note the highlight parameter no longer takes an array, but a string which is fed directly to to preg_replace. Thanks for all the feedback guys, let me know how you like the new changes.