NamePros
Welcome, Guest! Ready to make a name for yourself in the domain business? We welcome both the hobbyist and professional domainer to join the discussion as part of the NamePros community.

Click here to create your profile to start earning reputation for posting, and trader ratings for buying & selling in our free e-marketplace. Build your trader rating with each successful sale. Our system has tracked over 100,000 sales and counting!
FAQ & TOS Register Search Today's Posts Mark Forums Read

Go Back   NamePros.com > Website Development Discussion Forums > Programming > CODE
Reload this Page PHP5 Basic Spam Class

CODE This forum is for posting code snippets and example scripts that aren't quite tutorials, but could be useful for others. You may post code snippets and/or completed scripts that you've written and want to share here.

Advanced Search


Closed Thread
 
LinkBack Thread Tools
Old 09-24-2008, 06:04 AM THREAD STARTER               #1 (permalink)
Senior Member
 
Dave's Avatar
Join Date: Jun 2007
Location: NamePros.com
Posts: 1,400
Dave has much to be proud ofDave has much to be proud ofDave has much to be proud ofDave has much to be proud ofDave has much to be proud ofDave has much to be proud ofDave has much to be proud ofDave has much to be proud ofDave has much to be proud of
 


Cancer

Smile PHP5 Basic Spam Class


Hey guys,

This class contains a nice method for helping you determine if a string is spam or not. Here is what it does:
  • Length - Checks the string to make sure it isn't too short or too long.
  • Standard Text - Makes sure the string is standard text, which means only letters, numbers, whitespace, dashes, periods, question marks and exclamation marks.
  • Links - Makes sure the string doesn't contain too many anchor links or links.
  • Optional basic grammar check - It can check to make sure the first letter of the string is capitalized and make sure there aren't too many repetitive characters. This stops stupid messages like 'hiiiii' or 'you smellll!!!!!'.
  • Bad words - Make sure the string doesn't contain any bad words that you don't want!

Spam Class

PHP Code:
<?php
/**
 * Basic spam class for PHP5. Aids in determining whether a
 * string is spam using several factors.
 *
 * @package        Spam
 * @author         David Parr <dave@snezo.com>
 * @copyright      Copyright (c) David Parr, 2008
 */

class Spam
{
    
// Configuration
    
protected $config;
    
    
// Bad words that we don't want in the string
    
protected $bad_words;
    
    
/**
     * Constructor. Sets configuration.
     *
     * @param array Configuration
     * @param array Bad words like 'shit', 'faggot'
     * @return void
     */
    
public function __construct($config$bad_words)
    {
        
$this->config $config;
        
$this->bad_words $bad_words;
    }
    
    
/**
     * Performs several tests on a string to help
     * determine whether or not it is spam.
     *
     * @param string String we are checking
     * @return bool
     */
    
public function check($str)
    {
        
// Check the length of the string isn't too short, or long..
????: NamePros.com http://www.namepros.com/code/517324-php5-basic-spam-class.html
        
$length strlen($str);
        if(
$length $this->config['min_length'] OR $length $this->config['max_length'])
        {
            return 
false;
        }
        
        
// Check the string is standard text (only letters, numbers, whitespace, dashes and periods.
        
if( ! preg_match('/^[-\pL\pN\pZ_.!?]++$/uD'$str))
        {
            return 
false;
        }
        
        
// Count the number of anchor links found in the string.
        
preg_match_all('#(<a href|\[url|http:\/\/)#i'$str$matchesPREG_PATTERN_ORDER);
        if(
count($matches[0]) > $this->config['max_links'])
        {
            return 
false;
        }
        
        
// I always like to cleanup after myself :D
        
unset($matches);
        
        
// Grammar check?
        
if($this->config['grammar_check'])
        {
            
// First letter should always be capitalized
            
if($str[0] > 'a')
            {
                return 
false;
            }
            
            
// Shouldn't be no more than 2 repetitive characters in any word. Very few words have more.
            // This is ugly but it does work. If you know of any other way please let me know.
            // This is useful for stopping idiots entering things like "hiiiiii" or "you smeeeeelll!!!!" :o
            
$words explode(' '$str);
            
$found false;
            foreach(
$words as $word)
            {
                
// The length of the word should be greater than 1 if its not an a or a number
????: NamePros.com http://www.namepros.com/showthread.php?t=517324
                // This prevents things like U and R
                
if(strlen($word) < AND strtolower($word) != 'a' AND ! is_numeric($word))
                {
                    
$found true;
                    break;
                }
                
                
$chars explode("''"$word);
                
$chars_count = array();
                foreach(
$chars as $char)
                {
                    if(
$chars_count[$char]++ > 2)
                    {
                        
$found true;
                        break;
                    }
                }
            }
            
            unset(
$words);
            unset(
$chars);
            unset(
$chars_count);
            
            if(
$found)
            {
                return 
false;
            }
        }
        
        
// Check for any bad words.
        
foreach($this->bad_words as $bad_word)
        {
            if(
stripos($str$bad_word) !== FALSE)
            {
                return 
false;
            }
        }
        
        
// If we got here then everything is fine ;)
        
return true;
    }
}
Example

PHP Code:
<?php

require_once('classes/Spam.php');

$config = array(
    
'min_length'    => 10,
    
'max_length'    => 255,
    
'max_links'     => 0,
    
'grammar_check' => true
);

// Replace these with much worse words lol
$bad_words = array('bad''terrible''awful');

$spam = new Spam($config$bad_words);

$str 'Hi everybody how are you!?'// Will pass the check

$result $spam->check($str);

if(
$result)
{
    echo 
'String contains no spam';
}
else
{
   echo 
'String does contain spam';
}

// TESTS

$str 'Hello'// Is too short so would fail
$str 'http://games.com'// Fails because of link

// If we have grammar check on the following will fail
$str 'hi everyone how are you?'// First letter isn't capitalized.
$str 'hiiiii'// Repetitive
$str 'Hey u r idiot'// Fails because of u and r
?>
If you don't want grammar check then simply make sure the grammar check variable in the config is set to false.

Enjoy!
Last edited by Dave; 09-24-2008 at 06:12 AM.
Dave is offline  
Closed Thread


LinkBacks (?)
LinkBack to this Thread: http://www.namepros.com/code/517324-php5-basic-spam-class.html
Posted By For Type Date
How to use PHP to find base info about text QUALITY? - Stack Overflow This thread Refback 08-17-2011 04:59 AM
How to use PHP to find base info about text QUALITY? - Stack Overflow This thread Refback 07-22-2011 01:00 AM

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools


Liquid Web Smart Servers  
All times are GMT -7. The time now is 12:41 AM.

Managed Web Hosting by Liquid Web
Domain name forum recommended by Domaining.com Powered by: vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.6.0 Ad Management plugin by RedTyger