PHPBuilder - Dynamic Document Search Engine - Part 1 Page 4



RSS Twitter
Articles Tricks And Hacks

Dynamic Document Search Engine - Part 1 - Page 4

by: M.Murali Dharan
|
February 17, 2004

ExtractWords() Function:

This function filters words by allowing only alphabetic characters. To implement this, I used a technique called STATE MACHINE that filters the characters.
Alphabetic characters are taken as STATE1 and other characters (Numeric and Special Characters) as STATE0. Initially the machine will be in the STATE0. While parsing letters, it encounters alphabetic characters, the machine switches to STATE1 else it will remain in the same state. As a result we get a word with only alphabetic characters.

<?php
function ExtractWords($text){
    
$STATE0 0;  //Numeric / Other Characters
    
$STATE11;   //Alpha Characters
    
$state = $ STATE0;

    
$wordList = array();
    
$curWord "";

    for ( 
$i 0$i strlen($text); ++$i ) {
        
$ch $text{$i};
        
$isAlpha ctype_alpha$ch );

        if ( 
$state == $STATE0) {
            if ( 
$isAlpha ) {
                
$curWord $ch;
                
$state $STATE1;
            }
        }
        else if ( 
$state == $STATE1) {
            if ( 
$isAlpha ) {
                
$curWord .= $ch;
            }
            else {
                
$wordList[] = strtolower$curWord );
                
$state = $ STATE0;
            }
        }
    }

    if ( 
$state == $ STATE1) {
        
$wordList[] = strtolower$curWord );
    }

    return 
$wordList;
}
?>

« Previous Page
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
Next Page »

Comment and Contribute

Your comment has been submitted and is pending approval.

Author:
M.Murali Dharan

Comment:



Comment:

(Maximum characters: 1200). You have characters left.