PHPBuilder - Dynamic Document Search Engine - Part 2



RSS Twitter
Articles Tricks And Hacks

Dynamic Document Search Engine - Part 2

by: M.Murali Dharan
|
February 25, 2004

Introduction:

In part 1, the article discussed document based searches that display results based on the number of search words found in each document. This article is an extension that ranks based on number of search words found plus number of occurrences of each search word in the document.
To search for “php tutorials and examples”, the following table shows the title and occurrence of each search word in the document. Common words like is, was, and etc. are removed from the search constraints by the program. So in this example, we have three search words, ‘php, ‘tutorials’ and ‘examples’.
No.Article NumberphpTutorialExamplesTotal OccurrenceRank
1.Article #189151116423
2.Article #20325128451
3.Article #25718165394
4.Article #1456817315
5.Article #52651721432
6.Article #8614410286
Article #203 has the highest occurrence and it is given rank 1. Similarly ranking is given for other results.
Building The Database:
The database consists of three tables. Document Table, Keyword Table and Link Table. Document Table holds article’s title, and abstract. Keyword Table holds keyword and the keyword field is indexed. Link Table holds keyword id, content id, and occurrences.
The SQL Statement for creating these three tables are shown below.
Content Table:
CREATE TABLE content ( 
contid mediumint NOT NULL auto_increment, 
title text NOT NULL, 
abstract longtext NOT NULL, 
PRIMARY KEY (contid) 
) TYPE=MyISAM; 
Keyword Table:
CREATE TABLE keytable ( 
keyid mediumint NOT NULL auto_increment, 
keyword varchar(100) NOT NULL,
PRIMARY KEY (keyid), 
KEY keyword (keyword) 
) TYPE=MyISAM; 
Link Table:
CREATE TABLE link ( 
keyid mediumint NOT NULL, 
contid mediumint NOT NULL,  
occurances mediumint NOT NULL 
) TYPE=MyISAM; 

1
|
2
|
3
|
4
|
5
|
6
Next Page »

Comment and Contribute

Your comment has been submitted and is pending approval.

Author:
M.Murali Dharan

Comment:



Comment:

(Maximum characters: 1200). You have characters left.