Algorithm for filtering
-
The web query page obtained is divided into 2 parts. The
description about auction Time, Price, Region & the other about the
Description of the product.
-
If the essentials of 1st page aren't satisfied ie either
the price is above the range specified, or the country doesn't match, or
the time for closure is beyond the users wanted one then page directly
rejected, else :
-
Words in the Description are matched with the keywords &
the synonyms generated using wordnet.
-
The stopwords are eliminated using the database generated
using wordnet.
-
The vector summation of product of (weights specified by
the user & frequency of occerence of the keywords in the Description)
is obtained.
-
The summation is normalised by dividing it by the square
root of (No. of words in the Description-No. of stopwords)
-
These summations for different Description files is sorted.
-
Average difference between the values obtained of adjacent
files is obtained.
-
If his option of search results is either <10%,
<20%, <50% or All results then directly that number of search results
is mailed to him.
-
If his option is Histogram filter then out of the best 20%
results we start from the bottom & go on till the difference between
the corresponding files is greater than the average value obtained for
all of them. All the results above this value are send to him.
-
Also we store the value of first unselected result to be
used for the corresponding searches.
-
Results selected & rejected one's are stored in a file
-
Results selected are mailed to him.
Results selection for corresponding searches
-
Results rejected won't be considered again.
-
Comparison between the new entry appearing & the
old selected on the basis of average threshold, if the necessary qualities
are satisfied. In this we also take care of the value of the first unselected
file from the previous search.
-
Also the old selected entries are compared for any changes
in price & if it is unchanged it won't be notified to the
user.