Wednesday, February 23, 2011

+ Project work flow and description (Business Profile extraction) at the very first time.


BusinessSearch Project work flow and description

Objective of BusinessSearch Project:
Main goal of this project is to collect the business profile in a automated way and try to develop a system that can update the profile in quarter or half year basis.



Flow Description:
The target of our project is to collect maximum business profile. But the business profile is fully depends on host name. If we have a single host name then we are able to create a profile by using it’s home page, contact us, about us pages. But we do not have enough host names to extract Business profile for our project . That is why our main target is to collect maximum host name and convert it to a perfect business profile.

Current / available attributes for Business profile are:
·         Fax
·         Email
·         Website
·         Zip code
·         Address
·         Phone no
·         GPS location
·         Contract person
·         Contact person’s designation
Upcoming/ future attributes are:
·         Business type
·         Business Description
·         Owner/ Proprietor/ Director
·         Establishment date
·         Registration Date
·         Logo

Challenges: WebCrawler
·         Robustness
·         Mirror
·         Hashing
·         Unexpected bug

Work done so far in Business Search:
·         Fax : 80- 90% accuracy
·         Phone: 80-90%
·         Zip Code: 70-80%
·         Website: 85-90%
·         Email: 85-90%
·         Address: 45-50%
·         Company Name: 45-50%
·         Contact Person: 50-60%
·         Contact Designation: 60-70%
·         GPS Cordinate: 60-70% but (we will consider it later with the support of iSearch/Lucene)
·         Branch: 70-80%

Yet to Develop:
·         Business type – (Machine Learning)
·         Business Description – (Snippet)
·         Owner/ Proprietor/ Director (Natural Language Processing)
·         Establishment date - Parsing
·         Registration Date – Parsing
·         Logo – Paring + Tricks + Crawling




No comments:

Post a Comment