Visions based document segmentation is a new patent obtained by Microsoft. The web pages have so much diversity in topics including content, links, labels and ads that search engines will find it really difficult to index and get the pages for the searchers.
This patent considers breaking down the web pages into a number of different blocks having meaning. This is done on the basis of how we see the page. These sections are identified as portions of the page with varying meanings and this might be completely unrelated to one another.
The existence of the blocks will be helpful to the search engine when they look for pages and will decide on the indexing based on the content found in that page. This will help the searchers to get more information on what they were looking for.
The page as per the patent is broken into visual blocks and the differences in the page sections are found out. This is used to create the visual separations between the sections. These blocks and the separators are used to create a content structure for that page. The pages are not directly ranked, but the sections and the blocks of the pages are used for ranking so that, it is easier to find out the right information for the searchers.
Subscribe to our blog to receive new posts and updates by Email

