I attended the “Search Algorithms: The Patent Files” session first thing this morning. The panelists were Rand Fishkin, CEO of SEOmoz.org, Ani Kortikar, Founder and CEO, Netramind, Dr. E. Garcia of Mi Islita.com, and Jon Glick, Senior Director of Product Search, Become.com. My favorite presentation was from Jon. He was not overly technical (Dr. Garcia lost me at the advanced mathematics talking about calculating dot products of vectors) yet he gave solid advice. Here’s what he had to say, in summary:
Take these patents with a grain of salt, because…
– patent applicants don’t need to use all the stuff they include in a patent application.
– patent applicants don’t have to disclose all of its features in a patent application.
– and they recognize that SEOs and their competitors are pouring over their patent apps.
With that said, there are some valuable learnings from the 2003 Google patent. Search engines may take into account: CTR on your page in SERPs, rapid changes in content, rapid growth of in-links, and length of time users spend on your site.
So which of these actually impact your rankings? Some are red herrings, such as:
– Clickthrough rate (CTR): it’s too easy to distort (e.g. through clickbotting, which is evil and likely to get you penalized). Probably CTR is used for demotion only. In other words, high CTR won’t help your organic rankings, but low CTR may lower your rankings.
– Time spent on a site: when users hit the back button almost immediately, it can signify an irrelevant page or 404 error. However, if this was used then this would in effect reward black hat tactics like mousetrapping and endless pop-ups — tactics that trap users within a site.
– Rate of change in content: Most recent crawl date, last time the content changed, registration date, and first crawl date mostly impacts crawl frequency, not ranking. Duplicate detection technologies are used to find meaningful changes in site content. Meaningful changes in site content do not include putting today’s date or today’s weather on the page — it doesn’t help rankings. When a site changes its IP address, it is often re-evaluated because it is possibly under new ownership.
According to Jon, what’s not a red herring is:
– Rate of change in links: Most Search Engines limit how quickly a site can gain connectivity (sandboxing, link aging). A sudden jump in in-links (e.g. from link farming and interlinking and triangle linking lots of domains) can draw scrutiny. There are exceptions for “spike” sites (editorial review, lots of accompanying news/blog posts, lots of web searches).
Patent lawyers try to patent as much as they can get in each patent they file. In other words, they shoot for the stars, moon and galaxy when they file each patent. This just makes your patent stronger and your company worth more.
“Dr. Garcia lost me at the advanced mathematics talking about calculating dot products of vectors.”
Sorry to hear that. To calculate a dot product between any two points, simply multiply their dimensions and results together. It’s that simple 🙂
So for any two points A and B in 2 dimensions with coordinates A(x1, y1) and B(x2, y2) the DOT Product is just DOT P = x1*x2 + y1*y2
This can be extended to any N-Dimensions; nothing mysterious about it.
A summary is now available at http://www.miislita.com/search-engine-conferences/duplicated-content-patents.pdf
An HTML version is also available.
Dr. Garcia
Hi, Stephen
I tutorial on cosine similarity and dot product is available now at
http://www.miislita.com/information-retrieval-tutorial/cosine-similarity-tutorial.html
Dr. Garcia
Good point Neil, the way the patent process works submitters are encouraged to claim the sun the moon and the stars. They then submit increasingly narrower “cliams” and the USPTO usually grants them one star…and only when viewed from a particular angle. What has been notable about some web patents is the range of coverage that USPO has granted; I’m attributing this to the relative newness of the field and the lack of experience the patent examiners have with it. IMO many of the current web patents have so much “prior art” that they would be unlikely to withstand a challenge. For example, MSN is clearly using link analysis in its new engine and we haven’t heard a peep from Google about potential violation of their PageRank patent.