cipherdyne.org

Michael Rash, Security Researcher



Software Release - gootrude-0.2

Gootrude queries Yahoo Gootrude-0.2 is available for download. This release changes the default search engine from Google to Yahoo because it appears that Yahoo's Terms of Service do not prohibit automated queries (although I'm not a lawyer). Gootrude retains the ability to query Google at your own risk if you so desire. As long as you use Gootrude as designed to collect search engine results only once per day with a limited number of search terms (say, 20 or less), it is unlikely to provoke a negative reaction from search engines. Over time, support for many additional search engines will be added to Gootrude so that search results can be trended from all sorts of different data sources.

Also, this post should really have announced Gootrude as a "search results trender" instead of a "search trender". The goal of the project is made clear in the README, and here is the complete ChangeLog, for the Gootrude-0.2 release:

  • Added support for querying Yahoo for search results and made this the default because Google's TOS does not technially allow automated queries. However, Gootrude continues to support querying Google for search results, and responsible use of Gootrude once per day with a limited set of search terms should most likely not be a cause for concern for Google. A new syntax of the searchterms.conf file allows search terms to express which search engine should be queried on an individual basis according to the following syntax (note the middle element): [Linux "highspeed firewalls"] [yahoo:count] [Linux_highspeed_firewalls.dat]
  • Changed the default USER_AGENT variable to "Gootrude <version>".
  • Added a percentage-based offset for the plot minimum and maximum ranges on the y-axis (with a default of 10% controlled by two new variables MIN_PERCENT_DIFF and MAX_PERCENT_DIFF). This value can also be set by the search terms themselves via an additional config element as follows (for a 20% offset instead for example): [yrange:20%/20%]
  • Increased SEARCH_TERM_DELAY to two seconds to be nicer to search engines.