Web Log Storming version 2.8 is available for download. Most important changes include an improvement in spider identification algorithm and a custom IP resolving. This update is free for everyone who bought after February 14th, 2011.
Even well established web crawlers don’t behave as they used to. We remember happy times when all decent bots (excluding shady ones) presented themselves in User Agent field, so it was easy to identify them while parsing log files.
As one of our users noticed and pointed out (thanks, John!), even big players like MSN Bot introduce themselves as regular browsers now. This wasn’t easy to notice without resolving IPs to domain names; even by close inspecting their sessions, you can rarely notice anything unusual (they look like regular visitors, with IE7 browser and different Windows versions). Only after resolving IPs, you can see their true nature.
That’s why, in this version, we have added a feature to define Spider Domains identification list, in addition to already existing Spider User Agents list. Domains list can contain IP addresses (wildcarded) or domain names. That way identification works regardless if user resolves domain names or not.
Accompanied with this change, we have also added a right-click menu option Add Selected to Spiders in Domains report.
Custom IP resolving
Speaking of IP resolving, we have added an additional tool as a nice addition to Manually edit host name * option (Professional version only). If you work in teams, now you can define URL to web script that will be used to get these custom names from shared web space, for example database.
This script should follow simple rules for getting and setting IP/domain pairs. You can see more details here.
* In case you are not already familiar with it, Manually edit host name allows you to assign any text to IP address and this text will appear in reports as it was a normal domain name. That way you can assign texts like “Our company”, “Prospective client #1”, “Hacker”, etc.