The WebMoose FAQ
Q:What is WebMoose?
Q:How Can I See the Statistics WebMoose has Generated?
Q:Who Runs WebMoose?
Q:When Will WebMoose be Done?
Q:Where Can I Get the WebMoose Source Code?
Q:What Makes WebMoose Go?
Q:How Often Does WebMoose Run?
Q:Does WebMoose Follow the Standard for Robot Exclusion?
Q:How do I Know that WebMoose is Visiting My Site?
What is WebMoose?
WebMoose is a
WebMoose visits WWW sites and downloads their
HTML pages. It processes the downloaded HTML and generates a database of
HTML keyword usage frequency
server name and version
When WebMoose notices a link, it tosses it in its database and visits
the link sometime later. WebMoose tries not to flail on HTTP server
sitesbut because of the algorithm used, WebMoose might hit
a particular site several times in a row before wandering off to
How Can I See the Statstics WebMoose has Generated?
You can't, just yet. This is mainly because WebMoose is still under
development. Therefore, I haven't let it run quite long enough to
get any meaningful results.
Who Runs WebMoose?
I do. My name is Mike Blaszczak. Visit my home
page, if you're curious, or write to me directly.
When Will WebMoose be Done?
I don't know. I'm writing it in my spare time, and I have plenty of
other things to do. Some of those other things are actually more
interesting than WebMoose, so it's possible that WebMoose may
never be finished.
Where Can I Get the WebMoose Source Code?
You can't. WebMoose isn't done, and even when it is done (if it
is ever done), I might not release its source because there's just
too much chance for abuse.
What Makes WebMoose Go?
WebMoose itself runs on a 200 MHz Pentium Pro® system via an ISDN
connection to The Microsoft Network. WebMoose was written using MFC 4.2
and Microsoft Visual C++ 4.2. WebMoose runs under Windows 95.
WebMoose talks over a local Ethernet connection to a 90 MHz
Pentium system running Windows NT Server 4.0. This box runs Microsoft
SQL Server Version 6.0, and stores information about everything
that WebMoose has found lately.
How Often Does WebMoose Run?
I generally run WebMoose for a few hours late on weekend evenings.
I run WebMoose against a local web server to test it, so it doesn't
often get out in public.
Does WebMoose Follow the Standard for Robot Exclusion?
Standard for Robot Exclusion gives web masters a chance at having web
robots, like WebMoose, completely pass their site by. The standard is
simple and flexible: it affords the server administrator a way to
exclude robots by name, and to exclude robots from certain parts of their
For now, WebMoose doesn't follow the standard but I'm working on it.
(This is why, for now, I don't let the moose roam very far.) I'll
probably implement handling of this standard before going much further
with the development of the tool.
The presence and absenece of a
ROBOTS.TXT file, and the
proper response to a request for such a file (whether it exists or not)
are other statistics that WebMoose will keep.
How do I Know that WebMoose is Visiting My Site?
In all HTTP requests it makes,
WebMoose identifies itself with a
User-Agent: header that
looks like this:
h is a digit identifying the major version of the
kk is a pair of digits identifying the minor
version of the moose.
bbbb is a string of four digits
identifying the build number of the moose. The build number increments
each and every time the moose is recompiled, and that happens very often
since WebMoose is still under development.
At the exact time of this writing, WebMoose uses the string:
to identify itself. Undoubtedly, the last four digits have been incremented
since I just thought of something else to fix.
Up to Mike's home page.
Last modified on 25 November, 1997.