 2010/09/09
|
Last update 1998/11/27
 The Labs - Design & Functionality For The NetSearch Engine & Analyzation Tool
- Introduction
- Download
- Usage
- Limitations
- Examples
- Other Search Engines

WebSherpa was programmed within few hours as I like to have one
compact tool to analyze my site, and as side-effect use it as search-engine
as well.
Since the initial version (0.001) it has been improved with state-of-the-art
search-engine features.
- Index a web-site (URL based), this means only indexes pages which are
100% accessible via your site, and not any hidden file you might have forgotten
to remove from your server.
- If search returns no result, then lexical close words are listed (maybe
some mispelling happened). This feature will be further developed during
the next versions.
- Find missing links or pictures within your site (not yet)
- Displays logical structure via GIF picture. (under development)
$MyVersion: 0.009 - Fri Nov 27 11:04:07 EST 1998 - kiwi$
websherpa (perl-source)
It (still) requires lynx.
NOTE: This is still alpha version of this program, it will
be improved during the next weeks almost weekly.
Index Site

| | |
% ./websherpa -c http://mysite.com/fullpath/
|
The file created will be sherpa.index, unless you set with -i alt-file to
advise to use another index-file.
|
Search Site

| | |
% ./websherpa -s 'programming'
|
with -i you use can force to use another index-file to search in.
|
CGI

| | |
<form action=websherpa.cgi>
|
|
<input name=search>
|
|
</form>
|
and create in the same directory a search.html with the line
<!-- insert-result --> where you like to have the result output
placed into. You can choose another filename, then edit source-code
of WebSherpa.
|
Since WebSherpa is really small package, actually one single perl-script,
there are some limitations you may know of:
- Index a site with less of 5'000 pages, if your server has enough
memory (>64MB) then 10'000 should work as well.
- All URLs and its titles are
loaded when a search is started, the searching is hash-based and doesn't
take much memory as the word-list isn't loaded into the memory.
- Indexing a large site takes also long time, since word extracting
is done via Perl as well.
Most web-sites have less than 5'000 pages and therefore WebSherpa
could be your engine.
We used WebSherpa for dedicated search-engines at TheLabs:
| WebSherpa6. Other Search Engines
|

Hipocrisy of the finest: "I agree that no single company can create all the hardware and software. Openness is central because it's the foundation of choice." -- Steve Balmer (Microsoft) blaming Apple regarding iPhone, February 18, 2009Last update 1998/11/27 
All Rights Reserved - (C) 1997 - 2009 by The Labs.Com |