|
|
|
dcbspider is an automated robot that 'crawls'
the web to gather information on the pages it encounters. The robot
employs a specialized algorithm to detect specific
Inflation Techniques in order
to keep search results relevant and less prone to spammers.
The relevant data is extracted from the pages and indexed in a database
for later retrieval by a search queries.
The robot identifies itself VIA the User-Agent string in the HTTP request
header:
dcbspider/1.0 (+http://www.deltacrawler.com/bot.htm)
|
|
dcbspider is compliant with the standard
robot exclusion file
and will not download files or directories you deny in your robots.txt file.
For example, to prevent the robot from downloading any file below your /cgi-bin directory,
create a robots.txt at the root of your website or add the following lines to your existing robots.txt
User-agent: dcbspider
Disallow: /cgi-bin/
To deny the robot access to any file on your site, use the following lines:
User-agent: dcbspider
Disallow: /
|
|
|
dcbspider crawls the web in an ongoing basis.
Submissions of websites are not required as the dcbspider will
likely find the sites automatically. The DeltaCrawler Project
is not accepting manual website submissions at this time, but this will likely
change in the future.
|
|
|
dcbspider was designed to keep server bandwidth and load to a minimum
as pages are downloaded. We apologize in advance if our robot adversely affects
your server. If you notice any problems please let us
know immediately so we can fix the problem, or add your server to our exclusion
list. See contact information below.
|
|
If you have questions or comments, please send e-mail to admin@nospam.deltacrawler.com
with the 'nospam.' portion removed.
Thank you.
|
|