| |
Robots.txt 文件是用来写给搜索引擎看的文件,主要作用是为了告诉搜索引擎一些检索信息,他一般放在网站的根目录,以下列出四种Robots.txt文件的写法,大家可以直接下载后使用和修改,把文件名改为Robots.txt,放在网站的根目录即可!
1、允许所有的蜘蛛访问
# Robots.txt 范例 # # 允许所有的蜘蛛访问
User-agent: * Disallow:
2、阻止所有的蜘蛛访问
# Robots.txt 范例 # # 阻止所有的蜘蛛访问
User-agent: * Disallow: /
3、只允许蜘蛛访问指定目录
# Robots.txt 范例 # # 只允许蜘蛛访问cgi-bin目录和images目录 # # Disallow /cgi-bin/
User-agent: * Disallow: /cgi-bin/ Disallow: /images/
4、只允许知名的搜索引擎蜘蛛访问
# Robots.txt 范例 # # 只允许知名的搜索引擎蜘蛛访问. #
User-agent: Mozilla/3.0 (compatible;miner;mailto:miner@miner.com.br) Disallow:
User-agent: WebFerret Disallow:
User-agent: Due to a deficiency in Java it&aposs not currently possible to set the User-agent. Disallow:
User-agent: no Disallow:
User-agent: &aposAhoy! The Homepage Finder&apos Disallow:
User-agent: Arachnophilia Disallow:
User-agent: ArchitextSpider Disallow:
User-agent: ASpider/0.09 Disallow:
User-agent: AURESYS/1.0 Disallow:
User-agent: BackRub/*.* Disallow:
User-agent: Big Brother Disallow:
User-agent: BlackWidow Disallow:
User-agent: BSpider/1.0 libwww-perl/0.40 Disallow:
User-agent: CACTVS Chemistry Spider Disallow:
User-agent: Digimarc CGIReader/1.0 Disallow:
User-agent: Checkbot/x.xx LWP/5.x Disallow:
User-agent: CMC/0.01 Disallow:
User-agent: combine/0.0 Disallow:
User-agent: conceptbot/0.3 Disallow:
User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 Disallow:
User-agent: root/0.1 Disallow:
User-agent: CS-HKUST-IndexServer/1.0 Disallow:
User-agent: CyberSpyder/2.1 Disallow:
User-agent: Deweb/1.01 Disallow:
User-agent: DragonBot/1.0 libwww/5.0 Disallow:
User-agent: EIT-Link-Verifier-Robot/0.2 Disallow:
User-agent: Emacs-w3/v[0-9\.]+ Disallow:
User-agent: EmailSiphon Disallow:
User-agent: EMC Spider Disallow:
User-agent: explorersearch Disallow:
User-agent: Explorer Disallow:
User-agent: ExtractorPro Disallow:
User-agent: FelixIDE/1.0 Disallow:
User-agent: Hazel&aposs Ferret Web hopper, Disallow:
User-agent: ESIRover v1.0 Disallow:
User-agent: fido/0.9 Harvest/1.4.pl2 Disallow:
User-agent: H�m�h�kki/0.2 Disallow:
User-agent: KIT-Fireball/2.0 libwww/5.0a Disallow:
User-agent: Fish-Search-Robot Disallow:
User-agent: Mozilla/2.0 (compatible fouineur v2.0; fouineur.9bit.qc.ca) Disallow:
User-agent: Robot du CRIM 1.0a Disallow:
User-agent: Freecrawl Disallow:
User-agent: FunnelWeb-1.0 Disallow:
User-agent: gcreep/1.0 Disallow:
User-agent: ??? Disallow:
User-agent: GetURL.rexx v1.05 Disallow:
User-agent: Golem/1.1 Disallow:
User-agent: Gromit/1.0 Disallow:
User-agent: Gulliver/1.1 Disallow:
User-agent: yes Disallow:
User-agent: AITCSRobot/1.1 Disallow:
User-agent: wired-digital-newsbot/1.5 Disallow:
User-agent: htdig/3.0b3 Disallow:
User-agent: HTMLgobble v2.2 Disallow:
User-agent: no Disallow:
User-agent: IBM_Planetwide, Disallow:
User-agent: gestaltIconoclast/1.0 libwww-FM/2.17 Disallow:
User-agent: INGRID/0.1 Disallow:
User-agent: IncyWincy/1.0b1 Disallow:
User-agent: Informant Disallow:
User-agent: InfoSeek Robot 1.0 Disallow:
User-agent: Infoseek Sidewinder Disallow:
User-agent: InfoSpiders/0.1 Disallow:
User-agent: inspectorwww/1.0 http://www.greenpac.com/inspectorwww.html Disallow:
User-agent: &aposIAGENT/1.0&apos Disallow:
User-agent: IsraeliSearch/1.0 Disallow:
User-agent: JCrawler/0.2 Disallow:
User-agent: Jeeves v0.05alpha (PERL, LWP, lglb@doc.ic.ac.uk) Disallow:
User-agent: Jobot/0.1alpha libwww-perl/4.0 Disallow:
User-agent: JoeBot, Disallow:
User-agent: JubiiRobot Disallow:
User-agent: jumpstation Disallow:
User-agent: Katipo/1.0 Disallow:
User-agent: KDD-Explorer/0.1 Disallow:
User-agent: KO_Yappo_Robot/1.0.4(http://yappo.com/info/robot.html) Disallow:
User-agent: LabelGrab/1.1 Disallow:
User-agent: LinkWalker Disallow:
User-agent: logo.gif crawler Disallow:
User-agent: Lycos/x.x Disallow:
User-agent: Lycos_Spider_(T-Rex) Disallow:
User-agent: Magpie/1.0 Disallow:
User-agent: MediaFox/x.y Disallow:
User-agent: MerzScope Disallow:
User-agent: NEC-MeshExplorer Disallow:
User-agent: MOMspider/1.00 libwww-perl/0.40 Disallow:
User-agent: Monster/vX.X.X -$TYPE ($OSTYPE) Disallow:
User-agent: Motor/0.2 Disallow:
User-agent: MuscatFerret Disallow:
User-agent: MwdSearch/0.1 Disallow:
User-agent: NetCarta CyberPilot Pro Disallow:
User-agent: NetMechanic Disallow:
User-agent: NetScoop/1.0 libwww/5.0a Disallow:
User-agent: NHSEWalker/3.0 Disallow:
User-agent: Nomad-V2.x Disallow:
User-agent: NorthStar Disallow:
User-agent: Occam/1.0 Disallow:
User-agent: HKU WWW Robot, Disallow:
User-agent: Orbsearch/1.0 Disallow:
User-agent: PackRat/1.0 Disallow:
User-agent: Patric/0.01a Disallow:
User-agent: Peregrinator-Mathematics/0.7 Disallow:
User-agent: Duppies Disallow:
User-agent: Pioneer Disallow:
User-agent: PGP-KA/1.2 Disallow:
User-agent: Resume Robot Disallow:
User-agent: Road Runner: ImageScape Robot (lim@cs.leidenuniv.nl) Disallow:
User-agent: Robbie/0.1 Disallow:
User-agent: ComputingSite Robi/1.0 (robi@computingsite.com) Disallow:
User-agent: Roverbot Disallow:
User-agent: SafetyNet Robot 0.1, Disallow:
User-agent: Scooter/1.0 Disallow:
User-agent: not available Disallow:
User-agent: Senrigan/xxxxxx Disallow:
User-agent: SG-Scout Disallow:
User-agent: Shai&aposHulud Disallow:
User-agent: SimBot/1.0 Disallow:
User-agent: Open Text Site Crawler V1.0 Disallow:
User-agent: SiteTech-Rover Disallow:
User-agent: Slurp/2.0 Disallow:
User-agent: ESISmartSpider/2.0 Disallow:
User-agent: Snooper/b97_01 Disallow:
User-agent: Solbot/1.0 LWP/5.07 Disallow:
User-agent: Spanner/1.0 (Linux 2.0.27 i586) Disallow:
User-agent: no Disallow:
User-agent: Mozilla/3.0 (Black Widow v1.1.0; Linux 2.0.27; Dec 31 1997 12:25:00 Disallow:
User-agent: Tarantula/1.0 Disallow:
User-agent: tarspider Disallow:
User-agent: dlw3robot/x.y (in TclX by http://hplyot.obspm.fr/~dl/) Disallow:
User-agent: Templeton/ Disallow:
User-agent: TitIn/0.2 Disallow:
User-agent: TITAN/0.1 Disallow:
User-agent: UCSD-Crawler Disallow:
User-agent: urlck/1.2.3 Disallow:
User-agent: Valkyrie/1.0 libwww-perl/0.40 Disallow:
User-agent: Victoria/1.0 Disallow:
User-agent: vision-search/3.0&apos Disallow:
User-agent: VWbot_K/4.2 Disallow:
User-agent: w3index Disallow:
User-agent: W3M2/x.xxx Disallow:
User-agent: WWWWanderer v3.0 Disallow:
User-agent: WebCopy/ Disallow:
User-agent: WebCrawler/3.0 Robot libwww/5.0a Disallow:
User-agent: WebFetcher/0.8, Disallow:
User-agent: weblayers/0.0 Disallow:
User-agent: WebLinker/0.0 libwww-perl/0.1 Disallow:
User-agent: no Disallow:
User-agent: WebMoose/0.0.0000 Disallow:
User-agent: Digimarc WebReader/1.2 Disallow:
User-agent: webs@recruit.co.jp Disallow:
User-agent: webvac/1.0 Disallow:
User-agent: webwalk Disallow:
User-agent: WebWalker/1.10 Disallow:
User-agent: WebWatch Disallow:
User-agent: Wget/1.4.0 Disallow:
User-agent: w3mir Disallow:
User-agent: no Disallow:
User-agent: WWWC/0.25 (Win95) Disallow:
User-agent: none Disallow:
User-agent: XGET/0.7 Disallow:
User-agent: Nederland.zoek Disallow:
User-agent: BizBot04 kirk.overleaf.com Disallow:
User-agent: HappyBot (gserver.kw.net) Disallow:
User-agent: CaliforniaBrownSpider Disallow:
User-agent: EI*Net/0.1 libwww/0.1 Disallow:
User-agent: Ibot/1.0 libwww-perl/0.40 Disallow:
User-agent: Merritt/1.0 Disallow:
User-agent: StatFetcher/1.0 Disallow:
User-agent: TeacherSoft/1.0 libwww/2.17 Disallow:
User-agent: WWW Collector Disallow:
User-agent: processor/0.0ALPHA libwww-perl/0.20 Disallow:
User-agent: wobot/1.0 from 206.214.202.45 Disallow:
User-agent: Libertech-Rover www.libertech.com? Disallow:
User-agent: WhoWhere Robot Disallow:
User-agent: ITI Spider Disallow:
User-agent: w3index Disallow:
User-agent: MyCNNSpider Disallow:
User-agent: SummyCrawler Disallow:
User-agent: OGspider Disallow:
User-agent: linklooker Disallow:
User-agent: CyberSpyder (amant@www.cyberspyder.com) Disallow:
User-agent: SlowBot Disallow:
User-agent: heraSpider Disallow:
User-agent: Surfbot Disallow:
User-agent: Bizbot003 Disallow:
User-agent: WebWalker Disallow:
User-agent: SandBot Disallow:
User-agent: EnigmaBot Disallow:
User-agent: spyder3.microsys.com Disallow:
User-agent: www.freeloader.com. Disallow:
User-agent: Googlebot Disallow:
User-agent: METAGOPHER Disallow:
User-agent: * Disallow: /
|
评论
书写评论
|
|