uCoz Community » For Webmasters » Site Promotion » Indexing Policy & Robots.txt
Indexing Policy & Robots.txt
Sunny
Posts: 9296
Reputation: 456

Message # 1 | 10:41 AM
Website's Indexing Status


All uCoz websites have Indexing status that is displayed at the top of the Control Panel's main page (/panel/?a=cp). The parameter shows whether indexing by search engines is allowed for the website or not (whether the website is in quarantine).
The indexing status can show one of the two options: "indexing is allowed (quarantine is removed)":



Or "indexing is prohibited (the website is in quarantine)":



The status "indexing is prohibited (the website is in quarantine)" is assigned by default to all newly created websites.

Quarantine Removal Policy


A website can become available for indexing either automatically (if a premium plan is purchased) or upon the website owner's request. If the website does not have a premium plan and the user wants the quarantine to be removed, a request should be submitted from the website's Control Panel:



There will be a pop-up window with the info on the quarantine policy:



After the request has been submitted, the website will be checked automatically according to a number criteria: the website's age, presence of a custom domain name, content, verified phone number etc. On the basis of these criteria the system decides whether the quarantine should be removed. We cannot provide a more detailed description of the algorithm.

Note! If the quarantine removal was denied, the next request can be submitted no sooner than in 7 days.


Robots.txt


A website's robots.txt file is located at http://your_website_address/robots.txt. A website with the default robots.txt is indexed in the best possible way – we set up the file in such a way that only pages with content are indexed, and not all existing pages (e.g. login or registration page). Therefore uCoz websites are indexed better and get higher priority in comparison with other sites where all unnecessary pages are indexed.

That's why we strongly recommend not to replace the default robots.txt by your own.


If you still want to replace the file by your own, create a text file using Notepad or any other text editor and name it "robots.txt". Then upload it to the root folder of your website via File Manager or FTP. Note: while website indexing is prohibited, no modification of the robots.txt file is possible.

The default robots.txt looks as follows:
Quote

User-agent: *
Allow: /*?page
Allow: /*?ref=
Allow: /stat/dspixel
Disallow: /*?
Disallow: /stat/
Disallow: /index/1
Disallow: /index/3
Disallow: /register
Disallow: /index/5
Disallow: /index/7
Disallow: /index/8
Disallow: /index/9
Disallow: /index/sub/
Disallow: /panel/
Disallow: /admin/
Disallow: /informer/
Disallow: /secure/
Disallow: /poll/
Disallow: /search/
Disallow: /abnl/
Disallow: /*_escaped_fragment_=
Disallow: /*-*-*-*-987$
Disallow: /shop/order/
Disallow: /shop/printorder/
Disallow: /shop/checkout/
Disallow: /shop/user/
Disallow: /*0-*-0-17$
Disallow: /*-0-0-

Sitemap: http://forum.ucoz.com/sitemap.xml
Sitemap: http://forum.ucoz.com/sitemap-forum.xml



Robots.txt during the quarantine looks as follows:

Quote

User-agent: *
Disallow: /




Robots.txt FAQ


Informers are not indexed because they display information that ALREADY exists. As a rule this information is already indexed on the corresponding pages.


Question: I have accidentally messed up robots.txt. What should I do?

Answer: Delete it. The default robots.txt file will be added back automatically (the system checks whether a website has it, and if not – adds back the default file).


Question: Is there any use in submitting a website to search engines if the quarantine hasn't been removed yet?

Answer: No, your website won't be indexed while in quarantine.


Question: Will the robots.txt file be replaced automatically after the quarantine has been removed? Or should I update it manually?

Answer: It will be updated automatically.


Question: Is it possible to delete the default robots.txt?

Answer: You can't delete it, it's a system file, but you can add your own file. However, we don't recommend to do this, as was stated above. During the quarantine it is impossible to upload a custom robots.txt.


Question: What should I do to forbid indexing of the following pages?
_http://site.ucoz.com/index/0-4
_http://site.ucoz.com/index/0-5

Answer: Add the following lines to the robots.txt file:
/index/0-4
/index/0-5


Question: I have forbidden indexing of some links by means of robots.txt but they are still displayed. Why is it so?

Answer: By means of robots.txt you can forbid indexing of pages, not links.


Question: I want to make some changes in my robots.txt file. How can I do this?

Answer: Download it to your PC, edit it and then upload it back via File Manager or FTP.

I'm not active on the forum anymore. Please contact other forum staff.
Mercury™
Posts: 1
Reputation: 0

Message # 106 | 10:57 AM
Sunny,
how i can access google bots
to poste my website in Google.com
& thanks
Paradox
Old Guard
Posts: 3284
Reputation: 145

Message # 107 | 12:11 PM
Mercury™, you can't access googlebot as it is a web crawler. You can manually add your site to google by submitting the URL or the alternative is to simply wait for googlebot to crawl your site. If you have site statistics enabled you are generally able to monitor what bots have crawled your site if it is enabled in your settings.

Please Remember: Bots are unable to crawl free sites until they have been active for 30 days after creation.

Hope this helps, happy

Jack of all trades in development, design, strategy.
Working as a Support Engineer.
Been here for 13 years and counting.
teammember
Posts: 6
Reputation: 0

Message # 108 | 9:59 AM
Hello all,

I changed my robots.txt, cause I want to remove the http://member-money.ucoz.com/poll page from indexing.

So now when I use google webmaster tool or any other robots.txt checker I see this:

From Google Webmaster Tool

Line 1: ?User-agent: * Syntax not understood
Line 2: Disallow: /a/ No user-agent specified

From other tool

Line Contents
1 User-agent: *
Missing / at start of file or folder name

2 Disallow: /a/
The line above must be an allow, disallow, comment or user agent statement

But when I open my robots.txt (http://member-money.ucoz.com/robots.txt) I see this:

User-agent: *
Disallow: /a/
Disallow: /stat/
Disallow: /index/1
Disallow: /index/2
Disallow: /index/3
Disallow: /index/5
Disallow: /index/7
Disallow: /index/8
Disallow: /index/9
Disallow: /panel/
Disallow: /admin/
Disallow: /secure/
Disallow: /informer/
Disallow: /mchat
Disallow: /search
Disallow: /poll

Sitemap: http://www.member-money.ucoz.com/sitemap.xml

Which seems correct to me.

So can you help me with this problem guys?
Natashko
Posts: 3366
Reputation: 171

Message # 109 | 2:15 PM
teammember, you have mistakes in your robots.txt file. They are causing problems. We do not recommend to edit this file in Microsoft Word as well. That is why we recommend not to substitute the default robots.txt file. Once you did - you are on your own. There is nothing we can help you with.
I have already answered you here: http://forum.ucoz.com/forum/23-14087-1#81075 Posting similar threads is against forum rules.
Iorga805S
Posts: 45
Reputation: 0

Message # 110 | 12:16 PM
How many visits have to unlock robots.txt?
Paradox
Old Guard
Posts: 3284
Reputation: 145

Message # 111 | 12:57 PM
Iorga805S, your site will be opened for indexing after 30 days unless you are paying for a premium package.

There are no hit limits once the ban has ended in regards to website indexing.

Jack of all trades in development, design, strategy.
Working as a Support Engineer.
Been here for 13 years and counting.
Iorga805S
Posts: 45
Reputation: 0

Message # 112 | 4:19 PM
ok

Added (2011-09-05, 10:19 Am)
---------------------------------------------
sorry for double post !

I made a website on August 6 and now the 30 days have passed, and robots.txt still block the index of Google and other search engines. What can i do?

Post edited by Iorga805S - Monday, 2011-09-05, 4:17 PM
Natashko
Posts: 3366
Reputation: 171

Message # 113 | 10:04 AM
Iorga805S,
Quote (Iorga805S)
I made a website on August 6 and now the 30 days have passed, and robots.txt still block the index of Google and other search engines. What can i do?

Which website are you referring to? This one http://ro-auto.ucoz.net is young, so there is a quarantine there still. And this one imobiliare-ro.ucoz.com is opened for indexation.
Iorga805S
Posts: 45
Reputation: 0

Message # 114 | 10:53 AM
is working now, thanks!
nmrs
Posts: 5
Reputation: -2

Message # 115 | 11:49 AM
google can't connect to my website

http://nmrs.at.ua/

because robots.txt angry

what i must do now cry

http://i.imgur.com/XDlAi.png
lilu
Posts: 70
Reputation: 6

Message # 116 | 12:13 PM
nmrs, wait for 30 days. more info here: http://faq.ucoz.com/faq/0-0-29
or pay for any paid service (the cost schould be not less than 2$) and it will be unblocked
nmrs
Posts: 5
Reputation: -2

Message # 117 | 6:39 PM
this is so bad i am tired to make good
site to make some money from download
now my site blocked to vistors i am tired guys wacko
DEEPKG
Posts: 316
Reputation: 8

Message # 118 | 6:43 PM
nmrs, I m not Getting what Problem You actually Facing wink
I Try to help. U can Try to give Rep ++ For my try :P
nmrs
Posts: 5
Reputation: -2

Message # 119 | 6:50 PM
DEEPKG . i have add my site http://nmrs.at.ua/ to google

google told me

some robots.txt in my site

blocking the site to appears on google search

see the photo

http://i.imgur.com/XDlAi.png

excuse my english biggrin
DEEPKG
Posts: 316
Reputation: 8

Message # 120 | 6:54 PM
nmrs, oh thats too bad No Probs I it will going to fix soon.... wink
I Try to help. U can Try to give Rep ++ For my try :P
uCoz Community » For Webmasters » Site Promotion » Indexing Policy & Robots.txt
Search: