-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hi there, I just curious if this is an issue or not, but the robots-parser result apparently is unmatched from the actual url robots.txt. For example, in tokopedia.com/robots.txt the result would be the first one. But, with your library the result would be the same, except on the allow and disallow has the same directory. And you were providing a guideline from google that if the allow and disallow is the same, then the allow one is the chosen one because of Google's rule, but from the actual file is stated that it is disallow instead of allow. The result from your library would the the second one :
`User-agent: *
Allow: /blog/etalase
Allow: /blog/note
Allow: /blog/review
Allow: /tokopoints/intro/
Disallow: */tokopedia-lite-production/
Disallow: /*.pl
Disallow: /*/*/review
Disallow: /*/*/talk
Disallow: /*/note
Disallow: /*/review
Disallow: /amp/api/*
Disallow: /archive-*
Disallow: /cart?*
Disallow: /cart/*
Disallow: /chat?*
Disallow: /chat/*
Disallow: /content/*
Disallow: /events/
Disallow: /events/search*
Disallow: /feed?sc=*
Disallow: /feedcommunicationdetail/*
Disallow: /flight/search/*
Disallow: /graphql
Disallow: /hotel/search*
Disallow: /image-search/
Disallow: /insight/*
Disallow: /kartu-kredit*?id=*
Disallow: /myshop/*
Disallow: /order-list/
Disallow: /p/tour-travel
Disallow: /payment/*
Disallow: /people/*
Disallow: /provi/check*
Disallow: /rekomendasi/*/d/
Disallow: /reputationapp/*
Disallow: /search?*
Disallow: /search/*
Disallow: /similar-products*
Disallow: /tokopoints
Disallow: /wishlist?*
Disallow: /helios-client/*
Sitemap: https://www.tokopedia.com/sitemap/catalog-index.xml
Sitemap: https://www.tokopedia.com/sitemap/category-index.xml
Sitemap: https://www.tokopedia.com/sitemap/deals-index.xml
Sitemap: https://www.tokopedia.com/sitemap/egold-index.xml
Sitemap: https://www.tokopedia.com/sitemap/events-index.xml
Sitemap: https://www.tokopedia.com/sitemap/flight-index.xml
Sitemap: https://www.tokopedia.com/sitemap/hotel-index.xml
Sitemap: https://www.tokopedia.com/sitemap/official-store-brand-index.xml
Sitemap: https://www.tokopedia.com/sitemap/official-store-category-index.xml
Sitemap: https://www.tokopedia.com/sitemap/official-store-index.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-0.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-1.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-2.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-3.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-4.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-5.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-6.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-7.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-8.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-9.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-index.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-0.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-1.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-2.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-3.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-4.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-5.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-6.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-7.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-8.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-9.xml
Sitemap: https://www.tokopedia.com/sitemap/recharge-index.xml
Sitemap: https://www.tokopedia.com/sitemap/salam-index.xml
Sitemap: https://www.tokopedia.com/sitemap/shop-index.xml`
`{
"url": "https://tokopedia.com",
"data": {
"agents": {
"all": {
"allow": [
"/blog/etalase",
"/blog/note",
"/blog/review",
"/tokopoints/intro/",
"/*.pl",
"/*/*/review",
"/*/*/talk",
"/*/note",
"/*/review",
"/amp/api/*",
"/archive-*",
"/cart?*",
"/cart/*",
"/chat?*",
"/chat/*",
"/content/*",
"/events/",
"/events/search*",
"/feed?sc=*",
"/feedcommunicationdetail/*",
"/flight/search/*",
"/graphql",
"/hotel/search*",
"/image-search/",
"/insight/*",
"/kartu-kredit*?id=*",
"/myshop/*",
"/order-list/",
"/p/tour-travel",
"/payment/*",
"/people/*",
"/provi/check*",
"/rekomendasi/*/d/",
"/reputationapp/*",
"/search?*",
"/search/*",
"/similar-products*",
"/tokopoints",
"/wishlist?*",
"/helios-client/*"
],
"disallow": [
"/*.pl",
"/*/*/review",
"/*/*/talk",
"/*/note",
"/*/review",
"/amp/api/*",
"/archive-*",
"/cart?*",
"/cart/*",
"/chat?*",
"/chat/*",
"/content/*",
"/events/",
"/events/search*",
"/feed?sc=*",
"/feedcommunicationdetail/*",
"/flight/search/*",
"/graphql",
"/hotel/search*",
"/image-search/",
"/insight/*",
"/kartu-kredit*?id=*",
"/myshop/*",
"/order-list/",
"/p/tour-travel",
"/payment/*",
"/people/*",
"/provi/check*",
"/rekomendasi/*/d/",
"/reputationapp/*",
"/search?*",
"/search/*",
"/similar-products*",
"/tokopoints",
"/wishlist?*",
"/helios-client/*"
]
}
},
"allow": [
"/blog/etalase",
"/blog/note",
"/blog/review",
"/tokopoints/intro/",
"/*.pl",
"/*/*/review",
"/*/*/talk",
"/*/note",
"/*/review",
"/amp/api/*",
"/archive-*",
"/cart?*",
"/cart/*",
"/chat?*",
"/chat/*",
"/content/*",
"/events/",
"/events/search*",
"/feed?sc=*",
"/feedcommunicationdetail/*",
"/flight/search/*",
"/graphql",
"/hotel/search*",
"/image-search/",
"/insight/*",
"/kartu-kredit*?id=*",
"/myshop/*",
"/order-list/",
"/p/tour-travel",
"/payment/*",
"/people/*",
"/provi/check*",
"/rekomendasi/*/d/",
"/reputationapp/*",
"/search?*",
"/search/*",
"/similar-products*",
"/tokopoints",
"/wishlist?*",
"/helios-client/*"
],
"disallow": [
"/*.pl",
"/*/*/review",
"/*/*/talk",
"/*/note",
"/*/review",
"/amp/api/*",
"/archive-*",
"/cart?*",
"/cart/*",
"/chat?*",
"/chat/*",
"/content/*",
"/events/",
"/events/search*",
"/feed?sc=*",
"/feedcommunicationdetail/*",
"/flight/search/*",
"/graphql",
"/hotel/search*",
"/image-search/",
"/insight/*",
"/kartu-kredit*?id=*",
"/myshop/*",
"/order-list/",
"/p/tour-travel",
"/payment/*",
"/people/*",
"/provi/check*",
"/rekomendasi/*/d/",
"/reputationapp/*",
"/search?*",
"/search/*",
"/similar-products*",
"/tokopoints",
"/wishlist?*",
"/helios-client/*"
],
"sitemaps": [
"https://www.tokopedia.com/sitemap/catalog-index.xml",
"https://www.tokopedia.com/sitemap/category-index.xml",
"https://www.tokopedia.com/sitemap/deals-index.xml",
"https://www.tokopedia.com/sitemap/egold-index.xml",
"https://www.tokopedia.com/sitemap/events-index.xml",
"https://www.tokopedia.com/sitemap/flight-index.xml",
"https://www.tokopedia.com/sitemap/hotel-index.xml",
"https://www.tokopedia.com/sitemap/official-store-brand-index.xml",
"https://www.tokopedia.com/sitemap/official-store-category-index.xml",
"https://www.tokopedia.com/sitemap/official-store-index.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-0.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-1.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-2.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-3.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-4.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-5.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-6.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-7.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-8.xml",
"https://www.tokopedia.com/sitemap/product-find-city-index-9.xml",
"https://www.tokopedia.com/sitemap/product-find-index.xml",
"https://www.tokopedia.com/sitemap/products-index-0.xml",
"https://www.tokopedia.com/sitemap/products-index-1.xml",
"https://www.tokopedia.com/sitemap/products-index-2.xml",
"https://www.tokopedia.com/sitemap/products-index-3.xml",
"https://www.tokopedia.com/sitemap/products-index-4.xml",
"https://www.tokopedia.com/sitemap/products-index-5.xml",
"https://www.tokopedia.com/sitemap/products-index-6.xml",
"https://www.tokopedia.com/sitemap/products-index-7.xml",
"https://www.tokopedia.com/sitemap/products-index-8.xml",
"https://www.tokopedia.com/sitemap/products-index-9.xml",
"https://www.tokopedia.com/sitemap/recharge-index.xml",
"https://www.tokopedia.com/sitemap/salam-index.xml",
"https://www.tokopedia.com/sitemap/shop-index.xml"
],
"host": ""
}
}`
And my conclusion is, if there is different directory on the allow, then it is the actual allowed directory that can be crawled. Am I wrong or am I missing something? If not, can you make it simpler? Thanks in advance