Skip to content

The robot-parse result is unmatched with the actual robots.txt #16

@thoriqadillah

Description

@thoriqadillah

Hi there, I just curious if this is an issue or not, but the robots-parser result apparently is unmatched from the actual url robots.txt. For example, in tokopedia.com/robots.txt the result would be the first one. But, with your library the result would be the same, except on the allow and disallow has the same directory. And you were providing a guideline from google that if the allow and disallow is the same, then the allow one is the chosen one because of Google's rule, but from the actual file is stated that it is disallow instead of allow. The result from your library would the the second one :

`User-agent: *
Allow: /blog/etalase
Allow: /blog/note
Allow: /blog/review
Allow: /tokopoints/intro/
Disallow: */tokopedia-lite-production/
Disallow: /*.pl
Disallow: /*/*/review
Disallow: /*/*/talk
Disallow: /*/note
Disallow: /*/review
Disallow: /amp/api/*
Disallow: /archive-*
Disallow: /cart?*
Disallow: /cart/*
Disallow: /chat?*
Disallow: /chat/*
Disallow: /content/*
Disallow: /events/
Disallow: /events/search*
Disallow: /feed?sc=*
Disallow: /feedcommunicationdetail/*
Disallow: /flight/search/*
Disallow: /graphql
Disallow: /hotel/search*
Disallow: /image-search/
Disallow: /insight/*
Disallow: /kartu-kredit*?id=*
Disallow: /myshop/*
Disallow: /order-list/
Disallow: /p/tour-travel
Disallow: /payment/*
Disallow: /people/*
Disallow: /provi/check*
Disallow: /rekomendasi/*/d/
Disallow: /reputationapp/*
Disallow: /search?*
Disallow: /search/*
Disallow: /similar-products*
Disallow: /tokopoints
Disallow: /wishlist?*
Disallow: /helios-client/*

Sitemap: https://www.tokopedia.com/sitemap/catalog-index.xml
Sitemap: https://www.tokopedia.com/sitemap/category-index.xml
Sitemap: https://www.tokopedia.com/sitemap/deals-index.xml
Sitemap: https://www.tokopedia.com/sitemap/egold-index.xml
Sitemap: https://www.tokopedia.com/sitemap/events-index.xml
Sitemap: https://www.tokopedia.com/sitemap/flight-index.xml
Sitemap: https://www.tokopedia.com/sitemap/hotel-index.xml
Sitemap: https://www.tokopedia.com/sitemap/official-store-brand-index.xml
Sitemap: https://www.tokopedia.com/sitemap/official-store-category-index.xml
Sitemap: https://www.tokopedia.com/sitemap/official-store-index.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-0.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-1.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-2.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-3.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-4.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-5.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-6.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-7.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-8.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-city-index-9.xml
Sitemap: https://www.tokopedia.com/sitemap/product-find-index.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-0.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-1.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-2.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-3.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-4.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-5.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-6.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-7.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-8.xml
Sitemap: https://www.tokopedia.com/sitemap/products-index-9.xml
Sitemap: https://www.tokopedia.com/sitemap/recharge-index.xml
Sitemap: https://www.tokopedia.com/sitemap/salam-index.xml
Sitemap: https://www.tokopedia.com/sitemap/shop-index.xml`



`{
  "url": "https://tokopedia.com",
  "data": {
    "agents": {
      "all": {
        "allow": [
          "/blog/etalase",
          "/blog/note",
          "/blog/review",
          "/tokopoints/intro/",
          "/*.pl",
          "/*/*/review",
          "/*/*/talk",
          "/*/note",
          "/*/review",
          "/amp/api/*",
          "/archive-*",
          "/cart?*",
          "/cart/*",
          "/chat?*",
          "/chat/*",
          "/content/*",
          "/events/",
          "/events/search*",
          "/feed?sc=*",
          "/feedcommunicationdetail/*",
          "/flight/search/*",
          "/graphql",
          "/hotel/search*",
          "/image-search/",
          "/insight/*",
          "/kartu-kredit*?id=*",
          "/myshop/*",
          "/order-list/",
          "/p/tour-travel",
          "/payment/*",
          "/people/*",
          "/provi/check*",
          "/rekomendasi/*/d/",
          "/reputationapp/*",
          "/search?*",
          "/search/*",
          "/similar-products*",
          "/tokopoints",
          "/wishlist?*",
          "/helios-client/*"
        ],
        "disallow": [
          "/*.pl",
          "/*/*/review",
          "/*/*/talk",
          "/*/note",
          "/*/review",
          "/amp/api/*",
          "/archive-*",
          "/cart?*",
          "/cart/*",
          "/chat?*",
          "/chat/*",
          "/content/*",
          "/events/",
          "/events/search*",
          "/feed?sc=*",
          "/feedcommunicationdetail/*",
          "/flight/search/*",
          "/graphql",
          "/hotel/search*",
          "/image-search/",
          "/insight/*",
          "/kartu-kredit*?id=*",
          "/myshop/*",
          "/order-list/",
          "/p/tour-travel",
          "/payment/*",
          "/people/*",
          "/provi/check*",
          "/rekomendasi/*/d/",
          "/reputationapp/*",
          "/search?*",
          "/search/*",
          "/similar-products*",
          "/tokopoints",
          "/wishlist?*",
          "/helios-client/*"
        ]
      }
    },
    "allow": [
      "/blog/etalase",
      "/blog/note",
      "/blog/review",
      "/tokopoints/intro/",
      "/*.pl",
      "/*/*/review",
      "/*/*/talk",
      "/*/note",
      "/*/review",
      "/amp/api/*",
      "/archive-*",
      "/cart?*",
      "/cart/*",
      "/chat?*",
      "/chat/*",
      "/content/*",
      "/events/",
      "/events/search*",
      "/feed?sc=*",
      "/feedcommunicationdetail/*",
      "/flight/search/*",
      "/graphql",
      "/hotel/search*",
      "/image-search/",
      "/insight/*",
      "/kartu-kredit*?id=*",
      "/myshop/*",
      "/order-list/",
      "/p/tour-travel",
      "/payment/*",
      "/people/*",
      "/provi/check*",
      "/rekomendasi/*/d/",
      "/reputationapp/*",
      "/search?*",
      "/search/*",
      "/similar-products*",
      "/tokopoints",
      "/wishlist?*",
      "/helios-client/*"
    ],
    "disallow": [
      "/*.pl",
      "/*/*/review",
      "/*/*/talk",
      "/*/note",
      "/*/review",
      "/amp/api/*",
      "/archive-*",
      "/cart?*",
      "/cart/*",
      "/chat?*",
      "/chat/*",
      "/content/*",
      "/events/",
      "/events/search*",
      "/feed?sc=*",
      "/feedcommunicationdetail/*",
      "/flight/search/*",
      "/graphql",
      "/hotel/search*",
      "/image-search/",
      "/insight/*",
      "/kartu-kredit*?id=*",
      "/myshop/*",
      "/order-list/",
      "/p/tour-travel",
      "/payment/*",
      "/people/*",
      "/provi/check*",
      "/rekomendasi/*/d/",
      "/reputationapp/*",
      "/search?*",
      "/search/*",
      "/similar-products*",
      "/tokopoints",
      "/wishlist?*",
      "/helios-client/*"
    ],
    "sitemaps": [
      "https://www.tokopedia.com/sitemap/catalog-index.xml",
      "https://www.tokopedia.com/sitemap/category-index.xml",
      "https://www.tokopedia.com/sitemap/deals-index.xml",
      "https://www.tokopedia.com/sitemap/egold-index.xml",
      "https://www.tokopedia.com/sitemap/events-index.xml",
      "https://www.tokopedia.com/sitemap/flight-index.xml",
      "https://www.tokopedia.com/sitemap/hotel-index.xml",
      "https://www.tokopedia.com/sitemap/official-store-brand-index.xml",
      "https://www.tokopedia.com/sitemap/official-store-category-index.xml",
      "https://www.tokopedia.com/sitemap/official-store-index.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-0.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-1.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-2.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-3.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-4.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-5.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-6.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-7.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-8.xml",
      "https://www.tokopedia.com/sitemap/product-find-city-index-9.xml",
      "https://www.tokopedia.com/sitemap/product-find-index.xml",
      "https://www.tokopedia.com/sitemap/products-index-0.xml",
      "https://www.tokopedia.com/sitemap/products-index-1.xml",
      "https://www.tokopedia.com/sitemap/products-index-2.xml",
      "https://www.tokopedia.com/sitemap/products-index-3.xml",
      "https://www.tokopedia.com/sitemap/products-index-4.xml",
      "https://www.tokopedia.com/sitemap/products-index-5.xml",
      "https://www.tokopedia.com/sitemap/products-index-6.xml",
      "https://www.tokopedia.com/sitemap/products-index-7.xml",
      "https://www.tokopedia.com/sitemap/products-index-8.xml",
      "https://www.tokopedia.com/sitemap/products-index-9.xml",
      "https://www.tokopedia.com/sitemap/recharge-index.xml",
      "https://www.tokopedia.com/sitemap/salam-index.xml",
      "https://www.tokopedia.com/sitemap/shop-index.xml"
    ],
    "host": ""
  }
}`

And my conclusion is, if there is different directory on the allow, then it is the actual allowed directory that can be crawled. Am I wrong or am I missing something? If not, can you make it simpler? Thanks in advance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions