valuationbreak evenprofitcontributionforecastshiftqueuingbusiness analysis
accountbasbudgetBAS-I.Cshare valueshort salesinvestment
site mapform1
Browse All Question and Answer Items
Items selected where Item is 10997
Questions & Answers
Q: I am evaluating your sitemap software. It does not seem to honor robots.txt. Is there a way we can configure it to honor this (or some variation)? We run an eCommerce website and do not want the software to walk the entire site. I also noticed that during my tests, it does not accurately extract dynamic content which is of course a bigger issue.
A: The SiteMap XML generator is not a robot crawler and does not use the robots.txt file. It crawls your site structure using the href links in your pages (starting at the Start URL). Links that are formatted as href links in the web page code will be extracted this applies to dynamic links or static links. If the extracted link is not correct it will be an issue with the href link structure.So for a page to be crawled it must be part of the site structure and have a link from one of the crawled pages or be set as a Start URL. The urls included in the site map are determined by the settings used, only urls that match the Include URLs with pattern will be included. Urls can be excluded using the Exclude URL's with pattern. You can use this to exclude some links but if a search engine can crawl them then they will most likely still be indexed even if they are not in the sitemap. Hope this helps.
Question and Answer Item 10997 - Browse All Question and Answer Items