Announcing Sitemap Generator version 1.3: Improved encoding support
Stay organized with collectionsSave and categorize content based on your preferences.
September 15, 2005
TheSitemap Generator version 1.3is now available and provides improved encoding support. If your webserver uses an encoding other
than UTF-8 or if your domain name or some the URLs in your site use non-ASCII characters, and you
plan to use the Sitemap Generator to create your Sitemap, you should download this latest version.
Generally, non-ASCII URLs should beencodedusing UTF-8 before being percent-escaped. However, some webservers respond correctly only if URLs
are encoded specifically for the webserver's configuration. All URLs within your Sitemap, as well
as the URL of the Sitemap itself, must be encoded for readability by the web server on which they
are located.
If you are using theSitemap Generator,
you can specify the encoding of the URLs contained in the Sitemap from within theconfig.xmlfile. Within thesite definition sectionof that config file, use the optional default_encoding attribute to specify theencodingused by your webserver. If you don't use this attribute and your webserver uses an encoding other
than UTF-8, the Sitemap Generator can't know which encoding to use, although it does attempt to
determine the correct encoding. If the generated Sitemap doesn't list the URLs correctly, you
should explicitly indicate the encoding with the default_encoding attribute and run the Sitemap
Generator again.
If your URLs contain non-ASCII characters, we recommend that you run the Sitemap Generator script
using Python 2.3 or higher. This version of Python has increased non-ASCII support. If your domain
name contains non-ASCII characters, you must use Python 2.3 or later, asInternationalizing Domain Names in Applications (IDNA)support wasn't added until this version. Without IDNA support, the Sitemap Generator can't
correctly encode a non-ASCII domain name.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eSitemap Generator version 1.3 is released with better encoding support for webservers not using UTF-8.\u003c/p\u003e\n"],["\u003cp\u003eURLs, including the Sitemap's, need to be encoded for the specific web server they are on.\u003c/p\u003e\n"],["\u003cp\u003eThe Sitemap Generator's \u003ccode\u003econfig.xml\u003c/code\u003e file allows specifying URL encoding via the \u003ccode\u003edefault_encoding\u003c/code\u003e attribute.\u003c/p\u003e\n"],["\u003cp\u003eUsing Python 2.3 or higher is recommended for URLs with non-ASCII characters, and required for domain names with such characters due to IDNA support.\u003c/p\u003e\n"]]],["Sitemap Generator version 1.3 was released with improved encoding support for non-UTF-8 web servers and URLs with non-ASCII characters. Users should specify the web server's encoding in the `config.xml` file via the `default_encoding` attribute. URLs should be UTF-8 encoded before percent-escaping; webserver-specific encoding may be necessary. For non-ASCII characters in URLs or domain names, Python 2.3 or higher is required due to added IDNA support. The Sitemap generator is no longer maintained.\n"],null,["# Announcing Sitemap Generator version 1.3: Improved encoding support\n\nSeptember 15, 2005\n| It's been a while since we published this blog post. Some of the information may be outdated (for example, some images may be missing, and some links may not work anymore). The Sitemap Generator is no longer maintained.\n\n\nThe\n[Sitemap Generator version 1.3](https://sourceforge.net/project/showfiles.php?group_id=137793&package_id=153422)\nis now available and provides improved encoding support. If your webserver uses an encoding other\nthan UTF-8 or if your domain name or some the URLs in your site use non-ASCII characters, and you\nplan to use the Sitemap Generator to create your Sitemap, you should download this latest version.\n\n\nGenerally, non-ASCII URLs should be\n[encoded](/search/docs/crawling-indexing/sitemaps/build-sitemap#general-guidelines)\nusing UTF-8 before being percent-escaped. However, some webservers respond correctly only if URLs\nare encoded specifically for the webserver's configuration. All URLs within your Sitemap, as well\nas the URL of the Sitemap itself, must be encoded for readability by the web server on which they\nare located.\n\n\nIf you are using the\n[Sitemap Generator](/search/docs/crawling-indexing/sitemaps/overview),\nyou can specify the encoding of the URLs contained in the Sitemap from within the\n`config.xml` file. Within the\n[site definition section](/search/docs/crawling-indexing/sitemaps/overview#config_reference)\nof that config file, use the optional default_encoding attribute to specify the\n[encoding](/search/docs/crawling-indexing/sitemaps/overview#encoding)\nused by your webserver. If you don't use this attribute and your webserver uses an encoding other\nthan UTF-8, the Sitemap Generator can't know which encoding to use, although it does attempt to\ndetermine the correct encoding. If the generated Sitemap doesn't list the URLs correctly, you\nshould explicitly indicate the encoding with the default_encoding attribute and run the Sitemap\nGenerator again.\n\n\nIf your URLs contain non-ASCII characters, we recommend that you run the Sitemap Generator script\nusing Python 2.3 or higher. This version of Python has increased non-ASCII support. If your domain\nname contains non-ASCII characters, you must use Python 2.3 or later, as\n[Internationalizing Domain Names in Applications (IDNA)](https://www.rfc-editor.org/rfc/rfc3490.txt)\nsupport wasn't added until this version. Without IDNA support, the Sitemap Generator can't\ncorrectly encode a non-ASCII domain name.\n\nPosted by [Vanessa Fox](https://www.vanessafox.com/)"]]