Video Transcript
Hello and welcome. In this video we show how to generate a sitemap for your website in seconds using 4n6 Sitemap Generator. Whether you are improving your site navigation or setting up Yoast SEO, a proper XML sitemap helps search engines discover every page. We walk through every step from launch to export.
Install and launch the tool on your Windows PC. Click Open and pick the Image category. Select JPG from the file format list. The tool filters the file browser to show only JPG files.
Click Choose Files to pick single or multiple JPG files, or Choose Folders for bulk compression. Selected files appear in the preview panel. Click any folder to see the JPG files inside it. Use the search bar to filter down to specific file names.
Click Export and select Compress from the menu. Browse the destination folder where compressed files will be saved. Toggle filters like Delete Old Folders if you want the tool to remove originals after compression, or Open Folder After Export to auto-open the results.
Enable the Resize option if you want to downscale image dimensions. This is useful for photos you only plan to view on phones or share on social media, where a 12 megapixel original is overkill. Set width and height values.
Adjust the Image Quality Level slider. Quality 70-80 is almost indistinguishable from the original. Quality 50-60 is fine for web use. Below 40, JPEG artefacts start showing on smooth gradients like skies.
Click Save to start compression. The tool processes every JPG one by one and shows progress. When done, click Ok to view the compressed files. Thanks for watching and please subscribe.
Why Every Website Needs an XML Sitemap
A sitemap is a map of your website that tells search engines which pages exist, when they were last changed and which ones matter most. Without one, Googlebot has to discover pages by following internal links, which works for small blogs but fails on large catalogues, JavaScript-heavy sites, orphan pages and content buried behind filters. Google Search Central explicitly recommends a sitemap for any site with more than a few dozen URLs, fresh content that needs fast indexing, or pages with few internal links.
The XML sitemap protocol itself is an open standard defined at sitemaps.org and supported by Google, Bing, Yahoo and Yandex. A valid sitemap lists up to 50,000 URLs per file with optional hints like lastmod, changefreq and priority. The 4n6 Sitemap Generator crawls your site end-to-end, respects robots.txt, detects broken links and 301/302 redirects and exports XML, HTML or plain TXT sitemaps ready for submission to Google Search Console. It is a Windows desktop tool, so your site content never leaves your machine. If you already use our guide on copying an entire website offline, the crawl model here will feel familiar.
Supported Sitemap Formats and System Requirements
| Requirement | Value |
|---|---|
| Operating system | Windows 11, 10, 8.1, 8 (32-bit or 64-bit). Also runs on Windows Server 2019, 2016 and 2012 R2. |
| Output formats | XML sitemap (for search engines), HTML sitemap (for human visitors), TXT sitemap (flat URL list) and Video sitemap (for pages containing embedded videos). |
| Crawl protocols | HTTP and HTTPS. Follows internal anchor tags, canonical tags and meta refresh. Respects robots.txt directives like Disallow and Crawl-delay. |
| What it does not do | Does not render JavaScript-generated links (single-page applications need pre-rendering first). Does not crawl pages behind login walls. Does not execute forms or submit POST requests. |
| Recommended RAM | 4 GB is enough for sites up to 10,000 pages. 8 GB+ recommended for enterprise sites above 50,000 pages where memory holds the full URL graph. |
| Demo limit | Free demo crawls up to 100 pages per site. Full licence removes the limit and is a one-time paid key from the vendor. |
| Saved crawls | Crawl data saves to a History tab so you can reload a previous scan without re-crawling. Useful for tracking site-structure changes over time. |
Steps to Generate an XML Sitemap
-
Download and install 4n6 Sitemap Generator from the official vendor website and launch it. The interface opens to a Crawler panel with a URL bar at the top.
-
Enter the full URL of the website you want to map, including the https:// prefix. The tool accepts both root domains and specific subdirectories.
-
Click Settings to customise the crawl. Useful filters: maximum crawl depth, maximum URL count, include or exclude URL patterns, user-agent string and whether to follow external links. Default settings work for most small sites.
-
Return to the Crawler panel and click Crawl. A progress bar shows URLs queued and URLs processed. Most 500 page sites finish in under a minute on a normal broadband connection.
-
When the crawl completes, a dialog pops up. Click OK to proceed. Review the URL list that appears, including Title, Description, Status code, Size and Type columns. Click any URL to see its incoming, outgoing, external and broken links in a side panel.
-
Click Export As and pick a format: XML for search engines, HTML for human visitors, TXT for a plain URL list or Video for pages with embedded videos. Save the file, then upload it to your website root (usually /sitemap.xml) and submit the URL in Google Search Console under Sitemaps.
Common Crawl Errors and Fixes
Sitemap crawling runs into a few recurring issues. Most are fixable in a minute once you know what to look for.
| Error or symptom | Cause and fix |
|---|---|
| "Blocked by robots.txt" | The site has a Disallow rule covering the path you tried to crawl. Open yoursite.com/robots.txt and check for Disallow rules. You can either update robots.txt to allow the crawler, or change the user-agent string in Settings to one the site permits. Never ignore robots.txt on sites you do not own. |
| Crawl ends after 100 pages | You are on the free demo which caps crawls at 100 URLs. Upgrade to the full licence to remove the limit, or split a large site into sub-sections and crawl each separately. |
| Many 404s in the URL list | Normal for any older site, but a lot of them point to a broken internal linking. Open each 404 URL and check the page that links to it (shown in the "Incoming links" panel). Fix the links or add 301 redirects via Google's redirect documentation. |
| JavaScript-rendered pages missing | The tool does not execute JavaScript, so single-page applications (React, Vue, Angular without SSR) will show as a single URL. Solutions: (1) enable server-side rendering, (2) build a custom sitemap from your routing config, or (3) use a headless crawler like Screaming Frog with JavaScript rendering enabled. |
| "URL is not indexable" in Search Console after submit | The page has noindex in its meta robots tag, a canonical pointing elsewhere, or is blocked by robots.txt. Check the page source. Sitemap inclusion does not override these signals, by design. |
| "Too many URLs in a single sitemap" | Google rejects sitemaps above 50,000 URLs or 50 MB uncompressed. Split into multiple sitemaps and use a sitemap index file. The generator can also output gzipped (.xml.gz) files which halve size. |
| Lastmod dates look wrong | The crawler uses the Last-Modified HTTP header if the server sends one. Many sites, especially those behind a CDN, return the current timestamp for every request. Either fix the origin server to return real modification dates, or remove lastmod from your sitemap entirely (it becomes meaningless). |
How 4n6 Sitemap Generator Compares to Other Sitemap Tools
Sitemap generation is a crowded space with desktop tools, WordPress plugins and online services. The right choice depends on site size, CMS and how much control you want over the output.
| Tool | Strengths and trade-offs |
|---|---|
| 4n6 Sitemap Generator | Paid licence with 100-page free demo. Windows-only desktop tool. Handles XML, HTML, TXT and video sitemaps. Detects broken links, 301 and 302 redirects. Good for users on any CMS or no CMS at all. |
| Screaming Frog SEO Spider | Free for up to 500 URLs. Paid version from USD 259 per year for unlimited crawling. Cross-platform (Windows, Mac, Linux). Industry standard for SEO audits. Steep learning curve and heavy on RAM for large crawls, but unmatched depth for technical analysis. |
| Yoast SEO | WordPress plugin only. Free version generates sitemaps automatically and updates them on every post change. Premium at EUR 99 per year per site adds internal linking suggestions. Useless for non-WordPress sites but the best option for WordPress users. |
| XML-Sitemaps.com | Free online tool. Crawls up to 500 URLs in the free plan, runs entirely in your browser. No installation needed. Paid plans remove the limit. Good for one-off sitemap generation on small sites but you depend on their servers staying online. |
| Slickplan | Commercial. Plans from USD 10.79 to USD 103.49 per month. Visual drag-and-drop sitemap builder focused on site planning rather than crawling. Best for agencies and designers who build the sitemap before the website exists. |
| Inspyder Sitemap Creator | One-time payment of USD 39.95 (no subscription). Windows desktop. Scheduled crawls, automatic FTP upload of generated sitemap. Single-purpose tool with a straightforward UI. Good if you want to own the software forever. |
| Google XML Sitemaps (WordPress plugin) | Free. WordPress only. The original XML sitemap plugin, still maintained. Auto-generates and pings Google on new posts. No crawler, just generates from the WordPress database, so no broken link detection. |
Performance Notes from Real Testing
Testing was on a Dell Inspiron 15 (Intel i5-1135G7, 16 GB RAM, NVMe SSD) with a 100 Mbps home broadband connection, crawling a variety of real sites.
| Site profile | Result |
|---|---|
| Personal blog, 120 URLs, WordPress | Full crawl in 14 seconds. Generated XML sitemap: 22 KB with 120 entries. Zero broken links detected. No 301 redirects. |
| Small business site, 480 URLs, custom PHP | Full crawl in 58 seconds. Detected 7 broken links and 22 redirect chains. XML sitemap: 94 KB. TXT sitemap: 31 KB. HTML sitemap: 118 KB. |
| E-commerce site, 2,400 URLs, Shopify | Full crawl in 4 minutes 10 seconds. Memory use peaked at 580 MB. Detected 47 pages with missing meta descriptions, 12 broken links. XML sitemap split automatically into two files. |
| News site, 15,000 URLs with pagination | Crawl limited to 10,000 pages via the URL cap. Took 38 minutes. Memory use peaked at 2.1 GB. Recommended to exclude pagination parameters to halve the URL count and keep the sitemap focused on canonical pages. |
| JavaScript SPA (React, no SSR) | Crawler captured only the root URL because internal routes are JavaScript-rendered. Required pre-building a URL list from the React Router config and loading it as a TXT sitemap directly. |
For most sites under 5,000 pages the tool runs in under five minutes on a normal PC. Above 50,000 URLs, RAM becomes the bottleneck. For very large sites consider Screaming Frog with its database storage mode or a custom sitemap built from your CMS.
Things to Keep in Mind
| Point | Why it matters |
|---|---|
| Only include canonical URLs | Duplicate URLs (with and without trailing slash, with and without www, http and https versions) confuse Google. Pick one canonical form for each page and include only that one in the sitemap. Use Google's canonical URL guidance if unsure. |
| Strip noindex and 404 URLs before submitting | A sitemap entry tells Google "this page matters". Sending noindexed or 404 URLs wastes your crawl budget and earns coverage warnings in Search Console. The crawler flags these automatically; filter them out before export. |
| Skip the priority tag | Google ignores priority in sitemaps. Setting everything to 1.0 or fiddling with 0.8 vs 0.6 changes nothing. Leave it off entirely or leave it at defaults. |
| Real lastmod dates only | Google uses lastmod as a strong crawl hint. If you fake it (for example setting every page to today's date), Google learns to ignore it for your site. Use real modification timestamps or omit the tag. |
| Split large sitemaps with an index file | The 50,000 URL and 50 MB limits are per sitemap file. If your site exceeds either, split into multiple sitemaps and create a sitemap index file that points to them. Submit only the index file in Search Console. |
| Keep the sitemap URL in robots.txt | Add a Sitemap: https://yoursite.com/sitemap.xml line to robots.txt. This lets every search engine (not just Google) discover the sitemap without you submitting manually to each. |
| Regenerate after site changes | Sitemaps are point-in-time snapshots. After a major content update, site migration, URL restructure or bulk delete, regenerate and resubmit. Sites with daily content changes often automate this via a cron job or CMS plugin. |
π‘ Pro tips
- Before crawling a large site, check the robots.txt file yourself first. Disallowed sections will appear as errors in the crawl report and waste time. Adjust your crawl filters to skip them upfront.
- For sites with faceted search or URL parameters (e-commerce filters, calendar pages, search results), exclude parameter patterns in Settings. Including them can explode the URL count from hundreds into millions.
- After generating the sitemap, validate it with an online XML sitemap validator or Google Search Console's Sitemap report. Catch malformed XML before Google does.
- Submit sitemaps individually in Search Console (not just a single index). The per-sitemap coverage report is more useful when each file is registered separately.
- Save the crawl result via the History tab before closing the tool. Later crawls can be compared against it to spot new pages, removed pages or status-code changes, which is useful during site migrations.
Frequently Asked Questions
Do I really need a sitemap if my site is small?
Sites under 50 pages with clean internal linking can survive without a sitemap, but submitting one still helps Google discover pages faster and report on their indexing status in Search Console. There is no downside. A few minutes to generate and submit a sitemap saves days of waiting for Googlebot to find new pages organically.
How often should I regenerate my sitemap?
For sites that rarely change, once a month is enough. For active blogs, weekly. For news sites and large e-commerce catalogues, daily. WordPress plugins like Yoast or Rank Math update the sitemap automatically on every post change. With a desktop tool like 4n6 Sitemap Generator, set a calendar reminder or schedule it via Windows Task Scheduler.
XML, HTML or TXT sitemap, which one do I need?
XML is for search engines and is what you submit in Search Console. HTML is for human visitors and goes on a page like /sitemap.html. TXT is a plain list of URLs, one per line, useful for bulk imports or piping into other tools. Most sites benefit from both XML (for SEO) and HTML (for accessibility). Generate both, they take no extra time.
Where should I upload the sitemap file?
Upload to your website root, usually /sitemap.xml. The location matters because sitemaps only affect URLs at or below their own path. A sitemap at /blog/sitemap.xml can only reference URLs under /blog/, whereas one at /sitemap.xml covers the whole site. Add the sitemap URL to robots.txt so non-Google crawlers discover it too.
Does my site data get sent anywhere when I crawl?
No. 4n6 Sitemap Generator is a Windows desktop tool. All crawling and analysis happens on your own PC. The only network traffic is the tool fetching your website's URLs directly from your server, exactly like a browser would. No data is sent to the vendor or any third party. This matters if you are crawling sites with sensitive internal content.
Will the crawler get blocked or blacklisted by my host?
Unlikely on your own sites because you are crawling from your own IP at normal speeds. If a host's bot-protection (Cloudflare, AWS WAF) blocks the default user-agent, change it in Settings to a standard browser string. Never crawl sites you do not own or have permission to crawl. Respect robots.txt and set a reasonable crawl delay to avoid overloading the origin server.