The New York Time sitemap is legendary in SEO. The NYT creates automated logic to ensure that every article posted gets included in their HTML sitemap and neatly grouped accordingly.
Here's how the hierarchy of the sitemap works:
Page One: Year The Article Was Published
Page Two: Month The Article Was Published
Page Three: Day The Article Was Published
Page Four: List Of All Articles That Day
Even an article written in the 1850's is still only 5 "steps" aways from the home page. This ensures that always has a path to crawl the content and no article is every orphaned or pushed down the site architecture.
When you have sites this big, figuring out how to manage your content library at scale is crucial. The New York Times brilliant solution ensure that search engines are always going to be able to discover their content.
This post was originally shared by Chris Long on Linkedin.