Hide old docs from search engines via canonical link#24
Hide old docs from search engines via canonical link#24JazzTap wants to merge 3 commits intomatplotlib:masterfrom
Conversation
|
There are a ton of extraneous changes here; is there a way to get it to only do the two things you mentioned? It's not just whitespace changes that are added. |
|
That's lxml snapping all the docs to its grammar. But I didn't spot anything else beyond property re-ordering. Are there semantic changes? As I understand it, all html is (was) machine-generated from source in the first place. But instead of parsing, one could regex carefully for the lines of form |
|
There appears to be a way to get git to not add whitespace only changes (https://stackoverflow.com/questions/3515597/add-only-non-whitespace-changes). We should hold of on worry about the whitespace for now, @JazzTap and I are at the scipy sprints and agreed in person to focus on using the |
|
See #39 that only change a single line per file. |
|
I'll close this in lieu of #49 which does the same thing almost. Thanks a lot for the work on this though - it was very helpful. |
Project initiated with @JLegs to point search engines (and users, gently) at current docs. Dumb approach used: delete version string from url (and put absolute link to matplotlib.org to avoid baseurl shenanigans).
All HTML parsed through lxml by 'tools/docs_deprecator' notebook or script. Only change expected besides whitespace, and property ordering, is 1) a <link> at the bottom of <head> and 2) a <div> at the top of <body>. (The bot-forwarder and human-forwarder respectively.)
Corresponding comment in issue tracker: matplotlib/matplotlib#10016 (comment)
Note that in an ideal world we'd forward pages using a database of pages & their descendants, replacements, whatever. Their automatic computation is compute-heavy, as discussed above.