X Tutup
Skip to content

DOC: Use a consistent random seed in examples.#7382

Closed
QuLogic wants to merge 1 commit intomatplotlib:v2.xfrom
QuLogic:doc-example-derandomize
Closed

DOC: Use a consistent random seed in examples.#7382
QuLogic wants to merge 1 commit intomatplotlib:v2.xfrom
QuLogic:doc-example-derandomize

Conversation

@QuLogic
Copy link
Copy Markdown
Member

@QuLogic QuLogic commented Nov 2, 2016

This ensures that repeated builds of the documentation produce the same images, which improves consistency.

There are about 100 examples that use randomness without setting a seed, so this seems the more prudent (and lazier) solution.

This ensures that repeated builds of the documentation produce the same
images, which improves consistency.
@QuLogic QuLogic added this to the 2.0.1 (next bug fix release) milestone Nov 2, 2016
# Initialize random number generator for consistent plots.
import random
random.seed(20021210)
del random
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the point of importing random, then deleting random, then using np.random?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know for certain whether any of our examples use random, but I don't think np.random shares seed with it. I delete it because our documentation already mysteriously depends on np and plt existing and I don't want anyone else to start depending on random, too.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't do anything. It fixes the seed for the stdlib random generator, but not numpy's.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, which is why there's a line just underneath setting numpy's seed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so once #7385 is merged, you can get rid of these three lines.

@NelleV
Copy link
Copy Markdown
Member

NelleV commented Nov 2, 2016

If we want to have to have reproducible examples, fixing the random seed should be done in the examples, not in our documentation building system.

(That is also unlikely to work with sphinx-gallery).

@dopplershift
Copy link
Copy Markdown
Contributor

I'm not sure I agree about putting it in the examples themselves. IMO, the seed isn't for the benefit of those using them, so much as our ability to use them to smoke out changes. You'd be adding more lines of boilerplate for no educational benefit.

@QuLogic
Copy link
Copy Markdown
Member Author

QuLogic commented Nov 2, 2016

I'm not aiming for reproducible examples, but for reproducible documentation. It means if you change some small documentation thing, only that thing will change.

In a way, I think it's more "fun" (or let's say, interesting) to have the examples be truly random, and it detracts from them by setting a seed.

@NelleV NelleV modified the milestones: 2.1 (next point release), 2.0.1 (next bug fix release) Nov 2, 2016
@story645
Copy link
Copy Markdown
Member

story645 commented Nov 2, 2016

Couldn't a quick git grep tell us if the examples use random? And then change those over to np.random just to keep this cleaner.

@ivanov
Copy link
Copy Markdown
Member

ivanov commented Nov 2, 2016

There's only one example importing standard library's random and it's not using it (I just put up a PR removing it in #7385).

@ivanov
Copy link
Copy Markdown
Member

ivanov commented Nov 3, 2016

I like the idea of reproducible plots in our docs.

I'm not sure what the significance of 20021210 so if it's all the same, and we don't want to just used 0 as the seed, I suggest 19680801 - John Hunter's birthday.

@NelleV mentioned that this won't affect sphinx-gallery when we switch to that, but perhaps it'd be worth adding the plot_pre_code feature of plot_directive we're utilizing here. I added an issue for this in sphinx-gallery: sphinx-gallery/sphinx-gallery#158

@ivanov
Copy link
Copy Markdown
Member

ivanov commented Nov 3, 2016

On the other hand - there are only 17 files that use np.random, and of those, most are rst docs, that already set the seed, leaving only 7 .py files:

$ git grep  np\.random | cut -f 1 -d\: | uniq |grep \.py 
pyplots/align_ylabels.py
pyplots/boxplot_demo.py
pyplots/compound_path_demo.py
pyplots/dollar_ticks.py
pyplots/fig_axes_labels_simple.py
pyplots/pyplot_scales.py
pyplots/pyplot_text.py

of those, pyplots/boxplot_demo.py already sets a seed, so the alternative to writing the seed into the examples isn't that onerous either.

@story645
Copy link
Copy Markdown
Member

story645 commented Nov 3, 2016

@ivanov then converting those docs to using set seeds sounds like a great little project for new contributors or @NelleV's machine shop if they still need tasks. Unless you just want to throw that into #7385...

@QuLogic
Copy link
Copy Markdown
Member Author

QuLogic commented Nov 3, 2016

Why are you only looking at pyplots? All of the examples are included in the docs.

@QuLogic
Copy link
Copy Markdown
Member Author

QuLogic commented Nov 3, 2016

I could have sworn it was >100, but I must have forgotten to exclude the seeded ones. There are 80 examples without seeds:

$ grep -L 'RandomState' $(grep -L seed $(grep -IRl 'np.random' mpl_examples/)) | wc -l
80

plus the 6 in pyplots that @ivanov noted above and I forgot to count.

@QuLogic
Copy link
Copy Markdown
Member Author

QuLogic commented Nov 3, 2016

Oh, and 20021210 was the first date I could pull out of the changelog, but I have no real tie to it.

@ivanov
Copy link
Copy Markdown
Member

ivanov commented Nov 3, 2016

Why are you only looking at pyplots? All of the examples are included in the docs.

I used git grep - which doesn't follow symlinks. I did not realize that mpl_examples was symlinked to ../examples and therefore none of the ~100 examples that use np.random in there showed up in my original search. So I retract my previous suggestion of just adding a seed to all of those.

@story645
Copy link
Copy Markdown
Member

story645 commented Nov 3, 2016

I still think this could be a good machine-shop or new contributor task as part of the general docs overhaul...but that probably shouldn't be a blocker for this/this code can be taken out if that ever happens.

@NelleV
Copy link
Copy Markdown
Member

NelleV commented Nov 3, 2016

I can volunteer to make the examples reproducible.

@QuLogic QuLogic modified the milestones: unassigned, 2.1 (next point release) Nov 3, 2016
@QuLogic
Copy link
Copy Markdown
Member Author

QuLogic commented Nov 3, 2016

Closing for #7386.

@QuLogic QuLogic closed this Nov 3, 2016
@QuLogic QuLogic deleted the doc-example-derandomize branch November 3, 2016 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

X Tutup