X Tutup
Skip to content

Apply IWYU to {test/unit,examples,src}/*.{cpp,hpp}.#185

Closed
bcsgh wants to merge 3 commits intoetr:masterfrom
bcsgh:bcsgh_IWYU
Closed

Apply IWYU to {test/unit,examples,src}/*.{cpp,hpp}.#185
bcsgh wants to merge 3 commits intoetr:masterfrom
bcsgh:bcsgh_IWYU

Conversation

@bcsgh
Copy link
Copy Markdown
Contributor

@bcsgh bcsgh commented Mar 8, 2020

Identify the Bug

#177 (General code style issues)

Description of the Change

Apply IWYU to src/*.{cpp,hpp}.
https://include-what-you-use.org/

This required some hand edits for networking/windows stuff as well as adding std:: in a few places.

The actual command used was:

.../include-what-you-use\
    -DHTTPSERVER_COMPILATION \
    -Xiwyu --mapping_file=custom.imp \
    -Xiwyu --no_fwd_decls \
    -I${AS_NEEDED} ... \
    ${FILE}

At least on my setup, the default mapping doesn't cover everything and I ended up needing to pass --mapping_file=custom.imp with custom.imp being the following:

[
  { include: ["\"microhttpd.h\"", "private", "<microhttpd.h>", "public"] },
  { include: ["<ext/alloc_traits.h>", "private", "<memory>", "public"] },
  
  { symbol: ["std::exception",     "private", "<exception>", "public"]},
  { symbol: ["std::shared_ptr",    "private", "<memory>",    "public"]},
  { symbol: ["std::uint16_t",      "private", "<cstdint>",   "public"]},
  { symbol: ["std::uint64_t",      "private", "<cstdint>",   "public"]},
  { symbol: ["std::istringstream", "private", "<sstream>",   "public"]},
  { symbol: ["std::stringstream",  "private", "<sstream>",   "public"]},
  { symbol: ["std::ifstream",      "private", "<fstream>",   "public"]},

  { symbol: ["uint16_t", "private", "<stdint.h>", "public"]},
  { symbol: ["uint64_t", "private", "<stdint.h>", "public"]},

  { symbol: ["MHD_Connection", "private", "<microhttpd.h>", "public"]},
]

Alternate Designs

Do nothings.

Possible Drawbacks

It's very possible this will break a consuming build, however any break should be as a result of something depending on a transitive include and thus be fixable by adding some needed header (probably in the consuming code) which should be done anyway.

Also, the windows+networking includes the are in #ifdef's are just my best guess as to what should be included.

Verification Process

./bootstrap
mkdir build ; cs build
../configure
make

Release Notes

Apply IWYU to src/*.{cpp,hpp}.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Mar 8, 2020

Codecov Report

Merging #185 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #185   +/-   ##
=======================================
  Coverage   95.19%   95.19%           
=======================================
  Files          35       35           
  Lines        3140     3140           
=======================================
  Hits         2989     2989           
  Misses        151      151
Impacted Files Coverage Δ
src/httpserver/http_resource.hpp 33.33% <ø> (ø) ⬆️
test/unit/string_utilities_test.cpp 100% <ø> (ø) ⬆️
src/file_response.cpp 83.33% <ø> (ø) ⬆️
src/details/http_endpoint.cpp 100% <ø> (ø) ⬆️
src/http_request.cpp 96.69% <ø> (ø) ⬆️
src/webserver.cpp 90.47% <ø> (ø) ⬆️
src/httpserver/http_utils.hpp 100% <ø> (ø) ⬆️
src/httpserver/http_request.hpp 92.3% <ø> (ø) ⬆️
src/httpserver/deferred_response.hpp 100% <ø> (ø) ⬆️
src/httpserver/file_response.hpp 100% <ø> (ø) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc700b5...93b74b1. Read the comment docs.

@bcsgh bcsgh marked this pull request as ready for review March 8, 2020 04:10
@bcsgh bcsgh changed the title Apply IWYU to src/*.{cpp,hpp}. Apply IWYU to {test/unit,examples,src}/*.{cpp,hpp}. Mar 8, 2020
Comment on lines -38 to -39
class webserver;
class http_request;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why here we are preferring an explicit inclusion to a slimmer forward declaration.
It actually feels opposite (or orthogonal at the very least) to the spirit of the commit. While I am in agreement with the rest of the commit (as a meant to avoid implicit transitive dependencies) I disagree with a general preference for inclusion over forward-declaration (if that is what we are trying to achieve).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience is that forward declarations[1] are almost never a good idea.

tl;dr; a bunch of people, who I've worked with and who's judgment I trust and respect, and who's job it is to study and understand the costs and benefits of this sort of decision say that using forward declarations doesn't actually give the benefits you are seeking.

Starting with this: https://google.github.io/styleguide/cppguide.html#Forward_Declarations
the only cited pros:

  • avoid reading more headers. But in general that's doesn't happen, as very often the header ends up being included anyway in the translation unit. Also the full build time for everything here is maybe 30 seconds, given any effort to run a parallel build, that advantage vanishes.
  • Avoid extra compiles. Again, that's dubious for the same reason.

And then they list the cons, of which there are many.

If you want to cut down on includes and build times, then the biggest lever would be to allow and encourage direct use of the httpserver/*.hpp files by all clients.

That said, while I strongly think that using forward declarations provide little if any value and introduces real risks and costs, and am willing to discuss that with you in an effort to convince you[2]; this is your code. If you decide to use forward declarations then that's your decision to make and I'll respect that.

[1] that is where they are not in the header associated with the definition.
[2] there is a good snippet out of a conference talk I watched somewhere on this (but I'm not finding it) that goes over some empirical anecdata about the actual effect (or lack there of) with modern tooling.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I agree with their cons in google styleguide. Most of their cons do not apply here (especially in the context of forward declaring within the library itself). Their pros do.

They seem to be ignoring the pollution that inclusion in C++ necessarily drags along. Also, it feels like what is happening here is that they are removing the good uses of an instrument in view of potential bad uses. In essence, I disagree with them :). I think it might make sense when managing thousands of people to make a broad draconian law, it makes more sense in this case to use the specific advantage of the tool at hand.

Taking their cons one by one:

Forward declarations can hide a dependency, allowing user code to skip necessary recompilation when headers change.

Not sure how this could be the case as we should be using them anyhow only in case of pointers/references.

A forward declaration may be broken by subsequent changes to the library. Forward declarations of functions and templates can prevent the header owners from making otherwise-compatible changes to their APIs, such as widening a parameter type, adding a template parameter with a default value, or migrating to a new namespace.

Does not apply here.

Forward declaring symbols from namespace std:: yields undefined behavior.

Does not apply here (it also feels hardly generalizable)

It can be difficult to determine whether a forward declaration or a full #include is needed.

Not sure it is a con but rather a statement at root

Forward declaring multiple symbols from a header can be more verbose than simply #includeing the header.

Maybe, but it feels like a weak counter-argument against recompilation and pollution

Structuring code to enable forward declarations (e.g. using pointer members instead of object members) can make the code slower and more complex.

Valid, but again it is not happening here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forward declaration may be broken by subsequent changes to the library.

Does not apply here.

That does apply here, maybe. For example if you were to need to make one of the classes into a template (with default arguments) or an alias (typedef or using) then it would be necessary to go remove or update the forward declarations. (Thought I don't know how likely that kind of change would be.)

Forward declaring multiple symbols from a header can be more verbose ...

Maybe, but it feels like a weak counter-argument against recompilation and pollution

Much of my argument is that "recompilation and pollution" is it self a weak argument: I'm tempted to see how much the transitive includes from the .cpp files change if forward declarations are aggressively used. And then how much the compile time changes. (I'm guessing not much in both cases.)

The other half of the argument is that using forward declarations introduces something, at a distance, that needs to be maintained in sync with the definition. Which is to say that it can be out of sync, and (by Murphy's law) sooner or later will.

(FWIW: there is one forward declaration that I had to leave in. It's associated with a strange cycle in the dependencies that I'm hoping to remove at some point.)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see the cons applied here and hence the whole problems is falling for me in the realm of speculation. Not sure even what we need to keep in sync here given that we are not forward declaring anything but class names, so I'd rather stay close to the practicality of the problem rather than generalize so much to land in the space of philosophy.
The real problem for me is pollution (not much compilation speed). I can't see why I would want to drag this code and every piece of code depending on it to include everything included here. Moving the inclusions in the cop file allows me to free these classes and their dependents from their weight (be it bytes or noise).
Let's leave these in places and as a general policy for the lib in general I do prefer forward declarations to includes unless there is a tangible material risk associated with doing so.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll grant it is somewhat of a hypothetical but; say for example you want to allow using multiple threading models in the same binary and chose to to that by making all the relevant classes templates. You would need to touch every relevant forward declaration. Or maybe you find a need to rename a type and to allow the old new to work (but be deprecated) you add a using old_name = new_name; or similarly for moving something to a new namespace. Again every declaration needs to change.

As for pollution: Out of curiosity I looked up the set of include from each .cpp (by invoking cpp -M ...) in the branch vs. in the current head. With a little filtering to canonicalize things, the only addition this PR makes is to add httpserver/details/modded_request.hpp to the includes of each examples/*.cpp. Everything else was already included. (That said, I haven't compared that to this branch + use of forward declarations, but my guess is it won't change much.)

for f in $(find -name '*.cpp' | sort);
do
  echo $(\
      cpp -I${libmicrohttpd_include} -Itest -Isrc \
        -DHTTPSERVER_COMPILATION  \
        -M $f | \
      sed 's| /[^ ]*||g;s/\b \b/\n /g;s/ \\$//;s/.*://' | \
      sort -u
  );
done

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On templatizing those classes: I promise I will not do that. They were, and are not anymore. If someone does it, they will likely do it once say in N years (not every day of the week). We will be able to survive.

On the pollution: That is not my point of pollution. If anything the test is relatively obvious because for every include missing in a hpp you are going to find it in the cpp file. I am perfectly happy at symbols being in the so (where they cost nothing as, at worst, is a matter of linking), they still aren't forcing them down in every "h" file. To me the point is that each piece of code should just know only the bare minimum it needs to know. Nothing else. The recompilation of everything that results in this case to me is just the symptom (not the cause) of the problem with it. It just means that "needless noise" has been just added to the hpp.
Again, I am talking specifically here of how this semantic is used in the cases (one by one) you are touching in which what we need is just the name of a struct/class and the speculations on its structure are minimal or close to none. For example, microhttpd.h was never included in "h" files by convention as I don't know (and don't want to know) what they include in their "h" file to be able to use a single struct as pointer and I don't want my clients to have to know that. They don't need it. We have often run exercises to convert includes in FDs with this purpose and I intend to keep doing so in the future.

I am not sure this debate is useful at this point though. I'd rather capitalize on the rest of the change that I believe it is useful than staying stuck on something I cannot be brought to care about.
I think we should just move on and leave the forward declarations as is, unless, again we have a problem with them that is real, happening right now and currently acting against us in the library (and that also would have to be specific for each use of them, because I am still unconvinced by the generality of the tool being bad).

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a not so funny and practical example of what I am talking about and referring as "pollution", see what is happening here #187 where a simple movement of includes up and down is breaking a build on windows (still to root-cause why).
Especially in light of how much garbage many libraries tend to include in pre-processor, macros, etc... I'd rather not have them where it more visibly impacts everything else and our customers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding including microhttpd.h from headers, if anything, that's the case I'd most want to avoid FD's as this project doesn't control that one (or are you also an author of it?) and adding FD's has your code making assumptions about it's implementation details. (What if one of those types were to be converted to a typedef or even a macro #define?)


Regarding the ordering issue you are chasing down, I come to the opposite conclusion: the goal state should be that every header used in the project should be 100% self contained, that is the order they are instantiated in shouldn't ever matter. The easy test for that is to make sure that for every header there is some .cpp file with it as the first header.

In my experience, the cases were that can't be done are rather limited. E.g.:

  • Some external header refuses to cooperate (like the windows stuff, in which case I'd wrap it in my own header that hides the issue away).
  • "textual headers" (I've seen ones that are specifically intended to be included multiple times in the same translation unit, and not necessarily from global scope, which I consider a bad idea.)

Other than those, I don't remember ever seeing any cases where "solving that problem" didn't improve the code in general.


Regardless which way you decide to approve, I'm going to sit on this until you have the windows build sorted out and I can place this work on top of that.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Playing with it right now (windows I mean). I got it to build by fixing the define they need for the windows version. I noticed that they have improved the support for msys2 so trying to get it to work in a cleaner way (possibly running tests this time around). I'll ping you once done.

Comment on lines 33 to +35
#include "httpserver/http_request.hpp"
#include "httpserver/http_resource.hpp"
#include "httpserver/http_response.hpp"
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if a forward declaration here for these 3 would be a better strategy


#include "httpserver/http_utils.hpp"

struct MHD_Connection;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the comment above. I am not sure that I prefer a full-fledged include over a forward-declaration

{

class webserver;
class http_request;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on forward-declaration

Comment on lines -37 to -38
struct MHD_Connection;
struct MHD_Response;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on forward declaration

Comment on lines -51 to -62
class http_resource;
class create_webserver;

namespace http {
struct ip_representation;
struct httpserver_ska;
};

namespace details {
struct modded_request;
}

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same on forward declarations

@etr
Copy link
Copy Markdown
Owner

etr commented Mar 8, 2020

It also might be worth adding this to travis and have a dedicated build with it. Otherwise I can see it regressing back very quickly to where it was

@bcsgh
Copy link
Copy Markdown
Contributor Author

bcsgh commented Mar 8, 2020

I don't know enough about Travis to know how to make a check out of these. Heck, I don't even know if IWYU has a mode to support that.

@etr
Copy link
Copy Markdown
Owner

etr commented Mar 8, 2020

Don't worry too much about travis. You are being doing a lot anyhow and I don't want you to do anything that is not directly part of your interest. I'll open an issue after this goes through to add it to travis myself.

@etr
Copy link
Copy Markdown
Owner

etr commented Mar 9, 2020

Windows builds have been fixed. I see that there are conflicts on this PR but hopefully nothing too complex to fix.
As discussed, let's keep the forward declarations in place where they are so that I can approve and merge. The rest looks fine.

@bcsgh
Copy link
Copy Markdown
Contributor Author

bcsgh commented Mar 10, 2020

Note: I'm pushing some unfinished stuff to get the CI results. No need to review anything yet, I already know it's not done.

@etr etr mentioned this pull request Aug 27, 2020
@etr
Copy link
Copy Markdown
Owner

etr commented Aug 27, 2020

Closing this in favour of: #208

@etr etr closed this Aug 27, 2020
@bcsgh bcsgh deleted the bcsgh_IWYU branch February 15, 2021 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

X Tutup