herbsutter.com Open in urlscan Pro
192.0.78.25 Public Scan

Back to summary

Submitted URL:
http://herbsutter.com/
Effective URL:
https://herbsutter.com/
Submission: On May 30 via api (May 30th 2022, 3:17:17 pm UTC) from GB — Scanned from GB

Form analysis
4 forms found in the DOM

GET https://herbsutter.com/

<form role="search" method="get" class="search-form" action="https://herbsutter.com/">
  <label>
    <span class="screen-reader-text">Search for:</span>
    <input type="search" class="search-field" placeholder="Search …" value="" name="s">
  </label>
  <input type="submit" class="search-submit" value="Search">
</form>

POST https://subscribe.wordpress.com

<form action="https://subscribe.wordpress.com" method="post" accept-charset="utf-8" id="subscribe-blog">
  <p id="subscribe-email">
    <label id="subscribe-field-label" for="subscribe-field" class="screen-reader-text"> Email Address: </label>
    <input type="email" name="email" style="width: 95%; padding: 1px 10px" placeholder="Enter your email address" value="" id="subscribe-field">
  </p>
  <p id="subscribe-submit">
    <input type="hidden" name="action" value="subscribe">
    <input type="hidden" name="blog_id" value="3379246">
    <input type="hidden" name="source" value="https://herbsutter.com/">
    <input type="hidden" name="sub-type" value="widget">
    <input type="hidden" name="redirect_fragment" value="subscribe-blog">
    <input type="hidden" id="_wpnonce" name="_wpnonce" value="44f41abf65"> <button type="submit" class="wp-block-button__link"> Subscribe </button>
  </p>
</form>

POST https://subscribe.wordpress.com

<form method="post" action="https://subscribe.wordpress.com" accept-charset="utf-8" style="display: none;">
  <div class="actnbr-follow-count">Join 27,074 other followers</div>
  <div>
    <input type="email" name="email" placeholder="Enter your email address" class="actnbr-email-field" aria-label="Enter your email address">
  </div>
  <input type="hidden" name="action" value="subscribe">
  <input type="hidden" name="blog_id" value="3379246">
  <input type="hidden" name="source" value="https://herbsutter.com/">
  <input type="hidden" name="sub-type" value="actionbar-follow">
  <input type="hidden" id="_wpnonce" name="_wpnonce" value="44f41abf65">
  <div class="actnbr-button-wrap">
    <button type="submit" value="Sign me up"> Sign me up </button>
  </div>
</form>

<form id="jp-carousel-comment-form">
  <label for="jp-carousel-comment-form-comment-field" class="screen-reader-text">Write a Comment...</label>
  <textarea name="comment" class="jp-carousel-comment-form-field jp-carousel-comment-form-textarea" id="jp-carousel-comment-form-comment-field" placeholder="Write a Comment..."></textarea>
  <div id="jp-carousel-comment-form-submit-and-info-wrapper">
    <div id="jp-carousel-comment-form-commenting-as">
      <fieldset>
        <label for="jp-carousel-comment-form-email-field">Email (Required)</label>
        <input type="text" name="email" class="jp-carousel-comment-form-field jp-carousel-comment-form-text-field" id="jp-carousel-comment-form-email-field">
      </fieldset>
      <fieldset>
        <label for="jp-carousel-comment-form-author-field">Name (Required)</label>
        <input type="text" name="author" class="jp-carousel-comment-form-field jp-carousel-comment-form-text-field" id="jp-carousel-comment-form-author-field">
      </fieldset>
      <fieldset>
        <label for="jp-carousel-comment-form-url-field">Website</label>
        <input type="text" name="url" class="jp-carousel-comment-form-field jp-carousel-comment-form-text-field" id="jp-carousel-comment-form-url-field">
      </fieldset>
    </div>
    <input type="submit" name="submit" class="jp-carousel-comment-form-button" id="jp-carousel-comment-form-button-submit" value="Post Comment">
  </div>
</form>

Text Content

Skip to content


SUTTER’S MILL

Herb Sutter on software development


MY CPPCON 2021 TALK VIDEO IS ONLINE

Whew — I’m now back from CppCon, after remembering how to travel.

My talk video is now online. If you haven’t already seen this via JetBrains’
CppCon 2021 video page or the Reddit post, here’s a link:




Please direct technical comments to the Reddit thread and I’ll watch for them
there and respond to as many comments as I can. Thanks!

Thanks again to everyone who attended in person for supporting our requirements
for meeting together safely. Interestingly, this was the largest CppCon ever
(and the largest C++-specific conference ever as far as I know) in terms of
total attendance, though most were attending online. It was good to see and
e-see you all! With any luck, by CppCon 2022 our lives will be much closer to
normal everywhere in the world… here’s hoping. Thanks again, and stay safe.

Herb Sutter Uncategorized 2021-10-312021-10-31 1 Minute


TRIP REPORT: SUMMER 2021 ISO C++ STANDARDS MEETING (VIRTUAL)

On Monday, the ISO C++ committee held its third full-committee (plenary) meeting
of the pandemic and adopted a few more features and improvements for draft
C++23.

We had representatives from 17 voting nations at this meeting: Austria,
Bulgaria, Canada, Czech Republic, Finland, France, Germany, Israel, Italy,
Netherlands, Poland, Russia, Slovakia, Spain, Switzerland, United Kingdom, and
United States. Slovakia is our newest national body to officially join
international C++ work. Welcome!

We continue to have the same priorities and the same schedule we originally
adopted for C++23, but online via Zoom during the pandemic.


THIS WEEK: A FEW MORE C++23 FEATURES ADOPTED

This week we formally adopted a third round of small features for C++23, as well
as a number of bug fixes. Below, I’ll list some of the more user-noticeable
changes and credit all those paper authors, but note that this is far from an
exhaustive list of important contributors… even for these papers, nothing gets
done without help from a lot of people and unsung heroes, so thank you first to
all of the people not named here who helped the authors move their proposals
forward! And thank you to everyone who worked on the adopted issue resolutions
and smaller papers I didn’t include in this list.

P1938  by Barry Revzin, Richard Smith, Andrew Sutton, and Daveed Vandevoorde
adds the if consteval feature to C++23. If you know about C++17 if constexpr and
C++20 std::is_constant_evaluated, then you might think we already have this
feature under the spelling if constexpr (std::is_constant_evaluated())… and
that’s one of the reasons to add this feature, because that code actually
doesn’t do what one might think. See the paper for details, and why we really
want if consteval in the language.

P1401 by Andrzej Krzemieński enables testing integers as booleans in static_cast
and if constexpr without having to cast the result to bool first (or test
against zero). This is a small-but-nice example of removing redundant ceremony
to help make C++ code that much cleaner and more readable.

P1132 by Jean-Heyd Meneide, Todor Buyukliev, and Isabella Muerte add out_ptr and
inout_ptr abstractions to help with potential pointer ownership transfer when
passing a smart pointer to a function that is declared with a T** “out”
parameter. In a nutshell, if you’ve ever wanted to call a C API by writing
something like some_c_function( &my_unique_ptr ); then these types will likely
help you. The idea is that a call site can use one of these types to wrap a
smart pointer argument, and then when the helper type is destroyed it
automatically updates the pointer it wraps (using a reset call or semantically
equivalent behavior).

P1659 by Christopher DiBella generalizes the C++20 starts_with and ends_with on
string and string_view by adding the general forms ranges::starts_with and
ranges::ends_with to C++23. These can work on arbitrary ranges, and also answer
questions such as “are the starting elements of r1 less than the elements of
r2?” and “are the final elements of r1 greater than the elements of r2?”.

P2166 by Yuriy Chernyshov helps reduce a commonly-taught pitfall with
std::string. You know how since forever (C++98) you can construct a string from
a string literal, like std::string("xyzzy")? But that you’d better watch out
(and you’d better not cry or pout) not to pass a null pointer, like
std::string(nullptr), because that’s undefined behavior where implementations
aren’t required to check the pointer for null and can do just whatever they
liked, including crash? That’s still the case if you pass a pointer variable
whose value is null (sorry!), but with this paper, as of C++23 at least now we
have overloads that reject attempts to construct or assign a std::string from
nullptr specifically, as a compile-time “d’oh! don’t do that.”

We also adopted a number of other issue resolutions and small papers that made
additional improvements, including a number that will be backported
retroactively to C++20. Quite a few were of the “oh, you didn’t know that rare
case didn’t work? now it does” variety.


OTHER PROGRESS

We also approved work on a second Concurrency TS. Recall that a “TS” or
“Technical Specification” is like doing work in a feature branch, which can
later be merged into the C++ standards (trunk).

Two related pieces of work were approved to go into the Concurrency TS: P1121
and P1122 by Paul McKenney, Maged M. Michael, Michael Wong, Geoffrey Romer,
Andrew Hunter, Arthur O’Dwyer, Daisy Hollman, JF Bastien, Hans Boehm, David
Goldblatt, Frank Birbacher, Erik Rigtorp, Tomasz Kamiński, and Jens Maurer add
support for hazard pointers and read-copy-update (RCU) which are useful in
highly concurrent applications.


WHAT’S NEXT

We’re going to keep meeting virtually in subgroups, and then have at least one
more virtual plenary session to adopt features into the C++23 working draft in
October.

The next tentatively planned ISO C++ face-to-face meeting is February 2022 in
Portland, OR, USA. (Per our C++23 schedule, this is the “feature freeze”
deadline for design-approving new features targeting the C++23 standard, whether
the meeting is physical or virtual.) Meeting in person next February continues
to look promising – barring unexpected surprises, it’s possible that by that
time most ISO C++ participating nations will have been able to resume local
sports/theatre/concert events with normal audiences, and removed travel
restrictions among each other, so that people from most nations will be able to
participate at an in-person meeting. But we still have to wait and see… we
likely won’t know for sure until well into the autumn, and so we’re still
calling this one “tentative” for now. You can find a list of our meeting plans
on the Upcoming Meetings page.

Thank you again to the hundreds of people who are working tirelessly on C++,
even in our current altered world. Your flexibility and willingness to adjust
are much appreciated by all of us in the committee and by all the C++
communities! Thank you, and see you on Zoom.

Herb Sutter Uncategorized 2021-06-092021-06-09 4 Minutes


GOTW #102 SOLUTION: ASSERTIONS AND “UB” (DIFFICULTY: 7/10)

This special Guru of the Week series focuses on contracts. Now that we have
considered assertions, postconditions, and preconditions in GotWs #97-101, let’s
pause and reflect: To what extent does a failed contract imply “UB”… either the
Hidden Dragon of Undefined Behavior, or the Crouching Tiger of Unspecified
Behavior?


1. BRIEFLY, WHAT IS THE DIFFERENCE AMONG:


(A) UNDEFINED BEHAVIOR

Undefined behavior is what happens when your program tries to do something whose
meaning is not defined at all in the C++ standard language or library (illegal
code and/or data). A compiler is allowed to generate an executable that does
anything at all, from data corruption (objects not meeting the requirements of
their types) to injecting new code to reformat your hard drive if the program is
run on a Tuesday, even if there’s nothing in your source code that could
possibly reformat anything. Note that undefined behavior is a global property —
it always applies not only to the undefined operation, but to the whole program.
[1]


(B) UNSPECIFIED BEHAVIOR

Unspecified behavior is what happens when your program does something for which
the C++ standard doesn’t document the results. You’ll get some valid result, but
you won’t know what the result is until your code looks at it. A compiler is not
allowed to give you a corrupted object or to inject new code to reformat your
hard drive, not even on Tuesdays.


(C) IMPLEMENTATION-DEFINED BEHAVIOR

Implementation-defined behavior is like unspecified behavior, where the
implementation additionally is required to document what the actual result will
be on this particular implementation. You can’t rely on a particular answer in
portable code because another implementation could choose to do something
different, but you can rely on what it will be on this compiler and platform.


2. FOR EACH OF THE FOLLOWING, WRITE A SHORT FUNCTION … WHERE IF THE ASSERTION IS
NOT CHECKED AND IS FALSE THEN THE EFFECT:


(A) IS ALWAYS UNDEFINED BEHAVIOR

Easy peasy! Let’s dereference a null pointer:

// Example 2(a): If assert is violated, always undefined behavior
 
void deref_and_set( int* p ) {
    assert( p );
    *p = 42;
}

The function asserts that p is not null, and then on the next line
unconditionally dereferences p and scribbles over the location it points to. If
p is null and the assertion checking is off so that we can get to the next line,
the compiler is allowed to make running the whole program format our hard drive.


(B) POSSIBLY RESULTS IN UNDEFINED BEHAVIOR

A general way to describe this class of program is that the call site has two
bugs: first, it violates a precondition (so the callee’s results are always at
least unspecified), and then it additionally then uses the unspecified result
without checking it and/or in a dangerous way.

To make up an example, let’s bisect a numeric range:

// Example 2(b): If assert is violated, might lead to undefined behavior
 
int midpoint( int low, int high ) {
    assert( low <= high );
    return low + (high-low)/2;
        // less overflow-prone than “(low+high)/2”
        // more accurate than “low/2 + high/2”
}

The author of midpoint could have made the function more robust to take the
values in either order, and thus eliminated the assertion, but assume they had a
reason not to, as alluded to in the comments.

Violating the assertion does not result in undefined behavior directly. The
function just doesn’t specify (ahem!) its results if call sites call it in a way
that violates the precondition the assertion is testing. If the precondition is
violated, then the function can add a negative number to low. But just
calculating and returning some other int is not (yet) undefined behavior.

For many call sites, a bad call to midpoint won’t lead to later undefined
behavior.

However, it’s possible that some call site might go on to use the unspecified
result in a way that does end up being real undefined behavior, such as using it
as an array index that performs an out-of-bounds access:

auto m = midpoint( low_index(arr1), high_index(arr2) );   // unspecified
   // here we expect m >= low_index(arr1) ...
stats[m-low_index(arr1)]++;                 // --> potentially undefined

This call site code has a typo, and accidentally mixes the low and high indexes
of unrelated containers, which can violate the precondition and result in an
index that is less than the “low” value. Then in the next line it tries to use
it as an offset index into an instrumentation statistics array, which is
undefined behavior for a negative number.

GUIDELINE: Remember that an unspecified result is not in itself undefined
behavior, but a call site can run with it and end up with real undefined
behavior later. This happen particularly when the calculated value is a pointer,
or an integer used as an array index (which, remember, is basically the same
thing; a pointer value is just an index into all available memory viewed as an
array). If a program relies on unspecified behavior to avoid performing
undefined behavior, then it has a path to undefined behavior, and so unspecified
behavior is a Crouching Tiger, if you will… still dangerous, and can be turned
into to the full dragon.

GUIDELINE: Don’t specify your function’s behavior (output postconditions) for
invalid inputs (precondition violations), except for defense in depth (see
Example 2(c)). By definition, if a function’s preconditions are violated, then
the results are not specified. If you specify the outputs for precondition
violations, then (a) callers will depend on the outputs, and (b) those
“preconditions” aren’t really preconditions at all.

While we’re at it, here’s a second example: Let’s compare pointers in a way the
C++ standard says is unspecified. This program attempts to use pointer
comparisons to see whether a pointer points into the contiguous data stored in a
vector, but this technique doesn’t work because today’s C++ standard only
specifies the results of raw pointer comparison when the pointers point at
(into, or one-past-the-end of) the same allocation, and so when ptr is not
pointing into v’s buffer it’s unspecified whether either pointer comparison in
this test evaluates to false:

// Example 2(b)(ii): If assert is violated, might lead to undefined behavior
 
// std::vector<int> v = ...;
assert(&v[0] <= ptr && ptr < (&v[0])+v.size());           // unspecified
*ptr = 42;                                  // --> potentially undefined


(C) IS NEVER UNDEFINED OR UNSPECIFIED BEHAVIOR

An assertion violation is never undefined behavior if the function specifies
what happens in every case even when the assertion is violated. Here’s an
example mentioned in my paper P2064, distilled from real-world code:

// Example 2(c): If assert is violated, never undefined behavior
//               (function documents its result when x!=0)
 
some_result_value DoSomething( int x ) {
    assert( x != 0 );
    if    ( x == 0 ) { return error_value; }
    return sensible_result(x);
}

The function asserts that the parameter is not zero, to express that the call
site shouldn’t do that, in a way the call site can check and test… but then it
also immediately turns around and checks for the errant value and takes a
well-defined fallback path anyway even if it does happen. Why? This is an
example of “defense in depth,” and can be a useful technique for writing robust
software. This means that even though the assertion may be violated, we are
always still in a well-defined state and so this violation does not lead to
undefined behavior.

GUIDELINE: Remember that violating an assertion does not necessarily lead to
undefined behavior.

GUIDELINE: Function authors, always document your function’s requirements on
inputs (preconditions). The caller needs to know what inputs are and aren’t
valid. The requirements that are reasonably checkable should be written as code
so that the caller can perform the checks when testing their code.

GUIDELINE: Always satisfy the requirements of a function you call. Otherwise,
you are feeding “garbage in,” and the best you can hope for is “garbage out.”
Make sure your code’s tests includes verifying all the reasonably checkable
preconditions of functions that it calls.

Writing the above pattern has two problems: First, it repeats the condition,
which invites copy/paste errors. Second, it makes life harder for static
analysis tools, which often trust assertions to be true in order to reduce false
positive results, but then will think the fallback path is unreachable and so
won’t properly analyze that path. So it’s better to use a helper to express the
“either assert this or check it and do a fallback operation” in one shot, which
always avoids repeating the condition, and could in principle help static
analysis tools that are aware of this macro (yes, it would be nicer to do it
without resorting to a macro, but it’s annoyingly difficult to write the early
return without a macro, because a return statement inside a lambda doesn’t mean
the same thing):

// Using a helper that asserts the condition or performs the fallback
 
#define ASSERT_OR_FALLBACK(B, ACTION) { \
    bool b = B;                         \
    assert(b);                          \
    if(!b) ACTION;                      \
}
 
some_result_value DoSomething( int x ) {
    ASSERT_OR_FALLBACK( x != 0, return error_value; );
    return sensible_result(x);
}


3. EXPLAIN HOW YOUR ANSWERS TO QUESTIONS 1 AND 2 DO, OR DO NOT, CORRESPOND WITH
EACH OTHER.

In Example 2(a), violating the assertion leads to undefined behavior, 1(a).

In Example 2(b), violating the assertion leads to unspecified behavior, 1(b). At
buggy call sites, this could subsequently lead to undefined behavior.

In Example 2(c), violating the assertion leads to implementation-defined
behavior, 1(c), which never in itself leads to  undefined behavior.


4. BONUS: DESCRIBE A VALUABLE SERVICE THAT A TOOL COULD PERFORM FOR ASSERTIONS
THAT SATISFY THE REQUIREMENT IN 2(A), THAT IS NOT POSSIBLE FOR OTHER ASSERTIONS.

There are many. Here is just one example, that happens to be nice because it is
perfectly accurate.

Let’s say we have all the code examples in question 2, written using C assert
today (or even with those assertions missing!), and then at some future time we
get a version of standard C++ that can express them as preconditions. Then only
in Example 2(a), where we can see that the function body (and possibly
transitively its further callees with the help of inlining) exercises undefined
behavior, a tool can infer the precondition annotation and add it mechanically,
and get the benefit of diagnosing existing bugs at call sites:

// What a precondition-aware tool could generate for Example 2(a)
 
auto f( int* p )
    [[pre( p )]]  // can add this automatically: because a violation
                  // leads to undefined behavior, this precondition
                  // is guaranteed to never cause a false positive
{
    assert( p );
    *p = 42;
}

For example, after some future C++2x ships with contracts, a vendor could write
an automated tool that goes through every open source C++ project on GitHub and
mechanically generates a pull request to insert preconditions for functions like
Example 2(a) – but not (b) or (c) – whether or not the assertion already exists,
just by noticing the undefined behavior. And it can inject those contract
preconditions with complete confidence that none of them will ever cause a false
positive, that they will purely expose existing bugs at call sites when that
call site is built with contract checking enabled. I would expect such tool to
identify a good number of (at least latent if not actual) bugs, and be a boon
for C++ users, and it’s possible only for functions in the category of 2(a).

“Automated adoption” of at least part of a new C++ feature, combined with
“automatically identifies existing bugs” in today’s code, is a pretty good value
proposition.


ACKNOWLEDGMENTS

Thank you to the following for their comments on this material: Joshua Berne,
Gabriel Dos Reis, Gábor Horváth, Andrzej Krzemieński, Ville Voutilainen.


NOTES

[1] In the standard, there are two flavors of undefined behavior. The basic
“undefined behavior” is allowed to enter your program only once you actually try
to execute the undefined part. But some code is so extremely ill-formed (with
magical names like “IF-NDR”) that its very existence in the program makes the
entire program invalid, whether you try to execute it or not.

Herb Sutter Uncategorized 10 Comments 2021-06-032021-06-03 8 Minutes


GOTW #102: ASSERTIONS AND “UB” (DIFFICULTY: 7/10)

This special Guru of the Week series focuses on contracts. Now that we have
considered assertions, postconditions, and preconditions in GotWs #97-101, let’s
pause and reflect: To what extent does a failed contract imply “UB”… either the
Hidden Dragon of Undefined Behavior, or the Crouching Tiger of Unspecified
Behavior?


JG QUESTION

1. Briefly, what is the difference among:

(a) undefined behavior

(b) unspecified behavior

(c) implementation-defined behavior


GURU QUESTIONS

2. For each of the following, write a short function of the form:

1
2
3
4
5
/*...function name and signature...*/
{
    assert( /*...some condition about the parameters...*/ );
    /*...do something with parameters...*/;
}

where if the assertion is not checked and is false then the effect:

(a) is always undefined behavior

(b) possibly results in undefined behavior

(c) is never undefined or unspecified behavior

3. Explain how your answers to Questions 1 and 2 do, or do not, correspond with
each other.

4. BONUS: Describe a valuable service that a tool could perform for assertions
that satisfy the requirement in 2(a), that is not possible for other assertions.

Herb Sutter Uncategorized 6 Comments 2021-05-25 1 Minute


GOTW #101 SOLUTION: PRECONDITIONS, PART 2 (DIFFICULTY: 7/10)

This special Guru of the Week series focuses on contracts. We covered some
basics of preconditions in GotW #100. This time, let’s see how we can use
preconditions in some practical examples…


1. CONSIDER THESE FUNCTIONS, EXPANDED FROM AN ARTICLE BY ANDRZEJ KRZEMIEŃSKI:
[1] … HOW MANY WAYS COULD A CALLER OF EACH FUNCTION GET THE ARGUMENTS WRONG, BUT
THAT WOULD SILENTLY COMPILE WITHOUT ERROR? NAME AS MANY DIFFERENT WAYS AS YOU
CAN.

There are several ways to break this down. I’ll use three major categories of
possible mistakes, the first two of which overlap:

 * wrong order: passing an argument in the wrong position
 * wrong value: passing an argument with a valid but wrong value (e.g., index
   out of range)
 * invalid value: passing an argument that is already invalid (e.g., an invalid
   iterator)

Let’s see how these play out with our three examples, starting with (a).


(A) IS_IN_VALUES (INT VAL, INT MIN, INT MAX)

// Example 1: Adapted from [1]
 
auto is_in_values (int val, int min, int max)
  -> bool;  // true iff val is in the values [min, max]

Oh my, three identically typed integer parameters… what could be confusing about
that?!

Wrong order (5 ways): First, there are five ways to pass these in the wrong
order, because there are 3! = 6 permutations, all of which compile but only the
first of which is correct:

is_in_values( v,  lo, hi );    // correct
 
is_in_values( v,  hi, lo );    // all these are wrong, but compile :(
is_in_values( lo, v,  hi );
is_in_values( lo, hi, v  );
is_in_values( hi, v,  lo );
is_in_values( hi, lo, v  );

Some of these argument orders may seem strange, but some are orders other
libraries’ similar APIs might use which makes confusion easier, we all make
mistakes… and the type system isn’t helping us at all.

Wrong value (1 way): Second, there is an implicit precondition that min <= max,
so passing arguments where min > max would be wrong, but would silently compile.
Some of these are exercised by the “wrong order” permutations above, but even
call sites that remember the right argument order can make mistakes about the
actual values.

Invalid value (0 ways): Finally, all possible values of an int are valid — some
may be suspiciously big or small, but int doesn’t have the concept of “not a
number” (NaN) as we have with floats, or the concept of “invalidated” like we
have with iterators.


(B) IS_IN_CONTAINER (INT VAL, INT IDX_MIN, INT IDX_MAX)

It sure doesn’t help that the next function has the identical signature as
is_in_values, but with very different meaning:

auto is_in_container (int val, int idx_min, int idx_max)
  -> bool; // true iff container[i]==val for some i in [idx_min, idx_max]

Wrong order (5 ways): As in (a), we again have five ways to pass these in the
wrong order, all of which compile but only the first is correct:

is_in_container( v,  lo, hi );    // correct
 
is_in_container( v,  hi, lo );    // all these are wrong, but compile :(
is_in_container( lo, v,  hi );
is_in_container( lo, hi, v  );
is_in_container( hi, v,  lo );
is_in_container( hi, lo, v  );

Wrong value (3 ways): Again as in (a), we have the implicit precondition that
idx_min <= idx_max, so passing idx_min > idx_max would be wrong, but would
silently compile. But this time there are two additional ways to go wrong,
because idx_min and idx_max must both be valid subscripts into container, so if
either is outside the range [0, container.size()) it is a valid integer but an
out of bounds value for this use.

Invalid value (0 ways): Again as in (a), all possible values of an int are valid
— though some may be wrong values if they’re out of bounds as we noted above,
they’re still valid integers.


(C) IS_IN_RANGE (T VAL, ITER FIRST, ITER LAST)

template <typename T, typename Iter>
auto is_in_range (T val, Iter first, Iter last)
  -> bool; // true iff *i==val for some i in [first,last)

Wrong order (1 way): This time there’s only one way to pass the parameters in
the wrong order (ignoring pathological cases where the same argument might
convert both T and Iter):

is_in_container( v, istart, iend );    // correct
 
is_in_container( v, iend, istart );    // wrong, but compiles :(

Wrong value (2 ways): We could pass a first and last that are not a valid range
in two ways:

 * they point into the same container, but first doesn’t precede last
 * they point into different containers

Invalid value (2 ways): And finally, either of first or last could actually be
an invalidated iterator (e.g., dangling). For example, the container they point
into may be destroyed so that both are invalid; or one of the two iterators
might have been calculated before a more recent operation like vector::push_back
that could have invalidated it.

But if the sight of these function signatures has had you pulling your hair and
shouting “use the type system, Luke!” at your screen, you’re not alone… now
let’s make things better.


2. SHOW HOW CAN YOU IMPROVE THE FUNCTION DECLARATIONS IN QUESTION 1 BY …


(A) JUST GROUPING PARAMETERS, USING A STRUCT WITH PUBLIC VARIABLES

Interestingly, we actually get a lot of benefit simply by grouping ‘parameters
that go together,’ using an creating an aggregate or “grouping” helper
struct.[3] For example:

// Example 2(a)(i): Improving Example 1 with aggregate types
 
struct min_max { int min, max; };
 
auto is_in_values (int val, min_max minmax) -> bool;
auto is_in_container (int val, min_max rng) -> bool;
 
template <typename Iter> struct two_iters { Iter first, last; };
 
template <typename T, typename Iter>
auto is_in_range (T val, two_iters<Iter> rng) -> bool;

Or even just venerable anonymous std::pair is better than no grouping:

// Example 2(a)(ii): Improving Example 1 with aggregate types
 
auto is_in_values (int val, std::pair<int,int> minmax) -> bool;
auto is_in_container (int val, std::pair<int,int> rng) -> bool;
 
template <typename T, typename Iter>
auto is_in_range (T val, std::pair<Iter,Iter> rng) -> bool;

With either of the above, there’s only one way for callers to get the argument
order wrong. And it requires only two extra characters at call sites, because we
can use { } to group the arguments without creating actual named objects of the
helper struct:

is_in_values( v, {lo, hi} );    // correct
is_in_values( v, {hi, lo} );    // wrong, but compiles
 
is_in_container( v, {lo, hi} ); // correct
is_in_container( v, {hi, lo} ); // wrong, but compiles
 
is_in_range( v, {i1, i2} );     // correct
is_in_range( v, {i2, i1} );     // wrong, but compiles

So just grouping parameters using a struct eliminates some errors. But really
using the type system is even better…


(B) JUST USING AN ENCAPSULATED CLASS, USING A CLASS WITH PRIVATE VARIABLES (AN
ABSTRACTION WITH ITS OWN INVARIANT)

Clearly all three functions are crying out for a “range”-like abstraction for
its pair of parameters, in the first two cases a range of values and in the
third a range of iterators. How do we know? Because:

Here’s one way we can apply class types we can find in the standard library or
Boost today:

// Example 2(b): Improving Example 1 with encapsulated class types
 
auto is_in_values (int val, boost::integer_range<int> rng) -> bool;
 
auto is_in_container (int val, boost::integer_range<int> rng) -> bool;
 
template <typename T, std::ranges::input_range Range>
auto is_in_range (T val, Range&& rng) -> bool;

This gives us all the mistake-reduction goodness we got in (a), plus more.

First, as in (a), absent pathological conversions, it’s very difficult to get
arguments in the wrong order simply because of being forced to group the
parameters:

auto minmax = boost::irange(10, 100);
is_in_values( 42, minmax );
 
auto minmax2 = boost::irange(0, ssize(myvec)-1);
is_in_container( 42, minmax2 );
 
auto myvec = std::vector<int>();
is_in_range( 42, myvec );

But, unlike our helper structs in (a), we now get additional safety because the
types can express constructor preconditions that move some of those mistakes
(such as (hi,lo) misordering) to constructors of class abstractions that can
then preserve them as invariants [4] – so the mistake can still be made but in
fewer places, to where we construct or modify the abstracted object (e.g.,
range), rather than every time we use un-abstracted separately values (e.g., a
couple of iterator objects we have lying around and whose relationship we have
to maintain by hand over time). This is why we sometimes say “types are
predicates,” because a type encapsulates a predicate, namely its invariant.

GUIDELINE: When multiple functions state the same precondition, it’s a telltale
sign there’s a missing class that should turn it into an invariant. A repeated
precondition is nearly always a “naked invariant” that should be encapsulated up
inside a type. This is more obvious when the precondition involves multiple
parameters (or ordinary variables for that matter); a poster child is the STL’s
pervasive use of iterator pairs, which have long been crying out to be
encapsulated using a range abstraction, and fortunately we now have that in
C++20. Consider using a class instead.

GUIDELINE: Remember that a key reason why encapsulated classes are powerful is
that they wrap up preconditions and turn them into invariants. Hiding data
members is good dependency management because it limits the code that can depend
on the details of the data and is responsible for maintaining the correct
relationship among the data members.


(C) JUST USING POST-C++20 CONTRACT PRECONDITIONS (NOT YET VALID C++, BUT
SOMETHING LIKE THE SYNTAX IN [2])

Preconditions test values, so they can let us eliminate the “wrong values” kinds
of mistakes. Consider this code:

// Example 2(c): Improving Example 1 with boolean preconditions
 
auto is_in_values (int val, int min, int max)
  -> bool // true iff val is in the values [min, max]
     [[pre (min <= max)]]
;
 
auto is_in_container (int val, int idx_min, int idx_max)
  -> bool // true iff container[i]==val for i in [idx_min, idx_max]
     [[pre (0       <= idx_min
         && idx_min <= idx_max
         && idx_max <  container.size())]]          // see note [5]
;
 
template <typename T, typename Iter>
auto is_in_range (T val, Iter first, Iter last)
  -> bool // true iff *i==val for some i in [first,last)
     [[pre (/*... is_reachable? is_not_dangling? hmm ...*/)]]
;

For the first two functions, we can write clear preconditions that can check the
“wrong value” bugs.

In these particular examples, the best place to write the preconditions is right
on the constructors of the class types we saw in (b), and if we write them there
then we don’t have to repeat them as explicit contracts on every function.

But is (b) always better than (c), in other examples? This brings us to our last
question, which is all about “can” versus “should”…


3. CONSIDER THESE THREE EXAMPLES, WHERE EACH SHOWS EXPRESSING A BOOLEAN
CONDITION EITHER AS A FUNCTION PRECONDITION OR AS AN ENCAPSULATED INVARIANT
INSIDE A NEW TYPE… IN EACH OF THESE CASES, WHICH WAY IS BETTER? EXPLAIN YOUR
ANSWER.

In Question 2, writing a type was often the best choice, but it isn’t always.

The benefits to writing a type include:

 * Encapsulation. We limit the code that is responsible for maintaining the
   boolean condition.
 * Language support. We get the help of the type system to statically enforce
   requirements.

But there are costs and limitations too:

 * What’s the abstraction? There may not be a suitable one. We can’t write a
   good type unless we can discover a useful abstraction that the type’s
   interface should support. A good type represents a useful reusable domain
   abstraction that programmers can understand and that makes their code clearer
   by elevating the vocabulary of the code. There won’t always be a practical
   and reusable abstraction; when there isn’t, we won’t be able to write a
   useful and reusable type. — Even when there is, we have to design that all
   ahead of time, which requires a lot more advance knowledge and engineering
   than just writing ad-hoc boolean conditions on individual functions.
 * What’s the cost? It may not be feasible to maintain the invariant. We have to
   do any extra work it takes to maintain the invariant, and it has to be
   practical to do. When it isn’t, we can’t maintain the invariant without help
   from outside code, and so we won’t be able to really encapsulate it properly.
 * Does it make sense as an independent abstraction? Will the user be carrying
   around objects of this type, or are we just jamming a precondition common to
   a few functions (or only one) into a type and calling it useful? Occam’s
   Razor: Don’t multiply entities beyond necessity.
 * What’s the type the caller is using? This is where a real usable abstraction
   shines, because many callers will be using it independently of calling our
   function. But if the caller isn’t using this type, then there typically has
   to be an implicit or explicit conversion (because inheritance from all
   argument types our callers might already have usually isn’t an option), and
   that conversion would need to be usable and sufficiently cheap.

GUIDELINE: Remember that types and contracts are “better together.” Use both.
They are complementary, neither is a substitute for the other. All we are trying
to accomplish with contracts is to augment the language’s static type checking
with runtime checking where that is more appropriate because we can’t design a
practical abstraction. And this is why we want contracts on functions
(preconditions, postconditions) even though we already have types, and why we
also want contracts on types (invariants).

Let’s consider the three examples.


(A) A VECTOR THAT IS SORTED

template <typename T>
void f( vector<T> const& v ) [[pre( is_sorted(v) )]] ;
 
template <typename T>
void f( sorted<vector<T>> const& v );

If this looks familiar, it’s because is_sorted is one of the classic examples we
saw in GotW #98 of conditions that are often impractical to check and enforce as
an assertion, in this case a precondition.

Can we do better by making it a type, perhaps a sorted wrapper around a
container like vector that maintains the guarantee that it’s always sorted?
Well, we have to answer some questions about a sorted<T>:

 * What’s the abstraction it provides? It can’t easily fulfill the requirements
   of a sequence container like vector itself; for example, push_back doesn’t
   make much sense because letting the caller insert an arbitrary value at a
   specific location would easily cause the container to be unsorted. Instead,
   it would naturally want a more general insert function instead, and the
   interface would be more like set. This part could be workable.
 * What’s the cost? This where it starts to breaks down: Keeping a vector sorted
   all the time means that every insertion would cost O(N) work all the time.
   Which leads into…
 * Does it make sense as an independent abstraction? … that it’s very common for
   code to maintain an “almost-sorted” vector, such as by inserting new elements
   at the end which is fast (and, hmm, affects our abstraction design, because
   then it would make sense to have push_back after all, wouldn’t it? hmm) but
   leaves a suffix of unsorted elements in the container, and then periodically
   sorting the whole container so that the sorting cost is amortized. But an
   almost-sorted vector isn’t good enough, and so doesn’t fit the bill. We don’t
   have empirical evidence of such types in general use.
 * What’s the type the caller is using? And now we’re busted all the way,
   because we want this interface to be usable by anyone who has a vector<T>,
   which would require a conversion to sorted<vector<T>>. If we do a deep copy,
   that’s prohibitively expensive. Even if the conversion is lightweight by
   avoiding a deep copy, such as by just wrapping an existing vector object, it
   wouldn’t be very useful unless it did O(N) work every time unconditionally to
   verify the invariant. And even then the abstraction design is affected and
   compromised: If the user can still see and modify the original vector, then
   that’s still part of the accessible interface to the data, so the user can
   make the container be not fully sorted and we’re unable to really encapsulate
   and maintain our intended invariant.

So is_sorted is much better as a function precondition.


// (B) A VECTOR THAT IS NOT EMPTY

template <typename T>
void f( vector<T> const& v ) [[pre( !v.empty() )]] ;
 
template <typename T>
void f( not_empty<vector<T>> const& v );

This one is more feasible as a type, but still not ideal:

 * What’s the abstraction it provides? It’s a vector, and we can make the
   interface identical to vector with just extra preconditions on pop and erase
   functions to not remove the last element in the container.
 * What’s the cost? Emptiness is cheap enough to check and maintain.
 * Does it make sense as an independent abstraction? This is where it starts to
   get questionable… the answer is at best “maybe.” It’s not clear to me than a
   “nonempty vector” is a generally useful abstraction.
 * What’s the type the caller is using? This is where I think we break down
   again. Again, we want this interface to be usable by anyone who has a
   vector<T>, and that means a conversion to not_empty<vector<T>>. If we do a
   deep copy, that’s prohibitively expensive. This time if we just wrap an
   existing vector object to avoid the deep copy, the check is cheap. But then
   we still have the problem that the abstraction design is affected and
   compromised so that it can’t maintain its invariant, because if the user can
   still see and modify the original vector, they can remove the last element on
   us.

So not_empty seems better as a function precondition.


(C) A POINTER THAT IS NOT NULL

void f( int* p ) [[pre( p != nullptr )]] ;
 
void f( not_null<int*> p );

This time we can do better:

 * What’s the abstraction it provides? This one’s easy to state: It’s a not-null
   pointer. That’s a far simpler interface than a container, because we just
   need operator* and operator->, construction, destruction, and copying. Even
   so it’s not totally without subtlety, because not_null should not have move
   operations that modify the source object. This means that a
   not_null<unique_ptr<T>> is legal but there’s not much you can do with it
   besides dereference it and destroy it: It can’t be copyable because
   unique_ptr isn’t copyable, and it must not be movable because moving a
   unique_ptr leaves the source null.
 * What’s the cost? Nullness is cheap enough to check and maintain.
 * Does it make sense as an independent abstraction? Definitely. A “non-null
   pointer” has been widely rediscovered and reinvented as a generally useful
   abstraction.
 * What’s the type the caller is using? A not_null<int*> is a useful object in
   its own right in the calling code, independently of calling this particular
   function. And if our function is invoked by someone who has only an ordinary
   int*, doing a full copy of the pointer is cheap, and applying the nullness
   check as a precondition on that converting constructor is exactly equivalent
   to writing the precondition by hand, but is automated.

So not_null seems better as a type, primarily because it is independently
useful. This is why it has been reinvented a number of times, including as
gsl::not_null. [6]

GUIDELINE: Wherever practical, design interfaces so that incorrect call sites
are illegal (won’t compile, using the type system) or loud (won’t pass unit
tests, using preconditions). This is a key part of achieving the goal to “make
interfaces easy to use correctly, and hard to use incorrectly.” Preconditions
directly help with that by letting us catch entire groups of errors at test
time, and are a complement to the type system which makes incorrect uses “not
fit” through the compiler and also carries extra preconditions around for us in
the form of invariants.

GUIDELINE: Remember that the type system is a hammer, and not every precondition
is a nail. The type system is a powerful tool, but not every precondition is
naturally (part of) an invariant of a useful type that provides a good reusable
abstraction that’s generally useful independently of this function.




NOTES

[1] A. Krzemieński. “Contracts, preconditions and invariants” (Andrzej’s C++
blog, December 2020).

[2] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B.
Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper,
June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and
“ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20
for this article I also modified the syntax from : to ( ), and to name the
return value _return_ for postconditions. That’s not a statement of preference,
it’s just so the examples can compile today to make them easier to check.

[3] For 2(a) and 2(b), on platform ABIs that do not pass small structs/classes
in registers, turning individual parameters into a struct/class could cause them
to be passed in stack memory instead of in registers.

[4] Upcoming GotWs will cover invariants and violation handling.

[5] If C++ gets chained comparisons as proposed in P0515 and P0893 we could
write this much more clearly, and with fewer opportunities for mistakes, as:

[[pre( 0 <= idx_min <= idx_max < container.size() )]]

[6] B. Stroustrup and H. Sutter (eds.) “I.12 Declare a pointer that must not be
null as not_null” (C++ Core Guidelines.) If the not_null<T> type we are using is
implicitly convertible from T, which is the intent of I.12 to provide a drop-in
replacement for pointer parameters, then the usability is the same as with the
precondition. Otherwise, the caller has to provide a not_null argument at the
call site, either by doing an explicit conversion or by just using a not_null
local variable in their own body.


ACKNOWLEDGMENTS

Thank you to the following for their feedback on this material: Joshua Berne,
Gabriel Dos Reis, J. Daniel Garcia, Gábor Horváth, Andrzej Krzemieński, Bjarne
Stroustrup, Andrew Sutton, Ville Voutilainen

Herb Sutter C++, GotW, Uncategorized 9 Comments 2021-03-252021-04-02 15 Minutes


GOTW #101: PRECONDITIONS, PART 2 (DIFFICULTY: 7/10)

This special Guru of the Week series focuses on contracts. We covered some
basics of preconditions in GotW #100. This time, let’s see how we can use
preconditions in some practical examples…


JG QUESTION

1. Consider these functions, expanded from an article by Andrzej Krzemieński:
[1]

// Adapted from [1]
 
auto is_in_values (int val, int min, int max)
  -> bool; // true iff val is in the values [min, max]
 
auto is_in_container (int val, int idx_min, int idx_max)
  -> bool; // true iff container[i]==val for some i in [idx_min, idx_max]
 
template <typename T, typename Iter>
auto is_in_range (T val, Iter first, Iter last)
  -> bool; // true iff *i==val for some i in [first,last)

How many ways could a caller of each function get the arguments wrong, but that
would silently compile without error? Name as many different ways as you can.


GURU QUESTIONS

2. Show how can you improve the function declarations in Question 1 by:

(a) just grouping parameters, using a struct with public variables

(b) just using an encapsulated class, using a class with private variables (an
abstraction with its own invariant)

(c) just using post-C++20 contract preconditions (not yet valid C++, but
something like the syntax in [2])

In each case, how many of the possible kinds of mistakes for each function can
the approach prevent?

3. Consider these three examples, where each shows expressing a boolean
condition either as a function precondition or as an encapsulated invariant
inside a new type:

// (a) A vector that is sorted
 
template <typename T>
void f( vector<T> const& v ) [[pre( is_sorted(v) )]] ;
 
template <typename T>
void f( sorted<vector<T>> const& v );
 
 
// (b) A vector that is not empty
 
template <typename T>
void f( vector<T> const& v ) [[pre( !v.empty() )]] ;
 
template <typename T>
void f( not_empty<vector<T>> const& v );
 
 
// (c) A pointer that is not null
 
void f( int* p ) [[pre( p != nullptr )]] ;
 
void f( not_null<int*> p );

In each of these cases, which way is better? Explain your answer.


NOTES

[1] A. Krzemieński. “Contracts, preconditions and invariants” (Andrzej’s C++
blog, December 2020).

[2] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B.
Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper,
June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and
“ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20
for this article I also modified the syntax from : to ( ). That’s not a
statement of preference, it’s just so the examples can compile today to make
them easier to check.

Herb Sutter C++, GotW 2021-03-08 2 Minutes


GOTW #100 SOLUTION: PRECONDITIONS, PART 1 (DIFFICULTY: 8/10)

This special Guru of the Week series focuses on contracts. We’ve seen how
postconditions are directly related to assertions (see GotWs #97 and #99). So
are preconditions, but that in one important way makes them fundamentally
different. What is that? And why would having language support benefit us even
more for writing preconditions more than for the other two?


1. WHAT IS A PRECONDITION, AND HOW IS IT RELATED TO AN ASSERTION?

A precondition is a “call site prerequisite on the inputs”: a condition that
must be true at each call site before the caller can invoke this function. In
math terms, it’s about expressing the domain of recognized inputs for the
function. If preconditions don’t hold, the function can’t possibly do its work
(achieve its postconditions), because the caller hasn’t given it a starting
point it understands.

A precondition IS-AN assertion in every way described in GotW #97, with the
special addition that whereas a general assertion is always checked where it is
written, a precondition is written on the function and conceptually checked at
every call site. (In less-ideal implementations, including if we write it as a
library today, the precondition check might be in the function body; see
Question 2.)


EXPLAIN YOUR ANSWER USING THE FOLLOWING EXAMPLE, WHICH USES A VARIATION OF A
PROPOSED POST-C++20 SYNTAX FOR PRECONDITIONS. [1]

// Example 1(a): A precondition along the lines proposed in [1]
 
void f( int min, int max )
    [[pre( min <= max )]]
{
    // ...
}

The above would be roughly equivalent to writing the test before the call at
every call site instead. For example, for a call site that performs f(x, y), we
want to check the precondition at this specific call site at least when it is
being tested (and possibly earlier and/or later, see GotW #97 Question 4):

// Example 1(b): What a compiler might generate at a call site
//               “f(x, y)” for the precondition in Example 1(a)
 
assert( x <= y ); // implicitly injected assertion at this call site,
                  // checked (at least) when this call site is tested
f(x, y);

And, as we’ll see in Question 4, language support for preconditions should apply
this rewrite recursively for subexpressions that are themselves function calls
with preconditions.

GUIDELINE: Use a precondition to write “this is what a bug is” as code the
caller can check. A precondition states in code the circumstances under which
this function’s behavior is not documented.


2. REWRITE THE EXAMPLE IN QUESTION 1 TO SHOW HOW TO APPROXIMATE THE SAME EFFECT
USING ASSERTIONS IN TODAY’S C++.

Here’s one way we can do it, that extends the MY_POST technique from GotW #99
Example 2 to also support preconditions. Again, instead of MY_ you’d use your
company’s preferred unique macro prefix: [2]

// Eliminate forward-boilerplate with a macro (written only once)
#define MY_PRE_POST(preconditions, postconditions)         \
    assert( preconditions );                               \
    auto post = [&](auto&& _return_) -> auto&& {           \
        assert( postconditions );                          \
        return std::forward<decltype(_return_)>(_return_); \
    };

And then the programmer can just write:

// Example 2: Sample precondition
 
void f( int min, int max )
{   MY_PRE_POST_V( min <= max, true ); // true == no postconditions here
    // ...
}

This has the big benefit that it works using today’s C++. It has the same
advantages as MY_POST in GotW #99, including that it’s future-friendly… if we
use the macro as shown above, then if in the future C++ has language support for
preconditions and postconditions with a syntax like [1], migrating your code to
that could be as simple as search-and-replace:

{ MY_PRE_POST( **, * ) ➜ [[pre: ** ]] [[post _return_: * )]] {

return post( * ) ➜ return *

GUIDELINE (extended from GotW #99): If you don’t already use a way to write
preconditions and postconditions as code, consider trying something like
MY_PRE_POST until language support is available. It’s legal C++ today, it’s not
terrible, and it’s future-friendly to adopting future C++ language contracts.

But even if macros don’t trigger your fight-or-flight response, it’s still a far
cry from language support …


ARE THERE ANY DRAWBACKS TO YOUR SOLUTION COMPARED TO HAVING LANGUAGE SUPPORT FOR
PRECONDITIONS?

Yes:

 * Callee-body checking only. This method can run the check only inside the
   function’s body. First, this means we can’t easily perform the check at each
   call site, which would be ideal including so we can turn the check on for one
   call site but not another when we are testing a specific caller. Second, for
   constructors it can’t run at the very beginning of construction because
   member initialization happens before we enter the constructor body.
 * Doesn’t directly handle nested preconditions, meaning preconditions of
   functions invoked as part of the precondition itself. We’ll come to this in
   Question 4.


3. IF A PRECONDITION FAILS, WHAT DOES THAT INDICATE, AND WHO IS RESPONSIBLE FOR
FIXING THE FAILURE?

Each call site is responsible for making sure it meets all of a function’s
preconditions before calling that function. If a precondition is false, it’s a
bug in the calling code, and it’s the calling code author who is responsible for
fixing it.


EXPLAIN HOW THIS MAKES A PRECONDITION FUNDAMENTALLY DIFFERENT FROM EVERY OTHER
KIND OF CONTRACT.

A precondition is the only kind of contract you can write that someone else has
to fulfill, and so if it’s ever false then it’s someone else’s fault — it’s the
caller’s bug that they need to go fix.

GUIDELINE: Remember the fundamental way preconditions are unique… if they’re
false, then it’s someone else’s fault (the calling code author). When you write
any of the other contracts (assertions, function postconditions, class
invariants), you state something that must be true about your own function or
class, and if prior contracts were written and well tested then likely it’s your
function or class that created the first unexpected state.


4. CONSIDER THIS EXAMPLE, EXPANDED FROM A SUGGESTION BY GÁBOR HORVÁTH:

// Example 4(a): What are the implicit preconditions?
 
auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre( x[0] <= std::sqrt(y) )]] ;

Note that std::floating_point is a C++20 concept.


A) WHAT KINDS OF PRECONDITIONS MUST A CALLER OF CALC SATISFY THAT CAN’T
GENERALLY BE WRITTEN AS TESTABLE BOOLEAN EXPRESSIONS?

The language requires the number and types of arguments to match the parameter
list. Here, calc must be called with two arguments. The first must be a
std::vector<int> or something convertible to that. The second one’s type has to
satisfy the floating_point concept (it must be float, double, or long double).

It’s worth remembering that these language-enforced rules are conceptually part
of the function’s precondition, in the sense that they are requirements on call
sites. Even though we generally can’t write testable boolean predicates for
these to check that we didn’t write a bug, we also never need to do that because
if we write a bug the code just won’t compile. [3] Code that is “correct by
construction” doesn’t need to add assertions to find potential bugs.

GUIDELINE: Remember that a static type is a (non-boolean) precondition. It’s
just enforced by language semantics with always-static checking (edit or compile
time), and never needs to be tested using a boolean predicate whose test could
be delayed until dynamic checking (test or run time).

COROLLARY: A function’s number, order, and types of parameters are all
(non-boolean) parts of its precondition. This falls out of the “static type”
statement because the function’s own static type includes those things. For
example, the language won’t let us invoke this function with the argument lists
() or (1,2,3,4,5) or (3.14, myvector). We’ll delve into this more deeply in GotW
#101.

COROLLARY: All functions have preconditions. Even void f() { }, which takes no
inputs at all including that it reads no global state, has the precondition that
it must be passed zero arguments. The only counterexample I can think of is
pathological: void f(...) { } can be invoked with any number of arguments but
ignores them all.


B) WHAT KINDS OF BOOLEAN-TESTABLE PRECONDITIONS ARE IMPLICIT WITHIN THE
EXPLICITLY WRITTEN DECLARATION OF CALC?

There are three possible kinds of implied boolean preconditions. All three are
present in this example.


(1) TYPE INVARIANTS

Each object must meet the invariants of its type. This is subtly different from
“the object’s type matches” (a static property) that we say in 4(a), because
this means additionally “the object’s value is not corrupt” (a dynamic
property).

Here, this means x must obey the invariant of vector<int>, even though that
invariant isn’t expressed in code in today’s C++. [4] For y this is fairly easy
because all bit patterns are valid floating point values (more about NaNs in
just a moment).


(2) SUBEXPRESSION PRECONDITIONS

The subexpression x[0] calls x.operator[] which has its own precondition, namely
that the subscript be non-negative and less than x.size(). For 0, that’s true if
x.size() > 0 is true, or equivalently !x.empty(), so that becomes an implicit
part of our whole precondition.


(3) SUBEXPRESSIONS THAT MAKE THE WHOLE PRECONDITION FALSE

The subexpression std::sqrt(y) invokes C’s sqrt. The C standard says that unless
y >= 0, the result of sqrt(y) is NaN (“not a number”), which means our
precondition amounts to something <= NaN which is always false. Therefore, y >=
0 is effectively part of calc’s precondition too. [5]


PUTTING IT ALL TOGETHER

If we were to write this all out, the full precondition would be something like
this — and note that the order is important! Here we’ll ignore the parts that
are enforced by the language, such as parameter arity, and focus on the parts
that can be written as boolean expressions:

// Example 4(b): Trying to write the precondition out more explicitly
//               (NOT all are recommended, this is for exposition)
 
auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre(
 
//  1. parameter type invariants:
           /* x is a valid object, but we can’t spell that, so: */ true
 
//  2. subexpression preconditions:
        && x.size() > 0  // so checking x[0] won’t be undefined (!)
 
//  3. subexpression values that make our precondition false:
        && y >= 0        // redundant with the expression below
 
// finally, our explicit precondition itself:
        && x[0] <= std::sqrt(y)
    )]] ;

GUIDELINE: Remember that your function’s full effective precondition is the
precondition you write plus all its implicit prerequisites. Those are: (1) each
parameter’s type invariants, (2) any preconditions of other function calls
within the precondition, and (3) any defined results of function calls within
the precondition that would make the precondition false.


C) SHOULD ANY OF THESE BOOLEAN-TESTABLE IMPLICIT PRECONDITIONS ALSO BE WRITTEN
EXPLICITLY HERE IN THIS PRECONDITION CODE? EXPLAIN.

For #1 and #3, we generally shouldn’t be repeating them as in 4(b):

 * We can skip repeating #1 because it’s enforced by the type system, plus if
   there is a bug it’s likely in the type itself rather than in our code or our
   caller’s code and will be checked when we check the type’s invariants.
 * We can skip repeating #3 because it’ll just make the whole condition be false
   and so is already covered.

But #2 is the problematic case: If x is actually empty, the subexpression’s
precondition would actually make our precondition undefined to evaluate!
“Undefined” is a very bad answer if we ever check this precondition, because if
in our checking the full precondition is ever violated then we absolutely want
that check to do something well-defined — we want it to evaluate to false and
fail.

If a subexpression of our precondition itself has a real precondition, then we
do want to check that first, otherwise we cannot check our full precondition
without undefined behavior if that subexpression’s precondition was not met:

// Example 4(c): Today, we should repeat our subexpressions’ real
//               preconditions, so we can check our precondition
//               without undefined behavior
 
auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre( x.size() > 0 && x[0] <= std::sqrt(y) )]] ;

With today’s library-based preconditions, such as the one shown in Question 2,
we need to repeat subexpressions’ preconditions if we want to check our
precondition without undefined behavior. One of the potential advantages of a
language-supported contracts system is that it can “flatten” the preconditions
to automatically test category #2 , so that nested preconditions like this one
don’t need to be repeated (assuming that the types and functions you use, here
std::vector and its member functions, have written their preconditions and
invariants)… and then we could still debate whether or not to explicitly repeat
subexpression preconditions in our preconditions, but it would be just a
water-cooler stylistic debate, not a “can this even be checked at all without
invoking undefined behavior” correctness debate.

Here’s a subtle variation suggested by Andrzej Krzemieński. For the sake of
discussion, suppose we have a nested precondition that is not used in the
function body (which I think is terribly unlikely, but let’s just consider it):

void display( /*...*/ )
    [[pre( globalData->helloMessageHasBeenPrinted() )]]
{
    // assume for sake of discussion that globalData is not
    // dereferenced directly or indirectly by this function body
}

Here, someone could argue: “If globalData is null, only actually checking the
precondition would be undefined behavior, but executing the function body would
not be undefined behavior.”

Question: Is globalData != nullptr an implicit precondition of display, since it
applies only to the precondition, and is not actually used in the function body?
Think about it for a moment before continuing…

…

… okay, here’s my answer: Yes, it’s absolutely part of the precondition of
display, because by definition a precondition is something the caller is
required to ensure is true before calling display, and a condition that is
undefined to evaluate at all cannot be true.

GUIDELINE: If your checked precondition has a subexpression with its own
preconditions, make sure those are checked first. Otherwise, you might find your
precondition check doesn’t fire even when it’s violated. In the future, language
support for preconditions might automate this for you; until then, be careful to
write out the subexpression precondition by hand and put it first.


NOTES

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B.
Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper,
June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and
“ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20
for this article I also modified the syntax from : to ( ), and to name the
return value _return_ for postconditions. That’s not a statement of preference,
it’s just so the examples can compile today to make them easier to check.

[2] Again, as in GotW #99 Note 4, in a real system we’d want a few more
variations, such as:

// A separate _V version for functions that don’t return
// a value, because 'void' isn’t regular
#define MY_PRE_POST_V(preconditions, postconditions) \
    assert( preconditions );                         \
    auto post = [&]{ assert( postconditions ); };
 
// Parallel _DECL forms to work on forward declarations,
// for people who want to repeat the postcondition there
#define MY_PRE_POST_DECL(preconditions, postconditions)
#define MY_PRE_POST_V_DECL(preconditions, postconditions)

And see GotW #99 Note 5 for how to guarantee the programmer didn’t forget to
write “return post” at each return.

[3] Sure, there are things like is_invocable, but the point is we can’t always
write those expressions, and we don’t have to here.

[4] Upcoming GotWs will cover invariants and violation handling. For type
invariants, today’s C++ doesn’t yet provide a way to write those as a checkable
assertions to help us find bugs where we got it wrong and corrupted an object.
The language just flatly assumes that every object meets the invariants of its
type during the object’s lifetime, which is from the end of its construction to
the beginning of its destruction.

[5] There’s more nuance to the details of what the C standard says, but it ends
up that we should expect the result of passing a negative or NaN value to sqrt
will be NaN. Although C calls negative and NaN inputs “domain errors,” which
hints at a precondition, it still defines the results for all inputs and so
strictly speaking doesn’t have a precondition.


ACKNOWLEDGMENTS

Thank you to the following for their feedback on this material: Joshua Berne,
Gabriel Dos Reis, J. Daniel Garcia, Gábor Horváth, Andrzej Krzemieński,
Jean-Heyd Meneide, Bjarne Stroustrup, Andrew Sutton, Jim Thomas, Ville
Voutilainen.

Herb Sutter C++, GotW 2021-02-25 11 Minutes


TRIP REPORT: WINTER 2021 ISO C++ STANDARDS MEETING (VIRTUAL)

Today, the ISO C++ committee held its second full-committee (plenary) meeting of
the pandemic and adopted a few more features and improvements for draft C++23.

A record of 18 voting nations sent representatives to this meeting: Austria,
Bulgaria, Canada, Czech Republic, Finland, France, Germany, Israel, Italy,
Japan, Netherlands, Poland, Romania, Russia, Spain, Switzerland, United Kingdom,
and United States. Japan had participated in person during C++98 and C++11, and
has always given us good remote ballot feedback during C++14/17/20, and is
attending again now; welcome back! Italy and Romania are our newest national
bodies; welcome!


OUR VIRTUAL 2021

We continue to have the same priorities and the same schedule we originally
adopted for C++23. However, since the pandemic began, WG21 and its subgroups
have had to meet all-virtually via Zoom, and we are not going to try to have a
face-to-face meeting in 2021 (see What’s Next below). Some subgroups had already
been having virtual meetings for years, but this was a major change for other
groups including our two main design groups – the language and library evolution
working groups (EWG and LEWG). In all, over the past year we have held
approximately 200 virtual meetings.


TODAY: A FEW MORE C++23 FEATURES ADOPTED

Today we formally adopted a second round of small features for C++23, as well as
a number of bug fixes. Below, I’ll list some of the more user-noticeable changes
and credit all those paper authors, but note that this is far from an exhaustive
list of important contributors… even for these papers, nothing gets done without
help from a lot of people and unsung heroes, so thank you first to all of the
people not named here who helped the authors move their proposals forward! And
thank you to everyone who worked on the adopted issue resolutions and smaller
papers I didn’t include in this list.

P1102 by Alex Christensen and JF Bastien is the main noticeable change we
adopted for the core language itself. It’s just a tiny bit of cleanup, but one
that I’m personally fond of: In C++23 we will be able to omit empty ( ) lambda
parameter lists even when we have to declare the lambda mutable. I’m the one who
proposed the lambda syntax we have today (except for the mutable part which
wasn’t mine and I never liked), including that it enabled making unused parts of
the syntax optional so that we can write simple lambdas simply. For example,
today we can already write

[x]{ return f(x); }

as a legal synonym for

[x] () -> auto { return f(x); }

and omit the empty parameter list and deduced return type. Even so, I’ve noticed
a lot of people write the ( ) part anyway, which isn’t wrong or anything, it’s
just that often they write it because they don’t know they can omit it too. And
part of the problem was the oddity in pre-C++23 that if you need to write
mutable, then you actually do have to also write the ( ) (but not the return
type), which was just weird but was another reason for people to just write ( )
all the time, because sometimes they had to. With P1102, we don’t have to.
That’s more consistent. Thanks, Alex and JF!

In the spirit of “completing C++20,” P2259 by Tim Song makes several fixes to
iterator_category to make it work better with ranges and adaptors. Here is an
example of code that does not compile today for arcane reasons (see the paper),
but will be legal C++23 thanks to Tim:

std::vector<int> vec = {42};
auto r = vec | std::views::transform([](int c) { return std::views::single(c);})
             | std::views::join
             | std::views::filter([](int c) { return c > 0; });
r.begin();

Further in the “completing C++20” spirit, P2017 by Barry Revzin fixes some
additional glitches in ranges to make them work better. Here is an example of
safe and efficient code that does not compile today, where for arcane reasons
the declaration of e isn’t supported and today’s workaround is to make the code
more complex and less efficient. This will be legal C++23 thanks to Barry:

auto trim(std::string const& s) {
    auto isalpha = [](unsigned char c){ return std::isalpha(c); };
    auto b = ranges::find_if(s, isalpha);
    auto e = ranges::find_if(s | views::reverse, isalpha).base();
    return subrange(b, e);
}

P2212 by Alexey Dmitriev and Howard Hinnant generalizes time_point::clock to
allow for greater flexibility in the kinds of clocks it supports, including
stateful clocks, external system clocks that don’t really have time_points,
representing “time of day” as a distinct time_point, and more.

P2162 by Barry Revzin takes an important first step toward cleaning up
std::visit and lay the groundwork for its further generalization. Even if you
don’t yet love std::visit, it’s a useful tool that P2162 makes more useful by
making it work more regularly. We expect to see further generalization in the
future, which is much easier to do with a cleaner and more regular existing
feature to build upon.

Finally, I saw cheers and celebratory emoji erupt in the Zoom chat window when
we adopted P1682 by JeanHeyd Meneide. It’s very small, but very useful. When
passing an enum to an API that uses the underlying type, today we have to write
a static_cast to the std::underlying_type, which makes us repeat the enum’s name
and so is cumbersome all the time and brittle for type-safety under maintenance
if we change to use a different enum:

1
some_untyped_api( static_cast<std::underlying_type_t<ABCD>>(some_value) );

Thanks to JeanHeyd, in C++23 we will be able to write:

1
some_untyped_api( std::to_underlying(some_value) );

Note that of course standard library vendors don’t have to wait until 2023 to
provide to_underlying or any of these other fixes and improvements. Just having
a feature like this one voted into the draft standard is often enough for
vendors to be proactive in providing it… these days, vendors are more closely
tracking our draft standard meeting by meeting rather than waiting for the
official release, in part because we are shipping regularly and predictably and
we don’t vote features into the draft standard until we think they’re pretty
well baked so that vendors have less risk in implementing them early.

We also adopted a number of other issue resolutions and small papers that made
additional improvements.

Finally, we came close to adopting P0533 by Edward Rosten and Oliver Rosten,
which is about adding constexpr to many of the functions in math.h that we share
with C. This is clearly a Good Thing and therefore many voted in favor of
adopting the paper. The only hesitation that stopped it from getting consensus
this time were concerns that it needed more time to iron out how implementations
would implement it, such as how to deal with errno in a constexpr context. This
is the kind of question that often arises when we want to make improvements to
entities declare in the C headers, because not only are they governed by the C
standard rather than the C++ standard, but typically they are provided and
controlled by the operating system vendor rather than by the C++
compiler/library writer, and those constraints always mean a bit of extra work
when we want to make improvements for C++ programmers and remain compatible. As
far as I know, everyone wants to see these functions made constexpr, so we
expect to see this paper come to plenary again in the future. Thanks for your
perseverance, Edward and Oliver!


WHAT’S NEXT

As long as we are meeting virtually, we will continue to have virtual plenaries
like the one we had this week to formally adopt new features as they progress
through subgroups. Our next two virtual plenaries to adopt features into the
C++23 working draft will be held in June and November. Progress will be slower
than when we can meet face-to-face, and we’ll doubtless defer some topics that
really need in-person discussion until we can meet again safely, but in the
meantime we’ll make what progress we can and we’ll ship C++23 on time.

The next tentatively planned face-to-face meeting is February 2022 in Portland,
OR, USA; however, we likely won’t know until well into the autumn whether we’ll
be able to confirm that or need to postpone it. You can find a list of our
meeting plans on the Upcoming Meetings page.

Thank you again to the hundreds of people who are working tirelessly on C++,
even in our current altered world. Your flexibility and willingness to adjust
are much appreciated by all of us in the committee and by all the C++
communities! Thank you, and see you on Zoom.

Herb Sutter Uncategorized 2 Comments 2021-02-22 6 Minutes


GOTW #100: PRECONDITIONS, PART 1 (DIFFICULTY: 8/10)

This special Guru of the Week series focuses on contracts. We’ve seen how
postconditions are directly related to assertions (see GotWs #97 and #99). So
are preconditions, but that in one important way makes them fundamentally
different. What is that? And why would having language support benefit us even
more for writing preconditions more than for the other two?


JG QUESTION

1. What is a precondition, and how is it related to an assertion? Explain your
answer using the following example, which uses a variation of a proposed
post-C++20 syntax for preconditions. [1]

// A precondition along the lines proposed in [1]
 
void f( int min, int max )
    [[pre( min <= max )]]
{
    // ...
}


GURU QUESTIONS

2. Rewrite the example in Question 1 to show how to approximate the same effect
using assertions in today’s C++. Are there any drawbacks to your solution
compared to having language support for preconditions?

3. If a precondition fails, what does that indicate, and who is responsible for
fixing the failure? Explain how this makes a precondition fundamentally
different from every other kind of contract.

4. Consider this example, expanded from a suggestion by Gábor Horváth:

auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre( x[0] <= std::sqrt(y) )]] ;

Note that std::floating_point is a C++20 concept.

 * What kinds of preconditions must a caller of calc satisfy that can’t
   generally be written as testable boolean expressions?
 * What kinds of boolean-testable preconditions are implicit within the
   explicitly written declaration of calc?
 * Should any of these boolean-testable implicit preconditions also be written
   explicitly here in this precondition code? Explain.


NOTES

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B.
Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper,
June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and
“ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20
for this article I also modified the syntax from : to ( ). That’s not a
statement of preference, it’s just so the examples can compile today to make
them easier to check.

Herb Sutter C++, GotW 4 Comments 2021-02-102021-02-10 1 Minute


GOTW #99 SOLUTION: POSTCONDITIONS (DIFFICULTY: 7/10)

This special Guru of the Week series focuses on contracts. Postconditions are
directly related to assertions (see GotW #97)… but how, exactly? And since we
can already write postconditions using assertions, why would having language
support benefit us more for writing postconditions more than for writing
(ordinary) assertions?


1. WHAT IS A POSTCONDITION, AND HOW IS IT RELATED TO AN ASSERTION?

A function’s postconditions document “what it does” — they assert the function’s
intended effects, including the return value and any other caller-visible side
effects, which must hold at every return point when the function returns to the
caller.

A postcondition IS-AN assertion in every way described in GotW #97, with the
special addition that whereas a general assertion is always checked where it is
written, a postcondition is written on the function and checked at every return
(which could be multiple places). Otherwise, it’s “just an assertion”: As with
an assertion, if a postcondition is false then it means there is a bug, likely
right there inside the function on which the postcondition is written (or in the
postcondition itself), because if prior contracts were well tested then likely
this function created the first unexpected state. [2]


EXPLAIN YOUR ANSWER USING THE FOLLOWING EXAMPLE, WHICH USES A VARIATION OF A
PROPOSED POST-C++20 SYNTAX FOR POSTCONDITIONS. [1]

// Example 1(a): A postcondition along the lines proposed in [1]
 
string combine_and_decorate( const string& x, const string& y )
    [[post( _return_.size() > x.size() + y.size() )]]
{
    if (x.empty()) {
        return "[missing] " + y + optional_suffix();
    } else {
        return x + ' ' + y + something_computed_from(x);
    }
}

The above would be roughly equivalent to writing the test before every return
statement instead:

// Example 1(b): What a compiler might generate for Example 1(a)
 
string combine_and_decorate( const string& x, const string& y )
{
    if (x.empty()) {
        auto&& _return_ = "[missing] " + y + optional_suffix();
        assert( _return_.size() > x.size() + y.size() );
        return std::forward<decltype(_return_)>(_return_);
    } else {
        auto&& _return_ = x + ' ' + y + something_computed_from(x);
        assert( _return_.size() > x.size() + y.size() );
        return std::forward<decltype(_return_)>(_return_);
    }
}


2. REWRITE THE EXAMPLE IN QUESTION 1 TO SHOW HOW TO APPROXIMATE THE SAME EFFECT
USING ASSERTIONS IN TODAY’S C++. ARE THERE ANY DRAWBACKS TO YOUR SOLUTION
COMPARED TO HAVING LANGUAGE SUPPORT FOR POSTCONDITIONS?

We could always write Example 1(b) by hand, but language support for
postconditions is better in two key ways:

(A) The programmer should only write the condition once.

(B) The programmer should not need to write forwarding boilerplate by hand to
make looking at the return value efficient.

How can we approximate those advantages?


OPTION 1 (BASIC): NAMED RETURN OBJECT + AN EXIT GUARD

The simplest way to achieve (A) would be to use the C-style goto exit; pattern:

// Example 2(a)(i): C-style “goto exit;” postcondition pattern
 
string combine_and_decorate( const string& x, const string& y )
{
    auto _return_ = string();
    if (x.empty()) {
        _return_ = "[missing] " + y + optional_suffix();
        goto post;
    } else {
        _return_ = x + ' ' + y + something_computed_from(x);
        goto post;
    }
 
post:
    assert( _return_.size() > x.size() + y.size() );
    return _return_;
}

If you were thinking, “in C++ this wants a scope guard,” you’re right! [3]
Guards still need access to the return value, so the structure is basically
similar:

// Example 2(a)(ii): scope_guard pattern, along the lines of [3]
 
string combine_and_decorate( const string& x, const string& y )
{
    auto _return_ = string();
    auto post = std::experimental::scope_success([&]{
        assert( _return_.size() > x.size() + y.size() );
    });
 
    if (x.empty()) {
        _return_ = "[missing] " + y + optional_suffix();
        return _return_;
    } else {
        _return_ = x + ' ' + y + something_computed_from(x);
        return _return_;
    }
}

Advantages:

 * Achieved (A). The programmer writes the condition only once.

Drawbacks:

 * Didn’t achieve (B). There’s no forwarding boilerplate, but only because we’re
   not even trying to forward…
 * Overhead (maybe). … and to look at the return values we require a named
   return value and a move assignment into that object, which is overhead if the
   function wasn’t already doing that.
 * Brittle. The programmer has to remember to convert every return site to
   _return_ = ...; goto post; or _return_ = ...; return _return_;… If they
   forget, the code silently compiles but doesn’t check the postcondition.


OPTION 2 (BETTER): “RETURN POST” POSTCONDITION PATTERN

Here’s a second way to do it that achieves both goals, using a local function
(which we have to write as a lambda in C++):

// Example 2(b): “return post” postcondition pattern
 
string combine_and_decorate( const string& x, const string& y )
{
    auto post = [&](auto&& _return_) -> auto&& {
        assert( _return_.size() > x.size() + y.size() );
        return std::forward<decltype(_return_)>(_return_);
    };
 
    if (x.empty()) {
        return post( x + ' ' + y + something_computed_from(x) );
    } else {
        return post( "[missing] " + y + optional_suffix() );
    }
}

Advantages:

 * Achieved (A). The programmer writes the condition only once.
 * Efficient. We can look at return values efficiently, without requiring a
   named return value and a move assignment.

Drawbacks:

 * Didn’t achieve (B). We still have to write the forwarding boilerplate, but at
   least it’s only in one place.
 * Brittle. The programmer has to remember to convert every return site to
   return post. If they forget, the code silently compiles but doesn’t check the
   postcondition.


OPTION 3 (MO’BETTA): WRAPPING UP OPTION 2… WITH A MACRO

We can improve Option 2 by wrapping the boilerplate up in a macro (sorry). Note
that instead of “MY_” you’d use your company’s preferred unique macro prefix:
[4]

// Eliminate forward-boilerplate with a macro (written only once)
#define MY_POST(postconditions)                            \
    auto post = [&](auto&& _return_) -> auto&& {           \
        assert( postconditions );                          \
        return std::forward<decltype(_return_)>(_return_); \
    };

And then the programmer can just write:

// Example 2(c): “return post” with boilerplate inside a macro
 
string combine_and_decorate( const string& x, const string& y )
{   MY_POST( _return_.size() > x.size() + y.size() );
 
    if (x.empty()) {
        return post( x + ' ' + y + something_computed_from(x) );
    } else {
        return post( "[missing] " + y + optional_suffix() );
    }
}

Advantages:

 * Achieved (A) and (B). The programmer writes the condition only once, and
   doesn’t write the forwarding boilerplate.
 * Efficient. We can look at the return value without requiring a local variable
   for the return value, and without an extra move operation to put the value
   there.
 * Future-friendly. You may have noticed that I changed my usual brace style to
   write { MY_POST on a single line; that’s to make it easily replaceable with
   search-and-replace. If you systematically declare the condition as { MY_POST
   at the start of the function, and systematically write return post() to use
   it, the code is likely more future-proof — if we get language support for
   postconditions with a syntax like [1], migrating your code to that could be
   as simple as search-and-replace:

{ MY_POST( * ) ➜ [[post _return_: * )]] {

return post( * ) ➜ return *

Drawbacks:

 * (improved) Brittle. It’s still a manual pattern, but now we have the option
   of making it impossible for the programmer to forget return post by extending
   the macro to include a check that post was used before each return (see [5]).
   That’s feasible to put into the Option 3 macro, whereas it was not realistic
   to ask the programmer to write out by hand in Options 1 and 2.

GUIDELINE: If you don’t already use a way to write postconditions as code,
consider trying something like MY_POST until language support is available. It’s
legal C++ today, it’s not terrible, and it’s future-friendly to adopting future
C++ language contracts.

Finally, all of these options share a common drawback:

 * Less composable/toolable. The next library or team will have THEIR_POST
   convention that’s different, which makes it hard to write tools to support
   both styles. Language support has an important incidental benefit of
   providing a common syntax that portable code and tools can rely upon.




3. SHOULD A POSTCONDITION BE EXPECTED TO BE TRUE IF THE FUNCTION THROWS AN
EXCEPTION BACK TO THE CALLER?

No.

First, let’s generalize the question: Anytime you see “if the function throws an
exception,” mentally rewrite it to “if the function reports that it couldn’t do
what it advertised, namely complete its side effects.” That’s independent of
whether it reports said failure using an exception, std::error_code, HRESULT,
errno, or any other way.

Then the question answers itself: No, by definition. A postcondition documents
the side effects, and if those weren’t achieved then there’s nothing to check.
And for postconditions involving the return value we can add: No, those are
meaningless by construction, because it doesn’t exist.

“But wait!” someone might interrupt. “Aren’t there still things that need to be
true on function exit even if the function failed?” Yes, but those aren’t
postconditions. Let’s take a look.


JUSTIFY YOUR ANSWER WITH EXAMPLE(S).

Consider this code:

// Example 3: (Not) a reasonable postcondition?
 
void append_and_decorate( string& x, string&& y )
    [[post( x.size() <= x.capacity() && /* other non-corruption */ )]]
{
    x += y + optional_suffix();
}

This can seem like a sensible “postcondition” even when an exception is thrown,
but it is testing whether x is still a valid object of its type… and sure, that
had better be true. But that’s an invariant, which should be written once on the
type [2], not a postcondition to be laboriously repeated arbitrarily many times
on every function that ever might touch an object of that type.

When reasoning about function failures, we use the well-known Abrahams error
safety guarantees, and now it becomes important to understand them in terms of
invariants:

 * The nofail guarantee is “the function cannot fail” (e.g., such functions
   should be noexcept), and so doesn’t apply here since we’re discussing what
   happens if the function does fail.
 * The basic guarantee is “no corruption,” every object we might have tried to
   modify is still a valid object of its type… but that’s identical to saying
   “the object still meets the invariants of its type.”
 * The strong guarantee is “all or nothing,” so in the case we’re talking about
   where an error is being reported, a strong guarantee function is again saying
   that all invariants hold. (It also says observable state did not change, but
   I’ll ignore that for now; for how we might want to check that, see [6].)

So we’re talking primarily about class invariants… and those should hold on both
successful return and error exit, and they should be written on the type rather
than on every function that uses the type.

GUIDELINE: If you’re trying to write a “postcondition” that should still be true
even if an exception or other error is reported, you’re probably either trying
to write an invariant instead [2], or trying to check the strong did-nothing
guarantee [6].


4. SHOULD POSTCONDITIONS BE ABLE TO REFER TO BOTH THE INITIAL (ON ENTRY) AND
FINAL (ON EXIT) VALUE OF A PARAMETER, IF THOSE COULD BE DIFFERENT?

Yes.


IF SO, GIVE AN EXAMPLE.

Consider this code, which uses a strawman _in_() syntax for referring to
subexpressions of the postcondition that should be computed on entry so they can
refer to the “in” value of the parameter (note: this was not proposed in [1]):

// Example 4(a): Consulting “in” state in a postcondition
 
void instrumented_push( vector<widget>& c, const widget& value )
    [[post( _in_(c.size())+1 == c.size() )]]
{
 
    c.push_back(value);
 
    // perform some extra work, such as logging which
    // values are added to which containers, then return
}

Postconditions like this one express relative side effects, where the “out”
state is a delta from the “in” state of the parameter. To write postconditions
like this one, we have to be able to refer to both states of the parameter, even
for parameters that must be modifiable.

Note that this doesn’t require taking a copy of the parameter… that would be
expensive for c! Rather, an implementation would just evaluate any _in_
subexpression on entry and store only that result as a temporary, then evaluate
the rest of the expression on exist. For example, in this case the
implementation could generate something like this:

// Example 4(b): What an implementation might generate for 4(a)
 
void instrumented_push( vector<widget>& c, const widget& value )
{
    auto __in_c_size = c.size();
 
    c.push_back(value);
 
    // perform some extra work, such as logging which
    // values are added to which containers, then return
 
    assert( __in_c_size+1 == c.size() );
}


NOTES

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B.
Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper,
June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and
“ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20
for this article I also modified the syntax from : to ( ), and to name the
return value _return_ for postconditions. That’s not a statement of preference,
it’s just so the examples can compile today to make them easier to check.

[2] Upcoming GotWs will cover preconditions and invariants, including how
invariants relate to postconditions.

[3] P. Sommerlad and A. L. Sandoval. “P0052: Generic Scope Guard and RAII
Wrapper for the Standard Library” (WG21 paper, February 2019). Based on
pioneering work by Andrei Alexandrescu and Petru Marginean starting with “Change
the Way You Write Exception-Safe Code – Forever” (Dr. Dobb’s Journal, December
2000), and widely implemented in D and other languages, the Folly library, and
more.

[4] In a real system we’d want a few more variations, such as:

// A separate _V version for functions that don’t return
// a value, because 'void' isn’t regular
#define MY_POST_V(postconditions)                          \
    auto post = [&]{ assert( postconditions ); };
 
// Parallel _DECL forms to work on forward declarations,
// for people who want to repeat the postcondition there
#define MY_POST_DECL(postconditions)   // intentionally empty
#define MY_POST_V_DECL(postconditions) // intentionally empty

Note: We could try to combine MY_POST_V and MY_POST by always creating both a
single-parameter lambda and a no-parameter lambda, and then “overloading” them
using something like compose from Boost’s wonderful High-Order Function library
by Paul Fultz II. Then in a void-returning function return post() still works
fine even with empty parens. I didn’t do that because the proposed future
in-language contracts proposed in [1] uses a slightly different syntax depending
on whether there’s a return value, so if our syntax doesn’t somehow have such a
distinction then it will be harder to migrate this macro to a syntax like [1]
with a simple search-and-replace.

[5] We could add extra machinery help the programmer remember to write return
post, so that just executing a return without post will assert… set a flag that
gets sets on every post() evaluation, and then assert that flag in the
destructor of an RAII object for every normal return. The code is pretty simple
with a scope guard [3]:

// Check that the programmer wrote “return post” each time
#define MY_POST_CHECKED                                     \
    auto post_checked = false;                              \
    auto post_guard = std::experimental::scope_success([&]{ \
        assert( post_checked );                             \
    });

Then in MY_POST and MY_POST_V, pull in this machinery and then also set
post_checked:

#define MY_POST(postconditions)                             \
    MY_POST_CHECKED                                         \
    auto post = [&](auto&& _return_) -> auto&& {            \
        assert( postconditions );                           \
        post_checked = true;                                \
        return std::forward<decltype(_return_)>(_return_);  \
    };
 
#define MY_POST_V(postconditions)                           \
    MY_POST_CHECKED                                         \
    auto post = [&]{                                        \
        assert( postconditions );                           \
        post_checked = true;                                \
    };

If you don’t have a scope guard helper, you can roll your own, where “successful
exit” is detectable by seeing that the std::uncaught_exceptions() exception
count hasn’t changed:

// Hand-rolled alternative if you don’t have a scope guard
#define MY_POST_CHECKED                                     \
    auto post_checked = false;                              \
    struct post_checked_ {                                  \
        const bool *pflag;                                  \
        const int  ecount = std::uncaught_exceptions();     \
        post_checked_(const bool* p) : pflag{p} {}          \
        ~post_checked_() {                                  \
            assert( *pflag ||                               \
                    ecount != std::uncaught_exceptions() ); \
        }                                                   \
    } post_checked_guard{&post_checked};

[6] For strong-guarantee functions, we could try to check that all observable
state is the same as on function entry. In some cases, we can partly do that…
for example, writing the test that a failed vector::push_back didn’t invalidate
any pointers into the container may sound hard, but it’s actually the easy part
of that function’s “error exit” condition! Using a strawman syntax like [1],
extended to include an “error” exit condition:

// (Using a hypothetical “error exit” condition)
// This is enough to check that no pointers into *this are invalid
 
template <typename T, typename Allocator>
constexpr void vector<T>::push_back( const T& )
    [[error( _in_.data() == data() && _in_.size() == size() )]] ;

But other “error exit” checks for this same function would be hard, expensive,
or impossible to express. For example, it would be expensive to write the check
that all elements in the vector have their original values, which would require
first taking a deep copy of the container.


ACKNOWLEDGMENTS

Thank you to the following for their feedback on this material: Joshua Berne,
Gábor Horváth, Andrzej Krzemieński, James Probert, Bjarne Stroustrup, Andrew
Sutton.

Herb Sutter C++, GotW 11 Comments 2021-02-082021-02-09 11 Minutes


POSTS NAVIGATION

Older posts
Search for:


FOLLOW BY EMAIL

Email Address:

Subscribe


TWEETS

 * @mhoemmen 3/ This has caused such sustained confusion that I wrote about it
   in 2020: herbsutter.com/2020/02/23/ref… Refe… twitter.com/i/web/status/1…
   4 months ago
 * @mhoemmen 2/ It's a design quirk for C++ references: They're designed for
   parameter passing, not as a general langu… twitter.com/i/web/status/1…
   4 months ago
 * @mhoemmen 1/ IMO it's not a FAQ bug: "reference semantics" wasn't named for
   C++ 'references'. See also:… twitter.com/i/web/status/1… 4 months ago


ARCHIVES

 * October 2021 (1)
 * June 2021 (2)
 * May 2021 (1)
 * March 2021 (2)
 * February 2021 (4)
 * January 2021 (5)
 * December 2020 (1)
 * November 2020 (2)
 * October 2020 (1)
 * September 2020 (2)
 * July 2020 (1)
 * June 2020 (2)
 * May 2020 (3)
 * April 2020 (1)
 * March 2020 (1)
 * February 2020 (4)
 * November 2019 (1)
 * October 2019 (1)
 * September 2019 (4)
 * July 2019 (5)
 * June 2019 (1)
 * May 2019 (1)
 * April 2019 (1)
 * February 2019 (1)
 * November 2018 (2)
 * September 2018 (3)
 * July 2018 (1)
 * April 2018 (1)
 * November 2017 (2)
 * October 2017 (3)
 * September 2017 (3)
 * July 2017 (2)
 * June 2017 (1)
 * March 2017 (1)
 * February 2017 (2)
 * November 2016 (1)
 * September 2016 (4)
 * June 2016 (2)
 * March 2016 (1)
 * October 2015 (1)
 * September 2015 (1)
 * July 2015 (1)
 * June 2015 (1)
 * May 2015 (1)
 * April 2015 (1)
 * January 2015 (1)
 * December 2014 (1)
 * November 2014 (3)
 * October 2014 (2)
 * September 2014 (2)
 * August 2014 (1)
 * July 2014 (2)
 * May 2014 (3)
 * April 2014 (4)
 * March 2014 (4)
 * February 2014 (2)
 * January 2014 (4)
 * December 2013 (4)
 * November 2013 (4)
 * October 2013 (2)
 * September 2013 (6)
 * August 2013 (5)
 * July 2013 (1)
 * June 2013 (7)
 * May 2013 (22)
 * April 2013 (2)
 * March 2013 (1)
 * February 2013 (1)
 * January 2013 (3)
 * December 2012 (3)
 * November 2012 (6)
 * October 2012 (5)
 * September 2012 (3)
 * August 2012 (3)
 * July 2012 (1)
 * June 2012 (7)
 * May 2012 (5)
 * April 2012 (17)
 * March 2012 (4)
 * February 2012 (5)
 * January 2012 (7)
 * December 2011 (3)
 * November 2011 (6)
 * October 2011 (10)
 * September 2011 (3)
 * August 2011 (3)
 * July 2011 (2)
 * June 2011 (4)
 * May 2011 (5)
 * April 2011 (3)
 * March 2011 (2)
 * January 2011 (1)
 * December 2010 (2)
 * October 2010 (5)
 * September 2010 (3)
 * August 2010 (2)
 * July 2010 (1)
 * June 2010 (2)
 * May 2010 (5)
 * April 2010 (5)
 * March 2010 (9)
 * February 2010 (2)
 * January 2010 (3)
 * December 2009 (1)
 * November 2009 (4)
 * October 2009 (6)
 * September 2009 (2)
 * August 2009 (2)
 * July 2009 (2)
 * June 2009 (3)
 * May 2009 (4)
 * April 2009 (2)
 * March 2009 (4)
 * February 2009 (3)
 * January 2009 (5)
 * December 2008 (4)
 * November 2008 (2)
 * October 2008 (3)
 * September 2008 (3)
 * August 2008 (4)
 * July 2008 (6)
 * June 2008 (6)
 * May 2008 (2)
 * April 2008 (6)
 * March 2008 (5)
 * February 2008 (1)
 * January 2008 (7)
 * December 2007 (5)
 * November 2007 (4)
 * October 2007 (1)
 * September 2007 (6)
 * August 2007 (6)
 * July 2007 (6)
 * June 2007 (3)
 * May 2007 (3)
 * April 2007 (2)
 * March 2007 (3)
 * February 2007 (3)
 * January 2007 (6)
 * December 2006 (2)

Blog at WordPress.com.

Sutter’s Mill
Blog at WordPress.com.
 * Follow Following
    * Sutter’s Mill
      Join 27,074 other followers
      
      Sign me up
    * Already have a WordPress.com account? Log in now.

 *  * Sutter’s Mill
    * Customize
    * Follow Following
    * Sign up
    * Log in
    * Report this content
    * View site in Reader
    * Manage subscriptions
    * Collapse this bar

 

Loading Comments...

 

Write a Comment...
Email (Required) Name (Required) Website

herbsutter.com Open in urlscan Pro 192.0.78.25 Public Scan

Form analysis 4 forms found in the DOM

GET https://herbsutter.com/

POST https://subscribe.wordpress.com

POST https://subscribe.wordpress.com

Text Content

herbsutter.com Open in urlscan Pro
192.0.78.25 Public Scan

Form analysis
4 forms found in the DOM